Help Center/ MapReduce Service/ Component Development Specifications/ Hudi/ Spark on Hudi Development Specifications/ Spark Read/Write Hudi Development Specifications/ Specifications for setting the compaction parameter in the Spark asynchronous task execution table
Updated on 2024-08-30 GMT+08:00

Specifications for setting the compaction parameter in the Spark asynchronous task execution table

  • Do not manually run the run schedule command to generate a compaction plan if the write job is not stopped.

    Error example:

    run schedule on dsrTable

    If other tasks are writing data to the table, data loss will occur after this operation is performed.

  • When running the run compaction command, do not set hoodie.run.compact.only.inline to false. Set hoodie.run.compact.only.inline to true.

    Error example:

    set hoodie.run.compact.only.inline=false;
    run compaction on dsrTable;

    If other tasks are writing data to the table, performing the preceding operations will cause data loss.

    Correct example: Asynchronous Compaction

    set hoodie.compact.inline = true;
    set hoodie.run.compact.only.inline=true;
    run compaction on dsrTable;