Updated on 2024-12-11 GMT+08:00

Write Configuration

Table 1 Write configuration

Parameter

Description

Default Value

hoodie.datasource.write.table.name

Specifies the name of the Hudi table to be written.

None

hoodie.datasource.write.operation

Specifies the operation type of writing the Hudi table. Currently, upsert, delete, insert, and bulk_insert are supported.

  • upsert: updates and inserts data.
  • delete: deletes data.
  • insert: inserts data.
  • bulk_insert: imports data during initial table creation. Do not upsert or insert during initial table creation.
  • insert_overwrite: performs insert and overwrite operations on static partitions.
  • insert_overwrite_table: performs insert and overwrite operations on dynamic partitions. It does not immediately delete the entire table or overwrite the table. Instead, it overwrites the metadata of the Hudi table logically, and Hudi deletes useless data through the clean mechanism. Its efficiency is higher than that of the combination of bulk_insert and overwrite.

upsert

hoodie.datasource.write.table.type

Specifies the Hudi table type. Once the table type is specified, this parameter cannot be modified. The value can be MERGE_ON_READ.

COPY_ON_WRITE

hoodie.datasource.write.precombine.field

Merges and reduplicates rows with the same key before write.

A specific table field

hoodie.datasource.write.payload.class

Specifies the class used to merge the records to be updated and the updated records during update. This parameter can be customized. You can compile it yourself to implement your merge logic.

org.apache.hudi.common.model.DefaultHoodieRecordPayload

hoodie.datasource.write.recordkey.field

Specifies the primary key of the Hudi table. The Hudi table must have a unique primary key.

A specific table field

hoodie.datasource.write.partitionpath.field

Specifies the partition key. This parameter is used together with hoodie.datasource.write.keygenerator.class to meet the requirements of different partition scenarios.

None

hoodie.datasource.write.hive_style_partitioning

Specifies whether the partition mode is the same as that of Hive. You are advised to set this parameter to true.

true

hoodie.datasource.write.keygenerator.class

Generates the primary key and partition mode when used together with hoodie.datasource.write.partitionpath.field and hoodie.datasource.write.recordkey.field.

NOTE:

If the value of this parameter is different from that saved in the table, a message is displayed, indicating that the value must be the same.

org.apache.hudi.keygen.ComplexKeyGenerator