Updated on 2026-05-20 GMT+08:00

Hudi Load, Truncate+Load, and Insert Overwrite Modes

Overview

DataArts Migration updates and incrementally loads Hudi data by performing write operations. Hudi supports three write modes: Load, Truncate+Load, and Insert Overwrite.

Hive Data Write Modes

  • Load mode

    This is a basic write mode of Hudi. In this mode, data is appended to a destination Hudi table, and existing data in the table is not cleared.

    • Characteristics
      • Destination table not to be cleared: The destination table is not cleared before data is written to it. New data is appended to the table.
      • Incremental write: This mode is suitable for adding new data to a destination table that already contains data.
    • Example

      The source table is the MySQL table named data, the destination table is a Hudi table named data, and the partition field is dt. You can configure a job to write incremental data in the MySQL table to the Hudi table. Hudi appends the data to the corresponding partition based on the partition field dt.

      Source MySQL table

      CREATE TABLE `data` (  `id` varchar(10) DEFAULT NULL,  `dt` date DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;

      Destination Hudi table

      CREATE TABLE `data` (  `id` varchar(10) DEFAULT NULL,  `dt` date DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;
  • Truncate+Load mode

    In this mode, data in the partitions of a destination Hudi table are cleared before new data is loaded.

    • Characteristics
      • Clearing data in the partition of a destination table: Data in the selected partitions of a destination table is cleared before new data is written.
      • Partition retention: Although data is cleared, the partition structure is retained. New data is written to the corresponding partition based on the partition field.
    • Example

      The source table is the MySQL table named data, the destination table is a Hudi table named data, and the partition field is dt.

      Configure a job to truncate and delete the data in the dt=2025-10-21 partition in the Hudi table, and then write the data in the MySQL table to the Hudi table. Hudi reorganizes the data to the corresponding partition based on the partition field dt.

      Source MySQL table

      CREATE TABLE `data` ( 
       `id` varchar(10) DEFAULT NULL, 
       `dt` date DEFAULT NULL
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8;

      Destination Hive table

      CREATE TABLE `data` ( 
       `id` varchar(10)
      )
      PARTITIONED BY (`dt` date)
      STORED AS PARQUET
  • Insert Overwrite mode

    Data is first written to a temporary table, and then the data in the temporary table is written to the destination table using the INSERT OVERWRITE syntax of Hudi and overwrites existing data in the destination table.

    • Implementation

      This mode is implemented in Hudi through the following steps:

      1. Creating a temporary table: Create a temporary table to store data.
      2. Writing data to the temporary table: Write data to the temporary table.
      3. Overwriting existing data in the destination table: Use the INSERT OVERWRITE syntax to overwrite existing data in the destination table with the data in the temporary table.
    • Characteristics
      • Temporary table: The table temporarily stores data.
      • Overwriting existing data in the destination table: The INSERT OVERWRITE syntax of Hudi is used to overwrite existing data in the destination table with the data in the temporary table.
      • Writing data to dynamic and static partitions: The Insert Overwrite mode can write data to dynamic and static partitions.
        • Writing data to dynamic partitions: Values of partition fields in the source data are automatically identified and used to determine the destination partitions. Data in multiple partitions can be overwritten at a time. This meets your requirements for updating data in multiple partitions at the same time.
        • Writing data to static partitions: You need to specify the destination partition in the write operation. Data in only one partition can be overwritten at a time. This method is suitable for updating data in a specific partition.
    • Example

      The source table is the MySQL table named data, the destination table is a Hudi table named data, and the partition field is dt.

      The structure of the source MySQL table is as follows:

      CREATE TABLE `data` (  `id` varchar(10) DEFAULT NULL,  `dt` date DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;

      The structure of the destination Hudi table is as follows:

      CREATE TABLE `data` (  `id` varchar(10))PARTITIONED BY (`dt` date)STORED AS PARQUET LOCATION 'hdfs://path/to/hudi/table/data';
      • Static partition

        If you want to write all data in the MySQL table to the dt=2025-10-21 partition of the Hudi table, you can select the INSERT OVERWRITE mode and set the partition filter to dt=2025-10-21 in the write operation.

      • Dynamic partition

        If the MySQL table contains data from multiple partitions and you want the data to be automatically identified and to overwrite data in the destination partition based on the partition field value in the source data, you can select the Insert Overwrite mode. In this case, you do not need to specify the partition filter. You need to configure mapping of the dt partition field.

Summary

You can select an appropriate Hudi data write mode from Load, Truncate+Load, and Insert Overwrite to efficiently migrate and update data. You are advised to follow the recommendations in this practice to ensure data consistency and high system performance.