Updated on 2026-05-20 GMT+08:00

Hive Load, Truncate+Load, and Load Overwrite Modes

Overview

DataArts Migration migrates and updates Hive data by loading it. You can select the Load, Truncate+Load, or Load Overwrite loading mode to meet your requirements in different scenarios.

Hive Data Write Modes

  • Load mode

    This is the basic loading mode. In this mode, data files are directly loaded to a destination Hive table. DataArts Migration uses the native file write capability of Hive to adapt to partitioned and non-partitioned Hive file formats.

    • Characteristics
      • Destination table not to be cleared: The destination table is not cleared before data is loaded.
      • Application scenario: This mode is suitable for adding new data to a destination table that already contains data.
    • Example

      The source table is a MySQL table named data, the destination table is a Hive table named data, and the partition field is dt. After the job is configured, incremental data is written to the Hive table.

      Source MySQL table

      CREATE TABLE `data` (  `id` varchar(10) DEFAULT NULL,  `dt` date DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;

      Destination Hive table

      CREATE TABLE `data` (  `id` varchar(10))PARTITIONED BY (`dt` date);
  • Truncate+Load mode

    In this mode, data files in a destination table are cleared before new data is loaded. In the Truncate mode, data files in partitions are deleted, and the partitions are not deleted.

    • Characteristics
      • Deleting partition data: Data files in partitions are deleted, and the partition are not deleted.
      • Support for multiple partitions: Multiple partitions can be cleared and data can be loaded to these partitions at the same time.
      • Application scenario: This mode is suitable for updating all data in the partitions of a destination table.
    • Example

      The source table is a MySQL table named data, the destination table is a Hive table named data, and the partition field is dt. You want data in the dt=2025-10-21 partition to be deleted and data written to the Hive table.

      Source MySQL table

      CREATE TABLE `data` (  `id` varchar(10) DEFAULT NULL,  `dt` date DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;

      Destination Hive table

      CREATE TABLE `data` (  `id` varchar(10))PARTITIONED BY (`dt` date);
  • Load Overwrite mode

    In this mode, data is written to a temporary directory first. Then the temporary directory is loaded to the destination table using the LOAD DATA OVERWRITE syntax of Hive.

    • Characteristics
      • Temporary directory: The name of the temporary directory is in Table name_UUID format.
      • Overwriting the destination table: The LOAD DATA OVERWRITE syntax is used to overwrite the data in the destination table.
      • Application scenario: This mode is suitable for overwriting all data in the partitions of the destination table. It ensures data consistency in the overwritten partition.
      • Writing data to dynamic and static partitions: In the Load Overwrite mode, data can be written to dynamic or static partitions, meeting the requirements for overwriting data in a single partition or multiple partitions.
        • Writing data to dynamic partitions: Values of partition fields in the source data are automatically identified and used to determine the destination partitions. Data in multiple partitions can be overwritten at a time. This meets your requirements for updating data in multiple partitions at the same time.
        • Writing data to static partitions: You need to specify the destination partition in the loading statement. Data in only one partition can be overwritten at a time. This method is suitable for updating data in a specific partition.
    • Example

      The source table is a MySQL table named data, the destination table is a Hive table named data, and the partition field is dt. You want to write MySQL data to the Hive table again.

      Source MySQL table

      CREATE TABLE `data` (  `id` varchar(10) DEFAULT NULL,  `dt` date DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8;

      Destination Hive table

      CREATE TABLE `data` (  `id` varchar(10))PARTITIONED BY (`dt` date);
      • Static partition

        You want to write MySQL data to a partition of the Hive table dt=2025-10-21.

        You can set Write Mode to LOAD_OVERWRITE and Partition Filter Condition to dt=2025-10-21.

      • Dynamic partition

        The source MySQL data is from multiple partitions. You want values of partition fields in the source data to be automatically identified and used to determine the destination partitions. You hope that data in multiple partitions can be overwritten at a time so that you can update data in multiple partitions at the same time.

        You can set Write Mode to LOAD_OVERWRITE, leave Partition Filter Condition empty, and configure partition field dt for the source and destination field mapping.

Summary

You can select an appropriate Hive data loading mode from Load, Truncate+Load, and Load Overwrite to efficiently migrate and update data. You are advised to follow the recommendations in this practice to ensure data consistency and high system performance.