Updated on 2024-10-23 GMT+08:00

To Hudi

Table 1 Parameter description

Type

Parameter

Description

Recommended Configuration

Basic parameters

Database Name

Database name. Click the icon next to the text box. The dialog box for selecting the database is displayed.

dbadmin

Table Name

Click the icon next to the text box. The dialog box for selecting the table is displayed.

This parameter can be configured as a macro variable of date and time and a path name can contain multiple macro variables. When the macro variable of date and time works with a scheduled job, the incremental data can be synchronized periodically. For details, see Incremental Synchronization Using the Macro Variables of Date and Time.

NOTE:

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

cdm

Table Preparation Mode

Whether to automatically create Hudi tables

  • One-click creation: The destination table is automatically created.
  • Auto creation: If the destination database does not contain the table specified by Table Name, CDM will automatically create the table. If the table specified by Table Name already exists, no table is created and data is written to the existing table.

Auto creation

Writing Mode

Data write mode

  • TRUNCATE+LOAD: The TRUNCATE statement is executed to clear data in partitions before new data is written.
  • LOAD: No operation is performed before data is written.
  • INSERT_OVERWRITE: Data is overwritten.

LOAD

Partition

Partition information. To write data to a partitioned table, you can select the partitions to write data to.

Example: year=2020,location=sun.

-

Advanced attributes

DB Write Time Field

When a table is automatically created, this field is automatically added to the table creation statement. When the data is written to the Hudi table, the value of this field is the current time. The field must be of the timestamp type.

-

Write Parameters

Parameter configured using the set syntax to control the insertion of data into Hudi through a Spark SQL statement

hoodie.combine.before.upsert