Updated on 2024-10-23 GMT+08:00

To Hive

Data can be rapidly imported to MRS Hive.

Table 1 Parameter description

Type

Parameter

Description

Example Value

Basic parameters

Database

Database name. Click the icon next to the text box. The dialog box for selecting the database is displayed.

default

Table Name

Destination table name. Click the icon next to the text box. The dialog box for selecting the table is displayed.

This parameter can be configured as a macro variable of date and time and a path name can contain multiple macro variables. When the macro variable of date and time works with a scheduled job, the incremental data can be synchronized periodically. For details, see Incremental Synchronization Using the Macro Variables of Date and Time.

NOTE:

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

TBL_X

Hive Write Mode

Mode of writing data to Hive

  • TRUNCATE+LOAD: Data files in partitions are cleared, but partitions are not deleted.
  • LOAD: No operation is performed before data is written.
  • LOAD_OVERWRITE: A temporary directory named Table name_UUID is generated. The load overwrite syntax of Hive is used to load the temporary directory into the Hive table.

LOAD_OVERWRITE

Partition Values

In TRUNCATE mode, multiple partitions are supported. You only need to enter values in the corresponding text boxes.

In LOAD_OVERWRITE mode, data can be written to only one partition.

-

Advanced attributes

Source side null value conversion value

Null value conversion type

  • TO_NULL: The null value is not processed.
  • TO_EMPTY_STRRING: converts the null value to an empty string.
  • TO_NULL_STRING: converts the null value to a "null" string.

TO_NULL

Newline character processing mode

Policy for processing the newline characters in the data written to Hive textfile tables.

You can select Delete, Replace with another character string, or Do not process.

Delete

Newline Replacement String

This parameter is available when Processing mode of newline characters is set to Replace with another character string.

It indicates the string that will replace newline characters.

N/A

Executing Analyze Statements

After all data is written, the ANALYZE TABLE statement is asynchronously executed to accelerate the Hive table query. The SQL statement is as follows:

  • Non-partitioned table: ANALYZE TABLE tablename COMPUTE STATISTICS
  • Partitioned table: ANALYZE TABLE tablename PARTITION(partcol1[=val1], partcol2[=val2], ...) COMPUTE STATISTICS
NOTE:
  • Parameter Executing Analyze Statements applies only to the migration of a single table.
  • Running the ANALYZE statement may exert pressure on Hive.

Yes