Updated on 2024-04-03 GMT+08:00

From HBase/CloudTable

If the source link of a job is an HBase or CloudTable link, that is, if data is exported from MRS HBase, FusionInsight HBase, CloudTable, or Apache HBase, configure the source job parameters based on Table 1.

  1. When you migrate data from CloudTable or HBase, CDM reads the first row of the table as an example of the field list. If the first row of data does not contain all fields of the table, you need to manually add fields.
  2. Because HBase is schema-less, CDM cannot obtain the data types. If the data is stored in binary format, CDM cannot parse the data.
  1. When data is exported from HBase or CloudTable, because HBase/CloudTable is schema-less storage systems, CDM requires that the source numeric fields be stored in regular decimal format rather than in binary format. For example, the value 100 needs to be stored as 100 rather than 01100100.
Table 1 Parameter description

Category

Parameter

Description

Example Value

Basic parameters

Table Name

Name of the HBase table that data will be exported from

This parameter can be configured as a macro variable of date and time and a path name can contain multiple macro variables. When the macro variable of date and time works with a scheduled job, the incremental data can be synchronized periodically. For details, see Incremental Synchronization Using the Macro Variables of Date and Time.

NOTE:

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

TBL_2

Column Families

(Optional) Column families to which the exported data belongs

CF1&CF2

Advanced attributes

Split Rowkey

(Optional) Whether to split a rowkey. The default value is No.

Yes

Rowkey Delimiter

(Optional) Delimiter used to split a rowkey. If this parameter is left empty, the rowkey will not be split.

|

Start Time

(Optional) Start time (including the value) for extracting data. The format is yyyy-MM-dd HH:mm:ss. Only the data generated at the specified time and later is extracted.

This parameter can be set to a macro variable of date and time. When the macro variable of date and time works with a scheduled job, the incremental data can be synchronized periodically. For details, see Incremental Synchronization Using the Macro Variables of Date and Time.

NOTE:

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

2019-01-01 20:00:00

End Time

(Optional) End time (excluding the value) for extracting data. The format is yyyy-MM-dd HH:mm:ss. Only the data generated before the time point is extracted.

This parameter can be set to a macro variable of date and time. For details, see Incremental Synchronization Using the Macro Variables of Date and Time.

NOTE:

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

2019-02-01 20:00:00