Updated on 2023-03-21 GMT+08:00

HBase/CloudTable Incremental Migration

You can use CDM to export data in a specified period of time from HBase (including MRS HBase, FusionInsight HBase, and Apache HBase) and CloudTable. The CDM scheduled jobs can be used together to implement incremental migration of HBase and CloudTable.

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

When creating a table/file migration job and selecting the link to HBase or CloudTable as the source link, you can set the time range in advanced attributes.

Figure 1 Time range
  • Start time (including the value) for extracting data. The format is yyyy-MM-dd HH:mm:ss. Only the data generated at the specified time and later is extracted.
  • End time (excluding the value) for extracting data. The format is yyyy-MM-dd HH:mm:ss. Only the data generated before the time point is extracted.

The two parameters can be set to macro variables of date and time. Examples are as follows:

  • If Minimum Timestamp is set to ${dateformat(yyyy-MM-dd HH:mm:ss, -1, DAY)}, only the data generated after the day before is exported.
  • If Maximum Timestamp is set to ${dateformat(yyyy-MM-dd HH:mm:ss)}, only the data generated before the specified time point is exported.

If both parameters are configured, CDM exports only the data generated on the previous day. In addition, if the job is configured to execute at 00:00:00 every day, the data generated every day can be incrementally synchronized.