Updated on 2024-10-23 GMT+08:00

CDM Job

Functions

The CDM Job node is used to run a predefined CDM job for data migration.

If you have configured a macro variable of date and time in a CDM job and schedule the CDM job through DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

Parameters

Table 1, Table 2, and Table 3 describe the parameters of the CDM Job node. Configure the lineage to identify the data flow direction, which can be viewed in the DataArts Catalog module.

Table 1 Parameters of CDM Job nodes

Parameter

Mandatory

Description

CDM Cluster Name

Yes

Name of the CDM cluster to which the CDM job to be executed belongs.

You can select two CDM clusters to improve job reliability.
  • If you select two clusters, they are delivered randomly to share load. If one cluster is abnormal, jobs are switched to the other cluster.
  • If you select two clusters, you are advised to set Job Type to Existing jobs rather than New jobs and ensure that the job exists in both clusters. You can create a CDM job in one cluster, export it, and import it to the other cluster to implement job synchronization. For details, see Exporting and Importing CDM Jobs in Batches.

Job Type

Yes

  • Existing jobs
  • New jobs
NOTE:
  • If Job Type is Existing jobs, the job node is not updated when the CDM job is modified. To update the job node, save the job where the node is located again to trigger a CDM job update.
  • If Job Type is New jobs, the system checks whether a CDM job with the same name is running.
    • If the CDM job is not running, update the job with the same name based on the request body.
    • If a CDM job with the same name is running, update the job after the job is run. During this period, the job may be started by other tasks. As a result, the extracted data may not be the same as expected (for example, the job configuration is not updated, or the macro of the running time is not correctly replaced). Therefore, do not create multiple jobs with the same name.

CDM Job Name

No

This parameter is required only when Job Type is set to Existing jobs. Name of the CDM job to be executed.

If the CDM job uses the job parameters or environment variables configured during data development, data can be indirectly migrated based on the parameters or variables during node scheduling in the DataArts Factory module.

CDM Job Message Body

No

This parameter is required only when Job Type is set to New jobs. Enter the JSON message body of the CDM job. For convenience, you can choose More > View Job JSON in the Operation column of an existing CDM job, copy the JSON content, and modify the content here.

If the CDM job uses the job parameters or environment variables configured during data development, data can be indirectly migrated based on the parameters or variables during node scheduling in the DataArts Factory module.

Node Name

Yes

Name of a node. The name must contain 1 to 128 characters, including only letters, numbers, underscores (_), hyphens (-), slashes (/), less-than signs (<), and greater-than signs (>).

By default, the node name is the same as that of the selected CDM job. If you want the node name to be different from the CDM job name, disable this function by referring to Disabling Auto Node Name Change.

Table 2 Advanced parameters

Parameter

Mandatory

Description

Node Status Polling Interval (s)

Yes

Specifies how often the system check completeness of the node task. The value ranges from 1 to 60 seconds.

Max. Node Execution Duration

Yes

indicates the execution timeout interval for the node. If retry is configured and the execution is not complete within the timeout interval, the node will be executed again.

Retry upon Failure

Yes

Whether to re-execute a node if it fails to be executed.

  • Yes: The node task will be re-executed, and the following parameters must be configured:
    • Maximum Retries
    • Retry Interval (seconds)
  • No: The node will not be re-executed. This is the default setting.
NOTE:
  • You are advised to configure automatic retry for only file migration jobs or database migration jobs with Import to Staging Table enabled to avoid data inconsistency caused by repeated data writes.
  • If parameter transfer is used for scheduling the CDM job, do not configure parameter Retry upon Failure in the CDM job.
  • If retry is configured for a job node and the timeout duration is configured, the system allows you to retry a node when the node execution times out.

Policy for Handling Subsequent Nodes If the Current Node Fails

Yes

Operation that will be performed if the node fails to be executed. Possible values:

  • Suspend execution plans of the subsequent nodes: stops running subsequent nodes. The job instance status is Failed.
  • End the current job execution plan: stops running the current job. The job instance status is Failed.
  • Go to the next node: ignores the execution failure of the current node. The job instance status is Failure ignored.
  • Suspend the current job execution plan: If the current job instance is in abnormal state, the subsequent nodes of this node and the subsequent job instances that depend on the current job are in waiting state.

Enable Dry Run

No

If you select this option, the node will not be executed, and a success message will be returned.

Task Groups

No

Select a task group. If you select a task group, you can control the maximum number of concurrent nodes in the task group in a fine-grained manner in scenarios where a job contains multiple nodes, a data patching task is ongoing, or a job is rerunning.

Table 3 Lineage

Parameter

Description

Input

Add

Click Add. In the Type drop-down list, select the type to be created. The value can be DWS, OBS, CSS, HIVE, DLI, or CUSTOM.

OK

Click OK to save the parameter settings.

Cancel

Click Cancel to cancel the parameter settings.

Modify

Click to modify the parameter settings. After the modification, save the settings.

Delete

Click to delete the parameter settings.

View Details

Click to view details about the table created based on the input lineage.

Output

Add

Click Add. In the Type drop-down list, select the type to be created. The value can be DWS, OBS, CSS, HIVE, DLI, or CUSTOM.

OK

Click OK to save the parameter settings.

Cancel

Click Cancel to cancel the parameter settings.

Modify

Click to modify the parameter settings. After the modification, save the settings.

Delete

Click to delete the parameter settings.

View Details

Click to view details about the table created based on the output lineage.