Help Center/ DataArts Studio/ User Guide/ Offline Processing Migration Job Development/ Creating an Offline Processing Migration Job
Updated on 2024-10-23 GMT+08:00

Creating an Offline Processing Migration Job

Notes and Constraints

  • Offline processing migration jobs are not supported in enterprise mode.
  • You can use offline processing migration jobs only after apply for the trustlist membership. To use this feature, contact customer service or technical support.

Procedure

  1. Log in to the DataArts Studio console by following the instructions in Accessing the DataArts Studio Instance Console.
  2. On the DataArts Studio console, locate a workspace and click DataArts Factory.
  3. In the left navigation pane of DataArts Factory, choose Development > Develop Job.
  4. Create a migration job using either of the following methods:

    Method 1: On the Develop Job page, click Create Data Migration Job.

    Figure 1 Creating a migration job (method 1)

    Method 2: In the directory list, right-click a directory and select Create Data Migration Job.

    Figure 2 Creating a migration job (method 2)
  5. In the displayed Create Data Migration Job dialog box, configure job parameters. Table 1 describes the job parameters.
    Figure 3 Configuring data migration job parameters
    Table 1 Job parameters

    Parameter

    Description

    Job Name

    Name of the job. The name must contain 1 to 128 characters, including only letters, numbers, hyphens (-), underscores (_), and periods (.).

    Job Type

    Job type. Select Offline processing.

    • Offline processing: A large amount of collected data is processed and analyzed in batches. These tasks usually use optimized computing and storage resources to ensure efficient data processing and analysis. These tasks are usually executed periodically (for example, every day or every week) to process a large amount of historical data for batch analysis and data warehouses.
    • Real-time processing: New data generated continuously is processed and analyzed in real time to meet the requirements for data timeliness. This mode requires instant processing of data upon generation and returns the result or triggers operations.

    Select Directory

    Directory to which the job belongs. The root directory is selected by default.

  6. Click OK.

Configuring Basic Job Information

After you configure the owner and priority for a job, you can search for the job by the owner and priority. The procedure is as follows:

Click the Basic Info tab on the right of the canvas to expand the configuration page and configure job parameters, as listed in Table 2.

Table 2 Basic job information

Parameter

Description

Owner

An owner configured during job creation is automatically matched. This parameter value can be modified.

Executor

This parameter is available when Scheduling Identities is set to Yes.

User that executes the job. When you enter an executor, the job is executed by the executor. If the executor is left unspecified, the job is executed by the user who submitted the job for startup.

Job Agency

This parameter is available when Scheduling Identities is set to Yes.

After an agency is configured, the job interacts with other services as an agency during job execution.

Priority

Priority configured during job creation is automatically matched. This parameter value can be modified.

Execution Timeout

Timeout of the job instance. If this parameter is set to 0 or is not set, this parameter does not take effect. If the notification function is enabled for the job and the execution time of the job instance exceeds the preset value, the system sends a specified notification.

Exclude Waiting Time from Instance Timeout Duration

Whether to exclude the wait time from the instance execution timeout duration

If you select this option, the time to wait before an instance starts running is excluded from the timeout duration. You can modify this setting on the Default Configuration page.

If you do not select this option, the time to wait before an instance starts running is included in the timeout duration.

Custom Parameter

Set the name and value of the parameter.

Job Tag

Configure job tags to manage jobs by category.

Click Add to add a tag to the job. You can also select a tag configured in Managing Job Tags.

Node Status Polling Interval (s)

How often the system checks completeness of the node task. The value ranges from 1 to 60 seconds.

Max. Node Execution Duration

Execution timeout interval for the node. If retry is configured and the execution is not complete within the timeout interval, the node will be executed again.

Retry upon Failure

You can select Retry 3 times or Never. Never is recommended.

You are advised to configure automatic retry for only file migration jobs or database migration jobs with Import to Staging Table enabled to avoid data inconsistency caused by repeated data writes.

NOTE:

If you want to set parameters in DataArts Studio DataArts Factory to schedule the CDM migration job, do not configure this parameter. Instead, set parameter Retry upon Failure for the CDM node in DataArts Factory.

Policy for Handling Subsequent Nodes If the Current node Fails

Policy for handling subsequent nodes if the current node fails

  • End the current job execution plan: Execution of the current job will stop, and the job instance status will become Failed. If the job is scheduled periodically, subsequent periodic scheduling will run properly.
  • Ignore the failure and set the job execution result to success: The failure of the current node will be ignored. The job instance status will become Successful. If the job is scheduled periodically, subsequent periodic scheduling will run properly.

Configuring Job Parameters

Job parameters can be globally used in any node in jobs. The procedure is as follows:

Click Parameter Setup on the right of the editor and set the parameters described in Table 3.

Table 3 Job parameter setup

Functions

Description

Variables

Add

Click Add and enter the variable parameter name and parameter value in the text boxes.

  • Parameter

    Only letters, numbers, hyphens, and underscores (_) are allowed.

  • Parameter Value
    • The string type of parameter value is a character string, for example, str1.
    • The numeric type of parameter value is a number or operation expression.

After the parameter is configured, it is referenced in the format of ${parameter name} in the job.

Edit Parameter Expression

Click next to the parameter value text box. In the displayed dialog box, edit the parameter expression. For more expressions, see Expression Overview.

Modifying a Job

Change the parameter name or value in the corresponding text boxes.

Mask

If the parameter value is a key, click to mask the value for security purposes.

Delete

Click next to the parameter name and value text boxes to delete the job parameter.

Constant Parameter

Add

Click Add and enter the constant parameter name and parameter value in the text boxes.

  • Parameter

    Only letters, numbers, hyphens, and underscores (_) are allowed.

  • Parameter Value
    • The string type of parameter value is a character string, for example, str1.
    • The numeric type of parameter value is a number or operation expression.

After the parameter is configured, it is referenced in the format of ${parameter name} in the job.

Edit Parameter Expression

Click next to the parameter value text box. In the displayed dialog box, edit the parameter expression. For more expressions, see Expression Overview.

Modifying a Job

Modify the parameter name and parameter value in text boxes and save the modifications.

Delete

Click next to the parameter name and value text boxes to delete the job parameter.