Step 3: Creating and Executing a Job

Scenario

This section describes how to create a table migration job to migrate data tables from an on-premises MySQL database to DWS.

Procedure

On the Cluster Management page, locate the cdm-aff1 cluster created in Step 1: Creating a Cluster.
Click Job Management in the Operation column of the CDM cluster.
Choose Table/File Migration > Create Job, and configure the required job information.

Figure 1 Creating a job
- Job Name: Enter a unique job name, for example, mysql2dws.
- Source Job Configuration
  - Source Link Name: Select the mysqllink link created in Step 2: Creating Links.
  - Use SQL: Select No.
  - Schema/Tablespace: Select the MySQL database from which the table is to be exported.
  - Table Name: Select the table from which data is to be exported.
  - Retain the default values of other optional parameters. For details, see From a Common Relational Database.
- Destination Job Configuration
  - Destination Link Name: Select the dwslink link created in Step 2: Creating Links.
  - Schema/Tablespace: Select the database to which data is to be imported.
  - Auto Table Creation: Select Auto creation. If the table specified by Table Name does not exist, CDM automatically creates the table in the DWS database.
  - Table Name: Select the table to which data is to be imported.
  - Advanced Attributes > Extend Field Length: Select Yes. The encoding methods for Chinese characters stored in MySQL and DWS are different, and the required lengths are different as well. A Chinese character may occupy three bytes in UTF-8 encoding. If this parameter is set to Yes, the length of the fields of the character type will be set to three times of its original length when a table is automatically created. This prevents errors caused by insufficient lengths of the characters in DWS tables.
  - Retain the default values for other optional parameters. For details, see To DWS.
Click Next. The Map Field tab page is displayed. CDM automatically maps table fields at the migration source and destination. Check whether the field mapping is correct.
- If the field mapping is incorrect, click the row where the field is located and drag the field to adjust the mapping.
- When importing data to DWS, you need to manually select the distribution columns of DWS. You are advised to select the distribution columns according to the following principles:
  1. Use the primary key as the distribution column.
  2. If multiple data segments are combined as primary keys, specify all primary keys as the distribution column.
  3. In the scenario where no primary key is available, if no distribution column is selected, DWS uses the first column as the distribution column by default. As a result, data skew risks exist.
- If you want to convert the content of the source fields, perform the operations in this step. For details, see Converting Fields. In this example, field conversion is not required.
Figure 2 Field mapping
Click Next and set task parameters. Generally, retain the default values of all parameters.
In this step, you can configure the following optional functions:
- Retry Upon Failure: If the job fails to be executed, you can determine whether to automatically retry. Retain the default value Never.
- Group: Select the group to which the job belongs. The default group is DEFAULT. On the Job Management page, jobs can be displayed, started, or exported by group.
- Schedule Execution: To configure scheduled jobs, see Scheduling Job Execution. Retain the default value No.
- Concurrent Extractors: Enter the number of extractors to be concurrently executed. Retain the default value 1.
- Write Dirty Data: Specify this parameter if data that fails to be processed or filtered out during job execution needs to be written to OBS for future viewing. Before writing dirty data, create an OBS link. Retain the default value No so that dirty data is not recorded.
- Delete Job After Completion: Retain the default value Do not delete.
Click Save and Run. CDM starts to execute the job immediately.