Job Configuration Management
On the Settings tab page, you can perform the following operations:
Maximum Concurrent Extractors
Maximum number of concurrent extraction tasks in a cluster
- When data migration jobs are submitted, CDM splits each job into multiple tasks based on the Concurrent Extractors parameter in the job configuration.
Jobs for different data sources may be split based on different dimensions. Some jobs may not be split based on the Concurrent Extractors parameter.
- CDM submits the tasks to the running pool in sequence. The maximum number of tasks (defined by Maximum Concurrent Extractors) run concurrently. Excess tasks are queued.
By setting an appropriate number of concurrent extractors for a job and the maximum number of concurrent extractors for the cluster, you can accelerate migration. You can configure the number of concurrent extractors as follows:
- The maximum number of concurrent extractors for a cluster varies depending on the CDM cluster flavor. You are advised to set the maximum number of concurrent extractors to twice the number of vCPUs of the CDM cluster.
Table 1 Maximum number of concurrent extractors for a CDM cluster Flavor
vCPUs/Memory
Maximum Concurrent Extractors
cdm.large
8 vCPUs, 16 GB
16
cdm.xlarge
16 vCPUs, 32 GB
32
cdm.4xlarge
32 vCPUs, 64 GB
64
Figure 1 Setting Maximum Concurrent Extractors for a CDM cluster
- Configure the number of concurrent extractors based on the following rules:
- When data is to be migrated to files, CDM does not support multiple concurrent tasks. In this case, set a single process to extract data.
- If each row of the table contains less than or equal to 1 MB data, data can be extracted concurrently. If each row contains more than 1 MB data, it is recommended that data be extracted in a single thread.
- Set Concurrent Extractors for a job based on Maximum Concurrent Extractors for the cluster. It is recommended that Concurrent Extractors is less than Maximum Concurrent Extractors.
Figure 2 Setting Concurrent Extractors for a job
Scheduled Backup/Restoration
This function depends on the OBS service.
- Prerequisites
You have created the Link to OBS.
- Scheduled backup
On the Job Management page, click Settings and configure Scheduled Backup and its related parameters.
Table 2 Scheduled backup parameters Parameter
Description
Example Value
Scheduled Backup
Whether to enable automatic backup. This function is used to back up jobs but not links.
Enable
Backup Policy
- All jobs: CDM backs up all table/file migration jobs and entire DB migration jobs regardless of the job statuses. However, historical jobs are not backed up.
- All jobs by groups: You select one or more job groups to back up.
All jobs
Backup Cycle
Select the backup cycle.
- Day: The backup is performed daily at 00:00:00.
- Week: The backup is performed at 00:00:00 every Monday.
- Month: The backup is performed at 00:00:00 on the first day of each month.
Day
OBS Link for Writing Backups
Link used to back up jobs to OBS buckets. Select a link you have created on the Links page.
obslink
OBS Bucket
OBS bucket where backup files are stored
cdm
Backup Data Directory
Directory where backup files are stored
/cdm-bk/
- Restoring jobs
If automatic backup has been performed, the backup list is displayed on the Configuration Management tab page. The OBS buckets where the backup files reside, backup paths, and backup time are displayed.
You can click Restore Backup in the Operation column of the backup list to restore the CDM jobs.
Environment Variables of Job Parameters
When creating a migration job on CDM, the parameter (such as the OBS bucket name or file path) that can be manually configured, a field in a parameter, or a character in a field can be configured as a global variable, so that you can change parameter values in batches, or batch replace certain characters after jobs are exported or imported.
The following describes how to batch replace the OBS bucket name in a migration job.
- On the Job Management page, click the Configuration Management tab and configure environment variables.
bucket_1=A bucket_2=B
Variable bucket_1 indicates bucket A, and variable bucket_2 indicates bucket B.
- On the page for creating a CDM migration job, migrate data from bucket A to bucket B.
Set the source bucket name to ${bucket_1} and destination bucket name to ${bucket_2}.
Figure 3 Setting the bucket names to environment variables
- If you want to migrate data from bucket C to bucket D, you do not need to change the job parameters. You only need to change the environment variables on the Configuration Management tab page as follows:
bucket_1=C bucket_2=D
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.