Cloud Data Migration
Cloud Data Migration
- What's New
- Function Overview
- Service Overview
- Getting Started
-
User Guide
- Permissions Management
- Managing Clusters
-
Managing Links
- Supported Data Sources
- Creating Links
- Managing Drivers
- Managing Agents
- Managing Cluster Configurations
- Link to a Common Relational Database
- Link to an RDS for MySQL/MySQL Database
- Link to an Oracle Database
- Link to a Database Shard
- Link to DLI
- Link to Hive
- Link to HBase
- Link to HDFS
- Link to OBS
- Link to an FTP or SFTP Server
- Link to Redis/DCS
- Link to DDS
- Link to CloudTable
- Link to MongoDB
- Link to Cassandra
- Link to Kafka
- Link to DMS Kafka
- Link to Elasticsearch/CSS
- Managing Jobs
- Auditing
-
Tutorials
- Creating an MRS Hive Link
- Creating a MySQL Link
- Migrating Data from MySQL to MRS Hive
- Migrating Data from MySQL to OBS
- Migrating Data from MySQL to DWS
- Migrating an Entire MySQL Database to RDS
- Migrating Data from Oracle to CSS
- Migrating Data from Oracle to DWS
- Migrating Data from OBS to CSS
- Migrating Data from OBS to DLI
- Migrating Data from MRS HDFS to OBS
- Migrating the Entire Elasticsearch Database to CSS
- More Cases and Practices
-
Advanced Data Migration Guidance
- Incremental Migration
- Using Macro Variables of Date and Time
- Migration in Transaction Mode
- Encryption and Decryption During File Migration
- MD5 Verification
- Field Conversion
- Migrating Files with Specified Names
- Regular Expressions for Separating Semi-structured Text
- Recording the Time When Data Is Written to the Database
- File Formats
-
Best Practices
-
Advanced Data Migration Guidance
- Incremental Migration
- Using Macro Variables of Date and Time
- Migration in Transaction Mode
- Encryption and Decryption During File Migration
- MD5 Verification
- Field Conversion
- Migrating Files with Specified Names
- Regular Expressions for Separating Semi-structured Text
- Recording the Time When Data Is Written to the Database
- File Formats
- Scheduling a CDM Job by Transferring Parameters Using DataArts Factory
- Incremental Migration on CDM Supported by DLF
- Creating Table Migration Jobs in Batches Using CDM Nodes
- Case: Trade Data Statistics and Analysis
-
Advanced Data Migration Guidance
- Performance White Paper
- Security White Paper
-
API Reference
- Before You Start
- API Overview
- Calling APIs
- Application Example
- API
-
Public Data Structures
-
Link Parameter Description
- Link to a Relational Database
- Link to OBS
- Link to OSS on Alibaba Cloud
- Link to KODO/COS
- Link to HDFS
- Link to HBase
- Link to CloudTable
- Link to Hive
- Link to an FTP or SFTP Server
- Link to MongoDB
- Link to Redis/DCS (to Be Brought Offline)
- Link to NAS/SFS (to Be Brought Offline)
- Link to Kafka
- Link to Elasticsearch/Cloud Search Service
- Link to DLI
- Link to CloudTable OpenTSDB
- Link to Amazon S3
- Link to DMS Kafka
- Source Job Parameters
- Destination Job Parameters
- Job Parameter Description
-
Link Parameter Description
- Permissions Policies and Supported Actions
- Appendix
-
FAQs
-
General
- What Are the Differences Between CDM and Other Data Migration Services?
- What Are the Advantages of CDM?
- What Are the Security Protection Mechanisms of CDM?
- How Do I Reduce the Cost of Using CDM?
- Why Am I Billed Pay per Use When I Have Purchased a Yearly/Monthly CDM Incremental Package?
- How Do I Check the Remaining Validity Period of a Package?
- Will My Data Be Retained If My Package Expires or My Pay-per-Use Resources Are in Arrears?
- Can CDM Be Shared by Different Tenants?
- Can I Upgrade a CDM Cluster?
- How Is the Migration Performance of CDM?
- What Is the Number of Concurrent Jobs for Different CDM Cluster Versions?
-
Functions
- Does CDM Support Incremental Data Migration?
- Does CDM Support Field Conversion?
- What Component Versions Are Recommended for Migrating Hadoop Data Sources?
- What Data Formats Are Supported When the Data Source Is Hive?
- Can I Synchronize Jobs to Other Clusters?
- Can I Create Jobs in Batches?
- Can I Schedule Jobs in Batches?
- How Do I Back Up CDM Jobs?
- How Do I Configure the Connection If Only Some Nodes in the HANA Cluster Can Communicate with the CDM Cluster?
- How Do I Use Java to Invoke CDM RESTful APIs to Create Data Migration Jobs?
- How Do I Connect the On-Premises Intranet or Third-Party Private Network to CDM?
- Does CDM Support Parameters or Variables?
- How Do I Set the Number of Concurrent Extractors for a CDM Migration Job?
- Does CDM Support Real-Time Migration of Dynamic Data?
- Can I Stop CDM Clusters?
- How Do I Obtain the Current Time Using an Expression?
-
Troubleshooting
- What Should I Do If the Log Prompts that the Date Format Fails to Be Parsed?
- What Can I Do If the Map Field Tab Page Cannot Display All Columns?
- How Do I Select Distribution Columns When Using CDM to Migrate Data to DWS?
- What Do I Do If the Error Message "value too long for type character varying" Is Displayed When I Migrate Data to DWS?
- What Can I Do If Error Message "Unable to execute the SQL statement" Is Displayed When I Import Data from OBS to SQL Server?
- What Should I Do If the Cluster List Is Empty, I Have No Access Permission, or My Operation Is Denied?
- Why Is Error ORA-01555 Reported During Migration from Oracle to DWS?
- What Should I Do If the MongoDB Connection Migration Fails?
- What Should I Do If a Hive Migration Job Is Suspended for a Long Period of Time?
- What Should I Do If an Error Is Reported Because the Field Type Mapping Does Not Match During Data Migration Using CDM?
- What Should I Do If a JDBC Connection Timeout Error Is Reported During MySQL Migration?
- What Should I Do If a CDM Migration Job Fails After a Link from Hive to DWS Is Created?
- How Do I Use CDM to Export MySQL Data to an SQL File and Upload the File to an OBS Bucket?
- What Should I Do If CDM Fails to Migrate Data from OBS to DLI?
- What Should I Do If Error Message "Configuration Item [linkConfig.createBackendLinks] Does Not Exist" Is Displayed During Data Link Creation or Error Message "Configuration Item [throttlingConfig.concurrentSubJobs] Does Not Exist" Is Displayed During Job Creation?
- What Should I Do If Message "CORE_0031:Connect time out. (Cdm.0523)" Is Displayed During the Creation of an MRS Hive Link?
- What Should I Do If Message "CDM Does Not Support Auto Creation of an Empty Table with No Column" Is Displayed When I Enable Auto Table Creation?
- What Should I Do If I Cannot Obtain the Schema Name When Creating an Oracle Relational Database Migration Job?
-
General
Step 3: Creating and Executing a Job
Updated on 2022-09-15 GMT+08:00
Scenario
This section describes how to create a table migration job to migrate data tables from an on-premises MySQL database to DWS.
Procedure
- On the Cluster Management page, locate the cdm-aff1 cluster created in Step 1: Creating a Cluster.
- Click Job Management in the Operation column of the CDM cluster.
- Choose Table/File Migration > Create Job, and configure the required job information.
Figure 1 Creating a job
- Job Name: Enter a unique job name, for example, mysql2dws.
- Source Job Configuration
- Source Link Name: Select the mysqllink link created in Step 2: Creating Links.
- Use SQL: Select No.
- Schema/Tablespace: Select the MySQL database from which the table is to be exported.
- Table Name: Select the table from which data is to be exported.
- Retain the default values of other optional parameters. For details, see From a Common Relational Database.
- Destination Job Configuration
- Destination Link Name: Select the dwslink link created in Step 2: Creating Links.
- Schema/Tablespace: Select the database to which data is to be imported.
- Auto Table Creation: Select Auto creation. If the table specified by Table Name does not exist, CDM automatically creates the table in the DWS database.
- Table Name: Select the table to which data is to be imported.
- Advanced Attributes > Extend Field Length: Select Yes. The encoding methods for Chinese characters stored in MySQL and DWS are different, and the required lengths are different as well. A Chinese character may occupy three bytes in UTF-8 encoding. If this parameter is set to Yes, the length of the fields of the character type will be set to three times of its original length when a table is automatically created. This prevents errors caused by insufficient lengths of the characters in DWS tables.
- Retain the default values for other optional parameters. For details, see To DWS.
- Click Next. The Map Field tab page is displayed. CDM automatically maps table fields at the migration source and destination. Check whether the field mapping is correct.
- If the field mapping is incorrect, click the row where the field is located and drag the field to adjust the mapping.
- When importing data to DWS, you need to manually select the distribution columns of DWS. You are advised to select the distribution columns according to the following principles:
- Use the primary key as the distribution column.
- If multiple data segments are combined as primary keys, specify all primary keys as the distribution column.
- In the scenario where no primary key is available, if no distribution column is selected, DWS uses the first column as the distribution column by default. As a result, data skew risks exist.
- If you want to convert the content of the source fields, perform the operations in this step. For details, see Converting Fields. In this example, field conversion is not required.
Figure 2 Field mapping - Click Next and set task parameters. Generally, retain the default values of all parameters.
In this step, you can configure the following optional functions:
- Retry Upon Failure: If the job fails to be executed, you can determine whether to automatically retry. Retain the default value Never.
- Group: Select the group to which the job belongs. The default group is DEFAULT. On the Job Management page, jobs can be displayed, started, or exported by group.
- Schedule Execution: To configure scheduled jobs, see Scheduling Job Execution. Retain the default value No.
- Concurrent Extractors: Enter the number of extractors to be concurrently executed. Retain the default value 1.
- Write Dirty Data: Specify this parameter if data that fails to be processed or filtered out during job execution needs to be written to OBS for future viewing. Before writing dirty data, create an OBS link. Retain the default value No so that dirty data is not recorded.
- Delete Job After Completion: Retain the default value Do not delete.
- Click Save and Run. CDM starts to execute the job immediately.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.
The system is busy. Please try again later.