- What's New
- Function Overview
- Service Overview
- Getting Started
-
User Guide
- Permissions Management
- Managing Clusters
-
Managing Links
- Supported Data Sources
- Creating Links
- Managing Drivers
- Managing Agents
- Managing Cluster Configurations
- Link to a Common Relational Database
- Link to an RDS for MySQL/MySQL Database
- Link to an Oracle Database
- Link to a Database Shard
- Link to DLI
- Link to Hive
- Link to HBase
- Link to HDFS
- Link to OBS
- Link to an FTP or SFTP Server
- Link to Redis/DCS
- Link to DDS
- Link to CloudTable
- Link to MongoDB
- Link to Cassandra
- Link to Kafka
- Link to DMS Kafka
- Link to Elasticsearch/CSS
- Managing Jobs
- Auditing
-
Tutorials
- Creating an MRS Hive Link
- Creating a MySQL Link
- Migrating Data from MySQL to MRS Hive
- Migrating Data from MySQL to OBS
- Migrating Data from MySQL to DWS
- Migrating an Entire MySQL Database to RDS
- Migrating Data from Oracle to CSS
- Migrating Data from Oracle to DWS
- Migrating Data from OBS to CSS
- Migrating Data from OBS to DLI
- Migrating Data from MRS HDFS to OBS
- Migrating the Entire Elasticsearch Database to CSS
- More Cases and Practices
-
Advanced Data Migration Guidance
- Incremental Migration
- Using Macro Variables of Date and Time
- Migration in Transaction Mode
- Encryption and Decryption During File Migration
- MD5 Verification
- Field Conversion
- Migrating Files with Specified Names
- Regular Expressions for Separating Semi-structured Text
- Recording the Time When Data Is Written to the Database
- File Formats
-
Best Practices
-
Advanced Data Migration Guidance
- Incremental Migration
- Using Macro Variables of Date and Time
- Migration in Transaction Mode
- Encryption and Decryption During File Migration
- MD5 Verification
- Field Conversion
- Migrating Files with Specified Names
- Regular Expressions for Separating Semi-structured Text
- Recording the Time When Data Is Written to the Database
- File Formats
- Scheduling a CDM Job by Transferring Parameters Using DataArts Factory
- Incremental Migration on CDM Supported by DLF
- Creating Table Migration Jobs in Batches Using CDM Nodes
- Case: Trade Data Statistics and Analysis
-
Advanced Data Migration Guidance
- Performance White Paper
- Security White Paper
-
API Reference
- Before You Start
- API Overview
- Calling APIs
- Application Example
- API
-
Public Data Structures
-
Link Parameter Description
- Link to a Relational Database
- Link to OBS
- Link to OSS on Alibaba Cloud
- Link to KODO/COS
- Link to HDFS
- Link to HBase
- Link to CloudTable
- Link to Hive
- Link to an FTP or SFTP Server
- Link to MongoDB
- Link to Redis/DCS (to Be Brought Offline)
- Link to NAS/SFS (to Be Brought Offline)
- Link to Kafka
- Link to Elasticsearch/Cloud Search Service
- Link to DLI
- Link to CloudTable OpenTSDB
- Link to Amazon S3
- Link to DMS Kafka
- Source Job Parameters
- Destination Job Parameters
- Job Parameter Description
-
Link Parameter Description
- Permissions Policies and Supported Actions
- Appendix
-
FAQs
-
General
- What Are the Differences Between CDM and Other Data Migration Services?
- What Are the Advantages of CDM?
- What Are the Security Protection Mechanisms of CDM?
- How Do I Reduce the Cost of Using CDM?
- Why Am I Billed Pay per Use When I Have Purchased a Yearly/Monthly CDM Incremental Package?
- How Do I Check the Remaining Validity Period of a Package?
- Will My Data Be Retained If My Package Expires or My Pay-per-Use Resources Are in Arrears?
- Can CDM Be Shared by Different Tenants?
- Can I Upgrade a CDM Cluster?
- How Is the Migration Performance of CDM?
- What Is the Number of Concurrent Jobs for Different CDM Cluster Versions?
-
Functions
- Does CDM Support Incremental Data Migration?
- Does CDM Support Field Conversion?
- What Component Versions Are Recommended for Migrating Hadoop Data Sources?
- What Data Formats Are Supported When the Data Source Is Hive?
- Can I Synchronize Jobs to Other Clusters?
- Can I Create Jobs in Batches?
- Can I Schedule Jobs in Batches?
- How Do I Back Up CDM Jobs?
- How Do I Configure the Connection If Only Some Nodes in the HANA Cluster Can Communicate with the CDM Cluster?
- How Do I Use Java to Invoke CDM RESTful APIs to Create Data Migration Jobs?
- How Do I Connect the On-Premises Intranet or Third-Party Private Network to CDM?
- Does CDM Support Parameters or Variables?
- How Do I Set the Number of Concurrent Extractors for a CDM Migration Job?
- Does CDM Support Real-Time Migration of Dynamic Data?
- Can I Stop CDM Clusters?
- How Do I Obtain the Current Time Using an Expression?
-
Troubleshooting
- What Should I Do If the Log Prompts that the Date Format Fails to Be Parsed?
- What Can I Do If the Map Field Tab Page Cannot Display All Columns?
- How Do I Select Distribution Columns When Using CDM to Migrate Data to DWS?
- What Do I Do If the Error Message "value too long for type character varying" Is Displayed When I Migrate Data to DWS?
- What Can I Do If Error Message "Unable to execute the SQL statement" Is Displayed When I Import Data from OBS to SQL Server?
- What Should I Do If the Cluster List Is Empty, I Have No Access Permission, or My Operation Is Denied?
- Why Is Error ORA-01555 Reported During Migration from Oracle to DWS?
- What Should I Do If the MongoDB Connection Migration Fails?
- What Should I Do If a Hive Migration Job Is Suspended for a Long Period of Time?
- What Should I Do If an Error Is Reported Because the Field Type Mapping Does Not Match During Data Migration Using CDM?
- What Should I Do If a JDBC Connection Timeout Error Is Reported During MySQL Migration?
- What Should I Do If a CDM Migration Job Fails After a Link from Hive to DWS Is Created?
- How Do I Use CDM to Export MySQL Data to an SQL File and Upload the File to an OBS Bucket?
- What Should I Do If CDM Fails to Migrate Data from OBS to DLI?
- What Should I Do If Error Message "Configuration Item [linkConfig.createBackendLinks] Does Not Exist" Is Displayed During Data Link Creation or Error Message "Configuration Item [throttlingConfig.concurrentSubJobs] Does Not Exist" Is Displayed During Job Creation?
- What Should I Do If Message "CORE_0031:Connect time out. (Cdm.0523)" Is Displayed During the Creation of an MRS Hive Link?
- What Should I Do If Message "CDM Does Not Support Auto Creation of an Empty Table with No Column" Is Displayed When I Enable Auto Table Creation?
- What Should I Do If I Cannot Obtain the Schema Name When Creating an Oracle Relational Database Migration Job?
-
General
CDM Migration Principles
Migration Principles
When a tenant uses CDM, the CDM system provisions a fully-managed CDM instance in the tenant's VPC. The instance allows only console and RESTful API access. Therefore the tenant cannot access the instance through other interfaces (such as SSH). This ensures data isolation between CDM tenants, prevents data leakage, and ensures transmission security during data migration between different cloud services in a VPC. Tenants can also use the VPN to migrate data from the on-premises data center to cloud services to ensure migration security.
CDM works in push-pull mode. CDM pulls data from the migration source and pushes the data to the migration destination. Data access operations are initiated by CDM. SSL will be used if the data source (such as RDS) supports it. During the migration, the usernames and passwords of the migration source and destination are required. Such information is stored in the database of the CDM instance. Protecting such information is critical to ensure CDM security.

Security Boundary and Risk Mitigation
As shown in Figure 2, CDM may have the following threats:
- Threats from the Internet: Malicious tenants may attack CDM through the CDM console.
- Threats from the data center: Malicious CDM administrators obtain tenants' data source access information (usernames and passwords).
- Threats from malicious tenants: Malicious tenants steal data from other tenants.
- Data exposure to the public network: Data is exposed when it is migrated from the public network.
CDM offers the following mechanisms to prevent potential security risks:
- Threats from the Internet: Tenants cannot log in to the CDM console through the public network. CDM provides a two-layer security mechanism.
- On the one hand, the cloud console framework requires user authentication when tenants access management consoles of cloud services.
- On the other hand, Web Application Firewall (WAF) filters requests from all consoles and stops request attack code or content.
- Threats from the data center: Tenants must provide the usernames and passwords of the migration source and destination to complete data migration. To prevent the CDM administrators from obtaining such information and attacking important data sources of tenants, CDM provides a three-level protection mechanism.
- CDM stores passwords encrypted by AES-256 in the database of the instance to ensure tenant isolation. The database is run by user Ruby and listens to only 127.0.0.1. Therefore, tenants cannot remotely access the database.
- After the instance is provisioned, CDM changes the passwords of users root and Ruby to random passwords and does not store them in any place. This prevents the CDM administrators from accessing tenants' instances and databases containing password information.
- CDM instances work in push-pull mode. Therefore, the instances do not have any listening port enabled in the VPC, and tenants cannot access the local database or operating system from the VPC.
- Threats from malicious tenants: CDM runs instances on independent VMs, so that tenants' instances are completely isolated and secure. Malicious tenants cannot access instances of other tenants.
- Data exposure to the public network: In push-pull mode, even if elastic IP addresses (EIPs) are bound to the CDM clusters, no port is enabled for the EIPs. In this way, attackers cannot access and attack CDM using the EIPs. However, when data is migrated from the public network, tenants' data sources are exposed to the public network and threatened by third-party attacks. Therefore, tenants are advised to use ACLs or firewalls on the data source server for security. In this case, for example, only the access requests from the EIPs bound to the CDM clusters are allowed.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.