What Is CDM?
Product Overview
Cloud Data Migration (CDM) is an efficient and easy-to-use batch data integration service. Based on the big data migration to the cloud and intelligent data lake solution, CDM provides easy-to-use migration capabilities and capabilities of integrating multiple data sources to the data lake, reducing the complexity of data source migration and integration and effectively improving the data migration and integration efficiency.
In the DataArts Studio service, CDM serves as the DataArts Migration component, which provides the same capabilities as the independent CDM service. In later sections of this document, cloud data migration and data integration both refer to CDM.
Based on the distributed computing framework and the parallel processing technology, CDM helps you migrate massive sets of data stably and efficiently. You can migrate data online and quickly construct a desired data structure.
Functions
- Table/file/entire DB migration
Tables or files can be migrated in batches. An entire database can be migrated between homogeneous and heterogeneous databases. A job can migrate hundreds of tables.
- Incremental data migration
CDM supports incremental migration of files, relational databases, and HBase/CloudTable, as well as with WHERE clauses and macro variables of date and time.
- Migration in transaction mode
When a CDM job fails to be executed, CDM rolls back the data to the state before the job starts and automatically deletes data from the destination table.
- Field conversion
CDM supports field conversion functions, such as anonymization, character string operations, and date operations.
- File encryption
When files are migrated to a file system, CDM can encrypt the files written to the cloud.
- MD5 verification
MD5 verification is supported to check the file consistency from end to end and output verification result.
- Dirty data archiving
CDM can archive the data that fails to be processed during migration, has been filtered out, or is not compliant with conversion or cleaning rules to dirty data logs. The threshold for dirty data ratio can be set to determine whether a task is successful.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.