Help Center/ GaussDB(DWS)/ Best Practices/ Data Migration/ Using CDM to Migrate Data from AnalyticDB for MySQL to a GaussDB(DWS) Cluster
Updated on 2025-07-30 GMT+08:00

Using CDM to Migrate Data from AnalyticDB for MySQL to a GaussDB(DWS) Cluster

This practice demonstrates how to use CDM to migrate data from AnalyticDB for MySQL (ADB) to GaussDB(DWS).

CDM is an efficient and easy-to-use service for batch data migration. For more information, see Cloud Data Migration.

This practice takes about 90 minutes and uses cloud services such as Virtual Private Cloud (VPC) and Subnet, Elastic IP (EIP), Cloud Data Migration (CDM), and GaussDB(DWS). The basic process is as follows:

  1. Prerequisites
  2. Step 1: Migrating Metadata
  1. Step 2: Migrating Table Data
  2. Step 3: Verifying Data Consistency
Figure 1 ADB migration

Notes and Constraints

  • If DELETE or UPDATE operations occur during CDM migration, data consistency cannot be guaranteed afterward. Re-migration will be required in such cases.
  • A single database migration job can migrate only one database at a time. To migrate multiple databases, you need to configure multiple migration jobs.
  • You need to create databases and schemas to be synchronized in the destination GaussDB(DWS).
  • In ADB, the database level corresponds to the schema level in GaussDB(DWS).

Prerequisites

  • You have purchased GaussDB(DWS) and CDM clusters. For details, see CDM User Guide.
  • The ADB cluster and GaussDB(DWS) cluster can communicate with CDM. In this example, GaussDB(DWS) and CDM are created in the same region, private cloud, and subnet.
  • You have the migration permission.
  • The source and destination clients have been installed.
  • The IP address of the CDM cluster has been whitelisted in the data security settings within the ADB source cluster.
  • The migration tools listed in Table 1 have been prepared: DSC and DataCheck.
  • The runtime environment of DataCheck meets the following requirements:
    • The server is compatible with 64-bit operating systems and can run on either Linux or Windows.
    • Either JDK or JRE 1.8 has been installed on the system.
    • The server where DataCheck is installed and running can communicate with the database to be connected.
Table 1 Tools for ADB migration

Tool

Description

How to Obtain

DSC

Syntax migration tool for GaussDB(DWS)

Obtain the download link.

DataCheck

Data check tool

Contact technical support.

Step 1: Migrating Metadata

  1. Export the source syntax. Exporting the source syntax, which represents the implementation logic of customer's services, from ADB and modifying it to be compatible with GaussDB(DWS) can reduce the modeling workload and improve service migration efficiency.

    To do so, log in to the ADB console, select the target database, choose Export > Export Table Creation Statements, and save DDL_migration.sql.

    Since the source syntax involves the identification of the service scope, operations require a DBA familiar with the service. It is recommended that the source syntax be provided by the customer's DBA.

  2. Use the DSC tool to convert the DDL syntax.

    1. Unzip the DSC tool package obtained in Prerequisites.
    2. Save the DDL syntax files to be converted into the input folder of DSC.
      Figure 2 input directory
    3. Open the command line tool and double-click runDSC.bat in Windows. Run runDSC.sh in Linux.
    4. Convert the syntax:
      1
      runDSC.bat -S mysql
      
      Figure 3 DDL conversion
    5. View the conversion result in the output folder.
      Figure 4 DDL conversion result

  3. Connect to GaussDB(DWS) and run the SQL statement below to create a destination database, which in this practice is migration.

    1
    CREATE DATABASE database_name WITH ENCODING 'UTF-8' DBCOMPATIBILITY 'mysql' TEMPLATE template0;
    

  4. The concept of a database in ADB is equivalent to the concept of a schema in GaussDB(DWS). A schema needs to be created in the destination GaussDB(DWS) database.

    First, switch to the newly created database, then execute the following SQL statement to create a schema.
    1
    CREATE SCHEMA schema_name;
    

  5. In the SQL editor window of GaussDB(DWS), select the created database and schema, and click Import to import the table creation statement converted in 2.

    Figure 5 Importing DDL statements

  6. After the import, click the run button to run the SQL statement to create a table.
  7. Check whether the table is created:

    1
    SELECT table_name FROM information_schema.tables WHERE table_schema = 'migration';
    

Step 2: Migrating Table Data

  1. Configure the source link for CDM.

    1. Log in to the CDM console and choose Cluster Management on the left.
    2. If CDM is connected to ADB through the public network, bind an EIP. For details, see Binding or Unbinding an EIP.
    3. Click Job Management next to the cluster name.
    4. Before establishing a job link for the first time, install the driver. Choose Links > Driver Management and install the MySQL driver.
    5. After the driver installation, click Create Link on the link management page, select MySQL and then click Next.
    6. Enter the ADB database information.
      Figure 6 ADB information
    7. Click Test to check connectivity, and then click Save.

  2. Configure the destination link for CDM.

    1. Similarly, choose Job Management > Links and click Create Link.
    2. Select Data Warehouse Service and click Next.
    3. Enter the GaussDB(DWS) database information.
      Figure 7 GaussDB(DWS) information
    4. Click Test to check connectivity, and then click Save.

  3. Configure and start a table-level migration job.

    1. Click the Table/File Migration tab. This tab displays single-table migration jobs.
    2. Enter the source and destination information.
      • Job Name: Enter a unique name.
      • Source Job Configuration
        • Source Link Name: Select the created MySQL source link.
        • Use SQL Statement: Select No.
        • Schema/Tablespace: Select the name of the schema or tablespace from which data is to be extracted.
        • Table Name: Select the name of the table from which data is to be extracted.
        • Retain default settings for other parameters.
      • Destination Job Configuration
        • Destination Link Name: Select the created GaussDB(DWS) destination link.
        • Schema/Tablespace: Select the GaussDB(DWS) database to which data is to be written.
        • Auto Table Creation: This parameter is displayed only when both the migration source and destination are relational databases.
        • Table Name: Select the name of the table to which data is to be written. You can also enter a table name that does not exist. CDM automatically creates the table in GaussDB(DWS).
        • Clear Data Before Import: Specify whether to clear data in the destination table before the migration task starts.
    3. Click Next to map fields.
      • If the field mapping is incorrect, you can drag the fields to adjust the mapping.
      • CDM expressions have built-in ability to convert fields of common strings, dates, and numbers. For details, see Configuring Field Converters.
    4. Confirm the information and click Next.
    5. On the task configuration page, configure the following parameters:
      • Retry Upon Failure: If the job fails to be executed, you can determine whether to automatically retry. Retain the default value Never.
      • Group: Select the group to which the job belongs. The default group is DEFAULT. On the Job Management page, jobs can be displayed, started, or exported by group.
      • Schedule Execution: To configure scheduled jobs, enable this function. Retain the default value No.
      • Concurrent Extractors: Indicates data extracted concurrently. The default value is 1, but you can increase it. However, it is recommended to keep it at or below 4.
      • Write Dirty Data: Dirty data may be generated during data migration between tables. You are advised to select Yes.
    6. Confirm the information and click Save and Run.

      After the migration job begins, you can view the status in the job list.

  4. Configure and start a database-level migration job.

    1. Click the Entire DB Migration tab and click Create Job.
    2. Enter the source information on the left and the destination information on the right.
      • Job Name: Enter a unique name.
      • Source Job Configuration
        • Source Link Name: Select the created MySQL source link.
        • Use SQL Statement: Select No.
        • Schema/Tablespace: Select the name of the schema or tablespace from which data is to be extracted.
        • Table Name: Select the name of the table from which data is to be extracted.
        • Retain default settings for other parameters.
      • Destination Job Configuration
        • Destination Link Name: Select the created GaussDB(DWS) destination link.
        • Schema/Tablespace: Select the GaussDB(DWS) database to which data is to be written.
        • Auto Table Creation: This parameter is displayed only when both the migration source and destination are relational databases.
        • Clear Data Before Import: Specify whether to clear data in the destination table before the migration task starts.

    3. Click Next.
    4. Select all tables or the tables to migrate, click the right arrow in the middle to move them to the right pane, and then click Next.
    5. Configure job parameters.
      • Concurrent Subjobs: Indicates the number of tables to migrate simultaneously. The default value is 10; it is recommended to set it to a value less than 5.
      • Concurrent Extractors: Indicates data extracted concurrently. The default value is 1, but you can increase it. However, it is recommended to keep it at or below 4.

      Confirm the information and click Save and Run.

    6. Wait until the migration job is complete. Click the job name to view the migration status of each table.

  5. Connect to GaussDB(DWS) and run the following SQL statements to check if the data has been migrated:

    1
    2
    3
    4
    SELECT 'migration.users',count(1) FROM migration.users UNION ALL
    SELECT 'migration.products',count(1) FROM migration.products UNION ALL
    SELECT 'migration.orders',count(1) FROM migration.orders UNION ALL
    SELECT 'migration.employees',count(1) FROM migration.employees;
    

Step 3: Verifying Data Consistency

After the migration, check whether the data on the source and destination databases is consistent using DataCheck.

  1. Download and unzip DataCheck-*.zip, and then go to the DataCheck-* directory to use it. For details about how to use the files in the directory, see Table 2.
  2. Configure the tool package.

    • In Windows:

      Open the dbinfo.properties file in the conf folder and configure it based on your actual needs.

      You can use the command below in the tool to generate the ciphertext of src.passwd and dws.passwd.

      encryption.bat password

      After the command is executed, an encrypted file is generated in the local bin directory.

    • In Linux:

      The method of generating the ciphertext is similar to that for Windows. The command is sh encryption.sh Password. Other steps are the same.

  3. Check data.

    In Windows:

    1. Open the check.input file, enter the databases (if they are not entered, the content in the conf file is used), source table, and destination table to be checked, and fill in Row Range with a data query statement to specify the query scope.
      • After configuring the source database name in the configuration file, the source database name in the check.input file defaults to this. However, if a different source database name is specified in the check.input file, it will take precedence.
      • The Check Strategy offers three levels: high, medium, and low. If unspecified, the default is low.

      The following figure shows the check_input file for metadata comparison.

      Figure 8 check_input
    2. Run the datacheck.bat command in the bin directory to execute the check tool.

    3. View the generated check result file check_input_result.xlsx.

    In Linux:

    1. Edit and upload the check_input.xlsx file. Refer to the step 1 for Windows.
    2. Run the sh datacheck.sh command to execute the check tool.

    3. View the check result in the check_input_result.xlsx file. (The check result analysis is the same as that for Windows.)

Related Information

Table 2 Description of the DataCheck directory

File or Folder

Description

DataCheck

bin

Saves the entry script of the check tool.

  • Windows version: datacheck.bat
  • Linux version: datacheck.sh

conf

Configuration file, which is used to configure the connection between the source database and the destination database and set log printing.

lib

Stores JAR packages required for running the check tool.

check_input.xlsx

  • Information about the table to be checked, including schema, table, and column names.
  • Check level information and check rules of users. Three check levels are supported: high, middle, and low. The default value is low.

logs

The package does not include this file. Once the check tool runs, it automatically generates this file to log the tool's execution process.

check_input_result.xlsx

The package does not include this file. Once the check tool runs, a check result file will be created in the same location as check_input.xlsx.

Table 3 Basic functions of the data check tool

DataCheck

  • Check data in GaussDB(DWS), MySQL, and PostgreSQL databases.
  • Check common fields, such as numeric, time, and character types.
  • Support three check levels, including high, middle, and low.
  • Check schemas, table names, and column names.
  • Specify the check scope of records. By default, all records are checked.
  • Support various check methods, including count(*), max, min, sum, avg, and sampling details check.
  • Output the check result and related check details.
Table 4 Data check levels

Check Level

Description

Syntax

Low

  • Quantity check
  • Number of records: COUNT(*)

Middle

  • Quantity check
  • Numeric type check
  • Number of records: COUNT(*)
  • Value check: MAX, MIN, SUM, and AVG

High

  • Quantity check
  • Numeric type check
  • Date type check
  • Character type check
  • Number of records: COUNT(*)
  • Value check: MAX, MIN, SUM, and AVG
  • Date check: MAX and MIN
  • Character check: order by limit 1000, which reads the data and checks whether the content is the same