Help Center/ GaussDB(DWS)/ Best Practices/ Data Migration/ Using DRS to Synchronize MySQL Table Data to a GaussDB(DWS) Cluster in Real Time
Updated on 2024-10-29 GMT+08:00

Using DRS to Synchronize MySQL Table Data to a GaussDB(DWS) Cluster in Real Time

This practice demonstrates how to use Data Replication Service (DRS) to synchronize MySQL data to GaussDB(DWS) in real time. For details about DRS, see What Is DRS?

This practice takes about 60 minutes. The process is as follows:

  1. Preparations
  2. Step 1: Prepare a MySQL Source Table
  3. Step 2: Create a GaussDB(DWS) Cluster
  4. Step 3: Create a DRS Synchronization Task
  5. Step 4: Verify Data Synchronization

Scenario Description

In big data analysis scenarios, MySQL serves as an OLTP database. After MySQL is connected to the GaussDB(DWS) data warehouse for OLAP analysis, data written by MySQL in real time needs to be synchronized to the GaussDB(DWS) data warehouse in real time. DRS is used to perform the synchronization.

Figure 1 DRS real-time synchronization

Preparations

  • You have registered a Huawei account and enabled Huawei Cloud services.. Before using GaussDB(DWS), check the account status. The account cannot be in arrears or frozen.
  • The MySQL source table to be migrated has been prepared. In this practice, a Huawei Cloud RDS MySQL database is used as the source data. If your MySQL database is offline, ensure that the network connection is normal.

Step 1: Prepare a MySQL Source Table

  1. You have purchased an RDS MySQL DB engine (this practice use MySQL 8.0.x as an example). For details, see Buy a DB Instance.
  2. The source database rds_demo with the utf8mb4 character set has been created, and there is the table rds_t1 with data in the database.

Step 2: Create a GaussDB(DWS) Cluster

  1. Creating a Cluster. To ensure network connectivity, the GaussDB(DWS) cluster and RDS must be in the same region.
  2. Log in to the GaussDB(DWS) console, choose Dedicated Clusters > Clusters, locate the row that contains the target cluster, and click Login in the Operation column.

    This practice uses version 8.1.3.x as an example. 8.1.2 and earlier versions do not support this login mode. You can use Data Studio to connect to a cluster. For details, see Using Data Studio to Connect to a Cluster.

  3. After logging in to the GaussDB(DWS) database, create the database rds_demo for synchronization.

    1
    CREATE DATABASE rds_demo WITH ENCODING 'UTF-8' DBCOMPATIBILITY 'mysql' TEMPLATE template0;
    

  4. Switch to the rds_demo database and create a schema named rds_demo.

    1
    CREATE SCHEMA rds_demo;
    

  5. Create a table named rds_t1 in the schema rds_demo.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    CREATE TABLE rds_demo.rds_t1 (
      area_id varchar(256) NOT NULL,
      area_name varchar(256) DEFAULT NULL,
      lifecycle varchar(256) DEFAULT NULL,
      user_num int DEFAULT NULL,
      income  bigint DEFAULT NULL,
      create_time timestamp DEFAULT CURRENT_TIMESTAMP,
       PRIMARY KEY (area_id)
    )distribute by hash(area_id);
    COMMENT on column rds_demo.rds_t1.area_id is 'Region Code';
    COMMENT on column rds_demo.rds_t1.area_name is 'Region Name';
    COMMENT on column rds_demo.rds_t1.lifecycle is 'Life Cycle';
     COMMENT on column rds_demo.rds_t1.user_num is 'Subscribers in Each Life Cycle';
     COMMENT on column rds_demo.rds_t1.income is 'Region Income';
    COMMENT on column rds_demo.rds_t1.create_time is 'Creation Time';
    

  6. Query table data. Currently, the table is empty.

    1
    SELECT * FROM rds_demo.rds_t1;
    

Step 3: Create a DRS Synchronization Task

  1. Choose Service List > Databases > Data Replication Service to switch to the DRS console.

  2. Choose Data Synchronization Management on the left and click Create Synchronization Task in the upper right corner.

  3. Configure basic parameters. For details, see Table 1.

    Table 1 Basic parameters

    Parameter

    Value

    Billing Mode

    Pay-per-use

    Region

    CN-Hong Kong. Ensure that RDS and GaussDB(DWS) are in the same region.

    Project

    CN-Hong Kong

    Task Name

    DRS-DWS

    Description

    -

  4. Configure the following parameters. For details, see Table 2.

    Table 2 Synchronized instance parameters

    Parameter

    Value

    Data Flow

    To the cloud

    Source DB Engine

    MySQL

    Destination DB engine

    GaussDB(DWS)

    Network Type

    In this practice, select VPC. If the MySQL database is offline, select Public Network.

    Instance Type

    Single

    Destination DB Instance

    Select the cluster created in Step 2: Create a GaussDB(DWS) Cluster.

    Synchronization Instance Subnet

    Select the subnet where the GaussDB(DWS) cluster resides. In this practice, RDS and GaussDB(DWS) are in the same VPC and subnet.

    Synchronous Mode

    Full+Incremental

    Specifications

    In this practice, select Micro. This option is selected based on the data volume and synchronization rate.

  5. Click Next and click I have read and understand this notice.

    Wait for about 5 to 10 minutes for the synchronization to complete.

  6. After the synchronization succeeds, enter the source database information and click Test Connection.

    Table 3 Source database information

    Parameter

    Value

    Database Type

    RDS DB Instance

    DB Instance Name

    Select the created RDS DB instance.

    Database Username

    root

    Database Password

    ****

  7. Enter the destination database information and click Test Connection. The connection test is successful.

    Table 4 Destination database information

    Parameter

    Value

    Database Username

    dbadmin

    Database Password

    ****

  8. Click Next, and then click Agree.
  9. Set the synchronization policy. For details, see Table 5.

    Table 5 Synchronization policy

    Parameter

    Value

    Flow Control

    No

    Synchronization Object Type

    Data

    Incremental Conflict Policy

    Overwrite

    Data Synchronization Topology

    One-to-one

    Synchronize DDLs

    Default

    Synchronization Object

    Tables

    Select the table to be synchronized from the source database. In this practice, select rds_t1 under rds_demo.

    Enter the name of the GaussDB(DWS) database that data is synchronized to: rds_demo

  10. Click Next, confirm the information, and click Next.

    Wait until the database parameter check is successful. If the check fails, click Check Again.

  11. Click Next, select Start upon task creation, verify other information, and click Submit in the lower right corner.

  12. In the dialog box that is displayed, confirm the information, select I have read and understand this notice, and click Start Task.

    Go back to the Data Synchronization Management page and wait for about 5 to 10 minutes. The synchronization is started successfully.

    Wait for about 5 minutes and continue with Step 4: Verify Data Synchronization.

Step 4: Verify Data Synchronization

  1. Log in to GaussDB(DWS) console again, and run the following statement to query the table data again. If the result is shown as follows, the full data synchronization is successful.

    1
    SELECT * FROM rds_demo.rds_t1;
    

  2. Switch to the RDS console, log in to the RDS database, and insert new data into the table rds_t1.

    1
    INSERT INTO rds_t1 VALUES ('5','new_area_name_05',34,64,1003,'2022-11-04');
    

  3. Switch back to the GaussDB(DWS) database and run the following statement to query table data:

    A row of data is added to the query result, indicating that the data in the MySQL database has been synchronized to GaussDB(DWS) in real time.
    1
    SELECT * FROM rds_demo.rds_t1;