Updated on 2024-03-28 GMT+08:00

Comparing Synchronization Items

Scenarios

This section describes how to compare synchronization items to check if there are any differences between source and destination databases. To minimize the impact on services and shorten the service interruption duration, the following comparison methods are provided:

  • Object-level comparison: compares objects such as databases, indexes, tables, views, stored procedures, functions, and table sorting rules.
  • Data-level comparison is classified into row comparison and value comparison.
    • Row comparison: It helps you compare the number of rows in the tables to be synchronized. This comparison method is recommended because it is fast.
    • Value comparison: It helps you check whether data in the synchronized table is consistent. The comparison process is relatively slow.
  • Account comparison: It compares usernames and permissions of the source and destination databases.
  • Periodic comparison: DRS periodically compares the number of rows in the source database table with those in the destination database table and displays the comparison results. To compare objects periodically, enable comparison policy.

When you check data consistency, compare the number of rows first. If the number of rows are inconsistent, you can then compare the data in the table to determine the inconsistent data.

Table 1 Supported comparison methods

Synchronization Direction

Data Flow

Object-level Comparison

Row Comparison

Value Comparison

Dynamic Comparison

Account-level Comparison

To the cloud

MySQL->MySQL

Supported

Supported

Supported

Supported

Not supported

To the cloud

MySQL->PostgreSQL

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

MySQL -> GaussDB distributed

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

MySQL -> GaussDB primary/standby

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

MySQL->GaussDB(for MySQL)

Supported

Supported

Supported

Supported

Not supported

To the cloud

MySQL->GaussDB(DWS)

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

MySQL->MariaDB

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

PostgreSQL->PostgreSQL

Supported

Supported

Supported

Supported

Supported

To the cloud

PostgreSQL->GaussDB(DWS)

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

PostgreSQL -> GaussDB primary/standby

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

PostgreSQL -> GaussDB distributed

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

DDM->MySQL

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

DDM->GaussDB(DWS)

Not supported

Supported

Not supported

Not supported

Not supported

To the cloud

DDM->DDM

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

Oracle->MySQL

Supported

Supported

Supported

Supported

Not supported

To the cloud

Oracle->GaussDB(for MySQL)

Supported

Supported

Supported

Supported

Not supported

To the cloud

Oracle -> GaussDB primary/standby

Supported

Supported

Supported

Supported

Not supported

To the cloud

Oracle -> GaussDB distributed

Supported

Supported

Supported

Supported

Not supported

To the cloud

Oracle->DDM

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

Oracle->GaussDB(DWS)

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

Oracle->PostgreSQL

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

DB2 for LUW -> GaussDB primary/standby

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

DB2 for LUW -> GaussDB distributed

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

DB2 for LUW->GaussDB(DWS)

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

TiDB->GaussDB(for MySQL)

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

Microsoft SQL Server->GaussDB(DWS)

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

Microsoft SQL Server -> GaussDB primary/standby

Supported

Supported

Supported

Not supported

Not supported

To the cloud

Microsoft SQL Server -> GaussDB distributed

Supported

Supported

Supported

Not supported

Not supported

To the cloud

Microsoft SQL Server->Microsoft SQL Server

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

MongoDB->DDS

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

MariaDB->MariaDB

Supported

Supported

Supported

Supported

Not supported

To the cloud

MariaDB->MySQL

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

MariaDB->GaussDB(for MySQL)

Supported

Supported

Not supported

Not supported

Not supported

To the cloud

GaussDB(for MySQL)->GaussDB(for MySQL)

Supported

Supported

Supported

Supported

Not supported

From the cloud

MySQL->MySQL

Supported

Supported

Supported

Supported

Not supported

From the cloud

MySQL->Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

From the cloud

MySQL->CSS/ES

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

MySQL->MariaDB

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

DDM->MySQL

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

DDM->Oracle

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

DDM->Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

From the cloud

DDS->MongoDB

Supported

Supported

Supported

Not supported

Not supported

From the cloud

DDS->Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

From the cloud

PostgreSQL->PostgreSQL

Supported

Supported

Supported

Supported

Not supported

From the cloud

PostgreSQL->Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

From the cloud

GaussDB primary/standby -> MySQL

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

GaussDB primary/standby -> Oracle

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

GaussDB primary/standby -> Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

From the cloud

GaussDB primary/standby -> GaussDB(DWS)

Supported

Supported

Supported

Not supported

Not supported

From the cloud

GaussDB primary/standby -> GaussDB distributed

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

GaussDB primary/standby -> GaussDB primary/standby

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

GaussDB distributed -> MySQL

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

GaussDB distributed -> Oracle

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

GaussDB distributed -> GaussDB(DWS)

Supported

Supported

Supported

Not supported

Not supported

From the cloud

GaussDB distributed -> Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

From the cloud

GaussDB distributed -> GaussDB distributed

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

GaussDB distributed->GaussDB primary/standby

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

GaussDB(for MySQL)->MySQL

Supported

Supported

Supported

Supported

Not supported

From the cloud

GaussDB(for MySQL)->GaussDB(DWS)

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

GaussDB(for MySQL)->Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

From the cloud

GaussDB(for MySQL)->CSS/ES

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

GaussDB(for MySQL)->Oracle

Supported

Supported

Not supported

Not supported

Not supported

From the cloud

MariaDB->MariaDB

Supported

Supported

Supported

Supported

Not supported

From the cloud

Microsoft SQL Server->Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

Self-built -> Self-built

Oracle->Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

Self-built -> Self-built

Oracle -> GaussDB primary/standby

Supported

Supported

Supported

Supported

Not supported

Self-built -> Self-built

Oracle -> GaussDB distributed

Supported

Supported

Supported

Supported

Not supported

Self-built -> Self-built

MySQL->Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

Self-built -> Self-built

MySQL->CSS/ES

Supported

Supported

Not supported

Not supported

Not supported

Self-built -> Self-built

MySQL -> GaussDB primary/standby

Supported

Supported

Not supported

Not supported

Not supported

Self-built -> Self-built

MySQL -> GaussDB distributed

Supported

Supported

Not supported

Not supported

Not supported

Self-built -> Self-built

PostgreSQL->Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

Self-built -> Self-built

GaussDB primary/standby -> MySQL

Supported

Supported

Not supported

Not supported

Not supported

Self-built -> Self-built

GaussDB primary/standby -> Oracle

Supported

Supported

Not supported

Not supported

Not supported

Self-built -> Self-built

GaussDB primary/standby -> Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

Self-built -> Self-built

GaussDB primary/standby -> GaussDB primary/standby

Supported

Supported

Not supported

Not supported

Not supported

Self-built -> Self-built

GaussDB distributed -> Oracle

Supported

Supported

Not supported

Not supported

Not supported

Self-built -> Self-built

GaussDB distributed -> Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

Self-built -> Self-built

GaussDB distributed -> GaussDB distributed

Supported

Supported

Not supported

Not supported

Not supported

Self-built -> Self-built

DB2 for LUW -> GaussDB primary/standby

Supported

Supported

Not supported

Not supported

Not supported

Self-built -> Self-built

DB2 for LUW -> GaussDB distributed

Supported

Supported

Not supported

Not supported

Not supported

Self-built -> Self-built

Microsoft SQL Server->Kafka

Not supported

Not supported

Not supported

Not supported

Not supported

Constraints

  • You can manually create a comparison task only when the task is in the incremental phase.
  • When a full migration task is complete, DRS automatically creates object-level and row comparison tasks. If operations are performed on data in the source database during data comparison, the comparison results may be inconsistent.
  • If DDL operations were performed on the source database, you need to compare the objects again to ensure the accuracy of the comparison results.
  • If data in the destination database is modified separately, the comparison results may be inconsistent.
  • Currently, only tables with primary keys support value comparison. For tables that do not support value comparison, you can compare rows. Therefore, you can compare data by row or value based on scenarios.
  • The DRS task cannot be suspended during value comparison. Otherwise, the comparison task may fail.
  • Some data types do not support value comparison. For details, see Which Data Types Does Not Support Content Comparison?
  • To prevent resources from being occupied for a long time, DRS limits the comparison duration. If the comparison duration exceeds the threshold, the comparison task stops automatically.
    • When a full migration task is complete, DRS automatically creates object-level and row comparison tasks. The comparison duration limits to 30 minutes. After the threshold, the comparison tasks automatically stop and the full migration task stops.
    • For a row comparison task manually created in the incremental phase, if the source database is a relational database, the row comparison duration limits to 60 minutes. If the source database is a non-relational database, for example, MongoDB, the row comparison duration limits to 30 minutes.
  • To avoid occupying resources, the comparison results of DRS tasks can be retained for a maximum of 60 days. After 60 days, the comparison results are automatically cleared.
  • In the many-to-one row comparison scenario, the number of rows in the table in the source database is compared with that in the aggregation table mapped to the destination database.
  • In the many-to-one synchronization scenario, value comparison is not recommended because data consistency cannot be ensured.
  • Value comparison is not supported for a task in which tables in one database are mapped to multiple databases.
  • If the source is a PostgreSQL database, the index and constraint names will be changed during table mapping. As a result, the index and constraint names are inconsistent.
  • If you want to compare values and the DRS task you create supports value comparison, select a large specification for your DRS instance when creating the DRS task.

Impact on Databases

  • Object comparison: System tables of the source and destination databases are queried, occupying about 10 sessions. The database is not affected. However, if there are a large number of objects (for example, hundreds of thousands of tables), the database may be overloaded.
  • Row comparison: The number of rows in the source and destination databases is queried, which occupies about 10 sessions. The SELECT COUNT statement does not affect the database. However, if a table contains a large amount of data (hundreds of millions of records), the database will be overloaded and the query results will be returned slowly.
  • Value comparison: All data in the source and destination databases is queried, and each field is compared. The query pressure on the database leads to high I/O. The query speed is limited by the I/O and network bandwidth of the source and destination databases. Value comparison occupies one or two CPUs, and about 10 sessions.
  • Account comparison: The accounts and permissions of the source and destination databases are queried, which does not affect the database.

Estimated Comparison Duration

  • Object comparison: Generally, the comparison results are returned within several minutes based on the query performance of the source database. If the amount of data is large, the comparison may take dozens of minutes.
  • Row comparison: The SELECT COUNT method is used. The query speed depends on the database performance.
  • Value comparison: If the database workload is not heavy and the network is normal, the comparison speed is about 5 MB/s.
  • Account comparison: The results are returned with the object-level comparison results. If the number of objects is small, the results are returned in several minutes.

Prerequisites

  • You have logged in to the DRS console.
  • A synchronization task has been started.

Creating a Comparison Task

  1. On the Data Synchronization Management page, click the target synchronization task name in the Task Name/ID column.
  2. Choose Synchronization Comparison.
  3. Compare synchronization items.

    • On the Object-Level Comparison tab, click Compare. Wait for a while and click to check whether the comparison results of the source and destination databases are consistent. Locate a comparison item you want to view and click View Details in the Operation column.
      Figure 1 Object-level comparison
    • On the Data-Level Comparison tab, click Create Comparison Task. In the displayed dialog box, specify Comparison Type, Comparison Method, Comparison Time, and Object. Then, click OK.
      Figure 2 Creating a data-level comparison task
      • Comparison Type: compares rows and values.
        • Row comparison: checks whether the source table has the same number of rows as the destination table.
          • After a task enters the incremental comparison phase, you can create a row comparison task.
          • For Oracle to GaussDB synchronization, the row comparison task is automatically triggered after the full synchronization is complete.
        • Value comparison: checks whether the source table has the same data as the destination table.
          • After a task enters the incremental synchronization phase, you can create a value comparison task. After the full synchronization is complete, data in the source database cannot be changed. Otherwise, the comparison result will be inconsistent.
          • After the synchronization from Oracle to GaussDB primary/standby and from Oracle to GaussDB distributed enters the incremental verification phase, the comparison service starts to extract data from the incremental logs of the tables involved in the task for continuous comparison. If you want to compare the status of other tables, you can cancel the ongoing comparison task and create a comparison task.

          Value comparison only applies to tables with single-column primary key or unique index. You can use row comparison for tables that do not support value comparison. Therefore, you can compare data by row or value based on scenarios.

      • Comparison Policy: DRS supports one-to-one and many-to-one comparison policies.
        • One-to-one: compares the number of rows in a table in the source database with that in the table mapped to the destination database.
        • Many-to-one: compares the number of rows in a table in the source database with that in the aggregate table mapped to the destination database.

          If you select Row Comparison for Comparison Type, the Comparison Policy option becomes available.

      • Comparison Method: DRS provides static and dynamic comparison methods.
        • Static: All data in the source and destination databases is compared. The comparison task ends as the comparison is completed. Static comparison can only be performed when there are no ongoing services.
        • Dynamic: All data in the source database is compared with that in the destination database. After the comparison task is complete, incremental data in the source and destination databases is compared in real time. A dynamic comparison can be performed when data is changing.
          • If you select Value for Comparison Type, the Comparison Method option becomes available.
          • During database-level synchronization, tables cannot be created in the source database during dynamic comparison. If you want to create a table in the source database, cancel the dynamic comparison first. After the new table is created and synchronized, restart the dynamic comparison.
      • Comparison Time: You can select Start upon task creation or Start at a specified time. There is a slight difference in time between the source and destination databases during synchronization. Data inconsistency may occur. You are advised to compare migration items during off-peak hours for more accurate results.
      • Object: You can select objects to be compared based on the scenarios.
    • Account comparison: Click the Account-Level Comparison tab to view the comparison results of database accounts and permissions.
      Figure 3 Account-level comparison
      • Full synchronization tasks do not support account comparisons.
      • Only PostgreSQL to PostgreSQL synchronization supports account comparison.

  4. After the comparison creation task is submitted, the Data-Level Comparison tab is displayed. Click to refresh the list and view the comparison result of the specified comparison type.

    Figure 4 Data-level comparison

    Value comparison only applies to tables with single-column primary key or unique index. You can use row comparison for tables that do not support value comparison. Therefore, you can compare data by row or value based on scenarios.

    If you want to view the row or value comparison details, click View Results.

    If you want to download the row comparison or value comparison result, locate a specified comparison type and click Export Report in the Operation column.

    Figure 5 Row comparison details
    Figure 6 Value comparison details
    • You can cancel a running task at any time and view the comparison report of a canceled comparison task.
    • You can sort the row comparison results displayed on the current page in ascending or descending order based on the number of rows in the source database table or the destination database table.
    • If a negative number is displayed in the differences column, the number of rows in the destination database table is greater than that in the source database table. If a positive number is displayed in the differences column, the number of rows in the source database table is greater than that in the destination database table.

Periodic Comparison

Periodic comparison indicates that DRS periodically compares the number of rows in the source database table with those in the destination database table and displays the comparison results.

  1. On the Data Synchronization Management page, click the target synchronization task name in the Task Name/ID column.
  2. Click the Synchronization Comparison tab.
  3. Click the Periodic Comparison tab and click Modify Comparison Policy to modify the comparison policy.

    Only MySQL-to-MySQL, MySQL-to-GaussDB(for MySQL), MySQL-to-GaussDB(DWS), and GaussDB(for MySQL)-to-MySQL synchronization tasks support periodic comparison.

  4. In the Modify Comparison Policy dialog box, enable periodic comparison, configure the comparison frequency and time, and click Yes.

    • After periodic comparison is enabled, DRS compares the number of rows at the scheduled time. You can view the comparison results on the Data-Level Comparison tab.
    • After periodic comparison is disabled, only historical comparison results can be viewed.
    • Modifications to the comparison policy settings take effect from the next comparison and do not affect the on-going periodic comparison tasks.
    • During periodic comparison, the source and destination databases will be read. Perform the comparison during off-peak hours.
    • During periodic comparison, ultra-large tables (those with more than 100 million rows) are automatically filtered out. You can use data-level comparison to spot check such large tables. It is not recommended that these large tables be compared periodically.
    Figure 7 Modify Comparison Policy