Updated on 2023-04-26 GMT+08:00

Comparing Synchronization Items

Scenarios

This section describes how to compare synchronization items to check if there are any differences between source and destination databases. To minimize the impact on services and shorten the service interruption duration, the following comparison methods are provided:

  • Object-level comparison: compares objects such as databases, indexes, tables, views, stored procedures, functions, and table sorting rules.
  • Data-level comparison is classified into row comparison and value comparison.
    • Row comparison: It helps you compare the number of rows in the tables to be synchronized. This comparison method is recommended because it is fast.
    • Value comparison: It helps you check whether data in the synchronized table is consistent. The comparison process is relatively slow.
  • Account comparison: It compares usernames and permissions of the source and destination databases.
  • Periodic comparison: DRS periodically compares the number of rows in the source database table with those in the destination database table and displays the comparison results. To compare objects periodically, enable comparison policy.

When you check data consistency, compare the number of rows first. If the number of rows are inconsistent, you can then compare the data in the table to determine the inconsistent data.

Table 1 Supported comparison methods

Synchronization Direction

Data Flow

Object-level Comparison

Row Comparison

Value Comparison

Account-level Comparison

To the cloud

MySQL->MySQL

Yes

Yes

Yes

No

To the cloud

MySQL->PostgreSQL

Yes

Yes

No

No

To the cloud

MySQL -> GaussDB distributed

Yes

Yes

No

No

To the cloud

MySQL -> GaussDB primary/standby

Yes

Yes

No

No

To the cloud

MySQL->GaussDB(for MySQL)

Yes

Yes

Yes

No

To the cloud

MySQL->GaussDB(DWS)

Yes

Yes

No

No

To the cloud

PostgreSQL->PostgreSQL

Yes

Yes

Yes

Yes

To the cloud

PostgreSQL->GaussDB(DWS)

Yes

Yes

No

No

To the cloud

PostgreSQL -> GaussDB primary/standby

Yes

Yes

Yes

No

To the cloud

PostgreSQL -> GaussDB distributed

Yes

Yes

Yes

No

To the cloud

DDM->MySQL

Yes

Yes

No

No

To the cloud

DDM->GaussDB(DWS)

No

Yes

Yes

No

To the cloud

DDM->DDM

Yes

Yes

No

No

To the cloud

Oracle->MySQL

Yes

Yes

Yes

No

To the cloud

Oracle->GaussDB(for MySQL)

Yes

Yes

Yes

No

To the cloud

Oracle -> GaussDB primary/standby

Yes

Yes

Yes

No

To the cloud

Oracle -> GaussDB distributed

Yes

Yes

Yes

No

To the cloud

Oracle->DDM

Yes

Yes

No

No

To the cloud

Oracle->GaussDB(DWS)

Yes

Yes

No

No

To the cloud

Oracle->PostgreSQL

Yes

Yes

No

No

To the cloud

DB2 for LUW -> GaussDB primary/standby

Yes

Yes

No

No

To the cloud

DB2 for LUW -> GaussDB distributed

Yes

Yes

No

No

To the cloud

DB2 for LUW->GaussDB(DWS)

Yes

Yes

No

No

To the cloud

TiDB->GaussDB(for MySQL)

Yes

Yes

No

No

To the cloud

Microsoft SQL Server->GaussDB(DWS)

Yes

Yes

No

No

To the cloud

Microsoft SQL Server -> GaussDB primary/standby

Yes

Yes

No

No

To the cloud

Microsoft SQL Server -> GaussDB distributed

Yes

Yes

No

No

To the cloud

Microsoft SQL Server->Microsoft SQL Server

Yes

Yes

No

No

To the cloud

MongoDB->DDS

Yes

Yes

No

No

From the cloud

MySQL->MySQL

Yes

Yes

Yes

No

From the cloud

MySQL->Kafka

No

No

No

No

From the cloud

MySQL->CSS/ES

Yes

Yes

No

No

From the cloud

DDM->MySQL

Yes

Yes

No

No

From the cloud

DDM->Oracle

Yes

Yes

No

No

From the cloud

DDM->Kafka

No

No

No

No

From the cloud

DDS->MongoDB

Yes

Yes

Yes

No

From the cloud

PostgreSQL->Kafka

No

No

No

No

From the cloud

GaussDB primary/standby -> MySQL

Yes

Yes

No

No

From the cloud

GaussDB primary/standby -> Oracle

Yes

Yes

No

No

From the cloud

GaussDB primary/standby -> Kafka

No

No

No

No

From the cloud

GaussDB primary/standby -> GaussDB(DWS)

Yes

Yes

Yes

No

From the cloud

GaussDB primary/standby -> GaussDB distributed

Yes

Yes

No

No

From the cloud

GaussDB primary/standby -> GaussDB primary/standby

Yes

Yes

No

No

From the cloud

GaussDB distributed -> MySQL

Yes

Yes

No

No

From the cloud

GaussDB distributed -> Oracle

Yes

Yes

No

No

From the cloud

GaussDB distributed -> GaussDB(DWS)

Yes

Yes

Yes

No

From the cloud

GaussDB distributed -> Kafka

No

No

No

No

From the cloud

GaussDB distributed -> GaussDB distributed

Yes

Yes

No

No

From the cloud

GaussDB distributed->GaussDB primary/standby

Yes

Yes

No

No

From the cloud

GaussDB(for MySQL)->MySQL

Yes

Yes

Yes

No

From the cloud

GaussDB(for MySQL)->GaussDB(DWS)

Yes

Yes

No

No

From the cloud

GaussDB(for MySQL)->Kafka

No

No

No

No

From the cloud

GaussDB(for MySQL)->CSS/ES

Yes

Yes

No

No

From the cloud

GaussDB(for MySQL)->Oracle

Yes

Yes

No

No

Self-built -> Self-built

Oracle->Kafka

No

No

No

No

Self-built -> Self-built

Oracle -> GaussDB primary/standby

Yes

Yes

Yes

No

Self-built -> Self-built

Oracle -> GaussDB distributed

Yes

Yes

Yes

No

Self-built -> Self-built

MySQL->Kafka

No

No

No

No

Self-built -> Self-built

MySQL->CSS/ES

Yes

Yes

No

No

Self-built -> Self-built

PostgreSQL->Kafka

No

No

No

No

Self-built -> Self-built

GaussDB primary/standby -> Oracle

Yes

Yes

No

No

Self-built -> Self-built

GaussDB primary/standby -> Kafka

No

No

No

No

Self-built -> Self-built

GaussDB distributed -> Oracle

Yes

Yes

No

No

Self-built -> Self-built

GaussDB distributed -> Kafka

No

No

No

No

Constraints

  • A comparison task can be created only when the task is in the incremental phase. When a full task is complete, DRS automatically creates object-level and row comparison tasks.
  • If DDL operations were performed on the source database, you need to compare the objects again to ensure the accuracy of the comparison results.
  • If data in the destination database is modified separately, the comparison results may be inconsistent.
  • Currently, only tables with primary keys support value comparison. For tables that do not support value comparison, you can compare rows. Therefore, you can compare data by row or value based on scenarios.
  • Some data types do not support value comparison. For details, see Which Data Types Does Not Support Content Comparison?
  • To prevent resources from being occupied for a long time, DRS limits the row comparison duration. If the row comparison duration exceeds the threshold, the row comparison task stops automatically. If the source database is a relational database, the row comparison duration is 60 minutes. If the source database is a non-relational database, for example, MongoDB, the row comparison duration is 30 minutes.
  • In the many-to-one row comparison scenario, the number of rows in the table in the source database is compared with that in the aggregation table mapped to the destination database.
  • If the source is a PostgreSQL database, the index and constraint names will be changed during table mapping. As a result, the index and constraint names are inconsistent.
  • If you want to compare values and the DRS task you create supports value comparison, select a large specification for your DRS instance when creating the DRS task.

Impact on Databases

  • Object comparison: System tables of the source and destination databases are queried, occupying about 10 sessions. The database is not affected. However, if there are a large number of objects (for example, hundreds of thousands of tables), the database may be overloaded.
  • Row comparison: The number of rows in the source and destination databases is queried, which occupies about 10 sessions. The SELECT COUNT statement does not affect the database. However, if a table contains a large amount of data (hundreds of millions of records), the database will be overloaded and the query results will be returned slowly.
  • Value comparison: All data in the source and destination databases is queried, and each field is compared. The query pressure on the database leads to high I/O. The query speed is limited by the I/O and network bandwidth of the source and destination databases. Value comparison occupies one or two CPUs, and about 10 sessions.
  • Account comparison: The accounts and permissions of the source and destination databases are queried, which does not affect the database.

Estimated Comparison Duration

  • Object comparison: Generally, the comparison results are returned within several minutes based on the query performance of the source database. If the amount of data is large, the comparison may take dozens of minutes.
  • Row comparison: The SELECT COUNT method is used. The query speed depends on the database performance.
  • Value comparison: If the database workload is not heavy and the network is normal, the comparison speed is about 5 MB/s.
  • Account comparison: The results are returned with the object-level comparison results. If the number of objects is small, the results are returned in several minutes.

Prerequisites

  • You have logged in to the DRS console.
  • A synchronization task has been started.

Creating a Comparison Task

  1. On the Data Synchronization Management page, click the target synchronization task name in the Task Name/ID column.
  2. Click the Synchronization Comparison tab.
  3. Compare synchronization items.

    • On the Object-Level Comparison tab, click Compare. Wait for a while and click to check whether the comparison results of the source and destination databases are consistent. Locate a comparison item you want to view and click View Details in the Operation column.
      Figure 1 Object-level comparison
    • On the Data-Level Comparison tab, click Create Comparison Task. In the displayed dialog box, specify Comparison Type, Comparison Method, Comparison Time, and Object. Then, click OK.
      Figure 2 Creating a data-level comparison task
      • Comparison Type: compares rows and values.
        • Row comparison: checks whether the source table has the same number of rows as the destination table.
          • After a task enters the incremental comparison phase, you can create a row comparison task.
          • For Oracle to GaussDB synchronization, the row comparison task is automatically triggered after the full synchronization is complete.
        • Value comparison: checks whether the source table has the same data as the destination table.
          • After a task enters the incremental synchronization phase, you can create a value comparison task. After the full synchronization is complete, data in the source database cannot be changed. Otherwise, the comparison result will be inconsistent.
          • After the synchronization from Oracle to GaussDB primary/standby and from Oracle to GaussDB distributed enters the incremental verification phase, the comparison service starts to extract data from the incremental logs of the tables involved in the task for continuous comparison. If you want to compare the status of other tables, you can cancel the ongoing comparison task and create a comparison task.

          Value comparison only applies to tables with single-column primary key or unique index. You can use row comparison for tables that do not support value comparison. Therefore, you can compare data by row or value based on scenarios.

      • Comparison Policy: DRS supports one-to-one and many-to-one comparison policies.
        • One-to-one: compares the number of rows in a table in the source database with that in the table mapped to the destination database.
        • Many-to-one: compares the number of rows in a table in the source database with that in the aggregate table mapped to the destination database.

          If you select Row Comparison for Comparison Type, the Comparison Policy option becomes available.

      • Comparison Method: DRS provides static and dynamic comparison methods.
        • Static: All data in the source and destination databases is compared. The comparison task ends as the comparison is completed. Static comparison can only be performed when there are no ongoing services.
        • Dynamic: All data in the source database is compared with that in the destination database. After the comparison task is complete, incremental data in the source and destination databases is compared in real time. A dynamic comparison can be performed when data is changing.
          • If you select Value for Comparison Type, the Comparison Method option becomes available.
          • During database-level synchronization, tables cannot be created in the source database during dynamic comparison. If you want to create a table in the source database, cancel the dynamic comparison first. After the new table is created and synchronized, restart the dynamic comparison.
      • Comparison Time: You can select Start upon task creation or Start at a specified time. There is a slight difference in time between the source and destination databases during synchronization. Data inconsistency may occur. You are advised to compare migration items during off-peak hours for more accurate results.
      • Object: You can select objects to be compared based on the scenarios.
    • Account comparison: Click the Account-Level Comparison tab to view the comparison results of database accounts and permissions.
      Figure 3 Account-level comparison
      • Full synchronization tasks do not support account comparisons.
      • Only PostgreSQL to PostgreSQL synchronization supports account comparison.

  4. After the comparison creation task is submitted, the Data-Level Comparison tab is displayed. Click to refresh the list and view the comparison result of the specified comparison type.

    Figure 4 Data-level comparison

    Value comparison only applies to tables with single-column primary key or unique index. You can use row comparison for tables that do not support value comparison. Therefore, you can compare data by row or value based on scenarios.

    If you want to view the row or value comparison details, click View Results.

    If you want to download the row comparison or value comparison result, locate a specified comparison type and click Export Report in the Operation column.

    Figure 5 Viewing data-level comparison details

    You can also view comparison details of canceled comparison tasks.

Periodic Comparison

Periodic comparison indicates that DRS periodically compares the number of rows in the source database table with those in the destination database table and displays the comparison results.

  1. On the Data Synchronization Management page, click the target synchronization task name in the Task Name/ID column.
  2. Click the Synchronization Comparison tab.
  3. Click the Periodic Comparison tab and click Modify Comparison Policy to modify the comparison policy.

    • Currently, only MySQL to GaussDB(DWS) synchronization supports periodic comparison.
    • Many-to-one tasks do not support periodic comparison.

  4. In the Modify Comparison Policy dialog box, enable periodic comparison, configure the comparison frequency and time, and click Yes.

    • After periodic comparison is enabled, DRS compares the number of rows at the scheduled time. You can view the comparison results on the Data-Level Comparison tab.
    • After periodic comparison is disabled, only historical comparison results can be viewed.
    • Modifications to the comparison policy settings take effect from the next comparison and do not affect the on-going periodic comparison tasks.
    • During periodic comparison, the source and destination databases will be read. Perform the comparison during off-peak hours.
    • During periodic comparison, ultra-large tables (those with more than 100 million rows) are automatically filtered out. You can use data-level comparison to spot check such large tables. It is not recommended that these large tables be compared periodically.
    Figure 6 Modify Comparison Policy