Product Architecture and Function Principles

The following figure shows the product architecture and function principles of DRS.

Figure 1 DRS product architecture

Architecture Description

Minimum permission design
1. Java Database Connectivity (JDBC) is used to connect to the source and destination databases, so you do not have to deploy programs on the databases.
2. A task runs on an independent and exclusively used VM. Data is isolated between tenants.
3. The number of IP addresses is limited. Only the DRS instance IP address is allowed to access the source and destination databases.
Reliability design
1. Automatic reconnection: If the connection between DRS and your database breaks down due to bad network or database switchover, DRS automatically retries the connection until the task is restored.
2. Resumable upload: When the connection between the source and the destination is abnormal, DRS automatically marks the current replay point. After the fault is rectified, you can resume data transfer from the replay point to ensure data consistency.
3. If the VM where the DRS replication instance is located fails, services are automatically switched to a new VM with the IP address unchanged to ensure that the migration task is not interrupted.

Principles of Real-Time Migration

Figure 2 Real-time migration principle

Take the full+incremental migration as an example. A complete migration process includes four phases.
1. Phase 1: Structure migration. DRS queries the databases, tables, and primary keys to be migrated from the source and creates corresponding objects in the destination.
2. Phase 2: Full data migration. DRS uses the parallel technology to query all data from the source and inserts the data into the destination, which is fast and convenient. Before the full migration is started, incremental data is extracted and saved in advance to ensure data integrity and consistency in the subsequent incremental migration process.
3. Phase 3: Incremental data migration. After the full migration task is complete, the incremental migration task is started. The incremental data generated after the start of the full migration is continuously parsed, converted, and replayed to the destination database until data is in sync between the source and destination databases.
4. Phase 4: To prevent data from being operated by triggers and events during the migration, triggers and events will be migrated after a migration task is complete.
Principles of the underlying module for full migration:
Sharding module: calculates the sharding logic of each table using the optimized sharding algorithm.

Extraction module: queries data from the source database in parallel mode based on the calculated shard information.

Replay module: inserts the data queried by the extraction module into the destination database in parallel and multi-task mode.
Principles of the underlying module for incremental migration:
Log reading module: reads the original incremental log data (for example, binlog for MySQL) from the source database, parses the data, converts the data into the standard log format, and stores it locally.

Log replay module: processes and filters incremental logs based on the standard format converted by the log reading module, and synchronizes the incremental data to the destination database.

Principles of Real-Time Synchronization

Figure 3 Real-time synchronization principle

Real-time synchronization can ensure that data is always in sync between the source and destination databases. It mainly applies to synchronization from OLTP to OLAP or from OLTP to big data components in real time. The technical principles of full|+incremental synchronization and real-time migration are basically the same. However, there is a slight difference between them in different scenarios.

DRS supports heterogeneous synchronization (between different DB engines). It means that DRS converts the structure definition statements of the source database to match that of the destination database. In addition, DRS can map and convert database field types.
DRS allows you to configure data processing rules, so you can use these rules to extract, parse, and replay data to meet your service requirements.
Objects such as accounts, triggers, and events cannot be synchronized.
Real-time synchronization is often used in many-to-one scenario. DDL operations in many-to-one and one-to-many scenarios are specially processed.

Principles of Real-Time Disaster Recovery

DRS uses the real-time replication technology to implement disaster recovery for two databases. The underlying technical principles are the same as those of real-time migration. The difference is that real-time DR supports forward synchronization and backward synchronization. In addition, disaster recovery is performed on the instance-level, which means that databases and tables cannot be selected.

Parent topic: Service Overview

Previous topic: Real-Time Disaster Recovery

Next topic: Mapping Data Types