Migrating Doris Data to MRS with CCR

The Syncer service captures binlogs from the source cluster and synchronizes them to the destination cluster in real time. This enables the seamless migration of Doris historical metadata, incremental metadata, historical data, and incremental data. By leveraging real-time binlog synchronization, Syncer ensures both full and incremental migration of Doris.

Notes and Constraints

Doris version constraints for migration based on Cross Cluster Replication (CCR)
The minimum supported versions are 2.0.15 for Doris 2.0 and 2.1.6 for Doris 2.1.
Metadata and data migrations consumes resources in the source cluster. Therefore, migrate the data during off-peak hours.
Component version constraints
Syncer version ≥ Destination Doris version ≥ Source Doris version
Starting from Syncer version 2.1.8 and 3.0.4, Syncer no longer supports Doris 2.0.
Data migration is supported only from earlier Doris versions to later ones.
Metadata is automatically generated by CCR, so pre-creating it is not supported and may lead to migration failures.
For details about parameter configurations, see Configuration Instructions on the Apache Doris official website.

Solution Architecture

Figure 1 Doris data migration
Click to enlarge

Full metadata and data migration: The CCR task performs full data synchronization, copying all source data to the destination in a single operation.
Incremental metadata and data migration: After completing full synchronization, the CCR task continues with incremental synchronization to maintain data consistency between the source and destination.

The migration solution supports various networking types, such as the public network, VPN, and Direct Connect. Select a networking type based on the site requirements. The migration can be performed only when the source and destination networks can communicate with each other.

**Table 1** Networking types
Migration Network Type	Advantage	Disadvantage
Direct Connect	Stable performance with millisecond-level latency Maximum bandwidth up to dozens of Gbit/s Highly secure data transmission	High costs. Generally, the yearly/monthly billing mode is used. The source and destination private IP addresses cannot overlap. Long service provisioning duration. Generally, you need to apply for it one month in advance.
VPN	Flexible networking and setup at any time Good stability and security Moderate costs with only public network fee and VPN fee	There is a high network latency. The source and destination private IP addresses cannot overlap.
Public IP address	Support for migration when the source and destination private IP addresses are the same Bandwidth size from Mbit/s to Gbit/s Instant availability after purchase and quick binding Low costs	Poor stability. The bandwidth may not be fully used, and the migration speed is relatively slow. Data transmitted over public networks is vulnerable to leakage.

Full and incremental metadata and data migration

Enable binlog configuration on both the source Doris cluster and the destination cluster.
1. Add the following content to the fe.conf and be.conf files in the source Doris cluster and save the files:
```
enable_feature_binlog=true
```
2. Log in to MRS Manager of the destination cluster, choose Cluster > Services > Doris, click Configurations and All Configurations. In the navigation pane on the left, choose FE(Role) > Customization.
  For details about how to log in to MRS Manager, see Accessing MRS Manager.
3. Add the custom parameter enable_feature_binlog to the custom parameter file fe.conf and set the parameter value to true.
4. In the navigation pane, choose BE(Role) > Customization. Add the custom parameter enable_feature_binlog to the custom parameter file be.conf and set the parameter value to true.
5. Click Save to save the settings. Choose Dashboard, and click More > Service Rolling Restart in the upper right corner. Enter the password of the user and click OK to perform a rolling restart for the Doris service to make the settings take effect.
Create a Linux ECS. The security group, VPC, and subnet of the ECS must be the same as those of the destination MRS cluster. For details, see Purchasing an ECS in Custom Config Mode.

Configure network connectivity between the ECS and the source Doris cluster according to your environment. Ensure that the glibc version on the ECS is 2.28 or later. You can verify it by running the following command:
```
ldd --version
```
Download the Syncer package, for example, ccr-syncer-*-rc02-x64.tar.xz, from Cross Cluster Replication on the Doris official website, and upload it to a directory on the ECS created in 2.
Log in to the ECS containing the uploaded Syncer package as user root and decompress the Syncer package.
1. Navigate to the directory where the package is stored:
```
cd Package directory
```
2. Decompress the Syncer package.
```
tar -xvf ccr-syncer-*-rc02-x64.tar.xz
```
3. Navigate to the bin directory.
```
cd ccr-syncer-*-rc02-x64/bin
```
4. Enable binlogs for all tables in the specified database to be migrated.
```
bash enable_db_binlog.sh -h host -p port -u user -P password -d db
```
  In the preceding command:
  - host indicates the IP address of the server that connects to the source Doris FE node.
  - port indicates the query port for connecting to the source Doris FE node. Port 9030 is used by default.
  - db indicates the name of the specified source Doris database.
  - user and password indicate the username and password for connecting to the source Doris.
Start Syncer.
```
bash start_syncer.sh –daemon
```

curl -X POST -H "Content-Type: application/json" -d '{
"name": "ccr_test",
"src": {
"host": "Source FE IP address",
"port": "9030",
"thrift_port": "9020",
"user": "root",
"password": "",
"database": "db_name",
"table": "table_name"
},
"dest": {
"host": "Destination FE IP address",
"port": "9030",
"thrift_port": "9020",
"user": "root",
"password": "",
"database": "db_name",
"table": "table_name"
}
}' http://127.0.0.1:9190/create_ccr;

For more information about the parameters, see Table 2.

You do not need to manually create a database and table before the migration.
After a CCR task begins, it continuously synchronizes incremental metadata and data. This process persists until the task is explicitly stopped.

**Table 2** CCR migration task parameters
Parameter	Description	Example Value
name	Indicates the name of the CCR synchronization task, which is user-defined and must be unique.	ccr_test
src	Indicates that the following information is the source information.	-
dest	Indicates that the following information is the destination information.	-
host	Indicates the IP address of the Master FE.	10.10.xxx.xxx
port	Indicates the query port for connecting to the Doris FE node.	9030
thrift_port	Indicates the RPC port for connecting to the Doris FE node.	9020
user	Indicates the username for accessing Doris.	root
password	Indicates the password for accessing Doris.	********
database	Indicates the database name.	db_name
table	Indicates the name of the table to be migrated. If it is a database-level synchronization, leave the table name empty.	table_01

Use the MySQL client to connect to the destination Doris cluster and run the following command to check the migration task status:
```
show restore\G
```
As shown in the following figure, if State is FINISHED, the migration task is successfully executed.
After the data migration is complete, use the MgC Agent to verify the consistency of the migrated Doris data. For details, see Verifying Doris Data Migration.

Parent topic: Data Migration

Previous topic: Migrating Data from Doris to MRS with CDM

Next topic: Interconnection with Other Cloud Services

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot