Help Center/ Cloud Search Service/ Best Practices/ Elasticsearch Data Migration/ Migrating Data Between Huawei Cloud Elasticsearch Clusters Using the Read/Write Splitting Plugin

Updated on 2024-09-13 GMT+08:00

View PDF

Migrating Data Between Huawei Cloud Elasticsearch Clusters Using the Read/Write Splitting Plugin

This topic describes how to migrate data between Huawei Cloud Elasticsearch clusters using the read/write splitting plugin.

Scenario

By default, the read/write splitting plugin of CSS is installed in Elasticsearch 7.6.2 and Elasticsearch 7.10.2 clusters. You can configure read/write splitting to perform near-real-time synchronization of index data between Elasticsearch clusters.

The read/write splitting plugin can be used for data migration only if both the source and destination clusters were created in CSS. Typical application scenarios include:

Cross-region or cross-account migration: Migrate the data of an Elasticsearch cluster in another region or under another account to the current cluster.
Cluster merge: Merge the index data of two Elasticsearch clusters.

Overview

Figure 1 Migration procedure

Figure 1 illustrates the general procedure for migrating data from one Huawei Cloud Elasticsearch cluster to another using the read/write splitting plugin of CSS.

Use the read/write splitting plugin to set up a connection between the source (primary) and destination (secondary) clusters.
Configure automatic index synchronization in the destination cluster to have data automatically synchronized from the source to the destination cluster. The synchronization interval is 30 seconds by default and can be changed.
Check the status of auto synchronization to see if data migration has completed.

For more information about the read/write splitting feature of CSS, see Configuring Read/Write Splitting.

Advantages

High data consistency: The read/write splitting feature enables data replication between a pair of primary/secondary clusters and ensures data synchronization between different shards.
Fast migration: The speed of data synchronization depends on the available bandwidth, not the source or destination cluster.
Configurable synchronization interval: The default data synchronization interval is 30 seconds. You can change it to reduce the delay of data synchronization.

Constraints

Ensure that the network between the clusters is connected.
If the source and destination clusters are in different VPCs, establish a VPC peering connection between them. For details, see VPC Peering Connection Overview.
The versions of the source and destination clusters must both be 7.6.2 or 7.10.2.

Prerequisites

The source and destination Elasticsearch clusters are available and both have the read/write splitting plugin installed. (By default, the read/write splitting plugin is installed in Elasticsearch 7.6.2 and 7.10.2.)

Procedure

Obtain information about the Elasticsearch clusters to configure the data migration task.

**Table 1** Required Elasticsearch cluster information
Cluster		Required Information	How to Obtain
Source cluster (primary cluster)	Huawei Cloud Elasticsearch cluster	Access address of the source cluster Username and password for accessing the source cluster (only for security-mode clusters)	For details about how to obtain the cluster address, see 1.c. Contact the service administrator to obtain the username and password.
	On-premises Elasticsearch cluster	Public network address of the source cluster Username and password for accessing the source cluster (only for security-mode clusters)	Contact the service administrator to obtain the information.
	Third-party Elasticsearch cluster	Access address of the source cluster Username and password for accessing the source cluster (only for security-mode clusters)	Contact the service administrator to obtain the information.
Destination cluster (secondary cluster)	Huawei Cloud Elasticsearch cluster	Access address of the destination cluster Username and password for accessing the destination cluster (only for security-mode clusters)	For details about how to obtain the access address, see 1.c. Contact the service administrator to obtain the username and password.

Log in to the CSS management console.
In the navigation pane on the left, choose Clusters > Elasticsearch.
In the Elasticsearch cluster list, obtain the cluster access address.
Figure 2 Obtaining cluster information

Log in to the Kibana console of the destination Elasticsearch cluster.
1. In the Elasticsearch cluster list, select the destination cluster, and click Access Kibana in the Operation column to log in to the Kibana console.
2. Click Dev Tools in the navigation tree on the left.

Use the read/write splitting plugin to set up a connection between the source and destination clusters.

Run the following command to configure information about the source cluster in the destination cluster:

PUT /_cluster/settings
{
  "persistent" : {
    "cluster" : {
      "remote.rest" : {
        "leader1" : {
          "seeds" : [
            "http://10.0.0.1:9200",
            "http://10.0.0.2:9200",
            "http://10.0.0.3:9200"
          ] ,
            "username": "elastic",
            "password": "*****"
        }
      }
    }
  }
}

**Table 2** Request body parameters
Parameter	Description
leader1	Name of the configuration task, which is user-defined and will be used for configuring read/write splitting later.
seeds	Access address of the source cluster. When HTTPS is enabled for the cluster, the URI schema must use HTTPS.
username	Username of the source cluster (primary cluster). This parameter is required only when security mode is enabled for the primary cluster.
password	Password of the source cluster (primary cluster). This parameter is required only when security mode is enabled for the primary cluster.

If the value of acknowledged is true in the command output, the configuration is successful.

Configure automatic index synchronization in the destination cluster to have data automatically synchronized from the source to the destination cluster.

Run the following command in the destination cluster to create a pattern-matching index synchronization policy, which synchronizes matched indexes from the source cluster to the destination cluster.

PUT auto_sync/pattern/pattern1
{
 "remote_cluster": "leader1",
 "remote_index_patterns": "log*",
 "local_index_pattern": "{{remote_index}}",
 "apply_exist_index": true
}

**Table 3** Request body parameters
Parameter	Description
pattern1	Name of the pattern for index matching.
remote_cluster	Configuration task name, for example, leader1 in the previous step.
remote_index_patterns	Pattern for matching indexes to be synchronized in the source cluster. The wildcard (*) is supported.
local_index_pattern	Index pattern in the destination cluster. The index template can be replaced. For example, if this parameter is set to {{remote_index}}, the index log1 remains after synchronization.
apply_exist_index	Whether to synchronize existing indexes in the source cluster. The default value is true.

Check the status of automatic synchronization in the destination cluster. The default synchronization interval is 30 seconds.
Run the following command to obtain the synchronization status of the matched indexes:
```
GET auto_sync/stats
```
If the value of failed_count is 0 in the command output, the synchronization is complete.
After the synchronization is complete, check the data consistency between the source and destination Elasticsearch clusters.

For example, access the Kibana console of the source and destination clusters separately, go to Dev Tools, and run the GET cat/indices command to check whether their indexes are consistent.