Migrating Data Between Huawei Cloud Elasticsearch Clusters Using Backup and Restoration
Data can be migrated between CSS Elasticsearch clusters by backing up and restoring cluster snapshots.
Scenarios
Data migration between Huawei Cloud Elasticsearch clusters via backup and restoration is applicable solely to scenarios where both the source and destination clusters are CSS clusters and rely on OBS. Typical application scenarios include:
- Cross-region or cross-account migration: Migrate the data of an Elasticsearch cluster in another region or under another account to the current cluster.
- Cross-version migration: Migrate data from a self-built Elasticsearch cluster of an earlier version to a cluster of a later version.
- Cluster merge: Merge the index data of two Elasticsearch clusters.
Overview
Figure 1 shows the process of migrating data between Huawei Cloud Elasticsearch clusters using backup and restoration.
- Create a snapshot for the source Elasticsearch cluster and store the snapshot in an OBS bucket.
- Restore the cluster to a destination cluster using the snapshot store in the OBS bucket.
Advantages
- Easy operation and management: The cluster snapshot function on the CSS console allows for simple, easy-to-manage, and automatic data backup and restoration.
- Applicable to large-scale data migration: Snapshot backup is suitable for scenarios involving large amounts of data, especially when the data volume reaches GB, TB, or even PB levels.
- Cross-region and cross-account migration: With the cross-region replication function of OBS, data can be migrated across different regions and accounts.
- Controllable restoration process: During data restoration, you can restore specific indexes or all indexes and specify the cluster status to be restored.
- Controllable migration duration: The data migration rate can be configured based on the migration duration evaluation formula. Ideally, the data migration rate matches the file replication rate.
Impact on Performance
The essence of migrating data between clusters using backup and restoration is copying files at the data storage layer to back up data. This solution does not rely on any external Elasticsearch APIs. Hence it significantly reduces any impact on the performance of the source cluster. For clusters that are not particularly latency-sensitive, the performance impact of this method can be ignored.
Constraints
- The version of the destination cluster must not be earlier than that of the source cluster. For details, see Snapshot version compatibility.
- The number of nodes in the destination cluster must be greater than half of that in the source cluster, and cannot be less than the number of shard replicas in the source cluster.
- The CPU, memory, and disk capacities of the destination cluster must not be lower than those of the source cluster.
Migration Duration
The number of nodes or index shards in the source and destination clusters determines how long the data migration will take. Data migration consists of two phases: data backup and restoration. The backup duration is determined by the source cluster and the restoration duration is determined by the destination cluster. The formula for calculating the total migration duration is as follows:
- If the number of index shards is greater than the number of nodes:
Total duration (in seconds) = Size of migrated data (in GB)/40 MB (0.04 GB)/(Number of nodes in the source cluster + Number of nodes in the destination cluster) x Number of indexes
- If the number of index shards is smaller than the number of nodes:
Total duration (in seconds) = Size of migrated data (in GB)/40 MB (0.04 GB)/(Number of index shards in the source cluster + Number of index shards in the destination cluster) x Number of indexes

The migration duration estimated using the formula is the minimal duration possible (if each node transmits data at the fastest speed, 40 MB/s). The actual duration also depends on factors such as the network and resources condition.
Prerequisites
- The destination cluster (Es-2) and source cluster (Es-1) are available. You are advised to migrate a cluster during off-peak hours.
- Ensure that the destination cluster (Es-2) and source cluster (Es-1) are in the same region.
If the cluster is deployed across regions or accounts, copy the OBS bucket that stores snapshots for the source cluster to that for the destination cluster. For details, see Cross-Region Replication. Then, restore the snapshots in the destination cluster.
Procedure
- Log in to the Cloud Search Service management console.
- Choose Clusters > Elasticsearch. On the displayed page, click the source cluster name Es-1 to go to the basic information page.
- In the navigation pane, choose Cluster Snapshots, and set basic snapshot configurations.
Table 1 Basic configurations for a cluster snapshot Parameter
Description
OBS Bucket
Select an OBS bucket for storing cluster snapshots.
Backup Path
Storage path of the cluster snapshot in the OBS bucket. You can retain the default value.
Maximum Backup Rate (per Second)
Maximum data backup rate per node. You can use the default value 40 MB.
Maximum Recovery Rate (per Second)
Maximum data recovery rate per node. You can set this parameter to 40 MB.
For OpenSearch clusters and Elasticsearch clusters later than 7.6.2, the recovery rate is also limited by the indices.recovery.max_bytes_per_sec parameter.
- If Maximum Recovery Rate (per Second) is less than indices.recovery.max_bytes_per_sec, the former takes effect.
- If Maximum Recovery Rate (per Second) is greater than indices.recovery.max_bytes_per_sec, the latter takes effect.
NOTE:- To check the value of indices.recovery.max_bytes_per_sec, run the following command:
GET _cluster/settings
- To modify indices.recovery.max_bytes_per_sec, run the following command:
PUT _cluster/settings { "transient": { "indices.recovery.max_bytes_per_sec": "100mb" } }
IAM Agency
To store snapshots to an OBS bucket, you must have the required OBS access permissions. Select an IAM agency to grant the current account the permission to access and use OBS.- If you are configuring an agency for the first time, click Automatically Create IAM Agency to create css-obs-agency.
- If there is an IAM agency automatically created earlier, you can click One-click authorization to have the OBS Administrator permissions deleted automatically, and have the following custom policies added automatically instead to implement more refined permissions control.
"obs:bucket:GetBucketLocation", "obs:object:GetObjectVersion", "obs:object:GetObject", "obs:object:DeleteObject", "obs:bucket:HeadBucket", "obs:bucket:GetBucketStoragePolicy", "obs:object:DeleteObjectVersion", "obs:bucket:ListBucketVersions", "obs:bucket:ListBucket", "obs:object:PutObject"
- When OBS buckets use SSE-KMS encryption, the IAM agency must be granted KMS permissions. You can click Automatically Create IAM Agency and One-click authorization to have the following custom policies created automatically.
"kms:cmk:create", "kms:dek:create", "kms:cmk:get", "kms:dek:decrypt", "kms:cmk:list"
- To use Automatically Create IAM Agency and One-click authorization, the following minimum permissions are required:
"iam:agencies:listAgencies", "iam:roles:listRoles", "iam:agencies:getAgency", "iam:agencies:createAgency", "iam:permissions:listRolesForAgency", "iam:permissions:grantRoleToAgency", "iam:permissions:listRolesForAgencyOnProject", "iam:permissions:revokeRoleFromAgency", "iam:roles:createRole"
- To use an IAM agency, the following minimum permissions are required:
"iam:agencies:listAgencies", "iam:agencies:getAgency", "iam:permissions:listRolesForAgencyOnProject", "iam:permissions:listRolesForAgency"
NOTE:The agency name can contain only letters (case-sensitive), digits, underscores (_), and hyphens (-). Otherwise, the backup will fail.
- Click Create. In the dialog box that is displayed, configure the parameters and click OK to manually create a snapshot.
Table 2 Snapshot creation parameters Parameter
Description
Snapshot Name
User-defined snapshot name. You can retain the default value.
Index
Indexes that you want to back up using snapshots. The index names cannot contain spaces or uppercase letters, and cannot contain "\<|>/?. Use commas (,) to separate different index names. If you do not specify this parameter, all indexes in the cluster are backed up by default. You can use the asterisk (*) to match multiple indexes. For example, if you enter index*, then the data of all indexes whose names are prefixed with index will be backed up.
NOTE:Run the GET /_cat/indices command in Kibana to query the names of all indexes in the cluster.
Description
Snapshot description.
In the snapshot management list, if the snapshot status is Available, the snapshot has been created.
- In the snapshot management list, click Restore in the Operation column of the snapshot and configure restoration parameters to restore data to destination cluster Es-2.
Table 3 Snapshot restoration parameters Parameter
Description
Index
Specify the name of the index you want to restore.
Constraints:
- The value is a string of 0 to 1024 characters that cannot contain uppercase letters, spaces, or the following special characters: "\<|>/?.
- When restoring an index whose name is prefixed with .kibana, the index name must be specified.
- The .opendistro_security index cannot be restored.
Value range:
- You can use an asterisk (*) to match multiple indexes. For example, index* indicates that all indexes with the prefix index will be restored. When an asterisk (*) is used for index matching, the .opendistro_security index and any system indexes whose name is prefixed with .kibana are filtered out by default.
- You can restore indexes by specifying their names, for example, index1,index2,index3.
Default value:
By default, this parameter is left blank. That is, no index name is specified, and all indexes will be restored.
Rename Pattern
Index name matching rule. Enter a regular expression. Indexes that match the regular expression will be restored. The default value index_(.+) indicates all indexes. The value is a string of 0 to 1024 characters that cannot contain uppercase letters, spaces, or the following special characters: "\<|>/?,.
NOTE:The Rename Pattern and Rename Replacement take effect only when they are both configured at the same time.
Rename Replacement
Rule for index renaming. The default value restored_index_$1 indicates that restored_ will be added to the beginning of the names of all restored indexes. The value is a string of 0 to 1024 characters that cannot contain uppercase letters, spaces, or the following special characters: "\<|>/?,.
NOTE:The Rename Pattern and Rename Replacement take effect only when they are both configured at the same time.
Cluster
Select a destination cluster, for example, Es-2.
Select or deselect Overwrite same-name indexes in the destination cluster. By default, it is deselected. Data restoration using snapshots works by overwriting existing snapshot files. When there are same-name indexes in the destination cluster, you need to select this option in order to restore same-name, same-shard structure indexes. Indexes with a different shard structure cannot be restored. Exercise caution when performing this operation.
In the snapshot list, when the Task Status changes to Restoration succeeded, the data migration is complete.
- After the data migration is complete, check the data consistency between the destination Elasticsearch cluster Es-2 and source Elasticsearch cluster Es-1. For example, run the _cat/indices command in the source and destination clusters, separately, to check whether their indexes are consistent.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot