Help Center> Cloud Search Service> Best Practices> Cluster Migration> Migration from Elasticsearch> Migrating Data Through Backup and Restoration (from Third-Party Elasticsearch)
Updated on 2024-04-19 GMT+08:00

Migrating Data Through Backup and Restoration (from Third-Party Elasticsearch)

To migrate data from a user-built or third-party Elasticsearch cluster to a Huawei Cloud Elasticsearch cluster, perform the steps in this section.

Prerequisites

  • Before using backup and restoration, ensure that:
    • Target Elasticsearch version ≥ Source Elasticsearch version
    • Number of candidate master nodes of the target Elasticsearch cluster > Half of the number of candidate master nodes of the source Elasticsearch cluster
  • Backup and restoration do not support incremental data synchronization. You need to stop data update before backing up data.
  • The target Elasticsearch cluster has been created in CSS.

Migration Process

The following figure shows the cluster migration process when the source is a user-built or third-party Elasticsearch cluster, and the target is an Elasticsearch cluster of CSS.

Figure 1 Migration through backup and restoration

Procedure

  1. Log in to the third-party cloud where Elasticsearch is located and create a shared repository that supports the S3 protocol.
  2. Create a snapshot backup repository in the user-built or third-party Elasticsearch cluster to store Elasticsearch snapshot data.

    For example, create a backup repository named my_backup at Elasticsearch and associate it with the repository OSS.

    PUT _snapshot/my_backup
        {
            # Repository type.
    	"type": "oss",
            "settings": {
    		# # Private network domain name of the repository in step 1.
    		"endpoint": "http://oss-xxx.xxx.com", 
    		# User ID and password of the repository. Hard-coded or plaintext access keys (AK/SK) are risky. For security purposes, encrypt your access keys and store them in the configuration file or environment variables. In this example, access keys are stored in the environment variables for identity authentication. Before running the code in this example, configure the AK and SK in environment variables.
    		"access_key_id": "ak",
    		"secret_access_key": "sk",
    		# Bucket name of the repository created in step 1.
    		"bucket": "patent-esbak", 
    		# # Whether to enable snapshot file compression.
    		"compress": false,
    		# If the size of the uploaded snapshot data exceeds the value of this parameter, the data will be uploaded as blocks to the repository.
    		"chunk_size": "1g",
    		# Start position of the repository. The default value is the root directory.
    		"base_path": "snapshot/"
            }
    }

  3. Create a snapshot for the user-built or third-party Elasticsearch cluster.

    • Create a snapshot for all indexes.

      For example, create a snapshot named snapshot_1.

      PUT _snapshot/my_backup/snapshot_1?wait_for_completion=true
    • Create a snapshot for specified indexes.

      For example, create a snapshot named snapshot_test that contains indexes patent_analyse and patent.

      PUT _snapshot/my_backup/snapshot_test
      {
      "indices": "patent_analyse,patent"
      }

  4. View the snapshot creation progress of the cluster.

    • Run the following command to view information about all snapshots:
      GET _snapshot/my_backup/_all
    • Run the following command to view information about snapshot_1:
      GET _snapshot/my_backup/snapshot_1

  5. Migrate snapshot data from the repository to OBS.

    The Object Storage Migration Service (OMS) supports data migration from multiple cloud vendors to OBS. For details, see Migration from Other Clouds to Huawei Cloud.

  6. Create a repository in the Elasticsearch cluster of CSS and associate it with OBS. This repository will be used for restoring the snapshot data of the user-built or third-party Elasticsearch cluster.

    For example, create a repository named my_backup_all in the cluster and associate it with the destination OBS.

    PUT _snapshot/my_backup_all/
    {
        "type" : "obs",
        "settings" : {
    		# Private network domain name of OBS
    		"endpoint" : "obs.xxx.xxx.com",
    		"region" : "xxx",
    		# Username and password for accessing OBS. Hard-coded or plaintext access keys (AK/SK) are risky. For security purposes, encrypt your access keys and store them in the configuration file or environment variables. In this example, access keys are stored in the environment variables for identity authentication. Before running the code in this example, configure the AK and SK in environment variables.
    		"access_key": "ak",
    		"secret_key": "sk",  
    		# OBS bucket name, which must be the same as the destination OBS bucket name in the previous step
    		"bucket" : "esbak",   
    		"compress" : "false",
    		"chunk_size" : "1g",
    		#Note that there is no slash (/) after snapshot.
    		"base_path" : "snapshot",
    		"max_restore_bytes_per_sec": "100mb",
    		"max_snapshot_bytes_per_sec": "100mb"    
    	}
    }

  7. Restore the snapshot data to the Elasticsearch cluster of CSS.

    1. Check information about all snapshots.
      GET _snapshot
    2. Restore a snapshot
      • Restore all the indexes from a snapshot. For example, to restore all the indexes from snapshot_1, run the following command:
        POST _snapshot/my_backup_all/snapshot_1/_restore?wait_for_completion=true
      • Restores some indexes from a snapshot. For example, in the snapshot named snapshot_1, restore only the indexes that do not start with a period (.).
        POST _snapshot/my_backup/snapshot_1/_restore
        {"indices":"*,-.monitoring*,-.security*,-.kibana*","ignore_unavailable":"true"}
      • Restore a specified index from a snapshot and renames the index. For example, in snapshot_1, restore index_1 to restored_index_1 and index_2 to restored_index_2.
        POST /_snapshot/my_backup/snapshot_1/_restore
        {
        	# Restore only indexes index_1 and index_2 and ignore other indexes in the snapshot.
        	"indices": "index_1,index_2"
        	# Search for the index that is being restored. The index name must match the provided template.
        	"rename_pattern": "index_(.+)",
        	# Rename the found index.
        	"rename_replacement": "restored_index_$1"
        }

  8. View the snapshot restoration result.

    • Run the following command to view the restoration results of all snapshots:
      GET /_recovery/
    • Run the following command to check the snapshot restoration result of a specified index:
       GET {index_name}/_recovery