Updated on 2024-11-29 GMT+08:00

Migration Solution

Server Resource Planning

To migrate Solr to Elasticsearch, you need to prepare the same number of server nodes in advance based on the number of existing Solr cluster nodes to replace Solr nodes.

The CPU, memory, disk quantity, and disk size of nodes must be the same.

For example, assume that the Solr cluster has 64 nodes, 256 GB memory, and 24 x 600 GB disks.

To set up the Elasticsearch cluster, you need to configure 64 nodes, 256 GB memory, and 24 x 600 GB disks.

Server Network Planning

The network of MRS is divided into two planes, the service plane and management plane. The two planes are deployed in physical isolation mode to ensure network security. The active and standby management nodes support the configuration of external management network IP addresses. You can manage the cluster using external management networks. You need to prepare sufficient IP addresses in the network environment to configure network information for new nodes.

Migration Schedule

Data migration means migrating data to Elasticsearch in the scenario where source data is stored in HBase and index data is stored in Solr. You can use the HBase data import and export tool in the Elasticsearch software package to import data in Elasticsearch.

HBase cluster resources are used when the tool is running, which has requirements on memory. The migration speed is related to resources. Example:

  • An HBase cluster has 92 nodes (with 256 GB memory), and an Elasticsearch cluster has 64 nodes (with 256 GB memory). The migration speed is 2 million records per second. It takes 5.78 days to migrate a cluster with 1 trillion records.
  • An HBase cluster has 92 nodes (with 128 GB memory), and an Elasticsearch cluster has 20 nodes (with 128 GB memory). The migration speed is 500,000 records per second. It takes 23.12 days to migrate a cluster with 1 trillion records.

Before the migration, you are advised to perform the test for 2 hours to verify the data migration speed in the production environment and determine the operation time based on the total volume of data.

Impact of the Migration

After the migration, Solr and Elasticsearch can provide services at the same time, but user service applications need to be migrated from Solr to Elasticsearch. Services are not available during the migration.

Preparations

Table 1 Migration environment information to be collected

Parameter

Description

How to Obtain

Solr node information

Contains information about all IP addresses, CPUs, memory, and disks. It is used to prepare Elasticsearch hardware resources and test the data migration time.

Obtain the value from the cluster administrator.

Elasticsearch node information

Contains all IP addresses and port information. It is used to update configurations during service application migration and may be used to set network firewalls.

Administrator accounts of Solr and Elasticsearch

Contains the usernames and passwords. The accounts are used to query data to confirm the migration progress and result.

Data Migration Process

The following describes the data migration process in the scenario where Solr service data needs to be migrated to Elasticsearch and the source data is stored in HBase, as shown in Figure 1.

Figure 1 Migration procedure