Updated on 2024-11-29 GMT+08:00

Data Migration

Scenario

If you select Elasticsearch to replace Solr as the service platform, you need to import the index data to Elasticsearch.

Prerequisites

  • The cluster is running properly.
  • The cluster client has been installed, and the client node has been connected to the Elasticsearch cluster.

Procedure

  1. Migrate the Solr source data from HBase to Elasticsearch in TableScanMR concurrent mode by referring to Migrating HDFS Data Using HDFS2ES.
  2. Verify the import result.

    1. Check whether a table has been imported.

      Run the curl command to check whether data is imported, query the number of imported data records, and compare the number with the number of Solr records. If the number of data records is the same as that of Solr records, the data import is complete.

      Example:

      • Normal mode:

        curl -XGET "http://EsNode IP address:EsNode port number/my_store1/_search"

      • Security mode:

        curl -XGET --tlsv1.2 --negotiate -k -u:"https://EsNode IP address:EsNode port number/my_store1/_search"

    2. Check whether all tables are imported.

      Run the curl command to view all tables and the number of records in the Elasticsearch cluster. Compare the number with the number of Solr records. If the number of records is the same, the data import is complete.

      Example:

      • Normal mode:

        curl -XGET "http://EsNode IP address:EsNode port number/_cat/_indices"

      • Security mode:

        curl -XGET --tlsv1.2 --negotiate -k -u:"https://EsNode IP address:EsNode port number/_cat/_indices"

      For details about how to use the curl command, see Running curl Commands in Linux.

  3. Stop Solr-based service applications and migrate them to Elasticsearch.

    After the migration, check whether the services are normal. If they are abnormal, roll back to Solr.

    If the services are normal, reserve or delete the Solr service based on user maintenance requirements.

Related Information

  • After data is imported, if some indexes fail to be imported, the json directory is generated in the data import tool package directory.

    You can run the ./sbin/input.sh command in the sbin directory to import the data that fails to be imported.

  • In the Elasticsearch/tools/elasticsearch-data2es/hbase2es/logs directory of the cluster client, you can view the log information about the import process.