Help Center > > User Guide> Data Migration (Scenario Edition)> Migrating Data from HBase to MRS

Migrating Data from HBase to MRS

Updated at: Oct 21, 2021 GMT+08:00

Scenarios

This section describes how to migrate data from the offline IDC equipment room or public cloud HBase cluster to MRS on HUAWEI CLOUD. The data volume can be tens of TBs or less. This section uses HUAWEI CLOUD CDM as an example to describe how to migrate data.

Figure 1 HBase data migration

HBase stores data in HDFS, including Hfile and WAL files. The hbase.rootdir configuration item specifies the HDFS path. By default, data is stored in the /hbase folder on MRS.

Some mechanisms and tool commands of HBase can also be used to migrate data. For example, you can migrate data by exporting snapshots, exporting/importing data, and CopyTable. For details, see the Apache official website.

This document describes how to migrate HBase data using the CDM service on HUAWEI CLOUD.

Solution Advantages

Scenario-based migration migrates snapshots and then restores table data to speed up migration.

Full Data Migration

  1. Log in to the CDM management console.
  2. Create a CDM cluster. The security group, VPC, and subnet of the CDM cluster must be the same as those of the destination cluster to ensure that the CDM cluster can communicate with the MRS cluster.
  3. On the Cluster Management page, locate the row where the target cluster resides and click Job Management in the Operation column.
  4. On the Link Management tab page, click Create Link and select Hadoop release version.
  5. Add a connection to the source cluster by referring to Creating Links. Set Hadoop type to Apache Hadoop.

    (Optional) Use a user with high permissions to migrate HBase. For example, click Show Advanced Attributes and add the user hadoop.user.name = Username (for example, omm).

    Figure 2 Link to the source cluster

  6. On the Link Management tab page, click Create Link and select Hadoop release version.
  7. Add a connection to the destination cluster by referring to Creating Links. Set Hadoop type to MRS.

    (Optional) Use a user with high permissions to migrate HBase. For example, click Show Advanced Attributes and add the user hadoop.user.name = Username (for example, omm).

    Figure 3 Link to the destination cluster

  8. Choose Job Management > Scenario Migration, and click Create Job.
  9. The job parameter configuration page is displayed. Set the job name and set the migration scenario to HBase migration.
  10. Configure the source and destination job parameters and click Next.

    Figure 4 HBase job configuration

  11. Select the data table to be migrated and click Next.
  12. On the task configuration page that is displayed, click Save without any modification.
  13. Choose Job Management > Scenario Migration and click Run in the Operation column of the job to be executed to start HBase data migration.
  14. After the migration is complete, you can run the same query statement in the source and destination clusters to compare the query results.

    Example:

    • Query the number of records in the BTable table on the source and destination clusters to check whether the number of data records is the same. Add the --endtime parameter to eliminate the impact of data updates on the source cluster during the migration.

      Hbase org.apache.hadoop.hbase.mapreduce.RowCounter BTable --endtime=1587973835000

      Figure 5 Querying the number of records in the BTable table
    • Use scan ' BTable ', {TIMERANGE=>[1587973235000, 1587973835000]} of HBase shell to query data in a specified period and compare the data.

Incremental Data Migration

If new data exists in the source cluster before the service cutover, you need to periodically migrate the new data to the destination cluster. Generally, the data volume updated every day is at the GB level. You can use the Entire DB migration function of CDM to migrate new HBase data every day.

If the Entire DB Migration function of CDM is used, the deleted data in the source HBase cluster cannot be synchronized to the destination cluster.

The HBase connector for scenario migration cannot be shared with that for entire database migration. Therefore, a new HBase connector is required.

  1. Repeat 1 to 7 in Full Data Migration to add two HBase connectors. Select MRS HBase and Apache HBase as the connector type for the source cluster and destination cluster, respectively.

    Figure 6 HBase incremental migration link

  2. Choose Job Management > Entire DB Migration, and click Create Job.
  3. On the job parameter configuration page, configure job parameters and click Next.

    • Job Name: Enter a user-defined job name, for example, hbase-increase.
    • Source Job Configuration: Set Source Link Name to the name of the link to the source cluster created in 1, and click Show Advanced Settings to configure the time range for data migration.
    • Destination Job Configuration: Set Destination Link Name to the name of the link to the destination cluster created in 1. Leave other parameters blank.
    Figure 7 HBase incremental migration job configuration

  4. Select the data table to be migrated and click Save.
  5. Choose Job Management > Entire DB Migration and click Run in the Operation column of the job to be executed to start HBase incremental data migration.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel