Help Center > > User Guide> Data Migration> Copying Data

Copying Data

Updated at: Mar 25, 2021 GMT+08:00

Based on the regions of and network connectivity between the source cluster and destination cluster, data copy scenarios are classified as follows:

Same Region

If the source cluster and destination cluster are in the same region, follow instructions in Establishing a Data Transmission Channel to configure the network and set up a network transmission channel. Use the DistCp tool to run the following command to copy the HDFS, HBase, Hive data files and Hive metadata backup files from the source cluster to the destination cluster.

$HADOOP_HOME/bin/hadoop distcp <src> <dist> -p

The following provides description about the parameters in the preceding command.

  • $HADOOP_HOME: installation directory of the Hadoop client in the destination cluster
  • <src>: HDFS directory of the source cluster
  • <dist>: HDFS directory of the destination cluster

Different Regions

If the source cluster and destination cluster are in different regions, use the DistCp tool to copy the source cluster data to OBS, and use the OBS cross-region replication function (For details, see Cross-Region Replication.) to copy the data to OBS in the region where the destination cluster resides. If DistCp is used, permission, owner, and group information cannot be set for files on OBS. In this case, you need to export and copy the HDFS metadata while exporting data to prevent the loss of HDFS file property information.

Migrating Data from an Offline Cluster to a Cloud

You can use the following way to migrate data from an offline cluster to the cloud.

  • Direct Connect

    Create a Direct Connect between the source cluster and destination cluster, enable the network between the offline cluster egress gateway and the online VPC, and execute the DistCp to copy the data by referring to the method provided in Same Region.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?

Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel