Using CopyTable to Import Source Data to an HBase Cluster

CopyTable is a utility provided by HBase. It can copy part or of all of a table, either to the same cluster or another cluster. The target table must exist first. The CloudTable client tool includes CopyTable. After deploying the client tool, you can use CopyTable to import data to a CloudTable cluster.

Using CopyTable to Import Data

Prepare a Linux ECS as the client host and deploy the CloudTable HBase client tool on it.

For details, see Connecting to an HBase Normal Cluster Using HBase Shell.

When deploying the client tool, set the ZK link to the access address (Intranet) of the CloudTable HBase cluster where the source table resides.
(Optional) If you want to copy a table to another cluster, obtain the access address (Intranet) of the target CloudTable HBase cluster.

Choose Cluster Management in the navigation pane. In the cluster list, locate the required cluster and obtain the address in the Access Address (Intranet) column.
Before using CopyTable to copy table data, ensure that the target table exists in the target CloudTable HBase cluster. If the target table does not exist, create it first.

For details about how to create a table, see Getting Started with HBase.
On the client host, open the CLI, access the hbase directory in the installation directory of the client tool, and run the CopyTable command to import data to the CloudTable cluster.

The following is an example of the command. In this example, the data in the specified 1 hour in TestTable is copied to the target cluster.
```
cd ${Installation directory of the client tool}/hbase
./bin/hbase org.apache.hadoop.hbase.mapreduce.CopyTable --starttime=1265875194289 --endtime=1265878794289 --peer.adr=${ZK link of the target CloudTable cluster}:/hbase --families=myOldCf:myNewCf,cf2,cf3 TestTable
```

Overview of the CopyTable Command

The CopyTable command format is as follows:

CopyTable [general options] [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] <tablename>

For details about the CopyTable command, see CopyTable.

The following provides description about common options:

startrow: the start row
stoprow: the stop row
starttime: beginning of the time range (unixtime in milliseconds). If endtime is not specified, it implies that the duration extends from the start time indefinitely.
endtime: end of the time range. If no starttime is specified, ignore it.
versions: number of cell versions to be copied
new.name: name of a new table
peer.adr: Address of the target cluster. The format is hbase.zookeeper.quorum:hbase.zookeeper.client.port:zookeeper.znode.paren. For the HBase clusters, the parameter value is ${ZK link of the target CloudTable cluster}:/hbase.
families: List of column families to be copied. Multiple column families are separated by commas (,).
If you want to copy from sourceCfName to destCfName, specify sourceCfName:destCfName.

If the column family name needs to remain unchanged after copying, you only need to specify cfName.
all.cells: Deletion markers and the deleted cells are also copied.