Updated on 2022-09-14 GMT+08:00

Java APIs

API Usage Suggestions

  • org.apache.hadoop.hbase.Cell rather than org.apache.hadoop.hbase.KeyValue is recommended as the KV data object.
  • It is recommended that Connection connection = ConnectionFactory.createConnection(conf) be used to create a connection. The HTablePool is abandoned.
  • org.apache.hadoop.hbase.mapreduce rather than org.apache.hadoop.hbase.mapred is recommended.
  • You are advised to obtain the HBase client operation object using the getAdmin() method of the constructed Connection object.

Common HBase APIs

Common HBase Java classes are as follows:

  • Interface class Admin: It is the core class of HBase client applications, which encapsulates APIs for HBase management, such as table creation and table deletion. For details, see Table 1.
  • Interface class Table: It is a class for HBase read/write, which encapsulates APIs for reading and writing HBase tables. For details, see Table 2.
Table 1 org.apache.hadoop.hbase.client.Admin

Method

Description

boolean tableExists(final TableName tableName)

This method is used to check whether a specified table exists. If the table exists in the hbase:meta table, true is returned. Otherwise, false is returned.

HTableDescriptor[] listTables(String regex)

This method is used to view the user table that complies with the specified regular expression format. There are two other overload methods, one with the input parameter type of Pattern, and the other with the empty input parameter. When the input parameter is empty, all user tables are queried by default.

HTableDescriptor[] listTables(final Pattern pattern, final boolean includeSysTables)

The function of this method is similar to that of the previous method. Users can use this method to specify whether the returned result contains the system table. The previous method returns only the user table.

TableName[] listTableNames(String regex)

This method is used to view the user table that complies with the specified regular expression format. There are two other overload methods, one with the input parameter type of Pattern, and the other with the empty input parameter. When the input parameter is empty, all user tables are queried by default. The function of this method is similar to that of listTables. The only difference is that this method returns TableName[].

TableName[] listTableNames(final Pattern pattern, final boolean includeSysTables)

The function of this method is similar to that of the previous method. Users can use this method to specify whether the returned result contains the system table. The previous method returns only the user table.

void createTable(HTableDescriptor desc)

This method is used to create a table with only one region.

void createTable(HTableDescriptor desc, byte[] startKey, byte[] endKey, int numRegions)

This method is used to create a table with a specified number of regions. The endKey of the first region is the StartKey, and the StartKey of the last region is the endKey. If there are a large number of regions, the invoking of this method may time out.

void createTable(final HTableDescriptor desc, byte[][] splitKeys)

This method is used to create a table. The number of regions in the table and the StartKey of each region are determined by splitKeys. If there are a large number of regions, the invoking of this method may time out.

void createTable(final HTableDescriptor desc, final byte[][] splitKeys)

This method is used to create a table. The number of regions in the table and the StartKey of each region are determined by splitKeys. This method uses asynchronous invoking, and does not wait for the created table to be online.

void deleteTable(final TableName tableName)

This method is used to delete a specified table.

public void truncateTable(final TableName tableName, final boolean preserveSplits)

This method is used to re-create a specified table. If the second parameter is set to true, the region of the reconstructed table is the same as that of the previous table. Otherwise, there is only one region.

void enableTable(final TableName tableName)

This method is used to enable a specified table. If there are a large number of regions in a table, the invoking of this method may time out.

void enableTableAsync(final TableName tableName)

This method is used to enable a specified table. This method uses asynchronous invoking, and does not wait for all regions to be online.

void disableTable(final TableName tableName)

This method is used to disable a specified table. If there are a large number of regions in a table, the invoking of this method may time out.

void disableTableAsync(final TableName tableName)

This method is used to disable a specified table. This method uses asynchronous invoking, and does not wait for all regions to be offline.

boolean isTableEnabled(TableName tableName)

This method is used to check whether a table is enabled. This method can be used with the enableTableAsync method to check whether the operation of enabling a table is complete.

boolean isTableDisabled(TableName tableName)

This method is used to check whether a table is disabled. This method can be used with the disableTableAsync method to check whether the operation of disabling a table is complete.

void addColumn(final TableName tableName, final HColumnDescriptor column)

This method is used to add a column family to a specified table.

void deleteColumn(final TableName tableName, final HColumnDescriptor column)

This method is used to delete a specified column family from a specified table.

void modifyColumn(final TableName tableName, final HColumnDescriptor column)

This method is used to modify a specified column family.

Table 2 org.apache.hadoop.hbase.client.Table

Method

Description

boolean exists(Get get)

This method is used to check whether the specified rowkey exists in the table.

boolean[] existsAll(List<Get> gets)

This method is used to check whether the specified rowkey exists in the table. The returned Boolean array result corresponds to the input parameter position.

Result get(Get get)

This method is used to read data based on the specified RowKey.

Result[] get(List<Get> gets)

This method is used to read data in batches by specifying a batch of RowKeys.

ResultScanner getScanner(Scan scan)

This method is used to obtain a scanner object in the table. The query parameters can be specified by the input parameter scan, including StartRow, StopRow, and caching.

void put(Put put)

This method is used to write a data record to the table.

void put(List<Put> puts)

This method is used to write a batch of data records to the table.

void close()

This method is used to release the resources held by the table object.

Table 1 and Table 2 list only some common methods. F