Updated on 2024-04-02 GMT+08:00

Creating a Table

Function

Create a table using the createTable method of the org.apache.hadoop.hbase.client.Admin object and specify the table name and column family name. Tables can be created in two modes. (The mode of creating a table using preassigned regions is strongly recommended.)

  • Quickly create a table. A newly created table contains only one region which will be split into multiple new regions as data increases.
  • Create a table using preassigned regions. Preassign multiple regions before creating a table. This mode accelerates data write at the beginning of massive data write.

The column name and column family name of an HBase table consists of letters, digits, and underscores and cannot contain any special characters.

Example Code

The following code snippet belongs to the testCreateTable method in the HBaseSample class of the com.huawei.bigdata.hbase.examples package.

public void testCreateTable() {
             LOG.info("Entering testCreateTable.");
             // Specify the table descriptor.
             TableDescriptorBuilder htd = TableDescriptorBuilder.newBuilder(tableName);(1)

             // Set the column family name to info.
             ColumnFamilyDescriptorBuilder hcd = ColumnFamilyDescriptorBuilder.newBuilder(Bytes.toBytes("info"));(2)

             // Set data encoding methods, HBase provides DIFF,FAST_DIFF,PREFIX
 
             hcd.setDataBlockEncoding(DataBlockEncoding.FAST_DIFF);
             // Set compression methods, HBase provides two default compression
             // methods:GZ and SNAPPY
             // GZ has the highest compression rate,but low compression and
             // decompression effeciency,fit for cold data
             // SNAPPY has low compression rate, but high compression and
             // decompression effeciency,fit for hot data.
             // it is advised to use SNAANPPY
             hcd.setCompressionType(Compression.Algorithm.SNAPPY);//Note [1]
             htd.setColumnFamily(hcd.build());  (3)
             Admin admin = null; 
             try {
               // Instantiate an Admin object.
               admin = conn.getAdmin();  (4)
               if (!admin.tableExists(tableName)) {
                 LOG.info("Creating table...");
                 admin.createTable(htd.build());//Note [2]  (5)
                 LOG.info(admin.getClusterMetrics().toString());
                 LOG.info(admin.listNamespaceDescriptors().toString());
                 LOG.info("Table created successfully.");
               } else {
                 LOG.warn("table already exists");
               }
             } catch (IOException e) {
                 LOG.error("Create table failed " ,e);
             } finally {
               if (admin != null) {
                 try {
                   // Close the Admin object.
                   admin.close();
                 } catch (IOException e) {
                   LOG.error("Failed to close admin " ,e);
                 }
               }
             }
             LOG.info("Exiting testCreateTable.");
  }     

Explanation

1. Create a table descriptor.

2. Create a column family descriptor.

3. Add the column family descriptor to the table descriptor.

4. Obtain an Admin object. The Admin object provides functions for creating a table, creating a column family, checking whether a table exists, modifying the table structure, modifying the column family structure, and deleting a table.

5. Invoke the table creation method of Admin.

Precautions

  • 1. The compression mode of a column family can be set. The code snippets are as follows:
    // Set the encoding algorithm. HBase supports the DIFF, FAST_DIFF, PREFIX encoding algorithms.
     hcd.setDataBlockEncoding(DataBlockEncoding.FAST_DIFF);
      
     // Set the file compression mode. HBase provides the GZ and SNAPPY compression algorithms by default. 
     // GZ provides a high compression rate but low compression and decompression performance. GZ is suitable for cold data. 
     // SNAPPY provides a low compression rate but high compression and decompression performance. SNAPPY is suitable for hot data. 
     // It is recommended that SNAPPY be enabled by default. 
     hcd.setCompressionType(Compression.Algorithm.SNAPPY);
  • 2. A table can be created by specifying the start and end RowKeys or preassigning regions using RowKey arrays. The code snippets are as follows:
    // Create a table whose regions are preassigned.
     byte[][] splits = new byte[4][]; 
     splits[0] = Bytes.toBytes("A"); 
     splits[1] = Bytes.toBytes("H"); 
     splits[2] = Bytes.toBytes("O"); 
     splits[3] = Bytes.toBytes("U"); 
     admin.createTable(htd, splits);