创建HBase表
功能简介
HBase通过org.apache.hadoop.hbase.client.Admin对象的createTable方法来创建表,并指定表名、列族名。创建表有两种方式,建议采用预分Region建表方式:
- 快速建表,即创建表后整张表只有一个Region,随着数据量的增加会自动分裂成多个Region。
- 预分Region建表,即创建表时预先分配多个Region,此种方法建表可以提高写入大量数据初期的数据写入速度。
表的列名以及列族名不能包含特殊字符,可以由字母、数字以及下划线组成。
代码样例
以下代码片段在com.huawei.bigdata.hbase.examples包的“HBaseExample”类的testCreateTable方法中
public void testCreateTable() { LOG.info("Entering testCreateTable: " + tableName); // Set the column family name to info. byte [] fam = Bytes.toBytes("info"); ColumnFamilyDescriptor familyDescriptor = ColumnFamilyDescriptorBuilder.newBuilder(fam) // Set data encoding methods. HBase provides DIFF,FAST_DIFF,PREFIX // HBase 2.0 removed `PREFIX_TREE` Data Block Encoding from column families. .setDataBlockEncoding(DataBlockEncoding.FAST_DIFF) // Set compression methods, HBase provides two default compression // methods:GZ and SNAPPY // GZ has the highest compression rate,but low compression and // decompression effeciency,fit for cold data // SNAPPY has low compression rate, but high compression and // decompression effeciency,fit for hot data. // it is advised to use SANPPY .setCompressionType(Compression.Algorithm.SNAPPY) .build(); TableDescriptor htd = TableDescriptorBuilder.newBuilder(tableName).setColumnFamily(familyDescriptor).build(); Admin admin = null; try { // Instantiate an Admin object. admin = conn.getAdmin(); if (!admin.tableExists(tableName)) { LOG.info("Creating table..."); admin.createTable(htd); LOG.info(admin.getClusterMetrics()); LOG.info(admin.listNamespaceDescriptors()); LOG.info("Table created successfully."); } else { LOG.warn("table already exists"); } } catch (IOException e) { LOG.error("Create table failed.", e); } finally { if (admin != null) { try { // Close the Admin object. admin.close(); } catch (IOException e) { LOG.error("Failed to close admin ", e); } } } LOG.info("Exiting testCreateTable."); }