Updated on 2024-04-29 GMT+08:00

Enabling a Term Index When Creating a Table

Function Description

Follow instructions in Creating a Table to create a table. In addition, configure a term index schema in table properties.

Sample Code

public void testCreateTable() {
  LOG.info("Entering testCreateTable.");
  HTableDescriptor tableDesc = new HTableDescriptor(tableName);
  HColumnDescriptor cdm = new HColumnDescriptor(FAM_M);
  cdm.setDataBlockEncoding(DataBlockEncoding.FAST_DIFF);
  tableDesc.addFamily(cdm);
  HColumnDescriptor cdn = new HColumnDescriptor(FAM_N);
  cdn.setDataBlockEncoding(DataBlockEncoding.FAST_DIFF);
  tableDesc.addFamily(cdn);

  // Add bitmap index definitions.
  List<BitmapIndexDescriptor> bitmaps = new ArrayList<>();//(1)
  bitmaps.add(BitmapIndexDescriptor.builder()
    // Describe which column should be indexed.
    .setColumnName(FamilyOnlyName.valueOf(FAM_M))//(2)
    // Describe how to extract term(s) from KeyValue
    .setTermExtractor(TermExtractor.NAME_VALUE_EXTRACTOR)//(3)
    .build());
  // It will help to add several properties into HTableDescriptor.
  // SHARD_NUM should be less than the region number
  IndexHelper.enableAutoIndex(tableDesc, SHARD_NUM, bitmaps);//(4)

  List<byte[]> splitList = Arrays.stream(SPLIT.split(LemonConstants.COMMA))
    .map(s -> org.lemon.common.Bytes.toBytes(s.trim()))
    .collect(Collectors.toList());
  byte[][] splitArray = splitList.toArray(new byte[splitList.size()][]);

  Admin admin = null;
  try {
    // Instantiate an Admin object.
    admin = conn.getAdmin();
    if (!admin.tableExists(tableName)) {
      LOG.info("Creating table...");
      admin.createTable(tableDesc, splitArray);
      LOG.info(admin.getClusterStatus());
      LOG.info(admin.listNamespaceDescriptors());
      LOG.info("Table created successfully.");
    } else {
      LOG.warn("table already exists");
    }
  } catch (IOException e) {
    LOG.error("Create table failed.", e);
  } finally {
    if (admin != null) {
      try {
        // Close the Admin object.
        admin.close();
      } catch (IOException e) {
        LOG.error("Failed to close admin ", e);
      }
    }
  }
  LOG.info("Exiting testCreateTable.");
}

Precautions

  • (1) BitmapIndexDescriptor describes which fields use what rules to extract terms. One or more BitmapIndexDescriptor can be defined in a data table.
  • (2) Defines which columns need to extract terms. The options are as follows:
    • ExplicitColumnName: Specifies a column.
    • FamilyOnlyName: Indicates all columns in a column family.
    • PrefixColumnName: Indicates columns with a specific prefix.
  • (3) Defines a rule for extracting terms from columns. The options are as follows:
    • QualifierExtractor: Indicates that terms are extracted by column name.

      For example, if qualifier is Male and value is 1, the extracted term is Male.

    • QualifierValueExtractor: Indicates that terms are extracted by column name and value.

      For example, if qualifier is education and value is master, the extracted term is education:master.

    • QualifierArrayValueExtractor: Indicates that multiple terms can be extracted. The value is in JSON array format.
      For example, if qualifier is hobby and value is ["basketball","football","volleyball"], the extracted terms are as follows:
      hobby:basketball
      hobby:football
      hobby:volleyball
    • QualifierMapValueExtractor: Indicates that multiple terms can be extracted. The value is in JSON map format.
      For example, if qualifier is hobby and value is {"basketball":"9","football":"8","volleyball":"7"}, the extracted terms are as follows:
      hobby:basketball
      hobby:football
      hobby:volleyball
      hobby:basketball_9
      hobby:football_8
      hobby:volleyball_7
  • (4) The number of shards (SHARD_NUM) in the index table must be less than or equal to that in the data table.