Enabling a Term Index When Creating a Table
Function Description
Follow instructions in Creating a Table to create a table. In addition, configure a term index schema in table properties.
Sample Code
public void testCreateTable() { LOG.info("Entering testCreateTable."); HTableDescriptor tableDesc = new HTableDescriptor(tableName); HColumnDescriptor cdm = new HColumnDescriptor(FAM_M); cdm.setDataBlockEncoding(DataBlockEncoding.FAST_DIFF); tableDesc.addFamily(cdm); HColumnDescriptor cdn = new HColumnDescriptor(FAM_N); cdn.setDataBlockEncoding(DataBlockEncoding.FAST_DIFF); tableDesc.addFamily(cdn); // Add bitmap index definitions. List<BitmapIndexDescriptor> bitmaps = new ArrayList<>();//(1) bitmaps.add(BitmapIndexDescriptor.builder() // Describe which column should be indexed. .setColumnName(FamilyOnlyName.valueOf(FAM_M))//(2) // Describe how to extract term(s) from KeyValue .setTermExtractor(TermExtractor.NAME_VALUE_EXTRACTOR)//(3) .build()); // It will help to add several properties into HTableDescriptor. // SHARD_NUM should be less than the region number IndexHelper.enableAutoIndex(tableDesc, SHARD_NUM, bitmaps);//(4) List<byte[]> splitList = Arrays.stream(SPLIT.split(LemonConstants.COMMA)) .map(s -> org.lemon.common.Bytes.toBytes(s.trim())) .collect(Collectors.toList()); byte[][] splitArray = splitList.toArray(new byte[splitList.size()][]); Admin admin = null; try { // Instantiate an Admin object. admin = conn.getAdmin(); if (!admin.tableExists(tableName)) { LOG.info("Creating table..."); admin.createTable(tableDesc, splitArray); LOG.info(admin.getClusterStatus()); LOG.info(admin.listNamespaceDescriptors()); LOG.info("Table created successfully."); } else { LOG.warn("table already exists"); } } catch (IOException e) { LOG.error("Create table failed.", e); } finally { if (admin != null) { try { // Close the Admin object. admin.close(); } catch (IOException e) { LOG.error("Failed to close admin ", e); } } } LOG.info("Exiting testCreateTable."); }
Precautions
- (1) BitmapIndexDescriptor describes which fields use what rules to extract terms. One or more BitmapIndexDescriptor can be defined in a data table.
- (2) Defines which columns need to extract terms. The options are as follows:
- ExplicitColumnName: Specifies a column.
- FamilyOnlyName: Indicates all columns in a column family.
- PrefixColumnName: Indicates columns with a specific prefix.
- (3) Defines a rule for extracting terms from columns. The options are as follows:
- QualifierExtractor: Indicates that terms are extracted by column name.
For example, if qualifier is Male and value is 1, the extracted term is Male.
- QualifierValueExtractor: Indicates that terms are extracted by column name and value.
For example, if qualifier is education and value is master, the extracted term is education:master.
- QualifierArrayValueExtractor: Indicates that multiple terms can be extracted. The value is in JSON array format.
For example, if qualifier is hobby and value is ["basketball","football","volleyball"], the extracted terms are as follows:
hobby:basketball hobby:football hobby:volleyball
- QualifierMapValueExtractor: Indicates that multiple terms can be extracted. The value is in JSON map format.
For example, if qualifier is hobby and value is {"basketball":"9","football":"8","volleyball":"7"}, the extracted terms are as follows:
hobby:basketball hobby:football hobby:volleyball hobby:basketball_9 hobby:football_8 hobby:volleyball_7
- QualifierExtractor: Indicates that terms are extracted by column name.
- (4) The number of shards (SHARD_NUM) in the index table must be less than or equal to that in the data table.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot