Using a Secondary Index
Scenario
HIndex enables HBase indexing based on specific column values, making the retrieval of data highly efficient and fast.
Constraints
- Column families are separated by semicolons (;).
- Columns and data types must be contained in square brackets ([]).
- The column data type is specified by using -> after the column name.
- If the column data type is not specified, the default data type (string) is used.
- The number sign (#) is used to separate two index details.
- The following is an optional parameter:
-Dscan.caching: number of cached rows when the data table is scanned.
The default value is set to 1000.
Procedure
- Install the HBase client. For details, see Using an HBase Client.
- Go to the client installation directory, for example, /opt/client.
cd /opt/client
- Run the following command to configure environment variables:
source bigdata_env
- If the cluster is in security mode, run the following command to authenticate the user. In normal mode, user authentication is not required.
kinit Component service user
- Run the following command to access HIndex:
hbase org.apache.hadoop.hbase.hindex.mapreduce.TableIndexer
Table 1 Common HIndex commands Description
Command
Add Index
TableIndexer-Dtablename.to.index=table1-Dindexspecs.to.add='IDX1=>cf1:[q1->datatype],[q2],[q3];cf2:[q1->datatype],[q2->datatype]#IDX2=>cf1:[q5]'
Create Index
TableIndexer -Dtablename.to.index=table1 -Dindexnames.to.build='IDX1#IDX2'
Delete Index
TableIndexer -Dtablename.to.index=table1 -Dindexnames.to.drop='IDX1#IDX2'
Disable Index
TableIndexer -Dtablename.to.index=table1 -Dindexnames.to.disable='IDX1#IDX2'
Add and Create Index
TableIndexer -Dtablename.to.index=table1 -Dindexspecs.to.add='IDX1=>cf1:[q1->datatype],[q2],[q3];cf2:[q1->datatype],[q2->datatype]#IDX2=>cf1:[q5] -Dindexnames.to.build='IDX1'
Create Index for a Single Region
TableIndexer -Dtablename.to.index=table1 -Dregion.to.index=regionEncodedName -Dindexnames.to.build='IDX1#IDX2'
- IDX1: indicates the index name.
- cf1: indicates the column family name.
- q1: indicates the column name.
- datatype: indicates the data type, including String, Integer, Double, Float, Long, Short, Byte and Char.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.