Creating an HBase Global Secondary Index
Scenarios
- If a large amount of data exists in a table, you can add an index to a column.
- For user tables that do not have indexes, this tool allows you to add and build indexes at the same time.
Creating an HBase Global Secondary Index
Run the following command on the HBase client to add or create an index. After the command is executed, the specified index is added to the table.
hbase org.apache.hadoop.hbase.hindex.global.mapreduce.GlobalTableIndexer -Dtablename.to.index='table' -Dindexspecs.to.add='idx1=>cf1:[c1->string],[c2]#idx2=>cf2:[c1->string],[c2]#idx3=>cf1:[c1];cf2:[c1]' -Dindexspecs.covered.family.to.add='idx2=>cf1' -Dindexspecs.covered.to.add='idx1=>cf1:[c3],[c4]' -Dindexspecs.coveredallcolumn.to.add='idx3=>true' -Dindexspecs.splitkeys.to.set='idx1=>[\x010,\x011,\x012]#idx2=>[\x01a,\x01b,\x01c]#idx3=>[\x01d,\x01e,\x01f]'
The parameters are described as follows:
- tablename.to.index: Name of the data table for which an index is created
When this parameter is used to create an index, if the data table is empty, the created index will be in ACTIVE state. Otherwise, the index will be in INACTIVE state.
- indexspecs.to.addandbuild (optional): Generated index data during data table creation. If the data table is too large, enabling this parameter is not recommended. Use an index data generation tool instead.
This parameter and tablename.to.index cannot be used at the same time. When this parameter is used, the index will be in BUILDING state. After the index data is generated, the index will be in ACTIVE state.
- indexspecs.to.add: Mapping between the index name and the column in the corresponding data table (definition of index column)
- indexspecs.covered.to.add (optional): Column of the data table that is redundantly stored in an index (definition of overwrite column)
- indexspecs.covered.family.to.add (optional): Column family of the data table that is redundantly stored in an index table (definition of overwrite column)
- indexspecs.coveredallcolumn.to.add (optional): All data in a data table that is redundantly stored in an index table (definition of overwrite all columns)
- indexspecs.splitkeys.to.set (optional): Pre-partition split point of an index table. Specify this parameter in case the Region index table becomes a hotspot. The format of pre-partition is as follows:
- '#': separate indexes
- '[]': contain splitkeys
- ',': separate splitkeys
Each splitkey of the pre-partition must start with \x01.
- idx1, idx2, and idx3: index names
- cf1 and cf2: column family names
- c1, c2, c3, and c4: column names
- string: data type. The value can be STRING, INTEGER, FLOAT, LONG, DOUBLE, SHORT, BYTE, or CHAR.
- '#' is used to separate indexes, ';' is used to separate column families, and ',' is used to separate column qualifiers.
- The column name and its data type must be included in '[]'.
- Column names and their data types are separated by '->'.
- If the data type of a specific column is not specified, the default data type (string) is used.
Deleting an HBase Global Secondary Index
Run the following command on the HBase client to delete an index:
hbase org.apache.hadoop.hbase.hindex.global.mapreduce.GlobalTableIndexer -Dtablename.to.index='table' -Dindexnames.to.drop='idx1#idx2'
The parameters are described as follows:
- tablename.to.index: indicates the name of the table where the index to be deleted is located.
- indexnames.to.drop: indicates the name of the index to be deleted. You can specify multiple indexes and separate them with number signs (#).
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot