About HBase Global Secondary Indexes
Scenario
HBase secondary indexes can accelerate conditional queries with filters. There are local secondary indexes (LSIs, also called HIndexes) and global secondary indexes (GSIs). Compared with LSIs, GSIs have better query performance and are suitable for scenarios that require low read latency.
HBase GSIs use independent index tables to store index data. When a given query condition hits an index, a full table query is converted into an exact range query on an index table. This way, query speed is greatly improved. You do not need to modify your application code to enable HBase GSIs.
Key features of HBase GSIs are as follows:
- Composite index
Multiple columns of different column families can be specified as index columns.
- Covering index
Multiple columns or column families can be stored in the index table in redundancy to cover all data needed for a query. With covering indexes, you can quickly query non-index columns in index query.
- Index TTL
Index table TTL takes effect if data table TTL is enabled. To ensure consistency with the data table, the index table TTL is automatically inherited from the index column and the column to overwrite an index of the data table and cannot be specified.
- Online index change
Indexes can be created, deleted, and their status can be modified online without affecting data table read and write.
- Online index repair
If the index data hit by a query is invalid, index data rebuilding is triggered to ensure that the final query result is correct.
- Index tool
The index tool helps you to check consistency, repair, create, and delete indexes, modify index status, and rebuild index data.
Constraints
- Application Scenarios
- GSIs cannot be used together with HIndexes (LSIs). That is, they cannot be created in the same data table.
- Index tables do not support DR.
- DISABLE, DROP, MODIFY, and TRUNCATE cannot be directly performed on index tables.
- Index definition cannot be modified. You need to delete definitions and create indexes again. Other DDL operations on indexes are allowed, for example, modify index status, and delete and create indexes.
- HBase GSIs cannot be created for a table that contains data.
- Creating Indexes
- An index name must comply with the regular expression requirements and does not support other characters. The following regular expression is supported: [a-zA-Z_0-9-.]:
- The data table specified for index creation must exist. An index cannot be repeatedly created.
- The index table cannot have multiple versions.
Indexes cannot be created on data tables with multiple versions (VERSION>1). The VERSION=1 setting is a must.
- The number of indexes in a single data table cannot exceed five.
Do not create too many indexes for a data table. Otherwise, bigger storage is required and write operations become slow. If more than five indexes need to be created, add the hbase.gsi.max.index.count.per.table parameter to the custom configuration hbase.hmaster.config.expandor of HMaster and set the parameter to a value greater than 5. Restart HMaster to make the configuration take effect.
- The index name can contain a maximum of 18 characters.
Do not use long index names. If you have to, add the hbase.gsi.max.index.name.length parameter to the custom configuration hbase.hmaster.config.expandor of HMaster, set the parameter to a value greater than 18, and restart HMaster to make the configuration take effect.
- Indexes cannot be created for index tables.
Indexes cannot be nested. Index tables are used only to accelerate queries and do not provide data table functions.
- Indexes that can be covered by existing indexes cannot be created.
If indexes you want to create are a subset of the existing indexes, they cannot be created. Duplicate indexes waste storage space. In the following example, index 2 cannot be created:
Create a table.
create 't1','cf1'
Create index 1.
hbase org.apache.hadoop.hbase.hindex.global.mapreduce.GlobalTableIndexer -Dtablename.to.index='t1' -Dindexspecs.to.add='idx1=>cf1:[q1],[q2]'
Create index 2.
hbase org.apache.hadoop.hbase.hindex.global.mapreduce.GlobalTableIndexer -Dtablename.to.index='t1' -Dindexspecs.to.add='idx2=>cf1:[q1]'
- Indexes with the same name cannot be created in the same data table, but can be created in different data tables.
- The TTL of a column family in an index table is inherited from the original table, and must be the same as that of the original table.
The TTLs of all column families in an index table are the same and are inherited from a data table. The TTLs of associated column families in the data table must be the same. Otherwise, associated indexes cannot be created.
- Properties of user-defined index tables are not supported.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot