Creating HBase Global Secondary Indexes in Batches
Scenarios
If a large amount of data exists in a user table, index data of the existing data can be constructed in batches based on MapReduce tasks.
Creating HBase Global Secondary Indexes in Batches
- Only indexes in INACTIVE state can be built in batches. To rebuild index data, change the index status first.
- If a data table contains a large amount of data, the construction takes a long time. You are advised to run the nohup command in the background to prevent the operation from being interrupted unexpectedly.
Run the following command on the HBase client to create index data for existing data in batches:
hbase org.apache.hadoop.hbase.hindex.global.mapreduce.GlobalTableIndexer -Dtablename.to.index='table' -Dindexnames.to.build='idx1'
The related parameters are described as follows:
- tablename.to.index: indicates the name of the data table whose index status needs to be changed.
- indexnames.to.build: specifies the names of the indexes for which data needs to be generated in batches. You can specify multiple indexes and separate them with number signs (#).
- hbase.gsi.cleandata.enabled (optional): indicates whether to clear the index table before creating index data. The default value is false.
- (Optional) hbase.gsi.cleandata.timeout: timeout interval for clearing the index table before creating index data. The default value is 1800, in seconds.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot