Updating HBase Data in Batches Using BulkLoad

Scenario

HBase BulkLoad updates data in batches based on the row key naming rule, row key scope, field name, and field value.

Updating HBase Data in Batches Using BulkLoad

Run the following command to update the rows from row_start to row_stop and direct the output to /output/destdir/.

hbase com.huawei.hadoop.hbase.tools.bulkload.UpdateData 
  -Dupdate.rowkey.start="row_start" 
  -Dupdate.rowkey.stop="row_stop" 
  -Dupdate.hfile.output=/user/output/  
  -Dupdate.qualifier=f1:c1,f2  
  -Dupdate.qualifier.new.value=0,a  
  'table1'

-Dupdate.rowkey.start="row_start": indicates that the start row number is row_start.
-Dupdate.rowkey.stop="row_stop": indicates that the end row number is row_stop.
-Dupdate.hfile.output=/user/output/: indicates that the output results are directed to /user/output/.

After transparent encryption is configured for HBase, see 7 for precautions on batch updating.

Run the following command to load HFiles:

hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles <path/for/output> <tablename>

Precautions

During batch updating, the field value of the row that meets the requirements will be updated.
Batch updating cannot be performed on fields where indexes are created.
If you do not set the output file of the execution result, the default value is /tmp/updatedata/table name.

Parent topic: Improving HBase BulkLoad Data Migration

Previous topic: Importing HBase Data in Batches Using BulkLoad

Next topic: Deleting HBase Data in Batches Using BulkLoad

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

For any further questions, feel free to contact us through the chatbot.

Chatbot