Optimizing Hive OCR Data Storage
Scenario
ORC is an efficient column storage format and has higher compression ratio and reading efficiency than other file formats.
You are advised to use ORC as the default Hive table storage format.
Prerequisites
You have logged in to the Hive client. For details, see Using the Hive Client.
Procedure
- Recommended: SNAPPY compression, which applies to scenarios with even compression ratio and reading efficiency requirements.
Create table xx (col_name data_type) stored as orc tblproperties ("orc.compress"="SNAPPY");
- Available: ZLIB compression, which applies to scenarios with high compression ratio requirements.
Create table xx (col_name data_type) stored as orc tblproperties ("orc.compress"="ZLIB");
xx indicates the specific Hive table name.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot