On this page

Optimizing Data Storage

Updated on 2022-11-18 GMT+08:00

Scenario

ORC is an efficient column storage format and has higher compression ratio and reading efficiency than other file formats.

You are advised to use ORC as the default Hive table storage format.

Prerequisites

You have logged in to the Hive client. For details, see Using a Hive Client.

Procedure

  • Recommended: SNAPPY compression, which applies to scenarios with even compression ratio and reading efficiency requirements.

    Create table xx (col_name data_type) stored as orc tblproperties ("orc.compress"="SNAPPY");

  • Available: ZLIB compression, which applies to scenarios with high compression ratio requirements.

    Create table xx (col_name data_type) stored as orc tblproperties ("orc.compress"="ZLIB");

NOTE:

xx indicates the specific Hive table name.

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback