Help Center/ MapReduce Service/ Troubleshooting/ Using Hive/ Specifying the Output File Compression Format When Importing a Table
Updated on 2022-09-14 GMT+08:00

Specifying the Output File Compression Format When Importing a Table

Question

How do I specify an output file compression format when importing a table?

Procedure

Hive supports the following compression formats:
org.apache.hadoop.io.compress.BZip2Codec
org.apache.hadoop.io.compress.Lz4Codec
org.apache.hadoop.io.compress.DeflateCodec
org.apache.hadoop.io.compress.SnappyCodec
org.apache.hadoop.io.compress.GzipCodec
  • If global settings are required, that is, all tables need to be compressed, you can perform the following global settings for Hive service configuration parameters on the Manager page:
    • Set hive.exec.compress.output to true.
    • Set mapreduce.output.fileoutputformat.compress.codec to org.apache.hadoop.io.compress.BZip2Codec.

      The following parameters take effect only when hive.exec.compress.output is set to true.

  • If it needs to be set at the session level, configure the parameters as follows before command execution:
    set hive.exec.compress.output=true; 
    set mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;