Help Center/ MapReduce Service/ Troubleshooting/ Using Hive/ Specifying the Output File Compression Format When Importing a Hive Table
Updated on 2023-11-30 GMT+08:00

Specifying the Output File Compression Format When Importing a Hive Table

Symptom

The user does not know how to specify an output file compression format when importing a Hive table.

Procedure

Hive supports the following compression formats:
org.apache.hadoop.io.compress.BZip2Codec
org.apache.hadoop.io.compress.Lz4Codec
org.apache.hadoop.io.compress.DeflateCodec
org.apache.hadoop.io.compress.SnappyCodec
org.apache.hadoop.io.compress.GzipCodec
  • If global settings are required, that is, all tables need to be compressed, you can perform the following global settings for Hive service configuration parameters on the Manager page:
    • Set hive.exec.compress.output to true.
    • Set mapreduce.output.fileoutputformat.compress.codec to org.apache.hadoop.io.compress.BZip2Codec.

      The following parameters take effect only when hive.exec.compress.output is set to true.

  • If it needs to be set at the session level, configure the parameters as follows before command execution:

    set hive.exec.compress.output=true;

    set mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;