Help Center/ MapReduce Service/ Troubleshooting/ Using Hive/ Garbled Characters Returned upon a select Query If Text Files Are Compressed Using ARC4
Updated on 2022-09-14 GMT+08:00

Garbled Characters Returned upon a select Query If Text Files Are Compressed Using ARC4

Symptom

If a Hive query result table is compressed and stored using the ARC4 algorithm, garbled characters are returned after the select * query is conducted in the result table.

Cause Analysis

The default Hive compression format is not ARC4 or output compression is disabled.

Solution

  1. If garbled characters are returned after the SETECT query, set the following in Beeline:

    set mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.encryption.arc4.ARC4BlockCodec;

    set hive.exec.compress.output=true;

  2. Import the table to a new table using block decompression.

    insert overwrite table tbl_result select * from tbl_source;

  3. Perform the query again.

    select * from tbl_result;