Garbled Characters Returned upon a select Query If Text Files Are Compressed Using ARC4
Symptom
If a Hive query result table is compressed and stored using the ARC4 algorithm, garbled characters are returned after the select * query is conducted in the result table.
Cause Analysis
The default Hive compression format is not ARC4 or output compression is disabled.
Solution
- If garbled characters are returned after the SETECT query, set the following in Beeline:
set mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.encryption.arc4.ARC4BlockCodec;
set hive.exec.compress.output=true;
- Import the table to a new table using block decompression.
insert overwrite table tbl_result select * from tbl_source;
- Perform the query again.
select * from tbl_result;
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.