Garbled Characters Returned Upon a Query If Text Files Are Compressed Using ARC4
Symptom
If a Hive query result table is compressed and stored using the ARC4 algorithm, garbled characters are returned after the select * query is conducted in the result table.
Cause Analysis
The default Hive compression format is not ARC4 or output compression is disabled.
Solution
- If garbled characters are returned after the SETECT query, set the following in Beeline:
set mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.encryption.arc4.ARC4BlockCodec;
set hive.exec.compress.output=true;
- Import the table to a new table using block decompression.
insert overwrite table tbl_result select * from tbl_source;
- Perform the query again.
select * from tbl_result;
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot