Help Center/ MapReduce Service/ Troubleshooting/ Using Hue/ An Error Is Reported If the Query Result of an Impala SQL Statement Executed on Hue Contains Chinese Characters
Updated on 2022-11-15 GMT+08:00

An Error Is Reported If the Query Result of an Impala SQL Statement Executed on Hue Contains Chinese Characters

Symptom

Error message "UnicodeDecodeError: 'utf-8' codec can't decode byte in position 0: unexpected end of data" is displayed if the query result of an Impala SQL statement executed on Hue contains Chinese characters.

Cause Analysis

In Hive, the length of a Chinese character is 1. In Impala, the length of a Chinese character is 3. As a result, when functions such as substr(), substring(), and strleft() are used to extract Chinese characters in Impala SQL, the Chinese characters cannot be processed as the length of 1, resulting in coding failures.

Procedure

  1. Log in to the node where the Impala client is installed and run the following commands:

    cd Client installation directory

    source bigdata_env

  2. Run the following command to create a table:

    impala-shell -d bigdata

  3. Run the following command to query table data:

    select strleft(worker,3) from eier;