"ERROR: invalid byte sequence for encoding 'UTF8': 0x00" Is Reported When Data Is Imported to GaussDB(DWS) Using COPY FROM
Symptom
"ERROR: invalid byte sequence for encoding 'UTF8': 0x00" is reported when data is imported to GaussDB(DWS) using COPY FROM.
Possible Causes
The data file is imported from an Oracle database, and the file is UTF-8 encoded. The error message also contains the number of lines. Because the file is too large to be opened by running the vim command, run the sed command to extract the lines, and then run the vim command to open the file. No exception is found. Part of the file can be imported after running the split command to split the file by the number of lines.
According to the GaussDB(DWS) document, the direct cause of this error is that the fields or variables in VARCHAR type do not support character strings containing '\0' (that is, the value 0x00 and the UTF code '\u0000'). The solution is to delete '\0' from the character string in advance.
Handling Procedure
Run the sed command to replace 0x00.
1 | sed -i 's/\x00//g;' file |
Parameter description: -i indicates direct replacement in the original file. s/ indicates replacement. /g indicates global replacement.
Last Article: Data Import and Export
Next Article: Data Import and Export Faults with GDS
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.