"ERROR: invalid byte sequence for encoding 'UTF8': 0x00" Is Reported When Data Is Imported to GaussDB(DWS) Using COPY FROM
Symptom
"ERROR: invalid byte sequence for encoding 'UTF8': 0x00" is reported when data is imported to GaussDB(DWS) using COPY FROM.
Possible Causes
The data file is imported from an Oracle database, and the file is UTF-8 encoded. The error message also contains the number of lines. Because the file is too large to be opened by running the vim command, the sed command is used to extract the lines, and then the vim command is used to open the file. No exception is found. Part of the file can be imported after running the split command to split the file by the number of lines.
According to the analysis, fields or variables of the varchar type in GaussDB(DWS) cannot contain '\0' (that is, 0x00 and UTF encoding '\u0000'). Delete '\0' from the string before importing it.
Handling Procedure
Run the sed command to replace 0x00.
1
|
sed -i 's/\x00//g;' file |
Parameter:
- -i indicates replacement in the original file.
- s/ indicates single replacement.
- /g indicates global replacement.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.