Updated on 2025-06-30 GMT+08:00

Character Sets

GaussDB allows you to specify the following character sets for databases, schemas, tables, or columns. The default one is utf8.

Table 1 Character sets

MySQL

GaussDB

utf8mb4

Supported.

utf8

Supported.

gbk

Supported.

gb18030

Supported.

binary

Supported.

latin1

Supported.

  • GaussDB regards utf8 and utf8mb4 as the same character set. The maximum length of the code is 4 bytes. If the current character set is utf8 and the collation is set to utf8mb4_bin, utf8mb4_general_ci, utf8mb4_unicode_ci, or utf8mb4_0900_ai_ci (for example, by running SELECT _utf8'a' collate utf8mb4_bin), MySQL reports an error but GaussDB does not. The difference also exists when the character set is utf8mb4 and the collation is set to utf8_bin, utf8_general_ci, or utf8_unicode_ci.
  • The lexical syntax is parsed based on the byte stream. When multi-byte characters contain codes consistent with symbols like '\', '\'' and '\\', the behavior in GaussDB is inconsistent with that in MySQL. You are advised to temporarily disable the escape character feature. For details, see the enable_escape_string option of the GUC parameter m_format_behavior_compat_options in "Configuring GUC Parameters > GUC Parameters > Version and Platform Compatibility > Platform and Client Compatibility" in Administrator Guide.
  • GaussDB does not strictly verify the encoding logic of invalid characters that do not belong to the current character set, which may allow such invalid characters to be successfully entered. Conversely, MySQL will report an error upon verifying such characters.