Updated on 2024-06-03 GMT+08:00

Character Sets and Collations

A character set provides character encoding rules, and a collation provides character sorting rules. This section describes the character sets and collations in B-compatible GaussDB (sql_compatibility = 'B'). The following character sets, collation rules, and syntax are supported only in B-compatible mode:

For details about the character sets supported by GaussDB, see "ENCODING" in CREATE DATABASE. For details about the supported collations, see the PG_COLLATION system catalog.

Some character sets have default collations in B-compatible mode. For details, see Table 1.

The character set and collations are described as follows:

  • Each character set has one or more collations and has only one default collation.

  • Each collation has only one associated character set.

  • The sorting results of the same data using different collations may be different.

  • In GaussDB, utf8mb4 and utf8 are the same character set.

  • When sql_compatibility is set to 'B', the BINARY and SQL_ASCII character sets are the same.
  • You are advised to select the same character set for table columns and server_encoding to avoid performance loss caused by transcoding.

GaussDB supports the following functions:

  • Multiple character sets can be used to store character strings.

  • Collations can be used to compare character strings.

  • Database-level, schema-level, table-level, and column-level character sets and collations are supported.

    Character strings with different character sets and collations cannot be used in the same server, database, table, or SQL statement.