Character Processing Functions and Operators
String functions and operators provided by GaussDB are for concatenating strings with each other, concatenating strings with non-strings, and matching the patterns of strings. Note: Except length-related functions, other functions and operators of string processing functions do not support parameters greater than 1 GB.
bin(number)
Description: Returns a binary string of a given number.
Parameter |
Type |
Description |
---|---|---|
number |
|
Given number. |
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT bin(5); bin -------- 101 (1 row) |

- The bin function takes effect only when sql_compatibility is set to 'MYSQL'.
- If the value of the input parameter number is a decimal, round it down.
- If the absolute value of the input parameter number exceeds the maximum value of the bigint unsigned type, convert the input parameter number to the maximum value of the bigint unsigned type.
bit_length(string)
Description: Specifies the number of bits occupied by a string.
Return type: int.
Example:
1 2 3 4 5 |
gaussdb=# SELECT bit_length('world'); bit_length ------------ 40 (1 row) |
btrim(string text [, characters text])
Description: Removes the longest string consisting only of characters in characters (a space by default) from the start and end of string.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT btrim('sring' , 'ing'); btrim ------- sr (1 row) |
char_length(string) or character_length(string)
Description: Specifies the number of characters in a string.
Return type: int.
Example:
1 2 3 4 5 |
gaussdb=# SELECT char_length('hello'); char_length ------------- 5 (1 row) |
dump(expr[, return_fmt [, start_position [, length ] ] ])
Description: Returns the data type code, byte length, and internal representation of the input expression. return_fmt specifies the number system of the internal representation, start_position specifies the byte from which the internal representation starts, and length indicates the length of the data to be read.
Return type: text.

This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
elt(pos,str1,str2,...)
Description: Returns the posth string.
Parameter |
Type |
Description |
---|---|---|
pos |
|
Specified position of the parameter. |
str1,str2,... |
|
List of character strings. |
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT elt(3, 'a', 'b', 'c'); elt ------- c (1 row) |

- The elt function takes effect only when sql_compatibility is set to 'MYSQL'.
- If the input parameter pos is less than 1 or exceeds the number of parameters, NULL is returned.
field(str,str1,str2,str3,...)
Description: The field function returns the position of str in the str1,str2,str3,...,strN list. The position increases from 1. If 0 is returned, str is not found. If str is NULL, 0 is returned. If the input parameters of a function are all digits, the comparison is performed based on digits. If the input parameters are all non-digits, the comparison is performed based on character strings. If the input parameters contain both digits and non-digits, the comparison is performed based on the double type.
Parameter |
Type |
Description |
---|---|---|
str,str1,str2,str3,... |
|
List of character strings. |
Return type: bigint.
Example:
1 2 3 4 5 |
gaussdb=# SELECT field( 'abc','1',1,'abc','abcd' ); field ------- 3 (1 row) |

- The field function takes effect only when sql_compatibility is set to 'MYSQL'.
- Since the version whose b_format_version is set to '5.7' and b_format_dev_version is set to 's1', the sql_mode parameter pad_char_to_full_length specifies whether to add spaces at the end of the char type, which affects the field comparison result. For details, see Table 1.
- Since the version whose b_format_version is set to '5.7' and b_format_dev_version is set to 's1', the behavior of the character, binary, numeric, and date and time types is M-compatible, which affects the field comparison result. For details, see Data Types. For the floating-point type in the numeric type, the precision may be different from that in MySQL due to different connection parameter settings. Therefore, this scenario is not recommended, or the numeric type is used instead. For details, see Connection Parameters.
- Since the version whose b_format_version is set to '5.7' and b_format_dev_version is set to 's2', characters can be escaped and constant character strings can obtain collations. The collations affect the field comparison result. For details, see the SET NAMES syntax in SET. For details about the rules for combining different collations of the character types, see Rules for Combining Character Sets and Collations.
insert(str1, pos, len, str2)
Description: Returns the processing result of str1. The substring starts from pos and the number of len characters in the character string is replaced with str2. If any input parameter is NULL, the return value is NULL.
Parameter |
Type |
Description |
---|---|---|
pos, len |
|
pos indicates the parameter in the specified position, and len indicates the replacement length. |
str1, str2 |
|
String. |
Return value type: The return value of the function is of the text type (both s1 and s2 are of the text type) or bytea type (any of s1 or s2 is of the bytea type).
Example:
1 2 3 4 5 |
gaussdb=# SELECT INSERT('abcdef',2,3,'gg'); insert -------- aggef (1 row) |

- The insert function takes effect only when sql_compatibility is set to 'MYSQL'.
- The value range of the input parameter of the Int64 type is limited. If the value exceeds the value range of –9223372036854775808 to 9223372036854775807, an error is reported. MySQL has no restriction on the range of input parameters of the numeric type. If an exception occurs, an alarm is generated and the upper or lower limit is used. Input parameters of the character string are restricted. The maximum length of the input parameter of the text type is 2^30 – 5 bytes, and the maximum length of the input parameter of the bytea type is 2^30 – 512 bytes.
instr(text,text,int,int)
Description: instr(string1,string2,int1,int2) returns the position matching string2 in string1 for the int2th time from the position specified by int1. int1 indicates the start position for matching, and int2 indicates the number of matching times.
Return type: int.
Example:
1 2 3 4 5 |
gaussdb=# SELECT instr( 'abcdabcdabcd', 'bcd', 2, 2 ); instr ------- 6 (1 row) |
instrb(text,text,int,int)
Description: instrb(string1,string2,int1,int2) returns the position matching string2 in string1 for the int2th time from the position specified by int1. int1 indicates the start position for matching, and int2 indicates the number of matching times. Different from the instr function, instrb is calculated in bytes and is not affected by the character set in use.
Return type: int.
Example:
1 2 3 4 5 |
gaussdb=# SELECT instrb( 'abcdabcdabcd', 'bcd', 2, 2 ); instrb ------- 6 (1 row) |

- This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s1 in an ORA-compatible database.
- If the values of int1 and int2 are decimals, the values are truncated instead of being rounded off.
lengthb(text/bpchar)
Description: Obtains the number of bytes of a specified string.
Return type: int.
Example:
1 2 3 4 5 |
gaussdb=# SELECT lengthb('hello'); lengthb --------- 5 (1 row) |
lengthc(text/bpchar)
Description: Returns the length of a specified string, calculated according to Unicode character encoding rules.
Return type: int.
Example:
1 2 3 4 5 |
gaussdb=# SELECT lengthc('hello'); lengthc --------- 5 (1 row) |
left(str text, n int)
Description: Returns the first n characters in a string. When n is negative, all but the last |n| characters are returned.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT left('abcde', 2); left ------ ab (1 row) |
length(string bytea, encoding name)
Description: Specifies the number of characters in string in the given encoding. The string must be valid in this encoding.
Return type: int.
Example:
1 2 3 4 5 |
gaussdb=# SELECT length('jose', 'UTF8'); length -------- 4 (1 row) |

If the length of the bytea type is queried and UTF8 encoding is specified, the maximum length can only be 536870888.
locate(substr, str[, pos])
Description: Returns the position where substr appears for the first time in str, starting from pos (1 by default). If substr cannot be found in str, this function returns 0.
Parameter |
Type |
Description |
---|---|---|
substr, str |
|
|
pos |
|
(Optional) Start position of the search. |
Return type: int.
Example:
1 2 3 4 5 6 7 8 9 10 11 |
gaussdb=# SELECT locate( 'b','abcabc'); locate ------- 2 (1 row) gaussdb=# SELECT locate( 'b','abcabc',3); locate ------- 5 (1 row) |

The locate function takes effect only when sql_compatibility is set to 'MYSQL'.
lpad_s(string text, length int [, fill text])
Description: Fills up string to length by appending the characters fill (a space by default). If the length of string is longer than length, an error is reported.
Return type: text.

- The length parameter indicates the display length of a character string. The display length of a single character is processed based on ORA-compatible requirements.
- During the function execution, if the remaining length is 1 and the next character is of the full-width type (2 bytes), a space character is added to the left of the string.
- If the value of length is a decimal, the value is rounded down instead of being rounded off.
- The string and fill parameters do not comply with the encoding specifications.
- The length parameter indicates the total length of characters in a character string. The length of a single character is fixed to 1.
- If the value of length is a decimal, the value is rounded off.
- The string and fill parameters do not comply with the encoding specifications.
lpad(string text, length int [, fill text])
Description: Fills up string to length by appending the characters fill (a space by default). If string is already longer than length, then it is truncated.
Return type: text.

- The length parameter indicates the display length of a character string. The display length of a single character is processed based on ORA-compatible requirements.
- During the function execution, if the remaining length is 1 and the next character is of the full-width type (2 bytes), a space character is added to the left of the string.
- If the value of length is a decimal, the value is rounded down instead of being rounded off.
- The string and fill parameters do not comply with the encoding specifications.
- The length parameter indicates the total length of characters in a character string. The length of a single character is fixed to 1.
- If the value of length is a decimal, the value is rounded off.
- The string and fill parameters do not comply with the encoding specifications.
make_set(bits,str1,str2...)
Description: str1 corresponds to bit 0 of the bits input parameter in the bit format, str2 corresponds to bit 1, and so on. If the lower bit of the corresponding bit form is 1, the corresponding str parameter is appended to the result and returned with a comma as the separator.
Parameter |
Type |
Description |
---|---|---|
make_set |
|
The returned result consists of some character strings selected from str1,str2,...,strN and separated by commas (,). |
Return type: text.
Example:
1 2 3 4 5 6 7 8 9 10 |
gaussdb=# SELECT make_set(7,'a','b','c','d'); make_set ---------- a,b,c (1 row) gaussdb=# SELECT make_set(b'1010101010101101010101010110101010101011010101010101101010101010','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20','21','22','23','24','25','26','27','28','29','30','31','32','33','34','35','36','37','38','39','40','41','42','43','44','45','46','47','48','49','50','51','52','53','54','55','56','57','58','59','60','61','62','63','64','65','66','67','68','69'); make_set --------------------------------------------------------------------------------------------------- 2,4,6,8,10,12,13,15,17,19,21,23,25,26,28,30,32,34,36,38,39,41,43,45,47,49,51,52,54,56,58,60,62,64 (1 row) |

- The make_set function takes effect only when sql_compatibility is set to 'MYSQL'.
- When the bits parameter is an integer, the maximum range is int128, which is smaller than the MySQL range.
- When the bits parameter is of the date type (datetime, timestamp, date, or time), it is not supported because the conversion from the date type to the integer type is different from that in MySQL, which is not caused by this function.
- GaussDB and MySQL are inherently different in the bit and Boolean types, causing different returned results, which is not caused by this function. When the bits input parameter is of the Boolean type, the str input parameter of the bit and Boolean types is not supported.
notlike(x bytea name text, y bytea text)
Description: Compares x and y to check whether they are inconsistent.
Return type: Boolean.
Example:
1 2 3 4 5 6 7 8 9 10 |
gaussdb=# SELECT notlike(1,2); notlike -------------- t (1 row) gaussdb=# SELECT notlike(1,1); notlike -------------- f (1 row) |
octet_length(string)
Description: Specifies the number of bytes in a string.
Return type: int.
Example:
1 2 3 4 5 |
gaussdb=# SELECT octet_length('jose'); octet_length -------------- 4 (1 row) |
overlay(string placing string FROM int [for int])
Description: Replaces substrings. FROM int indicates the start position of the replacement in the first string. for int indicates the number of characters replaced in the first string.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT overlay('hello' placing 'world' from 2 for 3 ); overlay --------- hworldo (1 row) |
position(substring in string)
Description: Specifies the position of a substring. Parameters are case-sensitive.
Return type: int. If the character string does not exist, 0 is returned.
Example:
1 2 3 4 5 |
gaussdb=# SELECT position('ing' in 'string'); position ---------- 4 (1 row) |
pg_client_encoding()
Description: Specifies the current client-side encoding name.
Return type: name
Example:
1 2 3 4 5 |
gaussdb=# SELECT pg_client_encoding(); pg_client_encoding -------------------- UTF8 (1 row) |
quote(str)
Description: Returns a string enclosed in single quotation marks and adds a backslash (\) before the instances of backslashes (\), single quotation marks ('), ASCII NUL (\0), and Control+Z (\Z). If the parameter is NULL, the return value is NULL without single quotation marks.
Parameter |
Type |
Description |
---|---|---|
quote |
|
Adds quotation marks to the input string and adds a backslash (\) before the backslash (\) and single quotation mark ('). |
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT quote('hello\' world'); quote -------------- 'hello\' world' (1 row) |

- The function takes effect only when sql_compatibility is set to 'MYSQL'.
- To enable the GUC parameters, run the following commands:
SET standard_conforming_strings=off; SET escape_string_warning=off; SET backslash_quote=on;
- If the str character string contains \Z, \r, \%, or \_, GaussDB does not escape it, which is different from MySQL. The slash followed by digits may also cause differences, for example, \563. This function difference is the escape character difference between GaussDB and MySQL, which is not caused by this function.
- The output format of \b in the str character string is different from that in MySQL. This is an inherent difference between GaussDB and MySQL and is not caused by this function.
- If the str character string contains \0, GaussDB cannot identify the character because the UTF-8 character set cannot identify the character. As a result, the input fails. This is an inherent difference between GaussDB and MySQL and is not caused by this function.
- If str is of the bit or Boolean type, these types are not supported because they are different in GaussDB and MySQL.
- GaussDB supports a maximum of 1 GB data transfer. The maximum length of the str input parameter is 536870908, and the maximum size of the result string returned by the function is 1 GB.
quote_ident(string text)
Description: Returns the given string suitably quoted to be used as an identifier in an SQL statement string (quotation marks are used as required). Quotation marks are added only if necessary (that is, if the string contains non-identifier characters or would be case-folded). Embedded quotation marks are properly doubled.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT quote_ident('hello world'); quote_ident -------------- "hello world" (1 row) |
quote_literal(string text)
Description: Returns the given string suitably quoted to be used as text in an SQL statement string (quotation marks are used as required).
It supports XML data that is explicitly converted to the character type.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT quote_literal('hello'); quote_literal --------------- 'hello' (1 row) |
If a command similar to the following exists, the text will be escaped.
1 2 3 4 5 |
gaussdb=# SELECT quote_literal(E'O\'hello'); quote_literal --------------- 'O''hello' (1 row) |
If a command similar to the following exists, the backslash will be properly doubled.
1 2 3 4 5 |
gaussdb=# SELECT quote_literal('O\hello'); quote_literal --------------- E'O\\hello' (1 row) |
If the parameter is NULL, NULL is returned. If the parameter may be null, you are advised to use quote_nullable.
1 2 3 4 5 |
gaussdb=# SELECT quote_literal(NULL); quote_literal --------------- (1 row) |
quote_literal(value anyelement)
Description: Converts the given value to text and then quotes it as a literal.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT quote_literal(42.5); quote_literal --------------- '42.5' (1 row) |
If a command similar to the following exists, the given value will be escaped.
1 2 3 4 5 |
gaussdb=# SELECT quote_literal(E'O\'42.5'); quote_literal --------------- '0''42.5' (1 row) |
If a command similar to the following exists, the backslash will be properly doubled.
1 2 3 4 5 |
gaussdb=# SELECT quote_literal('O\42.5'); quote_literal --------------- E'O\\42.5' (1 row) |
quote_nullable(string text)
Description: Returns the given string suitably quoted to be used as a string in an SQL statement string (quotation marks are used as required).
It supports XML data that is explicitly converted to the character type.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT quote_nullable('hello'); quote_nullable ---------------- 'hello' (1 row) |
If a command similar to the following exists, the text will be escaped.
1 2 3 4 5 |
gaussdb=# SELECT quote_nullable(E'O\'hello'); quote_nullable ---------------- 'O''hello' (1 row) |
If a command similar to the following exists, the backslash will be properly doubled.
1 2 3 4 5 |
gaussdb=# SELECT quote_nullable('O\hello'); quote_nullable ---------------- E'O\\hello' (1 row) |
If the parameter is NULL, NULL is returned.
1 2 3 4 5 |
gaussdb=# SELECT quote_nullable(NULL); quote_nullable ---------------- NULL (1 row) |
quote_nullable(value anyelement)
Description: Converts the given value to text and then quotes it as a literal.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT quote_nullable(42.5); quote_nullable ---------------- '42.5' (1 row) |
If a command similar to the following exists, the given value will be escaped.
1 2 3 4 5 |
gaussdb=# SELECT quote_nullable(E'O\'42.5'); quote_nullable ---------------- 'O''42.5' (1 row) |
If a command similar to the following exists, the backslash will be properly doubled.
1 2 3 4 5 |
gaussdb=# SELECT quote_nullable('O\42.5'); quote_nullable ---------------- E'O\\42.5' (1 row) |
If the parameter is NULL, NULL is returned.
1 2 3 4 5 |
gaussdb=# SELECT quote_nullable(NULL); quote_nullable ---------------- NULL (1 row) |
space(count)
Description: Returns a string consisting of a specified number of spaces.
Parameter |
Type |
Description |
---|---|---|
count |
|
Number of spaces. |
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT space(5); space -------- (1 row) |

- The space function takes effect only when sql_compatibility is set to 'MYSQL'.
- If the input parameter count is less than 1073741819 and greater than 0, a string of count spaces is returned. Otherwise, an empty string is returned.
substring_inner(string [from int] [for int])
Description: Extracts a substring. from int indicates the start position of the truncation. for int indicates the number of bytes truncated.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT substring_inner('adcde', 2,3); substring_inner ----------------- dcd (1 row) |
substring(string [from int] [for int])
Description: Extracts a substring. from int indicates the start position of the truncation. for int indicates the number of bytes truncated.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT substring('Thomas' from 2 for 3); substring ----------- hom (1 row) |

This function is in the MySQL database. When the GUC parameter b_format_version is set to '5.7' and b_format_dev_version is set to 's1', [from int] can be a negative number, indicating the sequence number of the character counted from back to front. If this parameter is not set, [from int] is a negative number and the result is empty.
substring(string, pos, len)
Description: Extracts a substring. pos indicates the start position of the truncation. len indicates the number of bytes truncated.
Parameters: See Table 1 Parameters.
Parameter |
Type |
Description |
Value Range |
---|---|---|---|
string |
text |
Character string to be truncated. |
- |
pos |
int |
Start position of the character string to be truncated. |
The absolute value is less than the string length. |
len |
int |
Length of the character string to be truncated. |
The value is greater than 0. |
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT substring('substrteststring', -5, 5); substring ----------- tring (1 row) |

This function is in the MySQL database. When the GUC parameter b_format_version is set to '5.7' and b_format_dev_version is set to 's1', pos can be a negative number, indicating the sequence number of the character counted from back to front. If this parameter is not set, pos is a negative number and the result is empty.
substring_index(str,split,index)
Description: str is a string, split is a separator string, and index is the position from which the string is split. This function returns all contents on the left or right (left if the value of index is positive and right if the value of index is negative) of the str string that is split by split from the position specified by index.
Parameters: See Table 2 substring_index parameters.
Parameter |
Type |
Description |
---|---|---|
str |
text |
Character string to be truncated. |
split |
text |
Target character string to be split. |
index |
int |
Position from which the string is split. If the value of index is a positive number, all contents on the left of the position are obtained. If the value of index is a negative number, all contents on the right are obtained. |
Return type: text.
Example:
1 2 3 4 5 6 7 8 9 10 11 |
gaussdb=# SELECT substring_index('Test1splitTest2splitTest3splitTest4', 'split', 2); substring_index ----------------- Test1splitTest2 (1 row) gaussdb=# SELECT substring_index('Test1splitTest2splitTest3splitTest4', 'split', -2); substring_index ----------------- Test3splitTest4 (1 row) |
substring(string from pattern)
Description: Extracts substrings matching the POSIX-style regular expression. It returns the text that matches the pattern. If no match record is found, a null value is returned.
Return type: text.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
gaussdb=# SELECT substring('Thomas' from '...$'); substring ----------- mas (1 row) gaussdb=# SELECT substring('foobar' from 'o(.)b'); result -------- o (1 row) gaussdb=# SELECT substring('foobar' from '(o(.)b)'); result -------- oob (1 row) |

- If the POSIX regular expression contains any parentheses, the portion of the text that matched the first parenthesized sub-expression (the one whose left parenthesis comes first) is returned. You can put parentheses around the whole expression if you want to use parentheses within it without triggering this exception.
- When sql_compatibility is set to 'MYSQL', the GUC parameter b_format_version is set to '5.7', and b_format_dev_version is set to 's1', this function is used in the same way as the substr(str FROM pos) function. The output result and behavior of this function are the same as those of the substr(str FROM pos) function.
substring(string from pattern for escape)
Description: Extracts substrings matching the SQL regular expression. The declared schema must match the entire data string; otherwise, the function fails and returns a null value. To indicate the part of the schema that should be returned on success, the schema must contain two occurrences of the escape character followed by a double quotation mark ("). The text matching the portion of the pattern between these marks is returned.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT substring('Thomas' from '%#"o_a#"_' for '#'); substring ----------- oma (1 row) |

When sql_compatibility is set to 'MYSQL', the GUC parameter b_format_version is set to '5.7', and b_format_dev_version is set to 's1', this function is used in the same way as the substr(str FROM pos FOR len) function. The output result and behavior of this function are the same as those of the substr(str FROM pos FOR len) function.
rawcat(raw,raw)
Description: Indicates the string concatenation function.
Return type: raw.
Example:
1 2 3 4 5 |
gaussdb=# SELECT rawcat('ab','cd'); rawcat -------- ABCD (1 row) |
regexp_substr(string text, pattern text [, position int [, occurrence int [, flags text]]])
Description: Extracts substrings from a regular expression. Its function is similar to substr. When a regular expression contains multiple parallel brackets, it also needs to be processed.
Parameter description:
- string: source character string used for matching.
- pattern: regular expression pattern string used for matching.
- position: start character of the source string used for matching. This parameter is optional. The default value is 1.
- occurrence: sequence number of the matched substring to be extracted. This parameter is optional. The default value is 1.
- flags: contains zero or multiple single-letter flags that change the matching behavior of the function. This parameter is optional. The options supported by flags and description are shown in Table 3.
Table 3 Options supported by flags Option
Description
'b'
Indicates the BRE matching without extension.
'c'
Indicates the case-sensitive matching.
'e'
Indicates the ERE matching with extension.
'i'
Indicates the case-insensitive matching.
'm'
Indicates the multi-line matching. If flags contains "m", use the multi-line matching. Otherwise, use the single-line matching.
'n'
The option meanings of "n" are related to the GUC parameter behavior_compat_options and the compatibility mode of the current database.- If the SQL syntax is ORA- and MySQL-compatible and the value of the GUC parameter behavior_compat_options contains aformat_regexp_match, the n option indicates that "." matches the linefeed '\n'. If 'n' is not specified, "." does not match the linefeed.
- In other cases, the 'n' option has the same meaning as the 'm' option.
'p'
Indicates partial linefeed-sensitive matching, which is similar to the linefeed-sensitive matching ('m' or 'n') and affects "." and square bracket expression, but does not affect ^ and $.
'q'
Indicates common character matching.
's'
Indicates the single-line matching. The meaning is opposite to those of 'm' and 'n'.
't'
Indicates the compact matching. The whitespace characters match themselves.
'w'
Indicates the reverse partial linefeed-sensitive matching. The meaning is opposite to that of 'p'.
'x'
Indicates the loose matching. The whitespace characters are ignored.
Return type: text.
Example:
1 2 3 4 5 6 7 8 9 10 11
gaussdb=# SELECT regexp_substr('str','[ac]'); regexp_substr --------------- (1 row) gaussdb=# SELECT regexp_substr('foobarbaz', 'b(..)', 3, 2) AS RESULT; result -------- baz (1 row)
regexp_count(string text, pattern text [, position int [, flags text]])
Description: obtains the number of substrings used for matching.
Parameter description:
- string: source character string used for matching.
- pattern: regular expression pattern string used for matching.
- position: start character of the source string used for matching. This parameter is optional. The default value is 1.
- flags: contains zero or multiple single-letter flags that change the matching behavior of the function. This parameter is optional. The options supported by flags and description are shown in Table 3.

When the function is in an ORA-compatible database, the value of a_format_version is 10c, and the value of a_format_dev_version is s1, the pattern parameter ending with a backslash (\) is valid.
Return type: int.
Example:
gaussdb=# SELECT regexp_count('foobarbaz','b(..)', 5) AS RESULT;
result
--------
1
(1 row)
regexp_instr(string text, pattern text [, position int [, occurrence int [, return_opt int [, flags text]]]])
Description: obtains the position (starting from 1) of the substring that meets the matching condition. If no substring is matched, 0 is returned.
Parameter description:
- string: source character string used for matching.
- pattern: regular expression pattern string used for matching.
- position: start character of the source string used for matching. This parameter is optional. The default value is 1.
- occurrence: sequence number of the matched substring to be obtained. This parameter is optional. The default value is 1.
- return_opt: specifies whether to return the position of the first or last character of the matched substring. This parameter is optional. If the value is 0, the position of the first character (starting from 1) of the matched substring is returned. If the value is greater than 0, the position of the next character of the end character of the matched substring is returned. The default value is 0.
- flags: contains zero or multiple single-letter flags that change the matching behavior of the function. This parameter is optional. The options supported by flags and description are shown in Table 3.
Return type: int.
Example:
gaussdb=# SELECT regexp_instr('foobarbaz','b(..)', 1, 1, 0) AS RESULT; result -------- 4 (1 row) gaussdb=# SELECT regexp_instr('foobarbaz','b(..)', 1, 2, 0) AS RESULT; result -------- 7 (1 row)
regexp_matches(string text, pattern text [, flags text])
Description: Returns all captured substrings resulting from matching a POSIX-style regular expression against string. If the pattern does not match, the function returns no rows. If the pattern contains no parenthesized sub-expressions, then each row returned is a single-element text array containing the substring matching the whole pattern. If the pattern contains parenthesized sub-expressions, the function returns a text array whose nth element is the substring matching the nth parenthesized sub-expression of the pattern.
The optional flags parameter contains zero or multiple single-letter flags that change the behavior of a function. i indicates that the matching is not related to uppercase and lowercase. g indicates that each matched substring is replaced, instead of replacing only the first one.

If the last parameter is provided but the parameter value is an empty string ('') and the SQL compatibility mode of the database is set to ORA, the returned result is an empty set. This is because the ORA compatible mode treats the empty string ('') as NULL. To resolve this problem, you can:
- Change the database SQL compatibility mode to TD.
- Do not provide the last parameter or do not set the last parameter to an empty string.
Return type: SETOF text[].
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
gaussdb=# SELECT regexp_matches('foobarbequebaz', '(bar)(beque)'); regexp_matches ---------------- {bar,beque} (1 row) gaussdb=# SELECT regexp_matches('foobarbequebaz', 'barbeque'); regexp_matches ---------------- {barbeque} (1 row) gaussdb=# SELECT regexp_matches('foobarbequebazilbarfbonk', '(b[^b]+)(b[^b]+)', 'g'); regexp_matches ---------------- {bar,beque} {bazil,barf} (2 rows) |
regexp_match(string text, pattern text [, flags text])
Description: Returns a string array, which is the first substring obtained by matching the POSIX regular expression with the specified string. If the pattern does not match, the function returns a blank line. If the pattern contains no parenthesized sub-expressions, then each row returned is a single-element text array containing the substring matching the whole pattern. If the pattern contains parenthesized sub-expressions, the function returns a text array whose nth element is the substring matching the nth parenthesized sub-expression of the pattern.
The flags parameter is optional. If flags is not specified, tight syntax matching is used by default. The flags parameter contains zero or multiple single-letter flags that change the behavior of a function. Table 4 lists the supported flags.
Option |
Description |
---|---|
'b' |
Converts the POSIX regular expression to a BRE. |
'c' |
Indicates the case-sensitive matching. |
'e' |
Converts the POSIX regular expression to an ERE. |
'i' |
Indicates the case-insensitive matching. |
'n' |
Indicates newline-sensitive matching. A character string is regarded as multiple lines. Caret (^) and dollar sign ($) match the beginning and end of each line. Periods (.) do not match linefeed. |
'm' |
Indicates newline-sensitive matching. A character string is regarded as multiple lines. Caret (^) and dollar sign ($) match the beginning and end of each line. Periods (.) do not match linefeed. The matching mode of n is the same. |
'p' |
Indicates partial newline-sensitive matching. Periods (.) do not match linefeed. Caret (^) can only match the beginning of the first line and dollar sign ($) can only match the end of the last line. |
'q' |
Represents the regular expression is a text string, all of which are common characters. |
's' |
Indicates non-newline-sensitive matching. That is, a character string is regarded as a single line, and caret (^) and dollar sign ($) match the beginning and end of the line, respectively. In this mode, periods (.) in the regular expression can match any character, including the linefeed character. |
't' |
Indicates the tight syntax matching. |
'w' |
Indicates inverse partial newline-sensitive matching. In the regular expression, caret (^) matches the beginning of each line, and dollar sign ($) matches only the end of the last line. Period (.) can match any character, including the linefeed character. |
'x' |
Indicates the extended syntax matching, ignoring whitespace characters and comments in regular expressions. The following three scenarios are excluded:
|

If the parameter contains an empty string ('') and the compatibility mode of the database is set to ORA, the returned result is NULL. This is because the empty string is processed as NULL in ORA-compatible mode.
Return type: SETOF text[].
Example:
gaussdb=# SELECT regexp_match('foobarbequebaz', '(bar)(beque)'); regexp_match -------------- {bar,beque} (1 row) gaussdb=# SELECT (regexp_match('foobarbequebaz', 'bar.*que'))[1]; regexp_match -------------- barbeque (1 row) gaussdb=# SELECT regexp_match('Learning #PostgreSQL', 'R', 'c'); regexp_match -------------- (1 row) gaussdb=# SELECT regexp_match('hello world', 'h e l l o', 'x'); regexp_match -------------- {hello} (1 row)
regexp_split_to_array(string text, pattern text [, flags text ])
Description: Splits string using a POSIX-style regular expression as the delimiter. The regexp_split_to_array function behaves the same as regexp_split_to_table, except that regexp_split_to_array returns its result as an array of text.
Return type: text[].
Example:
1 2 3 4 5 |
gaussdb=# SELECT regexp_split_to_array('hello world', E'\\s+'); regexp_split_to_array ----------------------- {hello,world} (1 row) |
regexp_split_to_table(string text, pattern text [, flags text])
Description: Splits string using a POSIX-style regular expression as the delimiter. If there is no match to the pattern, the function returns the string. If there is at least one match, for each match it returns the text from the end of the last match (or the beginning of the string) to the beginning of the match. When there are no more matches, it returns the text from the end of the last match to the end of the string.
The flags parameter contains zero or multiple single-letter flags that change the behavior of a function. i indicates that the matching is not related to uppercase and lowercase. By default, each matching substring is replaced, instead of replacing only the first one.
Return type: SETOF text.
Example:
1 2 3 4 5 6 |
gaussdb=# SELECT regexp_split_to_table('hello world', E'\\s+'); regexp_split_to_table ----------------------- hello world (2 rows) |
repeat(string text, number int)
Description: Repeats string the specified number of times.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT repeat('Pg', 4); repeat ---------- PgPgPgPg (1 row) |

The maximum size of memory allocated at a time cannot exceed 1 GB due to the memory allocation mechanism of the database. Therefore, the maximum value of number cannot exceed (1 GB – x)/lengthb (string) – 1. x indicates the length of the header information, which is usually greater than 4 bytes. The value varies among different scenarios.
replace(string text, from text, to text)
Description: Replaces all occurrences in string of substring from with substring to.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT replace('abcdefabcdef', 'cd', 'XXX'); replace ---------------- abXXXefabXXXef (1 row) |
replace(string, substring)
Description: Deletes all substrings in a string.
String type: text.
Substring type: text.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT replace('abcdefabcdef', 'cd'); replace ---------------- abefabef (1 row) |
reverse(str)
Description: Returns a reversed string (by character).
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT reverse('abcde'); reverse --------- edcba (1 row) |
right(str text, n int)
Description: Returns the last n characters in a string. When n is negative, all but the first |n| characters are returned.
Return type: text.
Example:
1 2 3 4 5 6 7 8 9 10 11 |
gaussdb=# SELECT right('abcde', 2); right ------- de (1 row) gaussdb=# SELECT right('abcde', -2); right ------- cde (1 row) |
rpad(string text, length int [, fill text])
Description: Fills up string to length by appending the characters fill (a space by default). If string is already longer than length, then it is truncated.
Return type: text.

- The length parameter indicates the display length of a character string. The display length of a single character is processed based on O-compatible requirements.
- During the function execution, if the remaining length is 1 and the next character is of the full-width type (2 bytes), a space character is added to the right of the string.
- If the value of length is a decimal, the value is truncated instead of being rounded off.
- The string and fill parameters do not comply with the encoding specifications.
- If the length of the string parameter exceeds 1 GB after padding, an error is reported.
- The length parameter indicates the total length of characters in a character string. The length of a single character is fixed to 1.
- If the value of length is a decimal, the value is rounded off.
- The string and fill parameters do not comply with the encoding specifications.
- If the length of the string parameter exceeds 1 GB after padding, an error is reported.
substrb(text,int,int)
Description: Extracts a substring. The first int indicates the start position of the subtraction. The second int indicates the number of characters extracted.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT substrb('string',2,3); substrb --------- tri (1 row) |
substrb(text,int)
Description: Extracts a substring. int indicates the start position of the extraction.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT substrb('string',2); substrb --------- tring (1 row) |
substr(text,int)/substr(str FROM pos)
Description: str is the target string, and pos is the position in the string. The string of str from the pos position is output to the end of the character.
Return type: text.
Example:
1 2 3 4 5 6 7 8 9 10 11 |
gaussdb=# SELECT substr('stringtest' from 4); substr --------- ingtest (1 row) gaussdb=# SELECT substr('stringtest', 4); substr --------- ingtest (1 row) |
substr(str FROM pos FOR len)
Description: Extracts a substring. The first int indicates the start position of the subtraction. The second int indicates the number of characters extracted. The value of pos can be a negative number. If the value is a negative number, the value is extracted from back to front.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT substr('teststring' from 5 for 2); substr -------- st (1 row) |
substr(bytea,from,count)
Description: Extracts a substring from bytea. from specifies the position where the extraction starts. count specifies the length of the extracted substring.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT substr('string',2,3); substr -------- tri (1 row) |
string || string
Description: Concatenates strings.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT 'MPP'||'DB' AS RESULT; result -------- MPPDB (1 row) |
string || non-string or non-string || string
Description: Concatenates strings and non-strings.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT 'Value: '||42 AS RESULT; result ----------- Value: 42 (1 row) |
split_part(string text, delimiter text, field int)
Description: Splits string on delimiter and returns the fieldth column (counting from text of the first appeared delimiter).
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT split_part('abc~@~def~@~ghi', '~@~', 2); split_part ------------ def (1 row) |
strpos(string, substring)
Description: Specifies the position of a substring. It is the same as position(substring in string). However, the parameter sequences of them are reversed.
Return type: int.
Example:
1 2 3 4 5 |
gaussdb=# SELECT strpos('source', 'rc'); strpos -------- 4 (1 row) |
to_hex(number int or bigint)
Description: Converts a number to a hexadecimal expression.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT to_hex(2147483647); to_hex ---------- 7fffffff (1 row) |
translate(string text, from text, to text)
Description: Any character in string that matches a character in from is replaced by the corresponding character in to. If from is longer than to, extra characters occurred in from are removed.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT translate('12345', '143', 'ax'); translate ----------- a2x5 (1 row) |
length(string)
Description: Obtains the number of characters in a string. When sql_compatibility is set to 'MYSQL', b_format_version is set to '5.7', and b_format_dev_version is set to 's1', if the string is of the string or text type, the number of bytes of the string is returned.
Return type: integer.
lengthb(string)
Description: Obtains the number of bytes in a string. The value depends on character sets (GBK and UTF8).
Return type: integer.
Example:
1 2 3 4 5 |
gaussdb=# SELECT lengthb('Chinese'); lengthb --------- 7 (1 row) |
substr(string,from)
Description:
Extracts substrings from a string.
from indicates the start position of the extraction.
- If from starts at 0, the value 1 is used.
- If the value of from is positive, all characters from from to the end are extracted.
- If the value of from is negative, the last n characters in the string are extracted, and n indicates the absolute value of from.
Return type: text.
Example:
If the value of from is positive:
1 2 3 4 5 |
gaussdb=# SELECT substr('ABCDEF',2); substr -------- BCDEF (1 row) |
If the value of from is negative:
1 2 3 4 5 |
gaussdb=# SELECT substr('ABCDEF',-2); substr -------- EF (1 row) |
substr(string,from,count)
Description:
Extracts substrings from a string.
from indicates the start position of the extraction.
count indicates the length of the extracted substring.
- If from starts at 0, the value 1 is used.
- If the value of from is positive, extract count characters starting from from.
- If the value of from is negative, extract the last n count characters in the string, in which n indicates the absolute value of from.
- If the value of count is smaller than 1, null is returned.
Return type: text.
Example:
If the value of from is positive:
1 2 3 4 5 |
gaussdb=# SELECT substr('ABCDEF',2,2); substr -------- BC (1 row) |
If the value of from is negative:
1 2 3 4 5 |
gaussdb=# SELECT substr('ABCDEF',-3,2); substr -------- DE (1 row) |
substrb(string,from)
Description: The functionality of this function is the same as that of SUBSTR(string,from). However, the calculation unit is byte.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT substrb('ABCDEF',-2); substrb --------- EF (1 row) |
substrb(string,from,count)
Description: The functionality of this function is the same as that of SUBSTR(string,from,count). However, the calculation unit is byte.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT substrb('ABCDEF',2,2); substrb --------- BC (1 row) |
to_single_byte(char)
Description: Converts all multi-byte characters in a string to single-byte characters.
Return type: text.

This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
to_multi_byte(char)
Description: Converts all single-byte characters in a string to multi-byte characters.
Return type: text.

This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
trim([leading |trailing |both] [characters] from string)
Description: Removes the longest string consisting only of characters in characters (a space by default) from the start, end, or both sides of string.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT trim(BOTH 'x' FROM 'xTomxx'); btrim ------- Tom (1 row) |
1 2 3 4 5 |
gaussdb=# SELECT trim(LEADING 'x' FROM 'xTomxx'); ltrim ------- Tomxx (1 row) |
1 2 3 4 5 |
gaussdb=# SELECT trim(TRAILING 'x' FROM 'xTomxx'); rtrim ------- xTom (1 row) |
rtrim(string [, characters])
Description: Removes the longest string consisting only of characters in characters (a space by default) from the end of string.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT rtrim('TRIMxxxx','x'); rtrim ------- TRIM (1 row) |
ltrim(string [, characters])
Description: Removes the longest string consisting only of characters in characters (a space by default) from the start of string.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT ltrim('xxxxTRIM','x'); ltrim ------- TRIM (1 row) |
upper(string)
Description: Converts the string into the uppercase.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT upper('tom'); upper ------- TOM (1 row) |
lower(string)
Description: Converts the string into the lowercase.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT lower('TOM'); lower ------- tom (1 row) |
nls_upper(string [, nlsparam])
Description: Converts a character string to uppercase letters. You can specify a sorting rule to process special uppercase conversion rules in some languages. The format of nlsparam is 'nls_sort=sort_name', where sort_name is replaced by the specific sorting rule name. When the nlsparam parameter is not set, this function is equivalent to upper.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT nls_upper('große'); nls_upper ----------- GROßE (1 row) |
1 2 3 4 5 |
gaussdb=# SELECT nls_upper('große', 'nls_sort = XGerman'); nls_upper ----------- GROSSE (1 row) |

This function can be used only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
nls_lower(string [, nlsparam])
Description: Converts a character string to lowercase letters. You can specify a sorting rule to process special lowercase conversion rules in some languages. The format of nlsparam is 'nls_sort=sort_name', where sort_name is replaced by the specific sorting rule name. When the nlsparam parameter is not set, this function is equivalent to lower.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT nls_lower('INDIVISIBILITY'); nls_lower ---------------- indivisibility (1 row) |
1 2 3 4 5 |
gaussdb=# SELECT nls_lower('INDIVISIBILITY', 'nls_sort = XTurkish'); nls_lower ---------------- ındıvısıbılıty (1 row) |

This function can be used only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
instr(string,substring[,position,occurrence])
Description: Queries and returns the value of the substring position that occurs the occurrence (1 by default) times from the position (1 by default) in the string.
- If the value of position is 0, 0 is returned.
- If the value of position is a negative number, searches backwards from the last nth character in the string, in which n indicates the absolute value of position.
In this function, the calculation unit is character. One Chinese character is one character.
Return type: integer.
Example:
1 2 3 4 5 |
gaussdb=# SELECT instr('corporate floor','or', 3); instr ------- 5 (1 row) |
1 2 3 4 5 |
gaussdb=# SELECT instr('corporate floor','or',-3,2); instr ------- 2 (1 row) |
initcap(string)
Description: Capitalizes the first letter of each word in a string.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT initcap('hi THOMAS'); initcap ----------- Hi Thomas (1 row) |

When a_format_version is set to 10c and a_format_dev_version is set to s2, if a case-insensitive character, such as Chinese, is followed by a case-sensitive character, only the first letter of the case-sensitive character is capitalized. Therefore, you are advised to set a_format_version to 10c and a_format_dev_version to s2.
ascii(string)
Description: Indicates the ASCII code of the first character in the string.
Return type: integer.
Example:
1 2 3 4 5 |
gaussdb=# SELECT ascii('xyz'); ascii ------- 120 (1 row) |
ascii2(string)
Description: Returns the decimal code of the first character of the input string in the database character set.
Return type: bigint.

This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
asciistr(string)
Description: Converts non-ASCII characters in the input string to \XXXX, where XXXX indicates the UTF-16 code unit.
Return type: text.

This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
unistr(string)
Description: Converts the coding sequence in a string into the corresponding character. Other characters remain unchanged.
Return type: text.

- This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
- The backslash (\) must be followed by four hexadecimal characters to indicate the coding sequence, or another backslash (\) indicates that a single backslash (\) is entered.
- If the input parameter is of the time type, the time type is implicitly converted to the character string type.
vsize(expr)
Description: Returns the number of bytes of the input expression.
Return type: int.

This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
replace(string varchar, search_string varchar, replacement_string varchar)
Description: Replaces all search_string in the string with replacement_string.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT replace('jack and jue','j','bl'); replace ---------------- black and blue (1 row) |
concat(str1,str2)
Description: Connects str1 and str2 and returns the string. Note: concat calls the output function of the data type and the return value is uncertain. As a result, the optimizer cannot calculate the result in advance when generating a plan. If there are performance requirements, you are advised to use the operator ||.

- If sql_compatibility is set to 'MYSQL' and str1 or str2 is set to NULL, the returned result is NULL.
- The return value of the concat function is of the variable-length type. When the concat function is compared with table data, the character string length is lost in the combination result. As a result, the comparison results are different.
Return type: text.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
gaussdb=# SELECT concat('Hello', ' World!'); concat -------------- Hello World! (1 row) gaussdb=# SELECT concat('Hello', NULL); concat -------- (1 row) gaussdb=# CREATE TABLE test_space(c char(10)); CREATE TABLE gaussdb=# INSERT INTO test_space values('a'); INSERT 0 1 -- After spaces are padded, the character string is still a fixed-length character string. It is expected that the result can be found. gaussdb=# SELECT * FROM test_space WHERE c = 'a '; c ------------ a (1 row) -- The combination result is a variable-length character string. The comparison fails and the result cannot be found. gaussdb=# SELECT * FROM test_space WHERE c = 'a' || ' '; c --- (0 rows) |
chr(integer)
Description: For the UTF-8 character set, the input is encoded as Unicode and a UTF-8 character is returned. For other character sets, an ASCII character is returned.
Return type: text.

- When a_format_version is set to 10c and a_format_dev_version is set to s1 in an ORA-compatible database, the value is truncated instead of being rounded off if the value of integer is a decimal.
- For details about how to set and modify the character set and collation, see Character Sets and Collations.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 |
gaussdb=# SELECT chr(65); chr ----- A (1 row) -- In the case of UTF-8 character set gaussdb=# select chr(19968); chr ----- — (1 row) |
chr(cvalue int|bigint)
Description: Converts cvalue to the character of the corresponding byte order and returns the character.
cvalue can be converted into a value of the int or bigint type. The value range is [0,2^32 – 1], corresponding to the range of unsigned int. A character array consisting of one to four bytes is returned based on the value of n. The byte arrays returned in different character sets are the same. However, due to different encoding rules, the result of the returned character string varies depending on the character set encoding.
If the character set is a single-byte character set, an ASCII character is returned after cvalue mod 256.
- If a byte in the input cvalue is 0, the output is truncated.
- If the input does not comply with the encoding rule of the current character set, an error is reported.
- If the input is NULL or 0, NULL is returned.
Return type: text.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
gaussdb=# SELECT chr(65); chr ----- A (1 row) gaussdb=# CREATE DATABASE gaussdb_o WITH DBCOMPATIBILITY 'ORA'; gaussdb=# \c gaussdb_o gaussdb_o=# SET a_format_version='10c'; gaussdb_o=# SET a_format_dev_version = 's1'; gaussdb_o=# SELECT chr(16705); chr ----- AA (1 row) -- The output is truncated. gaussdb_o=# SELECT chr(4259905); chr ----- A (1 row) gaussdb_o=# \c postgres gaussdb=# DROP DATABASE gaussdb_o; |

When a_format_version is set to 10c and a_format_dev_version is set to s1 in an ORA-compatible database, the chr function returns a character in the collation based on the input value. If the current character set for database encoding is a multi-byte character set, the return value contains one to four bytes. If the current character set for database encoding is a single-byte character set, the return value is a single byte obtained by performing the mod 256 operation on the input value. Otherwise, if the current character set for database encoding is UTF-8, the input is encoded as Unicode and a UTF-8 character is returned. For other character sets, an ASCII character is returned.
nchr(cvalue int|bigint)
Description: Returns the characters corresponding to the input parameter in the national character set. The GUC parameter nls_nchar_characterset specifies the country character set. Only AL16UTF16 and UTF8 are supported. This function is valid only when the value of the GUC parameter a_format_version is 10c and that of a_format_dev_version is s4 in an ORA-compatible database. If nls_nchar_characterset is set to AL16UTF16 and the input parameter exceeds two bytes, the input parameter will be truncated and the lowest two bytes will be retained. If nls_nchar_characterset is set to UTF8 and the input parameter exceeds three bytes, the value 0 is used.
Parameter: cvalue. cvalue can be converted to the int or bigint type. The value range is [0, 2^32 – 1], corresponding to the range of unsigned int. If the input parameter contains the decimal part, the decimal part will be deleted.
Return type: NVARCHAR2.

- The byte length of the return value of the function is different from that in database ORA.
- The return value of the function is restricted by the database character set. In different database character sets, if there is no mapping table for converting UTF8 to the database character set or the mapping table does not contain the UTF8 code, the current database character set does not support the UTF8 character corresponding to the input parameter. As a result, the result returned by the nchr(cvalue int|bigint) function is inconsistent with that in database ORA.
- If the current database character set does not support the UTF8 character corresponding to the input parameter, or the national character set is UTF8 but the input parameter does not comply with the UTF8 format, the byte array corresponding to the input parameter is returned. If a single byte is in the range of [0x00,0x7F], an ASCII character is returned. If a single byte is in the range of [0x80,0xFF], a question mark (?) is returned.
- For details about how to set and modify the character set and collation, see Character Sets and Collations.
regexp_substr(source_char, pattern)
Description: Extracts substrings from a regular expression. If the SQL syntax is ORA- and MySQL-compatible and the value of the GUC parameter behavior_compat_options contains aformat_regexp_match, the period (.) cannot match the '\n' character. If aformat_regexp_match is not contained, the period (.) matches the '\n' character by default.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT regexp_substr('500 Hello World, Redwood Shores, CA', ',[^,]+,') "REGEXPR_SUBSTR"; REGEXPR_SUBSTR ------------------- , Redwood Shores, (1 row) |
regexp_replace(string, pattern, replacement [,flags ])
Description: Replaces substrings matching the POSIX-style regular expression. If no pattern is matched, the original string is returned. When a match exists, the matched substring in the returned string is replaced with the string specified by replacement. The replacement string can contain \n, where \n ranges from 1 to 9, to indicate that the source substring matching the nth parenthesized subexpression of the pattern should be inserted, and it can contain \& to indicate that the substring matching the entire pattern should be inserted.
Parameters:
- string: source character string used for matching.
- pattern: regular expression pattern string used for matching.
- flags: contains zero or multiple single-letter flags that change the matching behavior of the function. This parameter is optional. g indicates that all matched substrings are replaced with the specified content, instead of only the first matching item. Other options supported by flags and description are described in Table 3.
Return type: text
Example:
1 2 3 4 5 6 7 8 9 10 |
gaussdb=# SELECT regexp_replace('Thomas', '.[mN]a.', 'M'); regexp_replace ---------------- ThM (1 row) gaussdb=# SELECT regexp_replace('foobarbaz','b(..)', E'X\\1Y', 'g') AS RESULT; result ------------- fooXarYXazY (1 row) |
regexp_replace(string text, pattern text [, replacement text [, position int [, occurrence int [, flags text]]]])
Description: Replaces substrings matching the POSIX-style regular expression. If no pattern is matched, the original string is returned. When a match exists, the matched substring in the returned string is replaced with the string specified by replacement.
- string: source character string used for matching.
- pattern: regular expression pattern string used for matching.
- replacement: character string used to replace the matched substring. This parameter is optional. If no parameter value is specified or the parameter value is null, an empty string is used for replacement.
- position: start character of the source string used for matching. This parameter is optional. The default value is 1.
- occurrence: sequence number of the matched substring to be replaced. This parameter is optional. The default value is 1, indicating that the first matched substring is replaced.
- flags: contains zero or multiple single-letter flags that change the matching behavior of the function. This parameter is optional. The options supported by flags and description are described in Table 3.

When a_format_version is set to 10c and a_format_dev_version is set to s1 in an ORA-compatible database, the default value of occurrence is 0, indicating that all matched substrings are replaced and the pattern parameter ending with a backslash (\) is valid.
Return type: text.
Example:
gaussdb=# SELECT regexp_replace('foobarbaz','b(..)', E'X\\1Y', 2, 2, 'n') AS RESULT;
result
------------
foobarXazY
(1 row)
concat_ws(sep text, str"any" [, str"any" [, ...] ])
Description: Uses the first parameter as the separator, which is associated with all following parameters. The NULL parameter is ignored.

- If the first parameter value is NULL, the returned result is NULL.
- If the first parameter is provided but the parameter value is an empty string ('') and the SQL compatibility mode of the database is set to ORA, the returned result is NULL. This is because the ORA-compatible mode treats the empty string ('') as NULL. To resolve this problem, you can change the SQL compatibility mode of the database to MySQL, TD, or PG.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT concat_ws(',', 'ABCDE', 2, NULL, 22); concat_ws ------------ ABCDE,2,22 (1 row) |
nlssort(string text, sort_method text)
Description: Returns the encoding value of a string in the sorting mode specified by sort_method. The encoding value can be used for sorting and determines the sequence of the string in the sorting mode. Currently, sort_method can be set to nls_sort=schinese_pinyin_m or nls_sort=generic_m_ci. nls_sort=generic_m_ci supports only the case-insensitive order for English characters.
String type: text.
sort_method type: text
Return type: text.
Example:
convert(string bytea, src_encoding name, dest_encoding name)
Description: Converts the string to dest_encoding. src_encoding specifies the source code encoding. The string must be valid in this encoding.
Return type: bytea.
Example:
1 2 3 4 5 |
gaussdb=# SELECT convert('text_in_utf8', 'UTF8', 'GBK'); convert ---------------------------- \x746578745f696e5f75746638 (1 row) |

If the rule for converting between source to target encoding (for example, GBK and LATIN1) does not exist, the string is returned without conversion. See the pg_conversion system catalog for details. server_encoding is specified during database initialization.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
gaussdb=# SHOW server_encoding; server_encoding ----------------- LATIN1 (1 row) gaussdb=# SELECT convert_from('some text', 'GBK'); convert_from -------------- some text (1 row) db_latin1=# SELECT convert_to('some text', 'GBK'); convert_to ---------------------- \x736f6d652074657874 (1 row) db_latin1=# SELECT convert('some text', 'GBK', 'LATIN1'); convert ---------------------- \x736f6d652074657874 (1 row) |
convert(expr, USING transcoding_name)
Description: Converts a parameter, which can be a character or a number, into a character string of the transcoding_name type, and returns the character string.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT convert('asdas' using 'gbk'); convert --------- asdas (1 row) |

This function takes effect only in databases in MySQL-compatible mode.
convert_from(string bytea, src_encoding name)
Description: Converts a string using the coding mode of the database.
src_encoding specifies the source code encoding. The string must be valid in this encoding.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT convert_from('text_in_utf8', 'UTF8'); convert_from -------------- text_in_utf8 (1 row) |
convert_to(string text, dest_encoding name)
Description: Converts a string to dest_encoding.
Return type: bytea.
Example:
1 2 3 4 5 |
gaussdb=# SELECT convert_to('some text', 'UTF8'); convert_to ---------------------- \x736f6d652074657874 (1 row) |
string [NOT] LIKE pattern [ESCAPE escape-character]
Description: Specifies the pattern matching function.
If the pattern does not include a percentage sign (%) or an underscore (_), this mode represents itself only. In this case, the behavior of LIKE is the same as the equal operator. The underscore (_) in the pattern matches any single character while one percentage sign (%) matches no or multiple characters.
To match with underscores (_) or percent signs (%), corresponding characters in pattern must lead escape characters. The default escape character is a backward slash (\) and can be specified using the ESCAPE clause. To match with escape characters, enter two escape characters.
Return type: Boolean.
Example:
1 2 3 4 5 |
gaussdb=# SELECT 'AA_BBCC' LIKE '%A@_B%' ESCAPE '@' AS RESULT; result -------- t (1 row) |
1 2 3 4 5 |
gaussdb=# SELECT 'AA_BBCC' LIKE '%A@_B%' AS RESULT; result -------- f (1 row) |
1 2 3 4 5 |
gaussdb=# SELECT 'AA@_BBCC' LIKE '%A@_B%' AS RESULT; result -------- t (1 row) |
REGEXP_LIKE(source_string, pattern [, match_parameter])
Description: Indicates the mode matching function of a regular expression.
source_string indicates the source string and pattern indicates the matching pattern of the regular expression. match_parameter indicates the matching items and the values are as follows:
- 'i': case-insensitive
- 'c': case-sensitive
- 'n': allowing the metacharacter "." in a regular expression to be matched with a linefeed.
- 'm': allows source_string to be regarded as multiple rows.
If match_parameter is ignored, case-sensitive is enabled by default, "." is not matched with a linefeed, and source_string is regarded as a single row.
Return type: Boolean.
Example:
1 2 3 4 5 |
gaussdb=# SELECT regexp_like('ABC', '[A-Z]'); regexp_like ------------- t (1 row) |
1 2 3 4 5 |
gaussdb=# SELECT regexp_like('ABC', '[D-Z]'); regexp_like ------------- f (1 row) |
1 2 3 4 5 |
gaussdb=# SELECT regexp_like('ABC', '[A-Z]','i'); regexp_like ------------- t (1 row) |
1 2 3 4 5 |
gaussdb=# SELECT regexp_like('ABC', '[A-Z]'); regexp_like ------------- t (1 row) |
format(formatstr text [, str"any" [, ...] ])
Description: Formats a string.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT format('Hello %s, %1$s', 'World'); format -------------------- Hello World, World (1 row) |
md5(string)
Description: Encrypts a string in MD5 mode and returns a value in hexadecimal form.

The MD5 encryption algorithm has lower security and poses security risks. Therefore, you are advised to use a more secure encryption algorithm.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT md5('ABC'); md5 ---------------------------------- 902fbdd2b1df0c4f70b4a5d23525e932 (1 row) |
sha(string) / sha1(string)
Description: Encrypts a string using SHA1 and returns a hexadecimal number. The sha and sha1 functions are the same.

- The SHA-1 encryption algorithm is not recommended because it has lower security and poses security risks.
- This function is valid only when GaussDB is compatible with the MY type (that is, sql_compatibility is set to 'MYSQL').
Return type: text.
Example:
1 2 3 4 5 6 7 8 9 10 |
gaussdb=# SELECT sha('ABC'); sha ------------------------------------------ 3c01bdbb26f358bab27f267924aa2c9a03fcfdb8 (1 row) gaussdb=# SELECT sha1('ABC'); sha1 ------------------------------------------ 3c01bdbb26f358bab27f267924aa2c9a03fcfdb8 (1 row) |
sha2(string, hash_length)
Description: Encrypts a string in SHA-2 mode and returns a value in hexadecimal form.
hash_length: corresponds to an SHA-2 algorithm. The value can be 0 (SHA-256), 224 (SHA-224), 256 (SHA-256), 384 (SHA-384), or 512 (SHA-512). For other values, NULL is returned.

- The SHA-224 encryption algorithm is not recommended because it has lower security and poses security risks.
- The SHA2 function records hash plaintext in logs. Therefore, you are advised not to use this function to encrypt sensitive information such as keys.
- This function is valid only when GaussDB is compatible with the MY type (that is, sql_compatibility is set to 'MYSQL').
Return type: text.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
gaussdb=# SELECT sha2('ABC',224); sha2 ---------------------------------------------------------- 107c5072b799c4771f328304cfe1ebb375eb6ea7f35a3aa753836fad (1 row) gaussdb=# SELECT sha2('ABC',256); sha2 ------------------------------------------------------------------ b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78 (1 row) gaussdb=# SELECT sha2('ABC',0); sha2 ------------------------------------------------------------------ b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78 (1 row) |
decode(string text, format text)
Description: Decodes binary data from textual representation.
Return type: bytea.
Example:
1 2 3 4 5 |
gaussdb=# SELECT decode('MTIzAAE=', 'base64'); decode -------------- \x3132330001 (1 row) |
similar_escape(pat text, esc text)
Description: Converts a regular expression of the SQL:2008 style to the POSIX style.
Return type: text.
Example:
gaussdb=# SELECT similar_escape('\s+ab','2');
similar_escape
----------------
^(?:\\s+ab)$
(1 row)
find_in_set(text, set)
Description: Finds the position of a given member in a set, counting from 1. If no record is found, 0 is returned. The distributed system does not support the SET data type. An error will be reported when this function is executed.
Return type: int2
find_in_set(str, strlist)
Description: Queries whether the strlist column contains str. If the strlist column contains str, the position of str in the strlist field is returned. The inputs are str and strlist. str indicates the character string to be queried. strlist is a set of character strings separated by commas (,). If str is not in strlist or strlist is an empty string, the return value is 0.
The parameters are described as follows.
Parameter |
Type |
Description |
---|---|---|
str |
text |
Target string |
strlist |
text |
A set of character strings |
Return type: int.
Example:
1 2 3 4 5 |
gaussdb=# SELECT find_in_set('ee','a,ee,c'); find_in_set ------------- 2 (1 row) |
encode(data bytea, format text)
Description: Encodes binary data into a textual representation.
Return type: text.
Example:
1 2 3 4 5 |
gaussdb=# SELECT encode(E'123\\000\\001', 'base64'); encode ---------- MTIzAAE= (1 row) |
strcmp(expr1,expr2)
Description: Compares two input strings based on the current character order. If the strings are the same, 0 is returned. If the first string is smaller than the second string, –1 is returned. Otherwise, 1 is returned.
The parameters are described as follows.
Parameter |
Type |
Description |
---|---|---|
expr1/expr2 |
|
Character string involved in comparison. |
Return type: integer.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
-- Switch to the MySQL-compatible database. gaussdb=# CREATE DATABASE gaussdb_m dbcompatibility='MYSQL'; gaussdb=# \c gaussdb_m -- Set the compatible version control parameter to enable the function of specifying the collation for constant character strings. gaussdb_m=# SET b_format_version='5.7'; gaussdb_m=# SET b_format_dev_version='s2'; gaussdb_m=# SELECT strcmp('abc', 'ABC'); strcmp -------- 0 (1 row) gaussdb_m=# SELECT strcmp('abc ', 'abc'); strcmp -------- 0 (1 row) gaussdb_m=# SELECT strcmp('1', 1); strcmp -------- 0 (1 row) gaussdb_m=# SELECT strcmp(123, 2); strcmp -------- -1 (1 row) |

- The strcmp function is valid only when sql_compatibility is set to 'MYSQL'.
- Since the version whose b_format_version is set to '5.7' and b_format_dev_version is set to 's1', the sql_mode parameter pad_char_to_full_length specifies whether to add spaces at the end of the char type, which affects the strcmp comparison result. For details, see Table 1.
- Since the version whose b_format_version is set to '5.7' and b_format_dev_version is set to 's1', the behavior of the character, binary, numeric, and date and time types is M-compatible, which affects the strcmp comparison result. For details, see Data Types. For the floating-point type in the numeric type, the precision may be different from that in MySQL due to different connection parameter settings. Therefore, this scenario is not recommended, or the numeric type is used instead. For details, see Connection Parameters.
- Since the version whose b_format_version is set to '5.7' and b_format_dev_version is set to 's2', character escape and constant character string obtaining collation are supported. The collation affects the strcmp comparison result. For details, see the SET NAMES syntax in SET. For details about the rules for combining different collations of the character types, see Rules for Combining Character Sets and Collations.

- For a string containing linefeed characters, for example, a string consisting of a linefeed character and a space, the value of LENGTH and LENGTHB in GaussDB is 2.
- In GaussDB, n in the CHAR(n) type indicates the number of characters. Therefore, for multiple-octet coded character sets, the length returned by the LENGTHB function may be longer than n.
- GaussDB supports multiple types of databases, including ORA, MySQL, TD, and PG. The lexical analyzer of ORA-compatible database is different from that of the other three databases. In an ORA-compatible database, an empty string is considered as NULL. Therefore, when an ORA-compatible database is used, if an empty string is used as a parameter in the preceding character operation function, no output is displayed. For example:
gaussdb=# SELECT translate('12345','123',''); translate ----------- (1 row)
This is because the kernel checks whether the input parameter contains NULL before calling the corresponding function. If the input parameter contains NULL, the kernel does not call the corresponding function. As a result, no output is displayed. In PG-compatible mode, the processing of character strings is the same as that of PG. Therefore, the preceding problem does not occur.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot