Character Processing Functions and Operators
String functions and operators provided by GaussDB are for concatenating strings with each other, concatenating strings with non-strings, and matching the patterns of strings. Note: Except length-related functions, other functions and operators of string processing functions do not support parameters greater than 1 GB.
- bit_length(string)
Description: Specifies the number of bits occupied by a string.
Return type: int
Example:
1 2 3 4 5
gaussdb=# SELECT bit_length('world'); bit_length ------------ 40 (1 row)
- btrim(string text [, characters text])
Description: Removes the longest string consisting only of characters in characters (a space by default) from the start and end of string.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT btrim('sring' , 'ing'); btrim ------- sr (1 row)
- char_length(string) or character_length(string)
Description: Specifies the number of characters in a string.
Return type: int
Example:
1 2 3 4 5
gaussdb=# SELECT char_length('hello'); char_length ------------- 5 (1 row)
- dump(expr[, return_fmt [, start_position [, length ] ] ])
Description: Returns the data type code, byte length, and internal representation of the input expression. return_fmt specifies the number system of the internal representation, start_position specifies the byte from which the internal representation starts, and length indicates the length of the data to be read.
Return type: text
This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
- instr(text,text,int,int)
Description: instr(string1,string2,int1,int2) returns the text from int1 to int2 in string1. The first int indicates the start position for matching, and the second int indicates the number of matching times.
Return type: int
Example:
1 2 3 4 5
gaussdb=# SELECT instr( 'abcdabcdabcd', 'bcd', 2, 2 ); instr ------- 6 (1 row)
- instrb(text,text,int,int)
Description: instrb(string1,string2,int1,int2) returns the position matching string2 in string1 for the int2th time from the position specified by int1. int1 indicates the start position for matching, and int2 indicates the number of matching times. Different from the instr function, instrb is calculated in bytes and is not affected by the character set in use.
Return type: int
Example:
1 2 3 4 5
gaussdb=# SELECT instrb( 'abcdabcdabcd', 'bcd', 2, 2 ); instrb ------- 6 (1 row)
- This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s1 in an ORA-compatible database.
- If the values of int1 and int2 are decimals, the values are truncated instead of being rounded off.
- lengthb(text/bpchar)
Description: Obtains the number of bytes of a specified string.
Return type: int
Example:
1 2 3 4 5
gaussdb=# SELECT lengthb('hello'); lengthb --------- 5 (1 row)
- left(str text, n int)
Description: Returns the first n characters in a string. When n is negative, all but the last |n| characters are returned.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT left('abcde', 2); left ------ ab (1 row)
- length(string bytea, encoding name)
Description: Specifies the number of characters in string in the given encoding. The string must be valid in this encoding.
Return type: int
Example:
1 2 3 4 5
gaussdb=# SELECT length('jose', 'UTF8'); length -------- 4 (1 row)
If the length of the bytea type is queried and UTF8 encoding is specified, the maximum length can only be 536870888.
- lpad(string text, length int [, fill text])
Description: Fills up string to length by appending the characters fill (a space by default). If string is already longer than length, then it is truncated.
Return type: text
In the scenario where this function is in an ORA-compatible database, the value of a_format_version is 10c, and that of a_format_dev_version is s1:- The length parameter indicates the display length of a character string. The display length of a single character is processed based on ORA compatibility requirements. During the execution of the lpad function, if the remaining length is 1 and the next character is of the full-width type (2 bytes), a space character is added to the left of the string.
- If the value of length is a decimal, the value is truncated instead of being rounded off.
- notlike(x bytea name text, y bytea text)
Description: Compares x and y to check whether they are inconsistent.
Return type: Boolean
Example:
1 2 3 4 5 6 7 8 9 10
gaussdb=# SELECT notlike(1,2); notlike -------------- t (1 row) gaussdb=# SELECT notlike(1,1); notlike -------------- f (1 row)
- octet_length(string)
Description: Specifies the number of bytes in a string.
Return type: int
Example:
1 2 3 4 5
gaussdb=# SELECT octet_length('jose'); octet_length -------------- 4 (1 row)
- overlay(string placing string FROM int [for int])
Description: Replaces substrings. FROM int indicates the start position of the replacement in the first string. for int indicates the number of characters replaced in the first string.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT overlay('hello' placing 'world' from 2 for 3 ); overlay --------- hworldo (1 row)
- position(substring in string)
Description: Specifies the position of a substring. Parameters are case-sensitive.
Return type: int. If the character string does not exist, 0 is returned.
Example:
1 2 3 4 5
gaussdb=# SELECT position('ing' in 'string'); position ---------- 4 (1 row)
- pg_client_encoding()
Description: Specifies the current client encoding name.
Return type: name
Example:
1 2 3 4 5
gaussdb=# SELECT pg_client_encoding(); pg_client_encoding -------------------- UTF8 (1 row)
- quote_ident(string text)
Description: Returns the given string suitably quoted to be used as an identifier in an SQL statement string (quotation marks are used as required). Quotation marks are added only if necessary (that is, if the string contains non-identifier characters or would be case-folded). Embedded quotation marks are properly doubled.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT quote_ident('hello world'); quote_ident -------------- "hello world" (1 row)
- quote_literal(string text)
Description: Returns the given string suitably quoted to be used as text in an SQL statement string (quotation marks are used as required). It supports XML data that is explicitly converted to the character type.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT quote_literal('hello'); quote_literal --------------- 'hello' (1 row)
If a command similar to the following exists, the text will be escaped:
1 2 3 4 5
gaussdb=# SELECT quote_literal(E'O\'hello'); quote_literal --------------- 'O''hello' (1 row)
If a command similar to the following exists, the backslash will be properly doubled:
1 2 3 4 5
gaussdb=# SELECT quote_literal('O\hello'); quote_literal --------------- E'O\\hello' (1 row)
If the parameter is null, NULL is returned. If the parameter may be null, you are advised to use quote_nullable.
1 2 3 4 5
gaussdb=# SELECT quote_literal(NULL); quote_literal --------------- (1 row)
- quote_literal(value anyelement)
Description: Converts the given value to text and then quotes it as a literal.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT quote_literal(42.5); quote_literal --------------- '42.5' (1 row)
If a command similar to the following exists, the given value will be escaped:
1 2 3 4 5
gaussdb=# SELECT quote_literal(E'O\'42.5'); quote_literal --------------- '0''42.5' (1 row)
If a command similar to the following exists, the backslash will be properly doubled:
1 2 3 4 5
gaussdb=# SELECT quote_literal('O\42.5'); quote_literal --------------- E'O\\42.5' (1 row)
- quote_nullable(string text)
Description: Returns the given string suitably quoted to be used as a string in an SQL statement string (quotation marks are used as required).
It supports XML data that is explicitly converted to the character type.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT quote_nullable('hello'); quote_nullable ---------------- 'hello' (1 row)
If a command similar to the following exists, the text will be escaped:
1 2 3 4 5
gaussdb=# SELECT quote_nullable(E'O\'hello'); quote_nullable ---------------- 'O''hello' (1 row)
If a command similar to the following exists, the backslash will be properly doubled:
1 2 3 4 5
gaussdb=# SELECT quote_nullable('O\hello'); quote_nullable ---------------- E'O\\hello' (1 row)
If the parameter is null, NULL is returned.
1 2 3 4 5
gaussdb=# SELECT quote_nullable(NULL); quote_nullable ---------------- NULL (1 row)
- quote_nullable(value anyelement)
Description: Converts the given value to text and then quotes it as a literal.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT quote_nullable(42.5); quote_nullable ---------------- '42.5' (1 row)
If a command similar to the following exists, the given value will be escaped:
1 2 3 4 5
gaussdb=# SELECT quote_nullable(E'O\'42.5'); quote_nullable ---------------- 'O''42.5' (1 row)
If a command similar to the following exists, the backslash will be properly doubled:
1 2 3 4 5
gaussdb=# SELECT quote_nullable('O\42.5'); quote_nullable ---------------- E'O\\42.5' (1 row)
If the parameter is null, NULL is returned.
1 2 3 4 5
gaussdb=# SELECT quote_nullable(NULL); quote_nullable ---------------- NULL (1 row)
- substring_inner(string [from int] [for int])
Description: Extracts a substring. from int indicates the start position of the truncation. for int indicates the number of characters truncated.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT substring_inner('adcde', 2,3); substring_inner ----------------- dcd (1 row)
- substring(string [from int] [for int])
Description: Extracts a substring. from int indicates the start position of the truncation. for int indicates the number of characters truncated.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT substring('Thomas' from 2 for 3); substring ----------- hom (1 row)
- substring(string from pattern)
Description: Extracts substrings matching the POSIX-style regular expression. It returns the text that matches the pattern. If no match record is found, a null value is returned.
Return type: text
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
gaussdb=# SELECT substring('Thomas' from '...$'); substring ----------- mas (1 row) gaussdb=# SELECT substring('foobar' from 'o(.)b'); result -------- o (1 row) gaussdb=# SELECT substring('foobar' from '(o(.)b)'); result -------- oob (1 row)
If the POSIX-style regular expression contains any parentheses, the portion of the text that matched the first parenthesized sub-expression (the one whose left parenthesis comes first) is returned. You can put parentheses around the whole expression if you want to use parentheses within it without triggering this exception.
- substring(string from pattern for escape)
Description: Extracts substrings matching the SQL regular expression. The declared schema must match the entire data string; otherwise, the function fails and returns a null value. To indicate the part of the schema that should be returned on success, the schema must contain two occurrences of the escape character followed by a double quotation mark ("). The text matching the portion of the pattern between these marks is returned.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT substring('Thomas' from '%#"o_a#"_' for '#'); substring ----------- oma (1 row)
- rawcat(raw,raw)
Description: Indicates the string concatenation function.
Return type: raw
Example:
1 2 3 4 5
gaussdb=# SELECT rawcat('ab','cd'); rawcat -------- ABCD (1 row)
- regexp_like(text,text,text)
Description: Indicates the mode matching function of a regular expression.
Return type: bool
Example:
1 2 3 4 5
gaussdb=# SELECT regexp_like('str','[ac]'); regexp_like ------------- f (1 row)
- regexp_substr(string text, pattern text [, position int [, occurrence int [, flags text]]])
Description: Extracts substrings from a regular expression. Its function is similar to substr. When a regular expression contains multiple parallel brackets, it also needs to be processed.
Parameter description:
- string: source character string used for matching.
- pattern: regular expression pattern string used for matching.
- position: start character of the source string used for matching. This parameter is optional. The default value is 1.
- occurrence: sequence number of the matched substring to be extracted. This parameter is optional. The default value is 1.
- flags: contains zero or multiple single-letter flags that change the matching behavior of the function. This parameter is optional. The options supported by flags and description are shown in Table 1.
Table 1 Options supported by flags Option
Description
'b'
Indicates the BRE matching without extension.
'c'
Indicates the case-sensitive matching.
'e'
Indicates the ERE matching with extension.
'i'
Indicates the case-insensitive matching.
'm'
Indicates the multi-line matching. If flags contains 'm', use the multi-line matching. Otherwise, use the single-line matching.
'n'
The meanings of the 'n' option are related to the GUC parameter behavior_compat_options and the compatibility mode of the current database.- If the SQL compatibility mode of the database is ORA or MySQL and the value of the GUC parameter behavior_compat_options contains aformat_regexp_match, the 'n' option indicates that "." matches the linefeed '\n'. If 'n' is not specified, "." does not match the linefeed.
- In other cases, the 'n' option has the same meaning as the 'm' option.
'p'
Indicates partial linefeed-sensitive matching, which is similar to the linefeed-sensitive matching ('m' or 'n') and affects "." and square bracket expression, but does not affect ^ and $.
'q'
Indicates common character matching.
's'
Indicates the single-line matching. The meaning is opposite to those of 'm' and 'n'.
't'
Indicates the compact matching. The whitespace characters match themselves.
'w'
Indicates the reverse partial linefeed-sensitive matching. The meaning is opposite to that of 'p'.
'x'
Indicates the loose matching. The whitespace characters are ignored.
Return type: text
Example:
1 2 3 4 5 6 7 8 9 10 11
gaussdb=# SELECT regexp_substr('str','[ac]'); regexp_substr --------------- (1 row) gaussdb=# SELECT regexp_substr('foobarbaz', 'b(..)', 3, 2) AS RESULT; result -------- baz (1 row)
- regexp_count(string text, pattern text [, position int [, flags text]])
Description: obtains the number of substrings used for matching.
Parameters:
- string: source character string used for matching.
- pattern: regular expression pattern string used for matching.
- position: start character of the source string used for matching. This parameter is optional. The default value is 1.
- flags: contains zero or multiple single-letter flags that change the matching behavior of the function. This parameter is optional. The options supported by flags and description are shown in Table 1.
When the function is in an ORA-compatible database, the value of a_format_version is 10c, and the value of a_format_dev_version is s1, the pattern parameter ending with a backslash (\) is valid.
Return type: int
Example:
gaussdb=# SELECT regexp_count('foobarbaz','b(..)', 5) AS RESULT; result -------- 1 (1 row)
- regexp_instr(string text, pattern text [, position int [, occurrence int [, return_opt int [, flags text]]]])
Description: obtains the position (starting from 1) of the substring that meets the matching condition. If no substring is matched, 0 is returned.
Parameter description:
- string: source character string used for matching.
- pattern: regular expression pattern string used for matching.
- position: start character of the source string used for matching. This parameter is optional. The default value is 1.
- occurrence: sequence number of the matched substring to be replaced. This parameter is optional. The default value is 1.
- return_opt: specifies whether to return the position of the first or last character of the matched substring. This parameter is optional. If the value is 0, the position of the first character (starting from 1) of the matched substring is returned. If the value is greater than 0, the position of the next character of the end character of the matched substring is returned. The default value is 0.
- flags: contains zero or multiple single-letter flags that change the matching behavior of the function. This parameter is optional. The options supported by flags and description are shown in Table 1.
Return type: int
Example:
gaussdb=# SELECT regexp_instr('foobarbaz','b(..)', 1, 1, 0) AS RESULT; result -------- 4 (1 row) gaussdb=# SELECT regexp_instr('foobarbaz','b(..)', 1, 2, 0) AS RESULT; result -------- 7 (1 row)
- regexp_matches(string text, pattern text [, flags text])
Description: Returns all captured substrings resulting from matching a POSIX-style regular expression against string. If the pattern does not match, the function returns no rows. If the pattern contains no parenthesized sub-expressions, then each row returned is a single-element text array containing the substring matching the whole pattern. If the pattern contains parenthesized sub-expressions, the function returns a text array whose nth element is the substring matching the nth parenthesized sub-expression of the pattern.
The optional flags argument contains zero or multiple single-letter flags that change the function behavior. i indicates that the matching is not related to uppercase and lowercase. g indicates that each matched substring is replaced, instead of replacing only the first one.
If the last parameter is provided but the parameter value is an empty string ('') and the SQL compatibility mode of the database is set to ORA, the returned result is an empty set. This is because the ORA compatible mode treats the empty string ('') as NULL. To resolve this problem, you can:
- Change the database SQL compatibility mode to TD.
- Do not provide the last parameter or do not set the last parameter to an empty string.
Return type: SETOF text[]
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
gaussdb=# SELECT regexp_matches('foobarbequebaz', '(bar)(beque)'); regexp_matches ---------------- {bar,beque} (1 row) gaussdb=# SELECT regexp_matches('foobarbequebaz', 'barbeque'); regexp_matches ---------------- {barbeque} (1 row) gaussdb=# SELECT regexp_matches('foobarbequebazilbarfbonk', '(b[^b]+)(b[^b]+)', 'g'); regexp_matches -------------- {bar,beque} {bazil,barf} (2 rows)
- regexp_split_to_array(string text, pattern text [, flags text ])
Description: Splits string using a POSIX-style regular expression as the delimiter. The regexp_split_to_array function behaves the same as regexp_split_to_table, except that regexp_split_to_array returns its result as an array of text.
Return type: text[]
Example:
1 2 3 4 5
gaussdb=# SELECT regexp_split_to_array('hello world', E'\\s+'); regexp_split_to_array ----------------------- {hello,world} (1 row)
- regexp_split_to_table(string text, pattern text [, flags text])
Description: Splits string using a POSIX-style regular expression as the delimiter. If there is no match to the pattern, the function returns the string. If there is at least one match, for each match it returns the text from the end of the last match (or the beginning of the string) to the beginning of the match. When there are no more matches, it returns the text from the end of the last match to the end of the string.
The flags parameter is a text string containing zero or more single-letter flags that change the function's behavior. i indicates case-independent matching. If no parameter is contained, each matched substring, not only the first substring, is replaced by default.
Return type: SETOF text
Example:
1 2 3 4 5 6
gaussdb=# SELECT regexp_split_to_table('hello world', E'\\s+'); regexp_split_to_table ----------------------- hello world (2 rows)
- repeat(string text, number int)
Description: Repeats string the specified number of times.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT repeat('Pg', 4); repeat ---------- PgPgPgPg (1 row)
The maximum size of memory allocated at a time cannot exceed 1 GB due to the memory allocation mechanism of the database. Therefore, the maximum value of number cannot exceed (1 GB – x)/lengthb (string) – 1. x indicates the length of the header information, which is usually greater than 4 bytes. The value varies among different scenarios.
- replace(string text, from text, to text)
Description: Replaces all occurrences in string of substring from with substring to.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT replace('abcdefabcdef', 'cd', 'XXX'); replace ---------------- abXXXefabXXXef (1 row)
- replace(string, substring)
Description: Deletes all substrings in a string.
String type: text
Substring type: text
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT replace('abcdefabcdef', 'cd'); replace ---------------- abefabef (1 row)
- reverse(str)
Description: Returns a reversed string (by character).
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT reverse('abcde'); reverse --------- edcba (1 row)
- right(str text, n int)
Description: Returns the last n characters in a string. When n is negative, all but the first |n| characters are returned.
Return type: text
Example:
1 2 3 4 5 6 7 8 9 10 11
gaussdb=# SELECT right('abcde', 2); right ------- de (1 row) gaussdb=# SELECT right('abcde', -2); right ------- cde (1 row)
- rpad(string text, length int [, fill text])
Description: Fills up string to length by appending the characters fill (a space by default). If string is already longer than length, then it is truncated.
Return type: text
In the scenario where this function is in an ORA-compatible database, the value of a_format_version is 10c, and that of a_format_dev_version is s1:- The length parameter indicates the display length of a character string. The display length of a single character is processed based on ORA compatibility requirements. During the execution of the rpad function, if the remaining length is 1 and the next character is of the full-width type (2 bytes), a space character is added to the right of the string.
- If the value of length is a decimal, the value is truncated instead of being rounded off.
- substrb(text,int,int)
Description: Extracts a substring. The first int indicates the start position of the subtraction. The second int indicates the number of characters subtracted.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT substrb('string',2,3); substrb --------- tri (1 row)
- substrb(text,int)
Description: Extracts a substring. int indicates the start position of the extraction.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT substrb('string',2); substrb --------- tring (1 row)
- substr(bytea,from,count)
Description: Extracts a substring from bytea. from specifies the position where the extraction starts. count specifies the length of the extracted substring.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT substr('string',2,3); substr -------- tri (1 row)
- string || string
Description: Concatenates strings.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT 'MPP'||'DB' AS RESULT; result -------- MPPDB (1 row)
- string || non-string or non-string || string
Description: Concatenates strings and non-strings.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT 'Value: '||42 AS RESULT; result ----------- Value: 42 (1 row)
- split_part(string text, delimiter text, field int)
Description: Splits string on delimiter and returns the fieldth column (counting from text of the first appeared delimiter).
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT split_part('abc~@~def~@~ghi', '~@~', 2); split_part ------------ def (1 row)
- strpos(string, substring)
Description: Specifies the position of a substring. It is the same as position(substring in string). However, the parameter sequences of them are reversed.
Return type: int
Example:
1 2 3 4 5
gaussdb=# SELECT strpos('source', 'rc'); strpos -------- 4 (1 row)
- to_hex(number int or bigint)
Description: Converts a number to a hexadecimal expression.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT to_hex(2147483647); to_hex ---------- 7fffffff (1 row)
- translate(string text, from text, to text)
Description: Any character in string that matches a character in from is replaced by the corresponding character in to. If from is longer than to, extra characters occurred in from are removed.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT translate('12345', '143', 'ax'); translate ----------- a2x5 (1 row)
- length(string)
Description: Obtains the number of characters in a string.
Return type: integer
Example:
1 2 3 4 5
gaussdb=# SELECT length('abcd'); length -------- 4 (1 row)
- lengthb(string)
Description: Obtains the number of characters in a string. The value depends on character sets (GBK and UTF8).
Return type: integer
Example:
1 2 3 4 5
gaussdb=# SELECT lengthb('Chinese'); lengthb --------- 7 (1 row)
- substr(string,from)
Extracts substrings from a string.
from indicates the start position of the extraction.
- If from starts at 0, the value 1 is used.
- If the value of from is positive, all characters from from to the end are extracted.
- If the value of from is negative, the last n characters in the string are extracted, in which n indicates the absolute value of from.
Return type: varchar
Example:
If the value of from is positive:
1 2 3 4 5
gaussdb=# SELECT substr('ABCDEF',2); substr -------- BCDEF (1 row)
If the value of from is negative:
1 2 3 4 5
gaussdb=# SELECT substr('ABCDEF',-2); substr -------- EF (1 row)
- substr(string,from,count)
Extracts substrings from a string.
from indicates the start position of the extraction.
count indicates the length of the extracted substring.
- If from starts at 0, the value 1 is used.
- If the value of from is positive, extract count characters starting from from.
- If the value of from is negative, extract the last n count characters in the string, in which n indicates the absolute value of from.
- If the value of count is smaller than 1, null is returned.
Return type: varchar
Example:
If the value of from is positive:
1 2 3 4 5
gaussdb=# SELECT substr('ABCDEF',2,2); substr -------- BC (1 row)
If the value of from is negative:
1 2 3 4 5
gaussdb=# SELECT substr('ABCDEF',-3,2); substr -------- DE (1 row)
- substrb(string,from)
Description: The functionality of this function is the same as that of SUBSTR(string,from). However, the calculation unit is byte.
Return type: bytea
Example:
1 2 3 4 5
gaussdb=# SELECT substrb('ABCDEF',-2); substrb --------- EF (1 row)
- substrb(string,from,count)
Description: The functionality of this function is the same as that of SUBSTR(string,from,count). However, the calculation unit is byte.
Return type: bytea
Example:
1 2 3 4 5
gaussdb=# SELECT substrb('ABCDEF',2,2); substrb --------- BC (1 row)
- to_single_byte(char)
Description: Converts all multi-byte characters in a string to single-byte characters.
Return type: text
This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
- to_multi_byte(char)
Description: Converts all single-byte characters in a string to multi-byte characters.
Return type: text
This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
- trim([leading |trailing |both] [characters] from string)
Description: Removes the longest string containing only the characters (a space by default) from the start/end/both ends of the string.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT trim(BOTH 'x' FROM 'xTomxx'); btrim ------- Tom (1 row)
1 2 3 4 5
gaussdb=# SELECT trim(LEADING 'x' FROM 'xTomxx'); ltrim ------- Tomxx (1 row)
1 2 3 4 5
gaussdb=# SELECT trim(TRAILING 'x' FROM 'xTomxx'); rtrim ------- xTom (1 row)
- rtrim(string [, characters])
Description: Removes the longest string containing only characters from characters (a space by default) from the end of string.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT rtrim('TRIMxxxx','x'); rtrim ------- TRIM (1 row)
- ltrim(string [, characters])
Description: Removes the longest string containing only characters from characters (a space by default) from the start of string.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT ltrim('xxxxTRIM','x'); ltrim ------- TRIM (1 row)
- upper(string)
Description: Converts the string into the uppercase.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT upper('tom'); upper ------- TOM (1 row)
- lower(string)
Description: Converts the string into the lowercase.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT lower('TOM'); lower ------- tom (1 row)
- nls_upper(string [, nlsparam])
Description: Converts a character string to uppercase letters. You can specify a sorting rule to process special uppercase conversion rules in some languages. The format of nlsparam is 'nls_sort=sort_name', where sort_name is replaced by the specific sorting rule name. When the nlsparam parameter is not set, this function is equivalent to upper.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT nls_upper('große'); nls_upper ----------- GROßE (1 row)
1 2 3 4 5
gaussdb=# SELECT nls_upper('große', 'nls_sort = XGerman'); nls_upper ----------- GROSSE (1 row)
This function can be used only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
- nls_lower(string [, nlsparam])
Description: Converts a character string to lowercase letters. You can specify a sorting rule to process special lowercase conversion rules in some languages. The format of nlsparam is 'nls_sort=sort_name', where sort_name is replaced by the specific sorting rule name. When the nlsparam parameter is not set, this function is equivalent to lower.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT nls_lower('INDIVISIBILITY'); nls_lower ---------------- indivisibility (1 row)
1 2 3 4 5
gaussdb=# SELECT nls_lower('INDIVISIBILITY', 'nls_sort = XTurkish'); nls_lower ---------------- ındıvısıbılıty (1 row)
This function can be used only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
- instr(string,substring[,position,occurrence])
Description: Queries and returns the value of the substring position that occurs the occurrence (1 by default) times from the position (1 by default) in the string.
- If the value of position is 0, 0 is returned.
- If the value of position is negative, searches backwards from the last nth character in the string, in which n indicates the absolute value of position.
In this function, the calculation unit is character. One Chinese character is one character.
Return type: integer
Example:
1 2 3 4 5
gaussdb=# SELECT instr('corporate floor','or', 3); instr ------- 5 (1 row)
1 2 3 4 5
gaussdb=# SELECT instr('corporate floor','or',-3,2); instr ------- 2 (1 row)
- initcap(string)
Description: Capitalizes the first letter of each word in a string.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT initcap('hi THOMAS'); initcap ----------- Hi Thomas (1 row)
When a_format_version is set to 10c and a_format_dev_version is set to s2 in ORA-compatible database, if a case-insensitive character, such as Chinese, is followed by a case-sensitive character, only the first letter of the case-sensitive character is capitalized. Therefore, you are advised to set a_format_version to 10c and a_format_dev_version to s2.
- ascii(string)
Description: Indicates the ASCII code of the first character in the string.
Return type: integer
Example:
1 2 3 4 5
gaussdb=# SELECT ascii('xyz'); ascii ------- 120 (1 row)
- ascii2(string)
Description: Returns the decimal code of the first character of the input string in the database character set.
Return type: integer
Example:
1 2 3 4 5
gaussdb=# SELECT ascii2('xyz'); ascii2 -------- 120 (1 row)
This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
- asciistr(string)
Description: Converts non-ASCII characters in the input string to \XXXX, where XXXX indicates the UTF-16 code unit.
Return type: varchar
This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
- unistr(string)
Description: Converts the coding sequence in a string into the corresponding character. Other characters remain unchanged.
Return type: text
- This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
- The backslash (\) must be followed by four hexadecimal characters to indicate the coding sequence, or another backslash (\) indicates that a single backslash (\) is entered.
- If the input parameter is of the time type, the time type is implicitly converted to the character string type.
- vsize(expr)
Description: Returns the number of bytes of the input expression.
Return type: int
This function is valid only when the value of a_format_version is 10c and that of a_format_dev_version is s2 in an ORA-compatible database.
- replace(string varchar, search_string varchar, replacement_string varchar)
Description: Replaces all search_string in the string with replacement_string.
Return type: varchar
Example:
1 2 3 4 5
gaussdb=# SELECT replace('jack and jue','j','bl'); replace ---------------- black and blue (1 row)
- concat(str1,str2)
Description: Connects str1 and str2 and returns the string. Note: concat calls the output function of the data type and the return value is immutable. As a result, the optimizer cannot calculate the result in advance when generating a plan. If there are performance requirements, you are advised to use the operator ||.
- If sql_compatibility is set to 'MYSQL' and str1 or str2 is set to NULL, the returned result is NULL.
- The return value of the concat function is of the variable-length type. When the concat function is compared with table data, the character string length is lost in the combination result. As a result, the comparison results are different.
Return type: varchar
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
gaussdb=# SELECT concat('Hello', ' World!'); concat -------------- Hello World! (1 row) gaussdb=# SELECT concat('Hello', NULL); concat -------- (1 row) gaussdb=# CREATE TABLE test_space(c char(10)); CREATE TABLE gaussdb=# INSERT INTO test_space VALUES('a'); INSERT 0 1 -- After spaces are padded, the character string is still a fixed-length character string. It is expected that the result can be found. gaussdb=# SELECT * FROM test_space WHERE c = 'a '; c ------------ a (1 row) -- The combination result is a variable-length character string. The comparison fails and the result cannot be found. gaussdb=# SELECT * FROM test_space WHERE c = 'a' || ' '; c --- (0 rows)
- chr(integer)
Description: For the UTF-8 character set, the input is encoded as Unicode and a UTF-8 character is returned. For other character sets, an ASCII character is returned.
Return type: text
When a_format_version is set to 10c and a_format_dev_version is set to s1 in an ORA-compatible database, the value is truncated instead of being rounded off if the value of integer is a decimal.
Example:
1 2 3 4 5 6 7 8 9 10 11 12
gaussdb=# SELECT chr(65); chr ----- A (1 row) -- If the UTF-8 character set is used (the database character set has been set to UTF-8 during database creation, the database character set cannot be reset). gaussdb=# SELECT chr(19968); chr ----- — (1 row)
- chr(cvalue int|bigint)
Description: Converts cvalue to the character of the corresponding byte order and returns the character.
cvalue can be converted into a value of the int or bigint type. The value range is [0,2^32 – 1], corresponding to the range of unsigned int. A character array consisting of one to four bytes is returned based on the value of n. The byte arrays returned in different character sets are the same. However, due to different encoding rules, the result of the returned character string varies depending on the character set encoding.
If the character set is a single-byte character set, an ASCII character is returned after cvalue mod 256.
Precautions:- If a byte in the input cvalue is 0, the output is truncated.
- If the input does not comply with the encoding rule of the current character set, an error is reported.
- If the input is NULL or 0, NULL is returned.
Return type: text
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
gaussdb=# SELECT chr(65); chr ----- A (1 row) gaussdb=# CREATE DATABASE gaussdb_o WITH DBCOMPATIBILITY 'ORA'; gaussdb=# \c gaussdb_o gaussdb_o=# SET a_format_version='10c'; gaussdb_o=# SET a_format_dev_version = 's1'; gaussdb_o=# SELECT chr(16705); chr ----- AA (1 row) -- The output is truncated. gaussdb_o=# SELECT chr(4259905); chr ----- A (1 row) gaussdb_o=# \c postgres gaussdb=# DROP DATABASE gaussdb_o;
When a_format_version is set to 10c and a_format_dev_version is set to s1 in an ORA-compatible database, the chr function returns a character in the collation based on the input value. If the current character set for database encoding is a multi-byte character set, the return value contains one to four bytes. If the current character set for database encoding is a single-byte character set, the return value is a single byte obtained by performing the mod 256 operation on the input value. Otherwise, if the current character set for database encoding is UTF-8, the input is encoded as Unicode and a UTF-8 character is returned. For other character sets, an ASCII character is returned.
- regexp_substr(source_char, pattern)
Description: Extracts substrings from a regular expression. If the SQL syntax is ORA- and MySQL-compatible and the value of the GUC parameter behavior_compat_options contains aformat_regexp_match, the period (.) cannot match the '\n' character. If aformat_regexp_match is not contained, the period (.) matches the '\n' character by default.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT regexp_substr('500 Hello World, Redwood Shores, CA', ',[^,]+,') "REGEXPR_SUBSTR"; REGEXPR_SUBSTR ------------------- , Redwood Shores, (1 row)
- regexp_replace(string, pattern, replacement [,flags ])
Description: Replaces substrings matching the POSIX-style regular expression. The source string is returned unchanged if there is no match to the pattern. If there is a match, the source string is returned with the replacement string substituted for the matching substring.
The replacement string can contain \n, where n is 1 through 9, to indicate that the source substring matching the nth parenthesized sub-expression of the pattern should be inserted, and it can contain \& to indicate that the substring matching the entire pattern should be inserted.
The optional flags argument contains zero or multiple single-letter flags that change the function behavior. The options supported by flags and description are shown in Table 1.
Return type: varchar
Example:
1 2 3 4 5 6 7 8 9 10
gaussdb=# SELECT regexp_replace('Thomas', '.[mN]a.', 'M'); regexp_replace ---------------- ThM (1 row) gaussdb=# SELECT regexp_replace('foobarbaz','b(..)', E'X\\1Y', 'g') AS RESULT; result ------------- fooXarYXazY (1 row)
- repexp_replace(string text, pattern text [, replacement text [, position int [, occurrence int [, flags text]]]])
Description: Replaces substrings matching the POSIX-style regular expression. The source string is returned unchanged if there is no match to the pattern. If there is a match, the source string is returned with the replacement string substituted for the matching substring.
Parameter description:- string: source character string used for matching.
- pattern: regular expression pattern string used for matching.
- replacement:character string used to replace the matched substring. This parameter is optional. If no parameter value is specified or the parameter value is null, an empty string is used for replacement.
- position: start character of the source string used for matching. This parameter is optional. The default value is 1.
- occurrence: sequence number of the matched substring to be replaced. This parameter is optional. The default value is 1, indicating that the first matched substring is replaced.
- flags: contains zero or multiple single-letter flags that change the matching behavior of the function. This parameter is optional. The options supported by flags and description are shown in Table 1.
When a_format_version is set to 10c and a_format_dev_version is set to s1 in an ORA-compatible database, the default value of occurrence is 0, indicating that all matched substrings are replaced and the pattern parameter ending with a backslash (\) is valid.
Return type: text
Example:
gaussdb=# SELECT regexp_replace('foobarbaz','b(..)', E'X\\1Y', 2, 2, 'n') AS RESULT; result ------------ foobarXazY (1 row)
- concat_ws(sep text, str"any" [, str"any" [, ...] ])
Description: Uses the first parameter as the separator, which is associated with all following parameters. The NULL parameter is ignored.
- If the first parameter value is NULL, the returned result is NULL.
- If the first parameter is provided but the parameter value is an empty string ('') and the SQL compatibility mode of the database is set to ORA, the returned result is NULL. This is because the ORA-compatible mode treats the empty string ('') as NULL. To resolve this problem, you can change the SQL compatibility mode of the database to MySQL, TD, or PG.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT concat_ws(',', 'ABCDE', 2, NULL, 22); concat_ws ------------ ABCDE,2,22 (1 row)
- nlssort(string text, sort_method text)
Description: Returns the encoding value of a string in the sorting mode specified by sort_method. The encoding value can be used for sorting and determines the sequence of the string in the sorting mode. Currently, sort_method can be set to nls_sort=schinese_pinyin_m or nls_sort=generic_m_ci. nls_sort=generic_m_ci supports only the case-insensitive order for English characters.
String type: text
sort_method type: text
Return type: text
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
gaussdb=# CREATE TABLE test(a text); gaussdb=# INSERT INTO test(a) VALUES('abC'); gaussdb=# INSERT INTO test(a) VALUES('abC'); gaussdb=# INSERT INTO test(a) VALUES('abc'); gaussdb=# SELECT * FROM test ORDER BY nlssort(a,'nls_sort=schinese_pinyin_m'); a -------- abc abC abC (3 rows) gaussdb=# SELECT * FROM test ORDER BY nlssort(a, 'nls_sort=generic_m_ci'); a -------- abC abc abC (3 rows) gaussdb=# DROP TABLE test;
- convert(string bytea, src_encoding name, dest_encoding name)
Description: Converts the string to dest_encoding. src_encoding specifies the source code encoding. The string must be valid in this encoding.
Return type: bytea
Example:
1 2 3 4 5
gaussdb=# SELECT convert('text_in_utf8', 'UTF8', 'GBK'); convert ---------------------------- \x746578745f696e5f75746638 (1 row)
If the rule for converting between source to target encoding (for example, GBK and LATIN1) does not exist, the string is returned without conversion. See the pg_conversion system catalog for details.
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
gaussdb=# SHOW server_encoding; server_encoding ----------------- LATIN1 (1 row) gaussdb=# SELECT convert_from('some text', 'GBK'); convert_from -------------- some text (1 row) db_latin1=# SELECT convert_to('some text', 'GBK'); convert_to ---------------------- \x736f6d652074657874 (1 row) db_latin1=# SELECT convert('some text', 'GBK', 'LATIN1'); convert ---------------------- \x736f6d652074657874 (1 row)
- convert_from(string bytea, src_encoding name)
Description: Converts a string using the coding mode of the database.
src_encoding specifies the source code encoding. The string must be valid in this encoding.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT convert_from('text_in_utf8', 'UTF8'); convert_from -------------- text_in_utf8 (1 row)
- convert_to(string text, dest_encoding name)
Description: Converts a string to dest_encoding.
Return type: bytea
Example:
1 2 3 4 5
gaussdb=# SELECT convert_to('some text', 'UTF8'); convert_to ---------------------- \x736f6d652074657874 (1 row)
- string [NOT] LIKE pattern [ESCAPE escape-character]
Description: Specifies the pattern matching function.
If the pattern does not include a percentage sign (%) or an underscore (_), this mode represents itself only. In this case, the behavior of LIKE is the same as the equal operator. The underscore (_) in the pattern matches any single character while one percentage sign (%) matches no or multiple characters.
To match with underscores (_) or percent signs (%), corresponding characters in pattern must lead escape characters. The default escape character is a backward slash (\) and can be specified using the ESCAPE clause. To match with escape characters, enter two escape characters.
Return type: Boolean
Example:
1 2 3 4 5
gaussdb=# SELECT 'AA_BBCC' LIKE '%A@_B%' ESCAPE '@' AS RESULT; result -------- t (1 row)
1 2 3 4 5
gaussdb=# SELECT 'AA_BBCC' LIKE '%A@_B%' AS RESULT; result -------- f (1 row)
1 2 3 4 5
gaussdb=# SELECT 'AA@_BBCC' LIKE '%A@_B%' AS RESULT; result -------- t (1 row)
- REGEXP_LIKE(source_string, pattern [, match_parameter])
Description: Indicates the mode matching function of a regular expression.
source_string indicates the source string and pattern indicates the matching pattern of the regular expression. match_parameter indicates the matching items and the values are as follows:
- 'i': case-insensitive
- 'c': case-sensitive
- 'n': allowing the metacharacter "." in a regular expression to be matched with a linefeed.
- 'm': allows source_string to be regarded as multiple rows.
If match_parameter is ignored, case-sensitive is enabled by default, "." is not matched with a linefeed, and source_string is regarded as a single row.
Return type: Boolean
Example:
1 2 3 4 5
gaussdb=# SELECT regexp_like('ABC', '[A-Z]'); regexp_like ------------- t (1 row)
1 2 3 4 5
gaussdb=# SELECT regexp_like('ABC', '[D-Z]'); regexp_like ------------- f (1 row)
1 2 3 4 5
gaussdb=# SELECT regexp_like('ABC', '[A-Z]','i'); regexp_like ------------- t (1 row)
1 2 3 4 5
gaussdb=# SELECT regexp_like('ABC', '[A-Z]'); regexp_like ------------- t (1 row)
- format(formatstr text [, str"any" [, ...] ])
Description: Formats a string.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT format('Hello %s, %1$s', 'World'); format -------------------- Hello World, World (1 row)
- md5(string)
Description: Encrypts a string in MD5 mode and returns a value in hexadecimal form.
The MD5 encryption algorithm has lower security and poses security risks. Therefore, you are advised to use a more secure encryption algorithm.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT md5('ABC'); md5 ---------------------------------- 902fbdd2b1df0c4f70b4a5d23525e932 (1 row)
- sha(string) / sha1(string)
Description: Encrypts a string using SHA1 and returns a hexadecimal number. The sha and sha1 functions are the same.
- The SHA1 encryption algorithm is not recommended because it has lower security and poses security risks.
- This function is valid only when GaussDB is compatible with the MY type (that is, sql_compatibility is set to 'MYSQL').
Return type: text
Example:
1 2 3 4 5 6 7 8 9 10
gaussdb=# SELECT sha('ABC'); sha ------------------------------------------ 3c01bdbb26f358bab27f267924aa2c9a03fcfdb8 (1 row) gaussdb=# SELECT sha1('ABC'); sha1 ------------------------------------------ 3c01bdbb26f358bab27f267924aa2c9a03fcfdb8 (1 row)
- sha2(string, hash_length)
Description: Encrypts a string in SHA2 mode and returns a value in hexadecimal form.
hash_length: corresponds to an SHA2 algorithm. The value can be 0 (SHA-256), 224 (SHA-224), 256 (SHA-256), 384 (SHA-384), or 512 (SHA-512). For other values, NULL is returned.
- The SHA-224 encryption algorithm is not recommended because it has lower security and poses security risks.
- The SHA2 function records hash plaintext in logs. Therefore, you are advised not to use this function to encrypt sensitive information such as keys.
- This function is valid only when GaussDB is compatible with MySQL (that is, sql_compatibility is set to 'MYSQL').
Return type: text
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
gaussdb=# SELECT sha2('ABC',224); sha2 ---------------------------------------------------------- 107c5072b799c4771f328304cfe1ebb375eb6ea7f35a3aa753836fad (1 row) gaussdb=# SELECT sha2('ABC',256); sha2 ------------------------------------------------------------------ b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78 (1 row) gaussdb=# SELECT sha2('ABC',0); sha2 ------------------------------------------------------------------ b5d4045c3f466fa91fe2cc6abe79232a1a57cdf104f7a26e716e0a1e2789df78 (1 row)
- decode(string text, format text)
Description: Decodes binary data from textual representation.
Return type: bytea
Example:
1 2 3 4 5
gaussdb=# SELECT decode('MTIzAAE=', 'base64'); decode -------------- \x3132330001 (1 row)
- similar_escape(pat text, esc text)
Description: Converts a regular expression of the SQL:2008 style to the POSIX style.
Return type: text
Example:
gaussdb=# select similar_escape('\s+ab','2'); similar_escape ---------------- ^(?:\\s+ab)$ (1 row)
- find_in_set(text, set)
Description: Finds the position of a given member in a set, counting from 1. If no record is found, 0 is returned. The distributed system does not support the SET data type. An error will be reported when this function is executed.
Return type: int2
- encode(data bytea, format text)
Description: Encodes binary data into a textual representation.
Return type: text
Example:
1 2 3 4 5
gaussdb=# SELECT encode(E'123\\000\\001', 'base64'); encode ---------- MTIzAAE= (1 row)
- For a string containing newline characters, for example, a string consisting of a newline character and a space, the value of length and lengthb in GaussDB is 2.
- In GaussDB, n in the CHAR(n) type indicates the number of characters. Therefore, for multiple-octet coded character sets, the length returned by the LENGTHB function may be longer than n.
- GaussDB supports multiple types of databases, including ORA, MySQL, TD, and PG. The lexical analyzer of ORA-compatible database is different from that of the other three databases. In an ORA-compatible database, an empty string is considered as NULL. Therefore, when an ORA-compatible database is used, if an empty string is used as a parameter in the preceding character operation function, no output is displayed. Example:
gaussdb=# SELECT translate('12345','123',''); translate ----------- (1 row)
This is because the kernel checks whether the input parameter contains NULL before calling the corresponding function. If the input parameter contains NULL, the kernel does not call the corresponding function. As a result, no output is displayed. In PG-compatible mode, the processing of character strings is the same as that of PostgreSQL. Therefore, the preceding problem does not occur.
Extension Functions and Operators
- pkg_bpchar_opc()
Description: Serves as an extension API to add the comparison operator between bpchar and text or between text and bpchar policies, so as to solve the problem that indexes cannot be matched when data of the bpchar and text types is compared. Only the system administrator can install extensions.
The extended function is for internal use only. You are advised not to use it.
Example:
The tables logs_nchar, logs_char, logs_varchar2, and logs_text are as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61
/* logs_nchar table */ DROP TABLE IF EXISTS logs_nchar; CREATE TABLE logs_nchar ( log_id NCHAR(16), log_message VARCHAR2(255) NOT NULL ); DROP INDEX IF EXISTS idx_nchar_logid; CREATE INDEX idx_nchar_logid ON logs_nchar(log_id); INSERT INTO logs_nchar VALUES('FE306991300002 ', '111'); INSERT INTO logs_nchar VALUES('FE306991300003 ', '222'); INSERT INTO logs_nchar VALUES('FE306991300004 ', '222'); /* logs_char table */ DROP TABLE IF EXISTS logs_char; CREATE TABLE logs_char ( log_id CHAR(16), log_message VARCHAR2(255) NOT NULL ); DROP INDEX IF EXISTS idx_char_logid; CREATE INDEX idx_char_logid ON logs_char(log_id); INSERT INTO logs_char VALUES('FE306991300002 ', '111'); INSERT INTO logs_char VALUES('FE306991300003 ', '222'); INSERT INTO logs_char VALUES('FE306991300004 ', '222'); /* logs_varchar2 table */ DROP TABLE IF EXISTS logs_varchar2; CREATE TABLE logs_varchar2 ( log_id VARCHAR2(16), log_message VARCHAR2(255) NOT NULL ); DROP INDEX IF EXISTS idx_varchar2_logid; CREATE INDEX idx_varchar2_logid ON logs_varchar2(log_id); INSERT INTO logs_varchar2 VALUES('FE306991300002 ', '111'); INSERT INTO logs_varchar2 VALUES('FE306991300003 ', '222'); INSERT INTO logs_varchar2 VALUES('FE306991300004 ', '222'); /* logs_text table */ DROP TABLE IF EXISTS logs_text; CREATE TABLE logs_text ( log_id text, log_message VARCHAR2(255) NOT NULL ); DROP INDEX IF EXISTS idx_text_logid; CREATE INDEX idx_text_logid ON logs_text(log_id); INSERT INTO logs_text VALUES('FE306991300002 ', '111'); INSERT INTO logs_text VALUES('FE306991300003 ', '222'); INSERT INTO logs_text VALUES('FE306991300004 ', '222');
Compare the bpchar type with the text type (initial state, forward compatibility).1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
/* The purpose is to obtain the detailed execution plan of all nodes. */ gaussdb=# SET max_datanode_for_plan = 64; SET /* If no extension is installed, when nchar and text are compared, nchar is implicitly converted to text because there is no bpchar or text index operator. That is, the fixed-length character type is converted to the variable-length character type. As a result, the execution plan changes and the index cannot be matched. */ gaussdb=# EXPLAIN SELECT * FROM logs_nchar WHERE log_id = RPAD(TRIM('FE306991300002 '),16,' '); QUERY PLAN ------------------------------------------------------------------------ ----------------------------------------------------------------- Data Node Scan (cost=0.00..0.00 rows=0 width=0) Node/s: All datanodes Remote SQL: SELECT log_id, log_message FROM public.logs_nchar WHERE log _id::text = rpad(btrim('FE306991300002 '::text), 16, ' '::text) Datanode Name: datanode1 Seq Scan on logs_nchar (cost=0.00..1.01 rows=1 width=584) Filter: ((log_id)::text = 'FE306991300002 '::text) Datanode Name: datanode2 Seq Scan on logs_nchar (cost=0.00..1.03 rows=1 width=584) Filter: ((log_id)::text = 'FE306991300002 '::text) (12 rows) /* The log_id column in the logs_nchar table is of nchar(16) type. The inserted data is 'FE306991300002 ', which is implicitly converted to text. During comparison, spaces are deleted, that is, 'FE306991300002'='FE306991300002 '. Therefore, the data is not matched. */ gaussdb=# SELECT * FROM logs_nchar WHERE log_id = RPAD(TRIM('FE306991300002 '),16,' '); log_id | log_message --------+------------- (0 rows)
Compare the bpchar type with the text type (the pkg_bpchar_opc extension is installed, which is ORA-compatible).1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
/* The system administrator installs the pkg_bpchar_opc extension. The comparison operators of the bpchar and text types and index-related content are added to the database. */ gaussdb=# CREATE EXTENSION pkg_bpchar_opc; CREATE EXTENSION gaussdb=# SET max_datanode_for_plan = 64; SET gaussdb=# EXPLAIN SELECT * FROM logs_nchar WHERE log_id = RPAD(TRIM('FE306991300002 '),16,' '); QUERY PLAN ------------------------------------------------------------------------ ----------------------------------------------------------- Data Node Scan (cost=0.00..0.00 rows=0 width=0) Node/s: datanode2 Remote SQL: SELECT log_id, log_message FROM public.logs_nchar WHERE log _id = rpad(btrim('FE306991300002 '::text), 16, ' '::text) Datanode Name: datanode2 [Bypass] Index Scan using idx_nchar_logid on logs_nchar (cost=0.00..8.27 rows =1 width=584) Index Cond: (log_id = 'FE306991300002 '::text) (9 rows) /* In this case, when log_id is implicitly converted to the bpchar type and compared with the text type, the comparison operator and index information can be found, and the index can be matched. */ gaussdb=# SELECT * FROM logs_nchar WHERE log_id = RPAD(TRIM('FE306991300002 '),16,' '); log_id | log_message ------------------+------------- FE306991300002 | 111 (1 row) gaussdb=# DROP EXTENSION pkg_bpchar_opc; DROP EXTENSION
Compare the text type with the bpchar type (initial state, forward compatibility).1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
gaussdb=# SET max_datanode_for_plan = 64; SET gaussdb=# EXPLAIN SELECT * FROM logs_text WHERE log_id = 'FE306991300002 '::bpchar; QUERY PLAN ------------------------------------------------------------------------ -------------------------------------- Data Node Scan (cost=0.00..0.00 rows=0 width=0) Node/s: datanode2 Remote SQL: SELECT log_id, log_message FROM public.logs_text WHERE log_ id = 'FE306991300002 '::bpchar::text Datanode Name: datanode2 [Bypass] Index Scan using idx_text_logid on logs_text (cost=0.00..8.27 rows=1 width=548) Index Cond: (log_id = 'FE306991300002'::text) (9 rows) gaussdb=# SELECT * FROM logs_text WHERE log_id = 'FE306991300002 '::bpchar; log_id | log_message --------+------------- (0 rows) gaussdb=# SELECT * FROM logs_text WHERE log_id = 'FE306991300002 '::text; log_id | log_message ------------------+------------- FE306991300002 | 111 (1 row)
Compare the text type with the bpchar type (the pkg_bpchar_opc extension is installed, which is ORA-compatible).1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
gaussdb=# CREATE EXTENSION pkg_bpchar_opc; CREATE EXTENSION gaussdb=# SET max_datanode_for_plan = 64; SET gaussdb=# explain select * from logs_text t1 where t1.log_id ='FE306991300002 '::bpchar; QUERY PLAN ------------------------------------------------------------------------ ----------------------------------- Data Node Scan (cost=0.00..0.00 rows=0 width=0) Node/s: datanode2 Remote SQL: SELECT log_id, log_message FROM public.logs_text t1 WHERE l og_id = 'FE306991300002 '::bpchar Datanode Name: datanode2 [Bypass] Index Scan using idx_text_logid on logs_text t1 (cost=0.00..8.27 row s=1 width=548) Index Cond: (log_id = 'FE306991300002 '::bpchar) (9 rows) gaussdb=# SELECT * FROM logs_text WHERE log_id = 'FE306991300002 '::bpchar; log_id | log_message --------+------------- (0 rows) gaussdb=# SELECT * FROM logs_text WHERE log_id = 'FE306991300002 '::text; log_id | log_message ------------------+------------- FE306991300002 | 111 (1 row) gaussdb=# DROP EXTENSION pkg_bpchar_opc; DROP EXTENSION
Compare the text type with the bpchar type (initial state, forward compatibility).1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
gaussdb=# SET max_datanode_for_plan = 64; SET gaussdb=# SELECT * FROM logs_varchar2 t1, logs_char t2 WHERE t1.log_id = t2.log_id; log_id | log_message | log_id | log_message ------------------+-------------+------------------+------------- FE306991300003 | 222 | FE306991300003 | 222 FE306991300004 | 222 | FE306991300004 | 222 FE306991300002 | 111 | FE306991300002 | 111 (3 rows) gaussdb=# EXPLAIN SELECT * FROM logs_varchar2 t1, logs_char t2 WHERE t1.log_id = t2.log_id; QUERY PLAN ------------------------------------------------------------------------ -------------- Streaming (type: GATHER) (cost=17.29..39.84 rows=20 width=1150) Node/s: All datanodes -> Hash Join (cost=13.29..31.42 rows=20 width=1150) Hash Cond: (((t1.log_id)::bpchar) = t2.log_id) -> Streaming(type: REDISTRIBUTE) (cost=0.00..17.98 rows=20 wi dth=566) Spawn on: All datanodes -> Seq Scan on logs_varchar2 t1 (cost=0.00..13.13 rows= 20 width=566) -> Hash (cost=13.13..13.13 rows=21 width=584) -> Seq Scan on logs_char t2 (cost=0.00..13.13 rows=20 w idth=584) (9 rows) gaussdb=# SELECT * FROM logs_varchar2 t1, logs_char t2 WHERE t1.log_id = 'FE306991300002 '; log_id | log_message | log_id | log_message ------------------+-------------+------------------+------------- FE306991300002 | 111 | FE306991300003 | 222 FE306991300002 | 111 | FE306991300002 | 111 FE306991300002 | 111 | FE306991300004 | 222 (3 rows)
Compare the text type with the bpchar type (the pkg_bpchar_opc extension is installed, which is ORA-compatible).1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
gaussdb=# CREATE EXTENSION pkg_bpchar_opc; CREATE EXTENSION gaussdb=# SET max_datanode_for_plan = 64; SET /* This format is not recommended. After the extension is installed, the varchar2 type of log_id in the t1 table is implicitly converted to the text type. When it is compared with the log_id in the t2 table, the char type of log_id in the t2 table is implicitly converted to the bpchar type. In this case, spaces after log_id is removed by the database, that is, 'FE306991300002'='FE306991300002 '. Therefore, no data is matched. */ /* Incorrect example: */ gaussdb=# SELECT * FROM logs_varchar2 t1, logs_char t2 WHERE t1.log_id = t2.log_id; log_id | log_message | log_id | log_message --------+-------------+--------+------------- (0 rows) gaussdb=# EXPLAIN SELECT * FROM logs_varchar2 t1, logs_char t2 WHERE t1.log_id = t2.log_id; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------- Streaming (type: GATHER) (cost=17.29..45.25 rows=20 width=1150) Node/s: All datanodes -> Hash Join (cost=13.29..36.83 rows=20 width=1150) Hash Cond: ((t1.log_id)::text = t2.log_id) -> Streaming(type: BROADCAST) (cost=0.00..23.37 rows=40 width =566) Spawn on: All datanodes -> Seq Scan on logs_varchar2 t1 (cost=0.00..13.13 rows= 20 width=566) -> Hash (cost=13.13..13.13 rows=21 width=584) -> Seq Scan on logs_char t2 (cost=0.00..13.13 rows=20 w idth=584) (9 rows) gaussdb=# SELECT * FROM logs_varchar2 t1, logs_char t2 WHERE t1.log_id = 'FE306991300002 '; log_id | log_message | log_id | log_message ------------------+-------------+------------------+------------- FE306991300002 | 111 | FE306991300002 | 111 FE306991300002 | 111 | FE306991300003 | 222 FE306991300002 | 111 | FE306991300004 | 222 (3 rows) /* This format is recommended to avoid the following problems: the data type of log_id in the t1 table is converted to the text type, spaces are reserved during comparison, and data cannot be matched when the data type of log_id in the t2 table is compared with the data type of log_id in the t1 table. The type of t1 table is forcibly converted to the bpchar type before the extension is installed, that is, 'FE306991300002' = 'FE306991300002'. Therefore, data is matched. */ /* Correct example: */ gaussdb=# SELECT * FROM logs_varchar2 t1, logs_char t2 WHERE t1.log_id::bpchar = t2.log_id; log_id | log_message | log_id | log_message ------------------+-------------+------------------+------------- FE306991300002 | 111 | FE306991300002 | 111 FE306991300004 | 222 | FE306991300004 | 222 FE306991300003 | 222 | FE306991300003 | 222 (3 rows) gaussdb=# EXPLAIN SELECT * FROM logs_varchar2 t1, logs_char t2 WHERE t1.log_id::bpchar = t2.log_id; QUERY PLAN ------------------------------------------------------------------------ -------------- Streaming (type: GATHER) (cost=17.29..45.25 rows=20 width=1150) Node/s: All datanodes -> Hash Join (cost=13.29..36.83 rows=20 width=1150) Hash Cond: ((t1.log_id)::bpchar = t2.log_id) -> Streaming(type: BROADCAST) (cost=0.00..23.37 rows=40 width =566) Spawn on: All datanodes -> Seq Scan on logs_varchar2 t1 (cost=0.00..13.13 rows= 20 width=566) -> Hash (cost=13.13..13.13 rows=21 width=584) -> Seq Scan on logs_char t2 (cost=0.00..13.13 rows=20 w idth=584) (9 rows) gaussdb=# SELECT * FROM logs_varchar2 t1, logs_char t2 WHERE t1.log_id = 'FE306991300002 '; log_id | log_message | log_id | log_message ------------------+-------------+------------------+------------- FE306991300002 | 111 | FE306991300002 | 111 FE306991300002 | 111 | FE306991300003 | 222 FE306991300002 | 111 | FE306991300004 | 222 (3 rows) gaussdb=# DROP EXTENSION pkg_bpchar_opc; DROP EXTENSION gaussdb=# DROP TABLE logs_nchar; gaussdb=# DROP TABLE logs_char; gaussdb=# DROP TABLE logs_varchar2; gaussdb=# DROP TABLE logs_text;
- This solves the problem that data and indexes cannot be properly matched when equality matching is performed between the bpchar type (containing multiple spaces) and the text type.
- The UB-tree and B-tree are involved. The comparison symbols include >, >=, <, <=, and <>.
- The impact scope involves implicit conversion between character types. For example, when a variable-length data type is compared with a fixed-length data type, the variable-length data type is preferentially converted to the text type instead of the original bpchar type.
- The pkg_bpchar_opc extension is disabled by default. You can check whether the extension is enabled in the pg_extension system catalog. If the extension data exists, the extension is enabled. If the extension data does not exist, the extension is disabled. When extension is disabled, forward compatibility is maintained. When extension is enabled, compatibility with database ORA is maintained. After the pkg_bpchar_opc extension is enabled, you need to set the max_datanode_for_plan parameter as required. This parameter specifies the number of execution plans displayed on DNs when the FQS plan is generated. For example, to set max_datanode_for_plan to 64, run the following command:
set max_datanode_for_plan = 64
Table 2 Functions supported by pkg_bpchar_opc Name
Description
pg_catalog.bpchar_text_lt
Compares the bpchar type with the text type to check whether the value on the left is less than the value on the right.
pg_catalog.bpchar_text_le
Compares the bpchar type with the text type to check whether the value on the left is less than or equal to the value on the right.
pg_catalog.bpchar_text_eq
Compares the bpchar type with the text type and checks whether the value on the left is equal to the value on the right.
pg_catalog.bpchar_text_ge
Compares the bpchar type with the text type to check whether the value on the left is greater than or equal to the value on the right.
pg_catalog.bpchar_text_gt
Compares the bpchar type with the text type to check whether the value on the left is greater than the value on the right.
pg_catalog.bpchar_text_ne
Compares the bpchar type with the text type and checks whether the value on the left is different from the value on the right.
pg_catalog.bpchar_text_cmp
Specifies that the index of the bpchar and text types supports comparison functions.
pg_catalog.text_bpchar_lt
Compares the text type with the bpchar type to check whether the value on the left is less than the value on the right.
pg_catalog.text_bpchar_le
Compares the text type with the bpchar type to check whether the value on the left is less than or equal to the value on the right.
pg_catalog.text_bpchar_eq
Compares the text type with the bpchar type and checks whether the value on the left is equal to the value on the right.
pg_catalog.text_bpchar_ge
Compares the text type with the bpchar type to check whether the value on the left is greater than or equal to the value on the right.
pg_catalog.text_bpchar_gt
Compares the text type with the bpchar type to check whether the value on the left is greater than the value on the right.
pg_catalog.text_bpchar_ne
Compares the text type with the bpchar type and checks whether the value on the left is different from the value on the right.
pg_catalog.text_bpchar_cmp
Specifies that the index of the text and bpchar types supports comparison functions.
pg_catalog.hashbpchartext
Specifies that the hash of the bpchar and text types supports comparison functions.
pg_catalog.hashtextbpchar
Specifies that the hash of the text and bpchar types supports comparison functions.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot