Pattern Matching Operators
The database provides the [NOT] LIKE operator matching and regular expression pattern matching.
[NOT] LIKE
Description: Specifies whether a string matches the pattern string following LIKE. The LIKE expression returns true if the string matches the provided pattern. (As expected, the NOT LIKE expression returns false if the LIKE expression returns true, and vice versa.) Pattern matching can be performed on operands (string, binary, and bit string types) that contain 0 characters.
- This operator can succeed only when its pattern matches the entire string. If you want to match a sequence in any position within the string, the pattern must begin and end with a percent sign (%).
- The underscore (_) represents (matches) any single character. The percent sign (%) indicates the wildcard character of any string.
- To match a literal underscore or percent sign, the respective character in pattern must be preceded by the escape character. The default escape character is one backslash but a different one can be selected by using the ESCAPE clause.
- To match with escape characters, enter two escape characters. For example, to write a pattern constant containing a backslash (\), you need to enter two backslashes in SQL statements.
- Whether a value is case-sensitive depends on the collation. For example, the default value utf8mb4_general_ci is case-insensitive.
- The operator ~~ is equivalent to LIKE and ! ~~ is equivalent to NOT LIKE.
Examples:
- During string matching, the collation is utf8mb4_general_ci, which is case-insensitive.
m_db=# INSERT INTO t VALUES ('ab^CD&eF','ab^CD&eF','ab^CD&eF','ab^CD&eF','ab^CD&eF','ab^CD&eF'); m_db=# SELECT c FROM t WHERE c LIKE '%B^c%'; c ---------- ab^CD&eF (1 row) - For fields of the BINARY or BLOB type, the collation is utf8mb4_bin, which is case-sensitive.
m_db=# INSERT INTO t VALUES (b'0110001', 'ab^CD&eF', 'ab^CD&eF'); m_db=# SELECT b FROM t WHERE b LIKE 'ab%'; b ------------ ab^CD&eF (1 row) - Use the escape character to match _ and %.
m_db=# INSERT INTO t VALUES ('ab_CD%eF','ab_CD%eF','ab_CD%eF','ab_CD%eF','ab_CD%eF','ab_CD%eF'); m_db=# SELECT c FROM t WHERE c LIKE '%B\_c%'; c ---------- ab_CD%eF (1 row) - Specify the escape character.
m_db=# INSERT INTO t VALUES ('ab_CD%eF','ab_CD%eF','ab_CD%eF','ab_CD%eF','ab_CD%eF','ab_CD%eF'); m_db=# SELECT c FROM t WHERE c LIKE '%B^_c%' ESCAPE '^'; c ---------- ab_CD%eF (1 row)
[NOT] REGEXP
Description: The REGEXP operator is used for regular expression matching. True or false is returned based on whether the pattern matches the given string. The following table describes the regular expression operators.
|
Operator Name |
Description |
Syntax |
|---|---|---|
|
REGEXP |
Specifies whether a string matches the regular expression. |
expr REGEXP pat |
|
NOT REGEXP |
Specifies whether a string does not match the regular expression. The value is negative of REGEXP. |
expr NOT REGEXP pat |
Regular expression syntax:
- ^: matches the start of a string.
- $: matches the end of a string.
- .: matches any character (including the carriage return and newline characters).
- a*: matches zero or more "a" characters.
- a+: matches one or more "a" characters.
- a?: matches zero or one "a" character.
- de | abc: matches the "de" or "abc" character.
- (abc)*: matches zero or more "abc" characters.
- a{1} or a{2,3}: a{1} indicates that character "a" is repeated once, and a{2,3} indicates that character "a" is repeated twice or three times.
- [a-dX] or [^a-dX]: indicates range matching. [a-dX] matches a, b, c, d, or X. [^a-dX] matches characters that are not a, b, c, d, or X.
- [.characters.]: matches the collation used to verify elements. The following table lists the allowed character names and matched characters.
Name
Character
Name
Character
NUL
0
SOH
1
STX
2
ETX
3
EOT
4
ENQ
5
ACK
6
BEL
7
alert
7
BS
10
backspace
\b'
HT
11
tab
\t'
LF
12
newline
\n'
VT
13
vertical-tab
\v'
FF
14
form-feed
\f'
CR
15
carriage-return
\r'
SO
16
SI
17
DLE
20
DC1
21
DC2
22
DC3
23
DC4
24
NAK
25
SYN
26
ETB
27
CAN
30
EM
31
SUB
32
ESC
33
IS4
34
FS
34
IS3
35
GS
35
IS2
36
RS
36
IS1
37
US
37
space
'
exclamation-mark
!'
quotation-mark
"'
number-sign
#'
dollar-sign
$'
percent-sign
%'
ampersand
&'
apostrophe
\''
left-parenthesis
('
right-parenthesis
)'
asterisk
*'
plus-sign
+'
comma
,'
hyphen
-'
hyphen-minus
-'
period
.'
full-stop
.'
slash
/'
solidus
/'
zero
0'
one
1'
two
2'
three
3'
four
4'
five
5'
six
6'
seven
7'
eight
8'
nine
9'
colon
:'
semicolon
;'
less-than-sign
<'
equals-sign
='
greater-than-sign
>'
question-mark
?'
commercial-at
@'
left-square-bracket
['
backslash
\\'
reverse-solidus
\\'
right-square-bracket
]'
circumflex
^'
circumflex-accent
^'
underscore
_'
low-line
_'
grave-accent
`'
left-brace
{'
left-curly-bracket
{'
vertical-line
|'
right-brace
}'
right-curly-bracket
}'
tilde
~'
DEL
177
-
-
- [=character_class=]: specifies the equivalence class. A character matches all characters with the same sort proofreading value, including itself. For example, if o and (+) are of the same type, [[=o=]], [[=(+)=]], and [o(+)] are synonyms.
- [:character_class:]: matches all character classes that belong to the class. The following table lists the standard class names.
Character Class
Description
Character Range
alnum
Alphanumeric numeric character.
0–9, a–z, and A–Z
alpha
Alphanumeric character.
a–z and A–Z
blank
Blank character.
Blank character[\t]
cntrl
Control character.
[\x01-\x1F]
digit
Digit character.
[0-9]
graph
Graphic character.
[^\x01-\x20]
lower
Lowercase character.
[a-z]
print
Graphic character.
[^\x01-\x20]
punct
Punctuation character.
[-!"#$%&'( )*+,./:;<=>?@[\\]^_`{|}~]
space
Spaces, tabs, new lines, and carriage returns.
[\n\r\t\x0B]
upper
Uppercase character.
[A-Z]
xdigit
Hexadecimal numeric character.
[0-9a-fA-F]
- [[:<:]] and [[:>:]]: matches the start and end of a string respectively.
Examples:
m_db=# SELECT 'fo\nfo' REGEXP '^fo$'; ?column? ---------- f (1 row) m_db=# SELECT 'fofo' REGEXP '^fo'; ?column? ---------- t (1 row) m_db=# SELECT 'Ban' REGEXP '^Ba*n'; ?column? ---------- t (1 row) m_db=# SELECT 'Bn' REGEXP '^Ba?n'; ?column? ---------- t (1 row) m_db=# SELECT 'pi' REGEXP 'pi|apa'; ?column? ---------- t (1 row) m_db=# SELECT 'aXbc' NOT REGEXP '^[a-dXYZ]$'; ?column? ---------- t (1 row)
When enable_escape_string is enabled for m_format_behavior_compat_options, the following escape characters are supported:
|
Escape Character |
Description |
|---|---|
|
\' |
A single quotation mark. |
|
\" |
A double quotation mark. |
|
\b |
A backspace. |
|
\n |
A newline. |
|
\r |
A carriage return. |
|
\t |
A tab. |
|
\Z |
ASCII 26 (Control Z). |
|
\\ |
A backslash. |
|
\% |
A % character. |
|
\_ |
An underscore. |
|
\0 |
GaussDB reports the error: invalid byte sequence for encoding "UTF8": 0x00. |
When enable_escape_string is disabled in m_format_behavior_compat_options and standard_conforming_strings is set to 'off', escape_string_warning is set to 'off', backslash_quote is set to 'on', the following escape characters are supported:
|
Escape Character |
Description |
|---|---|
|
\' |
A single quotation mark. |
|
\b |
Backspace. |
|
\f |
Form feed, for example, C language. |
|
\n |
Newline, for example, C language. |
|
\r |
Carriage return, for example, C language. |
|
\t |
Horizontal tab, for example, C language. |
|
\uwxyz |
Character whose hexadecimal value is 0xwxyz, where wxyz is four hexadecimal digits. |
|
\xhhh |
Character whose hexadecimal value is 0xhhh, where hhh is any sequence of hexadecimal digits. |
|
\0 |
GaussDB reports the error: invalid byte sequence for encoding "UTF8": 0x00. |
|
\xy |
Character whose octal value is 0xy, where xy is two octal digits. |
|
\xyz |
Character whose octal value is 0xyz, where xyz is three octal digits. |
The lexical syntax is parsed based on byte streams. If a multi-byte character contains code that is consistent with symbols such as '\', '\'', and '\\', the behavior of the multi-byte character is inconsistent with that in MySQL. In this case, you are advised to disable the escape character function temporarily.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot