Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
On this page

Show all

Pattern Matching Operators

Updated on 2024-08-20 GMT+08:00

The database provides three independent methods for implementing pattern matching: SQL LIKE operator, SIMILAR TO operator, and POSIX-style regular expressions. Besides these basic operators, functions can be used to extract or replace matching substrings and to split a string at matching locations.

  • LIKE

    Description: Specifies whether the string matches the pattern string following LIKE. The LIKE expression returns true if the string matches the supplied pattern. (As expected, the NOT LIKE expression returns false if LIKE returns true, and vice versa.)

    Matching rules:
    1. This operator can succeed only when its pattern matches the entire string. If you want to match a sequence in any position within the string, the pattern must begin and end with a percent sign.
    2. An underscore (_) represents (matches) any single character. A percent sign (%) represents the wildcard character of any string.
    3. To match a literal underscore or percent sign, the respective character in pattern must be preceded by the escape character. The default escape character is one backslash but a different one can be selected by using the ESCAPE clause.
    4. To match with escape characters, enter two escape characters. For example, to write a pattern constant containing a backslash (\), you need to enter two backslashes in SQL statements.
      NOTE:

      When standard_conforming_strings is set to off, any backslashes you write in literal string constants will need to be doubled. Therefore, writing a pattern that matches a single backslash actually involves writing four backslashes in the statement (you can avoid this by selecting a different escape character with ESCAPE so that the backslash is no longer a special character of LIKE. But the backslash is still the special character of the character text analyzer, so you still need two backslashes.)

      In MySQL-compatible schema, it is also possible to select no escape character by writing ESCAPE ''. This effectively disables the escape mechanism, which makes it impossible to turn off the special meaning of underscore and percent signs in the schema.

    5. The keyword ILIKE can be used instead of LIKE to make the match case-insensitive.
    6. Operator ~~ is equivalent to LIKE, and operator ~~* corresponds to ILIKE.

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' LIKE 'abc' AS RESULT;
     result
    -----------
     t
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' LIKE 'a%' AS RESULT;
     result
    -----------
     t
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' LIKE '_b_' AS RESULT;
     result
    -----------
     t
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' LIKE 'c' AS RESULT;
     result
    -----------
     f
    (1 row)
    
  • SIMILAR TO

    Description: Returns true or false depending on whether the pattern matches the given string. It is similar to LIKE, but differs in that it uses the regular expression understanding pattern defined by the SQL standard.

    Matching rules:
    1. Similar to LIKE, this operator succeeds only when its pattern matches the entire string. If you want to match a sequence in any position within the string, the pattern must begin and end with a percent sign.
    2. An underscore (_) represents (matches) any single character. A percent sign (%) represents the wildcard character of any string.
    3. SIMILAR TO supports these pattern-matching metacharacters borrowed from POSIX-style regular expressions:

      Metacharacter

      Description

      |

      Specifies alternation (either of two alternatives).

      *

      Specifies repetition of the previous item zero or more times.

      +

      Specifies repetition of the previous item one or more times.

      ?

      Specifies repetition of the previous item zero or one time.

      {m}

      Specifies repetition of the previous item exactly m times.

      {m,}

      Specifies repetition of the previous item m or more times.

      {m,n}

      Specifies repetition of the previous item at least m times and does not exceed n times.

      ()

      Combines multiple items into a logical item.

      [...]

      Specifies a character class, just as in POSIX-style regular expressions.

    4. A preamble escape character disables the special meaning of any of these metacharacters. The rules for using escape characters are the same as those for LIKE.

    Regular expressions:

    The substring(string from pattern for escape) function extracts a substring that matches an SQL regular expression pattern.

    Example:

    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' SIMILAR TO 'abc' AS RESULT;
     result
    -----------
     t
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' SIMILAR TO 'a' AS RESULT;
     result
    -----------
     f
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' SIMILAR TO '%(b|d)%' AS RESULT;
     result
    -----------
     t
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' SIMILAR TO '(b|c)%'  AS RESULT;
     result
    -----------
     f
    (1 row)
    
  • POSIX-style regular expressions

    Description: A regular expression is a collation that is an abbreviated definition of a set of strings (a regular set). If a string is a member of a regular set described by a regular expression, the string matches the regular expression. POSIX-style regular expressions provide a more powerful means for pattern matching than the LIKE and SIMILAR TO operators. Table 1 lists all available operators for pattern matching using POSIX-style regular expressions.

    Table 1 Regular expression match operators

    Operator

    Description

    Example

    ~

    Matches a regular expression, which is case-sensitive.

    'thomas' ~ '.*thomas.*'

    ~*

    Matches a regular expression, which is case-insensitive.

    'thomas' ~* '.*Thomas.*'

    !~

    Does not match a regular expression, which is case-sensitive.

    'thomas' !~ '.*Thomas.*'

    !~*

    Does not match a regular expression, which is case-insensitive.

    'thomas' !~* '.*vadim.*'

    Matching rules:
    1. Unlike LIKE patterns, a regular expression is allowed to match anywhere within a string, unless the regular expression is explicitly anchored to the beginning or end of the string.
    2. Besides the metacharacters mentioned above, POSIX-style regular expressions also support the following pattern matching metacharacters:

      Metacharacter

      Description

      ^

      Specifies the match starting with a string.

      $

      Specifies the match at the end of a string.

      .

      Matches any single character.

    Regular expressions:

    POSIX-style regular expressions support the following functions:

    Example:

    1
    2
    3
    4
    5
    gaussdb=#  SELECT 'abc' ~ 'Abc' AS RESULT;
    result 
    --------
     f
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' ~* 'Abc' AS RESULT;
     result 
    --------
     t
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' !~ 'Abc' AS RESULT;
     result 
    --------
     t
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc'!~* 'Abc' AS RESULT;
     result 
    --------
     f
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' ~ '^a' AS RESULT;
     result 
    --------
     t
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' ~ '(b|d)'AS RESULT;
     result 
    --------
     t
    (1 row)
    
    1
    2
    3
    4
    5
    gaussdb=# SELECT 'abc' ~ '^(b|c)'AS RESULT;
     result 
    --------
     f
    (1 row)
    

    Although most regular expression searches can be executed quickly, they can still be artificially processed to require any length of time and any amount of memory. It is not recommended that you accept the regular expression search pattern from the non-security pattern source. If you must do this, you are advised to add the statement timeout limit. The search with the SIMILAR TO pattern has the same security risks as the SIMILAR TO provides many capabilities that are the same as those of the POSIX-style regular expression. The LIKE search is much simpler than the other two options. Therefore, it is more secure to accept the non-secure pattern source search.

  • [NOT] REGEXP/ [NOT] RLIKE

    Description: The REGEXP operator is used for regular expression matching and complies with POSIX-style regular expression matching rules. TRUE or FALSE is returned based on whether the pattern matches the given string. The following table describes the regular expression operators.

    Operator Name

    Description

    Syntax

    REGEXP

    Specifies whether a string matches the regular expression.

    expr REGEXP pat

    RLIKE

    Specifies whether a string matches the regular expression (same as REGEXP).

    expr RLIKE pat

    NOT REGEXP

    Specifies whether a string does not match the regular expression.

    expr NOT REGEXP pat

    NOT RLIKE

    Specifies whether the character string does not match the regular expression (same as NOT REGEXP).

    expr NOT RLIKE pat

    Matching rules:
    1. A regular expression is allowed to match anywhere within a string, unless the regular expression is explicitly anchored to the beginning or end of the string.
    2. The pattern matching metacharacters supported by the REGEXP operator are the same as those supported by POSIX-style regular expressions.
    3. The operator supports the following escape character matching.

      Escape Character

      Description

      \b

      Backspace key.

      \f

      Form feed character, for example, C language.

      \n

      Newline character, for example, C language.

      \r

      Carriage return character, for example, C language.

      \t

      Horizontal tab, for example, C language.

      \uwxyz

      Character whose hexadecimal value is 0xwxyz, where wxyz is four hexadecimal digits.

      \xhhh

      Character whose hexadecimal value is 0xhhh, where hhh is any sequence of hexadecimal digits.

      \0

      The GaussDB reports the error invalid byte sequence for encoding "UTF8": 0x00.

      \xy

      Character whose octal value is 0xy, where xy is two octal digits.

      \xyz

      Character whose octal value is 0xyz, where xyz is three octal digits.

    4. Ranges for matching a pattern string: [a-dX] and [^a-dX].

      [a-dX] matches any characters of a, b, c, d, and X. [^a-dX] matches characters other than a, b, c, d, or X.

      The hyphen (-) between two characters forms a range, indicating that all characters in the range are matched. To include a right square bracket (]), it must follow the left square bracket ([). To include a hyphen (-), it must be after the left square bracket ([) or before the right square bracket (]). Any character that does not have any special meaning enclosed in square brackets ([]) matches itself.

      Example:
      1
      2
      3
      4
      5
      gaussdb=# SELECT 'abd' REGEXP 'a[bc]d' AS RESULT;
       result 
      --------
       t
      (1 row)
      
      1
      2
      3
      4
      5
      gaussdb=# SELECT 'aed' REGEXP 'a[^bc]d' AS RESULT;
       result 
      --------
       t
      (1 row)
      
      1
      2
      3
      4
      5
      gaussdb=# SELECT 'a-' REGEXP 'a[-b]' AS RESULT;
       result 
      --------
       t
      (1 row)
      
      1
      2
      3
      4
      5
      gaussdb=# SELECT 'aX]bc' REGEXP '^[]a-dXYZ]*$' AS RESULT;
       result 
      --------
       t
      (1 row)
      
    5. [.characters.] in the pattern string pat matches the collation of the element. In bracket expressions using quare brackets ([]), the collation used to proofread elements is matched. The character is a single character or a character name such as space. A complete list of character names can be found in the regex/regc_locale file.
      Example:
      1
      2
      3
      4
      5
      gaussdb=# SELECT ' ' REGEXP '[[.space.]]' AS RESULT;
       result 
      --------
       t
      (1 row)
      
    6. Character class matching the [=character_class=] character in the pattern string pat. It is written in the square bracket expression. [=character_class=] indicates the equivalence class. A character matches all characters with the same sort proofreading value, including itself. For example, if o and (+) are of the same class, [[=o=]], [[=(+)=]], and [o(+)] are synonyms. The same class cannot be used as an endpoint of a range.
    7. Character class matching the [:character_class:] character in the pattern string pat. It is written in square brackets ([]). [:character_class:] is used to match the characters that match the character class. Other class names may be provided for specific regions. The character class cannot be used as an endpoint of a range. The following table lists the standard class names. If the backslash (\) is involved, set the parameters according to the description.

      Character Class

      Description

      Character Range

      alnum

      Alphanumeric numeric character.

      [0-9a-zA-Z]

      alpha

      Alphanumeric character.

      [a-zA-Z]

      blank

      Blank character.

      [\t], indicating a blank character

      cntrl

      Control character.

      [\x01-\x1F]

      digit

      Digit character.

      [0-9]

      graph

      Graphic character.

      [^\x01-\x20]

      lower

      Lowercase character.

      [a-z]

      print

      Graphic character.

      [^\x01-\x20]

      punct

      Punctuation character.

      [-!"#$%&'( )*+,./:;<=>?@[\\]^_`{|}~]

      space

      Spaces, tabs, new lines, and carriage returns.

      [\n\r\t\x0B]

      upper

      Uppercase character.

      [A-Z]

      xdigit

      Hexadecimal numeric character.

      [0-9a-fA-F]

      Example:

      1
      2
      3
      4
      5
      gaussdb=# SELECT '\n' REGEXP '[[:space:]]' AS RESULT;
       result 
      --------
       t
      (1 row)
      
    8. Start and end matching of the [[:<:]], [[:>:]] matching string in the pattern string pat.
      Example:
      1
      2
      3
      4
      5
      gaussdb=# SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]' AS RESULT;
       result 
      --------
       t
      (1 row)
      
    9. To match a text instance with special characters, add two backslashes (\) before the special characters (including (, ), ., ", ^, +, and ?). To match single quotation marks, you need to write '\'' in the source string to match '\\\'' in the pattern string. If the backslash (\) is involved, set the parameters according to the description.
      Example:
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      gaussdb=# SELECT 'a+b' REGEXP 'a\\+b' AS RESULT;
       result 
      --------
       t
      (1 row)
      gaussdb=# SELECT '\'' REGEXP '\\\'' AS RESULT;
       result 
      --------
       t
      (1 row)
      gaussdb=# SELECT '\\' REGEXP '\\\\' AS RESULT;
       result 
      --------
       t
      (1 row)
      
    10. When the case-sensitive character set and collation are set, regular expression matching is also case-sensitive.

      Example:

      1
      2
      3
      4
      5
      gaussdb=# SELECT 'abc' REGEXP 'ABC' COLLATE utf8mb4_bin AS RESULT;
       result 
      --------
       f
      (1 row)
      
    11. If the input parameter Expr or pat is null, NULL is returned.
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      gaussdb=# SELECT NULL REGEXP '*' AS RESULT;
       result 
      --------
      
      (1 row)
      gaussdb=# SELECT '-' REGEXP NULL AS RESULT;
       result 
      --------
      
      (1 row)
      
      NOTE:
      This operator is supported only in B-compatible databases and is valid only when sql_compatibility is set to 'B'. b_format_version is set to '5.7'. In this case, the REGEXP operator is equivalent to the ~* operator, and the NOT REGEXP operator is equivalent to the !~* operator.
      • When b_format_dev_version is set to s2, standard_conforming_strings is set to off and escape_string_warning is set to off by default. In this case, any backslash written in the string constant must be double-written. Therefore, writing a pattern matching a single backslash is actually going to write four backslashes in the statement.
      • This operator supports only the string type, text type, and other data types that can be implicitly converted to the string type and text type. The bytea type is not supported.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback