Help Center > > Developer Guide> Full Text Search> Configuration Examples

Configuration Examples

Updated at: Mar 13, 2020 GMT+08:00

Text search configuration specifies the following components required for converting a document into a tsvector:

  • A parser, decomposes a text into tokens.
  • Dictionary list, converts each token into a lexeme.

Each time when the to_tsvector or to_tsquery function is invoked, a text search configuration is required to specify a processing procedure. The GUC parameter default_text_search_config specifies the default text search configuration, which will be used if the text search function does not explicitly specify a text search configuration.

DWS provides some predefined text search configurations. You can also create user-defined text search configurations. In addition, to facilitate the management of text search objects, multiple gsql meta-commands are provided to display information about text search objects.

Procedure

  1. Create a text search configuration ts_conf by copying the predefined text search configuration english.

    1
    2
    CREATE TEXT SEARCH CONFIGURATION ts_conf ( COPY = pg_catalog.english );
    CREATE TEXT SEARCH CONFIGURATION
    

  2. Create a Synonym dictionary.

    Assume that the definition file pg_dict.syn of the Synonym dictionary contains the following contents:
    1
    2
    3
    postgres    pg 
    pgsql       pg 
    postgresql  pg
    

    Run the following statement to create the Synonym dictionary:

    1
    2
    3
    4
    5
    CREATE TEXT SEARCH DICTIONARY pg_dict (
         TEMPLATE = synonym,
         SYNONYMS = pg_dict,
         FILEPATH = 'obs://bucket_name/path accesskey=ak secretkey=sk region=rg'
     );
    

  3. Create an Ispell dictionary english_ispell (the dictionary definition file is from the open source dictionary).

    1
    2
    3
    4
    5
    6
    7
    CREATE TEXT SEARCH DICTIONARY english_ispell (
        TEMPLATE = ispell,
        DictFile = english,
        AffFile = english,
        StopWords = english,
        FILEPATH =  'obs://bucket_name/path accesskey=ak secretkey=sk region=rg'
    );
    

  4. Modify the text search configuration ts_conf and change the dictionary list for tokens of certain types. For details about token types, see Parsers.

    1
    2
    3
    4
    ALTER TEXT SEARCH CONFIGURATION ts_conf
        ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
                          word, hword, hword_part
        WITH pg_dict, english_ispell, english_stem;
    

  5. In the text search configuration, set non-index or set the search for tokens of certain types.

    1
    2
    ALTER TEXT SEARCH CONFIGURATION ts_conf
        DROP MAPPING FOR email, url, url_path, sfloat, float;
    

  6. Use the text retrieval commissioning function ts_debug() to test the text search configuration ts_conf.

    1
    2
    3
    4
    5
    SELECT * FROM ts_debug('ts_conf', '
    PostgreSQL, the highly scalable, SQL compliant, open source object-relational
    database management system, is now undergoing beta testing of the next
    version of our software.
    ');
    

  7. You can set the default text search configuration of the current session to ts_conf. This setting is valid only for the current session.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    \dF+ ts_conf
          Text search configuration "public.ts_conf"
    Parser: "pg_catalog.default"
          Token      |            Dictionaries             
    -----------------+-------------------------------------
     asciihword      | pg_dict,english_ispell,english_stem
     asciiword       | pg_dict,english_ispell,english_stem
     file            | simple
     host            | simple
     hword           | pg_dict,english_ispell,english_stem
     hword_asciipart | pg_dict,english_ispell,english_stem
     hword_numpart   | simple
     hword_part      | pg_dict,english_ispell,english_stem
     int             | simple
     numhword        | simple
     numword         | simple
     uint            | simple
     version         | simple
     word            | pg_dict,english_ispell,english_stem
    
    SET default_text_search_config = 'public.ts_conf';
    SET
    SHOW default_text_search_config;
     default_text_search_config 
    ----------------------------
     public.ts_conf
    (1 row)
    

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel