Text Search Configuration Example
Text search configuration specifies the following components required for converting a document into a tsvector:
- A parser, decomposes a text into tokens.
- Dictionary list, converts each token into a lexeme.
Each time when the to_tsvector or to_tsquery function is invoked, a text search configuration is required to specify a processing procedure. The GUC parameter default_text_search_config specifies the default text search configuration, which will be used if the text search function does not explicitly specify a text search configuration.
GaussDB(DWS) provides some predefined text search configurations. You can also create user-defined text search configurations. In addition, to facilitate the management of text search objects, multiple gsql meta-commands are provided to display related information. For details, see "Meta-Command Reference" in the Tool Guide.
Procedure
- Create a text search configuration ts_conf by copying the predefined text search configuration english.
1 2
CREATE TEXT SEARCH CONFIGURATION ts_conf ( COPY = pg_catalog.english ); CREATE TEXT SEARCH CONFIGURATION
- Create a Synonym dictionary.
Assume that the definition file pg_dict.syn of the Synonym dictionary contains the following contents:
1 2 3
postgres pg pgsql pg postgresql pg
Run the following statement to create the Synonym dictionary:
1 2 3 4 5
CREATE TEXT SEARCH DICTIONARY pg_dict ( TEMPLATE = synonym, SYNONYMS = pg_dict, FILEPATH = 'obs://bucket01/obs.xxx.xxx.com accesskey=xxxxx secretkey=xxxxx region=eu-west-101' );
- Create an Ispell dictionary english_ispell (the dictionary definition file is from the open source dictionary).
1 2 3 4 5 6 7
CREATE TEXT SEARCH DICTIONARY english_ispell ( TEMPLATE = ispell, DictFile = english, AffFile = english, StopWords = english, FILEPATH = 'obs://bucket01/obs.xxx.xxx.com accesskey=xxxxx secretkey=xxxxx region=eu-west-101' );
- Modify the text search configuration ts_conf and change the dictionary list for tokens of certain types. For details about token types, see Text Search Parser.
1 2 3 4
ALTER TEXT SEARCH CONFIGURATION ts_conf ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, word, hword, hword_part WITH pg_dict, english_ispell, english_stem;
- In the text search configuration, set non-index or set the search for tokens of certain types.
1 2
ALTER TEXT SEARCH CONFIGURATION ts_conf DROP MAPPING FOR email, url, url_path, sfloat, float;
- Use the text retrieval commissioning function ts_debug() to test the text search configuration ts_conf.
1 2 3 4 5
SELECT * FROM ts_debug('ts_conf', ' PostgreSQL, the highly scalable, SQL compliant, open source object-relational database management system, is now undergoing beta testing of the next version of our software. ');
- You can set the default text search configuration of the current session to ts_conf. This setting is valid only for the current session.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
\dF+ ts_conf Text search configuration "public.ts_conf" Parser: "pg_catalog.default" Token | Dictionaries -----------------+------------------------------------- asciihword | pg_dict,english_ispell,english_stem asciiword | pg_dict,english_ispell,english_stem file | simple host | simple hword | pg_dict,english_ispell,english_stem hword_asciipart | pg_dict,english_ispell,english_stem hword_numpart | simple hword_part | pg_dict,english_ispell,english_stem int | simple numhword | simple numword | simple uint | simple version | simple word | pg_dict,english_ispell,english_stem SET default_text_search_config = 'public.ts_conf'; SET SHOW default_text_search_config; default_text_search_config ---------------------------- public.ts_conf (1 row)
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.