Help Center > > Developer Guide> Full Text Search> Dictionaries> Ispell Dictionary

Ispell Dictionary

Updated at:Aug 27, 2020 GMT+08:00

An Ispell dictionary is a morphological dictionary, which can normalize different linguistic forms of a word into the same lexeme. For example, an English Ispell dictionary can match all declensions and conjugations of the search term bank, such as, banking, banked, banks, banks', and bank's.

GaussDB(DWS) does not provide any predefined Ispell dictionaries or dictionary files. The .dict files and .affix files support multiple open-source dictionary formats, including Ispell, MySpell, and Hunspell.

Procedure

  1. Obtain the dictionary definition file (.dict) and affix file (.affix).

    You can use an open-source dictionary. The name extensions of the open-source dictionary may be .aff and .dic. In this case, you need to change them to .affix and .dict. In addition, for some dictionary files (for example, Norwegian dictionary files), you need to run the following commands to convert the character encoding to UTF-8:

    1
    2
    iconv -f ISO_8859-1 -t UTF-8 -o nn_no.affix nn_NO.aff 
    iconv -f ISO_8859-1 -t UTF-8 -o nn_no.dict nn_NO.dic
    

  2. Create an Ispell dictionary.

    1
    2
    3
    4
    5
    6
    CREATE TEXT SEARCH DICTIONARY norwegian_ispell (
        TEMPLATE = ispell,
        DictFile = nn_no,
        AffFile = nn_no,
        FilePath = 'file:///home/dicts''obs://bucket_name/path accesskey=ak secretkey=sk region=rg'
    );
    

    The full names of the Ispell dictionary files are nn_no.dict and nn_no.affix, and the dictionary is stored in the obs://bucket_name/path accesskey=ak secretkey=sk region=rg directory. For details about the syntax and parameters for creating an Ispell dictionary, see CREATE TEXT SEARCH DICTIONARY.

  3. Use the Ispell dictionary to split compound words.

    1
    2
    3
    4
    5
    SELECT ts_lexize('norwegian_ispell', 'sjokoladefabrikk');
          ts_lexize      
    ---------------------
     {sjokolade,fabrikk}
    (1 row)
    

    MySpell does not support compound words. Hunspell supports compound words. GaussDB(DWS) supports only the basic compound word operations of Hunspell. Generally, Ispell dictionaries recognize a limited set of words, so they should be followed by another broader dictionary, for example, a Snowball dictionary, which recognizes everything.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel