Poisson Analyzer Usage Guide
Poisson Analyzer provides two analyzers poisson_index and poisson_search, two tokenizers poisson_index and poisson_search, and one synonym filter poisson_synonyms.
You can directly use the two built-in analyzers without extra configurations. The two analyzers are configured as follows:
"poisson_index":
{
"tokenizer": "poisson_index",
"filter": [
"stemmer",
"poisson_synonyms",
"stop"
]
} If you need to customize the configuration, for example, by uploading your own dictionaries, then you need to specify tokenizer and filter in settings. The following provides an example:
{
"settings": {
"index": {
"number_of_shards": "5",
"number_of_replicas": "1",
"analysis": {
"filter": {
"my_filter": {
"type": "poisson_synonyms",
"poisson_synonyms_dict": [
"word-1, word-2 => word-3",
"word-4, word-5",
"word-6, word-7, word-8"
],
"poisson_synonyms_dict_paths": "synonyms.txt"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "poisson_index",
"poisson_dict": [
"keyword-1",
"keyword-2"
"keyword-3"
],
"poisson_dict_paths": "main_dict1.txt,main_dict2.txt",
"poisson_stopword_dict_paths": "stopword.txt",
"ignore": true
}
},
"analyzer": {
"my_analyzer": {
"filter": [
"my_filter"
],
"tokenizer": "my_tokenizer"
}
}
}
}
}
} | Type | Parameter Name | Parameter Type | Value | Description |
| filter | poisson_synonyms_dict | Array of poisson_synonyms_dict objects | ["word-1, word-2 => word-3","word-4, word-5", "word-6, word-7, word-8"] | Synonym dictionary. Configuration rules of this parameter are as follows: |
| poisson_synonyms_dict_paths | String | synonyms.txt | Synonym dictionary path. Multiple paths are separated by commas (,). Configuration rules of this parameter are as follows:
| |
| tokenizer | poisson_dict | Array of poisson_dict objects | ["keyword-1","keyword-2", "keyword-3"] | Custom word segmentation dictionary. To change the coarse-grained word segmentation mode to fine-grained, add <= following the target word. The following is an example: Mobile navigation<= |
| poisson_dict_paths | String | main_dict1.txt,main_dict2.txt | Custom dictionary path. Multiple paths are separated by commas (,). Configuration rules of this parameter are as follows:
| |
| poisson_stopword_dict_paths | String | stopword.txt | Custom stop word dictionary path. Multiple paths are separated by commas (,). | |
| ignore | String | true or false | Whether to ignore case sensitivity for custom stop words. Value true indicates that case sensitivity is ignored, and value false indicates that case sensitivity is not ignored. |
Last Article: Deleting a Word Dictionary
Next Article: IK Word Splitting
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.