Poisson Analyzer Usage Guide
Poisson Analyzer provides two analyzers poisson_index and poisson_search, two tokenizers poisson_index and poisson_search, and one synonym filter poisson_synonyms.
You can directly use the two built-in analyzers without extra configurations. The two analyzers are configured as follows:
"poisson_index":
{
"tokenizer": "poisson_index",
"filter": [
"stemmer",
"poisson_synonyms",
"stop"
]
}
If you need to customize the configuration, for example, by uploading your own dictionaries, then you need to specify tokenizer and filter in settings. The following provides an example:
{
"settings": {
"index": {
"number_of_shards": "5",
"number_of_replicas": "1",
"analysis": {
"filter": {
"my_filter": {
"type": "poisson_synonyms",
"poisson_synonyms_dict": [
"word-1, word-2 => word-3",
"word-4, word-5",
"word-6, word-7, word-8"
],
"poisson_synonyms_dict_paths": "synonyms.txt"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "poisson_index",
"poisson_dict": [
"keyword-1",
"keyword-2"
"keyword-3"
],
"poisson_dict_paths": "main_dict1.txt,main_dict2.txt",
"poisson_stopword_dict_paths": "stopword.txt",
"ignore": true
}
},
"analyzer": {
"my_analyzer": {
"filter": [
"my_filter"
],
"tokenizer": "my_tokenizer"
}
}
}
}
}
}
|
Type |
Parameter Name |
Parameter Type |
Value |
Description |
|
filter |
poisson_synonyms_dict |
Array of poisson_synonyms_dict objects |
["word-1, word-2 => word-3","word-4, word-5", "word-6, word-7, word-8"] |
Synonym dictionary. Configuration rules of this parameter are as follows: |
|
poisson_synonyms_dict_paths |
String |
synonyms.txt |
Synonym dictionary path. Multiple paths are separated by commas (,). Configuration rules of this parameter are as follows:
|
|
|
tokenizer |
poisson_dict |
Array of poisson_dict objects |
["keyword-1","keyword-2", "keyword-3"] |
Custom word segmentation dictionary. To change the coarse-grained word segmentation mode to fine-grained, add <= following the target word. The following is an example: Mobile navigation<= |
|
poisson_dict_paths |
String |
main_dict1.txt,main_dict2.txt |
Custom dictionary path. Multiple paths are separated by commas (,). Configuration rules of this parameter are as follows:
|
|
|
poisson_stopword_dict_paths |
String |
stopword.txt |
Custom stop word dictionary path. Multiple paths are separated by commas (,). |
|
|
ignore |
String |
true or false |
Whether to ignore case sensitivity for custom stop words. Value true indicates that case sensitivity is ignored, and value false indicates that case sensitivity is not ignored. |
Last Article: Deleting a Word Dictionary
Next Article: IK Word Splitting
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.