Help Center> Cloud Search Service> User Guide> Customizing Word Dictionaries> Example

Example

Analyzers

Elasticsearch provides the following two analyzers for using the word dictionary:

ik_max_word: segments the text at a fine-grained level.
ik_smart: segments the text at a coarse-grained level.

Example

Log in to the CSS management console. Switch to the Clusters page. Click the name of the target cluster to switch to the Basic Information page.
Prepare the main word dictionary file, stop word dictionary file, and synonym dictionary file. Upload the files encoded using UTF-8 without BOM to the corresponding OBS bucket, for example, obs-b8ed.

The default word dictionary contains common stop words. Therefore, you do not need to upload the stop words mentioned in the preceding example.
Select the corresponding OBS path by referring to Configuring a Custom Word Dictionary and select corresponding main word dictionary file, stop word dictionary file, and synonym dictionary file. Click Save.
After the word dictionary status changes to Succeeded, switch to the Clusters page. In the cluster list, locate the row where the target cluster resides and click Kibana in the Operation column.

On the displayed page, click Dev Tools. On the displayed page, enter the following code and click

. You can view the word segmentation result on the right pane.

Use the ik_smart analyzer to perform word segmentation on Text used for word segmentation.

Example code:

POST /_analyze
{
  "analyzer":"ik_smart",
"text": "Text used for word segmentation"
}

After the operation is completed, view the word segmentation result.

{
  "tokens": [
    {
      "token": "The word segmentation result",
      "start_offset": 0,
      "end_offset": 4,
      "type": "CN_WORD",
      "position": 0
    },
    {
"token": "The word segmentation result",
      "start_offset": 5,
      "end_offset": 8,
      "type": "CN_WORD",
      "position": 1
    }
  ]
}

Use the ik_max_word analyzer to perform word segmentation on Text used for word segmentation.

Example code:

POST /_analyze
{
  "analyzer":"ik_max_word",
  "text":"Text used for word segmentation"
}

After the operation is completed, view the word segmentation result.

{
  "tokens" : [
    {
Smartphones
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "The word segmentation result",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
      "token" : "The word segmentation result",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "The word segmentation result",
      "start_offset" : 1,
      "end_offset" : 3,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "The word segmentation result",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 4
    },
    {
      "token" : "The word segmentation result",
      "start_offset" : 3,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 5
    },
    {
      "token" : "The word segmentation result",
      "start_offset" : 5,
      "end_offset" : 8,
      "type" : "CN_WORD",
      "position" : 6
    },
    {
      "token" : "The word segmentation result",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 7
    },
    {
      "token" : "The word segmentation result",
      "start_offset" : 6,
      "end_offset" : 8,
      "type" : "CN_WORD",
      "position" : 8
    },
    {
      "token" : "The word segmentation result",
      "start_offset" : 7,
      "end_offset" : 8,
      "type" : "CN_WORD",
      "position" : 9
    }
  ]
}

Refer to the following procedure to perform related operations, including creating an index, importing data, conducting search based on the keyword, and viewing the search result.

Create an index named book. In this example, set both analyzer and search_analyzer to ik_max_word. You can also select ik_smart.

(Versions earlier than 7.x)

PUT /book
{
    "settings": {
        "number_of_shards": 2,
        "number_of_replicas": 1
    },
    "mappings": {
        "type1": {
            "properties": {
                "content": {
                    "type": "text",
                    "analyzer": "ik_max_word",
                    "search_analyzer": "ik_max_word"
                }
            }
        }
    }
}

(Version 7.X and later versions)

PUT /book
{
    "settings": {
        "number_of_shards": 2,
        "number_of_replicas": 1
    },
    "mappings": {
        "properties": {
            "content": {
                "type": "text",
                "analyzer": "ik_max_word",
                "search_analyzer": "ik_max_word"
            }
        }
    }
}

Import data. Import the text information to the book index.
(Versions earlier than 7.x)
```
PUT /book/type1/1
{
  "content":"Imported text"
}
```
(Version 7.X and later versions)
```
PUT /book/_doc/1 
{ 
  "content":"Imported text"
}
```

Conduct search based on the keywords.

(Versions earlier than 7.x)

GET /book/type1/_search
{
  "query": {
    "match": {
      "content": "Keyword"
    }
  }
}

(Version 7.X and later versions)

GET /book/_doc/_search
{
  "query": {
    "match": {
      "content": "Keyword"
    }
  }
}

Search result

(Versions earlier than 7.x)

{
  "took" : 12,
  "timed_out" : false,
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.7260926,
    "hits" : [
      {
        "_index" : "book",
        "_type" : "type1",
        "_id" : "1",
        "_score" : 1.7260926,
        "_source" : {
          "content" : "Imported text"
        }
      }
    ]
  }
}

(Version 7.X and later versions)

{
  "took" : 16,
  "timed_out" : false,
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.7260926,
    "hits" : [
      {
        "_index" : "book",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.7260926,
        "_source" : {
          "content" : "Imported text"
        }
      }
    ]
  }
}

Refer to the following procedure to perform related operations, including creating an index, importing data, conducting search based on the synonym, and viewing the search result.

Create an index.

(Versions earlier than 7.x)

PUT myindex
{
  "settings": {
    "analysis": {
      "filter": {
        "my_synonym": {
          "type": "dynamic_synonym"
        }
      },
      "analyzer": {
        "ik_synonym": {
          "filter": [
            "my_synonym"
          ],
          "type": "custom",
          "tokenizer": "ik_smart"
        }
      }
    }
  },
  "mappings": {
    "mytype" :{
      "properties": {
        "desc": {
          "type": "text",
          "analyzer": "ik_synonym"
        }
      }
    }
  }
}

(Version 7.x and earlier versions)

PUT myindex
{
    "settings": {
        "analysis": {
            "filter": {
                "my_synonym": {
                    "type": "dynamic_synonym"
                }
            },
            "analyzer": {
                "ik_synonym": {
                    "filter": [
                        "my_synonym"
                    ],
                    "type": "custom",
                    "tokenizer": "ik_smart"
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "desc": {
                "type": "text",
                "analyzer": "ik_synonym"
            }
        }
    }
}

Import data. Import the text information to the myindex index.
(Versions earlier than 7.x)
```
PUT /myindex/mytype/1
{
  "desc": "Imported text"
}
```
(Version 7.X and later versions)
```
PUT /myindex/_doc/1
{
  "desc": "Imported text"
}
```

Conduct search based on the synonym Keyword and view the search results.

Run the following command to search for Keyword:

GET /myindex/_search
{
  "query": {
    "match": {
      "desc": "Keyword"
    }
  }
}

Search result

(Versions earlier than 7.x)

{
  "took": 12,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.41048482,
    "hits": [
      {
        "_index": "myindex",
        "_type": "mytype",
        "_id": "1",
        "_score": 0.41048482,
        "_source": {
          "desc": "Imported text"
        }
      }
    ]
  }
}

(Version 7.X and later versions)

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.1519955,
    "hits" : [
      {
        "_index" : "myindex",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.1519955,
        "_source" : {
          "desc" : "Imported text"
        }
      }
    ]
  }
}

Parent topic: Customizing Word Dictionaries

Last Article: Configuring a Custom Word Dictionary

Next Article: Simplified-Traditional Chinese Conversion Plugin

Did this article solve your problem?

Thank you for your score！Your feedback would help us improve the website.

Products

Compute

Application

Dedicated Cloud

Storage

Management & Deployment

Migration

Network

Enterprise Intelligence

Video

Database

Edge Cloud Services

DevCloud

Security

Cloud Communications

Internet of Things

Solutions

Industry-Specific Solutions

General-Purpose Solutions

Security

DevOps

Enterprise Intelligence

Essential Platform

Big Data

Visual Cognition

Speech and Semantics

Support

Help Center

Customer Services

Developers

Console

语言 - Language

中国站 - 简体中文

中国站 - English

International - 简体中文

International - English

Help Center

Example

Analyzers

Example