Help Center> Cloud Search Service> FAQ> Functions> How Do I Use Custom Scoring of Elasticsearch to Query Data?

How Do I Use Custom Scoring of Elasticsearch to Query Data?

You can use Elasticsearch to score the matched documents. This section describes how to use the custom scoring query function of Elasticsearch.

Procedure

  1. Log in to the CSS management console.
  2. In the left navigation pane, click Clusters to switch to the Clusters page.
  3. In the cluster list, locate the row where the target cluster resides and click Kibana in the Operation column.
  4. In the left navigation pane of Kibana, click Dev Tools. Click Get to work to switch to the Console page.
  5. Run the following command to create an index and specify a user-defined mapping to define the data type:

    For example, the tv.json data file is available and contains data as follows:

    {
    "tv":[
    { "name": "tv1", "description": "USB, DisplayPort", "vote": 0.98 }
    { "name": "tv2", "description": "USB, HDMI", "vote": 0.99 }
    { "name": "tv3", "description": "USB", "vote": 0.5 }
    { "name": "tv4", "description": "USB, HDMI, DisplayPort", "vote": 0.7 }
    ]
    }
    Run the following command to create the mall index and specify the user-defined mapping to define the data type:
    PUT /mall?pretty
    {
      "mappings": {
        "tv": {
          "properties": {
            "description": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword"
                }
              }
            },
            "name": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword"
                }
              }
            },
            "vote": {
              "type": "float"
            }
          }
        }
      }
    }
  6. Run the following command to import data in the tv.json file to the mall index:
    POST /mall/tv/_bulk?pretty
    { "index": {"_id": "1"}}
    { "name": "tv1", "description": "USB, DisplayPort", "vote": 0.98 }
    { "index": {"_id": "2"}}
    { "name": "tv2", "description": "USB, HDMI", "vote": 0.99 }
    { "index": {"_id": "3"}}
    { "name": "tv3", "description": "USB", "vote": 0.5 }
    { "index": {"_id": "4"}}
    { "name": "tv4", "description": "USB, HDMI, DisplayPort", "vote": 0.7 }
  7. Query data by using custom scoring.
    • Compute the total score of each TV based on the absolute praise rate and sort the searched products in a descending order of the total score.

      The following sample code illustrates how to query TVs with the USB, HDMI, or DisplayPort port. In this example, give score 0 for a TV if it has none of the three ports, 1 if it has one, 2 if it has two, and 3 if it has three. Then, multiply the obtained score by the TV's absolute praise rate to obtain its total score. Finally, sort the TVs in a descending order of their total scores.

      GET /mall/tv/_search?pretty
      {
        "query":{
          "function_score":{
            "query":{
              "bool":{
                "should":[
                  {"constant_score":{
                      "query":{"match":{"description":"USB"}}
                  }},
                  {"constant_score":{
                      "query":{"match":{"description":"HDMI"}}
                  }},
                  {"constant_score":{
                      "query":{"match":{"description":"DisplayPort"}}
                  }}
                ]
              }
            },
            "field_value_factor":{
              "field":"vote",
              "factor":1
            },
            "boost_mode":"multiply",
            "max_boost":10
          }
        }
      }

      In the preceding example, the total score is calculated as follows: new_score = query_score x (factor x vote)

      The command output is similar to the following:

      {
        "took": 13,
        "timed_out": false,
        "_shards": {
          "total": 5,
          "successful": 5,
          "failed": 0
        },
        "hits": {
          "total": 4,
          "max_score": 2.1,
          "hits": [
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "4",
              "_score": 2.1,
              "_source": {
                "name": "tv4",
                "description": "USB, HDMI, DisplayPort",
                "vote": 0.7
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "2",
              "_score": 1.98,
              "_source": {
                "name": "tv2",
                "description": "USB, HDMI",
                "vote": 0.99
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "1",
              "_score": 1.96,
              "_source": {
                "name": "tv1",
                "description": "USB, DisplayPort",
                "vote": 0.98
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "3",
              "_score": 0.5,
              "_source": {
                "name": "tv3",
                "description": "USB",
                "vote": 0.5
              }
            }
          ]
        }
      }

      The preceding command output shows that Elasticsearch computes the total score of each TV based on the absolute praise rate and then sorts the TVs in a descending order of the total score.

    • Compute the total score of each TV based on the relative praise rate and sort the searched products in a descending order of the total score.

      The following sample code illustrates how to query TVs with the USB, HDMI, or DisplayPort port. In this example, give score 0 for a TV if it has none of the three ports, 1 if it has one, 2 if it has two, and 3 if it has three. Then, multiply the obtained score by the TV's relative praise rate to obtain its total score. The relative praise rate is calculated as follows: Specify a praise rate threshold like 0.8. If a TV's absolute praise rate is higher than 0.8, take the praise rate as 1 in the calculation formula for the TV. Otherwise, take the praise rate as 0.5. Finally, sort the TVs in a descending order of their total scores.

      GET /mall/tv/_search?pretty
      {
        "query":{
          "function_score":{
            "query":{
              "bool":{
                "should":[
                  {"constant_score":{
                      "query":{"match":{"description":"USB"}}
                  }},
                  {"constant_score":{
                      "query":{"match":{"description":"HDMI"}}
                  }},
                  {"constant_score":{
                      "query":{"match":{"description":"DisplayPort"}}
                  }}
                ]
              }
            },
            "script_score":{
              "script":{
                "params":{"threshold":0.8},
                "inline":"if (doc[\"vote\"].value > params.threshold) {return 1;} return 0.5;"
              }
            },
            "boost_mode":"multiply",
            "max_boost":10
          }
        }
      }

      In the preceding example, the total score is calculated as follows: new_score = query_score x vote (If the value of vote is greater than 0.8, take its value as 1 in the formula. If the value of vote is not greater than 0.8, take its value as 0.5 in the formula.)

      The command output is similar to the following:

      {
        "took": 634,
        "timed_out": false,
        "_shards": {
          "total": 5,
          "successful": 5,
          "failed": 0
        },
        "hits": {
          "total": 4,
          "max_score": 2,
          "hits": [
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "2",
              "_score": 2,
              "_source": {
                "name": "tv2",
                "description": "USB, HDMI",
                "vote": 0.99
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "1",
              "_score": 2,
              "_source": {
                "name": "tv1",
                "description": "USB, DisplayPort",
                "vote": 0.98
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "4",
              "_score": 1.5,
              "_source": {
                "name": "tv4",
                "description": "USB, HDMI, DisplayPort",
                "vote": 0.7
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "3",
              "_score": 0.5,
              "_source": {
                "name": "tv3",
                "description": "USB",
                "vote": 0.5
              }
            }
          ]
        }
      }

      The preceding command output shows that Elasticsearch computes the total score of each TV based on the relative praise rate and then sorts the TVs in a descending order of the total score.