Help Center> Cloud Search Service> FAQs> Functions> How Do I Use Elasticsearch Custom Scoring to Query Data?
Updated on 2022-08-31 GMT+08:00

How Do I Use Elasticsearch Custom Scoring to Query Data?

Perform the following procedure to use the custom scoring function of Elasticsearch to query data.

Procedure

  1. Log in to the CSS management console.
  2. In the left navigation pane, click Clusters to switch to the Clusters page.
  3. In the cluster list, locate the row that contains the target cluster and click Access Kibana in the Operation column.
  4. In the left navigation pane of Kibana, click Dev Tools. Click Get to work to switch to the Console page.
  5. Run commands, as shown in the examples below, to create an index and specify a user-defined mapping to define the data type:

    For example, the tv.json data file is available and contains the following data:

    {
    "tv":[
    { "name": "tv1", "description": "USB, DisplayPort", "vote": 0.98 }
    { "name": "tv2", "description": "USB, HDMI", "vote": 0.99 }
    { "name": "tv3", "description": "USB", "vote": 0.5 }
    { "name": "tv4", "description": "USB, HDMI, DisplayPort", "vote": 0.7 }
    ]
    }
    Run the following command to create the mall index and specify the user-defined mapping to define the data type:
    PUT /mall?pretty
    {
      "mappings": {
        "tv": {
          "properties": {
            "description": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword"
                }
              }
            },
            "name": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword"
                }
              }
            },
            "vote": {
              "type": "float"
            }
          }
        }
      }
    }
  6. Run the following command to import data in the tv.json file to the mall index:
    POST /mall/tv/_bulk?pretty
    { "index": {"_id": "1"}}
    { "name": "tv1", "description": "USB, DisplayPort", "vote": 0.98 }
    { "index": {"_id": "2"}}
    { "name": "tv2", "description": "USB, HDMI", "vote": 0.99 }
    { "index": {"_id": "3"}}
    { "name": "tv3", "description": "USB", "vote": 0.5 }
    { "index": {"_id": "4"}}
    { "name": "tv4", "description": "USB, HDMI, DisplayPort", "vote": 0.7 }
  7. Query data by using custom scoring.
    • The total score of each TV is computed based on the absolute praise rate and the queried products are then sorted based on their total score, from the highest to lowest.

      The following sample code illustrates how to query TVs with USB, HDMI, or DisplayPort port. In this example, they get a score 0 for a TV if it has none of the three ports, 1 if it has one, 2 if it has two, and 3 if it has all three. Then, the obtained scores are multiplied by the TV's absolute praise rate to obtain their total scores. Finally, the TVs are sorted in an order of their total scores from the highest to lowest.

      GET /mall/tv/_search?pretty
      {
        "query":{
          "function_score":{
            "query":{
              "bool":{
                "should":[
                  {"constant_score":{
                      "query":{"match":{"description":"USB"}}
                  }},
                  {"constant_score":{
                      "query":{"match":{"description":"HDMI"}}
                  }},
                  {"constant_score":{
                      "query":{"match":{"description":"DisplayPort"}}
                  }}
                ]
              }
            },
            "field_value_factor":{
              "field":"vote",
              "factor":1
            },
            "boost_mode":"multiply",
            "max_boost":10
          }
        }
      }

      In the preceding example, the total score is calculated as follows: new_score = query_score x (factor x vote)

      The command output is similar to the following:

      {
        "took": 13,
        "timed_out": false,
        "_shards": {
          "total": 5,
          "successful": 5,
          "failed": 0
        },
        "hits": {
          "total": 4,
          "max_score": 2.1,
          "hits": [
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "4",
              "_score": 2.1,
              "_source": {
                "name": "tv4",
                "description": "USB, HDMI, DisplayPort",
                "vote": 0.7
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "2",
              "_score": 1.98,
              "_source": {
                "name": "tv2",
                "description": "USB, HDMI",
                "vote": 0.99
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "1",
              "_score": 1.96,
              "_source": {
                "name": "tv1",
                "description": "USB, DisplayPort",
                "vote": 0.98
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "3",
              "_score": 0.5,
              "_source": {
                "name": "tv3",
                "description": "USB",
                "vote": 0.5
              }
            }
          ]
        }
      }

      The preceding command output shows that Elasticsearch computes the total score of each TV based on the absolute praise rate and then sorts the TVs in an order of their total score, from the highest to lowest.

    • The total score of each TV is computed based on the relative praise rate and the queried products are then sorted based on their total score, from the highest to lowest.

      The following sample code illustrates how to query TVs with USB, HDMI, or DisplayPort port. In this example, they get a score 0 for a TV if it has none of the three ports, 1 if it has one, 2 if it has two, and 3 if it has all three. Then, the obtained scores are multiplied by the TV's relative praise rate to obtain their total scores. The relative praise rate is calculated as follows: Specify a praise rate threshold like 0.8. If a TV's absolute praise rate is higher than 0.8, consider the praise rate as 1 in the calculation formula for the TV. Otherwise, consider the praise rate as 0.5. Finally, the TVs are sorted in an order of their total scores, from the highest to lowest.

      GET /mall/tv/_search?pretty
      {
        "query":{
          "function_score":{
            "query":{
              "bool":{
                "should":[
                  {"constant_score":{
                      "query":{"match":{"description":"USB"}}
                  }},
                  {"constant_score":{
                      "query":{"match":{"description":"HDMI"}}
                  }},
                  {"constant_score":{
                      "query":{"match":{"description":"DisplayPort"}}
                  }}
                ]
              }
            },
            "script_score":{
              "script":{
                "params":{"threshold":0.8},
                "inline":"if (doc[\"vote\"].value > params.threshold) {return 1;} return 0.5;"
              }
            },
            "boost_mode":"multiply",
            "max_boost":10
          }
        }
      }

      In the preceding example, the total score is calculated as follows: new_score = query_score x vote (If the value of vote is greater than 0.8, round it off to 1 in the formula. If the value of vote is 0.8 or less, consider its value as 0.5 in the formula.)

      The command output is similar to the following:

      {
        "took": 634,
        "timed_out": false,
        "_shards": {
          "total": 5,
          "successful": 5,
          "failed": 0
        },
        "hits": {
          "total": 4,
          "max_score": 2,
          "hits": [
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "2",
              "_score": 2,
              "_source": {
                "name": "tv2",
                "description": "USB, HDMI",
                "vote": 0.99
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "1",
              "_score": 2,
              "_source": {
                "name": "tv1",
                "description": "USB, DisplayPort",
                "vote": 0.98
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "4",
              "_score": 1.5,
              "_source": {
                "name": "tv4",
                "description": "USB, HDMI, DisplayPort",
                "vote": 0.7
              }
            },
            {
              "_index": "mall",
              "_type": "tv",
              "_id": "3",
              "_score": 0.5,
              "_source": {
                "name": "tv3",
                "description": "USB",
                "vote": 0.5
              }
            }
          ]
        }
      }

      The preceding command output shows that Elasticsearch computes the total score of each TV based on the relative praise rate and then sorts the TVs based on their total score, from the highest to lowest.

Functions FAQs

more