Updated on 2024-11-29 GMT+08:00

Scroll Query

To avoid deep paging, you are not allowed to query data after 10000 by page (from&size). Use the scroll to query data.

The following is an example of scroll query in security mode:

curl -XGET --tlsv1.2 --negotiate -k -u : "https://ip:httpport/myindex-001/_search?scroll=1m&pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match": {
      "age": "36"
    }
  },
  "size":1000
}'

Run the scroll command to specify the scroll parameter in the initial search request. This parameter indicates how long the cursor window is maintained. For example: If scroll is set to 1 min, the cursor window is maintained for one minute.

The following response is returned:

{
  "_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoMgAAAAAAAABPFlFHZzExcFdnUWJDU0d5bU==",
  "took" : 55,
  "timed_out" : false,
  "_shards" : {
    "total" : 50,
    "successful" : 50,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 16692062,
    "max_score" : 0.0,
    "hits" : [...1000 data ]
  }
}

Optimizing scroll: In general scenarios, scroll is used to obtain a large amount of sorted data. In most cases, only data needs to be returned. In this case, scroll can be optimized. The returned results which are sorted by _doc are not sorted. In this case, the execution efficiency is the highest.

The following is an example in security mode:

curl -XGET --tlsv1.2 --negotiate -k -u : "https://ip:httpport/myindex-001/_search?scroll=1m&pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match": {
      "age": "36"
    }
  },
  "size":1000,
  "sort": "_doc"
}'

When scroll is enabled, a scroll lifetime is set. However, if it can be disabled in time, resources can be released in advance to reduce the Elasticsearch load.

curl -XDELETE --tlsv1.2 --negotiate -k -u : "https://ip:httpport/_search/scroll=1m&pretty" -H 'Content-Type: application/json' -d'
{
  "scroll_id":"DnF1ZXJ5VGhlbkZldGNoMgAAAAAAAABPFlFHZzExcFdnUWJDU0d5bU=="
}'