Updated on 2024-11-29 GMT+08:00

Optimizing Aggregation

In most cases, the aggregation query of a single field is fast. When multiple fields need to be aggregated, a large number of packets are generated, and a large amount of Elasticsearch memory is occupied. As a result, memory overflow occurs. Optimize based on service scenarios to reduce the number of aggregation times.

Default Depth-first Aggregation Changed to Breadth-first Aggregation

Add the following settings: collect_mode: breadth_first.

depth_first: Specifies that the sub-aggregation calculation is performed.

breadth_first: Specifies that the current aggregation result is calculated first, and the result is calculated in the sub-aggregation.

Optimizing Aggregation Execution Mode

Add execution_hint: map to each term of aggregation.

Add the following settings: execution_hint: map.

  • The query result is directly stored in the memory to construct a map. When the query result set is small, the query speed is high.
  • However, if the query result set is large (million-level to 100-million-level), the speed in the traditional aggregation mode is faster than that in the map mode.

The following is an example of aggregation query in security mode:

curl -XGET --tlsv1.2 --negotiate -k -u : "https://ip:httpport/myindex-001/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size" : 0,
  "aggregations": {
    "count_age" : {
	"terms" : {
		   "field" : "age"
		} 
	}
  }
}'

The following is an example of optimized aggregation query in security mode:

curl -XGET --tlsv1.2 --negotiate -k -u : "https://ip:httpport/myindex-001/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "size" : 0,
  "aggregations": {
    "count_age" : {
	"terms" : {
		   "field" : "age",
		   "execution_hint": "map",
		   "collect_mode": "breadth_first"
		} 
	}
  }
}'