Updated on 2025-08-14 GMT+08:00

Aggregate Functions

This section describes aggregate functions, including their syntax, parameters, and usage examples.

Function List

Table 1 Aggregate functions

Function

Description

approx_distinct

Estimates the number of unique values in x.

approx_percentile

Sorts the values of x in ascending order and returns the value at a specified percentile position (percentage).

arbitrary

Returns a random non-null value in x.

max_by

Queries the x value for the maximum y value or the x values for n maximum y values.

min_by

Queries the x value for the minimum y value or the x values for n minimum y values.

count

Used for counting.

max

Queries the maximum value in x.

min

Queries the minimum value in x.

avg

Calculates the arithmetic average value of x.

approx_distinct

This function estimates the number of unique values in x.

  • Estimates the number of unique values in x. By default, the standard error is 2.3%.

    Syntax: approx_distinct(x)

  • Estimates the number of unique values in x. You can customize the standard error.

    Syntax: approx_distinct(x, e)

Table 2 Parameter description

Parameter

Description

Type

Mandatory

x

Original field.

Any

Yes

e

Custom standard error.

Double; value range: [0.0115, 0.26]

No

Return value type: bigint

  • Example 1: Use the approx_distinct function to count the unique values of the clientIp field, with the standard error of 2.3%.
    Table 3 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT approx_distinct(clientIp)

    1

  • Example 2: Use the approx_distinct function to count the unique values of the clientIp field, with the custom standard error of 10%.
    Table 4 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT approx_distinct(clientIp, 0.1)

    1

approx_percentile

This function sorts the values of x in ascending order and returns the value at a specified percentile position (percentage).

  • Sorts x in ascending order and returns the value of x at the specified percentile position (percentage) as a double.

    Syntax: approx_percentile(x, percentage)

  • Sorts x in ascending order and returns the values of x at the specified percentile positions (percentage01 and percentage02), with the result presented as array(double,double).

    Syntax: approx_percentile(x, array[percentage01, percentage02)]

  • Computes the product of each element in x with the corresponding weight, sorts these products in ascending order, and returns the value of x at the specified percentile position (percentage) as a double.

    Syntax: approx_percentile(x, weight, percentage)

  • Computes the product of each element in x with the corresponding weight, sorts the products in ascending order, and returns the values of x at the specified percentile positions (percentage01 and percentage02), with the result presented as array(double,double).

    Syntax: approx_percentile(x, weight, array[percentage01, percentage02...])

  • Computes the product of each element in x with the corresponding weight, sorts these products in ascending order, and returns the value of x at the specified percentile position (percentage) as a double. The accuracy of the returned result can be set.

    Syntax: approx_percentile(x, weight, percentage, accuracy)

Table 5 Parameter description

Parameter

Description

Type

Mandatory

x

Original field.

Double

Yes

percentage

Percentage. The value range is [0, 1].

Double

Yes

weight

Weight, which is an integer greater than 1. When the weight is applied, the system sorts the products of x and the weight.

Integer

No

accuracy

Accuracy. The value range is (0, 1).

Double

No

Return value type: double or array(double,...,double)

  • Example 1: Sort the request_time column and return the value of the request_time field that is approximately at the 50% position.
    Table 6 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT approx_percentile(request_time, 0.5)

    45.0

  • Example 2: Sort the request_time column and return the values of the request_time field that are at the 10%, 20%, and 70% positions.
    Table 7 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT approx_percentile(request_time,array[0.1,0.2,0.7])

    [17.0, 24.0, 59.0]

  • Example 3: Sort the request_time column based on the product of request_time and weight, and return the value of the request_time field that is approximately at the 50% position. The weight is 60.
    Table 8 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT approx_percentile(request_time, 60, 0.5)

    45.0

  • Example 4: Sort the request_time column based on the product of request_time and weight, and return the values of the request_time field that are approximately at the 80% and 90% positions. The weight is 60.
    Table 9 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT approx_percentile(request_time, 60, array[0.8, 0.9])

    [66.0,73.0]

  • Example 5: Sort the request_time column based on the product of request_time and weight, and return the value of the request_time field that is approximately at the 50% position. The weight is 60 and the accuracy is 0.2.
    Table 10 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT approx_percentile(request_time, 60, 0.5, 0.2)

    45.0

arbitrary

This function returns a random non-null value in x.

Syntax: arbitrary(x)

Table 11 Parameter description

Parameter

Description

Type

Mandatory

x

Original field.

Any

Yes

Return value type: same as the data type of the parameter value

Example: select arbitrary(region)

Table 12 Query and analysis results

Type

Query Statement

Returned Result

Scenario

arbitrary(region)

r2

max_by

This function queries the x value for the maximum y value or the x values for n maximum y values.

  • Queries the x value corresponding to the maximum y value.

    Syntax: max_by(x, y)

  • Queries the x values corresponding to the top n maximum y values.

    Syntax: max_by(x, y, n)

Table 13 Parameter description

Parameter

Description

Type

Mandatory

x

Original field.

Any data type

Yes

y

Original field.

Any data type

Yes

n

A positive integer.

Integer

No

Return value type: same as the data type of the parameter value

  • Example 1: Return the request method for the longest request.
    Table 14 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT max_by(request_method, request_time)

    GET

  • Example 2: Return the request methods for the top 3 longest requests.
    Table 15 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT max_by(request_method, request_time, 3)

    ["GET","GET","GET"]

min_by

This function queries the x value for the minimum y value or the x values for n minimum y values.

  • Queries the x value corresponding to the minimum y value.

    Syntax: min_by(x, y)

  • Queries the x values corresponding to the top n minimum y values.

    Syntax: min_by(x, y, n)

Table 16 Parameter description

Parameter

Description

Type

Mandatory

x

Original field.

Any data type

Yes

y

Original field.

Any data type

Yes

n

A positive integer.

Integer

No

Return value type: same as the data type of the parameter value

  • Example 1: Return the request method for the fastest request.
    Table 17 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT min_by(request_method, request_time)

    POST

  • Example 2: Return the request methods for the top 3 fastest requests.
    Table 18 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT min_by(request_method, request_time, 3)

    ["POST","POST","POST"]

count

This function is used for counting.

  • Counts all log events.

    Syntax: COUNT(*)

  • Counts all log events. It is equivalent to count(*).

    Syntax: COUNT(1)

  • Counts the number of log events whose values are not NULL in x.

    Syntax: COUNT(x)

Table 19 Parameter description

Parameter

Description

Type

Mandatory

x

Original field.

Any

Yes

Return value type: Int

Example: select COUNT(*)

Table 20 Query and analysis results

Type

Query Statement

Returned Result

Scenario

COUNT(*)

1

max

This function queries the maximum value in x.

  • Queries the maximum value in x.

    Syntax: max(x)

  • Queries the n maximum values in x and returns them as an array.

    Syntax: max(x, n)

Table 21 Parameter description

Parameter

Description

Type

Mandatory

x

Original field.

Any data

Yes

n

Number of maximum values to return.

Positive integer

No

Return value type: same as the data type of the parameter value

  • Example 1: Query the maximum value in x.
    Table 22 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT max(x)

    99.0

  • Example 2: Query the two maximum values in x.
    Table 23 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT max(x, 2)

    [99.0, 99.0]

min

This function queries the minimum value in x.

  • Queries the minimum value in x.

    Syntax: min(x)

  • Queries the n minimum values in x and returns them as an array.

    Syntax: min(x, n)

Table 24 Parameter description

Parameter

Description

Type

Mandatory

x

Original field.

Any data

Yes

n

Number of minimum values to return.

Positive integer

No

Return value type: same as the data type of the parameter value

  • Example 1: Query the minimum value in x.
    Table 25 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT min(x)

    10.0

  • Example 2: Query the two minimum values in x.
    Table 26 Query and analysis results

    Type

    Query Statement

    Returned Result

    Scenario

    SELECT min(x, 2)

    [10.0, 10.0]

avg

This function calculates the arithmetic average value of x.

Syntax: AVG(x)

Table 27 Parameter description

Parameter

Description

Type

Mandatory

x

Original field.

number

Yes

Return value type: double

Example: select AVG(value)

Table 28 Query and analysis results

Type

Query Statement

Returned Result

Scenario

AVG(value)

1