Updated on 2024-05-31 GMT+08:00

Overview

Table 1 lists the aggregate functions supported by DLI.

Table 1 Aggregate functions

Function

Syntax

Value Type

Description

avg

avg(col), avg(DISTINCT col)

DOUBLE

Returns the average value.

corr

corr(col1, col2)

DOUBLE

Returns the coefficient of correlation of a pair of numeric columns.

count

count([distinct|all] <colname>)

BIGINT

Returns the number of records.

covar_pop

covar_pop(col1, col2)

DOUBLE

Returns the covariance of a pair of numeric columns.

covar_samp

covar_samp(col1, col2)

DOUBLE

Returns the sample covariance of a pair of numeric columns.

max

max(col)

DOUBLE

Returns the maximum value.

min

min(col)

DOUBLE

Returns the minimum value.

percentile

percentile(BIGINT col, p)

DOUBLE

Returns the percentage value point of the value area. The value of p must be between 0 and 1. Otherwise, NULL is returned. The value cannot be a float.

percentile_approx

percentile_approx(DOUBLE col, p [, B])

DOUBLE

Returns the approximate pth percentile of a numerical column within the group, including floating-point numbers. The value of p should be between 0 and 1. The parameter B controls the accuracy of the approximation, with a higher value of B resulting in a higher level of approximation. The default value is 10000. If the number of non-repeating values in the column is less than B, an exact percentile is returned.

stddev_pop

stddev_pop(col)

DOUBLE

Returns the deviation of a specified column.

stddev_samp

stddev_samp(col)

DOUBLE

Returns the sample deviation of a specified column.

sum

sum(col), sum(DISTINCT col)

DOUBLE

Returns the sum of the values in a column.

variance/var_pop

variance(col), var_pop(col)

DOUBLE

Returns the variance of a column.

var_samp

var_samp(col)

DOUBLE

Returns the sample variance of a specified column.