Updated on 2024-04-19 GMT+08:00

Aggregate Functions

Aggregate functions process all rows as input and produce a single aggregate value as the output.

Table 1 Aggregate functions

Function

Description

COUNT([ ALL ] expression | DISTINCT expression1 [, expression2]*)

By default or with the keyword ALL, returns the number of input rows where the expression is not NULL. Using DISTINCT calculates the count after removing duplicates.

COUNT(*) | COUNT(1)

Returns the number of input rows.

AVG([ ALL | DISTINCT ] expression)

By default or with the keyword ALL, returns the average value (arithmetic mean) of the expression across all input rows. Using DISTINCT calculates the average after removing duplicates.

SUM([ ALL | DISTINCT ] expression)

By default or with the keyword ALL, returns the sum of the expression across all input rows. Using DISTINCT calculates the sum after removing duplicates.

MAX([ ALL | DISTINCT ] expression)

By default or with the keyword ALL, returns the maximum value of the expression across all input rows. Using DISTINCT calculates the maximum after removing duplicates.

MIN([ ALL | DISTINCT ] expression )

By default or with the keyword ALL, returns the minimum value of the expression across all input rows. Using DISTINCT calculates the minimum after removing duplicates.

STDDEV_POP([ ALL | DISTINCT ] expression)

By default or with the keyword ALL, returns the population standard deviation of the expression across all input rows. Using DISTINCT calculates the standard deviation after removing duplicates.

STDDEV_SAMP([ ALL | DISTINCT ] expression)

By default or with the keyword ALL, returns the sample standard deviation of the expression across all input rows. Using DISTINCT calculates the standard deviation after removing duplicates.

VAR_POP([ ALL | DISTINCT ] expression)

By default or with the keyword ALL, returns the population variance (square of the population standard deviation) of the expression across all input rows. Using DISTINCT calculates the variance after removing duplicates.

VAR_SAMP([ ALL | DISTINCT ] expression)

By default or with the keyword ALL, returns the sample variance (square of the sample standard deviation) of the expression across all input rows. Using DISTINCT calculates the variance after removing duplicates.

COLLECT([ ALL | DISTINCT ] expression)

By default or with the keyword ALL, returns multiple sets of expressions across all input rows. NULL values are ignored. Using DISTINCT calculates the sets after removing duplicates.

VARIANCE([ ALL | DISTINCT ] expression)

A synonym for VAR_SAMP().

RANK()

Returns the rank of a value within a set of values. The result is 1 plus the number of rows preceding or equal to the current row in the ordering of the partition. The rank may not be consecutive in the sequence.

DENSE_RANK()

Returns the rank of a value within a set of values. The result is one plus the previously assigned rank value. Unlike the rank function, dense_rank does not leave gaps in the ranking sequence.

ROW_NUMBER()

Assigns a unique sequential number to each row within a window partition based on the ordering of rows by rows. ROW_NUMBER is similar to RANK. ROW_NUMBER numbers all rows sequentially (for example, 1, 2, 3, 4, 5). RANK provides the same sequence value for equal rows (for example, 1, 2, 2, 4, 5).

LEAD(expression [, offset] [, default])

Returns the value of the expression at the offset-th row after the current row in the window. The default value of offset is 1, and the default value of default is NULL.

LAG(expression [, offset] [, default])

Returns the value of the expression at the offset-th row before the current row in the window. The default value of offset is 1, and the default value of default is NULL.

FIRST_VALUE(expression)

Returns the first value in a set of ordered values.

LAST_VALUE(expression)

Returns the last value in a set of ordered values.

LISTAGG(expression [, separator])

Concatenates the values of string expressions and places a separator value between them. The default separator value is , if no separator is added at the end of the string.