窗口函数
窗口函数
窗口函数与OVER语句一起使用。OVER语句用于对数据进行分组,并对组内元素进行排序。窗口函数用于给组内的值生成序号。
窗口函数中的order by后面必须跟字段名,若order by后面跟数字,该数字会被按照常量处理,因此对目标列没有起到排序的作用。
- RANK()
描述:RANK函数为各组内值生成跳跃排序序号,其中,相同的值具有相同序号。
返回值类型:BIGINT
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
gaussdb=# CREATE TABLE rank_t1(a int, b int); gaussdb=# INSERT INTO rank_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6); gaussdb=# SELECT a,b,RANK() OVER(PARTITION BY a ORDER BY b) FROM rank_t1; a | b | rank ---+---+------ 1 | 1 | 1 1 | 1 | 1 1 | 2 | 3 1 | 3 | 4 2 | 4 | 1 2 | 5 | 2 3 | 6 | 1 (7 rows) gaussdb=# DROP TABLE rank_t1;
- ROW_NUMBER()
描述:ROW_NUMBER函数为各组内值生成连续排序序号,其中,相同的值其序号也不相同。
返回值类型:BIGINT
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
gaussdb=# CREATE TABLE row_number_t1(a int, b int); gaussdb=# INSERT INTO row_number_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6); gaussdb=# SELECT a,b,ROW_NUMBER() OVER(PARTITION BY a ORDER BY b) FROM row_number_t1; a | b | row_number ---+---+------------ 1 | 1 | 1 1 | 1 | 2 1 | 2 | 3 1 | 3 | 4 2 | 4 | 1 2 | 5 | 2 3 | 6 | 1 (7 rows) gaussdb=# DROP TABLE row_number_t1;
- DENSE_RANK()
描述:DENSE_RANK函数为各组内值生成连续排序序号,其中,相同的值具有相同序号。
返回值类型:BIGINT
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
gaussdb=# CREATE TABLE dense_rank_t1(a int, b int); gaussdb=# INSERT INTO dense_rank_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6); gaussdb=# SELECT a,b,DENSE_RANK() OVER(PARTITION BY a ORDER BY b) FROM dense_rank_t1; a | b | dense_rank ---+---+------------ 1 | 1 | 1 1 | 1 | 1 1 | 2 | 2 1 | 3 | 3 2 | 4 | 1 2 | 5 | 2 3 | 6 | 1 (7 rows) gaussdb=# DROP TABLE dense_rank_t1;
- PERCENT_RANK()
描述:PERCENT_RANK函数为各组内对应值生成相对序号,即根据公式 (rank - 1) / (totalrows - 1)计算所得的值。其中rank为该值依据RANK函数所生成的对应序号,totalrows为该分组内的总元素个数。
返回值类型:DOUBLE PRECISION
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
gaussdb=# CREATE TABLE percent_rank_t1(a int, b int); gaussdb=# INSERT INTO percent_rank_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6); gaussdb=# SELECT a,b,PERCENT_RANK() OVER(PARTITION BY a ORDER BY b) FROM percent_rank_t1; a | b | percent_rank ---+---+------------------ 1 | 1 | 0 1 | 1 | 0 1 | 2 | .666666666666667 1 | 3 | 1 2 | 4 | 0 2 | 5 | 1 3 | 6 | 0 (7 rows) gaussdb=# DROP TABLE percent_rank_t1;
- CUME_DIST()
描述:CUME_DIST函数为各组内对应值生成累积分布序号。即根据公式(小于等于当前值的数据行数)/(该分组总行数totalrows)计算所得的相对序号。
返回值类型:DOUBLE PRECISION
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
gaussdb=# CREATE TABLE cume_dist_t1(a int, b int); gaussdb=# INSERT INTO cume_dist_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6); gaussdb=# SELECT a,b,CUME_DIST() OVER(PARTITION BY a ORDER BY b) FROM cume_dist_t1; a | b | cume_dist ---+---+----------- 1 | 1 | .5 1 | 1 | .5 1 | 2 | .75 1 | 3 | 1 2 | 4 | .5 2 | 5 | 1 3 | 6 | 1 (7 rows) gaussdb=# DROP TABLE cume_dist_t1;
- NTILE(num_buckets integer)
描述:NTILE函数根据num_buckets integer将有序的数据集合平均分配到num_buckets所指定数量的桶中,并将桶号分配给每一行。分配时应尽量做到平均分配。
返回值类型:INTEGER
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
gaussdb=# CREATE TABLE ntile_t1(a int, b int); gaussdb=# INSERT INTO ntile_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6); gaussdb=# SELECT a,b,NTILE(2) OVER(PARTITION BY a ORDER BY b) FROM ntile_t1; a | b | ntile ---+---+------- 1 | 1 | 1 1 | 1 | 1 1 | 2 | 2 1 | 3 | 2 2 | 4 | 1 2 | 5 | 2 3 | 6 | 1 (7 rows) gaussdb=# DROP TABLE ntile_t1;
- LAG(value any [, offset integer [, default any ]])
描述:LAG函数为各组内对应值生成滞后值。即当前值对应的行数往前偏移offset位后所得行的value值作为序号。若经过偏移后行数不存在,则对应结果取为default值。若无指定,在默认情况下,offset取为1,default值取为NULL。default值的类型需要与value值的类型保持一致。
返回值类型:与参数数据类型相同
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
-- 建表并插入数据 gaussdb=# CREATE TABLE ta1 (hire_date date, last_name varchar(20), department_id int); CREATE TABLE gaussdb=# INSERT INTO ta1 VALUES('07-DEC-02', 'Raphaely', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 VALUES('24-JUL-05', 'Tobias', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 VALUES('24-DEC-05', 'Baida', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 VALUES('18-MAY-03', 'Khoo', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('15-NOV-06', 'Himuro', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('10-AUG-07', 'Colmenares', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('10-MAY-07', 'yq', 11); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('10-MAY-08', 'zi', 11); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('', 'yq1', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values(null, 'yq2', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('10-DEC-07', 'yq3', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values(null, null, 11); INSERT 0 1 gaussdb=# INSERT INTO ta1 values(null, null, 11); INSERT 0 1 -- 调用LAG,指定offset=3, default=null gaussdb=# SELECT hire_date, last_name, department_id, lag(hire_date, 3, null) OVER (PARTITION BY department_id ORDER BY last_name) AS "NextHired" FROM ta1 ORDER BY department_id; hire_date | last_name | department_id | NextHired ---------------------+------------+---------------+--------------------- 2007-05-10 00:00:00 | yq | 11 | 2008-05-10 00:00:00 | zi | 11 | | | 11 | | | 11 | 2007-05-10 00:00:00 2005-12-24 00:00:00 | Baida | 30 | 2007-08-10 00:00:00 | Colmenares | 30 | 2006-11-15 00:00:00 | Himuro | 30 | 2003-05-18 00:00:00 | Khoo | 30 | 2005-12-24 00:00:00 2002-12-07 00:00:00 | Raphaely | 30 | 2007-08-10 00:00:00 2005-07-24 00:00:00 | Tobias | 30 | 2006-11-15 00:00:00 | yq1 | 30 | 2003-05-18 00:00:00 | yq2 | 30 | 2002-12-07 00:00:00 2007-12-10 00:00:00 | yq3 | 30 | 2005-07-24 00:00:00 (13 rows)
- LEAD(value any [, offset integer [, default any ]])
描述:LEAD函数为各组内对应值生成提前值。即当前值对应的行数向后偏移offset位后所得行的value值作为序号。若经过向后偏移后行数超过当前组内的总行数,则对应结果取为default值。若无指定,在默认情况下,offset取为1,default值取为NULL。default值的类型需要与value值的类型保持一致。
返回值类型:与参数数据类型相同。
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
-- 建表并插入数据 gaussdb=# CREATE TABLE ta1 (hire_date date, last_name varchar(20), department_id int); CREATE TABLE gaussdb=# INSERT INTO ta1 values('07-DEC-02', 'Raphaely', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('24-JUL-05', 'Tobias', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('24-DEC-05', 'Baida', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('18-MAY-03', 'Khoo', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('15-NOV-06', 'Himuro', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('10-AUG-07', 'Colmenares', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('10-MAY-07', 'yq', 11); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('10-MAY-08', 'zi', 11); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('', 'yq1', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values(null, 'yq2', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values('10-DEC-07', 'yq3', 30); INSERT 0 1 gaussdb=# INSERT INTO ta1 values(null, null, 11); INSERT 0 1 gaussdb=# INSERT INTO ta1 values(null, null, 11); INSERT 0 1 -- 调用LEAD,指定offset=2 gaussdb=# SELECT hire_date, last_name, department_id, lead(hire_date, 2) OVER (PARTITION BY department_id ORDER BY last_name) AS "NextHired" FROM ta1 ORDER BY department_id; hire_date | last_name | department_id | NextHired ---------------------+------------+---------------+--------------------- 2007-05-10 00:00:00 | yq | 11 | 2008-05-10 00:00:00 | zi | 11 | | | 11 | | | 11 | 2005-12-24 00:00:00 | Baida | 30 | 2006-11-15 00:00:00 2007-08-10 00:00:00 | Colmenares | 30 | 2003-05-18 00:00:00 2006-11-15 00:00:00 | Himuro | 30 | 2002-12-07 00:00:00 2003-05-18 00:00:00 | Khoo | 30 | 2005-07-24 00:00:00 2002-12-07 00:00:00 | Raphaely | 30 | 2005-07-24 00:00:00 | Tobias | 30 | | yq1 | 30 | 2007-12-10 00:00:00 | yq2 | 30 | 2007-12-10 00:00:00 | yq3 | 30 | (13 rows)
- FIRST_VALUE(value any)
描述:FIRST_VALUE函数取各组内的第一个值作为返回结果。
返回值类型:与参数数据类型相同。
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
gaussdb=# CREATE TABLE first_value_t1(a int, b int); gaussdb=# INSERT INTO first_value_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6); gaussdb=# SELECT a,b,FIRST_VALUE(b) OVER(PARTITION BY a ORDER BY b) FROM first_value_t1; a | b | first_value ---+---+------------- 1 | 1 | 1 1 | 1 | 1 1 | 2 | 1 1 | 3 | 1 2 | 4 | 4 2 | 5 | 4 3 | 6 | 6 (7 rows) gaussdb=# DROP TABLE first_value_t1;
- LAST_VALUE(value any)
描述:LAST_VALUE函数取各组内的最后一个值作为返回结果。
返回值类型:与参数数据类型相同。
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
gaussdb=# CREATE TABLE last_value_t1(a int, b int); gaussdb=# INSERT INTO last_value_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6); gaussdb=# SELECT a,b,LAST_VALUE(b) OVER(PARTITION BY a ORDER BY b) FROM last_value_t1; a | b | last_value ---+---+------------ 1 | 1 | 1 1 | 1 | 1 1 | 2 | 2 1 | 3 | 3 2 | 4 | 4 2 | 5 | 5 3 | 6 | 6 (7 rows) gaussdb=# DROP TABLE last_value_t1;
- NTH_VALUE(value any, nth integer)
描述:NTH_VALUE函数返回该组内的第nth行作为结果。若该行不存在,则默认返回NULL。
返回值类型:与参数数据类型相同。
示例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
gaussdb=# CREATE TABLE nth_value_t1(a int, b int); gaussdb=# INSERT INTO nth_value_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6); gaussdb=# SELECT a,b,NTH_VALUE(b, 2) OVER(PARTITION BY a order by b) FROM nth_value_t1; a | b | nth_value ---+---+----------- 1 | 1 | 1 1 | 1 | 1 1 | 2 | 1 1 | 3 | 1 2 | 4 | 2 | 5 | 5 3 | 6 | (7 rows) gaussdb=# DROP TABLE nth_value_t1;
- delta
参数:numeric
返回值类型:numeric
- spread
参数:real
返回值类型:real