更新时间:2025-05-29 GMT+08:00
分享

窗口函数

窗口函数与OVER语句一起使用。OVER语句用于对数据进行分组,并对组内元素进行排序。窗口函数用于给组内的值生成序号。

窗口函数中的order by后面必须跟字段名,若order by后面跟数字,该数字会被按照常量处理,因此对目标列没有起到排序的作用。

当父查询中包含子查询中的窗口函数的过滤条件时,支持将父查询中的过滤条件下推到子查询中。

  1. 只支持将父查询中关于窗口函数的<、<=、=过滤条件下推到子查询中。
  2. 限制条件范围的上边界可以是常量、常量表达式、参数、非VOLATILE类型的函数、非相关子链接。
  3. 只支持ROW_NUMBER()RANK()DENSE_RANK()三个窗口函数。

示例:

--创建表。
gaussdb=# CREATE TABLE t2 (c1 INT, c2 INT);
--导入数据。
gaussdb=# INSERT INTO t2 SELECT generate_series, generate_series FROM generate_series(1, 1000000);
--执行查询,查询正常无报错,且执行结果正确。
gaussdb=# EXPLAIN ANALYZE SELECT nc1 FROM (
SELECT row_number() over() rid,
t2.c1 nc1
FROM t2
) WHERE rid BETWEEN 1 AND (1 + 10 - 1);
                                                     QUERY PLAN                                                     
--------------------------------------------------------------------------------------------------------------------
Streaming (type: GATHER)  (cost=0.12..10.64 rows=5 width=4) (actual time=10.201..12.766 rows=10 loops=1)
   Node/s: All datanodes
   ->  Subquery Scan on __unnamed_subquery__  (cost=0.00..10.45 rows=5 width=4) (actual time=[5.310,6.092]..[5.310,6.092], rows=10)
         Filter: (__unnamed_subquery__.rid >= 1)
         ->  WindowAgg  (cost=0.00..10.25 rows=16 width=4) (actual time=[5.262,5.873]..[5.262,5.873], rows=10)
               row_number_filter: (row_number() OVER () <= 10)
               ->  Streaming(type: BROADCAST)  (cost=0.00..10.29 rows=32 width=4) (actual time=[5.162,5.682]..[5.162,5.682], rows=11)
                     Spawn on: All datanodes
                     ->  WindowAgg  (cost=0.00..10.25 rows=16 width=4) (actual time=[1.759,1.815]..[1.786,1.844], rows=20)
                           row_number_filter: (row_number() OVER () <= 10)
                           ->  Seq Scan on t2  (cost=0.00..13.13 rows=20 width=4) (actual time=[1.711,1.716]..[1.743,1.747], rows=22)
 Total runtime: 13.428 ms
(12 rows)
--环境清理,避免数据泄露。
gaussdb=# DROP TABLE t2;

RANK()

描述:RANK函数为各组内值生成跳跃排序序号,其中,相同的值具有相同序号。

返回值类型:BIGINT

示例:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
gaussdb=# CREATE TABLE rank_t1(a int, b int);

gaussdb=# INSERT INTO rank_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,RANK() OVER(PARTITION BY a ORDER BY b) FROM rank_t1;
 a | b | rank 
---+---+------
 1 | 1 |    1
 1 | 1 |    1
 1 | 2 |    3
 1 | 3 |    4
 2 | 4 |    1
 2 | 5 |    2
 3 | 6 |    1
(7 rows)

gaussdb=# DROP TABLE rank_t1;

ROW_NUMBER()

描述:ROW_NUMBER函数为各组内值生成连续排序序号,其中,相同的值其序号也不相同。

返回值类型:BIGINT

示例:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
gaussdb=# CREATE TABLE row_number_t1(a int, b int);

gaussdb=# INSERT INTO row_number_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,ROW_NUMBER() OVER(PARTITION BY a ORDER BY b) FROM row_number_t1;
 a | b | row_number 
---+---+------------
 1 | 1 |          1
 1 | 1 |          2
 1 | 2 |          3
 1 | 3 |          4
 2 | 4 |          1
 2 | 5 |          2
 3 | 6 |          1
(7 rows)

gaussdb=# DROP TABLE row_number_t1;

DENSE_RANK()

描述:DENSE_RANK函数为各组内值生成连续排序序号,其中,相同的值具有相同序号。

返回值类型:BIGINT

示例:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
gaussdb=# CREATE TABLE dense_rank_t1(a int, b int);

gaussdb=# INSERT INTO dense_rank_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,DENSE_RANK() OVER(PARTITION BY a ORDER BY b) FROM dense_rank_t1;
 a | b | dense_rank 
---+---+------------
 1 | 1 |          1
 1 | 1 |          1
 1 | 2 |          2
 1 | 3 |          3
 2 | 4 |          1
 2 | 5 |          2
 3 | 6 |          1
(7 rows)

gaussdb=# DROP TABLE dense_rank_t1;

PERCENT_RANK()

描述:PERCENT_RANK函数为各组内对应值生成相对序号,即根据公式 (rank - 1) / (totalrows - 1)计算所得的值。其中rank为该值依据RANK函数所生成的对应序号,totalrows为该分组内的总元素个数。

返回值类型:DOUBLE PRECISION

示例:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
gaussdb=# CREATE TABLE percent_rank_t1(a int, b int);

gaussdb=# INSERT INTO percent_rank_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,PERCENT_RANK() OVER(PARTITION BY a ORDER BY b) FROM percent_rank_t1;
 a | b |   percent_rank   
---+---+------------------
 1 | 1 |                0
 1 | 1 |                0
 1 | 2 | .666666666666667
 1 | 3 |                1
 2 | 4 |                0
 2 | 5 |                1
 3 | 6 |                0
(7 rows)

gaussdb=# DROP TABLE percent_rank_t1;

CUME_DIST()

描述:CUME_DIST函数为各组内对应值生成累积分布序号。即根据公式(小于等于当前值的数据行数)/(该分组总行数totalrows)计算所得的相对序号。

返回值类型:DOUBLE PRECISION

示例:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
gaussdb=# CREATE TABLE cume_dist_t1(a int, b int);

gaussdb=# INSERT INTO cume_dist_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,CUME_DIST() OVER(PARTITION BY a ORDER BY b) FROM cume_dist_t1;
 a | b | cume_dist 
---+---+-----------
 1 | 1 |        .5
 1 | 1 |        .5
 1 | 2 |       .75
 1 | 3 |         1
 2 | 4 |        .5
 2 | 5 |         1
 3 | 6 |         1
(7 rows)

gaussdb=# DROP TABLE cume_dist_t1;

NTILE(num_buckets integer)

描述:NTILE函数根据num_buckets integer将有序的数据集合平均分配到num_buckets所指定数量的桶中,并将桶号分配给每一行。分配时应尽量做到平均分配。

返回值类型:INTEGER

示例:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
gaussdb=# CREATE TABLE ntile_t1(a int, b int);

gaussdb=# INSERT INTO ntile_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,NTILE(2) OVER(PARTITION BY a ORDER BY b) FROM ntile_t1;
 a | b | ntile 
---+---+-------
 1 | 1 |     1
 1 | 1 |     1
 1 | 2 |     2
 1 | 3 |     2
 2 | 4 |     1
 2 | 5 |     2
 3 | 6 |     1
(7 rows)

gaussdb=# DROP TABLE ntile_t1;

LAG

描述:LAG函数为各组内对应值生成滞后值。即将当前值对应的行数向前偏移offset位后所得行的value值作为序号。若经过向前偏移后行数不存在,则对应结果取为default值。若无指定,在默认情况下,offset取值为1,default值取值为NULL。default值的类型需要与value值的类型保持一致。

语法:

LAG(value any [, offset integer [, default any ]])
LAG(value any ignore|respect nulls [, offset integer [, default any ]])
LAG(value any [, offset integer [, default any ]]) ignore|respect nulls

ignore|respect nulls表示是否将NULL包含在向前偏移值中。若无指定,则默认是respect nulls。若设置为ignore nulls,当value为NULL时,则将其不包含在向前偏移值中。如果开启了ignore nulls功能,则该函数会存在性能劣化。

返回值类型:与参数数据类型相同。

示例1:不开启ignore nulls功能,设置offset=3, default=null

-- 建表并插入数据
gaussdb=# CREATE TABLE ta1 (hire_date date, last_name varchar(20), department_id int);
CREATE TABLE
gaussdb=# INSERT INTO ta1 values('07-DEC-02', 'Raphaely', 30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('24-JUL-05', 'Tobias',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('24-DEC-05', 'Baida',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('18-MAY-03', 'Khoo', 30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('15-NOV-06', 'Himuro',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('10-AUG-07', 'Colmenares',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('10-MAY-07', 'yq',  11);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('10-MAY-08', 'zi',  11);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('', 'yq1',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values(null, 'yq2',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('10-DEC-07', 'yq3',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values(null, null,  11);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values(null, null,  11);
INSERT 0 1

-- 调用LAG,指定offset=3, default=null
gaussdb=# SELECT hire_date, last_name, department_id, lag(hire_date, 3, null)  OVER (PARTITION BY department_id ORDER BY last_name) AS "NextHired" FROM ta1 ORDER BY department_id;
      hire_date      | last_name  | department_id |      NextHired
---------------------+------------+---------------+---------------------
 2007-05-10 00:00:00 | yq         |            11 |
 2008-05-10 00:00:00 | zi         |            11 |
                     |            |            11 |
                     |            |            11 | 2007-05-10 00:00:00
 2005-12-24 00:00:00 | Baida      |            30 |
 2007-08-10 00:00:00 | Colmenares |            30 |
 2006-11-15 00:00:00 | Himuro     |            30 |
 2003-05-18 00:00:00 | Khoo       |            30 | 2005-12-24 00:00:00
 2002-12-07 00:00:00 | Raphaely   |            30 | 2007-08-10 00:00:00
 2005-07-24 00:00:00 | Tobias     |            30 | 2006-11-15 00:00:00
                     | yq1        |            30 | 2003-05-18 00:00:00
                     | yq2        |            30 | 2002-12-07 00:00:00
 2007-12-10 00:00:00 | yq3        |            30 | 2005-07-24 00:00:00
(13 rows)

示例2:开启ignore nulls功能,设置offset=3, default='01-JAN-00'

gaussdb=# SELECT hire_date, last_name, department_id, lag(hire_date, 3, '01-JAN-00') ignore nulls OVER (PARTITION BY department_id ORDER BY last_name) AS "NextHired"
 FROM ta1 ORDER BY department_id;
      hire_date      | last_name  | department_id |      NextHired
---------------------+------------+---------------+---------------------
 2007-05-10 00:00:00 | yq         |            11 | 2000-01-01 00:00:00
 2008-05-10 00:00:00 | zi         |            11 | 2000-01-01 00:00:00
                     |            |            11 | 2000-01-01 00:00:00
                     |            |            11 | 2000-01-01 00:00:00
 2005-12-24 00:00:00 | Baida      |            30 | 2000-01-01 00:00:00
 2007-08-10 00:00:00 | Colmenares |            30 | 2000-01-01 00:00:00
 2006-11-15 00:00:00 | Himuro     |            30 | 2000-01-01 00:00:00
 2003-05-18 00:00:00 | Khoo       |            30 | 2005-12-24 00:00:00
 2002-12-07 00:00:00 | Raphaely   |            30 | 2007-08-10 00:00:00
 2005-07-24 00:00:00 | Tobias     |            30 | 2006-11-15 00:00:00
                     | yq1        |            30 | 2003-05-18 00:00:00
                     | yq2        |            30 | 2003-05-18 00:00:00
 2007-12-10 00:00:00 | yq3        |            30 | 2003-05-18 00:00:00
(13 rows)

-- 删除表
gaussdb=# DROP TABLE ta1;
DROP TABLE

LEAD

描述:LEAD函数为各组内对应值生成提前值。即当前值对应的行数向后偏移offset位后所得行的value值作为序号。若经过向后偏移后行数超过当前组内的总行数,则对应结果取为default值。若无指定,在默认情况下,offset取为1,default值取为NULL。default值的类型需要与value值的类型保持一致。

语法:

LEAD(value any [, offset integer [, default any ]])
LEAD(value any ignore|respect nulls [, offset integer [, default any ]])
LEAD(value any [, offset integer [, default any ]]) ignore|respect nulls

ignore|respect nulls表示是否将NULL包含在向后偏移值中。若无指定,则默认是respect nulls。若设置为ignore nulls,当value为NULL时,则将其不包含在向后偏移值中。如果开启了ignore nulls功能,则该函数会存在性能劣化。

返回值类型:与参数数据类型相同。

示例1:不开启ignore nulls功能,设置offset=2,不指定default

-- 建表并插入数据
gaussdb=# CREATE TABLE ta1 (hire_date date, last_name varchar(20), department_id int);
CREATE TABLE
gaussdb=# INSERT INTO ta1 values('07-DEC-02', 'Raphaely', 30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('24-JUL-05', 'Tobias',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('24-DEC-05', 'Baida',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('18-MAY-03', 'Khoo', 30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('15-NOV-06', 'Himuro',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('10-AUG-07', 'Colmenares',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('10-MAY-07', 'yq',  11);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('10-MAY-08', 'zi',  11);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('', 'yq1',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values(null, 'yq2',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values('10-DEC-07', 'yq3',  30);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values(null, null,  11);
INSERT 0 1
gaussdb=# INSERT INTO ta1 values(null, null,  11);
INSERT 0 1

-- 调用LEAD,指定offset=2
gaussdb=# SELECT hire_date, last_name, department_id, lead(hire_date, 2) OVER (PARTITION BY department_id ORDER BY last_name) AS "NextHired" FROM ta1 ORDER BY department_id;
      hire_date      | last_name  | department_id |      NextHired
---------------------+------------+---------------+---------------------
 2007-05-10 00:00:00 | yq         |            11 |
 2008-05-10 00:00:00 | zi         |            11 |
                     |            |            11 |
                     |            |            11 |
 2005-12-24 00:00:00 | Baida      |            30 | 2006-11-15 00:00:00
 2007-08-10 00:00:00 | Colmenares |            30 | 2003-05-18 00:00:00
 2006-11-15 00:00:00 | Himuro     |            30 | 2002-12-07 00:00:00
 2003-05-18 00:00:00 | Khoo       |            30 | 2005-07-24 00:00:00
 2002-12-07 00:00:00 | Raphaely   |            30 |
 2005-07-24 00:00:00 | Tobias     |            30 |
                     | yq1        |            30 | 2007-12-10 00:00:00
                     | yq2        |            30 |
 2007-12-10 00:00:00 | yq3        |            30 |
(13 rows)

示例2:开启ignore nulls功能,设置offset=2,default='01-JAN-00'

gaussdb=# SELECT hire_date, last_name, department_id, lead(hire_date, 2, '01-JAN-00') ignore nulls OVER (PARTITION BY department_id ORDER BY last_name) AS "NextHired" FROM ta1 ORDER BY department_id;
      hire_date      | last_name  | department_id |      NextHired
---------------------+------------+---------------+---------------------
 2007-05-10 00:00:00 | yq         |            11 | 2000-01-01 00:00:00
 2008-05-10 00:00:00 | zi         |            11 | 2000-01-01 00:00:00
                     |            |            11 | 2000-01-01 00:00:00
                     |            |            11 | 2000-01-01 00:00:00
 2005-12-24 00:00:00 | Baida      |            30 | 2006-11-15 00:00:00
 2007-08-10 00:00:00 | Colmenares |            30 | 2003-05-18 00:00:00
 2006-11-15 00:00:00 | Himuro     |            30 | 2002-12-07 00:00:00
 2003-05-18 00:00:00 | Khoo       |            30 | 2005-07-24 00:00:00
 2002-12-07 00:00:00 | Raphaely   |            30 | 2007-12-10 00:00:00
 2005-07-24 00:00:00 | Tobias     |            30 | 2000-01-01 00:00:00
                     | yq1        |            30 | 2000-01-01 00:00:00
                     | yq2        |            30 | 2000-01-01 00:00:00
 2007-12-10 00:00:00 | yq3        |            30 | 2000-01-01 00:00:00
(13 rows)

-- 删除表
gaussdb=# DROP TABLE ta1;
DROP TABLE

FIRST_VALUE()

描述:FIRST_VALUE函数获取各组内的第一个值作为返回结果。

返回值类型:与参数数据类型相同。

语法:

LAST_VALUE(value any)
LAST_VALUE(value any IGNORE|RESPECT NULLS)
LAST_VALUE(value any) IGNORE|RESPECT NULLS

IGNORE|RESPECT NULLS:表示是否要忽略NULL值,其中RESPECT NULLS是缺省值。若开启IGNORE NULLS选项,当查询的组内第一个值是NULL时,会继续往后查询,直到查询到一个非NULL值时进行返回;如果一直向后查询到当前组的结尾仍为NULL,则返回NULL。

示例1:不开启IGNORE NULLS,默认不忽略NULL值。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
gaussdb=# CREATE TABLE first_value_t1(a int, b int);

gaussdb=# INSERT INTO first_value_t1 VALUES(1, NULL),(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,FIRST_VALUE(b) OVER(PARTITION BY a ORDER BY b) FROM first_value_t1;
 a | b | first_value 
---+---+-------------
 1 | 1 |           1
 1 | 1 |           1
 1 | 2 |           1
 1 | 3 |           1
 1 |   |           1
 2 | 4 |           4
 2 | 5 |           4
 3 | 6 |           6
(8 rows)
gaussdb=# DROP TABLE first_value_t1;

示例2:开启IGNORE NULLS。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
gaussdb=# CREATE TABLE first_value_t1(a int, b int);

gaussdb=# INSERT INTO first_value_t1 VALUES(1, NULL),(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,FIRST_VALUE(b) IGNORE NULLS OVER(PARTITION BY a ORDER BY b DESC) FROM first_value_t1;
 a | b | first_value 
---+---+-------------
 1 |   |            
 1 | 3 |           3
 1 | 2 |           3
 1 | 1 |           3
 1 | 1 |           3
 2 | 5 |           5
 2 | 4 |           5
 3 | 6 |           6
(8 rows)

-- 对比不加IGNORE NULLS
gaussdb=# SELECT a,b,FIRST_VALUE(b) OVER (PARTITION BY a ORDER BY b DESC) FROM first_value_t1;
 a | b | first_value 
---+---+-------------
 1 |   |            
 1 | 3 |            
 1 | 2 |            
 1 | 1 |            
 1 | 1 |            
 2 | 5 |           5
 2 | 4 |           5
 3 | 6 |           6
(8 rows)

gaussdb=# DROP TABLE first_value_t1;

LAST_VALUE()

描述:LAST_VALUE函数获取各组内的最后一个值作为返回结果。

返回值类型:与参数数据类型相同。

语法:

LAST_VALUE(value any)
LAST_VALUE(value any IGNORE|RESPECT NULLS)
LAST_VALUE(value any) IGNORE|RESPECT NULLS

IGNORE|RESPECT NULLS:表示是否要忽略NULL值。其中RESPECT NULLS是缺省值。若开启IGNORE NULLS选项,当查询到的组内最后一个值是NULL时,会继续向前查询,直到查询到一个非NULL值时进行返回;如果一直向前查询到当前组的开头仍是NULL值,则返回NULL。

示例1:不开启IGNORE NULLS,默认不忽略NULL值。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
gaussdb=# CREATE TABLE last_value_t1(a int, b int);

gaussdb=# INSERT INTO last_value_t1 VALUES(1, NULL),(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,LAST_VALUE(b) OVER(PARTITION BY a ORDER BY b) FROM last_value_t1;
 a | b | last_value 
---+---+------------
 1 | 1 |          1
 1 | 1 |          1
 1 | 2 |          2
 1 | 3 |          3
 1 |   |           
 2 | 4 |          4
 2 | 5 |          5
 3 | 6 |          6
(8 rows)
gaussdb=# DROP TABLE last_value_t1;

示例2:开启IGNORE NULLS。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
gaussdb=# CREATE TABLE last_value_t1(a int, b int);

gaussdb=# INSERT INTO last_value_t1 VALUES(1, NULL),(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,LAST_VALUE(b) IGNORE NULLS OVER(PARTITION BY a ORDER BY b) FROM last_value_t1;
 a | b | last_value 
---+---+------------
 1 | 1 |          1
 1 | 1 |          1
 1 | 2 |          2
 1 | 3 |          3
 1 |   |          3
 2 | 4 |          4
 2 | 5 |          5
 3 | 6 |          6
(8 rows)

gaussdb=# DROP TABLE last_value_t1;

NTH_VALUE(value any, nth integer)

描述:NTH_VALUE函数返回该组内的第nth行作为结果。若该行不存在,则默认返回NULL。

返回值类型:与参数数据类型相同。

语法:

NTH_VALUE(value any, nth integer)
NTH_VALUE(value any, nth integer) IGNORE|RESPECT NULLS

IGNORE|RESPECT NULLS:表示是否要忽略NULL值。其中RESPECT NULLS是缺省值。若开启IGNORE NULLS选项,会在每个组从前向后查找非NULL值,直到查询到第nth个非NULL值时进行返回;如果查找到当前组的结尾都没有nth个非NULL值,则返回NULL。

示例1:不开启IGNORE NULLS,默认不忽略NULL值。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
gaussdb=# CREATE TABLE nth_value_t1(a int, b int);

gaussdb=# INSERT INTO nth_value_t1 VALUES(1, NULL),(1, NULL),(1, NULL),(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,NTH_VALUE(b, 2) OVER(PARTITION BY a ORDER BY b) FROM nth_value_t1;
 a | b | nth_value 
---+---+-----------
 1 | 1 |         1
 1 | 1 |         1
 1 | 2 |         1
 1 | 3 |         1
 1 |   |         1
 1 |   |         1
 1 |   |         1
 2 | 4 |          
 2 | 5 |         5
 3 | 6 |          
(10 rows)

gaussdb=# DROP TABLE nth_value_t1;

示例2:开启IGNORE NULLS。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
gaussdb=# CREATE TABLE nth_value_t1(a int, b int);

gaussdb=# INSERT INTO nth_value_t1 VALUES(1, NULL),(1, NULL),(1, NULL),(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,NTH_VALUE(b, 2) IGNORE NULLS OVER(PARTITION BY a ORDER BY b DESC) FROM nth_value_t1;
 a | b | nth_value 
---+---+-----------
 1 |   |          
 1 |   |          
 1 |   |          
 1 | 3 |          
 1 | 2 |         2
 1 | 1 |         2
 1 | 1 |         2
 2 | 5 |          
 2 | 4 |         4
 3 | 6 |          
(10 rows)

-- 对比不加 ignore nulls
SELECT a,b,NTH_VALUE(b, 2) OVER(PARTITION BY a ORDER BY b DESC) FROM nth_value_t1;
 a | b | nth_value 
---+---+-----------
 1 |   |          
 1 |   |          
 1 |   |          
 1 | 3 |          
 1 | 2 |          
 1 | 1 |          
 1 | 1 |          
 2 | 5 |          
 2 | 4 |         4
 3 | 6 |          
(10 rows)

gaussdb=# DROP TABLE nth_value_t1;

delta

描述:返回当前行和前一行的差值。

参数:numeric

返回值类型:numeric

spread

描述:该函数用于计算某段时间内最大和最小值得差值。

参数:real

返回值类型:real

RATIO_TO_REPORT(column_name)

描述:计算某一列的值占其所属分组中所有值总和的比例。

参数:数值类型,或任意可以隐式转换成数值类型的类型。

返回值类型:入参为float4和float8,返回值类型与入参一致;其余入参类型返回numeric类型。

RATIO_TO_REPORT(column_name)与OVER()一起使用时,OVER()中入参只支持PARTITION BY和NULL。

示例1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
gaussdb=# CREATE TABLE ratio_to_report_t1(a int, b int);

gaussdb=# INSERT INTO ratio_to_report_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,RATIO_TO_REPORT(b) OVER(PARTITION BY a) FROM ratio_to_report_t1;
 a | b |    ratio_to_report     
---+---+------------------------
 1 | 1 |  .14285714285714285714
 1 | 1 |  .14285714285714285714
 1 | 2 |  .28571428571428571429
 1 | 3 |  .42857142857142857143
 2 | 4 |  .44444444444444444444
 2 | 5 |  .55555555555555555556
 3 | 6 | 1.00000000000000000000
(7 rows)

gaussdb=# DROP TABLE ratio_to_report_t1;

示例2:与其它函数嵌套使用。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
gaussdb=# CREATE TABLE ratio_to_report_t1(a int, b int);

gaussdb=# INSERT INTO ratio_to_report_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# SELECT a,b,TO_CHAR(RATIO_TO_REPORT(b) OVER(PARTITION BY a), '$999eeee') FROM ratio_to_report_t1;
 a | b | to_char 
---+---+---------
 1 | 1 |  1e-01
 1 | 1 |  1e-01
 1 | 2 |  3e-01
 1 | 3 |  4e-01
 2 | 4 |  4e-01
 2 | 5 |  6e-01
 3 | 6 |  1e+00
(7 rows)

gaussdb=# DROP TABLE ratio_to_report_t1;

示例3:存储过程调用。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
gaussdb=# CREATE TABLE ratio_to_report_t1(a int, b int);

gaussdb=# INSERT INTO ratio_to_report_t1 VALUES(1,1),(1,1),(1, 2),(1, 3),(2, 4),(2, 5),(3,6);

gaussdb=# CREATE OR REPLACE PROCEDURE ratio_to_report_proc IS CURSOR cur_1 IS SELECT a,b,RATIO_TO_REPORT(b) OVER(PARTITION BY a) FROM ratio_to_report_t1;
BEGIN
FOR cur IN cur_1 LOOP
RAISE INFO '%', cur.ratio_to_report;
END LOOP;
END;
/

gaussdb=# CALL RATIO_TO_REPORT_PROC();
INFO:  .14285714285714285714
INFO:  .14285714285714285714
INFO:  .28571428571428571429
INFO:  .42857142857142857143
INFO:  .44444444444444444444
INFO:  .55555555555555555556
INFO:  1.00000000000000000000
 ratio_to_report_proc 
----------------------

(1 row)

gaussdb=# DROP PROCEDURE ratio_to_report_proc;

gaussdb=# DROP TABLE ratio_to_report_t1;

KEEP( DENSE_RANK { FIRST | LAST } ORDER BY expr)

描述:将查询结果的数据行按指定规则进行排序,返回第一组或者最后一组的值,可用于聚集函数或窗口函数。

语法:

aggregate_function KEEP ( DENSE_RANK { FIRST | LAST }  ORDER BY expr [ DESC | ASC ] [ NULLS { FIRST | LAST } ]
      [, expr [ DESC | ASC ] [ NULLS { FIRST | LAST } ]]...) 
OVER ( [query_partition_clause] ) ]

返回值类型:在aggregate_function中指定列的数据类型。

示例:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
gaussdb=# CREATE TABLE keep_t1(a int, b int, c int, d int);

gaussdb=# INSERT INTO keep_t1 values(2, 111, 333, 10), (2, 11, 33, 10), (2, 11, 3, 1), (1, 22, 55, 10), (1, 2, 5, 100);

gaussdb=#  SELECT MIN(d) KEEP(DENSE_RANK FIRST ORDER BY b) OVER (PARTITION BY a) MIN_FIRST,
	MIN(b) KEEP(DENSE_RANK LAST ORDER BY b) OVER (PARTITION BY a) MIN_LAST,
	MAX(b) KEEP(DENSE_RANK FIRST ORDER BY b) OVER (PARTITION BY a) MAX_FIRST,
	MAX(b) KEEP(DENSE_RANK LAST ORDER BY b) OVER (PARTITION BY a) MAX_LAST FROM keep_t1;
 min_first | min_last | max_first | max_last
-----------+----------+-----------+----------
       100 |       22 |         2 |       22
       100 |       22 |         2 |       22
         1 |      111 |        11 |      111
         1 |      111 |        11 |      111
         1 |      111 |        11 |      111
(5 rows)
gaussdb=# DROP TABLE keep_t1;

相关文档