文档首页> 数据仓库服务 GaussDB(DWS)> IoT数仓> 函数和表达式

更新时间：2024-05-07 GMT+08:00

函数和表达式

时序计算函数一览表

表1 时序计算支持的函数一览
功能	函数
用于计算按照时间排序后两行之间的差值。	delta
该函数用于计算某段时间内最大和最小值的差值。	spread
对给定的列，返回出现频率最高的值，如果多个值频率相同，返回这些值中最小的那个值。	mode()
计算百分位，是percentile_cont的近似算法。	value_of_percentile
给定百分位，计算对应的值。是value_of_percentile的逆运算。	percentile_of_value
通过比较column2列的值，找到其中的最小值，输出对应行column1列的值。	first
通过比较column2列的值，找到其中的最大值，输出对应行column1列的值。	last
用于获取时序表在当前DN节点上tag表的行数，只能在DN节点上使用。	get_timeline_count_internal
用于获取时序表在各个DN节点上tag表的行数，只能在CN节点上使用。	get_timeline_count
用于清理tag表中无用的tagid对应的行数据。	gs_clean_tag_relation
用于迁移单个时序表的分区管理任务，仅在时序表从8.1.1升级到8.1.3版本时使用。	ts_table_part_policy_pgjob_to_pgtask
用于迁移本数据库所有时序表的分区管理任务，仅在时序表从8.1.1升级到8.1.3版本时使用。	proc_part_policy_pgjob_to_pgtask
用于打印SQL语句，每条语句可用于迁移单个时序表的分区管理任务，仅在时序表从8.1.1升级到8.1.3版本时使用。	print_sql_part_policy_pgjob_to_pgtask

表2 按照时间填充的表达式
功能	表达式
对数据按照时间列排序后，补充缺失的时间数据信息（聚合以后的结果），补充的方法是用按照时间排序后的前值填充后值。	time_fill(interval, time_column, start_time, end_time), fill_last(agg_function(agg_column))
对数据按照时间列排序后，补充缺失的时间数据信息（聚合以后的结果），补充的方法是用按照时间排序后的后值填充前值。	time_fill(interval, time_column, start_time, end_time), fill_first(agg_function(agg_column))
对数据按照时间列排序后，补充缺失的时间数据信息（聚合以后的结果），补充的方法是用按照时间排序后前后两者的值填充当前值。	time_fill(interval, time_column, start_time, end_time), fill_avg(agg_function(agg_column))

表3 参数说明
参数名	类型	描述	Required/Option
interval	时间间隔类型：INTERVAL，最小单位是1秒。	按照时间分组的时间间隔。	Required
time_column	时间类型，timestamp/timestamptz。	按照指定列做时间分组。	Required
start_time	时间类型，timestamp/timestamptz。	分组的起始时间。	Required
end_time	时间类型，timestamp/timestamptz。	分组的结束时间。	Required
agg_function(agg_column))	指定Agg函数对指定列做聚合。比如max(col)。	对Agg的结果按照指定的填充方法填充。	Required

time_fill函数需要作为聚合函数使用，group by需要引用自身计算结果，不支持与自身嵌套使用，不支持单条查询内多次调用，不支持作为下层计算节点使用，不支持与within group使用。
start时间戳的值必须小于finish时间戳的值，且两者间距需要大于window_width的值。
所有参数不支持空值，start和finish需为确定值。
time_fill在使用时，必须与fill_avg、fill_first或者fill_last配合使用；或者与Agg函数组合使用。
time_fill必须出现在group by中，并且group by中只能出现这一列。
time_fill不支持出现在处理select后面的其他位置，比如where后面或其他关联条件中。

示例：

创建表，并且插入数据：

   
        create table dcs_cpu(
idle real TSField,
vcpu_num int TSTag,
node text TSTag,
scope_name text TSTag,
server_ip text TSTag,
iowait numeric TSField,
time_string timestamp TSTime
)with (TTL='7 days', PERIOD = '1 day', orientation=timeseries) distribute by hash(node);
insert  into dcs_cpu VALUES(1.0,1,'node_a','scope_a','1.1.1.1',1.0,'2019-07-12 00:10:10');
insert  into dcs_cpu VALUES(2.0,2,'node_b','scope_a','1.1.1.2',2.0,'2019-07-12 00:12:10');
insert  into dcs_cpu VALUES(3.0,3,'node_c','scope_b','1.1.1.3',3.0,'2019-07-12 00:13:10');

以1min为单位，求分组内的均值。用前一时间段的值，填充后一时间段的值：

   
        select time_fill(interval '1 min',time_string,'2019-07-12 00:09:00','2019-07-12 00:14:00'), fill_last(avg(idle)) from dcs_cpu group by time_fill order by time_fill;
time_fill                  | fill_last
--------------------------+-----------
Fri Jul 12 00:09:00 2019 |
Fri Jul 12 00:10:00 2019 |         1
Fri Jul 12 00:11:00 2019 |         1
Fri Jul 12 00:12:00 2019 |         2
Fri Jul 12 00:13:00 2019 |         3
Fri Jul 12 00:14:00 2019 |         3
(6 rows)

以1min为单位，求分组内的均值。用后一时间段的值，填充前一时间段的值：

   
        select time_fill(interval '1 min',time_string,'2019-07-12 00:09:00','2019-07-12 00:14:00'), fill_first(avg(idle)) from dcs_cpu group by time_fill order by time_fill;
time_fill                  | fill_first
--------------------------+------------
Fri Jul 12 00:09:00 2019 |          1
Fri Jul 12 00:10:00 2019 |          1
Fri Jul 12 00:11:00 2019 |          2
Fri Jul 12 00:12:00 2019 |          2
Fri Jul 12 00:13:00 2019 |          3
Fri Jul 12 00:14:00 2019 |
(6 rows)

以1min为单位，求分组内的平均值，用前后两个时间的加权平均值，填充当前值：

   
        select time_fill(interval '1 min',time_string,'2019-07-12 00:09:00','2019-07-12 00:14:00'), fill_avg(avg(idle)) from dcs_cpu group by time_fill order by time_fill;
time_fill                  | fill_avg
--------------------------+----------
Fri Jul 12 00:09:00 2019 |        1
Fri Jul 12 00:10:00 2019 |        1
Fri Jul 12 00:11:00 2019 |      1.5
Fri Jul 12 00:12:00 2019 |        2
Fri Jul 12 00:13:00 2019 |        3
Fri Jul 12 00:14:00 2019 |        3
(6 rows)

delta(field numeric)

用于计算按照时间排序后两行之间的差值。

表4 参数说明
参数名	类型	描述	Required/Option
field	数值型	需要计算的列。	Required

该函数通常用于时序场景计算按照时间排序后相邻两行插值，用于流量，速度等指标监控。
delta是一个窗口函数，需要与over窗口函数使用。并且，over中rows语句不会改变delta函数结果，比如delta(value) over(order by time rows 1 preceding) 和 delta(value) over(order by time rows 3 preceding) 返回的结果是一致的。

示例：

SELECT
     delta(value) over (rows 1 preceding)
FROM 
     (VALUES ('2019-07-12 00:00:00'::timestamptz, 1),('2019-07-12 00:01:00'::timestamptz, 2),('2019-07-12 00:02:00'::timestamptz, 3)) v(time,value);

spread(field numeric)

该函数用于计算某段时间内最大和最小值的差值。

表5 参数说明
参数名	类型	描述	Required/Option
field	数值型	需要计算的列。	Required

该函数用于时序场景计算每个指标的增量，通常按照时间排序后计算。
每个分组内如果少于2个元组，返回结果为0，不要和over窗口函数混用。

示例：

SELECT
     SPREAD(value)
FROM 
     (VALUES ('2019-07-12 00:00:00'::timestamptz, 1),('2019-07-12 00:01:00'::timestamptz, 2),('2019-07-12 00:02:00'::timestamptz, 3)) v(time,value);

mode() within group (order by value anyelement)

对给定的列，返回出现频率最高的值，如果多个值频率相同，返回这些值中最小的那个值。

表6 参数说明
参数名	类型	描述	Required/Option
value	anyelement	查询列。	Required

需要与within group一起使用，无within group语句，会报错，该函数参数放在group的order by后面。
不能和over子句一起使用。

示例：

SELECT
    mode() within group (order by value)
FROM
    (VALUES ('2019-07-12 00:00:00'::timestamptz, 1),('2019-07-12 00:01:00'::timestamptz, 2),('2019-07-12 00:02:00'::timestamptz, 3)) v(time,value);

value_of_percentile(column float, percentile float, compression float)

对于给定的列按照从小到大的顺序返回百分位的近似值，是percentile_cont的近似结果，但是性能比percentile_cont好。

表7 参数说明
参数名	类型	描述	Required/Option
column	float	要计算百分位的列。	Required
percentile	float	指定的百分位的值，取值范围[0，1]。	Required
compression	float	指定的压缩系数，取值范围为[0，500]，默认值是300，这个值取值越大，函数计算过程占用内存越大，同时结果的精度也相对高。如果指定的值不在取值范围内，会按照300求解。	Required

示例：

SELECT value_of_percentile(values, 0.8, 0) from TABLE;

percentile_of_value(column float, percentilevalue float, compression float)

对于给定的列按照从小到大的顺序返回百分位。是value_of_percentile的逆过程。

表8 参数说明
参数名	类型	描述	Requried/Option
column	float	要计算百分位的列。	Required
percentilevalue	float	指定该值，用于计算它所在的百分位。	Required
compression	float	指定的压缩系数，取值范围为[0，500]，默认值是300，这个值取值越大，函数计算过程占用内存越大，同时结果的精度也相对高。如果指定的值不在取值范围内，会按照300求解。	Required

示例：

SELECT percentile_of_value(values, 80, 0) from TABLE;

first(column1, column2)

聚合函数。通过比较分组内column2列的值，找到其中的最小值，输出对应column1列的值。

表9 参数说明
参数名	类型	描述	Requried/Option
column1	bigint/text/double/numeric	最终的输出列。	Required
column2	timestamp/timestamptz/numeric	比较列。	Required

示例：复用time_fill表达式中的表定义和数据。

求按照scope_name分组，每个分组内按照时间排序最靠前的idle的值：

    
         select first(idle, time_string) from dcs_cpu group by scope_name;
first
-------
1
3
(2 rows)

last(column1, column2)

聚合函数。通过比较分组内column2列的值，找到其中的最大值，输出对应column1列的值。

**表10** 参数说明
参数名	类型	描述	Requried/Option
column1	bigint/text/double/numeric	最终的输出列。	Required
column2	timestamp/timestamptz/numeric	比较列。	Required

示例：复用time_fill表达式中的表定义和数据

求按照scope_name分组，每个分组内按照时间排序最靠后的idle的值：

    
         select last(idle, time_string) from dcs_cpu group by scope_name;
last
------
2
3
(2 rows)

get_timeline_count_internal(schema_name text, rel_name text)

函数用于获取时序表在当前DN节点上tag表的行数，只能在DN节点上使用。

参数名	类型	描述	Required/Option
schema_name	text	时序表所属schema的名称。	Required
rel_name	text	时序表的表名。	Required

示例：

创建表，并且插入数据：

    
         CREATE TABLE IF NOT EXISTS CPU(
scope_name text TSTag,
server_ip text TSTag,
group_path text TSTag,
time timestamptz TSTime,
idle numeric TSField
) with (orientation=TIMESERIES) distribute by hash(scope_name);
insert into CPU values('dcxtataetaeta','10.145.255.33','saetataetaeta','2020-04-07 17:12:09+08',  60639);
insert into CPU values('wrhtataetaeta','10.145.255.33','saetataetaeta','2020-04-07 17:12:09+08',  53311);
insert into CPU values('saetataetaeta','10.145.255.33','saetataetaeta','2020-04-07 17:12:09+08',  27101);
insert into CPU values('saetataetaeta','10.145.255.33','saetataetaeta','2020-04-07 17:12:09+08',  48005);

数据从delta表进入CU后，连接DN节点，执行该函数：

    
         select get_timeline_count_internal('public', 'cpu');
get_timeline_count_internal
-----------------------------
2
(1 row)

get_timeline_count(relname regclass)

函数用于获取时序表在各个DN节点上tag表的行数，只能在CN节点上使用。

参数名	类型	描述	Required/Option
relname	regclass	时序表的名称。	Required

示例：

建表和导入数据与get_timeline_count_internal函数实例相同，连接CN节点，执行该函数。

    
         select get_timeline_count('cpu');
get_timeline_count
--------------------
(dn_1,2)
(dn_2,1)
(2 rows)

gs_clean_tag_relation(tagOid oid)

函数用于清理tag表中无用的tagid对应的行数据。由于分区的自动删除，主表中的数据已经被清理，长期使用可能导致tag表中存在一些废弃数据，可以通过调用该函数，将长期以来不使用的tag表中的行数据进行清理，提高tag表的利用率。返回值为成功清理tag表的行数。

参数说明

参数名	类型	描述	Required/Option
tagOid	oid	淘汰指定tag表中无用的数据。	Required

示例：

    
     
       
       
         CREATE TABLE IF NOT EXISTS CPU(
scope_name text TSTag,
server_ip text TSTag,
group_path text TSTag,
time timestamptz TSTime,
idle numeric TSField,
system numeric TSField,
util numeric TSField,
vcpu_num numeric TSField,
guest numeric TSField,
iowait numeric TSField,
users numeric TSField) with (orientation=TIMESERIES) distribute by hash(scope_name);
SELECT oid FROM PG_CLASS WHERE relname='cpu';
  oid
-------
 19099
(1 row)
SELECT gs_clean_tag_relation(19099);
 gs_clean_tag_relation
-----------------------
                     0
(1 row)

        

      

    
   

ts_table_part_policy_pgjob_to_pgtask(schemaName text, tableName text)

该函数仅在时序表从8.1.1升级到8.1.3版本时使用，用于迁移单个时序表的分区管理任务。8.1.1版本时序表的分区管理任务在pg_jobs表，而8.1.3版本的时序表分区管理任务在pg_task表，在8.1.1版本升级到8.1.3版本时，需要将时序表的分区管理任务从pg_jobs迁移到pg_task中。该函数只迁移时序表分区管理任务，迁移完成后将原有的pg_jobs任务设置为broken状态。

参数名	类型	描述	Required/Option
schemaName	text	时序表所属schema的名称。	Required
tableName	text	时序表的名称。	Required

示例：

CALL ts_table_part_policy_pgjob_to_pgtask('public','cpu1');
WARNING:  The job on pg_jobs is migrated to pg_task, and the original job is broken, the job what is call proc_drop_partition('public.cpu1', interval '7 d'); , the job interval is interval '1 day'.
WARNING:  The job on pg_jobs is migrated to pg_task, and the original job is broken, the job what is call proc_add_partition('public.cpu1', interval '1 d'); , the job interval is interval '1 day'.
ts_table_part_policy_pgjob_to_pgtask

----------------------------------

(1 row)

proc_part_policy_pgjob_to_pgtask()

该函数仅在时序表从8.1.1升级到8.1.3版本时使用，函数用于迁移本数据库所有8.1.1版本时序表的分区管理任务。本函数将遍历本数据库中所有时序表，并检查时序表的分区管理任务是否迁移，如果没有迁移，则调用ts_table_part_policy_pgjob_to_pgtask函数，迁移该时序表的分区管理任务。如果中途出现迁移失败，则整体回滚。

示例：

CALL proc_part_policy_pgjob_to_pgtask();
NOTICE:  find table, name is cpu1, namespace is public.
WARNING:  The job on pg_jobs is migrated to pg_task, and the original job is broken, the job what is call proc_drop_partition('public.cpu1', interval '7 d'); , the job interval is interval '1 day'.
CONTEXT:  SQL statement "call ts_table_part_policy_pgjob_to_pgtask('public', 'cpu1');"
PL/pgSQL function proc_part_policy_pgjob_to_pgtask() line 17 at EXECUTE statement
WARNING:  The job on pg_jobs is migrated to pg_task, and the original job is broken, the job what is call proc_add_partition('public.cpu1', interval '1 d'); , the job interval is interval '1 day'.
CONTEXT:  SQL statement "call ts_table_part_policy_pgjob_to_pgtask('public', 'cpu1');"
PL/pgSQL function proc_part_policy_pgjob_to_pgtask() line 17 at EXECUTE statement
NOTICE:  find table, name is cpu2, namespace is public.
WARNING:  The job on pg_jobs is migrated to pg_task, and the original job is broken, the job what is call proc_add_partition('public.cpu2', interval '1 d'); , the job interval is interval '1 day'.
CONTEXT:  SQL statement "call ts_table_part_policy_pgjob_to_pgtask('public', 'cpu2');"
PL/pgSQL function proc_part_policy_pgjob_to_pgtask() line 17 at EXECUTE statement
proc_part_policy_pgjob_to_pgtask
--------------------------------------

(1 row)

print_sql_part_policy_pgjob_to_pgtask()

该函数仅在时序表从8.1.1升级到8.1.3版本时使用，该函数用于打印SQL语句，每条语句可用于迁移单个8.1.1版本时序表的分区管理任务。由于proc_part_policy_pgjob_to_pgtask函数的迁移粒度是数据库级别，因此引入本函数，使用者可以手动执行本函数的打印内容，以实现单个时序表的迁移粒度。

示例：

CALL print_sql_part_policy_pgjob_to_pgtask();
call ts_table_part_policy_pgjob_to_pgtask('public', 'cpu1');
call ts_table_part_policy_pgjob_to_pgtask('public', 'cpu2');
print_sql_part_policy_pgjob_to_pgtask
---------------------------------------

(1 row)

上一篇：CREATE INDEX

下一篇：IoT场景下GUC参数

意见反馈

文档内容是否对您有帮助？

有帮助没帮助

提供反馈

提交成功！非常感谢您的反馈，我们会继续努力做到更好！您可在我的云声建议查看反馈及问题处理状态。

系统繁忙，请稍后重试

在使用文档中是否遇到以下问题

内容与产品页面不一致

内容不易理解

缺失示例代码

步骤不可操作

搜不到想要的内容

缺少最佳实践

意见反馈（选填）

0/500

请至少选择一项反馈信息并填写问题反馈

字符长度不能超过500

直接提交取消

如您有其它疑问，您也可以通过华为云社区问答频道来与我们联系探讨

智能客服提问云社区提问