更新时间:2025-12-10 GMT+08:00
分享

表达式分区

在处理大规模数据集时,数据分区是提高查询性能的关键技术之一。然而,传统的分区方式需要用户手动定义分区逻辑,这不仅增加了用户的负担,还可能导致分区策略不够灵活。表达式分区用于支持Iceberg的隐藏分区功能,其核心设计是将“分区逻辑”从用户视角隐藏,交给Iceberg元数据自动管理。如何确保数据分区既高效又灵活?通过使用表达式分区,用户可以无需关心复杂的分区逻辑,Iceberg将自动根据数据特征和查询模式优化分区策略,从而提高查询性能和数据管理效率。

约束限制

  • 最多支持4个分区列。
  • 不支持void表达式。
  • 所有分区表达式不支持复杂数据类型。
  • truncate/bucket表达式不支持二进制数据。
  • INSERT INTO PARTITION功能仅支持identity表达式,不支持其他表达式。
  • Iceberg约束:同一个表列,不能出现在year/month/day/hour两个时间表达式中。 但是,可以同时出现在时间和bucket/truncate表达式中。
  • Iceberg约束:对分区键有默认命名规则,不允许与表的列名称与之冲突。
  • 当前不支持分区演进功能。

支持的分区表达式

FabricSQL要支持分区表达式以及对应数据类型列表如下:

表1 FabricSQL分区支持的表达式

表达式名称

支持的数据类型

计算结果的类型

说明

identity

smallint, int, bigint, long, numeric, date, timestamp,

text, varchar, bpchar

原数据类型

不支持float,double,boolean, bytea。

bucket[N]

smallint, int, bigint, long, numeric,

date, timestamp

text, bpchar, varchar

int

  • 先计算输出数值的hash值,再求模。 计算结果为 [0, N)范围之内。
  • 哈希算法采用的是MurmurHash3算法。

truncate[W]

smallint, int, bigint, long,numeric,

text, bpchar, varchar

原数据类型

year

date, timestamp

int

例:year('2025-10-10') 计算结果为 '2025'年。

month

date, timestamp

int

例:month('2025-10-10') 计算结果为'2025-10'月。

day

date, timestamp

int

例:day('2025-10-10') 计算结果为'2025-10-10'日。

hour

timestamp

int

例:hour('2025-10-10 10:10:10.000') 计算结果为'2025-10-10 10:00:00'时。

void

不支持

不支持

本质上是将所有数值映射到null值上。

所有表达式对于null值的处理结果必须返回null值。

identity表达式对输入的数值不做任何处理,直接输出原数值。

year表达式计算的是相对1970年、以年为单位的数值。month表达式计算的是相对1970-01-01、以月为单位的数值。day表达式计算的是相对1970-01-01、以天为单位的数值。hour表达式计算的是相对1970-01-01 00:00:00、以小时为单位的数值。Iceberg在记录分区值结果时,并不会直接记录整数数值,而是将它转换为可阅读的时间字符串。例如,Iceberg会将year('2025-10-10') 表达式计算出来的分区值记录为'2025'年,而不是整数55。

FabricSQL提供了SQL函数fq_iceberg_year()、fq_iceberg_month()、fq_iceberg_day()、fq_iceberg_hour()、fq_iceberg_truncate()、fq_iceberg_bucket()用于实现对应分区表达式year、month、day、hour、truncate、bucket的计算功能。

分区表达式:truncate

对于不同的数据类型,truncate表达式的计算方法不同,可以使用函数fq_iceberg_truncate() 进行运算求值。

对于整数类型,truncate(W, integer_value)返回的是不超过integer_value、且为W倍数的最大数值。 示例如下:

fabricsql=>select fq_iceberg_truncate (10, -2) ;
 fq_iceberg_truncate
---------------------
                 -10
(1 row)

fabricsql=> select fq_iceberg_truncate (10, 0) ;
 fq_iceberg_truncate
---------------------
                   0
(1 row)

fabricsql=> select fq_iceberg_truncate (10, 9) ;
 fq_iceberg_truncate
---------------------
                   0
(1 row)

fabricsql=> select fq_iceberg_truncate (10, 10) ;
 fq_iceberg_truncate
---------------------
                  10
(1 row)

fabricsql=> select fq_iceberg_truncate (10, 19) ;
 fq_iceberg_truncate
---------------------
                  10
(1 row)

对于字符串类型,truncate(W, string_value)返回的是string_value字符串中前缀长度为W的前缀子字符串。注意,长度以字符为单位。 示例如下:

fabricsql=> select *, length(t)  from fq_iceberg_truncate (2, '') as t;
 iceberg_truncate | length
------------------+--------
                  |      0
(1 row)

fabricsql=> select *, length(t)  from fq_iceberg_truncate (2, 'a') as t;
 iceberg_truncate | length
------------------+--------
 a                |      1
(1 row)

fabricsql=> select *, length(t)  from fq_iceberg_truncate (2, 'tt') as t;
 iceberg_truncate | length
------------------+--------
 tt               |      2
(1 row)

fabricsql=> select *, length(t)  from fq_iceberg_truncate (2, 'cow') as t;
 iceberg_truncate | length
------------------+--------
 co               |      2
(1 row)

fabricsql=> select *, length(t)  from fq_iceberg_truncate (2, '你好中国!') as t;
 iceberg_truncate | length
------------------+--------
 你好             |      2
(1 row)

对于numeric类型,truncate(W, num_value)返回的是不超过num_value、且小数点右移scale位后为W倍数的最大数值。示例如下:

fabricsql=> select fq_iceberg_truncate (50, 10.64::numeric(15,3));
 fq_iceberg_truncate
---------------------
              10.600
(1 row)

fabricsql=> select fq_iceberg_truncate (50, 10.66::numeric(15,3));
 fq_iceberg_truncate
---------------------
              10.650
(1 row)

fabricsql=> select fq_iceberg_truncate (50, -10.66::numeric(15,3));
 fq_iceberg_truncate
---------------------
             -10.700
(1 row)

创建表达式分区表

CREATE TABLE语法详情参考CREATE TABLE

创建一个表达式分区的Iceberg managed表SQL语句如下:

fabricsql=> set current_schema to test;
fabricsql=> CREATE TABLE iceberg_table(id bigint, data text, category text, ts timestamp, val numeric) partition by (bucket(16, val), truncate(10, id), c3 int comment 'this is c3 field', year(ts) ) store as iceberg;
CREATE TABLE
fabricsql=> select * from pg_get_partition_keys('test', 'iceberg_table');
     table     |  partition_key
---------------+-----------------
 iceberg_table | bucket(16,val)
 iceberg_table | truncate(10,id)
 iceberg_table | c3
 iceberg_table | year(ts)
(4 rows)

在示例中, c3是一个identity分区键。此外,还创建了bucket、truncate、year表达式的三个分区键。Iceberg会对分区键进行内部命名,其命名规则如下表所述。

表2 不同表达式分区键在Iceberg内部的默认命名规则

表达式

默认命名规则

样例

year(source_column)

<source_column>_year

year(ts) -> 'ts_year'

month(source_column)

<source_column>_month

month(ts) -> 'ts_month'

day(source_column)

<source_column>_day

day(ts) -> 'ts_day'

hour(source_column)

<source_column>_hour

hour(ts) -> 'ts_hour'

bucket(N, source_column)

<source_column>_bucket

bucket(ts) -> 'ts_bucket'

truncate(W, source_column)

<source_column>_trunc

truncate(ts) -> 'ts_trunc'

identity(source_column)

<source_column>

identity(ts) -> 'ts'

表的列名称不能够与分区键的名称冲突,否则会报错、建表失败。

fabricsql=> CREATE TABLE iceberg_table(id bigint, data text, category text, ts timestamp, val numeric) partition by (bucket(16, val),  c3 int comment 'this is c3 field', year(ts), ts_year bigint ) store as iceberg;
ERROR:  column "ts_year" specified more than once

Iceberg不允许时间类型的同一列同时出现在year、month、day、hour多个表达式中,但是允许该列出现在truncate、bucket其他表达式中。

fabricsql=> CREATE TABLE iceberg_table(id bigint, data text, category text, ts timestamp, val numeric) partition by (bucket(16, val),  c3 int comment 'this is c3 field', year(ts), month(ts) ) store as iceberg;
ERROR:  java.lang.IllegalArgumentException: Cannot add redundant partition: 1002: ts_year: year(4) conflicts with 1003: ts_month: month(4).

fabricsql=> CREATE TABLE iceberg_table(id bigint, data text, category text, ts timestamp, val numeric) partition by (bucket(16, val),  c3 int comment 'this is c3 field', year(ts), bucket(7, ts) ) store as iceberg;
CREATE TABLE

FabricSQL支持创建分区键的最大数量是4,否则会报错、建表失败。

fabricsql=> CREATE TABLE iceberg_table(id bigint, data text, category text, ts timestamp, val numeric) partition by (bucket(16, val), truncate(10, id), c3 int comment 'this is c3 field', year(ts) ) store as iceberg;
ERROR:  partitioned table should have no more than 4 partition keys

FabricSQL不支持将复杂类型作为分区键,否则会报错、建表失败。

fabricsql=> CREATE TABLE iceberg_table(id bigint, data text, category text, ts timestamp, val numeric) partition by (bucket(16, val), truncate(10, id), c3 struct<c: text, d: int>, year(ts) ) store as iceberg;
ERROR:  Complex type cannot be served as a value-partitioning column.

fabricsql=> CREATE TABLE iceberg_table(a int , b struct<c: text, d: int>) PARTITION BY(bucket(5, b.c), truncate(3, b.d)) store as ICEBERG;
ERROR:  syntax error at or near "."
LINE 1: ... struct<c: text, d: int>) PARTITION BY(bucket(5, b.c), trunc...
                                                             ^

FabricSQL支持创建表达式分区的Iceberg外表,需要满足额外条件:

  • 与OBS上Iceberg表的分区键数量相同。
  • 与OBS上Iceberg表的分区键定义完全一致,包括分区键的顺序、分区键的表达式、分区键操作的表列。

后续小节将以iceberg_test_part_expr表为样例,其表定义如下:

fabricsql=> set current_schema to test;
fabricsql=> CREATE TABLE iceberg_test_part_expr(id bigint, data text, category text, ts date) partition by (c3 int, truncate(4, data), year(ts)) store as iceberg;
CREATE TABLE

导入数据

实现表达式分区的Iceberg表支持数据的单条插入和批量导入功能。

fabricsql=> INSERT INTO iceberg_test_part_expr values(9, 'hellohello',   'shanxi', '2025-08-29', 1);
INSERT 1
fabricsql=> INSERT INTO iceberg_test_part_expr values(9, 'hellohello',   'shanxi', '2025-08-29', 1);
INSERT 1
fabricsql=> INSERT INTO iceberg_test_part_expr values(9, 'he',           'shanxi', '2024-09-29', 20);
INSERT 1
fabricsql=> INSERT INTO iceberg_test_part_expr values(9, 'helloworld',   'shanxi', '2023-06-29', 6);
INSERT 1
fabricsql=> INSERT INTO iceberg_test_part_expr values(9, 'hella',        'shanxi', '2022-05-29', 60);
INSERT 1
fabricsql=> INSERT INTO iceberg_test_part_expr values(9, 'zhangmingmin', 'shanxi', '2025-04-29', 100);
INSERT 1

对于INSERT INTO PARTITION功能,FabricSQL仅支持所有分区键为identity类型的表,并不支持其他类型表达式的分区表。

fabricsql=> INSERT INTO iceberg_test_part_expr PARTITION(c3=1, data='hell', ts='2025') values (9, 'hellohello',   'shanxi', '2025-08-29 01:05:59.550463') ;
ERROR:  INSERT INTO PARTITION clause only support identity partition.

可以使用函数pg_get_partitions()来查看表的所有分区数据,示例如下:

fabricsql=> select partition_name from pg_get_partitions('iceberg_test_part_expr');
      partition_name
--------------------------
 c3=1/data=hell/ts=2025
 c3=20/data=he/ts=2024
 c3=100/data=zhan/ts=2025
 c3=6/data=hell/ts=2023
 c3=60/data=hell/ts=2022
(5 rows)

基于列表达式的查询优化

在支持表达式分区之后,FabricSQL可下推列表达式,从而进行静态分区剪枝的查询优化。 示例如下:

fabricsql=> explain(analyze, predicate_only on, costs off, timing off) select * from iceberg_test_part_expr where c3 = 1;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
  id |                                operation                                 | A-rows
 ----+--------------------------------------------------------------------------+--------
   1 | ->  Row Adapter                                                          |      2
   2 |    ->  Partitioned Vector Foreign Scan on iceberg_test_part_expr         |      2
   3 |       ->  Partitioned Vector Foreign Meta Scan on iceberg_test_part_expr |      2

            Predicate Information (identified by plan id)
 --------------------------------------------------------------------
   2 --Partitioned Vector Foreign Scan on iceberg_test_part_expr
         Filter: (c3 = 1)
         Pushdown Predicate Filter: (c3 = 1)
         Server Type: lf
         DN read from: direct
   3 --Partitioned Vector Foreign Meta Scan on iceberg_test_part_expr
         Filter: (c3 = 1)
         Pushdown Predicate Filter: (c3 = 1)
         Server Type: lf
         Pruning results: (Manifests total: 6, Manifests left: 2)
         DN read from: direct
(19 rows)

可以看到, c3=1列表达式下推后,扫描的分区数量由6个裁减为2个。

基于SQL函数的查询优化

在实现表达式分区之后,FabricSQL支持白名单SQL函数的下推优化。当前主要支持的SQL函数白名单如下:

表3 支持下推优化的SQL函数列表

分区表达式

SQL函数

下推条件

year

year(), date_trunc()

date_trunc()的时间单位是year。

month

date_trunc()

date_trunc()的时间单位是month。

day

date_trunc()

date_trunc()的时间单位是day。

hour

date_trunc()

date_trunc()的时间单位是hour。

truncate

left(), substring()

substring()必须从字符串头开始算起。

字符串函数left()和substring()下推进行静态分区剪枝的示例如下:

fabricsql=> explain(analyze, predicate_only on, costs off, timing off) select * from iceberg_test_part_expr where substring(data, 1, 3) > 'hello';
                                        QUERY PLAN
------------------------------------------------------------------------------------------
  id |                                operation                                 | A-rows
 ----+--------------------------------------------------------------------------+--------
   1 | ->  Row Adapter                                                          |      1
   2 |    ->  Partitioned Vector Foreign Scan on iceberg_test_part_expr         |      1
   3 |       ->  Partitioned Vector Foreign Meta Scan on iceberg_test_part_expr |      1

            Predicate Information (identified by plan id)
 --------------------------------------------------------------------
   2 --Partitioned Vector Foreign Scan on iceberg_test_part_expr
         Filter: ("substring"(data, 1, 3) > 'hello'::text)
         Server Type: lf
         DN read from: direct
   3 --Partitioned Vector Foreign Meta Scan on iceberg_test_part_expr
         Server Type: lf
         Pruning results: (Manifests total: 6, Manifests left: 1)
         DN read from: direct
(16 rows)

fabricsql=> explain(analyze, predicate_only on, costs off, timing off) select * from iceberg_test_part_expr where left(data, 5) < 'hello';
                                        QUERY PLAN
------------------------------------------------------------------------------------------
  id |                                operation                                 | A-rows
 ----+--------------------------------------------------------------------------+--------
   1 | ->  Row Adapter                                                          |      2
   2 |    ->  Partitioned Vector Foreign Scan on iceberg_test_part_expr         |      2
   3 |       ->  Partitioned Vector Foreign Meta Scan on iceberg_test_part_expr |      5

            Predicate Information (identified by plan id)
 --------------------------------------------------------------------
   2 --Partitioned Vector Foreign Scan on iceberg_test_part_expr
         Filter: ("left"(data, 5) < 'hello'::text)
         Server Type: lf
         DN read from: direct
   3 --Partitioned Vector Foreign Meta Scan on iceberg_test_part_expr
         Server Type: lf
         Pruning results: (Manifests total: 6, Manifests left: 5)
         DN read from: direct
(16 rows)

可以看到,substring()函数下推后,分区数量由6个裁减到1个。 而left()函数下推后,分区数量由6个裁减到5个。

时间类型函数date_trunc()下推进行静态分区剪枝的示例如下:

fabricsql=> explain(analyze, predicate_only on, costs off, timing off) select * from iceberg_test_part_expr where date_trunc('year', ts) > '2023-12-30 00:00:00';
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
  id |                                operation                                 | A-rows
 ----+--------------------------------------------------------------------------+--------
   1 | ->  Row Adapter                                                          |      4
   2 |    ->  Partitioned Vector Foreign Scan on iceberg_test_part_expr         |      4
   3 |       ->  Partitioned Vector Foreign Meta Scan on iceberg_test_part_expr |      4

                            Predicate Information (identified by plan id)
 ---------------------------------------------------------------------------------------------------
   2 --Partitioned Vector Foreign Scan on iceberg_test_part_expr
         Filter: (date_trunc('year'::text, ts) > '2023-12-30 00:00:00'::timestamp without time zone)
         Server Type: lf
         DN read from: direct
   3 --Partitioned Vector Foreign Meta Scan on iceberg_test_part_expr
         Server Type: lf
         Pruning results: (Manifests total: 6, Manifests left: 4)
         DN read from: direct
(16 rows)

fabricsql=> explain(analyze, predicate_only on, costs off, timing off) select * from iceberg_test_part_expr where year(ts) < 2024;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
  id |                                operation                                 | A-rows
 ----+--------------------------------------------------------------------------+--------
   1 | ->  Row Adapter                                                          |      2
   2 |    ->  Partitioned Vector Foreign Scan on iceberg_test_part_expr         |      2
   3 |       ->  Partitioned Vector Foreign Meta Scan on iceberg_test_part_expr |      2

            Predicate Information (identified by plan id)
 --------------------------------------------------------------------
   2 --Partitioned Vector Foreign Scan on iceberg_test_part_expr
         Filter: (year(ts) < 2024)
         Server Type: lf
         DN read from: direct
   3 --Partitioned Vector Foreign Meta Scan on iceberg_test_part_expr
         Server Type: lf
         Pruning results: (Manifests total: 6, Manifests left: 2)
         DN read from: direct
(16 rows)

可以看到, date_trunc()函数下推后,分区数量由6个裁减到4个。 而year()函数下推后,分区数量由6个裁减到2个。

相关文档