更新时间:2024-05-20 GMT+08:00

PBE动态剪枝

PBE动态剪枝支持范围如下所示:
  1. 支持分区类型:范围分区、哈希分区、列表分区。
  2. 支持表达式类型:比较表达式(<,<=,=,>=,>)、逻辑表达式、数组表达式。
  3. 支持部分隐式类型转换和函数:对于类型可以相互转换的场景和immutable函数可以支持PBE动态剪枝
  • PBE动态剪枝支持表达式、隐式转换、immutable函数,stable函数,不支持子查询表达式和volatile函数。对于stable函数,如to_timestamp等类型转换函数,可能会受GUC参数变化,影响剪枝结果。为了保持性能优化,此情况可以通过analyze表重新生成gplan解决。
  • 由于PBE动态剪枝是基于generic plan的剪枝,所以判断语句是否能PBE动态剪枝时,需要设置参数 plan_cache_mode = 'force_generic_plan',排除custom plan的干扰。
  • PBE动态剪枝支持的典型场景具体示例如下:
    1. 比较表达式
      --创建分区表
      gaussdb=# CREATE TABLE t1 (c1 int, c2 int)
      PARTITION BY RANGE (c1)
      (
          PARTITION p1 VALUES LESS THAN(10),
          PARTITION p2 VALUES LESS THAN(20),
          PARTITION p3 VALUES LESS THAN(MAXVALUE)
      );
      
      gaussdb=# PREPARE p1(int) AS SELECT * FROM t1 WHERE c1 = $1;
      PREPARE
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) EXECUTE p1(1);
                               QUERY PLAN                         
      ------------------------------------------------------------
       Data Node Scan
         Output: t1.c1, t1.c2
         Node/s: datanode1
         Node expr: $1
         Remote query: SELECT c1, c2 FROM public.t1 WHERE c1 = $1
      
       Remote SQL: SELECT c1, c2 FROM public.t1 WHERE c1 = $1
       Datanode Name: datanode1
         Partition Iterator
           Output: c1, c2
           Iterations: PART
           ->  Partitioned Seq Scan on public.t1
                 Output: c1, c2
                 Filter: (t1.c1 = $1)
                 Selected Partitions:  1 (pbe-pruning)
      
      (16 rows)
    2. 逻辑表达式
      gaussdb=# PREPARE p2(INT, INT) AS SELECT * FROM t1 WHERE c1 = $1 AND c2 = $2;
      PREPARE
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) EXECUTE p2(1, 2);
                                     QUERY PLAN                               
      ------------------------------------------------------------------------
       Data Node Scan
         Output: t1.c1, t1.c2
         Node/s: datanode1
         Node expr: $1
         Remote query: SELECT c1, c2 FROM public.t1 WHERE c1 = $1 AND c2 = $2
      
       Remote SQL: SELECT c1, c2 FROM public.t1 WHERE c1 = $1 AND c2 = $2
       Datanode Name: datanode1
         Partition Iterator
           Output: c1, c2
           Iterations: PART
           ->  Partitioned Seq Scan on public.t1
                 Output: c1, c2
                 Filter: ((t1.c1 = $1) AND (t1.c2 = $2))
                 Selected Partitions:  1 (pbe-pruning)
      
      (16 rows)
    3. 类型转换触发隐式转换
      gaussdb=# set plan_cache_mode = 'force_generic_plan';
      gaussdb=# PREPARE p3(TEXT) AS SELECT * FROM t1 WHERE c1 = $1;
      PREPARE
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) EXECUTE p3('12');
                                   QUERY PLAN                             
      --------------------------------------------------------------------
       Data Node Scan
         Output: t1.c1, t1.c2
         Node/s: datanode1
         Node expr: $1
         Remote query: SELECT c1, c2 FROM public.t1 WHERE c1 = $1::bigint
      
       Remote SQL: SELECT c1, c2 FROM public.t1 WHERE c1 = $1::bigint
       Datanode Name: datanode1
         Partition Iterator
           Output: c1, c2
           Iterations: PART
           ->  Partitioned Seq Scan on public.t1
                 Output: c1, c2
                 Filter: (t1.c1 = ($1)::bigint)
                 Selected Partitions:  2 (pbe-pruning)
      
      (16 rows)
  • PBE动态剪枝不支持的典型场景具体示例如下:
    1. 子查询表达式
      gaussdb=# PREPARE p4(INT) AS SELECT * FROM t1 WHERE c1 = ALL(SELECT c2 FROM t1 WHERE c1 > $1);
      PREPARE
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) EXECUTE p4(1);
                                     QUERY PLAN                                
      -------------------------------------------------------------------------
       Streaming (type: GATHER)
         Output: public.t1.c1, public.t1.c2
         Node/s: All datanodes
         ->  Partition Iterator
               Output: public.t1.c1, public.t1.c2
               Iterations: 3
               ->  Partitioned Seq Scan on public.t1
                     Output: public.t1.c1, public.t1.c2
                     Distribute Key: public.t1.c1
                     Filter: (SubPlan 1)
                     Selected Partitions:  1..3
                     SubPlan 1
                       ->  Materialize
                             Output: public.t1.c2
                             ->  Streaming(type: BROADCAST)
                                   Output: public.t1.c2
                                   Spawn on: All datanodes
                                   Consumer Nodes: All datanodes
                                   ->  Partition Iterator
                                         Output: public.t1.c2
                                         Iterations: 3
                                         ->  Partitioned Seq Scan on public.t1
                                               Output: public.t1.c2
                                               Distribute Key: public.t1.c1
                                               Filter: (public.t1.c1 > 1)
                                               Selected Partitions:  1..3
      (26 rows)
    2. 类型转换无法直接触发隐式转换
      gaussdb=# PREPARE p5(name) AS SELECT * FROM t1 WHERE c1 = $1;
      PREPARE
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) EXECUTE p5('12');
                                      QUERY PLAN                                
      --------------------------------------------------------------------------
       Data Node Scan
         Output: t1.c1, t1.c2
         Node/s: All datanodes
         Remote query: SELECT c1, c2 FROM public.t1 WHERE c1::text = '12'::text
      
       Remote SQL: SELECT c1, c2 FROM public.t1 WHERE c1::text = '12'::text
       Datanode Name: datanode1
         Partition Iterator
           Output: c1, c2
           Iterations: 3
           ->  Partitioned Seq Scan on public.t1
                 Output: c1, c2
                 Filter: ((t1.c1)::text = '12'::text)
                 Selected Partitions:  1..3
      
      (15 rows)
    3. stable/volatile函数
      gaussdb=# create sequence seq;
      gaussdb=# PREPARE p6(TEXT) AS SELECT * FROM t1 WHERE c1 = currval($1);--volatile函数不支持剪枝
      PREPARE
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) EXECUTE p6('seq');
                                       QUERY PLAN                                 
      ----------------------------------------------------------------------------
       Data Node Scan
         Output: t1.c1, t1.c2
         Node/s: All datanodes
         Remote query: SELECT c1, c2 FROM ONLY public.t1 WHERE true
         Coordinator quals: ((t1.c1)::numeric = currval(('seq'::text)::regclass))
      
       Remote SQL: SELECT c1, c2 FROM ONLY public.t1 WHERE true
       Datanode Name: datanode1
         Partition Iterator
           Output: c1, c2
           Iterations: 3
           ->  Partitioned Seq Scan on public.t1
                 Output: c1, c2
                 Selected Partitions:  1..3
      
      (15 rows)
      
      --清理示例环境。
      gaussdb=# DROP TABLE t1;