Updated on 2024-05-20 GMT+08:00

Dynamic Parameterized Path Pruning

Dynamic parameterized path pruning is supported in the following scenarios:
  1. Supported partitioning types: range partitioning, hash partitioning, and list partitioning.
  2. Supported operator types: index scan, index-only scan, and bitmap scan.
  3. Supported expression types: comparison expression (<, <=, =, >=, >) and logical expression.

Dynamic parameterized path pruning does not support subquery expressions, STABLE and VOLATILE functions, cross-QueryBlock parameterized paths, BitmapOr operator, or BitmapAnd operator.

  • Typical scenarios where dynamic parameterized path pruning is supported are as follows:
    1. Comparison expressions
      -- Create partitioned tables and indexes.
      gaussdb=# CREATE TABLE t1 (c1 INT, c2 INT)
      PARTITION BY RANGE (c1)
      (
          PARTITION p1 VALUES LESS THAN(10),
          PARTITION p2 VALUES LESS THAN(20),
          PARTITION p3 VALUES LESS THAN(MAXVALUE)
      );
      gaussdb=# CREATE TABLE t2 (c1 INT, c2 INT)
      PARTITION BY RANGE (c1)
      (
          PARTITION p1 VALUES LESS THAN(10),
          PARTITION p2 VALUES LESS THAN(20),
          PARTITION p3 VALUES LESS THAN(MAXVALUE)
      );
      gaussdb=# CREATE INDEX t1_c1 ON t1(c1) LOCAL;
      gaussdb=# CREATE INDEX t2_c1 ON t2(c1) LOCAL;
      gaussdb=# CREATE INDEX t1_c2 ON t1(c2) LOCAL;
      gaussdb=# CREATE INDEX t2_c2 ON t2(c2) LOCAL;
      
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) SELECT /*+ nestloop(t1 t2) indexscan(t1) indexscan(t2) */ * FROM t2 JOIN t1 ON t1.c1 = t2.c1;
                                                                          QUERY PLAN                                                                     
      ---------------------------------------------------------------------------------------------------------------------------------------------------
       Data Node Scan
         Output: t2.c1, t2.c2, t1.c1, t1.c2
         Node/s: All datanodes
         Remote query: SELECT/*+ NestLoop(t1 t2) IndexScan(t1) IndexScan(t2)*/ t2.c1, t2.c2, t1.c1, t1.c2 FROM public.t2 JOIN public.t1 ON t1.c1 = t2.c1
      
       Remote SQL: SELECT/*+ NestLoop(t1 t2) IndexScan(t1) IndexScan(t2)*/ t2.c1, t2.c2, t1.c1, t1.c2 FROM public.t2 JOIN public.t1 ON t1.c1 = t2.c1
       Datanode Name: datanode1
         Nested Loop
           Output: t2.c1, t2.c2, t1.c1, t1.c2
           ->  Partition Iterator
                 Output: t2.c1, t2.c2
                 Iterations: 3
                 ->  Partitioned Index Scan using t2_c1 on public.t2
                       Output: t2.c1, t2.c2
                       Selected Partitions:  1..3
           ->  Partition Iterator
                 Output: t1.c1, t1.c2
                 Iterations: PART
                 ->  Partitioned Index Scan using t1_c1 on public.t1
                       Output: t1.c1, t1.c2
                       Index Cond: (t1.c1 = t2.c1)
                       Selected Partitions:  1 (ppi-pruning)
      
      (23 rows)
      
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) SELECT /*+ nestloop(t1 t2) indexscan(t1) indexscan(t2) */ * FROM t2 JOIN t1 ON t1.c1 < t2.c1;
                                  QUERY PLAN                             
      -------------------------------------------------------------------
       Streaming (type: GATHER)
         Output: t2.c1, t2.c2, t1.c1, t1.c2
         Node/s: All datanodes
         ->  Nested Loop
               Output: t2.c1, t2.c2, t1.c1, t1.c2
               ->  Streaming(type: BROADCAST)
                     Output: t2.c1, t2.c2
                     Spawn on: All datanodes
                     Consumer Nodes: All datanodes
                     ->  Partition Iterator
                           Output: t2.c1, t2.c2
                           Iterations: 3
                           ->  Partitioned Seq Scan on public.t2
                                 Output: t2.c1, t2.c2
                                 Distribute Key: t2.c1
                                 Selected Partitions:  1..3
               ->  Partition Iterator
                     Output: t1.c1, t1.c2
                     Iterations: PART
                     ->  Partitioned Index Scan using t1_c1 on public.t1
                           Output: t1.c1, t1.c2
                           Distribute Key: t1.c1
                           Index Cond: (t1.c1 < t2.c1)
                           Selected Partitions:  1 (ppi-pruning)
      (24 rows)
      
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) SELECT /*+ nestloop(t1 t2) indexscan(t1) indexscan(t2) */ * FROM t2 JOIN t1 ON t1.c1 < t2.c1;
                                  QUERY PLAN                             
      -------------------------------------------------------------------
       Streaming (type: GATHER)
         Output: t2.c1, t2.c2, t1.c1, t1.c2
         Node/s: All datanodes
         ->  Nested Loop
               Output: t2.c1, t2.c2, t1.c1, t1.c2
               ->  Streaming(type: BROADCAST)
                     Output: t2.c1, t2.c2
                     Spawn on: All datanodes
                     Consumer Nodes: All datanodes
                     ->  Partition Iterator
                           Output: t2.c1, t2.c2
                           Iterations: 3
                           ->  Partitioned Seq Scan on public.t2
                                 Output: t2.c1, t2.c2
                                 Distribute Key: t2.c1
                                 Selected Partitions:  1..3
               ->  Partition Iterator
                     Output: t1.c1, t1.c2
                     Iterations: PART
                     ->  Partitioned Index Scan using t1_c1 on public.t1
                           Output: t1.c1, t1.c2
                           Distribute Key: t1.c1
                           Index Cond: (t1.c1 > t2.c1)
                           Selected Partitions:  1..3 (ppi-pruning)
      (24 rows)
    2. Logical expressions
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) SELECT /*+ nestloop(t1 t2) indexscan(t1) indexscan(t2) */ * FROM t2 JOIN t1 ON t1.c1 = t2.c1 AND t1.c2 = 2;
                                                                                 QUERY PLAN                                                                            
      -----------------------------------------------------------------------------------------------------------------------------------------------------------------
       Data Node Scan
         Output: t2.c1, t2.c2, t1.c1, t1.c2
         Node/s: All datanodes
         Remote query: SELECT/*+ NestLoop(t1 t2) IndexScan(t1) IndexScan(t2)*/ t2.c1, t2.c2, t1.c1, t1.c2 FROM public.t2 JOIN public.t1 ON t1.c1 = t2.c1 AND t1.c2 = 2
      
       Remote SQL: SELECT/*+ NestLoop(t1 t2) IndexScan(t1) IndexScan(t2)*/ t2.c1, t2.c2, t1.c1, t1.c2 FROM public.t2 JOIN public.t1 ON t1.c1 = t2.c1 AND t1.c2 = 2
       Datanode Name: datanode1
         Nested Loop
           Output: t2.c1, t2.c2, t1.c1, t1.c2
           ->  Partition Iterator
                 Output: t1.c1, t1.c2
                 Iterations: 3
                 ->  Partitioned Index Scan using t1_c2 on public.t1
                       Output: t1.c1, t1.c2
                       Index Cond: (t1.c2 = 2)
                       Selected Partitions:  1..3
           ->  Partition Iterator
                 Output: t2.c1, t2.c2
                 Iterations: PART
                 ->  Partitioned Index Scan using t2_c1 on public.t2
                       Output: t2.c1, t2.c2
                       Index Cond: (t2.c1 = t1.c1)
                       Selected Partitions:  1..3 (ppi-pruning)
      
      (24 rows)
  • Typical scenarios where dynamic parameterized path pruning is not supported are as follows:
    1. BitmapOr and BitmapAnd operators
      gaussdb=# set enable_seqscan=off;
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) SELECT /*+ nestloop(t1 t2) */ * FROM t2 JOIN t1 ON t1.c1 = t2.c1 OR t1.c2 = 2;
      WARNING:  Statistics in some tables or columns(public.t2.c1, public.t1.c1, public.t1.c2) are not collected.
      HINT:  Do analyze for them in order to generate optimized plan.
                                    QUERY PLAN
      ----------------------------------------------------------------------
       Streaming (type: GATHER)
         Output: t2.c1, t2.c2, t1.c1, t1.c2
         Node/s: All datanodes
         ->  Nested Loop
               Output: t2.c1, t2.c2, t1.c1, t1.c2
               ->  Streaming(type: BROADCAST)
                     Output: t2.c1, t2.c2
                     Spawn on: All datanodes
                     Consumer Nodes: All datanodes
                     ->  Partition Iterator
                           Output: t2.c1, t2.c2
                           Iterations: 3
                           ->  Partitioned Seq Scan on public.t2
                                 Output: t2.c1, t2.c2
                                 Distribute Key: t2.c1
                                 Selected Partitions:  1..3
               ->  Partition Iterator
                     Output: t1.c1, t1.c2
                     Iterations: 3
                     ->  Partitioned Bitmap Heap Scan on public.t1
                           Output: t1.c1, t1.c2
                           Distribute Key: t1.c1
                           Recheck Cond: ((t1.c1 = t2.c1) OR (t1.c2 = 2))
                           Selected Partitions:  1..3
                           ->  BitmapOr
                                 ->  Partitioned Bitmap Index Scan on t1_c1
                                       Index Cond: (t1.c1 = t2.c1)
                                 ->  Partitioned Bitmap Index Scan on t1_c2
                                       Index Cond: (t1.c2 = 2)
      (29 rows)
    2. Implicit conversion
      gaussdb=# CREATE TABLE t3(c1 TEXT, c2 INT);
      CREATE TABLE
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) SELECT * FROM t1 JOIN t3 ON t1.c1 = t3.c1;
      WARNING:  Statistics in some tables or columns(public.t1.c1, public.t3.c1) are not collected.
      HINT:  Do analyze for them in order to generate optimized plan.
                                  QUERY PLAN
      -------------------------------------------------------------------
       Streaming (type: GATHER)
         Output: t1.c1, t1.c2, t3.c1, t3.c2
         Node/s: All datanodes
         ->  Nested Loop
               Output: t1.c1, t1.c2, t3.c1, t3.c2
               Join Filter: (t1.c1 = ((t3.c1)::bigint))
               ->  Partition Iterator
                     Output: t1.c1, t1.c2
                     Iterations: 3
                     ->  Partitioned Index Scan using t1_c1 on public.t1
                           Output: t1.c1, t1.c2
                           Distribute Key: t1.c1
                           Selected Partitions:  1..3
               ->  Materialize
                     Output: t3.c1, t3.c2, ((t3.c1)::bigint)
                     ->  Streaming(type: REDISTRIBUTE)
                           Output: t3.c1, t3.c2, ((t3.c1)::bigint)
                           Distribute Key: ((t3.c1)::bigint)
                           Spawn on: All datanodes
                           Consumer Nodes: All datanodes
                           ->  Seq Scan on public.t3
                                 Output: t3.c1, t3.c2, t3.c1
                                 Distribute Key: t3.c1
      (23 rows)
    3. Functions
      gaussdb=# EXPLAIN (VERBOSE ON, COSTS OFF) SELECT * FROM t1 JOIN t3 ON t1.c1 = LENGTHB(t3.c1);
                               QUERY PLAN                          
      -------------------------------------------------------------
       Nested Loop
         Output: t1.c1, t1.c2, t3.c1, t3.c2
         ->  Seq Scan on public.t3
               Output: t3.c1, t3.c2
         ->  Partition Iterator
               Output: t1.c1, t1.c2
               Iterations: 3
               ->  Partitioned Index Scan using t1_c1 on public.t1
                     Output: t1.c1, t1.c2
                     Index Cond: (t1.c1 = lengthb(t3.c1))
                     Selected Partitions:  1..3
      (11 rows)
      
      -- Clean up the environment.
      gaussdb=# DROP TABLE t1;
      gaussdb=# DROP TABLE t2;
      gaussdb=# DROP TABLE t3;