Updated on 2024-06-07 GMT+08:00

Merge Append

Scenario

To globally sort a partitioned table, the SQL engine uses the PI operator and PartitionScan to perform a full scan on the partitioned table before sorting. In this case, it is difficult to perform global sorting based on the data partition algorithm. If the ORDER BY column contains indexes, the existing order cannot be used. To solve this problem, partitioned tables support Merge Append to improve the sorting mechanism.

Example

The following is an example of executing Merge Append:

gaussdb=#
CREATE TABLE test_range_pt (a INT, b INT, c INT)
PARTITION BY RANGE(a)
(
    PARTITION p1 VALUES LESS THAN (2000),
    PARTITION p2 VALUES LESS THAN (3000),
    PARTITION p3 VALUES LESS THAN (4000),
    PARTITION p4 VALUES LESS THAN (5000),
    PARTITION p5 VALUES LESS THAN (MAXVALUE)
)ENABLE ROW MOVEMENT;
INSERT INTO test_range_pt VALUES (generate_series(1,10000),generate_series(1,10000),generate_series(1,10000));
CREATE INDEX idx_range_b ON test_range_pt(b) LOCAL;
ANALYZE test_range_pt;

gaussdb=# EXPLAIN ANALYZE SELECT * FROM test_range_pt WHERE b >10 AND b < 5000 ORDER BY b LIMIT 10;
                                                                          QUERY PLAN                            
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.06..1.02 rows=10 width=12) (actual time=0.990..1.041 rows=10 loops=1)
   ->  Result  (cost=0.06..480.32 rows=10 width=12) (actual time=0.988..1.036 rows=10 loops=1)
         ->  Merge Append  (cost=0.06..480.32 rows=10 width=12) (actual time=0.985..1.026 rows=10 loops=1)
               Sort Key: b
               ->  Partitioned Index Scan using idx_range_b on test_range_pt  (cost=0.00..44.61 rows=998 width=12) (actual time=0.256..0.284 rows=10 loops=1)
                     Index Cond: ((b > 10) AND (b < 5000))
                     Selected Partitions:  1
               ->  Partitioned Index Scan using idx_range_b on test_range_pt  (cost=0.00..44.61 rows=998 width=12) (actual time=0.208..0.208 rows=1 loops=1)
                     Index Cond: ((b > 10) AND (b < 5000))
                     Selected Partitions:  2
               ->  Partitioned Index Scan using idx_range_b on test_range_pt  (cost=0.00..44.61 rows=998 width=12) (actual time=0.205..0.205 rows=1 loops=1)
                     Index Cond: ((b > 10) AND (b < 5000))
                     Selected Partitions:  3
               ->  Partitioned Index Scan using idx_range_b on test_range_pt  (cost=0.00..44.61 rows=998 width=12) (actual time=0.212..0.212 rows=1 loops=1)
                     Index Cond: ((b > 10) AND (b < 5000))
                     Selected Partitions:  4
               ->  Partitioned Index Scan using idx_range_b on test_range_pt  (cost=0.00..44.61 rows=998 width=12) (actual time=0.092..0.092 rows=0 loops=1)
                     Index Cond: ((b > 10) AND (b < 5000))
                     Selected Partitions:  5
 Total runtime: 1.656 ms
(20 rows)

-- Disable the Merge Append operator of a partitioned table.
gaussdb=# SET sql_beta_feature = 'disable_merge_append_partition';
SET
gaussdb=# EXPLAIN ANALYZE SELECT * FROM test_range_pt WHERE b >10 AND b < 5000 ORDER BY b LIMIT 10;
                                                                   QUERY PLAN                                   
------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=330.92..330.95 rows=10 width=12) (actual time=6.728..6.730 rows=10 loops=1)
   ->  Sort  (cost=330.92..343.40 rows=10 width=12) (actual time=6.727..6.729 rows=10 loops=1)
         Sort Key: b
         Sort Method: top-N heapsort  Memory: 26kB
         ->  Partition Iterator  (cost=0.00..223.07 rows=4991 width=12) (actual time=0.102..4.503 rows=4989 loops=1)
               Iterations: 5
               ->  Partitioned Index Scan using idx_range_b on test_range_pt  (cost=0.00..223.07 rows=4991 width=12) (actual time=0.253..3.666 rows=4989 loops=5)
                     Index Cond: ((b > 10) AND (b < 5000))
                     Selected Partitions:  1..5
 Total runtime: 6.945 ms
(10 rows)

gaussdb=# DROP TABLE test_range_pt;

The execution cost of Merge Append is much lower than that of the common execution mode.

Precautions and Constraints

  1. Merge Append can be executed only when the partition scanning path is Index/Index Only.
  2. Merge Append can be executed only when the partition pruning result is greater than 1.
  3. Merge Append can be executed only when all partitioned indexes are valid and are B-tree indexes.
  4. Merge Append can be executed only when the SQL statement contains the LIMIT clause.
  5. Merge Append cannot be executed when a filter exists during partition scanning.
  6. The Merge Append path is no longer generated when the GUC parameter sql_beta_feature is set to 'disable_merge_append_partition'.