Updated on 2024-05-20 GMT+08:00

Merge Append

Scenario

To globally sort a partitioned table, the SQL engine uses the PI operator and PartitionScan to perform a full scan on the partitioned table before sorting. In this case, it is difficult to perform global sorting based on the data partition algorithm. If the ORDER BY column contains ordered indexes, the existing order cannot be used. To solve this problem, partitioned tables support MergeAppend to improve the sorting mechanism.

Example

The following is an example of executing MergeAppend.

gaussdb=# CREATE TABLE test_range_pt (a INT, b INT, c INT)
PARTITION BY RANGE(a)
(
    PARTITION p1 VALUES LESS THAN (2000),
    PARTITION p2 VALUES LESS THAN (3000),
    PARTITION p3 VALUES LESS THAN (4000),
    PARTITION p4 VALUES LESS THAN (5000),
    PARTITION p5 VALUES LESS THAN (MAXVALUE)
)ENABLE ROW MOVEMENT;
gaussdb=# INSERT INTO test_range_pt VALUES (generate_series(1,10000),generate_series(1,10000),generate_series(1,10000));
gaussdb=# CREATE INDEX idx_range_b ON test_range_pt(b) LOCAL;
gaussdb=# ANALYZE test_range_pt;

gaussdb=# EXPLAIN ANALYZE SELECT * FROM test_range_pt WHERE b >10 AND b < 5000 ORDER BY b LIMIT 10;
                                                                          QUERY PLAN                            
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.06..1.02 rows=10 width=12) (actual time=0.990..1.041 rows=10 loops=1)
   ->  Result  (cost=0.06..480.32 rows=10 width=12) (actual time=0.988..1.036 rows=10 loops=1)
         ->  Merge Append  (cost=0.06..480.32 rows=10 width=12) (actual time=0.985..1.026 rows=10 loops=1)
               Sort Key: b
               ->  Partitioned Index Scan using idx_range_b on test_range_pt  (cost=0.00..44.61 rows=998 width=12) (actual time=0.256..0.284 rows=10 loops=1)
                     Index Cond: ((b > 10) AND (b < 5000))
                     Selected Partitions:  1
               ->  Partitioned Index Scan using idx_range_b on test_range_pt  (cost=0.00..44.61 rows=998 width=12) (actual time=0.208..0.208 rows=1 loops=1)
                     Index Cond: ((b > 10) AND (b < 5000))
                     Selected Partitions:  2
               ->  Partitioned Index Scan using idx_range_b on test_range_pt  (cost=0.00..44.61 rows=998 width=12) (actual time=0.205..0.205 rows=1 loops=1)
                     Index Cond: ((b > 10) AND (b < 5000))
                     Selected Partitions:  3
               ->  Partitioned Index Scan using idx_range_b on test_range_pt  (cost=0.00..44.61 rows=998 width=12) (actual time=0.212..0.212 rows=1 loops=1)
                     Index Cond: ((b > 10) AND (b < 5000))
                     Selected Partitions:  4
               ->  Partitioned Index Scan using idx_range_b on test_range_pt  (cost=0.00..44.61 rows=998 width=12) (actual time=0.092..0.092 rows=0 loops=1)
                     Index Cond: ((b > 10) AND (b < 5000))
                     Selected Partitions:  5
 Total runtime: 1.656 ms
(20 rows)

-- Disable the MergeAppend operator of a partitioned table.
gaussdb=# SET sql_beta_feature = 'disable_merge_append_partition';
SET
gaussdb=# EXPLAIN ANALYZE SELECT * FROM test_range_pt WHERE b >10 AND b < 5000 ORDER BY b LIMIT 10;
                                                                   QUERY PLAN                                   
------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=296.85..296.88 rows=10 width=12) (actual time=33.559..33.565 rows=10 loops=1)
   ->  Sort  (cost=296.85..309.33 rows=10 width=12) (actual time=33.555..33.557 rows=10 loops=1)
         Sort Key: b
         Sort Method: top-N heapsort  Memory: 26kB
         ->  Partition Iterator  (cost=0.00..189.00 rows=4991 width=12) (actual time=0.352..27.176 rows=4989 loops=1)
               Iterations: 5
               ->  Partitioned Seq Scan on test_range_pt  (cost=0.00..189.00 rows=4991 width=12) (actual time=16.874..25.637 rows=4989 loops=5)
                     Filter: ((b > 10) AND (b < 5000))
                     Rows Removed by Filter: 5011
                     Selected Partitions:  1..5
 Total runtime: 33.877 ms
(11 rows)

-- Cleanup example
gaussdb=# DROP TABLE test_range_pt;

Executing MergeAppend consumes much less resources than the common execution mode.

Precautions and Constraints

  1. MergeAppend can be executed only when the partition scanning path is Index/Index Only.
  2. MergeAppend can be executed only when the partition pruning result is greater than 1.
  3. MergeAppend can be executed only when all partitioned indexes are valid and are B-tree indexes.
  4. MergeAppend can be executed only when the SQL statement contains the LIMIT clause.
  5. MergeAppend cannot be executed when a filter exists during partition scanning.
  6. The MergeAppend path is no longer generated when the GUC parameter sql_beta_feature is set to 'disable_merge_append_partition'.