Help Center/ GaussDB/ Developer Guide(Centralized_V2.0-8.x)/ SQL Optimization/ Optimization Cases/ Adjusting the GUC Parameter costbased_rewrite_rule for Cost-based Query Rewriting
Updated on 2025-05-29 GMT+08:00

Adjusting the GUC Parameter costbased_rewrite_rule for Cost-based Query Rewriting

Query rewriting allows the optimizer to select different execution paths. However, performance will not be improved after some rules rewrite the SQL statements. A rewriting error may result in a 1,000-fold performance difference. Therefore, in the query rewriting phase, the cost-based evaluation policy needs to be supported for such rules to determine whether to apply the rules to rewrite SQL statements. These rules are controlled by the GUC parameter costbased_rewrite_rule. For details about the controlled rules, see Table 1.

Table 1 Control rules supported by GUC parameter costbased_rewrite_rule

Parameter

Description

none

Does not use any cost-based query rewriting policy.

pullup_subquery

The cost-based query rewriting policy is enabled for the rewriting rules of simple subquery expansion.

pullup_sublink_any_exists

The cost-based query rewriting policy is enabled for the rewriting rules of non-correlated sublinks of the ANY type and related sublinks of the [NOT] EXISTS type in a single or AND condition.

pullup_not_in_sublink

The cost-based query rewriting policy is enabled for the rewriting rules of non-correlated sublinks of the NOT IN type in a single or AND condition. This parameter is mutually exclusive with the disable_pullup_not_in_sublink parameter of the GUC parameter rewrite_rule. When this parameter is enabled, the disable_pullup_not_in_sublink parameter of the GUC parameter rewrite_rule does not take effect.

pullup_expr_sublink

This parameter takes effect in the rewriting scenarios of expression sublinks, non-correlated sublinks of the ANY type, and related sublinks of [NOT] the EXISTS type in the OR condition. This parameter is mutually exclusive with the disable_pullup_expr_sublink, enable_sublink_pullup_enhanced, and magicset parameters of the GUC parameter rewrite_rule. When this parameter is enabled, the related parameters of the GUC parameter rewrite_rule do not take effect.

intargetlist

The cost-based query rewriting policy is enabled for the rewriting rules of the sublinks of related expression in the TargetList. This parameter is mutually exclusive with the intargetlist and magicset parameters of the GUC parameter rewrite_rule. When this parameter is enabled, the intargetlist and magicset parameters of the GUC parameter rewrite_rule do not take effect.

enable_sublink_pullup_enhanced

The cost-based query rewriting policy is enabled for the rewriting rules of expression sublinks in the enhanced scenario. This parameter is affected by the pullup_expr_sublink parameter. This parameter takes effect only when the pullup_expr_sublink parameter is enabled for the expression sublink rewriting scenario in the AND condition. This parameter is mutually exclusive with the enable_sublink_pullup_enhanced parameter of the GUC parameter rewrite_rule. When this parameter is enabled, the enable_sublink_pullup_enhanced parameter of the GUC parameter rewrite_rule does not take effect.

The following uses typical cases to describe the application scenarios of the query rewriting rules controlled by the preceding parameters.

Preparing the Case Environment

To demonstrate rule application scenarios, you need to prepare the following table creation statements:

SET client_min_messages = warning;
SET CLIENT_ENCODING = 'UTF8';
-- Clean the environment.
DROP SCHEMA IF EXISTS costbased_rule_test cascade;
CREATE SCHEMA costbased_rule_test;
SET current_schema = costbased_rule_test;
SET enable_codegen = off;
DROP TABLE IF EXISTS costbased_rule_test.ct1;
DROP TABLE IF EXISTS costbased_rule_test.ct2;
DROP TABLE IF EXISTS costbased_rule_test.ct3;
DROP TABLE IF EXISTS costbased_rule_test.ct4;
-- Create a test table.
CREATE TABLE ct1 (a INT, b INT, c INT, d INT);
CREATE TABLE ct2 (a INT, b INT, c INT, d INT);
CREATE TABLE ct3 (a INT, b INT, c INT, d INT);
CREATE TABLE ct4 (a INT, b INT, c INT, d INT);
CREATE INDEX idx_ct1_b ON ct1(b);
CREATE INDEX idx_ct2_c ON ct2(c);
CREATE INDEX idx_ct3_c ON ct3(c);
-- Insert data.
INSERT INTO ct1 (a, b, c) VALUES (generate_series(1, 100), generate_series(200, 300), left(random()::int, 100));
INSERT INTO ct2 VALUES(1,2,3,4),(3,4,5,6);
INSERT INTO ct3 (a, b, c, d) VALUES (generate_series(1, 10), generate_series(20, 30), left(random()::int, 10), left(random()::int, 10));
-- Update statistics.
ANALYZE ct1;
ANALYZE ct2;
ANALYZE ct3;

Rewriting Rules of Non-correlated Sublinks in the Having Condition

SQL statements can be rewritten when the Having condition contains non-correlated expression sublinks with AGG. If the cost-based query rewriting evaluation policy is disabled, sublinks are rewritten based on rules. The generated plan is as follows:

gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT ct2.a, SUM(ct2.b) AS value FROM ct2,ct1 WHERE ct2.a = ct1.b GROUP BY ct2.a HAVING SUM(ct2.b) >= (SELECT AVG(ct2.b) FROM ct2,ct1 WHERE ct2.a = ct1.b) ORDER BY value DESC;
                                             QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
 Sort  (cost=33.14..33.14 rows=1 width=12)
   Sort Key: (sum(costbased_rule_test.ct2.b)) DESC
   ->  Nested Loop  (cost=33.04..33.13 rows=1 width=12)
         Join Filter: ((sum(costbased_rule_test.ct2.b)) >= (avg(costbased_rule_test.ct2.b)))
         ->  Aggregate  (cost=16.02..16.03 rows=1 width=36)
               ->  Nested Loop  (cost=0.00..15.52 rows=200 width=4)
                     ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
                     ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=100 width=4)
                           Index Cond: (b = costbased_rule_test.ct2.a)
         ->  HashAggregate  (cost=17.02..17.04 rows=2 width=8)
               Group By Key: costbased_rule_test.ct2.a
               ->  Nested Loop  (cost=0.00..15.52 rows=200 width=8)
                     ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
                     ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=100 width=4)
                           Index Cond: (b = costbased_rule_test.ct2.a)
(15 rows)

When costbased_rewrite_rule is set to pullup_expr_sublink,enable_sublink_pullup_enhanced, the optimizer evaluates whether to rewrite sublinks based on the cost. In this scenario, the plan generated without rewriting SQL statements is better. The generated plan is as follows:

gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced;
SET
gaussdb=# EXPLAIN SELECT ct2.a, SUM(ct2.b) AS value FROM ct2,ct1 WHERE ct2.a = ct1.b GROUP BY ct2.a HAVING SUM(ct2.b) >= (SELECT AVG(ct2.b) FROM ct2,ct1 WHERE ct2.a = ct1.b) ORDER BY value DESC;
                                           QUERY PLAN                                           
------------------------------------------------------------------------------------------------
 Sort  (cost=33.09..33.09 rows=2 width=8)
   Sort Key: (sum(costbased_rule_test.ct2.b)) DESC
   InitPlan 1 (returns $1)
     ->  Aggregate  (cost=16.02..16.03 rows=1 width=36)
           ->  Nested Loop  (cost=0.00..15.52 rows=200 width=4)
                 ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
                 ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=100 width=4)
                       Index Cond: (b = costbased_rule_test.ct2.a)
   ->  HashAggregate  (cost=17.02..17.04 rows=2 width=8)
         Group By Key: costbased_rule_test.ct2.a
         Filter: (sum(costbased_rule_test.ct2.b) >= $1)
         ->  Nested Loop  (cost=0.00..15.52 rows=200 width=8)
               ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
               ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=100 width=4)
                     Index Cond: (b = costbased_rule_test.ct2.a)
(15 rows)

costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced is mutually exclusive with rewrite_rule=disable_pullup_expr_sublink and rewrite_rule=enable_sublink_pullup_enhanced, and the value of costbased_rewrite_rule precedes that of rewrite_rule.

Pulling Up Non-correlated Sublinks of ANY/IN Type in the AND Condition

SQL statements can be rewritten when a single or AND condition contains non-correlated sublinks of ANY/IN type. If the cost-based query rewriting evaluation policy is disabled, sublinks are rewritten based on rules. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a IN (SELECT a FROM ct1 WHERE ct1.b > 1000);
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=0.00..9.32 rows=1 width=16)
   Join Filter: (ct2.a = ct1.a)
   ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=16)
   ->  Materialize  (cost=0.00..8.27 rows=1 width=4)
         ->  Index Scan using idx_ct1_b on ct1  (cost=0.00..8.27 rows=1 width=4)
               Index Cond: (b > 1000)
(6 rows)

When costbased_rewrite_rule is set to pullup_sublink_any_exists, the optimizer evaluates whether to rewrite sublinks based on the cost. In this scenario, the plan generated without rewriting SQL statements is better. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_sublink_any_exists;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a IN (SELECT a FROM ct1 WHERE ct1.b > 1000);
                                 QUERY PLAN                                  
-----------------------------------------------------------------------------
 Seq Scan on ct2  (cost=8.27..9.29 rows=1 width=16)
   Filter: (hashed SubPlan 1)
   SubPlan 1
     ->  Index Scan using idx_ct1_b on ct1  (cost=0.00..8.27 rows=1 width=4)
           Index Cond: (b > 1000)
(5 rows)

This scenario is controlled only by the GUC parameter costbased_rewrite_rule=pullup_sublink_any_exists. The GUC parameter rewrite_rule does not have related control parameters.

Pulling Up Related Sublinks of [NOT]EXISTS Type in the AND Condition

SQL statements can be rewritten when a single or AND condition contains related sublinks of [NOT]EXISTS type. If the cost-based query rewriting evaluation policy is disabled, sublinks are rewritten based on rules. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE EXISTS (SELECT 1 FROM ct3 WHERE ct3.b = ct2.c AND ct3.c < 1000);
                           QUERY PLAN                           
----------------------------------------------------------------
 Hash Right Semi Join  (cost=1.04..3.73 rows=2 width=16)
   Hash Cond: (ct3.b = ct2.c)
   ->  Seq Scan on ct3  (cost=0.00..2.38 rows=110 width=4)
         Filter: (c < 1000)
   ->  Hash  (cost=1.02..1.02 rows=2 width=16)
         ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=16)
(6 rows)

When costbased_rewrite_rule is set to pullup_sublink_any_exists, the optimizer evaluates whether to rewrite sublinks based on the cost. In this scenario, the plan generated without rewriting SQL statements is better. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_sublink_any_exists;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE EXISTS (SELECT 1 FROM ct3 WHERE ct3.b = ct2.c AND ct3.c < 1000);
                         QUERY PLAN                          
-------------------------------------------------------------
 Seq Scan on ct2  (cost=0.00..1.55 rows=1 width=16)
   Filter: (alternatives: SubPlan 1 or hashed SubPlan 2)
   SubPlan 1
     ->  Seq Scan on ct3  (cost=0.00..2.65 rows=10 width=0)
           Filter: ((c < 1000) AND (b = ct2.c))
   SubPlan 2
     ->  Seq Scan on ct3  (cost=0.00..2.38 rows=110 width=4)
           Filter: (c < 1000)
(8 rows)

This scenario is controlled only by the GUC parameter costbased_rewrite_rule=pullup_sublink_any_exists. The GUC parameter rewrite_rule does not have related control parameters.

Pulling Up Non-correlated Sublinks of NOT IN Type in the AND Condition

SQL statements can be rewritten when a single or AND condition contains non-correlated sublinks of NOT IN type. If the cost-based query rewriting evaluation policy is disabled, sublinks are rewritten based on rules. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET 
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.b NOT IN (SELECT a FROM ct3);
                               QUERY PLAN                               
------------------------------------------------------------------------
 Nested Loop Anti Join  (cost=0.00..15437.60 rows=9100 width=16)
   Join Filter: ((ct1.b = ct3.a) OR (ct1.b IS NULL) OR (ct3.a IS NULL))
   ->  Seq Scan on ct1  (cost=0.00..145.00 rows=10100 width=16)
   ->  Materialize  (cost=0.00..2.65 rows=110 width=4)
         ->  Seq Scan on ct3  (cost=0.00..2.10 rows=110 width=4)
(5 rows)

When costbased_rewrite_rule is set to pullup_not_in_sublink, the optimizer evaluates whether to rewrite sublinks based on the cost. In this scenario, the plan generated without rewriting SQL statements is better. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_not_in_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.b NOT IN (SELECT a FROM ct3);
                         QUERY PLAN                          
-------------------------------------------------------------
 Seq Scan on ct1  (cost=2.38..172.62 rows=5050 width=16)
   Filter: (NOT (hashed SubPlan 1))
   SubPlan 1
     ->  Seq Scan on ct3  (cost=0.00..2.10 rows=110 width=4)
(4 rows)

costbased_rewrite_rule=pullup_not_in_sublink is mutually exclusive with rewrite_rule=disable_pullup_not_in_sublink, and costbased_rewrite_rule precedes rewrite_rule.

Pulling Up Expression Sublinks in the AND Condition

SQL statements can be rewritten when a single or AND condition contains related sublinks of expression type. If the cost-based query rewriting evaluation policy is disabled, sublinks are rewritten based on rules. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.b > 500;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1.03..9.35 rows=1 width=16)
   Join Filter: ((ct1.b = subquery."?column?") AND (ct1.a = subquery.a))
   ->  Index Scan using idx_ct1_b on ct1  (cost=0.00..8.27 rows=1 width=16)
         Index Cond: (b > 500)
   ->  Subquery Scan on subquery  (cost=1.03..1.07 rows=1 width=8)
         Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
         ->  WindowAgg  (cost=1.03..1.06 rows=1 width=8)
               row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
               ->  Sort  (cost=1.03..1.04 rows=1 width=8)
                     Sort Key: ct2.b
                     ->  Seq Scan on ct2  (cost=1.02..1.02 rows=1 width=8)
                           Filter: (b > 500)
(12 rows)

When costbased_rewrite_rule is set to pullup_expr_sublink, the optimizer evaluates whether to rewrite sublinks based on the cost. In this scenario, the plan generated without rewriting SQL statements is better. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.b > 500;
                              QUERY PLAN                              
----------------------------------------------------------------------
 Index Scan using idx_ct1_b on ct1  (cost=0.00..9.29 rows=1 width=16)
   Index Cond: (b > 500)
   Filter: (a = (SubPlan 1))
   SubPlan 1
     ->  Limit  (cost=0.00..1.02 rows=1 width=4)
           ->  Seq Scan on ct2  (cost=0.00..1.02 rows=1 width=4)
                 Filter: (b = ct1.b)
(7 rows)

costbased_rewrite_rule=pullup_expr_sublink is mutually exclusive with rewrite_rule=disable_pullup_expr_sublink, rewrite_rule=enable_sublink_pullup_enhanced, and rewrite_rule=magicset, and costbased_rewrite_rule precedes rewrite_rule.

Rewriting SQL Statements when the Expression Sublinks in the AND Condition Meet the MAGICSET Pushdown Rules

SQL statements can be rewritten when a single or AND condition contains related sublinks of expression type and MagicSet pushdown rules can push down some filter criteria to the sublinks. If the cost-based query rewriting evaluation policy is disabled, sublinks are rewritten based on rules. The generated plan is as follows:

gaussdb=# SET rewrite_rule=magicset;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE  ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.c=10;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=101.70..147.50 rows=1 width=16)
   ->  Subquery Scan on subquery  (cost=96.70..96.74 rows=1 width=8)
         Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
         ->  WindowAgg  (cost=96.70..96.72 rows=1 width=8)
               row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
               ->  Sort  (cost=96.70..96.71 rows=1 width=8)
                     Sort Key: ct2.b
                     ->  Nested Loop Semi Join  (cost=5.00..96.69 rows=1 width=8)
                           ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
                           ->  Bitmap Heap Scan on ct1  (cost=5.00..47.83 rows=1 width=4)
                                 Recheck Cond: (b = ct2.b)
                                 Filter: (c = 10)
                                 ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
                                       Index Cond: (b = ct2.b)
   ->  Bitmap Heap Scan on ct1  (cost=5.00..50.75 rows=1 width=16)
         Recheck Cond: (b = subquery."?column?")
         Filter: ((c = 10) AND (subquery.a = a))
         ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
               Index Cond: (b = subquery."?column?")
(19 rows)

When costbased_rewrite_rule is set to pullup_expr_sublink, the optimizer evaluates whether to rewrite the MagicSet pushdown conditions of sublinks based on the cost. In this scenario, the plan generated without rewriting MagicSet is better. The generated plan is as follows:

gaussdb=# SET rewrite_rule=magicset;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE  ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1)  AND ct1.c=10;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=6.03..51.82 rows=1 width=16)
   ->  Subquery Scan on subquery  (cost=1.03..1.06 rows=1 width=8)
         Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
         ->  WindowAgg  (cost=1.03..1.05 rows=1 width=8)
               row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
               ->  Sort  (cost=1.03..1.03 rows=2 width=8)
                     Sort Key: ct2.b
                     ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
   ->  Bitmap Heap Scan on ct1  (cost=5.00..50.75 rows=1 width=16)
         Recheck Cond: (b = subquery."?column?")
         Filter: ((c = 10) AND (subquery.a = a))
         ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
               Index Cond: (b = subquery."?column?")
(13 rows)

costbased_rewrite_rule=pullup_expr_sublink is mutually exclusive with rewrite_rule=disable_pullup_expr_sublink, rewrite_rule=enable_sublink_pullup_enhanced, and rewrite_rule=magicset, and costbased_rewrite_rule precedes rewrite_rule.

Rewriting SQL Statements when the Expression Sublinks in the AND Condition Meet the WINAGG Rules

SQL statements can be rewritten when a single or AND condition contains related sublinks of expression type and WINAGG rules are met. If the cost-based query rewriting evaluation policy is disabled, sublinks are rewritten based on rules. The generated plan is as follows:

gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1, ct2 WHERE ct1.b = ct2.c AND ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c);
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Subquery Scan on inner_subquery  (cost=105.88..111.88 rows=1 width=36)
   Filter: (inner_subquery.a = inner_subquery.min)
   ->  WindowAgg  (cost=105.88..109.38 rows=200 width=32)
         ->  Sort  (cost=105.88..106.38 rows=200 width=32)
               Sort Key: ct2.c
               ->  Nested Loop  (cost=5.03..98.23 rows=200 width=32)
                     ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=16)
                     ->  Bitmap Heap Scan on ct1  (cost=5.03..47.61 rows=100 width=16)
                           Recheck Cond: (b = ct2.c)
                           ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
                                 Index Cond: (b = ct2.c)
(11 rows)

When costbased_rewrite_rule is set to pullup_expr_sublink,enable_sublink_pullup_enhanced, the optimizer evaluates whether to rewrite sublinks based on the cost. In this scenario, the plan generated without rewriting is better. The generated plan is as follows:

gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1, ct2 WHERE ct1.b = ct2.c AND ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c);
                                              QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=5.03..52.45 rows=100 width=32)
   ->  Seq Scan on ct2  (cost=0.00..1.17 rows=1 width=16)
         Filter: (a = (SubPlan 2))
         SubPlan 2
           ->  Result  (cost=0.06..0.07 rows=1 width=0)
                 InitPlan 1 (returns $1)
                   ->  Limit  (cost=0.00..0.06 rows=1 width=4)
                         ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=1 width=4)
                               Index Cond: ((b IS NOT NULL) AND (b = ct2.c))
   ->  Bitmap Heap Scan on ct1  (cost=5.03..50.27 rows=100 width=16)
         Recheck Cond: (b = ct2.c)
         ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
               Index Cond: (b = ct2.c)
(13 rows)

costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced is mutually exclusive with rewrite_rule=disable_pullup_expr_sublink, rewrite_rule=enable_sublink_pullup_enhanced, and rewrite_rule=magicset, and costbased_rewrite_rule precedes rewrite_rule.

Pulling Up Expression Sublinks in the OR Condition

SQL statements can be rewritten when an OR condition contains related sublinks of expression type. If the cost-based query rewriting evaluation policy is disabled, sublinks are rewritten based on rules. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c) OR ct2.b = (SELECT sum(a) FROM ct3 WHERE ct3.a = ct2.c);
                                 QUERY PLAN                                  
-----------------------------------------------------------------------------
 Nested Loop Left Join  (cost=199.20..202.11 rows=1 width=16)
   Join Filter: (subquery."?column?" = ct2.c)
   Filter: ((ct2.a = (min(ct1.b))) OR (ct2.b = subquery.sum))
   ->  Hash Right Join  (cost=196.55..198.84 rows=2 width=20)
         Hash Cond: (ct1.b = ct2.c)
         ->  HashAggregate  (cost=195.50..196.51 rows=101 width=4)
               Group By Key: ct1.b
               ->  Seq Scan on ct1  (cost=0.00..145.00 rows=10100 width=4)
         ->  Hash  (cost=1.02..1.02 rows=2 width=16)
               ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=16)
   ->  Materialize  (cost=2.65..2.90 rows=10 width=12)
         ->  Subquery Scan on subquery  (cost=2.65..2.85 rows=10 width=12)
               ->  HashAggregate  (cost=2.65..2.75 rows=10 width=4)
                     Group By Key: ct3.a
                     ->  Seq Scan on ct3  (cost=0.00..2.10 rows=110 width=4)
(15 rows)

When costbased_rewrite_rule is set to pullup_expr_sublink, the optimizer evaluates whether to rewrite sublinks based on the cost. In this scenario, the plan generated without rewriting is better. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c) OR ct2.b = (SELECT sum(a) FROM ct3 WHERE ct3.a = ct2.c);
                                           QUERY PLAN                                           
------------------------------------------------------------------------------------------------
 Hash Right Join  (cost=3.69..4.10 rows=1 width=16)
   Hash Cond: (ct3.a = ct2.c)
   Filter: ((ct2.a = (SubPlan 2)) OR (ct2.b = (sum(ct3.a))))
   ->  HashAggregate  (cost=2.65..2.75 rows=10 width=4)
         Group By Key: ct3.a
         ->  Seq Scan on ct3  (cost=0.00..2.10 rows=110 width=4)
   ->  Hash  (cost=1.02..1.02 rows=2 width=16)
         ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=16)
   SubPlan 2
     ->  Result  (cost=0.06..0.07 rows=1 width=0)
           InitPlan 1 (returns $1)
             ->  Limit  (cost=0.00..0.06 rows=1 width=4)
                   ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=1 width=4)
                         Index Cond: ((b IS NOT NULL) AND (b = ct2.c))
(14 rows)

costbased_rewrite_rule=pullup_expr_sublink is mutually exclusive with rewrite_rule=disable_pullup_expr_sublink, rewrite_rule=enable_sublink_pullup_enhanced, and rewrite_rule=magicset, and costbased_rewrite_rule precedes rewrite_rule.

Pulling Up the Related Sublinks of the Expression Type in TargetList

SQL statements can be rewritten when TargetList contains related sublinks of expression type. If the cost-based query rewriting evaluation policy is disabled, sublinks are rewritten based on rules. The generated plan is as follows:

gaussdb=# SET rewrite_rule=intargetlist;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT *, ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1)  FROM ct1 WHERE ct1.b >500;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=1.03..9.34 rows=1 width=20)
   Join Filter: (subquery."?column?" = ct1.b)
   ->  Index Scan using idx_ct1_b on ct1  (cost=0.00..8.27 rows=1 width=16)
         Index Cond: (b > 500)
   ->  Subquery Scan on subquery  (cost=1.03..1.06 rows=1 width=8)
         Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
         ->  WindowAgg  (cost=1.03..1.05 rows=1 width=8)
               row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
               ->  Sort  (cost=1.03..1.03 rows=2 width=8)
                     Sort Key: ct2.b
                     ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
(11 rows)

When costbased_rewrite_rule is set to intargetlist, the optimizer evaluates whether to rewrite sublinks based on the cost. In this scenario, the plan generated without rewriting is better. The generated plan is as follows:

gaussdb=# SET rewrite_rule=intargetlist;
SET
gaussdb=# SET costbased_rewrite_rule=intargetlist;
SET
gaussdb=# EXPLAIN SELECT *, ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1)  FROM ct1 WHERE ct1.b >500;
                              QUERY PLAN                              
----------------------------------------------------------------------
 Index Scan using idx_ct1_b on ct1  (cost=0.00..9.29 rows=1 width=16)
   Index Cond: (b > 500)
   SubPlan 1
     ->  Limit  (cost=0.00..1.02 rows=1 width=4)
           ->  Seq Scan on ct2  (cost=0.00..1.02 rows=1 width=4)
                 Filter: (b = ct1.b)
(6 rows)

costbased_rewrite_rule=intargetlist is mutually exclusive with rewrite_rule=intargetlist and rewrite_rule=magicset, and costbased_rewrite_rule precedes rewrite_rule.

Rewriting SQL Statements when the Expression Sublinks in TargetList Meet the MAGICSET Pushdown Rules

SQL statements can be rewritten when TargetList contains related sublinks of expression type and MagicSet pushdown rules can push down some filter criteria to the sublinks. If the cost-based query rewriting evaluation policy is disabled, sublinks are rewritten based on rules. The generated plan is as follows:

gaussdb=# SET rewrite_rule=intargetlist,magicset;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT *,ct1.a = (SELECT a FROM ct2 WHERE ct2.c = ct1.c LIMIT 1) FROM ct1 WHERE ct1.b=10;
                                                     QUERY PLAN                                                   

------------------------------------------------------------------------------------------------------------------
--
 Nested Loop Left Join  (cost=55.97..103.00 rows=100 width=20)
   Join Filter: (subquery."?column?" = costbased_rule_test.ct1.c)
   ->  Bitmap Heap Scan on ct1  (cost=5.03..50.27 rows=100 width=16)
         Recheck Cond: (b = 10)
         ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
               Index Cond: (b = 10)
   ->  Materialize  (cost=50.94..50.98 rows=1 width=8)
         ->  Subquery Scan on subquery  (cost=50.94..50.98 rows=1 width=8)
               Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
               ->  WindowAgg  (cost=50.94..50.96 rows=1 width=8)
                     row_number_filter: (row_number() OVER (PARTITION BY ct2.c) OPERATOR(pg_catalog.<=) 1::bigint)
                     ->  Sort  (cost=50.94..50.95 rows=2 width=8)
                           Sort Key: ct2.c
                           ->  Nested Loop  (cost=50.52..50.93 rows=2 width=8)
                                 ->  HashAggregate  (cost=50.52..50.53 rows=1 width=4)
                                       Group By Key: costbased_rule_test.ct1.c
                                       ->  Bitmap Heap Scan on ct1  (cost=5.03..50.27 rows=100 width=4)
                                             Recheck Cond: (b = 10)
                                             ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0
)
                                                   Index Cond: (b = 10)
                                 ->  Index Scan using idx_ct2_c on ct2  (cost=0.00..0.39 rows=1 width=8)
                                       Index Cond: (c = costbased_rule_test.ct1.c)
(22 rows)

When costbased_rewrite_rule is set to intargetlist, the optimizer evaluates whether to rewrite the MagicSet pushdown conditions of sublinks based on the cost. In this scenario, the plan generated without rewriting MagicSet is better. The generated plan is as follows:

gaussdb=# SET rewrite_rule=intargetlist,magicset;
SET
gaussdb=# SET costbased_rewrite_rule=intargetlist;
SET
gaussdb=# EXPLAIN SELECT *,ct1.a = (SELECT a FROM ct2 WHERE ct2.c = ct1.c LIMIT 1) FROM ct1 WHERE ct1.b=10;
                                                    QUERY PLAN                                                    

------------------------------------------------------------------------------------------------------------------
-
 Hash Left Join  (cost=6.10..52.35 rows=100 width=20)
   Hash Cond: (ct1.c = subquery."?column?")
   ->  Bitmap Heap Scan on ct1  (cost=5.03..50.27 rows=100 width=16)
         Recheck Cond: (b = 10)
         ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
               Index Cond: (b = 10)
   ->  Hash  (cost=1.06..1.06 rows=1 width=8)
         ->  Subquery Scan on subquery  (cost=1.03..1.06 rows=1 width=8)
               Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
               ->  WindowAgg  (cost=1.03..1.05 rows=1 width=8)
                     row_number_filter: (row_number() OVER (PARTITION BY ct2.c) OPERATOR(pg_catalog.<=) 1::bigint)
                     ->  Sort  (cost=1.03..1.03 rows=2 width=8)
                           Sort Key: ct2.c
                           ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
(14 rows)

costbased_rewrite_rule=intargetlist is mutually exclusive with rewrite_rule=intargetlist and rewrite_rule=magicset, and costbased_rewrite_rule precedes rewrite_rule.

Rewriting SQL Statements when Simple Subqueries Exist

SQL statements can be rewritten when simple subqueries exist. If the cost-based query rewriting evaluation policy is disabled, sublinks are rewritten based on rules. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2,(SELECT ct1.* FROM ct1,ct2 WHERE ct1.c > 10 AND ct1.a = 10) ct WHERE ct.a = ct2.d;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Nested Loop  (cost=20000000097.15..2000000019660.81 rows=2 width=32)
   ->  Nested Loop  (cost=20000000097.15..2000000019652.51 rows=1 width=32)
         ->  Seq Scan on ct2  (cost=10000000000.00..1000000000102.50 rows=1 width=16)
               Filter: (d = 10)
         ->  Seq Scan on ct1  (cost=10000000097.15..1000000019550.00 rows=1 width=16)
               Filter: ((c > 10) AND (a = 10))
   ->  Index Only Scan using idx_ct2_c on ct2  (cost=0.00..8.28 rows=2 width=0)
(7 rows)

When costbased_rewrite_rule is set to pullup_subquery, the optimizer evaluates whether to rewrite simple subqueries based on the cost. In this scenario, the plan generated without rewriting simple subqueries is better. The generated plan is as follows:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_subquery;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2,(SELECT ct1.* FROM ct1,ct2 WHERE ct1.c > 10 AND ct1.a = 10) ct WHERE ct.a = ct2.d;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Nested Loop  (cost=20000000097.15..2000000019660.39 rows=1 width=32)
   Join Filter: (costbased_rule_test.ct2.d = ct1.a)
   ->  Seq Scan on ct2  (cost=10000000000.00..1000000000102.00 rows=2 width=16)
   ->  Materialize  (cost=10000000097.15..1000000019558.33 rows=2 width=16)
         ->  Nested Loop  (cost=10000000097.15..1000000019558.30 rows=2 width=16)
               ->  Seq Scan on ct1  (cost=10000000097.15..1000000019550.00 rows=1 width=16)
                     Filter: ((c > 10) AND (a = 10))
               ->  Index Only Scan using idx_ct2_c on ct2  (cost=0.00..8.28 rows=2 width=0)
(8 rows)

This scenario is controlled only by the GUC parameter costbased_rewrite_rule=pullup_subquery. The GUC parameter rewrite_rule does not have related control parameters.