文档首页/ 云数据库 GaussDB/ 开发指南(集中式_V2.0-8.x)/ SQL调优指南/ 实际调优案例/ 案例:调整基于代价的查询重写GUC参数costbased_rewrite_rule
更新时间:2024-11-12 GMT+08:00
分享

案例:调整基于代价的查询重写GUC参数costbased_rewrite_rule

查询重写能让优化器选择不同的执行路径,但部分规则对SQL的改写,并不能确保输出后的都是可以提升性能的计划。一旦改写错误可能导致千倍的性能差距,因此在查询改写阶段需要对此类规则支持基于代价的评估策略,决策是否应用规则对SQL进行改写,这些规则受GUC参数costbased_rewrite_rule控制,其受控的规则如表1所示。

表1 GUC参数costbased_rewrite_rule支持的参数控制规则说明列表

参数名

描述

none

不使用任何基于代价的查询改写策略。

pullup_subquery

简单子查询展开改写规则开启使用基于代价的查询改写策略。

pullup_sublink_any_exists

单个或AND条件里的ANY非相关子链接和[NOT] EXISTS相关子链接改写规则开启使用基于代价的查询改写策略。

pullup_not_in_sublink

单个或AND条件里的NOT IN非相关子链接改写规则开启使用基于代价的查询改写策略。该参数与GUC参数rewrite_rule的disable_pullup_not_in_sublink参数互斥,当开启该参数时,rewrite_rule的disable_pullup_not_in_sublink参数功能不生效。

pullup_expr_sublink

表达式子链接以及OR条件中的ANY非相关子链接和[NOT] EXISTS相关子链接改写场景生效。该参数与GUC参数rewrite_rule的disable_pullup_expr_sublink、enable_sublink_pullup_enhanced、magicset参数互斥,当开启该参数时rewrite_rule的相关参数功能不生效。

intargetlist

TargetList中的相关表达式子链接改写规则开启使用基于代价的查询改写策略。该参数与GUC参数rewrite_rule的intargetlist、magicset参数互斥,当开启该参数时,rewrite_rule的intargetlist、magicset参数功能不生效。

enable_sublink_pullup_enhanced

表达式子链接改写功能增强场景使用基于代价的查询改写策略。该参数受pullup_expr_sublink参数影响,AND条件中的表达式子链接改写场景需开启pullup_expr_sublink参数,该参数功能才生效。且与GUC参数rewrite_rule的enable_sublink_pullup_enhanced参数互斥,当开启该参数时,rewrite_rule的enable_sublink_pullup_enhanced参数功能不生效。

对于上述参数所控制的查询重写规则,以下通过典型案例说明其使用场景。

案例环境准备

为了便于规则的使用场景演示,需准备建表语句如下:

SET client_min_messages = warning;
SET CLIENT_ENCODING = 'UTF8';
--清理环境。
DROP SCHEMA IF EXISTS costbased_rule_test cascade;
CREATE SCHEMA costbased_rule_test;
SET current_schema = costbased_rule_test;
SET enable_codegen = off;
DROP TABLE IF EXISTS costbased_rule_test.ct1;
DROP TABLE IF EXISTS costbased_rule_test.ct2;
DROP TABLE IF EXISTS costbased_rule_test.ct3;
DROP TABLE IF EXISTS costbased_rule_test.ct4;
--创建测试表。
CREATE TABLE ct1 (a INT, b INT, c INT, d INT);
CREATE TABLE ct2 (a INT, b INT, c INT, d INT);
CREATE TABLE ct3 (a INT, b INT, c INT, d INT);
CREATE TABLE ct4 (a INT, b INT, c INT, d INT);
CREATE INDEX idx_ct1_b ON ct1(b);
CREATE INDEX idx_ct2_c ON ct2(c);
CREATE INDEX idx_ct3_c ON ct3(c);
--插入数据。
INSERT INTO ct1 (a, b, c) VALUES (generate_series(1, 100), generate_series(200, 300), left(random()::int, 100));
INSERT INTO ct2 VALUES(1,2,3,4),(3,4,5,6);
INSERT INTO ct3 (a, b, c, d) VALUES (generate_series(1, 10), generate_series(20, 30), left(random()::int, 10), left(random()::int, 10));
--更新统计信息。
ANALYZE ct1;
ANALYZE ct2;
ANALYZE ct3;

Having条件中的非相关子链接改写场景

Having条件中存在带有AGG的非相关表达式子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:

gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT ct2.a, SUM(ct2.b) AS value FROM ct2,ct1 WHERE ct2.a = ct1.b GROUP BY ct2.a HAVING SUM(ct2.b) >= (SELECT AVG(ct2.b) FROM ct2,ct1 WHERE ct2.a = ct1.b) ORDER BY value DESC;
                                             QUERY PLAN                                             
----------------------------------------------------------------------------------------------------
 Sort  (cost=33.14..33.14 rows=1 width=12)
   Sort Key: (sum(costbased_rule_test.ct2.b)) DESC
   ->  Nested Loop  (cost=33.04..33.13 rows=1 width=12)
         Join Filter: ((sum(costbased_rule_test.ct2.b)) >= (avg(costbased_rule_test.ct2.b)))
         ->  Aggregate  (cost=16.02..16.03 rows=1 width=36)
               ->  Nested Loop  (cost=0.00..15.52 rows=200 width=4)
                     ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
                     ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=100 width=4)
                           Index Cond: (b = costbased_rule_test.ct2.a)
         ->  HashAggregate  (cost=17.02..17.04 rows=2 width=8)
               Group By Key: costbased_rule_test.ct2.a
               ->  Nested Loop  (cost=0.00..15.52 rows=200 width=8)
                     ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
                     ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=100 width=4)
                           Index Cond: (b = costbased_rule_test.ct2.a)
(15 rows)

但当设置costbased_rewrite_rule参数值为pullup_expr_sublink,enable_sublink_pullup_enhanced时,优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:

gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced;
SET
gaussdb=# EXPLAIN SELECT ct2.a, SUM(ct2.b) AS value FROM ct2,ct1 WHERE ct2.a = ct1.b GROUP BY ct2.a HAVING SUM(ct2.b) >= (SELECT AVG(ct2.b) FROM ct2,ct1 WHERE ct2.a = ct1.b) ORDER BY value DESC;
                                           QUERY PLAN                                           
------------------------------------------------------------------------------------------------
 Sort  (cost=33.09..33.09 rows=2 width=8)
   Sort Key: (sum(costbased_rule_test.ct2.b)) DESC
   InitPlan 1 (returns $1)
     ->  Aggregate  (cost=16.02..16.03 rows=1 width=36)
           ->  Nested Loop  (cost=0.00..15.52 rows=200 width=4)
                 ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
                 ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=100 width=4)
                       Index Cond: (b = costbased_rule_test.ct2.a)
   ->  HashAggregate  (cost=17.02..17.04 rows=2 width=8)
         Group By Key: costbased_rule_test.ct2.a
         Filter: (sum(costbased_rule_test.ct2.b) >= $1)
         ->  Nested Loop  (cost=0.00..15.52 rows=200 width=8)
               ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
               ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=100 width=4)
                     Index Cond: (b = costbased_rule_test.ct2.a)
(15 rows)

costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced互斥,且costbased_rewrite_rule的值优先级高于rewrite_rule的优先级。

AND条件中ANY/IN非相关子链接提升场景

单个或AND条件里存在ANY/IN非相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a IN (SELECT a FROM ct1 WHERE ct1.b > 1000);
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Nested Loop Semi Join  (cost=0.00..9.32 rows=1 width=16)
   Join Filter: (ct2.a = ct1.a)
   ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=16)
   ->  Materialize  (cost=0.00..8.27 rows=1 width=4)
         ->  Index Scan using idx_ct1_b on ct1  (cost=0.00..8.27 rows=1 width=4)
               Index Cond: (b > 1000)
(6 rows)

但当设置costbased_rewrite_rule参数值为pullup_sublink_any_exists时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_sublink_any_exists;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a IN (SELECT a FROM ct1 WHERE ct1.b > 1000);
                                 QUERY PLAN                                  
-----------------------------------------------------------------------------
 Seq Scan on ct2  (cost=8.27..9.29 rows=1 width=16)
   Filter: (hashed SubPlan 1)
   SubPlan 1
     ->  Index Scan using idx_ct1_b on ct1  (cost=0.00..8.27 rows=1 width=4)
           Index Cond: (b > 1000)
(5 rows)

该场景仅受GUC参数costbased_rewrite_rule=pullup_sublink_any_exists控制,GUC参数rewrite_rule无相关控制参数。

AND条件中[NOT]EXISTS相关子连接提升场景

单个或AND条件里存在[NOT]EXISTS相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE EXISTS (SELECT 1 FROM ct3 WHERE ct3.b = ct2.c AND ct3.c < 1000);
                           QUERY PLAN                           
----------------------------------------------------------------
 Hash Right Semi Join  (cost=1.04..3.73 rows=2 width=16)
   Hash Cond: (ct3.b = ct2.c)
   ->  Seq Scan on ct3  (cost=0.00..2.38 rows=110 width=4)
         Filter: (c < 1000)
   ->  Hash  (cost=1.02..1.02 rows=2 width=16)
         ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=16)
(6 rows)

但当设置costbased_rewrite_rule参数值为pullup_sublink_any_exists时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_sublink_any_exists;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE EXISTS (SELECT 1 FROM ct3 WHERE ct3.b = ct2.c AND ct3.c < 1000);
                         QUERY PLAN                          
-------------------------------------------------------------
 Seq Scan on ct2  (cost=0.00..1.55 rows=1 width=16)
   Filter: (alternatives: SubPlan 1 or hashed SubPlan 2)
   SubPlan 1
     ->  Seq Scan on ct3  (cost=0.00..2.65 rows=10 width=0)
           Filter: ((c < 1000) AND (b = ct2.c))
   SubPlan 2
     ->  Seq Scan on ct3  (cost=0.00..2.38 rows=110 width=4)
           Filter: (c < 1000)
(8 rows)

该场景仅受GUC参数costbased_rewrite_rule=pullup_sublink_any_exists控制,GUC参数rewrite_rule无相关控制参数。

AND条件中NOT IN非相关子连接提升场景

单个或AND条件里存在NOT IN非相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET 
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.b NOT IN (SELECT a FROM ct3);
                               QUERY PLAN                               
------------------------------------------------------------------------
 Nested Loop Anti Join  (cost=0.00..15437.60 rows=9100 width=16)
   Join Filter: ((ct1.b = ct3.a) OR (ct1.b IS NULL) OR (ct3.a IS NULL))
   ->  Seq Scan on ct1  (cost=0.00..145.00 rows=10100 width=16)
   ->  Materialize  (cost=0.00..2.65 rows=110 width=4)
         ->  Seq Scan on ct3  (cost=0.00..2.10 rows=110 width=4)
(5 rows)

但当设置costbased_rewrite_rule参数值为pullup_not_in_sublink时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_not_in_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.b NOT IN (SELECT a FROM ct3);
                         QUERY PLAN                          
-------------------------------------------------------------
 Seq Scan on ct1  (cost=2.38..172.62 rows=5050 width=16)
   Filter: (NOT (hashed SubPlan 1))
   SubPlan 1
     ->  Seq Scan on ct3  (cost=0.00..2.10 rows=110 width=4)
(4 rows)

costbased_rewrite_rule=pullup_not_in_sublink与rewrite_rule=disable_pullup_not_in_sublink互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。

AND条件中表达式子链接提升场景

单个或AND条件里存在表达式类相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.b > 500;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1.03..9.35 rows=1 width=16)
   Join Filter: ((ct1.b = subquery."?column?") AND (ct1.a = subquery.a))
   ->  Index Scan using idx_ct1_b on ct1  (cost=0.00..8.27 rows=1 width=16)
         Index Cond: (b > 500)
   ->  Subquery Scan on subquery  (cost=1.03..1.07 rows=1 width=8)
         Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
         ->  WindowAgg  (cost=1.03..1.06 rows=1 width=8)
               row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
               ->  Sort  (cost=1.03..1.04 rows=1 width=8)
                     Sort Key: ct2.b
                     ->  Seq Scan on ct2  (cost=1.02..1.02 rows=1 width=8)
                           Filter: (b > 500)
(12 rows)

但当设置costbased_rewrite_rule参数值为pullup_expr_sublink时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.b > 500;
                              QUERY PLAN                              
----------------------------------------------------------------------
 Index Scan using idx_ct1_b on ct1  (cost=0.00..9.29 rows=1 width=16)
   Index Cond: (b > 500)
   Filter: (a = (SubPlan 1))
   SubPlan 1
     ->  Limit  (cost=0.00..1.02 rows=1 width=4)
           ->  Seq Scan on ct2  (cost=0.00..1.02 rows=1 width=4)
                 Filter: (b = ct1.b)
(7 rows)

costbased_rewrite_rule=pullup_expr_sublink与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。

AND条件中表达式子链接满足MAGICSET下推改写场景

单个或AND条件里存在表达式类相关子链接,且满足MagicSet下推规则可以将部分过滤条件下推到子链接中时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:

gaussdb=# SET rewrite_rule=magicset;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE  ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.c=10;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=101.70..147.50 rows=1 width=16)
   ->  Subquery Scan on subquery  (cost=96.70..96.74 rows=1 width=8)
         Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
         ->  WindowAgg  (cost=96.70..96.72 rows=1 width=8)
               row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
               ->  Sort  (cost=96.70..96.71 rows=1 width=8)
                     Sort Key: ct2.b
                     ->  Nested Loop Semi Join  (cost=5.00..96.69 rows=1 width=8)
                           ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
                           ->  Bitmap Heap Scan on ct1  (cost=5.00..47.83 rows=1 width=4)
                                 Recheck Cond: (b = ct2.b)
                                 Filter: (c = 10)
                                 ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
                                       Index Cond: (b = ct2.b)
   ->  Bitmap Heap Scan on ct1  (cost=5.00..50.75 rows=1 width=16)
         Recheck Cond: (b = subquery."?column?")
         Filter: ((c = 10) AND (subquery.a = a))
         ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
               Index Cond: (b = subquery."?column?")
(19 rows)

但当设置costbased_rewrite_rule参数值为pullup_expr_sublink时,则优化器会基于代价评估是否对子链接的MagicSet下推条件进行改写,因为此场景不进行MagicSet改写生成的计划更优,因此生成计划如下:

gaussdb=# SET rewrite_rule=magicset;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE  ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1)  AND ct1.c=10;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=6.03..51.82 rows=1 width=16)
   ->  Subquery Scan on subquery  (cost=1.03..1.06 rows=1 width=8)
         Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
         ->  WindowAgg  (cost=1.03..1.05 rows=1 width=8)
               row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
               ->  Sort  (cost=1.03..1.03 rows=2 width=8)
                     Sort Key: ct2.b
                     ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
   ->  Bitmap Heap Scan on ct1  (cost=5.00..50.75 rows=1 width=16)
         Recheck Cond: (b = subquery."?column?")
         Filter: ((c = 10) AND (subquery.a = a))
         ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
               Index Cond: (b = subquery."?column?")
(13 rows)

costbased_rewrite_rule=pullup_expr_sublink与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。

AND条件中表达式子链接满足WINAGG改写场景

单个或AND条件里存在表达式类相关子链接,且满足WinAGG规则时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:

gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1, ct2 WHERE ct1.b = ct2.c AND ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c);
                                            QUERY PLAN                                            
--------------------------------------------------------------------------------------------------
 Subquery Scan on inner_subquery  (cost=105.88..111.88 rows=1 width=36)
   Filter: (inner_subquery.a = inner_subquery.min)
   ->  WindowAgg  (cost=105.88..109.38 rows=200 width=32)
         ->  Sort  (cost=105.88..106.38 rows=200 width=32)
               Sort Key: ct2.c
               ->  Nested Loop  (cost=5.03..98.23 rows=200 width=32)
                     ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=16)
                     ->  Bitmap Heap Scan on ct1  (cost=5.03..47.61 rows=100 width=16)
                           Recheck Cond: (b = ct2.c)
                           ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
                                 Index Cond: (b = ct2.c)
(11 rows)

但当设置costbased_rewrite_rule参数值为pullup_expr_sublink,enable_sublink_pullup_enhanced时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行改写生成的计划更优,因此生成计划如下:

gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1, ct2 WHERE ct1.b = ct2.c AND ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c);
                                              QUERY PLAN                                              
------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=5.03..52.45 rows=100 width=32)
   ->  Seq Scan on ct2  (cost=0.00..1.17 rows=1 width=16)
         Filter: (a = (SubPlan 2))
         SubPlan 2
           ->  Result  (cost=0.06..0.07 rows=1 width=0)
                 InitPlan 1 (returns $1)
                   ->  Limit  (cost=0.00..0.06 rows=1 width=4)
                         ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=1 width=4)
                               Index Cond: ((b IS NOT NULL) AND (b = ct2.c))
   ->  Bitmap Heap Scan on ct1  (cost=5.03..50.27 rows=100 width=16)
         Recheck Cond: (b = ct2.c)
         ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
               Index Cond: (b = ct2.c)
(13 rows)

costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。

OR条件中表达式子连接提升场景

OR条件里存在表达式类相关子链接,可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c) OR ct2.b = (SELECT sum(a) FROM ct3 WHERE ct3.a = ct2.c);
                                 QUERY PLAN                                  
-----------------------------------------------------------------------------
 Nested Loop Left Join  (cost=199.20..202.11 rows=1 width=16)
   Join Filter: (subquery."?column?" = ct2.c)
   Filter: ((ct2.a = (min(ct1.b))) OR (ct2.b = subquery.sum))
   ->  Hash Right Join  (cost=196.55..198.84 rows=2 width=20)
         Hash Cond: (ct1.b = ct2.c)
         ->  HashAggregate  (cost=195.50..196.51 rows=101 width=4)
               Group By Key: ct1.b
               ->  Seq Scan on ct1  (cost=0.00..145.00 rows=10100 width=4)
         ->  Hash  (cost=1.02..1.02 rows=2 width=16)
               ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=16)
   ->  Materialize  (cost=2.65..2.90 rows=10 width=12)
         ->  Subquery Scan on subquery  (cost=2.65..2.85 rows=10 width=12)
               ->  HashAggregate  (cost=2.65..2.75 rows=10 width=4)
                     Group By Key: ct3.a
                     ->  Seq Scan on ct3  (cost=0.00..2.10 rows=110 width=4)
(15 rows)

但当设置costbased_rewrite_rule参数值为pullup_expr_sublink时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行改写生成的计划更优,因此生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c) OR ct2.b = (SELECT sum(a) FROM ct3 WHERE ct3.a = ct2.c);
                                           QUERY PLAN                                           
------------------------------------------------------------------------------------------------
 Hash Right Join  (cost=3.69..4.10 rows=1 width=16)
   Hash Cond: (ct3.a = ct2.c)
   Filter: ((ct2.a = (SubPlan 2)) OR (ct2.b = (sum(ct3.a))))
   ->  HashAggregate  (cost=2.65..2.75 rows=10 width=4)
         Group By Key: ct3.a
         ->  Seq Scan on ct3  (cost=0.00..2.10 rows=110 width=4)
   ->  Hash  (cost=1.02..1.02 rows=2 width=16)
         ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=16)
   SubPlan 2
     ->  Result  (cost=0.06..0.07 rows=1 width=0)
           InitPlan 1 (returns $1)
             ->  Limit  (cost=0.00..0.06 rows=1 width=4)
                   ->  Index Only Scan using idx_ct1_b on ct1  (cost=0.00..6.25 rows=1 width=4)
                         Index Cond: ((b IS NOT NULL) AND (b = ct2.c))
(14 rows)

costbased_rewrite_rule=pullup_expr_sublink与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。

TargetList中相关表达式子连接提升场景

TargetList中存在表达式类相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:

gaussdb=# SET rewrite_rule=intargetlist;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT *, ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1)  FROM ct1 WHERE ct1.b >500;
                                                 QUERY PLAN                                                  
-------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=1.03..9.34 rows=1 width=20)
   Join Filter: (subquery."?column?" = ct1.b)
   ->  Index Scan using idx_ct1_b on ct1  (cost=0.00..8.27 rows=1 width=16)
         Index Cond: (b > 500)
   ->  Subquery Scan on subquery  (cost=1.03..1.06 rows=1 width=8)
         Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
         ->  WindowAgg  (cost=1.03..1.05 rows=1 width=8)
               row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
               ->  Sort  (cost=1.03..1.03 rows=2 width=8)
                     Sort Key: ct2.b
                     ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
(11 rows)

但当设置costbased_rewrite_rule参数值为intargetlist时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行改写生成的计划更优,因此生成计划如下:

gaussdb=# SET rewrite_rule=intargetlist;
SET
gaussdb=# SET costbased_rewrite_rule=intargetlist;
SET
gaussdb=# EXPLAIN SELECT *, ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1)  FROM ct1 WHERE ct1.b >500;
                              QUERY PLAN                              
----------------------------------------------------------------------
 Index Scan using idx_ct1_b on ct1  (cost=0.00..9.29 rows=1 width=16)
   Index Cond: (b > 500)
   SubPlan 1
     ->  Limit  (cost=0.00..1.02 rows=1 width=4)
           ->  Seq Scan on ct2  (cost=0.00..1.02 rows=1 width=4)
                 Filter: (b = ct1.b)
(6 rows)

costbased_rewrite_rule=intargetlist与rewrite_rule=intargetlist、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。

TargetList中表达式子链接满足MAGICSET下推改写场景

TargetList中存在表达式类相关子链接,且满足MagicSet下推规则可以将部分过滤条件下推到子链接中时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:

gaussdb=# SET rewrite_rule=intargetlist,magicset;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT *,ct1.a = (SELECT a FROM ct2 WHERE ct2.c = ct1.c LIMIT 1) FROM ct1 WHERE ct1.b=10;
                                                     QUERY PLAN                                                   

------------------------------------------------------------------------------------------------------------------
--
 Nested Loop Left Join  (cost=55.97..103.00 rows=100 width=20)
   Join Filter: (subquery."?column?" = costbased_rule_test.ct1.c)
   ->  Bitmap Heap Scan on ct1  (cost=5.03..50.27 rows=100 width=16)
         Recheck Cond: (b = 10)
         ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
               Index Cond: (b = 10)
   ->  Materialize  (cost=50.94..50.98 rows=1 width=8)
         ->  Subquery Scan on subquery  (cost=50.94..50.98 rows=1 width=8)
               Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
               ->  WindowAgg  (cost=50.94..50.96 rows=1 width=8)
                     row_number_filter: (row_number() OVER (PARTITION BY ct2.c) OPERATOR(pg_catalog.<=) 1::bigint)
                     ->  Sort  (cost=50.94..50.95 rows=2 width=8)
                           Sort Key: ct2.c
                           ->  Nested Loop  (cost=50.52..50.93 rows=2 width=8)
                                 ->  HashAggregate  (cost=50.52..50.53 rows=1 width=4)
                                       Group By Key: costbased_rule_test.ct1.c
                                       ->  Bitmap Heap Scan on ct1  (cost=5.03..50.27 rows=100 width=4)
                                             Recheck Cond: (b = 10)
                                             ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0
)
                                                   Index Cond: (b = 10)
                                 ->  Index Scan using idx_ct2_c on ct2  (cost=0.00..0.39 rows=1 width=8)
                                       Index Cond: (c = costbased_rule_test.ct1.c)
(22 rows)

但当设置costbased_rewrite_rule参数值为intargetlist时,则优化器会基于代价评估是否对子链接的MagicSet下推条件进行改写,因为此场景不进行MagicSet改写生成的计划更优,因此生成计划如下:

gaussdb=# SET rewrite_rule=intargetlist,magicset;
SET
gaussdb=# SET costbased_rewrite_rule=intargetlist;
SET
gaussdb=# EXPLAIN SELECT *,ct1.a = (SELECT a FROM ct2 WHERE ct2.c = ct1.c LIMIT 1) FROM ct1 WHERE ct1.b=10;
                                                    QUERY PLAN                                                    

------------------------------------------------------------------------------------------------------------------
-
 Hash Left Join  (cost=6.10..52.35 rows=100 width=20)
   Hash Cond: (ct1.c = subquery."?column?")
   ->  Bitmap Heap Scan on ct1  (cost=5.03..50.27 rows=100 width=16)
         Recheck Cond: (b = 10)
         ->  Bitmap Index Scan on idx_ct1_b  (cost=0.00..5.00 rows=100 width=0)
               Index Cond: (b = 10)
   ->  Hash  (cost=1.06..1.06 rows=1 width=8)
         ->  Subquery Scan on subquery  (cost=1.03..1.06 rows=1 width=8)
               Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
               ->  WindowAgg  (cost=1.03..1.05 rows=1 width=8)
                     row_number_filter: (row_number() OVER (PARTITION BY ct2.c) OPERATOR(pg_catalog.<=) 1::bigint)
                     ->  Sort  (cost=1.03..1.03 rows=2 width=8)
                           Sort Key: ct2.c
                           ->  Seq Scan on ct2  (cost=0.00..1.02 rows=2 width=8)
(14 rows)

costbased_rewrite_rule=intargetlist与rewrite_rule=intargetlist、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。

简单子查询展开改写场景

存在简单子查询时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2,(SELECT ct1.* FROM ct1,ct2 WHERE ct1.c > 10 AND ct1.a = 10) ct WHERE ct.a = ct2.d;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Nested Loop  (cost=20000000097.15..2000000019660.81 rows=2 width=32)
   ->  Nested Loop  (cost=20000000097.15..2000000019652.51 rows=1 width=32)
         ->  Seq Scan on ct2  (cost=10000000000.00..1000000000102.50 rows=1 width=16)
               Filter: (d = 10)
         ->  Seq Scan on ct1  (cost=10000000097.15..1000000019550.00 rows=1 width=16)
               Filter: ((c > 10) AND (a = 10))
   ->  Index Only Scan using idx_ct2_c on ct2  (cost=0.00..8.28 rows=2 width=0)
(7 rows)

但当设置costbased_rewrite_rule参数值为pullup_subquery时,则优化器会基于代价评估是否对简单子查询进行改写,因为此场景不进行简单子查询改写生成的计划更优,因此生成计划如下:

gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_subquery;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2,(SELECT ct1.* FROM ct1,ct2 WHERE ct1.c > 10 AND ct1.a = 10) ct WHERE ct.a = ct2.d;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Nested Loop  (cost=20000000097.15..2000000019660.39 rows=1 width=32)
   Join Filter: (costbased_rule_test.ct2.d = ct1.a)
   ->  Seq Scan on ct2  (cost=10000000000.00..1000000000102.00 rows=2 width=16)
   ->  Materialize  (cost=10000000097.15..1000000019558.33 rows=2 width=16)
         ->  Nested Loop  (cost=10000000097.15..1000000019558.30 rows=2 width=16)
               ->  Seq Scan on ct1  (cost=10000000097.15..1000000019550.00 rows=1 width=16)
                     Filter: ((c > 10) AND (a = 10))
               ->  Index Only Scan using idx_ct2_c on ct2  (cost=0.00..8.28 rows=2 width=0)
(8 rows)

该场景仅受GUC参数costbased_rewrite_rule=pullup_subquery控制,GUC参数rewrite_rule无相关控制参数。

相关文档