案例:调整基于代价的查询重写GUC参数costbased_rewrite_rule
查询重写能让优化器选择不同的执行路径,但部分规则对SQL的改写,并不能确保输出后的都是可以提升性能的计划。一旦改写错误可能导致千倍的性能差距,因此在查询改写阶段需要对此类规则支持基于代价的评估策略,决策是否应用规则对SQL进行改写,这些规则受GUC参数costbased_rewrite_rule控制,其受控的规则如表1所示。
参数名 |
描述 |
---|---|
none |
不使用任何基于代价的查询改写策略。 |
pullup_subquery |
简单子查询展开改写规则开启使用基于代价的查询改写策略。 |
pullup_sublink_any_exists |
单个或AND条件里的ANY非相关子链接和[NOT] EXISTS相关子链接改写规则开启使用基于代价的查询改写策略。 |
pullup_not_in_sublink |
单个或AND条件里的NOT IN非相关子链接改写规则开启使用基于代价的查询改写策略。该参数与GUC参数rewrite_rule的disable_pullup_not_in_sublink参数互斥,当开启该参数时,rewrite_rule的disable_pullup_not_in_sublink参数功能不生效。 |
pullup_expr_sublink |
表达式子链接以及OR条件中的ANY非相关子链接和[NOT] EXISTS相关子链接改写场景生效。该参数与GUC参数rewrite_rule的disable_pullup_expr_sublink、enable_sublink_pullup_enhanced、magicset参数互斥,当开启该参数时rewrite_rule的相关参数功能不生效。 |
intargetlist |
TargetList中的相关表达式子链接改写规则开启使用基于代价的查询改写策略。该参数与GUC参数rewrite_rule的intargetlist、magicset参数互斥,当开启该参数时,rewrite_rule的intargetlist、magicset参数功能不生效。 |
enable_sublink_pullup_enhanced |
表达式子链接改写功能增强场景使用基于代价的查询改写策略。该参数受pullup_expr_sublink参数影响,AND条件中的表达式子链接改写场景需开启pullup_expr_sublink参数,该参数功能才生效。且与GUC参数rewrite_rule的enable_sublink_pullup_enhanced参数互斥,当开启该参数时,rewrite_rule的enable_sublink_pullup_enhanced参数功能不生效。 |
对于上述参数所控制的查询重写规则,以下通过典型案例说明其使用场景。
案例环境准备
为了便于规则的使用场景演示,需准备建表语句如下:
SET client_min_messages = warning; SET CLIENT_ENCODING = 'UTF8'; --清理环境。 DROP SCHEMA IF EXISTS costbased_rule_test cascade; CREATE SCHEMA costbased_rule_test; SET current_schema = costbased_rule_test; SET enable_codegen = off; DROP TABLE IF EXISTS costbased_rule_test.ct1; DROP TABLE IF EXISTS costbased_rule_test.ct2; DROP TABLE IF EXISTS costbased_rule_test.ct3; DROP TABLE IF EXISTS costbased_rule_test.ct4; --创建测试表。 CREATE TABLE ct1 (a INT, b INT, c INT, d INT); CREATE TABLE ct2 (a INT, b INT, c INT, d INT); CREATE TABLE ct3 (a INT, b INT, c INT, d INT); CREATE TABLE ct4 (a INT, b INT, c INT, d INT); CREATE INDEX idx_ct1_b ON ct1(b); CREATE INDEX idx_ct2_c ON ct2(c); CREATE INDEX idx_ct3_c ON ct3(c); --插入数据。 INSERT INTO ct1 (a, b, c) VALUES (generate_series(1, 100), generate_series(200, 300), left(random()::int, 100)); INSERT INTO ct2 VALUES(1,2,3,4),(3,4,5,6); INSERT INTO ct3 (a, b, c, d) VALUES (generate_series(1, 10), generate_series(20, 30), left(random()::int, 10), left(random()::int, 10)); --更新统计信息。 ANALYZE ct1; ANALYZE ct2; ANALYZE ct3;
Having条件中的非相关子链接改写场景
Having条件中存在带有AGG的非相关表达式子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced; SET gaussdb=# SET costbased_rewrite_rule=none; SET gaussdb=# EXPLAIN SELECT ct2.a, SUM(ct2.b) AS value FROM ct2,ct1 WHERE ct2.a = ct1.b GROUP BY ct2.a HAVING SUM(ct2.b) >= (SELECT AVG(ct2.b) FROM ct2,ct1 WHERE ct2.a = ct1.b) ORDER BY value DESC; QUERY PLAN ---------------------------------------------------------------------------------------------------- Sort (cost=33.14..33.14 rows=1 width=12) Sort Key: (sum(costbased_rule_test.ct2.b)) DESC -> Nested Loop (cost=33.04..33.13 rows=1 width=12) Join Filter: ((sum(costbased_rule_test.ct2.b)) >= (avg(costbased_rule_test.ct2.b))) -> Aggregate (cost=16.02..16.03 rows=1 width=36) -> Nested Loop (cost=0.00..15.52 rows=200 width=4) -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=8) -> Index Only Scan using idx_ct1_b on ct1 (cost=0.00..6.25 rows=100 width=4) Index Cond: (b = costbased_rule_test.ct2.a) -> HashAggregate (cost=17.02..17.04 rows=2 width=8) Group By Key: costbased_rule_test.ct2.a -> Nested Loop (cost=0.00..15.52 rows=200 width=8) -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=8) -> Index Only Scan using idx_ct1_b on ct1 (cost=0.00..6.25 rows=100 width=4) Index Cond: (b = costbased_rule_test.ct2.a) (15 rows)
但当设置costbased_rewrite_rule参数值为pullup_expr_sublink,enable_sublink_pullup_enhanced时,优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced; SET gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced; SET gaussdb=# EXPLAIN SELECT ct2.a, SUM(ct2.b) AS value FROM ct2,ct1 WHERE ct2.a = ct1.b GROUP BY ct2.a HAVING SUM(ct2.b) >= (SELECT AVG(ct2.b) FROM ct2,ct1 WHERE ct2.a = ct1.b) ORDER BY value DESC; QUERY PLAN ------------------------------------------------------------------------------------------------ Sort (cost=33.09..33.09 rows=2 width=8) Sort Key: (sum(costbased_rule_test.ct2.b)) DESC InitPlan 1 (returns $1) -> Aggregate (cost=16.02..16.03 rows=1 width=36) -> Nested Loop (cost=0.00..15.52 rows=200 width=4) -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=8) -> Index Only Scan using idx_ct1_b on ct1 (cost=0.00..6.25 rows=100 width=4) Index Cond: (b = costbased_rule_test.ct2.a) -> HashAggregate (cost=17.02..17.04 rows=2 width=8) Group By Key: costbased_rule_test.ct2.a Filter: (sum(costbased_rule_test.ct2.b) >= $1) -> Nested Loop (cost=0.00..15.52 rows=200 width=8) -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=8) -> Index Only Scan using idx_ct1_b on ct1 (cost=0.00..6.25 rows=100 width=4) Index Cond: (b = costbased_rule_test.ct2.a) (15 rows)
costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced互斥,且costbased_rewrite_rule的值优先级高于rewrite_rule的优先级。
AND条件中ANY/IN非相关子链接提升场景
单个或AND条件里存在ANY/IN非相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=none; SET gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a IN (SELECT a FROM ct1 WHERE ct1.b > 1000); QUERY PLAN --------------------------------------------------------------------------------- Nested Loop Semi Join (cost=0.00..9.32 rows=1 width=16) Join Filter: (ct2.a = ct1.a) -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=16) -> Materialize (cost=0.00..8.27 rows=1 width=4) -> Index Scan using idx_ct1_b on ct1 (cost=0.00..8.27 rows=1 width=4) Index Cond: (b > 1000) (6 rows)
但当设置costbased_rewrite_rule参数值为pullup_sublink_any_exists时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=pullup_sublink_any_exists; SET gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a IN (SELECT a FROM ct1 WHERE ct1.b > 1000); QUERY PLAN ----------------------------------------------------------------------------- Seq Scan on ct2 (cost=8.27..9.29 rows=1 width=16) Filter: (hashed SubPlan 1) SubPlan 1 -> Index Scan using idx_ct1_b on ct1 (cost=0.00..8.27 rows=1 width=4) Index Cond: (b > 1000) (5 rows)
该场景仅受GUC参数costbased_rewrite_rule=pullup_sublink_any_exists控制,GUC参数rewrite_rule无相关控制参数。
AND条件中[NOT]EXISTS相关子连接提升场景
单个或AND条件里存在[NOT]EXISTS相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=none; SET gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE EXISTS (SELECT 1 FROM ct3 WHERE ct3.b = ct2.c AND ct3.c < 1000); QUERY PLAN ---------------------------------------------------------------- Hash Right Semi Join (cost=1.04..3.73 rows=2 width=16) Hash Cond: (ct3.b = ct2.c) -> Seq Scan on ct3 (cost=0.00..2.38 rows=110 width=4) Filter: (c < 1000) -> Hash (cost=1.02..1.02 rows=2 width=16) -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=16) (6 rows)
但当设置costbased_rewrite_rule参数值为pullup_sublink_any_exists时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=pullup_sublink_any_exists; SET gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE EXISTS (SELECT 1 FROM ct3 WHERE ct3.b = ct2.c AND ct3.c < 1000); QUERY PLAN ------------------------------------------------------------- Seq Scan on ct2 (cost=0.00..1.55 rows=1 width=16) Filter: (alternatives: SubPlan 1 or hashed SubPlan 2) SubPlan 1 -> Seq Scan on ct3 (cost=0.00..2.65 rows=10 width=0) Filter: ((c < 1000) AND (b = ct2.c)) SubPlan 2 -> Seq Scan on ct3 (cost=0.00..2.38 rows=110 width=4) Filter: (c < 1000) (8 rows)
该场景仅受GUC参数costbased_rewrite_rule=pullup_sublink_any_exists控制,GUC参数rewrite_rule无相关控制参数。
AND条件中NOT IN非相关子连接提升场景
单个或AND条件里存在NOT IN非相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=none; SET gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.b NOT IN (SELECT a FROM ct3); QUERY PLAN ------------------------------------------------------------------------ Nested Loop Anti Join (cost=0.00..15437.60 rows=9100 width=16) Join Filter: ((ct1.b = ct3.a) OR (ct1.b IS NULL) OR (ct3.a IS NULL)) -> Seq Scan on ct1 (cost=0.00..145.00 rows=10100 width=16) -> Materialize (cost=0.00..2.65 rows=110 width=4) -> Seq Scan on ct3 (cost=0.00..2.10 rows=110 width=4) (5 rows)
但当设置costbased_rewrite_rule参数值为pullup_not_in_sublink时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=pullup_not_in_sublink; SET gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.b NOT IN (SELECT a FROM ct3); QUERY PLAN ------------------------------------------------------------- Seq Scan on ct1 (cost=2.38..172.62 rows=5050 width=16) Filter: (NOT (hashed SubPlan 1)) SubPlan 1 -> Seq Scan on ct3 (cost=0.00..2.10 rows=110 width=4) (4 rows)
costbased_rewrite_rule=pullup_not_in_sublink与rewrite_rule=disable_pullup_not_in_sublink互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
AND条件中表达式子链接提升场景
单个或AND条件里存在表达式类相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=none; SET gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.b > 500; QUERY PLAN ------------------------------------------------------------------------------------------------------------- Nested Loop (cost=1.03..9.35 rows=1 width=16) Join Filter: ((ct1.b = subquery."?column?") AND (ct1.a = subquery.a)) -> Index Scan using idx_ct1_b on ct1 (cost=0.00..8.27 rows=1 width=16) Index Cond: (b > 500) -> Subquery Scan on subquery (cost=1.03..1.07 rows=1 width=8) Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint) -> WindowAgg (cost=1.03..1.06 rows=1 width=8) row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint) -> Sort (cost=1.03..1.04 rows=1 width=8) Sort Key: ct2.b -> Seq Scan on ct2 (cost=1.02..1.02 rows=1 width=8) Filter: (b > 500) (12 rows)
但当设置costbased_rewrite_rule参数值为pullup_expr_sublink时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink; SET gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.b > 500; QUERY PLAN ---------------------------------------------------------------------- Index Scan using idx_ct1_b on ct1 (cost=0.00..9.29 rows=1 width=16) Index Cond: (b > 500) Filter: (a = (SubPlan 1)) SubPlan 1 -> Limit (cost=0.00..1.02 rows=1 width=4) -> Seq Scan on ct2 (cost=0.00..1.02 rows=1 width=4) Filter: (b = ct1.b) (7 rows)
costbased_rewrite_rule=pullup_expr_sublink与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
AND条件中表达式子链接满足MAGICSET下推改写场景
单个或AND条件里存在表达式类相关子链接,且满足MagicSet下推规则可以将部分过滤条件下推到子链接中时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=magicset; SET gaussdb=# SET costbased_rewrite_rule=none; SET gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.c=10; QUERY PLAN ------------------------------------------------------------------------------------------------------------- Nested Loop (cost=101.70..147.50 rows=1 width=16) -> Subquery Scan on subquery (cost=96.70..96.74 rows=1 width=8) Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint) -> WindowAgg (cost=96.70..96.72 rows=1 width=8) row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint) -> Sort (cost=96.70..96.71 rows=1 width=8) Sort Key: ct2.b -> Nested Loop Semi Join (cost=5.00..96.69 rows=1 width=8) -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=8) -> Bitmap Heap Scan on ct1 (cost=5.00..47.83 rows=1 width=4) Recheck Cond: (b = ct2.b) Filter: (c = 10) -> Bitmap Index Scan on idx_ct1_b (cost=0.00..5.00 rows=100 width=0) Index Cond: (b = ct2.b) -> Bitmap Heap Scan on ct1 (cost=5.00..50.75 rows=1 width=16) Recheck Cond: (b = subquery."?column?") Filter: ((c = 10) AND (subquery.a = a)) -> Bitmap Index Scan on idx_ct1_b (cost=0.00..5.00 rows=100 width=0) Index Cond: (b = subquery."?column?") (19 rows)
但当设置costbased_rewrite_rule参数值为pullup_expr_sublink时,则优化器会基于代价评估是否对子链接的MagicSet下推条件进行改写,因为此场景不进行MagicSet改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=magicset; SET gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink; SET gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.c=10; QUERY PLAN ------------------------------------------------------------------------------------------------------------- Nested Loop (cost=6.03..51.82 rows=1 width=16) -> Subquery Scan on subquery (cost=1.03..1.06 rows=1 width=8) Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint) -> WindowAgg (cost=1.03..1.05 rows=1 width=8) row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint) -> Sort (cost=1.03..1.03 rows=2 width=8) Sort Key: ct2.b -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=8) -> Bitmap Heap Scan on ct1 (cost=5.00..50.75 rows=1 width=16) Recheck Cond: (b = subquery."?column?") Filter: ((c = 10) AND (subquery.a = a)) -> Bitmap Index Scan on idx_ct1_b (cost=0.00..5.00 rows=100 width=0) Index Cond: (b = subquery."?column?") (13 rows)
costbased_rewrite_rule=pullup_expr_sublink与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
AND条件中表达式子链接满足WINAGG改写场景
单个或AND条件里存在表达式类相关子链接,且满足WinAGG规则时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced; SET gaussdb=# SET costbased_rewrite_rule=none; SET gaussdb=# EXPLAIN SELECT * FROM ct1, ct2 WHERE ct1.b = ct2.c AND ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c); QUERY PLAN -------------------------------------------------------------------------------------------------- Subquery Scan on inner_subquery (cost=105.88..111.88 rows=1 width=36) Filter: (inner_subquery.a = inner_subquery.min) -> WindowAgg (cost=105.88..109.38 rows=200 width=32) -> Sort (cost=105.88..106.38 rows=200 width=32) Sort Key: ct2.c -> Nested Loop (cost=5.03..98.23 rows=200 width=32) -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=16) -> Bitmap Heap Scan on ct1 (cost=5.03..47.61 rows=100 width=16) Recheck Cond: (b = ct2.c) -> Bitmap Index Scan on idx_ct1_b (cost=0.00..5.00 rows=100 width=0) Index Cond: (b = ct2.c) (11 rows)
但当设置costbased_rewrite_rule参数值为pullup_expr_sublink,enable_sublink_pullup_enhanced时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced; SET gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced; SET gaussdb=# EXPLAIN SELECT * FROM ct1, ct2 WHERE ct1.b = ct2.c AND ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c); QUERY PLAN ------------------------------------------------------------------------------------------------------ Nested Loop (cost=5.03..52.45 rows=100 width=32) -> Seq Scan on ct2 (cost=0.00..1.17 rows=1 width=16) Filter: (a = (SubPlan 2)) SubPlan 2 -> Result (cost=0.06..0.07 rows=1 width=0) InitPlan 1 (returns $1) -> Limit (cost=0.00..0.06 rows=1 width=4) -> Index Only Scan using idx_ct1_b on ct1 (cost=0.00..6.25 rows=1 width=4) Index Cond: ((b IS NOT NULL) AND (b = ct2.c)) -> Bitmap Heap Scan on ct1 (cost=5.03..50.27 rows=100 width=16) Recheck Cond: (b = ct2.c) -> Bitmap Index Scan on idx_ct1_b (cost=0.00..5.00 rows=100 width=0) Index Cond: (b = ct2.c) (13 rows)
costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
OR条件中表达式子连接提升场景
OR条件里存在表达式类相关子链接,可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=none; SET gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c) OR ct2.b = (SELECT sum(a) FROM ct3 WHERE ct3.a = ct2.c); QUERY PLAN ----------------------------------------------------------------------------- Nested Loop Left Join (cost=199.20..202.11 rows=1 width=16) Join Filter: (subquery."?column?" = ct2.c) Filter: ((ct2.a = (min(ct1.b))) OR (ct2.b = subquery.sum)) -> Hash Right Join (cost=196.55..198.84 rows=2 width=20) Hash Cond: (ct1.b = ct2.c) -> HashAggregate (cost=195.50..196.51 rows=101 width=4) Group By Key: ct1.b -> Seq Scan on ct1 (cost=0.00..145.00 rows=10100 width=4) -> Hash (cost=1.02..1.02 rows=2 width=16) -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=16) -> Materialize (cost=2.65..2.90 rows=10 width=12) -> Subquery Scan on subquery (cost=2.65..2.85 rows=10 width=12) -> HashAggregate (cost=2.65..2.75 rows=10 width=4) Group By Key: ct3.a -> Seq Scan on ct3 (cost=0.00..2.10 rows=110 width=4) (15 rows)
但当设置costbased_rewrite_rule参数值为pullup_expr_sublink时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink; SET gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c) OR ct2.b = (SELECT sum(a) FROM ct3 WHERE ct3.a = ct2.c); QUERY PLAN ------------------------------------------------------------------------------------------------ Hash Right Join (cost=3.69..4.10 rows=1 width=16) Hash Cond: (ct3.a = ct2.c) Filter: ((ct2.a = (SubPlan 2)) OR (ct2.b = (sum(ct3.a)))) -> HashAggregate (cost=2.65..2.75 rows=10 width=4) Group By Key: ct3.a -> Seq Scan on ct3 (cost=0.00..2.10 rows=110 width=4) -> Hash (cost=1.02..1.02 rows=2 width=16) -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=16) SubPlan 2 -> Result (cost=0.06..0.07 rows=1 width=0) InitPlan 1 (returns $1) -> Limit (cost=0.00..0.06 rows=1 width=4) -> Index Only Scan using idx_ct1_b on ct1 (cost=0.00..6.25 rows=1 width=4) Index Cond: ((b IS NOT NULL) AND (b = ct2.c)) (14 rows)
costbased_rewrite_rule=pullup_expr_sublink与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
TargetList中相关表达式子连接提升场景
TargetList中存在表达式类相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=intargetlist; SET gaussdb=# SET costbased_rewrite_rule=none; SET gaussdb=# EXPLAIN SELECT *, ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) FROM ct1 WHERE ct1.b >500; QUERY PLAN ------------------------------------------------------------------------------------------------------------- Nested Loop Left Join (cost=1.03..9.34 rows=1 width=20) Join Filter: (subquery."?column?" = ct1.b) -> Index Scan using idx_ct1_b on ct1 (cost=0.00..8.27 rows=1 width=16) Index Cond: (b > 500) -> Subquery Scan on subquery (cost=1.03..1.06 rows=1 width=8) Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint) -> WindowAgg (cost=1.03..1.05 rows=1 width=8) row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint) -> Sort (cost=1.03..1.03 rows=2 width=8) Sort Key: ct2.b -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=8) (11 rows)
但当设置costbased_rewrite_rule参数值为intargetlist时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=intargetlist; SET gaussdb=# SET costbased_rewrite_rule=intargetlist; SET gaussdb=# EXPLAIN SELECT *, ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) FROM ct1 WHERE ct1.b >500; QUERY PLAN ---------------------------------------------------------------------- Index Scan using idx_ct1_b on ct1 (cost=0.00..9.29 rows=1 width=16) Index Cond: (b > 500) SubPlan 1 -> Limit (cost=0.00..1.02 rows=1 width=4) -> Seq Scan on ct2 (cost=0.00..1.02 rows=1 width=4) Filter: (b = ct1.b) (6 rows)
costbased_rewrite_rule=intargetlist与rewrite_rule=intargetlist、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
TargetList中表达式子链接满足MAGICSET下推改写场景
TargetList中存在表达式类相关子链接,且满足MagicSet下推规则可以将部分过滤条件下推到子链接中时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=intargetlist,magicset; SET gaussdb=# SET costbased_rewrite_rule=none; SET gaussdb=# EXPLAIN SELECT *,ct1.a = (SELECT a FROM ct2 WHERE ct2.c = ct1.c LIMIT 1) FROM ct1 WHERE ct1.b=10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------ -- Nested Loop Left Join (cost=55.97..103.00 rows=100 width=20) Join Filter: (subquery."?column?" = costbased_rule_test.ct1.c) -> Bitmap Heap Scan on ct1 (cost=5.03..50.27 rows=100 width=16) Recheck Cond: (b = 10) -> Bitmap Index Scan on idx_ct1_b (cost=0.00..5.00 rows=100 width=0) Index Cond: (b = 10) -> Materialize (cost=50.94..50.98 rows=1 width=8) -> Subquery Scan on subquery (cost=50.94..50.98 rows=1 width=8) Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint) -> WindowAgg (cost=50.94..50.96 rows=1 width=8) row_number_filter: (row_number() OVER (PARTITION BY ct2.c) OPERATOR(pg_catalog.<=) 1::bigint) -> Sort (cost=50.94..50.95 rows=2 width=8) Sort Key: ct2.c -> Nested Loop (cost=50.52..50.93 rows=2 width=8) -> HashAggregate (cost=50.52..50.53 rows=1 width=4) Group By Key: costbased_rule_test.ct1.c -> Bitmap Heap Scan on ct1 (cost=5.03..50.27 rows=100 width=4) Recheck Cond: (b = 10) -> Bitmap Index Scan on idx_ct1_b (cost=0.00..5.00 rows=100 width=0 ) Index Cond: (b = 10) -> Index Scan using idx_ct2_c on ct2 (cost=0.00..0.39 rows=1 width=8) Index Cond: (c = costbased_rule_test.ct1.c) (22 rows)
但当设置costbased_rewrite_rule参数值为intargetlist时,则优化器会基于代价评估是否对子链接的MagicSet下推条件进行改写,因为此场景不进行MagicSet改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=intargetlist,magicset; SET gaussdb=# SET costbased_rewrite_rule=intargetlist; SET gaussdb=# EXPLAIN SELECT *,ct1.a = (SELECT a FROM ct2 WHERE ct2.c = ct1.c LIMIT 1) FROM ct1 WHERE ct1.b=10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------ - Hash Left Join (cost=6.10..52.35 rows=100 width=20) Hash Cond: (ct1.c = subquery."?column?") -> Bitmap Heap Scan on ct1 (cost=5.03..50.27 rows=100 width=16) Recheck Cond: (b = 10) -> Bitmap Index Scan on idx_ct1_b (cost=0.00..5.00 rows=100 width=0) Index Cond: (b = 10) -> Hash (cost=1.06..1.06 rows=1 width=8) -> Subquery Scan on subquery (cost=1.03..1.06 rows=1 width=8) Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint) -> WindowAgg (cost=1.03..1.05 rows=1 width=8) row_number_filter: (row_number() OVER (PARTITION BY ct2.c) OPERATOR(pg_catalog.<=) 1::bigint) -> Sort (cost=1.03..1.03 rows=2 width=8) Sort Key: ct2.c -> Seq Scan on ct2 (cost=0.00..1.02 rows=2 width=8) (14 rows)
costbased_rewrite_rule=intargetlist与rewrite_rule=intargetlist、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
简单子查询展开改写场景
存在简单子查询时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=none; SET gaussdb=# EXPLAIN SELECT * FROM ct2,(SELECT ct1.* FROM ct1,ct2 WHERE ct1.c > 10 AND ct1.a = 10) ct WHERE ct.a = ct2.d; QUERY PLAN -------------------------------------------------------------------------------------- Nested Loop (cost=20000000097.15..2000000019660.81 rows=2 width=32) -> Nested Loop (cost=20000000097.15..2000000019652.51 rows=1 width=32) -> Seq Scan on ct2 (cost=10000000000.00..1000000000102.50 rows=1 width=16) Filter: (d = 10) -> Seq Scan on ct1 (cost=10000000097.15..1000000019550.00 rows=1 width=16) Filter: ((c > 10) AND (a = 10)) -> Index Only Scan using idx_ct2_c on ct2 (cost=0.00..8.28 rows=2 width=0) (7 rows)
但当设置costbased_rewrite_rule参数值为pullup_subquery时,则优化器会基于代价评估是否对简单子查询进行改写,因为此场景不进行简单子查询改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none; SET gaussdb=# SET costbased_rewrite_rule=pullup_subquery; SET gaussdb=# EXPLAIN SELECT * FROM ct2,(SELECT ct1.* FROM ct1,ct2 WHERE ct1.c > 10 AND ct1.a = 10) ct WHERE ct.a = ct2.d; QUERY PLAN -------------------------------------------------------------------------------------------- Nested Loop (cost=20000000097.15..2000000019660.39 rows=1 width=32) Join Filter: (costbased_rule_test.ct2.d = ct1.a) -> Seq Scan on ct2 (cost=10000000000.00..1000000000102.00 rows=2 width=16) -> Materialize (cost=10000000097.15..1000000019558.33 rows=2 width=16) -> Nested Loop (cost=10000000097.15..1000000019558.30 rows=2 width=16) -> Seq Scan on ct1 (cost=10000000097.15..1000000019550.00 rows=1 width=16) Filter: ((c > 10) AND (a = 10)) -> Index Only Scan using idx_ct2_c on ct2 (cost=0.00..8.28 rows=2 width=0) (8 rows)
该场景仅受GUC参数costbased_rewrite_rule=pullup_subquery控制,GUC参数rewrite_rule无相关控制参数。