案例:调整基于代价的查询重写GUC参数costbased_rewrite_rule
查询重写能让优化器选择不同的执行路径,但部分规则对SQL的改写,并不能确保输出后的都是可以提升性能的计划。一旦改写错误可能导致千倍的性能差距,因此在查询改写阶段需要对此类规则支持基于代价的评估策略,决策是否应用规则对SQL进行改写,这些规则受GUC参数costbased_rewrite_rule控制,其受控的规则如表1所示。
|
参数名 |
描述 |
|---|---|
|
none |
不使用任何基于代价的查询改写策略。 |
|
pullup_subquery |
简单子查询展开改写规则开启使用基于代价的查询改写策略。 |
|
pullup_sublink_any_exists |
单个或AND条件里的ANY非相关子链接和[NOT] EXISTS相关子链接改写规则开启使用基于代价的查询改写策略。 |
|
pullup_not_in_sublink |
单个或AND条件里的NOT IN非相关子链接改写规则开启使用基于代价的查询改写策略。该参数与GUC参数rewrite_rule的disable_pullup_not_in_sublink参数互斥,当开启该参数时,rewrite_rule的disable_pullup_not_in_sublink参数功能不生效。 |
|
pullup_expr_sublink |
表达式子链接以及OR条件中的ANY非相关子链接和[NOT] EXISTS相关子链接改写场景生效。该参数与GUC参数rewrite_rule的disable_pullup_expr_sublink、enable_sublink_pullup_enhanced、magicset参数互斥,当开启该参数时rewrite_rule的相关参数功能不生效。 |
|
intargetlist |
TargetList中的相关表达式子链接改写规则开启使用基于代价的查询改写策略。该参数与GUC参数rewrite_rule的intargetlist、magicset参数互斥,当开启该参数时,rewrite_rule的intargetlist、magicset参数功能不生效。 |
|
enable_sublink_pullup_enhanced |
表达式子链接改写功能增强场景使用基于代价的查询改写策略。该参数受pullup_expr_sublink参数影响,AND条件中的表达式子链接改写场景需开启pullup_expr_sublink参数,该参数功能才生效。且与GUC参数rewrite_rule的enable_sublink_pullup_enhanced 参数互斥,当开启该参数时,rewrite_rule的enable_sublink_pullup_enhanced参数功能不生效。 |
对于上述参数所控制的查询重写规则,以下通过典型案例说明其使用场景。
在分布式场景下,基于代价的改写策略会优先选择STREAM计划,当改写前后均无法生成STREAM计划时会退化为PGXC计划。然而如果生成的PGXC计划完全下推到DN上,则CN上的代价信息是不准确的。因此该场景下生成计划时仍然会基于规则进行查询重写。场景示例参见生成PGXC计划时退化为基于规则的查询重写。
案例环境准备
为了便于规则的使用场景演示,需准备建表语句如下:
SET client_min_messages = warning; SET CLIENT_ENCODING = 'UTF8'; --清理环境。 DROP SCHEMA IF EXISTS costbased_rule_test cascade; CREATE SCHEMA costbased_rule_test; SET current_schema = costbased_rule_test; SET enable_codegen = off; DROP TABLE IF EXISTS costbased_rule_test.ct1; DROP TABLE IF EXISTS costbased_rule_test.ct2; DROP TABLE IF EXISTS costbased_rule_test.ct3; DROP TABLE IF EXISTS costbased_rule_test.ct4; --创建测试表。 CREATE TABLE ct1 (a INT, b INT, c INT, d INT); CREATE TABLE ct2 (a INT, b INT, c INT, d INT); CREATE TABLE ct3 (a INT, b INT, c INT, d INT); CREATE TABLE ct4 (a INT, b INT, c INT, d INT); CREATE INDEX idx_ct1_b ON ct1(b); CREATE INDEX idx_ct2_c ON ct2(c); CREATE INDEX idx_ct3_c ON ct3(c); --插入数据。 INSERT INTO ct1 (a, b, c) VALUES (generate_series(1, 100), generate_series(200, 300), left(random()::int, 100)); INSERT INTO ct2 VALUES(1,2,3,4),(3,4,5,6); INSERT INTO ct3 (a, b, c, d) VALUES (generate_series(1, 10), generate_series(20, 30), left(random()::int, 10), left(random()::int, 10)); --更新统计信息。 ANALYZE ct1; ANALYZE ct2; ANALYZE ct3;
Having条件中的非相关子链接改写场景
Having条件中存在带有AGG的非相关表达式子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT ct2.a, SUM(ct2.b) AS value FROM ct2,ct1 WHERE ct2.a = ct1.b GROUP BY ct2.a HAVING SUM(ct2.b) >= (SELECT AVG(ct2.b) FROM ct2,ct1 WHERE ct2.a = ct1.b) ORDER BY value DESC;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
-------------
Streaming (type: GATHER) (cost=5.31..5.37 rows=3 width=12)
Merge Sort Key: (pg_catalog.sum((sum(costbased_rule_test.ct2.b)))) DESC
Node/s: All datanodes
-> Sort (cost=5.24..5.25 rows=3 width=12)
Sort Key: (pg_catalog.sum((sum(costbased_rule_test.ct2.b)))) DESC
-> Nested Loop (cost=5.13..5.23 rows=1 width=12)
Join Filter: (((pg_catalog.sum((sum(costbased_rule_test.ct2.b)))))::numeric >= (pg_catalog.avg((avg(costbased_rule_tes
t.ct2.b)))))
-> HashAggregate (cost=2.79..2.80 rows=2 width=24)
Group By Key: costbased_rule_test.ct2.a
-> Streaming(type: REDISTRIBUTE) (cost=2.66..2.76 rows=6 width=24)
Spawn on: All datanodes
-> HashAggregate (cost=2.66..2.68 rows=6 width=24)
Group By Key: costbased_rule_test.ct2.a
-> Merge Join (cost=1.46..2.16 rows=200 width=8)
Merge Cond: (costbased_rule_test.ct1.b = costbased_rule_test.ct2.a)
-> Index Only Scan using idx_ct1_b on ct1 (cost=0.00..90.75 rows=10100 width=4)
-> Sort (cost=1.29..1.30 rows=6 width=8)
Sort Key: costbased_rule_test.ct2.a
-> Streaming(type: BROADCAST) (cost=0.00..1.28 rows=6 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=8)
-> Materialize (cost=2.33..2.41 rows=3 width=32)
-> Aggregate (cost=2.33..2.40 rows=3 width=36)
-> Streaming(type: BROADCAST) (cost=2.33..2.40 rows=9 width=36)
Spawn on: All datanodes
-> Aggregate (cost=2.33..2.34 rows=3 width=36)
-> Merge Join (cost=1.46..2.16 rows=200 width=4)
Merge Cond: (costbased_rule_test.ct1.b = costbased_rule_test.ct2.a)
-> Index Only Scan using idx_ct1_b on ct1 (cost=0.00..90.75 rows=10100 width=4)
-> Sort (cost=1.29..1.30 rows=6 width=8)
Sort Key: costbased_rule_test.ct2.a
-> Streaming(type: BROADCAST) (cost=0.00..1.28 rows=6 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=8)
(34 rows)
但当设置costbased_rewrite_rule参数值为pullup_expr_sublink,enable_sublink_pullup_enhanced时,优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,生成计划如下:
gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced;
SET
gaussdb=# EXPLAIN SELECT ct2.a, SUM(ct2.b) AS value FROM ct2,ct1 WHERE ct2.a = ct1.b GROUP BY ct2.a HAVING SUM(ct2.b) >= (SELECT AVG(ct2.b) FROM ct2,ct1 WHERE ct2.a = ct1.b) ORDER BY value DESC;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
---
Sort (cost=5.35..5.35 rows=2 width=24)
Sort Key: (pg_catalog.sum((sum(costbased_rule_test.ct2.b)))) DESC
InitPlan 1 (returns $0)
-> Streaming (type: GATHER) (cost=2.52..2.58 rows=3 width=36)
Node/s: All datanodes
-> Aggregate (cost=2.33..2.40 rows=3 width=36)
-> Streaming(type: BROADCAST) (cost=2.33..2.40 rows=9 width=36)
Spawn on: All datanodes
-> Aggregate (cost=2.33..2.34 rows=3 width=36)
-> Merge Join (cost=1.46..2.16 rows=200 width=4)
Merge Cond: (costbased_rule_test.ct1.b = costbased_rule_test.ct2.a)
-> Index Only Scan using idx_ct1_b on ct1 (cost=0.00..90.75 rows=10100 width=
4)
-> Sort (cost=1.29..1.30 rows=6 width=8)
Sort Key: costbased_rule_test.ct2.a
-> Streaming(type: BROADCAST) (cost=0.00..1.28 rows=6 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=8)
-> HashAggregate (cost=2.79..2.94 rows=2 width=24)
Group By Key: costbased_rule_test.ct2.a
Filter: ((pg_catalog.sum((sum(costbased_rule_test.ct2.b))))::numeric >= $0)
-> Streaming (type: GATHER) (cost=2.79..2.94 rows=6 width=24)
Node/s: All datanodes
-> HashAggregate (cost=2.66..2.69 rows=6 width=24)
Group By Key: costbased_rule_test.ct2.a
-> Merge Join (cost=1.46..2.16 rows=200 width=8)
Merge Cond: (costbased_rule_test.ct1.b = costbased_rule_test.ct2.a)
-> Index Only Scan using idx_ct1_b on ct1 (cost=0.00..90.75 rows=10100 width=4)
-> Sort (cost=1.29..1.30 rows=6 width=8)
Sort Key: costbased_rule_test.ct2.a
-> Streaming(type: BROADCAST) (cost=0.00..1.28 rows=6 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=8)
(32 rows)
costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced互斥,且costbased_rewrite_rule的值优先级高于rewrite_rule的优先级。
AND条件中ANY/IN非相关子链接提升场景
单个或AND条件里存在ANY/IN非相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a IN (SELECT a FROM ct1 WHERE ct1.c > 500 AND ct1.a = 10);
QUERY PLAN
-----------------------------------------------------------------
Streaming (type: GATHER) (cost=0.19..66.72 rows=3 width=16)
Node/s: datanode2
-> Nested Loop Semi Join (cost=0.00..66.53 rows=3 width=16)
-> Seq Scan on ct2 (cost=0.00..1.01 rows=1 width=16)
Filter: (a = 10)
-> Seq Scan on ct1 (cost=0.00..65.50 rows=1 width=4)
Filter: ((c > 500) AND (a = 10))
(7 rows)
但当设置costbased_rewrite_rule参数值为pullup_sublink_any_exists时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_sublink_any_exists;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a IN (SELECT a FROM ct1 WHERE ct1.c > 500 AND ct1.a = 10);
QUERY PLAN
-----------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=65.59..66.66 rows=1 width=16)
Node/s: All datanodes
-> Seq Scan on ct2 (cost=65.52..66.54 rows=1 width=16)
Filter: (hashed SubPlan 1)
SubPlan 1
-> Materialize (cost=0.00..65.52 rows=9 width=4)
-> Streaming(type: BROADCAST) (cost=0.00..65.51 rows=3 width=4)
Spawn on: datanode2
-> Seq Scan on ct1 (cost=0.00..65.50 rows=1 width=4)
Filter: ((c > 500) AND (a = 10))
(10 rows)
该场景仅受GUC参数costbased_rewrite_rule=pullup_sublink_any_exists控制,GUC参数rewrite_rule无相关控制参数。
AND条件中[NOT]EXISTS相关子连接提升场景
单个或AND条件里存在[NOT]EXISTS相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE EXISTS (SELECT 1 FROM ct3 WHERE ct3.b = ct2.c AND ct3.c < 1000);
QUERY PLAN
------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=4.66..5.95 rows=2 width=16)
Node/s: All datanodes
-> Nested Loop (cost=4.60..5.82 rows=2 width=16)
Join Filter: (ct2.c = ct3.b)
-> Streaming(type: REDISTRIBUTE) (cost=0.00..1.13 rows=2 width=16)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=16)
-> Materialize (cost=4.60..4.65 rows=11 width=4)
-> HashAggregate (cost=4.60..4.64 rows=11 width=4)
Group By Key: ct3.b
-> Streaming(type: REDISTRIBUTE) (cost=3.55..4.57 rows=33 width=4)
Spawn on: All datanodes
-> HashAggregate (cost=3.55..3.66 rows=33 width=4)
Group By Key: ct3.b
-> Seq Scan on ct3 (cost=0.00..3.46 rows=110 width=4)
Filter: (c < 1000)
(16 rows)
但当设置costbased_rewrite_rule参数值为pullup_sublink_any_exists时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_sublink_any_exists;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE EXISTS (SELECT 1 FROM ct3 WHERE ct3.b = ct2.c AND ct3.c < 1000);
QUERY PLAN
-------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=0.06..1.26 rows=1 width=16)
Node/s: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.14 rows=1 width=16)
Filter: (SubPlan 1)
SubPlan 1
-> Result (cost=0.00..14.02 rows=330 width=4)
Filter: (ct3.b = ct2.c)
-> Materialize (cost=0.00..14.02 rows=330 width=4)
-> Streaming(type: BROADCAST) (cost=0.00..13.47 rows=330 width=4)
Spawn on: All datanodes
-> Seq Scan on ct3 (cost=0.00..3.46 rows=110 width=4)
Filter: (c < 1000)
(12 rows)
该场景仅受GUC参数costbased_rewrite_rule=pullup_sublink_any_exists控制,GUC参数rewrite_rule无相关控制参数。
AND条件中NOT IN非相关子连接提升场景
单个或AND条件里存在NOT IN非相关子链接时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.b NOT IN (SELECT a FROM ct3);
QUERY PLAN
-----------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=4.00..5681.51 rows=9100 width=16)
Node/s: All datanodes
-> Nested Loop Anti Join (cost=0.00..5302.39 rows=9100 width=16)
Join Filter: ((ct1.b = ct3.a) OR (ct1.b IS NULL) OR (ct3.a IS NULL))
-> Seq Scan on ct1 (cost=0.00..48.67 rows=10100 width=16)
-> Materialize (cost=0.00..13.93 rows=330 width=4)
-> Streaming(type: BROADCAST) (cost=0.00..13.38 rows=330 width=4)
Spawn on: All datanodes
-> Seq Scan on ct3 (cost=0.00..3.37 rows=110 width=4)
(9 rows)
但当设置costbased_rewrite_rule参数值为pullup_not_in_sublink时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行SQL改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_not_in_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.b NOT IN (SELECT a FROM ct3);
QUERY PLAN
------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=8.96..272.42 rows=5050 width=16)
Node/s: All datanodes
-> Seq Scan on ct1 (cost=4.96..62.05 rows=5050 width=16)
Filter: (NOT (hashed SubPlan 1))
SubPlan 1
-> Materialize (cost=0.00..4.14 rows=990 width=4)
-> Streaming(type: BROADCAST) (cost=0.00..3.59 rows=330 width=4)
Spawn on: All datanodes
-> Seq Scan on ct3 (cost=0.00..3.37 rows=110 width=4)
(9 rows)
costbased_rewrite_rule=pullup_not_in_sublink与rewrite_rule=disable_pullup_not_in_sublink互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
AND条件中表达式子链接提升场景
单个或AND条件里存在表达式类相关子链接时可以进行SQL等价改写。如果同时关闭基于规则和基于代价的查询改写评估策略,则子链接不会进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=disable_pullup_expr_sublink;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.b > 50;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=1.07..4454.57 rows=50 width=16)
Node/s: All datanodes
-> Seq Scan on ct1 (cost=0.01..4452.44 rows=50 width=16)
Filter: ((b > 50) AND (a = (SubPlan 1)))
SubPlan 1
-> Limit (cost=1.30..1.30 rows=3 width=8)
-> Sort (cost=1.30..1.31 rows=3 width=8)
Sort Key: ct2.a
-> Result (cost=0.00..1.29 rows=6 width=8)
Filter: (ct2.b = ct1.b)
-> Materialize (cost=0.00..1.29 rows=6 width=8)
-> Streaming(type: BROADCAST) (cost=0.00..1.28 rows=6 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=8)
(14 rows)
但当设置costbased_rewrite_rule参数值为pullup_expr_sublink时,则优化器会基于代价评估是否对子链接进行改写,因为此场景进行SQL改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=disable_pullup_expr_sublink;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.b > 50;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=1.31..1.83 rows=3 width=16)
Node/s: All datanodes
-> Merge Join (cost=1.25..1.71 rows=3 width=16)
Merge Cond: (subquery."?column?" = ct1.b)
Join Filter: (ct1.a = subquery.a)
-> Sort (cost=1.24..1.24 rows=1 width=8)
Sort Key: subquery."?column?"
-> Streaming(type: REDISTRIBUTE) (cost=1.05..1.23 rows=1 width=8)
Spawn on: All datanodes
-> Subquery Scan on subquery (cost=1.05..1.08 rows=1 width=8)
Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
-> WindowAgg (cost=1.05..1.07 rows=1 width=8)
row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
-> Sort (cost=1.05..1.06 rows=1 width=8)
Sort Key: ct2.b
-> Streaming(type: REDISTRIBUTE) (cost=1.01..1.04 rows=1 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=1.01..1.01 rows=1 width=8)
Filter: (b > 50)
-> Index Scan using idx_ct1_b on ct1 (cost=0.00..159.16 rows=10099 width=16)
Index Cond: (b > 50)
(21 rows)
costbased_rewrite_rule=pullup_expr_sublink与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
AND条件中表达式子链接满足MAGICSET下推改写场景
单个或AND条件里存在表达式类相关子链接,且满足MagicSet下推规则可以将部分过滤条件下推到子链接中时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=magicset;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.c=10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=62.95..78.78 rows=3 width=16)
Node/s: All datanodes
-> Nested Loop (cost=62.89..78.66 rows=3 width=16)
-> Streaming(type: REDISTRIBUTE) (cost=58.39..58.57 rows=1 width=8)
Spawn on: All datanodes
-> Subquery Scan on subquery (cost=58.39..58.43 rows=1 width=8)
Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
-> WindowAgg (cost=58.39..58.42 rows=3 width=8)
row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
-> Sort (cost=58.39..58.40 rows=3 width=8)
Sort Key: ct2.b
-> Nested Loop Semi Join (cost=0.00..58.38 rows=3 width=8)
Join Filter: (ct2.b = costbased_rule_test.ct1.b)
-> Streaming(type: REDISTRIBUTE) (cost=0.00..1.13 rows=2 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=8)
-> Materialize (cost=0.00..57.24 rows=1 width=4)
-> Streaming(type: REDISTRIBUTE) (cost=0.00..57.23 rows=1 width=4)
Spawn on: All datanodes
-> Seq Scan on ct1 (cost=0.00..57.09 rows=1 width=4)
Filter: (c = 10)
-> Bitmap Heap Scan on ct1 (cost=4.50..20.08 rows=1 width=16)
Recheck Cond: (b = subquery."?column?")
Filter: ((c = 10) AND (subquery.a = a))
-> Bitmap Index Scan on idx_ct1_b (cost=0.00..4.50 rows=100 width=0)
Index Cond: (b = subquery."?column?")
(26 rows)
但当设置costbased_rewrite_rule参数值为pullup_expr_sublink时,则优化器会基于代价评估是否对子链接的MagicSet下推条件进行改写,因为此场景不进行MagicSet改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=magicset;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1 WHERE ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) AND ct1.c=10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=5.61..21.43 rows=3 width=16)
Node/s: All datanodes
-> Nested Loop (cost=5.54..21.31 rows=3 width=16)
-> Streaming(type: REDISTRIBUTE) (cost=1.05..1.22 rows=1 width=8)
Spawn on: All datanodes
-> Subquery Scan on subquery (cost=1.05..1.08 rows=1 width=8)
Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
-> WindowAgg (cost=1.05..1.07 rows=1 width=8)
row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
-> Sort (cost=1.05..1.05 rows=1 width=8)
Sort Key: ct2.b
-> Streaming(type: REDISTRIBUTE) (cost=0.00..1.04 rows=2 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=8)
-> Bitmap Heap Scan on ct1 (cost=4.50..20.08 rows=1 width=16)
Recheck Cond: (b = subquery."?column?")
Filter: ((c = 10) AND (subquery.a = a))
-> Bitmap Index Scan on idx_ct1_b (cost=0.00..4.50 rows=100 width=0)
Index Cond: (b = subquery."?column?")
(19 rows)
costbased_rewrite_rule=pullup_expr_sublink与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset参数互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
AND条件中表达式子链接满足WINAGG改写场景
单个或AND条件里存在表达式类相关子链接,且满足WinAGG规则时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1, ct2 WHERE ct1.b = ct2.c AND ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=10.71..14.23 rows=1 width=36)
Node/s: All datanodes
-> Subquery Scan on inner_subquery (cost=10.64..14.11 rows=1 width=36)
Filter: (inner_subquery.a = inner_subquery.min)
-> WindowAgg (cost=10.64..13.27 rows=200 width=32)
-> Sort (cost=10.64..11.02 rows=200 width=32)
Sort Key: ct2.c
-> Streaming(type: REDISTRIBUTE) (cost=1.46..5.22 rows=200 width=32)
Spawn on: All datanodes
-> Merge Join (cost=1.46..2.18 rows=200 width=32)
Merge Cond: (ct1.b = ct2.c)
-> Index Scan using idx_ct1_b on ct1 (cost=0.00..150.75 rows=10100 width=16)
-> Sort (cost=1.29..1.30 rows=3 width=16)
Sort Key: ct2.c
-> Streaming(type: BROADCAST) (cost=0.00..1.28 rows=6 width=16)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=16)
(17 rows)
但当设置costbased_rewrite_rule参数值为pullup_expr_sublink,enable_sublink_pullup_enhanced时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=enable_sublink_pullup_enhanced;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced;
SET
gaussdb=# EXPLAIN SELECT * FROM ct1, ct2 WHERE ct1.b = ct2.c AND ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=3.55..5.99 rows=100 width=32)
Node/s: All datanodes
-> Merge Join (cost=1.49..1.87 rows=100 width=32)
Merge Cond: (costbased_rule_test.ct1.b = ct2.c)
-> Index Scan using idx_ct1_b on ct1 (cost=0.00..150.75 rows=10100 width=16)
-> Sort (cost=1.40..1.41 rows=3 width=16)
Sort Key: ct2.c
-> Streaming(type: BROADCAST) (cost=0.00..1.39 rows=3 width=16)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.12 rows=1 width=16)
Filter: (a = (SubPlan 2))
SubPlan 2
-> Result (cost=0.10..0.11 rows=3 width=0)
InitPlan 1 (returns $1)
-> Limit (cost=0.00..0.10 rows=3 width=4)
-> Result (cost=0.00..1009.68 rows=3 width=4)
Filter: (costbased_rule_test.ct1.b = ct2.c)
-> Materialize (cost=0.00..1009.68 rows=3 width=4)
-> Streaming(type: BROADCAST) (cost=0.00..959.18 rows=30300 width=4)
Spawn on: All datanodes
-> Seq Scan on ct1 (cost=0.00..48.67 rows=10100 width=4)
Filter: (b IS NOT NULL)
(22 rows)
costbased_rewrite_rule=pullup_expr_sublink,enable_sublink_pullup_enhanced与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset参数互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
OR条件中表达式子连接提升场景
OR条件里存在表达式类相关子链接,可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c) OR ct2.b = (SELECT sum(a) FROM ct3 WHERE ct3.a = ct2.c);
QUERY PLAN
---------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=73.05..75.50 rows=3 width=16)
Node/s: All datanodes
-> Nested Loop Left Join (cost=72.99..75.38 rows=3 width=16)
Join Filter: (ct3.a = ct2.c)
Filter: ((ct2.a = (min((min(ct1.b))))) OR (ct2.b = (sum(ct3.a))))
-> Nested Loop Left Join (cost=69.43..71.67 rows=2 width=20)
Join Filter: (ct1.b = ct2.c)
-> Streaming(type: REDISTRIBUTE) (cost=0.00..1.13 rows=2 width=16)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=16)
-> Materialize (cost=69.43..70.20 rows=101 width=8)
-> HashAggregate (cost=69.43..69.77 rows=101 width=8)
Group By Key: ct1.b
-> Streaming(type: REDISTRIBUTE) (cost=65.51..68.93 rows=303 width=8)
Spawn on: All datanodes
-> HashAggregate (cost=65.51..66.52 rows=303 width=8)
Group By Key: ct1.b
-> Seq Scan on ct1 (cost=0.00..48.67 rows=10100 width=4)
-> HashAggregate (cost=3.55..3.59 rows=12 width=12)
Group By Key: ct3.a
-> Seq Scan on ct3 (cost=0.00..3.37 rows=110 width=4)
(21 rows)
但当设置costbased_rewrite_rule参数值为pullup_expr_sublink时,则优化器会基于代价评估是否对子链接进行改写,因为此场景不进行改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_expr_sublink;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2 WHERE ct2.a = (SELECT min(b) FROM ct1 WHERE ct1.b = ct2.c) OR ct2.b = (SELECT sum(a) FROM ct3 WHERE ct3.a = ct2.c);
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=3.75..5.07 rows=3 width=16)
Node/s: All datanodes
-> Hash Left Join (cost=3.68..4.95 rows=3 width=16)
Hash Cond: (ct2.c = subquery."?column?")
Filter: ((ct2.a = (SubPlan 2)) OR (ct2.b = subquery.sum))
-> Streaming(type: REDISTRIBUTE) (cost=0.00..1.13 rows=2 width=16)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=16)
-> Hash (cost=3.63..3.63 rows=12 width=12)
-> Subquery Scan on subquery (cost=3.55..3.63 rows=12 width=12)
-> HashAggregate (cost=3.55..3.59 rows=12 width=12)
Group By Key: ct3.a
-> Seq Scan on ct3 (cost=0.00..3.37 rows=110 width=4)
SubPlan 2
-> Result (cost=0.10..0.11 rows=3 width=0)
InitPlan 1 (returns $1)
-> Limit (cost=0.00..0.10 rows=3 width=4)
-> Result (cost=0.00..1009.68 rows=3 width=4)
Filter: (costbased_rule_test.ct1.b = ct2.c)
-> Materialize (cost=0.00..1009.68 rows=3 width=4)
-> Streaming(type: BROADCAST) (cost=0.00..959.18 rows=30300 width=4)
Spawn on: All datanodes
-> Seq Scan on ct1 (cost=0.00..48.67 rows=10100 width=4)
Filter: (b IS NOT NULL)
(24 rows)
costbased_rewrite_rule=pullup_expr_sublink与rewrite_rule=disable_pullup_expr_sublink、rewrite_rule=enable_sublink_pullup_enhanced、rewrite_rule=magicset参数互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
TargetList中相关表达式子连接提升场景
TargetList中存在表达式类相关子链接时可以进行SQL等价改写。如果同时关闭基于规则和基于代价的查询改写评估策略,子链接不进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT *, ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) FROM ct1 WHERE ct1.b >50;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=4.00..4871.89 rows=10099 width=16)
Node/s: All datanodes
-> Seq Scan on ct1 (cost=0.00..4451.14 rows=10099 width=16)
Filter: (b > 50)
SubPlan 1
-> Limit (cost=1.30..1.30 rows=3 width=8)
-> Sort (cost=1.30..1.31 rows=3 width=8)
Sort Key: ct2.a
-> Result (cost=0.00..1.29 rows=6 width=8)
Filter: (ct2.b = ct1.b)
-> Materialize (cost=0.00..1.29 rows=6 width=8)
-> Streaming(type: BROADCAST) (cost=0.00..1.28 rows=6 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=8)
(14 rows)
但当设置costbased_rewrite_rule参数值为intargetlist时,则优化器会基于代价评估是否对子链接进行改写,因为此场景进行改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=intargetlist;
SET
gaussdb=# EXPLAIN SELECT *, ct1.a = (SELECT a FROM ct2 WHERE ct2.b = ct1.b LIMIT 1) FROM ct1 WHERE ct1.b >50;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=5.36..496.38 rows=10099 width=20)
Node/s: All datanodes
-> Hash Left Join (cost=1.36..75.63 rows=10099 width=20)
Hash Cond: (ct1.b = subquery."?column?")
-> Seq Scan on ct1 (cost=0.00..57.09 rows=10099 width=16)
Filter: (b > 50)
-> Hash (cost=1.35..1.35 rows=3 width=8)
-> Streaming(type: BROADCAST) (cost=1.05..1.35 rows=3 width=8)
Spawn on: All datanodes
-> Subquery Scan on subquery (cost=1.05..1.08 rows=1 width=8)
Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
-> WindowAgg (cost=1.05..1.07 rows=1 width=8)
row_number_filter: (row_number() OVER (PARTITION BY ct2.b) OPERATOR(pg_catalog.<=) 1::bigint)
-> Sort (cost=1.05..1.05 rows=1 width=8)
Sort Key: ct2.b
-> Streaming(type: REDISTRIBUTE) (cost=0.00..1.04 rows=2 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=8)
(18 rows)
costbased_rewrite_rule=intargetlist与rewrite_rule=ntargetlist、rewrite_rule=magicset互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
TargetList中表达式子链接满足MAGICSET下推改写场景
TargetList中存在表达式类相关子链接,且满足MagicSet下推规则可以将部分过滤条件下推到子链接中时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=intargetlist,magicset;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT *,ct1.a = (SELECT a FROM ct2 WHERE ct2.c = ct1.c LIMIT 1) FROM ct1 WHERE ct1.b=10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
-------------
Streaming (type: GATHER) (cost=27.90..46.26 rows=100 width=20)
Node/s: All datanodes
-> Nested Loop Left Join (cost=25.83..42.13 rows=100 width=20)
Join Filter: (subquery."?column?" = costbased_rule_test.ct1.c)
-> Bitmap Heap Scan on ct1 (cost=4.51..19.92 rows=100 width=16)
Recheck Cond: (b = 10)
-> Bitmap Index Scan on idx_ct1_b (cost=0.00..4.50 rows=100 width=0)
Index Cond: (b = 10)
-> Materialize (cost=21.33..21.64 rows=3 width=8)
-> Streaming(type: BROADCAST) (cost=21.33..21.63 rows=3 width=8)
Spawn on: All datanodes
-> Subquery Scan on subquery (cost=21.33..21.36 rows=1 width=8)
Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
-> WindowAgg (cost=21.33..21.35 rows=3 width=8)
row_number_filter: (row_number() OVER (PARTITION BY ct2.c) OPERATOR(pg_catalog.<=) 1::bigint)
-> Sort (cost=21.33..21.33 rows=3 width=8)
Sort Key: ct2.c
-> Nested Loop (cost=20.16..21.32 rows=2 width=8)
Join Filter: (ct2.c = costbased_rule_test.ct1.c)
-> Streaming(type: REDISTRIBUTE) (cost=0.00..1.13 rows=2 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=8)
-> Materialize (cost=20.16..20.18 rows=1 width=4)
-> HashAggregate (cost=20.16..20.17 rows=1 width=4)
Group By Key: costbased_rule_test.ct1.c
-> Streaming(type: REDISTRIBUTE) (cost=20.00..20.16 rows=3 width=4)
Spawn on: All datanodes
-> HashAggregate (cost=20.00..20.01 rows=3 width=4)
Group By Key: costbased_rule_test.ct1.c
-> Bitmap Heap Scan on ct1 (cost=4.51..19.92 rows=100 width=4)
Recheck Cond: (b = 10)
-> Bitmap Index Scan on idx_ct1_b (cost=0.00..4.50 rows=
100 width=0)
Index Cond: (b = 10)
(33 rows)
但当设置costbased_rewrite_rule参数值为intargetlist时,则优化器会基于代价评估是否对子链接的MagicSet下推条件进行改写,因为此场景不进行MagicSet改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=intargetlist,magicset;
SET
gaussdb=# SET costbased_rewrite_rule=intargetlist;
SET
gaussdb=# EXPLAIN SELECT *,ct1.a = (SELECT a FROM ct2 WHERE ct2.c = ct1.c LIMIT 1) FROM ct1 WHERE ct1.b=10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=7.93..25.74 rows=100 width=20)
Node/s: All datanodes
-> Hash Left Join (cost=5.87..21.62 rows=100 width=20)
Hash Cond: (ct1.c = subquery."?column?")
-> Bitmap Heap Scan on ct1 (cost=4.51..19.92 rows=100 width=16)
Recheck Cond: (b = 10)
-> Bitmap Index Scan on idx_ct1_b (cost=0.00..4.50 rows=100 width=0)
Index Cond: (b = 10)
-> Hash (cost=1.35..1.35 rows=3 width=8)
-> Streaming(type: BROADCAST) (cost=1.05..1.35 rows=3 width=8)
Spawn on: All datanodes
-> Subquery Scan on subquery (cost=1.05..1.08 rows=1 width=8)
Filter: (subquery."?column?" OPERATOR(pg_catalog.=) 1::bigint)
-> WindowAgg (cost=1.05..1.07 rows=1 width=8)
row_number_filter: (row_number() OVER (PARTITION BY ct2.c) OPERATOR(pg_catalog.<=) 1::bigint)
-> Sort (cost=1.05..1.05 rows=1 width=8)
Sort Key: ct2.c
-> Streaming(type: REDISTRIBUTE) (cost=0.00..1.04 rows=2 width=8)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=8)
(20 rows)
costbased_rewrite_rule=intargetlist与rewrite_rule=intargetlist、rewrite_rule=magicset参数互斥,且costbased_rewrite_rule的优先级高于rewrite_rule的优先级。
简单子查询展开改写场景
存在简单子查询时可以进行SQL等价改写。如果关闭基于代价的查询改写评估策略,子链接会基于规则进行改写,生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=none;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2,(SELECT ct1.* FROM ct1,ct2,ct3 WHERE ct1.c > 10 AND ct1.a = 10) ct WHERE ct.a = ct2.d;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=4.00..100.52 rows=660 width=32)
Node/s: All datanodes
-> Nested Loop (cost=0.00..73.02 rows=660 width=32)
-> Nested Loop (cost=0.00..69.52 rows=110 width=16)
-> Streaming(type: BROADCAST) (cost=0.00..65.78 rows=3 width=16)
Spawn on: datanode2
-> Seq Scan on ct1 (cost=0.00..65.50 rows=1 width=16)
Filter: ((c > 10) AND (a = 10))
-> Seq Scan on ct3 (cost=0.00..3.37 rows=110 width=0)
-> Materialize (cost=0.00..2.58 rows=6 width=16)
-> Streaming(type: BROADCAST) (cost=0.00..2.57 rows=6 width=16)
Spawn on: All datanodes
-> Nested Loop (cost=0.00..2.30 rows=2 width=16)
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=0)
-> Materialize (cost=0.00..1.29 rows=3 width=16)
-> Streaming(type: BROADCAST) (cost=0.00..1.28 rows=3 width=16)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=1 width=16)
Filter: (d = 10)
(19 rows)
但当设置costbased_rewrite_rule参数值为pullup_subquery时,则优化器会基于代价评估是否对简单子查询进行改写,因为此场景不进行简单子查询改写生成的计划更优,因此生成计划如下:
gaussdb=# SET rewrite_rule=none;
SET
gaussdb=# SET costbased_rewrite_rule=pullup_subquery;
SET
gaussdb=# EXPLAIN SELECT * FROM ct2,(SELECT ct1.* FROM ct1,ct2,ct3 WHERE ct1.c > 10 AND ct1.a = 10) ct WHERE ct.a = ct2.d;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Streaming (type: GATHER) (cost=1.37..73.72 rows=4 width=32)
Node/s: All datanodes
-> Hash Join (cost=1.31..73.60 rows=4 width=32)
Hash Cond: (ct1.a = costbased_rule_test.ct2.d)
-> Nested Loop (cost=0.00..71.37 rows=220 width=16)
-> Seq Scan on ct3 (cost=0.00..3.37 rows=110 width=0)
-> Materialize (cost=0.00..67.08 rows=6 width=16)
-> Streaming(type: BROADCAST) (cost=0.00..67.07 rows=6 width=16)
Spawn on: All datanodes
-> Nested Loop (cost=0.00..66.80 rows=2 width=16)
-> Streaming(type: BROADCAST) (cost=0.00..65.78 rows=3 width=16)
Spawn on: datanode2
-> Seq Scan on ct1 (cost=0.00..65.50 rows=1 width=16)
Filter: ((c > 10) AND (a = 10))
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=0)
-> Hash (cost=1.28..1.28 rows=6 width=16)
-> Streaming(type: BROADCAST) (cost=0.00..1.28 rows=6 width=16)
Spawn on: All datanodes
-> Seq Scan on ct2 (cost=0.00..1.01 rows=2 width=16)
(19 rows)
该场景仅受GUC参数costbased_rewrite_rule=pullup_subquery控制,GUC参数rewrite_rule无相关控制参数。
生成PGXC计划时退化为基于规则的查询重写
生成PGXC计划时,DN执行计划的代价计算不准确,故基于代价比较计划的结果不准确,基于代价的查询重写也因此退化为基于规则的查询重写。
示例如下,该计划效果更佳(基于规则改写,将带有EXISTS的子链接进行了提升)。
gaussdb=# SET enable_stream_operator=off;
SET
gaussdb=# SET costbased_rewrite_rule= pullup_sublink_any_exists;
SET
gaussdb=# EXPLAIN ANALYZE SELECT * FROM ct3 WHERE EXISTS (SELECT 1 from ct1 WHERE ct1.a = ct3.a AND ct1.c < 1000);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join (cost=26.02..26.66 rows=110 width=16) (actual time=23.507..23.744 rows=110 loops=1)
Hash Cond: (ct3.a = ct1.a)
-> Data Node Scan on ct3 "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=110 width=16) (actual time=0.741..0.817 rows=110 loops=1)
Node/s: All datanodes
-> Hash (cost=25.59..25.59 rows=34 width=4) (actual time=22.596..22.596 rows=100 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 4kB
-> HashAggregate (cost=25.25..25.59 rows=34 width=4) (actual time=22.517..22.552 rows=100 loops=1)
Group By Key: ct1.a
-> Data Node Scan on ct1 "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=10100 width=4) (actual time=1.129..17.014 rows=10100 loops=1)
Node/s: All datanodes
Total runtime: 24.440 ms
(11 rows)
而不应生成如下所示的计划(代价显示虽低但实际更慢)。
gaussdb=# EXPLAIN ANALYZE SELECT * FROM ct3 WHERE EXISTS (SELECT /*+ no_expand */ 1 from ct1 WHERE ct1.a = ct3.a AND ct1.c < 1000);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Data Node Scan on ct3 "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=55 width=16) (actual time=8.168..120.004 rows=110 loops=1)
Node/s: All datanodes
Coordinator quals: (SubPlan 1)
SubPlan 1
-> Data Node Scan on ct1 "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=101 width=0) (actual time=119.024..119.024 rows=110 loops=110)
Node/s: All datanodes
Coordinator quals: (a = ct3.a)
Total runtime: 123.777 ms
(8 rows)