案例:改写SQL排除剪枝干扰
分区表查询中表达式一般不是单纯的分区键,而是包含分区键的表达式的Filter条件,这种类型的Filter条件是不能用来剪枝的。
优化前
t_ddw_f10_op_cust_asset_mon为分区表,分区键为year_mth,此字段是由年月两个值拼接而成的整数。
测试SQL如下:
1 2 3 4 5 |
SELECT count(1) FROM t_ddw_f10_op_cust_asset_mon b1 WHERE b1.year_mth < substr('20200722',1 ,6 ) AND b1.year_mth + 1 >= substr('20200722',1 ,6 ); |
测试结果显示此SQL的表Scan耗时长达10s,查询SQL语句的执行计划如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
EXPLAIN (ANALYZE ON, VERBOSE ON) SELECT count(1) FROM t_ddw_f10_op_cust_asset_mon b1 WHERE b1.year_mth < substr('20200722',1 ,6 ) AND b1.year_mth + 1 >= cast(substr('20200722',1 ,6 ) AS int); QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- id | operation | A-time | A-rows | E-rows | E-distinct | Peak Memory | E-memory | A-width | E-width | E-costs ----+-------------------------------------------------------------------------------+-----------------------+----------+----------+------------+--------------+----------+---------+---------+----------- 1 | -> Aggregate | 10662.260 | 1 | 1 | | 32KB | | | 8 | 593656.42 2 | -> Streaming (type: GATHER) | 10662.172 | 4 | 4 | | 136KB | | | 8 | 593656.42 3 | -> Aggregate | [9692.785, 10656.068] | 4 | 4 | | [24KB, 24KB] | 1MB | | 8 | 593646.42 4 | -> Partition Iterator | [8787.198, 9629.138] | 16384000 | 32752850 | | [16KB, 16KB] | 1MB | | 0 | 573175.88 5 | -> Partitioned Seq Scan on public.t_ddw_f10_op_cust_asset_mon b1 | [8365.655, 9152.115] | 16384000 | 32752850 | | [32KB, 32KB] | 1MB | | 0 | 573175.88 SQL Diagnostic Information ------------------------------------------------------------------------------------------- Partitioned table unprunable Qual table public.t_ddw_f10_op_cust_asset_mon b1: left side of expression "((year_mth + 1) > 202008)" invokes function-call/type-conversion Predicate Information (identified by plan id) ---------------------------------------------------------------------------------- 4 --Partition Iterator Iterations: 6 5 --Partitioned Seq Scan on public.t_ddw_f10_op_cust_asset_mon b1 Filter: ((b1.year_mth < 202007::bigint) AND ((b1.year_mth + 1) >= 202007)) Rows Removed by Filter: 81920000 Partitions Selected by Static Prune: 1..6 |
优化后
分析语句的执行计划,查看执行计划中的SQL自诊断信息,发现如下诊断信息:
SQL Diagnostic Information ------------------------------------------------------------------------------------------ Partitioned table unprunable Qual table public.t_ddw_f10_op_cust_asset_mon b1: left side of expression "((year_mth + 1) > 202008)" invokes function-call/type-conversion
Filter条件中存在表达式(year_mth + 1) > 202008,这种表达式一侧不是单纯的分区键、而是包含分区键的表达式的Filter条件是不能用来剪枝的,因而导致查询语句扫描了几乎整个分区表的数据。
跟原始SQL语句对比,可以确定表达式 '(year_mth + 1) > 202008' 是从表达式 'b1.year_mth + 1 > substr('20200822',1 ,6 )' 衍生而来,按照诊断信息把修改SQL语句为如下方式:
1 2 3 4 5 |
SELECT count(1) FROM t_ddw_f10_op_cust_asset_mon b1 WHERE b1.year_mth <= substr('20200822',1 ,6 ) AND b1.year_mth > cast(substr('20200822',1 ,6 ) AS int) - 1; |
改写之后,SQL语句的执行信息如下,可以看到不剪枝告警已经消除,剪枝后需要扫描分区数为1,执行时间从10s提升至3s。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
EXPLAIN (analyze ON, verbose ON) SELECT count(1) FROM t_ddw_f10_op_cust_asset_mon b1 WHERE b1.year_mth < substr('20200722',1 ,6 ) AND b1.year_mth >= cast(substr('20200722',1 ,6 ) AS int) - 1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- id | operation | A-time | A-rows | E-rows | E-distinct | Peak Memory | E-memory | A-width | E-width | E-costs ----+-------------------------------------------------------------------------------+----------------------+----------+----------+------------+--------------+----------+---------+---------+----------- 1 | -> Aggregate | 3009.796 | 1 | 1 | | 32KB | | | 8 | 501541.70 2 | -> Streaming (type: GATHER) | 3009.718 | 4 | 4 | | 136KB | | | 8 | 501541.70 3 | -> Aggregate | [2675.509, 3003.298] | 4 | 4 | | [24KB, 24KB] | 1MB | | 8 | 501531.70 4 | -> Partition Iterator | [1820.725, 2053.836] | 16384000 | 16380697 | | [16KB, 16KB] | 1MB | | 0 | 491293.75 5 | -> Partitioned Seq Scan on public.t_ddw_f10_op_cust_asset_mon b1 | [1420.972, 1590.083] | 16384000 | 16380697 | | [16KB, 16KB] | 1MB | | 0 | 491293.75 Predicate Information (identified by plan id) ---------------------------------------------------------------------------- 4 --Partition Iterator Iterations: 1 5 --Partitioned Seq Scan on public.t_ddw_f10_op_cust_asset_mon b1 Filter: ((b1.year_mth < 202007::bigint) AND (b1.year_mth >= 202006)) Partitions Selected by Static Prune: 6 |