更新时间:2024-07-01 GMT+08:00

案例:改写SQL排除剪枝干扰

分区表查询中表达式一般不是单纯的分区键,而是包含分区键的表达式的Filter条件,这种类型的Filter条件是不能用来剪枝的。

优化前

t_ddw_f10_op_cust_asset_mon为分区表,分区键为year_mth,此字段是由年月两个值拼接而成的整数。

测试SQL如下:

1
2
3
4
5
SELECT
    count(1) 
FROM t_ddw_f10_op_cust_asset_mon b1
WHERE b1.year_mth  < substr('20200722',1 ,6 )
AND b1.year_mth + 1 >= substr('20200722',1 ,6 );

测试结果显示此SQL的表Scan耗时长达10s,查询SQL语句的执行计划如下

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
EXPLAIN (ANALYZE ON, VERBOSE ON)
SELECT
    count(1)
FROM t_ddw_f10_op_cust_asset_mon b1
WHERE b1.year_mth < substr('20200722',1 ,6 )
AND b1.year_mth + 1 >= cast(substr('20200722',1 ,6 ) AS int);
                                                                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  id |                                   operation                                   |        A-time         |  A-rows  |  E-rows  | E-distinct | Peak Memory  | E-memory | A-width | E-width |  E-costs
 ----+-------------------------------------------------------------------------------+-----------------------+----------+----------+------------+--------------+----------+---------+---------+-----------
   1 | ->  Aggregate                                                                 | 10662.260             |        1 |        1 |            | 32KB         |          |         |       8 | 593656.42
   2 |    ->  Streaming (type: GATHER)                                               | 10662.172             |        4 |        4 |            | 136KB        |          |         |       8 | 593656.42
   3 |       ->  Aggregate                                                           | [9692.785, 10656.068] |        4 |        4 |            | [24KB, 24KB] | 1MB      |         |       8 | 593646.42
   4 |          ->  Partition Iterator                                               | [8787.198, 9629.138]  | 16384000 | 32752850 |            | [16KB, 16KB] | 1MB      |         |       0 | 573175.88
   5 |             ->  Partitioned Seq Scan on public.t_ddw_f10_op_cust_asset_mon b1 | [8365.655, 9152.115]  | 16384000 | 32752850 |            | [32KB, 32KB] | 1MB      |         |       0 | 573175.88

                                 SQL Diagnostic Information
 -------------------------------------------------------------------------------------------
 Partitioned table unprunable Qual
         table public.t_ddw_f10_op_cust_asset_mon b1:
         left side of expression "((year_mth + 1) > 202008)" invokes function-call/type-conversion

                   Predicate Information (identified by plan id)
 ----------------------------------------------------------------------------------
   4 --Partition Iterator
         Iterations: 6
   5 --Partitioned Seq Scan on public.t_ddw_f10_op_cust_asset_mon b1
         Filter: ((b1.year_mth < 202007::bigint) AND ((b1.year_mth + 1) >= 202007))
         Rows Removed by Filter: 81920000
         Partitions Selected by Static Prune: 1..6

优化后

分析语句的执行计划,查看执行计划中的SQL自诊断信息,发现如下诊断信息:

                                 SQL Diagnostic Information
 ------------------------------------------------------------------------------------------
 Partitioned table unprunable Qual
         table public.t_ddw_f10_op_cust_asset_mon b1:
         left side of expression "((year_mth + 1) > 202008)" invokes function-call/type-conversion

Filter条件中存在表达式(year_mth + 1) > 202008,这种表达式一侧不是单纯的分区键、而是包含分区键的表达式的Filter条件是不能用来剪枝的,因而导致查询语句扫描了几乎整个分区表的数据。

跟原始SQL语句对比,可以确定表达式 '(year_mth + 1) > 202008' 是从表达式 'b1.year_mth + 1 > substr('20200822',1 ,6 )' 衍生而来,按照诊断信息把修改SQL语句为如下方式:

1
2
3
4
5
SELECT
    count(1) 
FROM t_ddw_f10_op_cust_asset_mon b1
WHERE b1.year_mth <= substr('20200822',1 ,6 )
AND b1.year_mth > cast(substr('20200822',1 ,6 ) AS int) - 1;

改写之后,SQL语句的执行信息如下,可以看到不剪枝告警已经消除,剪枝后需要扫描分区数为1,执行时间从10s提升至3s。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
EXPLAIN (analyze ON, verbose ON)
SELECT
    count(1)
FROM t_ddw_f10_op_cust_asset_mon b1
WHERE b1.year_mth < substr('20200722',1 ,6 )
AND b1.year_mth >= cast(substr('20200722',1 ,6 ) AS int) - 1;
                                                                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  id |                                   operation                                   |        A-time        |  A-rows  |  E-rows  | E-distinct | Peak Memory  | E-memory | A-width | E-width |  E-costs
 ----+-------------------------------------------------------------------------------+----------------------+----------+----------+------------+--------------+----------+---------+---------+-----------
   1 | ->  Aggregate                                                                 | 3009.796             |        1 |        1 |            | 32KB         |          |         |       8 | 501541.70
   2 |    ->  Streaming (type: GATHER)                                               | 3009.718             |        4 |        4 |            | 136KB        |          |         |       8 | 501541.70
   3 |       ->  Aggregate                                                           | [2675.509, 3003.298] |        4 |        4 |            | [24KB, 24KB] | 1MB      |         |       8 | 501531.70
   4 |          ->  Partition Iterator                                               | [1820.725, 2053.836] | 16384000 | 16380697 |            | [16KB, 16KB] | 1MB      |         |       0 | 491293.75
   5 |             ->  Partitioned Seq Scan on public.t_ddw_f10_op_cust_asset_mon b1 | [1420.972, 1590.083] | 16384000 | 16380697 |            | [16KB, 16KB] | 1MB      |         |       0 | 491293.75

                Predicate Information (identified by plan id)
 ----------------------------------------------------------------------------
   4 --Partition Iterator
         Iterations: 1
   5 --Partitioned Seq Scan on public.t_ddw_f10_op_cust_asset_mon b1
         Filter: ((b1.year_mth < 202007::bigint) AND (b1.year_mth >= 202006))
         Partitions Selected by Static Prune: 6