Help Center/ GaussDB/ Developer Guide(Distributed_8.x)/ SQL Optimization/ Optimization Cases/ Case: Rewriting SQL Statements to Eliminate Pruning Interference
Updated on 2024-08-20 GMT+08:00

Case: Rewriting SQL Statements to Eliminate Pruning Interference

Symptom

In a test at a site, ddw_f10_op_cust_asset_mon is a partitioned table and the partitioning key is year_mth whose value is a combined string of month and year values.

The tested SQL statements are displayed as follows:

1
2
3
4
select  
    count(1) 
from t_ddw_f10_op_cust_asset_mon b1
where b1.year_mth between to_char(add_months(to_date(''20170222'','yyyymmdd'), -11),'yyyymm') and substr(''20170222'',1 ,6 );

The test result shows the Scan operation on the tables in the SQL statement takes 135s. This may be the performance bottleneck.

add_months is a local adaptation function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
CREATE OR REPLACE FUNCTION ADD_MONTHS(date, integer) RETURNS date
    AS $$
    SELECT
    CASE 
    WHEN (EXTRACT(day FROM $1) = EXTRACT(day FROM (date_trunc('month', $1) + INTERVAL '1 month - 1 day'))) THEN
        date_trunc('month', $1) + CAST($2 + 1 || ' month - 1 day' as interval)
    ELSE
        $1 + CAST($2 || ' month' as interval)
    END
    $$
    LANGUAGE SQL
    IMMUTABLE;

Optimization

According to the statement execution plan, the base table filter is displayed as follows:

Filter: (((year_mth)::text <= '201702'::text) AND ((year_mth)::text >= to_char(add_months(to_date('20170222'::text, 'YYYYMMDD'::text), (-11)), 'YYYYMM'::text)))

The filter condition contains a non-constant expression to_char(add_months(to_date(''20170222'','yyyymmdd'), -11),'yyyymm'), which cannot be used for pruning. Therefore, the query statement scans all data in the partitioned table.

to_date and to_char are stable functions as queried in pg_proc. According to the function behavior described in the database, this type of functions cannot be converted to Const values in the preprocessing phase, which is the root cause why partition pruning cannot be performed.

Based on the preceding analysis, the optimization expression can be used for partition pruning, which is the key to performance optimization. The original SQL statements can be written to as follows:

1
2
3
4
select  
    count(1) 
from t_ddw_f10_op_cust_asset_mon b1
where b1.year_mth between(substr(ADD_MONTHS('20170222'::date, -11), 1, 4)||substr(ADD_MONTHS('20170222'::date, -11), 6, 2)) and substr(''20170222'',1 ,6 );

The execution time of modified SQL statements is reduced from 135s to 18s.