Help Center> GaussDB(DWS)> Troubleshooting> Cluster Performance> SQL Query Is Slow Because Partitions Are Not Pruned
Updated on 2024-01-25 GMT+08:00

SQL Query Is Slow Because Partitions Are Not Pruned

Symptom

The query of three SQL statements is slow. The queried partitioned table contains 18.5 billion data records, and the query criteria do not contain the partition key.

1
2
SELECT passtime FROM table where passtime<'2020-02-19 15:28:14' and passtime>'2020-02-18 15:28:37' order by passtime desc limit 10; 
SELECT max(passtime) FROM table where passtime<'2020-02-19 15:28:14' and passtime>'2020-02-18 15:28:37'; 

For a column-store table, the partition key is createtime and the hash distribution key is motorvehicleid.

Possible Causes

The query criteria of slow SQL statements do not include the partition field. As a result, partitions are not pruned from the execution plan and the entire table is scanned, severely deteriorating performance.

Cause Analysis

  1. Some services of the customer are slow, and all these services involve the same table tb_motor_vehicle.
  2. Collect several typical slow SQL statements and print their execution plans. The execution plans show that during the execution of the two SQL statements, time is mostly spent on the partition scanning of the Partitioned CStore Scan on public.tb_motor_vehicle column-store table.

  3. According to the customer, the partition key of the table is createtime. However, the query criteria of the involved SQL statements do not contain createtime. It can be confirmed that partitions are not pruned from the execution plan of slow SQL statements. As a result, the entire table with 18.5 billion data records is scanned, and the scanning performance is poor.
  4. After the partition key is added to the query criteria, the optimized SQL statement and execution plan are as follows. The query duration is slashed from more than 10 minutes to about 12 seconds.

    1
    SELECT passtime FROM tb_motor_vehicle WHERE createtime > '2020-02-19 00:00:00' AND createtime < '2020-02-20 00:00:00' AND passtime > '2020-02-19 00:00:00' AND passtime < '2020-02-20 00:00:00' ORDER BY passtime DESC LIMIT 10000;
    

Handling Procedure

Add the partition query criteria to the slow SQL statements to prevent full table scanning.