Updated on 2024-05-07 GMT+08:00

Overview

The SQL execution plan is a node tree, which displays detailed procedure when GaussDB runs an SQL statement. A database operator indicates one step.

You can run the EXPLAIN command to view the execution plan generated for each query by an optimizer. The output of EXPLAIN has one row for each execution node, showing the basic node type and the cost estimation that the optimizer made for the execution of this node, as shown below.

gaussdb=#  explain select * from t1,t2 where t1.c1=t2.c2;
                                   QUERY PLAN                                    
---------------------------------------------------------------------------------
 Streaming (type: GATHER)  (cost=14.17..29.07 rows=20 width=180)
   Node/s: All datanodes
   ->  Hash Join  (cost=13.29..27.75 rows=20 width=180)
         Hash Cond: (t2.c2 = t1.c1)
         ->  Streaming(type: REDISTRIBUTE)  (cost=0.00..14.31 rows=20 width=104)
               Spawn on: All datanodes
               ->  Seq Scan on t2  (cost=0.00..13.13 rows=20 width=104)
         ->  Hash  (cost=13.13..13.13 rows=21 width=76)
               ->  Seq Scan on t1  (cost=0.00..13.13 rows=20 width=76)
(9 rows)
  • Nodes at the bottom level are scan nodes. They scan tables and return raw rows. The types of scan nodes (sequential scans and index scans) vary depending on the table access methods. Objects scanned by the bottom layer nodes may not be row-store data (not directly read from a table), such as VALUES clauses and functions that return rows, which have their own types of scan nodes.
  • If the query requires join, aggregation, sorting, or other operations on the raw rows, there will be other nodes above the scan nodes to perform these operations. In addition, there is more than one way to perform these operations, so different types of execution nodes may be displayed here.
  • The first row (the upper-layer node) estimates the total execution cost of the execution plan. Such an estimate indicates the value that the optimizer tries to minimize.

Execution Plan Display Format

GaussDB provides four display formats: normal, pretty, summary, and run.

  • normal: indicates that the default printing format is used.
  • pretty: indicates that the new plan display format improved by GaussDB is used. The new format contains a plan node ID, directly and effectively analyzing performance.
  • summary: indicates that the printing information analysis is added based on the pretty format.
  • run: indicates that the information based on the summary format is exported as a CSV file for further analysis.

An example of an execution plan using the pretty format is as follows.

gaussdb=#  explain select * from t1,t2 where t1.c1=t2.c2;
 id |                operation                | E-rows | E-width | E-costs 
----+-----------------------------------------+--------+---------+---------
  1 | ->  Streaming (type: GATHER)            |     20 |     180 | 29.07
  2 |    ->  Hash Join (3,5)                  |     20 |     180 | 27.75
  3 |       ->  Streaming(type: REDISTRIBUTE) |     20 |     104 | 14.31
  4 |          ->  Seq Scan on t2             |     20 |     104 | 13.13
  5 |       ->  Hash                          |     21 |      76 | 13.13
  6 |          ->  Seq Scan on t1             |     20 |      76 | 13.13
(6 rows)

 Predicate Information (identified by plan id) 
-----------------------------------------------
   2 --Hash Join (3,5)
         Hash Cond: (t2.c2 = t1.c1)
(2 rows)

You can change the display format of execution plans by setting the GUC parameter explain_perf_mode. Later examples use the pretty format by default.

Execution Plan Information

In addition to setting different display formats for an execution plan, you can use different EXPLAIN syntax to display execution plan information in detail. The following lists the common EXPLAIN syntax. For details about more EXPLAIN syntax, see EXPLAIN.

  • EXPLAIN statement: only generates an execution plan and does not execute. The statement indicates SQL statements.
  • EXPLAIN ANALYZE statement: generates and executes an execution plan, and displays the execution summary. Then actual execution time statistics are added to the display, including the total elapsed time expended within each plan node (in milliseconds) and the total number of rows it actually returned.
  • EXPLAIN PERFORMANCE statement: generates and executes the execution plan, and displays all execution information.

To measure the run time cost of each node in the execution plan, the current execution of EXPLAIN ANALYZE or EXPLAIN PERFORMANCE adds profiling overhead to query execution. Executing EXPLAIN ANALYZE or EXPLAIN PERFORMANCE on a query sometimes takes longer time than executing the query normally. The amount of time that exceeds depends on the complexity of the query itself and the platform used.

Therefore, if an SQL statement is not finished after being running for a long time, run the EXPLAIN command to view the execution plan and then locate the fault. If the SQL statement has been properly executed, execute the EXPLAIN ANALYZE or EXPLAIN PERFORMANCE statement to check the execution plan and information to locate the fault.