Execution Information
In SQL optimization process, you can use EXPLAIN ANALYZE or EXPLAIN PERFORMANCE to check the SQL statement execution information. By comparing estimation differences between actual implementation and the optimizer, basis for service optimization is provided. EXPLAIN PERFORMANCE provides the execution information on each DN, whereas EXPLAIN ANALYZE does not.
The execution result of the following SQL statement in pretty mode is used as an example:
-- Create a table and insert data into the table. gaussdb=#CREATE TABLE t1 (c1 int, c2 int); gaussdb=#CREATE TABLE t2 (c1 int, c2 int); gaussdb=#INSERT INTO t1 SELECT generate_series(1,10), generate_series(1,10); gaussdb=#INSERT INTO t2 SELECT generate_series(1,10), generate_series(1,10); gaussdb=#ANALYZE t1; gaussdb=#ANALYZE t2;
The output of running EXPLAIN PERFORMANCE is as follows:
gaussdb=#EXPLAIN PERFORMANCE SELECT sum(t2.c1) FROM t1,t2 WHERE t1.c1 = t2.c2 GROUP BY t1.c2; id | operation | A-time | A-rows | E-rows | E-distinct | Peak Memory | E-memory | A-width | E-width | E-costs ----+-----------------------------------------------+-----------------+--------+--------+------------+----------------+----------+---------+---------+--------- 1 | -> Streaming (type: GATHER) | 64.022 | 10 | 10 | | 82KB | | | 16 | 3.31 2 | -> HashAggregate | [54.464,58.410] | 10 | 10 | | [27KB, 29KB] | 16MB | [20,20] | 16 | 2.84 3 | -> Streaming(type: REDISTRIBUTE) | [54.385,58.318] | 10 | 10 | | [36KB, 36KB] | 2MB | | 8 | 2.76 4 | -> Hash Join (5,7) | [0.471,0.906] | 10 | 10 | 5, 5 | [8KB, 8KB] | 1MB | | 8 | 2.64 5 | -> Streaming(type: REDISTRIBUTE) | [0.053,0.385] | 10 | 10 | | [36KB, 36KB] | 2MB | | 8 | 1.46 6 | -> Seq Scan on public.t2 | [0.014,0.021] | 10 | 10 | | [13KB, 13KB] | 1MB | | 8 | 1.05 7 | -> Hash | [0.055,0.071] | 10 | 10 | | [293KB, 293KB] | 16MB | [24,24] | 8 | 1.05 8 | -> Seq Scan on public.t1 | [0.019,0.024] | 10 | 10 | | [13KB, 13KB] | 1MB | | 8 | 1.05 (8 rows) Predicate Information (identified by plan id) ----------------------------------------------- 4 --Hash Join (5,7) Hash Cond: (t2.c2 = t1.c1) (2 rows) Memory Information (identified by plan id) ----------------------------------------------------------------------------------------------- Coordinator Query Peak Memory: Query Peak Memory: 1MB DataNode Query Peak Memory datanode1 Query Peak Memory: 6MB datanode2 Query Peak Memory: 6MB 1 --Streaming (type: GATHER) Peak Memory: 82KB, Estimate Memory: 64MB 2 --HashAggregate datanode1 Peak Memory: 29KB, Estimate Memory: 16MB, Width: 20 datanode2 Peak Memory: 27KB, Estimate Memory: 16MB, Width: 20 3 --Streaming(type: REDISTRIBUTE) datanode1 Peak Memory: 36KB, Estimate Memory: 2MB datanode2 Peak Memory: 36KB, Estimate Memory: 2MB datanode1 Stream Network: 1kB, Network Poll Time: 0.000; Data Deserialize Time: 0.000 datanode2 Stream Network: 1kB, Network Poll Time: 0.000; Data Deserialize Time: 0.000 4 --Hash Join (5,7) datanode1 Peak Memory: 8KB, Estimate Memory: 1024KB datanode2 Peak Memory: 8KB, Estimate Memory: 1024KB 5 --Streaming(type: REDISTRIBUTE) datanode1 Peak Memory: 36KB, Estimate Memory: 2MB datanode2 Peak Memory: 36KB, Estimate Memory: 2MB datanode1 Stream Network: 1kB, Network Poll Time: 0.000; Data Deserialize Time: 0.000 datanode2 Stream Network: 1kB, Network Poll Time: 0.000; Data Deserialize Time: 0.000 6 --Seq Scan on public.t2 datanode1 Peak Memory: 13KB, Estimate Memory: 1024KB datanode2 Peak Memory: 13KB, Estimate Memory: 1024KB 7 --Hash datanode1 Peak Memory: 293KB, Estimate Memory: 16MB, Width: 24 datanode2 Peak Memory: 293KB, Estimate Memory: 16MB, Width: 24 datanode1 Buckets: 32768 Batches: 1 Memory Usage: 1kB datanode2 Buckets: 32768 Batches: 1 Memory Usage: 1kB 8 --Seq Scan on public.t1 datanode1 Peak Memory: 13KB, Estimate Memory: 1024KB datanode2 Peak Memory: 13KB, Estimate Memory: 1024KB (34 rows) Targetlist Information (identified by plan id) ------------------------------------------------ 1 --Streaming (type: GATHER) Output: (sum(t2.c1)), t1.c2 Node/s: All datanodes 2 --HashAggregate Output: sum(t2.c1), t1.c2 Group By Key: t1.c2 3 --Streaming(type: REDISTRIBUTE) Output: t1.c2, t2.c1 Distribute Key: t1.c2 Spawn on: All datanodes Consumer Nodes: All datanodes 4 --Hash Join (5,7) Output: t1.c2, t2.c1 5 --Streaming(type: REDISTRIBUTE) Output: t2.c1, t2.c2 Distribute Key: t2.c2 Spawn on: All datanodes Consumer Nodes: All datanodes 6 --Seq Scan on public.t2 Output: t2.c1, t2.c2 Distribute Key: t2.c1 7 --Hash Output: t1.c2, t1.c1 8 --Seq Scan on public.t1 Output: t1.c2, t1.c1 Distribute Key: t1.c1 (26 rows) Datanode Information (identified by plan id) ----------------------------------------------------------------------------------------- 1 --Streaming (type: GATHER) (actual time=57.765..64.022 rows=10 loops=1) (Buffers: 0) (CPU: ex c/r=16607284, ex row=10, ex cyc=166072848, inc cyc=166072848) 2 --HashAggregate datanode1 (actual time=54.460..54.464 rows=6 loops=1) datanode2 (actual time=58.407..58.410 rows=4 loops=1) datanode1 (Buffers: 0) datanode2 (Buffers: 0) datanode1 (CPU: ex c/r=34274, ex row=6, ex cyc=205644, inc cyc=141279464) datanode2 (CPU: ex c/r=59030, ex row=4, ex cyc=236120, inc cyc=151512902) 3 --Streaming(type: REDISTRIBUTE) datanode1 (actual time=19.021..54.385 rows=6 loops=1) datanode2 (actual time=14.622..58.318 rows=4 loops=1) datanode1 (Buffers: 0) datanode2 (Buffers: 0) datanode1 (CPU: ex c/r=23512303, ex row=6, ex cyc=141073820, inc cyc=141073820) datanode2 (CPU: ex c/r=37819195, ex row=4, ex cyc=151276782, inc cyc=151276782) 4 --Hash Join (5,7) datanode1 (actual time=0.451..0.471 rows=6 loops=1) datanode2 (actual time=0.869..0.906 rows=4 loops=1) datanode1 (Buffers: shared hit=1) datanode2 (Buffers: shared hit=1) datanode1 (CPU: ex c/r=78319, ex row=12, ex cyc=939828, inc cyc=1220560) datanode2 (CPU: ex c/r=145606, ex row=8, ex cyc=1164852, inc cyc=2347916) 5 --Streaming(type: REDISTRIBUTE) datanode1 (actual time=0.046..0.053 rows=6 loops=1) datanode2 (actual time=0.358..0.385 rows=4 loops=1) datanode1 (Buffers: 0) datanode2 (Buffers: 0) datanode1 (CPU: ex c/r=22796, ex row=6, ex cyc=136780, inc cyc=136780) datanode2 (CPU: ex c/r=249361, ex row=4, ex cyc=997444, inc cyc=997444) 6 --Seq Scan on public.t2 datanode1 (actual time=0.016..0.021 rows=6 loops=1) datanode2 (actual time=0.010..0.014 rows=4 loops=1) datanode1 (Buffers: shared hit=1) datanode2 (Buffers: shared hit=1) datanode1 (CPU: ex c/r=8410, ex row=6, ex cyc=50460, inc cyc=50460) datanode2 (CPU: ex c/r=8769, ex row=4, ex cyc=35076, inc cyc=35076) 7 --Hash datanode1 (actual time=0.055..0.055 rows=6 loops=1) datanode2 (actual time=0.071..0.071 rows=4 loops=1) datanode1 (Buffers: shared hit=1) datanode2 (Buffers: shared hit=1) datanode1 (CPU: ex c/r=16183, ex row=6, ex cyc=97100, inc cyc=143952) datanode2 (CPU: ex c/r=32248, ex row=4, ex cyc=128992, inc cyc=185620) 8 --Seq Scan on public.t1 datanode1 (actual time=0.014..0.019 rows=6 loops=1) datanode2 (actual time=0.018..0.024 rows=4 loops=1) datanode1 (Buffers: shared hit=1) datanode2 (Buffers: shared hit=1) datanode1 (CPU: ex c/r=7808, ex row=6, ex cyc=46852, inc cyc=46852) datanode2 (CPU: ex c/r=14157, ex row=4, ex cyc=56628, inc cyc=56628) (53 rows) User Define Profiling --------------------------------------------------------------------------- Segment Id: 2 Track name: Datanode build connection datanode1 (time=0.117 total_calls=1 loops=1) datanode2 (time=0.161 total_calls=1 loops=1) Plan Node id: 1 Track name: coordinator get datanode connection coordinator1: (time=0.023 total_calls=1 loops=1) Plan Node id: 1 Track name: Coordinator check and update node definition coordinator1: (time=0.001 total_calls=1 loops=1) Plan Node id: 1 Track name: Coordinator serialize plan coordinator1: (time=1.289 total_calls=1 loops=1) Plan Node id: 1 Track name: Coordinator send query id with sync coordinator1: (time=0.091 total_calls=1 loops=1) Plan Node id: 1 Track name: Coordinator send begin command coordinator1: (time=0.002 total_calls=1 loops=1) Plan Node id: 1 Track name: Coordinator start transaction and send query coordinator1: (time=0.044 total_calls=1 loops=1) Plan Node id: 2 Track name: Datanode start up stream thread datanode1 (time=0.172 total_calls=1 loops=1) datanode2 (time=0.173 total_calls=1 loops=1) (18 rows) ====== Query Summary ===== -------------------------------------------------------------------------- Datanode executor start time [datanode1, datanode2]: [0.451 ms,0.475 ms] Datanode executor run time [datanode1, datanode2]: [54.512 ms,58.450 ms] Datanode executor end time [datanode2, datanode1]: [0.098 ms,0.105 ms] Remote query poll time: 0.000 ms, Deserialze time: 0.000 ms System available mem: 8190361KB Query Max mem: 8280473KB Query estimated mem: 6464KB Coordinator executor start time: 0.285 ms Coordinator executor run time: 64.083 ms Coordinator executor end time: 0.117 ms Total network : 1kB Planner runtime: 1.242 ms Plan size: 4251 byte Query Id: 72620543991481051 Total runtime: 64.786 ms (15 rows) -- Drop the table. gaussdb=#DROP TABLE t1,t2;
In the preceding example, the execution information consists of the following parts:
- The plan is displayed as a table, which contains the following columns: id, operation, A-time, A-rows, E-rows, E-distinct, Peak Memory, E-memory, A-width, E-width, and E-costs. The meanings of the plan-type columns (id, operation, or columns started with E) are the same as those when EXPLAIN is executed. For details, see Execution Plans. The definition of A-time, A-rows, E-distinct, Peak Memory, and A-width are described as follows:
- A-time: execution completion time of the current operator. The values in the brackets indicate the minimum and maximum DN completion time.
- A-rows: number of actual output tuples of the operator
- E-distinct: estimated distinct values at both ends of the hash join operator.
- E-memory: estimated memory to be used by the operator. (When the value of query_mem is 0, this row is not displayed in the result.)
- Peak Memory: peak memory used by the operator during execution.
- A-width: actual tuple width in each row of the current operator. This parameter is valid only for heavy memory operators, including (Vec)HashJoin, (Vec)HashAgg, (Vec)HashSetOp, (Vec)Sort, and (Vec)Materialize. The (Vec)HashJoin calculation width is the width of its right subtree operator and will be displayed on the right subtree.
- Predicate Information (identified by plan id):
This part displays the static information that does not change in the plan execution process, such as some join conditions and filter information.
In this example:
- In 4 --Hash Join (5,7), 4 indicates the operator whose ID is 4, and (5,7) indicates that the operator whose ID is 5 and the operator whose ID is 7 are joined.
- Hash Cond: (t2.c2 = t1.c1) indicates the hash join condition.
- Memory Information (identified by plan id):
This part displays the memory usage information printed by certain operators (mainly Hash and Sort), including peak memory, control memory, estimate memory, width, auto spread num, and early spilled, as well as spill details, including spill Time(s), inner/outer partition spill num, temp file num, spilled data volume, and written disk IO [min, max].
- Targetlist Information (identified by plan id):
This part displays the target columns provided by each operator.
- DataNode Information (identified by plan id):
The execution time, CPU, and buffer usage of each operator are printed in this part.
- User Define Profiling:
This part displays the thread interaction between CNs and DNs in the stream plan.
- ====== Query Summary =====:
This part displays the total execution time and network traffic, including the maximum and minimum execution time in the initialization and end phases, available system memory when the current statement is executed, and estimated statement memory.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot