Help Center/ GaussDB/ Centralized_8.x/ SQL Optimization/ Introduction to Plan Trace
Updated on 2024-06-03 GMT+08:00

Introduction to Plan Trace

  1. This feature is used by database kernel developers for in-depth analysis of slow SQL statements. It is not recommended that non-kernel developers use this feature.
  2. After this feature is enabled, optimizer information is recorded in the system catalog during DML execution. As a result, the read transaction becomes a write transaction, and functions that must be executed in the read transaction, such as pg_create_logical_replication_slot, cannot be executed.

You can use the plan trace feature to view the optimization process of a query plan. In the plan trace, you can view key information such as the path calculation process, path selection process, and path elimination process in the plan to analyze the root cause of slow SQL statements. The following describes the two methods of using this feature:

-- Prepare tables.
CREATE TABLE tb_a(c1 int);
CREATE TABLE tb_b(c1 int);
CREATE INDEX tb_a_idx_c1 ON tb_a(c1);
CREATE INDEX idx_b ON tb_b(c1);

Method 1: Use the GUC parameter enable_plan_trace to enable the plan trace feature. The procedure is as follows:

  1. Enable the plan trace GUC function.

    SET enable_plan_trace = on;

  2. Run the service SQL statement. For example, the service SQL statement is as follows:

    SELECT * FROM tb_a a, tb_b b WHERE a.c1 = b.c1 AND a.c1=1;

  3. View the newly generated plan trace in the gs_my_plan_trace view.

    SELECT * FROM gs_my_plan_trace ORDER BY modifydate LIMIT 1; 

    Generally, plan trace records are large. If gsql is used to connect to the database, you are advised to run the \x command to change the display mode of gsql query results to Expanded.

    The plan trace records are usually large. Therefore, only fragments of the key trace execution result of this example are provided.

    Fragment 1: In the trace, you can view the SQL statement and plan that are being executed.

    query_id      | 4f078a966a1c4a434167b2f780bbfd92
    query         | select * from tb_a a, tb_b b where a.c1 = b.c1 and a.c1=1;
    unique_sql_id | 2108646922
    plan          | Datanode Name: datanode
                  | Nested Loop  (cost=0.00..81.88 rows=144 width=8)
                  |   ->  Seq Scan on tb_a a  (cost=0.00..40.03 rows=12 width=4)
                  |         Filter: (c1 = '***')
                  |   ->  Materialize  (cost=0.00..40.09 rows=12 width=4)
                  |         ->  Seq Scan on tb_b b  (cost=0.00..40.03 rows=12 width=4)
                  |               Filter: (c1 = '***')

    Fragment 2: In the trace, you can view the key GUC parameters used by the current SQL statement.

    plan_trace    | [key_guc]
                  | enable_pbe_optimization=1
                  | plan_cache_mode=0
                  | random_page_cost=4.000
                  | enable_hashjoin=1
                  | enable_mergejoin=1
                  | enable_nestloop=1
                  | enable_seqscan=1
                  | effective_cache_size=16385
                  | work_mem=65536
                  | default_statistics_target=100
                  | cost_param=0
                  | =[key_guc]=

    Fragment 3: In the trace, you can view the process of calculating the path cost of the current SQL query plan.

                  | [optcost_initial_cost_nestloop]
                  | method_initial_state: inner_pathid,2 outer_pathid,1 inner_start_cost,0.000000 inner_total_cost,40.025000 outer_start_cost,0.000000 outer_total_cost,40.025000 outer_path_rows,12.000000
                  | cal: inner_rescan_start_cost,0.000000 inner_rescan_total_cost,40.025000
                  | cal: inner_run_cost = inner_total_cost - inner_start_cost 40.025000, 40.025000, 0.000000
                  | cal: inner_rescan_run_cost = inner_rescan_total_cost - inner_rescan_start_cost 40.025000
                  | cal: startup_cost += outer_start_cost + inner_start_cost 0.000000
                  | cal: run_cost += outer_total_cost - outer_start_cost 40.025000
                  | cal: run_cost += (outer_path_rows - 1) * inner_rescan_start_cost 40.025000
                  | cal: run_cost += inner_run_cost 80.050000
                  | cal: run_cost += (outer_path_rows - 1) * inner_rescan_run_cost 520.325000
                  | Initial nestloop cost: startup_cost: 0.000000, total_cost: 520.325000
                  | =[optcost_initial_cost_nestloop]=

    Fragment 4: In the trace, you can see the elimination process of the base table path: 1. The old path is eliminated. 2. Reasons why the old path was eliminated are checked; 3. Information about the new path is checked.

                  | An old path is removed with cost = 0.000000 .. 521.765000;  rows = 144.000000
                  | The old path and the comparison results are:
                  | {
                  |          old pathid=00000004    Cost = NewBetter        |       PathKeys = Equal        |          BMS = Equal          |         Rows = Equal
                  | }
                  | A new path is accepted with cost = 0.000000 .. 81.880000;  rows = 144.000000
                  | The detail information of the new path:
                  | {
                  |         NestLoop(1:tb_a  2:tb_b ) pathid=00000005 hasparam=0 rows=144 multiple=1.000000 tuples=0.00 rpages=0.00 ipages=0.00 selec=0.00000000 ml=0 iscost=1 lossy=0 uidx=0)  dop=1 cost=0.00..81.88 hint 0 trace_id=#1##3##5#      clauses:  outerpathid=00000001 innerpathid=00000003
                  | }

Method 2: Use the system function gs_plan_trace_watch_sqlid to enable the plan trace feature. The procedure is as follows:

  1. Obtain the unique SQL ID of the SQL statement from the dbe_perf.statement system catalog. For example, run the following SQL statement to obtain the unique SQL ID:

    SELECT * FROM dbe_perf.statement WHERE query LIKE '%tb_a%'; 

    The value of unique_sql_id in the command output is as follows:

    node_name            | datanode1
    node_id              | 0
    user_name            | qiumc
    user_id              | 10
    unique_sql_id        | 1921680825
    query                | select * from tb_a a, tb_b b where a.id=b.id and a.c1=?;
    n_calls              | 3
    min_elapse_time      | 8880
    max_elapse_time      | 12371
    total_elapse_time    | 32036

  2. A user with the sysadmin permission calls the gs_plan_trace_watch_sqlid function to listen to the unique SQL ID. For example:

    SELECT gs_plan_trace_watch_sqlid(1921680825);

  3. If no plan trace is generated for the unique SQL ID, the unique SQL ID is saved in a memory list. You can use the gs_plan_trace_show_sqlids() function to view the unique SQL ID list of the plan trace to be collected. An example SQL statement is as follows:

    SELECT gs_plan_trace_show_sqlids();
    The execution result of the SQL statement is as follows:
    -[ RECORD 1 ]-------------+------------
    gs_plan_trace_show_sqlids | 1921680825,

  4. If you run the following SQL statement:

    SELECT * FROM tb_a a, tb_b b WHERE a.c1=b.c1 AND a.c1=1; 

    You can also generate plan trace records for the SQL statement. The subsequent steps are the same as those in step 3 in method 1.

    Only users with the sysadmin, opradmin, or monadmin permission can call the gs_plan_trace_watch_sqlid and gs_plan_trace_show_sqlids functions. If a common user executes the SQL statement with the unique SQL ID listened by the administrator, the common user can use the gs_my_plan_trace view to view the plan trace generated by the common user.

Generally, plan trace records are large. You need to clear them in a timely manner. Otherwise, a large amount of disk space is occupied. You can use the gs_plan_trace_delete function to delete the plan trace records generated by yourself.

For example, run the following SQL statement:

select gs_plan_trace_delete(TIMESTAMPTZ '2023-01-10 17:16:42.652543+08')

You can delete all plan trace records earlier than or equal to the 2023-01-10 17:16:42.652543+08 for the current user. In this way, each user can delete its own plan trace data.