Help Center/ GaussDB/ Centralized_3.x/ SQL Optimization/ Introduction to Plan Trace
Updated on 2024-05-07 GMT+08:00

Introduction to Plan Trace

  1. This feature is used by database kernel developers for in-depth analysis of slow SQL statements. It is not recommended that non-kernel developers use this feature.
  2. After this feature is enabled, optimizer information is recorded in the system catalog during DML execution. As a result, the original read transaction becomes a write transaction, and functions that must be executed in the read transaction, such as pg_create_logical_replication_slot, cannot be executed.

You can use the plan trace feature to view the optimization process of a query plan. In the plan trace, you can view key information such as the path calculation process, path selection process, and path elimination process in the plan to help analyze the root cause of slow SQL statements. This feature can be used in either of the following ways:

-- Prepare tables.
CREATE TABLE tb_a(c1 int);
CREATE TABLE tb_b(c1 int);
CREATE INDEX tb_a_idx_c1 ON tb_a(c1);
CREATE INDEX idx_b ON tb_b(c1);

Method 1: Use the GUC parameter enable_plan_trace to enable the plan trace feature. The procedure is as follows:

  1. Enable the plan trace GUC function.

    set enable_plan_trace = on;

  2. Run the service SQL statement. For example, the service SQL statement is as follows:

    select * from tb_a a, tb_b b where a.c1 = b.c1 and a.c1=1;

  3. View the newly generated plan trace in the gs_my_plan_trace view.

    select * from gs_my_plan_trace order by modifydate limit 1; 

    Generally, plan trace records are large. If gsql is used to connect to the database, you are advised to run the \x command to change the display mode of gsql query results to Expanded.

    The plan trace records are usually large. Therefore, only fragments of the key trace execution result of this example are provided.

    Fragment 1: In the trace, you can view the SQL statement and plan that are being executed.

    query_id      | 69e138356181711a21de1211f639892b
    query         | select * from tb_a a, tb_b b where a.c1 = b.c1 and a.c1=1;
    unique_sql_id | 3388945134
    plan          | Datanode Name: datanode
                  | Nested Loop  (cost=0.00..10.75 rows=144 width=8)
                  |   ->  Index Only Scan using tb_a_idx_c1 on tb_a a  (cost=0.00..4.46 rows=12 width=4)
                  |         Index Cond: (c1 = '***')
                  |   ->  Materialize  (cost=0.00..4.52 rows=12 width=4)
                  |         ->  Index Only Scan using idx_b on tb_b b  (cost=0.00..4.46 rows=12 width=4)
                  |               Index Cond: (c1 = '***')

    Fragment 2: In the trace, you can view the key GUC parameters used by the current SQL statement.

    plan_trace    | [key_guc]
                  | enable_pbe_optimization=1
                  | plan_cache_mode=0
                  | random_page_cost=4.000
                  | enable_hashjoin=1
                  | enable_mergejoin=1
                  | enable_nestloop=1
                  | enable_seqscan=1
                  | effective_cache_size=16385
                  | work_mem=65536
                  | default_statistics_target=100
                  | cost_param=0
                  | =[key_guc]=

    Fragment 3: In the trace, you can view the process of calculating the path cost of the current SQL query plan.

                  | [btcostestimate]
                  | cal: num_sa_scans,1.000000
                  | cal: num_index_tuples=btree_selectivity * index_tuples,48.629630,0.004863,10000.000000
                  | cal: num_index_tuples = rint(num_index_tuples / num_sa_scans),49.000000
                  |
                  | [adt_genericcostestimate]
                  | input: loop_count,1.000000 num_index_tuples,49.000000 index_total_pages,37.000000
                  | cal: num_sa_scans,1.000000 idx_local_tupls,10000.000000
                  | cal: index_selectivity,0.004863
                  | cal: num_index_pages=ceil(num_index_tuples/idx_local_tupls * index_total_pages),1.000000
                  | cal: num_scans=num_sa_scans * loop_count,1.000000
                  | cal: index_total_cost=num_index_pages * spc_random_page_cost,4.000000
                  | cal: index_total_cost += num_index_tuples * num_sa_scans * (cpu_index_tuple_cost + qual_op_cost),4.367500
                  | cal: index_total_cost += num_sa_scans * 100.0 * cpu_operator_cost,4.617500
                  | =[adt_genericcostestimate]=
                  | =[btcostestimate]=

    Fragment 4: In the trace, you can see the elimination process of the base table path: 1. The old path is eliminated. 2. Reasons why the old path was eliminated; 3. Information about the new path.

                  | An old path is removed with cost = 881.806443 .. 932.541443;  rows = 49.000000
                  | The old path and the comparison results are:
                  | {
                  |          old pathid=00000005    Cost = NewBetter        |       PathKeys = Equal        |          BMS = Equal          |         Rows = Equal
                  | }
                  | A new path is accepted with cost = 284.629750 .. 341.718970;  rows = 49.000000
                  | The detail information of the new path:
                  | {
                  |         HashJoin(1:tb_a  2:tb_b ) pathid=00000011 hasparam=0 rows=49 multiple=1.000000 tuples=0.00 rpages=0.00 ipages=0.00 selec=0.00000000 ml=0 iscost=1 lossy=0 uidx=0)  dop=1 cost=284.63..341.72 hint 0 trace_id=#3##4##11#           clauses: a.id = b.id(norm_

    Fragment 5: In the trace, you can see the elimination process of the join path: 1. The old path is eliminated. 2. Reasons why the old path was eliminated; 3. Information about the new path.

                  | An old path is removed with cost = 4.629750 .. 7591.045220;  rows = 49.000000
                  | The old path and the comparison results are:
                  | {
                  |          old pathid=00000008    Cost = Equal            |       PathKeys = Equal        |          BMS = Equal          |         Rows = Equal
                  |         Small fuzzy factor is used!
                  | }
                  | A new path is accepted with cost = 4.629750 .. 7566.167720;  rows = 49.000000
                  | The detail information of the new path:
                  | {
                  |         NestLoop(1:tb_a  2:tb_b ) pathid=00000014 hasparam=0 rows=49 multiple=1.000000 tuples=0.00 rpages=0.00 ipages=0.00 selec=0.00000000 ml=0 iscost=1 lossy=0 uidx=0)  dop=1 cost=4.63..7566.17 hint 0 trace_id=#4##13##14#           clauses: a.id = b.id(norm_
                  |         Small fuzzy factor is used!

    The trace information is easy to understand and is not described here. The personnel who use this feature need to analyze the trace information by themselves.

Method 2: Use the system function gs_plan_trace_watch_sqlid to enable the plan trace feature. The procedure is as follows:

  1. Obtain the unique SQL ID of the SQL statement from the dbe_perf.statement system catalog. For example, run the following SQL statement to obtain the unique SQL ID:

    select * from dbe_perf.statement where query like '%tb_a%'; 

    The value of unique_sql_id in the command output is as follows:

    node_name            | datanode1
    node_id              | 0
    user_name            | qiumc
    user_id              | 10
    unique_sql_id        | 1921680825
    query                | select * from tb_a a, tb_b b where a.id=b.id and a.c1=?;
    n_calls              | 3
    min_elapse_time      | 8880
    max_elapse_time      | 12371
    total_elapse_time    | 32036

  2. A user with the sysadmin permission calls the gs_plan_trace_watch_sqlid function to listen to the unique SQL ID. Example:

    select gs_plan_trace_watch_sqlid(1921680825);

  3. If no plan trace is generated for the unique SQL ID, the unique SQL ID is saved in a memory list. You can use the gs_plan_trace_show_sqlids() function to view the unique SQL ID list of the plan trace to be collected. An example SQL statement is as follows:

    select gs_plan_trace_show_sqlids();
    The execution result of the SQL statement is as follows:
    -[ RECORD 1 ]-------------+------------
    gs_plan_trace_show_sqlids | 1921680825,

  4. If you run the following SQL statement:

    select * from tb_a a, tb_b b where a.id=b.id and a.c1=1; 

    You can also generate plan trace records for the SQL statement.

    Only users with the sysadmin, opradmin, or monadmin permission can call the gs_plan_trace_watch_sqlid and gs_plan_trace_show_sqlids functions. If a common user executes the SQL statement with the unique SQL ID listened by the administrator, the common user can use the gs_my_plan_trace view to view the plan trace generated by the common user.

Generally, plan trace records are large. You need to clear them in a timely manner. Otherwise, a large amount of disk space is occupied. You can use the gs_plan_trace_delete function to delete the plan trace records generated by yourself.

For example, run the following SQL statement:

select gs_plan_trace_delete(TIMESTAMPTZ '2023-01-10 17:16:42.652543+08')

You can delete all plan trace records earlier than or equal to the 2023-01-10 17:16:42.652543+08 for the current user. In this way, each user can delete its own plan trace data.