Help Center > > Developer Guide> Resource Load Management> Resource Monitoring> Historical TopSQL

Historical TopSQL

Updated at:Jul 15, 2020 GMT+08:00

After the execution of a job is complete, you can retrieve its historical information, including the resource usage (such as the usage of memory, disk, CPU time, and I/O), running status (such as error, terminated, and exception), and performance alarms. You can use historical resource monitoring views at the query- and operator-levels to query for historical TopSQLs whose execution cost is greater than resource_track_cost.

By default, query-level resource monitoring is enabled. To enable resource monitoring at both the query and operator levels, set resource_track_level to operator. The following jobs support historical resource monitoring:

  • Jobs whose execution cost estimated by the optimizer is greater than or equal to resource_track_cost. You can run the EXPLAIN statement to query for the execution cost of a statement.
  • Jobs whose execution time recorded in the real-time resource monitoring view is greater than or equal to resource_track_duration.

The historical resource monitoring view is described in the following table. You can query such a view in the same way as querying database tables. Common users can only query views with the prefix gs.

Table 1 Historical resource monitoring view

Monitoring Level

Monitored Node

Query View

View Description

Query

Current CN

GS_WLM_SESSION_HISTORY

Queries for the load of an executed job.

All CNs

PGXC_WLM_SESSION_HISTORY

Queries for the load of an executed job.

Operator

Current CN

GS_WLM_OPERATOR_HISTORY

Queries for the operator resource information of an executed job.

All CNs

PGXC_WLM_OPERATOR_HISTORY

Queries for the operator resource information of an executed job.

  • Prefixes gs and pgxc indicate views showing single CN information and those showing cluster information, respectively. Common users can log in to a CN in the cluster to query only views with the gs prefix.
  • You can query the gs_wlm_session_info, gs_wlm_operator_info, pgxc_wlm_session_info, and pgxc_wlm_operator_info views only after connecting to the postgres database.
  • If instance fault occurs, some SQL statement information may fail to be recorded in historical resource monitoring views.
  • The SQL statements that can be recorded in historical resource monitoring views are the same as those recorded in real-time resource monitoring views. For details, see SQL statements recorded in real-time resource monitoring views.

The resource monitoring information stored in the history view (for example, Table 1) is archived to the info view (for example, Table 2) at an interval of 3 minutes. After being archived, the records are deleted from the history view. The info view can be queried only when the postgres database is connected.

Table 2 Archiving view of historical resource monitoring

Monitoring Level

Monitored Node

Query View

View Description

Query

Current CN

GS_WLM_SESSION_INFO

Queries for the load of an executed job.

All CNs

PGXC_WLM_SESSION_INFO

Queries for the load of an executed job.

Operator

Current CN

GS_WLM_OPERATOR_INFO

Queries for the operator resource information of an executed job.

All CNs

PGXC_WLM_OPERATOR_INFO

Queries for the operator resource information of an executed job.

If enable_resource_record is set to on, storage space expansion may occur and thereby slightly affects the performance. Therefore, set is to off if record archiving is unnecessary.

Typical Queries for Historical Top SQLs

The query for statements whose execution is complete on the current CN within the last 3 minutes is used as an example. To query for the SQL statements executed on all CNs, replace the prefix gs of the view name in the query statement with pgxc. To query for the SQL statements whose execution is complete 3 minutes ago, replace the suffix history of the view name in the query statement with info.

  • Query level. By default, the first column in each query statement provided in the example is query (that is, the SQL statement to be queried).
    • Query for the start time, end time, execution time, and execution status of the SQL statement. The time unit is ms.
      1
      SELECT query,start_time,finish_time,duration,status FROM gs_wlm_session_history ORDER BY start_time DESC;
      
    • Query for the maximum memory peak of the SQL statement among all DNs, average memory usage during statement execution, and the memory usage skew ratio across DNs. The memory unit is MB.
      1
      SELECT query,max_peak_memory,average_peak_memory,memory_skew_percent FROM gs_wlm_session_history ORDER BY start_time DESC;
      
    • Query for the spill of the SQL statement to each DN, maximum data spill volume among all DNs, average data spill volume among all DNs, and spill skew ratio across DNs. The data volume unit is MB.
      1
      SELECT query,spill_info,max_spill_size,average_spill_size,spill_skew_percent FROM gs_wlm_session_history ORDER BY start_time DESC;
      
    • Query for the maximum execution time of the SQL statements among all DNs, average execution time of the SQL statement among all DNs, and the execution time skew ratio across DNs. The time unit is ms.
      1
      SELECT query,max_dn_time,average_dn_time,dntime_skew_percent FROM gs_wlm_session_history ORDER BY start_time DESC;
      
    • Query for the maximum CPU time of the SQL statement among all DNs, the total CPU time on all DNs, and the CPU time skew across DNs. The time unit is ms.
      1
      SELECT query,max_cpu_time,total_cpu_time,cpu_skew_percent FROM gs_wlm_session_history ORDER BY start_time DESC;
      
    • Query for the maximum IOPS peak of the SQL statement among all DNs, the average IOPS peak among all DNs, and the I/O skew ratio across DNs. The I/Os are counted by ones for column storage and by 10 thousands for row storage.
      1
      SELECT query,max_peak_iops,average_peak_iops,iops_skew_percent FROM gs_wlm_session_history ORDER BY start_time DESC;
      
  • Operator level. By default, the first column in each query statement provided in the example is plan_node_name (that is, the operator name matching plan_node_id).
    • Query for the start time for the operator to process data, total data processing duration of the operator, and operator execution status. The time unit is ms.
      1
      SELECT plan_node_name,start_time,duration FROM gs_wlm_operator_history ORDER BY start_time DESC;
      
    • Query for the maximum memory peak of the operator among all DNs, average memory peak among all DNs, and memory usage skew ratio across DNs. The memory unit is MB.
      1
      SELECT plan_node_name,max_peak_memory,average_peak_memory,memory_skew_percent FROM gs_wlm_operator_history ORDER BY start_time DESC;
      
    • Query for the maximum data spill volume of the operator among all DNs, average data spill volume among all DNs, and spill skew ratio across DNs. The data volume unit is MB.
      1
      SELECT plan_node_name,max_spill_size,average_spill_size,spill_skew_percent FROM gs_wlm_operator_history ORDER BY start_time DESC;
      
    • Query for the maximum execution time of the operator among all DNs, total execution time on all DNs, and execution time skew ratio across DNs. The time unit is ms.
      1
      SELECT plan_node_name,max_cpu_time,total_cpu_time,cpu_skew_percent FROM gs_wlm_operator_history ORDER BY start_time DESC;
      

Precautions

  • If queries are abnormally terminated due to FATAL or PANIC errors, their status will be displayed as aborted and no detailed information will be recorded.
  • Status information in the optimization phase of query parsing cannot be monitored.
  • Prefixes gs and pgxc indicate historical views showing single CN information and those showing cluster information, respectively. Common users can query only the views with the prefix gs.
  • The number of data records that can be retained in memory is limited by the reserved memory size. After a real-time query is complete, its information is imported to a historical view and will be deleted from the history view 3 minutes after being stored in a hash table in memory. The upper limit of records varies by view. For a query-level view, whether a record can be stored depends on the upper limit of records allowed by the memory at that time point. On each CN, the maximum number of real-time query-level records that can be stored in memory is max_session_realt_num (set to 15240 by default). The maximum number of historical records that can be stored in memory is max_session_hist_num (set to 18995 by default). The average execution time of queries in a service system is greater than run_time (in the unit of s). The maximum number of concurrent jobs allowed by real-time views on each CN is equal to num_realt_active, and also to max_session_realt_num. The maximum number of concurrent jobs allowed by historical views on each CN is equal to num_hist_active, and to max_session_hist_num/(180/run_time).
  • For operator-level views, whether a record can be stored depends on the upper limit allowed by the memory at that time point. If the number of plan nodes plus the number of records in the memory exceeds the upper limit, the record cannot be stored. On each CN, the maximum numbers of real-time and historical operator-level records that can be stored in memory are max_oper_realt_num (set to 93622 by default) and max_oper_hist_num (set to 187245 by default), respectively. The average number of plan nodes for queries is num_plan_node. The maximum number of concurrent jobs allowed by real-time views on each CN is equal to num_realt_active, and also to max_oper_realt_num/num_plan_node. The maximum number of concurrent jobs allowed by historical views on each CN is equal to num_hist_active, and to max_oper_hist_num/(180/run_time)/num_plan_node.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel