Updated on 2024-04-26 GMT+08:00

Monitoring Metrics

You can check the status and available resources of a cluster and learn about its real-time resource consumption through the GaussDB(DWS) monitoring items.

Table 1 describes GaussDB(DWS) monitoring metrics.

Table 1 GaussDB(DWS) monitoring metrics

Monitored Object

Metric Description

Value Range

Monitoring Interval (Raw Data)

Abnormal Job Monitoring Statistics

Query ID

String

30s

Statement executed for exception handling

String

Block time before the statement is executed

≥ 0

Elapsed time when the statement is executed

≥ 0

Total time used by the CPU on the DN when the statement is executed for exception handling

≥ 0

CPU usage skew on the DN when the statement is executed for exception handling

≥ 0

cgroups used for exception handling during statement execution

String

Status of a statement after exception handling

String

Exception handling action executed by the statement

String

Reason why the statement is processed abnormally

String

Node Status Statistics

Host name

String

60s

Host status

String

Instances by Status

Host name

String

60s

Instance type

String

Instance role

String

Instance status

String

Instance status cause

String

Cluster Status

Cluster status

String

30s

Whether an active/standby switchover has occurred

String

Whether redistribution has occurred

String

Whether the current cluster is read-only

String

CPU status

The default status is ALL.

String

30s

User-mode CPU time %

≥ 0.0

CPU time the process with a negative nice value (%)

≥ 0.0

Kernel mode time (%)

≥ 0.0

I/O wait time (%)

≥ 0.0

Hard interruption time (%)

≥ 0.0

Software interrupt time (%)

≥ 0.0

Time spent by the virtual CPU in involuntary waiting when the virtual machine manager serves another virtual processor (%)

≥ 0.0

Time spent running the virtual processor (%)

≥ 0.0

Idle time except for disk waiting operations (%)

≥ 0.0

Hyper-threading capability

Yes/No

Hyper-threading enabled

Yes/No

Number of processes in the runable status

≥ 0

Number of processes waiting for I/O completion.

≥ 0

Active Sessions

Database name

String

30s

Instance name

String

Number of all user sessions

≥ 0

Distinct username

≥ 0

Distinct application name

≥ 0

Number of queries in the active or fastpathfunctioncall state.

≥ 0

Disk Capacity Statistics

Instance name

String

86400s

Database name

String

Database size

≥ 0

Transaction Status.

Database name

String

60s

Instance name

String

Number of rows returned through a global database scan

≥ 0

Number of rows returned by querying indexes in the database

≥ 0

Number of rows inserted by queries in this database

≥ 0

Number of rows updated by queries in this database

≥ 0

Number of rows deleted by queries in this database

≥ 0

Number of transactions in this database that have been committed

≥ 0

Number of transactions in this database that have been rolled back

≥ 0

Number of deadlocks detected in this database

≥ 0

Number of disk blocks read in this database

≥ 0

Number of disk blocks that have been found in the cache. In this case, the read is unnecessary. (This includes only the postgresql buffer cache and does not include the file system cache of the operating system.)

≥ 0

Time spent reading data file blocks by backends in this database, in milliseconds

≥ 0.0

Time spent writing data file blocks by backends in this database, in milliseconds

≥ 0.0

Number of queries canceled due to database recovery conflicts (conflicts that occur only on the standby server).

≥ 0

Number of temporary files created by queries in this database. This parameter calculate all temporary files (such as sorting or hashing) and ignores the log_temp_files setting.

≥ 0

Total amount of data written to temporary files by queries in this database. This parameter calculates all temporary files and ignores the log_temp_files setting.

≥ 0

Database capacity, in bytes.

≥ 0

Number of rows returned by global database scanning in a unit time

≥ 0

Number of rows returned by querying indexes in the database in a unit time

≥ 0

Number of rows inserted through database query in a unit time

≥ 0

Number of rows updated through database query in unit time

≥ 0

Number of rows deleted by querying the database in a unit time

≥ 0

Number of transactions that have been submitted in the database per unit time

≥ 0

Number of transactions that have been rolled back in the database per unit time

≥ 0

Number of deadlocks retrieved in the database per unit time

≥ 0

File Handle

Name of the disk file system

String

30s

Total inode capacity (unit: KB)

≥ 0

Used capacity (unit: KB)

≥ 0

Node Disk Usage

Name of the disk file system

≥ 0

30s

Total space (unit: KB)

≥ 0

Used capacity (unit: KB)

≥ 0

Available capacity (unit: KB)

≥ 0

Disk usage

≥ 0

gsar NIC Usage Statistics

Node name

String

30s

NIC name

String

NIC IP address

String

Data received by the NIC (unit: KB)

≥ 0

Number of packets received by the NIC

≥ 0

Average length of received packets (unit: byte)

≥ 0

Number of received data packets that are dropped by the NIC

≥ 0

Port transmit discard rate

≥ 0.0

Data sent by the network adapter (unit: KB)

≥ 0

Number of packets sent by the NIC

≥ 0

Average length of received packets (unit: byte)

≥ 0

gsar TCP Statistics

Number of retransmitted TCP packets due to timeout

≥ 0

30s

Number of sent TCP packets

≥ 0

Number of retransmitted TCP packets

≥ 0

TCP retransmission rate

≥ 0.0

Node Disk I/O Statistics

Disk name (devicename)

String

30s

Number of transmissions per second (transferpersecond). The size of each transmission is unknown.

≥ 0.0

Amount of data read from the device per second (unit: KB)

≥ 0.0

Amount of data written to the device per second (unit: KB)

≥ 0.0

Total amount of read data (unit: KB)

≥ 0.0

Total amount of written data (unit: KB)

≥ 0.0

Number of times that read requests to the device are combined per second

≥ 0.0

Number of times that write requests to the device are combined per second

≥ 0.0

Number of completed reads per second

≥ 0.0

Number of completed writes per second

≥ 0.0

Amount of data read per second (unit: KB)

≥ 0.0

Amount of data written per second (unit: KB)

≥ 0.0

Average data volume of each I/O operation (unit: sector)

≥ 0.0

Average request queue length

≥ 0.0

Average waiting time for each I/O request (unit: ms)

≥ 0.0

Average processing time for each I/O request (unit: ms)

≥ 0.0

Percentage of the time when the I/O queue is not empty (I/O operation time divided by the total time)

≥ 0.0

Instance Memory Monitoring Statistics

Instance name

String

60s

Memory size occupied by the instance

≥ 0.0

Memory size used by a process

≥ 0.0

Maximum dynamic memory

≥ 0.0

Used dynamic memory

≥ 0.0

Dynamic peak memory

≥ 0.0

Maximum dynamic shared memory context

≥ 0.0

Dynamic peak value of the shared memory context

≥ 0.0

Maximum shared memory

≥ 0.0

Used shared memory

≥ 0.0

Maximum memory allowed by column store

≥ 0.0

Memory used in column store

≥ 0.0

Maximum memory that can be used by the communication library

≥ 0.0

Used memory size of the communication library

≥ 0.0

Peak memory usage of the communication library

≥ 0.0

Maximum memory that can be used by top SQLs to record historical job monitoring information

≥ 0.0

Peak memory usage of the top SQL that records historical job monitoring information

≥ 0.0

Memory used by top SQLs to record historical job monitoring information

≥ 0.0

Other used memory

≥ 0.0

Memory size occupied by pooler connections

≥ 0.0

Memory size occupied by pooler idle connections

≥ 0.0

Memory size used by column-store compression and decompression

≥ 0.0

Memory reserved for the UDFWorker process

≥ 0.0

Memory size used by the MMAP

≥ 0.0

Instance Resource Statistics

Instance name

String

60s

Read the value (CPU usage %) in postmaster.pID/cm_server.pID/gtm.pID/etcd.pID.

≥ 0.0

Read the value (memory usage %) in postmaster.pID/cm_server.pID/gtm.pID/etcd.pID.

≥ 0.0

Instance Disk Size Statistics

Instance name

String

86400s

Storage location

String

Disk space used by all databases on the current instance

≥ 0

Node Memory Statistics

Size of all available RAMs, that is, the remaining physical memory minus the reserved bits and kernel usage (unit: KB)

≥ 0

30s

Unused memory in the system. The value is lowfree+highfree (unit: KB).

≥ 0

Size of the cache used for the block device (unit: KB)

≥ 0

Size of the file buffer (unit: KB)

≥ 0

Total swap space (unit: KB)

≥ 0

Size of the RAM memory temporarily stored in the swap file (unit: KB)

≥ 0

Memory size of the virus-infected page (unit: KB)

≥ 0

Network Status Statistics

NIC name

String

30s

NIC status (up/down)

up/down

NIC speed (1000 Mbit/s or 100 Mbit/s)

≥ 0

Total data received by the NIC (unit: byte)

≥ 0

Number of packets received by the NIC

≥ 0

Total number of NIC receiving errors

≥ 0

Number of received data packets that are dropped by the NIC

≥ 0

Number of FIFO buffer errors during reception

≥ 0

Number of received packet frame errors

≥ 0

Number of received compressed data packets

≥ 0

Number of received multicast frames

≥ 0

Total data sent by the NIC (unit: byte)

≥ 0

Total number of packets sent by the NIC

≥ 0

Total number of NIC sending errors

≥ 0

Total number of data packets discarded by the NIC during transmission

≥ 0

Number of FIFO buffer errors during sending

≥ 0

Number of collisions detected on the sending interface

≥ 0

Number of carrier losses detected by the device driver during transmission

≥ 0

Number of sent compressed data packets

≥ 0

Specifies whether NIC multi-queue is supported.

Yes/No

NIC multi-queue is enabled.

Yes/No

Specifies the CPU affinity of a multi-queue NIC.

String

Indicates whether the NIC works in duplex mode.

String

Network speed

≥ 0.0

Node SQL Statistics

Node name

String

60s

Username

String

Number of SELECT statements

≥ 0

Number of UPDATE statements

≥ 0

Number of INSERT statements

≥ 0

Number of DELETE statements

≥ 0

Number of MERGEINTO statements

≥ 0

Number of DDL statements

≥ 0

Number of DML statements

≥ 0

Number of DCL statements

≥ 0

Total response time of SELECT statements

≥ 0

Average response time of SELECT statements

≥ 0

Maximum response time of SELECT statements

≥ 0

Minimum response time of SELECT statements

≥ 0

Total response time of UPDATE statements

≥ 0

Maximum response time of UPDATE statements

≥ 0

Minimum response time of UPDATE statements

≥ 0

Minimum response time of UPDATE statements

≥ 0

Total response time of DELETE statements

≥ 0

Average response time of DELETE statements

≥ 0

Maximum response time of DELETE statements

≥ 0

Minimum response time of DELETE statements

≥ 0

Total response time of INSERT statements

≥ 0

Average response time of INSERT statements

≥ 0

Maximum response time of INSERT statements

≥ 0

Minimum response time of INSERT statements

≥ 0

Difference between the numbers of SELECT statements at two time points

≥ 0

Difference between the numbers of UPDATE statements at two time points

≥ 0

Difference between the numbers of INSERT statements at two time points

≥ 0

Difference between the numbers of DELETE statements at two time points

≥ 0

Difference between the numbers of MERGE INTO statements at two time points

≥ 0

Difference between the numbers of DDL statements at two time points

≥ 0

Difference between the numbers of DML statements at two time points

≥ 0

Difference between the numbers of DCL statements at two time points

≥ 0

Difference between the total SELECT response time at two time points

≥ 0

Difference between the average SELECT response time at two time points

≥ 0

Difference between the maximum SELECT response time at two time points

≥ 0

Difference between the minimum SELECT response time at two time points

≥ 0

Difference between the total UPDATE response time at two time points

≥ 0

Difference between the maximum UPDATE response time at two time points

≥ 0

Difference between the minimum UPDATE response time at two time points

≥ 0

Difference between the minimum UPDATE response time at two time points

≥ 0

Difference between the total DELETE response time at two time points

≥ 0

Difference between the average DELETE response time at two time points

≥ 0

Difference between the maximum DELETE response time difference at two time points

≥ 0

Difference between the minimum DELETE response time difference at two time points

≥ 0

Difference between the total INSERT response time at two time points

≥ 0

Difference between the average INSERT response time at two time points

≥ 0

Difference between the maximum INSERT response time at two time points

≥ 0

Difference between the minimum INSERT response time at two time points

≥ 0

System Status Statistics

TCP protocol stack retransmission rate (%)

≥ 0.0

30s

Top SQL Statistics

Database name

String

60s

Instance name

String

Thread ID (session ID and session ID)

String

Internal query_ID used for statement execution

String

Job type, which can be set using the guc parameter query_band. The default value is a null string.

String

The value is obtained from the query_band field. The position is 0.

String

The value is obtained from the query_band field. The position is 1.

String

Username used for connecting to the backend

String

Name of the application that is connected to the backend

String

IP address of the client connected to the backend. A null value suggests a Unix socket connection or an internal server process, such as autovacuum.

String

Host name of the connected client, as reported by a reverse DNS lookup of client_addr. This column will only be non-null for IP connections, and only when log_hostname is enabled.

String

TCP port number used by the client to communicate with the backend. If the Unix socket is used, the value is -1.

String

Whether the backend is currently waiting on a lock. If yes, the value is true.

Yes/No

Time when the statement starts to be executed

≥ 0

Block time before the statement is executed. The unit is ms.

≥ 0

Duration that a statement has been executed (unit: ms)

≥ 0

Estimated execution time of a statement. The unit is ms.

≥ 0

Estimated remaining time of statement execution. The unit is ms.

≥ 0

Resource status in workload management

String

Resource pool used by the user

String

Priority of a job in the resource pool. The options are as follows:

  • 1: Low
  • 2: Medium
  • 4: High
  • 8: Rush

≥ 0

Cgroup used by the statement

String

Minimum memory peak of a statement across all DNs. The unit is MB.

≥ 0

Maximum memory peak of a statement across all DNs. The unit is MB.

≥ 0

Average memory usage during statement execution. The unit is MB.

≥ 0

Memory usage skew of a statement among DNs.

≥ 0

Estimated memory used by the statement. The unit is MB.

≥ 0

Statement spill information on all DNs

String

Minimum spilled data among all DNs when a spill occurs. The default value is 0 (unit: MB).

≥ 0

Maximum spilled data among all DNs when a spill occurs. The default value is 0 (unit: MB).

≥ 0

Average spilled data among all DNs when a spill occurs. The default value is 0 (unit: MB).

≥ 0

DN spill skew when a spill occurs

≥ 0

Minimum execution time of a statement across all DNs. The unit is ms.

≥ 0

Maximum execution time of a statement across all DNs. The unit is ms.

≥ 0

Average execution time of a statement across all DNs. The unit is ms.

≥ 0

Execution time skew of a statement among DNs.

≥ 0

Minimum CPU time of a statement across all DNs. The unit is ms.

≥ 0

Maximum CPU time of a statement across all DNs. The unit is ms.

≥ 0

Total CPU time of a statement across all DNs. The unit is ms.

≥ 0

CPU time skew of a statement among DNs.

≥ 0

Warning. The following warnings and warnings related to SQL self-diagnosis tuning are displayed:

String

Average IOPS peak of a statement across all DNs. It is counted by ones in a column-store table and by ten thousands in a row-store table.

≥ 0

I/O skew of a statement among DNs

≥ 0

Maximum IOPS of a statement across all DNs. It is counted by ones in a column-store table and by ten thousands in a row-store table.

≥ 0

Minimum IOPS of a statement across all DNs. It is counted by ones in a column-store table and by ten thousands in a row-store table.

≥ 0

Query statement

String

Query plan

String

Real-time running status of the current query statement. The value can be active, idle, idleintransaction, idleintransaction (aborted), fastpathfunctioncall, or disabled.

String

Running status of the query statement in the resource pool. The value can be pending, running, finished, aborted, active or unknown.

String

Statement attribute (ordinary, simple, complicated, or internal).

String

Fast and slow lanes (fast or slow)

String

Whether a query is a system query

Yes/No

Whether the query is a system query (adapting to monitor)

Yes/No

Time when this process was started, that is, when the client connected to the server

≥ 0

Execution time so far.

≥ 0

Time when the current transaction was started (NULL if no transactions are active). If the current query is the first of its transaction, the value of this column is the same as that of the query_start column.

≥ 0

Time when the status was changed in the previous time

≥ 0

Time when a statement starts to be executed

≥ 0

Actual execution duration of the statement, in seconds.

≥ 0

Historical Queries

Database name

String

180s

Instance name

String

Username

String

Name of the application connected to the backend

String

IP address of the client connected to the backend. A null value suggests a Unix socket connection or an internal server process, such as autovacuum.

String

Host name of the connected client, as reported by a reverse DNS lookup of client_addr. This column will only be non-null for IP connections, and only when log_hostname is enabled.

String

TCP port number used by the client to communicate with the backend. If the Unix socket is used, the value is -1.

String

Job type, which can be set using the guc parameter query_band. The default value is a null string.

String

The value is obtained from the query_band field. The position is 0.

String

The value is obtained from the query_band field. The position is 1.

String

Duration that a statement is blocked before being executed, including the statement parsing and optimization duration. The unit is ms.

≥ 0

Execution start time of a statement (unit: ms)

≥ 0

Execution end time of a statement (unit: ms)

≥ 0

Duration that a statement has been executed (unit: ms)

≥ 0

Estimated statement execution time (unit: ms)

≥ 0

Statement execution end status:

  • Finished (normal)
  • Aborted (Abnormal)

String

Exception information displayed if the final statement execution status is aborted.

String

Resource pool used by the user

String

Priority of a job in the resource pool. The options are as follows:

  • 8: Rush
  • 4: High
  • 2: Medium
  • 1: Low

≥ 0

Cgroup used by the statement

String

Minimum memory peak of a statement across all DNs. The unit is MB.

≥ 0

Maximum memory peak of a statement across all DNs. The unit is MB.

≥ 0

Average memory usage during statement execution. The unit is MB.

≥ 0

Memory usage skew of a statement among DNs

≥ 0

Statement spill information on all DNs

String

Minimum spilled data among all DNs when a spill occurs. The unit is MB. The default value is 0.

≥ 0

Maximum spilled data among all DNs when a spill occurs. The unit is MB. The default value is 0.

≥ 0

Average spilled data among all DNs when a spill occurs. The unit is MB. The default value is 0.

≥ 0

DN spill skew when a spill occurs

≥ 0

Minimum execution time of a statement across all DNs. The unit is ms.

≥ 0

Maximum execution time of a statement across all DNs. The unit is ms.

≥ 0

Average execution time of a statement across all DNs. The unit is ms.

≥ 0

Execution time skew of a statement among DNs.

≥ 0

Minimum CPU time of a statement across all DNs. The unit is ms.

≥ 0

Maximum CPU time of a statement across all DNs. The unit is ms.

≥ 0

Total CPU time of a statement across all DNs. The unit is ms.

≥ 0

CPU time skew of a statement among DNs.

≥ 0

Minimum IOPS of a statement across all DNs. It is counted by ones in a column-store table and by ten thousands in a row-store table.

≥ 0

Maximum IOPS of a statement across all DNs. It is counted by ones in a column-store table and by ten thousands in a row-store table.

≥ 0

Average IOPS peak of a statement across all DNs. It is counted by ones in a column-store table and by ten thousands in a row-store table.

≥ 0

I/O skew of a statement among DNs

≥ 0

Warning. The following warnings and warnings related to SQL self-diagnosis tuning are displayed:

String

Query ID

String

Statement executed

String

Execution plan of a statement

String

Logical cluster of the user running the statement

String

Schema Usage Statistics

Database name

String

3600s

Schema name

String

Used capacity (unit: byte)

≥ 0

Total capacity (unit: byte)

≥ 0

Session Statistics

Database name

String

180s

Instance name

String

Thread ID (It can be used as a session ID or a connection ID.)

String

Database username

String

User application name

String

Client address

String

Host name of the client

String

TCP port number used by the client to communicate with the background. If the Unix socket is used, the value is -1.

String

Time when this process was started, that is, when the client connected to the server

≥ 0

Execution time so far.

≥ 0

Time when the current transaction was started (NULL if no transactions are active). If the current query is the first of its transaction, the value of this column is the same as that of the query_start column.

≥ 0

Time when the status was changed in the previous time

≥ 0

Whether the backend is currently waiting on a lock. If yes, the value is true.

Yes/No

Current overall state of this backend.

String

Resource pool used by the user

String

Actual execution duration of the statement, in seconds.

≥ 0

ID of a query

String

Time when the statement starts to be executed

≥ 0

cgroups currently used by the statement.

String

Lock type

String

Lock mode

String

Indicates whether to hold a lock when lock waiting exists. The value is true.

Yes/No

Resource that is waiting for the lock

String

Statement type

String

Query SQL.

String

Indicates whether the query is performed by the system.

Yes/No

Query plan

String

SQL Probe Statistics

Query ID of a probe task

String

30s

Cluster ID

String

Cluster project ID

String

Task type of the probe SQL

String

Time when a probe SQL task is created

≥ 0

SQL execution time

≥ 0

probe_ID of the associated probe SQL

String

Probe

String

Task status of the SQL probe. The options are as follows:

  • Running
  • Success
  • Fail

String

Table Dirty Page Rate Statistics

Database name

String

7200s

Schema name

String

Table name (full name)

String

Table owner

String

Table size (unit: byte)

≥ 0

Dirty page rate

≥ 0.0

Table Skew Monitoring Statistics

Database name

String

7200s

Schema name

String

Table name (full name)

String

Table owner

String

Table size (unit: byte)

≥ 0

Skew rate

≥ 0.0

Resource Pool Statistics

Load resource pool

String

120s

CPU quota of the resource pool

≥ 0

Memory quota of the resource pool

≥ 0

Disk quota of the resource pool

≥ 0

Maximum number of concurrent simple jobs allowed by the resource pool

≥ 0

Maximum number of concurrent queries allowed by the resource pool

≥ 0

CPU usage of the resource pool

≥ 0.0

Memory usage of the resource pool

≥ 0.0

Disk usage of the resource pool

≥ 0.0

Number of concurrent simple jobs in the resource pool

≥ 0

Current number of concurrent requests in the resource pool

≥ 0

Resource Pool User Statistics

Load resource pool

String

30s

CPU quota of a user.

≥ 0

Memory quota of a user

≥ 0

Specifies the disk quota of a user

≥ 0

CPU usage of a user

≥ 0.0

Memory usage of a user

≥ 0.0

Disk usage of a user

≥ 0.0

User ID

String

Username

String