What Metrics Are Supported by the Agent?_Server Monitoring_Product Usage_FAQs

OS metric: CPU

Metric	Name	Description	Value Range	Unit	Conversion Rule	Earliest Agent Version Required	Monitoring Period (Raw Data)
cpu_usage	(Agent) CPU Usage	Used to monitor CPU usage Collection method (Linux): Check the metric value changes in file /proc/stat in a collection period. You can run the top command to check the %Cpu(s) value. Collection method (Windows): Obtain the metric value using the API GetSystemTimes.	0-100	%	N/A	2.4.1	1 minute
cpu_usage_idle	(Agent) Idle CPU Usage	Percentage of the time that CPU is idle Unit: Percent Collection method (Linux): Check the metric value changes in file /proc/stat in a collection period. Collection method (Windows): Obtain the metric value using the API GetSystemTimes.	0-100	%	N/A	2.4.5	1 minute
cpu_usage_other	(Agent) Other Process CPU Usage	Other CPU usage of the monitored object Collection method (Linux): Other Process CPU Usage = 1– Idle CPU Usage – Kernel Space CPU Usage – User Space CPU Usage Collection method (Windows): Other Process CPU Usage = 1– Idle CPU Usage – Kernel Space CPU Usage – User Space CPU Usage	0-100	%	N/A	2.4.5	1 minute
cpu_usage_system	(Agent) Kernel Space CPU Usage	Percentage of time that the CPU is used by kernel space Collection method (Linux): Check the metric value changes in file /proc/stat in a collection period. You can run the top command to check the %Cpu(s) sy value. Collection method (Windows): Obtain the metric value using the API GetSystemTimes.	0-100	%	N/A	2.4.5	1 minute
cpu_usage_user	(Agent) User Space CPU Usage	Percentage of time that the CPU is used by user space Collection method (Linux): Check the metric value changes in file /proc/stat in a collection period. You can run the top command to check the %Cpu(s) us value. Collection method (Windows): Obtain the metric value using the API GetSystemTimes.	0-100	%	N/A	2.4.5	1 minute
cpu_usage_nice	(Agent) Nice Process CPU Usage	Percentage of the time that the CPU is in user mode with low-priority processes which can easily be interrupted by higher-priority processes Collection method (Linux): Check the metric value changes in file /proc/stat in a collection period. You can run the top command to check the %Cpu(s) ni value. Collection method (Windows): not supported	0-100	%	N/A	2.4.5	1 minute
cpu_usage_iowait	(Agent) iowait Process CPU Usage	Percentage of time that the CPU is waiting for I/O operations to complete Collection method (Linux): Check the metric value changes in file /proc/stat in a collection period. You can run the top command to check the %Cpu(s) wa value. Collection method (Windows): not supported	0-100	%	N/A	2.4.5	1 minute
cpu_usage_irq	(Agent) CPU Interrupt Time	Percentage of time that the CPU is servicing interrupts Collection method (Linux): Check the metric value changes in file /proc/stat in a collection period. You can run the top command to check the %Cpu(s) hi value. Collection method (Windows): not supported	0-100	%	N/A	2.4.5	1 minute
cpu_usage_softirq	(Agent) CPU Software Interrupt Time	Percentage of time that the CPU is servicing software interrupts Collection method (Linux): Check the metric value changes in file /proc/stat in a collection period. You can run the top command to check the %Cpu(s) si value. Collection method (Windows): not supported	0-100	%	N/A	2.4.5	1 minute

OS Metric: CPU Load

Metric	Name	Description	Value Range	Unit	Conversion Rule	Earliest Agent Version Required	Monitoring Period (Raw Data)
load_average1	(Agent) 1-Minute Load Average	CPU load averaged from the last 1 minute Collection method (Linux): The value of this metric is the result of the value of load1 in /proc/loadavg divided by the number of logical CPUs. You can run the top command to check the value of load1. Collection method (Windows): not supported	≥ 0	None	N/A	2.4.1	1 minute
load_average5	(Agent) 5-Minute Load Average	CPU load averaged from the last 5 minutes Collection method (Linux): The value of this metric is the result of the value of load5 in /proc/loadavg divided by the number of logical CPUs. You can run the top command to check the value of load5. Collection method (Windows): not supported	≥ 0	None	N/A	2.4.1	1 minute
load_average15	(Agent) 15-Minute Load Average	CPU load averaged from the last 15 minutes Collection method (Linux): The value of this metric is the result of the value of load15 in /proc/loadavg divided by the number of logical CPUs. You can run the top command to check the value of load15. Collection method (Windows): not supported	≥ 0	None	N/A	2.4.1	1 minute

Metric

Name

Description

Value Range

Unit

Conversion Rule

Earliest Agent Version Required

Monitoring Period (Raw Data)

load_average1

(Agent) 1-Minute Load Average

CPU load averaged from the last 1 minute

Collection method (Linux): The value of this metric is the result of the value of load1 in /proc/loadavg divided by the number of logical CPUs. You can run the top command to check the value of load1.
Collection method (Windows): not supported

≥ 0

None

N/A

2.4.1

1 minute

load_average5

(Agent) 5-Minute Load Average

CPU load averaged from the last 5 minutes

Collection method (Linux): The value of this metric is the result of the value of load5 in /proc/loadavg divided by the number of logical CPUs. You can run the top command to check the value of load5.
Collection method (Windows): not supported

≥ 0

None

N/A

2.4.1

1 minute

load_average15

(Agent) 15-Minute Load Average

CPU load averaged from the last 15 minutes

Collection method (Linux): The value of this metric is the result of the value of load15 in /proc/loadavg divided by the number of logical CPUs. You can run the top command to check the value of load15.
Collection method (Windows): not supported

≥ 0

None

N/A

2.4.1

1 minute

OS Metric: Memory

Metric	Name	Description	Value Range	Unit	Conversion Rule	Earliest Agent Version Required	Monitoring Period (Raw Data)
mem_available	(Agent) Available Memory	Amount of memory that is available and can be given instantly to processes Collection method (Linux): Obtain the metric value from /proc/meminfo. If MemAvailable is displayed in /proc/meminfo, obtain the value. If MemAvailable is not displayed in /proc/meminfo, MemAvailable = MemFree + Buffers+Cached Collection method (Windows): formula (Available memory – Used memory) The value is obtained by calling the Windows API GlobalMemoryStatusEx.	≥ 0	GB	N/A	2.4.5	1 minute
mem_usedPercent	(Agent) Memory Usage	Memory usage of the instance Collection method (Linux): Obtain the metric value from the /proc/meminfo file (MemTotal-MemAvailable)/MemTotal. If MemAvailable is displayed in /proc/meminfo, MemUsedPercent = (MemTotal-MemAvailable)/MemTotal If MemAvailable is not displayed in /proc/meminfo, MemUsedPercent = (MemTotal – MemFree – Buffers – Cached)/MemTotal Collection method (Windows): formula (Used memory size/Total memory size x 100%)	0-100	%	N/A	2.4.1	1 minute
mem_free	(Agent) Idle Memory	Amount of memory that is not being used Collection method (Linux): Obtain the metric value from /proc/meminfo. Collection method (Windows): not supported	≥ 0	GB	N/A	2.4.5	1 minute
mem_buffers	(Agent) Buffer	Amount of memory that is being used for buffers Collection method (Linux): Obtain the metric value from /proc/meminfo. You can run the top command to check the KiB Mem:buffers value. Collection method (Windows): not supported	≥ 0	GB	N/A	2.4.5	1 minute
mem_cached	(Agent) Cache	Amount of memory that is being used for file caches Collection method (Linux): Obtain the metric value from /proc/meminfo. You can run the top command to check the KiB Swap:cached Mem value. Collection method (Windows): not supported	≥ 0	GB	N/A	2.4.5	1 minute
total_open_files	(Agent) Total File Handles	Total handles used by all processes Collection method (Linux): Use the /proc/{pid}/fd file to summarize the handles used by all processes. Collection method (Windows): not supported	≥ 0	Count	N/A	2.4.5	1 minute

OS Metric: Disk

Currently, CES Agent can collect only physical disk metrics and does not support disks mounted using the network file system protocol.

By default, CES Agent will not monitor Docker-related mount points. The prefix of the mount point is as follows:

/var/lib/docker;/mnt/paas/kubernetes;/var/lib/mesos

Metric	Name	Description	Value Range	Unit	Conversion Rule	Earliest Agent Version Required	Monitoring Period (Raw Data)
disk_free	(Agent) Available Disk Space	Free space on the disks Collection method (Linux): Run the df -h command to check the value in the Avail column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): Use the Windows Management Instrumentation (WMI) API GetDiskFreeSpaceExW to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥ 0	GB	N/A	2.4.1	1 minute
disk_total	(Agent) Disk Storage Capacity	Total disk capacity Collection method (Linux): Run the df -h command to check the value in the Size column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): Use the WMI API GetDiskFreeSpaceExW to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥ 0	GB	N/A	2.4.5	1 minute
disk_used	(Agent) Used Disk Space	Disk's used space Collection method (Linux): Run the df -h command to check the value in the Used column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): Use the WMI API GetDiskFreeSpaceExW to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥ 0	GB	N/A	2.4.5	1 minute
disk_usedPercent	(Agent) Disk Usage	Percentage of used disk space. It is calculated as follows: Disk Usage = Used Disk Space/Disk Storage Capacity. Collection method (Linux): It is calculated as follows: Used/Size. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): Use the WMI API GetDiskFreeSpaceExW to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	0-100	%	N/A	2.4.1	1 minute
disk_rwstate	(Agent) Disk Read/Write Status	Read and write status of the disk attached to the monitored object. The status can be 0 (read and write) or 1 (read-only). Collection method (Linux): Obtain the disk attachment status from the /proc/1/mountinfo file. Collection method (Windows): not supported	0: read and write 1: read-only	None	N/A	2.5.6	1 minute

OS Metric: Disk I/O

Metric	Name	Description	Value Range	Unit	Conversion Rule	Earliest Agent Version Required	Monitoring Period (Raw Data)
disk_agt_read_bytes_rate	(Agent) Disks Read Rate	Volume of data read from the instance per second Collection method (Linux): Calculate the data changes in the sixth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): Use Win32_PerfFormattedData_PerfDisk_LogicalDisk object in WMI to obtain disk I/O data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and monitoring data cannot be obtained.	≥ 0	byte/s	1024(IEC)	2.4.5	1 minute
disk_agt_read_requests_rate	(Agent) Disks Read Requests	Number of read requests sent to the monitored disk per second Collection method (Linux): The disk read requests are calculated by calculating the data changes in the fourth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): Use Win32_PerfFormattedData_PerfDisk_LogicalDisk object in WMI to obtain disk I/O data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and monitoring data cannot be obtained.	≥ 0	Request/s	N/A	2.4.5	1 minute
disk_agt_write_bytes_rate	(Agent) Disks Write Rate	Volume of data written to the instance per second Collection method (Linux): The disk write rate is calculated by calculating the data changes in the tenth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): Use Win32_PerfFormattedData_PerfDisk_LogicalDisk object in WMI to obtain disk I/O data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and monitoring data cannot be obtained.	≥ 0	byte/s	1024(IEC)	2.4.5	1 minute
disk_agt_write_requests_rate	(Agent) Disks Write Requests	Number of write requests sent to the monitored disk per second Collection method (Linux): The disk write requests are calculated by calculating the data changes in the eighth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): Use Win32_PerfFormattedData_PerfDisk_LogicalDisk object in WMI to obtain disk I/O data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and monitoring data cannot be obtained.	≥ 0	Request/s	N/A	2.4.5	1 minute
disk_readTime	(Agent) Average Read Request Time	The average time taken for disk read operations Collection method (Linux): The average read request time is calculated by calculating the data changes in the seventh column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): not supported	≥ 0	ms/count	N/A	2.4.5	1 minute
disk_writeTime	(Agent) Average Write Request Time	The average time taken for disk write operations Collection method (Linux): The average write request time is calculated by calculating the data changes in the eleventh column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): not supported	≥ 0	ms/count	N/A	2.4.5	1 minute
disk_ioUtils	(Agent) Disk I/O Usage	Percentage of the time that the disk has had I/O requests queued to the total disk operation time Collection method (Linux): The disk I/O usage is calculated by calculating the data changes in the thirteenth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): not supported	0-100	%	N/A	2.4.1	1 minute
disk_queue_length	(Agent) Disk Queue Length	Average number of read or write requests queued up for completion for the monitored disk in the monitoring period Collection method (Linux): The average disk queue length is calculated by calculating the data changes in the fourteenth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): not supported	≥ 0	count	N/A	2.4.5	1 minute
disk_write_bytes_per_operation	Disk Bytes Per Write Operation	Average number of bytes in an I/O write for the monitored disk in the monitoring period Collection method (Linux): The average disk write size is calculated by calculating the data changes in the tenth column of the corresponding device to divide that of the eighth column in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): not supported	≥ 0	Byte/op	N/A	2.4.5	1 minute
disk_read_bytes_per_operation	Disk Bytes Per Read Operation	Average number of bytes in an I/O read for the monitored disk in the monitoring period Collection method (Linux): The average disk read size is calculated by using the data changes in the sixth column of the corresponding device to divide that of the fourth column in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): not supported	≥ 0	Byte/op	N/A	2.4.5	1 minute
disk_io_svctm	(Agent) Disk I/O Service Time	Average time in an I/O read or write for the monitored disk in the monitoring period Collection method (Linux): The average disk I/O service time is calculated by using the data changes in the thirteenth column of the corresponding device to divide the sum of data changes in the fourth and eighth columns in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): not supported	≥ 0	ms/op	N/A	2.4.5	1 minute
disk_device_used_percent	Block Device Usage	Percentage of total disk space that is used. The calculation formula is as follows: Used storage space of all mounted disk partitions/Total disk storage space. Collection method (Linux): Summarize the disk usage of each mount point, calculate the total disk size based on the disk sector size and number of sectors, and calculate the overall disk usage. Currently, Windows does not support this metric.	0-100	%	N/A	2.5.6	1 minute

OS Metric: File System

Metric	Name	Description	Value Range	Unit	Conversion Rule	Earliest Agent Version Required	Monitoring Period (Raw Data)
disk_fs_rwstate	(Agent) File System Read/Write Status	Read and write status of the mounted file system of the monitored object The status can be 0 (read and write) or 1 (read-only). Collection method (Linux): Check file system information in the fourth column in file /proc/mounts. Collection method (Windows): not supported	0: read and write 1: read-only	None	N/A	2.4.5	1 minute
disk_inodesTotal	(Agent) Disk inode Total	Total number of index nodes on the disk Collection method (Linux): Run the df -i command to check the value in the Inodes column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): not supported	≥ 0	None	N/A	2.4.5	1 minute
disk_inodesUsed	(Agent) Total inode Used	Number of used index nodes on the disk Collection method (Linux): Run the df -i command to check the value in the IUsed column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): not supported	≥ 0	None	N/A	2.4.5	1 minute
disk_inodesUsedPercent	(Agent) Percentage of Total inode Used	Number of used index nodes on the disk Collection method (Linux): Run the df -i command to check the value in the IUse% column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Collection method (Windows): not supported	0-100	%	N/A	2.4.1	1 minute

OS Metric: NTP

Metric	Name	Description	Value Range	Unit	Conversion Rule	Earliest Agent Version Required	Monitoring Period (Raw Data)
ntp_offset	(Agent) NTP Offset	NTP offset of the monitored object Collection method (Linux): Run the ntpq -p or chronyc sources -v command. Collection method (Windows): not supported	≥ 0	ms	N/A	2.7.1	1 minute

Metric

Name

Description

Value Range

Unit

Conversion Rule

Earliest Agent Version Required

Monitoring Period (Raw Data)

ntp_offset

(Agent) NTP Offset

NTP offset of the monitored object

Collection method (Linux): Run the ntpq -p or chronyc sources -v command.
Collection method (Windows): not supported

≥ 0

ms

N/A

2.7.1

1 minute

OS Metric: TCP Connections

By default, two basic metrics related to TCP connections are collected: (Agent) TCP TOTAL and (Agent) TCP ESTABLISHED.

Metric	Name	Description	Value Range	Unit	Conversion Rule	Earliest Agent Version Required	Monitoring Period (Raw Data)
net_tcp_total	(Agent) Total Number of TCP Connections	Total number of TCP connections Collection method (Linux): Run the ss -s command to view the total number of TCP connections. NOTE: The number of TCP connections refers to the number of all active TCP connections on an ECS. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.1	1 minute
net_tcp_established	(Agent) Number of connections in the ESTABLISHED state	Number of TCP connections in the ESTABLISHED state Collection method (Linux): Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.1	1 minute
net_tcp_sys_sent	(Agent) Number of connections in the TCP SYS_SENT state.	Number of TCP connections that are being requested by the client Collection method (Linux): Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.5	1 minute
net_tcp_sys_recv	(Agent) Number of connections in the TCP SYS_RECV state.	Number of pending TCP connections received by the server Collection method (Linux): Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.5	1 minute
net_tcp_fin_wait1	(Agent) Number of TCP connections in the FIN_WAIT1 state.	Number of TCP connections waiting for ACK packets when the connections are being actively closed by the client Collection method (Linux): Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.5	1 minute
net_tcp_fin_wait2	(Agent) Number of TCP connections in the FIN_WAIT2 state.	Number of TCP connections in the FIN_WAIT2 state Collection method (Linux): Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.5	1 minute
net_tcp_time_wait	(Agent) TCP TIME_WAIT Connections	Number of TCP connections in the TIME_WAIT state Collection method (Linux): Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.5	1 minute
net_tcp_close	(Agent) Number of TCP connections in the CLOSE state.	Number of closed TCP connections Collection method (Linux): Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.5	1 minute
net_tcp_close_wait	(Agent) TCP CLOSE_WAIT Connections	Number of TCP connections in the CLOSE_WAIT state Collection method (Linux): Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.5	1 minute
net_tcp_last_ack	(Agent) Number of TCP connections in the LAST_ACK state.	Number of TCP connections waiting for ACK packets when the connections are being passively closed by the client Collection method (Linux): Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.5	1 minute
net_tcp_listen	(Agent) Number of TCP connections in the LISTEN state.	Number of TCP connections in the LISTEN state Collection method (Linux): Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.5	1 minute
net_tcp_closing	(Agent) Number of TCP connections in the CLOSING state.	Number of TCP connections to be automatically closed by the server and the client at the same time Collection method (Linux): Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Collection method (Windows): Obtain the metric value using the GetTcpTable2 API.	≥ 0	count	N/A	2.4.5	1 minute
net_tcp_retrans	(Agent) TCP Retransmission Rate	Percentage of packets that are resent Collection method (Linux): Obtain the metric value from the /proc/net/snmp file. The value is the ratio of the number of sent packets to the number of retransmitted packages in a collection period. Collection method (Windows): Obtain the metric value using the GetTcpStatistics API.	0-100	%	N/A	2.4.5	1 minute

OS Metric: NIC

Metric	Name	Description	Value Range	Unit	Conversion Rule	Earliest Agent Version Required	Monitoring Period (Raw Data)
net_bitRecv	(Agent) Outbound Bandwidth	Number of bits sent by this NIC per second Collection method (Linux): Check metric value changes in file /proc/net/dev in a collection period. Collection method (Windows): Use the MibIfRow object in WMI to obtain network metric data.	≥ 0	bit/s	1024(IEC)	2.4.1	1 minute
net_bitSent	(Agent) Inbound Bandwidth	Number of bits received by this NIC per second Collection method (Linux): Check metric value changes in file /proc/net/dev in a collection period. Collection method (Windows): Use the MibIfRow object in WMI to obtain network metric data.	≥ 0	bit/s	1024(IEC)	2.4.1	1 minute
net_packetRecv	(Agent) NIC Packet Receive Rate	Number of packets received by this NIC per second Collection method (Linux): Check metric value changes in file /proc/net/dev in a collection period. Collection method (Windows): Use the MibIfRow object in WMI to obtain network metric data.	≥ 0	Count/s	N/A	2.4.1	1 minute
net_packetSent	(Agent) NIC Packet Send Rate	Number of packets sent by this NIC per second Collection method (Linux): Check metric value changes in file /proc/net/dev in a collection period. Collection method (Windows): Use the MibIfRow object in WMI to obtain network metric data.	≥ 0	Count/s	N/A	2.4.1	1 minute
net_errin	(Agent) Receive Error Rate	Percentage of receive errors detected by this NIC per second Collection method (Linux): Check metric value changes in file /proc/net/dev in a collection period. Collection method (Windows): not supported	0-100	%	N/A	2.4.5	1 minute
net_errout	(Agent) Transmit Error Rate	Percentage of transmit errors detected by this NIC per second Collection method (Linux): Check metric value changes in file /proc/net/dev in a collection period. Collection method (Windows): not supported	0-100	%	N/A	2.4.5	1 minute
net_dropin	(Agent) Received Packet Drop Rate	Percentage of packets received by this NIC which were dropped per second Collection method (Linux): Check metric value changes in file /proc/net/dev in a collection period. Collection method (Windows): not supported	0-100	%	N/A	2.4.5	1 minute
net_dropout	(Agent) Transmitted Packet Drop Rate	Percentage of packets transmitted by this NIC which were dropped per second Collection method (Linux): Check metric value changes in file /proc/net/dev in a collection period. Collection method (Windows): not supported	0-100	%	N/A	2.4.5	1 minute

Process Monitoring Metrics

Metric	Name	Description	Value Range	Unit	Conversion Rule	Earliest Agent Version Required	Monitoring Period (Raw Data)
proc_pHashId_cpu	(Agent) CPU Usage	CPU consumed by a process. pHashId (process name and process ID) is the value of md5. Collection method (Linux): Check the metric value changes in file /proc/pid/stat. Collection method (Windows): Call the Windows API GetProcessTimes to obtain the CPU usage of the process.	0–1 x Number of vCPUs	%	N/A	2.4.1	1 minute
proc_pHashId_mem	(Agent) Memory Usage	Memory consumed by a process. pHashId (process name and process ID) is the value of md5. Collection method (Linux): RSSPAGESIZE/MemTotal Obtain the RSS* value by checking the second column of file /proc/pid/statm. Obtain the PAGESIZE value by running the getconf PAGESIZE command. Obtain the MemTotal value by checking file /proc/meminfo. Collection method (Windows): Call the Windows API procGlobalMemoryStatusEx to obtain the total memory size. Call GetProcessMemoryInfo to obtain the used memory size. Use the used memory size to divide the total memory size to get the memory usage.	0-100	%	N/A	2.4.1	1 minute
proc_pHashId_file	(Agent) Number of opened files	Number of files opened by a process. pHashId (process name and process ID) is the value of md5. Collection method (Linux): Run the ls -l /proc/pid/fd command to view the number of opened files. Collection method (Windows): not supported	≥ 0	Count	N/A	2.4.1	1 minute
proc_running_count	(Agent) Running processes	Number of processes that are running Collection method (Linux): You can obtain the state of each process by checking the Status value in the /proc/pid/status file, and then collect the total number of processes in each state. Collection method (Windows): not supported	≥ 0	None	N/A	2.4.1	1 minute
proc_idle_count	(Agent) Idle Processes	Number of processes that are idle Collection method (Linux): You can obtain the state of each process by checking the Status value in the /proc/pid/status file, and then collect the total number of processes in each state. Collection method (Windows): not supported	≥ 0	None	N/A	2.4.1	1 minute
proc_zombie_count	(Agent) Zombie Processes	Number of zombie processes Collection method (Linux): You can obtain the state of each process by checking the Status value in the /proc/pid/status file, and then collect the total number of processes in each state. Collection method (Windows): not supported	≥ 0	None	N/A	2.4.1	1 minute
proc_blocked_count	(Agent) Blocked Processes	Number of processes that are blocked Collection method (Linux): You can obtain the state of each process by checking the Status value in the /proc/pid/status file, and then collect the total number of processes in each state. Collection method (Windows): not supported	≥ 0	None	N/A	2.4.1	1 minute
proc_sleeping_count	(Agent) Sleeping Processes	Number of processes that are sleeping Collection method (Linux): You can obtain the state of each process by checking the Status value in the /proc/pid/status file, and then collect the total number of processes in each state. Collection method (Windows): not supported	≥ 0	None	N/A	2.4.1	1 minute
proc_total_count	(Agent) Total Processes	Total number of processes on the monitored object Collection method (Linux): You can obtain the state of each process by checking the Status value in the /proc/pid/status file, and then collect the total number of processes in each state. Collection method (Windows): Obtain the total number of processes by using the system process status support module psapi.dll.	≥ 0	None	N/A	2.4.1	1 minute
proc_specified_count	(Agent) Specified Processes	Number of specified processes Collection method (Linux): You can obtain the state of each process by checking the Status value in the /proc/pid/status file, and then collect the total number of processes in each state. Collection method (Windows): Obtain the total number of processes by using the system process status support module psapi.dll.	≥ 0	None	N/A	2.4.1	1 minute

GPU Specifications

If a GPU server has eight GPU cards and the PM mode is disabled, data may fail to be collected. You can enable the PM mode and restart the monitoring process.

Category	Metric Name	Description	Value Range	Unit	Conversion Rule	Earliest Agent Version Required	Collection Interval
GPU Specifications	gpu_status	GPU health status of the VM. This metric is a composite metric. Possible causes: 1. The ECC exceeded the threshold. 2. The GPU memory address failed to be remapped. 3. The GPU card is in the rev ff state. 4. infoROM error. 5. There are pages to be isolated. 6. The remapped rows are incorrect. (For details, see the following detailed metrics.) Collection method (Linux): Call APIs from the GPU driver library file libnvidia-ml.so.1 to obtain the GPU status. Collection method (Windows): Call APIs from the GPU driver library file nvml.dll to obtain the GPU status.	0: healthy 1: subhealthy 2: faulty	None	N/A	2.4.5	1 minute
	gpu_performance_state	Performance status of the GPU P0-P15, P32 P0 indicates the maximum performance status. P15 indicates the minimum performance status. P32 indicates the unknown status. Collection mode (Linux): Call the NvmlDeviceGetPerformanceState API from the GPU driver library file libnvidia-ml.so.1 to obtain the GPU performance level. Collection method (Windows): Call the NvmlDeviceGetPerformanceState API from the GPU driver library file nvml.dll to obtain the GPU performance level.	P0-P15: P0 indicates the maximum performance status, and P15 indicates the minimum performance status. P32 indicates the unknown status.	None	N/A	2.4.1	1 minute
	gpu_power_draw	Power of the GPU. If the power exceeds the maximum power or is an incorrect value, the GPU hardware may be faulty. Collection method (Linux): Call the NvmlDeviceGetPowerUsage API from the GPU driver library file libnvidia-ml.so.1 to obtain the GPU power. Collection method (Windows): Call the NvmlDeviceGetPowerUsage API from the GPU driver library file nvml.dll to obtain the GPU power.	≥ 0	W	N/A	2.4.5	1 minute
	gpu_temperature	Temperature of the GPU. If the temperature exceeds the maximum operating temperature threshold or is an incorrect value, the GPU hardware may be faulty. Collection method (Linux): Call the NvmlDeviceGetTemperature API from the GPU driver library file libnvidia-ml.so.1 to obtain the GPU temperature. Collection method (Windows): Call the NvmlDeviceGetTemperature API from the GPU driver library file nvml.dll to obtain the GPU temperature.	≥ 0	°C	N/A	2.4.5	1 minute
	gpu_usage_gpu	GPU computing power usage. The GPU computing power usage is displayed in percentage. The value is an instantaneous value at the sampling point. Collection method (Linux): Call the NvmlDeviceGetUtilizationRates API from the GPU driver library file libnvidia-ml.so.1 to obtain the GPU computing power usage. Collection method (Windows): Call the NvmlDeviceGetUtilizationRates API from nvml.dll to obtain the GPU computing power usage.	0-100	%	N/A	2.4.1	1 minute
	gpu_usage_mem	GPU memory usage. The GPU memory usage is displayed in percentage. The value is an instantaneous value at the sampling point. Collection method (Linux): Call the NvmlDeviceGetUtilizationRates API from the GPU driver library file libnvidia-ml.so.1 to obtain the GPU memory usage. Collection method (Windows): Call the NvmlDeviceGetUtilizationRates API from nvml.dll to obtain the GPU memory usage.	0-100	%	N/A	2.4.1	1 minute
	gpu_used_mem	Used GPU memory. The used GPU memory is displayed in percentage. The value is an instantaneous value at the sampling point. Collection method (Linux): Call the NvmlDeviceGetMemoryInfo API from the GPU driver library file libnvidia-ml.so.1 to obtain the used GPU memory. Collection method (Windows): Call the NvmlDeviceGetMemoryInfo API from the GPU driver library file nvml.dll to obtain the used GPU memory.	≥ 0	MB	N/A	2.4.5	1 minute
	gpu_free_mem	Remaining GPU memory. The idle GPU memory data is displayed. Collection method (Linux): Call the NvmlDeviceGetMemoryInfo API from the GPU driver library file libnvidia-ml.so.1 to obtain the remaining GPU memory. Collection method (Windows): Call the NvmlDeviceGetMemoryInfo API from nvml.dll to obtain the remaining GPU memory.	≥ 0	MB	N/A	2.4.5	1 minute
	gpu_usage_encoder	GPU encoder usage. The GPU encoder usage is displayed in percentage. The value is an instantaneous value at the sampling point. Collection method (Linux): Call the NvmlDeviceGetEncoderUtilization API from the GPU driver library file libnvidia-ml.so.1 to obtain the GPU encoding capability usage. Collection method (Windows): Call the NvmlDeviceGetEncoderUtilization API from nvml.dll to obtain the GPU encoding capability usage.	0-100	%	N/A	2.4.5	1 minute
	gpu_usage_decoder	GPU decoder usage. The GPU decoder usage is displayed in percentage. The value is an instantaneous value at the sampling point. Collection method (Linux): Call the NvmlDeviceGetDecoderUtilization API from the GPU driver library file libnvidia-ml.so.1 to obtain the GPU decoding capability usage. Collection method (Windows): Call the NvmlDeviceGetDecoderUtilization API from nvml.dll to obtain the GPU decoding capability usage.	0-100	%	N/A	2.4.5	1 minute
	gpu_graphics_clocks	GPU graphics (shader) clock frequency. Displays the GPU clock frequencies related to graphics performance. If no graphics capability is used, you can ignore it. Collection method (Linux): Call the NvmlDeviceGetClockInfo API from the GPU driver library file libnvidia-ml.so.1 to obtain the GPU graphics clock frequency. Collection method (Windows): Call the NvmlDeviceGetClockInfo API from the GPU driver library file nvml.dll to obtain the GPU graphics clock frequency.	≥ 0	MHz	N/A	2.4.5	1 minute
	gpu_sm_clocks	Streaming processor clock frequency of the GPU. Clock frequency for controlling the GPU memory running speed. Collection method (Linux): Call the NvmlDeviceGetClockInfo API from the GPU driver library file libnvidia-ml.so.1 to obtain the streaming processor clock frequency of the GPU. Collection method (Windows): Call the NvmlDeviceGetClockInfo API from the GPU driver library file nvml.dll to obtain the streaming processor clock frequency of the GPU.	≥ 0	MHz	N/A	2.4.5	1 minute
	gpu_mem_clocks	Memory clock frequency of the GPU. Displays the clock frequency closely related to CUDA core computing of the GPU. Collection method (Linux): Call the NvmlDeviceGetClockInfo API from the GPU driver library file libnvidia-ml.so.1 to obtain the GPU memory clock frequency. Collection method (Windows): Call the NvmlDeviceGetClockInfo API from the GPU driver library file nvml.dll to obtain the GPU memory clock frequency.	≥ 0	MHz	N/A	2.4.5	1 minute
	gpu_video_clocks	Video (including codec) clock frequency of the GPU. Displays the codec clock frequency of the current GPU. Collection method (Linux): Call the NvmlDeviceGetClockInfo API from the GPU driver library file libnvidia-ml.so.1 to obtain the video clock frequency of the GPU. Collection method (Windows): Call the NvmlDeviceGetClockInfo API from the GPU driver library file nvml.dll to obtain the GPU video clock frequency.	≥ 0	MHz	N/A	2.4.5	1 minute
	gpu_tx_throughput_pci	Outbound bandwidth of the GPU. Displays the amount of data sent by the GPU to the host via PCIe. Collection method (Linux): Call the NvmlDeviceGetPcieThroughput API from libnvidia-ml.so.1 to obtain the outbound bandwidth of the GPU. Collection method (Windows): Call the NvmlDeviceGetPcieThroughput API from nvml.dll to obtain the outbound bandwidth of the GPU.	≥ 0	MByte/s	N/A	2.4.5	1 minute
	gpu_rx_throughput_pci	Inbound bandwidth of the GPU. Displays the amount of data sent by the host to the GPU via PCIe. Collection method (Linux): Call the NvmlDeviceGetPcieThroughput API from libnvidia-ml.so.1 to obtain the inbound bandwidth of the GPU. Collection method (Windows): Call the NvmlDeviceGetPcieThroughput API from nvml.dll to obtain the inbound bandwidth of the GPU.	≥ 0	MByte/s	N/A	2.4.5	1 minute
	gpu_volatile_correctable	Number of correctable ECC errors since the GPU is reset. The value is reset to 0 each time the GPU is reset. Collection method (Linux): Call the NvmlDeviceGetPcieThroughput API from the GPU driver library file libnvidia-ml.so.1 to obtain the number of correctable ECC errors since the GPU is reset. Collection method (Windows): Call the NvmlDeviceGetPcieThroughput API from the GPU driver library file nvml.dll to obtain the number of correctable ECC errors since the GPU is reset.	≥ 0	count	N/A	2.4.5	1 minute
	gpu_volatile_uncorrectable	Number of uncorrectable ECC errors since the GPU is reset. The value is reset to 0 each time the GPU is reset. Collection method (Linux): Call the NvmlDeviceGetTotalEccErrors and NvmlDeviceGetMemoryErrorCounter APIs from the GPU driver library file libnvidia-ml.so.1 to obtain the number of uncorrectable ECC errors since the GPU is reset. Collection method (Windows): Call the NvmlDeviceGetTotalEccErrors and NvmlDeviceGetMemoryErrorCounter APIs from the GPU driver library file nvml.dll to obtain the number of uncorrectable ECC errors since the GPU is reset.	≥ 0	count	N/A	2.4.5	1 minute
	gpu_aggregate_correctable	Number of correctable ECC errors on the GPU. Collection method (Linux): Call the NvmlDeviceGetTotalEccErrors and NvmlDeviceGetMemoryErrorCounter APIs from the GPU driver library file libnvidia-ml.so.1 to obtain the number of correctable ECC errors on the GPU. Collection method (Windows): Call the NvmlDeviceGetTotalEccErrors and NvmlDeviceGetMemoryErrorCounter APIs from the GPU driver library file nvml.dll to obtain the number of correctable ECC errors on the GPU.	≥ 0	count	N/A	2.4.5	1 minute
	gpu_aggregate_uncorrectable	Number of uncorrectable ECC errors on the GPU. Collection method (Linux): Call the NvmlDeviceGetTotalEccErrors and NvmlDeviceGetMemoryErrorCounter APIs from the GPU driver library file libnvidia-ml.so.1 to obtain the number of uncorrectable ECC errors on the GPU. Collection method (Windows): Call the NvmlDeviceGetTotalEccErrors and NvmlDeviceGetMemoryErrorCounter APIs from the GPU driver library file nvml.dll to obtain the number of uncorrectable ECC errors on the GPU.	≥ 0	count	N/A	2.4.5	1 minute
	gpu_retired_page_single_bit	Number of retired page single bit errors, which indicates the number of single-bit pages isolated by the GPU. Collection method (Linux): Call the NvmlDeviceGetRetiredPages API from the GPU driver library file libnvidia-ml.so.1 to obtain the number of single-bit pages isolated by the GPU. Collection method (Windows): Call the NvmlDeviceGetRetiredPages API from the GPU driver library file nvml.dll to obtain the number of single-bit pages isolated by the GPU.	≥ 0	count	N/A	2.4.5	1 minute
	gpu_retired_page_double_bit	Number of retired page double bit errors, which indicates the number of double-bit pages isolated by the GPU. Collection method (Linux): Call the NvmlDeviceGetRetiredPages API from the GPU driver library file libnvidia-ml.so.1 to obtain the number of double-bit pages isolated by the GPU. Collection method (Windows): Call the NvmlDeviceGetRetiredPages API from the GPU driver library file nvml.dll to obtain the number of double-bit pages isolated by the GPU.	≥ 0	count	N/A	2.4.5	1 minute
	gpu_lnkcap_speed	Maximum speed supported by the PCIe link of the GPU. Maximum data throughput capability of the GPU on the PCIe bus. Collection method (Linux): Use lspci -d 10de: -vv \| grep -i lnkcap to query the maximum speed supported by the PCIe link of the GPU. Collection method (Windows): Use gwmi Win32_Bus -Filter 'DeviceID like "PCI%"').GetRelated('Win32_PnPEntity') to query the maximum speed supported by the PCIe link of the GPU.	≥ 0	GT/s	N/A	2.6.7	1 minute
	gpu_lnkcap_width	Link width of the PCIe link. Maximum number of PCIe lanes supported by the GPU. Collection method (Linux): Use lspci -d 10de: -vv \| grep -i lnksta to query the maximum speed supported by the PCIe link of the GPU. Collection method (Windows): Use gwmi Win32_Bus -Filter 'DeviceID like "PCI%"').GetRelated('Win32_PnPEntity') to query the maximum speed supported by the PCIe link of the GPU.	≥ 0	count	N/A	2.6.7	1 minute
	gpu_lnksta_speed	PCIe connection speed of the GPU. Maximum PCIe link speed supported by the GPU. Collection method (Linux): Use lspci -d 10de: -vv \| grep -i lnkcap to query the PCIe connection speed of the GPU. Collection method (Windows): not supported	≥ 0	GT/s	N/A	2.6.7	1 minute
	gpu_lnksta_width	PCIe link width of the GPU. Maximum number of lanes in the PCIe link supported by the GPU. Collection method (Linux): Use lspci -d 10de: -vv \| grep -i lnksta to query the PCIe link bandwidth of the GPU. Collection method (Windows): not supported	≥ 0	count	N/A	2.6.7	1 minute
	gpu_nvlink_number	Number of NVLink links of the GPU. Number of NVLink links supported by the GPU. For example, A100 supports 12 NVLink links. Collection method (Linux): Call the nvmlDeviceGetFieldValue API from the GPU driver library file libnvidia-ml.so.1 to obtain the number of NVLink links of the GPU. Collection method (Windows): not supported	≥ 0	count	N/A	2.6.7	1 minute
	gpu_nvlink_bandwidth	NVLink link width of the GPU. Indicates the total bandwidth for data transmission used by the GPU. Collection method (Linux): Call the nvmlDeviceGetFieldValue API from the GPU driver library file libnvidia-ml.so.1 to obtain the NVLink link width of the GPU. Collection method (Windows): not supported	≥ 0	GB/s	N/A	2.6.7	1 minute

What Metrics Are Supported by the Agent?

OS metric: CPU

OS Metric: CPU Load

OS Metric: Memory

OS Metric: Disk

OS Metric: Disk I/O

OS Metric: File System

OS Metric: NTP

OS Metric: TCP Connections

OS Metric: NIC

Process Monitoring Metrics

GPU Specifications

Feedback

Was this page helpful?