OS Monitoring Metrics Supported by ECSs with the Agent Installed_Cloud Eye Monitoring_User Guide

Description

OS monitoring provides system-level, proactive, and fine-grained monitoring. It requires the Agent to be installed on the ECSs to be monitored. This section describes OS monitoring metrics reported to Cloud Eye. Monitoring data is collected every one minute.

OS monitoring supports metrics about the CPU, CPU load, memory, disk, disk I/O, file system, GPU, network interface, NTP, and TCP connections.

After the Agent is installed, you can view monitoring metrics of ECSs running different OSs.

Cloud Eye can monitor dimensions nested to a maximum depth of four levels (levels 0 to 3). Level 3 is the deepest level. For example, if the monitored dimension of a metric is instance_id,mount_point, instance_id indicates level 0 and mount_point indicates level 1.

Namespace

AGT.ECS

OS Metrics: CPU

**Table 1** CPU metrics
Metric	Parameter	Description	Value Range	Unit	Conversion Rule	Dimension	Monitoring Period (Raw Data)
cpu_usage	(Agent) CPU Usage	CPU usage of the monitored object Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) value. Windows: Obtain the metric value using the Windows API GetSystemTimes.	0-100	%	N/A	instance_id	1 minute
cpu_usage_idle	(Agent) Idle CPU Usage	Percentage of time that CPU is idle Linux: Check metric value changes in file /proc/stat in a collection period. Windows: Obtain the metric value using the Windows API GetSystemTimes.	0-100	%	N/A	instance_id	1 minute
cpu_usage_user	(Agent) User Space CPU Usage	Percentage of time that the CPU is used by user space Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) us value. Windows: Obtain the metric value using the Windows API GetSystemTimes.	0-100	%	N/A	instance_id	1 minute
cpu_usage_system	(Agent) Kernel Space CPU Usage	Percentage of time that the CPU is used by kernel space Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) sy value. Windows: Obtain the metric value using the Windows API GetSystemTimes.	0-100	%	N/A	instance_id	1 minute
cpu_usage_other	(Agent) Other Process CPU Usage	Percentage of time that the CPU is used by other processes Linux: Other Process CPU Usage = 1- Idle CPU Usage - Kernel Space CPU Usage - User Space CPU Usage Windows: Other Process CPU Usage = 1- Idle CPU Usage - Kernel Space CPU Usage - User Space CPU Usage	0-100	%	N/A	instance_id	1 minute
cpu_usage_nice	(Agent) Nice Process CPU Usage	Percentage of time that the CPU is used by the Nice process Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) ni value. Windows is not supported currently.	0-100	%	N/A	instance_id	1 minute
cpu_usage_iowait	(Agent) iowait Process CPU Usage	Percentage of time that the CPU is waiting for I/O operations to complete Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) wa value. Windows is not supported currently.	0-100	%	N/A	instance_id	1 minute
cpu_usage_irq	(Agent) CPU Interrupt Time	Percentage of time that the CPU is servicing interrupts Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) hi value. Windows is not supported currently.	0-100	%	N/A	instance_id	1 minute
cpu_usage_softirq	(Agent) CPU Software Interrupt Time	Percentage of time that the CPU is servicing software interrupts Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) si value. Windows is not supported currently.	0-100	%	N/A	instance_id	1 minute

OS Metric: CPU Load

**Table 2** CPU load metrics
Metric	Parameter	Description	Value Range	Unit	Conversion Rule	Dimension	Monitoring Period (Raw Data)
load_average1	(Agent) 1-Minute Load Average	CPU load averaged from the last 1 minute Linux: Obtain the metric value from the number of logic CPUs in load1/ in file /proc/loadavg. Run the top command to check the load1 value.	≥ 0	N/A	N/A	instance_id	1 minute
load_average5	(Agent) 5-Minute Load Average	CPU load averaged from the last 5 minutes Linux: Obtain the metric value from the number of logic CPUs in load5/ in file /proc/loadavg. Run the top command to check the load5 value.	≥ 0	N/A	N/A	instance_id	1 minute
load_average15	(Agent) 15-Minute Load Average	CPU load averaged from the last 15 minutes Linux: Obtain the metric value from the number of logic CPUs in load15/ in file /proc/loadavg. Run the top command to check the load15 value.	≥ 0	N/A	N/A	instance_id	1 minute

OS Metric: Memory

**Table 3** Memory metrics
Metric	Parameter	Description	Value Range	Unit	Conversion Rule	Dimension	Monitoring Period (Raw Data)
mem_available	(Agent) Available Memory	Available memory size of the monitored object Linux: Obtain the metric value from /proc/meminfo. If MemAvailable is displayed in /proc/meminfo, obtain the value. If MemAvailable is not displayed in /proc/meminfo, MemAvailable = MemFree + Buffers+Cached Collection method (Windows): formula (Total memory – Used memory). The value is obtained by calling the Windows API GlobalMemoryStatusEx.	≥0	GB	N/A	instance_id	1 minute
mem_usedPercent	(Agent) Memory Usage	Memory usage of the monitored object Linux: Obtain the metric value from the /proc/meminfo file: (MemTotal - MemAvailable)/MemTotal If MemAvailable is displayed in /proc/meminfo, MemUsedPercent = (MemTotal-MemAvailable)/MemTotal If MemAvailable is not displayed in /proc/meminfo, MemUsedPercent = (MemTotal – MemFree – Buffers – Cached)/MemTotal Windows: The calculation formula is as follows: Used memory size/Total memory size*100%.	0-100	%	N/A	instance_id	1 minute
mem_free	(Agent) Idle Memory	Amount of memory that is not being used Linux: Obtain the metric value from /proc/meminfo. Windows is not supported currently.	≥0	GB	N/A	instance_id	1 minute
mem_buffers	(Agent) Buffer	Amount of memory that is being used for buffers Linux: Obtain the metric value from /proc/meminfo. Run the top command to check the KiB Mem:buffers value. Windows is not supported currently.	≥0	GB	N/A	instance_id	1 minute
mem_cached	(Agent) Cache	Amount of memory that is being used for file caches Linux: Obtain the metric value from /proc/meminfo. Run the top command to check the KiB Swap:cached Mem value. Windows is not supported currently.	≥0	GB	N/A	instance_id	1 minute
total_open_files	(Agent) Total File Handles	Total handles used by all processes Linux: Use the /proc/{pid}/fd file to summarize the handles used by all processes. Windows is not supported currently.	≥ 0	Count	N/A	instance_id	1 minute

OS Metric: Disk

Currently, only physical disks are monitored. The NFS-attached disks cannot be monitored.
By default, Docker-related mount points are shielded. The prefix of the mount point is as follows:
```
/var/lib/docker;/mnt/paas/kubernetes;/var/lib/mesos
```

**Table 4** Disk metrics
Metric	Parameter	Description	Value Range	Unit	Conversion Rule	Dimension	Monitoring Period (Raw Data)
disk_free	(Agent) Available Disk Space	Free space on the disks Linux: Run the df -h command to check the value in the Avail column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use the WMI interface to call GetDiskFreeSpaceExW API to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥0	GB	N/A	instance_id,mount_point	1 minute
disk_total	(Agent) Disk Storage Capacity	Total space on the disks, including used and free Linux: Run the df -h command to check the value in the Size column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use the WMI interface to call GetDiskFreeSpaceExW API to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥0	GB	N/A	instance_id,mount_point	1 minute
disk_used	(Agent) Used Disk Space	Used space on the disks Linux: Run the df -h command to check the value in the Used column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use the WMI interface to call GetDiskFreeSpaceExW API to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥0	GB	N/A	instance_id,mount_point	1 minute
disk_usedPercent	(Agent) Disk Usage	Percentage of total disk space that is used, which is calculated as follows: Disk Usage = Used Disk Space/Disk Storage Capacity Linux: It is calculated as follows: Used/Size. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use the WMI interface to call GetDiskFreeSpaceExW API to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	0-100	%	N/A	instance_id,mount_point	1 minute

OS Metric: Disk I/O

**Table 5** Disk I/O metrics
Metric	Parameter	Description	Value Range	Unit	Conversion Rule	Dimension	Monitoring Period (Raw Data)
disk_agt_read_bytes_rate	(Agent) Disks Read Rate	Number of bytes read from the monitored disk per second Linux: The disk read rate is calculated based on the data changes in the sixth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: The disk I/O data is obtained through the Win32_PerfFormattedData_PerfDisk_LogicalDisk object in WMI. The object is obtained once in each collection period. The instantaneous value returned by the object indicates the metric value in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and result in the failure of obtaining monitoring data.	≥ 0	byte/s	1024(IEC)	instance_id,disk instance_id,mount_point	1 minute
disk_agt_read_requests_rate	(Agent) Disks Read Requests	Number of read requests sent to the monitored disk per second Linux: The disk read requests are calculated based on the data changes in the fourth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: The disk I/O data is obtained through the Win32_PerfFormattedData_PerfDisk_LogicalDisk object in WMI. The object is obtained once in each collection period. The instantaneous value returned by the object indicates the metric value in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and result in the failure of obtaining monitoring data.	≥ 0	Request/s	N/A	instance_id,disk instance_id,mount_point	1 minute
disk_agt_write_bytes_rate	(Agent) Disks Write Rate	Number of bytes written to the monitored disk per second Linux: The disk write rate is calculated based on the data changes in the tenth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: The disk I/O data is obtained through the Win32_PerfFormattedData_PerfDisk_LogicalDisk object in WMI. The object is obtained once in each collection period. The instantaneous value returned by the object indicates the metric value in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and result in the failure of obtaining monitoring data.	≥ 0	byte/s	1024(IEC)	instance_id,disk instance_id,mount_point	1 minute
disk_agt_write_requests_rate	(Agent) Disks Write Requests	Number of write requests sent to the monitored disk per second Linux: The disk write requests are calculated based on the data changes in the eighth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: The disk I/O data is obtained through the Win32_PerfFormattedData_PerfDisk_LogicalDisk object in WMI. The object is obtained once in each collection period. The instantaneous value returned by the object indicates the metric value in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and result in the failure of obtaining monitoring data.	≥ 0	Request/s	N/A	instance_id,disk instance_id,mount_point	1 minute
disk_readTime	(Agent) Average Read Request Time	Average amount of time that read requests have waited on the disks Linux: The average read request time is calculated based on the data changes in the seventh column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0	ms/Count	N/A	instance_id,disk instance_id,mount_point	1 minute
disk_writeTime	(Agent) Average Write Request Time	Average amount of time that write requests have waited on the disks Linux: The average write request time is calculated based on the data changes in the eleventh column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0	ms/Count	N/A	instance_id,disk instance_id,mount_point	1 minute
disk_ioUtils	(Agent) Disk I/O Usage	Percentage of the time that the disk has had I/O requests queued to the total disk operation time Linux: The disk I/O usage is calculated based on the data changes in the thirteenth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	0-100	%	N/A	instance_id,disk instance_id,mount_point	1 minute
disk_queue_length	(Agent) Disk Queue Length	This metric reflects the disk usage in a specified period and can be used to evaluate the disk I/O performance. A larger value indicates a busier disk and poorer I/O performance. Linux: The metric value is calculated by dividing the data changes in the fourteenth column of the corresponding device in /proc/diskstats in a collection period by the metric collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0	Count	N/A	instance_id,disk instance_id,mount_point	1 minute
disk_write_bytes_per_operation	(Agent) Average Disk Write Size	Average number of bytes in an I/O write for the monitored disk in the monitoring period Linux: The average disk write size is calculated based on the data changes in the tenth column of the corresponding device to divide that of the eighth column in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0	Byte/op	N/A	instance_id,disk instance_id,mount_point	1 minute
disk_read_bytes_per_operation	(Agent) Average Disk Read Size	Average number of bytes in an I/O read for the monitored disk in the monitoring period Linux: The average disk read size is calculated based on the data changes in the sixth column of the corresponding device to divide that of the fourth column in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0	Byte/op	N/A	instance_id,disk instance_id,mount_point	1 minute
disk_io_svctm	(Agent) Disk I/O Service Time	Average time in an I/O read or write for the monitored disk in the monitoring period Linux: The average disk I/O service time is calculated based on the data changes in the thirteenth column of the corresponding device to divide the sum of data changes in the fourth and eighth columns in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0	ms/op	N/A	instance_id,disk instance_id,mount_point	1 minute
disk_device_used_percent	Block Device Usage	Percentage of the physical disk usage of the monitored object. Calculation formula: Used storage space of all mounted disk partitions/Total disk storage space Collection method for Linux ECSs: Obtain the disk usage of each mount point, calculate the total disk storage space based on the disk sector size and the number of sectors, and then you can calculate the used storage space in total. Windows is not supported currently.	0-100	%	N/A	instance_id,disk	1 minute

OS Metric: File System

**Table 6** File system metrics
Metric	Parameter	Description	Value Range	Unit	Conversion Rule	Dimension	Monitoring Period (Raw Data)
disk_fs_rwstate	(Agent) File System Read/Write Status	Read and write status of the mounted file system of the monitored object. Value: 0 (read and write) or 1 (read only) Linux: Check file system information in the fourth column in file /proc/mounts.	0: readable and writable 1: read-only	N/A	N/A	instance_id,mount_point	1 minute
disk_inodesTotal	(Agent) Disk inode Total	Total number of index nodes on the disk Linux: Run the df -i command to check the value in the Inodes column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥ 0	Count	N/A	instance_id,mount_point	1 minute
disk_inodesUsed	(Agent) Total inode Used	Number of used index nodes on the disk Linux: Run the df -i command to check the value in the IUsed column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥ 0	Count	N/A	instance_id,mount_point	1 minute
disk_inodesUsedPercent	(Agent) Percentage of Total inode Used	Ratio of used index nodes to the total index nodes on the disk Linux: Run the df -i command to check the value in the IUse% column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	0-100	%	N/A	instance_id,mount_point	1 minute

The Windows OS does not support the file system metrics.

OS Metric: NIC

**Table 7** NIC metrics
Metric	Parameter	Description	Value Range	Unit	Conversion Rule	Dimension	Monitoring Period (Raw Data)
net_bitRecv	(Agent) Outbound Bandwidth	Number of bits sent by this NIC per second Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows: Use the MibIfRow object in the WMI to obtain network metric data.	≥ 0	bit/s	1024(IEC)	instance_id	1 minute
net_bitSent	(Agent) Inbound Bandwidth	Number of bits received by this NIC per second Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows: Use the MibIfRow object in the WMI to obtain network metric data.	≥ 0	bit/s	1024(IEC)	instance_id	1 minute
net_packetRecv	(Agent) NIC Packet Receive Rate	Number of packets received by this NIC per second Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows: Use the MibIfRow object in the WMI to obtain network metric data.	≥ 0	Counts/s	N/A	instance_id	1 minute
net_packetSent	(Agent) NIC Packet Send Rate	Number of packets sent by this NIC per second Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows: Use the MibIfRow object in the WMI to obtain network metric data.	≥ 0	Counts/s	N/A	instance_id	1 minute
net_errin	(Agent) Receive Error Rate	Percentage of error packets detected by this NIC to the total number of packets received by the NIC per second Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows is not supported currently.	0-100	%	N/A	instance_id	1 minute
net_errout	(Agent) Transmit Error Rate	Percentage of transmit errors detected by this NIC per second Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows is not supported currently.	0-100	%	N/A	instance_id	1 minute
net_dropin	(Agent) Received Packet Drop Rate	Percentage of packets received by this NIC which were dropped per second Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows is not supported currently.	0-100	%	N/A	instance_id	1 minute
net_dropout	(Agent) Transmitted Packet Drop Rate	Percentage of packets transmitted by this NIC which were dropped per second Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows is not supported currently.	0-100	%	N/A	instance_id	1 minute

OS Metric: NTP

**Table 8** NTP metrics
Metric	Parameter	Description	Value Range	Unit	Conversion Rule	Dimension	Monitoring Period (Raw Data)
ntp_offset	(Agent) NTP Offset	NTP offset of the monitored object Collection method for Linux ECSs: Run chronyc sources -v to obtain the offset. Windows is not supported currently.	≥ 0	ms	N/A	instance_id	1 minute

OS Metric: TCP

**Table 9** TCP metrics
Metric	Parameter	Description	Value Range	Unit	Conversion Rule	Dimension	Monitoring Period (Raw Data)
net_tcp_total	(Agent) Total TCP Connections	Total number of TCP connections in all states Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_established	(Agent) TCP ESTABLISHED Connection	Number of TCP connections in ESTABLISHED state Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_sys_sent	(Agent) TCP SYS_SENT Connections	Number of TCP connections that are being requested by the client Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_sys_recv	(Agent) TCP SYS_RECV Connections	Number of pending TCP connections received by the server Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_fin_wait1	(Agent) TCP FIN_WAIT1 Connections	Number of TCP connections waiting for ACK packets when the connections are being actively closed by the client Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_fin_wait2	(Agent) TCP FIN_WAIT2 Connections	Number of TCP connections in the FIN_WAIT2 state Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_time_wait	(Agent) TCP TIME_WAIT Connections	Number of TCP connections in TIME_WAIT state Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_close	(Agent) TCP CLOSE Connections	Number of closed TCP connections Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_close_wait	(Agent) TCP CLOSE_WAIT Connections	Number of TCP connections in CLOSE_WAIT TCP state Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_last_ack	(Agent) TCP LAST_ACK Connections	Number of TCP connections waiting for ACK packets when the connections are being passively closed by the client Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_listen	(Agent) TCP LISTEN Connections	Number of TCP connections in the LISTEN state Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_closing	(Agent) TCP CLOSING Connections	Number of TCP connections to be automatically closed by the server and the client at the same time Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	Count	N/A	instance_id	1 minute
net_tcp_retrans	(Agent) TCP Retransmission Rate	Percentage of packets that are resent Linux: Obtain the metric value from the /proc/net/snmp file. The value is the ratio of the number of retransmitted packets to the number of sent packets in a collection period. Windows: Obtain the metric value using WindowsAPI GetTcpStatistics.	0-100	%	N/A	instance_id	1 minute

OS Metric: GPU

**Table 10** GPU metrics
Metric	Parameter	Description	Value Range	Unit	Conversion Rule	Dimension	Monitoring Period (Raw Data)
gpu_status	(Agent) GPU Health Status	Overall measurement of the GPU health Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	0: healthy 1: subhealthy 2: faulty	N/A	N/A	instance_id instance_id,gpu	1 minute
gpu_usage_encoder	(Agent) Encoding Usage	Encoding capability usage of the GPU Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	0-100	%	N/A	instance_id instance_id,gpu	1 minute
gpu_usage_decoder	(Agent) Decoding Usage	Decoding capability usage of the GPU Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	0-100	%	N/A	instance_id instance_id,gpu	1 minute
gpu_volatile_correctable	(Agent) Volatile Correctable ECC Errors	Number of correctable ECC errors since the GPU is reset. The value is reset to 0 each time the GPU is reset. Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	Count	N/A	instance_id instance_id,gpu	1 minute
gpu_volatile_uncorrectable	(Agent) Volatile Uncorrectable ECC Errors	Number of uncorrectable ECC errors since the GPU is reset. The value is reset to 0 each time the GPU is reset. Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	Count	N/A	instance_id instance_id,gpu	1 minute
gpu_aggregate_correctable	(Agent) Aggregate Correctable ECC Errors	Aggregate correctable ECC errors on the GPU Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	Count	N/A	instance_id instance_id,gpu	1 minute
gpu_aggregate_uncorrectable	(Agent) Aggregate Uncorrectable ECC Errors	Aggregate uncorrectable ECC Errors on the GPU Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	Count	N/A	instance_id instance_id,gpu	1 minute
gpu_retired_page_single_bit	(Agent) Retired Page Single Bit Errors	Number of retired page single bit errors, which indicates the number of single-bit pages blocked by the graphics card Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	Count	N/A	instance_id instance_id,gpu	1 minute
gpu_retired_page_double_bit	(Agent) Retired Page Double Bit Errors	Number of retired page double bit errors, which indicates the number of double-bit pages blocked by the graphics card Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	Count	N/A	instance_id instance_id,gpu	1 minute
gpu_performance_state	(Agent) Performance Status	GPU performance status Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	P0-P15, P32 P0: indicates the maximum performance status. P15: indicates the minimum performance status. P32: indicates the unknown performance status.	N/A	N/A	instance_id instance_id,gpu	1 minute
gpu_usage_mem	(Agent) GPU Memory Usage	GPU memory usage Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	0-100	%	N/A	instance_id instance_id,gpu	1 minute
gpu_usage_gpu	(Agent) GPU Usage	GPU compute usage Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	0-100	%	N/A	instance_id instance_id,gpu	1 minute

Dimension

Dimension	Key	Value
ECS	instance_id	Specifies the ECS ID. You can obtain the value by referring to Querying Server Monitoring Metrics from Different Dimensions.

Key

Value

ECS

instance_id

Specifies the ECS ID.

You can obtain the value by referring to Querying Server Monitoring Metrics from Different Dimensions.

Example of Querying Multi-Level Dimension Metrics Using APIs

If an object has multiple dimension levels, you need to specify the monitored dimension levels when you use an API to query the metrics of this object.

For example, if you want to query the remaining storage capacity (disk_free) of a disk mount point for an ECS, the dimension of the metric is instance_id,mount_point, where instance_id indicates level 0 and mount_point indicates level 1.

To query a single metric by calling an API, the mount_point dimension is used as follows:
```
dim.0=instance_id,3d65c1ac-9a9f-4c5f-a054-35184a087bb2&dim.1=mount_point,6666cd76f96956469e7be39d750cc7d9
```
3d65c1ac-9a9f-4c5f-a054-35184a087bb2 and 6666cd76f96956469e7be39d750cc7d9 are the values of instance_id and mount_point, respectively. For details about how to obtain the values, see the obtaining guide in the Dimension table.

To query multiple metrics by calling an API, the mount_point dimension is used as follows:

"dimensions": [ 
                { 
                    "name": "instance_id", 
                    "value": "3d65c1ac-9a9f-4c5f-a054-35184a087bb2"    
                }, 
                { 
                    "name": "mount_point", 
                    "value": "6666cd76f96956469e7be39d750cc7d9" 
                } 
            ]

3d65c1ac-9a9f-4c5f-a054-35184a087bb2 and 6666cd76f96956469e7be39d750cc7d9 are the values of instance_id and mount_point, respectively. For details about how to obtain the values, see the obtaining guide in the Dimension table.

OS Monitoring Metrics Supported by ECSs with the Agent Installed