OS Monitoring Metrics Supported by ECSs with the Agent Installed_Elastic Cloud Server

Description

OS monitoring provides system-level, proactive, and fine-grained monitoring. It requires the Agent to be installed on the ECSs to be monitored. This section describes OS monitoring metrics reported to Cloud Eye.

OS monitoring supports metrics about the CPU, CPU load, memory, disk, disk I/O, file system, GPU, NIC, NTP, and TCP.

After the Agent is installed, you can view monitoring metrics of ECSs running different OSs. Monitoring data is collected every 1 minute.

Namespace

AGT.ECS

OS Metrics: CPU

**Table 1** CPU metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
cpu_usage	(Agent) CPU Usage	CPU usage of the monitored object Unit: percent Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) value. Windows: Obtain the metric value using the Windows API GetSystemTimes.	0-100	ECS	1 minute
cpu_usage_idle	(Agent) Idle CPU Usage	Percentage of time that CPU is idle Unit: percent Linux: Check metric value changes in file /proc/stat in a collection period. Windows: Obtain the metric value using the Windows API GetSystemTimes.	0-100	ECS	1 minute
cpu_usage_user	(Agent) User Space CPU Usage	Percentage of time that the CPU is used by user space Unit: percent Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) us value. Windows: Obtain the metric value using the Windows API GetSystemTimes.	0-100	ECS	1 minute
cpu_usage_system	(Agent) Kernel Space CPU Usage	Percentage of time that the CPU is used by kernel space Unit: percent Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) sy value. Windows: Obtain the metric value using the Windows API GetSystemTimes.	0-100	ECS	1 minute
cpu_usage_other	(Agent) Other Process CPU Usage	Percentage of time that the CPU is used by other processes Unit: percent Linux: Other Process CPU Usage = 1- Idle CPU Usage - Kernel Space CPU Usage - User Space CPU Usage Windows: Other Process CPU Usage = 1- Idle CPU Usage - Kernel Space CPU Usage - User Space CPU Usage	0-100	ECS	1 minute
cpu_usage_nice	(Agent) Nice Process CPU Usage	Percentage of time that the CPU is in user mode with low-priority processes which can easily be interrupted by higher-priority processes Unit: percent Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) ni value. Windows is not supported currently.	0-100	ECS	1 minute
cpu_usage_iowait	(Agent) iowait Process CPU Usage	Percentage of time that the CPU is waiting for I/O operations to complete Unit: percent Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) wa value. Windows is not supported currently.	0-100	ECS	1 minute
cpu_usage_irq	(Agent) CPU Interrupt Time	Percentage of time that the CPU is servicing interrupts Unit: percent Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) hi value. Windows is not supported currently.	0-100	ECS	1 minute
cpu_usage_softirq	(Agent) CPU Software Interrupt Time	Percentage of time that the CPU is servicing software interrupts Unit: percent Linux: Check metric value changes in file /proc/stat in a collection period. Run the top command to check the %Cpu(s) si value. Windows is not supported currently.	0-100	ECS	1 minute

OS Metric: CPU Load

**Table 2** CPU load metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
load_average1	(Agent) 1-Minute Load Average	CPU load averaged from the last 1 minute Linux: Obtain the metric value from the number of logic CPUs in load1/ in file /proc/loadavg. Run the top command to check the load1 value.	≥ 0	ECS	1 minute
load_average5	(Agent) 5-Minute Load Average	CPU load averaged from the last 5 minutes Linux: Obtain the metric value from the number of logic CPUs in load5/ in file /proc/loadavg. Run the top command to check the load5 value.	≥ 0	ECS	1 minute
load_average15	(Agent) 15-Minute Load Average	CPU load averaged from the last 15 minutes Linux: Obtain the metric value from the number of logic CPUs in load15/ in file /proc/loadavg. Run the top command to check the load15 value.	≥ 0	ECS	1 minute

The Windows OS does not support the CPU load metrics.

OS Metric: Memory

**Table 3** Memory metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
mem_available	(Agent) Available Memory	Amount of memory that is available and can be given instantly to processes Unit: GB Linux: Obtain the metric value from /proc/meminfo. If MemAvailable is displayed in /proc/meminfo, obtain the value. If MemAvailable is not displayed in /proc/meminfo, MemAvailable = MemFree + Buffers+Cached Windows: The metric value is calculated by available memory minuses used memory. The value is obtained by calling the Windows API GlobalMemoryStatusEx.	≥ 0	ECS	1 minute
mem_usedPercent	(Agent) Memory Usage	Memory usage of the monitored object Unit: percent Linux: Obtain the metric value from the /proc/meminfo file: (MemTotal - MemAvailable)/MemTotal If MemAvailable is displayed in /proc/meminfo, MemUsedPercent = (MemTotal-MemAvailable)/MemTotal If MemAvailable is not displayed in /proc/meminfo, MemUsedPercent = (MemTotal – MemFree – Buffers – Cached)/MemTotal Windows: The calculation formula is as follows: Used memory size/Total memory size*100%.	0-100	ECS	1 minute
mem_free	(Agent) Idle Memory	Amount of memory that is not being used Unit: GB Linux: Obtain the metric value from /proc/meminfo. Windows is not supported currently.	≥ 0	ECS	1 minute
mem_buffers	(Agent) Buffer	Amount of memory that is being used for buffers Unit: GB Linux: Obtain the metric value from /proc/meminfo. Run the top command to check the KiB Mem:buffers value. Windows is not supported currently.	≥ 0	ECS	1 minute
mem_cached	(Agent) Cache	Amount of memory that is being used for file caches Unit: GB Linux: Obtain the metric value from /proc/meminfo. Run the top command to check the KiB Swap:cached Mem value. Windows is not supported currently.	≥ 0	ECS	1 minute
total_open_files	(Agent) Total File Handles	Total handles used by all processes Unit: count Linux: Use the /proc/{pid}/fd file to summarize the handles used by all processes. Windows is not supported currently.	≥ 0	ECS	1 minute

OS Metric: Disk

Currently, only physical disks are monitored. The NFS-attached disks cannot be monitored.
By default, Docker-related mount points are shielded. The prefix of the mount point is as follows:
```
/var/lib/docker;/mnt/paas/kubernetes;/var/lib/mesos
```

**Table 4** Disk metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
disk_free	(Agent) Available Disk Space	Free space on the disks Unit: GB Linux: Run the df -h command to check the value in the Avail column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use the WMI interface to call GetDiskFreeSpaceExW API to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥ 0	ECS - Mount point	1 minute
disk_total	(Agent) Disk Storage Capacity	Total space on the disks, including used and free Unit: GB Linux: Run the df -h command to check the value in the Size column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use the WMI interface to call GetDiskFreeSpaceExW API to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥ 0	ECS - Mount point	1 minute
disk_used	(Agent) Used Disk Space	Used space on the disks Unit: GB Linux: Run the df -h command to check the value in the Used column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use the WMI interface to call GetDiskFreeSpaceExW API to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥ 0	ECS - Mount point	1 minute
disk_usedPercent	(Agent) Disk Usage	Percentage of total disk space that is used, which is calculated as follows: Disk Usage = Used Disk Space/Disk Storage Capacity Unit: percent Linux: It is calculated as follows: Used/Size. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use the WMI interface to call GetDiskFreeSpaceExW API to obtain disk space data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	0-100	ECS - Mount point	1 minute

OS Metric: Disk I/O

**Table 5** Disk I/O metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
disk_agt_read_bytes_rate	(Agent) Disks Read Rate	Number of bytes read from the monitored disk per second Unit: byte/s Linux: The disk read rate is calculated based on the data changes in the sixth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use Win32_PerfFormattedData_PerfDisk_LogicalDisk object in the WMI to obtain disk I/O data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and result in the failure of obtaining monitoring data.	≥ 0 bytes/s	ECS - Disk ECS - Mount point	1 minute
disk_agt_read_requests_rate	(Agent) Disks Read Requests	Number of read requests sent to the monitored disk per second Unit: request/s Linux: The disk read requests are calculated based on the data changes in the fourth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use Win32_PerfFormattedData_PerfDisk_LogicalDisk object in the WMI to obtain disk I/O data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and result in the failure of obtaining monitoring data.	≥ 0 requests/s	ECS - Disk ECS - Mount point	1 minute
disk_agt_write_bytes_rate	(Agent) Disks Write Rate	Number of bytes written to the monitored disk per second Unit: byte/s Linux: The disk write rate is calculated based on the data changes in the tenth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use Win32_PerfFormattedData_PerfDisk_LogicalDisk object in the WMI to obtain disk I/O data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and result in the failure of obtaining monitoring data.	≥ 0 bytes/s	ECS - Disk ECS - Mount point	1 minute
disk_agt_write_requests_rate	(Agent) Disks Write Requests	Number of write requests sent to the monitored disk per second Unit: request/s Linux: The disk write requests are calculated based on the data changes in the eighth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows: Use Win32_PerfFormattedData_PerfDisk_LogicalDisk object in the WMI to obtain disk I/O data. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). When the CPU usage is high, monitoring data obtaining timeout may occur and result in the failure of obtaining monitoring data.	≥ 0 requests/s	ECS - Disk ECS - Mount point	1 minute
disk_readTime	(Agent) Average Read Request Time	Average amount of time that read requests have waited on the disks Unit: ms/count Linux: The average read request time is calculated based on the data changes in the seventh column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0 ms/Count	ECS - Disk ECS - Mount point	1 minute
disk_writeTime	(Agent) Average Write Request Time	Average amount of time that write requests have waited on the disks Unit: ms/count Linux: The average write request time is calculated based on the data changes in the eleventh column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0 ms/Count	ECS - Disk ECS - Mount point	1 minute
disk_ioUtils	(Agent) Disk I/O Usage	Percentage of the time that the disk has had I/O requests queued to the total disk operation time Unit: percent Linux: The disk I/O usage is calculated based on the data changes in the thirteenth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	0-100	ECS - Disk ECS - Mount point	1 minute
disk_queue_length	(Agent) Disk Queue Length	Average number of read or write requests queued up for completion for the monitored disk in the monitoring period Unit: count Linux: The average disk queue length is calculated based on the data changes in the fourteenth column of the corresponding device in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0	ECS - Disk ECS - Mount point	1 minute
disk_write_bytes_per_operation	(Agent) Average Disk Write Size	Average number of bytes in an I/O write for the monitored disk in the monitoring period Unit: byte/op Linux: The average disk write size is calculated based on the data changes in the tenth column of the corresponding device to divide that of the eighth column in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0 bytes/op	ECS - Disk ECS - Mount point	1 minute
disk_read_bytes_per_operation	(Agent) Average Disk Read Size	Average number of bytes in an I/O read for the monitored disk in the monitoring period Unit: byte/op Linux: The average disk read size is calculated based on the data changes in the sixth column of the corresponding device to divide that of the fourth column in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0 bytes/op	ECS - Disk ECS - Mount point	1 minute
disk_io_svctm	(Agent) Disk I/O Service Time	Average time in an I/O read or write for the monitored disk in the monitoring period Unit: ms/op Linux: The average disk I/O service time is calculated based on the data changes in the thirteenth column of the corresponding device to divide the sum of data changes in the fourth and eighth columns in file /proc/diskstats in a collection period. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~). Windows is not supported currently.	≥ 0	ECS - Disk ECS - Mount point	1 minute
disk_device_used_percent	Block Device Usage	Percentage of the physical disk usage of the monitored object. Calculation formula: Used storage space of all mounted disk partitions/Total disk storage space Collection method for Linux ECSs: Obtain the disk usage of each mount point, calculate the total disk storage space based on the disk sector size and the number of sectors, and then you can calculate the used storage space in total. Windows ECSs do not support this metric.	0-100	ECS - Disk	1 minute

OS Metric: File System

**Table 6** File system metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
disk_fs_rwstate	(Agent) File System Read/Write Status	Read and write status of the mounted file system of the monitored object Possible values are 0 (read and write) and 1 (read only). Linux: Check file system information in the fourth column in file /proc/mounts.	0: readable and writable 1: read-only	ECS - Mount point	1
disk_inodesTotal	(Agent) Disk inode Total	Total number of index nodes on the disk Linux: Run the df -i command to check the value in the Inodes column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥ 0	ECS - Mount point	1 minute
disk_inodesUsed	(Agent) Total inode Used	Number of used index nodes on the disk Linux: Run the df -i command to check the value in the IUsed column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	≥ 0	ECS - Mount point	1 minute
disk_inodesUsedPercent	(Agent) Percentage of Total inode Used	Number of used index nodes on the disk Unit: percent Linux: Run the df -i command to check the value in the IUse% column. The path of the mount point prefix cannot exceed 64 characters. It must start with a letter, and contain only digits, letters, hyphens (-), periods (.), and swung dashes (~).	0-100	ECS - Mount point	1 minute

The Windows OS does not support the file system metrics.

OS Metric: NIC

**Table 7** NIC metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
net_bitRecv	(Agent) Outbound Bandwidth	Number of bits sent by this NIC per second Unit: bit/s Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows: Use the MibIfRow object in the WMI to obtain network metric data.	≥ 0 bit/s	ECS	1 minute
net_bitSent	(Agent) Inbound Bandwidth	Number of bits received by this NIC per second Unit: bit/s Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows: Use the MibIfRow object in the WMI to obtain network metric data.	≥ 0 bit/s	ECS	1 minute
net_packetRecv	(Agent) NIC Packet Receive Rate	Number of packets received by this NIC per second Unit: count/s Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows: Use the MibIfRow object in the WMI to obtain network metric data.	≥ 0 Counts/s	ECS	1 minute
net_packetSent	(Agent) NIC Packet Send Rate	Number of packets sent by this NIC per second Unit: count/s Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows: Use the MibIfRow object in the WMI to obtain network metric data.	≥ 0 Counts/s	ECS	1 minute
net_errin	(Agent) Receive Error Rate	Percentage of receive errors detected by this NIC per second Unit: percent Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows is not supported currently.	0-100	ECS	1 minute
net_errout	(Agent) Transmit Error Rate	Percentage of transmit errors detected by this NIC per second Unit: percent Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows is not supported currently.	0-100	ECS	1 minute
net_dropin	(Agent) Received Packet Drop Rate	Percentage of packets received by this NIC which were dropped per second Unit: percent Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows is not supported currently.	0-100	ECS	1 minute
net_dropout	(Agent) Transmitted Packet Drop Rate	Percentage of packets transmitted by this NIC which were dropped per second Unit: percent Linux: Check metric value changes in file /proc/net/dev in a collection period. Windows is not supported currently.	0-100	ECS	1 minute

OS Metric: NTP

**Table 8** NTP metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
ntp_offset	(Agent) NTP Offset	NTP offset of the monitored object Unit: ms Collection method for Linux ECSs: Run chronyc sources -v to obtain the offset.	≥ 0 ms	ECS	1 minute

OS Metric: TCP

**Table 9** TCP metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
net_tcp_total	(Agent) TCP TOTAL	Total number of TCP connections in all states Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_established	(Agent) TCP ESTABLISHED	Number of TCP connections in ESTABLISHED state Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_sys_sent	(Agent) TCP SYS_SENT	Number of TCP connections that are being requested by the client Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_sys_recv	(Agent) TCP SYS_RECV	Number of pending TCP connections received by the server Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_fin_wait1	(Agent) TCP FIN_WAIT1	Number of TCP connections waiting for ACK packets when the connections are being actively closed by the client Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_fin_wait2	(Agent) TCP FIN_WAIT2	Number of TCP connections in the FIN_WAIT2 state Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_time_wait	(Agent) TCP TIME_WAIT	Number of TCP connections in TIME_WAIT state Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_close	(Agent) TCP CLOSE	Number of closed TCP connections Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_close_wait	(Agent) TCP CLOSE_WAIT	Number of TCP connections in CLOSE_WAIT TCP state Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_last_ack	(Agent) TCP LAST_ACK	Number of TCP connections waiting for ACK packets when the connections are being passively closed by the client Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_listen	(Agent) TCP LISTEN	Number of TCP connections in the LISTEN state Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_closing	(Agent) TCP CLOSING	Number of TCP connections to be automatically closed by the server and the client at the same time Unit: count Linux: Obtain TCP connections in all states from the /proc/net/tcp file, and then collect the number of connections in each state. Windows: Obtain the metric value using WindowsAPI GetTcpTable2.	≥ 0	ECS	1 minute
net_tcp_retrans	(Agent) TCP Retransmission Rate	Percentage of packets that are resent Unit: percent Linux: Obtain the metric value from the /proc/net/snmp file. The value is the ratio of the number of sent packets to the number of retransmitted packages in a collection period. Windows: Obtain the metric value using WindowsAPI GetTcpStatistics.	0-100	ECS	1 minute

OS Metric: GPU

**Table 10** GPU metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
gpu_status	GPU Health Status	Overall measurement of the GPU health Unit: none Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	0: The GPU is healthy. 1: The GPU is subhealthy. 2: The GPU is faulty.	ECS ECS - GPU	1 minute
gpu_usage_encoder	Encoding Usage	Encoding capability usage on the GPU Unit: percent Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	0-100	ECS ECS - GPU	1 minute
gpu_usage_decoder	Decoding Usage	Decoding capability usage on the GPU Unit: percent Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	0-100	ECS ECS - GPU	1 minute
gpu_volatile_correctable	Volatile Correctable ECC Errors	Number of correctable ECC errors since the GPU is reset. The value is reset to 0 each time the GPU is reset. Unit: count Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	ECS ECS - GPU	1 minute
gpu_volatile_uncorrectable	Volatile Uncorrectable ECC Errors	Number of uncorrectable ECC errors since the GPU is reset. The value is reset to 0 each time the GPU is reset. Unit: count Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	ECS ECS - GPU	1 minute
gpu_aggregate_correctable	Aggregate Correctable ECC Errors	Aggregate correctable ECC errors on the GPU Unit: count Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	ECS ECS - GPU	1 minute
gpu_aggregate_uncorrectable	Aggregate Uncorrectable ECC Errors	Aggregate uncorrectable ECC Errors on the GPU Unit: count Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	ECS ECS - GPU	1 minute
gpu_retired_page_single_bit	Retired Page Single Bit Errors	Number of retired page single bit errors, which indicates the number of single-bit pages blocked by the graphics card Unit: count Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	ECS ECS - GPU	1 minute
gpu_retired_page_double_bit	Retired Page Double Bit Errors	Number of retired page double bit errors, which indicates the number of double-bit pages blocked by the graphics card Unit: count Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0	ECS ECS - GPU	1 minute
gpu_performance_state	(Agent) Performance Status	GPU performance of the monitored object Unit: none Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	P0-P15, P32 P0: indicates the maximum performance status. P15: indicates the minimum performance status. P32: indicates the unknown performance status.	ECS ECS - GPU	1 minute
gpu_usage_mem	(Agent) GPU Memory Usage	GPU memory usage of the monitored object Unit: percent Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	0-100	ECS ECS - GPU	1 minute
gpu_usage_gpu	(Agent) GPU Usage	GPU usage of the monitored object Unit: percent Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	0-100	ECS ECS - GPU	1 minute
gpu_free_mem	GPU Free Memory	Free Memory on the GPU Unit: MB Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0 MB	ECS ECS - GPU	1 minute
gpu_graphics_clocks	GPU Graphics Clocks	Current Graphics Clocks on the GPU Unit: MHz Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0 MHz	ECS ECS - GPU	1 minute
gpu_mem_clocks	GPU Memory Clocks	Current Memory Clocks on the GPU Unit: MHz Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0 MHz	ECS ECS - GPU	1 minute
gpu_power_draw	GPU Draw Power	Draw Power on the GPU Unit: W Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	NA	ECS ECS - GPU	1 minute
gpu_rx_throughput_pci	GPU PCI Rx Throughput	Current PCI Rx Throughput on the GPU Unit: MByte/s Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0 MByte/s	ECS ECS - GPU	1 minute
gpu_sm_clocks	GPU SM Clocks	Current SM Clocks on the GPU Unit: MHz Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0 MHz	ECS ECS - GPU	1 minute
gpu_temperature	GPU Temperature	Current Temperature on the GPU Unit: °C Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0 °C	ECS ECS - GPU	1 minute
gpu_tx_throughput_pci	GPU PCI Tx Throughput	Current PCI Tx Throughput on the GPU Unit: MByte/s Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0 MByte/s	ECS ECS - GPU	1 minute
gpu_used_mem	GPU Used Memory	Memory Used on the GPU Unit: MB Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0 MB	ECS ECS - GPU	1 minute
gpu_video_clocks	GPU Video Clocks	Current Video Clocks on the GPU Unit: MHz Linux: Obtain the metric value using the libnvidia-ml.so.1 library file of the graphics card. Windows: Obtain the metric value using the nvml.dll library of the graphics card.	≥ 0 MHz	ECS ECS - GPU	1 minute

OS Metrics: NPU

**Table 11** NPU metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
npu_device_health	NPU Device Health	An overall measurement of the GPU health Unit: none Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	0: healthy 1: minor alarms 2: major alarms 3:critical alarms	ECS ECS - NPU	1 minute
npu_util_rate_mem	NPU Util Rate Mem	The utilization rate of the NPU memory Unit: percent Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	0-100	ECS ECS - NPU	1 minute
npu_util_rate_ai_core	NPU Util Rate AI Core	The utilization rate of the NPU AI Core Unit: percent Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	0-100	ECS ECS - NPU	1 minute
npu_util_rate_ai_cpu	NPU Util Rate AI Cpu	The utilization rate of the NPU's AI CPU Unit: percent Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	0-100	ECS ECS - NPU	1 minute
npu_util_rate_ctrl_cpu	NPU Util Rate Ctrl CPU	The utilization rate of the NPU's Control CPU Unit: percent Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	0-100	ECS ECS - NPU	1 minute
npu_util_rate_mem_bandwidth	NPU Util Rate Mem Bandwidth	The utilization rate of the NPU memory bandwidth Unit: percent Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	0-100	ECS ECS - NPU	1 minute
npu_freq_mem	NPU Freq Mem	Current Frequency(Clock) of the NPU memory Unit: MHz Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	≥ 0	ECS ECS - NPU	1 minute
npu_freq_ai_core	NPU Freq AI Core	Current Frequency(Clock) of the NPU's AI Core Unit: MHz Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	≥ 0	ECS ECS - NPU	1 minute
npu_usage_mem	NPU Usage Mem	Current used NPU memory Unit: MB Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	≥ 0	ECS ECS - NPU	1 minute
npu_sbe	NPU SBE	Numbers of single bit error of the NPU Unit: count Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	≥ 0	ECS ECS - NPU	1 minute
npu_dbe	NPU DBE	Numbers of double bit error of the NPU Unit: count Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	≥ 0	ECS ECS - NPU	1 minute
npu_power	NPU Power	The power of the NPU (current power for 310P, rated power for 310) Unit: W Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	≥ 0	ECS ECS - NPU	1 minute
npu_temperature	NPU temperature	Current temperature of the GPU Unit: °C Linux: Obtain the metric value from the libdcmi.so library file of the NPU card.	≥ 0	ECS ECS - NPU	1 minute

The Windows OS does not support NPU metrics.

OS Metrics: DAVP

**Table 12** DAVP metrics
Metric	Parameter	Description	Value Range	Monitored Object & Dimension	Monitoring Period (Raw Data)
davp_device_health	DAVP Device Health	An overall measurement of the DAVP health Unit: none Linux: Obtain the metric value from the libdcmi.so library file in the VAtools tool of the DAVP card.	0: healthy 1: abnormal	ECS ECS - DAVP	1 minute
davp_util_rate_mem	DAVP Util Rate Mem	The utilization rate of the davp memory Unit: percent Linux: Obtain the metric value from the libdcmi.so library file in the VAtools tool of the DAVP card.	0-100	ECS ECS - DAVP	1 minute
davp_usage_mem	DAVP Usage Mem	Current used davp memory Unit: MB Linux: Obtain the metric value from the libdcmi.so library file in the VAtools tool of the DAVP card.	≥ 0	ECS ECS - DAVP	1 minute
davp_util_rate_ai_core	DAVP Util Rate AI Core	The utilization rate of the DAVP AI Core Unit: percent Linux: Obtain the metric value from the libdcmi.so library file in the VAtools tool of the DAVP card.	0-100	ECS ECS - DAVP	1 minute
davp_util_rate_vdsp_core	DAVP Util Rate Vdsp Core	The utilization rate of the DAVP Vdsp Core Unit: percent Linux: Obtain the metric value from the libdcmi.so library file in the VAtools tool of the DAVP card.	0-100	ECS ECS - DAVP	1 minute
davp_util_rate_enc_core	DAVP Util Rate Enc Core	The utilization rate of the DAVP Enc Core Unit: percent Linux: Obtain the metric value from the libdcmi.so library file in the VAtools tool of the DAVP card.	0-100	ECS ECS - DAVP	1 minute
davp_util_rate_dec_core	DAVP Util Rate Dec Core	The utilization rate of the DAVP Dec Core Unit: percent Linux: Obtain the metric value from the libdcmi.so library file in the VAtools tool of the DAVP card.	0-100	ECS ECS - DAVP	1 minute
davp_sysc_temperature	Davp System Module Temperature	Current system module temperature of davp Unit: °C Linux: Obtain the metric value from the libdcmi.so library file in the VAtools tool of the DAVP card.	≥ 0	ECS ECS - DAVP	1 minute

The Windows OS does not support DAVP metrics.

Dimensions

Dimension	Key	Value
ECS	instance_id	Specifies the ECS ID.
ECS - Disk	disk	Specifies the disks attached to an ECS.
ECS - Mount point	mount_point	Specifies the mount point of a disk.
ECS - GPU	gpu	Specifies the graphics card of an ECS.
ECS - NPU	npu	Specifies the NPU graphics card of an NPU-based ECS.
ECS - DAVP	davp	Specifies the DaoCloud DAVP1 video acceleration card of a DAVP-based ECS.

OS Monitoring Metrics Supported by ECSs with the Agent Installed