Metrics Supported by the Agent
OS metric: CPU
Metric |
Name |
Description |
Unit |
Supported Version |
Monitoring Period (Raw Data) |
---|---|---|---|---|---|
cpu_usage |
(Agent) CPU Usage |
Used to monitor CPU usage
|
% |
2.4.1 |
1 minute |
cpu_usage_idle |
(Agent) Idle CPU Usage |
Percentage of the time that CPU is idle Unit: Percent
|
% |
2.4.5 |
1 minute |
cpu_usage_other |
(Agent) Other Process CPU Usage |
Other CPU usage of the monitored object
|
% |
2.4.5 |
1 minute |
cpu_usage_system |
(Agent) Kernel Space CPU Usage |
Percentage of time that the CPU is used by kernel space
|
% |
2.4.5 |
1 minute |
cpu_usage_user |
(Agent) User Space CPU Usage |
Percentage of time that the CPU is used by user space
|
% |
2.4.5 |
1 minute |
cpu_usage_nice |
(Agent) Nice Process CPU Usage |
Percentage of the time that the CPU is in user mode with low-priority processes which can easily be interrupted by higher-priority processes
|
% |
2.4.5 |
1 minute |
cpu_usage_iowait |
(Agent) iowait Process CPU Usage |
Percentage of time that the CPU is waiting for I/O operations to complete
|
% |
2.4.5 |
1 minute |
cpu_usage_irq |
(Agent) CPU Interrupt Time |
Percentage of time that the CPU is servicing interrupts
|
% |
2.4.5 |
1 minute |
cpu_usage_softirq |
(Agent) CPU Software Interrupt Time |
Percentage of time that the CPU is servicing software interrupts
|
% |
2.4.5 |
1 minute |
OS Metric: CPU Load
Metric |
Name |
Description |
Unit |
Supported Version |
Monitoring Period (Raw Data) |
---|---|---|---|---|---|
load_average1 |
(Agent) 1-Minute Load Average |
CPU load averaged from the last 1 minute
|
None |
2.4.1 |
1 minute |
load_average5 |
(Agent) 5-Minute Load Average |
CPU load averaged from the last 5 minutes
|
None |
2.4.1 |
1 minute |
load_average15 |
(Agent) 15-Minute Load Average |
CPU load averaged from the last 15 minutes
|
None |
2.4.1 |
1 minute |
OS Metric: Memory
Metric |
Name |
Description |
Unit |
Supported Version |
Monitoring Period (Raw Data) |
---|---|---|---|---|---|
mem_available |
(Agent) Available Memory |
Amount of memory that is available and can be given instantly to processes
|
GB |
2.4.5 |
1 minute |
mem_usedPercent |
(Agent) Memory Usage |
Memory usage of the instance
|
% |
2.4.1 |
1 minute |
mem_free |
(Agent) Idle Memory |
Amount of memory that is not being used
|
GB |
2.4.5 |
1 minute |
mem_buffers |
(Agent) Buffer |
Amount of memory that is being used for buffers
|
GB |
2.4.5 |
1 minute |
mem_cached |
(Agent) Cache |
Amount of memory that is being used for file caches
|
GB |
2.4.5 |
1 minute |
total_open_files |
(Agent) Total File Handles |
Total handles used by all processes
|
None |
2.4.5 |
1 minute |
OS Metric: Disk
Currently, CES Agent can collect only physical disk metrics and does not support disks mounted using the network file system protocol.
By default, CES Agent will not monitor Docker-related mount points. The prefix of the mount point is as follows:
/var/lib/docker;/mnt/paas/kubernetes;/var/lib/mesos
Metric |
Name |
Description |
Unit |
Supported Version |
Monitoring Period (Raw Data) |
---|---|---|---|---|---|
disk_free |
(Agent) Available Disk Space |
Free space on the disks
|
GB |
2.4.1 |
1 minute |
disk_total |
(Agent) Disk Storage Capacity |
Total disk capacity
|
GB |
2.4.5 |
1 minute |
disk_used |
(Agent) Used Disk Space |
Disk's used space
|
GB |
2.4.5 |
1 minute |
disk_usedPercent |
(Agent) Disk Usage |
Percentage of used disk space. It is calculated as follows: Disk Usage = Used Disk Space/Disk Storage Capacity.
|
% |
2.4.1 |
1 minute |
OS Metric: Disk I/O
OS Metric: File System
Metric |
Name |
Description |
Unit |
Supported Version |
Monitoring Period (Raw Data) |
---|---|---|---|---|---|
disk_fs_rwstate |
(Agent) File System Read/Write Status |
Read and write status of the mounted file system of the monitored object Possible statuses are 0 (read and write) and 1 (read only).
|
None |
2.4.5 |
1 minute |
disk_inodesTotal |
(Agent) Disk inode Total |
Total number of index nodes on the disk
|
None |
2.4.5 |
1 minute |
disk_inodesUsed |
(Agent) Total inode Used |
Number of used index nodes on the disk
|
None |
2.4.5 |
1 minute |
disk_inodesUsedPercent |
(Agent) Percentage of Total inode Used |
Number of used index nodes on the disk
|
% |
2.4.1 |
1 minute |
OS Metric: TCP
Metric |
Metric |
Description |
Unit |
Supported Version |
Monitoring Period (Raw Data) |
---|---|---|---|---|---|
net_tcp_total |
(Agent) Total Number of TCP Connections |
Total number of TCP connections
|
None |
2.4.1 |
1 minute |
net_tcp_established |
(Agent) Number of connections in the ESTABLISHED state |
Number of TCP connections in the ESTABLISHED state
|
None |
2.4.1 |
1 minute |
net_tcp_sys_sent |
(Agent) Number of connections in the TCP SYS_SENT state. |
Number of TCP connections that are being requested by the client
|
None |
2.4.5 |
1 minute |
net_tcp_sys_recv |
(Agent) Number of connections in the TCP SYS_RECV state. |
Number of pending TCP connections received by the server
|
None |
2.4.5 |
1 minute |
net_tcp_fin_wait1 |
(Agent) Number of TCP connections in the FIN_WAIT1 state. |
Number of TCP connections waiting for ACK packets when the connections are being actively closed by the client
|
None |
2.4.5 |
1 minute |
net_tcp_fin_wait2 |
(Agent) Number of TCP connections in the FIN_WAIT2 state. |
Number of TCP connections in the FIN_WAIT2 state
|
None |
2.4.5 |
1 minute |
net_tcp_time_wait |
(Agent) Number of TCP connections in the TIME_WAIT state. |
Number of TCP connections in the TIME_WAIT state
|
None |
2.4.5 |
1 minute |
net_tcp_close |
(Agent) Number of TCP connections in the CLOSE state. |
Number of closed TCP connections
|
None |
2.4.5 |
1 minute |
net_tcp_close_wait |
(Agent) Number of TCP connections in the CLOSE_WAIT state. |
Number of TCP connections in the CLOSE_WAIT state
|
None |
2.4.5 |
1 minute |
net_tcp_last_ack |
(Agent) Number of TCP connections in the LAST_ACK state. |
Number of TCP connections waiting for ACK packets when the connections are being passively closed by the client
|
None |
2.4.5 |
1 minute |
net_tcp_listen |
(Agent) Number of TCP connections in the LISTEN state. |
Number of TCP connections in the LISTEN state
|
None |
2.4.5 |
1 minute |
net_tcp_closing |
(Agent) Number of TCP connections in the CLOSING state. |
Number of TCP connections to be automatically closed by the server and the client at the same time
|
None |
2.4.5 |
1 minute |
net_tcp_retrans |
(Agent) TCP Retransmission Rate |
Percentage of packets that are resent
|
% |
2.4.5 |
1 minute |
OS Metric: NIC
Metric |
Name |
Description |
Unit |
Supported Version |
Monitoring Period (Raw Data) |
---|---|---|---|---|---|
net_bitRecv |
(Agent) Outbound Bandwidth |
Number of bits sent by this NIC per second
|
bit/s |
2.4.1 |
1 minute |
net_bitSent |
(Agent) Inbound Bandwidth |
Number of bits received by this NIC per second
|
bit/s |
2.4.1 |
1 minute |
net_packetRecv |
(Agent) NIC Packet Receive Rate |
Number of packets received by this NIC per second
|
Count/s |
2.4.1 |
1 minute |
net_packetSent |
(Agent) NIC Packet Send Rate |
Number of packets sent by this NIC per second
|
Count/s |
2.4.1 |
1 minute |
net_errin |
(Agent) Receive Error Rate |
Percentage of receive errors detected by this NIC per second
|
% |
2.4.5 |
1 minute |
net_errout |
(Agent) Transmit Error Rate |
Percentage of transmit errors detected by this NIC per second
|
% |
2.4.5 |
1 minute |
net_dropin |
(Agent) Received Packet Drop Rate |
Percentage of packets received by this NIC which were dropped per second
|
% |
2.4.5 |
1 minute |
net_dropout |
(Agent) Transmitted Packet Drop Rate |
Percentage of packets transmitted by this NIC which were dropped per second
|
% |
2.4.5 |
1 minute |
Process Monitoring Metrics
Metric |
Name |
Description |
Unit |
Supported Version |
Monitoring Period (Raw Data) |
---|---|---|---|---|---|
proc_pHashId_cpu |
(Agent) CPU Usage |
CPU consumed by a process. pHashId (process name and process ID) is the value of md5.
|
% |
2.4.1 |
1 minute |
proc_pHashId_mem |
(Agent) Memory Usage |
Memory consumed by a process. pHashId (process name and process ID) is the value of md5.
|
% |
2.4.1 |
1 minute |
proc_pHashId_file |
(Agent) Number of opened files |
Number of files opened by a process. pHashId (process name and process ID) is the value of md5.
|
Count |
2.4.1 |
1 minute |
proc_running_count |
(Agent) Number of running processes |
Number of processes that are running
|
None |
2.4.1 |
1 minute |
proc_idle_count |
(Agent) Idle Processes |
Number of processes that are idle
|
None |
2.4.1 |
1 minute |
proc_zombie_count |
(Agent) Zombie Processes |
Number of zombie processes
|
None |
2.4.1 |
1 minute |
proc_blocked_count |
(Agent) Blocked Processes |
Number of processes that are blocked
|
None |
2.4.1 |
1 minute |
proc_sleeping_count |
(Agent) Sleeping Processes |
Number of processes that are sleeping
|
None |
2.4.1 |
1 minute |
proc_total_count |
(Agent) Total Processes |
Total number of processes on the monitored object
|
None |
2.4.1 |
1 minute |
proc_specified_count |
(Agent) Specified Processes |
Number of specified processes
|
Count |
2.4.1 |
1 minute |
GPU Specifications
Metric |
Name |
Description |
Unit |
Supported Version |
Collection Method |
GPU Specifications |
gpu_status |
Specifies the GPU health status of the VM. This is a comprehensive metric. 0 indicates healthy, 1 indicates subhealthy, and 2 indicates faulty. |
- |
2.4.5 |
Collection method (Linux): Invoke the libnvidia-ml.so.1 library file of the GPU card. Collection method (Windows): Invoke the nvml.dll library file of the GPU card. |
gpu_performance_state |
Performance status of the GPU P0-P15, P32 P0 indicates the maximum performance status. P15 indicates the minimum performance status. P32 indicates the unknown status. |
- |
2.4.1 |
||
gpu_power_draw |
Power of the GPU. |
W |
2.4.5 |
||
gpu_temperature |
Temperature of the GPU. |
°C |
2.4.5 |
||
gpu_usage_gpu |
GPU computing power usage |
% |
2.4.1 |
||
gpu_usage_mem |
GPU memory usage |
% |
2.4.1 |
||
gpu_used_mem |
GPU memory usage |
MB |
2.4.5 |
||
gpu_free_mem |
Remaining GPU memory |
MB |
2.4.5 |
||
gpu_usage_encoder |
GPU encoding capability usage |
% |
2.4.5 |
||
gpu_usage_decoder |
GPU decoding capability usage |
% |
2.4.5 |
||
gpu_graphics_clocks |
Video card (shader) clock frequency of the GPU |
MHz |
2.4.5 |
||
gpu_sm_clocks |
Streaming processor clock frequency of the GPU |
MHz |
2.4.5 |
||
gpu_mem_clock |
Memory clock frequency of the GPU |
MHz |
2.4.5 |
||
gpu_video_clocks |
Video (including codec) clock frequency of the GPU |
MHz |
2.4.5 |
||
gpu_tx_throughput_pci |
Outbound bandwidth of the GPU |
MByte/s |
2.4.5 |
||
gpu_rx_throughput_pci |
Inbound bandwidth of the GPU |
MByte/s |
2.4.5 |
||
gpu_volatile_correctable |
Number of correctable ECC errors since the GPU is reset. The value is reset to 0 each time the GPU is reset. |
N/A |
2.4.5 |
||
gpu_volatile_uncorrectable |
Number of uncorrectable ECC errors since the GPU is reset. The value is reset to 0 each time the GPU is reset. |
N/A |
2.4.5 |
||
gpu_aggregate_correctable |
Number of correctable ECC errors on the GPU |
N/A |
2.4.5 |
||
gpu_aggregate_uncorrectable |
Number of uncorrectable ECC Errors on the GPU |
N/A |
2.4.5 |
||
gpu_retired_page_single_bit |
Number of retired page single bit errors, which indicates the number of single-bit pages blocked by the graphics card |
N/A |
2.4.5 |
||
gpu_retired_page_double_bit |
Number of errors, indicating the number of double-bit pages isolated by the current card. |
N/A |
2.4.5 |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot