Updated on 2024-09-06 GMT+08:00

Basic Metrics: VM Metrics

This section describes the categories, names, and meanings of VM metrics reported by ICAgents to AOM.

Table 1 VM metrics

Category

Metric

Metric Name

Description

Value Range

Unit

Network metrics

aom_node_network_receive_bytes

Downlink Rate (BPS)

Inbound traffic rate of a measured object

≥ 0

Bytes/s

aom_node_network_receive_packets

Downlink Rate (PPS)

Number of data packets received by a NIC per second

≥ 0

Packets/s

aom_node_network_receive_error_packets

Downlink Error Rate

Number of error packets received by a NIC per second

≥ 0

Count/s

aom_node_network_transmit_bytes

Uplink Rate (BPS)

Outbound traffic rate of a measured object

≥ 0

Bytes/s

aom_node_network_transmit_error_packets

Uplink Error Rate

Number of error packets sent by a NIC per second

≥ 0

Count/s

aom_node_network_transmit_packets

Uplink Rate (PPS)

Number of data packets sent by a NIC per second

≥ 0

Packets/s

aom_node_network_total_bytes

Total Rate (BPS)

Total inbound and outbound traffic rate of a measured object

≥ 0

Bytes/s

Disk metrics

aom_node_disk_read_kilobytes

Disk Read Rate

Volume of data read from a disk per second

≥ 0

KB/s

aom_node_disk_write_kilobytes

Disk Write Rate

Volume of data written into a disk per second

≥ 0

KB/s

Disk partition metrics

aom_host_diskpartition_thinpool_metadata_percent

Thin Pool's Metadata Space Usage

Percentage of the thin pool's used metadata space to the total metadata space on a CCE node

0–100

%

aom_host_diskpartition_thinpool_data_percent

Thin Pool's Data Space Usage

Percentage of the thin pool's used data space to the total data space on a CCE node

0–100

%

aom_host_diskpartition_total_capacity_megabytes

Thin Pool's Disk Partition Space

Total thin pool's disk partition space on a CCE node

≥ 0

MB

File system metrics

aom_node_disk_available_capacity_megabytes

Available Disk Space

Disk space that has not been used

≥ 0

MB

aom_node_disk_capacity_megabytes

Total Disk Space

Total disk space

≥ 0

MB

aom_node_disk_rw_status

Disk Read/Write Status

Read or write status of a disk

0 or 1

  • 0: read/write
  • 1: read-only

N/A

aom_node_disk_usage

Disk Usage

Percentage of the used disk space to the total disk space

0–100

%

Host metrics

aom_node_cpu_limit_core

Total CPU Cores

Total number of CPU cores that have been applied for a measured object

≥ 1

Cores

aom_node_cpu_used_core

Used CPU Cores

Number of CPU cores used by a measured object

≥ 0

Cores

aom_node_cpu_usage

CPU Usage

CPU usage of a measured object

0–100

%

aom_node_memory_free_megabytes

Available Physical Memory

Available physical memory of a measured object

≥ 0

MB

aom_node_virtual_memory_free_megabytes

Available Virtual Memory

Available virtual memory of a measured object

≥ 0

MB

aom_node_gpu_memory_free_megabytes

GPU Memory Capacity

Total GPU memory of a measured object

> 0

MB

aom_node_gpu_memory_usage

GPU Memory Usage

Percentage of the used GPU memory to the total GPU memory

0–100

%

aom_node_gpu_memory_used_megabytes

Used GPU Memory

GPU memory used by a measured object

≥ 0

MB

aom_node_gpu_usage

GPU Usage

GPU usage of a measured object

0–100

%

aom_node_npu_memory_free_megabytes

Total NPU Memory

Total NPU memory of a measured object

NOTE:

Only NPU metrics of CCE hosts can be collected.

> 0

MB

aom_node_npu_memory_usage

NPU Memory Usage

Percentage of the used NPU memory to the total NPU memory

NOTE:

Only NPU metrics of CCE hosts can be collected.

0–100

%

aom_node_npu_memory_used_megabytes

Used NPU Memory

NPU memory used by a measured object

NOTE:

Only NPU metrics of CCE hosts can be collected.

≥ 0

MB

aom_node_npu_usage

NPU Usage

NPU usage of a measured object

NOTE:

Only NPU metrics of CCE hosts can be collected.

0–100

%

aom_node_npu_temperature_centigrade

NPU Temperature

NPU temperature of a measured object

NOTE:

Only NPU metrics of CCE hosts can be collected.

-

°C

aom_node_memory_usage

Physical Memory Usage

Percentage of the used physical memory to the total physical memory applied for a measured object

0–100

%

aom_node_status

Host Status

Host status

  • 0: Normal
  • 1: Abnormal

N/A

aom_node_ntp_offset_ms

NTP Offset

Offset between the local time of the host and the NTP server time. The closer the NTP offset is to 0, the closer the local time of the host is to the time of the NTP server.

-

ms

aom_node_ntp_server_status

NTP Server Status

Whether the host is connected to the NTP server

0 or 1

  • 0: Connected
  • 1: Not connected

N/A

aom_node_ntp_status

NTP Synchronization Status

Whether the local time of the host is synchronized with the NTP server time

0 or 1

  • 0: Synchronous
  • 1: Asynchronous

N/A

aom_node_process_number

Processes

Number of processes on a measured object

≥ 0

N/A

aom_node_gpu_temperature_centigrade

GPU Temperature

GPU temperature of a measured object

-

°C

aom_node_memory_total_megabytes

Total Physical Memory

Total physical memory that has been applied for a measured object

≥ 0

MB

aom_node_virtual_memory_total_megabytes

Virtual Memory Size

Total virtual memory of a measured object

≥ 0

MB

aom_node_virtual_memory_usage

Virtual Memory Usage

Percentage of the used virtual memory to the total virtual memory

0–100

%

aom_node_current_threads_num

Current Threads

Number of threads created on a host

≥ 0

N/A

aom_node_sys_max_threads_num

Max Threads

Maximum number of threads that can be created on a host

≥ 0

N/A

aom_node_phy_disk_total_capacity_megabytes

Total Physical Disk Space

Total disk space of a host

≥ 0

MB

aom_node_physical_disk_total_used_megabytes

Used Physical Disk Space

Used disk space of a host

≥ 0

MB

aom_billing_hostUsed

Hosts

Number of hosts connected per day

≥ 0

N/A

Cluster metrics

aom_cluster_cpu_limit_core

Total CPU Cores

Total number of CPU cores that have been applied for a measured object

≥ 1

Cores

aom_cluster_cpu_used_core

Used CPU Cores

Number of CPU cores used by a measured object

≥ 0

Cores

aom_cluster_cpu_usage

CPU Usage

CPU usage of a measured object

0–100

%

aom_cluster_disk_available_capacity_megabytes

Available Disk Space

Disk space that has not been used

≥ 0

MB

aom_cluster_disk_capacity_megabytes

Total Disk Space

Total disk space

≥ 0

MB

aom_cluster_disk_usage

Disk Usage

Percentage of the used disk space to the total disk space

0–100

%

aom_cluster_memory_free_megabytes

Available Physical Memory

Available physical memory of a measured object

≥ 0

MB

aom_cluster_virtual_memory_free_megabytes

Available Virtual Memory

Available virtual memory of a measured object

≥ 0

MB

aom_cluster_gpu_memory_free_megabytes

Available GPU Memory

Available GPU memory of a measured object

> 0

MB

aom_cluster_gpu_memory_usage

GPU Memory Usage

Percentage of the used GPU memory to the total GPU memory

0–100

%

aom_cluster_gpu_memory_used_megabytes

Used GPU Memory

GPU memory used by a measured object

≥ 0

MB

aom_cluster_gpu_usage

GPU Usage

GPU usage of a measured object

0–100

%

aom_cluster_memory_usage

Physical Memory Usage

Percentage of the used physical memory to the total physical memory applied for a measured object

0–100

%

aom_cluster_network_receive_bytes

Downlink Rate (BPS)

Inbound traffic rate of a measured object

≥ 0

Bytes/s

aom_cluster_network_transmit_bytes

Uplink Rate (BPS)

Outbound traffic rate of a measured object

≥ 0

Bytes/s

aom_cluster_memory_total_megabytes

Total Physical Memory

Total physical memory that has been applied for a measured object

≥ 0

MB

aom_cluster_virtual_memory_total_megabytes

Virtual Memory Size

Total virtual memory of a measured object

≥ 0

MB

aom_cluster_virtual_memory_usage

Virtual Memory Usage

Percentage of the used virtual memory to the total virtual memory

0–100

%

Container metrics

aom_container_cpu_limit_core

Total CPU Cores

Total number of CPU cores restricted for a measured object

≥ 1

Cores

aom_container_cpu_used_core

Used CPU Cores

Number of CPU cores used by a measured object

≥ 0

Cores

aom_container_cpu_usage

CPU Usage

CPU usage of a measured object Percentage of the used CPU cores to the total CPU cores restricted for a measured object

0–100

%

aom_container_disk_read_kilobytes

Disk Read Rate

Volume of data read from a disk per second

≥ 0

KB/s

aom_container_disk_write_kilobytes

Disk Write Rate

Volume of data written into a disk per second

≥ 0

KB/s

aom_container_filesystem_available_capacity_megabytes

Available File System Capacity

Available file system capacity of a measured object. This metric is available only for containers using the Device Mapper storage drive in the Kubernetes cluster of version 1.11 or later.

≥ 0

MB

aom_container_filesystem_capacity_megabytes

Total File System Capacity

Total file system capacity of a measured object. This metric is available only for containers using the Device Mapper storage drive in the Kubernetes cluster of version 1.11 or later.

≥ 0

MB

aom_container_filesystem_usage

File System Usage

File system usage of a measured object. That is, the percentage of the used file system to the total file system. This metric is available only for containers using the Device Mapper storage drive in the Kubernetes cluster of version 1.11 or later.

0–100

%

aom_container_gpu_memory_free_megabytes

GPU Memory Capacity

Total GPU memory of a measured object

> 0

MB

aom_container_gpu_memory_usage

GPU Memory Usage

Percentage of the used GPU memory to the total GPU memory

0–100

%

aom_container_gpu_memory_used_megabytes

Used GPU Memory

GPU memory used by a measured object

≥ 0

MB

aom_container_gpu_usage

GPU Usage

GPU usage of a measured object

0–100

%

aom_container_npu_memory_free_megabytes

Total NPU Memory

Total NPU memory of a measured object

> 0

MB

aom_container_npu_memory_usage

NPU Memory Usage

Percentage of the used NPU memory to the total NPU memory

0–100

%

aom_container_npu_memory_used_megabytes

Used NPU Memory

NPU memory used by a measured object

≥ 0

MB

aom_container_npu_usage

NPU Usage

NPU usage of a measured object

0–100

%

aom_container_memory_request_megabytes

Total Physical Memory

Total physical memory restricted for a measured object

≥ 0

MB

aom_container_memory_usage

Physical Memory Usage

Percentage of the used physical memory to the total physical memory restricted for a measured object

0–100

%

aom_container_memory_used_megabytes

Used Physical Memory

Used physical memory of a measured object

≥ 0

MB

aom_container_network_receive_bytes

Downlink Rate (BPS)

Inbound traffic rate of a measured object

≥ 0

Bytes/s

aom_container_network_receive_packets

Downlink Rate (PPS)

Number of data packets received by a NIC per second

≥ 0

Packets/s

aom_container_network_receive_error_packets

Downlink Error Rate

Number of error packets received by a NIC per second

≥ 0

Count/s

aom_container_network_rx_error_packets

Error Packets Received

Number of error packets received by a measured object

≥ 0

Count

aom_container_network_transmit_bytes

Uplink Rate (BPS)

Outbound traffic rate of a measured object

≥ 0

Bytes/s

aom_container_network_transmit_error_packets

Uplink Error Rate

Number of error packets sent by a NIC per second

≥ 0

Count/s

aom_container_network_transmit_packets

Uplink Rate (PPS)

Number of data packets sent by a NIC per second

≥ 0

Packets/s

aom_process_status

Status

Docker container status

0 or 1

  • 0: Normal
  • 1: Abnormal

N/A

aom_container_memory_workingset_usage

Working Set Memory Usage

Usage of the working set memory

0–100

%

aom_container_memory_workingset_used_megabytes

Used Working Set Memory

Working set memory that has been used

≥ 0

MB

Process metrics

aom_process_cpu_limit_core

Total CPU Cores

Total number of CPU cores that have been applied for a measured object

≥ 1

Cores

aom_process_cpu_used_core

Used CPU Cores

Number of CPU cores used by a measured object

≥ 0

Cores

aom_process_cpu_usage

CPU Usage

CPU usage of a measured object Percentage of the used CPU cores to the CPU cores that have been applied

0–100

%

aom_process_handle_count

Handles

Number of handles used by a measured object

≥ 0

N/A

aom_process_max_handle_count

Max Handles

Maximum number of handles used by a measured object

≥ 0

N/A

aom_process_memory_request_megabytes

Total Physical Memory

Total physical memory that has been applied for a measured object

≥ 0

MB

aom_process_memory_usage

Physical Memory Usage

Percentage of the used physical memory to the total physical memory applied for a measured object

0–100

%

aom_process_memory_used_megabytes

Used Physical Memory

Used physical memory of a measured object

≥ 0

MB

aom_process_status

Status

Process status

0 or 1

  • 0: Normal
  • 1: Abnormal

N/A

aom_process_thread_count

Threads

Number of threads used by a measured object

≥ 0

N/A

aom_process_virtual_memory_total_megabytes

Virtual Memory Size

Total virtual memory that has been applied for a measured object

≥ 0

MB

  • If the host type is CCE, you can view disk partition metrics. The supported OSs are CentOS 7.6 and EulerOS 2.5.
  • Log in to the CCE node as the root user and run the docker info | grep 'Storage Driver' command to check the Docker storage driver type. If the command output shows driver type Device Mapper, the thin pool metrics can be viewed. Otherwise, the thin pool metrics cannot be viewed.
  • Memory usage = (Physical memory capacity – Available physical memory capacity)/Physical memory capacity; Virtual memory usage = ((Physical memory capacity + Total virtual memory capacity) – (Available physical memory capacity + Available virtual memory capacity))/(Physical memory capacity + Total virtual memory capacity) Currently, the virtual memory of a newly created VM is 0 MB by default. If no virtual memory is configured, the memory usage on the monitoring page is the same as the virtual memory usage.
  • For the total and used physical disk space, only the space of the local disk partitions' file systems is counted. The file systems (such as JuiceFS, NFS, and SMB) mounted to the host through the network are not taken into account.
  • Cluster metrics are aggregated by AOM based on host metrics, and do not include the metrics of master hosts.