Updated on 2024-04-11 GMT+08:00

Basic Metrics: IEF Metrics

This section describes the types, names, and meanings of IEF metrics reported to AOM.

After IEF metrics are reported to AOM, AOM will convert them based on mapping rules and displays the results on the Metric Browsing page.

Table 1 IEF metrics

Category

Sub-Category

Metrics Displayed on AOM

Metrics Reported by IEF

Metric Name

Description

Value Range

Unit

Host metrics

CPU

aom_node_cpu_limit_core

cpuCoreLimit

Total CPU Cores

Total number of CPU cores that have been applied for a measured object

≥ 1

Cores

aom_node_cpu_used_core

cpuCoreUsed

Used CPU Cores

Number of CPU cores used by a measured object

≥ 0

Cores

aom_node_cpu_usage

cpuUsage

CPU Usage

CPU usage of a measured object

0–100

%

Memory

aom_node_memory_total_megabytes

totalMem

Total Physical Memory

Total physical memory that has been applied for a measured object

≥ 0

MB

aom_node_memory_free_megabytes

freeMem

Available Physical Memory

Available physical memory of a measured object

≥ 0

MB

aom_node_memory_usage

memUsedRate

Physical Memory Usage

Percentage of the used physical memory to the total physical memory applied for a measured object

0–100

%

aom_node_virtual_memory_usage

virMemUsedRate

Virtual Memory Usage

Percentage of the used virtual memory to the total virtual memory

≥ 0

MB

Network

aom_node_network_receive_bytes

recvBytesRate

Downlink Rate (BPS)

Inbound traffic rate of a measured object

≥ 0

Bytes/s

aom_node_network_transmit_bytes

sendBytesRate

Uplink Rate (BPS)

Outbound traffic rate of a measured object

≥ 0

Bytes/s

Disk

aom_node_disk_capacity_megabytes

diskCapacity

Total Disk Space

Total disk space

≥ 0

MB

aom_node_disk_available_capacity_megabytes

diskAvailableCapacity

Available disk space

Disk space that has not been used

≥ 0

MB

aom_node_disk_usage

diskUsedRate

Disk Usage

Percentage of the used disk space to the total disk space

0–100

%

aom_node_disk_read_kilobytes

diskReadRate

Disk Read Rate

Volume of data read from a disk per second

≥ 0

KB/s

aom_node_disk_write_kilobytes

diskWriteRate

Disk Write Rate

Volume of data written into a disk per second

≥ 0

KB/s

GPU

aom_node_gpu_memory_free_megabytes

gpuMemCapacity

GPU Memory Capacity

Total GPU memory of a measured object

≥ 0

MB

aom_node_gpu_memory_usage

gpuMemUsage

GPU Memory Usage

Percentage of the used GPU memory to the total GPU memory

0–100

%

aom_node_gpu_memory_used_megabytes

gpuMemUsed

Used GPU Memory

GPU memory used by a measured object

≥ 0

MB

aom_node_gpu_usage

gpuUtil

GPU Usage

GPU usage of a measured object

0–100

%

Host

aom_node_process_number

processNum

Number of Processes

Number of running processes on a measured object

≥ 0

N/A

Atlas 500

AI Edge Station

aom_node_npu_temperature_centigrade

node_temperature

Node Temperature

Temperature of the Atlas 500 AI Edge Station node, which is reported by calling the edgecore API

≥ 0

°C

node_power

node_power

Node Power

Power of the Atlas 500 AI Edge Station node, which is reported by calling the edgecore API

≥ 0

W

node_voltage

node_voltage

Node Voltage

Voltage of the Atlas 500 AI Edge Station node, which is reported by calling the edgecore API

≥ 0

V

npu_temperature

npu_temperature

Chip Temperature

NPU temperature of the Atlas 500 AI Edge Station node, which is reported by calling the edgecore API

≥ 0

°C

npu_health

npu_health

Chip Health Status

NPU health status of the Atlas 500 AI Edge Station node, which is reported by calling the edgecore API

≥ 0

N/A

ai_cpu_rate

ai_cpu_rate

AI CPU Usage

AI CPU usage of the Ascend AI accelerator card, which is reported by calling the edgecore API

0–100

%

ai_core_rate

ai_core_rate

AI Core Usage

AI Core usage of the Ascend AI accelerator card, which is reported by calling the edgecore API

0–100

%

ctrl_cpu_rate

ctrl_cpu_rate

Control CPU Usage

Control CPU usage of the Ascend AI accelerator card, which is reported by calling the edgecore API

0–100

%

ddr_cap_rate

ddr_cap_rate

DDR Memory Usage

DDR memory usage of the Atlas 500 AI Edge Station node, which is reported by calling the edgecore API

0–100

%

ddr_bw_rate

ddr_bw_rate

DDR Bandwidth Usage

DDR bandwidth usage of the Atlas 500 AI Edge Station node, which is reported by calling the edgecore API

0–100

%

Container metrics

CPU

aom_container_cpu_limit_core

cpuCoreLimit

Total CPU Cores

Total number of CPU cores that have been applied for a measured object

≥ 1

Cores

aom_container_cpu_used_core

cpuCoreUsed

Used CPU Cores

Number of CPU cores used by a measured object

≥ 0

Cores

aom_container_cpu_usage

cpuUsage

CPU Usage

CPU usage of a measured object

0–100

%

Memory

aom_container_memory_request_megabytes

memCapacity

Total Physical Memory

Total physical memory that has been applied for a measured object

≥ 0

MB

aom_container_memory_used_megabytes

memUsed

Used Physical Memory

Used physical memory of a measured object

≥ 0

MB

memUsedRate

memUsedRate

Physical Memory Usage

Percentage of the used physical memory to the total physical memory applied for a measured object

0–100

%

Disk

aom_container_disk_read_kilobytes

diskReadRate

Disk Read Rate

Volume of data read from a disk per second

≥ 0

KB/s

aom_container_disk_write_kilobytes

diskWriteRate

Disk Write Rate

Volume of data written into a disk per second

≥ 0

KB/s

Network

aom_container_network_receive_bytes

recvBytesRate

Downlink Rate (BPS)

Inbound traffic rate of a measured object

≥ 0

Bytes/s

aom_container_network_transmit_bytes

sendBytesRate

Uplink Rate (BPS)

Outbound traffic rate of a measured object

≥ 0

Bytes/s

GPU

aom_container_gpu_memory_free_megabytes

gpuMemCapacity

GPU Memory Capacity

Total GPU memory of a measured object

≥ 0

MB

aom_container_gpu_memory_usage

gpuMemUsage

GPU Memory Usage

Percentage of the used GPU memory to the total GPU memory

0–100

%

aom_container_gpu_memory_used_megabytes

gpuMemUsed

Used GPU Memory

GPU memory used by a measured object

≥ 0

MB

aom_container_gpu_usage

gpuUtil

GPU Usage

GPU usage of a measured object

0–100

%

Container status

aom_container_status

status

Container Status

Container status

≥ 0

N/A

Process metrics

CPU

aom_process_cpu_usage

cpuUsage

CPU Usage

CPU usage of a measured object

0–100

%

Memory

aom_process_memory_used_megabytes

memUsed

Used Physical Memory

Used physical memory of a measured object

≥ 0

MB

Process status

aom_process_status

status

Process Status

Process status

≥ 0

N/A

GPU

gpuMemCapacity

gpuMemCapacity

GPU Memory Capacity

Total GPU memory of a measured object

≥ 0

MB

gpuMemUsage

gpuMemUsage

GPU Memory Usage

Percentage of the used GPU memory to the total GPU memory

0–100

%

gpuMemUsed

gpuMemUsed

Used GPU Memory

GPU memory used by a measured object

≥ 0

MB

gpuUtil

gpuUtil

GPU Usage

GPU usage of a measured object

0–100

%