Updated on 2025-07-17 GMT+08:00

NPU Metrics

If the CCE AI Suite (Ascend NPU) add-on version is 2.1.55 or later, NPU-Exporter can be used to monitor and collect Ascend AI processor metrics. NPU-Exporter is designed to obtain and report runtime data of Ascend AI chips, including the number of Ascend AI processors and real-time network port receiving rate. These metrics are referred to as NPU metrics. By monitoring NPU metrics, you can gain real-time visibility into NPU performance, detect and resolve potential issues, and ensure that NPUs run stably and efficiently. This section provides a detailed overview of the NPU metrics reported by NPU-Exporter.

Billing

NPU metrics are custom metrics. Uploading such metrics to AOM incurs fees. For details, see Pricing Details.

Applicable NPU Nodes

Only NPU metrics of the nodes listed in AI-accelerated ECSs can be monitored and collected.

NPU Metrics

NPU-Exporter can collect 73 NPU metrics. This section focuses on the common metrics supported by nodes listed in AI-accelerated ECSs. For details, see Table 1.

Table 1 NPU metrics

Category

Metric

Description

Metric Label

Field Type

NPU

npu_chip_info_name

Name and ID of an Ascend AI processor

container_name: a container name

String

id: an NPU ID

String

model_name: name of an Ascend AI processor

String

namespace: a namespace name

String

pcie_bus_info: PCIe information of an Ascend AI processor

String

pod_name: a pod name

String

vdie_id: Unique ID of an Ascend AI processor, which can be used as the UUID of the NPU

String

npu_chip_info_health_status

Health status of an Ascend AI processor. Options:

  • 0: The Ascend AI processor is unhealthy.
  • 1: The Ascend AI processor is healthy.

container_name: a container name

String

id: an NPU ID

String

model_name: name of an Ascend AI processor

String

namespace: a namespace name

String

pcie_bus_info: PCIe information of an Ascend AI processor

String

pod_name: a pod name

String

vdie_id: Unique ID of an Ascend AI processor, which can be used as the UUID of the NPU

String

npu_chip_info_power

Power consumption of an Ascend AI processor, in watts (W)

NOTE:

If the NPU on the node is Snt3P, this metric specifies the board power consumption. If the NPU is Snt3, this metric specifies the power consumption of the Ascend AI processor.

container_name: a container name

String

id: an NPU ID

String

model_name: name of an Ascend AI processor

String

namespace: a namespace name

String

pcie_bus_info: PCIe information of an Ascend AI processor

String

pod_name: a pod name

String

vdie_id: Unique ID of an Ascend AI processor, which can be used as the UUID of the NPU

String

npu_chip_info_temperature

Temperature of an Ascend AI processor, in degrees Celsius (°C)

container_name: a container name

String

id: an NPU ID

String

model_name: name of an Ascend AI processor

String

namespace: a namespace name

String

pcie_bus_info: PCIe information of an Ascend AI processor

String

pod_name: a pod name

String

vdie_id: Unique ID of an Ascend AI processor, which can be used as the UUID of the NPU

String

npu_chip_info_utilization

AI Core usage of an Ascend AI processor, in percentage

container_name: a container name

String

id: an NPU ID

String

model_name: name of an Ascend AI processor

String

namespace: a namespace name

String

pcie_bus_info: PCIe information of an Ascend AI processor

String

pod_name: a pod name

String

vdie_id: Unique ID of an Ascend AI processor, which can be used as the UUID of the NPU

String

npu_chip_info_vector_utilization

AI Vector usage of an Ascend AI processor

container_name: a container name

String

id: an NPU ID

String

model_name: name of an Ascend AI processor

String

namespace: a namespace name

String

pcie_bus_info: PCIe information of an Ascend AI processor

String

pod_name: a pod name

String

vdie_id: Unique ID of an Ascend AI processor, which can be used as the UUID of the NPU

String

DDR

npu_chip_info_used_memory

Used DDR memory of an Ascend AI processor, in MB

container_name: a container name

String

id: an NPU ID

String

model_name: name of an Ascend AI processor

String

namespace: a namespace name

String

pcie_bus_info: PCIe information of an Ascend AI processor

String

pod_name: a pod name

String

vdie_id: Unique ID of an Ascend AI processor, which can be used as the UUID of the NPU

String

npu_chip_info_total_memory

Total DDR memory of an Ascend AI processor, in MB

container_name: a container name

String

id: an NPU ID

String

model_name: name of an Ascend AI processor

String

namespace: a namespace name

String

pcie_bus_info: PCIe information of an Ascend AI processor

String

pod_name: a pod name

String

vdie_id: Unique ID of an Ascend AI processor, which can be used as the UUID of the NPU

String

Helpful Links

You can use NPU-Exporter to monitor these NPU metrics. For details, see Comprehensive Monitoring of NPU Metrics.