Updated on 2025-11-04 GMT+08:00

(Optional) Configuring Monitoring and Alarms

Monitoring Configurations

Using Cloud Eye to Monitor NPU Resources

Lite Server uses Cloud Eye for monitoring. Its required agent plug-in comes pre-installed. You can check NPU resource data directly through Cloud Eye. For details, see Monitoring Lite Server Resources.

Key Metrics

The table below lists the common monitoring metrics. For details about all supported metrics, see Lite Server Monitoring Metrics.

Table 1 Common monitoring metrics

No.

Category

Metric

Display Name

Description

Unit

Number System

Range

1

DDR

npu_util_rate_mem

NPU Memory Usage

NPU memory usage

%

N/A

0%–100%

2

npu_util_rate_mem_bandwidth

NPU Memory Bandwidth Usage

NPU memory bandwidth usage

%

N/A

0%–100%

3

HBM

npu_hbm_bandwidth_util

HBM Bandwidth Usage

NPU HBM bandwidth usage

%

N/A

0%–100%

4

npu_util_rate_hbm_bw

HBM Bandwidth Usage

NPU HBM bandwidth usage

%

N/A

0%–100%

5

AI Core

npu_util_rate_ai_core

NPU AI Core Usage

AI core usage of the NPU

%

N/A

0%–100%

6

/

mem_usedPercent

Memory Usage

Memory usage of the monitored object.

Linux: Obtain the metric value from the /proc/meminfo file: (MemTotal - MemAvailable)/MemTotal

If MemAvailable is displayed in /proc/meminfo, MemUsedPercent = (MemTotal-MemAvailable)/MemTotal

If MemAvailable is not displayed in /proc/meminfo, MemUsedPercent = (MemTotalMemFreeBuffersCached)/MemTotal

Collection method (Windows): formula (Used memory size/Total memory size x 100%)

%

N/A

0%–100%

Alarm Settings

You can use Cloud Eye to collect key events and cloud resource operation events. When an event occurs, you will receive an alarm. You can check supported events in Supported Events.