Updated on 2025-08-19 GMT+08:00

Overview

Whether you are using ECSs or BMSs, you can use server monitoring to track various OS metrics, monitor server resource usage, and query monitoring data when faults occur.

Server monitoring consists of basic monitoring, process monitoring, and OS monitoring for servers.

  • Basic monitoring covers metrics automatically reported by ECSs. The data is collected every 5 minutes. For details, see Cloud Product Metrics. Basic monitoring is unavailable for BMSs.
  • OS monitoring provides proactive and fine-grained OS monitoring for ECSs or BMSs provided that the Agent is installed. Data is collected every minute, capturing metrics such as CPU usage and memory usage. For more information, see Cloud Product Metrics.
  • Process monitoring provides monitoring of active processes on hosts. By default, Cloud Eye collects CPU usage, memory usage, and number of opened files of active processes.
  • Windows and Linux OSs are supported. For details, see What OSs Does the Agent Support?
  • Recommended specifications for server monitoring are 2 vCPUs and 4GiB for Linux servers and 4 vCPUs and 8GiB or higher for Windows servers.
  • To install the Agent on a Linux server, you must have the root permissions. For a Windows server, you must have the administrator permissions.

Constraints

Server monitoring is only available for servers using Huawei Cloud public images. If you use a private image, Cloud Eye will not provide technical support for any possible issues.

Monitoring Capabilities

Multiple metrics, such as metrics for CPU, memory, disk, and network usage, will be monitored, meeting the basic monitoring and O&M requirements for servers. For details about metrics, see Cloud Product Metrics.

Resource Usage

The Agent uses very few system resources (no more than 10% of a single CPU core and no more than 200 MB memory). Generally, the CPU usage of a single core is less than 5%, and the memory is less than 100 MB.

In some scenarios, the CPU and memory usage of the Agent may increase sharply due to server operations. If the resource usage exceeds the threshold, circuit-breaking will be activated. The following table describes some common scenarios and typical solutions.
Table 1 High Agent resource usage scenarios

Cause

Scenario

Procedure

Too many TCP connections

By default, the Agent collects only two basic metrics TCP TOTAL and TCP ESTABLISHED, which use a few CPU resources. If you choose to enable any detailed TCP metric by updating the configuration file, the Agent will start collecting all TCP metrics, which will consume a lot of CPU resources.

Basic TCP metrics: TCP TOTAL and TCP ESTABLISHED

TCP detailed metrics: TCP SYS_SENT, TCP SYS_RECV, TCP FIN_WAIT1, TCP FIN_WAIT2, TCP TIME_WAIT, TCP CLOSE, TCP CLOSE_WAIT, TCP LAST_ACK, TCP LISTEN, and TCP CLOSING

Method 1: Modify the configuration file to stop collecting TCP detailed metrics and reduce the CPU usage. For details, see How Do I Enable or Disable Metric Collection by Modifying the Configuration File?

Method 2: Modify the configuration file to change the Agent resource usage threshold. For details, see How Do I Change the Agent Resource Usage Threshold by Modifying the Configuration File?

Too many file handles

While the Agent is running, it monitors all files opened by processes on the server to track and sum the number of file handles. If there are too many file handles, the Agent task will be re-executed, resulting in high CPU usage.

Method 1: Modify the configuration file to reduce the metric update frequency for the Agent process to lower the CPU usage. For details, see How Do I Change the Process Collection Frequency by Modifying the Configuration File?

Method 2: Modify the configuration file to change the Agent resource usage threshold. For details, see How Do I Change the Agent Resource Usage Threshold by Modifying the Configuration File?

Too many processes

When the Agent is running, it scans all processes on the current server and collects process-level metrics by reviewing process information. When there are too many processes, the Agent task is re-executed, leading to high CPU usage.