Help Center/ Cloud Eye/ FAQs/ Troubleshooting/ Server Monitoring/ What Should I Do If the Monitoring Is Periodically Interrupted or the Agent Status Keeps Changing?

Updated on 2025-10-27 GMT+08:00

View PDF

What Should I Do If the Monitoring Is Periodically Interrupted or the Agent Status Keeps Changing?

Symptom

Monitoring interruptions and unstable Agent status may be caused by Agent overload. The Agent is overloaded if you see either of the following symptoms:

On the Server Monitoring page, the Agent status frequently changes between Running and Faulty.
The period in the metric dashboard is discontinuous.

Constraints

The restoration method in this section only supports new Agent version. If your Agent is of an earlier version, you are advised to upgrade it to the new version.

Run the following command to check the current Agent version:

if [[ -f /usr/local/uniagent/extension/install/telescope/bin/telescope ]]; then /usr/local/uniagent/extension/install/telescope/bin/telescope -v; elif [[ -f /usr/local/telescope/bin/telescope ]]; then echo "old agent"; else echo 0; fi

If old agent is displayed, the early version of the Agent is used.
If a version ID is returned, the new version of the Agent is used.
If 0 is returned, the Agent is not installed.

Possible Causes

The circuit patter is implemented by the Agent when the CPU and memory usage is too high to prevent other services from being affected. The circuit breaker pattern will be implemented automatically when the Agent is overloaded, and no monitoring data will not be reported.

Circuit Breaker Principles

By default, the Agent detection system is as follows:

The system checks the CPU usage and memory usage of the Agent process every minute. If the CPU usage exceeds 30% or the memory usage exceeds 700 MB, that is, the second threshold, the Agent process will exit. If both the CPU usage and memory usage do not exceed the second threshold, the system checks whether they exceed the first threshold (10% of CPU usage or 200 MB of memory usage). If any of them exceeds the first threshold for three consecutive times, the Agent process will exit and the information will be recorded.

After the Agent exits, the daemon process automatically starts the Agent process and checks the exit records. If there are three consecutive exit records, the Agent will hibernate for 20 minutes, during which monitoring data will not be collected.

When too many disks are attached to a server, the CPU or memory usage of the Agent process will become high. You can configure the tier-1 and tier-2 thresholds based on Procedure to trigger the circuit-breaker pattern according to the actual resource usages.

Procedure

Use the root account to log in to the ECS or BMS for which the Agent does not report data.
Optional: Go to the Agent installation path.
For Windows, the path is C:\Program Files\uniagent\extension\install\telescope\bin.

For Linux, the path is /usr/local/uniagent/extension/install/telescope/bin.

Modify configuration file conf.json.

Run the following command to open conf.json:
vi conf.json

Add the following parameters to the conf.json file. For details about the parameters, see Table 1.

**Table 1** Parameters
Parameter	Description
cpu_first_pct_threshold	Tier-1 threshold of CPU usage. The default value is 10 (%).
memory_first_threshold	Tier-1 threshold of memory usage. The default value is 209715200 (200 MB). The unit is byte.
cpu_second_pct_threshold	Tier-2 threshold of CPU usage. The default value is 30 (%).
memory_second_threshold	Tier-2 threshold of memory usage. The default value is 734003200 (700 MB). The unit is byte.
^a To query the CPU usage and memory usage of the Agent, use the following method: Linux: top -p telescope PID Windows: View the details about the Agent process in Task Manager.

{
    "cpu_first_pct_threshold": xx,
    "memory_first_threshold": xxx,
    "cpu_second_pct_threshold": xx,
    "memory_second_threshold": xxx
}

Save the conf.json file and exit.
:wq

Restart the Agent.
- Windows:
  - In the directory where the Agent installation package is stored, double-click the shutdown.bat script to stop the Agent, and then execute the start.bat script to start the Agent.
- Linux:
  - Check the Agent process ID.
    ps -ef |grep telescope
  - Run kill -9 PID to stop the process and then wait for 3 to 5 minutes for the Agent to automatically restart.
    kill -9 PID
    Figure 1 Restarting the Agent