What Should I Do If the Monitoring Is Periodically Interrupted or the Agent Status Keeps Changing?
Symptom
Monitoring interruptions and unstable Agent status may be caused by Agent overload. The Agent is overloaded if you see either of the following symptoms:
- On the Server Monitoring page, the Agent status frequently changes between Running and Faulty.
- The period in the metric dashboard is discontinuous.
Constraints
The restoration method in this section only supports new Agent version. If your Agent is of an earlier version, you are advised to upgrade it to the new version.
- Linux:
- Log in to the server as user root.
- Check the Agent version:
if [[ -f /usr/local/uniagent/extension/install/telescope/bin/telescope ]]; then /usr/local/uniagent/extension/install/telescope/bin/telescope -v; elif [[ -f /usr/local/telescope/bin/telescope ]]; then echo "old agent"; else echo 0; fi
- If old agent is returned, the earlier version of the Agent is used. Manage the Agent based on its version.
- If a version is returned, the new version of the Agent is in use. Manage the Agent based on its version.
- If 0 is returned, the Agent is not installed.
- Windows: Determine the Agent version based on the installation path. Default installation paths:
- New version: C:\Program Files\uniagent\extension\install\telescope
- Earlier version: C:\Program Files\telescope
Possible Causes
The circuit patter is implemented by the Agent when the CPU and memory usage is too high to prevent other services from being affected. The circuit breaker pattern will be implemented automatically when the Agent is overloaded, and no monitoring data will not be reported.
Circuit Breaker Principles
By default, the Agent detection system is as follows:
The system checks the CPU usage and memory usage of the Agent process every minute. If the CPU usage exceeds 30% or the memory usage exceeds 700 MB, that is, the second threshold, the Agent process will exit. If both the CPU usage and memory usage do not exceed the second threshold, the system checks whether they exceed the first threshold (10% of CPU usage or 200 MB of memory usage). If any of them exceeds the first threshold for three consecutive times, the Agent process will exit and the information will be recorded.
After the Agent exits, the daemon process automatically starts the Agent process and checks the exit records. If there are three consecutive exit records, the Agent will hibernate for 20 minutes, during which monitoring data will not be collected.
When too many disks are attached to a server, the CPU or memory usage of the Agent process will become high. You can configure the tier-1 and tier-2 thresholds based on Procedure to trigger the circuit-breaker pattern according to the actual resource usages.
Procedure
- Log in to the ECS or BMS where the Agent does not report data.
- Optional: Go to the Agent installation path.
For Windows, the path is C:\Program Files\uniagent\extension\install\telescope\bin.
For Linux, the path is /usr/local/uniagent/extension/install/telescope/bin.
- Modify configuration file conf.json.
- Run the following command to open conf.json:
vi conf.json
- Add the following parameters to the conf.json file. For details about the parameters, see Table 1.
Table 1 Parameters Parameter
Description
cpu_first_pct_threshold
Tier-1 threshold of CPU usage. The default value is 10 (%).
memory_first_threshold
Tier-1 threshold of memory usage. The default value is 209715200 (200 MB). The unit is byte.
cpu_second_pct_threshold
Tier-2 threshold of CPU usage. The default value is 30 (%).
memory_second_threshold
Tier-2 threshold of memory usage. The default value is 734003200 (700 MB). The unit is byte.
To query the CPU usage and memory usage of the Agent process, use either of the following methods:
{ "cpu_first_pct_threshold": xx, "memory_first_threshold": xxx, "cpu_second_pct_threshold": xx, "memory_second_threshold": xxx } - Save the conf.json file and exit.
:wq
- Run the following command to open conf.json:
- Restart the Agent.
- Windows:
- In the C:\Program Files\uniagent\extension\install\telescope directory, double-click shutdown.bat to stop the Agent, and then run start.bat to start the Agent.
- In the C:\Program Files\uniagent\script directory, double-click shutdown.bat to stop the Agent, and then run start.bat to start the Agent.
- Linux:
service ces-uniagent stop service ces-uniagent start /usr/local/uniagent/extension/install/telescope/telescoped restart
- Windows:
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot