ALM-12016 CPU Usage Exceeds the Threshold
Alarm Description
The system checks the CPU usage every 30 seconds and compares the actual CPU usage with the threshold. The CPU usage has a default threshold. This alarm is generated when the CPU usage exceeds the threshold for several times (configurable, 10 times by default) consecutively.
The alarm is cleared in the following two scenarios: The value of Trigger Count is 1 and the CPU usage is smaller than or equal to the threshold; the value of Trigger Count is greater than 1 and the CPU usage is smaller than or equal to 90% of the threshold.
Alarm Attributes
Alarm ID |
Alarm Severity |
Alarm Type |
Service Type |
Auto Cleared |
---|---|---|---|---|
12016 |
Major |
Physical resource |
FusionInsight Manager |
Yes |
Alarm Parameters
Type |
Parameter |
Description |
---|---|---|
Location Information |
Source |
Specifies the cluster or system for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
|
RoleName |
Specifies the role for which the alarm is generated. |
|
HostName |
Specifies the host for which the alarm is generated. |
|
Additional Information |
Trigger Condition |
Specifies the triggering condition for which the alarm is generated. |
Impact on the System
- Latency: If the CPU usage of a host is too high, service processes may run slowly and services may be delayed.
- Service failure: If the host CPU usage is too high, service processing may slow down, time out, or fail. As a result, jobs may fail to run.
Possible Causes
- The alarm threshold or alarm smoothing times are incorrect.
- The CPU configuration cannot meet service requirements, and the CPU usage reaches the upper limit. Or the service is in peak hours. As a result, the CPU usage reaches the upper limit in a short period of time.
Handling Procedure
Check whether the alarm threshold or alarm Trigger Count are correct.
- Change the alarm threshold and alarm Trigger Count based on CPU usage.
On FusionInsight Manager, choose O&M > Alarm > Thresholds > Name of the desired cluster > Host > CPU > Host CPU Usage and change the alarm smoothing times based on CPU usage, as shown in Figure 1.
This option defines the alarm check phase. Trigger Count indicates the alarm check threshold. An alarm is generated when the number of check times exceeds the threshold.
On Host CPU Usage page and click Modify in the Operation column to change the alarm threshold, as shown in Figure 2.
- After 2 minutes, check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 3.
Check whether the CPU usage reaches the upper limit.
- In the alarm list on FusionInsight Manager, click in the row where the alarm is located to view the alarm host address in the alarm details.
- On the Hosts page, click the node on which the alarm is reported.
- View the CPU usage for 5 minutes. If the CPU usage exceeds the threshold for multiple times, contact the system administrator to add more CPUs.
- Check whether the current traffic is in peak hours. If the alarm is generated during peak hours, you are advised to expand the capacity of the node or contact the system administrator to add more CPUs.
- Check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 8.
Collect fault information.
- On the FusionInsight Manager in the active cluster, choose O&M > Log > Download.
- Select OmmServer from the Service and click OK.
- Set Start Date for log collection to 10 minutes ahead of the alarm generation time and End Date to 10 minutes behind the alarm generation time in Time Range and click Download.
- Contact the O&M engineers and send the collected log information.
Alarm Clearance
After the fault is rectified, the system automatically clears this alarm.
Related Information
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot