ALM-12016 CPU Usage Exceeds the Threshold
Description
The system checks the CPU usage every 30 seconds and compares the actual CPU usage with the threshold. The CPU usage has a default threshold. This alarm is generated when the CPU usage exceeds the threshold for several times (configurable, 10 times by default) consecutively.
The alarm is cleared in the following two scenarios: The value of Trigger Count is 1 and the CPU usage is smaller than or equal to the threshold; the value of Trigger Count is greater than 1 and the CPU usage is smaller than or equal to 90% of the threshold.
Attribute
Alarm ID |
Alarm Severity |
Auto Clear |
---|---|---|
12016 |
Major |
Yes |
Parameters
Name |
Meaning |
---|---|
Source |
Specifies the cluster or system for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
RoleName |
Specifies the role for which the alarm is generated. |
HostName |
Specifies the host for which the alarm is generated. |
Trigger Condition |
Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated. |
Impact on the System
Service processes respond slowly or become unavailable.
Possible Causes
- The alarm threshold or alarm smoothing times are incorrect.
- CPU configuration cannot meet service requirements. The CPU usage reaches the upper limit. Or the service is in peak hours. As a result, the CPU usage reaches the upper limit in a short period of time.
Procedure
Check whether the alarm threshold or alarm Trigger Count are correct.
- Change the alarm threshold and alarm Trigger Count based on CPU usage.
On FusionInsight Manager, choose O&M > Alarm > Thresholds > Name of the desired cluster > Host > CPU > Host CPU Usage and change the alarm smoothing times based on CPU usage, as shown in Figure 1.
This option defines the alarm check phase. Trigger Count indicates the alarm check threshold. An alarm is generated when the number of check times exceeds the threshold.
On Host CPU Usage page and click Modify in the Operation column to change the alarm threshold, as shown in Figure 2.
- After 2 minutes, check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 3.
Check whether the CPU usage reaches the upper limit.
- In the alarm list on FusionInsight Manager, click in the row where the alarm is located to view the alarm host address in the alarm details.
- On the Hosts page, click the node on which the alarm is reported.
- View the CPU usage for 5 minutes. If the CPU usage exceeds the threshold for multiple times, contact the system administrator to add more CPUs.
- Check whether the current traffic is in peak hours. If the alarm is generated during peak hours, you are advised to expand the capacity of the node or contact the system administrator to add more CPUs.
- Check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 8.
Collect fault information.
- On the FusionInsight Manager in the active cluster, choose O&M > Log > Download.
- Select OmmServer from the Service and click OK.
- Set Start Date for log collection to 10 minutes ahead of the alarm generation time and End Date to 10 minutes behind the alarm generation time in Time Range and click Download.
- Contact the O&M personnel and send the collected log information.
Alarm Clearing
After the fault is rectified, the system automatically clears this alarm.
Related Information
None
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.