ALM-12016 CPU Usage Exceeds the Threshold (For MRS 2.x or Earlier)
Description
The system checks the CPU usage every 30 seconds and compares the check result with the default threshold. The CPU usage has a default threshold. This alarm is generated when the CPU usage exceeds the threshold for several times (configurable, 10 times by default) consecutively.
This alarm is cleared when the average CPU usage is less than or equal to 90% of the threshold.
Attribute
Alarm ID |
Alarm Severity |
Auto Clear |
---|---|---|
12016 |
Major |
Yes |
Parameters
Parameter |
Description |
---|---|
ServiceName |
Specifies the service for which the alarm is generated. |
RoleName |
Specifies the role for which the alarm is generated. |
HostName |
Specifies the host for which the alarm is generated. |
Trigger Condition |
Generates an alarm when the actual indicator value exceeds the specified threshold. |
Impact on the System
Processes respond slowly or do not work.
Possible Causes
- The alarm threshold or alarm hit number is improperly configured.
- The CPU configuration cannot meet service requirements. The CPU usage reaches the upper limit.
Procedure
- Check whether the alarm threshold or alarm hit number is properly configured.
- Log in to MRS Manager and change the alarm threshold and alarm hit number based on CPU usage.
- Choose System > Threshold Configuration > Device > Host > CPU > CPU Usage > CPU Usage and change the alarm threshold based on the actual CPU usage.
- Choose System > Threshold Configuration > Device > Host > CPU > CPU Usage > CPU Usage and change hit number based on the actual CPU usage.
This option defines the alarm check phase. Interval indicates the alarm check period and hit number indicates the number of times when the CPU usage exceeds the threshold. An alarm is generated when the CPU usage exceeds the threshold for several times consecutively.
- Wait 2 minutes and check whether the alarm is automatically cleared.
- If yes, no further action is required.
- If no, go to 2.
- Expand the system.
- Go to the MRS cluster details page. In the alarm list on the alarm management tab page, click the row that contains the alarm. In the alarm details, view the address of the node.
- Log in to the node for which the alarm is generated.
- Run cat /proc/stat | awk 'NR==1'|awk '{for(i=2;i<=NF;i++)j+=$i;print "" 100 - ($5+$6) * 100 / j;}' to check the system CPU usage.
- If the CPU usage exceeds the threshold, expand the CPU capacity.
- Check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 3.
- Collect fault information.
- On MRS Manager, choose .
- Contact the O&M engineers and send the collected logs.
Reference
None
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.