ALM-12191 Disk I/O Usage Exceeds the Threshold
Alarm Description
The system checks the disk I/O usage every 30 seconds and compares the actual disk I/O usage with the threshold. This alarm is generated when the disk I/O usage exceeds the threshold for multiple consecutive times (3 by default).
If the hit number is 1, this alarm is cleared when the disk I/O usage is less than or equal to the threshold. If the hit number is greater than 1, this alarm is cleared when the disk I/O usage is less than or equal to 90% of the threshold.
Alarm Attributes
Alarm ID |
Alarm Severity |
Alarm Type |
Service Type |
Auto Cleared |
---|---|---|---|---|
12191 |
Major |
Physical resource |
FusionInsight Manager |
Yes |
Alarm Parameters
Type |
Parameter |
Description |
---|---|---|
Location Information |
Source |
Specifies the cluster or system for which the alarm was generated. |
ServiceName |
Specifies the service for which the alarm was generated. |
|
RoleName |
Specifies the role for which the alarm was generated. |
|
HostName |
Specifies the host for which the alarm was generated. |
|
Additional Information |
Trigger Condition |
Specifies the alarm triggering condition. |
Impact on the System
- Latency: Service processes may run slowly and there is a latency.
- Service failure: Service processing may be slow, time out, or fail. As a result, jobs may fail to run.
Possible Causes
- The alarm threshold or alarm trigger count is improperly configured.
- The disk configuration cannot meet service requirements. The disk I/O usage reaches the upper limit. Alternatively, services are in peak hours. The disk I/O usage reaches the upper limit in a short period.
Handling Procedure
Check whether the alarm threshold or alarm trigger count is properly configured.
- Modify the alarm threshold and alarm trigger count based on the actual disk I/O usage.
- Log in to FusionInsight Manager and choose O&M > Alarm > Thresholds, click the name of the desired cluster, and choose Host > Disk > Disk IO Utilization.
- Click the edit button next to Trigger Count to change it to a proper value based on the actual service usage.
Trigger Count indicates how many consecutive times the threshold is reached when the alarm is triggered.
- Click Modify in the Operation column of the row that contains the rule and change the alarm threshold.
- Wait 2 minutes and check whether the alarm is automatically cleared.
- If yes, no further action is required.
- If no, go to 3.
Check whether the disk I/O usage reaches the upper limit.
- On FusionInsight Manager, choose O&M > Alarm > Alarms. In the alarm list, expand the alarm details and click the name of the host for which the alarm is generated in Location area.
- On the overview page of the host, observe the real-time data of the disk I/O usage for about 5 minutes. If the disk I/O usage exceeds the threshold for multiple times, contact the MRS cluster administrator to improve the disk specification.
If Disk IO Utilization chart is not displayed, click the drop-down arrow on the right, select Customize, select the desired item, and click OK.
- Check whether it was the peak hour. If this alarm was generated during peak hours, expand the node capacity or contact the MRS cluster administrator to improve the disk specification.
- Check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 7.
Collect fault information.
- On FusionInsight Manager, choose O&M. In the navigation pane on the left, choose Log > Download.
- Expand the Service drop-down list, select NodeAgent for the target cluster, and click OK.
- Click the edit icon in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact O&M engineers and provide the collected logs.
Alarm Clearance
This alarm is automatically cleared after the fault is rectified.
Related Information
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot