DWS_2000000009 Node Data Disk I/O Usage Exceeds the Threshold
Description
GaussDB(DWS) collects the data disk I/O usage of each cluster node every 30 seconds. This alarm is generated when the average usage of a data disk on a node exceeds 90% (configurable) in the last 10 minutes (configurable), and is automatically cleared when the average usage drops below 85% (alarm threshold minus 5%).
- If the data disk I/O usage of a node is always greater than the alarm threshold, the alarm is generated again 24 hours later (configurable).
- When using an SSD storage-based cluster, disk I/O can surpass 100% as the service volume grows. But, this does not always mean there is a performance bottleneck. To confirm the alarm's validity, you should evaluate the service's actual running status.
Alarm Attributes
Alarm ID |
Alarm Category |
Alarm Severity |
Alarm Type |
Service Type |
Auto Cleared |
---|---|---|---|---|---|
DWS_2000000009 |
Management plane alarm |
Urgent: > 90% |
Operation alarm |
GaussDB(DWS) |
Yes |
Alarm Parameters
Category |
Name |
Description |
---|---|---|
Location information |
Name |
Node Data Disk I/O Usage Exceeds the Threshold |
Type |
Operation alarm |
|
Generation time |
Time when the alarm is generated |
|
Other information |
Cluster ID |
Cluster details such as resourceId and domain_id |
Impact on the System
- High disk I/O usage affects data read and write performance, thereby affecting cluster performance.
- A large number of disk writes occupy the disk capacity. If the disk capacity exceeds 90%, the cluster becomes read-only.
Possible Causes
- A large number of read or write operations are performed during peak hours.
- A large amount of data spills to disks due to the execution of complex statements.
- Data is scanned by the Scan operator.
Handling Procedure
- On the Clusters > Dedicated Clusters page, locate the row that contains the target cluster and click Monitoring in the Operation column.
- In the navigation pane on the left, choose Monitoring > Node Monitoring. On the Node Monitoring page, click the Disks tab to view the data disk I/O usage and disk I/O rate.
If the disk I/O rate is high and the data disk usage keeps increasing, it indicates that services are writing data to disks. This may be caused by complex queries.
- Click Queries in the navigation pane to view the real-time queries.
If the execution time of a statement exceeds the expected time, stop the query and check the disk I/O usage again. For details, see 2.
Alarm Clearance
This alarm is automatically cleared when the data disk I/O usage drops to a certain value.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot