DWS_2000000009 Node Data Disk I/O Usage Exceeds the Threshold
Description
GaussDB(DWS) collects the data disk I/O usage of each cluster node every 30 seconds. This alarm is generated when the average usage of a data disk on a node exceeds 90% (configurable) in the last 10 minutes (configurable), and is automatically cleared when the average usage drops below 85% (alarm threshold minus 5%).
- If the data disk I/O usage of a node is always greater than the alarm threshold, the alarm is generated again 24 hours later (configurable).
- When using an SSD storage-based cluster, disk I/O can surpass 100% as the service volume grows. But, this does not always mean there is a performance bottleneck. To confirm the alarm's validity, you should evaluate the service's actual running status.
Alarm Attributes
Alarm ID |
Alarm Category |
Alarm Severity |
Alarm Type |
Service Type |
Auto Cleared |
---|---|---|---|---|---|
DWS_2000000009 |
Management plane alarm |
Urgent: > 90% |
Operation alarm |
GaussDB(DWS) |
Yes |
Alarm Parameters
Category |
Name |
Description |
---|---|---|
Location information |
Name |
Node Data Disk I/O Usage Exceeds the Threshold |
Type |
Operation alarm |
|
Generation time |
Time when the alarm is generated |
|
Other information |
Cluster ID |
Cluster details such as resourceId and domain_id |
Impact on the System
- High disk I/O usage affects data read and write performance, thereby affecting cluster performance.
- A large number of disk writes occupy the disk capacity. If the disk capacity exceeds 90%, the cluster becomes read-only.
Possible Causes
- A large number of read or write operations are performed during peak hours.
- A large amount of data spills to disks due to the execution of complex statements.
- Data is scanned by the Scan operator.
Handling Procedure
- On the Clusters > Dedicated Clusters page, locate the row that contains the target cluster and click Monitoring in the Operation column.
- In the navigation pane on the left, choose Monitoring > Node Monitoring. On the Node Monitoring page, click the Disks tab to view the data disk I/O usage and disk I/O rate.
If the disk I/O rate is high and the data disk usage keeps increasing, it indicates that services are writing data to disks. This may be caused by complex queries.
- Click Queries in the navigation tree on the left to view the real-time queries.
If the execution time of a statement exceeds the expected time, stop the query and check the disk I/O usage again. For details, see 2.
Alarm Clearance
This alarm is automatically cleared when the data disk I/O usage drops to a certain value.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.