Help Center/ GaussDB(DWS)/ Management Guide/ Monitoring and Alarms/ Alarms/ Alarm Handling/ DWS_2000000009 Node Data Disk I/O Usage Exceeds the Threshold
Updated on 2024-09-05 GMT+08:00

DWS_2000000009 Node Data Disk I/O Usage Exceeds the Threshold

Description

GaussDB(DWS) collects the data disk I/O usage of each cluster node every 30 seconds. This alarm is generated when the average usage of a data disk on a node exceeds 90% (configurable) in the last 10 minutes (configurable), and is automatically cleared when the average usage drops below 85% (alarm threshold minus 5%).

  • If the data disk I/O usage of a node is always greater than the alarm threshold, the alarm is generated again 24 hours later (configurable).
  • When using an SSD storage-based cluster, disk I/O can surpass 100% as the service volume grows. But, this does not always mean there is a performance bottleneck. To confirm the alarm's validity, you should evaluate the service's actual running status.

Alarm Attributes

Alarm ID

Alarm Category

Alarm Severity

Alarm Type

Service Type

Auto Cleared

DWS_2000000009

Management plane alarm

Urgent: > 90%

Operation alarm

GaussDB(DWS)

Yes

Alarm Parameters

Category

Name

Description

Location information

Name

Node Data Disk I/O Usage Exceeds the Threshold

Type

Operation alarm

Generation time

Time when the alarm is generated

Other information

Cluster ID

Cluster details such as resourceId and domain_id

Impact on the System

  • High disk I/O usage affects data read and write performance, thereby affecting cluster performance.
  • A large number of disk writes occupy the disk capacity. If the disk capacity exceeds 90%, the cluster becomes read-only.

Possible Causes

  • A large number of read or write operations are performed during peak hours.
  • A large amount of data spills to disks due to the execution of complex statements.
  • Data is scanned by the Scan operator.

Handling Procedure

  1. On the Clusters > Dedicated Clusters page, locate the row that contains the target cluster and click Monitoring in the Operation column.
  2. In the navigation pane on the left, choose Monitoring > Node Monitoring. On the Node Monitoring page, click the Disks tab to view the data disk I/O usage and disk I/O rate.

    If the disk I/O rate is high and the data disk usage keeps increasing, it indicates that services are writing data to disks. This may be caused by complex queries.

  3. Click Queries in the navigation tree on the left to view the real-time queries.

    If the execution time of a statement exceeds the expected time, stop the query and check the disk I/O usage again. For details, see 2.

Alarm Clearance

This alarm is automatically cleared when the data disk I/O usage drops to a certain value.