Help Center> Data Warehouse Service (DWS)> Management Guide> Monitoring and Alarms> Alarms> Alarm Handling> DWS_2000000012 Node Data Disk Latency Exceeds the Threshold
Updated on 2024-06-14 GMT+08:00

DWS_2000000012 Node Data Disk Latency Exceeds the Threshold

Description

GaussDB (DWS) collects the data disk latency of each node in the cluster every 30 seconds. This alarm is generated when the average latency of a data disk on a node exceeds 400 ms (configurable) in the last 10 minutes (configurable), and is automatically cleared when the average latency drops below 400 ms.

If the data disk latency of a node is always greater than the alarm threshold, this alarm is generated again after 24 hours (configurable).

Alarm Attributes

Alarm ID

Alarm Severity

Auto Clear

DWS_2000000012

Major

Yes

Alarm Parameters

Parameter

Description

Alarm Source

Indicates the name of the system for which the alarm is generated, for example, GaussDB(DWS).

Cluster Name

Indicates the cluster for which the alarm is generated.

Location Information

Includes ID and name of the cluster for which the alarm is generated, and ID and name of the instance for which the alarm is generated, for example, cluster_id: xxxx-xxxx-xxxx-xxxx, cluster_name: test_dws, instance_id: xxxx-xxxx-xxxx-xxxx, instance_name: test_dws-dws-cn-cn-1-1.

Detail Information

Detailed information about the alarm, including the cluster, instance, disk, and threshold information. Example: CloudService=DWS, resourceId= xxxx-xxxx-xxxx-xxxx, resourceIdName=test_dws, instance_id: xxxx-xxxx-xxxx-xxxx, instance_name: test_dws-dws-cn-cn-1-1, host_name: host-192-168-1-122, disk_name: /dev/vdb, first_alarm_time: 2022-01-30 10:30:00. The data disk I/O usage of the node within 10 minutes is 90.54%, exceeding the threshold 90%.

Generated

Time when an alarm is generated.

Status

Indicates the status of the current alarm.

Impact on the System

High disk latency will slow down the data read/write speed, causing the cluster performance to deteriorate.

Possible Causes

The database is in peak hours and there are a large number of read and write requests.

Handling Procedure

  1. On the Clusters > Dedicated Clusters page, locate the row that contains the target cluster and click Monitoring in the Operation column.
  2. In the navigation pane on the left, choose Monitoring > Node Monitoring. On the Node Monitoring page, view the CPU usage, disk usage, and memory usage.

    If the CPU usage and disk I/O rate are high, the cluster is in peak hours. You can adjust the latency threshold based on service requirements. For details, see 3.

  3. Click Alarms, switch to the Alarms tab page, and click Alarm Rule Management in the upper left corner.
  4. Locate the row that contains Node Data Disk Latency Exceeds the Threashold, and click Modify in the Operation column. On the displayed page, change the threshold.

Alarm Clearance

This alarm is automatically cleared when the data disk latency drops to a certain value.