DWS_2000000027 Memory Usage of a GaussDB(DWS) Cluster Node Exceeds the Threshold

Alarm Description

GaussDB(DWS) collects the instance memory usage of each node in a cluster every 60 seconds. If a node's instance memory usage exceeds 90% (adjustable), an alarm is reported indicating that the threshold has been surpassed. The alarm will be cleared if the average memory usage falls below 85% (5% below the reporting threshold).

If a node's instance average memory usage consistently exceeds the alarm threshold, the alarm will be triggered again after 24 hours (adjustable).

Attributes

Alarm ID	Alarm Category	Alarm Severity	Alarm Type	Service Type	Auto Cleared
DWS_2000000027	Management plane alarm	Urgent: > 90%	Operation alarm	GaussDB(DWS)	Yes

Alarm Parameters

Category	Name	Description
Location information	Name	Instance Memory Usage of a Cluster Node Exceeds the Threshold
	Type	Operation alarm
	Generation time	Time when the alarm is generated
Other information	Cluster ID	Cluster details such as resourceId and domain_id

Impact on the System

If instance memory usage remains high for an extended period, service processes may slow down or become unavailable.

Possible Causes

Complex services occupy a large number of instance memory resources.
The instance memory of the cluster is too low to meet service requirements.

Handling Procedure

Check the memory usage of each node instance.
1. Log in to the GaussDB(DWS) console.
2. Choose Management > Alarms, select the cluster for which the alarm is generated in the cluster selection drop-down list in the upper right corner, view the alarm information of the cluster in the last seven days, and locate the name of the node for which the alarm is generated based on the location information.
3. Choose Dedicated Clusters > Clusters, locate the row that contains the cluster for which the alarm is generated, and click Monitoring Panel in the Operation column.
4. Choose Monitoring > Performance > Monitoring View. On the displayed page, choose an instance to see its memory usage rate. Verify the information and click OK.
  Figure 1 Adding a monitoring view for instance memory usage
5. You can view the memory usage of each instance in the current cluster at the page's bottom. In the upper left corner, you can check the memory usage of each instance in the last 1, 3, 12, 24 hours, or 7 days. This helps you detect any sudden increase in memory usage of any instance.
  Figure 2 Instance memory usage monitoring view
  - If the memory usage of an instance frequently increases and then returns to normal in a short period of time, it indicates that the CPU usage temporarily spikes during service execution. In this case, you can adjust the alarm threshold to reduce the number of reported alarms.
  - If the instance memory usage remains high for a long time, the cluster is overloaded. In this case, check cluster services or improve cluster specifications. For details, see Changing the Node Flavor.
Check whether the memory usage alarm configuration of the instance is proper.
1. Choose Management > Alarms and click View Alarm Rule.
2. Locate the row that contains rule Instance Memory Usage of a Cluster Node Exceeds the Threshold, and click Modify in the Operation column. The Modifying an Alarm Rule page is displayed.
3. Adjust the alarm threshold and detection period. A higher alarm threshold and a longer detection period indicate a lower alarm sensitivity. For details about the GUI configuration, see Alarm Rules.