Alarm Rules
Overview
- Concepts related to threshold alarms
- Alarm rule: consists of the alarm rule name, rule description, clusters associated with the rule, alarm policy triggering relationship, and alarm policy. An alarm rule can apply to one or all clusters, and can consist of one or more policies. The relationship between alarm policies can be selected in Triggered Policies. Each alarm policy consists of the triggers and constraint of each alarm rule.
- Alarm policy: consists of the triggers, constraint, and alarm severity for an alarm metric.
- Alarm metric: indicates a database cluster metric, which is generally time series data, for example, node CPU usage and amount of data flushed to disks.
- Alarm rule types
- Default rule: best practices of DWS threshold alarms.
- User-defined rule: personalized alarm rules by configuring or combining monitoring metrics. (The current version supports only user-defined alarm rules of schema usage.)
- Alarm rule operations
- Modify: modifies an alarm rule. All alarm rules apply (all items of user-defined alarm rules but only some items of the default alarm rules).
- Enable/Disable: enables or disables an alarm rule. All alarm rules apply. When an alarm rule is enabled, it is added to the check list of the alarm engine and can be triggered normally. Disabled rules are not in the check list.
- Delete: deletes an alarm rule. You can delete only user-defined rules. Default alarm rules cannot be deleted.
Precautions
After a cluster is migrated, to monitor alarms of the new cluster, change the cluster bound to the alarm rule to the new cluster. You can also create an alarm rule for the new cluster.
Viewing Alarm Rules
- Log in to the DWS console.
- In the navigation tree on the left, choose Monitoring > Alarm.
- Click View Alarm Rule in the upper left corner. On the page that is displayed, you can see the threshold alarm rules of database cluster monitoring metrics, as shown in the following figure. For details, see Table 1.
The alarm policy is triggered based on the current configuration.
Table 1 Threshold alarms of DMS alarm sources Alarm Type
Alarm Name
Alarm Severity
Alarm Description
Default
Node CPU Usage Exceeds the Threshold
Critical
This alarm is generated if the threshold of CPU usage (system + user) of any node in the cluster is exceeded within the specified period and the constraint is not met. The alarm will be cleared when the CPU usage (system + user) is lower than the threshold and the constraint is not met.
Default
Node Data Disk Usage Exceeds the Threshold
Critical: > 85%; Major: > 80%
This alarm is generated if the threshold of data disk (/var/chroot/DWS/data[n]) usage of any node in the cluster is exceeded within the specified period and the constraint is not met. The alarm will be cleared when the data disk (/var/chroot/DWS/data[n]) usage is lower than the threshold and the constraint is not met.
Default
Node Data Disk I/O Usage Exceeds the Threshold
Critical
This alarm is generated if the threshold of data disk (/var/chroot/DWS/data[n]) I/O usage (util) of any node in the cluster is exceeded within the specified period and the constraint is not met. The alarm will be cleared when the data disk (/var/chroot/DWS/data[n]) I/O usage (util) is lower than the threshold and the constraint is not met.
Default
Node Data Disk Latency Exceeds the Threshold
Major
This alarm is generated if the threshold of data disk (/var/chroot/DWS/data[n]) I/O latency (await) of any node in the cluster is exceeded within the specified period and the constraint is not met. The alarm will be cleared when the data disk (/var/chroot/DWS/data[n]) I/O latency (await) is lower than the threshold and the constraint is not met.
Default
Data Spilled to Disks of the Query Statement Exceeds the Threshold
Critical
This alarm is generated if the threshold of data flushed to disks of the SQL statement in the cluster is exceeded within the specified period and the constraint is not met. The alarm can be cleared only after you handle the SQL statement.
Default
Number of Queuing Query Statements Exceeds the Threshold
Critical
This alarm is generated if the threshold of the number of queuing SQL statements is exceeded within the specified period. The alarm will be cleared when the number of queuing SQL statements is less than the threshold.
Default
Queue Congestion in the Default Cluster Resource Pool
Critical
This alarm is generated if the queue in the default resource pool of a cluster is congested and no alarm suppression conditions are met. This alarm will be cleared if the queue is not congested.
Default
Long SQL Probe Execution Duration in a Cluster
Critical
This alarm is generated if the DMS alarm module detects a SQL probe execution duration on a server and no alarm suppression conditions are met. If no execution duration exceeds the threshold, the alarm will be automatically cleared.
NOTE:The alarm is supported only in 8.1.1.300 and later cluster versions. For earlier versions, contact technical support.
Default
A Vacuum Full Operation That Holds a Table Lock for A Long Time Exists in the Cluster
Major
In a specified period, the DMS alarm module detects that VACUUM FULL has been running for a long time in the cluster and blocks other operations. This alarm is generated if there are other SQL statements in the lock wait state and no suppression conditions are met. This alarm will be cleared if VACUUM FULL in the cluster did not cause lock wait.
NOTE:If this alarm is generated, contact technical support engineers.
Default
Instance Memory Usage of a Cluster Node Exceeds the Threshold
Critical
This alarm is generated if the DMS alarm module detects the instance memory usage on a node in a cluster exceeds the threshold and no alarm suppression conditions are met. If the usage decreases, the alarm will be automatically cleared.
NOTE:If this alarm is generated, contact technical support engineers.
Default
Dynamic Memory Usage of a Cluster Node Exceeds the Threshold
Critical
This alarm is generated if the DMS alarm module detects the dynamic memory usage on a node in a cluster exceeds the threshold and no alarm suppression conditions are met. If the usage decreases, the alarm will be automatically cleared.
NOTE:If this alarm is generated, contact technical support engineers.
Default
Disk Usage of a DWS Cluster Resource Pool Exceeds the Threshold
Critical
The DMS alarm module generates an alarm if the disk usage of the cluster resource pool exceeds the set threshold within a specific time frame and the suppression conditions are not met. The alarm is cleared when the DMS alarm module detects that the disk usage of the cluster resource pool is below the threshold.
NOTE:If this alarm is generated, contact technical support engineers.
Default
Session Usage in a DWS Cluster Exceeds the Threshold
Critical
This alarm is generated by the DMS alarm module if the session usage in the cluster goes beyond the set threshold within a specific time frame and the suppression conditions are not met. The alarm will be cleared once the session usage in the cluster drops below the threshold.
NOTE:If this alarm is generated, contact technical support engineers.
Default
Active Session Usage in a DWS Cluster Exceeds the Threshold
Critical
The DMS alarm module generates an alarm if the active session usage in the cluster exceeds the set threshold within a specific time frame and the suppression conditions are not met. The alarm is cleared when the DMS alarm module detects that the active session usage in the cluster is below the threshold.
NOTE:If this alarm is generated, contact technical support engineers.
Default
Number of Database Deadlocks in a DWS Cluster Exceeds the Threshold
Critical
If the number of deadlocks in the cluster database exceeds the threshold within a specific time frame and the suppression conditions are not met, the DMS alarm module will generate an alarm. The alarm will be cleared once the DMS alarm module detects that the number of deadlocks in the cluster database is below the threshold.
NOTE:If this alarm is generated, contact technical support engineers.
Default
Database Session Usage of the DWS Cluster Exceeds the Threshold
Critical
The DMS alarm module will generate an alarm if the session usage of the cluster database goes over the threshold within a specific time frame and the suppression conditions are not met. The alarm will be resolved by the DMS alarm module once it detects that the session usage of the cluster database is below the threshold.
NOTE:If this alarm is generated, contact technical support engineers.
Custom
Name of the user-defined threshold alarm
User-defined alarm severity
Alarm description
Modifying an Alarm Rule
- Log in to the DWS console.
- In the navigation tree on the left, choose Monitoring > Alarm.
- Click View Alarm Rule in the upper left corner.
- On the Alarm Rules page that is displayed, click Modify in the Operation column of the target alarm rule.
- Read-only users (with the DWS ReadOnlyAccess permission) cannot modify alarm rules.
- Default rules have limited options that you can modify, such as cluster binding, alarm policy trigger threshold, data capture interval, and alarm suppression conditions. However, custom rules offer more flexibility, allowing you to modify all options.
Table 2 Alarm rule parameters Parameter
Description
Example Value
Alarm Rule
The rule name contains 6 to 64 characters and must start with a non-digit character.
-
Description
User-defined description, which contains a maximum of 490 characters.
-
Associated Cluster
You can select a cluster of the current tenant from the drop-down list box as the monitoring cluster of the alarm module.
All
Triggered Policies
Policy triggering relationships are as follows:
- Independent: Alarm policies are triggered independently of each other.
- Priority: Alarm policies are triggered by priority. Policies of a lower priority will be automatically triggered after those of a higher priority.
Independent
Alarm Policy
The alarm policies are as follows:
- Metric: DWS monitoring metric, which is the data source used by the alarm engine for threshold determination.
- Alarm Object: databases in the selected cluster and schemas in the selected databases.
- Trigger: calculation rule for threshold determination of a monitoring metric. Select the average value within a period of time of a metric to reduce the probability of alarm oscillation.
- Constraint: suppresses the repeated triggering and clearance of alarms of the same type within the specified period.
- Alarm Severity: includes Urgent, Important, Minor, and Prompt.
-
- Confirm the information and click OK.
Creating an Alarm Rule
- Log in to the DWS console.
- In the navigation tree on the left, choose Monitoring > Alarm.
- Click View Alarm Rule in the upper left corner.
- Click Create Alarm Rule in the upper right corner. You can configure items, such as the alarm rule name, rule description, associated cluster, and alarm policy. For details, see Table 2.
Currently, only alarm rules of schema usage metrics can be created on DWS.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.