Updated on 2025-09-25 GMT+08:00

Alarm Rules

Overview

  • Concepts related to threshold alarms
    • Alarm rule: consists of the alarm rule name, rule description, clusters associated with the rule, alarm policy triggering relationship, and alarm policy. An alarm rule can apply to one or all clusters, and can consist of one or more policies. The relationship between alarm policies can be selected in Triggered Policies. Each alarm policy consists of the triggers and constraint of each alarm rule.
    • Alarm policy: consists of the triggers, constraint, and alarm severity for an alarm metric.
    • Alarm metric: indicates a database cluster metric, which is generally time series data, for example, node CPU usage and amount of data flushed to disks.
  • Alarm rule types
    • Default rule: best practices of DWS threshold alarms.
    • User-defined rule: personalized alarm rules by configuring or combining monitoring metrics. (The current version supports only user-defined alarm rules of schema usage.)
  • Alarm rule operations
    • Modify: modifies an alarm rule. All alarm rules apply (all items of user-defined alarm rules but only some items of the default alarm rules).
    • Enable/Disable: enables or disables an alarm rule. All alarm rules apply. When an alarm rule is enabled, it is added to the check list of the alarm engine and can be triggered normally. Disabled rules are not in the check list.
    • Delete: deletes an alarm rule. You can delete only user-defined rules. Default alarm rules cannot be deleted.

Precautions

After a cluster is migrated, to monitor alarms of the new cluster, change the cluster bound to the alarm rule to the new cluster. You can also create an alarm rule for the new cluster.

Viewing Alarm Rules

  1. Log in to the DWS console.
  2. In the navigation tree on the left, choose Monitoring > Alarm.
  3. Click View Alarm Rule in the upper left corner. On the page that is displayed, you can see the threshold alarm rules of database cluster monitoring metrics, as shown in the following figure. For details, see Table 1.

    The alarm policy is triggered based on the current configuration.

    Table 1 Threshold alarms of DMS alarm sources

    Alarm Type

    Alarm Name

    Alarm Severity

    Alarm Description

    Default

    Node CPU Usage Exceeds the Threshold

    Critical

    This alarm is generated if the threshold of CPU usage (system + user) of any node in the cluster is exceeded within the specified period and the constraint is not met. The alarm will be cleared when the CPU usage (system + user) is lower than the threshold and the constraint is not met.

    Default

    Node Data Disk Usage Exceeds the Threshold

    Critical: > 85%; Major: > 80%

    This alarm is generated if the threshold of data disk (/var/chroot/DWS/data[n]) usage of any node in the cluster is exceeded within the specified period and the constraint is not met. The alarm will be cleared when the data disk (/var/chroot/DWS/data[n]) usage is lower than the threshold and the constraint is not met.

    Default

    Node Data Disk I/O Usage Exceeds the Threshold

    Critical

    This alarm is generated if the threshold of data disk (/var/chroot/DWS/data[n]) I/O usage (util) of any node in the cluster is exceeded within the specified period and the constraint is not met. The alarm will be cleared when the data disk (/var/chroot/DWS/data[n]) I/O usage (util) is lower than the threshold and the constraint is not met.

    Default

    Node Data Disk Latency Exceeds the Threshold

    Major

    This alarm is generated if the threshold of data disk (/var/chroot/DWS/data[n]) I/O latency (await) of any node in the cluster is exceeded within the specified period and the constraint is not met. The alarm will be cleared when the data disk (/var/chroot/DWS/data[n]) I/O latency (await) is lower than the threshold and the constraint is not met.

    Default

    Data Spilled to Disks of the Query Statement Exceeds the Threshold

    Critical

    This alarm is generated if the threshold of data flushed to disks of the SQL statement in the cluster is exceeded within the specified period and the constraint is not met. The alarm can be cleared only after you handle the SQL statement.

    Default

    Number of Queuing Query Statements Exceeds the Threshold

    Critical

    This alarm is generated if the threshold of the number of queuing SQL statements is exceeded within the specified period. The alarm will be cleared when the number of queuing SQL statements is less than the threshold.

    Default

    Queue Congestion in the Default Cluster Resource Pool

    Critical

    This alarm is generated if the queue in the default resource pool of a cluster is congested and no alarm suppression conditions are met. This alarm will be cleared if the queue is not congested.

    Default

    Long SQL Probe Execution Duration in a Cluster

    Critical

    This alarm is generated if the DMS alarm module detects a SQL probe execution duration on a server and no alarm suppression conditions are met. If no execution duration exceeds the threshold, the alarm will be automatically cleared.

    NOTE:

    The alarm is supported only in 8.1.1.300 and later cluster versions. For earlier versions, contact technical support.

    Default

    A Vacuum Full Operation That Holds a Table Lock for A Long Time Exists in the Cluster

    Major

    In a specified period, the DMS alarm module detects that VACUUM FULL has been running for a long time in the cluster and blocks other operations. This alarm is generated if there are other SQL statements in the lock wait state and no suppression conditions are met. This alarm will be cleared if VACUUM FULL in the cluster did not cause lock wait.

    NOTE:

    If this alarm is generated, contact technical support engineers.

    Default

    Instance Memory Usage of a Cluster Node Exceeds the Threshold

    Critical

    This alarm is generated if the DMS alarm module detects the instance memory usage on a node in a cluster exceeds the threshold and no alarm suppression conditions are met. If the usage decreases, the alarm will be automatically cleared.

    NOTE:

    If this alarm is generated, contact technical support engineers.

    Default

    Dynamic Memory Usage of a Cluster Node Exceeds the Threshold

    Critical

    This alarm is generated if the DMS alarm module detects the dynamic memory usage on a node in a cluster exceeds the threshold and no alarm suppression conditions are met. If the usage decreases, the alarm will be automatically cleared.

    NOTE:

    If this alarm is generated, contact technical support engineers.

    Default

    Disk Usage of a DWS Cluster Resource Pool Exceeds the Threshold

    Critical

    The DMS alarm module generates an alarm if the disk usage of the cluster resource pool exceeds the set threshold within a specific time frame and the suppression conditions are not met. The alarm is cleared when the DMS alarm module detects that the disk usage of the cluster resource pool is below the threshold.

    NOTE:

    If this alarm is generated, contact technical support engineers.

    Default

    Session Usage in a DWS Cluster Exceeds the Threshold

    Critical

    This alarm is generated by the DMS alarm module if the session usage in the cluster goes beyond the set threshold within a specific time frame and the suppression conditions are not met. The alarm will be cleared once the session usage in the cluster drops below the threshold.

    NOTE:

    If this alarm is generated, contact technical support engineers.

    Default

    Active Session Usage in a DWS Cluster Exceeds the Threshold

    Critical

    The DMS alarm module generates an alarm if the active session usage in the cluster exceeds the set threshold within a specific time frame and the suppression conditions are not met. The alarm is cleared when the DMS alarm module detects that the active session usage in the cluster is below the threshold.

    NOTE:

    If this alarm is generated, contact technical support engineers.

    Default

    Number of Database Deadlocks in a DWS Cluster Exceeds the Threshold

    Critical

    If the number of deadlocks in the cluster database exceeds the threshold within a specific time frame and the suppression conditions are not met, the DMS alarm module will generate an alarm. The alarm will be cleared once the DMS alarm module detects that the number of deadlocks in the cluster database is below the threshold.

    NOTE:

    If this alarm is generated, contact technical support engineers.

    Default

    Database Session Usage of the DWS Cluster Exceeds the Threshold

    Critical

    The DMS alarm module will generate an alarm if the session usage of the cluster database goes over the threshold within a specific time frame and the suppression conditions are not met. The alarm will be resolved by the DMS alarm module once it detects that the session usage of the cluster database is below the threshold.

    NOTE:

    If this alarm is generated, contact technical support engineers.

    Custom

    Name of the user-defined threshold alarm

    User-defined alarm severity

    Alarm description

Modifying an Alarm Rule

  1. Log in to the DWS console.
  2. In the navigation tree on the left, choose Monitoring > Alarm.
  3. Click View Alarm Rule in the upper left corner.
  4. On the Alarm Rules page that is displayed, click Modify in the Operation column of the target alarm rule.

    • Read-only users (with the DWS ReadOnlyAccess permission) cannot modify alarm rules.
    • Default rules have limited options that you can modify, such as cluster binding, alarm policy trigger threshold, data capture interval, and alarm suppression conditions. However, custom rules offer more flexibility, allowing you to modify all options.
    Table 2 Alarm rule parameters

    Parameter

    Description

    Example Value

    Alarm Rule

    The rule name contains 6 to 64 characters and must start with a non-digit character.

    -

    Description

    User-defined description, which contains a maximum of 490 characters.

    -

    Associated Cluster

    You can select a cluster of the current tenant from the drop-down list box as the monitoring cluster of the alarm module.

    All

    Triggered Policies

    Policy triggering relationships are as follows:

    • Independent: Alarm policies are triggered independently of each other.
    • Priority: Alarm policies are triggered by priority. Policies of a lower priority will be automatically triggered after those of a higher priority.

    Independent

    Alarm Policy

    The alarm policies are as follows:

    • Metric: DWS monitoring metric, which is the data source used by the alarm engine for threshold determination.
    • Alarm Object: databases in the selected cluster and schemas in the selected databases.
    • Trigger: calculation rule for threshold determination of a monitoring metric. Select the average value within a period of time of a metric to reduce the probability of alarm oscillation.
    • Constraint: suppresses the repeated triggering and clearance of alarms of the same type within the specified period.
    • Alarm Severity: includes Urgent, Important, Minor, and Prompt.

    -

  5. Confirm the information and click OK.

Creating an Alarm Rule

  1. Log in to the DWS console.
  2. In the navigation tree on the left, choose Monitoring > Alarm.
  3. Click View Alarm Rule in the upper left corner.
  4. Click Create Alarm Rule in the upper right corner. You can configure items, such as the alarm rule name, rule description, associated cluster, and alarm policy. For details, see Table 2.

    Currently, only alarm rules of schema usage metrics can be created on DWS.