Help Center/ Application Operations Management/ Best Practices/ Preventing ELB Alarm Storms Using AOM Alarm Grouping Rules
Updated on 2025-08-15 GMT+08:00

Preventing ELB Alarm Storms Using AOM Alarm Grouping Rules

This section describes how to set alarm noise reduction. Before sending an alarm notification, AOM processes alarms based on noise reduction rules to prevent alarm storms.

Scenario

When analyzing applications, resources, and businesses, e-commerce O&M personnel find that the number of alarms is too large and there are too many identical alarms. They cannot detect faults based on the alarms or monitor applications comprehensively.

Solution

The following shows how to use grouping rules to clear alarm storms when monitoring metrics at the ELB business layer.

  1. Step 1: Create a Grouping Rule: Filter alarm subsets and then group them based on different conditions. Alarms in the same group are aggregated to trigger one notification.
  2. Step 2: Create a Metric Alarm Rule (Configuration Mode Set to Select from all metrics): Set an alarm rule and associate it with the grouping rule to monitor resources (such as hosts and components) in real time.

Step 1: Create a Grouping Rule

When a critical or major alarm is generated, the apm notification rule is triggered, and alarms are grouped by alarm source. To create a grouping rule, do as follows:

  1. Log in to the AOM 2.0 console.
  2. In the navigation pane, choose Alarm Center > Alarm Noise Reduction.
  3. On the Grouping Rules tab page, click Create and set the rule name and grouping condition.

    Figure 1 Creating a grouping rule
    Table 1 Grouping rule parameters

    Parameter

    Description

    Example Value

    Rule Name

    Name of a grouping rule.

    Enter up to 100 characters and do not start or end with an underscore (_). Only letters, digits, and underscores are allowed.

    rule

    Enterprise Project

    Enterprise project name.

    • If Enterprise Project is set to All on the global settings page, select an enterprise project from the drop-down list here.
    • If you have already selected an enterprise project on the global settings page, this option will be grayed and cannot be changed.

    default

    Description

    Description of a grouping rule. Enter up to 1,024 characters. In this example, leave this parameter blank.

    -

    Grouping Condition

    Conditions set to filter alarms. After alarms are filtered out, you can set alarm notification rules for them.

    • Alarm Severity: severity of a metric or event alarm. Options: Critical, Major, Minor, and Warning.
    • Alarm Source: name of the service that triggers the alarm or event. Options: include AOM, LTS, and CCE.
    • Alarm Severity + Equals to + Critical & Major
    • Alarm Source + Equals to + AOM

    Notification Rule

    You can associate an alarm notification rule with an SMN topic and a message template. If the log, or resource or metric data meets the alarm condition, the system sends an alarm notification based on the associated SMN topic and message template.

    apm

    Combine Notifications

    Combines grouped alarms based on specified fields. Alarms in the same group are aggregated for sending one notification. In this example, select By alarm source + severity.

    By alarm source + severity: Alarms triggered by the same alarm source and of the same severity are combined into one group for sending notifications.

    By alarm source + severity

    Initial Wait Time

    Interval for sending an alarm notification after alarms are combined for the first time. It is recommended that the time be set to seconds to prevent alarm storms.

    15s

    Batch Processing Interval

    Waiting time for sending an alarm notification after the combined alarm data changes. The change here refers to a new alarm or an alarm status change.

    60s

    Repeat Interval

    Waiting time for sending an alarm notification after the combined alarm data becomes duplicate. Duplication means that no new alarm is generated and no alarm status is changed while other attributes (such as titles and content) are changed.

    1 hour

  4. Click Confirm.

Step 2: Create a Metric Alarm Rule (Configuration Mode Set to Select from all metrics)

You can set threshold conditions in metric alarm rules for resource metrics. If a metric value meets the threshold condition, a threshold alarm will be generated. If no metric data is reported, an insufficient data event will be generated.

The following describes how to create an alarm rule for monitoring all metrics at the ELB business layer.

  1. In the navigation pane, choose Alarm Center > Alarm Rules.
  2. On the Prometheus Monitoring tab page, click Create Alarm Rule.
  3. Set basic information about the alarm rule by referring to Table 2.

    Table 2 Basic information

    Parameter

    Description

    Example Value

    Original Rule Name

    Original name of the alarm rule.

    Enter a maximum of 256 characters and do not start or end with any special character. Only letters, digits, underscores (_), and hyphens (-) are allowed.

    monitor

    Rule Name

    Name of a rule. Max.: 256 characters. Only letters, digits, hyphens (-), and underscores (_) are allowed. Do not start or end with a hyphen or underscore. In this example, leave this parameter blank.

    NOTE:
    • If you set Rule Name, it will be displayed preferentially.
    • After an alarm rule is created, you can change Rule Name but cannot change Original Rule Name. When you change Rule Name and then move the cursor over it, both Original Rule Name and Rule Name can be viewed.

    -

    Enterprise Project

    Select the required enterprise project. The default value is default.

    default

    Description

    Description of the rule. Enter up to 1,024 characters. In this example, leave this parameter blank.

    -

  4. Set the detailed information about the alarm rule.

    1. Set Rule Type to Metric alarm rule and Configuration Mode to Select from all metrics.
    2. Select Prometheus_AOM_Default (default) for Prometheus Instance.
    3. Set alarm rule details. Table 3 describes the parameters.
      After the setting is complete, the monitored metric data is displayed in a line graph above the alarm conditions. You can click Add Metric to add more metrics and set the statistical period and detection rules for them.
      Table 3 Alarm rule details

      Parameter

      Description

      Example Value

      Multiple Metrics

      Calculation is performed based on the preset alarm conditions one by one. An alarm is triggered when one of the conditions is met.

      Multiple Metrics

      Metric

      Metric to be monitored. Click the Metric text box. In the resource tree on the right, select a target metric by resource type.

      aom_process_cpu_usage

      Statistical Period

      Interval at which metric data is collected.

      1 minute

      Conditions

      Metric monitoring scope. If this parameter is left blank, all resources are covered. In this example, leave this parameter blank.

      -

      Grouping Condition

      Aggregate metric data by the specified field and calculate the aggregation result.

      Not grouped

      Rule

      Detection rule of a metric alarm, which consists of the statistical mode (Avg, Min, Max, Sum, and Samples), determination criterion (, , >, and <), and threshold value.

      Avg > 1

      Trigger Condition

      When the metric value meets the alarm condition for a specified number of consecutive periods, a metric alarm will be generated.

      3

      Alarm Severity

      Severity of a metric alarm.

    4. Click Advanced Settings and set information such as Check Interval and Alarm Clearance. For details about the parameters, see Table 4.
      Table 4 Advanced settings

      Parameter

      Description

      Example Value

      Check Interval

      Interval at which metric query and analysis results are checked.

      Custom interval: 1 minute

      Alarm Clearance

      The alarm will be cleared when the alarm condition is not met for a specified number of consecutive periods.

      1

      Action Taken for Insufficient Data

      Action to be taken if there is no or insufficient metric data within the monitoring period. Enable this option if needed.

      Enabled: If the data is insufficient for 1 period, the status will change to Insufficient data and an alarm will be sent.

      Tags

      Click to add tags for alarm rules. They will be synchronized to TMS. They can be used to filter alarm rules and group alarms to reduce noise. They can also be referenced as "${event.metadata.tag key}" in message templates. In this example, leave this parameter blank.

      -

      Annotations

      Click to add attributes (key-value pairs) for alarm rules. Annotations will not be synchronized to TMS, but can be used to group alarms to reduce noise and referenced as "${event.metadata.annotation key}" in message templates. In this example, leave this parameter blank.

      -

  5. Set an alarm notification policy. For details, see Table 5.

    Figure 2 Selecting the alarm noise reduction mode
    Table 5 Alarm notification policy parameters

    Parameter

    Description

    Example Value

    Notify When

    Set the scenario for sending alarm notifications. By default, Alarm triggered and Alarm cleared are selected.

    • Alarm triggered: If the alarm trigger condition is met, the system sends an alarm notification to the specified personnel by email or SMS.
    • Alarm cleared: If the alarm clearance condition is met, the system sends an alarm notification to the specified personnel by email or SMS.

    Alarm triggered and Alarm cleared

    Alarm Mode

    Alarm mode. Select Alarm noise reduction.

    Alarm noise reduction: Alarms are sent only after being processed based on noise reduction rules, preventing alarm storms.

    Alarm noise reduction

    Grouping Rule

    Filter alarm subsets and then group them based on the grouping conditions. Alarms in the same group are aggregated to trigger one notification.

    rule

  6. Click Confirm. Then click View Rule to view the created rule.

    If a metric value meets the configured alarm condition, a metric alarm will be generated. To view the alarm, choose Alarm Center > Alarm List in the navigation pane. The generated AOM critical and major alarms will be aggregated based on the rule set in Step 1: Create a Grouping Rule for notification.