Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

Alarm Noise Reduction

Updated on 2024-10-30 GMT+08:00

This section describes how to set alarm noise reduction. Before sending an alarm notification, AOM processes alarms based on noise reduction rules to prevent alarm storms.

Scenario

When analyzing applications, resources, and businesses, e-commerce O&M personnel find that the number of alarms is too large and there are too many identical alarms. They need to detect faults based on the alarms and monitor applications comprehensively.

Solution

Use AOM to set alarm rules to monitor the usage of resources (such as hosts and components) in the environment in real time. When AOM or an external service is abnormal, an alarm is triggered immediately. AOM also provides the alarm noise reduction function. Before sending an alarm notification, AOM processes alarms based on noise reduction rules. This helps you identify critical problems and avoid alarm storms.

Alarm noise reduction consists of four parts: grouping, deduplication, suppression, and silence.

  • You can filter different subnets of alarms and then group them according to certain conditions. Alarms in the same group are aggregated to trigger one notification.
  • By using suppression rules, you can suppress or block notifications related to specific alarms. For example, when a major alarm is generated, less severe alarms can be suppressed. Another example, when a node is faulty, all other alarms of the processes or containers on this node can be suppressed.
  • You can create a silence rule to shield alarm notifications in a specified period. The rule takes effect immediately after it is created.
  • AOM has built-in deduplication rules. The service backend automatically deduplicates alarms. You do not need to manually create rules.

Monitoring ELB metrics at the business layer is used as an example here.

Step 1: Create a Grouping Rule

When a critical or major alarm is generated, the Monitor_host action rule is triggered, and alarms are grouped by alarm source. To create a grouping rule, do as follows:

  1. Log in to the AOM 2.0 console.
  2. In the navigation pane, choose Alarm Management > Alarm Noise Reduction.
  3. On the Grouping Rules tab page, click Create and set the rule name and grouping condition.

    Figure 1 Creating a grouping rule
    Table 1 Alarm combination rule

    Combine Notifications

    Combines grouped alarms based on specified fields. Alarms in the same group are aggregated for sending one notification.

    Notifications can be combined:

    • By alarm source: Alarms triggered by the same alarm source are combined into one group for sending notifications.
    • By alarm source + severity: Alarms triggered by the same alarm source and of the same severity are combined into one group for sending notifications.
    • By alarm source + all tags: Alarms triggered by the same alarm source and with the same tag are combined into one group for sending notifications.

    Initial Wait Time

    Interval for sending an alarm notification after alarms are combined for the first time. It is recommended that the time be set to seconds to prevent alarm storms.

    Value range: 0s to 10 minutes. Recommended: 15s.

    Batch Processing Interval

    Waiting time for sending an alarm notification after the combined alarm data changes. It is recommended that the time be set to minutes. If you want to receive alarm notifications as soon as possible, set the time to seconds.

    The change here refers to a new alarm or an alarm status change.

    Value range: 5s to 30 minutes. Recommended: 60s.

    Repeat Interval

    Waiting time for sending an alarm notification after the combined alarm data becomes duplicate. It is recommended that the time be set to hours.

    Duplication means that no new alarm is generated and no alarm status is changed while other attributes (such as titles and content) are changed.

    Value range: 0 minutes to 15 days. Recommended: 1 hour.

Step 2: Create a Metric Alarm Rule (Configuration Mode Set to Select from all metrics)

You can set threshold conditions in metric alarm rules for resource metrics. If a metric value meets the threshold condition, a threshold alarm will be generated. If no metric data is reported, an insufficient data event will be generated.

Metric alarm rules can be created in the following modes: Select from all metrics and PromQL. The following describes how to create an alarm rule for monitoring all metrics at the ELB business layer.

  1. Log in to the AOM 2.0 console.
  2. In the navigation pane, choose Alarm Management > Alarm Rules.
  3. On the Metric/Event Alarm Rules tab page, click Create.
  4. Set the basic information about the alarm rule, such as the rule name.
  5. Set the detailed information about the alarm rule.

    1. Set Rule Type to Metric alarm rule and Configuration Mode to Select from all metrics.
    2. Set parameters such as the metric, environment, and check interval.
      Figure 2 Setting the detailed information about the alarm rule
    3. Set alarm tags and annotations to group alarms. They can be associated with alarm noise reduction policies for sending notifications. As a business-layer metric is selected in 5.b, set Alarm Tag to aom_monitor_level:business.
      Figure 3 Customizing tag information
      NOTE:

      The tag of full metrics is in the format of "key:value". Generally, key is set to aom_monitor_level. value varies depending on the layer of metrics:

      • Infrastructure metrics: infrastructure
      • Middleware metrics: middleware
      • Application metrics: application
      • Business metrics: business

  6. Set an alarm notification policy. There are two alarm notification modes. In this example, the alarm noise reduction mode is selected.

    Alarm noise reduction: Alarms are sent only after being processed based on noise reduction rules, preventing alarm storms.
    Figure 4 Selecting the alarm noise reduction mode

  7. Click Confirm. Then, click Back to Alarm Rule List to view the created alarm rule.

    As shown in the following figure, a metric alarm rule is created. Click in front of the rule name to view its details.

    Figure 5 Creating a metric alarm rule

    In the expanded list, if a metric value meets the configured alarm condition, a metric alarm is generated on the alarm page. To view the alarm, choose Alarm Management > Alarm List in the navigation pane.

    If the preset notification policy is met, the system sends an alarm notification to the specified personnel by email, SMS, or WeCom.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback