Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

Overview

Updated on 2023-11-28 GMT+08:00

Application Operations Management (AOM) is a one-stop, multi-dimensional O&M management platform for cloud applications. It monitors applications and related cloud resources in real time, collects and associates resource metrics, logs, and events to analyze application health status, and supports alarm reporting and data visualization, helping you detect faults in a timely manner and monitor the running status of applications, resources, and services in real time.

Specifically, AOM monitors and uniformly manages servers, storage devices, networks, web containers, and applications hosted in Docker and Kubernetes, effectively preventing problems, facilitating fault locating, and reducing O&M costs. Unlike traditional monitoring systems, AOM monitors services by applications. It meets enterprises' requirements for high efficiency and fast iteration, provides effective IT support for their services, and protects and optimizes their IT assets, enabling enterprises to achieve strategic goals.

Console Description

Table 1 AOM console description

Category

Description

Overview

Both the O&M overview and dashboard are provided.

  • O&M

    The O&M page supports full-link, multi-layer, and one-stop O&M for resources, applications, and user experience.

  • Dashboard

    With a dashboard, different graphs such as line graphs and digit graphs are displayed on the same screen, which lets you view comprehensive monitoring data.

Alarm center

The alarm center displays the alarm list, event list, alarm rules, and notification rules.

  • Alarm list

    Alarms are the information which is reported when AOM or an external service is abnormal or may cause exceptions. You need to take measures accordingly. Otherwise, service exceptions may occur.

    The alarm list displays the alarms generated within a specified time range.

  • Event list

    Events generally carry some important information, informing you of the changes of AOM or an external service. Such changes do not necessarily cause exceptions.

    The event list displays the events generated within a specified time range.

  • Alarm rules

    By setting alarms rules, you can define event conditions for services or threshold conditions for resource metrics. If the resource data of a service meets the event condition, an event alarm will be generated. If the metric data of a resource meets the threshold condition, a threshold alarm will be generated. If no metric data is reported, an insufficient data event will be generated. In this way, you can discover and handle exceptions at the earliest time.

  • Alarm notification

    AOM supports alarm notification. You can create notification rules and alarm action rules, and configure alarm noise reduction. When alarms are reported due to an exception in AOM or an external service, alarm information can be sent to specified personnel by email or Short Message Service (SMS) message. In this way, they can rectify faults in time to avoid service loss.

Monitoring

Functions such as application monitoring, component monitoring, host monitoring, container monitoring, and metric monitoring are provided.

  • Application monitoring

    An application is a group of identical or similar components divided based on service requirements. AOM supports monitoring by application.

  • Component monitoring

    Components refer to the services that you deploy, including containers and common processes.

    The Component Monitoring page displays information such as type, CPU usage, memory usage, and status of each component. AOM supports drill-down from components to instances, and then to containers, enabling multi-dimensional monitoring.

  • Host monitoring

    The Host Monitoring page enables you to monitor common system devices such as disks and file systems, and resource usage and health status of hosts and service processes or instances running on them.

  • Container monitoring

    For container monitoring, only workloads deployed using Cloud Container Engine (CCE) and applications created using ServiceStage are monitored.

  • Metric monitoring

    The Metric Monitoring page displays metric data of each resource. You can monitor metric values and trends in real time, add desired metrics to dashboards, create threshold rules, and export monitoring reports. In this way, you can monitor services and analyze data in real time.

  • Cloud service monitoring

    The Cloud Service Monitoring page displays historical performance curves of each cloud service instance. You can view cloud service data of the last six months.

Log

Functions such as log search, log file, log dump, and path configuration are provided.

  • Log search

    AOM enables you to quickly query logs, and locate faults based on log sources and contexts.

  • Log files

    You can quickly view log files of component instances to locate faults.

  • Log dumps

    AOM enables you to dump logs to Object Storage Service (OBS) buckets for long-term storage.

  • Path configuration

    AOM can collect and display container and VM logs. VM refers to an Elastic Cloud Server (ECS) or a Bare Metal Server (BMS) running Linux. Before collecting logs, ensure that you have configured a log collection path.

  • Log buckets

    A log bucket is a logical group of log files. You can dump log files, create statistical rules, and view logs by log bucket.

  • Statistical rules

    A statistical rule takes effect by log bucket. You can configure keywords in statistical rules. Then, AOM periodically counts the number of such keywords in log buckets and generates log metrics.

  • Log structuring

    In log structuring, original logs can be separated by regular expressions or special characters so that structured logs can be queried and analyzed based on the SQL syntax.

  • Accessing LTS

    By adding access rules, you can map logs of CCE, Cloud Container Instance (CCI), or custom clusters in AOM to Log Tank Service (LTS). Then you can view and analyze logs on LTS. Mapping does not generate extra fees, but duplicate mapping will.

Configuration management

Functions such as ICAgent management, application discovery, and log configuration are provided.

  • ICAgent management

    ICAgent collects metrics, logs, and application performance data in real time. For hosts purchased from the Elastic Cloud Server (ECS) or Bare Metal Server (BMS) console, you need to manually install the ICAgent. For hosts purchased from the CCE console, the ICAgent is automatically installed.

  • Data subscription

    AOM allows you to subscribe to metrics or alarms. After the subscription, data can be forwarded to custom Kafka or Distributed Message Service (DMS) topics for you to retrieve.

  • Application discovery

    AOM can discover applications and collect their metrics based on configured rules.

  • Log configuration

    Log quotas and delimiters can be configured.

  • Quota configuration

    Earlier metrics will be deleted when the metric quota is exceeded.

    You can change the metric quota by switching between the basic edition and pay-per-use edition. In the basic edition, limited functions are provided for free.

  • Metric configuration

    You can enable the metric collection function to collect metrics (excluding SLA and custom metrics).

Process for Using AOM

The following figure shows the process of using AOM.

Figure 1 Process of using AOM
  1. (Mandatory) Subscribe to AOM.
  2. (Optional) Create IAM users and set permissions.
  3. (Mandatory) Purchase a cloud host.
  4. (Mandatory) Install the ICAgent.

    ICAgent is a collector used to collect metric, log, and application performance data in real time.

    If a cloud host is purchased through CCE, ICAgent is automatically installed on it.

  5. (Optional) Configure an application discovery rule.

    For the applications that meet built-in application discovery rules, they will be automatically discovered after the ICAgent is installed. For the applications that cannot be discovered using built-in application discovery rules, customize an application discovery rule.

  6. (Optional) Configure a log collection path.

    To use AOM to monitor host logs, configure a log collection path first.

  7. (Optional) Implement O&M.

    Use AOM functions such as Monitoring Overview, Alarm Management, Resource Monitoring, and Log Management to perform routine O&M.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback