- What's New
- Function Overview
- Service Overview (2.0)
- Getting Started (2.0)
-
User Guide (2.0)
- Introduction
- Access Center
- Dashboard
- Alarm Management
- Metric Browsing
- Log Analysis
-
Prometheus Monitoring
- Prometheus Monitoring
- Creating Prometheus Instances
- Managing Prometheus Instances
- Configuring a Recording Rule
- Metric Management
- Dashboard Monitoring
-
Access Guide
- Connecting Node Exporter
-
Exporter Access in the VM Scenario
- Access Overview
- MySQL Component Access
- Redis Component Access
- Kafka Component Access
- Nginx Component Access
- MongoDB Component Access
- Consul Component Access
- HAProxy Component Access
- PostgreSQL Component Access
- Elasticsearch Component Access
- RabbitMQ Component Access
- Access of Other Components
- Custom Plug-in Access
- Other Operations
- Obtaining the Service Address of a Prometheus Instance
- Viewing Prometheus Instance Data Through Grafana
- Reading Prometheus Instance Data Through Remote Read
- Reporting Self-Built Prometheus Instance Data to AOM
- Resource Usage Statistics
- Business Monitoring (Beta)
- Infrastructure Monitoring
- Settings
- Remarks
- Permissions Management
- Auditing
- Subscribing to AOM 2.0
- Upgrading to AOM 2.0
- Best Practices (2.0)
-
FAQs (2.0)
- Overview
- Dashboard
- Alarm Management
- Log Analysis
- Prometheus Monitoring
- Infrastructure Monitoring
-
Collection Management
- Are ICAgent and UniAgent the Same?
- What Can I Do If an ICAgent Is Offline?
- Why Is an Installed ICAgent Displayed as "Abnormal" on the Agent Management Page?
- Why Can't I View the ICAgent Status After It Is Installed?
- Why Can't AOM Monitor CPU and Memory Usage After ICAgent Is Installed?
- How Do I Obtain an AK/SK?
- FAQs About ICAgent Installation
- How Do I Enable the Nginx stub_status Module?
- Other FAQs
-
API Reference
- Before You Start
- API Overview
- Calling APIs
-
APIs
-
Alarm
- Querying the Event Alarm Rule List
- Adding an Event Alarm Rule
- Modifying an Event Alarm Rule
- Deleting an Event Alarm Rule
- Obtaining the Alarm Sending Result
- Deleting a Silence Rule
- Adding a Silence Rule
- Modifying a Silence Rule
- Obtaining the Silence Rule List
- Querying an Alarm Action Rule Based on Rule Name
- Adding an Alarm Action Rule
- Deleting an Alarm Action Rule
- Modifying an Alarm Action Rule
- Querying the Alarm Action Rule List
- Querying Metric or Event Alarm Rules
- Adding or Modifying Metric or Event Alarm Rules
- Deleting Metric or Event Alarm Rules
- Querying Events and Alarms
- Counting Events and Alarms
- Reporting Events and Alarms
-
Monitoring
- Querying Time Series Objects
- Querying Time Series Data
- Querying Metrics
- Querying Monitoring Data
- Adding Monitoring Data
- Adding or Modifying One or More Service Discovery Rules
- Deleting a Service Discovery Rule
- Querying Existing Service Discovery Rules
- Adding a Threshold Rule
- Querying the Threshold Rule List
- Modifying a Threshold Rule
- Deleting a Threshold Rule
- Querying a Threshold Rule
- Deleting Threshold Rules in Batches
-
Prometheus Monitoring
- Querying Expression Calculation Results in a Specified Period Using the GET Method
- (Recommended) Querying Expression Calculation Results in a Specified Period Using the POST Method
- Querying the Expression Calculation Result at a Specified Time Point Using the GET Method
- (Recommended) Querying Expression Calculation Results at a Specified Time Point Using the POST Method
- Querying Tag Values
- Obtaining the Tag Name List Using the GET Method
- (Recommended) Obtaining the Tag Name List Using the POST Method
- Querying Metadata
- Log
- Prometheus Instance
- Configuration Management
-
Alarm
- Historical APIs
- Examples
- Permissions Policies and Supported Actions
- Appendix
- SDK Reference
-
Service Overview (1.0)
- What Is AOM?
- Product Architecture
- Functions
- Application Scenarios
- Edition Differences
-
Metric Overview
- Introduction
- Network Metrics and Dimensions
- Disk Metrics and Dimensions
- Disk Partition Metrics
- File System Metrics and Dimensions
- Host Metrics and Dimensions
- Cluster Metrics and Dimensions
- Container Metrics and Dimensions
- VM Metrics and Dimensions
- Instance Metrics and Dimensions
- Service Metrics and Dimensions
- Restrictions
- Privacy and Sensitive Information Protection Statement
- Relationships Between AOM and Other Services
- Basic Concepts
- Permissions
- Billing
- Getting Started (1.0)
-
User Guide (1.0)
- Overview
- Subscribing to AOM
- Permissions Management
- Connecting Resources to AOM
- Monitoring Overview
- Alarm Management
- Resource Monitoring
- Log Management
- Configuration Management
- Auditing
- Upgrading to AOM 2.0
- Best Practices (1.0)
-
FAQs (1.0)
- User FAQs
-
Consultation FAQs
- What Are the Usage Restrictions of AOM?
- What Are the Differences Between AOM and APM?
- How Do I Distinguish Alarms from Events?
- What Is the Relationship Between the Time Range and Statistical Cycle?
- Does AOM Display Logs in Real Time?
- How Can I Do If I Cannot Receive Any Email Notification After Configuring a Threshold Rule?
- Why Are Connection Channels Required?
-
Usage FAQs
- What Can I Do If I Do Not Have the Permission to Access SMN?
- What Can I Do If Resources Are Not Running Properly?
- How Do I Set the Full-Screen Online Duration?
- What Can I Do If the Log Usage Reaches 90% or Is Full?
- How Do I Obtain an AK/SK?
- How Can I Check Whether a Service Is Available?
- Why Is the Status of an Alarm Rule Displayed as "Insufficient"?
- Why the Status of a Workload that Runs Normally Is Displayed as "Abnormal" on the AOM Page?
- How Do I Create the apm_admin_trust Agency?
- What Is the Billing Policy of Logs?
- Why Can't I See Any Logs on the Console?
- What Can I Do If an ICAgent Is Offline?
- Why Can't the Host Be Monitored After ICAgent Is Installed?
- Why Is "no crontab for root" Displayed During ICAgent Installation?
- Why Can't I Select an OBS Bucket When Configuring Log Dumping on AOM?
- Why Can't Grafana Display Content?
Show all
Function Overview
-
AOM
-
Application Operations Management (AOM) is a one-stop, multi-dimensional O&M management platform for cloud applications. It integrates observable data sources, such as Cloud Eye, Log Tank Service (LTS), Application Performance Management (APM), real user experience, and backend link data. It also provides unified application resource management, automated O&M, and one-stop observability analysis solutions. With AOM, you can detect faults in a timely manner, monitor applications, resources, and services in real time, and improve automated O&M capability and efficiency.
- Hosting & Running: AOM seamlessly interconnects with multiple upper-layer O&M services. It can quickly collect metric data from services such as ServiceStage, FunctionGraph, and Cloud Service Engine (CSE), and display them in real time.
- Observability Analysis: Provides observable analysis capabilities such as exception detection, historical data analysis, performance analysis, correlation analysis, and scenario-based analysis through transaction/container/Prometheus monitoring based on the metric system.
- Collection Management: Manages plug-ins centrally and issue instructions for operation such as script delivery and execution.
- Openness: Supports reporting of native Prometheus Query Language (PromQL) data, data reporting through APIs, data viewing through Grafana, and data dumping through Kafka.
-
-
Access Center
-
AOM monitors metric and log data from multiple dimensions at different layers in multiple scenarios. At the access center, you can quickly connect metrics and logs to monitor. After the connection is complete, you can view the metrics, logs, and statuses of related resources or applications on the Metric Browsing page.
Regions: EU-Dublin
-
-
Dashboard
-
With a dashboard, different graphs (such as line graphs and digit graphs) are displayed on the same screen, so you can view metric data or log data comprehensively. You can add key resource metrics to a dashboard and monitor them in real time. You can also compare the same metric of different resources on one screen. In addition, you can add routine O&M metrics to a dashboard so that you can perform routine checks without re-selecting metrics when you open AOM again.
Regions: EU-Dublin
-
-
Alarm Management
-
Alarm management allows you to query alarms, so that you can quickly detect, locate, and rectify faults. AOM provides both alarms and events. By customizing notification actions, you can obtain alarm information by email or Short Message Service (SMS) message. In this way, you can detect and handle exceptions at the earliest time.
-
-
Metric Browsing
-
The Metric Browsing page displays metric data of each resource. You can monitor metric values and trends in real time, and create alarm rules for real-time service data monitoring and analysis.
Regions: EU-Dublin
-
-
Log Analysis
-
AOM provides strong log management capability. It collects logs of Linux ECSs or bare metal server and displays them on the AOM page for search. Log search helps you quickly find required logs from a large number of logs. After configuring VM log collection path, you can collect customized log files and display them on the AOM page for search. Log dump helps you achieve long-term storage. Log Streams helps you quickly query required logs from a large number of logs and locate faults by analyzing log source information and raw context data.
-
-
Prometheus Monitoring
-
Prometheus monitoring fully interconnects with the open-source Prometheus ecosystem. It monitors various components, and provides multiple out-of-the-box dashboards and fully hosted Prometheus services.
Regions: EU-Dublin
Creating Prometheus Instances
Managing Prometheus Instances
Configuring a Recording Rule
Metric Management
Dashboard Monitoring
Access Guide
Obtaining the Service Address of a Prometheus Instance
Viewing Prometheus Instance Data Through Grafana
Reading Prometheus Instance Data Through Remote Read
Reporting Self-Built Prometheus Instance Data to AOM
Resource Usage Statistics
-
-
Business Monitoring (BETA)
-
You can create log metric rules to extract ELB log data reported to LTS as metrics and monitor them on the metric browsing and dashboard pages.
Regions: EU-Dublin
-
-
Infrastructure Monitoring
-
AOM provides the infrastructure monitoring function to monitor workloads, clusters, hosts, processes, and cloud services.Through workload monitoring, you can learn about the resource usage, status, and alarms of workloads in a timely manner. Cluster monitoring allows you to monitor multiple basic monitoring indicators and related alarms and events of a cluster in real time. Host monitoring displays resource usage, trends, and alarms. With process monitoring, you can configure rules to discover and collect applications deployed on your hosts and associated metrics to monitor applications and components.
Regions: EU-Dublin
-
-
Settings
-
AOM provides Service Authorization, Authentication, Global Settings, Collection Settings, Log Setting, and Menu Settings functions. Service Authorization allows you to grant the permissions to access multiple cloud services in one click. Authentication allows you to create an access code and configure API service invoking permissions for the current user. You can use the global settings to control the metric collection switch and the TMS tag switch of the alarm message content display resource. On the Log Settings page, you can set quotas, configure delimiters, and control the ICAgent collection switch. You can customize whether to display or hide functions such as Overview and Application Insight in the navigation pane of the console. On the Collection Settings page, you can install and manage the UniAgent, manage the ICAgent plug-in in the CCE cluster in a unified manner, manage host groups and proxy areas, and view operation logs of the UniAgent and ICAgent plug-in.
-
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.