Updated on 2025-03-03 GMT+08:00

Prometheus Monitoring Overview

Prometheus monitoring fully interconnects with the open-source Prometheus ecosystem. It monitors various components, and provides multiple out-of-the-box dashboards and fully hosted Prometheus services.

Prometheus is an open-source monitoring and alarm system. It features multi-dimensional data models, flexible PromQL statement query, and visualized data display. For more information, see official Prometheus documents.

Prometheus instances are logical units used to manage Prometheus data collection, storage, and analysis. Table 1 lists different types of instances classified based on monitored objects and application scenarios.

Table 1 Prometheus instance description

Prometheus Instance Type

Monitored Object

Monitoring Capability

Application Scenario

Default Prometheus instance

  • Metrics reported using the API for adding monitoring data
  • Cloud service metrics reported by APIs such as IoT Device Access (IoTDA), ModelArts, Intelligent EdgeFabric (IEF), and Cloud Container Instance (CCI) APIs
  • Metrics reported using ICAgents

Monitors the metrics reported to AOM using APIs or ICAgents.

Applicable to both the scenario where self-built Prometheus remote storage (remote write) is used and the scenario where container, cloud service, or host metrics are ingested.

Prometheus instance for CCE

CCE

  • Provides native container service integration and container metric monitoring capabilities.
  • By default, the following service discovery capabilities are enabled: Kubernetes SD, ServiceMonitor, and PodMonitor.

Applicable when you need to monitor CCE clusters and applications running on them.

Prometheus instance for ECS

ECS

Provides integrated monitoring for ECS applications and components (such as databases and middleware) in a Virtual Private Cloud (VPC) using the UniAgent (Exporter) installed in this VPC.

Applicable when you need to monitor application components running in a VPC (usually an ECS cluster) on the cloud. You can add Prometheus middleware and custom plug-ins to monitor through the access center.

Prometheus instance for cloud services

Multiple cloud services

Monitors multiple cloud services. Only one Prometheus instance for cloud services can be created in an enterprise project.

Applicable when you need to centrally collect, store, and display monitoring data of cloud services.

Common Prometheus instance

Self-built Prometheus

  • Provides remote storage for Prometheus time series databases.
  • Provides a self-developed monitoring dashboard to display data.

    You maintain self-built Prometheus servers. You need to configure metric management and data collection by yourselves.

Applicable when you have your own Prometheus servers but need to ensure data storage availability and scalability through remote write.

Prometheus instance for multi-account aggregation

CCE, ECS, and other cloud service resources of multiple accounts in the same organization

Aggregates the data of CCE, ECS, and other cloud service resources of multiple accounts in the same organization for monitoring and maintenance.

The following metrics can be ingested through this Prometheus instance:

Applicable when you need to centrally monitor the CCE, ECS, and other cloud service resources of multiple accounts in the same organization.

Prometheus for APM

APM traces

Integrates APM's application monitoring capabilities to monitor traces for Java, Go, Python, Node.js, PHP, .NET, and C++ applications.

Applicable when you have enabled APM and need to monitor application traces.

Functions

AOM Prometheus monitoring supports monitoring data collection, storage, computing, display, and alarm reporting. It monitors metrics of containers, cloud services, middleware, databases, applications, and services. The following lists the functions supported by AOM Prometheus monitoring.

Table 2 Monitored object access

Function

Description

Managing Prometheus Instances

AOM supports multiple types of Prometheus instances. You can create Prometheus instances as required.

Connecting a CCE Cluster

AOM supports the Prometheus cloud-native monitoring plug-in. You can install the plug-in for CCE clusters through Integration Center to report metrics to the Prometheus instance for CCE.

Only Prometheus instances for CCE support this function.

Connecting Middleware to AOM

AOM supports the Prometheus middleware plug-in. You can install the middleware Exporter for VMs through Access Center to report metrics to the Prometheus instance for ECS.

Only Prometheus instances for ECS support this function.

Connecting Cloud Services to AOM

You can connect cloud services to AOM through Cloud Service Connection to report metrics to the Prometheus instance for cloud services.

Only Prometheus instances for cloud services support this function.

Configuring Multi-Account Aggregation for Unified Monitoring

You can connect multiple member accounts within the same organization through Account Access to monitor metrics. Through data multi-write, cross-VPC access can be achieved without exposing the network information about servers.

Table 3 Monitoring metric collection

Function

Description

Managing Prometheus Instance Metrics

You can check, add, and discard metrics.

Only the default or common Prometheus instance and the Prometheus instances for CCE, cloud services, and ECS are supported.

Table 4 Data processing

Function

Description

Configuring the Remote Read Address to Enable Self-built Prometheus to Read Data from AOM

With the remote read and write addresses, you can store the monitoring data of self-built Prometheus to AOM Prometheus instances for remote storage.

Configuring Recording Rules to Improve Metric Query Efficiency

By setting recording rules, you can move the computing process to the write end, reducing resource usage on the query end. Especially in large-scale clusters and complex service scenarios, recording rules can reduce PromQL complexity, thereby improving the query performance and preventing slow user configuration and queries.

Only Prometheus instances for CCE support this function.

Configuring Data Multi-Write to Dump Metrics to Self-Built Prometheus Instances

Cross-VPC access is enabled through data multi-write.

Advantages

Table 5 Advantages

Out-of-the-box usability

  • Installs and deploys Kubernetes and cloud products in a few clicks.
  • Connects to various application components and alarm tools in a few clicks.

Low cost

  • Multiple metrics, including those of standard Kubernetes components, are free of charge.
  • Provides fully hosted services and eliminates the need to purchase additional resources, reducing monitoring costs and generating almost zero maintenance costs.
  • Integrates with CCE for monitoring services, reducing the time for creating a container monitoring system from 2 days to 10 minutes. A Prometheus instance for CCE can report the data of multiple CCE clusters.

Open-source compatibility

  • Supports custom multi-dimensional data models, HTTP API modules, and PromQL query.
  • Monitored objects can be discovered through static file configuration and dynamic discovery, facilitating migration and access.

Unlimited data

  • Supports cloud storage. There is no limit on the data to store. Distributed storage on the cloud ensures data reliability.
  • Supports the Prometheus instance for multi-account aggregation. Therefore, metric data of multiple accounts can be aggregated for unified monitoring.

High performance

  • Is more lightweight and consumes fewer resources than open-source products. Uses single-process integrated Agents to monitor Kubernetes clusters, improving collection performance by 20 times.
  • Deploys Agents on the user side to retain the native collection capability and minimize resource usage.
  • Uses the collection-storage-separated architecture to improve the overall performance.
  • Optimizes the collection component to improve the single-replica collection capability and reduce resource consumption.
  • Balances collection tasks through multi-replica horizontal expansion to implement dynamic scaling and solve open-source horizontal expansion problems.

High availability

  • Dual-replica: Data collection, processing, and storage components support multi-replica horizontal expansion, ensuring the high availability of core data links.
  • Horizontal expansion: Elastic scaling can be performed based on the cluster sca

Basic Concepts

The following lists the basic concepts about Prometheus monitoring.

Table 6 Basic concepts

Item

Description

Exporter

Collects monitoring data and regulates the data provided for external systems using the Prometheus monitoring function. Hundreds of official or third-party Exporters are available. For details, see Exporters.

Target

Target to be captured by a Prometheus probe. A target either exposes its own operation and service metrics or serves as a proxy to expose the operation and service metrics of a monitored object.

Job

Configuration set for a group of targets. Jobs specify the capture interval, access limit, and other behavior for a group of targets.

Prometheus monitoring

Prometheus monitoring fully interconnects with the open-source Prometheus ecosystem. It monitors various components, and provides multiple out-of-the-box dashboards and fully hosted Prometheus services.

Managing Prometheus Instances

Logical units used to collect, store, and analyze Prometheus data.

Prometheus probes

Deployed in the Kubernetes clusters on the user or cloud product side. Prometheus probes automatically discover targets, collect metrics, and remotely write data to databases.

PromQL

Prometheus query language. Supports both query based on specified time spans and instantaneous query, and provides multiple built-in functions and operators. Raw data can be aggregated, sliced, predicted, and combined.

Sample

Value corresponding to a time point in a timeline. For Prometheus monitoring, each sample consists of a value of the float64 data type and a timestamp with millisecond precision.

Alarm rules

Alarm configuration for Prometheus monitoring. An alarm rule can be specified using PromQL.

Tags

A key-value pair that describes a metric.

Metric management

Automatically discovers collection targets without static configuration. Supports multiple metric management modes (such as Kubernetes SD, Consul, and Eureka) and exposes collection targets through ServiceMonitor or PodMonitor.

Recording rules

Prometheus monitoring's recording rule capability. You can use PromQL to process raw data into new metrics to improve query efficiency.

Time series

Consist of metric names and tags. Time series are streams of timestamped values belonging to the same metric and the same set of tagged dimensions.

Remote storage

Self-developed time series data storage component. It supports the remote write protocol related to Prometheus monitoring and is fully hosted by cloud products.

Cloud product monitoring

Seamlessly integrates monitoring data of multiple cloud products. To monitor cloud products, connect them first.

Metrics

Labeled data exposed by targets, which can fully reflect the operation or service status of monitored objects. Prometheus monitoring uses the standard data format of OpenMetrics to describe metrics.