Help Center/ Application Operations Management/ Best Practices/ Building a Comprehensive Metric System
Updated on 2025-08-15 GMT+08:00

Building a Comprehensive Metric System

This section describes how to build a metric system and a dashboard for all-round, multi-dimensional, and visualized monitoring of resources and applications.

Scenario

In the Internet era, user experience is the top priority. The page response speed, access latency, and access success rate often affect user experience. If such information cannot be obtained in a timely manner, a large number of users will be lost. O&M personnel of an online shopping mall used open-source software to collect metrics. However, these metrics are scattered and cannot be displayed centrally.

Solution

AOM implements one-stop, multi-dimensional O&M for cloud applications. In the access center, ingest metrics of businesses, applications, and Prometheus middleware. You can also customize dashboards for monitoring and set alarm rules through a unified entry to implement routine inspection and ensure normal service running.

Table 1 Metric system supported by AOM

Type

Source

Example

How to Ingest

Business metrics

Device log SDKs and extracted ELB logs

UV, PV, latency, access failure rate, and access traffic

Ingest Business Metrics

Transaction monitoring or reported custom metrics

URL calls, maximum concurrency, and maximum response time

Application metrics

Component performance graphs or API performance data

URL calls, average latency, error calls, and throughput

Ingest Application Metrics

Middleware metrics

Native or cloud middleware data

File system capacity and file system usage

Ingest Middleware Metrics

Other layer metrics

Generally container or cloud service data, such as compute, storage, network, and database data

CPU usage, memory usage, and health status

Ingest Metrics at Other Layers (Example: container metrics and cloud service metrics)

Prerequisites

Step 1: Build a Metric System

  1. Ingest business metrics.

    1. Log in to the AOM 2.0 console.
    2. In the navigation pane, choose Access Center > Access Center.

      If you want to switch from the new Access Center to the old one, click Back to Old Version in the upper right corner.

    3. In the Business panel on the right, click a target card.
      • Ingesting ELB log metrics
        1. Log metrics can be automatically ingested.
        2. Choose Dashboard > Dashboard in the navigation pane, select the created dashboard, and click in the upper right corner of the page. On the Log Sources tab, enter the corresponding SQL statement to check the log metrics. For example, to check traffic metrics, enter an SQL statement and click Search.
      • Ingesting APM transaction metrics
        1. Install an APM probe for the workload. For details, see Installing an APM Probe.
        2. After the installation is complete, log in to the console of the service where the probe is installed and trigger the collection of APM transaction metrics. In the example of an online shopping mall, you can add a product to the shopping cart to trigger the collection.
        3. Log in to the AOM 2.0 console.
        4. In the navigation pane, choose Metric Browsing. In the right pane, select the ingested APM metrics to view.

  2. Ingest application metrics.

    1. To install an APM probe for a workload, perform the following steps:
      1. Log in to the CCE console and click a target cluster.
      2. Choose Workloads in the navigation pane, and select the type of workload whose metrics are to be reported to AOM.
      3. Click a target workload. On the APM Settings tab page, click Edit in the lower right corner.
      4. Select the APM 2.0 probe, set Probe Version to latest-x86, set APM Environment to phoenixenv1, and select the created application phoenixapp1 from the APM App drop-down list.
      5. Click Save.
    2. After the installation is complete, log in to the console of the service where the probe is installed and trigger the collection of application metrics. In the example of an online shopping mall, you can add a product to the shopping cart to trigger the collection.
    3. Log in to the AOM 2.0 console.
    4. In the navigation pane, choose Metric Browsing. In the right pane, select the ingested application metrics to view.

  3. Ingest middleware metrics.

    1. Upload the data to the ECS.
      1. Download the mysqld_exporter-0.14.0.linux-amd64.tar.gz package from https://prometheus.io/download/.
      2. Log in to the ECS as the root user, upload the Exporter software package to the ECS, and decompress it.
      3. Log in to the RDS console. On the Instances page, click an RDS DB instance name in the instance list. On the basic information page, view the RDS security group.
      4. Check whether port 3306 is enabled in the RDS security group.
        Figure 1 Checking whether the RDS port is enabled
      5. Go to the decompressed folder and configure the mysql.cnf file on the ECS:
        cd mysqld_exporter-0.14.0.linux-amd64 
        vi mysql.cnf

        For example, add the following content to the mysql.cnf file:

        [client]

        user=root (RDS username)

        password=**** (RDS password)

        host=192.168.0.198 (RDS public IP address)

        port=3306 (port)

      6. Run the following command to start the mysqld_exporter tool:
        nohup ./mysqld_exporter --config.my-cnf="mysql.cnf" --collect.global_status --collect.global_variables &
      7. Run the following command to check whether the tool is started properly:
        curl http://127.0.0.1:9104/metrics

        If the command output shown in Figure 2 is displayed, the tool is started properly.

        Figure 2 Checking metrics
    2. Ingest middleware metrics using VM access mode.
      1. Log in to the AOM 2.0 console.
      2. In the navigation pane, choose Global Settings. On the displayed page, choose UniAgents.
      3. On the UniAgents page, install the UniAgent for the ECS. For details, see Manual Installation.

        To switch from the new UniAgent management page to the old one, click Back to Old Version.

      4. In the navigation pane, choose Access Center > Access Center. In the Prometheus Middleware panel on the right, click a target card.
      5. In the dialog box that is displayed, configure a collection task and install Exporter. For details, see Exporter Access in the VM Scenario.
      6. Click Create.
    3. After the ingestion is complete, choose Metric Browsing in the navigation pane on the left. In the right pane, view the ingested middleware metrics.

  4. Ingest metrics at other layers. The following shows how to ingest container metrics and cloud service metrics. For how to ingest other types of metrics, see Connecting to AOM.

    1. Log in to the AOM 2.0 console.
    2. In the navigation pane, choose Access Center > Access Center.
    3. In the Prometheus Running Environments or Prometheus Cloud Services panel, click a target card.
      • Select a container metric card:

        For example, if you select the CCE card, the ICAgent is installed by default after you purchase a CCE cluster.

      • Select a cloud service metric card:
        1. Click a cloud service card. In the dialog box that is displayed, select the cloud service to monitor. For example, RDS or DCS.
        2. Select an enterprise project and a Prometheus instance for cloud services. By default, the Prometheus instance for cloud services under your specified enterprise project is selected. It is grayed and cannot be selected here.
        3. Click Connect Now.
    4. After the connection is complete, choose Metric Browsing in the navigation pane on the left. In the right pane, select the ingested metrics to view.

Step 2: Add a Dashboard for Unified Monitoring

  1. Create a metric alarm rule.

    You can set threshold conditions in metric alarm rules for resource metrics. If a metric value meets the threshold condition, a threshold alarm will be generated. If no metric data is reported, an insufficient data event will be generated.

    Metric alarm rules can be created in the following modes: Select from all metrics and PromQL. The following uses Select from all metrics as an example.

    1. Log in to the AOM 2.0 console.
    2. In the navigation pane, choose Alarm Center > Alarm Rules.
    3. On the Prometheus Monitoring tab page, click Create Alarm Rule.
    4. Set the basic information about the alarm rule, such as the rule name.
    5. Set parameters about the alarm rule. Set Rule Type to Metric alarm rule and Configuration Mode to Select from all metrics, and select a Prometheus instance from the drop-down list.
    6. Set alarm rule details.

      You need to set information such as the statistical period, condition, detection rule, trigger condition, and alarm severity. The detection rule consists of the statistical mode (Avg, Min, Max, Sum, and Samples), determination criterion (, , >, and <), and threshold value. For example, if Statistical Period is 1 minute, Rule is Avg >1, Consecutive Periods is 3, and Alarm Severity is Critical, a critical alarm will be generated when the average metric value is greater than 1 for three consecutive periods.

    7. Under Advanced Settings, set information such as Check Interval and Alarm Clearance. In this example, retain the default settings.
    8. Set an alarm notification policy. For details, see Table 2.
      Figure 3 Alarm notification
      Table 2 Alarm notification policy parameters

      Parameter

      Description

      Example Value

      Notify When

      Set the scenario for sending alarm notifications. By default, Alarm triggered and Alarm cleared are selected.

      • Alarm triggered: If the alarm trigger condition is met, the system sends an alarm notification to the specified personnel by email or SMS.
      • Alarm cleared: If the alarm clearance condition is met, the system sends an alarm notification to the specified personnel by email or SMS.

      Retain the default value.

      Alarm Mode

      • Direct alarm reporting: An alarm is directly sent when the alarm condition is met. If you select this mode, set an interval for notification and specify whether to enable a notification rule.
      • Frequency: interval for sending alarm notifications. Select a desired value from the drop-down list.
      • Notification Rule: After the rule is enabled, the system sends notifications based on the associated SMN topic and message template. If there is no notification rule you want to select, click Add Rule in the drop-down list to create one. For details, see Creating an Alarm Notification Rule.

      Alarm Mode: Select Direct alarm reporting.

      Frequency: Select Once.

      Notification Rule: aomtest

    9. Click Confirm. Then click View Rule to view the created rule.

      Click a rule name to view details. If a monitored object meets the configured alarm condition, a metric alarm is generated on the alarm list page. To view the alarm, choose Alarm Center > Alarm List in the navigation pane. If a host meets the preset notification policy, the system sends an alarm notification to the specified personnel by email, SMS, or WeCom.

  2. Create a dashboard.

    1. Create a dashboard.
      1. Log in to the AOM 2.0 console.
      2. In the navigation pane, choose Dashboard > Dashboard.
      3. Click Add Dashboard in the upper left corner of the list.
      4. In the displayed dialog box, set parameters.
        Bind the dashboard to the created application so that you can monitor key metrics of the application on the Application Monitoring page.
        Figure 4 Creating a dashboard
      5. Click OK.
    2. Add a graph to the dashboard.
      1. In the dashboard list, click the created dashboard.
      2. Go to the target dashboard page and click in the upper right corner to add a graph to the dashboard. Select a proper graph as required.
        Table 3 Adding a graph

        Graph Type

        Data Source

        Scenario

        Metric graph

        Metric data

        Monitors the metrics about the business layer, application layer, and Prometheus middleware.

        Log graph

        Log data

        Monitors business metrics or other log metrics, such as key metrics (latency, throughput, and errors) cleaned based on ELB logs.

        The following describes how to add a metric graph for CPU usage and a log graph for latency.

        • Add a metric graph for CPU usage.

          Select the CPU Usage metric. After the setting is complete, the metric graph shown in Figure 5 is displayed.

          Figure 5 Adding a metric graph
        • Add a log graph for latency. Click the Log Sources tab and set parameters to add a log graph.
          You can directly obtain the SQL query statement from the graph.
          1. In the upper right corner of the graph display area, click Show Chart.
          2. In the Charts list, select required log metrics to monitor.
          3. The query statement corresponding to the metric is automatically filled in the SQL statement setting area.

          After setting the parameters, click Add to Dashboard.

      3. You can repeat the preceding operations to add more graphs to the dashboard. Then click to save the dashboard.