Updated on 2025-05-22 GMT+08:00

OPS06-02 Defining Observable Objects

  • Risk level

    High

  • Key strategies

    The following table shows how to classify observable objects.

    Observability Layer

    Function/Major Metric

    IT resource monitoring

    IT resource monitoring monitors and reports the performance and capacity of IT resources to ensure stable and reliable running of your services.

    Application monitoring

    Application monitoring tracks resources across different layers (applications, service components, and environments) based on application and resource management. Each layer has its own set of metrics that are monitored. Monitor alarm information at the business, application, middleware, and infrastructure layers, and bind the dashboards to display system charts, metric sources, and log sources in charts. It is important to focus on metrics like available memory, the number of WAITING threads, and the number of TIMED_WAITING threads.

    Process monitoring

    Process monitoring is used to monitor active processes on a host. By default, information such as CPU usage, memory usage, and the number of opened files of these processes is collected. If you have customized process monitoring, the number of processes containing keywords is also monitored. It is important to focus on metrics like the number of running processes, idle processes, and zombie processes.

    Log monitoring

    The log configuration service extracts specified keywords from logs. This helps you use the monitoring service to monitor and report alarms for key metrics in logs. It is important to focus on the metrics like log size, the number of access logs, and the number of error logs.

    Custom monitoring

    The custom monitoring page displays all the metrics defined by yourself. You can call APIs to report collected monitoring data of those metrics to the monitoring service.

    Middleware monitoring

    The monitoring platform allows you to quickly install and configure middleware plug-ins and offers ready-to-use dashboards for monitoring. Currently, the following middleware plug-ins are supported:

    MySQL, Redis, MongoDB, Nginx, Node, HAProxy, COMP_EXPORTER, COMP_REDIS_EXPORTER, and COMP_MYSQL_EXPORTER

    Server monitoring

    It provides basic monitoring and OS monitoring of different monitoring granularities. Basic monitoring monitors metrics reported by ECSs. OS monitoring provides proactive, fine-grained OS monitoring for ECSs, and it requires the agent (a plug-in) to be installed on all ECSs to be monitored. It is important to focus on metrics like CPU_UTIL, DISK_READ_BYTES_RATE, and outband incoming rate.

    On-premises component monitoring

    Unified monitoring of on-premises components. The monitoring platform is interconnected with Grafana and Prometheus to monitor services, applications, and on-premises IDCs and middleware.

    Network performance management monitoring

    The full-link network that connects clients, networks, edges, and clouds is monitored. This helps you identify network faults quickly and keep track of the network's status. It is important to focus on metrics like application response time, DNS resolution time, TCP connection setup time, and access traffic.