Updated on 2022-06-01 GMT+08:00

Glossary

Metrics

Metrics reflect resource performance data or status. A metric consists of a namespace, dimension, name, and unit.

Metric namespaces can be regarded as containers for storing metrics. Metrics in different namespaces are independent of each other so that metrics of different applications will not be aggregated to the same statistics information. Each metric has certain features, and a dimension may be considered as a category of such features. Figure 1 describes the relationships among namespaces, dimensions, and cluster metrics.

Figure 1 Cluster metrics

Hosts

Each host of AOM corresponds to a VM or physical machine. A host can be your own VM or physical machine, or a VM (for example, ECS) that you created. A host can be connected to AOM for monitoring as long as its OS is supported by AOM and ICAgent has been installed on the host.

ICAgent

ICAgent is the collector of AOM. It runs on hosts to collect metrics, logs, and application performance data in real time. Before using AOM, ensure that the ICAgent has been installed. Otherwise, AOM cannot be used.

Logs

AOM supports search and analysis of massive quantities of logs.

Alarms

Alarms are reported when AOM or an external service such as ServiceStage, Cloud Container Engine (CCE), or Application Performance Management (APM) is abnormal or may cause exceptions. Alarms will cause service exceptions and need to be handled.

There are two alarm clearance modes:

  • Automatic clearance: After a fault is rectified, AOM automatically clears the corresponding alarm, for example, a threshold alarm.
  • Manual clearance: After a fault is rectified, AOM does not automatically clear the corresponding alarm, for example, ICAgent installation failure alarm. In such a case, manually clear the alarm.

Events

Events generally carry some important information. They are reported when AOM or an external service, such as ServiceStage, CCE, or APM encounters some changes. Such changes are not necessarily cause service exceptions. Events do not need to be handled.

Threshold Rules

Static threshold rules: You can set threshold conditions for resource metrics. AOM reports a threshold alarm when the value of a metric reaches the preset threshold, or reports an insufficient data event when no metric data is reported. In addition, a custom trigger policy is executed. When the static threshold rule status (Exceeded, OK, or Insufficient) changes, a notification is sent by email or SMS message. In this way, you can detect and handle exceptions at the earliest time.