Monitoring Overview

CCE works with AOM to comprehensively monitor clusters. When a node is created, the ICAgent (the DaemonSet named icagent in the kube-system namespace of the cluster) of AOM is installed by default. The ICAgent collects monitoring data of underlying resources and workloads running on the cluster. It also collects monitoring data of custom metrics of the workload.

Resource metrics
Basic resource monitoring includes CPU, memory, and disk monitoring. For details, see Resource Metrics. You can view these metrics of clusters, nodes, and workloads on the CCE or AOM console.
Custom metrics
The ICAgent collects custom metrics of applications and uploads them to AOM. For details, see Custom Monitoring.

In addition, you can install the Prometheus add-on in a cluster and use Prometheus to collect and display monitoring data. For details, see Monitoring by Using the prometheus Add-on.

Resource Metrics

**Table 1** Resource metrics
Metric	Description
CPU Allocation Rate	Indicates the percentage of CPUs allocated to workloads.
Memory Allocation Rate	Indicates the percentage of memory allocated to workloads.
CPU Usage	Indicates the CPU usage.
Memory Usage	Indicates the memory usage.
Disk Usage	Indicates the disk usage.
Down	Indicates the speed at which data is downloaded to a node. The unit is KB/s.
Up	Indicates the speed at which data is uploaded from a node. The unit is KB/s.
Disk Read Rate	Indicates the data volume read from a disk per second. The unit is KB/s.
Disk Write Rate	Indicates the data volume written to a disk per second. The unit is KB/s.

Viewing Cluster Monitoring Data

In the navigation pane of the CCE console, choose Resource Management > Clusters. Click Click to enlarge on the cluster card to access the cluster monitoring page.

Click to enlarge

The cluster monitoring page displays the monitoring status of cluster resources, CPU, memory, and disk usage of all nodes in a cluster, and CPU and memory allocation rates.

Explanation of monitoring metrics:

CPU allocation rate = Sum of CPU quotas requested by pods in the cluster/Sum of CPU quotas that can be allocated of all nodes (excluding master nodes) in the cluster
Memory allocation rate = Sum of memory quotas requested by pods in the cluster/Sum of memory quotas that can be allocated of all nodes (excluding master nodes) in the cluster
CPU usage: Average CPU usage of all nodes (excluding master nodes) in a cluster
Memory usage: Average memory usage of all nodes (excluding master nodes) in a cluster

Allocatable node resources (CPU or memory) = Total amount – Reserved amount – Eviction thresholds. For details, see Formula for Calculating the Reserved Resources of a Node.

On the cluster monitoring page, you can also view monitoring data of nodes, workloads, and pods. You can click Click to enlarge to view the detailed data.

Click to enlarge

Viewing Monitoring Data of Master Nodes

CCE allows you to view monitoring data of master nodes. You can view the monitoring data of a master node in the upper right corner of the cluster details page. Clicking More will direct you to the AOM console.

Click to enlarge

Viewing Monitoring Data of Worker Nodes

In addition to the cluster monitoring page, you can also view node monitoring data on the node console by clicking Monitoring in the row where the node resides.

Click to enlarge

The node list page also displays the data about the allocable resources of the node. Allocatable resources indicate the upper limit of resources that can be requested by pods on a node, and are calculated based on the requests. Allocatable resources do not indicate the actual available resources of the node.

The calculation formulas are as follows:

Allocatable CPU = Total CPU – Requested CPU of all pods – Reserved CPU for other resources
Allocatable memory = Total memory – Requested memory of all pods – Reserved memory for other resources

Click to enlarge