Updated on 2024-06-17 GMT+08:00

Viewing Cluster Information

Navigation Path

Choose Container Insights > Clusters and click the cluster name in Cluster Statistics. The displayed page consists of the following tabs:

Viewing Cluster Details

The cluster details page provides monitoring data of a single cluster, including the resource overview, top resource consumption statistics, and usage statistics. Cluster monitoring allows you to view the resource usage and trend of a cluster in a timely manner and quickly handle potential risks for smooth cluster running.

You can hover over a chart to view the monitoring data in each minute.

Figure 1 Cluster details page
Table 1 Modules on the cluster details page

Module

Description

Resource Health

Resource health is evaluated from several dimensions, such as the health score, number of risk items to be processed, risk level, and proportion of diagnosed risk items for master nodes, clusters, worker nodes, workloads, and external dependencies. (Abnormal data is displayed in red.) For more diagnosis results, go to the Health Diagnosis tab.

NOTICE:

You can view the resource health status of a cluster only when kube-prometheus-stack is deployed in server mode in the cluster.

Resource Overview

This module displays the proportion of abnormal resources in nodes, workloads, and pods and the total number of namespaces. In addition, the exception proportion of control plane components and master nodes, total QPS of the API server, and request error rate of the API server are also included.

As the API service provider of the cluster, if the API server on the control plane is abnormal, the entire cluster may fail to be accessed and workloads that depend on the API server may fail to run normally. To help you quickly identify and fix problems, this module provides the total QPS and request error rate metrics of the API server.

Top Resource Consumption Statistics

This module displays statistics collected by UCS on top 5 nodes, Deployments, StatefulSets, and pods by CPU and memory usage, helping you identify high resource consumption.

NOTE:
  • CPU usage

    Workload CPU usage = Average CPU usage in each pod of the workload

    Pod CPU usage = Used CPU cores/Sum of CPU limits of containers in the pod (If CPU limits are not specified, all node CPU cores are used.)

  • Memory usage

    Workload memory usage = Average memory usage in each pod of the workload

    Pod memory usage = Used physical memory/Sum of memory limits of containers in the pod (If memory limits are not specified, all node memory is used.)

Data Plane Monitoring

By default, the resource usage is collected from each dimension in the last hour, last 8 hours, and last 24 hours. To view more monitoring information, click View All Metrics to access the Dashboard tab. For details, see Dashboard.