Updated on 2024-06-17 GMT+08:00

Enabling Monitoring for Huawei Cloud Clusters

This section describes how to enable monitoring for Huawei Cloud clusters.

Constraints

Before enabling monitoring for Huawei Cloud clusters, kube-prometheus-stack may have been installed. If the add-on is in the Installing, Upgrading, Deleting, or Rolling back status, monitoring cannot be enabled. For details about the add-on status, see Add-on Status Description.

Prerequisites

A Huawei Cloud cluster has been registered with UCS. For details, see Huawei Cloud Clusters.

Procedure

  1. Log in to the UCS console. In the navigation pane on the left, choose Container Intelligent Analysis.
  2. Select a fleet or a cluster not in the fleet, and click Enable Monitoring.

    Figure 1 Selecting a fleet or a cluster not in the fleet

  3. Select a Huawei Cloud cluster.
  4. Click Next: Configure Connection to complete the metric collection settings.

    Specifications

    • Deployment Mode: The Agent and Server modes are supported. The Agent mode occupies fewer cluster resources and provides the Prometheus metric collection capability for the cluster. However, the HPA and health diagnosis functions based on custom Prometheus statements are not supported. The Server mode provides the Prometheus metric collection capability for clusters and supports HPA and health diagnosis based on custom Prometheus statements. This mode depends on PVC and consumes a large amount of memory.
    • Add-on Specifications: If Deployment Mode is set to Agent, the default add-on specifications are used. If Deployment Mode is set to Server, the add-on specifications include Demo (≤ 100 containers), Small (≤ 2,000 containers), Medium (≤ 5,000 containers), and Large (> 5,000 containers). Different specifications have different requirements on cluster resources, such as CPUs and memory. For details about the resource quotas of different add-on specifications, see Resource Quota Requirements of Different Specifications..

    Parameters

    • Interconnection Mode: Currently, only AOM can be interconnected.
    • AOM Instance: Container monitoring reports metrics to AOM in a unified manner. Therefore, you need to select an AOM Prometheus for CCE instance. The default metrics are collected for free but custom metrics are billed by AOM.
    • Collection Period: period for Prometheus to collect and report metrics. The value ranges from 10 to 120 seconds. The default value is 15 seconds.
      Storage: (Required when Deployment Mode is set to Server) Used for temporary storage (PVC) of Prometheus data. By default, Huawei Cloud clusters use PVCs of the csi-disk-topology storage type. If an available PVC (pvc-prometheus-server) exists in the namespace monitoring, it can be used as the storage source.
      • EVS Disk Type: You can select High I/O, Ultra-high I/O, or Common I/O.
      • Capacity: capacity specified when a PVC is created or the maximum storage limit when the pod storage is selected.

      Using EVS disks for add-on storage will incur extra fees. For details, see Product Pricing Details.

    For details about the add-on, see kube-prometheus-stack.

  5. Click Confirm. The Container Insights > Clusters page is displayed. The access status of the cluster is Installing.

    After monitoring is enabled for the cluster, metrics such as the CPU usage and CPU allocation rate of the cluster are displayed in the list, indicating that the cluster is monitored by CIA.

    If monitoring fails to be enabled for the cluster, rectify the fault by referring to FAQs.