Updated on 2024-01-04 GMT+08:00

Prometheus

Introduction

Prometheus is an open-source system monitoring and alerting framework. It is derived from Google's borgmon monitoring system, which was created by former Google employees working at SoundCloud in 2012. Prometheus was developed as an open-source community project and officially released in 2015. In 2016, Prometheus officially joined the Cloud Native Computing Foundation, after Kubernetes.

CCE allows you to quickly install Prometheus as an add-on.

Official website of Prometheus: https://prometheus.io/

Open source community: https://github.com/prometheus/prometheus

Constraints

The Prometheus add-on is supported only in clusters of v1.21 and earlier.

Features

As a next-generation monitoring framework, Prometheus has the following features:

  • Powerful multi-dimensional data model
    1. Time series data is identified by metric name and key-value pair.
    2. Multi-dimensional labels can be set for all metrics.
    3. Data models do not require dot-separated character strings.
    4. Data models can be aggregated, cut, and sliced.
    5. The double floating-point format is supported. Labels can all be set to unicode.
  • Flexible and powerful query statement (PromQL): One query statement supports addition, multiplication, and connection for multiple metrics.
  • Easy to manage: The Prometheus server is a separate binary file that can work locally. It does not depend on distributed storage.
  • Efficient: Each sampling point occupies only 3.5 bytes, and one Prometheus server can process millions of metrics.
  • The pull mode is used to collect time series data, which facilitates local tests and prevents faulty servers from pushing bad metrics.
  • Time series data can be pushed to the Prometheus server in push gateway mode.
  • Users can obtain the monitored targets through service discovery or static configuration.
  • Multiple visual GUIs are available.
  • Easy to scale

Installing the Add-on

  1. Log in to the CCE console and click the cluster name to access the cluster console. Choose Add-ons in the navigation pane, locate Prometheus on the right, and click Install.
  2. In the Configuration step, set the following parameters:

    Table 1 Prometheus add-on parameters

    Parameter

    Description

    Add-on Specifications

    Select an add-on specification based on service requirements. The options are as follows:

    • Demo(<= 100 containers): The specification type applies to the experience and function demonstration environment. In this specification, Prometheus occupies few resources but has limited processing capabilities. You are advised to use this specification when the number of containers in the cluster does not exceed 100.
    • Small(<= 2000 containers): You are advised to use this specification when the number of containers in the cluster does not exceed 2,000.
    • Medium(<= 5000 containers): You are advised to use this specification when the number of containers in the cluster does not exceed 5000.
    • Large(> 5000 containers): You are advised to use this specification when the number of containers in the cluster exceeds 5,000.

    Pods

    Number of pods that will be created to match the selected add-on specifications. The number cannot be modified.

    Containers

    CPU and memory quotas of the container allowed for the selected add-on specifications. The quotas cannot be modified.

    Data Retention (days)

    Number of days for storing customized monitoring data. The default value is 15 days.

    Storage

    Cloud hard disks can be used as storage. Set the following parameters as prompted:

    • AZ: Set this parameter based on the site requirements. An AZ is a physical region where resources use independent power supply and networks. AZs are physically isolated but interconnected through an internal network.
    • Disk Type: Common I/O, high I/O, and ultra-high I/O are supported.
    • Capacity: Enter the storage capacity based on service requirements. The default value is 10 GB.
    NOTE:

    If a PVC already exists in the namespace monitoring, the configured storage will be used as the storage source.

  3. Click Install. After the installation, the add-on deploys the following instances in the cluster.

    • prometheus-operator: deploys and manages the Prometheus Server based on CustomResourceDefinitions (CRDs), and monitors and processes the events related to these CRDs. It is the control center of the entire system.
    • prometheus (server): a Prometheus Server cluster deployed by the operator based on the Prometheus CRDs that can be regarded as StatefulSets.
    • prometheus-kube-state-metrics: converts the Prometheus metric data into a format that can be identified by Kubernetes APIs.
    • custom-metrics-apiserver: aggregates custom metrics to the native Kubernetes API server.
    • prometheus-node-exporter: deployed on each node to collect node monitoring data.
    • grafana: visualizes monitoring data.

Providing Resource Metrics Through the Metrics API

Resource metrics of containers and nodes, such as CPU and memory usage, can be obtained through the Kubernetes Metrics API. Resource metrics can be directly accessed, for example, by using the kubectl top command, or used by HPA or CustomedHPA policies for auto scaling.

The add-on can provide the Kubernetes Metrics API that is disabled by default. To enable the API, create the following APIService object:

apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    app: custom-metrics-apiserver
    release: cceaddon-prometheus
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: custom-metrics-apiserver
    namespace: monitoring
    port: 443
  version: v1beta1
  versionPriority: 100

You can save the object as a file, name it metrics-apiservice.yaml, and run the following command:

kubectl create -f metrics-apiservice.yaml

Run the kubectl top pod -n monitoring command. If the following information is displayed, the Metrics API can be accessed:

# kubectl top pod -n monitoring
NAME                                                      CPU(cores)   MEMORY(bytes)
......
custom-metrics-apiserver-d4f556ff9-l2j2m                  38m          44Mi
......

To uninstall the add-on, run the following kubectl command and delete the APIService object. Otherwise, the metrics-server add-on cannot be installed due to residual APIService resources.

kubectl delete APIService v1beta1.metrics.k8s.io

Reference