文档首页/ 云容器引擎 CCE/ 用户指南/ 可观测性/ 可观测性最佳实践/ 使用Prometheus监控Master节点组件指标
更新时间:2024-08-17 GMT+08:00

使用Prometheus监控Master节点组件指标

本文将介绍如何使用Prometheus对Master节点的kube-apiserver、kube-controller、kube-scheduler、etcd-server组件进行监控。

通过监控中心查看Master节点组件指标

云原生监控中心已支持对Master节点的kube-apiserver组件进行监控,您在集群中开通云原生监控中心后(安装云原生监控插件版本为3.5.0及以上),可以查看仪表盘中的APIServer视图,监控API指标。

如需对kube-controller、kube-scheduler、etcd-server组件进行监控,请参考以下步骤。

此3个组件监控指标不在容器基础指标范围,监控中心将该类指标上报至AOM后会进行收费,因此监控中心会默认屏蔽采集该类指标。

  1. 登录CCE控制台,单击集群名称进入集群详情页。
  2. 在左侧导航栏中选择“配置与密钥”,并切换至“monitoring”命名空间,找到名为“persistent-user-config”的配置项。
  3. 单击“更新”,对配置数据进行编辑,并在serviceMonitorDisable字段下删除以下配置。

      serviceMonitorDisable:
      - monitoring/kube-controller
      - monitoring/kube-scheduler
      - monitoring/etcd-server
      - monitoring/log-operator
    图1 删除配置

  4. 单击“确定”。
  5. 等待5分钟后,您可前往AOM控制台,在“指标浏览”中找到集群上报的AOM实例,查看上述组件的指标。

    图2 查看指标

自建Prometheus采集Master节点组件指标

如果您需要通过Prometheus采集Master节点组件指标,可通过以下指导进行配置。

  • 集群版本需要v1.19及以上。
  • 在集群中需安装自建的Prometheus,您可参考Prometheus使用Helm模板进行安装。安装自建Prometheus后,还需要使用prometheus-operator纳管该Prometheus实例,具体操作步骤请参见Prometheus Operator

    由于Prometheus(停止维护)插件版本已停止演进,不再支持该功能特性,请避免使用。

  1. 使用kubectl连接集群。
  2. 修改Prometheus的ClusterRole。

    kubectl edit ClusterRole prometheus -n {namespace}
    在rules字段添加以下内容:
    rules:
    ...
    - apiGroups:
      - proxy.exporter.k8s.io
      resources:
      - "*"
      verbs: ["get", "list", "watch"]

  3. 创建并编辑kube-apiserver.yaml文件。

    vi kube-apiserver.yaml
    文件内容如下:
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      labels:
        app.kubernetes.io/name: apiserver
      name: kube-apiserver
      namespace: monitoring    #修改为Prometheus安装的命名空间
    spec:
      endpoints:
      - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
        interval: 30s
        metricRelabelings:
        - action: keep
          regex: (aggregator_unavailable_apiservice|apiserver_admission_controller_admission_duration_seconds_bucket|apiserver_admission_webhook_admission_duration_seconds_bucket|apiserver_admission_webhook_admission_duration_seconds_count|apiserver_client_certificate_expiration_seconds_bucket|apiserver_client_certificate_expiration_seconds_count|apiserver_current_inflight_requests|apiserver_request_duration_seconds_bucket|apiserver_request_total|go_goroutines|kubernetes_build_info|process_cpu_seconds_total|process_resident_memory_bytes|rest_client_requests_total|workqueue_adds_total|workqueue_depth|workqueue_queue_duration_seconds_bucket|aggregator_unavailable_apiservice_total|rest_client_request_duration_seconds_bucket)
          sourceLabels:
          - __name__
        - action: drop
          regex: apiserver_request_duration_seconds_bucket;(0.15|0.25|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2.5|3|3.5|4.5|6|7|8|9|15|25|30|50)
          sourceLabels:
          - __name__
          - le
        port: https
        scheme: https
        tlsConfig:
          caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          serverName: kubernetes
      jobLabel: component
      namespaceSelector:
        matchNames:
        - default
      selector:
        matchLabels:
          component: apiserver
          provider: kubernetes

    创建ServiceMonitor:

    kubectl apply -f kube-apiserver.yaml

  4. 创建并编辑kube-controller.yaml文件。

    vi kube-controller.yaml
    文件内容如下:
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      labels:
        app.kubernetes.io/name: kube-controller
      name: kube-controller-manager
      namespace: monitoring    #修改为Prometheus安装的命名空间
    spec:
      endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          interval: 15s
          honorLabels: true
          port: https
          relabelings:
            - regex: (.+)
              replacement: /apis/proxy.exporter.k8s.io/v1beta1/kube-controller-proxy/${1}/metrics
              sourceLabels:
                - __address__
              targetLabel: __metrics_path__
            - regex: (.+)
              replacement: ${1}
              sourceLabels:
                - __address__
              targetLabel: instance
            - replacement: kubernetes.default.svc.cluster.local:443
              targetLabel: __address__
          scheme: https
          tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      jobLabel: app
      namespaceSelector:
        matchNames:
          - kube-system
      selector:
        matchLabels:
          app: kube-controller-proxy
          version: v1

    创建ServiceMonitor:

    kubectl apply -f kube-controller.yaml

  5. 创建并编辑kube-scheduler.yaml文件。

    vi kube-scheduler.yaml
    文件内容如下:
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      labels:
        app.kubernetes.io/name: kube-scheduler
      name: kube-scheduler
      namespace: monitoring    #修改为Prometheus安装的命名空间
    spec:
      endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          interval: 15s
          honorLabels: true
          port: https
          relabelings:
            - regex: (.+)
              replacement: /apis/proxy.exporter.k8s.io/v1beta1/kube-scheduler-proxy/${1}/metrics
              sourceLabels:
                - __address__
              targetLabel: __metrics_path__
            - regex: (.+)
              replacement: ${1}
              sourceLabels:
                - __address__
              targetLabel: instance
            - replacement: kubernetes.default.svc.cluster.local:443
              targetLabel: __address__
          scheme: https
          tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      jobLabel: app
      namespaceSelector:
        matchNames:
          - kube-system
      selector:
        matchLabels:
          app: kube-scheduler-proxy
          version: v1

    创建ServiceMonitor:

    kubectl apply -f kube-scheduler.yaml

  6. 创建并编辑etcd-server.yaml文件。

    vi etcd-server.yaml
    文件内容如下:
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      labels:
        app.kubernetes.io/name: etcd-server
      name: etcd-server
      namespace: monitoring    #修改为Prometheus安装的命名空间
    spec:
      endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          interval: 15s
          honorLabels: true
          port: https
          relabelings:
            - regex: (.+)
              replacement: /apis/proxy.exporter.k8s.io/v1beta1/etcd-server-proxy/${1}/metrics
              sourceLabels:
                - __address__
              targetLabel: __metrics_path__
            - regex: (.+)
              replacement: ${1}
              sourceLabels:
                - __address__
              targetLabel: instance
            - replacement: kubernetes.default.svc.cluster.local:443
              targetLabel: __address__
          scheme: https
          tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      jobLabel: app
      namespaceSelector:
        matchNames:
          - kube-system
      selector:
        matchLabels:
          app: etcd-server-proxy
          version: v1

    创建ServiceMonitor:

    kubectl apply -f etcd-server.yaml

  7. 创建完成后,访问Prometheus,单击“Status > Targets”,可以查看到Prometheus监控目标中已包含上述三个Master节点组件。