更新时间:2024-08-17 GMT+08:00
使用Prometheus监控Master节点组件指标
本文将介绍如何使用Prometheus对Master节点的kube-apiserver、kube-controller、kube-scheduler、etcd-server组件进行监控。
通过监控中心查看Master节点组件指标
云原生监控中心已支持对Master节点的kube-apiserver组件进行监控,您在集群中开通云原生监控中心后(安装云原生监控插件版本为3.5.0及以上),可以查看仪表盘中的APIServer视图,监控API指标。
如需对kube-controller、kube-scheduler、etcd-server组件进行监控,请参考以下步骤。
此3个组件监控指标不在容器基础指标范围,监控中心将该类指标上报至AOM后会进行收费,因此监控中心会默认屏蔽采集该类指标。
- 登录CCE控制台,单击集群名称进入集群详情页。
- 在左侧导航栏中选择“配置与密钥”,并切换至“monitoring”命名空间,找到名为“persistent-user-config”的配置项。
- 单击“更新”,对配置数据进行编辑,并在serviceMonitorDisable字段下删除以下配置。
serviceMonitorDisable: - monitoring/kube-controller - monitoring/kube-scheduler - monitoring/etcd-server - monitoring/log-operator
图1 删除配置
- 单击“确定”。
- 等待5分钟后,您可前往AOM控制台,在“指标浏览”中找到集群上报的AOM实例,查看上述组件的指标。
图2 查看指标
自建Prometheus采集Master节点组件指标
如果您需要通过Prometheus采集Master节点组件指标,可通过以下指导进行配置。
- 集群版本需要v1.19及以上。
- 在集群中需安装自建的Prometheus,您可参考Prometheus使用Helm模板进行安装。安装自建Prometheus后,还需要使用prometheus-operator纳管该Prometheus实例,具体操作步骤请参见Prometheus Operator。
由于Prometheus(停止维护)插件版本已停止演进,不再支持该功能特性,请避免使用。
- 使用kubectl连接集群。
- 修改Prometheus的ClusterRole。
kubectl edit ClusterRole prometheus -n {namespace}
在rules字段添加以下内容:rules: ... - apiGroups: - proxy.exporter.k8s.io resources: - "*" verbs: ["get", "list", "watch"]
- 创建并编辑kube-apiserver.yaml文件。
vi kube-apiserver.yaml
文件内容如下:apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: app.kubernetes.io/name: apiserver name: kube-apiserver namespace: monitoring #修改为Prometheus安装的命名空间 spec: endpoints: - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token interval: 30s metricRelabelings: - action: keep regex: (aggregator_unavailable_apiservice|apiserver_admission_controller_admission_duration_seconds_bucket|apiserver_admission_webhook_admission_duration_seconds_bucket|apiserver_admission_webhook_admission_duration_seconds_count|apiserver_client_certificate_expiration_seconds_bucket|apiserver_client_certificate_expiration_seconds_count|apiserver_current_inflight_requests|apiserver_request_duration_seconds_bucket|apiserver_request_total|go_goroutines|kubernetes_build_info|process_cpu_seconds_total|process_resident_memory_bytes|rest_client_requests_total|workqueue_adds_total|workqueue_depth|workqueue_queue_duration_seconds_bucket|aggregator_unavailable_apiservice_total|rest_client_request_duration_seconds_bucket) sourceLabels: - __name__ - action: drop regex: apiserver_request_duration_seconds_bucket;(0.15|0.25|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2.5|3|3.5|4.5|6|7|8|9|15|25|30|50) sourceLabels: - __name__ - le port: https scheme: https tlsConfig: caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt serverName: kubernetes jobLabel: component namespaceSelector: matchNames: - default selector: matchLabels: component: apiserver provider: kubernetes
创建ServiceMonitor:
kubectl apply -f kube-apiserver.yaml
- 创建并编辑kube-controller.yaml文件。
vi kube-controller.yaml
文件内容如下:apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: app.kubernetes.io/name: kube-controller name: kube-controller-manager namespace: monitoring #修改为Prometheus安装的命名空间 spec: endpoints: - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token interval: 15s honorLabels: true port: https relabelings: - regex: (.+) replacement: /apis/proxy.exporter.k8s.io/v1beta1/kube-controller-proxy/${1}/metrics sourceLabels: - __address__ targetLabel: __metrics_path__ - regex: (.+) replacement: ${1} sourceLabels: - __address__ targetLabel: instance - replacement: kubernetes.default.svc.cluster.local:443 targetLabel: __address__ scheme: https tlsConfig: caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt jobLabel: app namespaceSelector: matchNames: - kube-system selector: matchLabels: app: kube-controller-proxy version: v1
创建ServiceMonitor:
kubectl apply -f kube-controller.yaml
- 创建并编辑kube-scheduler.yaml文件。
vi kube-scheduler.yaml
文件内容如下:apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: app.kubernetes.io/name: kube-scheduler name: kube-scheduler namespace: monitoring #修改为Prometheus安装的命名空间 spec: endpoints: - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token interval: 15s honorLabels: true port: https relabelings: - regex: (.+) replacement: /apis/proxy.exporter.k8s.io/v1beta1/kube-scheduler-proxy/${1}/metrics sourceLabels: - __address__ targetLabel: __metrics_path__ - regex: (.+) replacement: ${1} sourceLabels: - __address__ targetLabel: instance - replacement: kubernetes.default.svc.cluster.local:443 targetLabel: __address__ scheme: https tlsConfig: caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt jobLabel: app namespaceSelector: matchNames: - kube-system selector: matchLabels: app: kube-scheduler-proxy version: v1
创建ServiceMonitor:
kubectl apply -f kube-scheduler.yaml
- 创建并编辑etcd-server.yaml文件。
vi etcd-server.yaml
文件内容如下:apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: app.kubernetes.io/name: etcd-server name: etcd-server namespace: monitoring #修改为Prometheus安装的命名空间 spec: endpoints: - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token interval: 15s honorLabels: true port: https relabelings: - regex: (.+) replacement: /apis/proxy.exporter.k8s.io/v1beta1/etcd-server-proxy/${1}/metrics sourceLabels: - __address__ targetLabel: __metrics_path__ - regex: (.+) replacement: ${1} sourceLabels: - __address__ targetLabel: instance - replacement: kubernetes.default.svc.cluster.local:443 targetLabel: __address__ scheme: https tlsConfig: caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt jobLabel: app namespaceSelector: matchNames: - kube-system selector: matchLabels: app: etcd-server-proxy version: v1
创建ServiceMonitor:
kubectl apply -f etcd-server.yaml
- 创建完成后,访问Prometheus,单击“Status > Targets”,可以查看到Prometheus监控目标中已包含上述三个Master节点组件。