Help Center/ Cloud Container Engine/ User Guide/ O&M/ O&M Best Practices/ Monitoring Master Node Components Using Prometheus
Updated on 2025-09-05 GMT+08:00

Monitoring Master Node Components Using Prometheus

This section describes how to use Prometheus to monitor the kube-apiserver, kube-controller, kube-scheduler and etcd-server components on the master nodes.

Collecting the Metrics of Master Node Components Using Self-Built Prometheus

This section describes how to collect the metrics of master node components using self-built Prometheus.

  • The cluster version must be 1.19 or later.
  • You need to install self-built Prometheus using Helm by referring to Prometheus. You need to use prometheus-operator to manage the installed Prometheus by referring to Prometheus Operator.

    Because the Prometheus add-on (Prometheus) is end of maintenance and does not support this function, you are advised not to use this add-on.

  1. Use kubectl to connect to the cluster.
  2. Modify the ClusterRole of Prometheus.

    kubectl edit ClusterRole prometheus -n {namespace}
    Add the following content under the rules field:
    rules:
    ...
    - apiGroups:
      - proxy.exporter.k8s.io
      resources:
      - "*"
      verbs: ["get", "list", "watch"]

  3. Create a file named kube-apiserver.yaml and edit it.

    vi kube-apiserver.yaml
    Example file content:
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      labels:
        app.kubernetes.io/name: apiserver
      name: kube-apiserver
      namespace: monitoring    # Change it to the namespace where Prometheus will be installed.
    spec:
      endpoints:
      - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
        interval: 30s
        metricRelabelings:
        - action: keep
          regex: (aggregator_unavailable_apiservice|apiserver_admission_controller_admission_duration_seconds_bucket|apiserver_admission_webhook_admission_duration_seconds_bucket|apiserver_admission_webhook_admission_duration_seconds_count|apiserver_client_certificate_expiration_seconds_bucket|apiserver_client_certificate_expiration_seconds_count|apiserver_current_inflight_requests|apiserver_request_duration_seconds_bucket|apiserver_request_total|go_goroutines|kubernetes_build_info|process_cpu_seconds_total|process_resident_memory_bytes|rest_client_requests_total|workqueue_adds_total|workqueue_depth|workqueue_queue_duration_seconds_bucket|aggregator_unavailable_apiservice_total|rest_client_request_duration_seconds_bucket)
          sourceLabels:
          - __name__
        - action: drop
          regex: apiserver_request_duration_seconds_bucket;(0.15|0.25|0.3|0.35|0.4|0.45|0.6|0.7|0.8|0.9|1.25|1.5|1.75|2.5|3|3.5|4.5|6|7|8|9|15|25|30|50)
          sourceLabels:
          - __name__
          - le
        port: https
        scheme: https
        tlsConfig:
          caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
          serverName: kubernetes
      jobLabel: component
      namespaceSelector:
        matchNames:
        - default
      selector:
        matchLabels:
          component: apiserver
          provider: kubernetes

    Create a ServiceMonitor:

    kubectl apply -f kube-apiserver.yaml

  4. Create a file named kube-controller.yaml and edit it.

    vi kube-controller.yaml
    Example file content:
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      labels:
        app.kubernetes.io/name: kube-controller
      name: kube-controller-manager
      namespace: monitoring    # Change it to the namespace where Prometheus will be installed.
    spec:
      endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          interval: 15s
          honorLabels: true
          port: https
          relabelings:
            - regex: (.+)
              replacement: /apis/proxy.exporter.k8s.io/v1beta1/kube-controller-proxy/${1}/metrics
              sourceLabels:
                - __address__
              targetLabel: __metrics_path__
            - regex: (.+)
              replacement: ${1}
              sourceLabels:
                - __address__
              targetLabel: instance
            - replacement: kubernetes.default.svc.cluster.local:443
              targetLabel: __address__
          scheme: https
          tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      jobLabel: app
      namespaceSelector:
        matchNames:
          - kube-system
      selector:
        matchLabels:
          app: kube-controller-proxy
          version: v1

    Create a ServiceMonitor:

    kubectl apply -f kube-controller.yaml

  5. Create a file named kube-scheduler.yaml and edit it.

    vi kube-scheduler.yaml
    Example file content:
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      labels:
        app.kubernetes.io/name: kube-scheduler
      name: kube-scheduler
      namespace: monitoring    # Change it to the namespace where Prometheus will be installed.
    spec:
      endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          interval: 15s
          honorLabels: true
          port: https
          relabelings:
            - regex: (.+)
              replacement: /apis/proxy.exporter.k8s.io/v1beta1/kube-scheduler-proxy/${1}/metrics
              sourceLabels:
                - __address__
              targetLabel: __metrics_path__
            - regex: (.+)
              replacement: ${1}
              sourceLabels:
                - __address__
              targetLabel: instance
            - replacement: kubernetes.default.svc.cluster.local:443
              targetLabel: __address__
          scheme: https
          tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      jobLabel: app
      namespaceSelector:
        matchNames:
          - kube-system
      selector:
        matchLabels:
          app: kube-scheduler-proxy
          version: v1

    Create a ServiceMonitor:

    kubectl apply -f kube-scheduler.yaml

  6. Create a file named etcd-server.yaml and edit it.

    vi etcd-server.yaml
    Example file content:
    apiVersion: monitoring.coreos.com/v1
    kind: ServiceMonitor
    metadata:
      labels:
        app.kubernetes.io/name: etcd-server
      name: etcd-server
      namespace: monitoring    # Change it to the namespace where Prometheus will be installed.
    spec:
      endpoints:
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          interval: 15s
          honorLabels: true
          port: https
          relabelings:
            - regex: (.+)
              replacement: /apis/proxy.exporter.k8s.io/v1beta1/etcd-server-proxy/${1}/metrics
              sourceLabels:
                - __address__
              targetLabel: __metrics_path__
            - regex: (.+)
              replacement: ${1}
              sourceLabels:
                - __address__
              targetLabel: instance
            - replacement: kubernetes.default.svc.cluster.local:443
              targetLabel: __address__
          scheme: https
          tlsConfig:
            caFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      jobLabel: app
      namespaceSelector:
        matchNames:
          - kube-system
      selector:
        matchLabels:
          app: etcd-server-proxy
          version: v1

    Create a ServiceMonitor:

    kubectl apply -f etcd-server.yaml

  7. Access Prometheus and choose Status > Targets.

    The preceding master node components are displayed.

kube-apiserver Metrics

Category

Metric

Type

Description

Example PromQL Statement

Request metrics

apiserver_request_total

Counter

The total number of API requests received by kube-apiserver, broken down by labels.

Example labels:

  • verb: HTTP request method, such as GET, POST, PUT, DELETE, PATCH, LIST, or WATCH.
  • group: Kubernetes API group, such as apps/v1 or networking.k8s.io/v1.
  • version: API version, such as v1 or v1beta1.
  • resource: Kubernetes resource type, such as pods, Deployments, Services, or nodes.
  • subresource: sub-resource of a resource (some operations are only available for sub-attributes of the resource), such as logs (viewing logs), exec (executing commands), or status (updating status).
  • scope: application scope of a request, such as cluster (cluster-level), namespace (namespace-level), or resource (single resource).
  • component: source component of a request, such as kube-controller-manager or kube-scheduler.
  • client: the client that initiates a request. It may be an internal component or an external service.
  • code: HTTP response status code, such as 200 (OK), 404 (Not Found), 500 (Internal Server Error), or 429 (Too Many Requests).
  • To query the total request rate (measured by QPS):
    sum(rate(apiserver_request_total[5m]))
  • To query the request errors (5xx returned):
    sum(rate(apiserver_request_total{code=~"5.."}[5m])) by (resource, verb)
  • To query the requests subject to rate limiting (status code 429)
    sum(rate(apiserver_request_total{code="429"}[5m])) by (resource)
  • To query the clients that frequently send requests:
    topk(5, sum(rate(apiserver_request_total[5m])) by (client))

apiserver_request_duration_seconds_bucket

Histogram

Response latency distribution, broken down by labels (such as request type, resource, and status code). This metric can be used to analyze P50, P90, and P99 latencies, identify slow requests and high-latency resources, and monitor kube-apiserver performance.

Example labels:

  • le: core label of the histogram. It indicates the number of requests that took less than or equal to an interval (measured in seconds). For example, le="0.005" indicates the number of requests that took less than or equal to 5 ms.
  • verb: HTTP request method, such as GET, POST, PUT, DELETE, PATCH, LIST, or WATCH.
  • group: Kubernetes API group, such as apps/v1 or networking.k8s.io/v1.
  • version: API version, such as v1 or v1beta1.
  • resource: Kubernetes resource type, such as pods, Deployments, Services, or nodes.
  • subresource: sub-resource of a resource (some operations are only available for sub-attributes of the resource), such as logs (viewing logs), exec (executing commands), or status (updating status).
  • scope: application scope of a request, such as cluster (cluster-level), namespace (namespace-level), or resource (single resource).
  • component: source component of a request, such as kube-controller-manager or kube-scheduler.
  • client: the client that initiates a request. It may be an internal component or an external service.
  • To query the P99 latency (99% of the requests completed within given latency):
    histogram_quantile(0.99,sum(rate(apiserver_request_duration_seconds_bucket[5m])) by (le, resource, verb))

    Example output:

    {resource="pods", verb="GET"}    0.8  # 99% of GET requests for pods were completed within 0.8s or less.
    {resource="deployments", verb="POST"} 1.2 # 99% of POST requests for Deployments were completed within 1.2s or less.
  • To query the P90 latency by resource type:
    histogram_quantile(0.90,sum(rate(apiserver_request_duration_seconds_bucket[5m])) by (le, resource))

    Example output:

    {resource="pods"}      0.5  # 90% of pod requests were completed within 0.5s or less.
    {resource="services"}  0.3  # 90% of Service requests were completed within 0.3s or less.
  • To monitor high-latency requests (> 1s):
    sum(rate(apiserver_request_duration_seconds_bucket{le="+Inf"}[5m])) by (resource, verb)- sum(rate(apiserver_request_duration_seconds_bucket{le="1.0"}[5m])) by (resource, verb)

    Example output:

    {resource="pods", verb="LIST"}  50  # There were 50 LIST requests for pods that took more than 1s in 5 minutes.

apiserver_current_inflight_requests

Gauge

The number of API requests that are being executed. This metric typically contains the following labels:

  • readOnly: read requests, which do not change the cluster status. Read requests are usually used to read resources, for example, obtaining the pod list and querying the node status.
  • mutating: write requests, which change the cluster status. Write requests are usually used to create, update, or delete resources, for example, creating a pod or updating a Service.
  • To view the number of write requests that are being executed:
    apiserver_current_inflight_requests{request_kind="mutating"}
  • To view the number of read requests that are being executed:
    apiserver_current_inflight_requests{request_kind="readOnly"}
  • To view the total number of requests that are being executed:
    sum(apiserver_current_inflight_requests)

etcd_request_duration_seconds_bucket

Histogram

etcd request latency.

Example labels:

  • le: core label of the histogram. It indicates the number of requests that took less than or equal to an interval (measured in seconds). For example, le="0.005" indicates the number of requests that took less than or equal to 5 ms.
  • type: type of the operation object.
  • operation: operation type.
  • To query P99 latency:
    histogram_quantile(0.99,sum(rate(etcd_request_duration_seconds_bucket[5m])) by (le, type))
  • To compare read and write latencies:
    histogram_quantile(0.95,rate(etcd_request_duration_seconds_bucket{type=~"range|put"}[5m]))
  • To detect slow requests (> 1s):
    sum(rate(etcd_request_duration_seconds_bucket{le="+Inf"}[5m]))  - sum(rate(etcd_request_duration_seconds_bucket{le="1"}[5m]))

Error and rate limiting metrics

apiserver_flowcontrol_current_executing_requests

Gauge

One of the core metrics in API Priority and Fairness (APF). It reflects the number of requests being executed by the API server in real time. For details, see API Priority and Fairness.

Example labels:

  • priority_level: request priority level. Options:
    • exempt: used for requests that are not subject to flow control (such as requests for key system operations). Requests of this priority level do not occupy the concurrency quota.
    • system: used for requests from Kubernetes control plane components (such as kube-controller-manager and kube-scheduler)
    • node-high: used for health status updates from nodes.
    • leader-election: used for leader election requests (such as requests from kube-controller-manager).
    • workload-high/low: used for workload requests with high and low priorities, respectively.
    • global-default: used for default requests that do not match any FlowSchemas.
    • catch-all: (default) used for all requests that are not explicitly classified. Requests of this priority have a very low concurrency quota.
  • flow_schema: the FlowSchema that matches requests. A FlowSchema can identify the request source (such as kube-controller-manager or kube-scheduler).
  • To view the number of requests executed by priority level:

    apiserver_flowcontrol_current_executing_requests
  • To query resource usages by priority level:
    sum(apiserver_flowcontrol_current_executing_requests{priority_level=~"system|leader-election"}) / sum(apiserver_flowcontrol_nominal_limit_seats{priority_level=~"system|leader-election"})

apiserver_flowcontrol_current_inqueue_requests

Gauge

The number of requests waiting to be executed in the flow control queue. These requests have been received but are not executed because the number of concurrent requests has reached the configured limit. For details, see API Priority and Fairness.

Example labels:

  • priority_level: request priority level (such as system or workload-high).
  • flow_schema: the FlowSchema that matches requests. A FlowSchema can identify the request source (such as kube-controller-manager or kube-scheduler).
  • To identify stacked requests labeled by system:

    apiserver_flowcontrol_current_inqueue_requests{priority_level="system"}
  • To identify requests that are stacked within 5 minutes:
    delta(apiserver_flowcontrol_current_inqueue_requests[5m])

apiserver_flowcontrol_nominal_limit_seats

Gauge

The nominal request concurrency limit per priority level. For details, see API Priority and Fairness.

This metric is classified by the priority_level label, which indicates the request priority level (such as system or workload-high).

  • To view the nominal request concurrency limits for all priority levels:
    apiserver_flowcontrol_nominal_limit_seats
  • To calculate resource usages based on executing requests:
    sum(apiserver_flowcontrol_current_executing_requests) by (priority_level) / apiserver_flowcontrol_nominal_limit_seats

apiserver_flowcontrol_current_limit_seats

Gauge

The request concurrency limit per priority level. This metric allows you to learn the load of the API server and determine whether to adjust the traffic control policy in high-load scenarios. For details, see API Priority and Fairness.

Unlike nominal_limit_seats, the value of this metric may be affected by the global traffic control policy.

This metric is classified by the priority_level label, which indicates the request priority level (such as system or workload-high).

To view the request concurrency limit at a priority level:
apiserver_flowcontrol_current_limit_seats{priority_level="system"}

apiserver_flowcontrol_current_executing_seats

Gauge

The number of seats corresponding to the requests currently being executed in a priority queue. This metric reflects the concurrent resources being consumed in the queue and helps you understand the actual load of the queue. For details, see API Priority and Fairness. This metric is classified by the priority_level label, which indicates the request priority level (such as system or workload-high).

If the value of current_executing_seats is close to that of current_limit_seats, the concurrent resources of the queue may be about to be used up. You can increase the values of max-mutating-requests-inflight and max-requests-inflight to optimize the configuration. For details, see Modifying Cluster Configurations.

  • To view the number of concurrent seats that have been occupied for a specific priority level:
    apiserver_flowcontrol_current_executing_seats{priority_level="system"}
  • To calculate the seat usage (current usage/current limit):
    sum(apiserver_flowcontrol_current_executing_seats) by (priority_level) / sum(apiserver_flowcontrol_current_limit_seats) by (priority_level)

apiserver_flowcontrol_current_inqueue_seats

Gauge

The concurrent resources consumed by the requests waiting in queues for all priority levels. For details, see API Priority and Fairness.

This metric is classified by the priority_level label, which indicates the request priority level (such as system or workload-high).

  • To view the seats occupied by requests waiting in the queue for a specific priority level:
    apiserver_flowcontrol_current_inqueue_seats{priority_level="system"}
  • To calculate the percentage of queued requests (the number of queued seats/total concurrency quota):
    sum(apiserver_flowcontrol_current_inqueue_seats) by (priority_level)/sum(apiserver_flowcontrol_nominal_limit_seats) by (priority_level)

apiserver_flowcontrol_request_execution_seconds_bucket

Histogram

The execution time of API requests. For details, see API Priority and Fairness.

Example labels:

  • le: core label of the histogram. It indicates the number of requests that took less than or equal to an interval (measured in seconds).
  • priority_level: request priority level (such as system or workload-high).
  • flow_schema: the FlowSchema that matches requests. A FlowSchema can identify the request source (such as kube-controller-manager or kube-scheduler).
  • To calculate the execution time for 99% of requests:
    histogram_quantile(0.99,  sum(rate(apiserver_flowcontrol_request_execution_seconds_bucket[5m])) by (le, priority_level))
  • To detect slow requests (> 1s):
    sum(rate(apiserver_flowcontrol_request_execution_seconds_bucket{le="+Inf"}[5m]))- sum(rate(apiserver_flowcontrol_request_execution_seconds_bucket{le="1"}[5m]))

apiserver_flowcontrol_request_wait_duration_seconds_bucket

Histogram

The waiting time of API requests in a queue. For details, see API Priority and Fairness.

Example labels:

  • le: core label of the histogram. It indicates the number of requests that took less than or equal to an interval (measured in seconds). For example, le="0.005" indicates the number of requests that took less than or equal to 5 ms.
  • priority_level: request priority level (such as system or workload-high).
  • flow_schema: the FlowSchema that matches requests. A FlowSchema can identify the request source (such as kube-controller-manager or kube-scheduler).
  • To calculate the waiting time for 95% of requests:
    histogram_quantile(0.95,  sum(rate(apiserver_flowcontrol_request_wait_duration_seconds_bucket[5m])) by (le, priority_level))
  • To detect long-lasting requests (> 5s):
    sum(rate(apiserver_flowcontrol_request_wait_duration_seconds_bucket{le="+Inf"}[5m]))- sum(rate(apiserver_flowcontrol_request_wait_duration_seconds_bucket{le="5"}[5m]))

apiserver_flowcontrol_dispatched_requests_total

Counter

The total number of API requests that have been scheduled (started to be executed). For details, see API Priority and Fairness.

  • To calculate the request rate at each priority level:
    sum(rate(apiserver_flowcontrol_dispatched_requests_total[5m])) by (priority_level)
  • To compare the number of requests in different FlowSchemas:
    sum(rate(apiserver_flowcontrol_dispatched_requests_total[5m])) by (flow_schema)

apiserver_flowcontrol_rejected_requests_total

Counter

The total number of rejected API requests. Requests are often rejected due to traffic control or insufficient resources. For details, see API Priority and Fairness.

Example labels:

  • priority_level: request priority level.
  • flow_schema: the FlowSchema that matches requests. A FlowSchema can identify the request source (such as kube-controller-manager or kube-scheduler).
  • reason: the reason why a request is rejected. Options:
    • queue-full: Too many requests were already queued.
    • concurrency-limit: If the number of requests exceeds the concurrency limit, the extra requests will be rejected. The excess traffic is immediately rejected with HTTP 429 (Too Many Requests).
    • time-out: The request was still in the queue when its queuing time expired.
    • cancelled: The request was not purged and locked and has been ejected from the queue.
  • To calculate the request rejection rate:
    sum(rate(apiserver_flowcontrol_rejected_requests_total[5m])) by (priority_level, reason)
  • To calculate the rejection rate ratio:
    sum(rate(apiserver_flowcontrol_rejected_requests_total[5m])) by (priority_level)/sum(rate(apiserver_flowcontrol_dispatched_requests_total[5m])) by (priority_level)

apiserver_flowcontrol_request_concurrency_limit

Gauge

The maximum number of concurrent requests for a priority queue.

This metric is deprecated in Kubernetes 1.30 and removed from Kubernetes 1.31. You are advised to use apiserver_flowcontrol_nominal_limit_seats in clusters of v1.31 or later.

To view the current global concurrency limit:

apiserver_flowcontrol_request_concurrency_limit

Authentication and authorization metrics

apiserver_admission_controller_admission_duration_seconds_bucket

Histogram

The time that the admission controller takes to process API requests.

Example labels:

  • le: core label of the histogram. It indicates the number of requests that took less than or equal to an interval (measured in seconds). For example, le="0.005" indicates the number of requests that took less than or equal to 5 ms.
  • name: name of the admission controller that processes requests, such as MutatingAdmissionWebhook or ValidatingAdmissionWebhook.
  • operation: operation, such as CREATE, UPDATE, or DELETE.
  • type: operation type.
    • validate: validates the validity of a request.
    • admit: controls whether to allow a valid request.
  • rejected: controls whether a request is rejected. The value can be true or false.
  • To sort the results by controller name:
    sort_desc(histogram_quantile(0.99,rate(apiserver_admission_controller_admission_duration_seconds_bucket[5m])))
  • To calculate the processing time for 99% of requests:
    histogram_quantile(0.99,sum(rate(apiserver_admission_controller_admission_duration_seconds_bucket[5m])) by (le, name))

apiserver_admission_webhook_admission_duration_seconds_bucket

Histogram

The time that the admission webhook takes to process requests.

To calculate the processing time for 99% of requests:

histogram_quantile(0.99,sum(rate(apiserver_admission_webhook_admission_duration_seconds_bucket[5m])) by (le, name))

Service availability metrics

up

Gauge

Service availability. Options:

  • 1: A service is available.
  • 0: A service is unavailable.

To check the availability of the current service:

up

kube-controller Metrics

Metric

Type

Description

Example PromQL Statement

workqueue_adds_total

Counter

The number of adds processed by the workqueue.

  • To calculate the task addition rate of each queue:
    rate(workqueue_adds_total[5m])
  • To detect the abnormal high addition rate (> 1,000/minute):
    rate(workqueue_adds_total[1m]) > 1000/60
  • To sort the results by controller:
    topk(3, sum by(name) (rate(workqueue_adds_total[5m])))

workqueue_depth

Gauge

How big the workqueue is. If the queue depth remains high for a long time, the controller cannot process tasks in the queue in a timely manner, causing task stacking.

To view the depths of all queues:
workqueue_depth

workqueue_queue_duration_seconds_bucket

Histogram

How long in seconds a task stays in the workqueue before being executed.

Example labels:

  • le: core label of the histogram. It indicates the number of requests that took less than or equal to an interval (measured in seconds). For example, le="0.005" indicates the number of requests that took less than or equal to 5 ms.
  • name: task name.
  • To calculate the queuing time for 99% of requests:
    histogram_quantile(0.99,  rate(workqueue_queue_duration_seconds_bucket[5m]))
  • To detect long-lasting tasks (> 10s):
    sum(rate(workqueue_queue_duration_seconds_bucket{le="+Inf"}[5m])) - sum(rate(workqueue_queue_duration_seconds_bucket{le="10"}[5m]))

rest_client_requests_total

Counter

The number of HTTP requests initiated by the Kubernetes client.

Example labels:

  • code: HTTP response status code, such as 200 (OK), 404 (Not Found), 500 (Internal Server Error), or 429 (Too Many Requests).
  • host: host address.
  • method: HTTP request method, such as GET, POST, PUT, DELETE, PATCH, LIST, or WATCH.
  • To calculate the request rate (by status code):
    sum by(code) (rate(rest_client_requests_total[5m]))
  • To detect the 5xx error rate:
    rate(rest_client_requests_total{code=~"5.."}[5m]) / rate(rest_client_requests_total[5m])
  • To collect request statistics by target service:
    sum by(host) (rate(rest_client_requests_total[5m]))

rest_client_request_duration_seconds_bucket

Histogram

The latency of HTTP requests from the client.

Example labels:

  • le: core label of the histogram. It indicates the number of requests that took less than or equal to an interval (measured in seconds). For example, le="0.005" indicates the number of requests that took less than or equal to 5 ms.
  • host: host address.
  • verb: HTTP request method, such as GET, POST, PUT, DELETE, PATCH, LIST, or WATCH.
  • To calculate the P95 latency:
    histogram_quantile(0.95,  rate(rest_client_request_duration_seconds_bucket[5m]))
  • To detect slow requests (> 2s):
    sum(rate(rest_client_request_duration_seconds_bucket{le="+Inf"}[5m]))- sum(rate(rest_client_request_duration_seconds_bucket{le="2"}[5m]))

kube-scheduler Metrics

Metric

Type

Description

Example PromQL Statement

scheduler_scheduler_cache_size

Gauge

The number of nodes, pods, and assumed pods (pods to be scheduled) in the scheduler cache.

To view the number of cached pods:

scheduler_scheduler_cache_size{type="pod"}

scheduler_pending_pods

Gauge

The number of pending pods. This metric can be used to identify scheduling bottlenecks.

This metric is usually classified by the following queue labels:

  • active: the number of pods that are ready and waiting for scheduling.
  • backoff: the number of pods that fail to be scheduled.
  • gated: the number of pods that have been scheduled but declared unschedulable, or explicitly marked as unschedulable.
  • unschedulable: the number of pods that cannot be scheduled.

To view the number of pods in a specific queue:

scheduler_pending_pods{queue="backoff"}

scheduler_pod_scheduling_attempts_bucket

Histogram

The number of attempts to schedule a pod.

Generally, this metric is labeled by le. The value can be 1, 2, 4, 8, 16, or +Inf.

To detect high-frequency retries (more than eight attempts):
sum(rate(scheduler_pod_scheduling_attempts_bucket{le="+Inf"}[5m]))- sum(rate(scheduler_pod_scheduling_attempts_bucket{le="8"}[5m]))

etcd-server Metrics

Category

Metric

Type

Description

Example PromQL Statement

etcd leader status metrics

etcd_server_has_leader

Gauge

etcd elects a member in a cluster as the leader (master node) and other members as followers (slave nodes). The leader periodically sends heartbeats to all members for cluster stability.

This metric controls whether there is a leader among the etcd servers. Options:

  • 1: There is a leader among the etcd servers.
  • 0: There is no leader among the etcd servers.

To view the leader status:

etcd_server_has_leader

etcd_server_is_leader

Gauge

Whether an etcd member is the leader. Options:

  • 1: The etcd member is the leader.
  • 0: The etcd member is not the leader.

To check whether an etcd member is the leader:

etcd_server_is_leader

etcd_server_leader_changes_seen_total

Counter

The number of leader changes within a specific period of time.

To monitor the leader change frequency within 1 hour:

rate(etcd_server_leader_changes_seen_total[1h])

etcd storage metrics

etcd_mvcc_db_total_size_in_bytes

Gauge

The total size of the etcd.

To calculate the storage space usage:

etcd_mvcc_db_total_size_in_use_in_bytes  / etcd_mvcc_db_total_size_in_bytes

etcd_mvcc_db_total_size_in_use_in_bytes

Gauge

The usage of the etcd.

etcd_debugging_mvcc_keys_total

Gauge

The total number of keys in the etcd.

To monitor the increase of keys:

rate(etcd_debugging_mvcc_keys_total[5m])

etcd write performance metrics

etcd_disk_backend_commit_duration_seconds_bucket

Histogram

Time required by the etcd for data at rest. This is the time that the etcd takes to write a data change to the storage backend and commit data.

To calculate the P99 latency for writes:

histogram_quantile(0.99,rate(etcd_disk_backend_commit_duration_seconds_bucket[5m]))

etcd_server_proposals_committed_total

Gauge

The number of proposals submitted by the etcd.

To calculate the write failure rate:

rate(etcd_server_proposals_failed_total[5m]) / rate(etcd_server_proposals_committed_total[5m])

etcd_server_proposals_applied_total

Gauge

The number of applied or executed proposals.

To calculate the write success rate:

rate(etcd_server_proposals_applied_total[5m]) / rate(etcd_server_proposals_committed_total[5m])

etcd_server_proposals_pending

Gauge

The number of pending proposals.

To detect the stacked writes:

etcd_server_proposals_pending

etcd_server_proposals_failed_total

Counter

The number of failed proposals.

To calculate the write failure rate:

rate(etcd_server_proposals_failed_total[5m]) / rate(etcd_server_proposals_committed_total[5m])

Helpful Link

For more information about Kubernetes system component metrics, see Kubernetes Metrics Reference.