Updated on 2024-09-30 GMT+08:00

Creating a VPA Policy

Kubernetes Vertical Pod Autoscaler (VPA) scales pods vertically. It does this by analyzing the historical usage of container resources and automatically adjusting the CPU and memory resources requested by pods. VPA can adjust container resource requests within a specific range to match the service load. It increases the resource requests when the service load increases and reduces them when the service load decreases to conserve computing resources. Additionally, VPA can provide recommendations on CPU and memory requests to optimize container resource utilization while ensuring that the containers have enough resources to function properly.

Overview

VPA collects and analyzes resource metrics for each container, adjusts the requested resources based on actual usage, and maintains the ratio of resource limit to request before and after the adjustment. VPA can increase or decrease CPU and memory resources as needed.

The rules are as follows:

  • VPA generates the CPU and memory resource recommendations using the data collected by the Metrics API.
  • VPA, in theory, recommends a minimum of 250 MiB of memory for each pod and 250 MiB divided by the number of containers in the pod for each container. It also recommends a minimum of 25m vCPUs for each pod and 25m divided by the number of containers in the pod for each container.

    When setting up a VPA, you can establish the minimum and maximum number of elastic resources in containers by configuring the containerPolicies field.

  • If a container has both resource request and limit configured, VPA will provide resource recommendations. It will adjust the requested resources of the container to match the recommendations and generate recommended resource limit based on the ratio of the original resource request to the limit set during the container's initial creation.

    Assume that the requested vCPUs of a container are 100m and the limit is 200m (with a ratio of 1:2). If VPA recommends a requested vCPU of 80m, the container's vCPU limit will be 160m.

  • VPA ensures its recommendations align with other resource limits. If the VPA recommendations conflict with a resource limit, they will not be adjusted to fit the limit. This means that the resource configuration suggested by VPA may go beyond other resource limits.

    Assume that the requested memory of a namespace cannot exceed 2 GiB. If VPA recommends a high memory configuration for a pod in that namespace, the total memory requested by the namespace may exceed 2 GiB after the pod's resource configuration is updated. This means the pod will not be scheduled.

Prerequisites

Precautions

The VPA feature is being tested. Exercise caution when using this feature.

  • When VPA adjusts a pod's resource configuration dynamically, the pod will be recreated and could be scheduled to a different node. However, VPA cannot ensure that the scheduling will be successful.
  • VPA can dynamically adjust the resource configuration of pods managed by replication controllers (such as Deployments and StatefulSets), but not for pods that are not managed by replication controllers.
  • VPA and HPA that monitors CPU and memory metrics cannot operate simultaneously.
  • VPA admission webhooks update the pod resource configuration. If there are other admission webhooks present in the cluster, make sure they do not conflict with the VPA admission webhooks.
  • VPA handles the majority of OOM events, but it may not handle all of them.
  • VPA performance has not yet been tested in large clusters.
  • VPA may recommend more resources than what is currently available, such as exceeding the maximum limit or resource quotas of nodes. This can result in recreated pods being stuck in a pending state and unable to be scheduled.
  • Configuring multiple VPAs for the same workload can lead to inconsistent behavior.

Procedure

  1. Use kubectl to access the cluster. For details, see Connecting to a Cluster Using kubectl.
  2. Deploy a sample workload. If there is a workload in the cluster, skip this step.

    kubectl create -f hamster.yaml
    Example configuration of hamster.yaml:
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: hamster
    spec:
      selector:
        matchLabels:
          app: hamster
      replicas: 2
      template:
        metadata:
          labels:
            app: hamster
        spec:
          containers:
            - name: hamster
              image: registry.k8s.io/ubuntu-slim:0.1
              resources:
                requests:
                  cpu: 100m
                  memory: 50Mi
              command: ["/bin/sh"]
              args:
                - "-c"
                - "while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done" 

  3. Create a VPA.

    kubectl create -f hamster-vpa.yaml
    Example configuration of hamster-vpa.yaml:
    apiVersion: autoscaling.k8s.io/v1
    kind: VerticalPodAutoscaler
    metadata:
      name: hamster-vpa
    spec:
      targetRef:
        apiVersion: "apps/v1"
        kind: Deployment
        name: hamster
      updatePolicy:
        updateMode: "Off"
      resourcePolicy:
        containerPolicies:
          - containerName: '*'
            minAllowed:
              cpu: 100m
              memory: 50Mi
            maxAllowed:
              cpu: 1
              memory: 500Mi
            controlledResources: ["cpu", "memory"]
    Table 1 Key fields of a VPA

    Field

    Mandatory

    Description

    spec.targetRef

    Yes

    The target workloads of the VPA

    Workload types such as Deployment, StatefulSet, and DaemonSet are supported.

    spec.updatePolicy.updateMode

    No

    The update policy of the VPA recommended resources. The default value is Auto.

    Options:

    • Off: The VPA only generates recommended resources and does not change the requested resources of pods.
    • Recreate: The VPA generates recommended resources and automatically changes the requested resources of pods.
    • Initial: The VPA generates recommended resources, changes the requested resources upon pod creation, but leaves the requested resources of existing pods unchanged.
    • Auto: The VPA behaves similarly to setting this parameter to Recreate.

    spec.resourcePolicy.containerPolicies

    No

    Specified VPA policies for container resources, including the minimum and maximum number of resources allowed for containers

    For details, see Table 2.

    Table 2 Key fields in containerPolicy

    Field

    Mandatory

    Description

    containerName

    Yes

    Container name

    minAllowed

    No

    The minimum number of resources allowed for a container. The number of VPA recommended resources cannot be lower than the value of this parameter.

    Supported resources:

    • CPU
    • Memory

    maxAllowed

    No

    The maximum number of resources allowed for a container. The number of VPA recommended resources cannot be higher than the value of this parameter.

    Supported resources:

    • CPU
    • Memory

    controlledResources

    No

    The types of container resources controlled by the VPA

    The default value is ["cpu", "memory"].

    Supported resources:

    • CPU
    • Memory

    mode

    No

    Whether the VPA policy of a container works. The default value is Auto.

    Options:

    • Auto: The VPA policy for the container is enabled.
    • Off: The VPA policy for the container is disabled.

  4. Wait for the VPA to generate the resource recommendations and run the following command to check the VPA recommendations:

    kubectl get vpa hamster-vpa -oyaml

    The command output is as follows:

    apiVersion: autoscaling.k8s.io/v1
    kind: VerticalPodAutoscaler
    metadata:
      name: hamster-vpa
      namespace: default
    spec:
      resourcePolicy:
        containerPolicies:
        - containerName: '*'
          controlledResources:
          - cpu
          - memory
          maxAllowed:
            cpu: 1
            memory: 500Mi
          minAllowed:
            cpu: 100m
            memory: 50Mi
      targetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: hamster
      updatePolicy:
        updateMode: "Off"
    status:
      conditions:
      - lastTransitionTime: "2024-06-27T07:37:01Z"
        status: "True"
        type: RecommendationProvided
      recommendation:
        containerRecommendations:
        - containerName: hamster
          lowerBound:
            cpu: 475m
            memory: 262144k
          target:
            cpu: 587m
            memory: 262144k
          uncappedTarget:
            cpu: 587m
            memory: 262144k
          upperBound:
            cpu: 673m
            memory: 262144k

    The status.recommendation field specifies the resource recommendations generated by the VPA.

    If updateMode is set to Auto, the VPA will change the requested resources of the pod based on the recommended resources. This change will trigger the rebuilding of the pod.

    Table 3 Key fields in containerRecommendation

    Field

    Description

    containerName

    Name of the container for which the VPA policy takes effect

    target

    Recommended resources generated by the VPA. The resources are calculated based on the minimum and maximum number of resources configured in the containerPolicy field.

    The VPA uses these values to dynamically adjust the number of pod resources that can be requested.

    lowerBound

    The minimum number of recommended resources

    upperBound

    The maximum number of recommended resources

    uncappedTarget

    Actual recommended resources generated by the VPA. The resources are not calculated based on the minimum and maximum number of resources configured in the containerPolicy field.