Creating a VPA Policy
Kubernetes Vertical Pod Autoscaler (VPA) scales pods vertically. It does this by analyzing the historical usage of container resources and automatically adjusting the CPU and memory resources requested by pods. VPA can adjust container resource requests within a specific range to match the service load. It increases the resource requests when the service load increases and reduces them when the service load decreases to conserve computing resources. Additionally, VPA can provide recommendations on CPU and memory requests to optimize container resource utilization while ensuring that the containers have enough resources to function properly.
Overview
VPA collects and analyzes resource metrics for each container, adjusts the requested resources based on actual usage, and maintains the ratio of resource limit to request before and after the adjustment. VPA can increase or decrease CPU and memory resources as needed.
The rules are as follows:
- VPA generates the CPU and memory resource recommendations using the data collected by the Metrics API.
- VPA, in theory, recommends a minimum of 250 MiB of memory for each pod and 250 MiB divided by the number of containers in the pod for each container. It also recommends a minimum of 25m vCPUs for each pod and 25m divided by the number of containers in the pod for each container.
When setting up a VPA, you can establish the minimum and maximum number of elastic resources in containers by configuring the containerPolicies field.
- If a container has both resource request and limit configured, VPA will provide resource recommendations. It will adjust the requested resources of the container to match the recommendations and generate recommended resource limit based on the ratio of the original resource request to the limit set during the container's initial creation.
Assume that the requested vCPUs of a container are 100m and the limit is 200m (with a ratio of 1:2). If VPA recommends a requested vCPU of 80m, the container's vCPU limit will be 160m.
- VPA ensures its recommendations align with other resource limits. If the VPA recommendations conflict with a resource limit, they will not be adjusted to fit the limit. This means that the resource configuration suggested by VPA may go beyond other resource limits.
Assume that the requested memory of a namespace cannot exceed 2 GiB. If VPA recommends a high memory configuration for a pod in that namespace, the total memory requested by the namespace may exceed 2 GiB after the pod's resource configuration is updated. This means the pod will not be scheduled.
Prerequisites
- The cluster version must be 1.25 or later.
- An add-on that provides Metrics API has been installed in the cluster. You can select one of the following add-ons based on your service requirements:
- Kubernetes Metrics Server: provides basic resource usage metrics, such as container CPU and memory usage.
- Cloud Native Cluster Monitoring: provides basic resource usage metrics using Prometheus. You need to register Prometheus as a service of Metrics API. For details, see Providing Resource Metrics Through the Metrics API.
- The Vertical Pod Autoscaler (Vertical Pod Autoscaler) add-on has been installed in the cluster.
Precautions
The VPA feature is being tested. Exercise caution when using this feature.
- When VPA adjusts a pod's resource configuration dynamically, the pod will be recreated and could be scheduled to a different node. However, VPA cannot ensure that the scheduling will be successful.
- VPA can dynamically adjust the resource configuration of pods managed by replication controllers (such as Deployments and StatefulSets), but not for pods that are not managed by replication controllers.
- VPA and HPA that monitors CPU and memory metrics cannot operate simultaneously.
- VPA admission webhooks update the pod resource configuration. If there are other admission webhooks present in the cluster, make sure they do not conflict with the VPA admission webhooks.
- VPA handles the majority of OOM events, but it may not handle all of them.
- VPA performance has not yet been tested in large clusters.
- VPA may recommend more resources than what is currently available, such as exceeding the maximum limit or resource quotas of nodes. This can result in recreated pods being stuck in a pending state and unable to be scheduled.
- Configuring multiple VPAs for the same workload can lead to inconsistent behavior.
Procedure
- Use kubectl to access the cluster. For details, see Connecting to a Cluster Using kubectl.
- Deploy a sample workload. If there is a workload in the cluster, skip this step.
kubectl create -f hamster.yaml
Example configuration of hamster.yaml:apiVersion: apps/v1 kind: Deployment metadata: name: hamster spec: selector: matchLabels: app: hamster replicas: 2 template: metadata: labels: app: hamster spec: containers: - name: hamster image: registry.k8s.io/ubuntu-slim:0.1 resources: requests: cpu: 100m memory: 50Mi command: ["/bin/sh"] args: - "-c" - "while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done"
- Create a VPA.
kubectl create -f hamster-vpa.yaml
Example configuration of hamster-vpa.yaml:apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: hamster-vpa spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: hamster updatePolicy: updateMode: "Off" resourcePolicy: containerPolicies: - containerName: '*' minAllowed: cpu: 100m memory: 50Mi maxAllowed: cpu: 1 memory: 500Mi controlledResources: ["cpu", "memory"]
Table 1 Key fields of a VPA Field
Mandatory
Description
spec.targetRef
Yes
The target workloads of the VPA
Workload types such as Deployment, StatefulSet, and DaemonSet are supported.
spec.updatePolicy.updateMode
No
The update policy of the VPA recommended resources. The default value is Auto.
Options:
- Off: The VPA only generates recommended resources and does not change the requested resources of pods.
- Recreate: The VPA generates recommended resources and automatically changes the requested resources of pods.
- Initial: The VPA generates recommended resources, changes the requested resources upon pod creation, but leaves the requested resources of existing pods unchanged.
- Auto: The VPA behaves similarly to setting this parameter to Recreate.
spec.resourcePolicy.containerPolicies
No
Specified VPA policies for container resources, including the minimum and maximum number of resources allowed for containers
For details, see Table 2.
Table 2 Key fields in containerPolicy Field
Mandatory
Description
containerName
Yes
Container name
minAllowed
No
The minimum number of resources allowed for a container. The number of VPA recommended resources cannot be lower than the value of this parameter.
Supported resources:
- CPU
- Memory
maxAllowed
No
The maximum number of resources allowed for a container. The number of VPA recommended resources cannot be higher than the value of this parameter.
Supported resources:
- CPU
- Memory
controlledResources
No
The types of container resources controlled by the VPA
The default value is ["cpu", "memory"].
Supported resources:
- CPU
- Memory
mode
No
Whether the VPA policy of a container works. The default value is Auto.
Options:
- Auto: The VPA policy for the container is enabled.
- Off: The VPA policy for the container is disabled.
- Wait for the VPA to generate the resource recommendations and run the following command to check the VPA recommendations:
kubectl get vpa hamster-vpa -oyaml
The command output is as follows:
apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: hamster-vpa namespace: default spec: resourcePolicy: containerPolicies: - containerName: '*' controlledResources: - cpu - memory maxAllowed: cpu: 1 memory: 500Mi minAllowed: cpu: 100m memory: 50Mi targetRef: apiVersion: apps/v1 kind: Deployment name: hamster updatePolicy: updateMode: "Off" status: conditions: - lastTransitionTime: "2024-06-27T07:37:01Z" status: "True" type: RecommendationProvided recommendation: containerRecommendations: - containerName: hamster lowerBound: cpu: 475m memory: 262144k target: cpu: 587m memory: 262144k uncappedTarget: cpu: 587m memory: 262144k upperBound: cpu: 673m memory: 262144k
The status.recommendation field specifies the resource recommendations generated by the VPA.
If updateMode is set to Auto, the VPA will change the requested resources of the pod based on the recommended resources. This change will trigger the rebuilding of the pod.
Table 3 Key fields in containerRecommendation Field
Description
containerName
Name of the container for which the VPA policy takes effect
target
Recommended resources generated by the VPA. The resources are calculated based on the minimum and maximum number of resources configured in the containerPolicy field.
The VPA uses these values to dynamically adjust the number of pod resources that can be requested.
lowerBound
The minimum number of recommended resources
upperBound
The maximum number of recommended resources
uncappedTarget
Actual recommended resources generated by the VPA. The resources are not calculated based on the minimum and maximum number of resources configured in the containerPolicy field.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot