Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Situation Awareness
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

Creating a VPA Policy

Updated on 2025-02-18 GMT+08:00

Kubernetes Vertical Pod Autoscaler (VPA) scales pods vertically. It does this by analyzing the historical usage of container resources and automatically adjusting the CPU and memory resources requested by pods. VPA can adjust container resource requests within a specific range to match the service load. It increases the resource requests when the service load increases and reduces them when the service load decreases to conserve computing resources. Additionally, VPA can provide recommendations on CPU and memory requests to optimize container resource utilization while ensuring that the containers have enough resources to function properly.

Overview

VPA collects and analyzes resource metrics for each container, adjusts the requested resources based on actual usage, and maintains the ratio of resource limit to request before and after the adjustment. VPA can increase or decrease CPU and memory resources as needed.

The rules are as follows:

  • VPA generates the CPU and memory resource recommendations using the data collected by the Metrics API.
  • VPA, in theory, recommends a minimum of 250 MiB of memory for each pod and 250 MiB divided by the number of containers in the pod for each container. It also recommends a minimum of 25m vCPUs for each pod and 25m divided by the number of containers in the pod for each container.

    When setting up a VPA, you can establish the minimum and maximum number of elastic resources in containers by configuring the containerPolicies field.

  • If a container has both resource request and limit configured, VPA will provide resource recommendations. It will adjust the requested resources of the container to match the recommendations and generate recommended resource limit based on the ratio of the original resource request to the limit set during the container's initial creation.

    Assume that the requested vCPUs of a container are 100m and the limit is 200m (with a ratio of 1:2). If VPA recommends a requested vCPU of 80m, the container's vCPU limit will be 160m.

  • VPA ensures its recommendations align with other resource limits. If the VPA recommendations conflict with a resource limit, they will not be adjusted to fit the limit. This means that the resource configuration suggested by VPA may go beyond other resource limits.

    Assume that the requested memory of a namespace cannot exceed 2 GiB. If VPA recommends a high memory configuration for a pod in that namespace, the total memory requested by the namespace may exceed 2 GiB after the pod's resource configuration is updated. This means the pod will not be scheduled.

Prerequisites

Precautions

NOTICE:

The VPA feature is being tested. Exercise caution when using this feature.

  • When VPA adjusts a pod's resource configuration dynamically, the pod will be recreated and could be scheduled to a different node. However, VPA cannot ensure that the scheduling will be successful.
  • VPA can dynamically adjust the resource configuration of pods managed by replication controllers (such as Deployments and StatefulSets), but not for pods that are not managed by replication controllers.
  • VPA and HPA that monitors CPU and memory metrics cannot operate simultaneously.
  • VPA admission webhooks update the pod resource configuration. If there are other admission webhooks present in the cluster, make sure they do not conflict with the VPA admission webhooks.
  • VPA handles the majority of OOM events, but it may not handle all of them.
  • VPA performance has not yet been tested in large clusters.
  • VPA may recommend more resources than what is currently available, such as exceeding the maximum limit or resource quotas of nodes. This can result in recreated pods being stuck in a pending state and unable to be scheduled.
  • Configuring multiple VPAs for the same workload can lead to inconsistent behavior.

Procedure

  1. Use kubectl to access the cluster. For details, see Connecting to a Cluster Using kubectl.
  2. Deploy a sample workload. If a workload already runs in the cluster, skip this step.

    kubectl create -f hamster.yaml
    Example configuration of hamster.yaml:
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: hamster
    spec:
      selector:
        matchLabels:
          app: hamster
      replicas: 2
      template:
        metadata:
          labels:
            app: hamster
        spec:
          containers:
            - name: hamster
              image: registry.k8s.io/ubuntu-slim:0.1
              resources:
                requests:
                  cpu: 100m
                  memory: 50Mi
              command: ["/bin/sh"]
              args:
                - "-c"
                - "while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done" 

  3. Create a VPA.

    kubectl create -f hamster-vpa.yaml
    Example configuration of hamster-vpa.yaml:
    apiVersion: autoscaling.k8s.io/v1
    kind: VerticalPodAutoscaler
    metadata:
      name: hamster-vpa
    spec:
      targetRef:
        apiVersion: "apps/v1"
        kind: Deployment
        name: hamster
      updatePolicy:
        updateMode: "Off"
      resourcePolicy:
        containerPolicies:
          - containerName: '*'
            minAllowed:
              cpu: 100m
              memory: 50Mi
            maxAllowed:
              cpu: 1
              memory: 500Mi
            controlledResources: ["cpu", "memory"]
    Table 1 Key fields of a VPA

    Field

    Mandatory

    Description

    spec.targetRef

    Yes

    The target workloads of the VPA

    Workload types such as Deployment, StatefulSet, and DaemonSet are supported.

    spec.updatePolicy.updateMode

    No

    The update policy of the VPA recommended resources. The default value is Auto.

    Options:

    • Off: The VPA only generates recommended resources and does not change the requested resources of pods.
    • Recreate: The VPA generates recommended resources and automatically changes the requested resources of pods.
    • Initial: The VPA generates recommended resources, changes the requested resources upon pod creation, but leaves the requested resources of existing pods unchanged.
    • Auto: The VPA behaves similarly to setting this parameter to Recreate.

    spec.resourcePolicy.containerPolicies

    No

    Specified VPA policies for container resources, including the minimum and maximum number of resources allowed for containers

    For details, see Table 2.

    Table 2 Key fields in containerPolicy

    Field

    Mandatory

    Description

    containerName

    Yes

    Container name

    minAllowed

    No

    The minimum number of resources allowed for a container. The number of VPA recommended resources cannot be lower than the value of this parameter.

    Supported resources:

    • CPU
    • Memory

    maxAllowed

    No

    The maximum number of resources allowed for a container. The number of VPA recommended resources cannot be higher than the value of this parameter.

    Supported resources:

    • CPU
    • Memory

    controlledResources

    No

    The types of container resources controlled by the VPA

    The default value is ["cpu", "memory"].

    Supported resources:

    • CPU
    • Memory

    mode

    No

    Whether the VPA policy of a container works. The default value is Auto.

    Options:

    • Auto: The VPA policy for the container is enabled.
    • Off: The VPA policy for the container is disabled.

  4. Wait for the VPA to generate the resource recommendations and run the following command to check the VPA recommendations:

    kubectl get vpa hamster-vpa -oyaml

    Command output:

    apiVersion: autoscaling.k8s.io/v1
    kind: VerticalPodAutoscaler
    metadata:
      name: hamster-vpa
      namespace: default
    spec:
      resourcePolicy:
        containerPolicies:
        - containerName: '*'
          controlledResources:
          - cpu
          - memory
          maxAllowed:
            cpu: 1
            memory: 500Mi
          minAllowed:
            cpu: 100m
            memory: 50Mi
      targetRef:
        apiVersion: apps/v1
        kind: Deployment
        name: hamster
      updatePolicy:
        updateMode: "Off"
    status:
      conditions:
      - lastTransitionTime: "2024-06-27T07:37:01Z"
        status: "True"
        type: RecommendationProvided
      recommendation:
        containerRecommendations:
        - containerName: hamster
          lowerBound:
            cpu: 475m
            memory: 262144k
          target:
            cpu: 587m
            memory: 262144k
          uncappedTarget:
            cpu: 587m
            memory: 262144k
          upperBound:
            cpu: 673m
            memory: 262144k

    The status.recommendation field specifies the resource recommendations generated by the VPA.

    If updateMode is set to Auto, the VPA will change the requested resources of the pod based on the recommended resources. This change will trigger the rebuilding of the pod.

    Table 3 Key fields in containerRecommendation

    Field

    Description

    containerName

    Name of the container for which the VPA policy takes effect

    target

    Recommended resources generated by the VPA. The resources are calculated based on the minimum and maximum number of resources configured in the containerPolicy field.

    The VPA uses these values to dynamically adjust the number of pod resources that can be requested.

    lowerBound

    The minimum number of recommended resources

    upperBound

    The maximum number of recommended resources

    uncappedTarget

    Actual recommended resources generated by the VPA. The resources are not calculated based on the minimum and maximum number of resources configured in the containerPolicy field.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback