Updated on 2025-05-22 GMT+08:00

PERF03-03 Applying Auto Scaling

  • Risk level

    Medium

  • Key strategies

    For scalable workloads like stateless applications, consider compute services with auto scaling to dynamically adjust resources as needed. Auto scaling ensures sufficient resources during peak hours and prevents over-allocation during off-peak hours. While both VM and container auto-scaling adjust application capacity dynamically, containers offer faster response times and better resource efficiency compared to VMs. Use Auto Scaling (AS) in VM scenarios and use Cluster AutoScaling (CA) and Horizontal Pod Autoscaling (HPA) in container scenarios.

    The following describes the auto scaling policies applicable to containers:

    CCE supports auto scaling for workloads and nodes.

    • Workload scaling: auto scaling at the scheduling layer to change the scheduling capacity of workloads. For example, you can use HPA, a scaling component at the scheduling layer, to adjust the number of pods used for an application. Adjusting the number of pods changes the scheduling capacity occupied by the current workload, thereby enabling scaling at the scheduling layer.
    • Node scaling: auto scaling at the resource layer. When the planned cluster nodes cannot allow workload scheduling, ECS resources are provided to support scheduling.

    Workload scaling and node scaling can work separately or together.

    For details, see Using HPA and CA for Auto Scaling of Workloads and Nodes.

    The following table introduces the workload scaling components

    Type

    Component

    Description

    Reference

    HPA

    Kubernetes Metrics Server

    It is a built-in component of Kubernetes and enables horizontal scaling of pods. It is based on Kubernetes HPA, but it also has a cooldown time window and scaling thresholds for applications.

    HPA Policy

    CustomedHPA

    CCE Advanced HPA

    An enhanced auto scaling feature, used for auto scaling of Deployments based on metrics (CPU usage and memory usage) or at a periodic interval (a specific time point every day, every week, every month, or every year).

    CustomedHPA Policy

    Prometheus

    An open-source system monitoring and alarm framework, which collects public metrics (CPU usage and memory usage) of kubelet in the Kubernetes cluster.

    N/A

    CronHPA

    CCE Advanced HPA

    CronHPA can scale in or out a cluster at a fixed time. It can work with HPA policies to periodically adjust the HPA scaling scope, implementing workload scaling in complex scenarios.

    CronHPA Policy

    Node scaling components are described as follows:

    Component

    Description

    Scenario

    Reference

    CCE Cluster Autoscaler

    An open source Kubernetes component for horizontal scaling of nodes, which is optimized by CCE in scheduling, auto scaling, and costs.

    Online services, deep learning, and large-scale computing with limited resource budgets

    Node Scaling

    CCE Cloud Bursting Engine for CCI

    Used to extend Kubernetes APIs to serverless container platforms (such as CCI), which means you no longer have to worry about node resources.

    Online traffic surge, CI/CD, big data, and more

    Elastic Scaling of CCE Pods to CCI

  • Related cloud services and tools
    • AS
    • CCE
    • CCI