Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
Cloud Phone Host
Huawei Cloud EulerOS
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT Device Access
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Situation Awareness
Managed Threat Detection
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
Distributed Database Middleware
Database and Application Migration UGO
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
Intelligent EdgeCloud
SAP Cloud
High Performance Computing
Developer Services
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS

CCE Cluster Autoscaler

Updated on 2024-01-26 GMT+08:00


CCE Cluster Autoscaler is an important Kubernetes controller. It supports microservice scaling and is key to serverless design.

When the CPU or memory usage of a microservice is too high, horizontal pod autoscaling is triggered to add pods to reduce the load. These pods can be automatically reduced when the load is low, allowing the microservice to run as efficiently as possible.

CCE simplifies the creation, upgrade, and manual scaling of Kubernetes clusters, in which traffic loads change over time. To balance resource usage and workload performance of nodes, Kubernetes introduces the autoscaler add-on to automatically adjust the number of nodes in a cluster based on the resource usage required for workloads deployed in the cluster. For details, see Creating a Node Scaling Policy.

Open source community:

How the Add-on Works

autoscaler controls auto scale-out and scale-in.

  • Auto scale-out
    You can choose either of the following methods:
    • If pods in a cluster cannot be scheduled due to insufficient worker nodes, cluster scaling is triggered to add nodes. The nodes to be added have the same specification as configured for the node pool to which the nodes belong.
      Auto scale-out will be performed when:
      • Node resources are insufficient.
      • No node affinity policy is set in the pod scheduling configuration. If a node has been configured as an affinity node for pods, no node will not be automatically added when pods cannot be scheduled. For details about how to configure the node affinity policy, see Scheduling Policy (Affinity/Anti-affinity).
    • When the cluster meets the node scaling policy, cluster scale-out is also triggered. For details, see Creating a Node Scaling Policy.

    The add-on follows the "No Less, No More" policy. For example, if three cores are required for creating a pod and the system supports four-core and eight-core nodes, autoscaler will preferentially create a four-core node.

  • Auto scale-in
    When a cluster node is idle for a period of time (10 minutes by default), cluster scale-in is triggered, and the node is automatically deleted. However, a node cannot be deleted from a cluster if the following pods exist:
    • Pods that do not meet specific requirements set in Pod Disruption Budgets (PodDisruptionBudget)
    • Pods that cannot be scheduled to other nodes due to constraints such as affinity and anti-affinity policies
    • Pods that have the 'false' annotation
    • Pods (except those created by DaemonSets in the kube-system namespace) that exist in the kube-system namespace on the node
    • Pods that are not created by the controller (Deployment/ReplicaSet/job/StatefulSet)

    When a node meets the scale-in conditions, autoscaler adds the DeletionCandidateOfClusterAutoscaler taint to the node in advance to prevent pods from being scheduled to the node. After the autoscaler add-on is uninstalled, if the taint still exists on the node, manually delete it.


  • Ensure that there are sufficient resources for installing the add-on.
  • The default node pool does not support auto scaling. For details, see Description of DefaultPool.
  • When autoscaler is used, some taints or annotations may affect auto scaling. Therefore, do not use the following taints or annotations in clusters:
    • The taint works on nodes. Kubernetes-native autoscaler supports protection against abnormal scale outs and periodically evaluates the proportion of available nodes in the cluster. When the proportion of non-ready nodes exceeds 45%, protection will be triggered. In this case, all nodes with the taint in the cluster are filtered out from the autoscaler template and recorded as non-ready nodes, which affects cluster scaling.
    • The annotation works on pods, which determines whether DaemonSet pods can be evicted by autoscaler. For details, see Well-Known Labels, Annotations and Taints.

Installing the Add-on

  1. Log in to the CCE console and click the cluster name to access the cluster console. Choose Add-ons in the navigation pane, locate CCE Cluster Autoscaler on the right, and click Install.
  2. On the Install Add-on page, configure the specifications.

    Table 1 Add-on configuration



    Add-on Specifications

    The add-on can be deployed in the following specifications:


    When the autoscaler add-on is deployed in HA or customized mode, anti-affinity policies exist between add-on instances and the add-on instances are deployed on different nodes. Therefore, the number of available nodes in the cluster must be greater than or equal to the number of add-on instances to ensure high availability of the add-on.

    • Single: The add-on is deployed with only one pod.
    • HA50: The add-on is deployed with two pods, serving a cluster with 50 nodes and ensuring high availability.
    • HA200: The add-on is deployed with two pods, serving a cluster with 200 nodes and ensuring high availability. Each pod uses more resources than those of the HA50 specification.
    • Custom: You can customize the number of pods and specifications as required.


    Number of pods that will be created to match the selected add-on specifications.

    If you select Custom, you can adjust the number of pods as required.


    • Preferred: Deployment pods of the add-on will be preferentially scheduled to nodes in different AZs. If all the nodes in the cluster are deployed in the same AZ, the pods will be scheduled to that AZ.
    • Required: Deployment pods of the add-on will be forcibly scheduled to nodes in different AZs. If there are fewer AZs than pods, the extra pods will fail to run.


    CPU and memory quotas of the container allowed for the selected add-on specifications.

    If you select Custom, you can adjust the container specifications as required.

  3. Configure the add-on parameters.

    Table 2 Add-on parameters




    You can select the following options as required:

    • Nodes are automatically added (from the node pool) when pods in the cluster cannot be scheduled.

      That is, when a pod is in Pending state, automatic scale-out is performed. If a node has been configured as an affinity node for pods, no node will not be automatically added when pods cannot be scheduled. Generally, an HPA policy works with such scaling. For details, see Using HPA and CA for Auto Scaling of Workloads and Nodes.

      If this parameter is not selected, scaling can be performed only through node scaling policies.

    • Auto node scale-in
      • Node Idle Time (min): Time for which a node should be unneeded before it is eligible for scale-down. Default value: 10 minutes.
      • Scale-in Threshold: If the percentage of both requested CPU and memory on a node is below this threshold, auto scale-down will be triggered to delete the node from the cluster. The default value is 0.5, which means 50%.
      • Stabilization Window (s)

        How long after a scale-out that a scale-in evaluation resumes. Default value: 10 minutes.


        If both auto scale-out and scale-in exist in a cluster, set How long after a scale-out that a scale-in evaluation resumes to 0 minutes. This can prevent the node scale-in from being blocked due to continuous scale-out of some node pools or retries upon a scale-out failure, resulting in unexpected waste of node resources.

        How long after the node deletion that a scale-in evaluation resumes. Default value: 10 minutes.

        How long after a scale-in failure that a scale-in evaluation resumes. Default value: 3 minutes. For details about the impact and relationship between the scale-in cooling intervals configured in the node pool and autoscaler, see Scale-In Cool-Down Period.

      • Max. Nodes for Batch Deletion: Maximum number of empty nodes that can be deleted at the same time. Default value: 10.
        This feature applies only to idle nodes. Idle nodes can be concurrently scaled in. Nodes that are not idle can only be scaled in one by one.

        During node scale-in, if the pod on the node does not need to be evicted (such as the pods of DaemonSet), the node is idle. Otherwise, the node is not idle.

      • Check Interval: Interval for checking again a node that could not be removed before. Default value: 5 minutes.

    Total Nodes

    Maximum number of nodes that can be managed by the cluster, within which cluster scale-out is performed.

    Total CPUs

    Maximum sum of CPU cores of all nodes in a cluster, within which cluster scale-out is performed.

    Total Memory (GB)

    Maximum sum of memory of all nodes in a cluster, within which cluster scale-out is performed.

  4. After the configuration is complete, click Install.


Table 3 autoscaler

Container Component


Resource Type


Auto scaling for Kubernetes clusters


Scale-In Cool-Down Period

Scale-in cooling intervals can be configured in the node pool settings and the autoscaler add-on settings.

Scale-in cooling interval configured in a node pool

This interval indicates the period during which nodes added to the current node pool after a scale-out operation cannot be deleted. This interval takes effect at the node pool level.

Scale-in cooling interval configured in the autoscaler add-on

The interval after a scale-out indicates the period during which the entire cluster cannot be scaled in after the autoscaler add-on triggers scale-out (due to the unschedulable pods, metrics, and scaling policies). This interval takes effect at the cluster level.

The interval after a node is deleted indicates the period during which the cluster cannot be scaled in after the autoscaler add-on triggers scale-in. This interval takes effect at the cluster level.

The interval after a failed scale-in indicates the period during which the cluster cannot be scaled in after the autoscaler add-on triggers scale-in. This interval takes effect at the cluster level.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more





Selected Content

Submit selected content with the feedback