CCE Cluster Autoscaler

Introduction

The CCE Cluster Autoscaler add-on is built on the Autoscaler component of the community. It can automatically adjust the number of cluster nodes based on the resource needs of applications, optimizing resource utilization and performance. Autoscaler is the main controller in Kubernetes. It can automatically scale nodes in or out based on resource requirements. When there are not enough node resources to schedule pods in a cluster, Autoscaler adds more nodes with additional resources for those pods. Furthermore, if the resource utilization of the added nodes is low, Autoscaler will automatically remove them. For details about how to implement node auto scaling, see Creating a Node Auto Scaling Policy.

Open-source community: https://github.com/kubernetes/autoscaler

How the Add-on Works

Autoscaler controls auto scale-out and scale-in.

Auto scale-out
You can choose either of the following methods:
- If a pod cannot be scheduled due to insufficient resources of worker nodes, CCE will add more nodes to the cluster. The new nodes have the same resource quotas as those configured for the node pools that the new nodes are in.
  Auto scale-out will be performed when:
  - Node resources are insufficient.
  - No node affinity policy is set in the scheduling configurations of the pod. If the pod is configured affinity for a node, the system will not automatically add more nodes in the cluster. For details about how to configure node affinity policies, see Configuring Node Affinity Scheduling (nodeAffinity).
- When the cluster meets the node scaling policy, cluster scale-out is also triggered. For details, see Creating a Node Auto Scaling Policy.
The add-on follows the "No Less, No More" policy. For example, if three cores are required for creating a pod and the system supports four-core and eight-core nodes, Autoscaler will preferentially create a four-core node.
Auto scale-in
When a cluster node is idle for a period of time (10 minutes by default), cluster scale-in is triggered, and the node is automatically deleted. However, a node cannot be deleted from a cluster if the following pods exist:
- Pods that do not meet specific requirements set in Pod Disruption Budgets (PodDisruptionBudget)
- Pods that cannot be scheduled to other nodes due to constraints such as affinity and anti-affinity policies
- Pods that have the cluster-autoscaler.kubernetes.io/safe-to-evict: 'false' annotation
- Pods (except those created by DaemonSets in the kube-system namespace) that exist in the kube-system namespace on the node
- Pods that are not created by the controller (Deployment/ReplicaSet/job/StatefulSet)
- When a node meets the scale-in conditions, Autoscaler adds the DeletionCandidateOfClusterAutoscaler taint to the node in advance to prevent pods from being scheduled to the node. After the CCE Cluster Autoscaler add-on is uninstalled, if the taint still exists on the node, manually delete it.
- To ensure system stability and efficient resource utilization, CCE Cluster Autoscaler uses a conservative policy. Nodes that are not entirely idle are drained one at a time. When these nodes host pods configured with graceful termination, the draining process can be prolonged. As a result, the overall scale-in process may take more time to complete.

Notes and Constraints

There must be enough resources in the cluster during the add-on installation.
The default node pool does not support auto scaling. For details, see Description of DefaultPool.
Node scale-in will cause PVC/PV data loss for the local PVs associated with the node. These PVCs and PVs cannot be restored or used again. In a node scale-in, a pod that uses the local PV will be evicted from the node. A new pod will be created, but it remains in a pending state because the label of the PVC bound to it conflicts with the node label.
When CCE Cluster Autoscaler is used, some taints or annotations may affect auto scaling. Therefore, do not use the following taints or annotations in clusters:
- ignore-taint.cluster-autoscaler.kubernetes.io: The taint works on nodes. Kubernetes-native Autoscaler supports protection against abnormal scale-outs and periodically evaluates the proportion of available nodes in the cluster. When the proportion of non-ready nodes exceeds 45%, protection will be triggered. In this case, all nodes with the ignore-taint.cluster-autoscaler.kubernetes.io taint in the cluster are filtered out from the Autoscaler template and recorded as non-ready nodes, which affect cluster scaling.
- cluster-autoscaler.kubernetes.io/enable-ds-eviction: The annotation works on pods, which determines whether DaemonSet pods can be evicted by Autoscaler. For details, see Well-Known Labels, Annotations and Taints.

Installing the Add-on

Log in to the CCE console and click the cluster name to access the cluster console.
In the navigation pane, choose Add-ons. Locate CCE Cluster Autoscaler on the right and click Install.

On the Install Add-on page, configure the specifications as needed.

There are three types of preset specifications based on the cluster scale. You can select one as required. The system will configure the number of pods and resource quotas for the add-on based on the selected preset specifications. You can see the configurations on the console.

If your cluster is large and the preset specifications do not meet your needs, you can customize the resource specifications and estimate the number of pods in the cluster to better determine the memory usage of the add-on. The recommended memory request and limit in typical large clusters can be calculated as follows:

Memory request = Number of pods × Size of the pod YAML files (KB)/10000 × 0.28 GiB + 1 GiB
Memory limit =Memory request + 2 GiB

For example, if there are 20,000 pods and the YAML file size of each pod is 10 KB, the memory request would be 6.6 GiB (2 × 10 × 0.28 GiB + 1 GiB) and the memory limit would be 8.6 GiB (6.6 GiB + 2 GiB). (The calculated values may differ from the recommendations listed in Table 1. You can refer to that table or use these formulas.)

**Table 1** Recommended add-on memory size in large-scale scenarios
Number of Pods (10 KB for Each Pod YAML)	Recommended Memory Request	Recommended Memory Limit
10000	4 GiB	6 GiB
30000	8 GiB	10 GiB
50000	16 GiB	18 GiB
80000	24 GiB	26 GiB
100000	28 GiB	30 GiB

Configure deployment policies for the add-on pods.

Scheduling policies do not take effect on add-on pods of the DaemonSet type.
When configuring multi-AZ deployment or node affinity, ensure that there are nodes meeting the scheduling policy and that resources are sufficient in the cluster. Otherwise, the add-on cannot run.

**Table 2** Configurations for add-on scheduling
Parameter	Description
Multi-AZ Deployment	Preferred: Deployment pods of the add-on will be preferentially scheduled to nodes in different AZs. If all the nodes in the cluster are deployed in the same AZ, the pods will be scheduled to different nodes in that AZ. Equivalent mode: Deployment pods of the add-on are evenly scheduled to the nodes in the cluster in each AZ. If a new AZ is added, you are advised to increase add-on pods for cross-AZ HA deployment. With the Equivalent multi-AZ deployment, the difference between the number of add-on pods in different AZs will be less than or equal to 1. If resources in one of the AZs are insufficient, pods cannot be scheduled to that AZ. Forcible: Deployment pods of the add-on are forcibly scheduled to nodes in different AZs. There can be at most one pod in each AZ. If nodes in a cluster are not in different AZs, some add-on pods cannot run properly. If a node is faulty, add-on pods on it may fail to be migrated.
Node Affinity	Not configured: Node affinity is disabled for the add-on. Specify node: Specify the nodes where the add-on is deployed. If you do not specify the nodes, the add-on will be randomly scheduled based on the default cluster scheduling policy. Specify node pool: Specify the node pool where the add-on is deployed. If you do not specify the node pools, the add-on will be randomly scheduled based on the default cluster scheduling policy. Customize affinity: Enter the labels of the nodes where the add-on is to be deployed for more flexible scheduling policies. If you do not specify node labels, the add-on will be randomly scheduled based on the default cluster scheduling policy. If multiple custom affinity policies are configured, ensure that there are nodes that meet all the affinity policies in the cluster. Otherwise, the add-on cannot run.
Toleration	Using both taints and tolerations allows (not forcibly) the add-on Deployment to be scheduled to a node with the matching taints, and controls the Deployment eviction policies after the node where the Deployment is located is tainted. The add-on adds the default tolerance policy for the node.kubernetes.io/not-ready and node.kubernetes.io/unreachable taints, respectively. The tolerance time window is 60s. For details, see Configuring Tolerance Policies.

After the configuration is complete, click Install.

Components

**Table 3** Add-on components
Component	Description	Resource Type
Autoscaler	Auto scaling for Kubernetes clusters	Deployment

Helpful Links

After the add-on is installed, you can create node scaling policies to increase or decrease the number of nodes. For details, see Creating a Node Auto Scaling Policy.
If a node pool contains multiple node flavors, there are priorities for auto scaling. For details, see Priorities for Scaling Node Pools.

Change History

**Table 4** CCE Cluster Autoscaler add-on adapted to clusters v1.31
Add-on Version	Supported Cluster Version	New Feature	Community Version
1.31.13	v1.31	The scale-down delay and scale-down utilization thresholds can be configured for node pools.	1.31.1
1.31.8	v1.31	CCE clusters v1.31 are supported.	1.31.1

**Table 5** CCE Cluster Autoscaler add-on adapted to clusters v1.30
Add-on Version	Supported Cluster Version	New Feature	Community Version
1.30.51	v1.30	The scale-down delay and scale-down utilization thresholds can be configured for node pools.	1.30.1
1.30.18	v1.30	Fixed some issues.	1.30.1
1.30.15	v1.30	Clusters v1.30 are supported. Added the name of the target node pool to the events.	1.30.1

**Table 6** CCE Cluster Autoscaler add-on adapted to clusters v1.29
Add-on Version	Supported Cluster Version	New Feature	Community Version
1.29.84	v1.29	The scale-down delay and scale-down utilization thresholds can be configured for node pools.	1.29.1
1.29.53	v1.29	Fixed some issues.	1.29.1
1.29.50	v1.29	Added the name of the target node pool to the events.	1.29.1
1.29.17	v1.29	Optimized events.	1.29.1
1.29.13	v1.29	Clusters v1.29 are supported.	1.29.1

**Table 7** CCE Cluster Autoscaler add-on adapted to clusters v1.28
Add-on Version	Supported Cluster Version	New Feature	Community Version
1.28.123	v1.28	The scale-down delay and scale-down utilization thresholds can be configured for node pools.	1.28.1
1.28.91	v1.28	Fixed some issues.	1.28.1
1.28.88	v1.28	Added the name of the target node pool to the events.	1.28.1
1.28.55	v1.28	Optimized events.	1.28.1
1.28.51	v1.28	Optimized the logic for generating alarms when resources in a node pool are sold out.	1.28.1
1.28.22	v1.28	Fixed some issues.	1.28.1
1.28.17	v1.28	Fixed the issue that scale-in cannot be performed when there are custom pod controllers in a cluster.	1.28.1

**Table 8** CCE Cluster Autoscaler add-on adapted to clusters v1.27
Add-on Version	Supported Cluster Version	New Feature	Community Version
1.27.155	v1.27	The scale-down delay and scale-down utilization thresholds can be configured for node pools.	1.27.1
1.27.122	v1.27	Fixed some issues.	1.27.1
1.27.119	v1.27	Added the name of the target node pool to the events.	1.27.1
1.27.88	v1.27	Optimized events.	1.27.1
1.27.84	v1.27	Optimized the logic for generating alarms when resources in a node pool are sold out.	1.27.1
1.27.55	v1.27	Fixed some issues.	1.27.1
1.27.51	v1.27	Fixed some issues.	1.27.1
1.27.14	v1.27	Fixed the scale-in failure of nodes of different specifications in the same node pool and unexpected PreferNoSchedule taint issues.	1.27.1

**Table 9** CCE Cluster Autoscaler add-on adapted to clusters v1.25
Add-on Version	Supported Cluster Version	New Feature	Community Version
1.25.184	v1.25	The scale-down delay and scale-down utilization thresholds can be configured for node pools.	1.25.0
1.25.153	v1.25	Fixed some issues.	1.25.0
1.25.152	v1.25	Added the name of the target node pool to the events.	1.25.0
1.25.120	v1.25	Optimized events.	1.25.0
1.25.116	v1.25	Optimized the logic for generating alarms when resources in a node pool are sold out.	1.25.0
1.25.88	v1.25	Fixed some issues.	1.25.0
1.25.84	v1.25	Fixed some issues.	1.25.0
1.25.46	v1.25	Fixed the scale-in failure of nodes of different specifications in the same node pool and unexpected PreferNoSchedule taint issues.	1.25.0
1.25.34	v1.25	Optimized the method of identifying GPUs and NPUs. Used the remaining node quota of a cluster for the extra nodes that are beyond the cluster scale.	1.25.0
1.25.21	v1.25	Fixed the issue that the autoscaler's least-waste is disabled by default. Fixed the issue that if a node scale-out failed in a node pool, the same operation cannot be performed in another node pool and the add-on has to restart. The default taint tolerance duration is changed to 60s. Fixed the issue that scale-out is still triggered after the scale-out rule is disabled.	1.25.0
1.25.11	v1.25	Supported anti-affinity scheduling of add-on pods on nodes in different AZs. Added the tolerance time during which the pods with temporary storage volumes can be unscheduled. Fixed the issue that the number of node pools cannot be restored when scaling group resources are insufficient.	1.25.0
1.25.7	v1.25	CCE clusters v1.25 are supported. Modified the memory request and limit of a customized flavor. Allowed a node pool with auto scaling disabled to report a scaling failure event. Fixed the bug that NPU node scale-out is triggered again during scale-out.	1.25.0

**Table 10** CCE Cluster Autoscaler add-on adapted to clusters v1.23
Add-on Version	Supported Cluster Version	New Feature	Community Version
1.23.157	v1.23	Fixed some issues.	1.23.0
1.23.156	v1.23	Added the name of the target node pool to the events.	1.23.0
1.23.125	v1.23	Optimized events.	1.23.0
1.23.121	v1.23	Optimized the logic for generating alarms when resources in a node pool are sold out.	1.23.0
1.23.95	v1.23	Fixed some issues.	1.23.0
1.23.91	v1.23	Fixed some issues.	1.23.0
1.23.54	v1.23	Fixed the scale-in failure of nodes of different specifications in the same node pool and unexpected PreferNoSchedule taint issues.	1.23.0
1.23.44	v1.23	Optimized the method of identifying GPUs and NPUs. Used the remaining node quota of a cluster for the extra nodes that are beyond the cluster scale.	1.23.0
1.23.31	v1.23	Fixed the issue that the autoscaler's least-waste is disabled by default. Fixed the issue that if a node scale-out failed in a node pool, the same operation cannot be performed in another node pool and the add-on has to restart. The default taint tolerance duration is changed to 60s. Fixed the issue that scale-out is still triggered after the scale-out rule is disabled.	1.23.0
1.23.21	v1.23	Supported anti-affinity scheduling of add-on pods on nodes in different AZs. Added the tolerance time during which the pods with temporary storage volumes can be unscheduled. Fixed the issue that the number of node pools cannot be restored when scaling group resources are insufficient.	1.23.0
1.23.17	v1.23	Supported NPUs and secure containers. Supported node scaling policies without a step. Fixed a bug so that deleted node pools are automatically removed. Supported priority-based scheduling. Supported the emptyDir scheduling policy. Fixed a bug so that scale-in can be triggered on the nodes whose capacity is lower than the scale-in threshold when the node scaling policy is disabled. Modified the memory request and limit of a customized flavor. Allowed a node pool with auto scaling disabled to report a scaling failure event. Fixed the bug that NPU node scale-out is triggered again during scale-out.	1.23.0
1.23.10	v1.23	Optimized logging. Supported scale-in waiting so that operations such as data dump can be performed before a node is deleted.	1.23.0
1.23.9	v1.23	Added the nodenetworkconfigs.crd.yangtse.cni resource object permission.	1.23.0
1.23.8	v1.23	Fixed the issue that scale-out fails when the number of nodes to be added at a time exceeds the upper limit in periodic scale-outs.	1.23.0
1.23.7	v1.23	None	1.23.0
1.23.3	v1.23	CCE clusters v1.23 are supported.	1.23.0

**Table 11** CCE Cluster Autoscaler add-on adapted to clusters v1.21
Add-on Version	Supported Cluster Version	New Feature	Community Version
1.21.114	v1.21	Optimized the logic for generating alarms when resources in a node pool are sold out.	1.21.0
1.21.89	v1.21	Fixed some issues.	1.21.0
1.21.86	v1.21	Fixed the issue that the node pool auto scaling cannot meet expectations after AZ topology constraints are configured for nodes.	1.21.0
1.21.51	v1.21	Fixed the scale-in failure of nodes of different specifications in the same node pool and unexpected PreferNoSchedule taint issues.	1.21.0
1.21.43	v1.21	Optimized the method of identifying GPUs and NPUs. Used the remaining node quota of a cluster for the extra nodes that are beyond the cluster scale.	1.21.0
1.21.29	v1.21	Supported anti-affinity scheduling of add-on pods on nodes in different AZs. Added the tolerance time during which the pods with temporary storage volumes can be unscheduled. Fixed the issue that the number of node pools cannot be restored when scaling group resources are insufficient. Fixed the issue that if a node scale-out failed in a node pool, the same operation cannot be performed in another node pool and the add-on has to restart. The default taint tolerance duration is changed to 60s. Fixed the issue that scale-out is still triggered after the scale-out rule is disabled.	1.21.0
1.21.20	v1.21	Supported anti-affinity scheduling of add-on pods on nodes in different AZs. Added the tolerance time during which the pods with temporary storage volumes can be unscheduled. Fixed the issue that the number of node pools cannot be restored when scaling group resources are insufficient.	1.21.0
1.21.16	v1.21	Supported NPUs and secure containers. Supported node scaling policies without a step. Fixed a bug so that deleted node pools are automatically removed. Supported priority-based scheduling. Supported the emptyDir scheduling policy. Fixed a bug so that scale-in can be triggered on the nodes whose capacity is lower than the scale-in threshold when the node scaling policy is disabled. Modified the memory request and limit of a customized flavor. Allowed a node pool with auto scaling disabled to report a scaling failure event. Fixed the bug that NPU node scale-out is triggered again during scale-out.	1.21.0
1.21.9	v1.21	Optimized logging. Supported scale-in waiting so that operations such as data dump can be performed before a node is deleted.	1.21.0
1.21.8	v1.21	Added the nodenetworkconfigs.crd.yangtse.cni resource object permission.	1.21.0
1.21.6	v1.21	Fixed the issue that authentication fails due to incorrect signature in the add-on request retries.	1.21.0
1.21.4	v1.21	Fixed the issue that authentication fails due to incorrect signature in the add-on request retries.	1.21.0
1.21.2	v1.21	Fixed the issue that auto scaling may be blocked due to a failure in deleting an unregistered node.	1.21.0
1.21.1	v1.21	Fixed the issue that the node pool modification in the existing periodic auto scaling rule does not take effect.	1.21.0

**Table 12** CCE Cluster Autoscaler add-on adapted to clusters v1.19
Add-on Version	Supported Cluster Version	New Feature	Community Version
1.19.76	v1.19	Optimized the method of identifying GPUs and NPUs. Used the remaining node quota of a cluster for the extra nodes that are beyond the cluster scale.	1.19.0
1.19.56	v1.19	Fixed the scale-in failure of nodes of different specifications in the same node pool and unexpected PreferNoSchedule taint issues.	1.19.0
1.19.48	v1.19	Optimized the method of identifying GPUs and NPUs. Used the remaining node quota of a cluster for the extra nodes that are beyond the cluster scale.	1.19.0
1.19.35	v1.19	Supported anti-affinity scheduling of add-on pods on nodes in different AZs. Added the tolerance time during which the pods with temporary storage volumes can be unscheduled. Fixed the issue that the number of node pools cannot be restored when scaling group resources are insufficient. Fixed the issue that if a node scale-out failed in a node pool, the same operation cannot be performed in another node pool and the add-on has to restart. The default taint tolerance duration is changed to 60s. Fixed the issue that scale-out is still triggered after the scale-out rule is disabled.	1.19.0
1.19.27	v1.19	Supported anti-affinity scheduling of add-on pods on nodes in different AZs. Added the tolerance time during which the pods with temporary storage volumes can be unscheduled. Fixed the issue that the number of node pools cannot be restored when scaling group resources are insufficient.	1.19.0
1.19.22	v1.19	Supported NPUs and secure containers. Supported node scaling policies without a step. Fixed a bug so that deleted node pools are automatically removed. Supported priority-based scheduling. Supported the emptyDir scheduling policy. Fixed a bug so that scale-in can be triggered on the nodes whose capacity is lower than the scale-in threshold when the node scaling policy is disabled. Modified the memory request and limit of a customized flavor. Allowed a node pool with auto scaling disabled to report a scaling failure event. Fixed the bug that NPU node scale-out is triggered again during scale-out.	1.19.0
1.19.14	v1.19	Optimized logging. Supported scale-in waiting so that operations such as data dump can be performed before a node is deleted.	1.19.0
1.19.13	v1.19	Fixed the issue that scale-out fails when the number of nodes to be added at a time exceeds the upper limit in periodic scale-outs.	1.19.0
1.19.12	v1.19	Fixed the issue that authentication fails due to incorrect signature in the add-on request retries.	1.19.0
1.19.11	v1.19	Fixed the issue that authentication fails due to incorrect signature in the add-on request retries.	1.19.0
1.19.9	v1.19	Fixed the issue that auto scaling may be blocked due to a failure in deleting an unregistered node.	1.19.0
1.19.8	v1.19	Fixed the issue that the node pool modification in the existing periodic auto scaling rule does not take effect.	1.19.0
1.19.7	v1.19	Regular upgrade of add-on dependencies	1.19.0
1.19.6	v1.19	Fixed the issue that repeated scale-out is triggered when taints are asynchronously updated.	1.19.0
1.19.3	v1.19	Supported scheduled scaling policies based on the total number of nodes, CPU limit, and memory limit and fixed other functional defects.	1.19.0

**Table 13** CCE Cluster Autoscaler add-on adapted to clusters v1.17
Add-on Version	Supported Cluster Version	New Feature	Community Version
1.17.27	v1.17	Optimized logging. Fixed a bug so that deleted node pools are automatically removed. Supported priority-based scheduling. Fixed the issue that taints on newly added nodes are overwritten. Fixed a bug so that scale-in can be triggered on the nodes whose capacity is lower than the scale-in threshold when the node scaling policy is disabled. Modified the memory request and limit of a customized flavor. Allowed a node pool with auto scaling disabled to report a scaling failure event.	1.17.0
1.17.22	v1.17	Optimized logging.	1.17.0
1.17.21	v1.17	Fixed the issue that scale-out fails when the number of nodes to be added at a time exceeds the upper limit in periodic scale-outs.	1.17.0
1.17.19	v1.17	Fixed the issue that authentication fails due to incorrect signature in the add-on request retries.	1.17.0
1.17.17	v1.17	Fixed the issue that auto scaling may be blocked due to a failure in deleting an unregistered node.	1.17.0
1.17.16	v1.17	Fixed the issue that the node pool modification in the existing periodic auto scaling rule does not take effect.	1.17.0
1.17.15	v1.17	Unified resource specification configuration unit.	1.17.0
1.17.14	v1.17	Fixed the issue that repeated scale-out is triggered when taints are asynchronously updated.	1.17.0
1.17.8	v1.17	Fixed bugs.	1.17.0
1.17.7	v1.17	Added log content and fixed bugs.	1.17.0
1.17.5	v1.17	Supported clusters v1.17 and allowed scaling events to be displayed on the CCE console.	1.17.0
1.17.2	v1.17	Clusters v1.17 are supported.	1.17.0