Help Center/ Cloud Container Engine/ User Guide/ Auto Scaling/ Node Scaling/ Creating a Node Auto Scaling Policy

Updated on 2026-05-26 GMT+08:00

Creating a Node Auto Scaling Policy

If a large number of applications and services are running in a cluster, the compute resources of nodes are fixed, but the load is dynamic. There may be the following problems:

During peak hours, if the node quantity is insufficient, workload pods cannot run, resulting in slow response and request timeout.
During off-peak hours, if there are a large number of nodes, a lot of compute resources are wasted.

These problems affect application performance and user experience and increase O&M complexity and costs. Kubernetes provides node auto scaling (Cluster Autoscaler), which can automatically add or remove nodes based on the cluster load. CCE provides auto scaling through the CCE Cluster Autoscaler add-on (CCE Cluster Autoscaler). Nodes with different flavors can be automatically added or removed across AZs on demand.

Prerequisites

Before using node auto scaling, you must install the CCE Cluster Autoscaler add-on (CCE Cluster Autoscaler) of v1.13.8 or later in the cluster.

To use node flavor priorities, the Autoscaler version must be 1.19.35, 1.21.28, 1.23.30, 1.25.20, or later. To balance load among AZs, the version must be 1.23.122, 1.25.117, 1.27.85, 1.28.52, or later.

Constraints

If there are no nodes in a node pool, the CCE Cluster Autoscaler add-on cannot obtain the CPU or memory data of the node, and the node auto scaling rule triggered based on CPU or memory metrics will not take effect.
If the GPU or NPU driver is not installed, the CCE Cluster Autoscaler add-on considers the node unavailable, preventing CPU- or memory-based auto scaling rules from taking effect.
When CCE Cluster Autoscaler is used, some taints or annotations may affect auto scaling. Therefore, do not use the following taints or annotations in clusters:
- ignore-taint.cluster-autoscaler.kubernetes.io: The taint works on nodes. Kubernetes-native Autoscaler supports protection against abnormal scale-outs and periodically evaluates the proportion of available nodes in the cluster. When the proportion of non-ready nodes exceeds 45%, protection will be triggered. In this case, all nodes with the ignore-taint.cluster-autoscaler.kubernetes.io taint in the cluster are filtered out from the Autoscaler template and recorded as non-ready nodes, which affects cluster scaling.
- cluster-autoscaler.kubernetes.io/enable-ds-eviction: The annotation works on pods, which determines whether DaemonSet pods can be evicted by Autoscaler. For details, see Well-Known Labels, Annotations and Taints.
If the CCE Cluster Autoscaler add-on is installed, node pool auto scaling cannot be triggered when some node affinity policies of workloads cause pod scheduling failures. The following node labels should not be used for node affinity policies of the workloads in the cluster: node.kubernetes.io/subnetid, os.architecture, os.name, and os.version.
During the initial scale-out, resources are calculated based on estimates. During subsequent scale-outs, calculations use actual resources reported by the node. Due to system background noise (such as underlying architecture or OS differences), the initial scale-out may provision excess nodes in extreme scenarios. Enable automatic scale-in or manually remove excess nodes as needed. If workloads fail to schedule due to insufficient resources, switch to a higher-specification flavor group.

Configuring a Node Pool Auto Scaling Policy

Log in to the CCE console and click the cluster name to access the cluster console.
In the navigation pane, choose Nodes. On the Node Pools tab, locate the row containing the target node pool and click Auto Scaling.
- If CCE Cluster Autoscaler is not installed, click Install, configure add-on parameters based on service requirements, click Install, and wait until the add-on is installed. For details about add-on configurations, see CCE Cluster Autoscaler.
- If CCE Cluster Autoscaler has been installed, configure scaling policies.

Configure auto scaling policies.

Auto Scaling Configuration

Custom rules: Click Add Rule. In the dialog box displayed, configure parameters. You can add multiple node scaling policies, including a maximum of one CPU-based rule and one memory-based rule. The total number of rules cannot exceed 10.

The following table lists custom rules.

**Table 1** Custom rules
Rule Type	Configuration
Metric-based	Trigger: Select CPU allocation rate or Memory allocation rate and enter a value. The rate must be greater than the value specified in the node resource requirements for a node scale-in when you configure a scaling policy (Configuring an Auto Scaling Policy for a Cluster). NOTE: Resource allocation rate (%) = Resources requested by pods in the node pool/Resources allocatable to pods in the node pool If multiple rules meet scaling conditions, the rules are executed in either of the following modes: If rules based on the CPU allocation rate and memory allocation rate are configured and two or more rules meet the scale-out conditions, the rule that will add the most nodes will be executed. If a rule based on the CPU allocation rate and a periodic rule are configured and both the rules meet the scale-out conditions, the periodic rule executed early changes the node pool to the scaling state. As a result, the metric-based rule cannot be executed. After the periodic rule is executed and the node pool status becomes normal, the metric-based rule will not be executed. If the metric-based rule is executed early, the periodic rule will be executed after the metric-based rule is executed. If rules based on the CPU allocation rate and memory allocation rate are configured, the policy detection period varies with the processing logic of each loop of the CCE Cluster Autoscaler add-on. A scale-out is triggered once the conditions are met, but it is constrained by other factors such as the cooldown period and node pool status. If the number of nodes reaches the cluster scale upper limit, the maximum node quantity range of the node pool, or the maximum node quantity range defined by the specification, metric-based scale-out will not be triggered. If the number of nodes, the number of vCPUs, or the amount of memory reaches the upper limit for a node scale-out, a metric-based scale-out will not be triggered. Action: Configure an action to be performed when the triggering conditions are met. Custom: Plan the number of nodes added to a node pool. Auto calculation: When the triggering conditions are met, nodes are automatically added and the allocation rate is restored to a value lower than the threshold. The formula is as follows: The number of nodes to be added = Resource requests of pods in the node pool/(Available resources of a single node x Target allocation rate) - The number of current nodes + 1
Periodic	Trigger Time: You can select a specific time every day, every week, every month, or every year. Action: specifies an action to be carried out when the trigger time is reached. Plan the number of nodes added to the node pool.

Nodes: specifies the minimum number of nodes during scale-in and the maximum number of nodes during scale-out. The number of nodes in a node pool will always be within the specified node range during auto scaling, but this setting does not apply to manual scaling.
For example, if the number of nodes in a node pool has reached the maximum, auto scaling cannot be triggered, but manual scaling is still allowed.
Cooldown Period: a period during which the nodes added to the current node pool cannot be scaled in.

Auto Scaling Object

Specifications: Configure whether to enable auto scaling for node flavors in a node pool.

If multiple flavors are configured for a node pool, you can specify the number of nodes and the scaling priority for each flavor. Multiple flavors are supported only by add-ons v1.21.18, v1.23.19, v1.25.9, or later.

View cluster-level auto scaling configurations, which take effect for all node pools in the cluster. On the Policies page, you can only view cluster-level auto scaling policies. To modify these policies, go to the Settings page. For details, see Configuring an Auto Scaling Policy for a Cluster.
Click OK.

Configuring an Auto Scaling Policy for a Cluster

An auto scaling policy applies to all node pools in a cluster. After the policy is modified, the Autoscaler add-on will be restarted.

Log in to the CCE console and click the cluster name to access the cluster console.
In the navigation pane, choose Settings. Then click the Auto Scaling tab.
- If CCE Cluster Autoscaler is not installed, configure add-on parameters based on service requirements, click Install, and wait until the add-on is installed. For details about add-on configurations, see CCE Cluster Autoscaler.
- If CCE Cluster Autoscaler has been installed, configure scaling policies.
Configure auto scale-out.
- Auto Scale-out when the load cannot be scheduled: When workload pods in a cluster cannot be scheduled (pods remain in the pending state), CCE automatically adds nodes to the node pool. If a pod has been scheduled to a node, the node will not be involved in an auto scale-out. Such auto scaling typically works with an HPA policy. For details, see Using an HPA and a CA for Auto Scaling of Workloads and Nodes.
  If this function is not enabled, custom scaling rules are the only option for performing a scale-out.
- Upper limit of resources to be expanded: the upper limits for the cluster's resources, such as the number of nodes, the number of vCPUs, and the amount of memory. Once an upper limit is reached, no new nodes will be automatically added.
- Scale-Out Priority: You can drag and drop the node pools in a list to adjust their scale-out priorities.

Configure auto scale-in. Auto scale-in is disabled by default. After it is enabled, you can configure Node Scale-In Conditions and Node Scale-In Policy. If the nodes in the cluster meet the scale-in conditions, the nodes are removed automatically.

Node Scale-In Conditions

Nodes in a cluster comply with the default scale-in conditions by default. If custom scale-in conditions are specified for a node pool, the nodes in the node pool comply with the custom scale-in conditions.

**Table 2** Node scale-in conditions
Parameter	Description
Default Scale-In Conditions	If the CPU and memory allocation rates of a node are lower than a certain percentage (50% by default) for a period of time (10 minutes by default), or the node is unavailable for a period of time (20 minutes by default), the node will be scaled in. Allocation rate = Total requested resources of all pods/Allocatable resources on the node If the option Exclude CPU and memory resources pre-allocated to DaemonSet pods is selected, CCE will not consider the CPU and memory resources pre-allocated to DaemonSet pods when determining whether to scale in cluster nodes. This means that the resources used by DaemonSet pods will not affect the scale-in decision. If this option is not selected, the resources pre-allocated to DaemonSet pods will be included in the resource allocation calculations. This can cause the CPU and memory allocation rates to exceed the node scale-in threshold, potentially preventing nodes with low CPU and memory utilization from being scaled in.
(Optional) Custom Scale-In Conditions	You can configure scale-in conditions for each node pool. If the CPU and memory allocation rates of nodes in a node pool are lower than a certain percentage (50% by default) for a period of time (10 minutes by default), the node pool will be scaled in. Custom scale-in conditions are supported when the CCE Cluster Autoscaler add-on version is v1.25.181, v1.27.152, v1.28.120, v1.29.81, v1.30.48, v1.31.10, or later. If the auto scaling function is not enabled for all flavors in a node pool, custom scale-in conditions configured for the node pool do not take effect. For details about how to enable the auto scaling function for a node pool, see Configuring a Node Pool Auto Scaling Policy.
Scale-In Exception Scenarios	When a node meets the following exception scenarios, CCE will not scale in the node even if the node resources or status meets scale-in conditions: Resources on other nodes in the cluster are insufficient. Scale-in protection is enabled on the node. To enable or disable node scale-in protection, choose Nodes in the navigation pane and then click the Nodes tab. Locate the target node and choose More > Enable Scale-in Protection or Disable Scale-in Protection in the Operation column. There is a pod with the non-scale label on the node. Policies such as reliability have been configured on some containers on the node. There are non-DaemonSet pods in the kube-system namespace on the node. (Optional) A container managed by a third-party pod controller is running on a node. Third-party pod controllers are for custom workloads except Kubernetes-native workloads such as Deployments and StatefulSets. Such controllers can be created using CustomResourceDefinitions.

Node Scale-In Policy

**Table 3** Node scale-in policy configurations
Item	Description	Default Value
Max Nodes for Batch Deletion	Maximum number of idle nodes that can be deleted concurrently. The actual number of nodes that can be deleted concurrently is also affected by the flow control value of the node deletion API. Only idle nodes can be concurrently scaled in. Nodes that are not idle can only be scaled in one by one. NOTE: During a node scale-in, if all the pods on a node do not need to be evicted (such as the DaemonSet pods), the node will be considered as idle. Otherwise, the node is not idle.	10
Check Period	Interval at which a node can be checked again after it is determined that the node cannot be scaled in	5 minutes
Cooldown Duration (If multiple cooldown durations are active in a cluster, scale-in evaluation will only resume after all cooldown durations have completed.)	The length of time before CCE starts evaluating scale-in again after an auto scale-in in a cluster.	10 minutes
	The length of time before CCE starts evaluating scale-in again after an auto scale-out in a cluster. NOTE: If both auto scale-outs and scale-ins exist in a cluster, set this parameter to 0 minutes. This prevents unexpected waste of node resources caused by a blocked node scale-in due to continuous scale-outs of some node pools or retries upon a scale-out failure.	10 minutes
	The length of time before CCE starts evaluating scale-in again after an auto scale-in failure in a cluster.	3 minutes

Click Confirm Settings.

Cooldown Period

The impact and relationship between the two cooldown periods configured for a node pool are as follows:

Cooldown Period During a Scale-out

This period after a scale-out controls how long nodes added to the current node pool after a scale-out cannot be removed. This setting applies to the node pool.

Cooldown Period During a Scale-in

The period after a scale-out controls how long the cluster cannot be scaled in after the CCE Cluster Autoscaler add-on triggers a scale-out (due to the unschedulable pods, metrics, or periodic scaling policies). This setting applies to the cluster.

The period after a node is removed controls how long the cluster cannot be scaled in after the CCE Cluster Autoscaler add-on triggers a scale-in. This setting applies to the cluster.

The interval after a failed scale-in indicates the period during which the cluster cannot be scaled in after the CCE Cluster Autoscaler add-on triggers a scale-in. This setting applies to the entire cluster.

Period for CCE Cluster Autoscaler to Retry a Scale-out

If a node pool failed to be scaled out due to reasons such as insufficient resources or quota or an error during node installation, the CCE Cluster Autoscaler add-on can retry the scale-out in the same node pool or another node pool. The retry period varies depending on failure causes:

When resources in a node pool are sold out or the user quota is insufficient, the CCE Cluster Autoscaler add-on cools down the node pool for 5 minutes, 10 minutes, or 20 minutes. The maximum cooldown period is 30 minutes. Then, the CCE Cluster Autoscaler add-on switches to another node pool in the next 10 seconds for a scale-out until the desired nodes are added or all node pools have been cooled down.
If there is an error during node installation in a node pool, the node pool will enter a 5-minute cooldown period. After the period expires, the CCE Cluster Autoscaler add-on can trigger a node pool scale-out again. If the faulty node is automatically reclaimed, the CCE Cluster Autoscaler add-on re-evaluates the cluster status within 1 minute and triggers a node pool scale-out as needed.
During a node pool scale-out, if a node remains in the installing state for a long time, the CCE Cluster Autoscaler add-on tolerates the node for a maximum of 15 minutes. After the tolerance period expires, the CCE Cluster Autoscaler add-on re-evaluates the cluster status and triggers a node pool scale-out as needed.

Example YAML

The following is a YAML example of a node scaling policy:

apiVersion: autoscaling.cce.io/v1alpha1
kind: HorizontalNodeAutoscaler
metadata:
  name: xxxx
  namespace: kube-system
spec:
  disable: false
  rules:
  - action:
      type: ScaleUp
      unit: Node
      value: 1
    cronTrigger:
      schedule: 47 20 * * *
    disable: false
    ruleName: cronrule
    type: Cron
  - action:
      type: ScaleUp
      unit: Node
      value: 2
    disable: false
    metricTrigger:
      metricName: Cpu
      metricOperation: '>'
      metricValue: "40"
      unit: Percent
    ruleName: metricrule
    type: Metric
  targetNodepoolIds:
  - 7d48eca7-3419-11ea-bc29-0255ac1001a8

**Table 4** Key parameters
Parameter	Type	Description
spec.disable	Bool	Whether to enable the scaling policy. This parameter takes effect for all rules in the policy.
spec.rules	Array	All rules in a scaling policy.
spec.rules[x].ruleName	String	Rule name.
spec.rules[x].type	String	Rule type. Cron and Metric are supported.
spec.rules[x].disable	Bool	Rule switch. Currently, only false is supported.
spec.rules[x].action.type	String	Rule action type. Currently, only ScaleUp is supported.
spec.rules[x].action.unit	String	Rule action unit. Currently, only Node is supported.
spec.rules[x].action.value	Integer	Rule action value.
spec.rules[x].cronTrigger	N/A	Optional. This parameter is valid only in periodic rules.
spec.rules[x].cronTrigger.schedule	String	Cron expression of a periodic rule.
spec.rules[x].metricTrigger	N/A	Optional. This parameter is valid only in metric-based rules.
spec.rules[x].metricTrigger.metricName	String	Metric of a metric-based rule. Currently, Cpu and Memory are supported.
spec.rules[x].metricTrigger.metricOperation	String	Comparison operator of a metric-based rule. Only > is supported.
spec.rules[x].metricTrigger.metricValue	String	Threshold of the metric rule. The value can be an integer from 1 to 100 and must be a character. If the value is set to -1, the threshold is automatically calculated.
spec.rules[x].metricTrigger.Unit	String	Unit of the metric-based rule threshold. Only % is supported.
spec.targetNodepoolIds	Array	All node pools associated with the scaling policy.
spec.targetNodepoolIds[x]	String	UID of the node pool associated with the scaling policy.

Common Issues

Why is the "no scale task needed with desired node count n" event reported?
CCE Cluster Autoscaler caches node pool information. This may trigger "no scale task needed with desired node count n." This event does not affect any node pool and can be safely ignored.

Parent Topic: Node Scaling

Previous topic: Priorities for Scaling Node Pools

Next topic: Managing Node Scaling Policies

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

For any further questions, feel free to contact us through the chatbot.

Chatbot