Updated on 2024-11-11 GMT+08:00

autoscaler

Introduction

Autoscaler is an important Kubernetes controller. It supports microservice scaling and is key to serverless design.

When the CPU or memory usage of a microservice is too high, horizontal pod autoscaling is triggered to add pods to reduce the load. These pods can be automatically reduced when the load is low, allowing the microservice to run as efficiently as possible.

CCE simplifies the creation, upgrade, and manual scaling of Kubernetes clusters, in which traffic loads change over time. To balance resource usage and workload performance of nodes, Kubernetes introduces the autoscaler add-on to automatically resize a cluster based on the resource usage required for workloads deployed in the cluster. For details, see Creating a Node Scaling Policy.

Open source community: https://github.com/kubernetes/autoscaler

How the Add-on Works

autoscaler controls auto scale-out and scale-in.

  • Auto scale-out

    If pods in a cluster cannot be scheduled due to insufficient worker nodes, cluster scaling is triggered to add nodes. The nodes to be added have the same specification as configured for the node pool to which the nodes belong. For details, see Creating a Node Scaling Policy.

    The add-on follows the "No Less, No More" policy. For example, if three cores are required for creating a pod and the system supports four-core and eight-core nodes, autoscaler will preferentially create a four-core node.

    Auto scale-out will be performed when:

    • Node resources are insufficient.
    • No node affinity policy is set in the pod scheduling configuration. That is, if a node has been configured as an affinity node for pods, no node will not be automatically added when pods cannot be scheduled. For details about how to configure the node affinity policy, see Node Affinity.
  • Auto scale-in

    When a cluster node is idle for a period of time (10 minutes by default), cluster scale-in is triggered, and the node is automatically deleted. However, a node cannot be deleted from a cluster if the following pods exist:

    • Pods that do not meet specific requirements set in PodDisruptionBudget
    • Pods that cannot be scheduled to other nodes due to constraints such as affinity and anti-affinity policies
    • Pods that have the cluster-autoscaler.kubernetes.io/safe-to-evict: 'false' annotation
    • Pods (except those created by kube-system DaemonSet) that exist in the kube-system namespace on the node
    • Pods that are not created by the controller (Deployment/ReplicaSet/job/StatefulSet)

Notes and Constraints

  • Ensure that there are sufficient resources for installing the add-on.
  • Only pay-per-use VM nodes can be added or removed by autoscaler.
  • The default node pool does not support auto scaling. For details, see Description of DefaultPool.

Installing the Add-on

  1. Log in to the CCE console. In the navigation pane, choose Add-ons. On the Add-on Marketplace tab page, click Install Add-on under autoscaler.
  2. On the Install Add-on page, select the cluster and the add-on version, and click Next: Configuration.
  3. Configure add-on installation parameters listed in Table 1.

    Table 1 Basic settings

    Parameter

    Add-on Version

    Description

    Add-on Specifications

    Available in all versions

    The add-on can be deployed in the following specifications:

    • Single: The add-on is deployed with only one pod.
    • HA50: The add-on is deployed with two pods, serving a cluster with 50 nodes and ensuring high availability.
    • HA200: The add-on is deployed with two pods, serving a cluster with 50 nodes and ensuring high availability. Each pod uses more resources than those of the HA50 specification.
    • Custom: You can customize the number of pods and specifications as required.

    Instances

    Available in all versions

    Number of pods that will be created to match the selected add-on specifications. The number cannot be modified.

    Container

    Available in all versions

    CPU and memory quotas of the container allowed for the selected add-on specifications. The quotas cannot be modified.

    Login Mode

    Available only in certain versions

    Select a login mode for the worker nodes to be added during auto scale-up. Passwords and key pairs are supported for login.

    If you select Password:

    • Password: Set a password for logging in to the added worker nodes as user root
    • Confirm Password: Enter the password again.

    If you select Key pair:

    Key pair: Select an existing key pair or create a new one for identity authentication during remote login to the added nodes.

    Auto Scale-In

    Available in all versions

    Off: Auto scale-down is not allowed. Only auto scale-up is allowed.

    On: Enable auto scale-in. The scale-in policy is valid for node pools in the cluster with auto scaling enabled.

    • Idle Time (min): Time for which a node should be unneeded before it is eligible for scale-down. Default value: 10 minutes.
    • Resource Usage: If the percentage of both CPU and memory usage on a node is below this threshold, auto scale-down will be triggered to delete the node from the cluster. The default value is 0.5, which means 50%.
    • Scale-in Cooldown After Scale-out: The time after scale-up that the scale-down evaluation will resume. Default value: 10 minutes.
      NOTE:

      If both auto scale-out and scale-in exist in a cluster, you are advised to set Scale-in Cooldown After Scale-out to 0 minutes. This can prevent the node scale-in from being blocked due to continuous scale-out of some node pools or retries upon a scale-out failure, resulting in unexpected waste of node resources.

    • Scale-in Cooldown After Node Deletion: The time after node deletion that the scale-down evaluation will resume. Default value: 10 minutes.
    • Scale-in Cooldown After Failure: The time after a scale-down failure that the scale-down evaluation will resume. Default value: 3 minutes. For details about the impact and relationship between the scale-in cooling intervals configured in the node pool and autoscaler, see Scale-in Cooling Interval.
    • Max empty bulk delete: The maximum number of empty nodes that can be deleted at the same time. Default value: 10.
    • Node Recheck Timeout: The timeout before autoscaler checks again the node that could not be previously removed. Default value: 5 minutes.

    Node Pool Configuration

    Available only in certain versions

    Configuration of the default node pool. A node pool is a group of compute nodes with the same node type (VM or BMS), specifications, and labels. When a cluster needs to be scaled up, autoscaler will automatically add nodes from node pools to the cluster. If no custom node pool is available, autoscaler will use the default node pool.

    Click Add Node Pool Configuration and set the following parameters:

    • AZ: A physical region where resources use independent power supplies and networks. AZs are physically isolated but interconnected through the internal network.
    • OS: OS of the nodes to be created.
    • Taints: No taints are added by default.
      Taints allow nodes to repel a set of pods. You can add a maximum of 10 taints for each node pool. Each taint contains the following parameters:
      • Key: A key must contain 1 to 63 characters starting with a letter or digit. Only letters, digits, hyphens (-), underscores (_), and periods (.) are allowed. A DNS subdomain name can be used as the prefix of a key.
      • Value: A value must start with a letter or digit and can contain a maximum of 63 characters, including letters, digits, hyphens (-), underscores (_), and periods (.).
      • Effect: Available options are NoSchedule, PreferNoSchedule, and NoExecute.
      NOTICE:
      • If taints are used, you must configure tolerations in the YAML files of pods. Otherwise, scale-up may fail or pods cannot be scheduled onto the added nodes.
      • Taints cannot be modified after configuration. Incorrect taints may cause a scale-up failure or prevent pods from being scheduled onto the added nodes.
    • Resource Tags: Resource tags can be added to classify resources.
      NOTE:

      You can create predefined tags in Tag Management Service (TMS). Predefined tags are visible to all service resources that support the tagging function. You can use these tags to improve tagging and resource migration efficiency.

    • Specifications: CPU and memory of the added nodes.

    To configure more add-on parameters, click Advanced Settings at the bottom of this page.

    Table 2 Advanced settings

    Parameter

    Add-on Version

    Description

    Total Nodes

    Available in all versions

    Maximum number of nodes that can be managed by the cluster, within which cluster scale-out is performed.

    Total Cores

    Available in all versions

    Maximum sum of CPU cores of all nodes in a cluster, within which cluster scale-out is performed.

    Total Memory (GB)

    Available in all versions

    Maximum sum of memory of all nodes in a cluster, within which cluster scale-out is performed.

    Auto Scale-Out

    Available only in certain versions

    Triggered when there are pods unscheduled: Selected by default.

  4. When the configuration is complete, click Install.

    After the add-on is installed, click Go Back to Previous Page. On the Add-on Instance tab page, select the corresponding cluster to view the running instance. This indicates that the add-on has been installed on each node in the cluster.

Upgrading the Add-on

  1. Log in to the CCE console. In the navigation pane, choose Add-ons. On the Add-on Instance tab page, click Upgrade under autoscaler.

    • If the Upgrade button is unavailable, the current add-on is already up-to-date and no upgrade is required.
    • If the Upgrade button is available, click Upgrade to upgrade the add-on.
    • During the upgrade, the coredns add-on of the original version on cluster nodes will be discarded, and the add-on of the target version will be installed.

  2. In the dialog box displayed, set parameters and upgrade the add-on. For details about the parameters, see the parameter description in Installing the Add-on.

Uninstalling the Add-on

  1. Log in to the CCE console. In the navigation pane, choose Add-ons. On the Add-on Instance tab page, select the target cluster and click Uninstall under autoscaler.
  2. In the dialog box displayed, click Yes to uninstall the add-on.