Using Karpenter for Auto Scaling of Nodes
Karpenter is a dynamic, high-performance, open-source cluster auto scaling solution for Kubernetes. It aims to use the right number of nodes at the right time, simplifying Kubernetes infrastructure management. Compared with Cluster Autoscaler, Karpenter reduces the resource scaling time from minutes to seconds, significantly improving workload efficiency in clusters and lowering costs.
Karpenter:
- Monitors pods marked as unschedulable. Pods may become unschedulable due to reasons, such as insufficient CPU or memory resources, selector conditions not met, mismatched node taints and tolerations, or occupied host ports.
- Evaluates the scheduling requirements of the unschedulable pods.
- Provisions new nodes that meet the requirements of those pods.
- Deletes nodes when they are no longer needed, for example, when nodes are idle or resources expire.
Core Features and Advantages
- Event-driven, rapid scale-out: Traditional Cluster Autoscaler (CA) relies on periodic polling and cloud auto-scaling groups, which typically requires three to five minutes to add new nodes. Karpenter abandons this model and instead listens continuously for pod scheduling events within the cluster, achieving millisecond-level response times. It evaluates the resource requirements of pending pods directly and calls cloud provider APIs without intermediate layers. This enables rapid provisioning of the most suitable compute instances to handle traffic spikes and high-concurrency workloads.
- Node pool-free architecture: Karpenter bypasses the constraints of traditional fixed node pools. Through declarative NodePool policies, it enables flexible configuration of AZs, instance architectures, and billing models. The system accurately parses pod requirements for CPU, memory, and scheduling constraints such as affinity and tolerations. Combined with FlexusX instances, Karpenter can configure CPU-to-memory ratios on demand to ensure that node specifications precisely match actual service requirements. This prevents resource overcommitment and waste at the source.
- Continuous intelligent consolidation: Karpenter emphasizes efficiency management throughout the entire cluster lifecycle, not merely scale-out speed. The system continuously evaluates real-time resource utilization. By enabling consolidation policies such as WhenEmptyOrUnderutilized, Karpenter automatically reclaims completely idle nodes and proactively evicts and migrates pods from underutilized nodes in a controlled manner. By aggregating scattered workloads onto fewer or more cost-effective instances, Karpenter maximizes the reduction of idle costs.
- Capacity and cost awareness:
- Karpenter integrates with the cloud provider's billing model. During scale-out, it performs multi-dimensional cost calculations based on real-time instance pricing and intelligently selects the instance combination with the lowest unit cost that meets pod requirements. It also natively supports flexible hybrid deployment of on-demand and spot instances, minimizing overall compute costs.
- Karpenter also maintains real-time awareness of cloud provider capacity. If the preferred instance type is sold out, triggering a resource insufficiency error, Karpenter automatically retries and seamlessly falls back to other available instance types or AZs within milliseconds. This eliminates the scaling bottlenecks caused by single node pool shortages in traditional CA implementations and ensures service continuity.
Prerequisites
- A cluster that meets Karpenter's prerequisites is available, with worker nodes provisioned for Karpenter to run on.
- There are EIPs bound to nodes for pulling images from the Internet during chart installation.
- A Karpenter container requires Internet access. For a standard cluster, the node where Karpenter is located needs an EIP bound. For a Turbo cluster, the Karpenter container needs an EIP bound.
Notes and Constraints
- Karpenter requires the corresponding cloud service provider permissions to create or delete nodes. Currently, Karpenter uses Access Key and Secret Key (AK and SK) credentials. In the future, Karpenter will be available as a system add-on in the CCE add-on marketplace and support custom add-on agencies or pod identity authentication.
- Both Karpenter and CCE Cluster Autoscaler are node-scaling add-ons. They cannot be installed together, as doing so may cause mutual interference.
- Karpenter cannot scale in the nodes where CCE system add-on pods are running.
Deploying Karpenter
- Obtain a chart.
Go to the chart page, select a proper version, and download the Helm chart in .tgz format. This section uses the chart of version 0.2.1 as an example. This chart applies to CCE clusters of v1.29 or later. The configuration items in the chart may vary according to the version. The configuration in this section takes effect only for the chart of version 0.2.1.
- Upload the chart.
- Log in to the CCE console and click the cluster name to access the cluster console. In the navigation pane, choose App Templates and click Upload Chart in the upper right corner.
- Click Add, select the chart to be uploaded, and click Upload.

- Specify the value.yaml file.
You can create a value.yaml configuration file on the local PC to configure workload installation parameters. During the installation, you only need to import this configuration file for custom installation. Other unspecified parameters will use the default settings.
The settings are as follows:# Default values for karpenter-provider-huawei # -- Number of controller replicas replicaCount: 1 # Controller image configuration image: # -- Controller image repository repository: swr.ap-southeast-3.myhuaweicloud.com/huaweiclouddeveloper/cce/karpenter/controller # -- Controller image tag tag: "0.2.1" # -- Image pull policy pullPolicy: IfNotPresent # kube-rbac-proxy sidecar configuration rbacProxy: image: # -- kube-rbac-proxy image repository repository: quay.io/brancz/kube-rbac-proxy # -- kube-rbac-proxy image tag tag: "v0.16.0" # -- Image pull policy pullPolicy: IfNotPresent # -- Image pull secrets for controller pods imagePullSecrets: [] # - name: registry-credentials # -- Name prefix for all resources namePrefix: "karpenter-provider-huawei-" serviceAccount: # -- Create ServiceAccount create: true # -- ServiceAccount name name: controller-manager # Controller arguments controller: # -- Metrics port metricsPort: 8080 # -- Health probe port healthProbePort: 8081 # Huawei Cloud credentials # The generated Secret uses HUAWEICLOUD_SDK_AK, HUAWEICLOUD_SDK_SK, # and HUAWEICLOUD_SDK_REGION_ID from this block, plus # HUAWEICLOUD_SDK_CCE_CLUSTER_ID from clusterInfo.clusterID. # If credentials.create=false, the existing Secret should provide the same keys. credentials: # -- Create a Secret for credentials create: true # -- Secret name for Huawei Cloud credentials name: "huawei-credentials" # -- Use an existing Secret when credentials.create is false existingSecret: "" # -- Huawei Cloud access key accessKey: "your-access-key" # -- Huawei Cloud secret key secretKey: "your-secret-key" # -- Huawei Cloud region ID region: "your-region-id" clusterInfo: # -- Huawei Cloud CCE cluster ID clusterID: "your-cluster-id" # -- Cluster category: Optional. Enter "eni" for Turbo network types. # For other network types (vpc-router or overlay_l2), enter other values. category: "" # -- yangtseEipInfo: Supports user-defined EIPs (Elastic IPs) bound to Karpenter pods. # This only takes effect when clusterInfo.category is set to "eni". yangtseEipInfo: yangtse.io/pod-with-eip: "true" # Controller resources resources: limits: cpu: "1" memory: 512Mi requests: cpu: 200m memory: 256Mi # -- Pod security context podSecurityContext: runAsNonRoot: true seccompProfile: type: RuntimeDefault # -- Container security context for manager securityContext: readOnlyRootFilesystem: true allowPrivilegeEscalation: false capabilities: drop: - "ALL" # -- Node selector for controller pods nodeSelector: {} # -- Tolerations for controller pods tolerations: [] # -- Affinity for controller pods affinity: {}- If clusterInfo.category is set to eni, an EIP is automatically bound to a Karpenter pod. You are advised to bind an EIP to a pod in a CCE Turbo cluster. You can configure more EIP parameters in yangtseEipInfo. For details about parameter settings, see Configuring an EIP for a Pod in a CCE Turbo Cluster.
- Parameters such as clusterInfo.clusterID, credentials.accessKey, credentials.secretKey, and credentials.region are mandatory. Change other parameter values based on service requirements.
- Create a release.
- Log in to the CCE console and click the target cluster name. In the left navigation pane, choose App Templates.
- Locate the uploaded chart and click Install.
- Configure Release Name, Namespace, and Select Version.
- Click Add next to Configuration File, select the YAML file created locally, and click Install.

- On the Releases tab, view the status of the release.

Performing Verification
- Create NodePool and CCENodeClass resources. For details about the resource parameters, see Configuration Parameters.
apiVersion: karpenter.k8s.huawei/v1alpha1 kind: CCENodeClass metadata: name: demo-cce-elastic-cpu spec: # Add one subnet per target AZ if you want this demo NodePool to launch # across multiple zones. subnetSelectorTerms: - id: "30abcb7b-ddeb-4a93-9e83-0b21f59b07a3" imsSelector: imsFamily: "Huawei Cloud EulerOS 2.0" blockDeviceMappings: root: volumeSize: 40 volumeType: SSD k8s: volumeSize: 100 volumeType: SSD runtimeConfiguration: type: containerd login: userPassword: username: root password: "JDYk*****" --- apiVersion: karpenter.sh/v1 kind: NodePool metadata: name: demo-cce-elastic-cpu spec: disruption: budgets: - nodes: "1" consolidationPolicy: WhenEmptyOrUnderutilized consolidateAfter: 30s limits: # Allow enough headroom for the 600-replica GSS validation to scale past 4 x 12c nodes. cpu: "100" memory: 384Gi template: metadata: labels: demo.huawei.com/scenario: cpu-elastic demo.huawei.com/nodepool-profile: shared-burst-and-consolidation spec: nodeClassRef: group: karpenter.k8s.huawei kind: CCENodeClass name: demo-cce-elastic-cpu requirements: - key: kubernetes.io/arch operator: In values: - amd64 - key: kubernetes.io/os operator: In values: - linux - key: karpenter.sh/capacity-type operator: In values: - on-demand # Keep zone unconstrained so Karpenter can use every AZ exposed by the # selected ECSNodeClass subnets. Re-add a topology.kubernetes.io/zone # requirement if you need to pin this demo to specific AZs. - key: node.kubernetes.io/instance-type operator: In values: - c9.large.4 - c9.xlarge.4 - c9.2xlarge.4 - c9.4xlarge.4 - Create a Deployment to verify that Karpenter can provision nodes.
apiVersion: apps/v1 kind: Deployment metadata: name: cpu-burst namespace: default spec: replicas: 0 selector: matchLabels: app: cpu-burst template: metadata: labels: app: cpu-burst spec: terminationGracePeriodSeconds: 10 nodeSelector: demo.huawei.com/scenario: cpu-elastic containers: - name: web image: nginx:1.27-alpine imagePullPolicy: IfNotPresent ports: - containerPort: 80 name: http resources: requests: cpu: "1400m" memory: "1200Mi" limits: cpu: "1400m" memory: "1200Mi" - Scale out the Deployment.
kubectl scale deployment cpu-burst --replicas=10
After the Deployment is deployed, you can see that the newly started pods cannot be scheduled to the existing nodes. After the new nodes are created, the pods can be properly scheduled to the new nodes.
$ kubectl get node NAME STATUS ROLES AGE VERSION 192.168.1.15 Ready <none> 3h28m v1.33.5-r20-33.0.4.9 192.168.1.158 Ready <none> 42m v1.33.5-r20-33.0.4.9 192.168.1.168 Ready <none> 4m7s v1.33.5-r20-33.0.4.9 $ kubectl get pod -l app=cpu-burst NAME READY STATUS RESTARTS AGE cpu-burst-5d84f5647c-697j2 1/1 Running 0 9m35s cpu-burst-5d84f5647c-9r9rs 1/1 Running 0 9m35s cpu-burst-5d84f5647c-9swmq 1/1 Running 0 9m35s cpu-burst-5d84f5647c-bqrmw 1/1 Running 0 9m35s cpu-burst-5d84f5647c-jc7f9 1/1 Running 0 9m35s cpu-burst-5d84f5647c-pshpx 1/1 Running 0 9m35s cpu-burst-5d84f5647c-qkzfm 1/1 Running 0 9m36s cpu-burst-5d84f5647c-rl7mk 1/1 Running 0 9m35s cpu-burst-5d84f5647c-tnt92 1/1 Running 0 9m35s cpu-burst-5d84f5647c-w728l 1/1 Running 0 9m35s
- Scale in the Deployment.
kubectl scale deployment cpu-burst --replicas=5
After the Deployment is scaled in, new small-core nodes are created first, and then the old nodes are reclaimed.
$ kubectl get node NAME STATUS ROLES AGE VERSION 192.168.1.15 Ready <none> 3h30m v1.33.5-r20-33.0.4.9 192.168.1.168 Ready <none> 6m25s v1.33.5-r20-33.0.4.9 192.168.1.71 Ready <none> 2m14s v1.33.5-r20-33.0.4.9


Uninstalling the Release
- Log in to the CCE console and click the target cluster name. In the left navigation pane, choose App Templates.
- On the Releases tab page, locate the row that contains the installed release and choose More > Uninstall in the Operation column.

Configuration Parameters
- NodePool resource parameters: For details, see NodePools.
- CCENodeClass resource parameters:
Table 1 CCENodeClass resource parameters Parameter
Mandatory
Type
Description
subnetSelectorTerms
Yes
SubnetSelectorTerm object
Definition
Node subnet information
Constraints
N/A
ecsGroupId
No
String
Definition
ECS group ID. If this parameter is specified, nodes will be created in the specified ECS group.
Constraints
N/A
Range
N/A
Default Value
imsSelector
Yes
IMSSelector object
Definition
Node OS and image
Constraints
N/A
blockDeviceMappings
Yes
BlockDeviceMappings Object
Definition
Node disk device
Constraints
N/A
login
Yes
Login Object
Definition
Node login mode
Constraints
N/A
runtimeConfiguration
No
RuntimeConfiguration Object
Definition
Runtime configuration
Constraints
N/A
Table 2 SubnetSelectorTerm Parameter
Mandatory
Type
Description
id
Yes
String
Definition
Network ID of the subnet that the network interface belongs to
Constraints
N/A
Range
Log in to the VPC console. In the left navigation pane, choose Virtual Private Cloud > Subnets. Click the target subnet name and copy the Network ID on the Summary tab page.
Default Value
N/A
Table 3 IMSSelector Parameter
Mandatory
Type
Description
imsFamily
Yes
String
Definition
Node OS
Constraints
N/A
Range
N/A
Default Value
N/A
Table 4 BlockDeviceMappings Parameter
Mandatory
Type
Description
root
Yes
BlockDevice Object
Definition
System disk
Constraints
N/A
k8s
Yes
BlockDevice Object
Definition
Data disk used by runtime and Kubernetes
Constraints
N/A
users
No
Array of BlockDevice Objects
Definition
User data volume
Constraints
N/A
Table 5 BlockDevice Parameter
Mandatory
Type
Description
volumeSize
Yes
Int
Definition
Disk size, in GiB
Constraints
N/A
Range
- Value range of the root volume: 20 to 1024
- Value range of the Kubernetes volume: 20 to 32768
- Value range of the user volume: 10 to 32768
Default Value
N/A
volumeType
Yes
String
Definition
Disk type
Constraints
N/A
Range
- SAS: High I/O disk
- SSD: Ultra-high I/O disk
- SATA: Common I/O disk SATA disks are no longer available from EVS. Only existing nodes have this type of disks.
- ESSD: Extreme SSD disk
- GPSSD: General Purpose SSD disk
- ESSD2: Extreme SSD V2 disk
- GPSSD2: General Purpose SSD V2 disk
Default Value
N/A
Table 6 Login Parameter
Mandatory
Type
Description
userPassword
Yes
UserPassword Object
Definition
Node login mode
Constraints
N/A
Table 7 UserPassword Parameter
Mandatory
Type
Description
username
No
String
Definition
Node login username
Constraints
N/A
Range
N/A
Default Value
N/A
password
Yes
String
Definition
Login password. If a username and password are used when a node is created, this field is shielded in the response body.
Constraints
The password field must be encrypted using a unique salt per credential during node creation. For details, see Adding a Salt in the password Field When Creating a Node.
Range
A password must meet the following requirements:
- It must contain 8 to 26 characters.
- Contains at least three of the following character types: uppercase letters, lowercase letters, digits, and special characters !@$%^-_=+[{}]:,./?
- Cannot contain the username or the username spelled backwards.
Default Value
N/A
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot