CCE Advanced HPA
CCE Advanced HPA (cce-hpa-controller) is a CCE-developed add-on, which can be used to flexibly scale in or out Deployments based on metrics such as CPU usage and memory usage.
Main Functions
- Scaling can be performed based on the percentage of the current number of pods.
- The minimum scaling step can be set.
- Different scaling operations can be performed based on the actual metric values.
Notes and Constraints
- If the add-on version is earlier than 1.2.11, the Prometheus add-on must be installed. If the add-on version is 1.2.11 or later, the add-ons that can provide metrics API must be installed. You can select one of the following add-ons based on your cluster version and requirements.
- Kubernetes Metrics Server: provides basic resource usage metrics, such as container CPU and memory usage. It is supported by all cluster versions.
- Cloud Native Cluster Monitoring: available only in clusters of v1.17 or later.
- Auto scaling based on basic resource metrics: Prometheus needs to be registered as a metrics API. For details, see Providing Resource Metrics Through the Metrics API.
- Auto scaling based on custom metrics: Custom metrics need to be aggregated to the Kubernetes API server. For details, see Creating an HPA Policy Using Custom Metrics.
- Prometheus: Prometheus needs to be registered as a metrics API. For details, see Providing Resource Metrics Through the Metrics API. This add-on supports only clusters of v1.21 or earlier.
Installing the Add-on
- Log in to the CCE console and click the cluster name to access the cluster console. In the navigation pane, choose Add-ons, locate CCE Advanced HPA on the right, and click Install.
- On the Install Add-on page, configure the specifications as needed.
- If you selected Preset, the add-on specifications will be automatically configured based on the recommended values by CCE. These values are suitable for most scenarios and can be viewed on the console.
- If you selected Custom, you can modify the number of replicas, CPUs, and memory of each add-on component as required.
Replicas: HA is not possible with just one replica, so one replica is used only for verification. In commercial scenarios, you can configure multiple replicas based on the cluster specifications.
CPU Quota and Memory Quota: The resource quotas of a component are affected by how many containers and scaling policies in a cluster. For typical situations, it is recommended that you configure 500m CPU cores and 1,000 MiB of memory for every 5,000 containers in a cluster. As for scaling policies, 100m CPU cores and 500 MiB of memory should be configured for every 1,000 of them.
- Configure deployment policies for the add-on pods.
- Scheduling policies do not take effect on add-on instances of the DaemonSet type.
- When configuring multi-AZ deployment or node affinity, ensure that there are nodes meeting the scheduling policy and that resources are sufficient in the cluster. Otherwise, the add-on cannot run.
Table 1 Configurations for add-on scheduling Parameter
Description
Multi AZ
- Preferred: Deployment pods of the add-on will be preferentially scheduled to nodes in different AZs. If all the nodes in the cluster are deployed in the same AZ, the pods will be scheduled to different nodes in that AZ.
- Equivalent mode: Deployment pods of the add-on are evenly scheduled to the nodes in the cluster in each AZ. If a new AZ is added, you are advised to increase add-on pods for cross-AZ HA deployment. With the Equivalent multi-AZ deployment, the difference between the number of add-on pods in different AZs will be less than or equal to 1. If resources in one of the AZs are insufficient, pods cannot be scheduled to that AZ.
- Required: Deployment pods of the add-on are forcibly scheduled to nodes in different AZs. There can be at most one pod in each AZ. If nodes in a cluster are not in different AZs, some add-on pods cannot run properly. If a node is faulty, add-on pods on it may fail to be migrated.
Node Affinity
- Not configured: Node affinity is disabled for the add-on.
- Node Affinity: Specify the nodes where the add-on is deployed. If you do not specify the nodes, the add-on will be randomly scheduled based on the default cluster scheduling policy.
- Specified Node Pool Scheduling: Specify the node pool where the add-on is deployed. If you do not specify the node pool, the add-on will be randomly scheduled based on the default cluster scheduling policy.
- Custom Policies: Enter the labels of the nodes where the add-on is to be deployed for more flexible scheduling policies. If you do not specify node labels, the add-on will be randomly scheduled based on the default cluster scheduling policy.
If multiple custom affinity policies are configured, ensure that there are nodes that meet all the affinity policies in the cluster. Otherwise, the add-on cannot run.
Toleration
Using both taints and tolerations allows (not forcibly) the add-on Deployment to be scheduled to a node with the matching taints, and controls the Deployment eviction policies after the node where the Deployment is located is tainted.
The add-on adds the default tolerance policy for the node.kubernetes.io/not-ready and node.kubernetes.io/unreachable taints, respectively. The tolerance time window is 60s.
For details, see Configuring Tolerance Policies.
- Click Install.
Components
Component |
Description |
Resource Type |
---|---|---|
customedhpa-controller |
CCE auto scaling component, which scales in or out Deployments based on metrics such as CPU usage and memory usage |
Deployment |
Change History
Add-on Version |
Supported Cluster Version |
New Feature |
---|---|---|
1.4.3 |
v1.21 v1.23 v1.25 v1.27 v1.28 v1.29 |
Fixed some issues. |
1.4.2 |
v1.21 v1.23 v1.25 v1.27 v1.28 v1.29 |
CCE clusters 1.29 are supported. |
1.3.42 |
v1.21 v1.23 v1.25 v1.27 v1.28 |
CCE clusters 1.28 are supported. |
1.3.14 |
v1.19 v1.21 v1.23 v1.25 v1.27 |
CCE clusters 1.27 are supported. |
1.3.10 |
v1.19 v1.21 v1.23 v1.25 |
Periodic scaling is not affected by the cooldown period. |
1.3.7 |
v1.19 v1.21 v1.23 v1.25 |
Supported anti-affinity scheduling of add-on pods on nodes in different AZs. |
1.3.3 |
v1.19 v1.21 v1.23 v1.25 |
|
1.3.1 |
v1.19 v1.21 v1.23 |
CCE clusters 1.23 are supported. |
1.2.12 |
v1.15 v1.17 v1.19 v1.21 |
Optimizes the add-on performance to reduce resource consumption. |
1.2.11 |
v1.15 v1.17 v1.19 v1.21 |
|
1.2.10 |
v1.15 v1.17 v1.19 v1.21 |
CCE clusters 1.21 are supported. |
1.2.4 |
v1.15 v1.17 v1.19 |
|
1.2.3 |
v1.15 v1.17 v1.19 |
Supports ARM64 nodes. |
1.2.2 |
v1.15 v1.17 v1.19 |
Enhances the health check function. |
1.2.1 |
v1.15 v1.17 v1.19 |
|
1.1.3 |
v1.15 v1.17 |
Supports periodic scaling rules. |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.