Modifying Cluster Configurations
Scenario
CCE allows you to manage cluster parameters, through which you can let core components work under your requirements.
Procedure
- Log in to the CCE console. In the navigation pane, choose Clusters.
- Locate the target cluster, click ... to view more operations on the cluster, and choose Manage.
- On the Manage Component page, change the values of the Kubernetes parameters listed in the following table.
Table 1 kube-apiserver configurations Item
Parameter
Description
Value
Toleration time for nodes in NotReady state
default-not-ready-toleration-seconds
Specifies the default tolerance time. The configuration takes effect for all pods by default. You can configure different tolerance time for pods. In this case, the tolerance time configured for the pod is used. For details, see Configuring Tolerance Policies.
If the specified tolerance time is too short, pods may be frequently migrated in scenarios like a network jitter. If the specified tolerance time is too long, services may be interrupted during this period after the node is faulty.
Default: 300s
Toleration time for nodes in unreachable state
default-unreachable-toleration-seconds
Specifies the default tolerance time. The configuration takes effect for all pods by default. You can configure different tolerance time for pods. In this case, the tolerance time configured for the pod is used. For details, see Configuring Tolerance Policies.
If the specified tolerance time is too short, pods may be frequently migrated in scenarios like a network jitter. If the specified tolerance time is too long, services may be interrupted during this period after the node is faulty.
Default: 300s
Maximum Number of Concurrent Modification API Calls
max-mutating-requests-inflight
Maximum number of concurrent mutating requests. When the value of this parameter is exceeded, the server rejects requests.
The value 0 indicates that there is no limitation on the maximum number of concurrent modification requests. This parameter is related to the cluster scale. You are advised not to change the value.
Manual configuration is no longer supported since cluster v1.21. The value is automatically specified based on the cluster scale.
- 200 for clusters with 50 or 200 nodes
- 500 for clusters with 1000 nodes
- 1000 for clusters with 2000 nodes
Maximum Number of Concurrent Non-Modification API Calls
max-requests-inflight
Maximum number of concurrent non-mutating requests. When the value of this parameter is exceeded, the server rejects requests.
The value 0 indicates that there is no limitation on the maximum number of concurrent non-modification requests. This parameter is related to the cluster scale. You are advised not to change the value.
Manual configuration is no longer supported since cluster v1.21. The value is automatically specified based on the cluster scale.
- 400 for clusters with 50 or 200 nodes
- 1000 for clusters with 1000 nodes
- 2000 for clusters with 2000 nodes
NodePort port range
service-node-port-range
NodePort port range. After changing the value, go to the security group page and change the TCP/UDP port range of node security groups 30000 to 32767. Otherwise, ports other than the default port cannot be accessed externally.
If the port number is smaller than 20106, a conflict may occur between the port and the CCE health check port, which may further lead to unavailable cluster. If the port number is greater than 32767, a conflict may occur between the port and the ports in net.ipv4.ip_local_port_range, which may further affect the network performance.
Default: 30000 to 32767
Value range:
Min > 20105
Max < 32768
Overload Control
support-overload
Cluster overload control. After this function is enabled, concurrent requests will be dynamically controlled based on the resource demands received by master nodes to ensure the stable running of the master nodes and the cluster.
This parameter is available only in clusters of v1.23 or later.
- false: Overload control is disabled.
- true: Overload control is enabled.
Node Restriction Add-on
enable-admission-plugin-node-restriction
This add-on allows the kubelet of a node to operate only the objects of the current node for enhanced isolation in multi-tenant scenarios or the scenarios with high security requirements.
This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later versions.
Default: true
Pod Node Selector Add-on
enable-admission-plugin-pod-node-selector
This add-on allows cluster administrators to configure the default node selector through namespace annotations. In this way, pods run only on specific nodes and configurations are simplified.
This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later versions.
Default: true
Pod Toleration Limit Add-on
enable-admission-plugin-pod-toleration-restriction
This add-on allows cluster administrators to configure the default value and limits of pod tolerations through namespaces for fine-grained control over pod scheduling and key resource protection.
This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later versions.
Default: false
Table 2 Scheduler configurations Item
Parameter
Description
Value
Default cluster scheduler
default-scheduler
- kube-scheduler scheduler: provides the standard scheduling capability of the community.
- volcano scheduler: compatible with kube-scheduler scheduling capabilities and provides enhanced scheduling capabilities. For details, see Volcano Scheduling.
Default: kube-scheduler
QPS for communicating with kube-apiserver
kube-api-qps
QPS for communicating with kube-apiserver.
- If the number of nodes in a cluster is less than 1000, the default value is 100.
- If the number of nodes in a cluster is 1000 or more, the default value is 200.
Burst for communicating with kube-apiserver
kube-api-burst
Burst for communicating with kube-apiserver.
- If the number of nodes in a cluster is less than 1000, the default value is 100.
- If the number of nodes in a cluster is 1000 or more, the default value is 200.
Whether to enable GPU sharing
enable-gpu-share
Whether to enable GPU sharing. This parameter is supported only in clusters of v1.23.7-r10, v1.25.3-r0, or later versions.
- When disabled, ensure that pods in the cluster cannot use shared GPUs (no cce.io/gpu-decision annotation in pods) and that GPU virtualization is disabled.
- When enabled, ensure that there is a cce.io/gpu-decision annotation on all pods that use GPU resources in the cluster.
Default: true
Table 3 kube-controller-manager configurations Item
Parameter
Description
Value
Number of concurrent processing of deployment
concurrent-deployment-syncs
Number of deployment objects that can be synchronized concurrently
Default: 5
Concurrent processing number of endpoint
concurrent-endpoint-syncs
Number of endpoint syncing operations that will be done concurrently
Default: 5
Concurrent number of garbage collector
concurrent-gc-syncs
Number of garbage collector workers that can be synchronized concurrently
Default: 20
Number of job objects allowed to sync simultaneously
concurrent-job-syncs
Number of job objects that can be synchronized concurrently
Default: 5
Number of CronJob objects allowed to sync simultaneously
concurrent-cron-job-syncs
Number of scheduled jobs that can be synchronized concurrently
Default: 5
Number of concurrent processing of namespace
concurrent-namespace-syncs
Number of namespace objects that can be synchronized concurrently
Default: 10
Concurrent processing number of replicaset
concurrent-replicaset-syncs
Number of replica sets that can be synchronized concurrently
Default: 5
Number of concurrent processing of resource quota
concurrent-resource-quota-syncs
Number of resource quotas that can be synchronized concurrently
Default: 5
Concurrent processing number of service
concurrent-service-syncs
Number of services that can be synchronized concurrently
Default: 10
Concurrent processing number of serviceaccount-token
concurrent-serviceaccount-token-syncs
Number of service account token objects that can be synchronized concurrently
Default: 5
Concurrent processing of ttl-after-finished
concurrent-ttl-after-finished-syncs
Number of ttl-after-finished-controller workers that can be synchronized concurrently
Default: 5
RC
concurrent_rc_syncs (used in clusters of v1.19 or earlier)
concurrent-rc-syncs (used in clusters of v1.21 through v1.25.3-r0)
Number of replication controllers that can be synchronized concurrently
NOTE:This parameter is no longer supported in clusters of v1.25.3-r0 and later versions.
Default: 5
Cluster elastic computing period
horizontal-pod-autoscaler-sync-period
Period for the horizontal pod autoscaler to perform auto scaling on pods. A smaller value will result in a faster auto scaling response and higher CPU load.
NOTE:Make sure to configure this parameter properly as a lengthy period can cause the controller to respond slowly, while a short period may overload the cluster control plane.
Default: 15 seconds
Horizontal Pod Scaling Tolerance
horizontal-pod-autoscaler-tolerance
The configuration determines how quickly the horizontal pod autoscaler will act to auto scaling policies. If the parameter is set to 0, auto scaling will be triggered immediately when the related metrics are met.
Configuration suggestion: If the service resource usage increases sharply over time, retain a certain tolerance to prevent auto scaling which is beyond expectation in high resource usage scenarios.
Default: 0.1
QPS for communicating with kube-apiserver
kube-api-qps
QPS for communicating with kube-apiserver
- If the number of nodes in a cluster is less than 1000, the default value is 100.
- If the number of nodes in a cluster is 1000 or more, the default value is 200.
Burst for communicating with kube-apiserver
kube-api-burst
Burst for communicating with kube-apiserver
- If the number of nodes in a cluster is less than 1000, the default value is 100.
- If the number of nodes in a cluster is 1000 or more, the default value is 200.
The maximum number of terminated pods that can be kept before the Pod GC deletes the terminated pod
terminated-pod-gc-threshold
Number of terminated pods that can exist in a cluster. If there are more terminated pods than the expected number in the cluster, the terminated pods that exceed the number will be deleted.
NOTE:If this parameter is set to 0, all pods in the terminated state are retained.
Default: 1000
Value range: 10 to 12500
If the cluster version is v1.21.11-r40, v1.23.8-r0, v1.25.6-r0, v1.27.3-r0, or later, the value range is changed to 0 to 100000.
Unhealthy AZ Threshold
unhealthy-zone-threshold
When more than a certain proportion of pods in an AZ are unhealthy, the AZ itself will be considered unhealthy, and scheduling pods to nodes in that AZ will be restricted to limit the impacts of the unhealthy AZ.
This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later versions.
NOTE:If the parameter is set to a large value, pods in unhealthy AZs will be migrated in a large scale, which may lead to risks such as overloaded clusters.
Default: 0.55
Value range: 0 to 1
Node Eviction Rate
node-eviction-rate
This parameter specifies the number of nodes that pods are deleted from per second in a cluster when the AZ is healthy. The default value is 0.1, indicating that pods can be evicted from at most one node every 10 seconds.
This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later versions.
NOTE:If the parameter is set to a large value, the cluster may be overloaded. Additionally, if too many pods are evicted, they cannot be rescheduled, which will slow down fault recovery.
Default: 0.1
Secondary Node Eviction Rate
secondary-node-eviction-rate
This parameter specifies the number of nodes that pods are deleted from per second in a cluster when the AZ is unhealthy. The default value is 0.01, indicating that pods can be evicted from at most one node every 100 seconds.
This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later versions.
NOTE:There is no need to set the parameter to a large value for nodes in an unhealthy AZ, and this configuration may result in overloaded clusters.
Default: 0.01
Configure this parameter with node-eviction-rate and set it to one-tenth of node-eviction-rate.
Large Cluster Threshold
large-cluster-size-threshold
If the number of nodes in a cluster is greater than the value of this parameter, this is a large cluster.
This parameter is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later versions.
NOTE:kube-controller-manager automatically adjusts configurations for large clusters to optimize the cluster performance. Therefore, an excessively small threshold for small clusters will deteriorate the cluster performance.
Default: 50
For the clusters with a large number of nodes, configure a relatively larger value than the default one for higher performance and faster responses of controllers. Retain the default value for small clusters. Before adjusting the value of this parameter in a production environment, check the impact of the change on cluster performance in a test environment.
Table 4 Networking component configurations (supported only by the clusters using a VPC network) Item
Parameter
Description
Value
Retaining the non-masqueraded CIDR block of the original pod IP address
nonMasqueradeCIDRs
In a CCE cluster using the VPC network model, if a container in the cluster needs to access externally, the source pod IP address must be masqueraded as the IP address of the node where the pod resides through SNAT. After the configuration, the node will not SNAT the IP addresses in the CIDR block by default. This function is available only in clusters of v1.23.14-r0, v1.25.9-r0, v1.27.6-r0, v1.28.4-r0, or later versions.
By default, nodes in a cluster do not perform SNAT on packets destined for 10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16 that is detected by CCE as a private CIDR block. Instead, these packets are directly transferred using the upper-layer VPC. (The three CIDR blocks are considered as internal networks in the cluster and are reachable at Layer 3 by default.)
Default: 10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16
NOTE:To enable cross-node pod access, the CIDR block of the node where the target pod runs must be added.
Similarly, to enable cross-ECS pod access in a VPC, the CIDR block of the ECS where the target pod runs must be added.
Table 5 Extended controller configurations (supported only by clusters of v1.21 and later) Item
Parameter
Description
Value
Enable resource quota management
enable-resource-quota
Indicates whether to automatically create a ResourceQuota when creating a namespace. With quota management, you can control the number of workloads of each type and the upper limits of resources in a namespace or related dimensions.
- false: Auto creation is disabled.
- true: Auto creation is enabled. For details about the resource quota defaults, see Configuring Resource Quotas.
NOTE:
In high-concurrency scenarios (for example, creating pods in batches), the resource quota management may cause some requests to fail due to conflicts. Do not enable this function unless necessary. To enable this function, ensure that there is a retry mechanism in the request client.
Default: false
- Click OK.
References
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.