Node Resource Reservation Policy
Some node resources are used to run mandatory Kubernetes system components and resources to make the node as part of your cluster. Therefore, the total number of node resources and the number of allocatable node resources for your cluster are different. The larger the node specifications, the more the containers deployed on the node. Therefore, more node resources need to be reserved to run Kubernetes components.
To ensure node stability, a certain number of CCE node resources will be reserved for Kubernetes components (such as kubelet, kube-proxy, and docker) based on the node specifications.
CCE calculates the resources that can be allocated to user nodes as follows:
Allocatable resources = Total amount - Reserved amount - Eviction threshold
The memory eviction threshold is fixed at 100 MiB.
Total amount indicates the available memory of the ECS, excluding the memory used by system components. Therefore, the total amount is slightly less than the memory of the node flavor.
When the memory consumed by all pods on a node increases, the following behaviors may occur:
- When the available memory of the node is lower than the eviction threshold, kubelet is triggered to evict the pod. For details about the eviction threshold in Kubernetes, see Node-pressure Eviction.
- If a node triggers an OS memory insufficiency event (OOM) before kubelet reclaims memory, the system terminates the container. However, different from pod eviction, kubelet restarts the container based on the RestartPolicy of the pod.
Rules v1 for Reserving Node Memory
For clusters of versions earlier than v1.21.4-r0 and v1.23.3-r0, the v1 model is used for node memory reservation. For clusters of v1.21.4-r0, v1.23.3-r0, or later, the node memory reservation model is optimized to v2. For details, see Rules for Reserving Node Memory v2.
Upgrading the cluster to v1.21.4-r0, v1.23.3-r0, or a later version may result in the eviction of the workload on a node that is already fully utilized. This is because system components will need more reserved resources.
You can use the following formula calculate how much memory you should reserve for running containers on a node:
Total reserved amount = Reserved memory for system components + Reserved memory for kubelet to manage pods
Total Memory (TM) |
Reserved Memory for System Components |
---|---|
TM ≤ 8 GiB |
0 MiB |
8 GiB < TM ≤ 16 GiB |
[(TM – 8 GiB) x 1024 x 10%] MiB |
16 GiB < TM ≤ 128 GiB |
[8 GiB x 1024 x 10% + (TM – 16 GiB) x 1024 x 6%] MiB |
TM > 128 GiB |
(8 GiB x 1024 x 10% + 112 GiB x 1024 x 6% + (TM – 128 GiB) x 1024 x 2%) MiB |
Total Memory (TM) |
Number of Pods |
Reserved Memory for kubelet |
---|---|---|
TM ≤ 2 GiB |
None |
TM x 25% |
TM > 2 GiB |
0 < Max. pods on a node ≤ 16 |
700 MiB |
16 < Max. pods on a node ≤ 32 |
[700 + (Max. pods on a node – 16) x 18.75] MiB |
|
32 < Max. pods on a node ≤ 64 |
[1024 + (Max. pods on a node – 32) x 6.25] MiB |
|
64 < Max. pods on a node ≤ 128 |
[1230 + (Max. pods on a node – 64) x 7.80] MiB |
|
Max. pods on a node > 128 |
[1740 + (Max. pods on a node – 128) x 11.20] MiB |
For a small-capacity node, adjust the maximum number of instances based on the site requirements. Alternatively, when creating a node on the CCE console, you can adjust the maximum number of instances for the node based on the node specifications.
Rules for Reserving Node Memory v2
For clusters of v1.21.4-r0, v1.23.3-r0, or later, the node memory reservation model is optimized to v2 and can be dynamically adjusted using the node pool parameters kube-reserved-mem and system-reserved-mem. For details, see Modifying Node Pool Configurations.
The total reserved node memory of the v2 model is equal to the sum of that reserved for the OS and that reserved for CCE to manage pods.
Reserved memory includes basic and floating parts. For the OS, the floating memory depends on the node specifications. For CCE, the floating memory depends on the number of pods on a node.
Reserved for |
Basic/Floating |
Reservation |
Used by |
---|---|---|---|
OS |
Basic |
Fixed at 400 MiB |
OS service components such as sshd and systemd-journald. |
Floating (depending on the node memory) |
25MiB/GiB |
Kernel |
|
CCE |
Basic |
Fixed at 500 MiB |
Container engine components, such as kubelet and kube-proxy, when the node is unloaded |
Floating (depending on the number of pods on the node) |
Docker: 20 MiB/Pod containerd: 5 MiB/Pod |
Container engine components when the number of pods increases
NOTE:
When the v2 model reserves memory for a node by default, the default maximum number of pods is estimated based on the memory. For details, see Table 1. |
Rules for Reserving Node CPU
Total CPU Cores (Total) |
Reserved CPU Cores |
---|---|
Total ≤ 1 core |
Total x 6% |
1 core < Total ≤ 2 cores |
1 core x 6% + (Total – 1 core) x 1% |
2 cores < Total ≤ 4 cores |
1 core x 6% + 1 core x 1% + (Total – 2 cores) x 0.5% |
Total > 4 cores |
1 core x 6% + 1 core x 1% + 2 cores x 0.5% + (Total – 4 cores) x 0.25% |
Rules for CCE to Reserve Data Disks on Nodes
CCE uses Logical Volume Manager (LVM) to manage disks. LVM creates a metadata area on a disk to store logical and physical volumes, occupying 4 MiB space. Therefore, the actual available disk space of a node is equal to the disk size minus 4 MiB.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.