Updated on 2024-04-25 GMT+08:00

Cloud Native 2.0 Network

Model Definition

Developed by CCE, Cloud Native 2.0 network deeply integrates Elastic Network Interfaces (ENIs) and sub-ENIs of Virtual Private Cloud (VPC). Container IP addresses are allocated from the VPC CIDR block. ELB passthrough networking is supported to direct access requests to containers. Security groups and EIPs are bound to deliver high performance.

Figure 1 Cloud Native 2.0 network

Pod-to-pod communication

  • Pods on BMS nodes use ENIs, whereas pods on ECS nodes use Sub-ENIs. Sub-ENIs are attached to ENIs through VLAN sub-interfaces.
  • On the same node: Packets are forwarded through the VPC ENI or sub-ENI.
  • Across nodes: Packets are forwarded through the VPC ENI or sub-ENI.

Constraints

This network model is available only to CCE Turbo clusters.

Advantages and Disadvantages

Advantages

  • As the container network directly uses VPC, it is easy to locate network problems and provide the highest performance.
  • External networks in a VPC can be directly connected to container IP addresses.
  • The load balancing, security group, and EIP capabilities provided by VPC can be directly used by pods.

Disadvantages

The container network directly uses VPC, which occupies the VPC address space. Therefore, you must properly plan the container CIDR block before creating a cluster.

Application Scenarios

  • High performance requirements and use of other VPC network capabilities: Cloud Native Network 2.0 directly uses VPC, which delivers almost the same performance as the VPC network. Therefore, it applies to scenarios that have high requirements on bandwidth and latency, such as live streaming and e-commerce flash sale.
  • Large-scale networking: Cloud Native Network 2.0 supports a maximum of 2000 ECS nodes and 100,000 containers.

Container IP Address Management

In the Cloud Native Network 2.0 model, BMS nodes use ENIs and ECS nodes use sub-ENIs.

  • The IP address of the pod is directly allocated from the VPC subnet configured for the container network. You do not need to allocate an independent small network segment to the node.
  • To add an ECS node to a cluster, bind the ENI that carries the sub-ENI first. After the ENI is bound, you can bind the sub-ENI.
  • Number of ENIs bound to an ECS node: For clusters of v1.19.16-r40, v1.21.11-r0, v1.23.9-r0, v1.25.4-r0, v1.27.1-r0, and later versions, the value is 1. For clusters of earlier versions, the value is the maximum number of sub-ENIs that can be bound to the node divided by 64 (rounded up).
  • ENIs bound to an ECS node = Number of ENIs used to bear sub-ENIs + Number of sub-ENIs currently used by pods + Number of pre-bound sub-ENIs
  • ENIs bound to a BMS node = Number of ENIs currently used by pods + Number of pre-bound ENIs
  • When a pod is created, an available ENI is randomly allocated from the prebinding ENI pool of the node.
  • When the pod is deleted, the ENI is released back to the ENI pool of the node.
  • When a node is deleted, the ENIs are released back to the pool, and the sub-ENIs are deleted.

Cloud Native Network 2.0 supports dynamic and threshold-based ENI pre-binding policies. The following table lists the scenarios.

Table 1 Comparison between ENI pre-binding policies

Policy

Dynamic ENI Pre-binding Policy (Default)

Threshold-based ENI Pre-binding Policy

Management policy

nic-minimum-target: minimum number of ENIs (pre-bound and unused + used) bound to a node.

nic-maximum-target: If the number of ENIs bound to a node exceeds the value of this parameter, the system does not proactively pre-bind ENIs.

nic-warm-target: minimum number of pre-bound ENIs on a node.

nic-max-above-warm-target: ENIs are unbound and reclaimed only when the number of idle ENIs minus the number of nic-warm-target is greater than the threshold.

Low threshold of the number of bound ENIs: minimum number of ENIs (unused + used) bound to a node

High threshold of the number of bound ENIs: maximum number of ENIs that can be bound to a node. If the number of ENIs bound to a node exceeds the value of this parameter, the system unbinds the idle ENIs.

Application scenario

Accelerates pod startup while improving IP resource utilization. This mode applies to scenarios where the number of IP addresses in the container network segment is insufficient.

For details about the preceding parameters, see Pre-binding ENIs for CCE Turbo Clusters.

Applies to scenarios where the number of IP addresses in the container CIDR block is sufficient and the number of pods on nodes changes sharply but is fixed in a certain range.

  • For clusters from 1.19.16-r2, 1.21.5-r0, 1.23.3-r0 to 1.19.16-r4, 1.21.7-r0, and 1.23.5-r0, only the nic-minimum-target and nic-warm-target parameters are supported. The threshold-based pre-binding policy takes priority over the dynamic ENI pre-binding policy.
  • For clusters of 1.19.16-r4, 1.21.7-r0, 1.23.5-r0, 1.25.1-r0 or later, the preceding parameters are supported. The dynamic ENI pre-binding policy takes priority over the threshold-based pre-binding policy.
Figure 2 Dynamic ENI pre-binding policy

CCE provides four parameters for the dynamic ENI pre-binding policy. Set these parameters properly.

Table 2 Parameters of the dynamic ENI pre-binding policy

Parameter

Default Value

Description

Suggestion

nic-minimum-target

10

Minimum number of ENIs bound to a node. The value can be a number or a percentage.

  • Value: The value must be a positive integer. For example, 10 indicates that at least 10 ENIs are bound to a node. If the ENI quota of a node is exceeded, the ENI quota is used.
  • Percentage: The value ranges from 1% to 100%. For example, 10%. If the ENI quota of a node is 128, at least 12 (rounded down) ENIs are bound to the node.

Set both nic-minimum-target and nic-maximum-target to the same value or percentage.

Set these parameters based on the number of pods.

nic-maximum-target

0

If the number of ENIs bound to a node exceeds the value of nic-maximum-target, the system does not proactively pre-bind ENIs.

If the value of this parameter is greater than or equal to the value of nic-minimum-target, the check on the maximum number of the pre-bound ENIs is enabled. Otherwise, the check is disabled. The value can be a number or a percentage.

  • Value: The value must be a positive integer. For example, 0. The check on the maximum number of the pre-bound ENIs is disabled. If the ENI quota of a node is exceeded, the ENI quota is used.
  • Percentage: The value ranges from 1% to 100%. For example, 50%. If the ENI quota of a node is 128, the maximum number of the pre-bound ENI is 64 (rounded down).

Set both nic-minimum-target and nic-maximum-target to the same value or percentage.

Set these parameters based on the number of pods.

nic-warm-target

2

Minimum number of pre-bound ENIs on a node. The value must be a number.

When the value of nic-warm-target + the number of bound ENIs is greater than the value of nic-maximum-target, the system will pre-bind ENIs based on the difference between the value of nic-maximum-target and the number of bound ENIs.

Set this parameter to the number of pods that can be scaled out instantaneously within 10 seconds.

nic-max-above-warm-target

2

Only when the number of idle ENIs on a node minus the value of nic-warm-target is greater than the threshold, the pre-bound ENIs will be unbound and reclaimed. The value can only be a number.

  • Setting a larger value of this parameter slows down the recycling of idle ENIs and accelerates pod startup. However, the IP address usage decreases, especially when IP addresses are insufficient. Therefore, exercise caution when increasing the value of this parameter.
  • Setting a smaller value of this parameter accelerates the recycling of idle ENIs and improves the IP address usage. However, when a large number of pods increase instantaneously, the startup of some pods slows down.

Set this parameter based on the difference between the number of pods that are frequently scaled on most nodes within minutes and the number of pods that are instantly scaled out on most nodes within 10 seconds.

The preceding parameters support global configuration at the cluster level and custom settings at the node pool level. The latter takes priority over the former.

The container networking component maintains a scalable pre-bound ENI pool for each node. The component checks and calculates the number of pre-bound ENIs or idle ENIs every 10 seconds.
  • Number of pre-bound ENIs = min(nic-maximum-target – Number of bound ENIs, max(nic-minimum-target – Number of bound ENIs, nic-warm-target – Number of idle ENIs)
  • Number of ENIs to be unbound = min(Number of idle ENIs – nic-warm-target – nic-max-above-warm-target, Number of bound ENIs – nic-minimum-target)
The number of pre-binding ENIs on the node remains in the following range:
  • Minimum number of ENIs to be pre-bound = min(max(nic-minimum-target – Number of bound ENIs, nic-warm-target), nic-maximum-target – Number of bound ENIs)
  • Maximum number of ENIs to be pre-bound = max(nic-warm-target + nic-max-above-warm-target, Number of bound ENIs – nic-minimum-target)

When a pod is created, an idle ENI (the earliest unused one) is preferentially allocated from the pool. If no idle ENI is available, a newsub-ENI is bound to the pod.

When the pod is deleted, the corresponding ENI is released back to the pre-bound ENI pool of the node, enters a 2 minutes cooldown period, and can be bind to another pod. If the ENI is not bound to any pod within 2 minutes, it will be released.

Figure 3 Threshold-based policy

CCE provides a configuration parameter for the threshold algorithms. You can set this parameter based on the service plan, cluster scale, and number of ENIs that can be bound to a node.

  • Low threshold of the number of bound ENIs: Defaults to 0, indicating the minimum number of ENIs (unused + used) bound to a node. Minimum number of pre-bound ENIs on an ECS node = Number of ENIs bound to the node at the low threshold x Number of sub-ENIs on the node. Minimum number of pre-bound ENIs on a BMS node = Number of ENIs bound to the node at the low threshold x Number of ENIs on the node.
  • High threshold of the number of bound ENIs: Defaults to 0, indicating the maximum number of ENIs that can be bound to a node. If the number of ENIs bound to a node exceeds the value of this parameter, the system unbinds the idle ENIs. Maximum number of pre-bound ENIs on an ECS node = Number of bound ENIs at the high threshold x Number of sub-ENIs on the node. Maximum number of pre-bound ENIs on a BMS node = Number of bound ENIs at the high threshold x Number of ENIs on the node.

The container networking component maintains a scalable ENI pool for each node.

  • If the number of bound ENIs (used ENIs + pre-bound ENIs) is less than the number of pre-bound ENIs at the low threshold, ENIs are bound until the two numbers are equal.
  • If the number of bound ENIs (used ENIs + pre-bound ENIs) is greater than the the number of pre-bound ENIs at the high threshold and the number of pre-bound ENIs is greater than 0, the pre-bound ENIs that are not used for more than 2 minutes will be released periodically until the number of bound ENIs = Number of pre-bound ENIs at the high threshold or the number of used ENIs is greater than the number of pre-bound ENIs at the high threshold and the number of pre-bound ENIs on the node is 0.

Recommendation for CIDR Block Planning

As described in Cluster Network Structure, network addresses in a cluster can be divided into three parts: node network, container network, and service network. When planning network addresses, consider the following aspects:

  • The three CIDR blocks cannot overlap. Otherwise, a conflict occurs. All subnets (including those created from the secondary CIDR block) in the VPC where the cluster resides cannot conflict with the container and Service CIDR blocks.
  • Ensure that each CIDR block has sufficient IP addresses.
    • The IP addresses in the node CIDR block must match the cluster scale. Otherwise, nodes cannot be created due to insufficient IP addresses.
    • The IP addresses in the container CIDR block must match the service scale. Otherwise, pods cannot be created due to insufficient IP addresses.

In the Cloud Native Network 2.0 model, the container CIDR block and node CIDR block share the network addresses in a VPC. It is recommended that the container subnet and node subnet not use the same subnet. Otherwise, containers or nodes may fail to be created due to insufficient IP resources.

In addition, a subnet can be added to the container CIDR block after a cluster is created to increase the number of available IP addresses. In this case, ensure that the added subnet does not conflict with other subnets in the container CIDR block.

Figure 4 Setting the CIDR blocks (when creating the cluster)

Example of Cloud Native Network 2.0 Access

Create a CCE Turbo cluster, which contains three ECS nodes.

Access the details page of one node. You can see that the node has one primary ENI and one extended ENI, and both of them are ENIs. The extended ENI belongs to the container CIDR block and is used to mount a sub-ENI to the pod.

Figure 5 Node ENIs

Create a Deployment in the cluster.

kind: Deployment
apiVersion: apps/v1
metadata:
  name: example
  namespace: default
spec:
  replicas: 6
  selector:
    matchLabels:
      app: example
  template:
    metadata:
      labels:
        app: example
    spec:
      containers:
        - name: container-0
          image: 'nginx:perl'
          resources:
            limits:
              cpu: 250m
              memory: 512Mi
            requests:
              cpu: 250m
              memory: 512Mi
      imagePullSecrets:
        - name: default-secret

View the created pod.

$ kubectl get pod -owide
NAME                       READY   STATUS    RESTARTS   AGE   IP            NODE         NOMINATED NODE   READINESS GATES
example-5bdc5699b7-54v7g   1/1     Running   0          7s    10.1.18.2     10.1.0.167   <none>           <none>
example-5bdc5699b7-6dzx5   1/1     Running   0          7s    10.1.18.216   10.1.0.186   <none>           <none>
example-5bdc5699b7-gq7xs   1/1     Running   0          7s    10.1.16.63    10.1.0.144   <none>           <none>
example-5bdc5699b7-h9rvb   1/1     Running   0          7s    10.1.16.125   10.1.0.167   <none>           <none>
example-5bdc5699b7-s9fts   1/1     Running   0          7s    10.1.16.89    10.1.0.144   <none>           <none>
example-5bdc5699b7-swq6q   1/1     Running   0          7s    10.1.17.111   10.1.0.167   <none>           <none>

The IP addresses of all pods are sub-ENIs, which are mounted to the ENI (extended ENI) of the node.

For example, the extended ENI of node 10.1.0.167 is 10.1.17.172. On the Network Interfaces page of the Network Console, you can see that three sub-ENIs are mounted to the extended ENI 10.1.17.172, which is the IP address of the pod.

Figure 6 Pod ENIs

In the VPC, the IP address of the pod can be successfully accessed.

Performance on Batch Creating Pods in a CCE Turbo Cluster

Pods in a CCE Turbo cluster request ENIs or sub-ENIs from VPC. Pods are bound with ENIs or sub-ENIs after pod scheduling is complete. The pod creation speed is constrained by how fast ENIs are created and bound. The following table describes the constraints.

Table 3 Time required for creating ENIs

Node Type

ENI Type

Maximum Number of Supported ENI

Binding ENI to Node

ENI Availability

Concurrency Control

Default Pre-Binding Configuration of ENI to Node

ECS

Sub-ENI

256

Specifying the ENI of a node to create a sub-ENI

Within 1 second

Tenant-level: 600/minute

For clusters of versions earlier than 1.19.16-r2, 1.21.5-r0, or 1.23.3-r0, ENI pre-binding is not supported.

For clusters of versions between 1.19.16-r2, 1.21.5-r0, or 1.23.3-r0 and 1.19.16-r4, 1.21.7-r0, or 1.23.5-r0, dynamic ENI pre-binding is supported (nic-minimum-target=10; nic-warm-target=2).

For clusters of 1.19.16-r4, 1.21.7-r0, 1.23.5-r0, 1.25.1-r0, and later, dynamic pre-binding is supported (nic-minimum-target=10; nic-maximum-target=0; nic-warm-target=2; nic-max-above-warm-target=2).

BMS

ENI

128

Binding an ENI to a node

20s to 30s

Node-level: 3 concurrently

For clusters earlier than 1.19.16-r4, 1.21.7-r0, and 1.23.5-r0, the total number of ENIs is determined by the high and low thresholds (nic-threshold=0.3:0.6).

For clusters of 1.19.16-r4, 1.21.7-r0, 1.23.5-r0, 1.25.1-r0, 1.28.1-r0, and later, dynamic pre-binding is supported (nic-minimum-target=10; nic-maximum-target=0; nic-warm-target=2; nic-max-above-warm-target=2).

Pre-binding consumes container subnet IP addresses and affects the number of pods that can run in the cluster. You should properly plan and configure dynamic pre-binding based on the service scale. For details, see Pre-Binding Container ENI for CCE Turbo Clusters.

Creating a Pod on an ECS Node (Using Sub-ENIs)

  • If no pre-bound ENI is available on the node to which a pod is scheduled, the API used for creating a sub-ENI is called to create a sub-ENI using the ENI of the node and allocate the sub-ENI to the pod.
  • If a pre-bound ENI is available on the node to which a pod is scheduled, the unused sub-ENI that is created the earliest is allocated to the pod.
  • Limited by the concurrent creation speed of sub-ENIs, a maximum of 600 pods can be created per minute without pre-binding. If a larger-scale creation is required, you can configure pre-binding for sub-ENIs as needed.

Creating a Pod on a BMS Node (Using ENIs)

  • If no prebound ENI is available on the node to which a pod is scheduled, the API used for binding an ENI to a node is called to bind and allocate an ENI to the pod. It takes about 20 to 30 seconds to bind an ENI to a BMS node.
  • If a pre-bound ENI is available on the node to which a pod is scheduled, the unused ENI that is created the earliest is allocated to the pod.
  • Limited by the speed of binding ENIs to BMS nodes, three pods running on the same node can start in 20 seconds without pre-binding. Therefore, pre-bind all available ENIs for BMS nodes.