Reliability Functions

Cluster HA

Three CCE master nodes can be deployed in HA mode to ensure cluster reliability.

Data Backup and Restoration

For persistent storage, you can use CCE to mount storage volumes created from EVS disks to a path of a container. CCE works with EVS to support snapshots. If data is lost, you can roll back the disk data to the state when a snapshot was created.

For details, see Snapshots and Backups.

Health Checks

Container health can be checked regularly when the container is running. If health check is not configured, a pod cannot detect application exceptions or automatically restart the application to recover it. As a result, a pod status can be normal even if the application in the pod is not.

Kubernetes provides the following health check probes:

Liveness probe (livenessProbe): checks whether a container is still alive. It is similar to the ps command that checks whether a process exists. If the liveness probe of a container fails, the cluster restarts the container. If the liveness probe is successful, no further action is taken.
Readiness probe (readinessProbe): checks whether a container is ready to process user requests. Upon that the container is detected unready, service traffic will not be directed to the container. An application may take a long time to start up and provide services due to some reasons, for example, it may need to load disk data or wait for the startup of an external module. In this case, application processes are running, but the applications are not ready to provide services. This is where the readiness probe enters. If the container readiness probe fails, the cluster masks all requests sent to the container. If the container readiness probe is successful, the container can be accessed.
Startup probe (startupProbe): checks when a containerized application has started. If such a probe is configured, it disables liveness and readiness checks until it succeeds, ensuring that those probes do not interfere with the application startup. This can be used to perform liveness checks on slow starting containers to prevent them from getting terminated by the kubelet before they are started.

For details, see Configuring Container Health Check.

Anti-Affinity

CCE supports node anti-affinity. When creating a node pool, you can select an ECS group and configure an anti-affinity policy for it. Then ECSs in the same ECS group will be distributed to different hosts for higher reliability.

CCE supports affinity and anti-affinity between workloads and nodes as well as between workloads:

Node affinity: You can choose to deploy workloads in a specified node or AZ or not.
Affinity or anti-affinity between workloads: Workloads are deployed on the same node, to minimize the usage of network resources, or on different nodes, to minimize the impacts of potential breakdowns.

For details, see Scheduling Policies (Affinity and Anti-affinity).

Overload Control

CCE clusters support overload control. If this function is enabled, concurrent requests will be dynamically controlled based on the resource demands received by master nodes to ensure the stable running of the master nodes and the cluster.

For details, see Enabling Overload Control for a Cluster.

Auto Scaling

CCE supports auto scaling of workloads and nodes:

Workload scaling: auto scaling at the scheduling layer to change the scheduling capacity of workloads. For example, you can use HPA, a scaling component at the scheduling layer, to adjust the number of pods used for an application. Adjusting the number of pods changes the scheduling capacity occupied by the current workload, thereby enabling scaling at the scheduling layer.
Node scaling: auto scaling at the resource layer. When the planned cluster nodes cannot allow workload scheduling, ECS or CCI resources are provided to support scheduling.

Workload scaling and node scaling can work separately or together.

For details, see Auto Scaling Overview.