Upgrading a Pod Without Interrupting Services
Applications
In a Kubernetes cluster, applications can be accessed externally by deploying in Deployments with the LoadBalancer Services. When an application is updated or upgraded, new pods are created for the Deployment. These new pods will gradually replace the old ones. During this process, services may be interrupted.
Solution
To prevent an application upgrade from interrupting services, configure Deployments and Services as follows:
- In a Deployment, upgrade pods in the Rolling upgrade mode. In this mode, pods are updated one by one, not all at once. In this way, you can control the upgrade speed and the number of concurrent pods to ensure that services are not interrupted during the upgrade. For example, you can configure the maxSurge and maxUnavailable parameters to control the number of new pods created and the number of old pods deleted concurrently. Ensure that there is always a workload that can provide services during the upgrade.
- There are two types of service affinity in a LoadBalancer:
- Cluster-level service affinity (externalTrafficPolicy: Cluster). In this mode, if there is no pod deployed on a node, the request is forwarded to pods on another node. During the cross-node forwarding, the source IP address may be lost.
- Node-level service affinity (externalTrafficPolicy: Local). In this mode, requests are directly forwarded to the node where the pod resides. Cross-node forwarding is not involved. Therefore, the source IP address can be preserved. However, if the node where the pod resides changes during the rolling upgrade, the ELB backend server will change accordingly, which may cause service interruption. In this case, you can upgrade pods in place. This ensures that there is at least one pod running properly on the ELB backend node.
The table below lists the solutions to ensure service continuity during a pod upgrade.
| Scenario | Service | Deployment |
|---|---|---|
| The source IP address does not need to be preserved. | Select the Cluster-level service affinity. | Select Rolling upgrade for Upgrade Mode, configure graceful termination, and use the liveness probe and the readiness probe. |
| The source IP address needs to be preserved. | Select the Node-level service affinity. | Select Rolling upgrade for Upgrade Mode, configure graceful termination, use the liveness probe and readiness probe, and add some node affinity rules. Ensure that there is at least one pod running on each node during the upgrade. |
Procedure
In this example, there are 200 pods for a workload, and the workload is exposed to external networks through a LoadBalance Service. When upgrading workloads associated with LoadBalancer Services or ingresses, cross-service calling is necessary, so it is important to carefully configure the rolling upgrade parameters.
- Log in to the CCE console and click the cluster name to access the cluster console. In the navigation pane, choose Workloads.
- In the workload list, click Upgrade in the Operation column of the workload to be upgraded. The Upgrade Workload page is displayed.
- Enable the liveness probe and readiness probe. In the Container Settings area, click Health Check and enable the liveness probe and readiness probe. In this example, TCP is selected for Check Method. Configure the parameters based on your requirements. Parameters like Period (s), Delay (s), and Timeout (s) must be properly configured. Some applications take a long time to start. A small value of these parameters will lead to repeated restarts.
In this example, the readiness probe delay is set to 20 to control the interval for workload rolling upgrades in batches.
Figure 1 Enabling the liveness probe and readiness probe
- Configure a rolling upgrade. In the Advanced Settings area, click Upgrade and select Rolling upgrade for Upgrade Mode. This ensures that the pods of the old versions are gradually replaced with the ones of a new version.
In this example, maxUnavailable is set to 2%, and maxSurge is set to 2% to control the workload rolling upgrade step. This, combined with the readiness probe delay, enables eight workload pods to be upgraded every 20 seconds.
Figure 2 Configuring a rolling upgrade
- Configure graceful termination.
- In the Container Settings area, click Lifecycle and configure pre-stop processing. Configure this parameter to the time required for the Service to process all remaining requests, most of which are persistent connection requests. You can, for example, set the workload to hibernate for 30s after receiving a deletion request so that the workload can have sufficient time to process the remaining requests to ensure proper service running.
- In the Advanced Settings area, click Upgrade. Configure Scale-In Time Window (terminationGracePeriodSeconds) to specify the waiting time for command execution before the pod is stopped. The scale-in time window must be greater than the pre-stop processing time specified in Lifecycle. Add 30s to the command execution time before the pod is stopped. If, for example, the pre-stop processing time is 30s, the scale-in time window should be 60s.
Figure 3 Entering the pre-stop command
- Add node affinity rules. Add this kind of rule when Node-level is selected for a Service's Service Affinity. In the Advanced Settings area, click Scheduling and add Node Affinity rules. When adding a scheduling policy, specify the nodes that the workload requires affinity. Figure 4 Adding node affinity rules
- Enable the liveness probe and readiness probe. In the Container Settings area, click Health Check and enable the liveness probe and readiness probe. In this example, TCP is selected for Check Method. Configure the parameters based on your requirements. Parameters like Period (s), Delay (s), and Timeout (s) must be properly configured. Some applications take a long time to start. A small value of these parameters will lead to repeated restarts.
- After the configuration is complete, click Upgrade Workload.
On the Pods tab, after the newly created pods are displayed, stop the old ones. This ensures that there are always some workload pods running.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot