Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Situation Awareness
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
Help Center/ Cloud Container Engine/ Best Practices/ Container/ Upgrading Pods Without Interrupting Services

Upgrading Pods Without Interrupting Services

Updated on 2024-05-31 GMT+08:00

Application Scenarios

In a Kubernetes cluster, applications can be accessed externally through Deployments and LoadBalancer Services. When an application is updated or upgraded, new pods are created in the Deployment. These new pods will gradually replace the old ones. During this process, services may be interrupted.

Solution

To prevent an application upgrade from interrupting services, configure Deployments and Services as follows:

  • In a Deployment, upgrade pods in the Rolling upgrade mode. In this mode, pods are updated one by one, not all at once. In this way, you can control the update speed and the number of concurrent pods to ensure that services are not interrupted during the upgrade. For example, you can configure the maxSurge and maxUnavailable parameters to control the number of new pods created and the number of old pods deleted concurrently. Ensure that there is always a workload that can provide services during the upgrade.
  • There are two types of service affinity in a LoadBalancer:
    • Cluster-level service affinity (externalTrafficPolicy: Cluster). In this mode, if there is no pod deployed on a node, the request is forwarded to pods on another node. During the cross-node forwarding, the source IP address may be lost.
    • Node-level service affinity (externalTrafficPolicy: Local). In this mode, requests are directly forwarded to the node where the pod resides. Cross-node forwarding is not involved. Therefore, the source IP address can be preserved. However, if the node where the pod resides changes during the rolling upgrade, the ELB backend server will change accordingly, which may cause service interruption. In this case, you can upgrade pods in place. This ensures that there is at least one pod running properly on the ELB backend node.

The following table lists the solution for ensuring service continuity during a pod upgrade.

Scenario

Service

Deployment

The source IP address does not need to be preserved.

Select the Cluster-level service affinity.

Select Rolling upgrade for Upgrade Mode, configure a graceful termination, and enable Liveness probe and Ready probe.

The source IP address needs to be preserved.

Select the Node-level service affinity.

Select Rolling upgrade for Upgrade Mode, configure a graceful termination, enable Liveness probe and Ready probe, and add Node Affinity policies. (Ensure that there is at least one pod running on each node during the update.)

Procedure

In this example, there are 200 replicas in the workload, and the workload is exposed through the LoadBalance Service. The rolling upgrade of workloads associated with Loadbalance or Ingress Services involves multiple Services. Therefore, you need to pay attention to the configuration of rolling upgrade parameters.

  1. Log in to the CCE console and click the cluster name to access the cluster console. In the navigation pane, choose Workloads.
  2. In the workload list, click Upgrade in the Operation column of the workload to be upgraded. The Upgrade Workload page is displayed.

    1. Enable the liveness probe and ready probe. In the Container Settings area, click Health Check and enable Liveness probe and Ready probe. In this example, TCP is selected for Check Method. Configure the parameters based on your requirements. Parameters like Period (s), Delay (s), and Timeout (s) must be properly configured. Some applications take a long time to start. A small value of these parameters will lead to repeated restart.

      In this example, the ready probe delay is set to 20 to control the interval for rolling workloads in batches.

      Figure 1 Enabling the liveness probe and ready probe
    2. Configure a rolling upgrade. In the Advanced Settings area, click Upgrade and select Rolling upgrade for Upgrade Mode. This ensures that the instances of the old versions are gradually replaced with the ones of the new versions.

      In this example, maxUnavailable is set to 2%, and maxSurge is set to 2% to control the workload rolling step. This, works with the ready probe delay, enables eight workloads to be upgraded every 20 seconds.

      Figure 2 Configuring a rolling upgrade
    3. Configure a graceful termination.
      1. In the Container Settings area, click Lifecycle and configure pre-stop processing. Configure this parameter to the time required for the Service to process all remaining requests, most of which are persistent connection requests. You can, for example, set the workload to hibernate for 30s after receiving a deletion request so that the workload can have sufficient time to process the remaining requests to ensure proper service running.
      2. In the Advanced Settings area, click Upgrade. Configure Scale-In Time Window (terminationGracePeriodSeconds) to specify the waiting time for command execution before the container is stopped. The scale-in time window must be greater than the pre-stop processing time. Add 30s to the command execution time before the container is stopped. If, for example, the pre-stop processing time is 30s, the scale-in time window should be 60s.
      Figure 3 Entering the pre-stop command
    4. Add node affinity policies. Add this kind of policy when Node-level is selected for a Service's Service Affinity. In the Advanced Settings area, click Scheduling and add Node Affinity policies. When adding a scheduling policy, specify the nodes that the workload requires affinity.
      Figure 4 Adding node affinity policies

  3. After the configuration is complete, click Upgrade Workload.

    On the Pods tab, after a newly created pod is displayed, stop the old one. This ensures that there is always a pod running in the workload.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback