Before You Start
Before the upgrade, you can check whether your cluster can be upgraded and which versions are available on the CCE console. For details, see Upgrade Overview.
Precautions
Before upgrading a cluster, pay attention to the following points:
- Upgrading a cluster cannot be rolled back. Perform an upgrade at a proper time to minimize the impact on your services. To ensure data security, you back up your data before an upgrade.
- Before upgrading a cluster, ensure that no high-risk operations are performed in the cluster. Otherwise, the cluster upgrade may fail or the configuration may be lost after the upgrade. Common high-risk operations include modifying cluster node configurations locally and modifying the configurations of the listeners managed by CCE on the ELB console. Instead, modify configurations on the CCE console so that the modifications can be automatically inherited during the upgrade.
- Before upgrading a cluster, ensure the cluster is working properly.
- Before upgrading a cluster, learn about the features and differences of each cluster version in Kubernetes Release Notes to prevent exceptions due to the use of an incompatible cluster version. For example, check whether any APIs deprecated in the target version are used in the cluster. Otherwise, calling the APIs may fail after the upgrade. For details, see Deprecated APIs.
During a cluster upgrade, pay attention to the following points that may affect your services:
- During a cluster upgrade, do not perform any operation on the cluster. Do not stop, restart, or delete nodes during cluster upgrade. Otherwise, the upgrade will fail.
- During a cluster upgrade, the running workloads will not be interrupted, but access to the API server will be temporarily interrupted.
- During a cluster upgrade, the node.kubernetes.io/upgrade taint (equivalent to NoSchedule) will be added to the nodes in the cluster. The taint will be removed after the cluster is upgraded. Do not add taints with the same key name on a node. Even if the taints have different effects, they may be deleted by the system by mistake after the upgrade.
Constraints
- CCE clusters with VM nodes can be upgraded.
- If there are any nodes created using a private image, the cluster cannot be upgraded.
- After the cluster is upgraded, if the containerd vulnerability of the container engine is fixed in Kubernetes Release Notes, manually restart containerd for the upgrade to take effect. The same applies to the existing pods.
- If you mount the docker.sock file on a node to a pod using the hostPath mode, that is, the Docker in Docker scenario, Docker will restart during the upgrade, but the docker.sock file does not change. As a result, your services may malfunction. You are advised to mount the docker.sock file by mounting the directory.
- When clusters using the tunnel network model are upgraded to v1.19.16-r4, v1.21.7-r0, v1.23.5-r0, v1.25.1-r0, or later, the SNAT rule whose destination address is the container CIDR block but the source address is not the container CIDR block will be removed. If you have configured VPC routes to directly access all pods outside the cluster, only the pods on the corresponding nodes can be directly accessed after the upgrade.
Deprecated APIs
With the evolution of Kubernetes APIs, APIs are periodically reorganized or upgraded, and old APIs are deprecated and finally deleted. The following tables list the deprecated APIs in each Kubernetes community version. For details about more deprecated APIs, see Deprecated API Migration Guide.
- APIs Deprecated in Kubernetes v1.25
- APIs Deprecated in Kubernetes v1.22
- APIs Deprecated in Kubernetes v1.16
When an API is deprecated, the existing resources are not affected. However, when you create or edit the resources, the API version will be intercepted.
Resource Name |
Deprecated API Version |
Substitute API Version |
Change Description |
---|---|---|---|
CronJob |
batch/v1beta1 |
batch/v1 (This API is available since v1.21.) |
None |
EndpointSlice |
discovery.k8s.io/v1beta1 |
discovery.k8s.io/v1 (This API is available since v1.21.) |
Pay attention to the following changes:
|
Event |
events.k8s.io/v1beta1 |
events.k8s.io/v1 (This API is available since v1.19.) |
Pay attention to the following changes:
|
HorizontalPodAutoscaler |
autoscaling/v2beta1 |
autoscaling/v2 (This API is available since v1.23.) |
None |
PodDisruptionBudget |
policy/v1beta1 |
policy/v1 (This API is available since v1.21.) |
If spec.selector is set to null ({}) in PodDisruptionBudget of policy/v1, all pods in the namespace are selected. (In policy/v1beta1, an empty spec.selector means that no pod will be selected.) If spec.selector is not specified, pod will be selected in neither API version. |
PodSecurityPolicy |
policy/v1beta1 |
None |
Since v1.25, the PodSecurityPolicy resource no longer provides APIs of the policy/v1beta1 version, and the PodSecurityPolicy access controller is deleted. Replace it with Configuring Pod Security Admission. |
RuntimeClass |
node.k8s.io/v1beta1 |
node.k8s.io/v1 (This API is available since v1.20.) |
None |
Resource Name |
Deprecated API Version |
Substitute API Version |
Change Description |
---|---|---|---|
MutatingWebhookConfiguration ValidatingWebhookConfiguration |
admissionregistration.k8s.io/v1beta1 |
admissionregistration.k8s.io/v1 (This API is available since v1.16.) |
|
CustomResourceDefinition |
apiextensions.k8s.io/v1beta1 |
apiextensions/v1 (This API is available since v1.16.) |
|
APIService |
apiregistration/v1beta1 |
apiregistration.k8s.io/v1 (This API is available since v1.10.) |
None |
TokenReview |
authentication.k8s.io/v1beta1 |
authentication.k8s.io/v1 (This API is available since v1.6.) |
None |
LocalSubjectAccessReview SelfSubjectAccessReview SubjectAccessReview SelfSubjectRulesReview |
authorization.k8s.io/v1beta1 |
authorization.k8s.io/v1 (This API is available since v1.16.) |
spec.group was renamed spec.groups in v1 (patch #32709). |
CertificateSigningRequest |
certificates.k8s.io/v1beta1 |
certificates.k8s.io/v1 (This API is available since v1.19.) |
Pay attention to the following changes in certificates.k8s.io/v1:
|
Lease |
coordination.k8s.io/v1beta1 |
coordination.k8s.io/v1 (This API is available since v1.14.) |
None |
Ingress |
networking.k8s.io/v1beta1 extensions/v1beta1 |
networking.k8s.io/v1 (This API is available since v1.19.) |
|
IngressClass |
networking.k8s.io/v1beta1 |
networking.k8s.io/v1 (This API is available since v1.19.) |
None |
ClusterRole ClusterRoleBinding Role RoleBinding |
rbac.authorization.k8s.io/v1beta1 |
rbac.authorization.k8s.io/v1 (This API is available since v1.8.) |
None |
PriorityClass |
scheduling.k8s.io/v1beta1 |
scheduling.k8s.io/v1 (This API is available since v1.14.) |
None |
CSIDriver CSINode StorageClass VolumeAttachment |
storage.k8s.io/v1beta1 |
storage.k8s.io/v1 |
|
Resource Name |
Deprecated API Version |
Substitute API Version |
Change Description |
---|---|---|---|
NetworkPolicy |
extensions/v1beta1 |
networking.k8s.io/v1 (This API is available since v1.8.) |
None |
DaemonSet |
extensions/v1beta1 apps/v1beta2 |
apps/v1 (This API is available since v1.9.) |
|
Deployment |
extensions/v1beta1 apps/v1beta1 apps/v1beta2 |
apps/v1 (This API is available since v1.9.) |
|
StatefulSet |
apps/v1beta1 apps/v1beta2 |
apps/v1 (This API is available since v1.9.) |
|
ReplicaSet |
extensions/v1beta1 apps/v1beta1 apps/v1beta2 |
apps/v1 (This API is available since v1.9.) |
spec.selector is now a mandatory field and cannot be changed after the object is created. The label of an existing template can be used as a selector for seamless migration. |
PodSecurityPolicy |
extensions/v1beta1 |
policy/v1beta1 (This API is available since v1.10.) |
PodSecurityPolicy for the policy/v1beta1 API version will be removed in v1.25. |
Version Differences
Upgrade Path |
Version Difference |
Self-Check |
---|---|---|
v1.23 to v1.25 |
Since Kubernetes v1.25, PodSecurityPolicy has been replaced by pod Security Admission (Configuring Pod Security Admission). |
|
v1.21 to v1.23 |
For the Nginx Ingress Controller of an earlier version (community version v0.49 or earlier, or CCE nginx-ingress version v1.x.x), the created ingresses can be managed by the Nginx Ingress Controller even if kubernetes.io/ingress.class: nginx is not set in the ingress annotations. However, for the Nginx Ingress Controller of a later version (community version v1.0.0 or later, or CCE nginx-ingress version v2.x.x), the ingresses created without specifying the Nginx type will not be managed by the Nginx Ingress Controller, and ingress rules will become invalid, which interrupts services. |
|
v1.19 to v1.23 |
||
v1.19 to v1.21 |
The bug of exec probe timeouts is fixed in Kubernetes 1.21. Before this bug is fixed, the exec probe does not consider the timeoutSeconds field. Instead, the probe will run indefinitely, even beyond its configured deadline. It will stop until the result is returned. If this field is not specified, the default value 1 is used. This field takes effect after the upgrade. If the probe runs over 1 second, the application health check may fail and the application may restart frequently. |
Before the upgrade, check whether the timeout is properly set for the exec probe. |
kube-apiserver of CCE 1.19 or later requires that the Subject Alternative Names (SANs) field be configured for the certificate of your webhook server. Otherwise, kube-apiserver fails to call the webhook server after the upgrade, and containers cannot be started properly. Root cause: X.509 CommonName is discarded in Go 1.15. kube-apiserver of CCE 1.19 is compiled using Go 1.15. If your webhook certificate does not have SANs, kube-apiserver does not process the CommonName field of the X.509 certificate as the host name by default. As a result, the authentication fails. |
Before the upgrade, check whether the SAN field is configured in the certificate of your webhook server.
|
|
v1.15 to v1.19 |
The control plane of CCE clusters of v1.19 is incompatible with kubelet v1.15. If a node fails to be upgraded or the node to be upgraded restarts after the master node is successfully upgraded, there is a high probability that the node is in the NotReady status. This is because the node failed to be upgraded restarts the kubelet and trigger the node registration. In clusters of v1.15, the default registration tags (failure-domain.beta.kubernetes.io/is-baremetal and kubernetes.io/availablezone) are regarded as invalid tags by the clusters of v1.19. The valid tags in the clusters of v1.19 are node.kubernetes.io/baremetal and failure-domain.beta.kubernetes.io/zone. |
|
In CCE 1.15 and 1.19 clusters, the Docker storage driver file system is switched from XFS to Ext4. As a result, the import package sequence in the pods of the upgraded Java application may be abnormal, causing pod exceptions. |
Before the upgrade, check the Docker configuration file /etc/docker/daemon.json on the node. Check whether the value of dm.fs is xfs.
{ "storage-driver": "devicemapper", "storage-opts": [ "dm.thinpooldev=/dev/mapper/vgpaas-thinpool", "dm.use_deferred_removal=true", "dm.fs=xfs", "dm.use_deferred_deletion=true" ] } |
|
kube-apiserver of CCE 1.19 or later requires that the Subject Alternative Names (SANs) field be configured for the certificate of your webhook server. Otherwise, kube-apiserver fails to call the webhook server after the upgrade, and containers cannot be started properly. Root cause: X.509 CommonName is discarded in Go 1.15. kube-apiserver of CCE 1.19 is compiled using Go 1.15. The CommonName field is processed as the host name. As a result, the authentication fails. |
Before the upgrade, check whether the SAN field is configured in the certificate of your webhook server.
NOTICE:
To mitigate the impact of version differences on cluster upgrade, CCE performs special processing during the upgrade from 1.15 to 1.19 and still supports certificates without SANs. However, no special processing is required for subsequent upgrades. You are advised to rectify your certificate as soon as possible. |
|
In clusters of v1.17.17 and later, CCE automatically creates pod security policies (PSPs) for you, which restrict the creation of pods with unsafe configurations, for example, pods for which net.core.somaxconn under a sysctl is configured in the security context. |
After an upgrade, you can allow insecure system configurations as required. For details, see Configuring a Pod Security Policy. |
|
If initContainer or Istio is used in the in-place upgrade of a cluster of v1.15, pay attention to the following restrictions: In kubelet 1.16 and later versions, QoS classes are different from those in earlier versions. In kubelet 1.15 and earlier versions, only containers in spec.containers are counted. In kubelet 1.16 and later versions, containers in both spec.containers and spec.initContainers are counted. The QoS class of a pod will change after the upgrade. As a result, the container in the pod restarts. |
You are advised to modify the QoS class of the service container before the upgrade to avoid this problem. For details, see Table 4. |
|
v1.13 to v1.15 |
After a VPC network cluster is upgraded, the master node occupies an extra CIDR block due to the upgrade of network components. If no container CIDR block is available for the new node, the pod scheduled to the node cannot run. |
Generally, this problem occurs when the nodes in the cluster are about to fully occupy the container CIDR block. For example, the container CIDR block is 10.0.0.0/16, the number of available IP addresses is 65,536, and the VPC network is allocated a CIDR block with the fixed size (using the mask to determine the maximum number of container IP addresses allocated to each node). If the upper limit is 128, the cluster supports a maximum of 512 (65536/128) nodes, including the three master nodes. After the cluster is upgraded, each of the three master nodes occupies one CIDR block. As a result, 506 nodes are supported. |
Init Container (Calculated Based on spec.initContainers) |
Service Container (Calculated Based on spec.containers) |
Pod (Calculated Based on spec.containers and spec.initContainers) |
Impacted or Not |
---|---|---|---|
Guaranteed |
Besteffort |
Burstable |
Yes |
Guaranteed |
Burstable |
Burstable |
No |
Guaranteed |
Guaranteed |
Guaranteed |
No |
Besteffort |
Besteffort |
Besteffort |
No |
Besteffort |
Burstable |
Burstable |
No |
Besteffort |
Guaranteed |
Burstable |
Yes |
Burstable |
Besteffort |
Burstable |
Yes |
Burstable |
Burstable |
Burstable |
No |
Burstable |
Guaranteed |
Burstable |
Yes |
Upgrade Backup
How to back up a node:
- etcd database backup: CCE automatically backs up the etcd database during the cluster upgrade.
- Master node backup (recommended, manual confirmation required): On the upgrade confirmation page, click Backup to back up the entire master node of the cluster. The backup process uses the Cloud Backup and Recovery (CBR) service and takes about 20 minutes. If there are many cloud backup tasks at the current site, the backup time may be prolonged.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot