Updated on 2024-01-26 GMT+08:00

Pre-upgrade Check

The system performs a comprehensive pre-upgrade check before the cluster upgrade. If the cluster does not meet the pre-upgrade check conditions, the upgrade cannot continue. To prevent upgrade risks, you can perform pre-upgrade check according to the check items provided by this section.

Table 1 Check items

No.

Check Item

Description

1

Node Restrictions

  • Check whether the node is available.
  • Check whether the node OS supports the upgrade.
  • Check whether there are unexpected node pool tags in the node.
  • Check whether the Kubernetes node name is consistent with the ECS name.

2

Upgrade Management

Check whether the current user is in the upgrade blocklist.

3

Add-ons

  • Check whether the add-on status is normal.
  • Check whether the add-on support the target version.

4

Helm Charts

Check whether the current HelmRelease record contains discarded Kubernetes APIs that are not supported by the target cluster version. If yes, the Helm chart may be unavailable after the upgrade.

5

SSH Connectivity of Master Nodes

Check whether CCE can connect to your master nodes.

6

Node Pools

Check the node pool status.

7

Security Groups

Check whether the security group allows the master node to access nodes using ICMP.

8

Arm Node Restrictions

  • Check whether the cluster contains Arm nodes.

9

To-Be-Migrated Nodes

Check whether the node needs to be migrated.

10

Discarded Kubernetes Resources

Check whether there are discarded resources in the clusters.

11

Compatibility Risks

Read the version compatibility differences and ensure that they are not affected. The patch upgrade does not involve version compatibility differences.

12

Node CCE Agent Versions

Check whether cce-agent on the current node is of the latest version.

13

Node CPU Usage

Check whether the CPU usage of the node exceeds 90%.

14

CRDs

  • Check whether the key CRD packageversions.version.cce.io of the cluster is deleted.
  • Check whether the cluster key CRD network-attachment-definitions.k8s.cni.cncf.io is deleted.

15

Node Disks

  • Check whether the key data disks on the node meet the upgrade requirements.
  • Check whether the /tmp directory has 500 MiB available space.

16

Node DNS

  • Check whether the DNS configuration of the current node can resolve the OBS address.
  • Check whether the current node can access the OBS address of the storage upgrade component package.

17

Node Key Directory File Permissions

Check whether the key directory /var/paas on the nodes contain files with abnormal owners or owner groups.

18

Kubelet

Check whether the kubelet on the node is running properly.

19

Node Memory

Check whether the memory usage of the node exceeds 90%.

20

Node Clock Synchronization Server

Check whether the clock synchronization server ntpd or chronyd of the node is running properly.

21

Node OS

Check whether the OS kernel version of the node is supported by CCE.

22

Node CPUs

Check whether the number of CPUs on the master node is greater than 2.

23

Node Python Commands

Check whether the Python commands are available on a node.

24

ASM Version

  • Check whether ASM is used by the cluster.
  • Check whether the current ASM version supports the target cluster version.

25

Node Readiness

Check whether the nodes in the cluster are ready.

26

Node journald

Check whether journald of a node is normal.

27

containerd.sock

Check whether the containerd.sock file exists on the node. This file affects the startup of container runtime in the Euler OS.

28

Internal Errors

Before the upgrade, check whether an internal error occurs.

29

Node Mount Points

Check whether inaccessible mount points exist on the node.

30

Kubernetes Node Taints

Check whether the taint needed for cluster upgrade exists on the node.

31

everest Restrictions

Check whether there are any compatibility restrictions on the current everest add-on.

32

cce-hpa-controller Restrictions

Check whether the current cce-controller-hpa add-on has compatibility restrictions.

33

Enhanced CPU Policies

Check whether the current cluster version and the target version support enhanced CPU policy.

34

Health of Worker Node Components

Check whether the container runtime and network components on the worker nodes are healthy.

35

Health of Master Node Components

Check whether the Kubernetes, container runtime, and network components of the master nodes are healthy.

36

Memory Resource Limit of Kubernetes Components

Check whether the resources of Kubernetes components, such as etcd and kube-controller-manager, exceed the upper limit.

37

Discarded Kubernetes APIs

The system scans the audit logs of the past day to check whether the user calls the deprecated APIs of the target Kubernetes version.
NOTE:

Due to the limited time range of audit logs, this check item is only an auxiliary method. APIs to be deprecated may have been used in the cluster, but their usage is not included in the audit logs of the past day. Check the API usage carefully.

38

IPv6 Capabilities of a CCE Turbo Cluster

If IPv6 is enabled for a CCE Turbo cluster, check whether the target cluster version supports IPv6.

39

Node NetworkManager

Check whether NetworkManager of a node is normal.

40

Node ID File

Check the ID file format.

41

Node Configuration Consistency

When you upgrade a CCE cluster to v1.19 or later, the system checks whether the following configuration files have been modified in the background.

42

Node Configuration File

Check whether the configuration files of key components exist on the node.

43

CoreDNS Configuration Consistency

Check whether the current CoreDNS key configuration Corefile is different from the Helm release record. The difference may be overwritten during the add-on upgrade, affecting domain name resolution in the cluster.

44

sudo Commands of a Node

Check whether the sudo commands and sudo-related files of the node are working.

45

Key Commands of Nodes

Check whether some key commands that the node upgrade depends on are working.

46

Mounting of a Sock File on a Node

The docker/containerd.sock file on the node is mounted to the pod through a hostPath. During the upgrade, Docker/containerd restarts, but the sock file in the container does not change. As a result, an error may occur in your services.

47

HTTPS Load Balancer Certificate Consistency

Check whether the certificate used by an HTTPS load balancer has been modified on ELB.

48

Node Mounting

Check whether the default mount directory and soft link on the node have been manually mounted or modified.

49

Login Permissions of User paas on a Node

Check whether user paas is allowed to log in to a node.

50

Private IPv4 Addresses of Load Balancers

Check whether the load balancer associated with a Service is allocated with a private IPv4 address.

51

Historical Upgrade Records

Check whether the source version of the cluster is earlier than v1.11 and the target version is later than v1.23.

52

CIDR Block of the Cluster Management Plane

Check whether the CIDR block of the cluster management plane is the same as that configured on the backbone network.

53

GPU Add-on

The GPU add-on is involved in the upgrade, which may affect the GPU driver installation during the creation of a GPU node.

54

Nodes' System Parameter Settings

Check whether the default system parameter settings on your nodes are modified.

55

Residual Package Versions

Check whether there are residual package versions in the current cluster.

56

Node Commands

Check whether the commands required for the upgrade are available on the node.

57

Node Swap

Check whether swap has been enabled on cluster nodes.

58

nginx-ingress Upgrade

Check whether there are compatibility issues that may occur during nginx-ingress upgrade.