Updated on 2024-11-11 GMT+08:00

Draining a Node

Scenario

After you enable nodal drainage on the console, CCE configures the node to be non-schedulable and securely evicts all pods that comply with Rules for Draining Nodes on the node. Subsequent new pods will not be scheduled to this node.

When a node becomes faulty, nodal drainage quickly isolates the faulty node. The pods evicted from the faulty node will be scheduled by the workload controller to other nodes that are running properly.

To ensure service availability during drainage, specify a disruption budget for your application. Otherwise, the application may become unavailable during pod rescheduling.

Prerequisites

  • A cluster is available and the cluster version meets the following requirements:
    • v1.21: v1.21.10-r0 or later
    • v1.23: v1.23.8-r0 or later
    • v1.25: v1.25.3-r0 or later
    • Versions later than v1.25
  • To drain a node as an IAM user, you must have at least one of the following permissions (for details, see Namespace Permissions (Kubernetes RBAC-based)):
    • cluster-admin (administrator): read and write permissions on all resources in all namespaces.
    • drainage-editor: drain a node.
    • drainage-viewer: view the nodal drainage status but cannot drain a node.

Rules for Draining Nodes

When a node is drained, all pods on the node will be safely evicted. However, CCE will take specific actions for pods that meet certain filtering criteria.

Filter Criterion

Forced Drainage Enabled

Forced Drainage Disabled

The status.phase field of the pod is Succeeded or Failed.

Deletion

Deletion

The pod is not managed by the workload controller.

Deletion

Drainage cancellation

The pod is managed by DaemonSet.

None

Drainage cancellation

A volume of the emptyDir type is mounted to the pod.

Eviction

Drainage cancellation

The pod is a static pod directly managed by kubelet

None

None

The following operations may be performed on pods during node drainage:

  • Deletion: The pod is deleted from the current node and will not be scheduled to other nodes.
  • Eviction: The pod is deleted from the current node and rescheduled to another node.
  • None: The pod will not be evicted or deleted.
  • Drainage cancellation: If a pod on a node cancels drainage, the drainage process of the node is terminated and no pod is evicted or deleted.

Procedure

This section describes how to drain nodes.

  1. Log in to the CCE console and click the cluster name to access the cluster console.
  2. In the navigation pane, choose Nodes. On the displayed page, click the Nodes tab.
  3. Locate the target node and choose More > Nodal Drainage in the Operation column.
  4. In the Nodal Drainage window displayed, set parameters.

    • Timeout (s): Node drain tasks automatically fail after the specified timeout expires. A value of 0 indicates that task will not time out.
    • Forced Drainage: If this function is enabled, pods managed by DaemonSet will be ignored, and pods with emptyDir volumes and pods not managed by controllers will be deleted. For details, see Rules for Draining Nodes.

  5. Click OK and wait until the node drainage is complete.
  1. Use kubectl to access the cluster. For details, see Connecting to a Cluster Using kubectl.
  2. Edit the YAML file for drainage.

    The following is an example of Drainage-test.yaml:

    apiVersion: node.cce.io/v1
    kind: Drainage
    metadata:
      name: 192.168.1.67-1721616409999   # Drainage resource name
    spec:
      nodeName: 192.168.1.67     # Kubernetes name of the node to be drained, which can be obtained by running the kubectl get node command
      force: true
      timeout: 0
    • nodeName: node to be drained. The parameter value is the node name in Kubernetes, not the name displayed on the console.

      You can run the kubectl get node command to obtain a node name in Kubernetes.

    • force: whether to forcibly drain a node. Value true means that the drainage is forced, while false means it is not.
    • timeout: timeout measured in seconds. Node drain tasks automatically fail after the specified timeout expires. A value of 0 indicates that task will not time out.

  3. Create drainage resources.

    kubectl create -f Drainage-test.yaml

    If information similar to the following is displayed, the drainage resources have been created:

    drainage.node.cce.io/192.168.1.67-1721616409999 created

  4. Check the result.

    kubectl get drainages 192.168.1.67-1721616409999 -o yaml

    If phase is Succeeded, the operation is successful.

    apiVersion: node.cce.io/v1
    kind: Drainage
    metadata:
      creationTimestamp: "2024-07-22T03:12:56Z"
      generation: 1
      name: 192.168.1.67-1721616409999
      resourceVersion: "2683143"
      uid: 3ec131e4-0505-4c88-8255-ef9d0eb02712
    spec:
      force: true
      nodeName: 192.168.1.67
      timeout: 0
    status:
      conditions:
      - lastTransitionTime: "2024-07-22T03:12:56Z"
        message: start to drain node
        reason: Started
        status: "True"
        type: Started
      - lastTransitionTime: "2024-07-22T03:13:26Z"
        message: node has been drained
        reason: Succeeded
        status: "True"
        type: Finished
      phase: Succeeded

Cancelling Node Drainage

This section describes how to stop a node drainage that is currently in progress.

In clusters of v1.23.16-r0, v1.25.11-r0, v1.27.8-r0, v1.28.6-r0, v1.29.2-r0, or later versions, node drainage can be canceled.

This operation will abort drainage on nodes, but workloads that have been evicted from these nodes will not be automatically migrated back.

  1. Log in to the CCE console and click the cluster name to access the cluster console.
  2. In the navigation pane, choose Nodes. On the displayed page, click the Nodes tab.
  3. Locate the node that is being drained and click Cancel Drainage.
  4. In the displayed dialog box, click OK. The node status changes to Drainage cancelled. You can click Enable Scheduling to restore the node to the schedulable state.
  1. Use kubectl to access the cluster. For details, see Connecting to a Cluster Using kubectl.
  2. Check drainage resources.

    kubectl get drainages
    Command output:
    NAME                         AGE
    192.168.1.67-1721616409999   13s

  3. Cancel drainage.

    kubectl annotate drainages 192.168.1.67-1721616409999 node.cce.io/drainage-disable=true

  4. Check the result.

    kubectl get drainages 192.168.1.67-1721616409999 -o yaml

    If the command output, phase is changed to Cancelled.

    apiVersion: node.cce.io/v1
    kind: Drainage
    metadata:
      annotations:
        node.cce.io/drainage-disable: "true"
      creationTimestamp: "2024-07-22T03:12:56Z"
      generation: 1
      name: 192.168.1.67-1721616409999
      resourceVersion: "2689858"
      uid: 3ec131e4-0505-4c88-8255-ef9d0eb02712
    spec:
      force: true
      nodeName: 192.168.1.67
      timeout: 0
    status:
      conditions:
      - lastTransitionTime: "2024-07-22T03:12:56Z"
        message: start to drain node
        reason: Started
        status: "True"
        type: Started
      - lastTransitionTime: "2024-07-22T03:13:26Z"
        message: node has been drained
        reason: Succeeded
        status: "True"
        type: Finished
      - lastTransitionTime: "2024-07-22T03:37:48Z"
        message: node drainage has been cancelled
        reason: Cancelled
        status: "True"
        type: Cancelled
      phase: Cancelled