Help Center/ Cloud Container Engine/ User Guide (Paris Regions)/ FAQs/ Workload/ Workload Abnormalities/ What Should I Do If a Workload Remains in the Creating State?
Updated on 2023-12-07 GMT+08:00

What Should I Do If a Workload Remains in the Creating State?

Symptom

The workload remains in the creating state.

Troubleshooting Process

Troubleshooting methods are sorted based on the occurrence probability of the possible causes. You are advised to check the possible causes from high probability to low probability to quickly locate the cause of the problem.

If the fault persists after a possible cause is rectified, check other possible causes.

Check Item 1: Whether the cce-pause Image Is Deleted by Mistake

Symptom

When creating a workload, an error message indicating that the sandbox cannot be created is displayed. This is because the cce-pause:3.1 image fails to be pulled.

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "cce-pause:3.1": failed to pull image "cce-pause:3.1": failed to pull and unpack image "docker.io/library/cce-pause:3.1": failed to resolve reference "docker.io/library/cce-pause:3.1": pulling from host **** failed with status code [manifests 3.1]: 400 Bad Request

Possible Causes

The image is a system image added during node creation. If the image is deleted by mistake, the workload cannot be created.

Solution

  1. Log in to the faulty node.
  2. Decompress the cce-pause image installation package.

    tar -xzvf /opt/cloud/cce/package/node-package/pause-*.tgz

  3. Import the image.

    • For a node which uses a Docker container runtime:
      docker load ./pause/package/image/cce-pause-3.1.tar
    • For a node which uses a containerd container runtime:
      ctr -n k8s.io image import ./pause/package/image/cce-pause-3.1.tar

  4. Create a workload.

Check Item 2: Modifying Node Specifications After the CPU Management Policy Is Enabled in the Cluster

The kubelet option cpu-manager-policy defaults to static. This allows granting enhanced CPU affinity and exclusivity to pods with certain resource characteristics on the node. If you modify CCE node specifications on the ECS console, the original CPU information does not match the new CPU information. As a result, workloads on the node cannot be restarted or created.

  1. Log in to the CCE node (ECS) and delete the cpu_manager_state file.

    Example command for deleting the file:

    rm -rf /mnt/paas/kubernetes/kubelet/cpu_manager_state

  2. Restart the node or kubelet. The following is the kubelet restart command:

    systemctl restart kubelet

    Verify that workloads on the node can be successfully restarted or created.

    For details, see What Should I Do If I Fail to Restart or Create Workloads on a Node After Modifying the Node Specifications?.