Help Center/ Cloud Container Engine/ FAQs/ Workload/ Workload Exception Troubleshooting/ How Can I Find the Fault for an Abnormal Workload?
Updated on 2024-11-13 GMT+08:00

How Can I Find the Fault for an Abnormal Workload?

If a workload is abnormal, you can check the pod events first to locate the fault and then rectify the fault.

Fault Locating

To locate the fault of an abnormal workload, take the following steps:

  1. Check whether the pod is running properly.

    1. Log in to the CCE console.
    2. Click the cluster name to access the cluster console. In the navigation pane, choose Workloads.
    3. In the upper left corner of the page, select a namespace, locate the target workload, and view its status.
      • If the workload is not ready, you can view pod events and determine the cause. For details, see Viewing Pod Events. You can find the solution to the exception based on the events by referring to Common Pod Issues.
      • If the workload is processing, wait patiently.
      • If the workload is running, no action is required. If the workload status is normal but it cannot be accessed, check whether intra-cluster access is normal.

  2. Check whether access within the cluster is normal.

    Log in to the CCE console or use kubectl to obtain the pod IP address. Then, log in to the node or the pod and run curl or use other methods to manually call the APIs and check whether the expected result is returned.

    If {Container IP address}:{Port number} cannot be accessed, log in to the service container, access 127.0.0.1:{Port number}, and locate the fault.

  3. Check whether the access result meets the expectation.

    If the workload is accessible within the cluster but the access result is not as expected, check the workload configurations, such as verifying if the image tag and environment variables are correctly configured.

Common Pod Issues

Status

Description

Solution

Pending

The pod scheduling failed.

For details, see What Should I Do If Pod Scheduling Fails?

Pending

A storage volume fails to be mounted to a pod.

For details, see What Should I Do If a Storage Volume Cannot Be Mounted or the Mounting Times Out?.

Pending

The storage volume mounting failed.

For details, see What Should I Do If a Workload Exception Occurs Due to a Storage Volume Mount Failure?

FailedPullImage

ImagePullBackOff

The image pull failed.

The image failed to be pulled again.

For details, see What Should I Do If a Pod Fails to Pull the Image?

CreateContainerError

CrashLoopBackOff

The container startup failed.

The container failed to restart.

For details, see What Should I Do If Container Startup Fails?

Evicted

A pod is in the Evicted state, and the pod keeps being evicted.

For details, see What Should I Do If a Pod Fails to Be Evicted?

Creating

A pod is in the Creating state.

For details, see What Should I Do If a Workload Remains in the Creating State?

Terminating

A pod is in the Terminating state.

For details, see What Should I Do If a Pod Remains in the Terminating State?

Stopped

A pod is in the Stopped state.

For details, see What Should I Do If a Workload Is Stopped Caused by Pod Deletion?

Viewing Pod Events

Method 1

On the CCE console, click the workload name to go to the workload details page, locate the row containing the abnormal pod, and choose More > View Events in the Operation column.

Figure 1 Viewing pod events

Method 2

Run kubectl describe pod {Pod name} to view pod events. The following shows an example:

$ kubectl describe pod prepare-58bd7bdf9-fthrp
...
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  49s   default-scheduler  0/2 nodes are available: 2 Insufficient cpu.
  Warning  FailedScheduling  49s   default-scheduler  0/2 nodes are available: 2 Insufficient cpu.