Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Situation Awareness
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

What Should I Do If a Pod Fails to Pull the Image?

Updated on 2024-01-04 GMT+08:00

Fault Locating

When a workload enters the state of "Pod not ready: Back-off pulling image "xxxxx", a Kubernetes event of PodsFailed to pull image or Failed to re-pull image will be reported. For details about how to view Kubernetes events, see Viewing Pod Events.

Troubleshooting Process

Determine the cause based on the event information, as listed in Table 1.

Table 1 FailedPullImage

Event Information

Cause and Solution

Failed to pull image "xxx": rpc error: code = Unknown desc = Error response from daemon: Get xxx: denied: You may not login yet

You have not logged in to the image repository.

Check Item 1: Whether imagePullSecret Is Specified When You Use kubectl to Create a Workload

Failed to pull image "nginx:v1.1": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io: no such host

The image address is incorrectly configured.

Check Item 2: Whether the Image Address Is Correct When a Third-Party Image Is Used

Check Item 3: Whether an Incorrect Secret Is Used When a Third-Party Image Is Used

Failed create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod "nginx-6dc48bf8b6-l8xrw": Error response from daemon: mkdir xxxxx: no space left on device

The disk space is insufficient.

Check Item 4: Whether the Node Disk Space Is Insufficient

Failed to pull image "xxx": rpc error: code = Unknown desc = error pulling image configuration: xxx x509: certificate signed by unknown authority

An unknown or insecure certificate is used by the third-party image repository from which the image is pulled.

Check Item 5: Whether the Remote Image Repository Uses an Unknown or Insecure Certificate

Failed to pull image "xxx": rpc error: code = Unknown desc = context canceled

The image size is too large.

Check Item 6: Whether the Image Size Is Too Large

Failed to pull image "docker.io/bitnami/nginx:1.22.0-debian-11-r3": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Check Item 7: Connection to the Image Repository

ERROR: toomanyrequests: Too Many Requests.

Or

you have reached your pull rate limit, you may increase the limit by authenticating an upgrading

The rate is limited because the number of image pull times reaches the upper limit.

Check Item 8: Whether the Number of Public Image Pull Times Reaches the Upper Limit

Check Item 1: Whether imagePullSecret Is Specified When You Use kubectl to Create a Workload

If the workload status is abnormal and a Kubernetes event is displayed indicating that the pod fails to pull the image, check whether the imagePullSecrets field exists in the YAML file.

Items to Check

  • If an image needs to be pulled from SWR, the name parameter must be set to default-secret.
    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: nginx
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      strategy:
        type: RollingUpdate
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - image: nginx 
            imagePullPolicy: Always
            name: nginx
          imagePullSecrets:
          - name: default-secret
  • If an image needs to be pulled from a third-party image repository, the imagePullSecrets parameter must be set to the created secret name.

    When you use kubectl to create a workload from a third-party image, specify the imagePullSecret field, in which name indicates the name of the secret used to pull the image.

Check Item 2: Whether the Image Address Is Correct When a Third-Party Image Is Used

CCE allows you to create workloads using images pulled from third-party image repositories.

Enter the third-party image address according to requirements. The format must be ip:port/path/name:version or name:version. If no tag is specified, latest is used by default.

  • For a private repository, enter an image address in the format of ip:port/path/name:version.
  • For an open-source Docker repository, enter an image address in the format of name:version, for example, nginx:latest.

The following information is displayed when you fail to pull an image due to incorrect image address provided.

Failed to pull image "nginx:v1.1": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: dial tcp: lookup registry-1.docker.io: no such host

Solution

You can either edit your YAML file to change the image address or log in to the CCE console to replace the image on the Upgrade tab on the workload details page.

Check Item 3: Whether an Incorrect Secret Is Used When a Third-Party Image Is Used

Generally, a third-party image repository can be accessed only after authentication (using your account and password). CCE uses the secret authentication mode to pull images. Therefore, you need to create a secret for an image repository before pulling images from the repository.

Solution

If your secret is incorrect, images will fail to be pulled. In this case, create a new secret.

Check Item 4: Whether the Node Disk Space Is Insufficient

If the Kubernetes event contains information "no space left on device", there is no disk space left for storing the image. As a result, the image will fail to be pulled. In this case, clear the image or expand the disk space to resolve this issue.

Failed create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod "nginx-6dc48bf8b6-l8xrw": Error response from daemon: mkdir xxxxx: no space left on device

Run the following command to obtain the disk space for storing images on a node:

lvs

Solution 1: Clearing images

Perform the following operations to clear unused images:
  • Nodes that use containerd
    1. Obtain local images on the node.
      crictl images -v
    2. Delete the images that are not required by image ID.
      crictl rmi Image ID
  • Nodes that use Docker
    1. Obtain local images on the node.
      docker images
    2. Delete the images that are not required by image ID.
      docker rmi Image ID
NOTE:

Do not delete system images such as the cce-pause image. Otherwise, pods may fail to be created.

Solution 2: Expanding the disk capacity

To expand a disk capacity, perform the following steps:

  1. Expand the capacity of the data disk on the EVS console.
  2. Log in to the CCE console and click the cluster. In the navigation pane, choose Nodes. Click More > Sync Server Data in the row containing the target node.
  3. Log in to the target node.
  4. Run the lsblk command to check the block device information of the node.

    A data disk is divided depending on the container storage Rootfs:

    • Overlayfs: No independent thin pool is allocated. Image data is stored in the dockersys disk.
      # lsblk
      NAME                MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
      vda                   8:0    0   50G  0 disk 
      └─vda1                8:1    0   50G  0 part /
      vdb                   8:16   0  200G  0 disk 
      ├─vgpaas-dockersys  253:0    0   90G  0 lvm  /var/lib/docker               # Space used by the container engine
      └─vgpaas-kubernetes 253:1    0   10G  0 lvm  /mnt/paas/kubernetes/kubelet  # Space used by Kubernetes

      Run the following commands on the node to add the new disk capacity to the dockersys disk:

      pvresize /dev/vdb 
      lvextend -l+100%FREE -n vgpaas/dockersys
      resize2fs /dev/vgpaas/dockersys
    • Devicemapper: A thin pool is allocated to store image data.
      # lsblk
      NAME                                MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
      vda                                   8:0    0   50G  0 disk 
      └─vda1                                8:1    0   50G  0 part /
      vdb                                   8:16   0  200G  0 disk 
      ├─vgpaas-dockersys                  253:0    0   18G  0 lvm  /var/lib/docker    
      ├─vgpaas-thinpool_tmeta             253:1    0    3G  0 lvm                   
      │ └─vgpaas-thinpool                 253:3    0   67G  0 lvm                   # Space used by thinpool
      │   ...
      ├─vgpaas-thinpool_tdata             253:2    0   67G  0 lvm  
      │ └─vgpaas-thinpool                 253:3    0   67G  0 lvm  
      │   ...
      └─vgpaas-kubernetes                 253:4    0   10G  0 lvm  /mnt/paas/kubernetes/kubelet
      • Run the following commands on the node to add the new disk capacity to the thinpool disk:
        pvresize /dev/vdb 
        lvextend -l+100%FREE -n vgpaas/thinpool
      • Run the following commands on the node to add the new disk capacity to the dockersys disk:
        pvresize /dev/vdb 
        lvextend -l+100%FREE -n vgpaas/dockersys
        resize2fs /dev/vgpaas/dockersys

Check Item 5: Whether the Remote Image Repository Uses an Unknown or Insecure Certificate

When a pod pulls an image from a third-party image repository that uses an unknown or insecure certificate, the image fails to be pulled from the node. The pod event list contains the event "Failed to pull the image" with the cause "x509: certificate signed by unknown authority".

NOTE:

The security of EulerOS 2.9 images is enhanced. Some insecure or expired certificates are removed from the system. It is normal that this error is reported in EulerOS 2.9 but not or some third-party images on other types of nodes. You can also perform the following operations to rectify the fault.

Solution

  1. Check the IP address and port number of the third-party image server for which the error message "unknown authority" is displayed.

    You can see the IP address and port number of the third-party image server for which the error is reported in the event information "Failed to pull image".
    Failed to pull image "bitnami/redis-cluster:latest": rpc error: code = Unknown desc = error pulling image configuration: Get https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/e8/e83853f03a2e792614e7c1e6de75d63e2d6d633b4e7c39b9d700792ee50f7b56/data?verify=1636972064-AQbl5RActnudDZV%2F3EShZwnqOe8%3D: x509: certificate signed by unknown authority

    The IP address of the third-party image server is production.cloudflare.docker.com, and the default HTTPS port number is 443.

  2. Load the root certificate of the third-party image server to the node where the third-party image is to be downloaded.

    Run the following commands on the EulerOS and CentOS nodes with {server_url}:{server_port} replaced with the IP address and port number obtained in Step 1, for example, production.cloudflare.docker.com:443:

    If the container engine of the node is containerd, replace systemctl restart docker with systemctl restart containerd.
    openssl s_client -showcerts -connect {server_url}:{server_port} < /dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > /etc/pki/ca-trust/source/anchors/tmp_ca.crt
    update-ca-trust
    systemctl restart docker
    Run the following command on Ubuntu nodes:
    openssl s_client -showcerts -connect {server_url}:{server_port} < /dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > /usr/local/share/ca-certificates/tmp_ca.crt
    update-ca-trust
    systemctl restart docker

Check Item 6: Whether the Image Size Is Too Large

The pod event list contains the event "Failed to pull image". This may be caused by a large image size.

Failed to pull image "XXX": rpc error: code = Unknown desc = context canceled

However, the image can be manually pulled by running the docker pull command.

Possible Causes

In Kubernetes clusters, there is a default timeout period for pulling images. If the image pulling progress is not updated within a certain period of time, the download will be canceled. If the node performance is poor or the image size is too large, the image may fail to be pulled and the workload may fail to be started.

Solution

  • Solution 1 (recommended):
    1. Log in to the node and manually pull the image.
      • containerd nodes:
        crictl pull <image-address>
      • Docker nodes:
        docker pull <image-address>
    2. When creating a workload, ensure that imagePullPolicy is set to IfNotPresent (the default configuration). In this case, the workload uses the image that has been pulled to the local host.
  • Solution 2 (applies to clusters of v1.25 or later): Modify the configuration parameters of the node pools. The configuration parameters for nodes in the DefaultPool node pool cannot be modified.
    1. Log in to the CCE console.
    2. Click the cluster name to access the cluster console. Choose Nodes in the navigation pane and click the Node Pools tab.
    3. Locate the row that contains the target node pool and click Manage.
    4. In the window that slides out from the right, modify the image-pull-progress-timeout parameter under Docker/containerd. This parameter specifies the timeout interval for pulling an image.
    5. Click OK.

Check Item 7: Connection to the Image Repository

Symptom

The following error message is displayed during workload creation:

Failed to pull image "docker.io/bitnami/nginx:1.22.0-debian-11-r3": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Possible Causes

Failed to connect to the image repository due to the disconnected network. SWR allows you to pull images only from the official Docker repository. For image pulls from other repositories, you need to access the Internet.

Solution

  • Bind a public IP address to the node which needs to pull the images.
  • Upload the image to SWR and then pull the image from SWR.

Check Item 8: Whether the Number of Public Image Pull Times Reaches the Upper Limit

Symptom

The following error message is displayed during workload creation:

ERROR: toomanyrequests: Too Many Requests.

Or

you have reached your pull rate limit, you may increase the limit by authenticating an upgrading: https://www.docker.com/increase-rate-limits.

Possible Causes

Docker Hub sets the maximum number of container image pull requests. For details, see Understanding Your Docker Hub Rate Limit.

Solution

Push the frequently used image to SWR and then pull the image from SWR.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback