Effective Troubleshooting in Kubernetes with Temporary Containers
Pods are the basic building blocks in Kubernetes. Pods are disposable and replaceable. Once a pod is created, containers cannot be added to it. Ephemeral containers are special containers in Kubernetes. They are used to temporarily create containers in running pods for debugging.
In most cases, container exceptions can be debugged using the kubectl exec or kubectl logs command to access the container and diagnose faults. However, the exec command may fail if containers are in a Crash state or if their images lack debugging tools. In such situations, ephemeral containers offer a practical solution. These containers can be injected into a running pod to inspect its status and execute commands, aiding in the resolution of issues that are difficult to replicate.
Prerequisites
- The cluster version should be v1.23 or later and the ephemeral containers are enabled for the cluster by default.
- kubectl has been installed. You have been granted the required cluster access permissions. For details, see Accessing a Cluster Using kubectl.
- You have prepared a debugging tool image. The images of the preset tool packages are recommended.
- container-trouble-shooting: a public image provided by the Huawei Cloud container team. This image is preconfigured with rich network diagnosis tools, performance diagnosis, and development environments, such as GDB, Python, Delve, strace, tcpdump, traceroute, telnet, Nmap, bind-utils, iPerf3, net-tools, ethtool, iftop, pstack, GCC, Golang, and perf.
Image address: swr-gallery.swr-pro.myhuaweicloud.com/library/container-trouble-shooting:v1
- nicolaka/netshoot: a network diagnosis tool set (tcpdump, netstat, curl, and more)
- container-trouble-shooting: a public image provided by the Huawei Cloud container team. This image is preconfigured with rich network diagnosis tools, performance diagnosis, and development environments, such as GDB, Python, Delve, strace, tcpdump, traceroute, telnet, Nmap, bind-utils, iPerf3, net-tools, ethtool, iftop, pstack, GCC, Golang, and perf.
Assigning Permissions (RBAC)
To ensure only minimum privileges are assigned, allowing a user or service account to modify only the temporary containers of pods:
- Create a role-test.yaml file and grant only the permissions to modify the temporary containers to the user.
vi role-test.yaml
The content is as follows:apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: ephemeral-debugger rules: - apiGroups: [""] resources: ["pods/ephemeralcontainers"] verbs: ["update", "patch"] # Only the temporary containers of the pods can be modified. --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: debugger-binding subjects: - kind: User name: "xxx" # User ID apiGroup: rbac.authorization.k8s.io roleRef: kind: ClusterRole name: ephemeral-debugger apiGroup: rbac.authorization.k8s.io
For details about how to obtain a user ID, see Obtaining Account, IAM User, Group, Project, Region, and Agency Information.
- Create the preceding RBAC configurations.
kubectl create -f role-test.yaml
Injecting a Temporary Container
- Inject a temporary container into a pod and enter the interactive shell:
kubectl debug <pod-name> -it --image=swr-gallery.swr-pro.myhuaweicloud.com/library/container-trouble-shooting:v1 --target=<target-container>
The parameters in this command are described as follows:
- -it: Access the interactive terminal.
- --image: Specify the image of the temporary container.
- --target: (Optional) Share the process namespace of the target container with the temporary container.
For example, to debug the nginx container in the myapp pod, run the following command:
kubectl debug myapp -it --image=swr-gallery.swr-pro.myhuaweicloud.com/library/container-trouble-shooting:v1 --target=nginx
You can also debug containers using a pod by running the following command:
kubectl debug myapp -it --image=swr-gallery.swr-pro.myhuaweicloud.com/library/container-trouble-shooting:v1 --share-processes --copy-to=myapp-debug
In the preceding command:
- --share-processes: enables the temporary container to share the namespace with the pod, which allows you to view other container processes in the pod in the temporary container.
- --copy-to: creates a pod replica, copies the temporary container to the new pod, and specifies the name of the new pod.
After these commands are executed, information similar to the following is displayed:Pod/myapp created Defaulting container name to debugger. If you do not see a command prompt, try pressing Enter. / #
- Start the debugging and use various debugging tools to check the container status. For details about typical scenarios, see Typical Problem Diagnosis Scenarios.
Typical Problem Diagnosis Scenarios
- Network Problem Check
Scenario: A pod is unable to access external or internal services.
- Use a debugging tool image to examine the network:
kubectl debug myapp -it --image=swr-gallery.swr-pro.myhuaweicloud.com/library/container-trouble-shooting:v1
- Run the following command within the temporary container to check the network:
tcpdump -i eth0 port 80 # Capture HTTP traffic. netstat -tuln # Check the listening status of the port. dig my-service.namespace.svc.cluster.local # DNS curl -v http://backend:8080 # Test service connectivity.
- Use a debugging tool image to examine the network:
- File System Check
Scenario: Logs from the main container are missing, or the configuration file is faulty.
- Create a temporary container and mount the file system (shared volumeMounts) of the main container.
kubectl debug myapp -it --image=busybox --target=nginx
- Check the files.
ls /var/log/nginx # View the log directory. cat /etc/nginx/nginx.conf # Verify the configuration file content.
- Create a temporary container and mount the file system (shared volumeMounts) of the main container.
- Process/Performance Analysis
Scenario: Abnormal CPU or memory usage is observed. Tools such as htop, strace, and perf are used.
- Create a temporary container and share the process namespace of the target container with it using the --target parameter.
kubectl debug myapp -it --image=alpine --target=nginx
- Run the following command within the temporary container to check the process:
# View the process tree. ps aux # Monitor resource usage. top -H # Trace system calls. strace -p 1 # PID 1 is the main Nginx process.
- Create a temporary container and share the process namespace of the target container with it using the --target parameter.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot