How Do I Configure a Pod to Use the Acceleration Capability of a GPU Node?
Problem Description
I have purchased a GPU node, but the operating speed is still slow. How do I configure the pod to use the acceleration capability of the GPU node?
Solution
Solution 1:
You are advised to remove the unschedulable taints from the GPU nodes in the cluster, so that the GPU plug-in driver can be properly installed. In addition, you need to install the GPU driver of a later version.
If a container is not deployed on a GPU node in your cluster, you can configure affinity and anti-affinity policies to prevent the container from being scheduled to the GPU node.
Solution 2:
You are advised to install the GPU driver of a later version and use kubectl to update the GPU plug-in configuration. Add the following configuration:
tolerations: - operator: "Exists"
After the configuration is added, the GPU plug-in driver can be properly installed on the GPU node with a taint.
Node Running FAQs
- What Should I Do If a Cluster Is Available But Some Nodes Are Unavailable?
- How Do I Troubleshoot the Failure to Remotely Log In to a Node in a CCE Cluster?
- How Do I Log In to a Node Using a Password and Reset the Password?
- How Do I Collect Logs of Nodes in a CCE Cluster?
- What Can I Do If the Container Network Becomes Unavailable After yum update Is Used to Upgrade the OS?
- What Should I Do If the vdb Disk of a Node Is Damaged and the Node Cannot Be Recovered After Reset?
- Which Ports Are Used to Install kubelet on CCE Cluster Nodes?
- How Do I Configure a Pod to Use the Acceleration Capability of a GPU Node?
- What Should I Do If I/O Suspension Occasionally Occurs When SCSI EVS Disks Are Used?
- What Should I Do If Excessive Docker Audit Logs Affect the Disk I/O?
- How Do I Fix an Abnormal Container or Node Due to No Thin Pool Disk Space?
- Which Ports Does a Node Listen On?
- How Do I Rectify Failures When the NVIDIA Driver Is Used to Start Containers on GPU Nodes?
- What Should I Do If a Node Does Not Synchronize with the NTP Clock Source?
- What Should I Do If the Data Disk Usage Is High Because a Large Volume of Data Is Written Into the Log File?
- Why Does My Node Memory Usage Obtained by Running the kubelet top node Command Exceeds 100%?
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbotmore