Help Center/ Cloud Container Instance/ FAQs/ Container Workload FAQs/ Why an Error Is Reported When a GPU-Related Operation Is Performed on the Container Entered by Using exec?

Updated on 2024-11-05 GMT+08:00

View PDF

Why an Error Is Reported When a GPU-Related Operation Is Performed on the Container Entered by Using exec?

Symptom

After I enter a container using exec and perform a GPU-related operation (such as using nvidia-smi or running a GPU training task using TensorFlow), the error message "cannot open shared object file: No such file or directory" is displayed.

Click to enlarge

Possible Cause

The CUDA library in a container is located in /usr/local/nvidia/lib64. This directory must be added to LD_LIBRARY_PATH to ensure that the CUDA library can be found.

Solution

Log in to the GPU-accelerated container by using kubectl exec or console, run the export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/nvidia/lib64 command, and then perform other GPU-related operations.

Parent topic: Container Workload FAQs

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot