Help Center/
ModelArts/
Troubleshooting/
Training Jobs/
GPU Issues/
Error Message "No CUDA-capable device is detected" Displayed in Logs
Updated on 2022-12-08 GMT+08:00
Error Message "No CUDA-capable device is detected" Displayed in Logs
Symptom
An error similar to the following occurs during the running of the program:
1. 'failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected' 2. 'No CUDA-capable device is detected although requirements are installed'
Possible Causes
The possible causes are as follows:
- CUDA_VISIBLE_DEVICES has been incorrectly set.
- CUDA operations are performed on GPUs with IDs that are not specified by CUDA_VISIBLE_DEVICES.
Solution
- Do not change the CUDA_VISIBLE_DEVICES value in the code. Use its default value.
- Ensure that the specified GPU IDs are within the available GPU IDs.
- If the error persists, print the CUDA_VISIBLE_DEVICES value and debug it in the notebook, or run the following commands to check whether the returned result is True:
import torch torch.cuda.is_available()
Summary and Suggestions
Before creating a training job, use the ModelArts development environment to debug the training code to maximally eliminate errors in code migration.
- Use the online notebook environment for debugging. For details, see Using JupyterLab to Develop a Model.
- Use the local IDE (PyCharm or VS Code) to access the cloud environment for debugging. For details, see Using the Local IDE to Develop a Model.
Parent topic: GPU Issues
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
The system is busy. Please try again later.
For any further questions, feel free to contact us through the chatbot.
Chatbot