Error Message "No CUDA-capable device is detected" Displayed in Logs
Symptom
1. 'failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected' 2. 'No CUDA-capable device is detected although requirements are installed'
Possible Causes
The possible causes are as follows:
- CUDA_VISIBLE_DEVICES has been incorrectly set.
- CUDA operations are performed on GPUs with IDs that are not specified by CUDA_VISIBLE_DEVICES.
Solution
- Do not change the CUDA_VISIBLE_DEVICES value in the code. Use its default value.
- Ensure that the specified GPU IDs are within the available GPU IDs.
- If the error persists, print the CUDA_VISIBLE_DEVICES value and debug it in the notebook, or run the following commands to check whether the returned result is True:
import torch torch.cuda.is_available()
Summary and Suggestions
Before creating a training job, use the ModelArts development environment to debug the training code to maximally eliminate errors in code migration.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.