Help Center/ ModelArts/ Troubleshooting/ Training Jobs/ GP Issues/ No GP Detected in a Training Job
Updated on 2025-08-22 GMT+08:00

No GP Detected in a Training Job

Symptom

The following error message is displayed during the running of a ModelArts training job:

failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

Possible Causes

According to error information, the error cause is that the training job program cannot read GPs.

Solution

Check whether the following configuration information is added to code and set GPs visible to the program based on the error message:

os.environ['CUDA_VISIBLE_DEVICES'] = '0,1,2,3,4,5,6,7'

In the preceding information, 0 is a GP ID of the server. It can be any GP ID visible to the program, for example, 0, 1, 2, or 3. If the configuration information is not added, the GP corresponding to the ID is unavailable.