Help Center/ Elastic Cloud Server/ Troubleshooting/ GPU Driver Issues/ Why Is the T4 GPU Display Abnormal?
Updated on 2024-08-15 GMT+08:00

Why Is the T4 GPU Display Abnormal?

Symptom

An ECS, for example, of the PI2 or G6 flavor, is using NVIDIA Tesla T4 GPUs. However, when you run nvidia-smi to check the GPU usage, the following information is displayed:
No devices were found

Possible Causes

NVIDIA Tesla T4 GPU is a new version of NVIDIA. By default, the GSP firmware is enabled and used. As a result, the GPU cannot be identified.

Method 1

The following settings will become invalid after the ECS is restarted.

  1. Remove the NVIDIA kernel module.

    rmmod nvidia_drm

    rmmod nvidia_modeset

    rmmod nvidia

  2. Disable GSP firmware and load the NVIDIA kernel module.

    modprobe nvidia NVreg_EnableGpuFirmware=0

    modprobe nvidia_drm

    modprobe nvidia_modeset

  3. If the fault persists, contact customer service.

Method 2

  1. Run the following command to open the /etc/modprobe.d/nvidia.conf file:

    vim /etc/modprobe.d/nvidia.conf

    Press i to enter the editing mode.

  2. Add the following information to /etc/modprobe.d/nvidia.conf:
    options nvidia NVreg_EnableGpuFirmware=0

    Press Esc, enter :wq!, and exit.

  3. Run the following command to restart the ECS:

    reboot

  4. If the fault persists, contact customer service.