Updated on 2024-02-29 GMT+08:00

Why Is the GPU Driver Abnormal?

Symptom

When you run the following command on a GPU-accelerated ECS to view the CPU usage, the system displays a message indicating that the specified program cannot be executed or the file path does not exist.

nvidia-smi

Information similar to the following is displayed:

-bash: /bin/nvidia-smi: No such file or directory

or

nvidia-smi: command not found

Possible Causes

The ECS driver is abnormal, not installed, or uninstalled.

Solution

  • If the driver has been uninstalled:

    Run the history command to check whether an uninstallation has been performed.

    Go to the /var/log directory and check whether the nvidia-uninstall.log file exists. If the log exists, the GPU driver has been uninstalled. Reinstall the GPU driver.

  • If the driver has been installed but the driver status is abnormal:
    1. Uninstall the driver.
      • Method 1: Run the nvidia-uninstall command to uninstall the driver.

        If the system displays a message indicating that the command does not exist, use method 2.

      • Method 2: Run the whereis nvidia command to query the version of the driver installed on the ECS.
        Figure 1 Installed driver version

        Download the driver package of the same version as the obtained one from the NVIDIA official website. (This driver package is required when you uninstall and reinstall the driver.)

        For example, if the driver version is nvidia-396.44, run the sh NVIDIA-Linux-x86_64-396.44.run --uninstall command to uninstall the driver.

    2. Reinstall the driver.

      For details, see Installing a Driver and Toolkit.