Help Center> Image Management Service> FAQ> OS> How Do I Install the NVIDIA Driver on a P1 ECS?

How Do I Install the NVIDIA Driver on a P1 ECS?

Prerequisites

  • The target ECS has had an EIP bound.
  • You have obtained the driver installation package required for an OS. For details, see Table 1.
Table 1 NVIDIA drivers

Driver

Installation Package

How to Obtain

GPU driver

NVIDIA-Linux-x86_64-375.66.run

http://www.nvidia.com/download/driverResults.aspx/118955/en-us

CUDA Toolkit

cuda_8.0.61_375.26_linux.run

https://developer.nvidia.com/compute/cuda/8.0/Prod2/local_installers/cuda_8.0.61_375.26_linux-run

Procedure

  1. Log in to the target ECS and run the following command to switch to user root:

    sudo su

  2. Install the GCC and g++ software.

    • For Ubuntu 16.04 64bit, run the following commands:

      sudo apt-get install gcc

      sudo apt-get install g++

      sudo apt-get install make

    • For CentOS 7.3, you do not need to install the software.
    • For Debian 8.0, run the following commands:

      sudo apt-get install gcc

      sudo apt-get install g++

      sudo apt-get install make

      sudo apt-get install linux-headers-$(uname -r)

  3. (Optional) Disable the Nouveau driver.

    Perform this step if the Nouveau driver has been installed on the target ECS. This prevents conflict with the NVIDIA driver installation.

    1. Run the following command to check whether the Nouveau driver is running on the target ECS:

      lsmod | grep nouveau

      • If yes, go to 3.b.
      • If no, go to 4.
    2. Add the following statement to the end of the /etc/modprobe.d/blacklist.conf file:

      blacklist nouveau

      options nouveau modeset=0

    3. Run the following commands to back up and create an initramfs application:

      mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak

      update-initramfs -u

    4. Run the following command to restart the ECS:

      reboot

  4. (Optional) Disable the X service.

    If the ECS has been logged in using the GUI, disable the X service before installing the NVIDIA driver.

    1. Run the following command to switch to multi-user mode:

      systemctl set-default multi-user.target

    2. Run the following command to restart the ECS:

      reboot

  5. (Optional) Install the GPU driver.

    You can either use the GPU driver provided in the CUDA Toolkit installation package or download the required GPU driver. Unless otherwise specified, you are advised to install GPU driver NVIDIA-Linux-x86_64-375.66.run, which has been fully verified.
    1. Upload the GPU driver installation package NVIDIA-Linux-x86_64-xxx.yy.run to the /tmp directory of the ECS.

      To download the GPU driver, log in at http://www.nvidia.com/Download/index.aspx?lang=en.

      Figure 1 Downloading the GPU driver
    2. Run the following command to install the GPU driver:

      sh ./NVIDIA-Linux-x86_64-xxx.yy.run

    3. Run the following command to delete the installation package:

      rm -f NVIDIA-Linux-x86_64-xxx.yy.run

  6. Install the CUDA Toolkit.

    Unless otherwise specified, you are advised to install CUDA Toolkit cuda_8.0.61_375.26_linux.run, which has been fully verified.
    1. Upload the CUDA Toolkit installation package cuda_a.b.cc_xxx.yy_linux.run to the /tmp directory of the ECS.

      To download the CUDA Toolkit, log in at https://developer.nvidia.com/cuda-downloads.

    2. Run the following command to change the permission:

      chmod +x cuda_a.b.cc_xxx.yy_linux.run

    3. Run the following command to install the CUDA Toolkit:

      ./cuda_a.b.cc_xxx.yy_linux.run -toolkit -samples -silent -override --tmpdir=/tmp/

    4. Run the following command to delete the installation package:

      rm -f cuda_a.b.cc_xxx.yy_linux.run

    5. Run the following commands to check whether the installation is successful:

      cd /usr/local/cuda/samples/1_Utilities/deviceQueryDrv/

      make

      ./deviceQueryDrv

      If the terminal display contains "Result = PASS", both CUDA Toolkit and GPU driver have been installed.

      ./deviceQueryDrv Starting...  
         
       CUDA Device Query (Driver API) statically linked version   
       Detected 1 CUDA Capable device(s)  
         
       Device 0: "Tesla P100-PCIE-16GB"  
         CUDA Driver Version:                           8.0  
         CUDA Capability Major/Minor version number:    6.0  
         Total amount of global memory:                 16276 MBytes (17066885120 bytes)  
         (56) Multiprocessors, ( 64) CUDA Cores/MP:     3584 CUDA Cores  
         GPU Max Clock rate:                            1329 MHz (1.33 GHz)  
         Memory Clock rate:                             715 Mhz  
         Memory Bus Width:                              4096-bit  
         L2 Cache Size:                                 4194304 bytes  
         Max Texture Dimension Sizes                    1D=(131072) 2D=(131072, 65536) 3D=(16384, 16384, 16384)  
         Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers  
         Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers  
         Total amount of constant memory:               65536 bytes  
         Total amount of shared memory per block:       49152 bytes  
         Total number of registers available per block: 65536  
         Warp size:                                     32  
         Maximum number of threads per multiprocessor:  2048  
         Maximum number of threads per block:           1024  
         Max dimension size of a thread block (x,y,z): (1024, 1024, 64)  
         Max dimension size of a grid size (x,y,z):    (2147483647, 65535, 65535)  
         Texture alignment:                             512 bytes  
         Maximum memory pitch:                          2147483647 bytes  
         Concurrent copy and kernel execution:          Yes with 2 copy engine(s)  
         Run time limit on kernels:                     No  
         Integrated GPU sharing Host Memory:            No  
         Support host page-locked memory mapping:       Yes  
         Concurrent kernel execution:                   Yes  
         Alignment requirement for Surfaces:            Yes  
         Device has ECC support:                        Enabled  
         Device supports Unified Addressing (UVA):      Yes  
         Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 6  
         Compute Mode:  
            < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >  
       Result = PASS 

  1. Log in to the target ECS and run the following command to switch to user root:

    sudo su

  2. (Optional) Install GCC, g++, and kernel-devel.

    Perform this step only if GCC, g++, and kernel-devel have not been installed.

    yum install gcc

    yum install gcc-c++

    yum install make

    yum install kernel-devel-`uname -r`

  3. (Optional) Disable the Nouveau driver.

    Perform this step if the Nouveau driver has been installed on the target ECS. This prevents conflict with the NVIDIA driver installation.

    1. Run the following command to check whether the Nouveau driver is running on the target ECS:

      lsmod | grep nouveau

      • If yes, go to 3.b.
      • If no, go to 4.
    2. Add the following statement to the end of the /etc/modprobe.d/blacklist.conf file:

      blacklist nouveau

    3. Run the following commands to back up and create an initramfs application:

      mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak

      dracut -v /boot/initramfs-$(uname -r).img $(uname -r)

    4. Run the following command to restart the ECS:

      reboot

  4. (Optional) Disable the X service.

    If the ECS has been logged in using the GUI, disable the X service before installing the NVIDIA driver.

    1. Run the following command to switch to multi-user mode:

      systemctl set-default multi-user.target

    2. Run the following command to restart the ECS:

      reboot

  5. (Optional) Install the GPU driver.

    You can either use the GPU driver provided in the CUDA Toolkit installation package or download the required GPU driver. Unless otherwise specified, you are advised to install GPU driver NVIDIA-Linux-x86_64-375.66.run, which has been fully verified.
    1. Upload the GPU driver installation package NVIDIA-Linux-x86_64-xxx.yy.run to the /tmp directory of the ECS.

      To download the GPU driver, log in at http://www.nvidia.com/Download/index.aspx?lang=en.

      Figure 2 Downloading the driver installation package
    2. Run the following command to install the GPU driver:

      sh ./NVIDIA-Linux-x86_64-xxx.yy.run

    3. Run the following command to delete the installation package:

      rm -f NVIDIA-Linux-x86_64-xxx.yy.run

  6. Install the CUDA Toolkit.

    Unless otherwise specified, you are advised to install CUDA Toolkit cuda_8.0.61_375.26_linux.run, which has been fully verified.
    1. Upload the CUDA Toolkit installation package cuda_a.b.cc_xxx.yy_linux.run to the /tmp directory of the ECS.

      To download the CUDA Toolkit, log in at https://developer.nvidia.com/cuda-downloads.

    2. Run the following command to change the permission:

      chmod +x cuda_a.b.cc_xxx.yy_linux.run

    3. Run the following command to install the CUDA Toolkit:

      ./cuda_a.b.cc_xxx.yy_linux.run -toolkit -samples -silent -override --tmpdir=/tmp/

    4. Run the following command to delete the installation package:

      rm -f cuda_a.b.cc_xxx.yy_linux.run

    5. Run the following commands to check whether the installation is successful:

      cd /usr/local/cuda/samples/1_Utilities/deviceQueryDrv/

      make

      ./deviceQueryDrv

      If the terminal display contains "Result = PASS", both CUDA Toolkit and GPU driver have been installed.

      ./deviceQueryDrv Starting...  
         
       CUDA Device Query (Driver API) statically linked version   
       Detected 1 CUDA Capable device(s)  
         
       Device 0: "Tesla P100-PCIE-16GB"  
         CUDA Driver Version:                           8.0  
         CUDA Capability Major/Minor version number:    6.0  
         Total amount of global memory:                 16276 MBytes (17066885120 bytes)  
         (56) Multiprocessors, ( 64) CUDA Cores/MP:     3584 CUDA Cores  
         GPU Max Clock rate:                            1329 MHz (1.33 GHz)  
         Memory Clock rate:                             715 Mhz  
         Memory Bus Width:                              4096-bit  
         L2 Cache Size:                                 4194304 bytes  
         Max Texture Dimension Sizes                    1D=(131072) 2D=(131072, 65536) 3D=(16384, 16384, 16384)  
         Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers  
         Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers  
         Total amount of constant memory:               65536 bytes  
         Total amount of shared memory per block:       49152 bytes  
         Total number of registers available per block: 65536  
         Warp size:                                     32  
         Maximum number of threads per multiprocessor:  2048  
         Maximum number of threads per block:           1024  
         Max dimension size of a thread block (x,y,z): (1024, 1024, 64)  
         Max dimension size of a grid size (x,y,z):    (2147483647, 65535, 65535)  
         Texture alignment:                             512 bytes  
         Maximum memory pitch:                          2147483647 bytes  
         Concurrent copy and kernel execution:          Yes with 2 copy engine(s)  
         Run time limit on kernels:                     No  
         Integrated GPU sharing Host Memory:            No  
         Support host page-locked memory mapping:       Yes  
         Concurrent kernel execution:                   Yes  
         Alignment requirement for Surfaces:            Yes  
         Device has ECC support:                        Enabled  
         Device supports Unified Addressing (UVA):      Yes  
         Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 6  
         Compute Mode:  
            < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >  
       Result = PASS