Help Center> Cloud Container Engine> Product Bulletin> Vulnerability Notices> Notice on the NVIDIA GPU Driver Vulnerability (CVE-2021-1056)
Updated on 2023-11-15 GMT+08:00

Notice on the NVIDIA GPU Driver Vulnerability (CVE-2021-1056)

Description

NVIDIA detected a vulnerability (assigned CVE-2021-1056), which exists in the NVIDIA GPU drivers and is related to device isolation. When a container is started in the non-privileged mode, an attacker can exploit this vulnerability to create a special character device file in the container to obtain the access permission of all GPU devices on the host machine.

For more information about this vulnerability, see CVE-2021-1056.

According to the official NVIDIA announcement, if your CCE cluster has a GPU-enabled node (ECS) and uses the recommended NVIDIA GPU driver (Tesla 396.37), your NVIDIA driver is not affected by this vulnerability. If you have installed or updated the NVIDIA GPU driver on your node, this vulnerability may be involved.

Table 1 Vulnerability information

Type

CVE-ID

Severity

Discovered

Privilege escalation

CVE-2021-1056

Medium

2021-01-07

Impact

According to the vulnerability notice provided by NVIDIA, the affected NVIDIA GPU driver versions are as follows:

For more information, see the official NVIDIA website.

Note:

  • The NVIDIA GPU driver version recommended for CCE clusters and the gpu-beta add-on has not yet been listed in the affected versions disclosed on the NVIDIA official website. If there are official updates, you will be notified and provided possible solutions to fix this vulnerability.
  • If you have selected a custom NVIDIA GPU driver version or updated the GPU driver on the node, check whether your GPU driver is affected by this vulnerability by referring to the preceding table.

Querying the NVIDIA Driver Version of a GPU Node

Log in to your GPU node and run the following command to view the driver version.

[root@XXX36 bin]# ./nvidia-smi 
Fri Apr 16 10:28:28 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:21:01.0 Off |                    0 |
| N/A   68C    P0    31W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

The preceding command output indicates that the GPU driver version of the node is 460.32.03.

Solution

Upgrade the node to the target driver version based on the Impact.

After upgrading your NVIDIA GPU driver, you need to restart the GPU node, which will temporarily affect your services.

  • If your node driver version belongs to 418 series, upgrade it to 418.181.07.
  • If your node driver version belongs to 450 series, upgrade it to 450.102.04.
  • If your node driver version belongs to 460 series, upgrade it to 460.32.03.

If you upgrade the GPU driver of a CCE cluster node, upgrade or reinstall the gpu-beta add-on, and enter the download address of the repaired NVIDIA GPU driver when installing the add-on.