Creating a Workload That Will Receive vGPU Support

This section describes how to use GPU virtualization to isolate the compute and GPU memory and efficiently use GPU resources.

Prerequisites

You have prepared GPU virtualization resources.
If you want to create a cluster by running commands, use kubectl to connect to the cluster. For details, see Connecting to a Cluster Using kubectl.

Constraints

The init container does not support GPU virtualization.

For a single GPU:
- A maximum of 20 vGPUs can be created.
- A maximum of 20 pods that use the isolation capability can be scheduled.
- Only workloads in the same isolation mode can be scheduled. (GPU virtualization supports two isolation modes: GPU memory isolation and isolation of GPU memory and compute.)
For different containers of the same workload:
- You can configure one GPU model and cannot configure two or more GPU models concurrently.
- You can configure the same GPU usage mode and cannot configure virtualization and non-virtualization modes concurrently.
After a GPU is virtualized, the GPU cannot be used by workloads that use shared GPU resources.

Creating a Workload That Will Receive vGPU Support on the Console

Log in to the UCS console.
Click the on-premises cluster name to access its details page, choose Workloads in the navigation pane, and click Create Workload in the upper right corner.
Configure workload parameters. In Container Settings, choose Basic Info and set the GPU quota.

Video memory: The value must be a positive integer, in MiB. If the configured GPU memory exceeds that of a single GPU, GPU scheduling cannot be performed.

Computing power: The value must be a multiple of 5, in %, and cannot exceed 100.

Figure 1 Configuring workload information
Configure other parameters and click Create Workload.

Verify the isolation capability of GPU virtualization.

kubectl exec -it gpu-app -- nvidia-smi

Expected output:

Wed Apr 12 07:54:59 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  Off  | 00000000:21:01.0 Off |                    0 |
| N/A   27C    P0    37W / 300W |   4792MiB /  5000MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

5,000 MiB of GPU memory is allocated to the container, and 4,792 MiB is used.

Run the following command on the node to check the isolation of the GPU memory:

export PATH=$PATH:/usr/local/nvidia/bin;nvidia-smi

Expected output:

Wed Apr 12 09:31:10 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  Off  | 00000000:21:01.0 Off |                    0 |
| N/A   27C    P0    37W / 300W |   4837MiB / 16160MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    760445      C   python                           4835MiB |
+-----------------------------------------------------------------------------+

16,160 MiB of GPU memory is allocated to the GPU node, and 4,837 MiB is used by the pod.

Creating a Workload That Will Receive vGPU Support Using kubectl

Create a workload that will support vGPUs. Create a gpu-app.yaml file.

There are two isolation modes: GPU memory isolation and isolation of both GPU memory and compute. volcano.sh/gpu-core.percentage cannot be set separately for GPU compute isolation.

Isolate the GPU memory only:

apiVersion: apps/v1
 kind: Deployment
 metadata:
   name: gpu-app
   labels:
     app: gpu-app
 spec:
   replicas: 1
   selector:
     matchLabels:
       app: gpu-app
   template:
      metadata:
       labels:
         app: gpu-app
     spec:
       containers:
       - name: container-1
         image: <your_image_address>     # Replace it with your image address.
         resources:
           limits:
            volcano.sh/gpu-mem: 5000    # GPU memory allocated to the pod
       imagePullSecrets:
         - name: default-secret

Isolate both the GPU memory and compute:

apiVersion: apps/v1
 kind: Deployment
 metadata:
   name: gpu-app
   labels:
     app: gpu-app
 spec:
   replicas: 1
   selector:
     matchLabels:
       app: gpu-app
   template:
      metadata:
       labels:
         app: gpu-app
     spec:
       containers:
       - name: container-1
         image: <your_image_address>     # Replace it with your image address.
         resources:
           limits:
             volcano.sh/gpu-mem: 5000    # GPU memory allocated to the pod
            volcano.sh/gpu-core.percentage: 25    # Compute allocated to the pod
       imagePullSecrets:
         - name: default-secret

**Table 1** Key parameters
Parameter	Required	Description
volcano.sh/gpu-mem	No	The value must be a positive integer, in MiB. If the configured GPU memory exceeds that of a single GPU, GPU scheduling cannot be performed.
volcano.sh/gpu-core.percentage	No	The value must be a multiple of 5, in %, and cannot exceed 100.

Run the following command to create a workload:
```
kubectl apply -f gpu-app.yaml
```

Verify the isolation.

kubectl exec -it gpu-app -- nvidia-smi

Expected output:

Wed Apr 12 07:54:59 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  Off  | 00000000:21:01.0 Off |                    0 |
| N/A   27C    P0    37W / 300W |   4792MiB /  5000MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

5,000 MiB of GPU memory is allocated to the container, and 4,792 MiB is used.

Run the following command on the node to check GPU memory isolation:

/usr/local/nvidia/bin/nvidia-smi

Expected output:

Wed Apr 12 09:31:10 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  Off  | 00000000:21:01.0 Off |                    0 |
| N/A   27C    P0    37W / 300W |   4837MiB / 16160MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    760445      C   python                           4835MiB |
+-----------------------------------------------------------------------------+

16,160 MiB of GPU memory is allocated to the node, and 4,837 MiB is used by the pod in this example.

Parent topic: GPU Virtualization

Previous topic: Preparing GPU Virtualization Resources

Next topic: Monitoring GPU Virtualization Resources