Updated on 2024-06-17 GMT+08:00

Creating a Workload That Will Receive vGPU Support

This section describes how to use GPU virtualization to isolate the compute and GPU memory and efficiently use GPU resources.

Prerequisites

Constraints

  • The init container does not support GPU virtualization.
  • For a single GPU:
    • A maximum of 20 vGPUs can be created.
    • A maximum of 20 pods that use the isolation capability can be scheduled.
    • Only workloads in the same isolation mode can be scheduled. (GPU virtualization supports two isolation modes: GPU memory isolation and isolation of GPU memory and compute.)
  • For different containers of the same workload:
    • You can configure one GPU model and cannot configure two or more GPU models concurrently.
    • You can configure the same GPU usage mode and cannot configure virtualization and non-virtualization modes concurrently.
  • After a GPU is virtualized, the GPU cannot be used by workloads that use shared GPU resources.

Creating a Workload That Will Receive vGPU Support on the Console

  1. Log in to the UCS console.
  2. Click the on-premises cluster name to access its details page, choose Workloads in the navigation pane, and click Create Workload in the upper right corner.
  3. Configure workload parameters. In Container Settings, choose Basic Info and set the GPU quota.

    Video memory: The value must be a positive integer, in MiB. If the configured GPU memory exceeds that of a single GPU, GPU scheduling cannot be performed.

    Computing power: The value must be a multiple of 5, in %, and cannot exceed 100.

    Figure 1 Configuring workload information

  4. Configure other parameters and click Create Workload.
  5. Verify the isolation capability of GPU virtualization.

    • Log in to the target container and check its GPU memory.
      kubectl exec -it gpu-app -- nvidia-smi
      Expected output:
      Wed Apr 12 07:54:59 2023
      +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |                               |                      |               MIG M. |
      |===============================+======================+======================|
      |   0  Tesla V100-SXM2...  Off  | 00000000:21:01.0 Off |                    0 |
      | N/A   27C    P0    37W / 300W |   4792MiB /  5000MiB |      0%      Default |
      |                               |                      |                  N/A |
      +-------------------------------+----------------------+----------------------+
      +-----------------------------------------------------------------------------+
      | Processes:                                                                  |
      |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
      |        ID   ID                                                   Usage      |
      |=============================================================================|
      +-----------------------------------------------------------------------------+

      5,000 MiB of GPU memory is allocated to the container, and 4,792 MiB is used.

    • Run the following command on the node to check the isolation of the GPU memory:
      export PATH=$PATH:/usr/local/nvidia/bin;nvidia-smi

      Expected output:

      Wed Apr 12 09:31:10 2023
      +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |                               |                      |               MIG M. |
      |===============================+======================+======================|
      |   0  Tesla V100-SXM2...  Off  | 00000000:21:01.0 Off |                    0 |
      | N/A   27C    P0    37W / 300W |   4837MiB / 16160MiB |      0%      Default |
      |                               |                      |                  N/A |
      +-------------------------------+----------------------+----------------------+
      +-----------------------------------------------------------------------------+
      | Processes:                                                                  |
      |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
      |        ID   ID                                                   Usage      |
      |=============================================================================|
      |    0   N/A  N/A    760445      C   python                           4835MiB |
      +-----------------------------------------------------------------------------+

      16,160 MiB of GPU memory is allocated to the GPU node, and 4,837 MiB is used by the pod.

Creating a Workload That Will Receive vGPU Support Using kubectl

  1. Log in to the master node and use kubectl to connect to the cluster.
  2. Create a workload that will support vGPUs. Create a gpu-app.yaml file.

    There are two isolation modes: GPU memory isolation and isolation of both GPU memory and compute. volcano.sh/gpu-core.percentage cannot be set separately for GPU compute isolation.

    • Isolate the GPU memory only:
      apiVersion: apps/v1
       kind: Deployment
       metadata:
         name: gpu-app
         labels:
           app: gpu-app
       spec:
         replicas: 1
         selector:
           matchLabels:
             app: gpu-app
         template:
            metadata:
             labels:
               app: gpu-app
           spec:
             containers:
             - name: container-1
               image: <your_image_address>     # Replace it with your image address.
               resources:
                 limits:
                  volcano.sh/gpu-mem: 5000    # GPU memory allocated to the pod
             imagePullSecrets:
               - name: default-secret
    • Isolate both the GPU memory and compute:
      apiVersion: apps/v1
       kind: Deployment
       metadata:
         name: gpu-app
         labels:
           app: gpu-app
       spec:
         replicas: 1
         selector:
           matchLabels:
             app: gpu-app
         template:
            metadata:
             labels:
               app: gpu-app
           spec:
             containers:
             - name: container-1
               image: <your_image_address>     # Replace it with your image address.
               resources:
                 limits:
                   volcano.sh/gpu-mem: 5000    # GPU memory allocated to the pod
                  volcano.sh/gpu-core.percentage: 25    # Compute allocated to the pod
             imagePullSecrets:
               - name: default-secret
      Table 1 Key parameters

      Parameter

      Required

      Description

      volcano.sh/gpu-mem

      No

      The value must be a positive integer, in MiB. If the configured GPU memory exceeds that of a single GPU, GPU scheduling cannot be performed.

      volcano.sh/gpu-core.percentage

      No

      The value must be a multiple of 5, in %, and cannot exceed 100.

  3. Run the following command to create a workload:

    kubectl apply -f gpu-app.yaml

  4. Verify the isolation.

    • Log in to a container and check its GPU memory.
      kubectl exec -it gpu-app -- nvidia-smi

      Expected output:

      Wed Apr 12 07:54:59 2023
      +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |                               |                      |               MIG M. |
      |===============================+======================+======================|
      |   0  Tesla V100-SXM2...  Off  | 00000000:21:01.0 Off |                    0 |
      | N/A   27C    P0    37W / 300W |   4792MiB /  5000MiB |      0%      Default |
      |                               |                      |                  N/A |
      +-------------------------------+----------------------+----------------------+
      +-----------------------------------------------------------------------------+
      | Processes:                                                                  |
      |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
      |        ID   ID                                                   Usage      |
      |=============================================================================|
      +-----------------------------------------------------------------------------+

      5,000 MiB of GPU memory is allocated to the container, and 4,792 MiB is used.

    • Run the following command on the node to check GPU memory isolation:
      /usr/local/nvidia/bin/nvidia-smi

      Expected output:

      Wed Apr 12 09:31:10 2023
      +-----------------------------------------------------------------------------+
      | NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.4     |
      |-------------------------------+----------------------+----------------------+
      | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
      | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
      |                               |                      |               MIG M. |
      |===============================+======================+======================|
      |   0  Tesla V100-SXM2...  Off  | 00000000:21:01.0 Off |                    0 |
      | N/A   27C    P0    37W / 300W |   4837MiB / 16160MiB |      0%      Default |
      |                               |                      |                  N/A |
      +-------------------------------+----------------------+----------------------+
      +-----------------------------------------------------------------------------+
      | Processes:                                                                  |
      |  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
      |        ID   ID                                                   Usage      |
      |=============================================================================|
      |    0   N/A  N/A    760445      C   python                           4835MiB |
      +-----------------------------------------------------------------------------+

      16,160 MiB of GPU memory is allocated to the node, and 4,837 MiB is used by the pod in this example.