Updated on 2023-12-19 GMT+08:00

Overview

On-premises clusters use xGPU for GPU virtualization to allow you to dynamically allocate the GPU memory and compute. A GPU can be virtualized into a maximum of 20 vGPUs. Dynamic allocation provides more flexibility than static allocation. You can assign the right amount of GPU for service stability, which improves the GPU utilization.

Advantages

GPU virtualization of on-premises clusters has the following advantages:

  • Flexible: The compute and GPU memory are configured in a refined manner. The compute can be allocated at a granularity of 5% of the GPU, and the GPU memory at a granularity of 1 MiB.
  • Isolated: There are two isolation modes: GPU memory isolation and isolation of GPU memory and compute.
  • Compatible: There is no need to recompile the services or replace the CUDA library.