Virtual GPU Burst Scheduling
Burst scheduling is an elastic GPU resource scheduling policy. It ensures that each pod has a guaranteed share of compute power while dynamically preempting idle resources from nodes to maximize overall utilization. This policy dynamically optimizes the allocation of compute resources (such as vCPUs and GPU cores) while retaining the existing memory scheduling and isolation architecture for compute and GPU memory.
GPU virtualization allocates time slices to each GPU based on the number of containers requesting GPU resources. These time slices, labeled as segment 1, segment 2, ..., segment N, are used to distribute GPU compute power among the containers. As shown in Figure 1, the upper part shows the isolated scheduling policy for both compute and GPU memory resources, while the lower part shows the burst scheduling policy. Assume that containers 1, 2, and 3 request 5%, 5%, and 10% of the compute power, respectively. Container 3 does not use GPU computing. The comparison focuses only on compute scheduling.
- Isolated scheduling: CCE allocates the requested compute power to each container: 5% to container 1, 5% to container 2, and 10% to container 3.
- Burst scheduling: When container 3 is idle, the unused compute power is dynamically reallocated to containers 1 and 2, resulting in 50% each for containers 1 and 2, and 0% for container 3. When container 3 uses GPU resources, CCE reallocates compute power based on the initial ratio, resulting in 25% for container 1, 25% for container 2, and 50% for container 3.
Prerequisites
- A CCE standard or Turbo cluster of v1.23.8-r0, v1.25.3-r0, or later is available.
- GPU nodes with cluster-wide virtualization enabled are available in the cluster. For details, see Preparing Virtual GPU Resources.
- The CCE AI Suite (NVIDIA GPU) add-on of v2.1.50, v2.7.67, or later has been installed in the cluster. For details, see CCE AI Suite (NVIDIA GPU).
- Volcano of v1.10.5 or later has been installed. For details, see Volcano Scheduler.
Notes and Constraints
- Before burst scheduling is enabled, all GPU and GPU virtualization tasks in the cluster must be migrated or stopped to prevent service interruptions or scheduling failures.
- After burst scheduling is enabled:
- Only compute power supports burst scheduling. GPU memory allocation remains based on the configured quota.
- Containers supporting isolated scheduling for both compute and GPU memory resources (policy=1) can no longer be created in the cluster.
- Containers supporting isolated scheduling for GPU memory resources only (policy=0) can still be created, but they cannot be scheduled on the same GPU as burst containers.
- Containers supporting GPU sharing can still be created, but they cannot be scheduled on the same GPU as burst containers.
Enabling Virtual GPU Burst Scheduling
- Log in to the CCE console and click the cluster name to access the cluster console. The Overview page is displayed.
- In the navigation pane, choose Add-ons. In the right pane, find the CCE AI Suite (NVIDIA GPU) add-on and click Edit.
- In the Install Add-on dialog box, click Edit YAML. Search for enabled_xgpu_burst in the YAML file and set it to true. The following is an example:
... custom: annotations: {} compatible_with_legacy_api: false component_schedulername: kube-scheduler disable_mount_path_v1: false disable_nvidia_gsp: true driver_mount_paths: bin,lib64 enable_fault_isolation: true enable_health_monitoring: true enable_metrics_monitoring: true enable_simple_lib64_mount: true enable_xgpu: false enabled_xgpu_burst: true gpu_driver_config: {} health_check_xids_v2: 74,79 inject_ld_Library_path: '' install_nvidia_peermem: false is_driver_from_nvidia: true ... - After the setting, click OK in the lower right corner of the page. CCE AI Suite (NVIDIA GPU) is then automatically upgraded. After the add-on status changes to Running, the burst function becomes effective.
Using Burst Scheduling
- Install kubectl on an existing ECS and access a cluster using kubectl. For details, see Accessing a Cluster Using kubectl.
- Run the following command to create a YAML file, which is used for virtual GPU burst scheduling container creation:
vim xgpu-burst.yamlExample file content:apiVersion: apps/v1 kind: Deployment metadata: name: xgpu-burst labels: app: xgpu-burst spec: replicas: 1 selector: matchLabels: app: xgpu-burst xgpu.burst/enabled: true # Enable the burst function. template: metadata: labels: app: xgpu-burst spec: containers: - name: container-1 image: <your_image_address> # Replace it with your image address. resources: limits: volcano.sh/gpu-mem.128Mi: 40 # The GPU memory allocated to the pod. This value represents 5120 MiB (40 x 128 MiB). volcano.sh/gpu-core.percentage: 25 # The compute power allocated to the pod, in percentage imagePullSecrets: - name: default-secret schedulerName: volcano
- After the burst function is enabled, containers supporting isolated scheduling for both compute and GPU memory resources cannot be created. If the volcano.sh/gpu-mem.128Mi and volcano.sh/gpu-core.percentage parameters are specified in resources.requests and resources.limits, the xgpu.burst/enabled: true label must be set. Otherwise, the workload will not be scheduled.
- Run the following command to create the workload:
kubectl apply -f xgpu-burst.yamlIf information similar to the following is displayed, the workload has been created:
deployment.apps/xgpu-burst created
- Run the following command to view the created pod:
kubectl get pod -n defaultInformation similar to the following is displayed:
NAME READY STATUS RESTARTS AGE xgpu-burst-6bdb4d7cb-pmtc2 1/1 Running 0 21s
- Log in to the pod and check the scheduling policy used by it.
kubectl exec -it gpu-app-6bdb4d7cb-pmtc2 -- cat /proc/xgpu/0/policyIf 6 is displayed in the command output, the burst scheduling policy is used.
Disabling Virtual GPU Burst Scheduling
Before disabling virtual GPU burst scheduling, migrate or stop containers using this capability in the cluster. Otherwise, virtual GPU containers will not be scheduled to GPU nodes running burst containers, blocking task scheduling and wasting resources.
- Log in to the CCE console and click the cluster name to access the cluster console. The Overview page is displayed.
- In the navigation pane, choose Add-ons. In the right pane, find the CCE AI Suite (NVIDIA GPU) add-on and click Edit.
- In the Install Add-on dialog box, click Edit YAML. Search for enabled_xgpu_burst in the YAML file and set it to false. The following is an example:
... custom: annotations: {} compatible_with_legacy_api: false component_schedulername: kube-scheduler disable_mount_path_v1: false disable_nvidia_gsp: true driver_mount_paths: bin,lib64 enable_fault_isolation: true enable_health_monitoring: true enable_metrics_monitoring: true enable_simple_lib64_mount: true enable_xgpu: false enabled_xgpu_burst: false gpu_driver_config: {} health_check_xids_v2: 74,79 inject_ld_Library_path: '' install_nvidia_peermem: false is_driver_from_nvidia: true ... - After the setting, click OK in the lower right corner of the page. CCE AI Suite (NVIDIA GPU) is then automatically upgraded. After the add-on status changes to Running, the burst function becomes ineffective.
Use Cases for Virtual GPU Burst Scheduling
Assume that there is a GPU node in your cluster. You create a workload with a burst scheduling policy, deploying it on a single pod with one container requesting 20% of the GPU compute power. Initially, the entire GPU power is allocated to this container. Next, you create another workload with the same burst scheduling policy, deploying it on a separate pod with one container requesting 5% of the GPU power. The GPU node dynamically reallocates compute power based on the request ratios of the two containers, 80% to the first container and 20% to the second.
- Install kubectl on an existing ECS and access a cluster using kubectl. For details, see Accessing a Cluster Using kubectl.
- Run the following command to create a YAML file, which is used for the creation of the first burst scheduling container:
vim xgpu-burst1.yamlExample file content:apiVersion: apps/v1 kind: Deployment metadata: name: xgpu-burst1 labels: app: xgpu-burst1 spec: replicas: 1 selector: matchLabels: app: xgpu-burst1 xgpu.burst/enabled: true # Enable the burst function. template: metadata: labels: app: xgpu-burst1 spec: containers: - name: container-1 image: nginx:latest # Replace it with your image address. command: <dosomething> # Replace it with the actual GPU service command. resources: limits: volcano.sh/gpu-mem.128Mi: 40 # The GPU memory allocated to the pod. This value represents 5120 MiB (40 x 128 MiB). volcano.sh/gpu-core.percentage: 20 # The compute power allocated to the pod, in percentage imagePullSecrets: - name: default-secret schedulerName: volcano - Run the following command to create the workload:
kubectl apply -f xgpu-burst1.yamlIf information similar to the following is displayed, the workload has been created:
deployment.apps/xgpu-burst1 created
- Run the following command to view the created pod:
kubectl get pod -n defaultIf information similar to the following is displayed, the pod has been executed:
NAME READY STATUS RESTARTS AGE xgpu-burst1-6bdb4d7cb-pmtc2 1/1 Running 0 21s
- Log in to the GPU node and run the following command to check the compute power allocation for the workload container:
xgpu-smi
The entire GPU power is allocated to the single container.Fri Mar 7 03:36:03 2025 +---------------------------------------------------------------------------------------+ | HUAWEI CLOUD XGPU-SMI XGPU Version: 1.0 | |=========================================+======================+======================| | Container-Id | GPU | GPU-Util/Limit | GPU-Memory-Usage/Limit | +-----------------------------------------+----------------------+----------------------+ | 5eff70afff85 | 0 | 100% / 20% | 1028Mi / 5120Mi | |=========================================+======================+======================| ...
- Run the following command to create a YAML file, which is used for the creation of the second burst scheduling container:
vim xgpu-burst2.yamlExample file content:apiVersion: apps/v1 kind: Deployment metadata: name: xgpu-burst2 labels: app: xgpu-burst2 spec: replicas: 1 selector: matchLabels: app: xgpu-burst2 xgpu.burst/enabled: true # Enable the burst function. template: metadata: labels: app: xgpu-burst2 spec: containers: - name: container-1 image: nginx:latest # Replace it with your image address. command: <dosomething> # Replace it with the actual GPU service command. resources: limits: volcano.sh/gpu-mem.128Mi: 40 # The GPU memory allocated to the pod. This value represents 5120 MiB (40 x 128 MiB). volcano.sh/gpu-core.percentage: 5 # The compute power allocated to the pod, in percentage imagePullSecrets: - name: default-secret schedulerName: volcano - Run the following command to create the workload:
kubectl apply -f xgpu-burst2.yamlIf information similar to the following is displayed, the workload has been created:
deployment.apps/xgpu-burst2 created
- Run the following command to view the created pod:
kubectl get pod -n defaultIf information similar to the following is displayed, the pod has been executed:
NAME READY STATUS RESTARTS AGE xgpu-burst1-6bdb4d7cb-pmtc2 1/1 Running 0 21s xgpu-burst2-5xdb4d7cb-qmld3 1/1 Running 0 21s
- Log in to the GPU node and run the following command to check the compute power allocation for the workload container: Compute power allocation may take some time. Wait patiently.
xgpu-smi
The GPU node dynamically reallocates compute power based on the request ratios of the two containers, 80% to the first container and 20% to the second.Fri Mar 7 03:36:03 2025 +---------------------------------------------------------------------------------------+ | HUAWEI CLOUD XGPU-SMI XGPU Version: 1.0 | |=========================================+======================+======================| | Container-Id | GPU | GPU-Util/Limit | GPU-Memory-Usage/Limit | +-----------------------------------------+----------------------+----------------------+ | 5eff70afff85 | 0 | 80% / 20% | 1024Mi / 5120Mi | | 98d5201b7ea3 | 0 | 20% / 5% | 2041Mi / 5120Mi | |=========================================+======================+======================| ...
Helpful Links
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot

