Help Center/ Cloud Container Engine/ User Guide/ Scheduling/ GPU Scheduling/ Configuring Auto Scaling for xGPU Nodes
Updated on 2025-01-07 GMT+08:00

Configuring Auto Scaling for xGPU Nodes

If there are not enough GPU virtualization resources in a cluster, xGPU nodes can be scaled out automatically. This section describes how to create an auto scaling policy for xGPU nodes.

Prerequisites

Step 1: Configure the Node Pool

  1. Log in to the CCE console and click the cluster name to access the cluster console. In the navigation pane, choose Nodes.
  2. Click Create Node Pool to create an xGPU node pool. For details, see Creating a Node Pool.

    For details about requirements on xGPU nodes, such as the specifications, OS, and runtime, see Preparing xGPU Resources.

  3. After the node pool is created, click Auto Scaling. In the AS Object area, enable Auto Scaling for the target specification and click OK.

Step 2: Configure Heterogeneous Resources

  1. In the navigation pane, choose Settings. Then, click the Heterogeneous Resources tab.
  2. In the GPU Settings area, locate Node Pool Configurations and select the created node pool.
  3. Select a driver that meets GPU virtualization requirements and enable GPU virtualization based on Preparing xGPU Resources.

    Figure 1 Heterogeneous Resources

  4. Click Confirm configuration.

Step 3: Create a GPU Virtualization Workload and Trigger Capacity Expansion

Create a Deployment that uses GPU virtualization resources and requests a number of GPUs exceeding the current upper limit available in the cluster. For details, see Using GPU Virtualization. For example, there is a total of 16 GiB of GPU memory available, with each pod requiring 1 GiB. Then, configure 17 pods, which need a total of 17 GiB of GPU memory.

After a short period of time, you can find GPU node scale-out on the node pool details page.