GPU Scheduling
You can use GPUs in CCE containers.
Prerequisites
- A GPU node has been ready for use. For details, see Buying a Node.
- The gpu-beta add-on has been installed. During the installation, select the GPU driver on the node. For details, see gpu-beta.
Using GPUs
Create a workload and request GPUs. You can specify the number of GPUs as follows:
apiVersion: apps/v1
kind: Deployment
metadata:
name: gpu-test
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: gpu-test
template:
metadata:
labels:
app: gpu-test
spec:
containers:
- image: nginx:perl
name: container-0
resources:
requests:
cpu: 250m
memory: 512Mi
nvidia.com/gpu: 1 # Number of requested GPUs
limits:
cpu: 250m
memory: 512Mi
nvidia.com/gpu: 1 # Maximum number of GPUs that can be used
imagePullSecrets:
- name: default-secret nvidia.com/gpu specifies the number of GPUs to be requested. The value can be smaller than 1. For example, nvidia.com/gpu: 0.5 indicates that multiple pods share a GPU.
After nvidia.com/gpu is specified, workloads will not be scheduled to nodes without GPUs. If GPUs are insufficient, a Kubernetes event similar to "0/2 nodes are available: 2 Insufficient nvidia.com/gpu." will be reported.
To use GPUs on the CCE console, select the GPU quota and specify the percentage of GPUs reserved for the container when creating a workload.
GPU Node Labels
CCE will label GPU-enabled nodes that are ready to use. Different types of GPU-enabled nodes have different labels.
When using GPUs, you can enable the affinity between pods and nodes based on labels so that the pods can be scheduled to the correct nodes.
apiVersion: apps/v1
kind: Deployment
metadata:
name: gpu-test
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: gpu-test
template:
metadata:
labels:
app: gpu-test
spec:
nodeSelector:
accelerator: nvidia-t4
containers:
- image: nginx:perl
name: container-0
resources:
requests:
cpu: 250m
memory: 512Mi
nvidia.com/gpu: 1 # Number of requested GPUs
limits:
cpu: 250m
memory: 512Mi
nvidia.com/gpu: 1 # Maximum number of GPUs that can be used
imagePullSecrets:
- name: default-secret Last Article: Managing Pods
Next Article: NPU Scheduling
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.