Scheduling Overview

CCE supports multiple resource and task scheduling policies to enhance application performance and overall cluster resource utilization. This section describes the main functions of CPU scheduling, GPU/NPU heterogeneous scheduling, and Volcano scheduling.

CPU Scheduling

CCE provides CPU management policies that enable the allocation of complete physical CPU cores to applications. This improves application performance and reduces scheduling latency.

Function	Description	Documentation
CPU policy	If a node runs a large number of CPU-intensive pods, workloads may be migrated between CPU cores. For CPU-sensitive applications, you can allocate dedicated physical cores to them using the CPU management policy provided by Kubernetes. This improves application performance and reduces scheduling latency.	CPU Policy
Enhanced CPU policy	Based on the conventional CPU management policy, this policy supports intelligent scheduling for burstable pods, whose CPU request and limit values must be positive integers. These pods can use specific CPU cores preferentially, but they do not exclusively use these CPU cores.	Enhanced CPU Policy

GPU Scheduling

CCE provides GPU scheduling for clusters, facilitating refined resource allocation and optimizing resource utilization. This accommodates the specific GPU compute needs of diverse workloads, thereby enhancing the overall scheduling efficiency and service performance of the cluster.

Function	Description	Documentation
Default GPU scheduling in Kubernetes	You can specify the number of GPUs that a pod requests. The value can be less than 1 so that multiple pods can share a single GPU.	Default GPU Scheduling in Kubernetes
GPU monitoring	Prometheus and Grafana comprehensively monitor GPU metrics. This helps optimize compute performance, quickly identify faults, and efficiently schedule resources. This leads to improved GPU utilization and reduced O&M costs.	GPU Monitoring
GPU auto scaling	CCE allows you to configure auto scaling policies for workloads and nodes based on GPU metrics to dynamically schedule and optimize resources. This improves computing efficiency, ensures stable service operation, and reduces O&M costs.	GPU Auto Scaling
GPU fault handling	If a GPU becomes faulty, CCE promptly reports an event and isolates the faulty GPU based on the event information. This ensures that other functional GPUs can continue operating, minimizing the impact on services.	GPU Fault Handling

NPU Scheduling

CCE provides NPU scheduling for clusters, facilitating efficient processing of inference and image recognition tasks.

Function	Description	Documentation
Complete NPU allocation	CCE allocates NPU resources to workload pods based on the requested count.	Complete NPU Allocation
NPU monitoring	Monitoring NPU metrics in a cluster identifies performance bottlenecks, optimizes resource utilization, and quickly locates exceptions, enhancing system stability and efficiency. In CCE standard and Turbo clusters, NPU-Exporter allows real-time monitoring and alarm reporting by uploading NPU metric data collected via DCMI or hccn tool to the cloud native monitoring system. This helps improve system reliability and performance.	NPU Monitoring

Volcano Scheduling

Volcano is a Kubernetes-based batch processing platform that supports machine learning, deep learning, bioinformatics, genomics, and other big data applications. It provides general-purpose, high-performance computing capabilities, such as job scheduling, heterogeneous chip management, and job running management.

Function	Description	Documentation
NUMA affinity scheduling	Volcano targets to lift the limitation to make scheduler NUMA topology aware so that: Pods are not scheduled to the nodes that NUMA topology does not match. Pods are scheduled to the most suitable node for NUMA topology.	NUMA Affinity Scheduling
Application scaling priority policies	With application scaling priority policies, you can customize the scaling order of pods across different node types to manage resources more efficiently.	Application Scaling Priority Policies

Function

Description

Documentation

NUMA affinity scheduling

Volcano targets to lift the limitation to make scheduler NUMA topology aware so that:

Pods are not scheduled to the nodes that NUMA topology does not match.
Pods are scheduled to the most suitable node for NUMA topology.

NUMA Affinity Scheduling

Application scaling priority policies

With application scaling priority policies, you can customize the scaling order of pods across different node types to manage resources more efficiently.

Application Scaling Priority Policies

Cloud Native Hybrid Deployment

The cloud native hybrid deployment solution focuses on the Volcano and Kubernetes ecosystems to help users improve resource utilization and efficiency and reduce costs.

Function	Description	Documentation
Dynamic resource oversubscription	Based on the types of online and offline jobs, Volcano scheduling is used to utilize the resources that are requested but not used in the cluster (the difference between the number of requested resources and the number of used resources) for resource oversubscription and hybrid deployment to improve cluster resource utilization.	Dynamic Resource Oversubscription

Parent Topic: Scheduling

Previous topic: Scheduling

Next topic: CPU Scheduling

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.

The system is busy. Please try again later.