优先级调度

优先级表示一个作业相对于其他作业的重要性，Volcano兼容Kubernetes中的Pod优先级定义（PriorityClass）。启用该能力后，调度器将优先保障高优先级业务调度。

前提条件

已创建v1.19及以上版本的集群，详情请参见创建Standard集群。
已安装Volcano插件，详情请参见Volcano调度器。

优先级调度介绍

用户在集群中运行的业务丰富多样，包括核心业务、非核心业务，在线业务、离线业务等，根据业务的重要程度和SLA要求，可以对不同业务类型设置相应的高优先级。比如对核心业务和在线业务设置高优先级，可以保证该类业务优先获取集群资源。

CCE集群支持的优先级调度如表1所示。

表1 业务优先级保障调度
调度类型	说明	支持的调度器
基于优先级调度	调度器优先保障高优先级业务运行，但不会主动驱逐已运行的低优先级业务。基于优先级调度配置默认开启，不支持关闭。	kube-scheduler调度器/Volcano调度器

配置优先级调度策略

登录CCE控制台。
单击集群名称进入集群，在左侧选择“配置中心”，在右侧选择“调度配置”页签。
在“业务优先级保障调度”配置中，进行优先级调度配置。
- 基于优先级调度：调度器优先保障高优先级业务运行，但不会主动驱逐已运行的低优先级业务。基于优先级调度配置默认开启，不支持关闭。

配置完成后，可以在工作负载或Volcano Job中使用优先级定义（PriorityClass）进行优先级调度。

创建一个或多个优先级定义（PriorityClass）。

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000000
globalDefault: false
description: ""

创建工作负载或Volcano Job，并指定priorityClassName。

工作负载

apiVersion: apps/v1
kind: Deployment
metadata:
  name: high-test
  labels:
    app: high-test
spec:
  replicas: 5
  selector:
    matchLabels:
      app: test
  template:
    metadata:
      labels:
        app: test
    spec:
      priorityClassName: high-priority
      schedulerName: volcano
      containers:
      - name: test
        image: busybox
        imagePullPolicy: IfNotPresent
        command: ['sh', '-c', 'echo "Hello, Kubernetes!" && sleep 3600']
        resources:
          requests:
            cpu: 500m
          limits:
            cpu: 500m

Volcano Job

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: vcjob
spec:
  schedulerName: volcano
  minAvailable: 4
  priorityClassName: high-priority
  tasks:
    - replicas: 4
      name: "test"
      template:
        spec:
          containers:
            - image: alpine
              command: ["/bin/sh", "-c", "sleep 1000"]
              imagePullPolicy: IfNotPresent
              name: running
              resources:
                requests:
                  cpu: "1"
          restartPolicy: OnFailure

基于优先级调度示例

如果集群中存在两个空闲节点，存在3个优先级的工作负载，分别为high-priority，med-priority，low-priority，首先运行high-priority占满集群资源，然后提交med-priority，low-priority的工作负载，由于集群资源全部被更高优先级工作负载占用，med-priority，low-priority的工作负载为pending状态，当high-priority工作负载结束，按照优先级调度原则，med-priority工作负载将优先调度。

通过priority.yaml创建3个优先级定义（PriorityClass），分别为：high-priority，med-priority，low-priority。

priority.yaml文件内容如下：

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 100
globalDefault: false
description: "This priority class should be used for volcano job only."
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: med-priority
value: 50
globalDefault: false
description: "This priority class should be used for volcano job only."
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: low-priority
value: 10
globalDefault: false
description: "This priority class should be used for volcano job only."

创建PriorityClass：

kubectl apply -f priority.yaml

查看优先级定义信息。

kubectl get PriorityClass

回显如下：

NAME                      VALUE        GLOBAL-DEFAULT   AGE
high-priority             100          false            97s
low-priority              10           false            97s
med-priority              50           false            97s
system-cluster-critical   2000000000   false            6d6h
system-node-critical      2000001000   false            6d6h

创建高优先级工作负载high-priority-job，占用集群的全部资源。

high-priority-job.yaml

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: priority-high
spec:
  schedulerName: volcano
  minAvailable: 4
  priorityClassName: high-priority
  tasks:
    - replicas: 4
      name: "test"
      template:
        spec:
          containers:
            - image: alpine
              command: ["/bin/sh", "-c", "sleep 1000"]
              imagePullPolicy: IfNotPresent
              name: running
              resources:
                requests:
                  cpu: "1"
          restartPolicy: OnFailure

执行以下命令下发作业：

kubectl apply -f high_priority_job.yaml

通过 kubectl get pod 查看Pod运行信息，如下：

NAME                   READY   STATUS    RESTARTS   AGE
priority-high-test-0   1/1     Running   0          3s
priority-high-test-1   1/1     Running   0          3s
priority-high-test-2   1/1     Running   0          3s
priority-high-test-3   1/1     Running   0          3s

此时，集群节点资源已全部被占用。

创建中优先级工作负载med-priority-job和低优先级工作负载low-priority-job。

med-priority-job.yaml

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: priority-medium
spec:
  schedulerName: volcano
  minAvailable: 4
  priorityClassName: med-priority
  tasks:
    - replicas: 4
      name: "test"
      template:
        spec:
          containers:
            - image: alpine
              command: ["/bin/sh", "-c", "sleep 1000"]
              imagePullPolicy: IfNotPresent
              name: running
              resources:
                requests:
                  cpu: "1"
          restartPolicy: OnFailure

low-priority-job.yaml

apiVersion: batch.volcano.sh/v1alpha1
kind: Job
metadata:
  name: priority-low
spec:
  schedulerName: volcano
  minAvailable: 4
  priorityClassName: low-priority
  tasks:
    - replicas: 4
      name: "test"
      template:
        spec:
          containers:
            - image: alpine
              command: ["/bin/sh", "-c", "sleep 1000"]
              imagePullPolicy: IfNotPresent
              name: running
              resources:
                requests:
                  cpu: "1"
          restartPolicy: OnFailure

执行以下命令下发作业：

kubectl apply -f med_priority_job.yaml
kubectl apply -f low_priority_job.yaml

通过 kubectl get pod 查看Pod运行信息，集群资源不足，Pod处于Pending状态，如下：

NAME                     READY   STATUS    RESTARTS   AGE
priority-high-test-0     1/1     Running   0          3m29s
priority-high-test-1     1/1     Running   0          3m29s
priority-high-test-2     1/1     Running   0          3m29s
priority-high-test-3     1/1     Running   0          3m29s
priority-low-test-0      0/1     Pending   0          2m26s
priority-low-test-1      0/1     Pending   0          2m26s
priority-low-test-2      0/1     Pending   0          2m26s
priority-low-test-3      0/1     Pending   0          2m26s
priority-medium-test-0   0/1     Pending   0          2m36s
priority-medium-test-1   0/1     Pending   0          2m36s
priority-medium-test-2   0/1     Pending   0          2m36s
priority-medium-test-3   0/1     Pending   0          2m36s

删除high_priority_job工作负载，释放集群资源，med_priority_job会被优先调度。

执行 kubectl delete -f high_priority_job.yaml 释放集群资源，查看Pod的调度信息，如下：

NAME                     READY   STATUS    RESTARTS   AGE
priority-low-test-0      0/1     Pending   0          5m18s
priority-low-test-1      0/1     Pending   0          5m18s
priority-low-test-2      0/1     Pending   0          5m18s
priority-low-test-3      0/1     Pending   0          5m18s
priority-medium-test-0   1/1     Running   0          5m28s
priority-medium-test-1   1/1     Running   0          5m28s
priority-medium-test-2   1/1     Running   0          5m28s
priority-medium-test-3   1/1     Running   0          5m28s

父主题： 业务优先级保障调度

上一篇：业务优先级保障调度

下一篇：AI任务性能增强调度