Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

Volcano

Updated on 2025-02-26 GMT+08:00

Introduction

Volcano is a batch processing platform based on Kubernetes. It provides a series of features required by machine learning, deep learning, bioinformatics, genomics, and other big data applications, as a powerful supplement to Kubernetes capabilities.

Volcano provides general-purpose, high-performance computing capabilities, such as job scheduling engine, heterogeneous chip management, and job running management, serving end users through computing frameworks for different industries, such as AI, big data, gene sequencing, and rendering. (Volcano has been open-sourced in GitHub.)

Volcano provides job scheduling, job management, and queue management for computing applications. Its main features are as follows:

  • Diverse computing frameworks, such as TensorFlow, MPI, and Spark, can run on Kubernetes in containers. Common APIs for batch computing jobs through CRD, various add-ons, and advanced job lifecycle management are provided.
  • Advanced scheduling capabilities are provided for batch computing and high-performance computing scenarios, including group scheduling, preemptive priority scheduling, packing, resource reservation, and task topology.
  • Queues can be effectively managed for scheduling jobs. Complex job scheduling capabilities such as queue priority and multi-level queues are supported.

Open source community: https://github.com/volcano-sh/volcano

Installing the Add-on

NOTICE:

Install the Volcano add-on. An on-premises cluster does not support multi-AZ deployment and node affinity policies of the add-on pods.

After the Volcano add-on is installed in an on-premises cluster, only Volcano can be configured to schedule the created workload in YAML.

  1. Log in to the UCS console and click the cluster name to access the cluster console. In the navigation pane, choose Add-ons. Locate Volcano and click Install.
  2. Select Standalone, Custom, or HA for Add-on Specifications.

    If you select Custom, the following requests and limits are recommended for volcano-controller and volcano-scheduler:

    • If the number of nodes is less than 100, retain the default configuration. The requested vCPUs are 500m, and the limit is 2000m. The requested memory is 500 MiB, and the limit is 2000 MiB.
    • If the number of nodes is greater than 100, increase the requested vCPUs by 500m and the requested memory by 1000 MiB each time 100 nodes (10,000 pods) are added. Increase the vCPU limit by 1500m and the memory limit by 1000 MiB.
      NOTE:

      Recommended formulas for calculating the requested values:

      • Requested vCPUs: Calculate the number of target nodes multiplied by the number of target pods, perform interpolation search based on the number of nodes in the cluster multiplied by the number of target pods in Table 1, and round up the request value and limit value to ones that are closest to the specifications.

        For example, for 2,000 nodes (20,000 pods), the product of the number of nodes multiplied by the number of pods is 40 million, which is close to 700/70,000 in the specification (Number of nodes × Number of pods = 49 million). Set the CPU request to 4000m and the limit to 5500m.

      • Requested memory: It is recommended that 2.4 GiB of memory be allocated to every 1,000 nodes and 1 GiB of memory be allocated to every 10,000 pods. The requested memory is the sum of these two values. (The obtained value may be different from the recommended value in Table 1. You can use either of them.)

        Requested memory = Number of nodes/1000 × 2.4 GiB + Number of pods/10000 × 1 GiB

        For example, for 2,000 nodes and 20,000 pods, the requested memory is 6.8 GiB (2000/1000 × 2.4 GiB + 20,000/10,000 × 1 GiB).

      Table 1 Recommended requests and limits for volcano-controller and volcano-scheduler

      Nodes/Pods in a Cluster

      CPU Request (m)

      CPU Limit (m)

      Memory Request (Mi)

      Memory Limit (Mi)

      50/5,000

      500

      2,000

      500

      2,000

      100/10,000

      1,000

      2,500

      1,500

      2,500

      200/20,000

      1,500

      3,000

      2,500

      3,500

      300/30,000

      2,000

      3,500

      3,500

      4,500

      400/40,000

      2,500

      4,000

      4,500

      5,500

      500/50,000

      3,000

      4,500

      5,500

      6,500

      600/60,000

      3,500

      5,000

      6,500

      7,500

      700/70,000

      4,000

      5,500

      7,500

      8,500

  3. Configure the parameters of the default Volcano scheduler. For details, see Table 2.

    colocation_enable: ''
    default_scheduler_conf:
      actions: 'allocate, backfill'
      tiers:
        - plugins:
            - name: 'priority'
            - name: 'gang'
            - name: 'conformance'
        - plugins:
            - name: 'drf'
            - name: 'predicates'
            - name: 'nodeorder'
        - plugins:
            - name: 'cce-gpu-topology-predicate'
            - name: 'cce-gpu-topology-priority'
            - name: 'cce-gpu'
        - plugins:
            - name: 'nodelocalvolume'
            - name: 'nodeemptydirvolume'
            - name: 'nodeCSIscheduling'
            - name: 'networkresource'
    Table 2 Volcano add-ons

    Add-on

    Function

    Description

    Demonstration

    resource_exporter_enable

    Collects NUMA topology information of a node.

    Values:

    • true: You can view the NUMA topology information of the current node.
    • false: This option disables the NUMA topology information of the current node.

    -

    binpack

    Schedules pods to nodes with high resource utilization to reduce resource fragments.

    • binpack.weight: weight of the binpack add-on.
    • binpack.cpu: percentage of CPU. The default value is 1.
    • binpack.memory: percentage of memory. The default value is 1.
    • binpack.resources: resource type.
    - plugins:
      - name: binpack
        arguments:
          binpack.weight: 10
          binpack.cpu: 1
          binpack.memory: 1
          binpack.resources: nvidia.com/gpu, example.com/foo
          binpack.resources.nvidia.com/gpu: 2
          binpack.resources.example.com/foo: 3

    conformance

    Prevent key pods, such as the pods in the kube-system namespace from being preempted.

    -

    -

    gang

    The gang add-on considers a group of pods as a whole to allocate resources.

    -

    -

    priority

    The priority add-on schedules pods based on the custom workload priority.

    -

    -

    overcommit

    Resources in a cluster are scheduled after being accumulated in a certain multiple to improve the workload enqueuing efficiency. If all workloads are Deployments, remove this add-on or set the raising factor to 2.0.

    overcommit-factor: Raising factor. The default value is 1.2.

    - plugins:
      - name: overcommit
        arguments:
          overcommit-factor: 2.0

    drf

    Schedules resources based on the container group dominant resources. The smallest dominant resources would be selected for priority scheduling.

    -

    -

    predicates

    Determines whether a task is bound to a node using a series of evaluation algorithms, such as node/pod affinity, taint tolerance, node port repetition, volume limits, and volume zone matching.

    -

    -

    nodeorder

    The nodeorder add-on scores all nodes for a task by using a series of scoring algorithms.

    • nodeaffinity.weight: Pods are scheduled based on the node affinity. The default value is 1.
    • podaffinity.weight: Pods are scheduled based on the pod affinity. The default value is 1.
    • leastrequested.weight: Pods are scheduled to the node with the least requested resources. The default value is 1.
    • balancedresource.weight: Pods are scheduled to the node with balanced resource. The default value is 1.
    • mostrequested.weight: Pods are scheduled to the node with the most requested resources. The default value is 0.
    • tainttoleration.weight: Pods are scheduled to the node with a high taint tolerance. The default value is 1.
    • imagelocality.weight: Pods are scheduled to the node where the required images exist. The default value is 1.
    • selectorspread.weight: Pods are evenly scheduled to different nodes. The default value is 0.
    • volumebinding.weight: Pods are scheduled to the node with the local PV delayed binding policy. The default value is 1.
    • podtopologyspread.weight: Pods are scheduled based on the pod topology. The default value is 2.
    - plugins:
      - name: nodeorder
        arguments:
          leastrequested.weight: 1
          mostrequested.weight: 0
          nodeaffinity.weight: 1
          podaffinity.weight: 1
          balancedresource.weight: 1
          tainttoleration.weight: 1
          imagelocality.weight: 1
          volumebinding.weight: 1
          podtopologyspread.weight: 2

    cce-gpu-topology-predicate

    GPU-topology scheduling preselection algorithm

    -

    -

    cce-gpu-topology-priority

    GPU-topology scheduling priority algorithm

    -

    -

    cce-gpu

    GPU resource allocation that supports decimal GPU configurations by working with the gpu add-on.

    -

    -

    numaaware

    NUMA topology scheduling

    weight: Weight of the numa-aware add-on.

    -

    networkresource

    The ENI requirement node can be preselected and filtered. The parameters are transferred by CCE and do not need to be manually configured.

    NetworkType: network type (eni or vpc-router).

    -

    nodelocalvolume

    Filters out nodes that do not meet local volume requirements.

    -

    -

    nodeemptydirvolume

    Filters out nodes that do not meet the emptyDir requirements.

    -

    -

    nodeCSIscheduling

    Filters out nodes that have everest component exceptions.

    -

    -

  4. Click Install.

Modifying the volcano-scheduler Configurations Using the Console

Volcano allows you to configure the scheduler during installation, upgrade, and editing. The configuration will be synchronized to volcano-scheduler-configmap.

This section describes how to configure volcano-scheduler.

NOTE:

Only Volcano v1.7.1 and later support this function. On the new add-on page, options such as plugins.eas_service and resource_exporter_enable are replaced by default_scheduler_conf.

Log in to the CCE console and click the cluster name to access the cluster console. In the navigation pane, choose Add-ons. On the right of the displayed page, locate Volcano and click Install or Upgrade. In the Parameters area, configure the volcano-scheduler parameters.

  • Using resource_exporter:
    {
        "ca_cert": "",
        "default_scheduler_conf": {
            "actions": "allocate, backfill",
            "tiers": [
                {
                    "plugins": [
                        {
                            "name": "priority"
                        },
                        {
                            "name": "gang"
                        },
                        {
                            "name": "conformance"
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "drf"
                        },
                        {
                            "name": "predicates"
                        },
                        {
                            "name": "nodeorder"
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "cce-gpu-topology-predicate"
                        },
                        {
                            "name": "cce-gpu-topology-priority"
                        },
                        {
                            "name": "cce-gpu"
                        },
                        {
                            "name": "numa-aware" # add this also enable resource_exporter
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "nodelocalvolume"
                        },
                        {
                            "name": "nodeemptydirvolume"
                        },
                        {
                            "name": "nodeCSIscheduling"
                        },
                        {
                            "name": "networkresource"
                        }
                    ]
                }
            ]
        },
        "server_cert": "",
        "server_key": ""
    }

    After the parameters are configured, you can use the functions of the numa-aware add-on and resource_exporter at the same time.

  • Using eas_service:
    {
        "ca_cert": "",
        "default_scheduler_conf": {
            "actions": "allocate, backfill",
            "tiers": [
                {
                    "plugins": [
                        {
                            "name": "priority"
                        },
                        {
                            "name": "gang"
                        },
                        {
                            "name": "conformance"
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "drf"
                        },
                        {
                            "name": "predicates"
                        },
                        {
                            "name": "nodeorder"
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "cce-gpu-topology-predicate"
                        },
                        {
                            "name": "cce-gpu-topology-priority"
                        },
                        {
                            "name": "cce-gpu"
                        },
                        {
                            "name": "eas",
                            "custom": {
                                "availability_zone_id": "",
                                "driver_id": "",
                                "endpoint": "",
                                "flavor_id": "",
                                "network_type": "",
                                "network_virtual_subnet_id": "",
                                "pool_id": "",
                                "project_id": "",
                                "secret_name": "eas-service-secret"
                            }
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "nodelocalvolume"
                        },
                        {
                            "name": "nodeemptydirvolume"
                        },
                        {
                            "name": "nodeCSIscheduling"
                        },
                        {
                            "name": "networkresource"
                        }
                    ]
                }
            ]
        },
        "server_cert": "",
        "server_key": ""
    }
  • Using ief:
    {
        "ca_cert": "",
        "default_scheduler_conf": {
            "actions": "allocate, backfill",
            "tiers": [
                {
                    "plugins": [
                        {
                            "name": "priority"
                        },
                        {
                            "name": "gang"
                        },
                        {
                            "name": "conformance"
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "drf"
                        },
                        {
                            "name": "predicates"
                        },
                        {
                            "name": "nodeorder"
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "cce-gpu-topology-predicate"
                        },
                        {
                            "name": "cce-gpu-topology-priority"
                        },
                        {
                            "name": "cce-gpu"
                        },
                        {
                            "name": "ief",
                            "enableBestNode": true
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "nodelocalvolume"
                        },
                        {
                            "name": "nodeemptydirvolume"
                        },
                        {
                            "name": "nodeCSIscheduling"
                        },
                        {
                            "name": "networkresource"
                        }
                    ]
                }
            ]
        },
        "server_cert": "",
        "server_key": ""
    }

Retaining the Original Configurations of volcano-scheduler-configmap

If you want to use the original configurations after the add-on is upgraded, perform the following steps:

  1. Check and back up the original volcano-scheduler-configmap configuration.

    Example:
    # kubectl edit cm volcano-scheduler-configmap -n kube-system
    apiVersion: v1
    data:
      default-scheduler.conf: |-
        actions: "enqueue, allocate, backfill"
        tiers:
        - plugins:
          - name: priority
          - name: gang
          - name: conformance
        - plugins:
          - name: drf
          - name: predicates
          - name: nodeorder
          - name: binpack
            arguments:
              binpack.cpu: 100
              binpack.weight: 10
              binpack.resources: nvidia.com/gpu
              binpack.resources.nvidia.com/gpu: 10000
        - plugins:
          - name: cce-gpu-topology-predicate
          - name: cce-gpu-topology-priority
          - name: cce-gpu
        - plugins:
          - name: nodelocalvolume
          - name: nodeemptydirvolume
          - name: nodeCSIscheduling
          - name: networkresource

  2. Enter the customized content in the Parameters area on the console.

    {
        "ca_cert": "",
        "default_scheduler_conf": {
            "actions": "enqueue, allocate, backfill",
            "tiers": [
                {
                    "plugins": [
                        {
                            "name": "priority"
                        },
                        {
                            "name": "gang"
                        },
                        {
                            "name": "conformance"
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "drf"
                        },
                        {
                            "name": "predicates"
                        },
                        {
                            "name": "nodeorder"
                        },
                        {
                            "name": "binpack",
                            "arguments": {
                                "binpack.cpu": 100,
                                "binpack.weight": 10,
                                "binpack.resources": "nvidia.com/gpu",
                                "binpack.resources.nvidia.com/gpu": 10000
                            }
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "cce-gpu-topology-predicate"
                        },
                        {
                            "name": "cce-gpu-topology-priority"
                        },
                        {
                            "name": "cce-gpu"
                        }
                    ]
                },
                {
                    "plugins": [
                        {
                            "name": "nodelocalvolume"
                        },
                        {
                            "name": "nodeemptydirvolume"
                        },
                        {
                            "name": "nodeCSIscheduling"
                        },
                        {
                            "name": "networkresource"
                        }
                    ]
                }
            ]
        },
        "server_cert": "",
        "server_key": ""
    }
    NOTE:

    After the parameters are configured, the original content in volcano-scheduler-configmap will be overwritten. You must check whether volcano-scheduler-configmap has been modified during the upgrade. If volcano-scheduler-configmap has been modified, synchronize the modification to the upgrade page.

Change History

NOTICE:

You are advised to upgrade Volcano to the latest version that matches the cluster.

Table 3 Cluster version mapping

Cluster Version

Add-on Version

v1.25

1.7.1 and 1.7.2

v1.23

1.7.1 and 1.7.2

v1.21

1.7.1 and 1.7.2

v1.19.16

1.3.7, 1.3.10, 1.4.5, 1.7.1, and 1.7.2

v1.19

1.3.7, 1.3.10, and 1.4.5

v1.17 (End of maintenance)

1.3.7, 1.3.10, and 1.4.5

v1.15 (End of maintenance)

1.3.7, 1.3.10, and 1.4.5

Table 4 CCE add-on versions

Add-on Version

Supported Cluster Version

Updated Feature

1.9.1

/v1.19.16.*|v1.21.*|v1.23.*|v1.25.*/

  • Fixed the issue that the counting pipeline pod of the networkresource add-on occupies supplementary network interfaces (Sub-ENI).
  • Fixed the issue where the binpack add-on scores nodes with insufficient resources.
  • Fixed the issue of processing resources in the pod with unknown end status.
  • Optimized event output.
  • Supports HA deployment by default.

1.7.2

/v1.19.16.*|v1.21.*|v1.23.*|v1.25.*/

  • Supported Kubernetes 1.25.
  • Improved Volcano scheduling.

1.7.1

/v1.19.16.*|v1.21.*|v1.23.*|v1.25.*/

Supported Kubernetes 1.25.

1.6.5

/v1.19.*|v1.21.*|v1.23.*/

  • Served as the CCE default scheduler.
  • Supported unified scheduling in hybrid deployments.

1.4.5

/v1.17.*|v1.19.*|v1.21.*/

  • Changed the deployment mode of volcano-scheduler from statefulset to deployment. Fixed the issue that pods cannot be automatically migrated when the node is abnormal.

1.4.2

/v1.15.*|v1.17.*|v1.19.*|v1.21.*/

  • Resolved the issue that cross-GPU allocation fails.
  • Supported the updated EAS API.

1.3.3

/v1.15.*|v1.17.*|v1.19.*|v1.21.*/

  • Fixed the scheduler crash issue caused by GPU exceptions and the admission failure issue for privileged init containers.

1.3.1

/v1.15.*|v1.17.*|v1.19.*/

  • Upgraded the Volcano framework to the latest version.
  • Supported Kubernetes 1.19.
  • Added the numa-aware add-on.
  • Fixed the deployment scaling issue in the multi-queue scenario.
  • Adjusted the algorithm add-on enabled by default.

1.2.5

/v1.15.*|v1.17.*|v1.19.*/

  • Fixed the OutOfcpu issue in some scenarios.
  • Fixed the issue that pods cannot be scheduled when some capabilities are set for a queue.
  • Made the log time of the volcano component consistent with the system time.
  • Fixed the issue of preemption between multiple queues.
  • Fixed the issue that the result of the ioaware add-on does not meet the expectation in some extreme scenarios.
  • Supported hybrid clusters.

1.2.3

/v1.15.*|v1.17.*|v1.19.*/

  • Fixed the training task OOM issue caused by insufficient precision.
  • Fixed the GPU scheduling issue in CCE 1.15 and later versions. Rolling upgrade of CCE versions during task distribution is not supported.
  • Fixed the issue where the queue status is unknown in certain scenarios.
  • Fixed the issue where a panic occurs when a PVC is mounted to a job in a specific scenario.
  • Fixed the issue that decimals cannot be configured for GPU jobs.
  • Added the ioaware add-on.
  • Added the ring controller.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback