Updated on 2025-05-21 GMT+08:00

Capacity Reservation

Resource reservation allows users or administrators to reserve compute nodes to ensure that resources can be exclusively used or shared in a specified period. The following describes how to reserve resources.

Creating Resource Reservations

Run the scontrol create reservation command (as the administrator):

scontrol create reservation ReservationName=<name> StartTime=<start-time> Duration=<duration> Nodes=<node-list-or-quantity> Partition=<partition-name> Users=<user-list> Flags=<options>

Parameter description:

  • ReservationName: specifies the unique name of the reservation record.
  • StartTime: defines when the reservation begins. The format is YYYY-MM-DD[THH:MM[:SS]] or now+{number}{unit} (for example, now+1hours).
  • Duration: specifies how long the reservation lasts. The format is DD-HH:MM:SS (for example, 2-12:00:00 indicates 2 days and 12 hours).
  • Nodes: determines which nodes are included.
  • Partition: determines which partition the reservation applies to.
  • Users: assigns the reservation to specific users. (Multiple users are separated by commas, for example, user1,user2. The creator can access the reservation by default).
  • Flags: controls how resources are allocated.
    • MAINT: marks the reservation for maintenance purposes. Only administrators can access it.
    • OVERLAP: allows the reservation to overlap with other reservations.
    • IGNORE_JOBS: allows the reservation to be created without considering currently running jobs and to take effect immediately.
    • SPEC_NODES: ensures that specific nodes are reserved, rather than allowing Slurm to allocate any available nodes.

Create a capacity reservation policy in the backend of the cockpit (to reserve a compute node for user test1).

Checking Resource Reservation

  • Check all reservations.
    scontrol show reservation
  • Check a specific reservation:
    scontrol show reservation <ReservationName>

Check the capacity reservation policy in the backend of the cockpit.

Managing Reservations

  • Update reservation parameters (for example, modifying the duration).
    scontrol update ReservationName=<Name> Duration=<New duration>

  • Delete a reservation.
    scontrol delete ReservationName=<name>

Preventing Non-specified Users from Using Reserved Resources

You can submit a job in either of the following ways (it is recommended that you use the cockpit):

  • Submitting a job on the cockpit UI

    If the user who submits the job is not the specified user, the job cannot be executed.

  • Submitting a job by running a command

    Use --reservation in the job submission command to specify the reservation name.

    # Submit batch jobs.
    sbatch --reservation=<name> --partition=<partition> job.sh

Use Cases

Scenario 1: Reserve a node for testing.

scontrol create reservation \
     ReservationName=test_job \
     StartTime=now+30minutes \
     Duration=1:00:00 \
     Nodes=node01 \
     Users=alice
  • User alice can exclusively use node01 for 1 hour 30 minutes later.

Scenario 2: Reserve nodes for multi-user collaborative tasks.

scontrol create reservation \
     ReservationName=team_project \
     StartTime=2024-01-01T09:00:00 \
     Duration=24:00:00 \
     Nodes=4 \
     Partition=workq \
     Users=user1,user2,user3
  • Users share four nodes for 24 hours in a specified period.

Scenario 3: Reserve nodes for maintenance purposes.

scontrol create reservation \
     ReservationName=maintenance \
     StartTime=now \
     Duration=8:00:00 \
     Nodes=ALL \
     Flags=MAINT,IGNORE_JOBS
  • The administrator occupies all nodes for maintenance and terminates the existing jobs.

Precautions

  • Permissions: Common users need to create reservations with the help of administrators.
  • Time conflicts: By default, nodes cannot be reserved repeatedly unless OVERLAP is specified.
  • Job time limits: Jobs must start within the reserved time window and cannot run beyond the reserved duration.

Resource reservation can optimize cluster resource allocation and ensure that critical tasks are executed on time. You are advised to contact the cluster administrator to configure resource reservation when there are complex requirements.