Bin Packing

Bin packing schedules multiple jobs to nodes in a compact manner to maximize resource utilization and reduce resource fragments. Slurm uses the priority-based scheduling algorithm by default. However, you can configure and adjust parameters to implement bin packing.

Bin Packing of the Slurm Scheduler

Resource allocation policies
- Default policy: Slurm uses the First Fit policy (to allocate resources by node) by default. You can change the policy to Best Fit or Consumable Resource (CR) optimization.
- Enable compact allocation.
```
# Enable compact allocation in the slurm.conf file.
SelectType=select/cons_tres
SelectTypeParameters=CR_CPU_Memory
```
  This setting enables the scheduler to preferentially allocate CPU and memory in a compact manner during resource allocation.

Node sharing mode

Node sharing: Multiple jobs can share node resources (with CPU-based allocation being the typical configuration).

# Configure node sharing in the slurm.conf file.
NodeName=node[1-100] Sockets=1 CoresPerSocket=16 ThreadsPerCore=2 RealMemory=64000

Job parameters:

sbatch -p [partition-name] --ntasks=4 --cpus-per-task=4 --mem=4G job.sh # Apply for 4 vCPUs and 4 GB of memory for each task.

Forcible compact allocation
Use the --distribution parameter to forcibly pack tasks as closely together as possible within nodes.
```
sbatch -p [partition] --nodes=2 --ntasks-per-node=8 --distribution=block:block job.sh
```

Configuration Examples

Example 1: Restricting bin packing by QoS and partition

Creating a QoS to limit the resources of a single job

sacctmgr add qos packed_qos MaxCPUsPerUser=100 MaxMemPerUser=100G

Specifying resources when submitting a job

sbatch --qos=packed_qos --ntasks=10 --cpus-per-task=2 --mem-per-cpu=2G job.sh

Example 2: Using --exclusive and sharing together

Exclusive node (This avoids resource competition but may waste resources.)
```
sbatch --exclusive --nodes=1 job.sh
```

Shared node (This improves the resource utilization.)

sbatch --share --ntasks=4 job.sh # Allow other jobs to share the remaining resources of the node.

Monitoring and debugging

Viewing node resource usage

sinfo -N -o "%N %C %e %m" # Displays the CPU and memory usages of the node.

Analyzing job distribution

squeue -o "%i %P %u %T %C %m %N" # View the CPU/memory requirements and the nodes that are allocated to a job.

Detecting resource fragments

scontrol show nodes # Check for the idle resources of the node.

Precautions

Resource oversubscription risks: Excessive compactness may cause resource contention (for example, insufficient memory). You need to monitor OOM events.
Impact on job priority: Running high-priority jobs may interrupt compact allocation. You need to strike a balance between fair access and resource utilization.
Configuration complexity: Enabling complex scheduling policies (such as cons_tres) may increase the scheduling delay.

You can configure resource allocation policies, job parameters, and QoS restrictions to implement efficient bin packing operations in the cockpit and significantly improve cluster resource utilization.

Parent topic: HPC Management and Scheduling Plug-in

Previous topic: bursting from on-premises Plug-in

Next topic: FAQs