Skip to content

Slurm QOS and partitions#

Introduction#

In this section we present two configuration characteristics of Slurm that we use to control the resource utilization on our clusters:

  • Quality of Service (QOS) affects the scheduling priority, the preemption, and the resource limits of submitted jobs.
  • Partitions act as job queues, imposing restrictions on submitted jobs, such as on job sizes or times.

Quick start#

When configuring a Slurm job, you should explicitly define the QOS and the partition where your job will be submitted. You can do it either:

  • by adding e.g. #SBATCH --qos=serial to your Slurm script; or
  • by adding that to the sbatch command directly (e.g. sbatch -q serial slurm_script.sh)

Some clusters define a default QOS or partition, which will be used if you do not explicitly specify the corresponding parameters. Nevertheless, we recommend to always define the QOS and the partition of your job, even when using the default values. This reduces ambiguity and helps during debugging.

You can find in the documentation more details about our allocation policies.

Slurm Quality of Service (QOS)#

View QOS information#

You can view information on the available QOS by running on a cluster frontend:

sacctmgr show qos

Kuma, the GPU cluster#

On Kuma, the newest GPU cluster, we have the following QOS structure:

QOS Priority Max Wall-time Max resources per job
normal normal 3-00:00:00 8 nodes
long low 7-00:00:00 8 nodes
build high 04:00:00 1 node, 0 gpu, 16 core
debug high 01:00:00 2 gpu

The default QOS is normal.

Jed, the CPU cluster#

On Jed, the CPU cluster, we have the following QOS structure:

QOS Priority Max Wall-time Min resources per Job Max resources per job Max jobs per user
serial low 7-00:00:00 1 core 1 node 10001
parallel high 15-00:00:00 1 node + 1 core 32 nodes 10001
free lowest 6:00:00 1 core 1 node 150
debug highest 2:00:00 1 core 18 cores 1

Special Jed QOS#

On Jed, there are two more QOS. These are meant to provide access to nodes with more RAM than the standard nodes, which have 512 GB of RAM. These QOS are:

QOS Node properties Max Wall-time Max Nodes per Job Max jobs per user
bigmem 1 TB of RAM 15-00:00:00 21 1001
hugemem 2 TB of RAM 3-00:00:00 2 4

To access these nodes you also need to choose partitions with the same name. So, for instance, to run a job on hugemem you'd need to:

sbatch -p hugemem -q hugemem ...

Izar, the academic GPU cluster#

On Izar, our academic GPU cluster we have yet another QOS structure, with a total of four options:

QOS Priority Max Wall-time Max resources per job
normal normal 3-00:00:00
long low 7-00:00:00
debug high 1:00:00 2 GPUs
build high 8:00:00 1 node, 20 cores, 90 GB of RAM, 0 GPU

The default QOS is normal so if the standard values are fine for you, you don't need to add any QOS to your jobs.

Limits exceeded

If these limits are reached the jobs will be held with the reason QOSResourceLimit.

Helvetios, the academic CPU cluster#

On Helvetios, the Academic CPU cluster, we have the following QOS structure:

QOS Priority Max Wall-time Max resources per job
serial normal 3-00:00:00 1 node, 5000 core
parallel high 15-00:00:00 32 node, 3060 core
free lowest 06:00:00 1 node
debug High 02:00:00 8 core

The default QOS is serial.

debug QOS#

For debug you are limited to one job at a time. The debug QOS is meant to be as general as possible. While the number of cores is fairly small, there is no limit on the number of nodes. You can use up to 18 nodes (with one core per node). So, if you want to test whether your MPI code is compiled properly you can do:

sbatch -q debug --nodes=2 --ntasks-per-node=9 --time=10:00 my_script.sh

Slurm partitions#

Slurm partitions are job queues, each defining different constraints, for example, job size limit, job time limit, or users permitted to use the partition.

Requesting a partition#

Generally, when you use Slurm to run jobs, you should request the partition where your job will be executed. There are two equivalent ways to do that:

  • add #SBATCH --partition=<PARTITION> to your Slurm script;
    • example: #SBATCH --partition=bigmem
  • pass the requested partition as a -p or --partition argument to the sbatch command.
    • example: sbatch --partition=bigmem slurm_script.sh

Default partitions#

Some partitions are defined as the default partitions, for example, the standard partition on CPU clusters. When no --partition is defined in the sbatch script, jobs are assigned to the default partitions.

Always request a partition

When you want your job to be executed on a default partition, you are still encouraged to explicitly request it, for example by passing #SBATCH --partition=standard. This will make your configurations more robust and make debugging easier.

View partition information#

You can view basic information on the available partitions by running:

sinfo

To view detailed information on the available partitions, execute:

scontrol show partitions

List of Slurm partitions on CPU clusters#

The table below illustrates the typical partitions that most SCITAS users are expected to use. We present the following columns:

  • Partition: the name of the Slurm partition.
  • Default: whether this partition is the default.
  • Attached QOS: whether a Quality of Service (QOS) has been attached to the partition. If specified, the partition has the same limits as the QOS.
  • Clusters: the clusters where the partition is available.
Partition Attached QOS Clusters
standard n/a Jed, Helvetios
bigmem bigmem Jed
hugemem hugemem Jed

The standard partition is the default partition on all CPU clusters.

List of Slurm partitions on GPU clusters#

The tables below illustrate the partitions available on the GPU clusters.

Partitions on Izar, the academic GPU cluster#

The table below illustrates the partitions available on the Izar GPU cluster. We present the following columns:

  • Partition: the name of the Slurm partition.
  • Allowed QOS: the respective Quality of Services (QOS) that are allowed to run on the partition.
Partition Allowed QOS
gpu all
gpu-xl all
test normal

gpu is the default partition on Izar. The gpu-xl partition is a subset of gpu.

Partitions on Kuma, the GPU cluster#

The table below illustrates the partitions available on the Kuma GPU cluster. We present the following columns:

  • Partition: the name of the Slurm partition.
  • Allowed QOS: the respective Quality of Services (QOS) that are allowed to run on the partition.
Partition Allowed QOS
h100 all
l40s all

No default partition on Kuma

There is no default partition on Kuma. You have to choose one! The job will fail with the following message otherwise: sbatch: error: Batch job submission failed: No partition specified or system default partition