Skip to content

Slurm QOS and partitions#

Introduction#

With the arrival of the Jed cluster in early 2023, we adapted the partition and QOS structures in our clusters. Since then, we have also successfully implemented these changes in the Helvetios cluster.

In most cases you should define the QOS you want and the partition is defined to be the same. This should be done in one of two ways:

  • by adding e.g. #SBATCH --qos=serial to your Slurm script;
  • by adding that to the sbatch command directly (e.g. sbatch -q serial slurm_script.sh)

Standard QOS#

Here's a quick look at the QOS that will be of interest for most SCITAS users:

QOS Usage
serial for jobs from 1 core, up to 1 node
parallel for jobs of more than 1 full node
free low priority, for users without paid access to our clusters
debug high priority, for testing codes or inputs

There are no limits on the number of jobs the user can submit in serial, parallel, or free.

For debug you are limited to one job at a time. The debug QOS is meant to be as general as possible. While the number of cores is fairly small, there is no limit on the number of nodes. You can use up to 18 nodes (with one core per node). So, if you want to test whether your MPI code is compiled properly you can do:

$ sbatch -q debug --nodes=2 --ntasks-per-node=9 --time=10:00 my_script.sh

A more detailed description can be found on this table:

QOS Priority Max Wall-time Min resources per Job Max resources per job Max resources per user Max jobs per user
serial low 3-00:00:00 1 core 1 node 7560 cores 10001
parallel high 15-00:00:00 1 node + 1 core 32 nodes 12672 cores 10001
free lowest 6:00:00 1 core 1 node 1368 cores 150
debug highest 2:00:00 1 core 18 cores 1

Special QOS#

On Jed, there are two more QOS. These are meant to provide access to nodes with more RAM than the standard nodes, which have 512 GB of RAM. These QOS are:

QOS Node properties Max Wall-time Max Nodes per Job Max resources per user Max jobs per user
bigmem 1 TB of RAM 15-00:00:00 21 1512 cores 1001
hugemem 2 TB of RAM 3-00:00:00 2 144 cores 4

To access these nodes you also need to choose partitions with the same name. So, for instance, to run a job on hugemem you'd need to:

$ sbatch -p hugemem -q hugemem ...

Izar, the GPU cluster#

On Izar, our GPU cluster we still have a different QOS structure, with a total of three options:

QOS Priority Max Wall-time Max resources per job
gpu normal 3-00:00:00 1 node
week low 7-00:00:00 2 nodes
gpu_free lowest 12:00:00 1 node
debug lowest 1:00:00 1 node,1gpu,20core
build lowest 8:00:00 1 node

The default QOS is gpu so if the standard values are fine for you, you don't need to add any QOS to your jobs.

For gpu_free there are two significant limits:

  • any one user is limited to no more than 3 nodes;
  • all the jobs can use a maximum of 5 nodes.

Limits exceeded

If these limits are reached the jobs will be held with the reason QOSResourceLimit.


Last update: January 29, 2024