Billing#

Basic information#

When you are using our clusters with a paid account all your jobs with some exceptions will be submitted to billing for the account concerned.

Exceptions:

Jobs which failed because of a node failure NODE_FAIL
Jobs in non-billed QOS and/or partition. For example, qos debug on jed

Sausage#

You can use sausage to display what you have already consumed.

# sausage -h
usage: Sausage [-h] [-v] [-V] [-u USERNAME] [-a] [-A ACCOUNT] [-s START]
               [-e END] [-x] [-t]

SCITAS Account Usage.

options:
  -h, --help            show this help message and exit
  -v, --verbose         Verbose (default: False)
  -V, --version         Version
  -u USERNAME, --username USERNAME
                        If not provided whoami is considered (default: None)
  -a, --all             all users from an account are printed (default: False)
  -A ACCOUNT, --account ACCOUNT
                        Prints account consumption per cluster (default: None)
  -s START, --start START
                        Start date - format YYYY-MM-DD (default: 2023-06-01)
  -e END, --end END     End date (included) - format YYYY-MM-DD (default:
                        2023-06-30)
  -x, --csv             Print result in csv style (default: False)
  -t, --total           Print result with total (default: False)

By default, sausage displays the information for your user for the current month.

jed # date; sausage
Tue Jun 20 09:29:13 CEST 2023
╭────────────────────────────────────────────────────────────────────╮
│                        USERNAME : ncoudene                         │
│             Global usage from 2023-06-01 to 2023-06-30             │
│╭──────────┬──────────┬──────┬───────┬───────┬─────────┬───────────╮│
││Account   │Cluster   │# jobs│GPU [h]│CPU [h]│eCO₂ [kg]│Costs [CHF]││
│├──────────┼──────────┼──────┼───────┼───────┼─────────┼───────────┤│
││scitas-ge │jed       │     4│    0.0│    0.0│      0.0│        0.0││
││scitas-ge │helvetios │     1│    0.0│    0.0│      0.0│        0.0││
│╰──────────┴──────────┴──────┴───────┴───────┴─────────┴───────────╯│
╰────────────────────────────────────────────────────────────────────╯
sausage v0.12.1.2

You can browse your past consumption with some arguments:

jed # date; sausage --start 2023-01-01 --total
Tue Jun 20 09:31:49 CEST 2023
╭────────────────────────────────────────────────────────────────────╮
│                        USERNAME : ncoudene                         │
│             Global usage from 2023-01-01 to 2023-06-30             │
│╭──────────┬──────────┬──────┬───────┬───────┬─────────┬───────────╮│
││Account   │Cluster   │# jobs│GPU [h]│CPU [h]│eCO₂ [kg]│Costs [CHF]││
│├──────────┼──────────┼──────┼───────┼───────┼─────────┼───────────┤│
││scitas-ge │helvetios │   182│    0.0│    1.1│      0.0│        0.0││
││scitas-ge │jed       │   123│    0.0│   82.0│      0.1│        0.5││
││scitas-ge │izar      │     6│    0.0│    0.0│      0.0│        0.0││
│╰──────────┴──────────┴──────┴───────┴───────┴─────────┴───────────╯│
│ Walltime GPU             h    0.0                                  │
│ Walltime CPU             h   83.2                                  │
│ Number of jobs              311.0                                  │
│ Est. carbon footprint   kg    0.1                                  │
│ Costs                  CHF    0.5                                  │
╰────────────────────────────────────────────────────────────────────╯
sausage v0.12.1.2

Running Jobs with capping enabled#

By default, all jobs are submitted with capping. This means that when you submit your jobs Slurm will do a calculations to estimate if you will exceed a certain limit. If so, your job will not be submitted.

Basically the submitting your calculation is done in five parts:

Estimate the cost of the soon-to-be submitted job: job_estimation
Verify if your username and account has a capping limit, if so we will take note of both username_capping and account_capping
Calculate what your user consumed already username_consumed and your account account_consumed in our clusters. This information comes from sausage.
Calculate what you are expected to use based on the jobs your username username_queued and your account account_queued have queued in all our clusters.
Verify if you have exceed a limit: username_capping - username_consumed - username_queued - job_estimation <= 0 or account_capping - account_consumed - account_queued - job_estimation <= 0

Default behaviour

By default, if any of these steps fail, your job will be submitted anyway.

Support

If you need to change your capping limit: username_capping and/or account_capping. Please ask your account administrator to request a change via the EPFL support 1234@epfl.ch.

Basically when you submit a job you will see this kind of message:

Example without limit:

$ srun --qos serial hostname
srun: info: [ESTIMATION] The estimated cost of this job is CHF 0.00
srun: info: [CAPPING]    All users of the account scitas-ge have consumed 715.41 CHF
srun: info: [CAPPING]    In addition, based on queued and running jobs all users of the account scitas-ge will consume up to 58.74 CHF
srun: info: [CAPPING]    Your username ncoudene have consumed 0.00 CHF
srun: info: [CAPPING]    In addition, based on queued and running jobs your username ncoudene will consume up to 0.00 CHF
srun: info: ╭──────────────────────────────┬─────────────┬─────────────┬─────────────╮
srun: info: │ [in CHF]                     │ Capping     │ Consumed    │ Queued      │
srun: info: ├──────────────────────────────┼─────────────┼─────────────┼─────────────┤
srun: info: │ account : scitas-ge          │ 10,000      │ 715.45      │ 58.75       │
srun: info: ├──────────────────────────────┼─────────────┼─────────────┼─────────────┤
srun: info: │ username : ncoudene          │ 0           │ 0.0         │ 0.0         │
srun: info: ╰──────────────────────────────┴─────────────┴─────────────┴─────────────╯
h042

Example with limit:

$ srun --qos serial  hostname
srun: error: [ESTIMATION] The estimated cost of this job is CHF 0.00
srun: error: [CAPPING]    All users of the account scitas-ge have consumed 715.41 CHF
srun: error: [CAPPING]    In addition, based on queued and running jobs all users of the account scitas-ge will consume up to 29.30 CHF
srun: error: [CAPPING]    Your username ylopes have consumed 421.83 CHF
srun: error: [CAPPING]    In addition, based on queued and running jobs your username ylopes will consume up to 0.00 CHF
srun: error: ╭──────────────────────────────┬─────────────┬─────────────┬─────────────╮
srun: error: │ [in CHF]                     │ Capping     │ Consumed    │ Queued      │
srun: error: ├──────────────────────────────┼─────────────┼─────────────┼─────────────┤
srun: error: │ account : scitas-ge          │ 10,000      │ 715.45      │ 29.35       │
srun: error: ├──────────────────────────────┼─────────────┼─────────────┼─────────────┤
srun: error: │ 🔥username : ylopes          │ 420         │ 421.85      │ 0.0         │
srun: error: ╰──────────────────────────────┴─────────────┴─────────────┴─────────────╯
srun: error: [CAPPING]    🛑 You reached a capping.
srun: error: Unable to allocate resources: Unspecified error

Memory limit#

In our cluster, we put some limit to the memory you can use by cpu allocated : MaxMemPerCPU

For more information, please read our documentation on Memory Allocation.

The limit is calculated like this :

[NODE_MEMORY] / [NODE_CPU_COUNT]

For example for a standard node on jed this limit is : 504000 / 72 = 7000

Basically it means that if you ask for a job more memory than you asked cpus (CPUS*MaxMemPerCPU) , slurm will change the cpu count of the job.

We add some warning at job submission to help you see if you have asked more memory than the limit allowed :

jed # srun --mem 256000 hostname
[...]
srun: info: [MEMORY]     ⚠️  WARNING: The amount of memory you asked for corresponds to 36 cpus.
srun: info: [MEMORY]     ⚠️  WARNING: For this reason, your job will be assigned 36 cpus instead of 1.0.
[...]
jst003

jed # srun --mem-per-cpu 7200 hostname
[...]
srun: info: [MEMORY]     ⚠️  WARNING: The amount of memory you asked for corresponds to 2 cpus.
srun: info: [MEMORY]     ⚠️  WARNING: For this reason, your job will be assigned 2 cpus instead of 1.0.
[...]
jst003

Capping

Keep in mind that your estimated job cost will be calculated based on the corrected cpu count.

Last update: June 22, 2023