Skip to content

Clusters#

Kuma Cluster Full Production & Pricing – Nov 1st

We are excited to announce the successful completion of the beta testing phase for the Kuma GPU cluster, and we are preparing to enter full production starting from November 1st, 2024. Your participation in the beta phase has been invaluable, with a total of approximately 450,000 GPU hours of calculation jobs executed. This extensive testing allowed us to identify and resolve various hardware and software issues, ensuring that Kuma is largely ready for production.

Kuma Beta Opening

After a successful restricted beta with more than 80'000 jobs submitted, we are pleased to announce that Kuma, the new GPU-based cluster, is available for testing starting now! This marks an important milestone as we transition from the Izar cluster, which will soon be reassigned to educational purposes, to the much more powerful Kuma cluster. You can now connect to the login node at kuma.hpc.epfl.ch to begin testing your codes.

Annual SCITAS maintenance

This communication is of significant importance and may affect your work. We strongly recommend dedicating time to thoroughly read its content.

We are approaching our forthcoming annual maintenance period, scheduled from February 5 to February 19, 2024. This maintenance is essential for enhancing our services and includes the following key upgrades:

Downfall vulnerability

The downfall vulnerability, identified as CVE-2022-40982, enables a user to access and steal data from other users who share the same computer. It is found in most Intel CPUs starting from the 6th generation (Skylake) up to the 11th generation (Tiger Lake) included. For instance, a malicious app obtained from an app store could use the Downfall attack to steal sensitive information like passwords, encryption keys, and private data such as banking details, personal emails, and messages.

Jed frontend has to be rebooted due to a power issue

This morning, around 6h30, the Jed frontend was turned off due to an unexpected power issue on the direct power line. Because of this, we had to reboot the frontend, which may have caused some loss of connections.

All the running jobs as well as the ones in the queue were not affected.

We are currently investigating the cause of this problem. We apologize for the inconvenience it may have caused you.