Compiling LAMMPS with Kokkos+ML-PACE#

GPU Cluster

This document concerns Kuma, since it's about a version of LAMMPS compiled with GPU support. Apart from some Kuma-specific options, this should also work on Izar.

LAMMPS and Kokkos#

LAMMPS has had support for Kokkos for some time now. Until recently you needed to install Kokkos first and then compile LAMMPS with it.

Nowadays LAMMPS integrates more tightly with Kokkos and, by default, it will install its own version. LAMMPS is fairly strict with which Kokkos versions it will work with, so the easiest option is to let it do its thing.

For this tutorial we are downloading the latest pre-release version of LAMMPS, the patch release 19 Nov 2024 since the Kokkos support is being actively developed. This version depends on Kokkos 4.4.1, in case you want to install Kokkos yourself.

Compiling LAMMPS with Kokkos#

Since we're targetting Kuma, we're opting for the gcc stack to compile LAMMPS.

In the past you needed to specify the architecture of the hardware you were targeting at the LAMMPS compilation time. Nowadays LAMMPS integrates better with the Kokkos hardware discovery and it has an option (Kokkos_ARCH_NATIVE) which will target the underlying hardware, without having to specify it.

This means that if LAMMPS sees a GPU it will compile Kokkos specifically for that GPU. On Kuma, the build QOS cannot access the GPUs (since most codes don't need a GPU to be compiled), so you'll have to opt for either a debug or normal QOS job. So, a possibility for compiling LAMMPS would be:

Sinteract -p h100 -c 16 -m 40G -t 1:00:00 -g gpu:1 -q debug

which would give you access to 16 cores, 40 GB of RAM and one H100 GPU for up to one hour.

Once you're logged in to a node, you'll need to load the following set of modules to take advantage of our optimized tools:

module load gcc openmpi openblas cmake cuda python fftw/3.3.10-openmp

A variation of this procedure should get you started:

cmake ./cmake -B build \
  -DBUILD_SHARED_LIBS=ON \
  -DLAMMPS_EXCEPTIONS=OFF \
  -DBUILD_MPI=ON -DBUILD_OMP=ON \
  -DMPI_CXX_COMPILER=$MPICXX \
  -DCMAKE_BUILD_TYPE=Release \
  -DFFT=FFTW3 \
  -DPKG_COMPRESS=ON -DPKG_CORESHELL=ON -DPKG_DIPOLE=ON -DPKG_GRANULAR=ON \
  -DPKG_KSPACE=ON -DPKG_MANYBODY=ON -DPKG_MC=ON -DPKG_EXTRA-PAIR=ON \
  -DPKG_MOLECULE=ON -DPKG_PERI=ON -DPKG_PYTHON=ON -DPKG_QEQ=ON \
  -DPKG_REPLICA=ON -DPKG_RIGID=ON -DPKG_SHOCK=ON -DPKG_ML-SNAP=ON -DPKG_SRD=ON \
  -DPKG_ATC=ON -DPKG_DIFFRACTION=ON -DPKG_EXTRA-DUMP=ON -DPKG_ML-PACE=ON \
  -DPKG_VORONOI=ON -DPKG_MISC=ON \
  -DPKG_KOKKOS=ON -DKokkos_ENABLE_CUDA=ON -DKokkos_ENABLE_OPENMP=ON \
  -DKokkos_ARCH_NATIVE=ON -DFFT_KOKKOS=CUFFT \
  -DCMAKE_INSTALL_PREFIX=/path/to/your/lammps/install/dir

The choice of PKG_SOMETHING depends on your needs and you can opt not to include the ones shown above, or to include others. Please note that the options Kokkos_ENABLE_CUDA=ON and FFT_KOKKOS=CUFFT are specific to a GPU cluster. If you're trying something along these lines on Jed you should not use these two options.

Once this is done, you're almost ready to start. There is a minor issue with the choice of compiler flags and unless you do a small change, you will have an error almost at the end. By removing a couple of compilation flags you will work around the issue. We'll do this with a sed command, after which you're ready to compile LAMMPS.

sed -i 's/ -Xcudafe --diag_suppress=unrecognized_pragma,--diag_suppress=128//' build/CMakeFiles/lmp.dir/flags.make
cmake --build build -j $SLURM_CPUS_PER_TASK
cmake --install build

At this point you have your own LAMMPS installed on /path/to/your/lammps/install/dir.

GPU choice and hardware capabilities

Please note that we compiled this on the H100 partition. These GPUs support double precision (FP64), unlike the L40s GPUs. The binary produced here won't work on the L40s partition. If you don't need the higher precision these instructions can be used to compile a version for that hardware.

Happy computing!