Compiling LAMMPS with Kokkos+ML-PACE#
GPU Cluster
This document concerns Kuma, since it's about a version of LAMMPS compiled with GPU support. Apart from some Kuma-specific options, this should also work on Izar.
LAMMPS and Kokkos#
LAMMPS has had support for Kokkos for some time now. Until recently you needed to install Kokkos first and then compile LAMMPS with it.
Nowadays LAMMPS integrates more tightly with Kokkos and, by default, it will install its own version. LAMMPS is fairly strict with which Kokkos versions it will work with, so the easiest option is to let it do its thing.
For this tutorial we are downloading the latest pre-release version of LAMMPS, the patch release 19 Nov 2024 since the Kokkos support is being actively developed. This version depends on Kokkos 4.4.1, in case you want to install Kokkos yourself.
Compiling LAMMPS with Kokkos#
Since we're targetting Kuma, we're opting for the gcc stack to compile LAMMPS.
In the past you needed to specify the architecture of the hardware you were
targeting at the LAMMPS compilation time. Nowadays LAMMPS integrates better
with the Kokkos hardware discovery and it has an option (Kokkos_ARCH_NATIVE
)
which will target the underlying hardware, without having to specify it.
This means that if LAMMPS sees a GPU it will compile Kokkos specifically for
that GPU. On Kuma, the build
QOS cannot access the GPUs (since most codes
don't need a GPU to be compiled), so you'll have to opt for either a debug
or
normal
QOS job. So, a possibility for compiling LAMMPS would be:
which would give you access to 16 cores, 40 GB of RAM and one H100 GPU for up to one hour.
Once you're logged in to a node, you'll need to load the following set of modules to take advantage of our optimized tools:
A variation of this procedure should get you started:
cmake ./cmake -B build \
-DBUILD_SHARED_LIBS=ON \
-DLAMMPS_EXCEPTIONS=OFF \
-DBUILD_MPI=ON -DBUILD_OMP=ON \
-DMPI_CXX_COMPILER=$MPICXX \
-DCMAKE_BUILD_TYPE=Release \
-DFFT=FFTW3 \
-DPKG_COMPRESS=ON -DPKG_CORESHELL=ON -DPKG_DIPOLE=ON -DPKG_GRANULAR=ON \
-DPKG_KSPACE=ON -DPKG_MANYBODY=ON -DPKG_MC=ON -DPKG_EXTRA-PAIR=ON \
-DPKG_MOLECULE=ON -DPKG_PERI=ON -DPKG_PYTHON=ON -DPKG_QEQ=ON \
-DPKG_REPLICA=ON -DPKG_RIGID=ON -DPKG_SHOCK=ON -DPKG_ML-SNAP=ON -DPKG_SRD=ON \
-DPKG_ATC=ON -DPKG_DIFFRACTION=ON -DPKG_EXTRA-DUMP=ON -DPKG_ML-PACE=ON \
-DPKG_VORONOI=ON -DPKG_MISC=ON \
-DPKG_KOKKOS=ON -DKokkos_ENABLE_CUDA=ON -DKokkos_ENABLE_OPENMP=ON \
-DKokkos_ARCH_NATIVE=ON -DFFT_KOKKOS=CUFFT \
-DCMAKE_INSTALL_PREFIX=/path/to/your/lammps/install/dir
The choice of PKG_SOMETHING
depends on your needs and you can opt not to
include the ones shown above, or to include others. Please note that the
options Kokkos_ENABLE_CUDA=ON
and FFT_KOKKOS=CUFFT
are specific to a GPU
cluster. If you're trying something along these lines on Jed you should not use
these two options.
Once this is done, you're almost ready to start. There is a minor issue with the choice of compiler flags and unless you do a small change, you will have an error almost at the end. By removing a couple of compilation flags you will work around the issue. We'll do this with a sed command, after which you're ready to compile LAMMPS.
$ sed -i 's/ -Xcudafe --diag_suppress=unrecognized_pragma,--diag_suppress=128//' build/CMakeFiles/lmp.dir/flags.make
$ cmake --build build -j $SLURM_CPUS_PER_TASK
$ cmake --install build
At this point you have your own LAMMPS installed on /path/to/your/lammps/install/dir
.
GPU choice and hardware capabilities
Please note that we compiled this on the H100 partition. These GPUs support double precision (FP64), unlike the L40s GPUs. The binary produced here won't work on the L40s partition. If you don't need the higher precision these instructions can be used to compile a version for that hardware.
Happy computing!