The cluster that we have access to uses older version of glibc i.e. 2.28 due to which uspex-25 fails to run on it. We built an apptainer image to circumvent this problem using Ubuntu-24.04 and minconda (based on our test on a local PC) and tested on the cluster. The problem is we can run it direclty using login nodes i.e. without SLURM script. We execute it by using the following script:
#!/bin/sh
while [ ! -f ./USPEX_IS_DONE ] ; do
date >> log
apptainer exec /path/to/uspex25_conda.sif /opt/uspex25
sleep 10
done
But when we use "1 : whichCluster" in the input file for SLURM submission, it fails to run since our apptainer image doesn't have SLURM installed in it. We did try to bind the slurm binaries and libraries from the cluster to our image:
#!/bin/sh
while [ ! -f ./USPEX_IS_DONE ] ; do
date >> log
apptainer exec \
--bind /usr/bin/sbatch:/usr/bin/sbatch \
--bind /usr/bin/srun:/usr/bin/srun \
--bind /lib64/slurm:/lib64/slurm \
--bind /home/apps/SPACK/spack/opt/spack/linux-almalinux8-cascadelake/gcc-13.2.0/lz4-1.9.4-zpae24lwb6eqptlaeyzxzrtr6rfnuzvg/lib:/home/apps/SPACK/spack/opt/spack/linux-almalinux8-cascadelake/gcc-13.2.0/lz4-1.9.4-zpae24lwb6eqptlaeyzxzrtr6rfnuzvg/lib \
--bind /etc/slurm:/etc/slurm \
/path/to/uspex25_conda.sif /opt/uspex25
sleep 10
done
but it still fails.
If we submit the uspex25 calculation using a SLURM script instead, it runs, but the problem is walltime.
If anyone has been able to use uspex25 on a cluster with SLURM job scheduler, could you please help us?