Using a singularity container on a slurm managed cluster

1,002 views
Skip to first unread message

Brett Chapman

unread,
Sep 8, 2020, 11:59:30 PM9/8/20
to singularity

Hi, I'm using some Singularity containers on my cluster that I built, which run tools that directly interact with slurm. Specifically I'm using Get_homologies-est (https://github.com/eead-csic-compbio/get_homologues), which needs access to slurm commands like sacct, sinfo etc from the host (or access from within the container), and REPET (https://hub.docker.com/r/urgi/docker_vre_aio), which has it's own installation of slurm and it's own slurm.conf file (which I could replace with my systems slurm.conf or simply modify it, as they're different versions of slurm). At the moment I'm running the tools on single nodes, which is just too slow for the large jobs I'm running.

I've read that Singulairty works with HPC systems. I usually run singularity exec from a *.sif container with --bind to mount files I need for the job.

For example I usually run my tools like so:

srun -n 1 singularity exec --bind ${PWD}:${PWD} ${GET_HOMOLOGUES_IMAGE} get_homologues-est.pl -d cds -D -n ${SLURM_NTASKS_PER_NODE} -o &> log.cds.pfam

If I needed to get the singularity containers to see the host slurm installation, what steps would I need to take? Would I need to bind particular directories? (e.g. bind the /etc/slurm-llnl/, /var/run/ directories). Would the host slurm version need to be the exact same in the container? Has anyone set up something similar for their slurm cluster?

My version of Singularity is 3.6

My slurm version on the host system is 19.05.5

My host system OS is Ubuntu 20.04

I git pull singularity and install from source.

Thanks for your help.

Regards

Brett

David Godlove

unread,
Sep 9, 2020, 10:12:58 AM9/9/20
to singularity
So the way that I've made this work in the past is to perform the following actions:

1) in the %post section add whatever slurm user is in use on the cluster.  Something like this:  adduser --disabled-password --gecos "" slurm
2) in the %environment section set the PATH and LD_LIBRARY_PATH to the locations of all the slurm binaries and slurm/munge libraries.  This will be the location inside the container after you bind mount those things at runtime.
3) at runtime, bind mount the location of all of the slurm binaries, slurm/munge libraries, and state directores (e.g. /var/run/munge, /usr/local/logs) that slurm and munge will need to do their thang.  

There's a bunch of bits and pieces to put together and the location of many will depend on your setup.  So be prepared to try, get an error, google, rinse and repeat for a while until you get an initial def file put together. 

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.
To view this discussion on the web visit https://groups.google.com/a/lbl.gov/d/msgid/singularity/6f74cae7-35ce-464e-84f7-91177cd1100dn%40lbl.gov.

Erik Sjölund

unread,
Oct 1, 2020, 1:33:58 PM10/1/20
to singu...@lbl.gov
This is very helpful!
Do you know if it is important to have the same Linux distribution
(and version) both
in the container and on the host to get this working?
Best regards,
Erik Sjölund
> To view this discussion on the web visit https://groups.google.com/a/lbl.gov/d/msgid/singularity/CAN9aCefzgEjT_VGz-uM%3DDUEdZtuayAFpwQpAJK0%3DV4CGRpN0BA%40mail.gmail.com.

David Godlove

unread,
Oct 1, 2020, 2:03:15 PM10/1/20
to singularity
So, you are bind-mounting compiled binaries and libraries from the host system into the container which is always going to be fraught with caveats.  I think that the biggest issue will probably be the glibc version that exists on the host and the container.  I've gotten this to work with an ubuntu bionic container running on a rhel7 host.  ymmv.  

Rémy Dernat

unread,
Oct 2, 2020, 4:25:55 AM10/2/20
to singu...@lbl.gov
Hi,

Sorry (...), but seriously, I do not understand why people still design containers like this ?! Containers are not VMs... ! I recently had to convert a docker container with slurm + mysql + a bioinformatic software (!!) to a singularity container... People should not assume anything about the computing environment and not force the user to choose a job scheduler (+ running a database engine as a job workload on a HPC/HTC cluster when the job should run extensively in parallel...)....

Rémy.

David Godlove

unread,
Oct 2, 2020, 1:05:53 PM10/2/20
to singularity
This is a good point Rémy and I agree that it's not a great way to use a container.  I would go a step further and say that your SOFTWARE should not expect to use some particular batch scheduling system.  But lots of folks write software that expects to use a batch scheduling system in an effort to make things "simple" for the end user.  In my experience these attempts almost always make things harder for the user in the long run and they also necessitate yucky tricks like this when you try to containerize the software.  

Reply all
Reply to author
Forward
0 new messages