Best practice with MPI and singularity

1066 views
Skip to first unread message

Steven Brandt

unread,
Oct 11, 2017, 11:39:04 AM10/11/17
to singularity
From everything I've read, the way to use MPI and Singularity seems to be to have the same MPI inside and outside the container (where "the same" includes configuration options), and that MPI should probably be OpenMPI 2.0 or greater. The run command looks like this:

  mpirun -np 4 -hosts file singularity exec my.img myprog

Is that the best practice?

Or is there a way to use the same MPI that's inside the container while on the outside?

What I wish I could do is this:

  singularity exec my.img mpirun -np 4 -hosts file singularity exec my.img myprog

Then I could have one MPI to rule them all. Obviously, the above doesn't work due to permission issues. Thanks.


Martin Cuma

unread,
Oct 12, 2017, 12:33:39 PM10/12/17
to singularity
Steven,

IMO the main issue is the container portability (be able to run on any host) vs. ability to run in parallel on multiple hosts.

If you run just on one host, you package into the container whatever MPI you want and run MPI program inside any single host inside the container: singularity exec my.img mpirun -np 4 myprog.

When running in parallel on multiple hosts, the container needs to be launched with the host based MPI so there has to be some level of compatibility between the host and container MPI. They don't necessarily need to be exactly the same, but binary compatible.

I have tested various recent MPICH derivatives (MPICH, MVAPICH2, Intel MPI) - though not exhaustively, and since they share common ABI (https://wiki.mpich.org/mpich/index.php/ABI_Compatibility_Initiative), most of the time, they work with each other (the exception being one MPI is built with an option which the other MPI build does not have - example of that is an Ubuntu stock MPICH which uses BLCR checkpointing which Intel MPI does not package - see a container we did that revealed this at https://github.com/CHPC-UofU/Singularity-meep-mpi).

What we also did was to simply bring in our clusters LMod modules, and then use the MPI binaries from the host in the container. If the MPI is built with old enough glibc (e.g. Intel MPI), it works on many Linux distros. Though, this obviously is not very portable.

My choice for an as portable as possible container would be basic MPICH build inside the container (no IB, BLCR, etc), with the container user having the choice to use MPICH ABI compatible MPIs like MVAPICH2, IMPI, or Cray MPI. To bring in IB, one would have to LD_PRELOAD or prepend LD_LIBRARY_PATH with the libmpi.so from the host that has the IB support built in.

OpenMPI has its own ABI and seems to be more aware of the container vs. host binary compatibility (https://github.com/open-mpi/ompi/wiki/Container-Versioning), but perhaps that's just because its ABI is more fluid than that of MPICH.

HTH,
MC


Steven Brandt

unread,
Oct 27, 2017, 11:46:40 AM10/27/17
to singularity
So I came up with an answer to my own question which may be of interest to others. I inserted the code below into my bashrc. Once in place, I can edit ~/sing.txt to contain the path name of a valid singularity image. In this way, I can use the same container to launch the jobs as to run them and I don't have to worry about what choices in MPI I have available on the machine I want to run on. To revert to my normal environment, I can delete sing.txt, logout and login.

I think this hack might be of interest to others.

if [ -r $HOME/sing.txt ]
then
    IMAGE=$(cat $HOME/sing.txt)
fi
if [ "$IMAGE" != "" ]
then
    if [ -r "$IMAGE" ]
    then
        if [ "$SINGULARITY_CONTAINER" = "" ]
        then
            exec singularity exec $IMAGE bash --login
        fi
    else
        echo Could not read image file $IMAGE
    fi  
fi
Reply all
Reply to author
Forward
0 new messages