IMO the main issue is the container portability (be able to run on any host) vs. ability to run in parallel on multiple hosts.
If you run just on one host, you package into the container whatever MPI you want and run MPI program inside any single host inside the container: singularity exec my.img mpirun -np 4 myprog.
When running in parallel on multiple hosts, the container needs to be launched with the host based MPI so there has to be some level of compatibility between the host and container MPI. They don't necessarily need to be exactly the same, but binary compatible.
I have tested various recent MPICH derivatives (MPICH, MVAPICH2, Intel MPI) - though not exhaustively, and since they share common ABI (https://wiki.mpich.org/mpich/index.php/ABI_Compatibility_Initiative
), most of the time, they work with each other (the exception being one MPI is built with an option which the other MPI build does not have - example of that is an Ubuntu stock MPICH which uses BLCR checkpointing which Intel MPI does not package - see a container we did that revealed this at https://github.com/CHPC-UofU/Singularity-meep-mpi
What we also did was to simply bring in our clusters LMod modules, and then use the MPI binaries from the host in the container. If the MPI is built with old enough glibc (e.g. Intel MPI), it works on many Linux distros. Though, this obviously is not very portable.
My choice for an as portable as possible container would be basic MPICH build inside the container (no IB, BLCR, etc), with the container user having the choice to use MPICH ABI compatible MPIs like MVAPICH2, IMPI, or Cray MPI. To bring in IB, one would have to LD_PRELOAD or prepend LD_LIBRARY_PATH with the libmpi.so from the host that has the IB support built in.
OpenMPI has its own ABI and seems to be more aware of the container vs. host binary compatibility (https://github.com/open-mpi/ompi/wiki/Container-Versioning
), but perhaps that's just because its ABI is more fluid than that of MPICH.