mpi and portability

269 views
Skip to first unread message

Cédric Clerget

unread,
May 11, 2017, 5:55:24 PM5/11/17
to singularity
Hello,

I will speak next week in a workshop about reproducible science and portability and I wouldn't lie about MPI and Singularity containers.

I managed to run MPI applications with Singularity and OpenMPI.

So I installed version 2.1.0rc4 on host (centos 6) and container (ubuntu 16.04), by following the documentation I simply compiled OpenMPI in container with
./configure && make && make install.
On the host: ./configure --with-sge --with-psm && make && make install

All works as expected with a hello example. To be sure it run on Infiniband, I launched a PingPong between two hosts
and latency results show it used Ethernet.
The solution was to install libpsm-infinipath1 and libpsm-infinipath1-dev package and recompile OMPI with ./configure --with-psm

All documentations just did ./configure in container without any options.

I red in this group that MVAPICH works without problem with singularity, give it a try: same behaviour, need to install psm headers too and recompile.

and came to these questions:
  • is there some options to pass in configure on OMPI/MVAPICH host
  • for portability should I embed all libs/headers to work with many hardware configurations (mellanox, glogic, intel)
It would be grateful if you would share you experience about that

Regards,
Cédric Clerget

Gregory M. Kurtzer

unread,
May 11, 2017, 6:16:09 PM5/11/17
to singu...@lbl.gov
Hi Cedric,

Yes, always be truthful! I second that!

Regarding your findings, yes, you are 100% correct in that the IB support must be present within the container for the MPI to be able to communicate with the underlying hardware. There is no way to virtualize that as of yet, and yes, this does have an impact on portability due to the reliance of kernel<->userspace compatibility within the OFED stack. We would like to mitigate this but it will take collaboration with the OFED community which still needs to happen (and introductions would be greatly appreciated from anybody on the list).

Singularity by default will blur the lines between container and host as much as possible, and that includes sharing devices between the environments. So from a container perspective, Singularity really lends itself to this easily. But, from a user-space and environment perspective, you will still need the necessary libraries to communicate with the underlying hardware; this is true in a container or when running on the host proper.

Now to your questions.

1. The configure options (as far as I know) will be auto-discovered as long as you have the necessary IB development environment installed wherever you are building OMPI/MVAPICH.

2. Yes, you should embed all of the libraries and headers necessary to work on the hardware configurations you wish to be compatible with. Luckily, we have figured this out with GPUs, but not OFED, Qlogic, or OmniPath.

Hope that helps!

Greg



--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Cédric Clerget

unread,
May 11, 2017, 6:42:59 PM5/11/17
to singularity
Hi Greg,

Thank you very much for your quick response. It's more clear now.

Cédric
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Gregory M. Kurtzer

unread,
May 11, 2017, 6:52:48 PM5/11/17
to singu...@lbl.gov
Excellent! Thanks and have fun with your presentation/workshop!

Greg

To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Chris Hines

unread,
May 11, 2017, 6:58:30 PM5/11/17
to singu...@lbl.gov
Hi Greg,


2. Yes, you should embed all of the libraries and headers necessary to work on the hardware configurations you wish to be compatible with. Luckily, we have figured this out with GPUs, but not OFED, Qlogic, or OmniPath.

So that seems perfectly reasonable to me. Indeed I was able to achieve similar functionality by bind mounting OpenMPI from my CentOS host into my Ubuntu container (i.e. srun worked external to the container, OFED worked internal

The thing is I repeated this using OMPI 1.10.3 as well as 2.1.0 and that seemed to work as well, which is great, but flys in the face of this


To achieve proper container’ized Open MPI support, you must use Open MPI version 2.1 


Any chance you can explain what "proper container'ized support" is? I think Cedric and I both assumed that it meant as long as you had any old libmpi.so.20 in the container orted would magically figure out how to use OFED

PS. Happy to make a PR to update the docs, but I just want to understand what 2.1 enables that 1.10 didn't.

Cheers,
--
Chris.
 

Hope that helps!

Greg



On Thu, May 11, 2017 at 2:55 PM, Cédric Clerget <cedric....@gmail.com> wrote:
Hello,

I will speak next week in a workshop about reproducible science and portability and I wouldn't lie about MPI and Singularity containers.

I managed to run MPI applications with Singularity and OpenMPI.

So I installed version 2.1.0rc4 on host (centos 6) and container (ubuntu 16.04), by following the documentation I simply compiled OpenMPI in container with
./configure && make && make install.
On the host: ./configure --with-sge --with-psm && make && make install

All works as expected with a hello example. To be sure it run on Infiniband, I launched a PingPong between two hosts
and latency results show it used Ethernet.
The solution was to install libpsm-infinipath1 and libpsm-infinipath1-dev package and recompile OMPI with ./configure --with-psm

All documentations just did ./configure in container without any options.

I red in this group that MVAPICH works without problem with singularity, give it a try: same behaviour, need to install psm headers too and recompile.

and came to these questions:
  • is there some options to pass in configure on OMPI/MVAPICH host
  • for portability should I embed all libs/headers to work with many hardware configurations (mellanox, glogic, intel)
It would be grateful if you would share you experience about that

Regards,
Cédric Clerget

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

vanessa s

unread,
May 11, 2017, 7:14:15 PM5/11/17
to singu...@lbl.gov
Hey Chris,

+1 to PR to the docs! And we just had some work on this recently:


If you have additions to make, or specific points you want clarified, would you mind posting a PR for that page? We should have the docs for 2.3 released with your questions answered. Thanks!

Best,

Vanessa
--
Vanessa Villamia Sochat
Stanford University '16

Gregory M. Kurtzer

unread,
May 11, 2017, 7:21:18 PM5/11/17
to singu...@lbl.gov
Hi Chris,

In a matter of speaking, it really depends on the MPI. For example (as I understand) Intel MPI includes the IB/OmniPath support within the MPI implementation itself and doesn't require the OFED user space libraries. Where something like OMPI would. Now if you compiled your OFED libraries in the same place you were bind mounting the OpenMPI from on your host, *and* if those libraries were glibc compatible with your container (which I am assuming they were, because you didn't mention any problems), then all would indeed work as expected!

Hope that helps, and yes on the PR to the docs! PLEASE!

Thanks!

Greg

Chris Hines

unread,
May 11, 2017, 8:24:44 PM5/11/17
to singu...@lbl.gov
Thanks Greg,

Now if you compiled your OFED libraries in the same place you were bind mounting the OpenMPI from on your host, *and* if those libraries were glibc compatible with your container (which I am assuming they were, because you didn't mention any problems), then all would indeed work as expected!

You've surmised correctly! In this case I was running a recent Ubuntu (16.04) container on an older (CentOS 7) host, with OFED and MPI compiled with the older CentOS 7 glibc. 
I guess my strategy of bind mounting helps me run new software on older stable cluster nodes, but would not help with the reverse strategy of running old stable containers for reproducible science on new clusters.

So, is there any functional difference in container integration between Open MPI 1.x series and Open MPI 2.1 series? I'm not sure which (if any) of the above assumptions I can relax for 2.1.

@Vanessa: That helps, but you didn't notice that I submitted that PR to you ;-) I want to update it to make sure that its crystal clear what the Open MPI 2.1 series enables and what the differences with the Open MPI 1.x and 2.0 series is (at the moment I can't find any when using bind mounts and container glibc > host glibc, so the example should work for 1.10 as well as 2.1 although I need to verify)
 
Hope that helps, and yes on the PR to the docs! PLEASE!

 Definitely!
 
Cheers,
--
Chris.

Gregory M. Kurtzer

unread,
May 11, 2017, 8:42:51 PM5/11/17
to singu...@lbl.gov
On Thu, May 11, 2017 at 5:24 PM, Chris Hines <chris...@monash.edu> wrote:
Thanks Greg,

Now if you compiled your OFED libraries in the same place you were bind mounting the OpenMPI from on your host, *and* if those libraries were glibc compatible with your container (which I am assuming they were, because you didn't mention any problems), then all would indeed work as expected!

You've surmised correctly! In this case I was running a recent Ubuntu (16.04) container on an older (CentOS 7) host, with OFED and MPI compiled with the older CentOS 7 glibc. 
I guess my strategy of bind mounting helps me run new software on older stable cluster nodes, but would not help with the reverse strategy of running old stable containers for reproducible science on new clusters.

Yes, this is exactly correct with my experience.
 

So, is there any functional difference in container integration between Open MPI 1.x series and Open MPI 2.1 series? I'm not sure which (if any) of the above assumptions I can relax for 2.1.

Yes, I think there is, but nothing I can state definitively. I had a talk with some of the OMPI devels a while back and they mentioned some advantages, specifically versioning handshake along the PMI which would help with version mismatches between host and containers. But don't quote me if that is working properly in the 2.x series, nor do I know if there are enough 2.x releases to adequately test this.
 

@Vanessa: That helps, but you didn't notice that I submitted that PR to you ;-) I want to update it to make sure that its crystal clear what the Open MPI 2.1 series enables and what the differences with the Open MPI 1.x and 2.0 series is (at the moment I can't find any when using bind mounts and container glibc > host glibc, so the example should work for 1.10 as well as 2.1 although I need to verify)
 
Hope that helps, and yes on the PR to the docs! PLEASE!

 Definitely!

Thank you!


 

vanessa s

unread,
May 11, 2017, 10:01:32 PM5/11/17
to singu...@lbl.gov
Thanks everyone for the work on this - the PRs (I believe) are properly updated and merged! And Chris, sorry I didn't associate your Github username with Chris... too many i's and 1's :)

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Martin Cuma

unread,
May 12, 2017, 12:21:47 PM5/12/17
to singularity
Again a bit of a shameless plug, here's our MPI container recipe: https://github.com/CHPC-UofU/Singularity-ubuntu-mpi. We run CentOS7 on the host, and the container runs Ubuntu16.

The InfiniBand stack is brought in from the Ubuntu repos. Notice that I am lazy to build too much stuff inside of the container so I am bringing the Intel compiler and the MPI from the host (on an NFS mounted file system).

I haven't done an extensive testing on this with real applications but I have one on todo list once things calm down a bit.

HTH,
MC

Gregory M. Kurtzer

unread,
May 12, 2017, 12:25:40 PM5/12/17
to singu...@lbl.gov
Hi Martin,

Nothing wrong with a shameless plug! As a matter of fact, I would encourage you to contribute your recipes into the development branch of Singularity under the `examples/` directory using the same format and syntax as is defined there.

Thanks!

--

Martin Cuma

unread,
May 12, 2017, 12:29:11 PM5/12/17
to singularity
Thanks Greg, good idea. I'll put it on my list. I first want to find some time to do a few more experiments with things like the GPUs and MPI. But, once it's on a list, it'll get done sometime ;-)

MC

vanessa s

unread,
May 12, 2017, 1:56:12 PM5/12/17
to singu...@lbl.gov
Cool! I'll watch out for that PR, and if it's ok with you, will add a gist for the recipe to the docs/2.3, so others can find it easily too :)

On Fri, May 12, 2017 at 12:29 PM, Martin Cuma <mart...@gmail.com> wrote:
Thanks Greg, good idea. I'll put it on my list. I first want to find some time to do a few more experiments with things like the GPUs and MPI. But, once it's on a list, it'll get done sometime ;-)

MC

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.
Reply all
Reply to author
Forward
0 new messages