--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.
Hey guys,
I've heard many times that Singularity has a nice support in Open MPI 2.0, but could someone describes how exactly such integration affects the execution of MPI application? Older Open MPI and MPICH work in SAPP as well, so I don't really get what Open MPI 2.0 brings us.
Moreover I see MPI support in Singularity is positioned as one of the features that is implemented better than in Shifter (correct?). But Shifter also allows to run MPI apps, well, at least I see Cray runs MPICH in Shifter's chroot (not sure about Open MPI though). Could you explain please what is the difference (if any) between running, say, MPICH with Singularity vs running it in Shifter (from HPC prospective of course)?
On May 13, 2016, at 9:52 AM, Taras Shapovalov <shapov...@gmail.com> wrote:1. All implementations of MPI by default should work with Singularity containers (maybe not as optimal as could be, but should start and finish correctly always). Actually I've tested recently MPICH+Singularity with several workload managers, worked fine (did not benchmark it comparing with Open MPI). I did not manage to make Singularity+MPI work in LSF, but this is a different story that deserves a separate thread.Hi Ralph and Gregory,Thank you the both for the so detailed answers! I see your replies complement each other. Although I am a bit confused now with the whole picture, so could you confirm that I get the ideas correctly:
2. MPI process calls dl_open, thus the more MPI processes starts on a node, the more times dl_open will be called. Open MPI 2.0.1 somehow solves this magically (I don't get how) and dl_open is called only once per node. Other implementations of MPI and older Open MPI are not Singularity aware, thus they still will call dl_open each time when MPI process spawns.
3. dl_open issue affects only process start time and does not effect the process execution, so on small scale with long running processes there is no difference between Open MPI 2.0.1 and older Open MPI versions (as well as other MPI implementations).
4. When sapp is built then Singularity detects Open MPI (even older then 2.0.1, right?) and resolves all dependencies automatically adding all files to the sapp. But with, say, MVAPICH2 the dependencies are not resolved automatically, so user should add some stuff manually.
5. Apart of solving dl_open issue Open MPI 2.0.1 does some splitting between the host and the container, which allows user/admin to not optimize Open MPI for a target platform. I really don't get how Singularity does this, but I get the problem. Could you explain what Singularity or Open MPI 2.0.1 does for that specificaly?
On May 13, 2016, at 9:52 AM, Taras Shapovalov <shapov...@gmail.com> wrote:1. All implementations of MPI by default should work with Singularity containers (maybe not as optimal as could be, but should start and finish correctly always). Actually I've tested recently MPICH+Singularity with several workload managers, worked fine (did not benchmark it comparing with Open MPI). I did not manage to make Singularity+MPI work in LSF, but this is a different story that deserves a separate thread.Hi Ralph and Gregory,Thank you the both for the so detailed answers! I see your replies complement each other. Although I am a bit confused now with the whole picture, so could you confirm that I get the ideas correctly:
Correct - the LSF issue is likely a problem of getting the required setup info passed by LSF2. MPI process calls dl_open, thus the more MPI processes starts on a node, the more times dl_open will be called. Open MPI 2.0.1 somehow solves this magically (I don't get how) and dl_open is called only once per node. Other implementations of MPI and older Open MPI are not Singularity aware, thus they still will call dl_open each time when MPI process spawns.Not exactly. Singularity will solve the dl_open problem by itself. What the container does is wrap all the dl_open libraries into the container, and so all dl_open calls by the app are locally resolved. Thus, you automatically resolve the IO node bottleneck scaling issue.What OMPI adds is that it pulls the container only once/node. Other mpiexec implementations will pull the container again for every local process. So if you have 100 procs/node, OMPI will result in 100x fewer “pulls” thru that IO node.
3. dl_open issue affects only process start time and does not effect the process execution, so on small scale with long running processes there is no difference between Open MPI 2.0.1 and older Open MPI versions (as well as other MPI implementations).Correct
4. When sapp is built then Singularity detects Open MPI (even older then 2.0.1, right?) and resolves all dependencies automatically adding all files to the sapp. But with, say, MVAPICH2 the dependencies are not resolved automatically, so user should add some stuff manually.Correct
5. Apart of solving dl_open issue Open MPI 2.0.1 does some splitting between the host and the container, which allows user/admin to not optimize Open MPI for a target platform. I really don't get how Singularity does this, but I get the problem. Could you explain what Singularity or Open MPI 2.0.1 does for that specificaly?When running under mpiexec with Singularity, OMPI’s local daemon on each node actually runs outside of the containers. We then fork/exec the container itself, and the container is defined so it auto-executes the application process. This allows us to minimize the services overhead, keeping all services outside of your container (and thus shared across all containers.Other approaches have the daemon -inside- the container, and you get one daemon for each container - and thus, one daemon for each local application. So you get a higher overhead and therefore lower performance.
HTHRalphTarasBest regards,--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.
--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.
Hi guys,Thanks for the great answers! Now it is more or less clear how it works. To be absolutely sure, can you please confirm also these statements (got from your answers):1. Ralph's answer mentions mpiexec, but Gregory's answer is about mpirun. So, all the discussed here can be applied to the both utilities included in Open MPI distribution.
2. Running Open MPI processes in a single container is impleneted only in Singularity v2. In v1 each Open MPI process still will be executed in different containers.
3. Lets compare these 2 scenarios: Singularity runs child processes in a single container agains scenario when each child runs in a separate container each. The optimization with dlopen call happens in the first scenario, because the opened library is loaded into the memory per Singularity container, then dlopen magically returns the same handler for each child process inside the container, which should be faster. Or there is some other low level optimization occurs in the first scenario regarding dlopen?
On May 16, 2016, at 5:54 AM, Gregory M. Kurtzer <gmku...@lbl.gov> wrote:On Mon, May 16, 2016 at 1:17 AM, Taras Shapovalov <shapov...@gmail.com> wrote:Hi guys,Thanks for the great answers! Now it is more or less clear how it works. To be absolutely sure, can you please confirm also these statements (got from your answers):1. Ralph's answer mentions mpiexec, but Gregory's answer is about mpirun. So, all the discussed here can be applied to the both utilities included in Open MPI distribution.Ralph can speak definitively here, but I believe my answer applies to both.