Communication between singularity containers

2,120 views
Skip to first unread message

Raimon Bosch

unread,
Jun 21, 2016, 10:37:49 AM6/21/16
to singularity


Hi,

We are trying to run experiments using singularity containers. The idea is to run OpenMPI among several containers and check performance results.

How can I communicate with another container? In docker this is clear because every container gets an assigned IP and you can ping there, but what is the situation in the case of singularity? Is it possible to assign an IP to each container? Can I connect via ssh to them?

Thanks in advance,

Ralph Castain

unread,
Jun 21, 2016, 10:49:19 AM6/21/16
to singu...@lbl.gov
Singularity is fully supported by OMPI (and vice versa). If you grab a copy of the OMPI master and build it —with-singularity=<path-to-singularity> (or have the singularity path in your default path), then all you have to do is use mpirun as you normally do, but provide the container as your “app”.

We’ll take care of the rest. Our initial studies showed zero performance degradation by running inside Singularity, and the launch penalty is near-zero as well (and gets better when compared against dl_open’d dynamic jobs running at scale). I’ll let Greg answer the question of how to address the running container.


--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Gregory M. Kurtzer

unread,
Jun 21, 2016, 10:51:03 AM6/21/16
to singularity
Hi Raimon,

The communication model of a Singularity container is very different from that of a Docker implementation. This is because Docker for all practical purposes emulates a virtual machine as each container has it's own IP address and thus it's own ssh server. It also carries its own set of complexities, for example networks need to be segregated/VLan'ed, DNS/host resolution needs to be dynamic and passed down to the containers (so they can reach each other), ssh daemons and other process running inside the containers, management via an existing scheduling system, and the list goes on and on.

Think of it this way, Singularity does not do any of that... It runs a program within the container as if it were running on the host itself, so to communicate between containers is as easy as communicating between programs. So for MPI, it would happen with the MPI on the physical host (outside the container) invoking the container subsystem which then invokes the MPI programs within the container and the MPI programs within the container communicate back to the MPI (orted) outside the container on the host to get access to the host resources. In this model all available resources and infrastructure can be leveraged at full bandwidth by the contained processes and all of the aforementioned complexities akin to running on a virtualized mini-cluster are circumvented.

There is additional information I have written at:


That page is still coming along, and needs more information still but if you have any questions, comments or change proposals please let us know!

Thanks and hope that helps!



--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.



--
Gregory M. Kurtzer
High Performance Computing Services (HPCS)
University of California
Lawrence Berkeley National Laboratory
One Cyclotron Road, Berkeley, CA 94720

John Hearns

unread,
Jun 21, 2016, 11:46:50 AM6/21/16
to singu...@lbl.gov
> We’ll take care of the rest. Our initial studies showed zero performance degradation by running inside Singularity, and the launch penalty is near-zero as well

May I just say - I haz a happee. Lolz.
Sorry - normal service will be resumed as soon as possible.  And yes I am a sad person when the thought of running MPI processes in containers makes me happy.

Greg Keller

unread,
Jun 21, 2016, 12:39:09 PM6/21/16
to singu...@lbl.gov
Any chance IntelMPI will, "just work"?

Ralph Castain

unread,
Jun 21, 2016, 12:46:14 PM6/21/16
to singu...@lbl.gov
It might, minus perhaps some launch optimizations. 

Gregory M. Kurtzer

unread,
Jun 21, 2016, 12:46:37 PM6/21/16
to singularity
On Tue, Jun 21, 2016 at 8:46 AM, 'John Hearns' via singularity <singu...@lbl.gov> wrote:
> We’ll take care of the rest. Our initial studies showed zero performance degradation by running inside Singularity, and the launch penalty is near-zero as well

May I just say - I haz a happee. Lolz.
Sorry - normal service will be resumed as soon as possible.  And yes I am a sad person when the thought of running MPI processes in containers makes me happy.


 I think I may need to quote you somewhere! LOL


Balazs Gerofi

unread,
Jun 21, 2016, 12:50:04 PM6/21/16
to singu...@lbl.gov
Hello Greg,

I've tested Intel MPI and it works fine.
One caveat: if you run over IB you will need to add the network drivers (libdapl* and friends) to the container image.
Unfortunately these don't get displayed just by inspecting your binary with ldd, but you can figure them out during runtime. 

Best,
Balazs 

Gregory M. Kurtzer

unread,
Jun 21, 2016, 12:53:20 PM6/21/16
to singularity
That sounds like a perfect FAQ!

Using Singularity v2, it just means installing the dapl RPM into the container, correct?

Greg Keller

unread,
Jun 21, 2016, 12:54:11 PM6/21/16
to singu...@lbl.gov
Balazs,

Thanks for the tip.  Hopefully it will be easy enough to teach singularity to imply those requirements automagically.  

Ralph Castain

unread,
Jun 21, 2016, 12:59:16 PM6/21/16
to singu...@lbl.gov
I’m a little surprised that those dependencies wouldn’t be “discoverable” - the linker must be able to find them, yes? How are they communicated to the linker?

Balazs Gerofi

unread,
Jun 21, 2016, 1:08:17 PM6/21/16
to singu...@lbl.gov
Ralph,

I think Intel MPI uses dlopen() internally based on what you specify as the I_MPI_FABRICS environment variable, if you don't use IB it doesn't need those libraries.
Of course the files need to be in your LD_PRELOAD_PATH or in the ld.so.cache.

Balazs

Ralph Castain

unread,
Jun 21, 2016, 1:16:21 PM6/21/16
to singu...@lbl.gov
Ah indeed! Hmmm…I suppose one could just write an IMPI module in Singularity that added all the possible libraries…otherwise, the user is going to have to know in advance and manually add them.

Balazs Gerofi

unread,
Jun 21, 2016, 1:17:27 PM6/21/16
to singu...@lbl.gov
Hi Greg,

it could be that the CentOS dapl and ibverbs packages would be sufficient, I copied them from the OFED distribution.

I still think it would be nice if there was a standard way of discovering and adding dependencies (as in v1), perhaps with some additional twist to automatically add things like IB drivers..? 

Balazs

Gregory M. Kurtzer

unread,
Jun 21, 2016, 1:22:17 PM6/21/16
to singularity
I am still considering how best to do some level of internal dependency checking during bootstrap. I can bring back some of the LDD dependency walking code and other InstallFile checks into v2, but still if this happens outside of a bootstrap I won't be able to catch it.

I would recommend that we create some FAQs and example bootstrap definitions for IMPI support over IB.

Ralph Castain

unread,
Jun 21, 2016, 1:32:33 PM6/21/16
to singu...@lbl.gov
Yeah, based on what Balazs said, it sounds like we either create an IMPI module that just loads all possible network drivers into the container, or require that the user pre-determine what they are going to use and then load it manually.

Balazs Gerofi

unread,
Jun 21, 2016, 1:40:12 PM6/21/16
to singu...@lbl.gov
Ralph, this also brings up the issue of where you would want to run your containers later.
For example, do you guys launch mpirun from the underlying host or are you using a containerized version of that as well?

If the mpirun command doesn't match the mpi library your application is linked to, one might get problems.
I guess this is more of a general issue of communication between native and containerized components.

Balazs

Ralph Castain

unread,
Jun 21, 2016, 1:46:24 PM6/21/16
to singu...@lbl.gov
mpirun and its attendant daemons are running outside the container, and you are correct that this can create an issue. We’ve been resolving it by ensuring that PMIx (which is the only point of contact between the app and the daemons) has the ability to run cross-version. Will be enabled in upcoming PMIx release.

Containerizing mpirun won’t help as the issue is compatibility between the mpirun infrastructure and the app, and either side might be statically linked or have a different OMPI version in its container/path.

Gregory M. Kurtzer

unread,
Jun 21, 2016, 1:47:12 PM6/21/16
to singularity
I am concerned that this is a crutch for people that are not defining optional dependencies in their bootstrap. This is obviously not a hard dependency otherwise YUM/APT/DNF would bring them in automatically. In Singularity v1, we had our own dependency solver so we can easily add functionality for this type of things, but now we are relying on the operating system's installer and dependency resolver.

It is my perspective that this needs to be handled via the bootstrap definition file and proper documentation (which I'm still working on, sorry everyone for the delay!).

Raimon Bosch

unread,
Jun 22, 2016, 4:54:45 AM6/22/16
to singularity

Hi Gregory,

Thank you for your answer. One of our experiments needs to run OpenMPI among several servers. This means that we should put one of our containers in host01, another in host02 and another in host03 and collect the results.

How can I do this execution in parallel if I need to communicate with more than one server?

Dave Love

unread,
Jun 22, 2016, 7:02:02 AM6/22/16
to singu...@lbl.gov
Ralph Castain <r...@open-mpi.org> writes:

> Singularity is fully supported by OMPI (and vice versa). If you grab a
> copy of the OMPI master

Apparently not by any version we would have installed, though. [I often
wonder if OMPI people understand the implications for services of being
told to run a live development version which is incompatible with
everything else on the system, or potentially installable on the
system.]

When I tracked down the MCA module and tried to figure out what it was
doing in the new, undocumented framework, it appeared to be supporting
singularity v1. Given the instability of OMPI, I don't see how to keep
things in sync for an HPC service. In principle, as I understand it, a
singularity distribution could bundle a compatible MCA module, but it
would likely be compatible with few OMPI versions.

Of course I don't mean to flame, I still think OMPI is the best option
after surveying them and pushing for it originally, and I'm grateful to
people working on it, but the maintenance is frustrating for people
running a service. I know I should get back to the OMPI list and say
things there, even if it's not effective.

> and build it
> —with-singularity=<path-to-singularity> (or have the singularity path
> in your default path), then all you have to do is use mpirun as you
> normally do, but provide the container as your “app”.

I don't see that. See the following.

Dave Love

unread,
Jun 22, 2016, 7:05:48 AM6/22/16
to singu...@lbl.gov
"Gregory M. Kurtzer" <gmku...@lbl.gov> writes:

> [...] management via an existing scheduling system, and the list goes
> on and on.

In this connexion, the startup for more-or-less privileged users via a
privileged daemon (rather than the resource manager) seems the most
important -- at least to a resource manager maintainer.

[By the way, if you're interested in security aspects, there's
<https://www.nccgroup.trust/globalassets/our-research/us/whitepapers/2016/april/ncc_group_understanding_hardening_linux_containers-10pdf>,
for instance.]

> Think of it this way, Singularity does not do any of that... It runs a
> program within the container as if it were running on the host itself, so
> to communicate between containers is as easy as communicating between
> programs. So for MPI, it would happen with the MPI on the physical host
> (outside the container) invoking the container subsystem which then invokes
> the MPI programs within the container and the MPI programs within the
> container communicate back to the MPI (orted) outside the container on the
> host to get access to the host resources.

But if I run a normal program not built against the system MPI, I expect
it to fail (typically un-obviously). You seem to be saying that's
entirely supported; could someone explain the magic that allows
incompatible OMPI components to communicate? Also, how does it work if
I do

mpirun --mca <parameter> <value> --<option> container

when <parameter> and <option> don't exist for the OMPI version inside
the container?

> In this model all available
> resources and infrastructure can be leveraged at full bandwidth by the
> contained processes and all of the aforementioned complexities akin to
> running on a virtualized mini-cluster are circumvented.

This should probably be another FAQ: How do you make the resources
visible? For instance, I can't see the Lustre, PVFS, or large NFS
filesystems' mounts inside the container or how to change that.

> There is additional information I have written at:
>
> http://singularity.lbl.gov/#hpc

If I follow those instructions (practicality aside) with singularity
v2.0, the ranks fail like this:

--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

ompi_mpi_init: ompi_rte_init failed
--> Returned "(null)" (-43) instead of "Success" (0)
--------------------------------------------------------------------------

It's the same with the container as the mpirun <program> directly.

Dave Love

unread,
Jun 22, 2016, 7:07:41 AM6/22/16
to singu...@lbl.gov
Balazs Gerofi <bge...@riken.jp> writes:

> Hi Greg,
>
> it could be that the CentOS dapl and ibverbs packages would be sufficient,
> I copied them from the OFED distribution.

[I've no idea what lengths IMPI might go to, but components from
different distributions may well not be compatible, and might not be
compatible with the kernel drivers.]

ibverbs isn't generally sufficient, even if you're not running whatever
Infiniband was called in the end. You probably need an HCA-specific
component too -- libmlx4 in our case.

> I still think it would be nice if there was a standard way of discovering
> and adding dependencies (as in v1), perhaps with some additional twist to
> automatically add things like IB drivers..?

I thought the main idea of v2 was to get that right through packaging
(modulo optional packages, as above, and v2.0 doesn't support
bootstrapping with the multiple repos you likely need) as the old system
didn't work.

Dave Love

unread,
Jun 22, 2016, 7:08:30 AM6/22/16
to singu...@lbl.gov
"Gregory M. Kurtzer" <gmku...@lbl.gov> writes:

>> May I just say - I haz a happee. Lolz.
>> Sorry - normal service will be resumed as soon as possible. And yes I am
>> a sad person when the thought of running MPI processes in containers makes
>> me happy.

It's not running in containers that's the issue -- people have been
doing that -- but how, and understanding the implications.

>
>
> I think I may need to quote you somewhere! LOL

Please keep it that side of the Atlantic.

Gregory M. Kurtzer

unread,
Jun 22, 2016, 8:42:54 AM6/22/16
to singu...@lbl.gov
Hi Raimon,

The quick answer is you have mpirun handle that as you would normally where the container file lives on a shared file system:

$ mpirun singularity exec ~/container.img mpi_prog_in_container

Let the MPI outside the container launch the singularity container on each host as it would normally launch any MPI program. Then it will call Singulairty and Singularity will launch the MPI program inside the container on each of your hosts/servers. 

Hope that helps!

Raimon Bosch

unread,
Jun 22, 2016, 9:56:23 AM6/22/16
to singularity


Hi Gregory,

I'm not sure if I would achieve the same with your commands. In an environment based on dockers or virtual machines we would do something like this [non applicable to Singularity]:

> cd $OPEN_MPI/bin && mpirun -np 4 --hostfile hosts.txt ./trace.sh ./bt.C.4

where hosts.txt* is:

>vm-ip-01-on-host01 slots=2
>vm-ip-01-on-host02 slots=2

* vm-ip-XX-on-hostXX are IPs i.e. 172.100.60.XX

and trace.sh is:

>#!/bin/bash
>
>export EXTRAE_HOME=/opt/extrae/
>export EXTRAE_CONFIG_FILE=/extrae.xml
>export LD_PRELOAD=${EXTRAE_HOME}/lib/libmpitrace.so
>
>## Run the desired program
>$*

As you see we only perform one execution and OpenMPI transparently manages communication between containers or virtual machines. This command would work well rather VMs are on the same host or not.

What I understand from your response is that now we should execute OpenMPI on each host and then merge results manually. I don't know yet how to do this merge step or if it is any way to centralize everything like I would do with VMs.

Thanks in advance,
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Gregory M. Kurtzer

unread,
Jun 22, 2016, 10:41:58 AM6/22/16
to singu...@lbl.gov
Hi Raimon,

Sorry I wasn't clear. I am not yet at my computer and thinking while typing on an iPhone hinders my mental processes. Lol

If I understand your example properly, you have a docker or VM infrastructure already set up and you are invoking the mpirun commands from within the virtual environment. Singularity works on a very different premis because integrating a virtual cluster into an existing cluster and scheduling system is a mess.

So starting from the physical nodes, which already have access to all other nodes in the cluster, and already scheduled properly, and in direct access to optimized hardware and file systems.... You call mpirun. 

The mpirun command will take the standard format as you illustrated with the following change to call Singularity inline:

$ mpirun -np 4 --hostfile hosts.txt singularity exec ~/container.img trace.sh bt.C.4

This assumes the following:

1. The container image which contains the program's you want to run is at ~/container.img and accessible at this path on all nodes referenced in hosts.txt
2. The hosts.txt references other physical nodes you want to run on
3. The executables trace.sh and bt.C.4 are both inside the container and in a standard path

In this case we are performing one execution and MPI + singularity is managing all of the communication between processes, nodes and containers. Also it is now using any optimized hardware (eg. Infiniband) and existing high performance file systems (which should not be accessible via a virtualized or Docker'ized cluster for security reasons).

This way is actually MUCH simpler then what you are proposing because there is no need to manage any virtual nodes, virtual networks, or resource manager hacks. It really is as easy as just running any other MPI process on an existing cluster. 

Hope that helps better!

Raimon Bosch

unread,
Jun 22, 2016, 10:46:19 AM6/22/16
to singularity

I think it is more clear now. Thanks for your answer! I will post if I get interesting results.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.


--
Gregory M. Kurtzer
High Performance Computing Services (HPCS)
University of California
Lawrence Berkeley National Laboratory
One Cyclotron Road, Berkeley, CA 94720

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Dave Love

unread,
Jun 22, 2016, 12:01:07 PM6/22/16
to singu...@lbl.gov
Balazs Gerofi <bge...@riken.jp> writes:

> Ralph,
>
> I think Intel MPI uses dlopen() internally based on what you specify as the
> I_MPI_FABRICS environment variable, if you don't use IB it doesn't need
> those libraries.
> Of course the files need to be in your LD_PRELOAD_PATH or in the
> ld.so.cache.

Similarly for openmpi, in case people don't know.

Raimon Bosch

unread,
Jun 23, 2016, 4:53:44 AM6/23/16
to singularity

One last question: What if I want to execute more than one container in the same host? With this technique I am bounded always to the same container. One of our experiments was based in measuring performance of several containers working in parallel in the same node. Also we had experiments with N containers per host in a multihost environment.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.


--
Gregory M. Kurtzer
High Performance Computing Services (HPCS)
University of California
Lawrence Berkeley National Laboratory
One Cyclotron Road, Berkeley, CA 94720

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Ralph Castain

unread,
Jun 23, 2016, 9:09:13 AM6/23/16
to singu...@lbl.gov
I think you are misunderstanding the basic nature of the Singularity “container”. It’s just a file system overlay. So “sharing” a container is no different than running on a node where the procs all see the same file system. Thus, having multiple containers that are identical makes no sense - it’s all the same file system.

Now if you want to run different containers (e.g., with different libraries or OS in them), then you would use mpirun’s MPMD syntax - for example:

mpirun -n 1 <container1> : -n 1 <container2>

HTH
Ralph

Gregory M. Kurtzer

unread,
Jun 23, 2016, 9:21:46 AM6/23/16
to singu...@lbl.gov
In addition to what Ralph mentioned if you are just referring to benchmarking different containers on the same host, that is just a matter of launching a different container image. There are some limitations indeed due to consumable resource allocations (eg. loop interfaces) but nothing you should worry about for normal usage. 

To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Raimon Bosch

unread,
Jul 5, 2016, 11:25:32 AM7/5/16
to singularity, r...@open-mpi.org

That solution does not work with nas/mpi benchmark. That's because bt.C.16 expects 16 processes. When you split processes it throws an exception because number of processes is lower than 16.

I am still trying to figure out how to do this. Let me know if you have any suggestion.

Cheers,

Gregory M. Kurtzer

unread,
Jul 5, 2016, 12:21:48 PM7/5/16
to singularity, Ralph Castain
Hi Raimon,

I am confused as to what the issue is that you are having. Singularity supports running both across nodes as well as multiple processes per node in any number of containers. Can you paste your command and the error you are getting, maybe that will help.

Thanks!



To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Raimon Bosch

unread,
Jul 6, 2016, 4:25:24 AM7/6/16
to singularity, r...@open-mpi.org

Hi Gregory,

It fails depending on your environment. In my Ubuntu 14.04 it worked fine, but in this instance of Debian jessie I get the following:

> ERROR: Failed to associate image to loop: Device or resource busy

Maybe is because we are using a glusterfs shared disk to keep the containers?

Here you have the entire output:

> sudo mpirun -n 1 singularity exec /mnt/glusterfs/singularity/nasmpi-1.img /trace.sh /NPB/NPB3.3-MPI/bin/bt.C.4 : -n 1 singularity exec /mnt/glusterfs/singularity/nasmpi-2.img /trace.sh /NPB/NPB3.3-MPI/bin/bt.C.4 : -n 1 singularity exec /mnt/glusterfs/singularity/nasmpi-3.img /trace.sh /NPB/NPB3.3-MPI/bin/bt.C.4 : -n 1 singularity exec /mnt/glusterfs/singularity/nasmpi-4.img /trace.sh /NPB/NPB3.3-MPI/bin/bt.C.4
ERROR: Failed to associate image to loop: Device or resource busy
ERROR: Failed to associate image to loop: Device or resource busy
/bin/bash: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
ERROR: Failed to associate image to loop: Device or resource busy
--------------------------------------------------------------------------
mpirun has exited due to process rank 2 with PID 63416 on
node bscgrid30 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------

Thanks in advance,
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.


-- 
Gregory M. Kurtzer
High Performance Computing Services (HPCS)
University of California
Lawrence Berkeley National Laboratory
One Cyclotron Road, Berkeley, CA 94720


-- 
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.


-- 
Gregory M. Kurtzer
High Performance Computing Services (HPCS)
University of California
Lawrence Berkeley National Laboratory
One Cyclotron Road, Berkeley, CA 94720


-- 
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Raimon Bosch

unread,
Jul 6, 2016, 4:43:32 AM7/6/16
to singularity, r...@open-mpi.org

When I do "df -h" I see the singularity container still mounted. Maybe I need to run a command to unmount it:

> df -h
Filesystem                 Size  Used Avail Use% Mounted on
****
tmpfs                      3.2G     0  3.2G   0% /run/user/1006
****

Gregory M. Kurtzer

unread,
Jul 6, 2016, 10:00:36 AM7/6/16
to singu...@lbl.gov, r...@open-mpi.org
Hi, 

/run/user is associated with the Singularity container?

Can you show me the output of 'losetup -a' please?

Why are you are running it with sudo, you should not need to.

It is weird, isn't -n a synonym for -np and if so, shouldn't it executing 1 process on the given node? It seems like it is doing more. 

Lastly, what version of Singularity is this? If from Git master when did you do the last pull? Can you try this in debug mode and with a simple binary for testing:

mpirun -n 1 singularity -d exec /mnt/glusterfs/singularity/nasmpi-1.img true

And send that output please. 

To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.



--
Gregory M. Kurtzer
High Performance Computing Services (HPCS)
University of California
Lawrence Berkeley National Laboratory
One Cyclotron Road, Berkeley, CA 94720

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Raimon Bosch

unread,
Jul 6, 2016, 10:20:37 AM7/6/16
to singularity, r...@open-mpi.org

Hi Gregory,


> /run/user is associated with the Singularity container?

I guess it is. Because containers are 3G size and it matches with this instances on /run/user/**. Unmounting them did not help.


> Can you show me the output of 'losetup -a' please?

"sudo losetup -a" returns empty


> Why are you are running it with sudo, you should not need to.

I execute with sudo because the container inside needs 'root'. This is an old docker container that only has a unique user root with all the files (probably i should change this in the future).


> It is weird, isn't -n a synonym for -np and if so, shouldn't it executing 1 process on the given node? It seems like it is doing more.

In my local machine the behaviour is correct. Tested it with -np and the behaviour is the same.


> Lastly, what version of Singularity is this?

Is the master. I did "git clone https://github.com/gmkurtzer/singularity.git" and followed the installation steps.

As a side comment, If I deploy with a unique container I don't encounter this problem. I think that when I want to mount extra containers that the SO gets crazy or maybe singularity tries to assign containers to a /dev/loop* that is busy and does not try to look for one that is available. In my final test I will need at least 16 containers in one host. Is that possible with singularity because I only see 8 loops?

Here you have the debug output:

> sudo mpirun -n 1 singularity -d exec /mnt/glusterfs/singularity/nasmpi-singularity.img true
enabling debugging
ending argument loop
+ '[' -f /usr/local/etc/singularity/init ']'
+ . /usr/local/etc/singularity/init
++ unset module
++ PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
++ HISTFILE=/dev/null
++ export PATH HISTFILE
++ '[' -n 1 ']'
++ SINGULARITY_NO_NAMESPACE_PID=1
++ export SINGULARITY_NO_NAMESPACE_PID
+ true
+ case $1 in
+ break
+ '[' -z /mnt/glusterfs/singularity/nasmpi-singularity.img ']'
+ SINGULARITY_IMAGE=/mnt/glusterfs/singularity/nasmpi-singularity.img
+ export SINGULARITY_IMAGE
+ shift
+ exec /usr/local/libexec/singularity/sexec true
VERBOSE [U=0,P=3944]       message.c:46:init()                        : Setting messagelevel to: 5
DEBUG   [U=0,P=3944]       sexec.c:127:main()                         : Gathering and caching user info.
DEBUG   [U=0,P=3944]       privilege.c:43:get_user_privs()            : Called get_user_privs(struct s_privinfo *uinfo)
DEBUG   [U=0,P=3944]       privilege.c:54:get_user_privs()            : Returning get_user_privs(struct s_privinfo *uinfo) = 0
DEBUG   [U=0,P=3944]       sexec.c:134:main()                         : Checking if we can escalate privs properly.
DEBUG   [U=0,P=3944]       privilege.c:61:escalate_privs()            : Called escalate_privs(void)
DEBUG   [U=0,P=3944]       privilege.c:73:escalate_privs()            : Returning escalate_privs(void) = 0
DEBUG   [U=0,P=3944]       sexec.c:141:main()                         : Setting privs to calling user
DEBUG   [U=0,P=3944]       privilege.c:79:drop_privs()                : Called drop_privs(struct s_privinfo *uinfo)
DEBUG   [U=0,P=3944]       privilege.c:87:drop_privs()                : Dropping privileges to GID = '0'
DEBUG   [U=0,P=3944]       privilege.c:93:drop_privs()                : Dropping privileges to UID = '0'
DEBUG   [U=0,P=3944]       privilege.c:103:drop_privs()               : Confirming we have correct GID
DEBUG   [U=0,P=3944]       privilege.c:109:drop_privs()               : Confirming we have correct UID
DEBUG   [U=0,P=3944]       privilege.c:115:drop_privs()               : Returning drop_privs(struct s_privinfo *uinfo) = 0
DEBUG   [U=0,P=3944]       sexec.c:146:main()                         : Obtaining user's homedir
DEBUG   [U=0,P=3944]       sexec.c:150:main()                         : Obtaining file descriptor to current directory
DEBUG   [U=0,P=3944]       sexec.c:155:main()                         : Getting current working directory path string
DEBUG   [U=0,P=3944]       sexec.c:161:main()                         : Obtaining SINGULARITY_COMMAND from environment
DEBUG   [U=0,P=3944]       sexec.c:168:main()                         : Obtaining SINGULARITY_IMAGE from environment
DEBUG   [U=0,P=3944]       sexec.c:174:main()                         : Checking container image is a file: /mnt/glusterfs/singularity/nasmpi-singularity.img
DEBUG   [U=0,P=3944]       sexec.c:180:main()                         : Building configuration file location
DEBUG   [U=0,P=3944]       sexec.c:183:main()                         : Config location: /usr/local/etc/singularity/singularity.conf
DEBUG   [U=0,P=3944]       sexec.c:185:main()                         : Checking Singularity configuration is a file: /usr/local/etc/singularity/singularity.conf
DEBUG   [U=0,P=3944]       sexec.c:191:main()                         : Checking Singularity configuration file is owned by root
DEBUG   [U=0,P=3944]       sexec.c:197:main()                         : Opening Singularity configuration file
DEBUG   [U=0,P=3944]       sexec.c:210:main()                         : Checking Singularity configuration for 'sessiondir prefix'
DEBUG   [U=0,P=3944]       config_parser.c:43:config_get_key_value()  : Called config_get_key_value(fp, sessiondir prefix)
DEBUG   [U=0,P=3944]       config_parser.c:61:config_get_key_value()  : Return config_get_key_value(fp, sessiondir prefix) = NULL
DEBUG   [U=0,P=3944]       file.c:48:file_id()                        : Called file_id(/mnt/glusterfs/singularity/nasmpi-singularity.img)
VERBOSE [U=0,P=3944]       file.c:58:file_id()                        : Generated file_id: 0.39.12911060245380037651
DEBUG   [U=0,P=3944]       file.c:60:file_id()                        : Returning file_id(/mnt/glusterfs/singularity/nasmpi-singularity.img) = 0.39.12911060245380037651
DEBUG   [U=0,P=3944]       sexec.c:217:main()                         : Set sessiondir to: /tmp/.singularity-session-0.39.12911060245380037651
DEBUG   [U=0,P=3944]       sexec.c:221:main()                         : Set containername to: nasmpi-singularity.img
DEBUG   [U=0,P=3944]       sexec.c:223:main()                         : Setting loop_dev_* paths
DEBUG   [U=0,P=3944]       sexec.c:229:main()                         : Set image mount path to: /usr/local/var/singularity/mnt
LOG     [U=0,P=3944]       sexec.c:231:main()                         : Command=exec, Container=/mnt/glusterfs/singularity/nasmpi-singularity.img, CWD=/tmp/result, Arg1=true
DEBUG   [U=0,P=3944]       sexec.c:236:main()                         : Set prompt to: Singularity/nasmpi-singularity.img>
DEBUG   [U=0,P=3944]       sexec.c:238:main()                         : Checking if we are opening image as read/write
DEBUG   [U=0,P=3944]       sexec.c:240:main()                         : Opening image as read only: /mnt/glusterfs/singularity/nasmpi-singularity.img
DEBUG   [U=0,P=3944]       sexec.c:247:main()                         : Setting shared lock on file descriptor: 6
DEBUG   [U=0,P=3944]       sexec.c:267:main()                         : Checking for namespace daemon pidfile
DEBUG   [U=0,P=3944]       sexec.c:301:main()                         : Escalating privledges
DEBUG   [U=0,P=3944]       privilege.c:61:escalate_privs()            : Called escalate_privs(void)
DEBUG   [U=0,P=3944]       privilege.c:73:escalate_privs()            : Returning escalate_privs(void) = 0
VERBOSE [U=0,P=3944]       sexec.c:306:main()                         : Creating/Verifying session directory: /tmp/.singularity-session-0.39.12911060245380037651
DEBUG   [U=0,P=3944]       file.c:196:s_mkpath()                      : Creating directory: /tmp/.singularity-session-0.39.12911060245380037651
DEBUG   [U=0,P=3944]       sexec.c:320:main()                         : Setting shared lock on session directory
DEBUG   [U=0,P=3944]       sexec.c:331:main()                         : Caching info into sessiondir
DEBUG   [U=0,P=3944]       file.c:255:fileput()                       : Called fileput(/tmp/.singularity-session-0.39.12911060245380037651/image, nasmpi-singularity.img)
DEBUG   [U=0,P=3944]       sexec.c:337:main()                         : Checking for set loop device
DEBUG   [U=0,P=3944]       loop-control.c:52:obtain_loop_dev()        : Called obtain_loop_dev(void)
DEBUG   [U=0,P=3944]       loop-control.c:66:obtain_loop_dev()        : Found available existing loop device number: 0
VERBOSE [U=0,P=3944]       loop-control.c:81:obtain_loop_dev()        : Using loop device: /dev/loop0
DEBUG   [U=0,P=3944]       loop-control.c:95:obtain_loop_dev()        : Returning obtain_loop_dev(void) = /dev/loop0
DEBUG   [U=0,P=3944]       loop-control.c:106:associate_loop()        : Called associate_loop(image_fp, loop_fp, 1)
DEBUG   [U=0,P=3944]       loop-control.c:109:associate_loop()        : Setting loop flags to LO_FLAGS_AUTOCLEAR
VERBOSE [U=0,P=3944]       image.c:39:image_offset()                  : Calculating image offset
VERBOSE [U=0,P=3944]       image.c:48:image_offset()                  : Found image at an offset of 31 bytes
DEBUG   [U=0,P=3944]       image.c:53:image_offset()                  : Returning image_offset(image_fp) = 31
DEBUG   [U=0,P=3944]       loop-control.c:114:associate_loop()        : Setting image offset to: 31
VERBOSE [U=0,P=3944]       loop-control.c:116:associate_loop()        : Associating image to loop device
VERBOSE [U=0,P=3944]       loop-control.c:122:associate_loop()        : Setting loop device flags
DEBUG   [U=0,P=3944]       loop-control.c:130:associate_loop()        : Returning associate_loop(image_fp, loop_fp, 1) = 0
DEBUG   [U=0,P=3944]       file.c:255:fileput()                       : Called fileput(/tmp/.singularity-session-0.39.12911060245380037651/loop_dev, /dev/loop0)
DEBUG   [U=0,P=3944]       sexec.c:375:main()                         : Creating container image mount path: /usr/local/var/singularity/mnt
DEBUG   [U=0,P=3944]       sexec.c:441:main()                         : Checking to see if we are joining an existing namespace
VERBOSE [U=0,P=3944]       sexec.c:444:main()                         : Creating namespace process
DEBUG   [U=0,P=3944]       privilege.c:79:drop_privs()                : Called drop_privs(struct s_privinfo *uinfo)
DEBUG   [U=0,P=3944]       privilege.c:87:drop_privs()                : Dropping privileges to GID = '0'
DEBUG   [U=0,P=3944]       privilege.c:93:drop_privs()                : Dropping privileges to UID = '0'
DEBUG   [U=0,P=3944]       privilege.c:103:drop_privs()               : Confirming we have correct GID
DEBUG   [U=0,P=3944]       privilege.c:109:drop_privs()               : Confirming we have correct UID
DEBUG   [U=0,P=3944]       privilege.c:115:drop_privs()               : Returning drop_privs(struct s_privinfo *uinfo) = 0
DEBUG   [U=0,P=3949]       sexec.c:449:main()                         : Hello from namespace child process
VERBOSE [U=0,P=3949]       sexec.c:461:main()                         : Not virtualizing PID namespace
DEBUG   [U=0,P=3949]       sexec.c:480:main()                         : Virtualizing FS namespace
DEBUG   [U=0,P=3949]       sexec.c:488:main()                         : Virtualizing mount namespace
DEBUG   [U=0,P=3949]       sexec.c:495:main()                         : Making mounts private
DEBUG   [U=0,P=3949]       sexec.c:505:main()                         : Mounting Singularity image file read/write
DEBUG   [U=0,P=3949]       mounts.c:48:mount_image()                  : Called mount_image(/dev/loop0, /usr/local/var/singularity/mnt, 0)
DEBUG   [U=0,P=3949]       mounts.c:50:mount_image()                  : Checking mount point is present
DEBUG   [U=0,P=3949]       mounts.c:56:mount_image()                  : Checking loop is a block device
DEBUG   [U=0,P=3949]       mounts.c:75:mount_image()                  : Trying to mount read only as ext4 with discard option
DEBUG   [U=0,P=3949]       mounts.c:88:mount_image()                  : Returning mount_image(/dev/loop0, /usr/local/var/singularity/mnt, 0) = 0
DEBUG   [U=0,P=3949]       sexec.c:518:main()                         : Checking if container has /bin/sh
DEBUG   [U=0,P=3949]       sexec.c:526:main()                         : Checking to see if we should do bind mounts
DEBUG   [U=0,P=3949]       sexec.c:530:main()                         : Checking configuration file for 'mount home'
DEBUG   [U=0,P=3949]       config_parser.c:69:config_get_key_bool()   : Called config_get_key_bool(fp, mount home, 1)
DEBUG   [U=0,P=3949]       config_parser.c:43:config_get_key_value()  : Called config_get_key_value(fp, mount home)
DEBUG   [U=0,P=3949]       config_parser.c:54:config_get_key_value()  : Return config_get_key_value(fp, mount home) = yes
DEBUG   [U=0,P=3949]       config_parser.c:75:config_get_key_bool()   : Return config_get_key_bool(fp, mount home, 1) = 1
VERBOSE [U=0,P=3949]       sexec.c:536:main()                         : Mounting home directory base path: /root
DEBUG   [U=0,P=3949]       mounts.c:96:mount_bind()                   : Called mount_bind(/root, 19992816, 1)
DEBUG   [U=0,P=3949]       mounts.c:98:mount_bind()                   : Checking that source exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:104:mount_bind()                  : Checking that destination exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:110:mount_bind()                  : Calling mount(/root, /usr/local/var/singularity/mnt//root, ...)
DEBUG   [U=0,P=3949]       mounts.c:124:mount_bind()                  : Returning mount_bind(/root, 19992816, 1) = 0
DEBUG   [U=0,P=3949]       sexec.c:551:main()                         : Checking configuration file for 'bind path'
DEBUG   [U=0,P=3949]       config_parser.c:43:config_get_key_value()  : Called config_get_key_value(fp, bind path)
DEBUG   [U=0,P=3949]       config_parser.c:54:config_get_key_value()  : Return config_get_key_value(fp, bind path) = /etc/resolv.conf
VERBOSE [U=0,P=3949]       sexec.c:566:main()                         : Found 'bind path' = /etc/resolv.conf, /etc/resolv.conf
VERBOSE [U=0,P=3949]       sexec.c:583:main()                         : Binding '/etc/resolv.conf' to 'nasmpi-singularity.img:/etc/resolv.conf'
DEBUG   [U=0,P=3949]       mounts.c:96:mount_bind()                   : Called mount_bind(/etc/resolv.conf, 19995920, 1)
DEBUG   [U=0,P=3949]       mounts.c:98:mount_bind()                   : Checking that source exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:104:mount_bind()                  : Checking that destination exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:110:mount_bind()                  : Calling mount(/etc/resolv.conf, /usr/local/var/singularity/mnt//etc/resolv.conf, ...)
DEBUG   [U=0,P=3949]       mounts.c:124:mount_bind()                  : Returning mount_bind(/etc/resolv.conf, 19995920, 1) = 0
DEBUG   [U=0,P=3949]       config_parser.c:43:config_get_key_value()  : Called config_get_key_value(fp, bind path)
DEBUG   [U=0,P=3949]       config_parser.c:54:config_get_key_value()  : Return config_get_key_value(fp, bind path) = /etc/hosts
VERBOSE [U=0,P=3949]       sexec.c:566:main()                         : Found 'bind path' = /etc/hosts, /etc/hosts
VERBOSE [U=0,P=3949]       sexec.c:583:main()                         : Binding '/etc/hosts' to 'nasmpi-singularity.img:/etc/hosts'
DEBUG   [U=0,P=3949]       mounts.c:96:mount_bind()                   : Called mount_bind(/etc/hosts, 19998528, 1)
DEBUG   [U=0,P=3949]       mounts.c:98:mount_bind()                   : Checking that source exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:104:mount_bind()                  : Checking that destination exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:110:mount_bind()                  : Calling mount(/etc/hosts, /usr/local/var/singularity/mnt//etc/hosts, ...)
DEBUG   [U=0,P=3949]       mounts.c:124:mount_bind()                  : Returning mount_bind(/etc/hosts, 19998528, 1) = 0
DEBUG   [U=0,P=3949]       config_parser.c:43:config_get_key_value()  : Called config_get_key_value(fp, bind path)
DEBUG   [U=0,P=3949]       config_parser.c:54:config_get_key_value()  : Return config_get_key_value(fp, bind path) = /dev
VERBOSE [U=0,P=3949]       sexec.c:566:main()                         : Found 'bind path' = /dev, /dev
VERBOSE [U=0,P=3949]       sexec.c:583:main()                         : Binding '/dev' to 'nasmpi-singularity.img:/dev'
DEBUG   [U=0,P=3949]       mounts.c:96:mount_bind()                   : Called mount_bind(/dev, 20000832, 1)
DEBUG   [U=0,P=3949]       mounts.c:98:mount_bind()                   : Checking that source exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:104:mount_bind()                  : Checking that destination exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:110:mount_bind()                  : Calling mount(/dev, /usr/local/var/singularity/mnt//dev, ...)
DEBUG   [U=0,P=3949]       mounts.c:124:mount_bind()                  : Returning mount_bind(/dev, 20000832, 1) = 0
DEBUG   [U=0,P=3949]       config_parser.c:43:config_get_key_value()  : Called config_get_key_value(fp, bind path)
DEBUG   [U=0,P=3949]       config_parser.c:54:config_get_key_value()  : Return config_get_key_value(fp, bind path) = /tmp
VERBOSE [U=0,P=3949]       sexec.c:566:main()                         : Found 'bind path' = /tmp, /tmp
VERBOSE [U=0,P=3949]       sexec.c:583:main()                         : Binding '/tmp' to 'nasmpi-singularity.img:/tmp'
DEBUG   [U=0,P=3949]       mounts.c:96:mount_bind()                   : Called mount_bind(/tmp, 20003376, 1)
DEBUG   [U=0,P=3949]       mounts.c:98:mount_bind()                   : Checking that source exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:104:mount_bind()                  : Checking that destination exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:110:mount_bind()                  : Calling mount(/tmp, /usr/local/var/singularity/mnt//tmp, ...)
DEBUG   [U=0,P=3949]       mounts.c:124:mount_bind()                  : Returning mount_bind(/tmp, 20003376, 1) = 0
DEBUG   [U=0,P=3949]       config_parser.c:43:config_get_key_value()  : Called config_get_key_value(fp, bind path)
DEBUG   [U=0,P=3949]       config_parser.c:54:config_get_key_value()  : Return config_get_key_value(fp, bind path) = /var/tmp
VERBOSE [U=0,P=3949]       sexec.c:566:main()                         : Found 'bind path' = /var/tmp, /var/tmp
VERBOSE [U=0,P=3949]       sexec.c:583:main()                         : Binding '/var/tmp' to 'nasmpi-singularity.img:/var/tmp'
DEBUG   [U=0,P=3949]       mounts.c:96:mount_bind()                   : Called mount_bind(/var/tmp, 20005936, 1)
DEBUG   [U=0,P=3949]       mounts.c:98:mount_bind()                   : Checking that source exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:104:mount_bind()                  : Checking that destination exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:110:mount_bind()                  : Calling mount(/var/tmp, /usr/local/var/singularity/mnt//var/tmp, ...)
DEBUG   [U=0,P=3949]       mounts.c:124:mount_bind()                  : Returning mount_bind(/var/tmp, 20005936, 1) = 0
DEBUG   [U=0,P=3949]       config_parser.c:43:config_get_key_value()  : Called config_get_key_value(fp, bind path)
DEBUG   [U=0,P=3949]       config_parser.c:54:config_get_key_value()  : Return config_get_key_value(fp, bind path) = /home
VERBOSE [U=0,P=3949]       sexec.c:566:main()                         : Found 'bind path' = /home, /home
VERBOSE [U=0,P=3949]       sexec.c:583:main()                         : Binding '/home' to 'nasmpi-singularity.img:/home'
DEBUG   [U=0,P=3949]       mounts.c:96:mount_bind()                   : Called mount_bind(/home, 20008528, 1)
DEBUG   [U=0,P=3949]       mounts.c:98:mount_bind()                   : Checking that source exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:104:mount_bind()                  : Checking that destination exists and is a file or directory
DEBUG   [U=0,P=3949]       mounts.c:110:mount_bind()                  : Calling mount(/home, /usr/local/var/singularity/mnt//home, ...)
DEBUG   [U=0,P=3949]       mounts.c:124:mount_bind()                  : Returning mount_bind(/home, 20008528, 1) = 0
DEBUG   [U=0,P=3949]       config_parser.c:43:config_get_key_value()  : Called config_get_key_value(fp, bind path)
DEBUG   [U=0,P=3949]       config_parser.c:61:config_get_key_value()  : Return config_get_key_value(fp, bind path) = NULL
VERBOSE [U=0,P=3949]       sexec.c:633:main()                         : Not staging passwd or group (running as root)
VERBOSE [U=0,P=3949]       sexec.c:638:main()                         : Forking exec process
DEBUG   [U=0,P=3949]       sexec.c:770:main()                         : Dropping privs...
DEBUG   [U=0,P=3949]       privilege.c:79:drop_privs()                : Called drop_privs(struct s_privinfo *uinfo)
DEBUG   [U=0,P=3949]       privilege.c:87:drop_privs()                : Dropping privileges to GID = '0'
DEBUG   [U=0,P=3949]       privilege.c:93:drop_privs()                : Dropping privileges to UID = '0'
DEBUG   [U=0,P=3949]       privilege.c:103:drop_privs()               : Confirming we have correct GID
DEBUG   [U=0,P=3949]       privilege.c:109:drop_privs()               : Confirming we have correct UID
DEBUG   [U=0,P=3949]       privilege.c:115:drop_privs()               : Returning drop_privs(struct s_privinfo *uinfo) = 0
VERBOSE [U=0,P=3949]       sexec.c:776:main()                         : Waiting for Exec process...
DEBUG   [U=0,P=3959]       sexec.c:642:main()                         : Hello from exec child process
VERBOSE [U=0,P=3959]       sexec.c:644:main()                         : Entering container file system space
DEBUG   [U=0,P=3959]       sexec.c:649:main()                         : Changing dir to '/' within the new root
DEBUG   [U=0,P=3959]       sexec.c:657:main()                         : Checking configuration file for 'mount proc'
DEBUG   [U=0,P=3959]       config_parser.c:69:config_get_key_bool()   : Called config_get_key_bool(fp, mount proc, 1)
DEBUG   [U=0,P=3959]       config_parser.c:43:config_get_key_value()  : Called config_get_key_value(fp, mount proc)
DEBUG   [U=0,P=3959]       config_parser.c:54:config_get_key_value()  : Return config_get_key_value(fp, mount proc) = yes
DEBUG   [U=0,P=3959]       config_parser.c:75:config_get_key_bool()   : Return config_get_key_bool(fp, mount proc, 1) = 1
VERBOSE [U=0,P=3959]       sexec.c:661:main()                         : Mounting /proc
DEBUG   [U=0,P=3959]       sexec.c:674:main()                         : Checking configuration file for 'mount sys'
DEBUG   [U=0,P=3959]       config_parser.c:69:config_get_key_bool()   : Called config_get_key_bool(fp, mount sys, 1)
DEBUG   [U=0,P=3959]       config_parser.c:43:config_get_key_value()  : Called config_get_key_value(fp, mount sys)
DEBUG   [U=0,P=3959]       config_parser.c:54:config_get_key_value()  : Return config_get_key_value(fp, mount sys) = yes
DEBUG   [U=0,P=3959]       config_parser.c:75:config_get_key_bool()   : Return config_get_key_bool(fp, mount sys, 1) = 1
VERBOSE [U=0,P=3959]       sexec.c:678:main()                         : Mounting /sys
VERBOSE [U=0,P=3959]       sexec.c:692:main()                         : Dropping all privileges
DEBUG   [U=0,P=3959]       privilege.c:121:drop_privs_perm()          : Called drop_privs_perm(struct s_privinfo *uinfo)
DEBUG   [U=0,P=3959]       privilege.c:129:drop_privs_perm()          : Resetting supplementary groups
DEBUG   [U=0,P=3959]       privilege.c:135:drop_privs_perm()          : Dropping real and effective privileges to GID = '0'
DEBUG   [U=0,P=3959]       privilege.c:141:drop_privs_perm()          : Dropping real and effective privileges to UID = '0'
DEBUG   [U=0,P=3959]       privilege.c:151:drop_privs_perm()          : Confirming we have correct GID
DEBUG   [U=0,P=3959]       privilege.c:157:drop_privs_perm()          : Confirming we have correct UID
DEBUG   [U=0,P=3959]       privilege.c:163:drop_privs_perm()          : Returning drop_privs_perm(struct s_privinfo *uinfo) = 0
VERBOSE [U=0,P=3959]       sexec.c:699:main()                         : Changing to correct working directory: /tmp/result
DEBUG   [U=0,P=3959]       sexec.c:713:main()                         : Setting environment variable 'SINGULARITY_CONTAINER=1'
VERBOSE [U=0,P=3959]       sexec.c:732:main()                         : COMMAND=exec
DEBUG   [U=0,P=3959]       container_actions.c:59:container_exec()    : Called container_exec(2, **argv)
VERBOSE [U=0,P=3959]       container_actions.c:65:container_exec()    : Exec'ing program: true
VERBOSE [U=0,P=3949]       sexec.c:785:main()                         : Exec parent process returned: 0
VERBOSE [U=0,P=3944]       sexec.c:804:main()                         : Starting cleanup...
DEBUG   [U=0,P=3944]       sexec.c:955:main()                         : Checking to see if we are the last process running in this sessiondir
DEBUG   [U=0,P=3944]       sexec.c:959:main()                         : Escalating privs to clean session directory
DEBUG   [U=0,P=3944]       privilege.c:61:escalate_privs()            : Called escalate_privs(void)
DEBUG   [U=0,P=3944]       privilege.c:73:escalate_privs()            : Returning escalate_privs(void) = 0
VERBOSE [U=0,P=3944]       sexec.c:964:main()                         : Cleaning sessiondir: /tmp/.singularity-session-0.39.12911060245380037651
DEBUG   [U=0,P=3944]       file.c:212:s_rmdir()                       : Removing dirctory: /tmp/.singularity-session-0.39.12911060245380037651
DEBUG   [U=0,P=3944]       loop-control.c:138:disassociate_loop()     : Called disassociate_loop(loop_fp)
VERBOSE [U=0,P=3944]       loop-control.c:140:disassociate_loop()     : Disassociating image from loop device
DEBUG   [U=0,P=3944]       loop-control.c:146:disassociate_loop()     : Returning disassociate_loop(loop_fp) = 0
DEBUG   [U=0,P=3944]       privilege.c:79:drop_privs()                : Called drop_privs(struct s_privinfo *uinfo)
DEBUG   [U=0,P=3944]       privilege.c:87:drop_privs()                : Dropping privileges to GID = '0'
DEBUG   [U=0,P=3944]       privilege.c:93:drop_privs()                : Dropping privileges to UID = '0'
DEBUG   [U=0,P=3944]       privilege.c:103:drop_privs()               : Confirming we have correct GID
DEBUG   [U=0,P=3944]       privilege.c:109:drop_privs()               : Confirming we have correct UID
DEBUG   [U=0,P=3944]       privilege.c:115:drop_privs()               : Returning drop_privs(struct s_privinfo *uinfo) = 0
VERBOSE [U=0,P=3944]       sexec.c:981:main()                         : Cleaning up...

Thanks,
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.


-- 
Gregory M. Kurtzer
High Performance Computing Services (HPCS)
University of California
Lawrence Berkeley National Laboratory
One Cyclotron Road, Berkeley, CA 94720


-- 
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.


-- 
Gregory M. Kurtzer
High Performance Computing Services (HPCS)
University of California
Lawrence Berkeley National Laboratory
One Cyclotron Road, Berkeley, CA 94720


-- 
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.



--
Gregory M. Kurtzer
High Performance Computing Services (HPCS)
University of California
Lawrence Berkeley National Laboratory
One Cyclotron Road, Berkeley, CA 94720

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Raimon Bosch

unread,
Jul 6, 2016, 10:39:44 AM7/6/16
to singularity, r...@open-mpi.org

Just guessing... but could not be singularity that does not detect different containers and tries to mount the 4 containers in the same point?

Gregory M. Kurtzer

unread,
Jul 6, 2016, 11:54:37 AM7/6/16
to singularity, Ralph Castain
Hi Raimon,


On Wed, Jul 6, 2016 at 7:39 AM, Raimon Bosch <raimon...@gmail.com> wrote:

Just guessing... but could not be singularity that does not detect different containers and tries to mount the 4 containers in the same point?

Yes, it does exactly that but thanks to CLONE_NEWNS, the mount namespaces never overlap or even see each other.

Looking through your debug output, there are no errors. Can you run the command again with debugging enabled executing /NPB/NPB3.3-MPI/bin/bt.C.4 instead of /bin/true?

Thanks!


 
--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Raimon Bosch

unread,
Jul 7, 2016, 3:32:31 AM7/7/16
to singularity, r...@open-mpi.org


Hi Gregory,

Attached you have the full execution in debug mode:

That's the command I did:
sudo sudo mpirun -n 1 singularity -d exec /mnt/glusterfs/singularity/nasmpi-singularity.img /trace.sh /NPB/NPB3.3-MPI/bin/bt.C.4 : -n 1 singularity -d exec /mnt/glusterfs/singularity/nasmpi-singularity-2.img /trace.sh /NPB/NPB3.3-MPI/bin/bt.C.4 : -n 1 singularity -d exec /mnt/glusterfs/singularity/nasmpi-singularity-3.img /trace.sh /NPB/NPB3.3-MPI/bin/bt.C.4 : -n 1 singularity -d exec /mnt/glusterfs/singularity/nasmpi-singularity-4.img /trace.sh /NPB/NPB3.3-MPI/bin/bt.C.4 2>&1 | tee /tmp/out.log
out.log

Greg Keller

unread,
Jul 7, 2016, 9:29:22 AM7/7/16
to singu...@lbl.gov, r...@open-mpi.org

The repeated/nested Sudo at the start of the command confused me.  Did you intend to run the command with -u [username] ?

What is its purpose?

Cheers!
Greg - but not THE Greg

--

Gregory M. Kurtzer

unread,
Jul 7, 2016, 11:36:20 AM7/7/16
to singularity, Ralph Castain
There is something weird going on... When you did the same with the container command 'true' it only spawned one singularity container, with this command it is spawning 4. Do you have any idea why this is happening?

Also, there may be a race condition that I am investigating...

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Gregory M. Kurtzer

unread,
Jul 7, 2016, 3:47:39 PM7/7/16
to singularity, Ralph Castain
Hi Raimon,

Is /tmp/ a shared file system or does it have any non-standard mount options? As far as I can tell it doesn't appear that flock() is being honored. What is the output of:

$ grep proc /proc/mounts

You could try changing the configuration entry for session dir location to something like:

session dir = /var/singularity/sessions/

Also, I added more debugging information to [master 6128198] which will hopefully make this easier to debug in the future and implement a very basic check to check that flock() is being honored. Please give that a shot (both with and without debugging) and see if it helps. I've been testing with a large local run and have not been able to trigger a problem on my system:

$ mpirun --oversubscribe -n 250 singularity exec container.img true

Keep in mind that we still have some confusion as to why mpirun is spawning 4 singularity processes for this command rather then the single process it spawned when we called /bin/true.

Hope that helps!

Raimon Bosch

unread,
Jul 11, 2016, 5:25:23 AM7/11/16
to singularity, r...@open-mpi.org

the "sudo sudo" is a typo, sorry. I runned only as sudo.

Raimon Bosch

unread,
Jul 11, 2016, 6:15:26 AM7/11/16
to singularity, r...@open-mpi.org


Hi Gregory,

Thanks for your answer. I checked all the things you asked for and sent you the new log:


El jueves, 7 de julio de 2016, 21:47:39 (UTC+2), Gregory M. Kurtzer escribió:
Hi Raimon,

Is /tmp/ a shared file system or does it have any non-standard mount options? As far as I can tell it doesn't appear that flock() is being honored. What is the output of:

It is not shared.

$ grep proc /proc/mounts

proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=22,pgrp=1,timeout=300,minproto=5,maxproto=5,direct 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
 

You could try changing the configuration entry for session dir location to something like:

session dir = /var/singularity/sessions/

I've changed this, but the error persisted.
 

Also, I added more debugging information to [master 6128198] which will hopefully make this easier to debug in the future and implement a very basic check to check that flock() is being honored. Please give that a shot (both with and without debugging) and see if it helps. I've been testing with a large local run and have not been able to trigger a problem on my system:

$ mpirun --oversubscribe -n 250 singularity exec container.img true

Keep in mind that we still have some confusion as to why mpirun is spawning 4 singularity processes for this command rather then the single process it spawned when we called /bin/true.


Attached you have the output with the new master.

Yes, the weird thing is that I didn't have this problem in Ubuntu 14.04. I'm not sure if the problem comes from Debian Jessie or if it is a configuration issue. I couldn't test yet spawning my containers from a my native disk instead of shared. I don't have much space left to test that. Maybe that's the issue.
 
out.log
Message has been deleted

Gregory M. Kurtzer

unread,
Jul 11, 2016, 9:14:27 AM7/11/16
to singularity, Ralph Castain
We have two issues going on:

1. It should be running only one job, but it is still running 4. Are you running it on an interactive scheduler allocation?

2. I am still seeing an issue with the flock() not being honored... Let me dig through this a bit more and let you know.

BTW: You should be using a much newer version, not older. Try the 2.x development branch from the OMPI github.


On Mon, Jul 11, 2016 at 5:11 AM, Raimon Bosch <raimon...@gmail.com> wrote:

Maybe the error comes from OpenMPI 1.6.5. Now I'm testing with an earlier version.

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Gregory M. Kurtzer

unread,
Jul 11, 2016, 9:19:14 AM7/11/16
to singularity, Ralph Castain
Can you tell me a bit more about the type of nodes and the file systems? Also, I'd like to see all of /proc/mounts if possible (at least for locally mounted file systems).

Thanks!

Gregory M. Kurtzer

unread,
Jul 11, 2016, 3:50:42 PM7/11/16
to singu...@lbl.gov, r...@open-mpi.org
I blame my phone.... Or maybe I was just not reading carefully enough, the below command is what you are running? That would explain why we are seeing multiple Singulairty commands running, and possibly why I was unable to replicate. It also gives me a better use case to test for the error you are seeing. You are running with 4 different containers and I was not. 

While I will debug that and try to replicate, I am curious... Why are you using different containers?

Thanks!

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Raimon Bosch

unread,
Jul 12, 2016, 5:09:04 AM7/12/16
to singularity, r...@open-mpi.org

Well, we have a set of experiments to do. We measure MPI performance among several scenarios. One of them is 16 virtual machines working in the same host, another 4 virtual machines in the same host, 1 virtual machine in one host and also multihost scenarios such as 16 VM across 4 hosts.

So in order to perform a comparison between VMs, Docker and Singularity I need to reproduce the same scenarios in all platforms. Maybe if we can't solve this multicontainer scenario, I can try out the experiments reusing the same container but I'm not sure if the results will be similar.

Gregory M. Kurtzer

unread,
Jul 12, 2016, 7:38:58 AM7/12/16
to singu...@lbl.gov, r...@open-mpi.org
On the backend of what Singularity is doing I don't believe you are going to see a performance difference. This is because Singulairty is not joining the namespaces of existing processes like that. For example, if you do:

$ singularity exec container.img ps

On as many containers as you can run at once, they will all show that the command 'ps' is PID 1 and thus they all exist in their own namespaces. The only way to to share namespaces is to use start/stop before exec'ing the Singularity commands for each container you wish to use. Once you "start" a container all subsequent Singularity commands will use that namespace. But about the only performance optimization you will see is approximately a hundredth of a second startup optimization (in my experience). 

There is one other factor that may affect performance that I can foresee; file system IO. A single image file on a large job may have greater contention. But on the flip side it also means more aggressive caching of both the image file itself as well as contents. <shrug>

Anyway I hope that helps. I will be trying to resolve the issue you identified later today and I will report back on it. Can you make a GitHub issue for that race condition bug when using multiple images if you have a chance?

Thanks!

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

Raimon Bosch

unread,
Jul 12, 2016, 8:39:02 AM7/12/16
to singularity, r...@open-mpi.org

Sure. I'll create the issue.

Raimon Bosch

unread,
Jul 12, 2016, 8:46:17 AM7/12/16
to singularity, r...@open-mpi.org
Reply all
Reply to author
Forward
0 new messages