Hi Mike,
that sounds interesting - I wonder how the behaviour might look like
with explicit bind mounts to the file system?
We recently wondered an odd behaviour with Docker containers with NFS
mounts, where we bound paths from the NFS mounts into the container
namespace (and learnt a bit about how the kernel interprets bind mounts
compared to file system mounts).
Thing was, that even after umounting the fs mount under root, the *fs
mount as such* still persisted, i.e., no mount was visible for root, but
from the container context one could still read/write into the
fs/namespace and everything got synced (cross checked with Wireshark,
that indeed NFS traffic was ongoing with the fs 'umounted' for root).
So, we learnt on the way, that while the 'original' mount was not
anymore, the bind mount to the container kept the fs mount (it appeared
to me a bit like hardlinks with inodes...)
I documented a bit in
https://confluence.desy.de/display/~hartmath/Containers%2C+file+systems+and+bind+mount+oddities
Would be interesting to see, how the processes' mountinfos for each of
the containers look like (assuming that the behaviour is the same for
block devices as for NFS)?
Cheers,
Thomas
On 10/04/2019 17.14, Mike wrote:
> Hi,
>
> I believe I have discovered an interesting use case for Singularity
> instances.
>
> If you run multiple "singularity exec /xxx/.sif /cmd/" on the same host
> (typical e.g. for single-threaded MPI jobs on multi-CPU systems), each
> invocation will get its own mount, and file system contents will be
> buffered separately for each mount. If processes are accessing
> essentially the same data in the container, they will compete for buffer
> cache space to hold multiple copies of identical data, possibly
> amounting to a significant portion of the entire memory, thus detracting
> from memory effectively available to processes' address space.
>
> One can easily demonstrate this behavior by preparing e.g. a
> ubuntu:latest container which has e.g. a 1GB file in its /data directory
> and then starting two or more containers on the same host. I could
> reproduce this for Singularity 2.6 and 3.1, and with kernels 3.10, 4.18,
> and 5.0.7.
>
> *Demonstration of multiple buffer cache allocation
> *
>
> For the example given below, I used Singularity 3.1.1 on a virtual host
> running Ubuntu 18.04 LTS; kernel = 4.18. I monitored buffer cache usage
> with top(1).
>
> Start two separate shell sessions: singularity shell xxx.sif
>
> From another window, empty the buffer cache: sync; echo 3 | sudo tee
> /proc/sys/vm/drop_caches
>
> While monitoring buffer cache usage, issue in each Singularity session:
> cp /data/file1GB /dev/null
>
> Buffer cache usage (MiB):
> before: 209.363 buff/cache (a)
> after first cp: 3393.145 buff/cache delta (a,1) = 3184
> after second cp: 5458.289 buff/cache delta (1,2) = 2065
> after termination: 1236.008 buff/cache delta (a,b) = 1027
>
> Noteworthy observations:
> It appears that we need approx 1GB to cache the relevant portion of the
> SIF file in the host context, and roughly another 2GB of cache per
> invocation for buffering the data within the conainer.
>
> *Reducing buffer cache usage by Singularity instances*
>
> Repeating the same experiment with Singularity instances...
>
> singularity instance start xxx.sif c.i
>
> In both sessions:
> singularity shell instance://c.i
>
> Drop buffer cache
> sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
>
> In each session as above:
> cp /data/file1GB /dev/null
>
> before: 199.441 buff/cache (a)
> after first cp: 3386.508 buff/cache delta (a,1) = 3187
> after second cp: 3386.965 buff/cache delta (1,2) < 1
> after termination: 1246.730 buff/cache delta (a,b) = 1047
>
> As expected, the buffer cache is shared between all Singularity
> processes because they are running in the same name space.
>
> Still, I wonder why reading a 1GB file uses 2GB to cache file system
> contents inside the container.
>
> *Practical considerations*
> --
> You received this message because you are subscribed to the Google
> Groups "singularity" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
singularity...@lbl.gov
> <mailto:
singularity...@lbl.gov>.