Adding files in a build recipe

1,506 views
Skip to first unread message

Oliver Freyermuth

unread,
Oct 6, 2017, 11:13:24 AM10/6/17
to singu...@lbl.gov
Dear Singularities,

I have just now found the need to add files to containers, and it seems there are two options (%setup anf %files).
However, what is unclear to me is how I can make the build-recipe reasonably portable, i.e. "how to specify the paths".

Right now, I have a git-tree organized as follows:

recipes/SL6/default.def
recipes/Ubuntu1604/default.def
resources/profile/setupATLAS.sh
resources/profile/setupBelle.sh

Now I would like to:
- Be able to call "singularity bootstrap" from anywhere, i.e. not caring about the actual working directory of the "singularity" bootstrapping process.
- Copy the resource-files into my containers.
- Achieve that without hardcoding any absolute paths in the build recipe.

This burns down to the question: How do I specify the path correctly in the build-recipe?
Is it expected to be:
- relative to the working directory of the "singularity bootstrap" process? That would be very much against portability.
- relative to the location of the build recipe? Then I could probably use "../../resources/profile/setupATLAS.sh /etc/profile.d/setupATLAS.sh" in my "%files" section.

Maybe it's even possible to pass it in, i.e. use something like
%files
${resourcedir:-../../resources}/profile/setupATLAS.sh
inside the recipe, and use "${resourcedir}" from the bootstraping host's environment if it is set?

This is not really clear to me from the documentation, but maybe I just missed it ;-).

All the best and many thanks for your help!
Oliver

vanessa s

unread,
Oct 6, 2017, 12:09:42 PM10/6/17
to singu...@lbl.gov
Hey Oliver,

I think you have a few options.

> I have just now found the need to add files to containers, and it seems there are two options (%setup anf %files).
> However, what is unclear to me is how I can make the build-recipe reasonably portable, i.e. "how to specify the paths".

%setup shouldn't be needed for much, as files are (as of recent versions) added to the container prior to post. If you need to make directories for the files that don't exist, you would want to add them in %setup:

mkdir -p ${SINGULARITY_ROOTFS}/data

and then copy via files

%files
script.sh /data/script.sh

> Right now, I have a git-tree organized as follows:

> recipes/SL6/default.def
> recipes/Ubuntu1604/default.def
> resources/profile/setupATLAS.sh
> resources/profile/setupBelle.sh

The good news (if you want a service) is that Singularity Hub (2.0) that will be released after Singularity 2.4 is going to support this structure. The standard is to find (recursively) any file called "Singularity" and build if it's been updated. The extension of the file is the tag. So for your above files, you would have a repo connected to Singularity Hub with this organization:

recipes/SL6/Singularity.SL6
recipes/Ubuntu1604/Singularity.Ubuntu16.04

For files, at least for Singularity Hub, the builder is always going to set the base as the repo base, so you would still add files like:

%files
resources/profile/setupBelle.sh
resources/profile/setupATLAS.sh

and in the above those would go to the root of the image, same name

> Now I would like to:
> - Be able to call "singularity bootstrap" from anywhere, i.e. not caring about the actual working directory of the "singularity" bootstrapping process.

I'm not sure I totally follow here - Singularity needs minimally a definition file (Singularity) and path to an image - the build context is important. Docker is the same - when you build it looks for the Dockerfile and local context. What would make sense is to have some standard organization of the build directory, and then have a common bootstrap file (adding files from the same folder, finding the definition) sort of automatically. Another idea is to have a wrapper around singularity that keeps a record of build bases (directories) associated with recipe or uri names, and then when you call to bootstrap /build with the uri, it changes directories appropriately, etc.

> - Copy the resource-files into my containers.
> - Achieve that without hardcoding any absolute paths in the build recipe.

You shouldn't need to - given that you have a "resources" folder in the root of the build directory, if you are running it from there you can use relative paths.

> This burns down to the question: How do I specify the path correctly in the build-recipe?
> Is it expected to be:
> - relative to the working directory of the "singularity bootstrap" process? That would be very much against portability.
> - relative to the location of the build recipe? Then I could probably use "../../resources/profile/setupATLAS.sh /etc/profile.d/setupATLAS.sh" in my "%files" section.

It's relative to where you are calling it. I don't think this breaks portability, if you think about the ways that people share containers and recipes. For containers, the work is done and the recipes are inside - this is what we care most about. For portability of the predecessor to the container (recipe, files) we use version control (eg Github) and we don't care about absolute paths. It's portable because someone else can download my repo, and build my container.

>
> Maybe it's even possible to pass it in, i.e. use something like
> %files
> ${resourcedir:-../../resources}/profile/setupATLAS.sh
> inside the recipe, and use "${resourcedir}" from the bootstraping host's environment if it is set?

I think it would be unlikely for most to have one researcher's special environment variable, but I could be wrong.


> This is not really clear to me from the documentation, but maybe I just missed it ;-).

If you could better define what exactly isn't reasonable about relative paths from some base, I can offer suggestions. If you are looking for a local image manager to make it easy to push your own containers (and find them later) check out Singularity Registry https://singularityhub.github.io/sregistry/


> All the best and many thanks for your help!
>         Oliver
>
> --
> You received this message because you are subscribed to the Google Groups "singularity" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

--
Vanessa Villamia Sochat
Stanford University

vanessa s

unread,
Oct 6, 2017, 12:11:22 PM10/6/17
to singu...@lbl.gov
If you want some kind of internal organization, check out SCI-F :


Maybe a mapping of some organization of local files into internal modules (apps) would work well

> To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov.

Oliver Freyermuth

unread,
Oct 6, 2017, 1:06:14 PM10/6/17
to singu...@lbl.gov, vanessa s
Hey Vanessa,

first of - many thanks for your quick, detailed and elaborate reply!
Singularity is the first software project I encounter with such a quick response time ;-).

> %setup shouldn't be needed for much, as files are (as of recent versions) added to the container prior to post. If you need to make directories for the files that don't exist, you would want to add them in %setup:
Thanks for the explanation and example!

> The good news (if you want a service) is that Singularity Hub (2.0) that will be released after Singularity 2.4 is going to support this structure. The standard is to find (recursively) any file called "Singularity" and build if it's been updated.
This indeed is good news. However, for now, we will start off with an internal git repository and some build scaffolding in form of shell scripts.
There are several reasons at the moment (e.g. pre-2.0 shub still requests non-acceptable permissions for the github account), but that for sure is an option for the future!

>> Now I would like to:
>> - Be able to call "singularity bootstrap" from anywhere, i.e. not caring about the actual working directory of the "singularity" bootstrapping process.
>
> I'm not sure I totally follow here - Singularity needs minimally a definition file (Singularity) and path to an image -
> the build context is important. Docker is the same - when you build it looks for the Dockerfile and local context.
> What would make sense is to have some standard organization of the build directory, and then have a common bootstrap file (adding files from the same folder, finding the definition) sort of automatically.
> Another idea is to have a wrapper around singularity that keeps a record of build bases (directories) associated with recipe or uri names, and then when you call to bootstrap /build with the uri, it changes directories appropriately, etc.
Up to now, the context is not (yet) important for me, since I can do e.g.:
export TMPDIR=$(mktemp -d)
mount -t tmpfs -o size=80% none ${TMPDIR}
export SINGULARTIY_REPO_CHECKOUT=/singularity/singularity_build
singularity bootstrap ${TMPDIR} ${SINGULARTIY_REPO_CHECKOUT}/recipes/SL6/default.def
And there is no need at all to take care of what the current working directory is. The issue arises as soon as I want to include files from the host.

For docker (please correct me if I am wrong), the build context is enforced to be the place where the Dockerfile is placed, so again it does not matter where I call the actual "docker build" binary.
If singularity really uses the working directory where the binary is started as build context, I consider this a portability and reproducibility issue,
since a user always has to enter the directory where the singularity recipe is placed in to achieve reproducible behaviour on his / her machine.
Otherwise, it could happen that he/she accidentally includes files or directories from a totally different place.

Let's visualize with a security-relevant, but not unrealistic example. Let's assume my recipe contains:

%files
.ssh/id_rsa /home/user/.ssh/id_rsa
.ssh/id_rsa.pub /home/user/.ssh/id_rsa.pub

If the user checks out my repo and now executes, in his / her home:
singularity bootstrap someimage.img ~/my_git_repo_checkout/recipes/SL6/Singularity
the container will be built successfully, but the result would contain the ssh-keys of the building user(!) taken from the home directory instead of those which reside in
~/my_git_repo_checkout/recipes/SL6/.ssh/
No error would be thrown. If the user now pushes this build to a public registry, he / she has a huge problem and is not even aware of it
(and likely not even aware of it if he / she read the recipe beforehand!).

In my eyes, that's at least a portability / reproducibility issue, if not a security issue, since the executable working directory matters for the build result -
while for docker this is not the case as far as I know, it uses a fixed base (the one of the Dockerfile).

> Another idea is to have a wrapper around singularity that keeps a record of build bases (directories) associated with recipe or uri names, and then when you call to bootstrap /build with the uri, it changes directories appropriately, etc.
I already have such a wrapper (which for example can build .tar.gz containers using the intermediate tmpfs-ramdisk - really fast!), so I can implement that easily ;-).

> If you could better define what exactly isn't reasonable about relative paths from some base, I can offer suggestions.
I actually think relative paths are fine - but I'm not happy with the base ;-).
I would prefer the base to be similar to how it is handled in Docker (by default at least),
and / or to be able to specify it from outside e.g. via a command line parameter, which would allow for example:
singularity bootstrap --buildroot /my_cloned_repo/resource_set_1/ SL6_with_resource_set_1.img /my_cloned_repo/recipes/SL6.def
singularity bootstrap --buildroot /my_cloned_repo/resource_set_2/ SL6_with_resource_set_2.img /my_cloned_repo/recipes/SL6.def
This can be pretty useful if I have to build a series of containers which are all the same, but just have a different set of resources.

As it is now, everybody building a singularity container seems to have to change manually (or via wrapper) to the directory of the build recipe before bootstrapping to ensure things work out.
This is something I dislike, and it's not even documented (I think) that the build context is the working directory.

> If you are looking for a local image manager to make it easy to push your own containers (and find them later) check out Singularity Registry https://singularityhub.github.io/sregistry/
I actually looked at this project shortly before deciding for the shell-script-solution.
What we are doing here is to build containers for our HPC cluster, and we directly deploy them to CVMFS, to have them readily available on all worker nodes, and take advantage of the
superior caching mechanics of CVMFS (we put extracted images there, not image files).
So there is no push / pull infrastructure required, and our users have access to our local CVMFS repository (to "read" containers).
Of course, a registry has quite some more nice features, but for us the overhead (Webserver, database, worker etc.) was too large for the requirements at hand.

It might be very useful if also users should be able to modify container recipes, but right now only we admins are taking care to prepare that ;-).


So all in all, my suggestion would be to re-think the build context concept to follow the more safe and reproducible concept from Docker,
and at least document what the build context is (as Docker does).
As a bonus, the "--buildroot" parameter (or whatever name is preferred) would be a nice-to-have.

For now, however, I think I will solve my issue at hand by just extending the wrapper to explicitly enter the directory in which the build recipe is contained,
and (sym)link all common resources that should be used for all containers that should use them, so they will be found right next / in a directory below the build recipe.


Many thanks for the elaborate reply (and for making singularity the successful project it already is!) :-),
Oliver
> *> - relative to the working directory of the "singularity bootstrap" process? That would be very much against portability.
> *> - relative to the location of the build recipe? Then I could probably use "../../resources/profile/setupATLAS.sh /etc/profile.d/setupATLAS.sh" in my "%files" section.
>
> It's relative to where you are calling it. I don't think this breaks portability, if you think about the ways that people share containers and recipes. For containers, the work is done and the recipes are inside - this is what we care most about. For portability of the predecessor to the container (recipe, files) we use version control (eg Github) and we don't care about absolute paths. It's portable because someone else can download my repo, and build my container.
>
>>
>> Maybe it's even possible to pass it in, i.e. use something like
>> %files
>> ${resourcedir:-../../resources}/profile/setupATLAS.sh
>> inside the recipe, and use "${resourcedir}" from the bootstraping host's environment if it is set?
>
> I think it would be unlikely for most to have one researcher's special environment variable, but I could be wrong.
>
>
>> This is not really clear to me from the documentation, but maybe I just missed it ;-).
>
> If you could better define what exactly isn't reasonable about relative paths from some base, I can offer suggestions. If you are looking for a local image manager to make it easy to push your own containers (and find them later) check out Singularity Registry https://singularityhub.github.io/sregistry/
>
>> All the best and many thanks for your help!
>>         Oliver
>>
>> --
>> You received this message because you are subscribed to the Google Groups "singularity" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov <mailto:singularity%2Bunsu...@lbl.gov>.
>
> --
> Vanessa Villamia Sochat
> Stanford University
>
> --
> You received this message because you are subscribed to the Google Groups "singularity" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov <mailto:singularity...@lbl.gov>.

vanessa s

unread,
Oct 6, 2017, 1:25:35 PM10/6/17
to Oliver Freyermuth, singu...@lbl.gov
For docker (please correct me if I am wrong), the build context is enforced to be the place where the Dockerfile is placed, so again it does not matter where I call the actual "docker build" binary.
If singularity really uses the working directory where the binary is started as build context, I consider this a portability and reproducibility issue,

Oh sorry I mispoke then - Singularity is an executable added to your path that you can call from anywhere. It works just as Docker does with regards to the build recipe, but is more flexible because it doesn't automatically add everything in sight (eg, why we have .dockerignore)
 
since a user always has to enter the directory where the singularity recipe is placed in to achieve reproducible behaviour on his / her machine.

Another clarification - this is only the case when you build. Before you build the image and have different files, of course it's not a reproducible container. Once you build the container, then you can be assured reproducibility as everything is packaged inside.
 
Otherwise, it could happen that he/she accidentally includes files or directories from a totally different place.

If you have a simple practice (like Docker) of keeping recipes and relevant build files in the same place, this seems logical to me.
 
Let's visualize with a security-relevant, but not unrealistic example. Let's assume my recipe contains:

%files
.ssh/id_rsa /home/user/.ssh/id_rsa
.ssh/id_rsa.pub /home/user/.ssh/id_rsa.pub

Is that even a good idea? These kinds of things might be more appropriate for a user to map to their own home (so the container doesn't store them, and in fact if you are in home you would map the calling user's home directory over it anyway) or to produce (again mapped to the user's host) at runtime. I don't think it's generally good practice to store credentials, certificates, etc. in the (static / not used) image.
 
If the user checks out my repo and now executes, in his / her home:
singularity bootstrap someimage.img ~/my_git_repo_checkout/recipes/SL6/Singularity
the container will be built successfully, but the result would contain the ssh-keys of the building user(!) taken from the home directory instead of those which reside in
~/my_git_repo_checkout/recipes/SL6/.ssh/

yeah DON'T do that. That borders on malicious to have a user build a container and knowingly copy their secrets (whatever they might be) into it. 
 
No error would be thrown. If the user now pushes this build to a public registry, he / she has a huge problem and is not even aware of it
(and likely not even aware of it if he / she read the recipe beforehand!).

This is also an issue of education about security. This is just the same as saying I could add credentials to a Github repo and push it. But I think I get what you are saying - the user would be doing this unknowingly. All I can say is that they should look at the recipe file first, OR use a pre-built container that can't take their files and distribute them. 
 
In my eyes, that's at least a portability / reproducibility issue, if not a security issue, since the executable working directory matters for the build result -
while for docker this is not the case as far as I know, it uses a fixed base (the one of the Dockerfile).

I still don't follow. Singularity is exactly the same as Docker in this regard, the commands ADD/COPY correspond to our %files section, and the paths are relative to the build folder.
 
> Another idea is to have a wrapper around singularity that keeps a record of build bases (directories) associated with recipe or uri names, and then when you call to bootstrap /build with the uri, it changes directories appropriately, etc.
I already have such a wrapper (which for example can build .tar.gz containers using the intermediate tmpfs-ramdisk - really fast!), so I can implement that easily ;-).

great!

 
> If you could better define what exactly isn't reasonable about relative paths from some base, I can offer suggestions.
I actually think relative paths are fine - but I'm not happy with the base ;-).
I would prefer the base to be similar to how it is handled in Docker (by default at least),
and / or to be able to specify it from outside e.g. via a command line parameter, which would allow for example:
singularity bootstrap --buildroot /my_cloned_repo/resource_set_1/ SL6_with_resource_set_1.img /my_cloned_repo/recipes/SL6.def
singularity bootstrap --buildroot /my_cloned_repo/resource_set_2/ SL6_with_resource_set_2.img /my_cloned_repo/recipes/SL6.def
This can be pretty useful if I have to build a series of containers which are all the same, but just have a different set of resources.

So you are saying you want to completely close off directories outside of the PWD / build directory? So I could add files relative, but not do something like ../../parent/directories ? That seems reasonable to do, we could add some kind of variable for a "build_dir" and spit out an error if the user tries to copy / go outside of it.
 
As it is now, everybody building a singularity container seems to have to change manually (or via wrapper) to the directory of the build recipe before bootstrapping to ensure things work out.
This is something I dislike, and it's not even documented (I think) that the build context is the working directory.

/shrug here, I guess I am naturally inclined to want to keep different projects / containers in their own folders, to match with my github repos.
 
> If you are looking for a local image manager to make it easy to push your own containers (and find them later) check out Singularity Registry https://singularityhub.github.io/sregistry/
I actually looked at this project shortly before deciding for the shell-script-solution.
What we are doing here is to build containers for our HPC cluster, and we directly deploy them to CVMFS, to have them readily available on all worker nodes, and take advantage of the
superior caching mechanics of CVMFS (we put extracted images there, not image files).
So there is no push / pull infrastructure required, and our users have access to our local CVMFS repository (to "read" containers).
Of course, a registry has quite some more nice features, but for us the overhead (Webserver, database, worker etc.) was too large for the requirements at hand.

It might be very useful if also users should be able to modify container recipes, but right now only we admins are taking care to prepare that ;-).

This would be the rationale for Github. If you have a recipe in a Github repo, a user can easily contribute via PR. Then you discuss, change, and it goes through tests. When the merge is done, then you have it deploy with a push to your registry. That way you have user contribution to (admin) served images, and it happens automatically with the PR merge/push. 

So all in all, my suggestion would be to re-think the build context concept to follow the more safe and reproducible concept from Docker,
and at least document what the build context is (as Docker does).
As a bonus, the "--buildroot" parameter (or whatever name is preferred) would be a nice-to-have.

This is an idea that would be useful to explore. Could you please add an issue to the board?

 

For now, however, I think I will solve my issue at hand by just extending the wrapper to explicitly enter the directory in which the build recipe is contained,
and (sym)link all common resources that should be used for all containers that should use them, so they will be found right next / in a directory below the build recipe.

haha, awesome. Thanks to you too! :) 
>> To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov <mailto:singularity%2Bunsu...@lbl.gov>.

>
> --
> Vanessa Villamia Sochat
> Stanford University
>
> --
> You received this message because you are subscribed to the Google Groups "singularity" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov <mailto:singularity+unsub...@lbl.gov>.




--
Vanessa Villamia Sochat
Stanford University '16

Oliver Freyermuth

unread,
Oct 6, 2017, 2:31:25 PM10/6/17
to vanessa s, singu...@lbl.gov
Hi Vanessa,

many thanks for the follow up :-).

Am 06.10.2017 um 19:25 schrieb vanessa s:
>> For docker (please correct me if I am wrong), the build context is enforced to be the place where the Dockerfile is placed, so again it does not matter where I call the actual "docker build" binary.
>> If singularity really uses the working directory where the binary is started as build context, I consider this a portability and reproducibility issue,
> Oh sorry I mispoke then - Singularity is an executable added to your path that you can call from anywhere. It works just as Docker does with regards to the build recipe, but is more flexible because it doesn't automatically add everything in sight (eg, why we have .dockerignore)Yes, I agress, singularity is much nicer in this regard (there are also many things I dislike about the Dockerfile format itself...).

>> since a user always has to enter the directory where the singularity recipe is placed in to achieve reproducible behaviour on his / her machine.
> Another clarification - this is /only/ the case when you build. Before you build the image and have different files, of course it's not a reproducible container. Once you build the container, then you can be assured reproducibility as everything is packaged inside.
Yes, I know - my complaint was really about the reproducibility of the actual build procedure.

>> %files
>> .ssh/id_rsa /home/user/.ssh/id_rsa
>> .ssh/id_rsa.pub /home/user/.ssh/id_rsa.pub
> Is that even a good idea? These kinds of things might be more appropriate for a user to map to their own home (so the container doesn't store them, and in fact if you are in home you would map the calling user's home directory over it anyway) or to produce (again mapped to the user's host) at runtime. I don't think it's generally good practice to store credentials, certificates, etc. in the (static / not used) image.
For sure this is a very bad idea - I would never actually do that, this was just an example to get the idea.

> yeah DON'T do that. That borders on malicious to have a user build a container and knowingly copy their secrets (whatever they might be) into it.
I won't - but it would be nice if it was not possible from the start ;-). Better protect the users as good as possible ;-).

> This is also an issue of education about security. This is just the same as saying I could add credentials to a Github repo and push it. But I think I get what you are saying - the user would be doing this unknowingly. All I can say is that they should look at the recipe file first, OR use a pre-built container that can't take their files and distribute them.
That's exactly my point. Even from the recipe, it would not be obvious the container might pick up my credentials if I bootstrap it starting "singularity bootstrap" from inside my home,
unless you know that singularity, unlike Docker, uses the working directory as build context and not the location of the Singularity build recipe.
Since most people will know Docker, that difference may come as a huge surprise.

> I still don't follow. Singularity is exactly the same as Docker in this regard, the commands ADD/COPY correspond to our %files section, and the paths are relative to the build folder.
Basically, what I'd like to point out is that docker closes off all directories outside of the build directory - and the build directory is explicitly set to the location of the build recipe, and not the PWD.
This prevents any accidental or malicious attempt to pick up stuff from the host which should not enter the container.

> So you are saying you want to completely close off directories /outside /of the PWD / build directory? So I could add files relative, but not do something like ../../parent/directories ? That seems reasonable to do, we could add some kind of variable for a "build_dir" and spit out an error if the user tries to copy / go outside of it.
Exactly :-).
I consider this would be more safe in the long run, and protect you from shooting yourself in the foot accidentally (at least in this way ;-) ).
Additionally, I would prefer if the default build context would explicitly be the location of the Singularity recipe instead of PWD, but this goes hand in hand.

>> As it is now, everybody building a singularity container seems to have to change manually (or via wrapper) to the directory of the build recipe before bootstrapping to ensure things work out.
>> This is something I dislike, and it's not even documented (I think) that the build context is the working directory.
> /shrug here, I guess I am naturally inclined to want to keep different projects / containers in their own folders, to match with my github repos.
I agree for the "classical" use case. In our case, we would like to build containers for different OSes - but meant to run on the same infrastructure,
and provide comparable environments.
So it is natural they would share some resources, like stuff to go into /etc/profile.d/.
But symlinks can solve that as well ;-).

>> It might be very useful if also users should be able to modify container recipes, but right now only we admins are taking care to prepare that ;-).
> This would be the rationale for Github. If you have a recipe in a Github repo, a user can easily contribute via PR. Then you discuss, change, and it goes through tests. When the merge is done, then you have it deploy with a push to your registry. That way you have user contribution to (admin) served images, and it happens automatically with the PR merge/push.
We will consider that in a later stage of the project, when our users are more used to containers.

>> So all in all, my suggestion would be to re-think the build context concept to follow the more safe and reproducible concept from Docker,
>> and at least document what the build context is (as Docker does).
>> As a bonus, the "--buildroot" parameter (or whatever name is preferred) would be a nice-to-have.
>
> This is an idea that would be useful to explore. Could you please add an issue to the board?
Good idea! Here it is: https://github.com/singularityware/singularity/issues/1025

> haha, awesome. Thanks to you too! :)
Thanks to you, singularity is awesome :-).
> > If you could better define what exactly isn't reasonable about relative paths from some base, I can offer suggestions. If you are looking for a local image manager to make it easy to push your own containers (and find them later) check out Singularity Registry https://singularityhub.github.io/sregistry/ <https://singularityhub.github.io/sregistry/>
> >
> >> All the best and many thanks for your help!
> >>         Oliver
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups "singularity" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov <mailto:singularity%2Bunsu...@lbl.gov> <mailto:singularity%2Bunsu...@lbl.gov <mailto:singularity%252Buns...@lbl.gov>>.
> >
> > --
> > Vanessa Villamia Sochat
> > Stanford University
> >
> > --
> > You received this message because you are subscribed to the Google Groups "singularity" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov <mailto:singularity%2Bunsu...@lbl.gov> <mailto:singularity...@lbl.gov <mailto:singularity%2Bunsu...@lbl.gov>>.
>
>
>
>
> --
> Vanessa Villamia Sochat
> Stanford University '16
> (603) 321-0676

vanessa s

unread,
Oct 6, 2017, 3:42:58 PM10/6/17
to Oliver Freyermuth, singu...@lbl.gov
Yes, I know - my complaint was really about the reproducibility of the actual build procedure.

This is a tough one! The premise is that there is one "right" way to build an image. The most basic thing the software can take account for is having one recipe that is used to completely re-make the image. What we don't control for is the content that is added to it - this is a decision of the user. From an analysis of the software, given this basic function, reproducibility is assured.

What Singularity does not cover (as it's out of scope) is the manager that sets up the particulars of some build to then call a singularity build.  Think of all the options and needs there are! I'll go over the big ones I see:

Single Users (Cloud)
Many users don't have an HPC center that can afford to do much more than install Singularity. Thus is the reason for Singularity Hub - you can use Github (just as researchers normally do) and push a file called "Singularity", put file dependencies in the repo, assume the build is done relevant to the repo base, and they build automatically and are delivered via singularity shub:// uri.  Another cloud manager of this type could reproduce perfectly just knowing the repos.

Institution (possible private)
Singularity Hub didn't meet the need for institutions that didn't want their images to be public, or to have 1000 Github repos. They also desire to build in their particular way (build server? slurm? still Github with user PRs?) This is the rationale for Singularity Registry - you do your thing, then push images to it, also available via shub://

SLURM / HPC
This is the use case not addressed, and mostly because every cluster is so different I couldn't imagine a "one thing works for all" solution. Actually, I started making a solution for this - a singularity image with a builder and enforced internal structure --> https://github.com/singularityhub/singularity-registry but spent a few months on it, heard from most that they wanted Docker, and just started over.

The best I can offer, aside from writing your own little bash script with specifics to the cluster, is to think about the recipe. Arguably, if you didn't use local files but just retrieved them from a secure place (and checking your sums and what not) then the entire Singularity build produce is reproducible, from anywhere with a definition file, and an internet connection. You also might want to think about macros in the recipe - given a simple format for keeping things (see the repo I linked above for what I was working on) then some user could plug in their base directory as a macro,and files would be copied relative to that. WIthout the macro, it would assume PWD. That would be a cool idea!

> This is also an issue of education about security. This is just the same as saying I could add credentials to a Github repo and push it. But I think I get what you are saying - the user would be doing this unknowingly. All I can say is that they should look at the recipe file first, OR use a pre-built container that can't take their files and distribute them.

yes!
 
That's exactly my point. Even from the recipe, it would not be obvious the container might pick up my credentials if I bootstrap it starting "singularity bootstrap" from inside my home,
unless you know that singularity, unlike Docker, uses the working directory as build context and not the location of the Singularity build recipe.
Since most people will know Docker, that difference may come as a huge surprise.

To be specific again, this PWD is only relevant for %setup and adding files %files, which is what we've been talking about. If you build from a folder and add files foo, it is going to look for the files in that folder before the %post section is executed. Once you are in %post, the build itself happens somewhere in /var (I don't remember the full path actually).
 
> I still don't follow. Singularity is exactly the same as Docker in this regard, the commands ADD/COPY correspond to our %files section, and the paths are relative to the build folder.
Basically, what I'd like to point out is that docker closes off all directories outside of the build directory - and the build directory is explicitly set to the location of the build recipe, and not the PWD.
This prevents any accidental or malicious attempt to pick up stuff from the host which should not enter the container.

Yes this makes a lot of sense - I'd like to see this added to Singularity.  

> So you are saying you want to completely close off directories /outside /of the PWD / build directory? So I could add files relative, but not do something like ../../parent/directories ? That seems reasonable to do, we could add some kind of variable for a "build_dir" and spit out an error if the user tries to copy / go outside of it.
Exactly :-).
I consider this would be more safe in the long run, and protect you from shooting yourself in the foot accidentally (at least in this way ;-) ).
Additionally, I would prefer if the default build context would explicitly be the location of the Singularity recipe instead of PWD, but this goes hand in hand.

yes that's very Dockery :)
 

We will consider that in a later stage of the project, when our users are more used to containers.

great! I hope they catch up faster than I, I'm still not used to these container things :)
 
>> So all in all, my suggestion would be to re-think the build context concept to follow the more safe and reproducible concept from Docker,
>> and at least document what the build context is (as Docker does).
>> As a bonus, the "--buildroot" parameter (or whatever name is preferred) would be a nice-to-have.
>
> This is an idea that would be useful to explore. Could you please add an issue to the board?
Good idea! Here it is: https://github.com/singularityware/singularity/issues/1025

> haha, awesome. Thanks to you too! :)
Thanks to you, singularity is awesome :-).

thanks!! +1 on buildroot.
 
>     >> To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov <mailto:singularity%2Bunsu...@lbl.gov> <mailto:singularity%2Bunsu...@lbl.gov <mailto:singularity%252Buns...@lbl.gov>>.

>     >
>     > --
>     > Vanessa Villamia Sochat
>     > Stanford University
>     >
>     > --
>     > You received this message because you are subscribed to the Google Groups "singularity" group.
>     > To unsubscribe from this group and stop receiving emails from it, send an email to singularity+unsubscribe@lbl.gov <mailto:singularity%2Bunsu...@lbl.gov> <mailto:singularity+unsub...@lbl.gov <mailto:singularity%2Bunsu...@lbl.gov>>.

>
>
>
>
> --
> Vanessa Villamia Sochat
> Stanford University '16
> (603) 321-0676

Reply all
Reply to author
Forward
0 new messages