Best practice for uploading containers to sregistry with tags/versions

314 views
Skip to first unread message

Mike Moore

unread,
Oct 17, 2018, 3:30:10 PM10/17/18
to singularity
Hi,

  So, I was wondering if anyone had any recommendations or best practices around the tagging/versioning of images uploaded to sregistry?  I've seen cases where a container will have both a "latest" and a <version> entry in the registry.  Is it simply one upload with a "--tag latest --tag <version>", or two separate uploads , one for latest and one for <version>?  Just trying to figure out the best way to do this and provide the correct documentation to our users/developers.

-Mike


v

unread,
Oct 17, 2018, 3:48:34 PM10/17/18
to singu...@lbl.gov
Hey Mike,

I can definitely help answer this one! Do you mean Singularity Registry Server, or Singularity Hub? They are slightly different so I can adjust my answer based on that. The short answer is that the tag of "latest" is nothing special other than the default tag used given that the user doesn't specify a tag (eg., built from the "Singularity" file in the Github repo (no extension) or pulled without specifying a tag (shub://vsoch/hello-world. This is a convention taken from Docker, and to be honest I don't think it's the best practice, because "latest" now is not "latest" later. It's much better to generally pull with a version (either a commit or container hash for Singularity Hub) and as a builder, you don't need to thin about this (it is figured out automatically from the container and from your repository).

For Singularity Registry, we can't derive a commit, but we can derive the container hash. So generally I would think of a tag as a moving thing (e.g., latest today isn't latest tomorrow) and when you interact with images (such as specifying usage of one in a script) it's best practice to include the hash (shub://<username>/<repo>:<tag>@<hash>

What is missing for Singularity Registry are better hooks into CI (Circle or Travis) which actually I've shown before with Travis ---> https://github.com/singularityhub/singularity-ci/ that would push and also provide the container for the registry to include a way back to its source (the Github repo and commit!). The good news is that we don't need some special builder or plugin, just an example recipe that will do the build and push. I'd be happy to write this up for you (was planning on updating that repo and making one for Circle too :)

Best,

Vanessa

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.


--
Vanessa Villamia Sochat
Stanford University '16

v

unread,
Oct 17, 2018, 4:13:40 PM10/17/18
to singu...@lbl.gov
To be more specific answering your question - there is no universal best practice, because it depends on your use case. I can answer under the context of wanting to create then use reproducible software.

1. If you build a container and you have no link back to its recipe, under version control, you can never go back and recreate it, if needed. You can't hand it off to a colleague and have him/her understand how the guts were generated. In my world, a container built under any context without a recipe linked to it, and one that is version controlled, is not good enough.
2. If you build a container and have no place to put it where it's programatically accessible, this is another fail case. I can build and have containers on my local machine to my heart's content, but if others aren't able to grab the exact same container, via some API, well that doesn't get me very far either.
3. This final point is what I think of as the "scrappy developer" quality. If you don't do 1 and 2 using tools and resources that are open and publicly available, it doesn't matter if you can afford to use it, the next person that you want to share with (that cannot) isn't as lucky as you, and you fail. For example, as a scrappy developer I would want to take advantage of building containers in the most version controlled and accessible way that I can. This means using services like CircleCI and Travis because A., the build recipes and the configs for the CI are both under version control, together, and B. Anyone can literally clone my repo, connect to the (free) service, and get the exact same thing. The containers built satisfy conditions 1. and 2. above. The only additional thing you need here is a Registry to push them to, but you can have confidence that if the registry explodes, you at least have a history of how it was built, and the recipe it was built from.

Outside of those points, you can do whatever you want. You can have a big folder of containers you build in random places (local? some remote you are paying for?) but at the end of the day, you can't really reproduce that, or at least a third party that has (or loses) your container and wants to re-generate it cannot. You can choose to keep the recipe internal to the container, but then you delete the container, oops so much for that. If I were a general researcher, I would maximally take advantage of simple, free resources like Github and Continuous Integration (to build) and then for the registry, optimize based on what is available to me. Not sure if any of this is helpful, but glad to chat more :)

Best,

V

Mike Moore

unread,
Oct 17, 2018, 4:22:13 PM10/17/18
to singularity
Hi Vanessa,

  I am primarily talking about Singularity Registry.  I am OK with just pushing versioned containers and not "latest".  I was just trying to understand whether "latest" could just be an extra tag applied to an upload with a <version> tag so that we can declare a newer release as latest, or if it would have to be its own separate upload.  In a more general sense, can we have multiple tags for different things with a single image upload or is it one tag per upload?

-Mike

Mike Moore

unread,
Oct 17, 2018, 4:30:01 PM10/17/18
to singularity
Oh, I completely agree with your points.  We are going to have a system with github/gitlab for recipes, some sort of CI to connect the recipes to the build process and publication via sregistry.  I'm working on two fronts.  One is to introduce Singularity to our environment as a Preview, with fewer restrictions.  While our user base is learning/digesting that, I'll get the version control/CI/large sregistry system in place to allow us to tie the image back to the recipe.  I have nightmares about how would we get everything back if the world blew up...  We're not there yet...  But this is a step on the road.

Thanks!
-Mike

v

unread,
Oct 17, 2018, 4:38:32 PM10/17/18
to singu...@lbl.gov
Right now I'm still thinking that it's best to not introduce an additional layer of abstraction (meaning having multiple tags pointing at the same container) because then we lose the 1:1 ratio between a container and what (is supposed to be) somewhat of a URI. If the user (oups!) forgets that "release1.0" also refers to "release1.0test" and then does a force delete to the test, well then either Singularity Registry needs to protect the user from him or herself, or we just abide by the command and delete both images (not what we wanted to do.) So how to avoid this? We have the very simple "rule" that a container uri, meaning a <repo>/<username>:<tag> has a a 1:1 association with a container. Yes, you can push multiple of the tag (and get different versions based on the hash, e.g., <repo>/<username>:<tag>@<hash> but minimally if you modify some tag1, you can be sure it's not going to accidentally also modify some other tag2. Does that make sense?

Your project sounds exciting! And actually, by using the CI as a middleman you can keep your registry pretty locked down with respect to giving push access - you can essentially manage / control via Github, have many people work on a container (without giving write permission to the registry) via pull requests, and then also get the version control bit in too!

To go back in time a bit, the entire setup of Singularity Hub is based on the idea that if the entire thing blew up, you wouldn't need anything other than the Github repos to rebuild the images. Everything from the recipes, to the tags, are kept there. This is why I don't let users define tags (in the interface) on the fly, which is something you can do in Docker Hub. I had those nightmares too :)

Another thing to remember (or point out) is that you don't necessarily have to use Singularity Registry, if your institution is set up to have some other service. For example, the sregistry client will push images to most Google places, or AWS, and the (still cool part!) is that you can achieve this same thing still via the same nicely set up continuous integration.

v

unread,
Oct 17, 2018, 4:45:22 PM10/17/18
to singu...@lbl.gov
And for both the registry and the client to interact with storage endpoints, both are completely, truly, open source!


I do my best to respond to the needs of the community, but there is only one of me, so I generally have to triage when it comes to requests for features. But this also means that if something is important to you, it's probably important to others as well, and well, having it be implemented is just a pull request away :) It is hacktoberfest after all right? 

Anyway, I'm really excited about some of the CI stuff, I'll work on that soon and have some cleaner examples to include CircleCI too. Here is the early work I did, probably still good :)

v

unread,
Oct 17, 2018, 5:51:45 PM10/17/18
to singu...@lbl.gov
I just brought up a local registry to do some more testing for you, and here is a little more information to help. First, here is how I'm pushing an image without specifying a tag. Note that I've exported SREGISTRY_CLIENT=registry so it's not trying to interact with another endpoint):

$ sregistry push --name chicken/nugget /home/vanessa/Desktop/busybox.simg 
[client|registry] [database|sqlite:////home/vanessa/.singularity/sregistry.db]
[1. Collection return status 200 OK]
[================================] 0/0 MB - 00:00:00
[Return status 200 Upload Complete]

I'll just show that chunk of stuff for the first time, I can now use the "search" command to see the containers in my registry. Note that since I didn't supply a tag, it pushed as "latest"

$ sregistry search 
Collections
1  chicken/nugget:latest http://127.0.0.1/containers/1

Let's say I push again. Again, no tag. What happens?

$ sregistry push --name chicken/nugget /home/vanessa/Desktop/busybox.simg 
$ sregistry search 
1  chicken/nugget:latest http://127.0.0.1/containers/2

The subtle difference is that the URL for the container is different - instead of ending in 1, we have 2. This isn't really a meaningful identifier, but it does tell you that the container is different. It used to be the case that I would maintain all pushes of a container, but at some point (with Singularity Hub) I found that users were pushing infinitely, not noticing or realizing I was saving every version and there is no way that this is maintainable (since Singularity Hub has tight funding). So the default is that if you, as a user, push the same container again, you understand you are essentially replacing the previous one like that. Unless you freeze it. What do I mean by that? Let me show you. First, let's push a container with a different tag.

$ sregistry push --name chicken/nugget:ketchup /home/vanessa/Desktop/busybox.simg 
$ sregistry search
Collections
1  chicken/nugget:ketchup http://127.0.0.1/containers/3
2  chicken/nugget:latest http://127.0.0.1/containers/2

Now we have the ketchup tag! Woohoo! And we can verify (if we have access to the web interface) that there are two containers in collection "chicken"

image.png

See the unlocked green button? Neither of these containers are frozen. We can overwrite them. Let's freeze one, shall we?

image.png

Oooh now it's blue! Let's try pushing ketchup again from the command line, now that it's frozen.

$ sregistry push --name chicken/nugget:ketchup /home/vanessa/Desktop/busybox.simg 
[client|registry] [database|sqlite:////home/vanessa/.singularity/sregistry.db]
[1. Collection return status 200 OK]
[================================] 0/0 MB - 00:00:00
[Return status 200 chicken/nugget:ketchup@80897b282135094777f9157d9273572a already exists.]

This would be how to protect your images and prevent overwrite, if needed. If you want to provide a "latest" tag, then allow for an overwrite to latest. For a specific version? Push that container with some release-1.2 and freeze it.
Now what about the integrity of the files themselves? The list doesn't show it, but the version (hash of the file) is maintained in the database:

image.png

and accessible via the API:

image.png
and when you pull it, you can list your own images:

image.png

Yes, the hash is duplicated for two images because they are a push of the same file (with different name) because I'm lazy :)

and then inspect them for more information, this is what the singularity registry client provides in its local database:

image.png

So based on what I've shown you here, we have a very easy strategy for linking the recipes to an automated build with version control!

 - add the recipes to Github
 - link to Travis or Circle
 - push to sregistry after build, testing, discussion, and for the push have the "tag" be the commit id (and any other additional tags that are meaningful like release, or latest).

Best,

Vanessa



v

unread,
Oct 17, 2018, 7:38:44 PM10/17/18
to singu...@lbl.gov
Okey doke, I updated the travis builder example for you!


I worked on CircleCI but... no virtualization of mount namespace allowed! So we aren't there yet. But there are a crapton of CI services (even Github just announced a native one!) so we have many options. :)

Mike Moore

unread,
Oct 18, 2018, 10:20:20 AM10/18/18
to singularity
Wow.  Thanks Vanessa.  We are just thinking about our build system now.  We released access to our POC sregistries recently and are beginning to look at scaling issues and general questions about Singularity usage, user permissions, and things like container versioning and how that is handled in the registry.

You mentioned that we could use other storage endpoints like google spaces or AWS.  I might be mistaken, but I didn't think Singularity itself could pull down an image from anywhere but docker://, shub://, or local paths.  Could you do something like?

singularity shell https://<url to image in say Amazon S3>

and have it work?  Or do we have to do separate pull using sregistry-cli before we run singularity?  It would certainly simplify the architecture if we could take advantage of on-premise and Cloud object storage solutions for container images.  That would also help out with some teams who standardized on Docker images without input on the security implications of that model.

Again, thank you so much for the work you have done here with the CI work.  Our goal is to get to the point where the recipes are the canonical source of our containers...  That is a lot less to backup than multiple multi-gigabyte images.


v

unread,
Oct 18, 2018, 10:38:15 AM10/18/18
to singu...@lbl.gov
Hey Mike,

Singularity itself can handle a lot of docker registry locations - the previous (before 3.0) had working pulls for URIs for AWS, Nvidia, and of course Docker Hub (I haven't tried them since) but they should hopefully be working because conforming to OCI makes this easier! The sregistry (singularity registry global client) is also optimized to do just the management commands (pull, push, inspect, mv, search, etc.) given some remote endpoint (e.g., nvidia, aws, docker hub, google storage, etc.) There are instructions for each of the clients here --> https://singularityhub.github.io/sregistry-cli/clients and of course if you have any issues please open and I will help out! 

So in summary:

 1. For standard docker registries, first try actions (pull, shell, etc.) with Singularity native.
 2. For more management (e.g., you pull an image and keep it organized, push to a non docker registry) you can try Singularity Registry Global Client (sregistry-cli)
 
To take advantage of (most) cloud storage that *isn't a proper docker registry* (aws, nvidia, Docker Hub are) then sregistry-cli can help you out. The goal of the software is to allow for flexibility, because most institutions have different kinds of storage they would put their images in, not necessarily a proper registry.  If you do want a proper registry for Singularity Images, then I'd again suggest Singularity Registry with CI to push. The (long ago derived) goals of this was to be an open source registry that is powered by contributions of its users.

Oh! And with a little help from the Circle crew, I got the circle-ci example working! --> https://github.com/singularityhub/circle-ci And guess what? This is kind of nuts, but images up to 3GB will actually be storable as artifacts! Meaning you can use their API (you get a token) to download them with curl. That is really sweet, because it's a total version control --> build --> storage without needing any additional anything. I wouldn't trust the storage for long term (e.g., a publication) but for short term (while developing or otherwise) it's a really easy solution for building without sudo. You just can't have images that are too big :)

Github just added actions too - which introduces another beautiful opportunity for building. I signed up for beta but don't have the button yet :) 

Best,

Vanessa

--
You received this message because you are subscribed to the Google Groups "singularity" group.
To unsubscribe from this group and stop receiving emails from it, send an email to singularity...@lbl.gov.

v

unread,
Oct 18, 2018, 10:43:54 AM10/18/18
to singu...@lbl.gov

Not to be confused with some of your previous professors in the math department. Different kind of artifact :)
Reply all
Reply to author
Forward
0 new messages