Proposal: Trusted Builds - finding and trusting the source of a container on the public registry

1,531 views
Skip to first unread message

Solomon Hykes

unread,
Aug 27, 2013, 8:29:43 PM8/27/13
to docker-dev
So far the public registry has been a great way to get started and discover the possibilities of Docker, but many are hesitant to actually use third-party containers in their application because there's no easy and trustworthy way to *get the sources* of these containers.

Here are the 2 questions I hear the most about the public registry:

- Question 1: "I found a cool container on the public registry. How can I find its Dockerfile and its source?"

- Question 2: "I don't trust this random binary download on the registry. How do I know for sure what source it was built from?"


To solve this problem, I propose to add a "trusted build" feature to the public registry.

This feature would provide an alternative way to publish a container on the public registry. Instead of the usual binary upload via "docker push", the publisher could submit a source URL to the registry. That would then trigger a build process *managed by the Docker team*, which would pull the source from the given URL, build the container, and push it.

A container published via a Trusted Build would be accessible in all the usual ways. But it would be distinguishable by 1) a special "trusted build" notice on the page, and 2) a link to the source it was built from. Conversely, images which are not trusted builds could have a notice indicating that we cannot guarantee their content.

This would provide a solid answer to both questions:

"How can I find its Dockerfile and its source?" Answer: if it's a trusted build, just follow the source link!

"How do I know for sure what source it was built from?" Answer: if it's a trusted build, we (the docker team and specifically infrastructure maintainers of the docker project) guarantee that the container was built from the indicated source. Obviously you need to trust the Docker team and the source - but you don't need to trust an arbitrary 3d-party binary.



So, consider this a request for feedback. Do you consider this useful? Would you use it?

Jérôme Petazzoni

unread,
Aug 27, 2013, 8:52:54 PM8/27/13
to Solomon Hykes, docker-dev
That would be awesome + it would be a great stepping stone to multi-arch!


--
You received this message because you are subscribed to the Google Groups "docker-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to docker-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Brandon Philips

unread,
Aug 28, 2013, 1:51:02 PM8/28/13
to Solomon Hykes, docker-dev
On Tue, Aug 27, 2013 at 5:29 PM, Solomon Hykes <sol...@dotcloud.com> wrote:
> This feature would provide an alternative way to publish a container on the
> public registry. Instead of the usual binary upload via "docker push", the
> publisher could submit a source URL to the registry. That would then trigger
> a build process *managed by the Docker team*, which would pull the source
> from the given URL, build the container, and push it.

I have been thinking about signed containers recently too. But, from a
slightly different angle:

1) Establishing trust: I want a mechanism inside of docker push to GPG
sign a container tag just like signing a git tag. And then to verify
that tag on a docker pull. At the end of the day I need to trust the
developer not to have nefarious code and having nefarious code built
by someone else doesn't add much to the security. But! Being able to
verify that the developer I trust was actually the one to release
version X of the container is a really useful property.

We should be able to do this all in native Go too:
https://code.google.com/p/go/source/browse/?repo=crypto#hg%2Fopenpgp

2) Source discovery: The source code discovery is really important too
and that would be a nice feature to add.

3) Random aside: It would be cool if you could start from a signed git
tag in source, build the container from it and then verify the built
container matches the developer's container bit for bit. Debian and
Tor have been talking about something like this:
https://wiki.debian.org/ReproducibleBuilds#Why_do_we_want_reproducible_builds.3F

Brandon

Paul Nasrat

unread,
Aug 28, 2013, 2:22:19 PM8/28/13
to Brandon Philips, Solomon Hykes, docker-dev
On 28 August 2013 13:51, Brandon Philips <bra...@ifup.co> wrote:
On Tue, Aug 27, 2013 at 5:29 PM, Solomon Hykes <sol...@dotcloud.com> wrote:
> This feature would provide an alternative way to publish a container on the
> public registry. Instead of the usual binary upload via "docker push", the
> publisher could submit a source URL to the registry. That would then trigger
> a build process *managed by the Docker team*, which would pull the source
> from the given URL, build the container, and push it.

I have been thinking about signed containers recently too. But, from a
slightly different angle:

1) Establishing trust: I want a mechanism inside of docker push to GPG
sign a container tag just like signing a git tag. And then to verify
that tag on a docker pull. At the end of the day I need to trust the
developer not to have nefarious code and having nefarious code built
by someone else doesn't add much to the security. But! Being able to
verify that the developer I trust was actually the one to release
version X of the container is a really useful property.

We should be able to do this all in native Go too:
https://code.google.com/p/go/source/browse/?repo=crypto#hg%2Fopenpgp

2) Source discovery: The source code discovery is really important too
and that would be a nice feature to add.

Source discover and linking are probably required in the case of GPL software. At the moment if I put up a base container with a modified binary from the original distribution there is no path to get the source.

3) Random aside: It would be cool if you could start from a signed git
tag in source, build the container from it and then verify the built
container matches the developer's container bit for bit. Debian and
Tor have been talking about something like this:
https://wiki.debian.org/ReproducibleBuilds#Why_do_we_want_reproducible_builds.3F

Interesting - at the very least if you are composing from packages you should be able to compare the checksums (eg rpm -Va). I think there is definitely some form of docker container workflow that contains an optional verification step.

Paul 

Tianon

unread,
Sep 3, 2013, 5:59:20 PM9/3/13
to docke...@googlegroups.com
What if I were to create an image using this method, but use FROM on an untrusted image?  Would my new image be a trusted image, or is it automatically sullied by being based on an untrusted image?

Also, wouldn't the original ubuntu base images be untrusted, since they're not created from a Dockerfile, and are instead created by debootstrap?  I realize you could manually mark those as "trusted", since they are official after all, but that would preclude other debootstrap-based images from being trusted, even if they're created using a standard, repeatable method/script.

Don't get me wrong, I really love the idea, I just want to make sure the obvious issues are covered.

Solomon Hykes

unread,
Sep 3, 2013, 8:34:02 PM9/3/13
to Tianon, docker-dev
Hey Tianon,

On Tue, Sep 3, 2013 at 2:59 PM, Tianon <admw...@gmail.com> wrote:
What if I were to create an image using this method, but use FROM on an untrusted image?  Would my new image be a trusted image, or is it automatically sullied by being based on an untrusted image?

Good question!

First, to be specific, the *build* would be trusted, not the image. In other words you can trust that the image advertises an authentic source code (Dockerfile included). But if the source does something evil, the image will as well, and a trusted build won't protect you against that.

In other words: Trusted Image = Trusted Source + Trusted Build.


To answer your question: I think a trusted build should be trusted all the way down. So having an untrusted dependency (in the form of a FROM pointing to an untrusted build) should sully your build and make it untrusted as well.

Also, wouldn't the original ubuntu base images be untrusted, since they're not created from a Dockerfile, and are instead created by debootstrap?  I realize you could manually mark those as "trusted", since they are official after all

One war or the other we have to deal with the turtle at the bottom :)

You're right, we don't exactly mark images manually, but we do something similar. As it turns out, the compressed output of a minimal debootstrap is small enough to be comparable in size to a very large source repository. So we just did this: https://github.com/dotcloud/ubuntu-quantal

In other words, base images will also have a Dockerfile. It doesn't make the need for bootstrapping go away - but it makes it less of a special case for our bild and release toolchain.

, but that would preclude other debootstrap-based images from being trusted, even if they're created using a standard, repeatable method/script.


Since base images are built from source+Dockerfile just like any other images, all the other rules apply identically. Anybody can 'docker build' any image, but only a trusted registry can provide you a trusted build.

Don't get me wrong, I really love the idea, I just want to make sure the obvious issues are covered.

This is what the mailing list is for :) Tell me how you feel about the suggestions above.

Thanks for the feedback!


On Tuesday, 27 August 2013 18:29:43 UTC-6, Solomon Hykes wrote:
So far the public registry has been a great way to get started and discover the possibilities of Docker, but many are hesitant to actually use third-party containers in their application because there's no easy and trustworthy way to *get the sources* of these containers.

Here are the 2 questions I hear the most about the public registry:

- Question 1: "I found a cool container on the public registry. How can I find its Dockerfile and its source?"

- Question 2: "I don't trust this random binary download on the registry. How do I know for sure what source it was built from?"


To solve this problem, I propose to add a "trusted build" feature to the public registry.

This feature would provide an alternative way to publish a container on the public registry. Instead of the usual binary upload via "docker push", the publisher could submit a source URL to the registry. That would then trigger a build process *managed by the Docker team*, which would pull the source from the given URL, build the container, and push it.

A container published via a Trusted Build would be accessible in all the usual ways. But it would be distinguishable by 1) a special "trusted build" notice on the page, and 2) a link to the source it was built from. Conversely, images which are not trusted builds could have a notice indicating that we cannot guarantee their content.

This would provide a solid answer to both questions:

"How can I find its Dockerfile and its source?" Answer: if it's a trusted build, just follow the source link!

"How do I know for sure what source it was built from?" Answer: if it's a trusted build, we (the docker team and specifically infrastructure maintainers of the docker project) guarantee that the container was built from the indicated source. Obviously you need to trust the Docker team and the source - but you don't need to trust an arbitrary 3d-party binary.



So, consider this a request for feedback. Do you consider this useful? Would you use it?

Tianon

unread,
Sep 3, 2013, 10:36:35 PM9/3/13
to docke...@googlegroups.com, Tianon
I think that sounds quite excellent (especially your clarifications on the points I raised), and would certainly enjoy using such a feature.  This plan would also help to create a centralized library of Dockerfiles that one can learn cool Dockerfile techniques from.

When you say "managed by the Docker team", do you mean that you guys would just run the build server/farm for these specialized images, or that you would actually be reviewing the builds and Dockerfiles to ensure that they are benign (or something similar to that)?  If the latter, I'd be worried about the burden this places on the already heavy-laden Docker team, but otherwise this sounds like an excellent real-world use case for "docker-in-docker". :)

Bonus points for making sure this translates well to private registries, too.

Bo Shi

unread,
Sep 5, 2013, 1:35:35 AM9/5/13
to Paul Nasrat, Brandon Philips, Solomon Hykes, docker-dev
Combining elements of Solomon's proposal and Brandon's comments begins
to sound a lot like a Launchpad PPA service for docker images. Neat.
That said, Solomon, do you think trust and or discovery will compete
for developer resources with getting to "production-ready"? If yes,
from my own perspective as a user planning on deploying a private
image repository, I would definitely opt for getting to
production-ready faster.

Thanks,
Bo


As an aside, I think Brandon's approach of de-coupling trust and
discovery is an elegant approach to addressing the questions you
enumerated.
> --
> You received this message because you are subscribed to the Google Groups
> "docker-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to docker-dev+...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.



--
Bo Shi
617-942-1744

Juan Batiz-Benet

unread,
Sep 25, 2013, 5:30:49 AM9/25/13
to docke...@googlegroups.com, Paul Nasrat, Brandon Philips, Solomon Hykes

Trusted Builds sound great! Would particularly +1 the trust propagating with dependencies, as Solomon pointed out.


I would strongly suggest that all repos/images in the main public registry become trusted builds. If an image is on the main docker index, it ought be trusted, and have its source (Dockerfile) viewable. Having a "Trusted vs Untrusted" dichotomy (and some images lacking source) is additional complexity thrown at the users. A better design might enforce trusted builds throughout.


Other than the hassle of doing this for already published repos/images, are there technical problems preventing it? I'm new to docker, but perhaps:

- Already-published images can't be rebuilt without maintainer re-uploading? (Maybe then enforce it for all builds going forward + ping maintainers?)

- Can't trust base images of less-popular/custom distros?


Cheers,
Juan

Stephen Handley

unread,
Oct 20, 2013, 9:19:46 AM10/20/13
to docke...@googlegroups.com
I'm new to Docker, and so may be missing something obvious, but I don't understand the utility in organizing the public repo around images with Dockerfiles as secondary objects. It seems like the most general use case of Docker will be running proprietary code and an accompanying private image repo, where users create their own local images.

Having the public repository be organized primarily around Dockerfiles rather than images seems like it would be more useful, especially if there was support for software specific templates (i.e. paramaterized docker commands needed to install and run just "node" or "memcache" or "redis" or...). These could be combined to generate Dockerfiles which would themselves be shared as examples for the community to learn from. Instead of the current linear model, it would be more of a mixin-based approach to creating Dockerfiles. Vendors / maintainers of software projects could include a DockerfileTemplate in the root directory of their source, which would offer a "trusted" snippet which could be included in a Dockerfile via something like
TEMPLATE <name> <args>

so for example:
ADD ./mongodb.conf /data/mongodb.conf
TEMPLATE mongodb { conf : "/data/mongodb.conf", port: 27017, data : "/data/db" }

would be expanded into:
...
RUN apt-get install mongodb-10gen
RUN mkdir -p /data/db
EXPOSE 27017
ENTRYPOINT ["mongod", "-f", "/data/mongodb.conf"]

Instead of building machinery / process / convention around images and the accompanying centralized build and verification process, the docker team would just need to manage access to the template namespace. 

Joffrey Fuhrer

unread,
Oct 20, 2013, 2:34:20 PM10/20/13
to Stephen Handley, docke...@googlegroups.com
Hi Stephen,

I think the short answer to "Why is the public index not about sharing Dockerfiles instead of images" is that images are immutable, while Dockerfiles in the general case can (will) produce different results based on a variety of factors, and Docker is all about the guarantee of immutability. 

I highly recommend reading through "The Whole Story" on the docker website, which explains (amongst other things) why this is important: http://www.docker.io/the_whole_story/


--
You received this message because you are subscribed to the Google Groups "docker-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to docker-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Joffrey F
[Docker-Team] docker-py and registry maintainer

Stephen Handley

unread,
Oct 20, 2013, 2:56:22 PM10/20/13
to docke...@googlegroups.com, Stephen Handley
Hey Joffrey, Awesome, thank you for the clarification. Any links / info about what factors cause Dockerfiles to produce different results? 

Joffrey Fuhrer

unread,
Oct 20, 2013, 6:32:13 PM10/20/13
to Stephen Handley, docke...@googlegroups.com
No links, but it's actually pretty simple. It stems from the fact that any arbitrary shell command can be run from a Dockerfile.

The most common "real-life" example would be

FROM ubuntu
RUN sudo apt-get update

Depending on the date, this will yield different results depending on the package updates in the apt-get repositories. In general, this means that anything that has to do with getting data from the network can and will change depending on the availability of the network, of the host that you're contacting, of the availibility of the file you're trying to reach, etc. 

You could say that this is easily prevented by forbidding the use of anything apt-get, curl, wget and etc, but
1. These kinds of restrictions are easily circumvented
2. It would make Dockerfiles highly inefficient as tools for building images.

HTH :)
Reply all
Reply to author
Forward
0 new messages