building docker without docker & its Dockerfile

2,222 views
Skip to first unread message

unclejack

unread,
Aug 23, 2013, 2:46:13 PM8/23/13
to docke...@googlegroups.com
Hello,

It looks like there is indeed a problem with building docker on various distributions. Gentoo needs to build docker without having docker installed and running. Most build environments can probably handle a simple makefile, but using the Dockerfile seems like overkill.


Ideally, we should have an alternative small build script to take care of producing a docker binary. Perhaps this script could also accept an output location to place the binary in its final location.

Marco Hennings

unread,
Aug 23, 2013, 3:06:37 PM8/23/13
to docke...@googlegroups.com
Hello,

I think it would be best if the Makefile does the build, and is used within the Dockerfile.

It really should not do anything more than just building the code.
If a distro wants to integrate and build a distro specific package that is all what is needed.

And also we do not need to support two builds, if the Makefile is reused.

Beside this it makes dev environments a bit more friendly.


Of course it is cool that docker can bootstrap itself, but i think thats something for official releases only.
For example: If i want to build locally i am not so interested in pushing the result to ec2.


Hope we find a good solution,

Kind regards,

Marco

Solomon Hykes

unread,
Aug 23, 2013, 3:27:44 PM8/23/13
to Marco Hennings, docker-dev
Hi guys,

Here's the current situation:


1) We have an official build and release pipeline based on the Dockerfile, which deprecates the previous Makefile-based pipeline. This is really helpful to have an official build environment for all contributors, automate releases and tests, etc. This is the recommended way to build and install docker, and the only one we will support.

2) That official pipeline relies on a very simple make.sh script (conceptually similar to a Makefile, let's leave the "shell script vs Makefile" conversation for another time :) That script is very simple, partly because it expects to run in a correct build environment. You are free to run that make.sh script on your own, but it's your responsibility to run it in a correct environment, and if that environment is not the official build container, we can't support it.

3) There are many software distributions that may be a) potentially interested in distributing docker, and b) unable or unwilling to use the official docker-based pipeline. That includes Debian, Ubuntu, RHEL/Fedora, Gentoo, Suse, Arch but also packaging tools like Homebrew. That's totally normal and understandable, and we want to make their lives easier. BUT we can't maintain and support a pipeline for each of these distributions which respects their specific policies on build environments, mirrors, dependencies etc. We just don't have the time to learn how to do it right for everyone.

As for the specific case of Gentoo: I don't know the specifics of packaging software for Gentoo. Maybe it really is just providing the Makefile. But maybe there's a few tweaks that would be more Gentoo-friendly. Organize the Makefile in a certain way, add a little manifest here, create an account there. Maybe even help "gentoo-ize" some upstream dependencies. Maybe there's a "gentoo way" of upgrading Go.

I understand 100% why distros need these adjustments to upstream. We just don't have time to do the work, let alone do it properly. Instead, we would like to have 1 point of contact for each distro who is willing and able to do that work downstream. And we want to make sure that person gets everything he/she needs from upstream.

This is what we're trying to do with Debian and Ubuntu (note: if you're an Ubuntu "master of the universe" and want to help, we're interested!). And that's what we need to do with other distros as well, including Gentoo.

Any volunteers for Gentoo? :)


Thanks for reading, I hope this clarifies the situation!



--
You received this message because you are subscribed to the Google Groups "docker-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to docker-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Marco Hennings

unread,
Aug 23, 2013, 3:36:33 PM8/23/13
to docke...@googlegroups.com, Marco Hennings
Hi,

as a gentoo user, i think i can volunteer. Its not easy to get an official ebuild into gentoo, though you can provide a build in an "overlay" Thats something similar to a PPA but source based.
Its not my primary interest, but i think it should be done.

Current state in gentoo is:
just do a make and it works

Thats one of the reasons i use gentoo. building form source just works.

So there is not so much special to do, beside depending on go.

Kind regards,

Marco

Solomon Hykes

unread,
Aug 23, 2013, 4:02:31 PM8/23/13
to Marco Hennings, docker-dev
On Fri, Aug 23, 2013 at 12:36 PM, Marco Hennings <marco.h...@freiheit.com> wrote:
Hi,

as a gentoo user, i think i can volunteer. Its not easy to get an official ebuild into gentoo, though you can provide a build in an "overlay" Thats something similar to a PPA but source based.
Its not my primary interest, but i think it should be done.

Cool! Then I guess let the list know if you need anything, or if you would like someone else to help you or take over.
 
Current state in gentoo is:
just do a make and it works

One of the reasons we're discontinuing the mega-Makefile is that it's really "do a make and it might work, if you have the right version of Go, your GOPATH is correctly set, CGO is properly disabled, etc." We had several cases where it *seemed* to work, but in reality the build was flawed in non-obvious ways - a certain version of main.go built with another version of the docker submodules for example!

Thats one of the reasons i use gentoo. building form source just works.

So there is not so much special to do, beside depending on go.

... and specifying build dependencies and build environment in a gentoo-friendly friendly way. Which I have no idea how to do  :)


Thanks for the help! 

bra...@ifup.co

unread,
Aug 23, 2013, 6:21:53 PM8/23/13
to docke...@googlegroups.com
Hello-

I wrote a build script that to build docker 0.6.0 without dependencies or networking. This is inspired by how Brad Fitz's camlistore and some other projects build go stuff.

Here is the branch, happy to send a PR: https://github.com/philips/docker/tree/add-build-and-deps

And here is an example ebuild:


Thanks for the work on 0.6.0! Looking forward to trying it out.

Brandon

tianon

unread,
Aug 23, 2013, 6:50:03 PM8/23/13
to docke...@googlegroups.com
For what it's worth, I've adapted my own ebuild to use a pretty terrible sed hack to grab what's needed directly from the Dockerfile: https://github.com/tianon/gentoo-overlay/blob/61c8520f4d006b766548f39f39d36da37b4c2701/app-emulation/lxc-docker/lxc-docker-9999.ebuild#L62

As a somewhat simpler patch-over until we can get some proper vendor handling of some kind (whether that's copying the deps or otherwise), I propose (and am of course happy to provide a pull request to illustrate/implement) that we create a hack/release/deps.sh file that takes a single argument of the src directory and just performs the clone lines from the Dockerfile, ie:
PKG=github.com/kr/pty REV=27435c699; git clone http://$PKG $1/$PKG && (cd $1/$PKG && git checkout -f $REV)
PKG=github.com/gorilla/context/ REV=708054d61e5; git clone http://$PKG $1/$PKG && (cd 
$1/$PKG && git checkout -f $REV)
PKG=github.com/gorilla/mux/ REV=9b36453141c; git clone http://$PKG $1/$PKG && (cd 
$1/$PKG && git checkout -f $REV)
PKG=github.com/dotcloud/tar/ REV=d06045a6d9; git clone http://$PKG $1/$PKG && (cd 
$1/$PKG && git checkout -f $REV)
PKG=code.google.com/p/go.net/ REV=84a4013f96e0;  hg  clone http://$PKG $1/$PKG && (cd 
$1/$PKG && hg  checkout    $REV)

Then the Dockerfile gets a line similar to this right before performing hack/release/make.sh:
run cd /go/src/github.com/dotcloud/docker && hack/release/deps.sh /go/src

This alone would obviously simplify my ebuild immensely since I could replace that whole sed hack with a simple hack/release/deps.sh .gopath/src.

Solomon Hykes

unread,
Aug 23, 2013, 6:59:10 PM8/23/13
to tianon, docke...@googlegroups.com
Would this pull request help? It implements a simple vendoring script.  

@solomonstre
@getdocker


--

tianon

unread,
Aug 23, 2013, 7:08:55 PM8/23/13
to docke...@googlegroups.com, tianon
It certainly wouldn't hurt, but it seems to me that the vendoring issue is one that's important enough that it should have a serious amount of thought and time devoted to it, so my proposal is just a refactoring of what we've got right now to provide time to evaluate different vendor options more clearly.

Solomon Hykes

unread,
Aug 23, 2013, 7:20:31 PM8/23/13
to tianon, docke...@googlegroups.com, tianon
Yeah, whether we vendor or not would be a separate decision.
But rename vendor.sh to deps.sh and you might have a script you can call directly as part of your build. We would call it too as part of our build, and everybody would be happy :)

@solomonstre
@getdocker

Tianon

unread,
Aug 23, 2013, 7:27:02 PM8/23/13
to docke...@googlegroups.com, tianon
My apologies, I didn't see https://github.com/dotcloud/docker/pull/1606 - that would certainly be sufficient for my ebuild, even with the name being vendor.sh (doesn't bother me a bit what the name is, as long as it gets the job done).  It's already essentially what I've proposed here. *thumbsup*

Lokesh Mandvekar

unread,
Aug 23, 2013, 7:33:32 PM8/23/13
to Tianon, docke...@googlegroups.com
Hi all,

I'm packaging docker for Fedora, and I'm volunteering to be the point of
contact for Fedora stuff. Hope I'm replying to the right thread :)

Thanks,
--
Lokesh
> >> email to docker-dev+...@googlegroups.com <javascript:>.
> >> For more options, visit https://groups.google.com/groups/opt_out.
> >>
> >
> >
>
> --
> You received this message because you are subscribed to the Google Groups "docker-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to docker-dev+...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

--
Lokesh

Solomon Hykes

unread,
Aug 23, 2013, 7:58:03 PM8/23/13
to Lokesh Mandvekar, Tianon, docke...@googlegroups.com
Thanks Lokesh! Yes, this is the thread where I mentioned the need for a point of contact per distro.
We are discussing a combo of:

1) A human-readable file listing the build and run dependencies of docker for the convenience of distro maintainers.
​2) Breaking down the steps from our release pipeline (get system deps, get go deps, build, run tests) into small discrete shell scripts, which may be used separately for custom builds (eg. for gentoo build etc).
Let this list know if you need anything. 
 

@solomonstre
@getdocker

Lokesh Mandvekar

unread,
Aug 23, 2013, 8:01:14 PM8/23/13
to Solomon Hykes, Tianon, docke...@googlegroups.com
Great, thanks Solomon. Guess I'll find out more once I submit the package review request
(soon) and I'll get back in case of updates.

Marco Hennings

unread,
Aug 24, 2013, 3:15:26 AM8/24/13
to docke...@googlegroups.com
Hello,

I think checking in the dependencies inside the repo and providing a version update script is a nice way to ensure a tagged release stays in a stable stage even if dependencies change.
It also simplifies the build process.

The build script did not work for me, mostly bcause of missing build options. 
I've added a PR with these fixes at https://github.com/philips/docker/pull/1

Wouldnt it be easier to use a simplified Makefile instead a build sh?

Kind regards,

Marco

Solomon Hykes

unread,
Aug 24, 2013, 3:22:21 AM8/24/13
to Marco Hennings, docke...@googlegroups.com
Guys, there is already a build script in master: hack/release/make.sh. Can we please make sure we're not duplicating effort?
 
​ 

@solomonstre
@getdocker


--

Marco Hennings

unread,
Aug 24, 2013, 6:04:37 AM8/24/13
to Solomon Hykes, docke...@googlegroups.com
2013/8/24 Solomon Hykes <solomo...@dotcloud.com>:
> Guys, there is already a build script in master: hack/release/make.sh. Can
> we please make sure we're not duplicating effort?

Sure, i am just interested in trying the out the build with the
checked-in dependencies Brandon has done.
It is seems to be a good alternative for the dependency handling part.

For the build itself I think it would be best if we split parts of the
make.sh into a common.sh and reuse it for a local build.

Please have a look at:
https://github.com/mhennings/docker/compare/build-without-docker

Marco Hennings

unread,
Aug 24, 2013, 6:24:26 AM8/24/13
to docke...@googlegroups.com

Tianon,

your ebuild works fine. 

Maybe we should use it for an official docker overlay?

Kind regards,

Marco

Brandon Philips

unread,
Aug 24, 2013, 9:32:52 AM8/24/13
to marco.h...@freiheit.com, Solomon Hykes, docke...@googlegroups.com
On Sat, Aug 24, 2013 at 3:04 AM, Marco Hennings
<marco.h...@freiheit.com> wrote:
> Sure, i am just interested in trying the out the build with the
> checked-in dependencies Brandon has done.
> It is seems to be a good alternative for the dependency handling part.

Thanks, we have been using it on etcd with success.

I did a little bit of work to have your build-without-docker use the
third_party repo too:

https://github.com/philips/docker/commit/a277b464953e00de8f637caf00c2952cc91c452d
https://github.com/philips/docker/commits/build-without-docker

> For the build itself I think it would be best if we split parts of the
> make.sh into a common.sh and reuse it for a local build.
>
> Please have a look at:
> https://github.com/mhennings/docker/compare/build-without-docker

Worked great for me, thanks for using the existing tools.

Brandon

Jérôme Petazzoni

unread,
Aug 24, 2013, 3:24:20 PM8/24/13
to Brandon Philips, docker-dev
Do we really want to check in dependencies within the repo?
I would understand the rationale if those dependencies were non-versioned, or hosted on unreliable 3rd party systems.
But in the present case, they are versioned and hosted on github and code.google.com.
If we really want a self-contained repo, can't we have a separate "docker-vendored" repo, which would vendor in Docker and all its dependencies?
It would keep the development repo clean and lean.
Just my 2c, and allow me to apologize in advance if I sound too much concerned about unnecessary aesthetics :-)

Marco Hennings

unread,
Aug 24, 2013, 3:34:06 PM8/24/13
to Jérôme Petazzoni, Brandon Philips, docker-dev
I understand both; reasons for checking in as reasons against it.

Reasons for it i see are:
- stability of earlier versions
- simpler to make sure every build has the same result; less effort in
maintaining

Against it:
- size
- leaving choice to distros to use newer dependencies

If there is a go way to do it i would prefer it.
It seems the way brendon suggests is something like a best practice.
so i like it. not saying we should to do it.

Kind regards,

Marco

2013/8/24 Jérôme Petazzoni <jerome.p...@dotcloud.com>:
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "docker-dev" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/docker-dev/VeB_vmIMONc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
--
Marco Hennings
Dipl.-Ing.

freiheit.com technologies gmbh
Straßenbahnring 22 / 20251 Hamburg, Germany
fon +49 40 / 890584-0
fax +49 40 / 890584-20
HRB Hamburg 70814

F5D8 4B57 6D93 76E8 87C7 2045 04C4 1F36 CD24 84A9
Geschäftsführer: Claudia Dietze, Stefan Richter, Jörg Kirchhof

Anand Patil

unread,
Aug 24, 2013, 3:48:11 PM8/24/13
to docke...@googlegroups.com, Marco Hennings
Hi everyone,

However you end up doing the build script, it would be great to have a way to install a fixed version of docker on new boxes, such as "curl get.docker.io/version/0.6.1 | sh".

Thanks,
Anand

Brandon Philips

unread,
Aug 25, 2013, 5:52:39 PM8/25/13
to Jérôme Petazzoni, docker-dev
On Sat, Aug 24, 2013 at 12:24 PM, Jérôme Petazzoni
<jerome.p...@dotcloud.com> wrote:
> If we really want a self-contained repo, can't we have a separate
> "docker-vendored" repo, which would vendor in Docker and all its
> dependencies?

Introducing another git repo that people have to wait to update
doesn't feel right.

If I checkin code that uses new APIs from library X then something in
the docker repo should also change to reflect we are pinning version X
in the build system. There are three options to do that as far as I
can see:

1) Checkin all of the dependencies

I like this pattern because you get everything you need to build a
branch with a single git fetch. Other projects like camlistore use it
and I have been happy with it on etcd. The small bonus is that a build
from git doesn't require network access.

2) Keep a deps file with library repo and tree ref

This works and is roughly equivalent to 1. The Dockefile is doing this
today and mhennings and I can modify fetch_deps to do this too.
https://github.com/mhennings/docker/compare/build-without-docker#L1R43

3) Use git submodules

I used git submodules on luvit and it is hard to use if you don't
understand git really well. Not recommended :(

> It would keep the development repo clean and lean.

If everything is in docker.git it will be less bandwidth than cloning
the full history of every library repo. Either way you have to pull
down dependencies if you want to build docker.

I am leaning on solution 1 because it is familiar to me. Solution 2 is
fine too if someone feels strongly about it.

Jerome- Which option do you want? I am willing to append to mhenning's
branch and send a PR.

Thanks!

Brandon

Solomon Hykes

unread,
Aug 25, 2013, 7:04:50 PM8/25/13
to Brandon Philips, Jérôme Petazzoni, docker-dev
Let's go with option 1.

@solomonstre
@getdocker


Brandon Philips

unread,
Aug 25, 2013, 7:34:11 PM8/25/13
to Solomon Hykes, Jérôme Petazzoni, docker-dev
On Sun, Aug 25, 2013 at 4:04 PM, Solomon Hykes
<solomo...@dotcloud.com> wrote:
> Let's go with option 1.

Sent my proposal PR https://github.com/dotcloud/docker/pull/1668

Happy to fix stuff up. Thanks.

Brandon

Eric Myhre

unread,
Aug 25, 2013, 7:52:52 PM8/25/13
to docker-dev

> Reasons for it i see are:
> - stability of earlier versions
> - simpler to make sure every build has the same result; less effort in
> maintaining
>
> Against it:
> - size
> - leaving choice to distros to use newer dependencies

There's something I find colossally absent from this list:

it's *semantically wrong* to put something like github.com/kr/pty in github.com/dotcloud/docker.

kr's pty library isn't in docker, and it's not written by dotcloud.


Here's a working example of how that hurts me, right now: I have a project that references the docker source, and also references the kr/pty source. Today, that project refers to both of them, and that works, and nothing's duplicated in my project, and it *makes sense semantically*. If the docker repo starts *also* including kr/pty as a giant cp'd blob with incoherent history, *my* project working tree now looks like absolutely incoherent slop.


Also, I'm in almost complete agreement with jpetazzo:

On 08/24/2013 02:24 PM, J�r�me Petazzoni wrote:
> I would understand the rationale if those dependencies were non-versioned,
> or hosted on unreliable 3rd party systems.
> But in the present case, they are versioned and hosted on github and
> code.google.com.
...
> It would keep the development repo clean and lean.
> Just my 2c, and allow me to apologize in advance if I sound too much
> concerned about unnecessary aesthetics :-)
>

... with the exception that I disagree that it's dismissable as just aesthetics.


Maintaining separate repositories, either in the docker-deps approach or a submodule approach is "hard", yes. Maybe even "moderately irritating". But I believe we tend to come up with scripts for things that are moderately irritating, and it makes them less irritating ;)

David Calavera

unread,
Aug 25, 2013, 8:24:54 PM8/25/13
to docke...@googlegroups.com
I've added my two cents as a reply to that pull request. Please, don't vendor the dependencies in the docker's repo:



On Sunday, August 25, 2013 4:52:52 PM UTC-7, Eric Myhre wrote:

> Reasons for it i see are:
> - stability of earlier versions
> - simpler to make sure every build has the same result; less effort in
> maintaining
>
> Against it:
> - size
> - leaving choice to distros to use newer dependencies

There's something I find colossally absent from this list:

it's *semantically wrong* to put something like github.com/kr/pty in github.com/dotcloud/docker.

kr's pty library isn't in docker, and it's not written by dotcloud.


Here's a working example of how that hurts me, right now: I have a project that references the docker source, and also references the kr/pty source.  Today, that project refers to both of them, and that works, and nothing's duplicated in my project, and it *makes sense semantically*.  If the docker repo starts *also* including kr/pty as a giant cp'd blob with incoherent history, *my* project working tree now looks like absolutely incoherent slop.


Also, I'm in almost complete agreement with jpetazzo:

Brandon Philips

unread,
Aug 25, 2013, 8:48:00 PM8/25/13
to Eric Myhre, docker-dev
On Sun, Aug 25, 2013 at 4:52 PM, Eric Myhre <ha...@exultant.us> wrote:
> If the docker repo starts *also* including kr/pty as a giant cp'd blob with incoherent history, *my* project working tree now looks like absolutely incoherent slop.

The third_party directory is only used at build time if you use the
make*.sh scripts. It doesn't affect manually setting up a go workspace
with shared versions of kr/pty for example.

Including all deps is a practical decision to have simple, fast and
offline production builds. Other projects do it as an example:

https://github.com/bradfitz/camlistore/tree/master/third_party
https://github.com/coreos/etcd/tree/master/third_party

We can also look at the node.js community which has decided that
checking in dependencies is the right thing to do for applications:

https://npmjs.org/doc/faq.html#Should-I-check-my-node_modules-folder-into-git

Brandon

David Calavera

unread,
Aug 25, 2013, 10:43:23 PM8/25/13
to docke...@googlegroups.com, Eric Myhre
Brandon,

the main difference between docker and those examples is that you're not going to deploy docker from source to your production servers that have a docker server running. Docker is a tool, precompiled, that follows a series of steps before arriving to replace a production docker, mainly build it for a specific distro and package it in a specific format.

The npm guide is correct when it says:

Check node_modules into git for things you deploy, such as websites and apps.

Because you cannot depend on third parties when you have to push your website and app straight to live production. This is not again docker's case.

unclejack

unread,
Aug 26, 2013, 9:35:39 AM8/26/13
to docke...@googlegroups.com, Eric Myhre

@David Calavera

Gopack might be the right tool when you have control over all the dependencies and their continuous uninterrupted availability. It also doesn't solve the problem of having to fetch these resources after doing a fresh checkout on a CI system.

Why are you saying that they shouldn't be checked into the repository? That would greatly simplify a lot of things for many of the people involved with this project, including the maintainers.

Docker has become so much more than just a tool a company released as open source. It's become a project which is already being used by a lot of companies. Making sure that this code can always be built is a very good idea, even when a random third party removes a third party public repository or the site hosting that repository can't be accessed. Having all the dependencies checked into the repository is a big plus for enterprise users. It's a sign of stability and it helps build trust that the project is serious about sticking around for a long time.

I've had troubles building docker when Github was under DDoS and that's just one of the problems with relying on fetching dependencies from third party repositories.

Tianon

unread,
Aug 26, 2013, 12:02:02 PM8/26/13
to docke...@googlegroups.com
In regards to an "official docker overlay", I've moved the Gentoo ebuilds out of my personal overlay and into a dedicated docker-overlay: https://github.com/tianon/docker-overlay (and would love to have the blessing to drop the "somewhat almost" in that description and be the "official" maintainer for Gentoo).

On the subject at hand in this thread, I can definitely see with where unclejack is coming from with no guarantee of upstream being stable (either connection-wise or oh-noes-the-repo-is-renamed-or-worse-outright-deleted), and although I agree with jpetazzo that it's aesthetically unpleasing, the appeal for build stability is there in droves (which is something the Gentoo team will definitely be very keen about if we ever want to push upstream into the portage tree directly, on top of the direct benefit to the everyday developers of docker).

Eric Myhre

unread,
Aug 26, 2013, 12:57:15 PM8/26/13
to docke...@googlegroups.com
I find that this discussion is getting confusing. I want to try to un-muddy the waters by simply re-enumerating the already discussed approaches.

The Dockerfile used to build Docker is currently in the category of using the top approach, "script (clone at build time)".

(If the gods of email line wrapping have no mercy, I also posted it as a gist: https://gist.github.com/heavenlyhash/6343783 )

```
core transparent multiple permanent bloat
approach offline? fast reproducible when library histories to source repo
---------- -------- ---- ------------ ------------ --------- ---------------

script (clone semantic no/fail no yes (`checkout yes no no
at build time) -f $HASH)

script (clone semantic yes yes yes yes no no
once, checkout
at build time)

go get wget yes (caching, yes no/fail yes no no
if not smart)

git semantic yes yes yes yes no no
submodules

gopath semantic yes yes yes yes no no


docker+deps blob-copy yes yes yes yes yes no
(separate
vendored repo)

vendoring blob-copy yes yes yes no/fail yes yes/fail
docker (main
source repo)

```

Since the "good" answer for some columns is the affirmative and others in the negative and that's quite confusing at a first read, when there's a clear good answer and a clear fail answer, I've marked the fail as "[yes or no]/fail". (You may debate the importance of the column as you wish, but I think it is fair to note the unit vector.)

The "offline" column is, I hope, self-explanatory.

I included "fast" as a column because it's come up in discussion before, but really, as far as I can tell, "fast" is a synonym of "offline".

The "reproducible" column means it gives every person who pulls the docker repo a reliable way to get exact copies of exact versions of deps that the docker devs intend. (It doesn't mean "reliable in the case of github down", because that's already covered by the "offline" column.)

These columns are the ones of immediate relevance to docker.

The following columns are more about the holistic health of the open source ecosystem around docker:

The "multiple histories" column refers to whether or not third party sources end up with multiple histories. If pty_linux.go has one history in github.com/kr/pty, and then it has another history with different dates, graph, and hashes in docker, then that's multiple histories. If I someday google for some source code in kr/pty/pty_linux.go and I end up on the github source pages for docker, then that's multiple histories.

The "transparent when library" column refers to whether or not a developer who brings the docker source into another project will have to be aware of how docker drags in its deps. If I have a project that submodules docker, and submodules kr/pty, and I can `find | grep pty_linux.go | wc -l` and I get anything other than '1', you get a fail in this column.

The "permanent bloat to source repo" column describes what kind of diffs end up in your source repo if dependencies are substantially changed -- if the diff is large, and permanently increases the size of cloning the source repo, this column contains a yes; if the diff is small because it's a semantic reference to other sources, this column contains a no.

</unbiased>

Scanning across the rows here:

"script (clone at build time)" fails, because it doesn't work offline;
"script (clone once, chekcout at build time)" does not clearly fail;
"go get" fails, because it's not reproducible;
"git submodules" do not clearly fail;
"go path" does not clearly fail (although perhaps I should have added a column to mention that it itself adds a dep);
"docker+deps (separate vendored repo)" does not clearly fail;
"vendoring (main source repo)" fails because it's not transparent to libraries in the ecosystem, and it's a permanent bloat to source repo.

This is why I argue in favor of submodules, gopath, a script that does semantic checkouts at build time, or a separate docker+deps repo. These approaches are strictly more capable than the alternatives.


Eric Myhre

unread,
Aug 26, 2013, 12:55:14 PM8/26/13
to docke...@googlegroups.com
> be the right tool when you have control over all the
> dependencies and their continuous uninterrupted availability. It also
> doesn't solve the problem of having to fetch these resources after doing a
> fresh checkout on a CI system.
> [...]
> troubles building docker when Github was under DDoS and that's
> just one of the problems with relying on fetching dependencies from third
> party repositories.

This is a valid concern to raise against using scripts that re-download dependencies at run time, every time.

This is not a valid concern against gopack, or submodules, or other mechanisms that are based on semantic references and use source control. You run `gp` after `git clone`, and it caches; or you run `git submodule update --init` after you clone (or `git clone --recursive`), and it caches! When I argue for submodules or other mechanisms that are based on semantic references that use source control and maintain coherent histories, I am -not- advocating using the network at build time. Those are extremely orthogonal, since we live in a world of DVCS.

That aside, these concerns are also not an argument that specifically favors vendoring into the source repo, because vendoring into a separate docker+deps repo and leaving the source repo pristine would behave equally in all ways. (Except of course "source repo pristine" part.)

Similarly, earlier it was said:

> Including all deps is a practical decision to have simple, fast and
> offline production builds.

I wholeheartedly agree that fast, offline builds are a glorious and desirable thing. I couldn't possibly agree any more strongly. The ability to do fast, consistent, and (after, of course, the initial clone) offline builds are the single most important attribute of a project to me when I'm determining how seriously I take it.

In original context, that quote was used as an argument in support of vendoring into the source repo. I don't think it's actually an argument in support of vendoring. I think that's an undifferentiating argument, and supports of any of the options proposed in this thread.

Vendoring is the weakest possible way to go about fast, offline builds, and the most cumbersome. It checks off that main criteria, but makes many sacrifices, because it's based on abandoning semantics and doing blob-copies. In many ways, vendoring reminds me the way I would blob-copy folders of my whole source tree around before I discovered my first version control system, and then one day I met subversion and I instantly went "...Oh. ...Now I know better. Now I have a tool that works with the semantics that I really meant all along."

Git submodules are better than vendoring. Gopath, as proposed by @calavera, while I haven't used it before, smells right and looks better than vendoring. Making a separate repo, and doing whatever you want with its history, is better than vendoring. The stuff in docker/Dockerfile right now:

run PKG=github.com/kr/pty REV=27435c699; git clone http://$PKG /go/src/$PKG && cd /go/src/$PKG && git checkout -f $REV
run PKG=github.com/gorilla/context/ REV=708054d61e5; git clone http://$PKG /go/src/$PKG && cd /go/src/$PKG && git checkout -f $REV
run PKG=github.com/gorilla/mux/ REV=9b36453141c; git clone http://$PKG /go/src/$PKG && cd /go/src/$PKG && git checkout -f $REV
run PKG=github.com/dotcloud/tar/ REV=d06045a6d9; git clone http://$PKG /go/src/$PKG && cd /go/src/$PKG && git checkout -f $REV
run PKG=code.google.com/p/go.net/ REV=84a4013f96e0; hg clone http://$PKG /go/src/$PKG && cd /go/src/$PKG && hg checkout $REV

... is much, much better that vendoring. The only problem with this code currently in the docker file is that it explicitly uses http references. But that's a problem with that script right now, not a problem with the core concept! To make this dockerfile work offline, you just add a variable replacing "http://" with "file://", and this is not just serendipity, this is because dealing with source by referring to its version control is a correct approach.



Another issue that differentiates the available approaches discussed is the degree of long-term impact to the source repo.

Git submodules? Git rm, and there's a diff, and it's fine. Gopath? Git rm, there's a diff, and it's fine. Separate repo with vendored deps? Do whatever you want, it's separate for a reason, and it's fine. But with blob-copies that are vendored into your main, source repository, if you want to ever change your organization, or substantially change your dependencies... you have to either suck it up and deal with the permanent bloat of schmutz in your history, or you have to bust out the filter-branch commands and start rewriting history, and now everyone who ever mentioned a commit hash now has a reference to not-history, and it's just not a good time.

The pattern there? Of all of the presented approaches, only one of them has a unique strategy for *damage control*. There's *damage control* for vendoring. (And it's not even very good damage control.) Everything else you can clean up in perfectly normal ways that cooperate reasonably with your version control system.

Frederick Kautz

unread,
Aug 26, 2013, 1:56:29 PM8/26/13
to Eric Myhre, docke...@googlegroups.com
At the very minimum, if deps aren't checked in, they should at least be archived via a github fork. If a repo is deleted, recovery will be much easier.

--
Frederick F. Kautz IV



--

You received this message because you are subscribed to the Google Groups "docker-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to docker-dev+...@googlegroups.com.

David Calavera

unread,
Aug 27, 2013, 1:05:03 AM8/27/13
to docke...@googlegroups.com, Eric Myhre
I cannot agree more with Eric Myhre.

Frederick, the good part of distributed version control systems is that each one of us that have checked those repos to build docker has a complete version of them. If one of them is deleted tomorrow, you only need to push your version to your favorite server and we can keep working like if nothing had happened.

Brandon Philips

unread,
Aug 30, 2013, 2:56:07 PM8/30/13
to docker-dev
Whats the way forward? I have sent to pull requests:

Shared make.sh:

https://github.com/dotcloud/docker/pull/1725

And a separate make-without-docker.sh:

https://github.com/dotcloud/docker/pull/1688

Neither of these PRs pull in dependencies and simply solve the build
without Docker use case.

Marco Hennings

unread,
Aug 30, 2013, 3:17:53 PM8/30/13
to docke...@googlegroups.com
Maybe this should be a topic for the next IRC meeting. 

Jérôme Petazzoni

unread,
Aug 30, 2013, 3:32:16 PM8/30/13
to Brandon Philips, docker-dev
I'll try to review those today or this week-end.
Sorry, was kept very busy this week with "the road without AUFS"!

Brandon Philips

unread,
Aug 30, 2013, 6:42:44 PM8/30/13
to Jérôme Petazzoni, docker-dev
On Fri, Aug 30, 2013 at 12:32 PM, Jérôme Petazzoni
<jerome.p...@dotcloud.com> wrote:
> Sorry, was kept very busy this week with "the road without AUFS"!

No problem, I get that.

I just want to settle on some agreeable solution so I don't have to
maintain a fork.

Thanks!

Brandon

Frederick Kautz

unread,
Aug 31, 2013, 5:13:11 AM8/31/13
to David Calavera, Eric Myhre, docke...@googlegroups.com
David, in general I agree. However, how often do you update your git repos in your GOPATH? While keeping a fork doesn't guarantee an always up to date copy, it is easier to script the update and recover than ask around for who ran go get -u most recently.

Sent from Mailbox for iPad


On Mon, Aug 26, 2013 at 10:05 PM, David Calavera <david.c...@gmail.com> wrote:

Frederick Kautz

unread,
Aug 31, 2013, 5:17:09 AM8/31/13
to Brandon Philips, Jérôme Petazzoni, docker-dev
Also, by fork, I specifically am thinking of a mirror, not a separate dev branch or vendorized into the project. :)

Sent from Mailbox for iPad


Solomon Hykes

unread,
Sep 10, 2013, 2:27:57 PM9/10/13
to Frederick Kautz, Brandon Philips, Jérôme Petazzoni, docker-dev
Hi everyone,

After much discussion here and on irc, and much appreciated contributions (thanks Brandon and Morgan), I have opened a pull request with a bunch of improvements to the build tool. This includes vendoring, a "packager's manual" which summarizes requirements for building and running docker, more modular build scripts, and a dev-friendly way to develop & test docker using docker.


Reply all
Reply to author
Forward
0 new messages