Tagging and stamping docker images

638 views
Skip to first unread message

Jeff Grafton

unread,
Jan 18, 2017, 4:48:00 PM1/18/17
to bazel-discuss
I've been working on converting the process for building release tarballs for Kubernetes from a set of shell scripts to Bazel.

It's basically working (I have some complaints about pkg_tar), but I mainly need some extensions to the docker_build rule.

Currently for releases, we:
1. Build binaries
2. Save md5sum of each binary into a .docker_tag file
2. Build a docker image for each of these binaries
3. Tag each docker image with binary_name:md5sum
4. Run docker save to create a tarball for each image
5. Bundle all of these tarballs and .docker_tag files into a larger tarball

Then in deployments, we
1. Run docker load on each docker tarball, which loads the image with the binary_name:md5sum tag
2. Refer to images using the .docker_tag file we created

Currently, the docker_build rule does not save any tags in the generated tarfile, though the tools/build_defs/docker/create_image.py script does seem to support them.
Would it be reasonable to extend the docker_build rule to support tags? Is this something that should be done only if --stamp is given?

On a similar note, it's a little frustrating (though not a deal-breaker) that docker images don't have a valid created time. I guess this is done for caching reasons, but could we change this behavior if --stamp is given?

Regarding tags, I created a commit to do what I wanted here: https://github.com/ixdy/bazel/commit/3b29803eb528ff525c7024190ffbf4b08c598cf2
It adds the tags always, not just with --stamp, but maybe that could be changed.

I stole the "read from file" functionality from labels, since I couldn't think of a better way to use the md5sum of the binaries we're bundling, though it is a little gross.

In this commit, I also added an option to disable adding the package name to the repository, since our git tree structure doesn't match our container registry structure.

I haven't changed anything around creation time yet, since that's not critical at this point.

Any thoughts on what I've proposed?

(appendix: PR where I am using the modified docker_build rule: https://github.com/kubernetes/kubernetes/pull/39898, in particular the build/BUILD file.)

miked...@google.com

unread,
Jan 18, 2017, 4:50:59 PM1/18/17
to bazel-discuss, Matthew Moore
+mattmoor@

Matthew Moore

unread,
Jan 18, 2017, 4:54:53 PM1/18/17
to Michael Danese, Damien Martin-guillerez, bazel-discuss
+dmarting
--
Matthew Moore
DI/Docker (aka Convoy)
Developer Infrastructure @ Google

Damien Martin-guillerez

unread,
Jan 18, 2017, 5:44:57 PM1/18/17
to Matthew Moore, Michael Danese, bazel-discuss
On Wed, Jan 18, 2017 at 10:54 PM Matthew Moore <matt...@google.com> wrote:
+dmarting

On Wed, Jan 18, 2017 at 1:50 PM, <miked...@google.com> wrote:
+mattmoor@

On Wednesday, January 18, 2017 at 1:48:00 PM UTC-8, Jeff Grafton wrote:
> I've been working on converting the process for building release tarballs for Kubernetes from a set of shell scripts to Bazel.
>
>
> It's basically working (I have some complaints about pkg_tar), but I mainly need some extensions to the docker_build rule.
>
>
> Currently for releases, we:
> 1. Build binaries
> 2. Save md5sum of each binary into a .docker_tag file
> 2. Build a docker image for each of these binaries
> 3. Tag each docker image with binary_name:md5sum
> 4. Run docker save to create a tarball for each image
> 5. Bundle all of these tarballs and .docker_tag files into a larger tarball
>
>
> Then in deployments, we
> 1. Run docker load on each docker tarball, which loads the image with the binary_name:md5sum tag
> 2. Refer to images using the .docker_tag file we created
>
>
> Currently, the docker_build rule does not save any tags in the generated tarfile, though the tools/build_defs/docker/create_image.py script does seem to support them.
> Would it be reasonable to extend the docker_build rule to support tags?

Totally reasonable, just no time for doing it.
 
Is this something that should be done only if --stamp is given?

I don't know does tag contains non deterministic information related to the time of the build for example?

>
>
>
> On a similar note, it's a little frustrating (though not a deal-breaker) that docker images don't have a valid created time. I guess this is done for caching reasons, but could we change this behavior if --stamp is given?

That was a plan but I don't think we have this information from skylark right now :( +Laurent Le Brun might know? 

I had someone that sent a code review that was pretty good but he seems to have deserted it, I was just missing correct testing, see https://cr.bazel.build/6470/ 
 
>
>
> Regarding tags, I created a commit to do what I wanted here: https://github.com/ixdy/bazel/commit/3b29803eb528ff525c7024190ffbf4b08c598cf2
> It adds the tags always, not just with --stamp, but maybe that could be changed.

This should go into a proper code review but I see no big blocker for it (since we were setting tags only at load time before, we actually discussed doing so when updating to the docker 1.10 format, it was none done for backward compatible reason).
 
>
>
> I stole the "read from file" functionality from labels, since I couldn't think of a better way to use the md5sum of the binaries we're bundling, though it is a little gross.

That sounds reasonable to me but I might make it change during the code review if you decide to send it.
 
>
>
> In this commit, I also added an option to disable adding the package name to the repository, since our git tree structure doesn't match our container registry structure.

The current rule support forcing the repository name, why not?
 
>
>
> I haven't changed anything around creation time yet, since that's not critical at this point.

Well the good news in the code already exists somewhere :)

Matthew Moore

unread,
Jan 18, 2017, 5:50:12 PM1/18/17
to Damien Martin-guillerez, Michael Danese, bazel-discuss
docker_build will produce tags, but you have to build :foo.tar vs. :foo

I think it is always a bazel/path/to/target:foo label though, not something you have fine-grained control over.
-M

Jeff Grafton

unread,
Jan 18, 2017, 6:04:29 PM1/18/17
to Matthew Moore, Damien Martin-guillerez, Michael Danese, bazel-discuss
docker_build doesn't tag the layers.

For example:

$ bazel build build/kube-proxy.tar
$ docker load -i bazel-bin/build/kube-proxy.tar
34b2007ad853: Loading layer [==================================================>] 1.935 MB/1.935 MB
699123784092: Loading layer [==================================================>]  10.7 MB/10.7 MB
dc67e3957f4f: Loading layer [==================================================>] 3.932 MB/3.932 MB
5d57ee9b0c7c: Loading layer [==================================================>] 28.14 MB/28.14 MB
Loaded image ID: sha256:aa1ccb4f7c1fd7a98078f0f272ea0f6edbdd71ccc9d0a1ab02b04d5449fbd8e5
Loaded image ID: sha256:5a00e6ccb81ef304e1bb9995ea9605f199aa96659a44237d58ca96982daf9af8
Loaded image ID: sha256:958de2058835203cf8e82a71640f3bc1e142e104bf4fcaa1dc57584f54e2d98f
Loaded image ID: sha256:e55e9aefc3da90d85c28278843586e561d00e03324bf7ca39cfe62c247d50c34
$ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
<none>              <none>              e55e9aefc3da        292 years ago       44.27 MB
$ tar xf bazel-bin/build/kube-proxy.tar -C /tmp/kp
$ cd /tmp/kp
$ jq <repositories .
{
    "kube-proxy": "d85186a71557febb70d2bc392de5fd0bee4857223ff787366b524eaf2b3a3e7f"
  }
$ jq <manifest.json .
[
  {
    "Config": "aa1ccb4f7c1fd7a98078f0f272ea0f6edbdd71ccc9d0a1ab02b04d5449fbd8e5.json",
    "Layers": [
      "f3046efb26d33baaf970966a5cb49a3cf7149b0b1e0f98a106526530cf0a69f9/layer.tar"
    ]
  },
  {
    "Config": "5a00e6ccb81ef304e1bb9995ea9605f199aa96659a44237d58ca96982daf9af8.json",
    "Layers": [
      "f3046efb26d33baaf970966a5cb49a3cf7149b0b1e0f98a106526530cf0a69f9/layer.tar",
      "ed21564970a3f98b66245d57dae09cb735c52d40f628874483eaf5e87c12242e/layer.tar"
    ],
    "Parent": "sha256:aa1ccb4f7c1fd7a98078f0f272ea0f6edbdd71ccc9d0a1ab02b04d5449fbd8e5"
  },
  {
    "Config": "958de2058835203cf8e82a71640f3bc1e142e104bf4fcaa1dc57584f54e2d98f.json",
    "Layers": [
      "f3046efb26d33baaf970966a5cb49a3cf7149b0b1e0f98a106526530cf0a69f9/layer.tar",
      "ed21564970a3f98b66245d57dae09cb735c52d40f628874483eaf5e87c12242e/layer.tar",
      "2ee72e0b63bcbd9a7702b9bcad8c289b89359ebac449afff875ac8cd32921418/layer.tar"
    ],
    "Parent": "sha256:5a00e6ccb81ef304e1bb9995ea9605f199aa96659a44237d58ca96982daf9af8"
  },
  {
    "Config": "e55e9aefc3da90d85c28278843586e561d00e03324bf7ca39cfe62c247d50c34.json",
    "Layers": [
      "f3046efb26d33baaf970966a5cb49a3cf7149b0b1e0f98a106526530cf0a69f9/layer.tar",
      "ed21564970a3f98b66245d57dae09cb735c52d40f628874483eaf5e87c12242e/layer.tar",
      "2ee72e0b63bcbd9a7702b9bcad8c289b89359ebac449afff875ac8cd32921418/layer.tar",
      "d85186a71557febb70d2bc392de5fd0bee4857223ff787366b524eaf2b3a3e7f/layer.tar"
    ],
    "Parent": "sha256:958de2058835203cf8e82a71640f3bc1e142e104bf4fcaa1dc57584f54e2d98f"
  }
]


With my change, the layers are tagged:
$ docker load -i bazel-bin/build/kube-proxy.tar
34b2007ad853: Loading layer [==================================================>] 1.935 MB/1.935 MB
699123784092: Loading layer [==================================================>]  10.7 MB/10.7 MB
dc67e3957f4f: Loading layer [==================================================>] 3.932 MB/3.932 MB
5d57ee9b0c7c: Loading layer [==================================================>] 28.14 MB/28.14 MB
$ docker images
REPOSITORY                                  TAG                                IMAGE ID            CREATED             SIZE
gcr.io/google_containers/kube-proxy/build   50c9998493d410691898c4ba428c1131   e55e9aefc3da        292 years ago       44.27 MB
$ tar xf bazel-bin/build/kube-proxy.tar -C /tmp/kp
$ cd /tmp/kp
$ jq <manifest.json .
[
  {
    "Config": "aa1ccb4f7c1fd7a98078f0f272ea0f6edbdd71ccc9d0a1ab02b04d5449fbd8e5.json",
    "Layers": [
      "f3046efb26d33baaf970966a5cb49a3cf7149b0b1e0f98a106526530cf0a69f9/layer.tar"
    ]
  },
  {
    "Config": "5a00e6ccb81ef304e1bb9995ea9605f199aa96659a44237d58ca96982daf9af8.json",
    "Layers": [
      "f3046efb26d33baaf970966a5cb49a3cf7149b0b1e0f98a106526530cf0a69f9/layer.tar",
      "ed21564970a3f98b66245d57dae09cb735c52d40f628874483eaf5e87c12242e/layer.tar"
    ],
    "Parent": "sha256:aa1ccb4f7c1fd7a98078f0f272ea0f6edbdd71ccc9d0a1ab02b04d5449fbd8e5"
  },
  {
    "Config": "958de2058835203cf8e82a71640f3bc1e142e104bf4fcaa1dc57584f54e2d98f.json",
    "Layers": [
      "f3046efb26d33baaf970966a5cb49a3cf7149b0b1e0f98a106526530cf0a69f9/layer.tar",
      "ed21564970a3f98b66245d57dae09cb735c52d40f628874483eaf5e87c12242e/layer.tar",
      "2ee72e0b63bcbd9a7702b9bcad8c289b89359ebac449afff875ac8cd32921418/layer.tar"
    ],
    "Parent": "sha256:5a00e6ccb81ef304e1bb9995ea9605f199aa96659a44237d58ca96982daf9af8"
  },
  {
    "Config": "e55e9aefc3da90d85c28278843586e561d00e03324bf7ca39cfe62c247d50c34.json",
    "Layers": [
      "f3046efb26d33baaf970966a5cb49a3cf7149b0b1e0f98a106526530cf0a69f9/layer.tar",
      "ed21564970a3f98b66245d57dae09cb735c52d40f628874483eaf5e87c12242e/layer.tar",
      "2ee72e0b63bcbd9a7702b9bcad8c289b89359ebac449afff875ac8cd32921418/layer.tar",
      "d85186a71557febb70d2bc392de5fd0bee4857223ff787366b524eaf2b3a3e7f/layer.tar"
    ],
    "Parent": "sha256:958de2058835203cf8e82a71640f3bc1e142e104bf4fcaa1dc57584f54e2d98f",
    "RepoTags": [
    ]
  }
]


Notably the RepoTags are missing from the manifest.json currently.

(This snippet also demonstrates my complaint about adding the package name to the repository - we use names like gcr.io/google_containers/kube-proxy, and I'd like to keep the docker_build rules out of the root BUILD file.)

--
You received this message because you are subscribed to a topic in the Google Groups "bazel-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bazel-discuss/l0j4sID3ogE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bazel-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/CACW46hcacQAVkZqcdRtHVDhLZb31m%2B2H_9%2Bzjz286aOq11NmUw%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

Jeff Grafton

unread,
Jan 18, 2017, 6:08:56 PM1/18/17
to Damien Martin-guillerez, Matthew Moore, Michael Danese, bazel-discuss
On Wed, Jan 18, 2017 at 2:44 PM, 'Damien Martin-guillerez' via bazel-discuss <bazel-...@googlegroups.com> wrote:


On Wed, Jan 18, 2017 at 10:54 PM Matthew Moore <matt...@google.com> wrote:
+dmarting

On Wed, Jan 18, 2017 at 1:50 PM, <miked...@google.com> wrote:
+mattmoor@

On Wednesday, January 18, 2017 at 1:48:00 PM UTC-8, Jeff Grafton wrote:
> I've been working on converting the process for building release tarballs for Kubernetes from a set of shell scripts to Bazel.
>
>
> It's basically working (I have some complaints about pkg_tar), but I mainly need some extensions to the docker_build rule.
>
>
> Currently for releases, we:
> 1. Build binaries
> 2. Save md5sum of each binary into a .docker_tag file
> 2. Build a docker image for each of these binaries
> 3. Tag each docker image with binary_name:md5sum
> 4. Run docker save to create a tarball for each image
> 5. Bundle all of these tarballs and .docker_tag files into a larger tarball
>
>
> Then in deployments, we
> 1. Run docker load on each docker tarball, which loads the image with the binary_name:md5sum tag
> 2. Refer to images using the .docker_tag file we created
>
>
> Currently, the docker_build rule does not save any tags in the generated tarfile, though the tools/build_defs/docker/create_image.py script does seem to support them.
> Would it be reasonable to extend the docker_build rule to support tags?

Totally reasonable, just no time for doing it.

OK, I'll send you a PR or a Gerrit CR. (I guess the Bazel team prefers Gerrit?)
 
 
Is this something that should be done only if --stamp is given?

I don't know does tag contains non deterministic information related to the time of the build for example?

The way I'm using it it's based on an md5sum of a build output, so it should be deterministic. I'm not sure how others might use it.
I basically need something that's dependent on the state of the build tree.
 

>
>
>
> On a similar note, it's a little frustrating (though not a deal-breaker) that docker images don't have a valid created time. I guess this is done for caching reasons, but could we change this behavior if --stamp is given?

That was a plan but I don't think we have this information from skylark right now :( +Laurent Le Brun might know? 

I had someone that sent a code review that was pretty good but he seems to have deserted it, I was just missing correct testing, see https://cr.bazel.build/6470/ 

OK, I might look to see if I can follow up on this. It's not a huge priority right now.
 
 
>
>
> Regarding tags, I created a commit to do what I wanted here: https://github.com/ixdy/bazel/commit/3b29803eb528ff525c7024190ffbf4b08c598cf2
> It adds the tags always, not just with --stamp, but maybe that could be changed.

This should go into a proper code review but I see no big blocker for it (since we were setting tags only at load time before, we actually discussed doing so when updating to the docker 1.10 format, it was none done for backward compatible reason).
 
>
>
> I stole the "read from file" functionality from labels, since I couldn't think of a better way to use the md5sum of the binaries we're bundling, though it is a little gross.

That sounds reasonable to me but I might make it change during the code review if you decide to send it.'

As noted earlier, I need to have some way of using the same tag both for the docker image as well as saving it in a separate file elsewhere. There are maybe other ways to achieve this, but this seemed simplest.
 
 
>
>
> In this commit, I also added an option to disable adding the package name to the repository, since our git tree structure doesn't match our container registry structure.

The current rule support forcing the repository name, why not?

The docker_build rule treats the repository attribute as a prefix, not the entire name. It always adds the package name, which is frustrating.
 
 
>
>
> I haven't changed anything around creation time yet, since that's not critical at this point.

Well the good news in the code already exists somewhere :)
 
>
>
> Any thoughts on what I've proposed?
>
>
> (appendix: PR where I am using the modified docker_build rule: https://github.com/kubernetes/kubernetes/pull/39898, in particular the build/BUILD file.)

--
You received this message because you are subscribed to a topic in the Google Groups "bazel-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bazel-discuss/l0j4sID3ogE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bazel-discuss+unsubscribe@googlegroups.com.

Damien Martin-guillerez

unread,
Jan 18, 2017, 6:18:07 PM1/18/17
to Jeff Grafton, Matthew Moore, Michael Danese, bazel-discuss
On Thu, Jan 19, 2017 at 12:08 AM Jeff Grafton <jgra...@google.com> wrote:
On Wed, Jan 18, 2017 at 2:44 PM, 'Damien Martin-guillerez' via bazel-discuss <bazel-...@googlegroups.com> wrote:


On Wed, Jan 18, 2017 at 10:54 PM Matthew Moore <matt...@google.com> wrote:
+dmarting

On Wed, Jan 18, 2017 at 1:50 PM, <miked...@google.com> wrote:
+mattmoor@

On Wednesday, January 18, 2017 at 1:48:00 PM UTC-8, Jeff Grafton wrote:
> I've been working on converting the process for building release tarballs for Kubernetes from a set of shell scripts to Bazel.
>
>
> It's basically working (I have some complaints about pkg_tar), but I mainly need some extensions to the docker_build rule.
>
>
> Currently for releases, we:
> 1. Build binaries
> 2. Save md5sum of each binary into a .docker_tag file
> 2. Build a docker image for each of these binaries
> 3. Tag each docker image with binary_name:md5sum
> 4. Run docker save to create a tarball for each image
> 5. Bundle all of these tarballs and .docker_tag files into a larger tarball
>
>
> Then in deployments, we
> 1. Run docker load on each docker tarball, which loads the image with the binary_name:md5sum tag
> 2. Refer to images using the .docker_tag file we created
>
>
> Currently, the docker_build rule does not save any tags in the generated tarfile, though the tools/build_defs/docker/create_image.py script does seem to support them.
> Would it be reasonable to extend the docker_build rule to support tags?

Totally reasonable, just no time for doing it.

OK, I'll send you a PR or a Gerrit CR. (I guess the Bazel team prefers Gerrit?)

I definitely prefer Gerrit :) but if you are allergic to it we can do the CR on Github too.
 
 
 
Is this something that should be done only if --stamp is given?

I don't know does tag contains non deterministic information related to the time of the build for example?

The way I'm using it it's based on an md5sum of a build output, so it should be deterministic. I'm not sure how others might use it.
I basically need something that's dependent on the state of the build tree.

The state of the build tree is ok to not be a stamp information.
 
 

>
>
>
> On a similar note, it's a little frustrating (though not a deal-breaker) that docker images don't have a valid created time. I guess this is done for caching reasons, but could we change this behavior if --stamp is given?

That was a plan but I don't think we have this information from skylark right now :( +Laurent Le Brun might know? 

I had someone that sent a code review that was pretty good but he seems to have deserted it, I was just missing correct testing, see https://cr.bazel.build/6470/ 

OK, I might look to see if I can follow up on this. It's not a huge priority right now.
 
 
>
>
> Regarding tags, I created a commit to do what I wanted here: https://github.com/ixdy/bazel/commit/3b29803eb528ff525c7024190ffbf4b08c598cf2
> It adds the tags always, not just with --stamp, but maybe that could be changed.

This should go into a proper code review but I see no big blocker for it (since we were setting tags only at load time before, we actually discussed doing so when updating to the docker 1.10 format, it was none done for backward compatible reason).
 
>
>
> I stole the "read from file" functionality from labels, since I couldn't think of a better way to use the md5sum of the binaries we're bundling, though it is a little gross.

That sounds reasonable to me but I might make it change during the code review if you decide to send it.'

As noted earlier, I need to have some way of using the same tag both for the docker image as well as saving it in a separate file elsewhere. There are maybe other ways to achieve this, but this seemed simplest.
 
 
>
>
> In this commit, I also added an option to disable adding the package name to the repository, since our git tree structure doesn't match our container registry structure.

The current rule support forcing the repository name, why not?

The docker_build rule treats the repository attribute as a prefix, not the entire name. It always adds the package name, which is frustrating.
 
 
>
>
> I haven't changed anything around creation time yet, since that's not critical at this point.

Well the good news in the code already exists somewhere :)
 
>
>
> Any thoughts on what I've proposed?
>
>
> (appendix: PR where I am using the modified docker_build rule: https://github.com/kubernetes/kubernetes/pull/39898, in particular the build/BUILD file.)

--
You received this message because you are subscribed to a topic in the Google Groups "bazel-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bazel-discuss/l0j4sID3ogE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bazel-discus...@googlegroups.com.

Jeff Grafton

unread,
Jan 18, 2017, 6:36:35 PM1/18/17
to bazel-discuss, jgra...@google.com, matt...@google.com, miked...@google.com
One thing I've noticed working with the docker_build rule is that the way we (Kubernetes) treat Docker repositories and the way Bazel seems to is different.

For us, we have a separate repo for every binary, e.g.
...
etc

And for each repo, our tags correspond to the version of code, e.g.

It seems the way Bazel wants us to do this, it'd be
  :kube-proxy
  :kube-apiserver
etc

... which doesn't make much sense. I'm not really sure how versioning fits into the way the docker_build rule is currently constructed.
(My change, which adds image_tags, starts to support this, though.)

(And I guess there's the docker run behavior, which also tags things, but not the tarfile? I'm not really sure how that fits in.)

I've used Gerrit before, so not that allergic. :) https://bazel-review.googlesource.com/8370
Reply all
Reply to author
Forward
0 new messages