Getting started building Docker images

2,808 views
Skip to first unread message

cha...@gmail.com

unread,
Feb 23, 2018, 2:46:58 PM2/23/18
to bazel-discuss
I have spent two days trying to figure out how to build a Docker image with Bazel and have only had limited success. I have been cutting and pasting snippets from various articles, documentation, and any other source I can find. Although I have managed to build and push an image to a registry, it does not work the way I want it to.

I am trying to build a Docker image like the one described in this Dockerfile.

FROM python:2.7

WORKDIR /usr/local/src/heartbeat
ADD requirements.txt /usr/local/src/heartbeat
ADD heartbeat/heartbeat.py /usr/local/src/heartbeat

RUN pip install -r requirements.txt
CMD ["python", "heartbeat.py"]


I have found all sorts of examples that do not seem complete to me. I started with https://bazel.build/, https://github.com/bazelbuild/rules_docker and https://medium.com/bitnami-perspectives/building-docker-images-without-docker-c619061b13a9.

I have stumbled on https://github.com/google/subpar and https://github.com/bazelbuild/rules_docker#py_image but cannot seem to put it all together.

Any recommendations on where to start with learning how to build a trivial container with a python app.

Thanks
Charlie

rodr...@google.com

unread,
Feb 26, 2018, 11:54:02 AM2/26/18
to bazel-discuss
Hey Charlie,

I'd expect that to be possible by following these instructions to get your Python script working in Bazel (without Docker):

https://github.com/bazelbuild/rules_python#setup
https://github.com/bazelbuild/rules_python#importing-pip-dependencies
https://github.com/bazelbuild/rules_python#consuming-pip-dependencies

then, if you switch your py_binary to a py_image as described here:

https://github.com/bazelbuild/rules_docker#py_image

you should be able to run the same script inside a Docker container with (assuming you have `py_image(name="heartbeat", ...)`):

bazel run :heartbeat_image

Bazel's support for Python is in an early stage, so you might run into issues. If you do, describe what you did and the exact error you got, and I'll try to help. There are also a few other approaches for using Bazel for Python scripts - try searching the archive of bazel-discuss for more info.

Paul Johnston

unread,
Feb 26, 2018, 12:22:43 PM2/26/18
to bazel-discuss
I would just focus your efforts on getting your python app with bazel first, and ignore the docker part.

Once you have a functioning py_binary, just substitute it with py_image and it should just work once overlaid onto https://github.com/GoogleCloudPlatform/distroless/tree/master/python2.7 (the py_image default base image).

cha...@gmail.com

unread,
Feb 26, 2018, 3:19:50 PM2/26/18
to bazel-discuss
I managed to get the py_binary to build, but it does not run correctly. If I am in a virtual environment that has pip installed from my requirements.txt file, it runs fine. If I deactivate that environment and try to run it, I get an error.

INFO: Build completed successfully, 1 total action

INFO: Running command line: bazel-bin/heartbeat_bin
Traceback (most recent call last):
File "/private/var/tmp/_bazel_brgl/ae72ec8531e850566373626f2dd415c9/execroot/__main__/bazel-out/darwin-fastbuild/bin/heartbeat_bin.runfiles/__main__/heartbeat/heartbeat.py", line 1, in <module>
from snowplow_tracker import Tracker, Emitter, logger
File "/private/var/tmp/_bazel_brgl/ae72ec8531e850566373626f2dd415c9/execroot/__main__/bazel-out/darwin-fastbuild/bin/heartbeat_bin.runfiles/pypi__snowplow_tracker_0_8_0/snowplow_tracker/__init__.py", line 3, in <module>
from snowplow_tracker.emitters import logger, Emitter, AsyncEmitter, CeleryEmitter, RedisEmitter
File "/private/var/tmp/_bazel_brgl/ae72ec8531e850566373626f2dd415c9/execroot/__main__/bazel-out/darwin-fastbuild/bin/heartbeat_bin.runfiles/pypi__snowplow_tracker_0_8_0/snowplow_tracker/emitters.py", line 33, in <module>
from celery import Celery
File "/private/var/tmp/_bazel_brgl/ae72ec8531e850566373626f2dd415c9/execroot/__main__/bazel-out/darwin-fastbuild/bin/heartbeat_bin.runfiles/pypi__celery_3_1_11/celery/__init__.py", line 130, in <module>
from celery import five
File "/private/var/tmp/_bazel_brgl/ae72ec8531e850566373626f2dd415c9/execroot/__main__/bazel-out/darwin-fastbuild/bin/heartbeat_bin.runfiles/pypi__celery_3_1_11/celery/five.py", line 51, in <module>
from kombu.five import monotonic
ImportError: No module named kombu.five
ERROR: Non-zero return code '1' from command: Process exited with status 1
~

I assume that it is not loading all the python packages correctly. That said, when I look through the files in the bazel-out directory, they all seem to be there.

Justine Tunney

unread,
Feb 26, 2018, 3:28:03 PM2/26/18
to cha...@gmail.com, bazel-discuss
Typically what happens at Google is you're have some sort of sh_binary or py_binary that generates not only the docker file, but the source tree as well. The source tree is schlepped into this big symlink tree, called the runfiles dir. Through tiny bit of heroics, your sh_binary or py_binary can grab all those files, create the docker file, put them all in a .zip file. Then you have a genrule that runs the sh_binary or py_binary.


--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/b650e4e2-070a-484d-b4b5-c8384427021d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Rodrigo Queiro

unread,
Feb 27, 2018, 3:23:57 AM2/27/18
to cha...@gmail.com, bazel-discuss
Can you post the output of:

    ls bazel-out/darwin-fastbuild/bin/heartbeat_bin.runfiles/
    find bazel-out/darwin-fastbuild/bin/heartbeat_bin.runfiles/pypi__kombu* -name '*.py'

and also your requirements.txt?

It seems that some of the dependencies (snowplow_tracker, celery) are correctly added to runfiles by rules_python, but kombu.five is not. This might be related to https://github.com/bazelbuild/rules_python/issues/14.

--
You received this message because you are subscribed to a topic in the Google Groups "bazel-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bazel-discuss/Nt4F-_4vmJc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bazel-discus...@googlegroups.com.
-- 
Google Germany GmbH | Erika-Mann-Strasse | 80636 Muenchen | Germany

AG Hamburg, HRB 86891 | Sitz der Gesellschaft: Hamburg | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

Evan Jones

unread,
Feb 27, 2018, 9:48:04 AM2/27/18
to bazel-discuss
In general: Bazel's Python rules don't really work very well with external dependencies, without a whole lot of work. There is work happening at Google to help resolve this, but I suspect it will take a while to get it working. See the threads and documents here if you want more information and/or want to contribute: https://groups.google.com/forum/#!forum/bazel-sig-python


The thing I've had success with is writing my own rules, which generates a zip that includes effectively the entire virtualenv for a Python target. I hope to not have to maintain this once the Bazel Python rules actually work:  https://github.com/TriggerMail/rules_pyz


Good luck!

Evan

cha...@gmail.com

unread,
Feb 27, 2018, 11:00:41 AM2/27/18
to bazel-discuss
~/code/snowplow-heartbeat: [bazel-build*]$ ls bazel-out/darwin-fastbuild/bin/heartbeat_bin.runfiles
MANIFEST pypi__PyContracts_1_7_6 pypi__decorator_4_2_1 pypi__pyparsing_2_2_0 pypi__requests_2_2_1
__init__.py pypi__celery_3_1_11 pypi__gevent_1_0_2 pypi__python_dotenv_0_7_1 pypi__six_1_11_0
__main__ pypi__click_6_7 pypi__greenlet_0_4_10 pypi__redis_2_9_1 pypi__snowplow_tracker_0_8_0


~/code/snowplow-heartbeat: [bazel-build*]$ find bazel-out/darwin-fastbuild/bin/heartbeat_bin.runfiles/pypi__kombu* -name '*.py'
find: bazel-out/darwin-fastbuild/bin/heartbeat_bin.runfiles/pypi__kombu*: No such file or directory


~/code/snowplow-heartbeat: [bazel-build*]$ cat requirements.txt
pytest==3.4.0
snowplow_tracker==0.8.0
python-dotenv==0.7.1


Requirements.txt was generated by pipreqs. When I run 'pip install -r requirements.txt' in a new virtual environment, it installs all the necessary packages.
(env) ~/code/snowplow-heartbeat: [bazel-build*]$ pip list --format=columns
Package Version
---------------- ---------
amqp 1.4.9
anyjson 0.3.3
attrs 17.4.0
billiard 3.3.0.23
celery 3.1.11
certifi 2018.1.18
chardet 3.0.4
click 6.7
decorator 4.2.1
docopt 0.6.2
funcsigs 1.0.2
gevent 1.0.2
greenlet 0.4.10
idna 2.6
kombu 3.0.37
pip 9.0.1
pipreqs 0.4.9
pluggy 0.6.0
py 1.5.2
PyContracts 1.7.6
pyparsing 2.2.0
pytest 3.4.0
python-dotenv 0.7.1
pytz 2018.3
redis 2.9.1
requests 2.2.1
setuptools 38.5.1
six 1.11.0
snowplow-tracker 0.8.0
urllib3 1.22
wheel 0.30.0
yarg 0.1.9

Rodrigo Queiro

unread,
Feb 27, 2018, 11:41:42 AM2/27/18
to Charlie White, bazel-discuss
Thanks for the information!

The issue seems to be that the .whl file for celery==3.1.11 on PyPI only contains a METADATA file (and not metadata.json) and rules_python doesn't properly support the METADATA file (I just filed an issue for this). This means rules_python doesn't fetch celery's depedencies. You can work around the problem by manually adding the dependencies to requirements.txt:

kombu==3.0.37
billiard==3.3.0.23
pytz==2018.3

and also to the `deps` of `:heartbeat`:

    deps = [
        requirement("billiard"),
        requirement("kombu"),
        requirement("pytz"),
        requirement("snowplow_tracker"),
    ],

However, you could well be better off with https://github.com/TriggerMail/rules_pyz as Evan suggested - you should try it if you keep having trouble with rules_python.


For more options, visit https://groups.google.com/d/optout.

Evan Jones

unread,
Feb 27, 2018, 11:45:28 AM2/27/18
to Rodrigo Queiro, Charlie White, bazel-discuss
Wow! I didn't know wheels could exclude this file. My rules_pyz will have the same problem: it uses the same whl.py tool to get dependencies from wheels.

Evan


To unsubscribe from this group and all its topics, send an email to bazel-discuss+unsubscribe@googlegroups.com.
-- 
Google Germany GmbH | Erika-Mann-Strasse | 80636 Muenchen | Germany

AG Hamburg, HRB 86891 | Sitz der Gesellschaft: Hamburg | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

--
You received this message because you are subscribed to a topic in the Google Groups "bazel-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bazel-discuss/Nt4F-_4vmJc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bazel-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/CAMnS4ZCxdpbobgJX_jHJDC_yrQAZ6zuBVNMXL9BXCj1989wC0A%40mail.gmail.com.

cha...@gmail.com

unread,
Feb 27, 2018, 12:58:53 PM2/27/18
to bazel-discuss
Unfortunately even with those changes I am still seeing the same error.

Charlie White

unread,
Feb 27, 2018, 6:07:17 PM2/27/18
to bazel-discuss
Correction.  It does work, I just had to set those dependencies on the py_binary, the actual target I was building.  Operator error on my part.  Not the worst workaround that I have dealt with, but I would have had no idea on how to fix it if it were not for your help.

Now I need to figure out the py_image.  It is not, apparently, a straight change of py_binary to py_image

Justine Tunney

unread,
Feb 27, 2018, 10:14:31 PM2/27/18
to Evan Jones, bazel-discuss
https://github.com/TriggerMail/rules_pyz

I wrote Bazel's code for downloading files. See ed7ced0018dc5c5ebd6fc8afc7158037ac1df00d. It's designed to be carrier grade (see also). Why reinvent this?

Here's how I import Java Maven packages into Bazel: https://youtu.be/xdMDuhJTKMI See source code and README explanation.

Long configs with pinning are painful, but worthwhile. See motivational reading.

--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/346c8847-8478-4bdd-a593-bc0ed12411c7%40googlegroups.com.

Justine Tunney

unread,
Feb 27, 2018, 10:27:39 PM2/27/18
to Evan Jones, bazel-discuss
> The thing I've had success with is writing my own rules, which generates a zip […]

Here's a Skylark rule named zip_file() that generalizes zip file creation it Bazel. It has zero dependencies. The only thing it requires is Skylark and @bazel_tools//tools/zip:zipper (which comes included in Bazel). Here are examples of it being used to create App Engine deploy .war files, with web server assets.

On Tue, Feb 27, 2018 at 6:48 AM, Evan Jones <evan....@bluecore.com> wrote:

--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discuss+unsubscribe@googlegroups.com.

Rodrigo Queiro

unread,
Feb 28, 2018, 3:55:51 AM2/28/18
to Charlie White, bazel-discuss
On Wed, Feb 28, 2018 at 12:07 AM Charlie White <cha...@gmail.com> wrote:
Correction.  It does work, I just had to set those dependencies on the py_binary, the actual target I was building.  Operator error on my part.  Not the worst workaround that I have dealt with, but I would have had no idea on how to fix it if it were not for your help.

Now I need to figure out the py_image.  It is not, apparently, a straight change of py_binary to py_image

Glad to hear you made progress! Let us know how you get on with switching to py_image. 

Paul Johnston

unread,
Feb 28, 2018, 10:53:41 AM2/28/18
to bazel-discuss
Good to know about that rule, thanks @jart
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.

Charlie White

unread,
Feb 28, 2018, 11:53:32 AM2/28/18
to bazel-discuss
I have successfully created an image with py_image and have been able to push it to a GitLab registry with container_push.  I was able to successfully run that Docker image with one issue, environment variables.  In my original dockerfile I set an environment variable that is used by the application.  I have not yet figured out how to set that variable in the docker image as I could with Docker build command.  When I deploy it to Kubernetes pod, I will specify the proper value depending on the environment it is running.  What I would like to do is set a default value in the container so it will execute without error.  I have added a default value to the Python code, but would rather set it at the container level.  One shouldn't have to modify code to change an environment variable default.  I'm sure that it "easy" to do, just haven't found that option yet.

After that, I am going to tackle running tests against the container.  Ultimately I am trying to get a CI/CD system running that will Build the Docker Image, execute tests in a container running that image, tag and push the image to GitLab registry, and finally deploy the image to a K8s cluster.  I feel that I am really close to accomplishing this goal and hope to have a running CI/CD running by EOD.

Thanks to all for all the help and suggestions.

Rodrigo Queiro

unread,
Feb 28, 2018, 12:01:17 PM2/28/18
to Charlie White, bazel-discuss
The only way I know is to use container_image to create a layer with the envvar (using its `env` parameter), and either pass that to py_image's `base` parameter, or pass the py_image as the container_image's `base` parameter.

--
You received this message because you are subscribed to a topic in the Google Groups "bazel-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bazel-discuss/Nt4F-_4vmJc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bazel-discus...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
-- 
Google Germany GmbH | Erika-Mann-Strasse | 80636 Muenchen | Germany

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.

Evan Jones

unread,
Mar 1, 2018, 3:48:36 PM3/1/18
to Justine Tunney, bazel-discuss
On Tue, Feb 27, 2018 at 10:13 PM, Justine Tunney <ja...@google.com> wrote:
https://github.com/TriggerMail/rules_pyz

I wrote Bazel's code for downloading files. See ed7ced0018dc5c5ebd6fc8afc7158037ac1df00d. It's designed to be carrier grade (see also). Why reinvent this?

Thanks for the feedback! This is a pretty horrific hack so I apologize in advance if you looked at it. :) However, pip_generate generates a .bzl file with a list of things for Bazel to download later. I believe that means it needs to run outside of Bazel, since bazel stuff can't generate BUILD rules if I understand it correctly. This is actually just downloading the file to be able to read what other things it depends on, and to generate the URL/hash for Bazel to download later. It then discards the file, after extracting the information out of it. In the build, it is the files downloaded by Bazel that are actually used. It is inspired by how Bazel Deps works: https://github.com/johnynek/bazel-deps

I think this is actually the same purpose and motivation for why your tool downloads the URLs, if I understand it correctly? (I may not)




Long configs with pinning are painful, but worthwhile. See motivational reading.

Heck yes I agree! See the output generated by pip_generate as an example: https://github.com/TriggerMail/rules_pyz_example/blob/master/third_party/pypi/pypi_rules.bzl
 
Evan

Evan Jones

unread,
Mar 1, 2018, 3:52:41 PM3/1/18
to Justine Tunney, bazel-discuss
On Tue, Feb 27, 2018 at 10:26 PM, Justine Tunney <ja...@google.com> wrote:
> The thing I've had success with is writing my own rules, which generates a zip […]

Here's a Skylark rule named zip_file() that generalizes zip file creation it Bazel. It has zero dependencies. The only thing it requires is Skylark and @bazel_tools//tools/zip:zipper (which comes included in Bazel). Here are examples of it being used to create App Engine deploy .war files, with web server assets.

This may be another part where I don't understand Bazel: I actually need to merge a set of input zip files (Python wheels) together into a single output. It seems to me that the "merge" part may not be trivial, but I don't recall why I found it hard. My recollection is that Bazel rules need to explicitly list the paths of all their outputs, so you can't trivially just unzip a zip into an output directory, then say that another rule depends on everything in that output directory or something.

I should actually make another attempt to just build an output directory tree with symlinks for all the files, but there was some reason that seemed hard also, but again I don't recall why.

I wrote this Go tool in "anger" since it is a tool I'm much more familiar with than Skylark/Bazel :)

Evan

Justine Tunney

unread,
Mar 1, 2018, 7:57:50 PM3/1/18
to Evan Jones, bazel-discuss
input zip files (Python wheels) together into a single output

zip_file() does that.

david.o...@gmail.com

unread,
Mar 2, 2018, 1:45:29 AM3/2/18
to bazel-discuss
On Friday, March 2, 2018 at 1:57:50 AM UTC+1, Justine Tunney wrote:
> > input zip files (Python wheels) together into a single output
>
>
> zip_file() does that.

Why it is not a part of Bazel or at least bazel-skylib?
Note that Buck offers zip_file() rule out of the box: [1].

[1] https://buckbuild.com/rule/zip_file.html
Reply all
Reply to author
Forward
0 new messages