Hang on 'build' using python API client

333 views
Skip to first unread message

Marcus Cobden

unread,
Sep 17, 2014, 9:25:36 AM9/17/14
to docke...@googlegroups.com
I'm quite regularly experiencing hangs when calling .build(...) from the python client.
The client hangs when I try to consume from the stream generator.

Does anyone have any suggestions as to why this might be hanging?

The call I'm making is the following:

client.build(
            tag=self._image_tag,
            path=self._repos_dir,
            rm=True,
            stream=True)

I see these logs from my worker & the docker daemon:

[2014-09-17 12:20:41,009: DEBUG/Worker-5] "POST /run/docker.sock/v1.12/build?q=False&rm=True&t=job_54197c973aedca001425cec5_img&nocache=False HTTP/1.1" 200 None
[debug] server.go:1046 Calling GET /_ping
[info] GET /v1.12/_ping
[debug] server.go:1046 Calling POST /build
[info] POST /v1.12/build?q=False&rm=True&t=job_54197c973aedca001425cec5_img&nocache=False
[c5bbc5f0] +job build()
[debug] archive.go:90 [tar autodetect] n: [68 111 99 107 101 114 102 105 108 101]
[debug] archive.go:90 [tar autodetect] n: [68 111 99 107 101 114 102 105 108 101]
[debug] dispatchers.go:193 Command to be executed: [/bin/sh -c set -e; export DEBIAN_FRONTEND=noninteractive; sed -E -i 's|archive\.ubuntu\.com|rdg-linux.cisco.com|g' /etc/apt/sources.list /etc/apt/sources.list.d/proposed.list; apt-get -q update; apt-get -q install -y --no-install-recommends python python-dev python-pip build-essential git; rm -rf /var/lib/apt/lists/*]
[debug] internals.go:361 [BUILDER] Use cached version
[debug] internals.go:361 [BUILDER] Use cached version
[c5bbc5f0] -job build() = OK (0)

Jérôme Petazzoni

unread,
Sep 18, 2014, 10:28:34 AM9/18/14
to Marcus Cobden, docker-dev
What version of Docker are you using? With which backend? Can you reproduce the issue with a simple Dockerfile?

--
You received this message because you are subscribed to the Google Groups "docker-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to docker-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Marcus

unread,
Sep 18, 2014, 11:47:10 AM9/18/14
to Jérôme Petazzoni, docker-dev
docker-py: 0.5.0

docker version                             
Client version: 1.2.0
Client API version: 1.14
Go version (client): go1.3.1
Git commit (client): fa7b24f
OS/Arch (client): linux/amd64
Server version: 1.2.0
Server API version: 1.14
Go version (server): go1.3.1
Git commit (server): fa7b24f

default docker backend

Dockerfile is as follows:

FROM ubuntu:14.04
RUN set -e; \
    export DEBIAN_FRONTEND=noninteractive; \

    apt-get -q update; \
    apt-get -q install -y --no-install-recommends \
        python \
        python-dev \
        python-pip \
        build-essential \
        git; \
    rm -rf /var/lib/apt/lists/*
CMD true

Jérôme Petazzoni

unread,
Sep 18, 2014, 12:04:40 PM9/18/14
to Marcus, docker-dev
Great! Could you also post a small Python file + requirements.txt to reproduce, and let us know how often it happens (roughly)? 

Thanks!

Marcus

unread,
Sep 18, 2014, 1:18:57 PM9/18/14
to Jérôme Petazzoni, docker-dev
Just run test.sh; should hang within ~60s! (After all the building is cached)

Sven Dowideit

unread,
Sep 19, 2014, 12:59:07 AM9/19/14
to docke...@googlegroups.com, jerome.p...@docker.com
I cloned, cd'd into the dir, then ran ./test.sh after changing the #! to /bin/sh

and it didn't hang.

it did however fail to run the tester image - as it appears not to exist.

So running the initial build -t tester manually and then running the final one results in lots of continuous output:


[cf3147b8] -job log(untag, 8336b5d9a7d97337416c41e6ec012df3c52b9cb74a6a2672fd55c72480900907, ) = OK (0)
[cf3147b8] -job image_delete(img_1411102616.75) = OK (0)
[debug] server.go:1036 Calling POST /build
[info] POST /v1.12/build?q=False&rm=True&t=img_1411102616.86&nocache=False
[cf3147b8] +job build()
[debug] archive.go:88 [tar autodetect] n: [68 111 99 107 101 114 102 105 108 101]
[debug] archive.go:88 [tar autodetect] n: [68 111 99 107 101 114 102 105 108 101]
[debug] build.go:382 Error unmarshalling: invalid character 's' looking for beginning of value, setting to /bin/sh -c
[debug] build.go:292 Command to be executed: [/bin/sh -c set -e;     export DEBIAN_FRONTEND=noninteractive;     apt-get -q update;     apt-get -q install -y --no-install-recommends         python         python-dev         python-pip         build-essential         git;     rm -rf /var/lib/apt/lists/*]
[debug] build.go:266 [BUILDER] Use cached version
[debug] build.go:382 Error unmarshalling: json: cannot unmarshal bool into Go value of type []string, setting to /bin/sh -c
[debug] build.go:266 [BUILDER] Use cached version
[cf3147b8] -job build() = OK (0)
[debug] server.go:1036 Calling DELETE /images/{name:.*}
[info] DELETE /v1.12/images/img_1411102616.86?noprune=True&force=False
[cf3147b8] +job image_delete(img_1411102616.86)
[cf3147b8] +job log(untag, 8336b5d9a7d97337416c41e6ec012df3c52b9cb74a6a2672fd55c72480900907, )
[cf3147b8] -job log(untag, 8336b5d9a7d97337416c41e6ec012df3c52b9cb74a6a2672fd55c72480900907, ) = OK (0)
[cf3147b8] -job image_delete(img_1411102616.86) = OK (0)
Context upload complete. Build started
{u'stream': u' ---> 826544226fdc\n'}
{u'stream': u'Step 1 : RUN set -e;     export DEBIAN_FRONTEND=noninteractive;     apt-get -q update;     apt-get -q install -y --no-install-recommends         python         python-dev         python-pip         build-essential         git;     rm -rf /var/lib/apt/lists/*\n'}
{u'stream': u' ---> Using cache\n'}
{u'stream': u' ---> 7f4143a10733\n'}
{u'stream': u'Step 2 : CMD true\n'}
{u'stream': u' ---> Using cache\n'}
{u'stream': u' ---> 8336b5d9a7d9\n'}
{u'stream': u'Successfully built 8336b5d9a7d9\n'}
Build complete.
+ :
+ python /tester/test.py
Context upload complete. Build started
{u'stream': u' ---> 826544226fdc\n'}
{u'stream': u'Step 1 : RUN set -e;     export DEBIAN_FRONTEND=noninteractive;     apt-get -q update;     apt-get -q install -y --no-install-recommends         python         python-dev         python-pip         build-essential         git;     rm -rf /var/lib/apt/lists/*\n'}
{u'stream': u' ---> Using cache\n'}
{u'stream': u' ---> 7f4143a10733\n'}
{u'stream': u'Step 2 : CMD true\n'}
{u'stream': u' ---> Using cache\n'}
{u'stream': u' ---> 8336b5d9a7d9\n'}
{u'stream': u'Successfully built 8336b5d9a7d9\n'}
Build complete.
+ :

So it looks like your example isn't hanging on my boot2docker 1.2.0 system

could you please send the output of your `docker info`?

can you try b2d?


Sven

Marcus

unread,
Sep 19, 2014, 4:13:44 AM9/19/14
to Sven Dowideit, docker-dev, Jérôme Petazzoni
docker info:

Containers: 0
Images: 176
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Dirs: 176
Execution Driver: native-0.2
Kernel Version: 3.13.0-24-generic
Operating System: Ubuntu 14.04.1 LTS
Debug mode (server): true
Debug mode (client): false
Fds: 9
Goroutines: 10
EventsListeners: 0
Init Path: /usr/bin/docker
WARNING: No swap limit support

I haven't had a chance to try b2d yet, but given this seems to be a race condition-based deadlock I suspect that spinning up a local VM would change the performance characteristics somewhat.
FWIW I'm running this system from an SSD, on an i7-4600U CPU.


--
You received this message because you are subscribed to a topic in the Google Groups "docker-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/docker-dev/fhrTqwF7IJQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to docker-dev+...@googlegroups.com.

Jérôme Petazzoni

unread,
Sep 19, 2014, 2:32:10 PM9/19/14
to Marcus, Sven Dowideit, docker-dev
I let it running for an hour this morning, and it didn't hang.
Kernel 3.14, Docker 1.2, aufs storage, native exec, on 4-core phys machine.

Marcus

unread,
Sep 23, 2014, 8:19:17 AM9/23/14
to Jérôme Petazzoni, Sven Dowideit, docker-dev
Hmm, frustrating. It still hangs for me.

Here's a recording of the output:
http://showterm.io/74eeda21fee65bc82197c

I've added a "docker info" every 5 seconds, so that's what's still running at the end.

(sorry for the duplicate, forgot to reply-all first time)

Marcus

unread,
Oct 3, 2014, 8:37:35 AM10/3/14
to Jérôme Petazzoni, Sven Dowideit, docker-dev
I've dug into this a bit more; I've verified that there's nothing wrong in the python client's _stream_helper method:

I removed the socket_fp.makefile call, and moved to using sock_fp.recv(4096), and batching up lines manually.
In requests where it hangs, NOTHING is ever successfully read from the socket.

I've also tried out a global (using fnctl on a an open file) mutex around the build and remove_image calls, and it still hangs on build sometimes.
I'm going to asusme the remove_image calls are not the problem.

So, my current reasoning about where the problem could be:
1. Python client is doing something wrong with unix sockets & HTTP
2. Build api call has a random deadlock in it somewhere.
3. Something is broken when running docker-in-docker
4. My configuration for running docker-in-docker has some mistake in it causing random lockups

Any help would be greatly appreciated!

Thanks,
Marcus

Marcus

unread,
Oct 3, 2014, 11:08:02 AM10/3/14
to Jérôme Petazzoni, Sven Dowideit, docker-dev
I stuck my application under strace, and it looks like the problem is #1:
The python client seems to be dropping some buffered content in between parsing the HTTP headers and grabbing the raw socket for streaming mode.

Marcus

unread,
Oct 3, 2014, 5:01:27 PM10/3/14
to Jérôme Petazzoni, Sven Dowideit, docker-dev

Jérôme Petazzoni

unread,
Oct 3, 2014, 5:10:58 PM10/3/14
to Marcus, Sven Dowideit, docker-dev
Well played. Thanks for tracking this down!
Reply all
Reply to author
Forward
0 new messages