Hi,
I have been experiencing 'random' crashes of the docker daemon.
Apr 20 08:37:41 minion3 dockerd[1827]: fatal error: unexpected signal during runtime execution
Apr 20 08:37:41 minion3 dockerd[1827]: [signal 0xb code=0x1 addr=0xf0 pc=0xf0]
Apr 20 08:37:41 minion3 dockerd[1827]: runtime stack:
Apr 20 08:37:41 minion3 dockerd[1827]: runtime.gothrow(0x1585a70, 0x2a)
Apr 20 08:37:41 minion3 dockerd[1827]: /usr/lib/go/src/runtime/panic.go:503 +0x8e
Apr 20 08:37:41 minion3 dockerd[1827]: runtime.sigpanic()
Apr 20 08:37:41 minion3 dockerd[1827]: /usr/lib/go/src/runtime/sigpanic_unix.go:14 +0x5e
Apr 20 08:37:41 minion3 dockerd[1827]: goroutine 519 [syscall, locked to thread]:
Apr 20 08:37:41 minion3 dockerd[1827]: runtime.cgocall_errno(0x408cb3, 0xc2089d4730, 0x0)
Apr 20 08:37:41 minion3 dockerd[1827]: /usr/lib/go/src/runtime/cgocall.go:130 +0xf5 fp=0xc2089d4710 sp=0xc2089d46e8
Apr 20 08:37:41 minion3 dockerd[1827]:
github.com/docker/docker/daemon/logger/journald._Cfunc_wait_for_data_or_close(0x7f5368000900, 0xc200000043, 0x0)
Apr 20 08:37:41 minion3 dockerd[1827]:
github.com/docker/docker/daemon/logger/journald/_obj/_cgo_gotypes.go:164 +0x43 fp=0xc2089d4730 sp=0xc2089d4710
Apr 20 08:37:41 minion3 dockerd[1827]: github.com/docker/docker/daemon/logger/journald.func·001()
Apr 20 08:37:41 minion3 dockerd[1827]: /build/amd64-usr/var/tmp/portage/app-emulation/docker-1.9.1-r3/work/docker-1.9.1/.gopath/src/
github.com/docker/docker/daemon/logger/Apr 20 08:37:41 minion3 dockerd[1827]: runtime.goexit()
...
Which let me to assume was a result of the bug
https://github.com/docker/docker/issues/19728I don't restart journald but it seems that it was temporarily on available.
I was able to reproduce the bug by having a noisy restarting container (missed one of its dependencies) and restarting systemd-journald
My hypothesis was that if I used a logger other than journald (--log-driver=json-file) that I would no longer be depending on journald.
Now if I stop journald docker keeps running, but as soon as I start journald systemd starts docker again.
Then I found
https://bugs.freedesktop.org/show_bug.cgi?id=84923 which shows a simple test to prove the bug which I can reproduce on my system
Then I found
https://bugzilla.redhat.com/show_bug.cgi?id=1300076 which shows a solution by redirecting output on stdout and stderr directly to journald with:
https://github.com/projectatomic/forward-journaldMy question is what is the best cause of action I can take?
I really don't want my database containers to be killed when the logging of another container should fail.
DISTRIB_ID=CoreOS
DISTRIB_RELEASE=899.15.0
DISTRIB_CODENAME="Red Dog"
DISTRIB_DESCRIPTION="CoreOS 899.15.0"