Docker container terminating before ExecStop

449 views
Skip to first unread message

Melvin Lim

unread,
Oct 30, 2014, 5:46:15 PM10/30/14
to coreo...@googlegroups.com
Hi folks,

I'm new to CoreOS and I'm trying to setup my dev environment using Vagrant and coreos-beta (I also tried alpha).

I have one docker container running Postgres, and it doesn't shut down cleanly if I do vagrant halt or reload.

If I do fleetctl stop and start, then it's okay.

Here's my unit file:

[Unit]
Description=App PostgreSQL Service
After=docker.service
After=opt-data.mount
Requires=docker.service
Requires=opt-data.mount

[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill "app-psql"
ExecStartPre=-/usr/bin/docker rm "app-psql"
ExecStartPre=/usr/bin/docker pull paintedfox/postgresql
ExecStart=/usr/bin/docker run -p 5432:5432 --name="app-psql" -v /opt/data/app-psql/data:/data -v /opt/data/app-psql/logs:/var/log/postgresql -e USER="super" -e PASS=XXXXX paintedfox/postgresql
ExecStop=/usr/bin/docker ps
ExecStop=/usr/bin/docker stop "app-psql"

You see that the I put two ExecStop commands: docker ps and docker stop

Here's the output from journalctl in reverse order (-r). I think the problem is the container is exiting before it executes the ExecStop, that's why the docker ps is showing blank.

Oct 30 20:58:50 core-01 systemd[1]: Stopped app-psql.service.
Oct 30 20:58:50 core-01 docker[1225]: app-psql
Oct 30 20:58:50 core-01 docker[1205]: CONTAINER ID        IMAGE               COMMAND             CREATED             ST
Oct 30 20:58:50 core-01 systemd[1]: app-psql.service: main process exited, code=exited, status=255/n/a
Oct 30 20:58:49 core-01 systemd[1]: Stopping App PostgreSQL Service...

Here's the more complete journalct log:

Oct 30 20:39:29 core-01 systemd[1]: Stopping Docker Application Container Engine...
Oct 30 20:39:29 core-01 systemd[1]: Unit app-psql.service entered failed state.
Oct 30 20:39:29 core-01 systemd[1]: Stopped App PostgreSQL Service.
Oct 30 20:39:29 core-01 docker[1059]: app-psql
Oct 30 20:39:29 core-01 docker[604]: [697d2d23] -job stop(app-psql) = ERR (1)
Oct 30 20:39:29 core-01 docker[604]: Container already stopped
Oct 30 20:39:29 core-01 docker[604]: [697d2d23] +job stop(app-psql)
Oct 30 20:39:29 core-01 docker[604]: [info] POST /v1.14/containers/app-psql/stop?t=10
Oct 30 20:39:29 core-01 docker[1040]: CONTAINER ID        IMAGE               COMMAND             CREATED             ST
Oct 30 20:39:29 core-01 fleetd[566]: INFO manager.go:231: Removing systemd unit app-psql.service
Oct 30 20:39:29 core-01 fleetd[566]: INFO manager.go:89: Triggered systemd unit app-psql.service stop: job=1786
Oct 30 20:39:29 core-01 docker[604]: [697d2d23] -job containers() = OK (0)
Oct 30 20:39:29 core-01 docker[604]: [697d2d23] +job containers()
Oct 30 20:39:29 core-01 docker[604]: [info] GET /v1.14/containers/json
Oct 30 20:39:29 core-01 systemd[1]: Stopped OpenSSH per-connection server daemon (10.0.2.2:56040).
Oct 30 20:39:29 core-01 systemd[1]: app-psql.service: main process exited, code=exited, status=255/n/a
Oct 30 20:39:29 core-01 systemd[1]: Stopped Getty on tty1.
Oct 30 20:39:29 core-01 systemd[1]: Stopped Cluster reboot manager.
Oct 30 20:39:29 core-01 systemd[1]: Stopped Network Name Resolution.
Oct 30 20:39:29 core-01 systemd[1]: Stopped Login Service.
Oct 30 20:39:29 core-01 systemd[1]: Stopped Network Time Service.
Oct 30 20:39:29 core-01 systemd[1]: Stopping App PostgreSQL Service...
Oct 30 20:39:29 core-01 systemd[1]: Removed slice system-system\x2dcloudinit.slice.
Oct 30 20:39:29 core-01 locksmithd[504]: Received interrupt/termination signal - shutting down.
Oct 30 20:39:29 core-01 systemd[1]: Stopping system-system\x2dcloudinit.slice.
Oct 30 20:39:29 core-01 systemd[1]: Stopped Load cloud-config from /usr/share/oem/cloud-config.yml.
Oct 30 20:39:29 core-01 systemd[1]: Stopping Load cloud-config from /usr/share/oem/cloud-config.yml...
Oct 30 20:39:29 core-01 systemd[1]: Stopped Modifies /etc/environment for CoreOS.
Oct 30 20:39:29 core-01 docker[604]: [697d2d23] -job release_interface(ded9d15b168e72659d5971e3e124c74656103b77f30aed4a4
Oct 30 20:39:29 core-01 docker[604]: [697d2d23] -job wait(ded9d15b168e72659d5971e3e124c74656103b77f30aed4a46cf546bcbbf23
Oct 30 20:39:29 core-01 docker[604]: [697d2d23] +job wait(ded9d15b168e72659d5971e3e124c74656103b77f30aed4a46cf546bcbbf23
Oct 30 20:39:29 core-01 docker[604]: [info] POST /v1.14/containers/ded9d15b168e72659d5971e3e124c74656103b77f30aed4a46cf5
Oct 30 20:39:29 core-01 docker[604]: [697d2d23] -job attach(ded9d15b168e72659d5971e3e124c74656103b77f30aed4a46cf546bcbbf
Oct 30 20:39:29 core-01 docker[604]: [697d2d23] +job release_interface(ded9d15b168e72659d5971e3e124c74656103b77f30aed4a4
Oct 30 20:39:29 core-01 docker[604]: [697d2d23] -job log(die, ded9d15b168e72659d5971e3e124c74656103b77f30aed4a46cf546bcb
Oct 30 20:39:29 core-01 docker[604]: [697d2d23] +job log(die, ded9d15b168e72659d5971e3e124c74656103b77f30aed4a46cf546bcb
Oct 30 20:39:29 core-01 fleetd[566]: ERROR event.go:91: etcd watcher {Watch /_coreos.com/fleet/job} returned error: canc
Oct 30 20:39:29 core-01 fleetd[566]: INFO client.go:278: Failed getting response from http://localhost:4001/: cancelled
Oct 30 20:39:29 core-01 fleetd[566]: ERROR event.go:91: etcd watcher {Watch /_coreos.com/fleet/job} returned error: canc
Oct 30 20:39:29 core-01 fleetd[566]: INFO client.go:278: Failed getting response from http://localhost:4001/: cancelled
Oct 30 20:39:29 core-01 fleetd[566]: INFO fleet.go:89: Gracefully shutting down
Oct 30 20:39:29 core-01 systemd[1]: Stopping Modifies /etc/environment for CoreOS...
Oct 30 20:39:29 core-01 systemd[1]: Stopped Load cloud-config from /var/tmp/hostname.yml.
Oct 30 20:39:29 core-01 ntpd[460]: ntpd exiting on signal 15
Oct 30 20:39:29 core-01 kernel: docker0: port 1(veth6cdf) entered disabled state
Oct 30 20:39:29 core-01 kernel: device veth6cdf left promiscuous mode
Oct 30 20:39:29 core-01 kernel: docker0: port 1(veth6cdf) entered disabled state
Oct 30 20:39:29 core-01 systemd[1]: Stopping Load cloud-config from /var/tmp/hostname.yml...
Oct 30 20:39:29 core-01 systemd[1]: Stopped Load cloud-config from /var/tmp/networks.yml.
Oct 30 20:39:29 core-01 systemd[1]: Stopping Load cloud-config from /var/tmp/networks.yml...
Oct 30 20:39:29 core-01 systemd[1]: Stopped target Load system-provided cloud configs.
Oct 30 20:39:29 core-01 systemd[1]: Stopping Load system-provided cloud configs.
Oct 30 20:39:29 core-01 systemd[1]: Removed slice system-user\x2dcloudinit.slice.
Oct 30 20:39:29 core-01 systemd[1]: Stopping system-user\x2dcloudinit.slice.
Oct 30 20:39:29 core-01 systemd[1]: Stopped Load cloud-config from /var/lib/coreos-vagrant/vagrantfile-user-data.
Oct 30 20:39:29 core-01 systemd[1]: Stopping Load cloud-config from /var/lib/coreos-vagrant/vagrantfile-user-data...
Oct 30 20:39:29 core-01 systemd[1]: Stopped target Load user-provided cloud configs.
Oct 30 20:39:29 core-01 systemd[1]: Stopping Load user-provided cloud configs.
Oct 30 20:39:29 core-01 systemd[1]: Stopping Login Service...
Oct 30 20:39:29 core-01 systemd[1]: Stopping Cluster reboot manager...
Oct 30 20:39:29 core-01 systemd-networkd[763]: docker0         : lost carrier
Oct 30 20:39:29 core-01 systemd-networkd[763]: veth6cdf        : could not find udev device
Oct 30 20:39:29 core-01 systemd-networkd[763]: veth6cdf        : lost carrier
Oct 30 20:39:29 core-01 systemd[1]: Stopping Network Time Service...
Oct 30 20:39:29 core-01 systemd[1]: Stopping Network Name Resolution...
Oct 30 20:39:29 core-01 systemd[1]: Stopping Getty on tty1...
Oct 30 20:39:29 core-01 systemd[1]: Stopped target Login Prompts.
Oct 30 20:39:29 core-01 systemd[1]: Stopping Login Prompts.
Oct 30 20:39:29 core-01 systemd[1]: Stopped target Multi-User System.
Oct 30 20:39:29 core-01 systemd[1]: Stopping Multi-User System.
Oct 30 20:39:29 core-01 systemd[1]: Stopping fleet daemon...
Oct 30 20:39:29 core-01 systemd[1]: Stopping OpenSSH per-connection server daemon (10.0.2.2:56040)...
Oct 30 20:39:29 core-01 systemd[1]: Removed slice system-addon\x2dconfig.slice.
Oct 30 20:39:29 core-01 systemd[1]: Stopping system-addon\x2dconfig.slice.
Oct 30 20:39:29 core-01 systemd[1]: Removed slice system-addon\x2drun.slice.
Oct 30 20:39:29 core-01 systemd[1]: Stopping system-addon\x2drun.slice.
Oct 30 20:39:29 core-01 systemd[1]: Stopped docker container ded9d15b168e72659d5971e3e124c74656103b77f30aed4a46cf546bcbb
Oct 30 20:39:29 core-01 systemd[1]: Stopping docker container ded9d15b168e72659d5971e3e124c74656103b77f30aed4a46cf546bcb
Oct 30 20:39:29 core-01 sudo[1022]: core : TTY=unknown ; PWD=/home/core ; USER=root ; COMMAND=/bin/bash -l
Oct 30 20:39:28 core-01 sshd[984]: Accepted publickey for core from 10.0.2.2 port 56040 ssh2: RSA dd:3b:b8:2e:85:04:06:e
Oct 30 20:39:28 core-01 sshd[984]: SSH: Server;Ltype: Authname;Remote: 10.0.2.2-56040;Name: core [preauth]
Oct 30 20:39:28 core-01 sshd[984]: SSH: Server;Ltype: Kex;Remote: 10.0.2.2-56040;Enc: aes128-cbc;MAC: hmac-sha1;Comp: no


Can you guys please give me a hint how to fix this?

Thanks,

Melvin

Brandon Philips

unread,
Feb 3, 2015, 11:56:28 PM2/3/15
to Melvin Lim, coreos-user
On Thu, Oct 30, 2014 at 2:46 PM, Melvin Lim <mlv...@gmail.com> wrote:
> Can you guys please give me a hint how to fix this?

It was a bug in both systemd and docker. We hunted it down and the
full journey can be found here:
https://github.com/coreos/bugs/issues/186

Once this patch lands in libcontainer we should be able to get this
all into a CoreOS release:
https://github.com/docker/libcontainer/pull/359

Thanks,

Brandon
Reply all
Reply to author
Forward
0 new messages