unit restarts forever when run via fleet

59 views
Skip to first unread message

Brian Lalor

unread,
Aug 7, 2015, 4:47:18 PM8/7/15
to coreos-dev
I'm having a fleet/systemd problem I could use a hand with.  I'm running the latest fleet on CentOS 7, which uses systemd 208.  I have a unit file that runs a docker container, using the standard pattern in the documentation (pull, kill, rm, run) and it works fine when run directly via systemd. But when the unit's deployed via fleet, it constantly restarts.  I can't figure out what's different.  I have a hunch that it’s some interaction between fleetd and systemd that causes systemd to relaunch the unit when polling for status, but I’m really at a loss to troubleshoot this.  Any suggestions?

Thanks,
Brian

— 
Brian Lalor

Brandon Philips

unread,
Aug 7, 2015, 5:46:30 PM8/7/15
to coreos-dev
Can you paste your unit file here?

Brian Lalor

unread,
Aug 8, 2015, 8:33:32 AM8/8/15
to coreo...@googlegroups.com
Hm, I suppose that would have been good info to provide up front…

Here’s a simplified scenario:

[(master) hackathon2015]> cat fleet-units/sleep-forever.service 
[Unit]
Description=just a sleeper

[Service]
ExecStart=/bin/sh -c "echo 'hello world'; while sleep 10; do echo 'hi'; done; echo 'goodbye world'"

[X-Fleet]
MachineMetadata=role=docker

Then I do "fleetctl start sleep-forever.service”.  "journalctl -f -u sleep-forever" shows this:

Aug 08 12:26:05 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:7] Unknown section 'X-Fleet'. Ignoring.
Aug 08 12:26:05 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:8] Assignment outside of section. Ignoring.
Aug 08 12:26:05 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:7] Unknown section 'X-Fleet'. Ignoring.
Aug 08 12:26:05 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:8] Assignment outside of section. Ignoring.
Aug 08 12:26:05 docker-006 systemd[1]: Starting just a sleeper...
Aug 08 12:26:05 docker-006 systemd[1]: Started just a sleeper.
Aug 08 12:26:05 docker-006 sh[1112]: hello world
Aug 08 12:26:15 docker-006 systemd[1]: Stopping just a sleeper...
Aug 08 12:26:15 docker-006 systemd[1]: Stopped just a sleeper.
Aug 08 12:26:15 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:7] Unknown section 'X-Fleet'. Ignoring.
Aug 08 12:26:15 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:8] Assignment outside of section. Ignoring.
Aug 08 12:26:25 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:7] Unknown section 'X-Fleet'. Ignoring.
Aug 08 12:26:25 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:8] Assignment outside of section. Ignoring.
Aug 08 12:26:25 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:7] Unknown section 'X-Fleet'. Ignoring.
Aug 08 12:26:25 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:8] Assignment outside of section. Ignoring.
Aug 08 12:26:25 docker-006 systemd[1]: Starting just a sleeper...
Aug 08 12:26:25 docker-006 systemd[1]: Started just a sleeper.
Aug 08 12:26:25 docker-006 sh[1146]: hello world
Aug 08 12:26:29 docker-006 systemd[1]: Stopping just a sleeper...
Aug 08 12:26:29 docker-006 systemd[1]: Stopped just a sleeper.
Aug 08 12:26:29 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:7] Unknown section 'X-Fleet'. Ignoring.
Aug 08 12:26:29 docker-006 systemd[1]: [/run/fleet/units/sleep-forever.service:8] Assignment outside of section. Ignoring.

If I put the same unit file in /etc/systemd/system and systemctl start sleep-forever, I see

Aug 08 12:27:16 docker-006 systemd[1]: [/etc/systemd/system/sleep-forever.service:9] Unknown section 'X-Fleet'. Ignoring.
Aug 08 12:27:16 docker-006 systemd[1]: [/etc/systemd/system/sleep-forever.service:10] Assignment outside of section. Ignoring.
Aug 08 12:27:16 docker-006 systemd[1]: Starting just a sleeper...
Aug 08 12:27:16 docker-006 systemd[1]: Started just a sleeper.
Aug 08 12:27:16 docker-006 sh[1193]: hello world
Aug 08 12:27:26 docker-006 sh[1193]: hi
Aug 08 12:27:36 docker-006 sh[1193]: hi
Aug 08 12:27:46 docker-006 sh[1193]: hi
Aug 08 12:27:56 docker-006 sh[1193]: hi


— 
Brian Lalor

Brian Lalor

unread,
Aug 8, 2015, 8:48:19 AM8/8/15
to coreo...@googlegroups.com
I set fleet’s verbosity=2; fleet is definitely restarting this unit unnecessarily. Here’s the log: https://gist.github.com/blalor/8fed0123183db176a019

The key line seems to be
Aug 08 12:35:48 docker-006 fleetd[4227]: DEBUG reconcile.go:321: AgentReconciler attempting tasks [{UnloadUnit unit loaded but not scheduled here %!s(*job.Unit=&{sleep-forever.service {map[] []} })} {ReloadUnitFiles always reload unit files %!s(*job.Unit=<nil>)}]
After that, fleetd stops, removes, reloads and starts my unit.  I’m running 0.11.2.  

This looks like a fleet problem; I’ll open an issue.

— 
Brian Lalor

Adarsh J

unread,
Aug 8, 2015, 9:07:46 AM8/8/15
to coreo...@googlegroups.com
May be try `fleetctl list-machines` and ensure that `METADATA` column
lists `role=docker` (which is mentioned in your X-Fleet section)?

(the metadata is usually assigned from cloud-config in `fleet` section
- https://coreos.com/os/docs/latest/cloud-config.html#fleet or as
Environment value FLEET_METADATA in fleetd's service file)
Regards,
Adarsh J

Brian Lalor

unread,
Aug 8, 2015, 9:29:09 AM8/8/15
to coreo...@googlegroups.com
I created https://github.com/coreos/fleet/issues/1324.

Those do line up. There’s only a single machine with that role set; the unit’s not getting moved around, it’s just getting restarted on this machine.

Brian Lalor
bla...@bravo5.org

Reply all
Reply to author
Forward
0 new messages