I've had a heck of a time figuring this out, so I thought I would share. In a nutshell, this shows how to introduce your own service file (skydns.service) to the coreos machine via the cloud-config's write_file mechanism, and then how to start that unit and make it part of systemd's responsibility to keep up.
I have coreos running on digitalocean with etcd/fleet/flanneld. I decided I wanted discovery and dns to be part of my installation. So, I added skydns and registrator to the cloud-config file. I have some reservations about this, mixing *my* configuration with the coreos configuration, but, I really wanted the common denominator on a fleet machine to include container discovery and dns.
To that end, I created service files and had no problem running skydns and registrator manually *after* the coreos fleet booted and synced up. But, I really wanted to have a fleet machine ready to go after it was created on digital ocean, no more fiddling needed.
So I thought it would be simple to add 2 more units to the cloud-config. I couldn't get them to work in a reliable way. Sometimes they would work, sometimes not. It basically boiled down to my lack of understanding with systemd. I did two things. In the service definition I put the [Install] section. When the cloud-config's unit definition does 'enable: true', the [Install] section is used to create links and make the service part of systemd's responsibility. Of course there needs to be the command: start as well, I had this in my initial versions, but, if anything failed during the attempted start everything would stop, and a retry would not happen. By doing the enable:true I made it part of systemd's mission to keep this service up, and to retry as appropriate.
Also, I can now kill the container running skydns, and systemd will restart it!
All of this is guess work on my part, I have read through systemd documentation and coreos documentation. If there is a better way to do this I'd like to know! Here is the relevant piecework of my cloud-config file:
#cloud-config
hostname: a
...
write_files:
- path: /etc/systemd/system/skydns.service
permissions: 0644
owner: core:core
content: |
[Unit]
Description=SkyDNS service discovery
After=flanneld.service docker.service etcd.service
Requires=flanneld.service docker.service etcd.service
[Service]
Restart=always
ExecStartPre=-/usr/bin/env docker kill skydns
ExecStartPre=-/usr/bin/env docker rm skydns
ExecStartPre=/usr/bin/env docker pull tacodata/skydns-coreos
ExecStart=/usr/bin/env bash -c '/usr/bin/docker run --name skydns -p 53:53/udp tacodata/skydns-coreos'
ExecStop=-/usr/bin/docker stop skydns
[Install] WantedBy = multi-user.target
...
coreos:
etcd:
...
units:
...
- name: skydns.service
enable: true
command: start
I think it is worth mentioning that the section :
Does not exist in this service definition. However the service is run on all coreos hosts because it is part of the #cloud-config.