Alex Crawford writes:
> On 04/16, Nate Williams wrote:
> > Thanks. I spent many hours this weekend coaxing TF to generate simple
> > Ignition files, and was able to get an identical configuration file as
> > removed above. The node works great at first bootup, but the box
> > refuses to allow SSH connections after reboot. :(
>
> Uh oh. Let's dig into this.
>
> > The issue is not cloud-config vs. Ignition, it's requiring another
> > binary (ct) to be run to generate the config's dynamically, which
> > doesn't play well in Terraform. TF does have a way to generate Ignition
> > configuration files, but it's > 9 months out-of-date, and given recent
> > ignition configuration changes (configuring and etcd/locksmith), so
> > creating configuration is non-trivial, to say the least.
>
> Ah, the new features you are referring to are not a part of Ignition,
> but actually a part of Container Linux Configs (which are transpiled
> into Ignition Configs). We are working on a Container Linux Config
> provider for Terraform which will allow you to take advantage of these
> features.
Whoo hoo!
> > To summarize, I no longer have SSH races at first boot, but I can no
> > longer connect to instance after it's rebooted, so I'm still stuck. :(
>
> Do you have access to the serial logs from the machine?
I just found out I can get access to the serial logs on AWS. Yay!
TL;DR
-----
[[0;1;31mFAILED[0m] Failed to start etcd (System Application Container).
See 'systemctl status etcd-member.service' for details.
-----
(Edited logs attached below)
I believe it's complaining about a dependency loop, since that's
happened a bunch of times as well, but since I don't have SSH access
(SSH is working, but it won't allow logins using my PEM), I can't do any
more debugging.
At this point on this particular server, I have SSH working, but the PEM
keys are not working properly so I can't login(??). Other times I have
no SSH, other times etcd is not running. It's very inconsistent.
> Can you show the output of `systectl cat sshd@'? It's very odd that it
> works on the first boot but not subsequent boots. Ignition is designed
> to make sure that the first boot is no longer a special case.
Since I can't login, I can't run any commands like 'systemctl cat
sshd@', but I may be able to do that by rebuilding the box and not
rebooting it.
Here you go:
------------------ cut here ------------------
$ systemctl cat sshd@
# /usr/lib/systemd/system/sshd@.service
[Unit]
Description=OpenSSH per-connection server daemon
After=syslog.target auditd.service
[Service]
ExecStart=-/usr/sbin/sshd -i -e
StandardInput=socket
StandardError=syslog
# /usr/lib64/systemd/system/sshd@.service.d/sshd-keygen.conf
[Unit]
Wants=sshd-keygen.service
After=sshd-keygen.service
# /etc/systemd/system/sshd@.service.d/waiter.conf
[Unit]
After=waiter.service
------------------ cut here ------------------
Thanks for your help!
Nate
------------------------- serial logs -------------------------
[ 0.000000] Linux version 4.9.16-coreos-r1 (jenkins@localhost) (gcc
version 4.9.3 (Gentoo Hardened 4.9.3 p1.5, pie-0.6.4) ) #1 SMP Fri Mar
31 02:07:42 UTC 2017
[ 0.000000] Command line: BOOT_IMAGE=/coreos/vmlinuz-a
mount.usr=/dev/mapper/usr
verity.usr=PARTUUID=7130c94a-213a-4e5a-8e26-6cce9662f132 rootflags=rw
mount.usrflags=ro consoleblank=0 root=LABEL=ROOT console=ttyS0,115200n8
coreos.first_boot=1
coreos.randomize_disk_guid=00000000-0000-0000-0000-000000000001
coreos.oem.id=ec2 modprobe.blacklist=xen_fbfront net.ifnames=0
verity.usrhash=bdba6be8dc5acdf23b3ad079600f3ae57c7e30a6994e9431357eb80cee98609a
...
Welcome to [0;34mdracut-044 (Initramfs)[0m!
...
[ 4.661367] systemd[1]: Starting Ignition (files)... Starting
Ignition (files)...
[ 4.670481] ignition[427]: Ignition v0.12.1
[ 4.674338] ignition[[0;32m OK [0m] Started Ignition (files).
[427]: files: op(1): [started] processing unit
"coreos-metadata-sshkeys@.service"Starting Reload Configuration from the
Real Root...
[[0;32m OK [0m] Reached target ignition.target.
[ 4.699320] ignition[427]: files: op(1): [finished] processing unit
"coreos-metadata-sshkeys@.service"
[ 4.720300] systemd[1]: Started Ignition (files).
[ 4.731987] ignition[427]: files: op(2): [started] enabling unit
"coreos-metadata-sshkeys@.service"
[ 4.748717] systemd[1]: Starting Reload Configuration from the Real
Root...
[ 4.760672] ignition[427]: files: op(2): [finished] enabling unit
"coreos-metadata-sshkeys@.service"
[ 4.780003] systemd[1]: Reached target ignition.target.
[ 4.788329] ignition[427]: files: op(3): [started] processing unit
"waiter.service"
[ 4.803233] systemd[1]: Reloading.
[ 4.815863] ignition[427]: files: op(3): op(4): [started] writing
unit "waiter.service" at "etc/systemd/system/waiter.service"
[ 4.836170] ignition[427]: files: op(3): op(4): [finished] writing
unit "waiter.service" at "etc/systemd/system/waiter.service"
[[0;32m OK [0m] Started Reload Configuration from the Real Root.[
4.859880] systemd[1]: Started Reload Configuration from the Real Root.
[ 4.864673] ignition[427]: files: op(3): [finished] processing unit
"waiter.service"
[ 4.873021]
[[0;32m OK [0m] Reached target Initrd File Systems.
[[0;32m OK [0m] Reached target Initrd Default Target.
Starting dracut pre-pivot and cleanup hook...
[[0;32m OK [0m] Started dracut pre-pivot and cleanup hook.
Starting Cleaning Up and Shutting Down Daemons...
ignition[427]: files: op(5): [started] enabling unit
"waiter.service"[[0;32m OK [0m] Stopped Cleaning Up and Shutting Down
Daemons.
[[0;32m OK [0m[ 4.954111] systemd[1]: Reached target Initrd File
Systems.
[ 4.960598] ignition[427]: files: op(5): [finished] enabling unit
"waiter.service"] Stopped dracut pre-pivot and cleanup hook.
[ 4.976630] systemd[1]: Reached target Initrd Default
Target.
[ 4.987463] ignition[427]: files: op(6): [started] processing unit
"sshd@.service"Stopping Network Name Resolution...
[[0;32m OK [0m] Stopped target ignition.target.
[[0;32m OK [0m] Stopped dracut pre-mount hook.
[[0;32m OK [0m] Stopped target Timers.
[[0;32m OK [0m] Stopped target Remote File Systems.
[[0;32m OK [0m] Stopped target Remote File Systems (Pre).
[[0;32m OK [0m] Stopped target Initrd Default Target.
[[0;32m OK [0m] Stopped dracut initqueue hook.
[[0;32m OK [0m] Stopped target Basic System.
[[0;32m OK [0m] Stopped target Paths.
[[0;32m OK [0m] Stopped Dispatch Password Requests to Console
Directory Watch.
[[0;32m OK [0m] Stopped target System Initialization.
[[0;32m OK [0m] Stopped udev Coldplug all Devices.
[[0;32m OK [0m] Stopped dracut pre-trigger hook.
[[0;32m OK [0m] Stopped target Swap.
[[0;32m OK [0m] Stopped target Encrypted Volumes.
[[0;32m OK [0m] Stopped target Slices.
[[0;32m OK [0m] Stopped target Sockets.
[[0;32m OK [0m] Stopped target Local File Systems.
[[0;32m OK [0m] Stopped target Local File Systems (Pre).
[[0;32m OK [0m] Stopped Network Name Resolution.
[[0;32m OK [0m] Stopped target Network.
Stopping Network Service...
[[0;32m OK [0m] Stopped Network Service.
[ 5.103805]
Stopping udev Kernel Device Manager...systemd[1]: Starting
dracut pre-pivot and cleanup hook...
[ 5.114216] ignition[427]: files: op(6): op(7): [started] writing
drop-in "waiter.conf" at
"etc/systemd/system/sshd@.service.d/waiter.conf"
[[0;32m OK [0m] Stopped Apply Kernel Variables.
[[0;32m OK [0m] Closed Network Service Netlink Socket.
[[0;32m OK [0m] Stopped udev Kernel Device Manager.
[[0;32m OK [0m] Stopped dracut pre-udev hook.
[ 5.141620] systemd[1]: Started dracut pre-pivot and cleanup hook.
[[0;32m OK [0m] Stopped dracut cmdline hook.
[ 5.153768] ignition[427]: files: op(6): op(7): [finished] writing
drop-in "waiter.conf" at
"etc/systemd/system/sshd@.service.d/waiter.conf"
[[0;32m OK [0m] Stopped Create Static Device Nodes in /dev.
[[0;32m OK [0m] Stopped Create list of required sta...ce nodes for the
current kernel.
[ 5.193120] systemd[1]: Starting Cleaning Up and Shutting Down
Daemons...
[ 5.210286] ignition[427]: files: op(6): [finished] processing unit
"sshd@.service"
[ 5.226612] systemd[1]: Stopped Cleaning Up and Shutting Down
Daemons.
[ 5.232403] ignition[427]: files: op(8): [started] enabling unit
"sshd@.service"
[ 5.237776] systemd[1]: Stopped dracut pre-pivot and cleanup hook.
[ 5.242417] ignition[427]: files: op(8): [finished] enabling unit
"sshd@.service"
[ 5.260429] systemd[1]: Stopping Network Name Resolution...
[ 5.272399] ignition[427]: files: op(9): [started] processing unit
"etcd-member.service"
[ 5.278414] systemd[1]: Stopped target ignition.target.
[ 5.282728] ignition[427]: files: op(9): [finished] processing unit
"etcd-member.service"
[ 5.288112] systemd[1]: Stopped dracut pre-mount hook.
[ 5.292092] ignition[427]: files: op(a): [started] enabling unit
"etcd-member.service"
[ 5.297375] systemd[1]: Stopped target Timers.
[ 5.301425] ignition[427]: files: op(a): [finished] enabling unit
"etcd-member.service"
[ 5.307425] systemd[1]: Stopped target Remote File Systems.
....
Welcome to [38;5;75mContainer Linux by CoreOS 1298.7.0 (Ladybug)[0m!
...
[[0;32m OK [0m] Started Update is Completed.
[[0;32m OK [0m] Reached target System Initialization.
[[0;32m OK [0m] Listening on OpenSSH Server Socket.
Starting rkt metadata service socket.
[[0;32m OK [0m] Started Periodic Garbage Collection for rkt.
[[0;32m OK [0m] Started Daily Cleanup of Temporary Directories.
[[0;32m OK [0m] Started Watch for a cloud-config at
/var/lib/coreos-install/user_data.
[[0;32m OK [0m] Started Watch for update engine configuration changes.
[[0;32m OK [0m] Reached target Paths.
Starting Docker Socket for the API.
[[0;32m OK [0m] Listening on D-Bus System Message Bus Socket.
[[0;32m OK [0m] Started Daily Log Rotation.
[[0;32m OK [0m] Reached target Timers.
[[0;32m OK [0m] Listening on rkt metadata service socket.
[[0;32m OK [0m] Listening on Docker Socket for the API.
[[0;32m OK [0m] Reached target Sockets.
[[0;32m OK [0m] Reached target Basic System.
Starting Generate sshd host keys...
Starting etcd (System Application Container)...
Starting Garbage Collection for rkt...
Starting Install an ssh key from /proc/cmdline...
[[0;32m OK [0m] Started Cluster reboot manager.
Starting Generate /run/coreos/motd...
Starting Extend Filesystems...
Starting CoreOS Metadata Agent (SSH Keys)...
Starting Update Engine...
Starting Login Service...
[[0;32m OK [0m] Started D-Bus System Message Bus.
[ 14.148336] extend-filesystems[829]: resize2fs 1.42.13 (17-May-2015)
[ 14.156400] EXT4-fs (xvda9): resizing filesystem from 553472 to
1489915 blocks
[ 14.215232] EXT4-fs (xvda9): resized filesystem to 1489915
Starting Load cloud-config from
/usr/share/oem/cloud-config.yml...
Starting Network Service...
[[0;32m OK [0m] Started Install an ssh key from /proc/cmdline.
[[0;32m OK [0m] Started Generate /run/coreos/motd.
[[0;32m OK [0m] Started Login Service.
[[0;32m OK [0m] Started Update Engine.
[ 14.369113] [[0;32m OK [0m] Started Network Service.
[[0;32m OK [0m] Reached target Network.
Starting Network Name Resolution...
[[0;32m OK [0m] Started Extend Filesystems.
extend-filesystems[829]: Filesystem at /dev/xvda9 is mounted on /;
on-line resizing required Starting Hostname Service...
[ 14.434704] extend-filesystems[829]: old_desc_blocks = 1,
new_desc_blocks = 1
[ 14.449765] [[0;32m OK [0m] Started Hostname Service.
Starting Authorization Manager...
[[0;32m OK [0m] Started Network Name Resolution.
extend-filesystems[829]: The filesystem on /dev/xvda9 is now 1489915
(4k) blocks long.
Starting Cloudinit from EC2-style metadata...
[[0;32m OK [0m] Started Cloudinit from EC2-style metadata.
[[0;32m OK [0m] Started Load cloud-config from
/usr/share/oem/cloud-config.yml.
[[0;32m OK [0m] Reached target Load system-provided cloud configs.
[[0;32m OK [0m] Reached target Load user-provided cloud configs.
[[0;32m OK [0m] Started Authorization Manager.
[[0;1;31mFAILED[0m] Failed to start etcd (System Application Container).
See 'systemctl status etcd-member.service' for details.
[[0;32m OK [0m] Started Garbage Collection for rkt.
Starting waiter.service...
[[0;32m OK [0m] Started Generate sshd host keys.
Starting Generate /run/issue...
[[0;32m OK [0m] Started Generate /run/issue.
Starting Permit User Sessions...
[[0;32m OK [0m] Started CoreOS Metadata Agent (SSH Keys).
[[0;32m OK [0m] Started Permit User Sessions.
[[0;32m OK [0m] Started Serial Getty on ttyS0.
[[0;32m OK [0m] Started Getty on tty1.
[[0;32m OK [0m] Reached target Login Prompts.
[[0;32m OK [0m] Created slice system-sshd.slice.
SSH host key: SHA256:XXXX (ASDFADSF)
SSH host key: SHA256:YYYY (DSA)
SSH host key: SHA256:WWW (RSA)(
SSH host key: SHA256:ZZZ (ECDSA)
eth0: 10.0.254.5 fe80::ce:9aff:fe86:873d
ip-10-0-254-5 login:
------------------------- serial logs -------------------------