Failing ssh-logon due to missing /etc/ssh/sshd_config

485 views
Skip to first unread message

Gustav Karlsson

unread,
Jun 6, 2018, 6:37:01 AM6/6/18
to CoreOS User
Hi,

Fairly new to Container Linux. have a host that after first boot sometimes come up in a bad state, unaccessible via ssh. The error I see on the host is:

    sshd[1227]: /etc/ssh/sshd_config: No such file or directory

* We don't have any special configuration for sshd
* There are no other failed units as far as I can tell from the log
* I have been unable to access the console when the host is in the bad state
* Version 1745.4.0

Does anyone have any clue as to what might cause such a state?

Regards,
Gustav

David Michael

unread,
Jun 6, 2018, 8:46:14 AM6/6/18
to Gustav Karlsson, CoreOS User
What does "ls -al /etc/ssh" look like when it's not working? Does the
log have any errors about tmpfiles? The /etc/ssh/sshd_config symlink
should be created automatically by /usr/lib/tmpfiles.d/ssh.conf on
every boot if it does not exist, and the symlink points to an
immutable path under /usr that always exists (unless something is
mounted over it).

Thanks.

David

Gustav Karlsson

unread,
Jun 6, 2018, 8:57:34 AM6/6/18
to CoreOS User
Thank you for the information!

Unfortunately I have not been able to access the host in the failed state, only able to stop/reboot via UI, so have been unable to check the symlink. 

All units referring to tmpfiles seem to have run successfully. The host usually start working after doing a stop+start.

There is one other custom unit that is failing, one that attempts to install a package using pip, i.e. downloading some stuff from Internet. Might or might not be related.

/Gustav

Gustav Karlsson

unread,
Jun 7, 2018, 4:46:07 AM6/7/18
to CoreOS User
I set up a systemd-unit that dumps some state to journal, so I managed to get some more information.

ls -al /etc/ssh
total 564
drwxr-xr-x.  2 root root   4096 May 24 21:51 .
drwxr-xr-x. 37 root root   4096 Jun  7 08:16 ..
-rw-r--r--.  1 root root 553185 May 23 08:29 moduli

Am thinking maybe boot-sequence is not yet complete due to some network issues.

There are traces of some serious network-issues, looks like dns-lookups are failing (my custom curl-check): 
   curl: (6) Could not resolve host: <some host>

What unit is generating the /etc/ssh tmpfiles?

Regards,
Gustav

Gustav Karlsson

unread,
Jun 7, 2018, 7:55:16 AM6/7/18
to CoreOS User
I have found what is triggering the issue, but I have not really understood why. Anyways, we have two systemd-units  :

mount-mydisk:
[Unit]
Before=local-fs.target

[Mount]
What=...
Where=...

[Install]
RequiredBy=local-fs.target


grow-mydisk:
[Unit]
Description=Grow mydisk
After=mydisk.mount
Requires=mydisk.mount

[Service]
Type=oneshot
ExecStart=/usr/sbin/xfs_growfs ...
RemainAfterExit=true

[Install]
WantedBy=multi-user.target


If I add a line "Before=local-fs.target" to the grow-mydisk unit, host-creation will start to fail sporadically (for some reason).


Regards,
Gustav
Reply all
Reply to author
Forward
0 new messages