systemd-networkd failed to up interfaces following an OS update

37 views
Skip to first unread message

Wojtek Czekalski

unread,
Jul 22, 2022, 3:27:05 AM7/22/22
to Flatcar Container Linux User
This issue only happened on our older hardware. All servers from Dell's RX10 series (R810 for example) were impacted. Newer ones were not.

Nodes were initially provisioned with 3139.2.3 (5.15.48-flatcar), they were upgraded to 3227.2.0 (5.15.55-flatcar).

Following the restart after the OS update journalctl for systemd presented with a message like "could not bring up interface no such file or directory". I also recorded the reboot sequence in case it's informative. In the recording session I forgot to show the error message but it was literally just what I wrote above. Just that.

Here's the boot sequence in case it's helpful. This issue is not too bad for us since we are operating at a rather small scale. Unexpected nevertheless and prompted us to disable automatic OS updates.

Jeremi Piotrowski

unread,
Jul 22, 2022, 4:01:08 AM7/22/22
to Flatcar Container Linux User
Accidentally sent a private response so here's an approximate repost:
Sorry for this, let's open a github issue with some logs so that we can track down what went wrong.
Is this an upgrade gone wrong or is there an issue with 3227.2.0 on Dell R810?

etcd-lock strategy in locksmith might be relevant to have a maximum of 1 node knocked out before the other stop rebooting.

Wojtek Czekalski

unread,
Aug 4, 2022, 3:34:04 AM8/4/22
to Flatcar Container Linux User
(I also accidentally sent a private response)

In which repo do I create an issue? What kind of logs do you mean? Unfortunately I only have the screen recording.

It’s an issue with the upgrade itself. 3227.2.0 has been running well for 12 days now. I presume it’s a race condition in systemd-networkd. I have seen some related changes in the diff and the changelog. I cannot pinpoint what’s the issue though.

Jeremi Piotrowski

unread,
Aug 4, 2022, 3:58:40 AM8/4/22
to Flatcar Container Linux User
The repo for issues is https://github.com/flatcar-linux/flatcar. See this comment for the needed logs: https://github.com/flatcar-linux/Flatcar/issues/807#issuecomment-1193755286 (seems to be a similar issue). The recording is nice but nothing beats text logs that we can grep through. I'd suggest saving the logs to disk first, then rolling back (https://www.flatcar.org/docs/latest/setup/debug/manual-rollbacks/#performing-a-manual-rollback), or (in grub) selecting the *other* partition with 3139.2.3 so that you can get online and extract the logs.

It may also be helpful to try enabling debug logging in systemd-networkd like this: https://superuser.com/a/1234599.

> It’s an issue with the upgrade itself. 3227.2.0 has been running well for 12 days now. I presume it’s a race condition in systemd-networkd. I have seen some related changes in the diff and the changelog. I cannot pinpoint what’s the issue though.

Good to know. Unfortunately we've hit several issues with systemd-networkd in this release because v250 hit stable and seems to have changed behaviour in subtle ways. The important thing is that we track down what is causing this so that we understand what went wrong and how to prevent it in the future.
Reply all
Reply to author
Forward
0 new messages