Bug: machine id is never generated

758 views
Skip to first unread message

Moessbauer, Felix

unread,
Jul 21, 2022, 11:41:10 AM7/21/22
to isar-...@googlegroups.com, Gylstorff, Quirin, Schmidl, Tobias, Schild, Henning, jan.k...@siemens.com
Hi,

when booting plain ISAR images (with Debian11), the "/etc/machine-id" is never generated.
This breaks a couple of services that depend on having the id.
An example is the systemd-networkd with DHCP, leading to error messages like this one (and breaking networking):

systemd-networkd[277]: enp8s0: DHCP6 CLIENT: Failed to set identifier: No such file or directory

The error can manually be fixed by running "systemd-machine-id-setup", but this obviously does not work in embedded scenarios.

The root cause could be that /etc is read-only mounted when the first-boot-complete.target is reached.
At least the logs indicate this:

journalctl --grep machine --no-pager
-- Journal begins at Thu 2022-07-21 15:07:21 UTC, ends at Thu 2022-07-21 15:35:20 UTC. --
Jul 21 15:07:21 test-image systemd[1]: System cannot boot: Missing /etc/machine-id and /etc is mounted read-only.
Jul 21 15:07:21 test-image systemd[1]: 1) /etc/machine-id exists and is populated.
Jul 21 15:07:21 test-image systemd[1]: 2) /etc/machine-id exists and is empty.
Jul 21 15:07:21 test-image systemd[1]: 3) /etc/machine-id is missing and /etc is writable.
Jul 21 15:07:24 test-image systemd[1]: Condition check resulted in Commit a transient machine-id on disk being skipped.

Maybe Quirin knows more, as he implemented the postproc removal in 8b5e3f9.
IIRC Jan also mentioned something that "rw" has to be added to the kernel cmdline, but this looks like a workaround as well.
For the sake of completeness:

cat /proc/cmdline
initrd=\initrd.img-5.10.0-16-amd64 LABEL=Boot root=PARTUUID=9ad286e9-8df8-42c2-ac7d-0f2d7387d03d rootwait console=tty0 console=ttyS0,115200

IMO this is a pretty critical bug as it affects a ton of images.

Best regards,
Felix

--
Siemens AG, Linux Expert Center
Otto-Hahn-Ring 6, 81739 München, Germany

Henning Schild

unread,
Jul 21, 2022, 12:26:37 PM7/21/22
to Moessbauer, Felix (T CED SES-DE), isar-...@googlegroups.com, Gylstorff, Quirin (T CED SES-DE), Schmidl, Tobias (T CED SES-DE), Kiszka, Jan (T CED)
Am Thu, 21 Jul 2022 17:41:07 +0200
schrieb "Moessbauer, Felix (T CED SES-DE)"
<felix.mo...@siemens.com>:

> Hi,
>
> when booting plain ISAR images (with Debian11), the "/etc/machine-id"
> is never generated. This breaks a couple of services that depend on
> having the id. An example is the systemd-networkd with DHCP, leading
> to error messages like this one (and breaking networking):
>
> systemd-networkd[277]: enp8s0: DHCP6 CLIENT: Failed to set
> identifier: No such file or directory

Yes that would just be one of many possible problems. It also plays
into the whole "first boot" functionality which never was really
usable with isar.

I think before the last patch to remove the file we never had a first
boot, now every boot it the first it seems.

As a workaround for systemd networkd you can likely use the MAC as
identifier for dhcp instead of the duid, but not nice.

> The error can manually be fixed by running
> "systemd-machine-id-setup", but this obviously does not work in
> embedded scenarios.
>
> The root cause could be that /etc is read-only mounted when the
> first-boot-complete.target is reached. At least the logs indicate
> this:
>
> journalctl --grep machine --no-pager
> -- Journal begins at Thu 2022-07-21 15:07:21 UTC, ends at Thu
> 2022-07-21 15:35:20 UTC. -- Jul 21 15:07:21 test-image systemd[1]:
> System cannot boot: Missing /etc/machine-id and /etc is mounted
> read-only. Jul 21 15:07:21 test-image systemd[1]: 1) /etc/machine-id
> exists and is populated. Jul 21 15:07:21 test-image systemd[1]: 2)
> /etc/machine-id exists and is empty. Jul 21 15:07:21 test-image
> systemd[1]: 3) /etc/machine-id is missing and /etc is writable. Jul
> 21 15:07:24 test-image systemd[1]: Condition check resulted in Commit
> a transient machine-id on disk being skipped.
>
> Maybe Quirin knows more, as he implemented the postproc removal in
> 8b5e3f9. IIRC Jan also mentioned something that "rw" has to be added
> to the kernel cmdline, but this looks like a workaround as well. For
> the sake of completeness:
>
> cat /proc/cmdline
> initrd=\initrd.img-5.10.0-16-amd64 LABEL=Boot
> root=PARTUUID=9ad286e9-8df8-42c2-ac7d-0f2d7387d03d rootwait
> console=tty0 console=ttyS0,115200

We should find out what that is, maybe even a systemd bug. Not many
systems actually do the first boot on the literal first boot.
In debian the first boot is when systemd gets installed.

But "ro" on the kernel cmdline should be the normal case, where the
initrd or systemd at some points makes it writable. "rw" on the cmdline
sounds like a hack.

> IMO this is a pretty critical bug as it affects a ton of images.

Yes maybe. Could also collect which distros are affected and whether it
was caused by quirins patch replacing the empty file with "no file".

Henning

Schmidl, Tobias

unread,
Jul 21, 2022, 1:08:19 PM7/21/22
to isar-...@googlegroups.com, Moessbauer, Felix, Gylstorff, Quirin, jan.k...@siemens.com, Schild, Henning
Hi Felix,

Am Donnerstag, dem 21.07.2022 um 15:41 +0000 schrieb Moessbauer, Felix (T
CED SES-DE):
> Hi,
>
> when booting plain ISAR images (with Debian11), the "/etc/machine-id" is
> never generated.
> This breaks a couple of services that depend on having the id.
> An example is the systemd-networkd with DHCP, leading to error messages
> like this one (and breaking networking):
>
> systemd-networkd[277]: enp8s0: DHCP6 CLIENT: Failed to set identifier:
> No such file or directory
>
> The error can manually be fixed by running "systemd-machine-id-setup",
> but this obviously does not work in embedded scenarios.
>
> The root cause could be that /etc is read-only mounted when the first-
> boot-complete.target is reached.
> At least the logs indicate this:
>

I've examined the same thing, for expand-on-first-boot, which would also
profit from a `ConditionFirstBoot=yes` in its service file. I've seen the
same pattern.

What I don't understand is that with a normal Debian the regeneration of
/etc/machine-id works without any problems. Somehow we have to differ
here.

Kind regards,

Tobias

Gylstorff Quirin

unread,
Jul 22, 2022, 5:33:06 AM7/22/22
to Moessbauer, Felix (T CED SES-DE), isar-...@googlegroups.com, Schmidl, Tobias (T CED SES-DE), Schild, Henning (T CED SES-DE), Kiszka, Jan (T CED)


On 7/21/22 17:41, Moessbauer, Felix (T CED SES-DE) wrote:
> Hi,
>
> when booting plain ISAR images (with Debian11), the "/etc/machine-id" is never generated.
> This breaks a couple of services that depend on having the id.
> An example is the systemd-networkd with DHCP, leading to error messages like this one (and breaking networking):
>
> systemd-networkd[277]: enp8s0: DHCP6 CLIENT: Failed to set identifier: No such file or directory
>
> The error can manually be fixed by running "systemd-machine-id-setup", but this obviously does not work in embedded scenarios.
>
> The root cause could be that /etc is read-only mounted when the first-boot-complete.target is reached.
> At least the logs indicate this:
>
> journalctl --grep machine --no-pager
> -- Journal begins at Thu 2022-07-21 15:07:21 UTC, ends at Thu 2022-07-21 15:35:20 UTC. --
> Jul 21 15:07:21 test-image systemd[1]: System cannot boot: Missing /etc/machine-id and /etc is mounted read-only.
> Jul 21 15:07:21 test-image systemd[1]: 1) /etc/machine-id exists and is populated.
> Jul 21 15:07:21 test-image systemd[1]: 2) /etc/machine-id exists and is empty.
> Jul 21 15:07:21 test-image systemd[1]: 3) /etc/machine-id is missing and /etc is writable.
> Jul 21 15:07:24 test-image systemd[1]: Condition check resulted in Commit a transient machine-id on disk being skipped.

/etc is read only after the initrd gives control to systemd. The remount
from ro to rw of the rootfs occurs in systemd.


I tested it with my Debian machine and without /etc/machine-id to same
behavior occurs.

From my testing setting '/etc/machine-id' to "unitialized\n" results in
the intended behavior with active first boot mechanic.


```
modified meta/classes/image-postproc-extension.bbclass
@@ -57,7 +57,7 @@ ROOTFS_POSTPROCESS_COMMAND =+
"image_postprocess_machine_id"
image_postprocess_machine_id() {
# systemd(1) takes care of recreating the machine-id on first boot
sudo rm -f '${IMAGE_ROOTFS}/var/lib/dbus/machine-id'
- sudo rm -f '${IMAGE_ROOTFS}/etc/machine-id'
+ echo "uninitialized\n" | sudo tee '${IMAGE_ROOTFS}/etc/machine-id'
}
```

This still needs to be tested with other images.

Quirin

Gylstorff Quirin

unread,
Jul 22, 2022, 6:55:24 AM7/22/22
to Moessbauer, Felix (T CED SES-DE), isar-...@googlegroups.com, Schmidl, Tobias (T CED SES-DE), Schild, Henning (T CED SES-DE), Kiszka, Jan (T CED), Bezdeka, Florian (CT RDA IOT SES-DE)
We also tested it with Fedora 36 (thanks to Florian) and if you you have
no `/etc/machine-id` systemd also fails.

Setting the content of /etc/machineid to "unitialized\n" leads to the
intended behavior.

Henning Schild

unread,
Jul 22, 2022, 3:54:51 PM7/22/22
to Schmidl, Tobias (T CED SES-DE), isar-...@googlegroups.com, Moessbauer, Felix (T CED SES-DE), Gylstorff, Quirin (T CED SES-DE), Kiszka, Jan (T CED)
Am Thu, 21 Jul 2022 19:08:14 +0200
schrieb "Schmidl, Tobias (T CED SES-DE)" <tobias...@siemens.com>:

> Hi Felix,
>
> Am Donnerstag, dem 21.07.2022 um 15:41 +0000 schrieb Moessbauer,
> Felix (T CED SES-DE):
> > Hi,
> >
> > when booting plain ISAR images (with Debian11), the
> > "/etc/machine-id" is never generated.
> > This breaks a couple of services that depend on having the id.
> > An example is the systemd-networkd with DHCP, leading to error
> > messages like this one (and breaking networking):
> >
> > systemd-networkd[277]: enp8s0: DHCP6 CLIENT: Failed to set
> > identifier: No such file or directory
> >
> > The error can manually be fixed by running
> > "systemd-machine-id-setup", but this obviously does not work in
> > embedded scenarios.
> >
> > The root cause could be that /etc is read-only mounted when the
> > first- boot-complete.target is reached.
> > At least the logs indicate this:
> >
>
> I've examined the same thing, for expand-on-first-boot, which would
> also profit from a `ConditionFirstBoot=yes` in its service file. I've
> seen the same pattern.

Hooking into systemds understanding of "first boot" would be good for
expand-on-first-boot and sshd-regen-keys when just talking plain isar.
Both recipes use a weird trick to only run once. A nasty pattern that
might have spread into layer recipes.

It would be really nice if isar generated images could make use of
`ConditionFirstBoot=yes` in any recipe that has such needs. I think we
should try to get there with isar. The only problem is that debian
assumes the "first boot" to be the install-time so we might break
assumptions. But if we do i am sure debian will try to cater once there
is any kind of problem and we explain the case.

regards,
Henning

Henning Schild

unread,
Oct 6, 2022, 9:22:48 AM10/6/22
to Moessbauer, Felix (T CED SES-DE), isar-...@googlegroups.com, Gylstorff, Quirin (T CED SES-DE), Schmidl, Tobias (T CED SES-DE), Kiszka, Jan (T CED)
Am Thu, 21 Jul 2022 17:41:07 +0200
schrieb "Moessbauer, Felix (T CED SES-DE)"
<felix.mo...@siemens.com>:

> Hi,
>
> when booting plain ISAR images (with Debian11), the "/etc/machine-id"
> is never generated. This breaks a couple of services that depend on
> having the id. An example is the systemd-networkd with DHCP, leading
> to error messages like this one (and breaking networking):
>
> systemd-networkd[277]: enp8s0: DHCP6 CLIENT: Failed to set
> identifier: No such file or directory

Seems that can also cause issues for NetworkManager where one can loose
the network when a lease expires and a renew is due. I hope that
https://github.com/ilbers/isar/commit/693d76b8c06af fixed that but just
drop the error message here for completeness.

NetworkManager[526]: <error> [1665061214.9410] /etc/machine-id: no
valid machine-id. Use fake one based on secret-key: ...

Henning

Moessbauer, Felix

unread,
Oct 7, 2022, 3:11:49 AM10/7/22
to Schild, Henning, isar-...@googlegroups.com, Gylstorff, Quirin, Schmidl, Tobias, jan.k...@siemens.com
> From: Schild, Henning (T CED SES-DE) <henning...@siemens.com>
> Sent: Thursday, October 6, 2022 9:23 PM
> To: Moessbauer, Felix (T CED SES-DE) <felix.mo...@siemens.com>
> Cc: isar-...@googlegroups.com; Gylstorff, Quirin (T CED SES-DE)
> <quirin.g...@siemens.com>; Schmidl, Tobias (T CED SES-DE)
> <tobias...@siemens.com>; Kiszka, Jan (T CED) <jan.k...@siemens.com>
> Subject: Re: Bug: machine id is never generated
>
> Am Thu, 21 Jul 2022 17:41:07 +0200
> schrieb "Moessbauer, Felix (T CED SES-DE)"
> <felix.mo...@siemens.com>:
>
> > Hi,
> >
> > when booting plain ISAR images (with Debian11), the "/etc/machine-id"
> > is never generated. This breaks a couple of services that depend on
> > having the id. An example is the systemd-networkd with DHCP, leading
> > to error messages like this one (and breaking networking):
> >
> > systemd-networkd[277]: enp8s0: DHCP6 CLIENT: Failed to set
> > identifier: No such file or directory
>
> Seems that can also cause issues for NetworkManager where one can loose the
> network when a lease expires and a renew is due. I hope that
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co
> m%2Filbers%2Fisar%2Fcommit%2F693d76b8c06af&amp;data=05%7C01%7Cfeli
> x.moessbauer%40siemens.com%7C16b5eb33d35940dc0ad408daa79ddb88%7C
> 38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7C638006593671503174%7
> CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI
> 6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=BAsqAoyxq1bY
> T%2BWBFBgLj9v4phYSe%2Bkppqv%2BpLtajUg%3D&amp;reserved=0 fixed that
> but just drop the error message here for completeness.
>
> NetworkManager[526]: <error> [1665061214.9410] /etc/machine-id: no valid
> machine-id. Use fake one based on secret-key: ...

Hi Henning, yes, that's a typical pattern we see when the /etc/machine-id is not available.
But do you want to report a bug, or what is the intention of the statement from above?
The missing /etc/machine-id is fixed by the mentioned commit.
Did you try a build including this patch?

Felix

Henning Schild

unread,
Oct 7, 2022, 4:13:02 AM10/7/22
to Moessbauer, Felix (T CED SES-DE), isar-...@googlegroups.com, Gylstorff, Quirin (T CED SES-DE), Schmidl, Tobias (T CED SES-DE), Kiszka, Jan (T CED)
Am Fri, 7 Oct 2022 09:11:46 +0200
I just wanted to note that NetworkManager was also affected if one had
an Isar version not yet containing the fix.

So a heads up for all NetworkManager users to take a more recent Isar.

Henning
Reply all
Reply to author
Forward
0 new messages