System upgrade atomicity

39 views
Skip to first unread message

Pavel Löbl

unread,
Jan 29, 2024, 8:42:20 PM1/29/24
to EFI Boot Guard
Hi,

I'm considering using EFI boot in some upcoming ARM based embedded system deployment. As it seems it's the future on ARM now. What I miss a bit is to get the whole update mechanism picture, as EFI Boot Guard is only one piece in the chain. So I would appreciate to hear some real world experiences.

What we usually did in the past was the usual full A/B scheme, starting from the BL2/SPL. So two hardware boot partitions (eMMC) for firmware, two user boot partitions with FIT images and two root filesystems. Nice property of this is, there is only one place which says what slot you boot from. And that's the EXT_CSD register in eMMC which tells the ROM code which slot we are booting to. So after the update we simply flip this bit and everything gets loaded from the other slots.

In EFI world I would probably used capsule update for firmware, which can be still backed-up by hardware partitions in eMMC for redundancy. But how to update the bootloader (EFI Boot Guard). Just by copying the binary to ESP and calling rename()? Or rely on some EFI variables and firmware to load the correct bootloader?

That would mean we have active boot slot information stored at three places. MMC register for firmware, EFI variable for bootloader slot and bootloader configuration files for UKI. This would not be a real issue if boot protocol between all these stages would be stable for whole product life-cycle. But in case there will be some changes needed, we can get in trouble if system crashes during update. We could potentially get some "slot mixing" next boot as we are not able to update all active slots atomically.

Don't get me wrong. I fully understand standardization of boot protocol is a good thing. It just brings some challenges. So any experiences how to handle this, mainly how to update the bootloader would be welcome.

Pavel

Jan Kiszka

unread,
Jan 30, 2024, 1:43:10 AM1/30/24
to Pavel Löbl, EFI Boot Guard, Quirin Gylstorff
On 30.01.24 02:42, Pavel Löbl wrote:
> Hi,
>
> I'm considering using EFI boot in some upcoming ARM based embedded
> system deployment. As it seems it's the future on ARM now. What I miss a
> bit is to get the whole update mechanism picture, as EFI Boot Guard is
> only one piece in the chain. So I would appreciate to hear some real
> world experiences.
>
> What we usually did in the past was the usual full A/B scheme, starting
> from the BL2/SPL. So two hardware boot partitions (eMMC) for firmware,
> two user boot partitions with FIT images and two root filesystems. Nice
> property of this is, there is only one place which says what slot you
> boot from. And that's the EXT_CSD register in eMMC which tells the ROM
> code which slot we are booting to. So after the update we simply flip
> this bit and everything gets loaded from the other slots.
>
> In EFI world I would probably used capsule update for firmware, which
> can be still backed-up by hardware partitions in eMMC for redundancy.
> But how to update the bootloader (EFI Boot Guard). Just by copying the
> binary to ESP and calling rename()? Or rely on some EFI variables and
> firmware to load the correct bootloader?
>

Updating EBG itself robustly indeed requires support by the EFI firmware
to switch the boot paths. The renaming approach is what we currently do
in isar-cip-core [1], Quirin just added it, but that is not perfect.
This might be improvable by using BootNext, relying on EFI variables and
their robustness - something we didn't want to do yet for the more
common and frequent update of the OS, one reason for EBG to exist. I'm
not even sure right now how well (and robustly) BootNext already works
with the generally preferred UEFI provider on ARM, ie. U-Boot.

> That would mean we have active boot slot information stored at three
> places. MMC register for firmware, EFI variable for bootloader slot and
> bootloader configuration files for UKI. This would not be a real issue
> if boot protocol between all these stages would be stable for whole
> product life-cycle. But in case there will be some changes needed, we
> can get in trouble if system crashes during update. We could potentially
> get some "slot mixing" next boot as we are not able to update all active
> slots atomically.

Well, selecting firmware slots should not affect the firmware's
selection of OS slots, in theory. EFI variables should be stored in a
central place and should not be touched by firmware updates. And EBG
also does that (via the BGENV partitions). In turn, you don't want to
touch the firmware every time you only update something of the OS.
That's also why something like EBG should be there to decouple things.

>
> Don't get me wrong. I fully understand standardization of boot protocol
> is a good thing. It just brings some challenges. So any experiences how
> to handle this, mainly how to update the bootloader would be welcome.
>

UEFI isn't perfect yet, for sure, and that is also why EBG exists. But
even that combo makes no perfect world yet. However, the direction is
inevitable and already a major improvement over all those custom boot
procedures we had in the past - you can now use an almost random Linux
on your boards. That's why basically all archs are standardizing over it
(see RISC-V or Loongsoon). But I think we need to improve the UEFI
standard and its implementations further which should eventually fully
obsolete EBG as well.

Jan

[1]
https://gitlab.com/cip-project/cip-core/isar-cip-core/-/commit/ce889bd38e6e4024d5f78aaf8274ff840d731359

--
Siemens AG, Technology
Linux Expert Center

Pavel Löbl

unread,
Jan 30, 2024, 3:38:20 AM1/30/24
to EFI Boot Guard
Yes. It seems rename on FAT should be fairly atomic. So this might be a valid
option. 


> That would mean we have active boot slot information stored at three
> places. MMC register for firmware, EFI variable for bootloader slot and
> bootloader configuration files for UKI. This would not be a real issue
> if boot protocol between all these stages would be stable for whole
> product life-cycle. But in case there will be some changes needed, we
> can get in trouble if system crashes during update. We could potentially
> get some "slot mixing" next boot as we are not able to update all active
> slots atomically.

Well, selecting firmware slots should not affect the firmware's
selection of OS slots, in theory. EFI variables should be stored in a
central place and should not be touched by firmware updates. And EBG
also does that (via the BGENV partitions). In turn, you don't want to
touch the firmware every time you only update something of the OS.
That's also why something like EBG should be there to decouple things.
If things would be stable and truly decoupled then yes. But if there are some
changes coming in the future, then you need to track versions, and decide
whether you need to flash also the lower stages of the boot chain. Especially
if downgrades also need to be supported it can get quite tricky sometimes.
So updating everything every time and having the switch in one place was
an easy way out of this.

Also not sure what is the common practice with device trees. I guess in ideal
EFI setup DT would be provided by the firmware. Maybe that would work
if you are using the mainline bindings all the time. As that should be fairly
stable. But mainlining takes some time usually, so if things are not so ideal
you rather place the DT inside UKI and then rely on that EFI fixup protocol if
runtime changes are needed? But this also introduces some coupling. In case
 there are some rough binding changes the firmware might not be able to fixup
provided DT I guess.

So in practice I probably could boot "any distribution" if either that distro is
using mainline kernel and my firmware provides mainline DT, or distro is
providing the DT for my board and installed firmware version is able to fix it up.

Pavel



Jan Kiszka

unread,
Jan 30, 2024, 12:05:27 PM1/30/24
to Pavel Löbl, EFI Boot Guard
It's not fully atomic, already because you can't replace an existing
file atomically via a rename of another one (if I'm not wrong). And then
the filesystem operations themselves are not atomic /wrt the backing
storage. IOW, there are remaining invalid intermediate states that would
leave the system unbootable when a power-cut hits you at the wrong time.
That should be better via BootNext and without renaming. In theory.
Yes, DT replacement via UKI is currently that plan B until a firmware
has a complete DT with official bindings (accepted by mainline). This is
supported by EBG, and there is also ongoing work to have systemd-boot
improved in that regard (https://github.com/systemd/systemd/pull/29726).

Missing standardization of DT fixup is a problem, and I raised that
already more than once (https://github.com/ARM-software/ebbr/issues/68).
With U-Boot underneath, you have a de-facto standard. With EDK II and a
vendor not fully understanding things, you can have funny effects
(Tegra...) and that require funny workarounds (dummy DT strings to have
enough space for non-standard patching by the firmware).

>
> So in practice I probably could boot "any distribution" if either that
> distro is
> using mainline kernel and my firmware provides mainline DT, or distro is
> providing the DT for my board and installed firmware version is able to
> fix it up.

We are not there, and if you look at recent integrations even of rather
well-behaving vendors
(https://groups.google.com/d/msgid/isar-users/cover.1705490373.git.jan.kiszka%40siemens.com,
https://groups.google.com/d/msgid/isar-users/cover.1705490373.git.jan.kiszka%40siemens.com),
there is still a need for customization, also around which DT to use
(BeaglePlay is already fine with mainline, VisionFive 2 still needs its
own for all features). But if you don't start anywhere, you will never
reach that "any distribution".

Also, the whole secure boot story becomes a different beast via UEFI, in
general a much friendlier one.

Jan
Reply all
Reply to author
Forward
0 new messages