efibootguard 0.7-r0 itco watchdog not firing

41 views
Skip to first unread message

François Dussault

unread,
Oct 28, 2021, 8:08:42 PM10/28/21
to EFI Boot Guard
Hi,

I wonder what could be the reason for the iTCO watchdog not firing when not confirming (i.e. not calling bg_setenv -c) for the duration of 10 secs (value set in BGENV.DAT of both EFI partitions).
We're using a standard Haswell CPU from Intel.

When booting linux I can see:
iTCO_wdt: Intel TCO Watchdog Timer Driver v1.11
iTCO_wdt: Found a Lynx Point TCO device (Version=2, TCOBASE=0x1860)
iTCO_wdt: initialized. heartbeat=30sec (nowayout=0)

We ship a yocto release with efibootguard along with swupdate.
Nevertheless, articifically provoking a partition switch by incrementing the revision and setting the watchdog timer to 10 sec and ustate on "INSTALLED" and rebooting (which becomes "TESTING" after reboot), and then waiting, I wait for over 5 mins and nothing reboots.
If I reboot manually the ustate becomes "FAILED" and revision is 0, but the reboot didn't happen from the watchdog (only manually from me).

I wonder if there could be something that pings the watchdog in linux(something pinging it and thus preventing the efibg watchdog from firing up).

The EFI console says (just before booting linux):
Detected Intel TCO watchdog
Starting C:BOOT0:bzImage with watchdog set to 10 seconds

Yet nothing really happens past those 10 secs.

This is problematic. We have machines in the medical field that don't update properly and since the watchdog never fires nobody saw it until the issue was specifically reported during a demo we did!

Thanks!

Jan Kiszka

unread,
Oct 29, 2021, 1:12:54 AM10/29/21
to François Dussault, EFI Boot Guard
If Linux boots, it may also take over the watchdog from EFI Boot Guard
and drive it until userland picks up. Check
CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED (can also be tuned via kernel
command line).

Jan

--
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux

François Dussault

unread,
Oct 29, 2021, 11:05:42 AM10/29/21
to Jan Kiszka, EFI Boot Guard
Thank you very much for the answer.

When doing :
cat /proc/config.gz|gunzip -c|grep -i CONFIG_WATCHDOG I have:

CONFIG_WATCHDOG=y
CONFIG_WATCHDOG_CORE=y
# CONFIG_WATCHDOG_NOWAYOUT is not set
# CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED is not set
# CONFIG_WATCHDOG_SYSFS is not set
# CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set

So I suppose I have this part correct since CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED is not set.

Yet my EBG watchdog doesn't work :(
Could there be something else interfering with the boot-enabled iTCO watchdog?
I see that Lynx Point iTCO support was added in v0.6 yet I'm at 0.7. I suppose it *SHOULD* work.

Jan Kiszka

unread,
Oct 29, 2021, 11:13:13 AM10/29/21
to François Dussault, EFI Boot Guard
On 29.10.21 17:05, François Dussault wrote:
> Thank you very much for the answer.
>
> When doing :
> cat /proc/config.gz|gunzip -c|grep -i CONFIG_WATCHDOG I have:
>
> CONFIG_WATCHDOG=y
> CONFIG_WATCHDOG_CORE=y
> # CONFIG_WATCHDOG_NOWAYOUT is not set
> *# CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED is not set*
> # CONFIG_WATCHDOG_SYSFS is not set
> # CONFIG_WATCHDOG_PRETIMEOUT_GOV is not set
>
> So I suppose I have this part correct since
> CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED is not set.
>
> Yet my EBG watchdog doesn't work :(
> Could there be something else interfering with the boot-enabled iTCO
> watchdog?
> I see that Lynx Point iTCO support was added in v0.6 yet I'm at 0.7. I
> suppose it *SHOULD* work.

Some things to check/try:
- Exclude the kernel completely from the test, just add an inifinite
loop at the end of EBG main
- Do older versions / revisions of EBG work?
- When not booting via EBG, does the kernel drive the iTCO correctly?
Does it manage to trigger a reboot?

François Dussault

unread,
Nov 1, 2021, 9:40:32 AM11/1/21
to Jan Kiszka, EFI Boot Guard
I somewhat isolated the problem. I used efibg 0.9 and tested with an infinite loop at the end of efi_main() just before loading the image, and indeed in this case the watchdog does work.
Furthermore, when booting, I noticed that disabling the module (in my case, it is built-in the kernel) with initcall_blacklist=iTCO_wdt_init_module in the kernel options does fix my issue (basically the watchdog does reboot the system a little while after booting into linux as expected).

So it seems the cause of my issues is that the iTCO_wdt module somewhat resets the watchdog set by efibg at boot. I wonder if it's possible to make the module load without messing with the efibg watchdog set at boot.

Second, I noticed in "bg_setenv -c" that the code for the "-c" argument doesn't contain anything that stops the watchdog when confirming. I was somewhat expecting this based on the state machine found in "UPDATE.md" which seems to state the watchdog reboot doesn't happen if we confirm in time. So even after confirming, my system will reboot (even if iTCO_wdt did leave the efibg watchdog alone in an hypothetical situation).

One way I see of respecting UPDATE.md state machine (one way to make it work) would be to compile the kernel with iTCO_wdt as a module and preventing it to be loaded at boot. Then I could confirm with "modprobe iTCO_wdt; bg_setenv -c" i.e. loading the module to cancel the watchdog reboot, and then confirming

But... isn't bg_setenv -c supposed to take care of the watchdog?! I'm confused.

Jan Kiszka

unread,
Nov 1, 2021, 11:01:42 AM11/1/21
to François Dussault, Christian Storm, EFI Boot Guard
On 01.11.21 14:40, François Dussault wrote:
> I somewhat isolated the problem. I used efibg 0.9 and tested with an
> infinite loop at the end of efi_main() just before loading the image,
> and indeed in this case the watchdog does work.
> Furthermore, when booting, I noticed that disabling the module (in my
> case, it is built-in the kernel) with
> initcall_blacklist=iTCO_wdt_init_module in the kernel options does fix
> my issue (basically the watchdog does reboot the system a little while
> after booting into linux as expected).
>
> So it seems the cause of my issues is that the iTCO_wdt module somewhat
> resets the watchdog set by efibg at boot. I wonder if it's possible to
> make the module load without messing with the efibg watchdog set at boot.
>

This is what handle_boot_enabled is supposed to do: When disabled, the
kernel driver will not take care of an already running watchdog until
userspace picks up. It should just issue a message like

"watchdog0 running and kernel based pre-userspace handler disabled"

provided it detects the watchdog as running. Seems that is not the case
for the iTCO driver (I see no set_bit(WDOG_HW_RUNNING, ...)). It might
be that the kernel driver actually initializes the watchdog into a
stopped state. That would be bad for our use case, indeed.

Christian, you worked a lot with iTCO. Did you observe such a behavior
with the Linux driver?

> Second, I noticed in "bg_setenv -c" that the code for the "-c" argument
> doesn't contain anything that stops the watchdog when confirming. I was
> somewhat expecting this based on the state machine found in
> "UPDATE.md" which seems to state the watchdog reboot doesn't happen if
> we confirm in time. So even after confirming, my system will reboot
> (even if iTCO_wdt did leave the efibg watchdog alone in an hypothetical
> situation).
>
> One way I see of respecting UPDATE.md state machine (one way to make it
> work) would be to compile the kernel with iTCO_wdt as a module and
> preventing it to be loaded at boot. Then I could confirm with "modprobe
> iTCO_wdt; bg_setenv -c" i.e. loading the module to cancel the watchdog
> reboot, and then confirming
>
> But... isn't bg_setenv -c supposed to take care of the watchdog?! I'm
> confused.
>

Nope, bg_setenv just flips the state in the EBG environment so that the
next boot - no matter if caused by a watchdog or are regular reboot -
does not cause to go back to the previous version. At the point
bt_setenv -c is called, the kernel driver is supposed to manage the
watchdog, EBG is only starting it and later on not in charge of it anymore.

Christian Storm

unread,
Nov 2, 2021, 9:10:19 AM11/2/21
to EFI Boot Guard
Hi,

> On 01.11.21 14:40, François Dussault wrote:
> > I somewhat isolated the problem. I used efibg 0.9 and tested with an
> > infinite loop at the end of efi_main() just before loading the image,
> > and indeed in this case the watchdog does work.
> > Furthermore, when booting, I noticed that disabling the module (in my
> > case, it is built-in the kernel) with
> > initcall_blacklist=iTCO_wdt_init_module in the kernel options does fix
> > my issue (basically the watchdog does reboot the system a little while
> > after booting into linux as expected).
> >
> > So it seems the cause of my issues is that the iTCO_wdt module somewhat
> > resets the watchdog set by efibg at boot. I wonder if it's possible to
> > make the module load without messing with the efibg watchdog set at boot.
> >
>
> This is what handle_boot_enabled is supposed to do: When disabled, the
> kernel driver will not take care of an already running watchdog until
> userspace picks up. It should just issue a message like
>
> "watchdog0 running and kernel based pre-userspace handler disabled"
>
> provided it detects the watchdog as running. Seems that is not the case
> for the iTCO driver (I see no set_bit(WDOG_HW_RUNNING, ...)). It might
> be that the kernel driver actually initializes the watchdog into a
> stopped state. That would be bad for our use case, indeed.
>
> Christian, you worked a lot with iTCO. Did you observe such a behavior
> with the Linux driver?

It's been a while but with the kernel version I used back then, no,
I haven't seen such a behavior.


> > Second, I noticed in "bg_setenv -c" that the code for the "-c" argument
> > doesn't contain anything that stops the watchdog when confirming. I was
> > somewhat expecting this based on the state machine found in
> > "UPDATE.md" which seems to state the watchdog reboot doesn't happen if
> > we confirm in time. So even after confirming, my system will reboot
> > (even if iTCO_wdt did leave the efibg watchdog alone in an hypothetical
> > situation).
> >
> > One way I see of respecting UPDATE.md state machine (one way to make it
> > work) would be to compile the kernel with iTCO_wdt as a module and
> > preventing it to be loaded at boot. Then I could confirm with "modprobe
> > iTCO_wdt; bg_setenv -c" i.e. loading the module to cancel the watchdog
> > reboot, and then confirming
> >
> > But... isn't bg_setenv -c supposed to take care of the watchdog?! I'm
> > confused.
> >
>
> Nope, bg_setenv just flips the state in the EBG environment so that the
> next boot - no matter if caused by a watchdog or are regular reboot -
> does not cause to go back to the previous version. At the point
> bt_setenv -c is called, the kernel driver is supposed to manage the
> watchdog, EBG is only starting it and later on not in charge of it anymore.

To second this, this behavior is actually the whole idea to watchdog-
supervize the boot process: The watchdog is enabled in the EFI domain
*prior* to booting into the new "firmware", read: kernel, and then
― within the watchdog interval ― the kernel takes over the *armed* watchdog.
The latter has not always been the case, e.g., some BBBs back then
insisted on unconditionally initializing the watchdog but the situation
has meanwhile greatly improved.

If booted successfully, you have to confirm this by telling EBG not to
rollback on next boot, i.e., bg_setenv -c. The userspace part of EBG
is out of the game regarding the watchdog, only the EFI part is involved
in terms of arming it. So, userspace and kernel must continue feeding
the watchdog.




Kind regards,
Christian

--
Dr. Christian Storm
Siemens AG, Technology, T RDA IOT SES-DE
Otto-Hahn-Ring 6, 81739 München, Germany
Reply all
Reply to author
Forward
0 new messages