Re: CSD lockup during kexec due to unbounded busy-wait in pl011_console_write_atomic (arm64)

1 view
Skip to first unread message

Breno Leitao

unread,
Nov 26, 2025, 9:13:51 AMNov 26
to gli...@google.com, el...@google.com, dvy...@google.com, usamaa...@gmail.com, leo...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, kerne...@meta.com, rmi...@meta.com, john....@linutronix.de, pml...@suse.com, li...@armlinux.org.uk, pau...@kernel.org, kasa...@googlegroups.com
On Tue, Nov 25, 2025 at 08:02:16AM -0800, Breno Leitao wrote:
> 6. Meanwhile, kfence's toggle_allocation_gate() on another CPU attempts to
> perform a synchronous operation across all CPUs, which correctly triggers a CSD
> lock timeout because CPU#0 is stuck in the busy loop with IRQs disabled.

I've hacked a patch to disable kfence IPIs during machine shutdown, and
with it loaded, I don't reproduce the problem described in this thread.

Author: Breno Leitao <lei...@debian.org>
Date: Tue Nov 25 07:21:55 2025 -0800

mm/kfence: add reboot notifier to disable KFENCE on shutdown

Register a reboot notifier to disable KFENCE and cancel any pending
timer work during system shutdown. This prevents potential IPI
synchronization issues that can occur when KFENCE is active during
the reboot process.

The notifier runs with high priority (INT_MAX) to ensure KFENCE is
disabled early in the shutdown sequence.

Signed-off-by: Breno Leitao <lei...@debian.org>

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 727c20c94ac5..5810afaaf6b4 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -26,6 +26,7 @@
#include <linux/panic_notifier.h>
#include <linux/random.h>
#include <linux/rcupdate.h>
+#include <linux/reboot.h>
#include <linux/sched/clock.h>
#include <linux/seq_file.h>
#include <linux/slab.h>
@@ -819,6 +820,21 @@ static struct notifier_block kfence_check_canary_notifier = {

static struct delayed_work kfence_timer;

+static int kfence_reboot_callback(struct notifier_block *nb,
+ unsigned long action, void *data)
+{
+ /* Disable KFENCE to avoid IPI synchronization during shutdown */
+ WRITE_ONCE(kfence_enabled, false);
+ /* Cancel any pending timer work */
+ cancel_delayed_work_sync(&kfence_timer);
+ return NOTIFY_OK;
+}
+
+static struct notifier_block kfence_reboot_notifier = {
+ .notifier_call = kfence_reboot_callback,
+ .priority = INT_MAX, /* Run early to stop timers ASAP */
+};
+
#ifdef CONFIG_KFENCE_STATIC_KEYS
/* Wait queue to wake up allocation-gate timer task. */
static DECLARE_WAIT_QUEUE_HEAD(allocation_wait);
@@ -901,6 +917,8 @@ static void kfence_init_enable(void)
if (kfence_check_on_panic)
atomic_notifier_chain_register(&panic_notifier_list, &kfence_check_canary_notifier);

+ register_reboot_notifier(&kfence_reboot_notifier);
+
WRITE_ONCE(kfence_enabled, true);
queue_delayed_work(system_unbound_wq, &kfence_timer, 0);


Alexander, Marco and Kasan maintainers:

What is the potential impact of disabling KFENCE during reboot
procedures?

The primary motivation is to avoid triggering IPIs during the machine
teardown process, mainly when the nbconsole is not running in threaded
mode.

Marco Elver

unread,
Nov 26, 2025, 9:55:05 AMNov 26
to Breno Leitao, gli...@google.com, dvy...@google.com, usamaa...@gmail.com, leo...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, kerne...@meta.com, rmi...@meta.com, john....@linutronix.de, pml...@suse.com, li...@armlinux.org.uk, pau...@kernel.org, kasa...@googlegroups.com
Just place it under the #ifdef CONFIG_KFENCE_STATIS_KEYS below, I do
not think this is required if CONFIG_KFENCE_STATIC_KEYS is unset.

> #ifdef CONFIG_KFENCE_STATIC_KEYS
> /* Wait queue to wake up allocation-gate timer task. */
> static DECLARE_WAIT_QUEUE_HEAD(allocation_wait);
> @@ -901,6 +917,8 @@ static void kfence_init_enable(void)
> if (kfence_check_on_panic)
> atomic_notifier_chain_register(&panic_notifier_list, &kfence_check_canary_notifier);
>
> + register_reboot_notifier(&kfence_reboot_notifier);
> +
> WRITE_ONCE(kfence_enabled, true);
> queue_delayed_work(system_unbound_wq, &kfence_timer, 0);
>
>
> Alexander, Marco and Kasan maintainers:
>
> What is the potential impact of disabling KFENCE during reboot
> procedures?

But only if CONFIG_KFENCE_STATIC_KEYS is enabled?
That would be reasonable, given our recommendation has been to disable
CONFIG_KFENCE_STATIC_KEYS since
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4f612ed3f748962cbef1316ff3d323e2b9055b6e
in most cases.

I believe some low-CPU count systems are still benefiting from it, but
in general, I'd advise against it.

> The primary motivation is to avoid triggering IPIs during the machine
> teardown process, mainly when the nbconsole is not running in threaded
> mode.

Thanks,
-- Marco

Breno Leitao

unread,
Nov 26, 2025, 10:54:47 AMNov 26
to Marco Elver, gli...@google.com, dvy...@google.com, usamaa...@gmail.com, leo...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, kerne...@meta.com, rmi...@meta.com, john....@linutronix.de, pml...@suse.com, li...@armlinux.org.uk, pau...@kernel.org, kasa...@googlegroups.com
Hello Marco,

On Wed, Nov 26, 2025 at 03:54:26PM +0100, Marco Elver wrote:
> On Wed, 26 Nov 2025 at 15:13, Breno Leitao <lei...@debian.org> wrote:
> > +static int kfence_reboot_callback(struct notifier_block *nb,
> > + unsigned long action, void *data)
> > +{
> > + /* Disable KFENCE to avoid IPI synchronization during shutdown */
> > + WRITE_ONCE(kfence_enabled, false);
> > + /* Cancel any pending timer work */
> > + cancel_delayed_work_sync(&kfence_timer);
> > + return NOTIFY_OK;
> > +}
> > +
> > +static struct notifier_block kfence_reboot_notifier = {
> > + .notifier_call = kfence_reboot_callback,
> > + .priority = INT_MAX, /* Run early to stop timers ASAP */
> > +};
>
> Just place it under the #ifdef CONFIG_KFENCE_STATIS_KEYS below, I do
> not think this is required if CONFIG_KFENCE_STATIC_KEYS is unset.

Ack. This is only needed for CONFIG_KFENCE_STATIC_KEYS, my bad.

> > Alexander, Marco and Kasan maintainers:
> >
> > What is the potential impact of disabling KFENCE during reboot
> > procedures?
>
> But only if CONFIG_KFENCE_STATIC_KEYS is enabled?
> That would be reasonable, given our recommendation has been to disable
> CONFIG_KFENCE_STATIC_KEYS since
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4f612ed3f748962cbef1316ff3d323e2b9055b6e
> in most cases.
>
> I believe some low-CPU count systems are still benefiting from it, but
> in general, I'd advise against it.

Thanks for your review and guidance.

Just to confirm my understanding: You’re okay with me adding this
notifier specifically for CONFIG_KFENCE_STATIC_KEYS (which is what
I need), but you would not support adding it for the general case where
!CONFIG_KFENCE_STATIC_KEYS, correct?

Thanks again,
--breno

Marco Elver

unread,
Nov 26, 2025, 11:09:39 AMNov 26
to Breno Leitao, gli...@google.com, dvy...@google.com, usamaa...@gmail.com, leo...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, kerne...@meta.com, rmi...@meta.com, john....@linutronix.de, pml...@suse.com, li...@armlinux.org.uk, pau...@kernel.org, kasa...@googlegroups.com
On Wed, 26 Nov 2025 at 16:54, Breno Leitao <lei...@debian.org> wrote:
[..]
> > > Alexander, Marco and Kasan maintainers:
> > >
> > > What is the potential impact of disabling KFENCE during reboot
> > > procedures?
> >
> > But only if CONFIG_KFENCE_STATIC_KEYS is enabled?
> > That would be reasonable, given our recommendation has been to disable
> > CONFIG_KFENCE_STATIC_KEYS since
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4f612ed3f748962cbef1316ff3d323e2b9055b6e
> > in most cases.
> >
> > I believe some low-CPU count systems are still benefiting from it, but
> > in general, I'd advise against it.
>
> Thanks for your review and guidance.
>
> Just to confirm my understanding: You’re okay with me adding this
> notifier specifically for CONFIG_KFENCE_STATIC_KEYS (which is what
> I need), but you would not support adding it for the general case where
> !CONFIG_KFENCE_STATIC_KEYS, correct?

Yes, correct. If there's a real issue with CONFIG_KFENCE_STATIC_KEYS,
it's worth fixing if there are still valid uses for it. But I wouldn't
pessimize the now default mode, which is !CONFIG_KFENCE_STATIC_KEYS,
as it doesn't appear to have this problem.

Thanks,
-- Marco

Breno Leitao

unread,
Nov 26, 2025, 11:37:51 AMNov 26
to Marco Elver, gli...@google.com, dvy...@google.com, usamaa...@gmail.com, leo...@arm.com, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, kerne...@meta.com, rmi...@meta.com, john....@linutronix.de, pml...@suse.com, li...@armlinux.org.uk, pau...@kernel.org, kasa...@googlegroups.com
On Wed, Nov 26, 2025 at 05:08:59PM +0100, Marco Elver wrote:
> On Wed, 26 Nov 2025 at 16:54, Breno Leitao <lei...@debian.org> wrote:
> [..]
> > > > Alexander, Marco and Kasan maintainers:
> > > >
> > > > What is the potential impact of disabling KFENCE during reboot
> > > > procedures?
> > >
> > > But only if CONFIG_KFENCE_STATIC_KEYS is enabled?
> > > That would be reasonable, given our recommendation has been to disable
> > > CONFIG_KFENCE_STATIC_KEYS since
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4f612ed3f748962cbef1316ff3d323e2b9055b6e
> > > in most cases.
> > >
> > > I believe some low-CPU count systems are still benefiting from it, but
> > > in general, I'd advise against it.
> >
> > Thanks for your review and guidance.
> >
> > Just to confirm my understanding: You’re okay with me adding this
> > notifier specifically for CONFIG_KFENCE_STATIC_KEYS (which is what
> > I need), but you would not support adding it for the general case where
> > !CONFIG_KFENCE_STATIC_KEYS, correct?
>
> Yes, correct. If there's a real issue with CONFIG_KFENCE_STATIC_KEYS,
> it's worth fixing if there are still valid uses for it.

Thanks for clarifying. I'll submit the patch with changes limited to
CONFIG_KFENCE_STATIC_KEYS.

--breno
Reply all
Reply to author
Forward
0 new messages