Re: v5.19-rc2-rt3: mm/kfence might_sleep() splat

2 views
Skip to first unread message

Sebastian Andrzej Siewior

unread,
Jun 24, 2022, 5:05:29 AM6/24/22
to Mike Galbraith, RT, Alexander Potapenko, Marco Elver, Dmitry Vyukov, kasa...@googlegroups.com
On 2022-06-18 11:34:51 [+0200], Mike Galbraith wrote:
> I moved the prandom_u32_max() call in kfence_guarded_alloc() out from
> under raw spinlock to shut this one up.

Care to send a patch? I don't even why kfence_metadata::lock is a
raw_spinlock_t. This is the case since the beginning of the code.

> [ 1.128544] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
> [ 1.128546] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 90, name: kworker/u16:3
> [ 1.128547] preempt_count: 1, expected: 0
> [ 1.128548] RCU nest depth: 1, expected: 1
> [ 1.128549] CPU: 3 PID: 90 Comm: kworker/u16:3 Tainted: G W 5.19.0.g0639b59-master-rt #2 55e5fbd63d8381661776ddec390c2b764f305c0b
> [ 1.128551] Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
> [ 1.128552] Workqueue: events_unbound async_run_entry_fn
> [ 1.128556] Call Trace:
> [ 1.128557] <TASK>
> [ 1.128558] dump_stack_lvl+0x44/0x58
> [ 1.128562] __might_resched+0x141/0x160
> [ 1.128566] rt_spin_lock+0x2d/0x70
> [ 1.128569] get_random_u32+0x45/0x100
> [ 1.128575] __kfence_alloc+0x3f4/0x6c0
> [ 1.128647] kmem_cache_alloc_lru+0x1d8/0x220
> [ 1.128649] xas_alloc+0x9b/0xc0
> [ 1.128651] xas_create+0x20c/0x390
> [ 1.128653] xas_store+0x52/0x5a0
> [ 1.128655] __filemap_add_folio+0x189/0x5a0
> [ 1.128660] filemap_add_folio+0x38/0xa0
> [ 1.128661] __filemap_get_folio+0x1b0/0x580
> [ 1.128665] pagecache_get_page+0x13/0x80
> [ 1.128667] simple_write_begin+0x20/0x2d0
> [ 1.128669] generic_perform_write+0xae/0x1e0
> [ 1.128671] __generic_file_write_iter+0x141/0x180
> [ 1.128672] generic_file_write_iter+0x5d/0xb0
> [ 1.128674] __kernel_write+0x139/0x2f0
> [ 1.128676] kernel_write+0x56/0x1a0
> [ 1.128678] xwrite.constprop.8+0x35/0x8e
> [ 1.128682] do_copy+0xee/0x13a
> [ 1.128685] write_buffer+0x27/0x37
> [ 1.128687] flush_buffer+0x34/0x8b
> [ 1.128690] unxz+0x1b8/0x301
> [ 1.128695] unpack_to_rootfs+0x17f/0x2ae
> [ 1.128698] do_populate_rootfs+0x59/0x108
> [ 1.128700] async_run_entry_fn+0x2b/0x110
> [ 1.128701] process_one_work+0x21f/0x4a0
> [ 1.128703] worker_thread+0x39/0x3d0
> [ 1.128706] kthread+0x13e/0x160
> [ 1.128709] ret_from_fork+0x1f/0x30
> [ 1.128711] </TASK>

Sebastian

Marco Elver

unread,
Jun 24, 2022, 5:11:32 AM6/24/22
to Sebastian Andrzej Siewior, Mike Galbraith, RT, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
On Fri, 24 Jun 2022 at 11:05, Sebastian Andrzej Siewior
<big...@linutronix.de> wrote:
>
> On 2022-06-18 11:34:51 [+0200], Mike Galbraith wrote:
> > I moved the prandom_u32_max() call in kfence_guarded_alloc() out from
> > under raw spinlock to shut this one up.
>
> Care to send a patch? I don't even why kfence_metadata::lock is a
> raw_spinlock_t. This is the case since the beginning of the code.

Because kfence_handle_page_fault() may be called from anywhere, incl.
other raw_spinlock critical sections. We have this problem with all
debugging tools where the bug may manifest anywhere.

A patch for it already exists in -mm:
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-hotfixes-stable&id=327b18b7aaed5de3b548212e3ab75133bf323759

Sebastian Andrzej Siewior

unread,
Jun 24, 2022, 5:19:46 AM6/24/22
to Marco Elver, Mike Galbraith, RT, Alexander Potapenko, Dmitry Vyukov, kasa...@googlegroups.com
On 2022-06-24 11:10:55 [+0200], Marco Elver wrote:
> On Fri, 24 Jun 2022 at 11:05, Sebastian Andrzej Siewior
> <big...@linutronix.de> wrote:
> >
> > On 2022-06-18 11:34:51 [+0200], Mike Galbraith wrote:
> > > I moved the prandom_u32_max() call in kfence_guarded_alloc() out from
> > > under raw spinlock to shut this one up.
> >
> > Care to send a patch? I don't even why kfence_metadata::lock is a
> > raw_spinlock_t. This is the case since the beginning of the code.
>
> Because kfence_handle_page_fault() may be called from anywhere, incl.
> other raw_spinlock critical sections. We have this problem with all
> debugging tools where the bug may manifest anywhere.

Oh thank you. I had some vague memory of this but could find anything.
Thanks for the pointer.

Sebastian
Reply all
Reply to author
Forward
0 new messages