[syzbot] [rcu?] [bcachefs?] BUG: unable to handle kernel NULL pointer dereference in rcu_core (3)

14 views
Skip to first unread message

syzbot

unread,
Feb 4, 2025, 7:34:21 PMFeb 4
to ak...@linux-foundation.org, jo...@joshtriplett.org, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, pau...@kernel.org, r...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-0de63bb7.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/1142009a30a7/vmlinux-0de63bb7.xz
kernel image: https://storage.googleapis.com/syzbot-assets/5d9e46a8998d/bzImage-0de63bb7.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/526692501242/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+80e5d6...@syzkaller.appspotmail.com

slab radix_tree_node start ffff88803bf382c0 pointer offset 24 size 576
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 0 P4D 0
Oops: Oops: 0010 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 5705 Comm: syz-executor Not tainted 6.14.0-rc1-syzkaller-00020-g0de63bb7d919 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
rcu_do_batch kernel/rcu/tree.c:2546 [inline]
rcu_core+0xaaa/0x17a0 kernel/rcu/tree.c:2802
handle_softirqs+0x2d4/0x9b0 kernel/softirq.c:561
__do_softirq kernel/softirq.c:595 [inline]
invoke_softirq kernel/softirq.c:435 [inline]
__irq_exit_rcu+0xf7/0x220 kernel/softirq.c:662
irq_exit_rcu+0x9/0x30 kernel/softirq.c:678
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1049 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1049
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0xd8/0x140 kernel/locking/spinlock.c:194
Code: 9c 8f 44 24 20 42 80 3c 23 00 74 08 4c 89 f7 e8 fe 78 2d f6 f6 44 24 21 02 75 52 41 f7 c7 00 02 00 00 74 01 fb bf 01 00 00 00 <e8> c3 0f 95 f5 65 8b 05 d4 58 0b 74 85 c0 74 43 48 c7 04 24 0e 36
RSP: 0018:ffffc900030fef60 EFLAGS: 00000206
RAX: 23438dd059a4b100 RBX: 1ffff9200061fdf0 RCX: ffffffff819b316a
RDX: dffffc0000000000 RSI: ffffffff8c0aa680 RDI: 0000000000000001
RBP: ffffc900030feff8 R08: ffffffff942f9847 R09: 1ffffffff285f308
R10: dffffc0000000000 R11: fffffbfff285f309 R12: dffffc0000000000
R13: 1ffff9200061fdec R14: ffffc900030fef80 R15: 0000000000000246
spin_unlock_irqrestore include/linux/spinlock.h:406 [inline]
rmqueue_bulk mm/page_alloc.c:2329 [inline]
__rmqueue_pcplist+0x21fd/0x2a90 mm/page_alloc.c:3004
rmqueue_pcplist mm/page_alloc.c:3046 [inline]
rmqueue mm/page_alloc.c:3077 [inline]
get_page_from_freelist+0x886/0x37a0 mm/page_alloc.c:3474
__alloc_frozen_pages_noprof+0x292/0x710 mm/page_alloc.c:4739
alloc_pages_mpol+0x311/0x660 mm/mempolicy.c:2270
folio_alloc_mpol_noprof mm/mempolicy.c:2289 [inline]
vma_alloc_folio_noprof+0x12b/0x260 mm/mempolicy.c:2324
folio_prealloc+0x2e/0x170
wp_page_copy mm/memory.c:3435 [inline]
do_wp_page+0x1253/0x49b0 mm/memory.c:3827
handle_pte_fault mm/memory.c:5905 [inline]
__handle_mm_fault+0x24d5/0x70f0 mm/memory.c:6032
handle_mm_fault+0x3e5/0x8d0 mm/memory.c:6201
do_user_addr_fault arch/x86/mm/fault.c:1388 [inline]
handle_page_fault arch/x86/mm/fault.c:1480 [inline]
exc_page_fault+0x2b9/0x8b0 arch/x86/mm/fault.c:1538
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
RIP: 0010:__put_user_4+0x11/0x20 arch/x86/lib/putuser.S:88
Code: 1f 84 00 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 cb 48 c1 fb 3f 48 09 d9 0f 01 cb <89> 01 31 c9 0f 01 ca c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90
RSP: 0018:ffffc900030fff00 EFLAGS: 00050202
RAX: 0000000000000005 RBX: 0000000000000000 RCX: 00005555679927d0
RDX: 0000000000000000 RSI: ffffffff8c0ab8e0 RDI: ffffffff8c608a00
RBP: ffff888000dfcf20 R08: ffffffff901b5177 R09: 1ffffffff2036a2e
R10: dffffc0000000000 R11: fffffbfff2036a2f R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000005 R15: dffffc0000000000
schedule_tail+0x96/0xb0 kernel/sched/core.c:5312
ret_from_fork+0x24/0x80 arch/x86/kernel/process.c:144
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc90000007bd8 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 1ffff110077e705c RCX: 23438dd059a4b100
RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffff88803bf382d8
RBP: ffffc90000007e10 R08: ffffffff819f146c R09: 1ffff11003f8519a
R10: dffffc0000000000 R11: 0000000000000000 R12: ffffffff81a6d507
R13: ffff88803bf382e0 R14: 0000000000000000 R15: ffff88803bf382d8
FS: 0000555567992500(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000004da38000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: 9c pushf
1: 8f 44 24 20 pop 0x20(%rsp)
5: 42 80 3c 23 00 cmpb $0x0,(%rbx,%r12,1)
a: 74 08 je 0x14
c: 4c 89 f7 mov %r14,%rdi
f: e8 fe 78 2d f6 call 0xf62d7912
14: f6 44 24 21 02 testb $0x2,0x21(%rsp)
19: 75 52 jne 0x6d
1b: 41 f7 c7 00 02 00 00 test $0x200,%r15d
22: 74 01 je 0x25
24: fb sti
25: bf 01 00 00 00 mov $0x1,%edi
* 2a: e8 c3 0f 95 f5 call 0xf5950ff2 <-- trapping instruction
2f: 65 8b 05 d4 58 0b 74 mov %gs:0x740b58d4(%rip),%eax # 0x740b590a
36: 85 c0 test %eax,%eax
38: 74 43 je 0x7d
3a: 48 rex.W
3b: c7 .byte 0xc7
3c: 04 24 add $0x24,%al
3e: 0e (bad)
3f: 36 ss


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Paul E. McKenney

unread,
Feb 5, 2025, 9:56:22 AMFeb 5
to syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, r...@vger.kernel.org, syzkall...@googlegroups.com
The usual way that this happens is that someone clobbers the rcu_head
structure of something that has been passed to call_rcu(). The most
popular way of clobbering this structure is to pass the same something to
call_rcu() twice in a row, but other creative arrangements are possible.

Building your kernel with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y can usually
spot invoking call_rcu() twice in a row.

Thanx, Paul

syzbot

unread,
Jun 8, 2025, 2:58:04 AMJun 8
to ak...@linux-foundation.org, ayaanmi...@gmail.com, ayaanmir...@gmail.com, jo...@joshtriplett.org, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, lu...@kernel.org, pau...@kernel.org, pet...@infradead.org, r...@vger.kernel.org, syzkall...@googlegroups.com, tg...@linutronix.de
syzbot has bisected this issue to:

commit 14152654805256d760315ec24e414363bfa19a06
Author: Kent Overstreet <kent.ov...@linux.dev>
Date: Mon Nov 25 05:21:27 2024 +0000

bcachefs: Bad btree roots are now autofix

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=12fa0a82580000
start commit: 99fa936e8e4f Merge tag 'affs-6.14-rc5-tag' of git://git.ke..
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=11fa0a82580000
console output: https://syzkaller.appspot.com/x/log.txt?x=16fa0a82580000
kernel config: https://syzkaller.appspot.com/x/.config?x=523d3ff8e053340a
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=119d35a8580000

Reported-by: syzbot+80e5d6...@syzkaller.appspotmail.com
Fixes: 141526548052 ("bcachefs: Bad btree roots are now autofix")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Kent Overstreet

unread,
Jun 8, 2025, 11:26:37 AMJun 8
to Paul E. McKenney, syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, r...@vger.kernel.org, syzkall...@googlegroups.com
I don't think it's that - syzbot's .config already has that enabled.
KASAN, too.

And the only place we do call_rcu() is from rcu_pending.c, where we've
got a rearming rcu callback - but we track whether it's outstanding, and
we do all relevant operations with a lock held.

And we only use rcu_pending.c with SRCU, not regular RCU.

We do use kfree_rcu() in a few places (all boring, I expect), but that
doesn't (generally?) use the rcu callback list.

So I'm not sure this is even a bcachefs bug.

Uladzislau Rezki

unread,
Jun 8, 2025, 2:23:41 PMJun 8
to Kent Overstreet, Paul E. McKenney, Paul E. McKenney, syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, r...@vger.kernel.org, syzkall...@googlegroups.com
Right, kvfree_rcu() does not intersect with regular callbacks, it has
its own path.

It looks like the problem is here:

<snip>
f = rhp->func;
debug_rcu_head_callback(rhp);
WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
f(rhp);
<snip>

we do not check if callback, "f", is a NULL. If it is, the kernel bug
is triggered right away. For example:

call_rcu(&rh, NULL);

@Paul, do you think it makes sense to narrow callers which apparently
pass NULL as a callback? To me it seems the case of this bug. But we
do not know the source.

It would give at least a stack-trace of caller which passes a NULL.

--
Uladzislau Rezki

Paul E. McKenney

unread,
Jun 8, 2025, 8:25:08 PMJun 8
to Uladzislau Rezki, Kent Overstreet, syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, r...@vger.kernel.org, syzkall...@googlegroups.com
Adding a check for NULL func passed to __call_rcu_common(), you mean?

That wouldn't hurt, and would either (as you say) catch the culprit
or show that the problem is elsewhere.

Thanx, Paul

Hillf Danton

unread,
Jun 9, 2025, 1:29:43 AMJun 9
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On Sun, Jun 08, 2025 at 08:23:36PM +0200, Uladzislau Rezki wrote:
> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> > On Wed, Feb 05, 2025 at 06:56:19AM -0800, Paul E. McKenney wrote:
> > > On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit: 0de63bb7d919 Merge tag 'pull-fix' of git://git.kernel.org/..
> > > > git tree: upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=10faf5f8580000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=1909f2f0d8e641ce
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
> > > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16b69d18580000

#syz test upstream master

--- x/kernel/rcu/tree.c
+++ y/kernel/rcu/tree.c
@@ -3071,6 +3071,7 @@ __call_rcu_common(struct rcu_head *head,

/* Misaligned rcu_head! */
WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1));
+ BUG_ON(!func);

if (debug_rcu_head_queue(head)) {
/*
--

syzbot

unread,
Jun 9, 2025, 2:01:06 AMJun 9
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+80e5d6...@syzkaller.appspotmail.com
Tested-by: syzbot+80e5d6...@syzkaller.appspotmail.com

Tested on:

commit: 19272b37 Linux 6.16-rc1
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17e2ca82580000
kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
patch: https://syzkaller.appspot.com/x/patch.diff?x=1398f570580000

Note: testing is done by a robot and is best-effort only.

Uladzislau Rezki

unread,
Jun 9, 2025, 4:35:40 AMJun 9
to Paul E. McKenney, Uladzislau Rezki, Kent Overstreet, syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, r...@vger.kernel.org, syzkall...@googlegroups.com
Yes. Currently there is no any check. So passing a NULL just triggers
kernel panic.

>
> That wouldn't hurt, and would either (as you say) catch the culprit
> or show that the problem is elsewhere.
>
I can add it then and send out the patch if no objections.

--
Uladzislau Rezki

Paul E. McKenney

unread,
Jun 9, 2025, 5:47:35 AMJun 9
to Uladzislau Rezki, Kent Overstreet, syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, r...@vger.kernel.org, syzkall...@googlegroups.com
No objections from me!

Thanx, Paul

Joel Fernandes

unread,
Jun 9, 2025, 10:21:07 AMJun 9
to pau...@kernel.org, Uladzislau Rezki, Kent Overstreet, syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, r...@vger.kernel.org, syzkall...@googlegroups.com
Me neither! And I can push that into an -rc release as well once I have it
(since it is related to a potential bug).

thanks,

- Joel


Vlastimil Babka

unread,
Jun 9, 2025, 2:29:00 PMJun 9
to Uladzislau Rezki, Kent Overstreet, Paul E. McKenney, syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, r...@vger.kernel.org, syzkall...@googlegroups.com
On 6/8/25 20:23, Uladzislau Rezki wrote:
> On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
>>
>> I don't think it's that - syzbot's .config already has that enabled.
>> KASAN, too.
>>
>> And the only place we do call_rcu() is from rcu_pending.c, where we've
>> got a rearming rcu callback - but we track whether it's outstanding, and
>> we do all relevant operations with a lock held.
>>
>> And we only use rcu_pending.c with SRCU, not regular RCU.
>>
>> We do use kfree_rcu() in a few places (all boring, I expect), but that
>> doesn't (generally?) use the rcu callback list.
>>
> Right, kvfree_rcu() does not intersect with regular callbacks, it has
> its own path.

You mean do to the batching? Maybe the batching should be disabled with
CONFIG_DEBUG_OBJECTS_RCU_HEAD=y if it prevents it from detecting issues?
Otherwise we now have kvfree_rcu_cb() so the special handling of
kvfree_rcu() is gone in in the non-batching case.

> It looks like the problem is here:
>
> <snip>
> f = rhp->func;
> debug_rcu_head_callback(rhp);
> WRITE_ONCE(rhp->func, (rcu_callback_t)0L);
> f(rhp);
> <snip>
>
> we do not check if callback, "f", is a NULL. If it is, the kernel bug
> is triggered right away. For example:
>
> call_rcu(&rh, NULL);
>
> @Paul, do you think it makes sense to narrow callers which apparently
> pass NULL as a callback? To me it seems the case of this bug. But we
> do not know the source.
>
> It would give at least a stack-trace of caller which passes a NULL.

Right, AFAIU this kind of check is now possible, previously NULL was being
interpreted as a valid __is_kvfree_rcu_offset() (i.e. rcu_head at offset 0).

> --
> Uladzislau Rezki
>

Uladzislau Rezki

unread,
Jun 10, 2025, 8:19:53 AMJun 10
to Joel Fernandes, pau...@kernel.org, Uladzislau Rezki, Kent Overstreet, syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, r...@vger.kernel.org, syzkall...@googlegroups.com
I will prepare it and send out today.

--
Uladzislau Rezki

Uladzislau Rezki

unread,
Jun 10, 2025, 8:33:26 AMJun 10
to Vlastimil Babka, Uladzislau Rezki, Kent Overstreet, Paul E. McKenney, syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, r...@vger.kernel.org, syzkall...@googlegroups.com
On Mon, Jun 09, 2025 at 08:28:56PM +0200, Vlastimil Babka wrote:
> On 6/8/25 20:23, Uladzislau Rezki wrote:
> > On Sun, Jun 08, 2025 at 11:26:28AM -0400, Kent Overstreet wrote:
> >>
> >> I don't think it's that - syzbot's .config already has that enabled.
> >> KASAN, too.
> >>
> >> And the only place we do call_rcu() is from rcu_pending.c, where we've
> >> got a rearming rcu callback - but we track whether it's outstanding, and
> >> we do all relevant operations with a lock held.
> >>
> >> And we only use rcu_pending.c with SRCU, not regular RCU.
> >>
> >> We do use kfree_rcu() in a few places (all boring, I expect), but that
> >> doesn't (generally?) use the rcu callback list.
> >>
> > Right, kvfree_rcu() does not intersect with regular callbacks, it has
> > its own path.
>
> You mean do to the batching? Maybe the batching should be disabled with
> CONFIG_DEBUG_OBJECTS_RCU_HEAD=y if it prevents it from detecting issues?
> Otherwise we now have kvfree_rcu_cb() so the special handling of
> kvfree_rcu() is gone in in the non-batching case.
>
Not really. I meant that in a call_rcu() API there is no any check if
a passed callback which is executed after GP is NULL. If so, we get the
bug about about dereferencing of NULL pointer.

Since it is invoked by the rcu_core() context, we can not identify the
caller in order to blame someone :)

As for batching, we have a support of CONFIG_DEBUG_OBJECTS_RCU_HEAD. It
helps to identify double-freeing and probably leaking.

Uladzislau Rezki

unread,
Jun 11, 2025, 11:59:02 AMJun 11
to syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, pau...@kernel.org, r...@vger.kernel.org, syzkall...@googlegroups.com
On Tue, Feb 04, 2025 at 04:34:18PM -0800, syzbot wrote:
#syz test

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 475f31deed14..b297a32c6779 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3047,6 +3047,10 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy_in)
/* Misaligned rcu_head! */
WARN_ON_ONCE((unsigned long)head & (sizeof(void *) - 1));

+ /* Avoid NULL dereference if callback is NULL. */
+ if (WARN_ON_ONCE(!func))
+ return;
+
if (debug_rcu_head_queue(head)) {
/*
* Probable double call_rcu(), so leak the callback.

syzbot

unread,
Jun 11, 2025, 2:02:05 PMJun 11
to ak...@linux-foundation.org, jo...@joshtriplett.org, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, pau...@kernel.org, r...@vger.kernel.org, syzkall...@googlegroups.com, ure...@gmail.com
Hello,

syzbot tried to test the proposed patch but the build/boot failed:

failed to apply patch:
checking file kernel/rcu/tree.c
patch: **** unexpected end of file in patch



Tested on:

commit: aef17cb3 Revert "mm/damon/Kconfig: enable CONFIG_DAMON..
git tree: upstream
kernel config: https://syzkaller.appspot.com/x/.config?x=523d3ff8e053340a
patch: https://syzkaller.appspot.com/x/patch.diff?x=17de99d4580000

Uladzislau Rezki

unread,
Jun 11, 2025, 3:15:13 PMJun 11
to syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, pau...@kernel.org, r...@vger.kernel.org, syzkall...@googlegroups.com, ure...@gmail.com
#syz test

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e8a4b720d7d2..14d4499c6fc3 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3072,6 +3072,10 @@ __call_rcu_common(struct rcu_head *head, rcu_callback_t func, bool lazy_in)

syzbot

unread,
Jun 11, 2025, 3:57:08 PMJun 11
to ak...@linux-foundation.org, jo...@joshtriplett.org, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, pau...@kernel.org, r...@vger.kernel.org, syzkall...@googlegroups.com, ure...@gmail.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+80e5d6...@syzkaller.appspotmail.com
Tested-by: syzbot+80e5d6...@syzkaller.appspotmail.com

Tested on:

commit: 488ef356 KEYS: Invert FINAL_PUT bit
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
patch: https://syzkaller.appspot.com/x/patch.diff?x=170e460c580000

Boqun Feng

unread,
Jun 11, 2025, 4:58:46 PMJun 11
to syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, pau...@kernel.org, r...@vger.kernel.org, syzkall...@googlegroups.com, ure...@gmail.com
On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch and the reproducer did not trigger any issue:
>
> Reported-by: syzbot+80e5d6...@syzkaller.appspotmail.com
> Tested-by: syzbot+80e5d6...@syzkaller.appspotmail.com
>
> Tested on:
>
> commit: 488ef356 KEYS: Invert FINAL_PUT bit
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000

Is there a way to see the whole console output? If Ulad's patch fixes
the exact issue, we should be able to see a WARN_ON_ONCE() triggered.

Regards,
Boqun

Aleksandr Nogikh

unread,
Jun 12, 2025, 3:42:46 AMJun 12
to Boqun Feng, syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, pau...@kernel.org, r...@vger.kernel.org, syzkall...@googlegroups.com, ure...@gmail.com
On Wed, Jun 11, 2025 at 10:58 PM Boqun Feng <boqun...@gmail.com> wrote:
>
> On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot has tested the proposed patch and the reproducer did not trigger any issue:
> >
> > Reported-by: syzbot+80e5d6...@syzkaller.appspotmail.com
> > Tested-by: syzbot+80e5d6...@syzkaller.appspotmail.com
> >
> > Tested on:
> >
> > commit: 488ef356 KEYS: Invert FINAL_PUT bit
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
>
> Is there a way to see the whole console output? If Ulad's patch fixes
> the exact issue, we should be able to see a WARN_ON_ONCE() triggered.

If WARN_ON_ONCE() were triggered, the associated kernel panic output
would have been at the end of this log.
FWIW the last time the bug was observed on syzbot was 100 days ago, so
it has likely been fixed since then or has become much harder to
reproduce.

> > compiler: Debian clang version 20.1.6 (++20250514063057+1e4d39e07757-1~exp1~20250514183223.118), Debian LLD 20.1.6
> > patch: https://syzkaller.appspot.com/x/patch.diff?x=170e460c580000
> >
> > Note: testing is done by a robot and is best-effort only.
> >
>

--
Aleksandr

Uladzislau Rezki

unread,
Jun 12, 2025, 5:37:20 AMJun 12
to Aleksandr Nogikh, Boqun Feng, syzbot, ak...@linux-foundation.org, jo...@joshtriplett.org, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, pau...@kernel.org, r...@vger.kernel.org, syzkall...@googlegroups.com, ure...@gmail.com
On Thu, Jun 12, 2025 at 09:42:32AM +0200, Aleksandr Nogikh wrote:
> On Wed, Jun 11, 2025 at 10:58 PM Boqun Feng <boqun...@gmail.com> wrote:
> >
> > On Wed, Jun 11, 2025 at 12:57:04PM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot has tested the proposed patch and the reproducer did not trigger any issue:
> > >
> > > Reported-by: syzbot+80e5d6...@syzkaller.appspotmail.com
> > > Tested-by: syzbot+80e5d6...@syzkaller.appspotmail.com
> > >
> > > Tested on:
> > >
> > > commit: 488ef356 KEYS: Invert FINAL_PUT bit
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=129a660c580000
> >
> > Is there a way to see the whole console output? If Ulad's patch fixes
> > the exact issue, we should be able to see a WARN_ON_ONCE() triggered.
>
> If WARN_ON_ONCE() were triggered, the associated kernel panic output
> would have been at the end of this log.
>
> >
> > Regards,
> > Boqun
> >
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=713d218acd33d94
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=80e5d6f453f14a53383a
>
> FWIW the last time the bug was observed on syzbot was 100 days ago, so
> it has likely been fixed since then or has become much harder to
> reproduce.
>
That is even worse, if it is last for 100 days already.

--
Uladzislau Rezki

Boqun Feng

unread,
Jun 12, 2025, 1:20:38 PMJun 12
to Uladzislau Rezki (Sony), Aleksandr Nogikh, syzbot, Andrew Morton, Josh Triplett, kent.ov...@linux.dev, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, Paul E. McKenney, r...@vger.kernel.org, syzkall...@googlegroups.com
My understanding is that the evidence shows that the
issue that directly caused null-ptr-derek the has been
fixed 100 days ago.

Regards,
Boqun

> --
> Uladzislau Rezki

syzbot

unread,
Jun 26, 2025, 1:21:16 PMJun 26
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
No recent activity, existing reproducers are no longer triggering the issue.
Reply all
Reply to author
Forward
0 new messages