[syzbot] [io-uring?] BUG: unable to handle kernel NULL pointer dereference in percpu_ref_put

syzbot

unread,

Dec 23, 2024, 2:52:29 PM12/23/24

to asml.s...@gmail.com, ax...@kernel.dk, io-u...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com

Hello,

syzbot found the following issue on:

HEAD commit: eabcdba3ad40 Merge tag 'for-6.13-rc3-tag' of git://git.ker..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10871f44580000
kernel config: https://syzkaller.appspot.com/x/.config?x=c22efbd20f8da769
dashboard link: https://syzkaller.appspot.com/bug?extid=3dcac84cc1d50f43ed31
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=141bccf8580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=135f7730580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/a9904ed2be77/disk-eabcdba3.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/fb8d571e1cb3/vmlinux-eabcdba3.xz
kernel image: https://storage.googleapis.com/syzbot-assets/76349070db25/bzImage-eabcdba3.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3dcac8...@syzkaller.appspotmail.com

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 0 P4D 0
Oops: Oops: 0010 [#1] PREEMPT SMP KASAN PTI
CPU: 0 UID: 0 PID: 11082 Comm: syz-executor246 Not tainted 6.13.0-rc3-syzkaller-00073-geabcdba3ad40 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc9000413f9e0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff88807c722018 RCX: ffffffff8497d56c
RDX: 1ffff110287e09e1 RSI: ffffffff8497d57a RDI: ffff88807c722018
RBP: ffff888143f04f00 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000002 R12: ffff88807c722020
R13: 0000000000000000 R14: 0000000000000000 R15: ffff8880745b4a10
FS: 0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000000db7e000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
percpu_ref_put_many.constprop.0+0x269/0x2a0 include/linux/percpu-refcount.h:335
percpu_ref_put include/linux/percpu-refcount.h:351 [inline]
percpu_ref_kill_and_confirm+0x94/0x180 lib/percpu-refcount.c:396
percpu_ref_kill include/linux/percpu-refcount.h:149 [inline]
io_ring_ctx_wait_and_kill+0x86/0x250 io_uring/io_uring.c:2973
io_uring_release+0x39/0x50 io_uring/io_uring.c:2995
__fput+0x3f8/0xb60 fs/file_table.c:450
task_work_run+0x14e/0x250 kernel/task_work.c:239
exit_task_work include/linux/task_work.h:43 [inline]
do_exit+0xadd/0x2d70 kernel/exit.c:938
do_group_exit+0xd3/0x2a0 kernel/exit.c:1087
get_signal+0x2576/0x2610 kernel/signal.c:3017
arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:337
exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x150/0x2a0 kernel/entry/common.c:218
do_syscall_64+0xda/0x250 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f1575ca04e9
Code: Unable to access opcode bytes at 0x7f1575ca04bf.
RSP: 002b:00007f1575c5b218 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00007f1575d2a308 RCX: 00007f1575ca04e9
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f1575d2a308
RBP: 00007f1575d2a300 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f1575d2a30c
R13: 00007f1575cf7074 R14: 006e716e5f797265 R15: 0030656c69662f2e
</TASK>
Modules linked in:
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffc9000413f9e0 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff88807c722018 RCX: ffffffff8497d56c
RDX: 1ffff110287e09e1 RSI: ffffffff8497d57a RDI: ffff88807c722018
RBP: ffff888143f04f00 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000002 R12: ffff88807c722020
R13: 0000000000000000 R14: 0000000000000000 R15: ffff8880745b4a10
FS: 0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000000db7e000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Jens Axboe

unread,

Dec 23, 2024, 3:33:39 PM12/23/24

to syzbot, asml.s...@gmail.com, io-u...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, linux...@lists.infradead.org, Hannes Reinecke, Sagi Grimberg

On 12/23/24 12:52 PM, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: eabcdba3ad40 Merge tag 'for-6.13-rc3-tag' of git://git.ker..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=10871f44580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=c22efbd20f8da769
> dashboard link: https://syzkaller.appspot.com/bug?extid=3dcac84cc1d50f43ed31
> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=141bccf8580000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=135f7730580000

I ran this one but his this instead:

==================================================================
BUG: KASAN: slab-out-of-bounds in nvmet_root_discovery_nqn_store+0x110/0x180
Write of size 256 at addr ffff000009e71180 by task refcrash/775

CPU: 0 UID: 0 PID: 775 Comm: refcrash Not tainted 6.13.0-rc4 #2
Hardware name: linux,dummy-virt (DT)
Call trace:
show_stack+0x1c/0x30 (C)
__dump_stack+0x24/0x30
dump_stack_lvl+0x60/0x80
print_address_description+0x88/0x220
print_report+0x4c/0x60
kasan_report+0x94/0xf0
kasan_check_range+0x248/0x288
__asan_memset+0x30/0x60
nvmet_root_discovery_nqn_store+0x110/0x180
configfs_write_iter+0x220/0x2e8
do_iter_readv_writev+0x2e0/0x458
vfs_writev+0x220/0x728
do_writev+0xf8/0x1a8
__arm64_sys_writev+0x80/0x98
invoke_syscall+0x7c/0x258
el0_svc_common+0x108/0x1d0
do_el0_svc+0x4c/0x60
el0_svc+0x4c/0xa0
el0t_64_sync_handler+0x70/0x100
el0t_64_sync+0x170/0x178

Allocated by task 1:
kasan_save_track+0x2c/0x60
kasan_save_alloc_info+0x3c/0x48
__kasan_kmalloc+0x80/0x98
__kmalloc_node_track_caller_noprof+0x2f0/0x590
kstrndup+0x4c/0xb8
nvmet_subsys_alloc+0x1c4/0x498
nvmet_init_discovery+0x20/0x48
nvmet_init+0x18c/0x1c0
do_one_initcall+0x1a4/0x718
do_initcall_level+0x178/0x348
do_initcalls+0x58/0xa0
do_basic_setup+0x7c/0x98
kernel_init_freeable+0x268/0x380
kernel_init+0x24/0x148
ret_from_fork+0x10/0x20

The buggy address belongs to the object at ffff000009e71180
which belongs to the cache kmalloc-64 of size 64
The buggy address is located 0 bytes inside of
allocated 37-byte region [ffff000009e71180, ffff000009e711a5)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x49e71
anon flags: 0x3ffe00000000000(node=0|zone=0|lastcpupid=0x1fff)
page_type: f5(slab)
raw: 03ffe00000000000 ffff0000070028c0 fffffdffc0523d80 dead000000000005
raw: 0000000000000000 0000000000200020 00000001f5000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff000009e71080: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
ffff000009e71100: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
>ffff000009e71180: 00 00 00 00 05 fc fc fc fc fc fc fc fc fc fc fc
Zero length message leads to an empty skb
^
ffff000009e71200: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
ffff000009e71280: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint

which makes me think something else is the culprit here. The test case
doesn't do much outside of creating two rings, it doesn't actually use
them.

CC'ing likely suspects on the nvme front. This is on 6.13-rc4 fwiw.

--
Jens Axboe

Jens Axboe

unread,

Dec 23, 2024, 3:51:40 PM12/23/24

to syzbot, asml.s...@gmail.com, io-u...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com

#syz set subsystems: nvme

--
Jens Axboe

Jens Axboe

unread,

Dec 23, 2024, 3:55:32 PM12/23/24

to Caleb Sander, syzbot, asml.s...@gmail.com, io-u...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, linux...@lists.infradead.org, Hannes Reinecke, Sagi Grimberg

On 12/23/24 1:52 PM, Caleb Sander wrote:
> This is probably the same bug that is being addressed by
> https://lore.kernel.org/lkml/20241218185000.1...@gmail.com/T/

Yep that looks highly plausible. We should get this queued for 6.12 and
marked for stable, it's missing the cc stable tag.

--
Jens Axboe

Caleb Sander

unread,

Dec 24, 2024, 4:44:01 AM12/24/24

to Jens Axboe, syzbot, asml.s...@gmail.com, io-u...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, linux...@lists.infradead.org, Hannes Reinecke, Sagi Grimberg

This is probably the same bug that is being addressed by
https://lore.kernel.org/lkml/20241218185000.1...@gmail.com/T/

Tetsuo Handa

unread,

Jan 4, 2025, 8:55:14 AM1/4/25

to syzbot, asml.s...@gmail.com, ax...@kernel.dk, io-u...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com

#syz dup: general protection fault in account_kernel_stack (3)

Reply all

Reply to author

Forward

[syzbot] [io-uring?] BUG: unable to handle kernel NULL pointer dereference in percpu_ref_put_many

syzbot

Jens Axboe

Jens Axboe

Jens Axboe

Caleb Sander

Tetsuo Handa