[syzbot] [bpf?] [trace?] WARNING: locking bug in __lock_task_sighand

15 views
Skip to first unread message

syzbot

unread,
Nov 28, 2024, 8:07:23 AM11/28/24
to and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, linux-tra...@vger.kernel.org, marti...@linux.dev, mathieu....@efficios.com, mattbo...@google.com, mhir...@kernel.org, ros...@goodmis.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
Hello,

syzbot found the following issue on:

HEAD commit: 2c22dc1ee3a1 Merge tag 'mailbox-v6.13' of git://git.kernel..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17f2bee8580000
kernel config: https://syzkaller.appspot.com/x/.config?x=8df9bf3383f5970
dashboard link: https://syzkaller.appspot.com/bug?extid=97da3d7e0112d59971de
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/9137c3e19e21/disk-2c22dc1e.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/1aad80837d89/vmlinux-2c22dc1e.xz
kernel image: https://storage.googleapis.com/syzbot-assets/d7979d71d6d2/bzImage-2c22dc1e.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+97da3d...@syzkaller.appspotmail.com

=============================
[ BUG: Invalid wait context ]
6.12.0-syzkaller-09435-g2c22dc1ee3a1 #0 Not tainted
-----------------------------
iou-wrk-9958/9967 is trying to lock:
ffff88802744ae58 (&sighand->siglock){-.-.}-{3:3}, at: __lock_task_sighand+0x149/0x2d0 kernel/signal.c:1379
other info that might help us debug this:
context-{5:5}
3 locks held by iou-wrk-9958/9967:
#0: ffff88814d2870c0 (&acct->lock){+.+.}-{2:2}, at: io_acct_run_queue io_uring/io-wq.c:260 [inline]
#0: ffff88814d2870c0 (&acct->lock){+.+.}-{2:2}, at: io_wq_worker+0x44b/0xed0 io_uring/io-wq.c:654
#1: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
#1: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
#1: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: __bpf_trace_run kernel/trace/bpf_trace.c:2350 [inline]
#1: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: bpf_trace_run2+0x1fc/0x540 kernel/trace/bpf_trace.c:2392
#2: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
#2: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
#2: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: __lock_task_sighand+0x29/0x2d0 kernel/signal.c:1362
stack backtrace:
CPU: 1 UID: 0 PID: 9967 Comm: iou-wrk-9958 Not tainted 6.12.0-syzkaller-09435-g2c22dc1ee3a1 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_lock_invalid_wait_context kernel/locking/lockdep.c:4826 [inline]
check_wait_context kernel/locking/lockdep.c:4898 [inline]
__lock_acquire+0x15a8/0x2100 kernel/locking/lockdep.c:5176
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
__lock_task_sighand+0x149/0x2d0 kernel/signal.c:1379
lock_task_sighand include/linux/sched/signal.h:743 [inline]
do_send_sig_info kernel/signal.c:1267 [inline]
group_send_sig_info+0x274/0x310 kernel/signal.c:1418
bpf_send_signal_common+0x3c4/0x630 kernel/trace/bpf_trace.c:881
____bpf_send_signal kernel/trace/bpf_trace.c:886 [inline]
bpf_send_signal+0x1d/0x30 kernel/trace/bpf_trace.c:884
bpf_prog_631417f49dd64198+0x25/0x48
bpf_dispatcher_nop_func include/linux/bpf.h:1290 [inline]
__bpf_prog_run include/linux/filter.h:701 [inline]
bpf_prog_run include/linux/filter.h:708 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2351 [inline]
bpf_trace_run2+0x2ec/0x540 kernel/trace/bpf_trace.c:2392
trace_contention_end+0x114/0x140 include/trace/events/lock.h:122
__pv_queued_spin_lock_slowpath+0xb7e/0xdb0 kernel/locking/qspinlock.c:557
pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:584 [inline]
queued_spin_lock_slowpath+0x42/0x50 arch/x86/include/asm/qspinlock.h:51
queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
do_raw_spin_lock+0x272/0x370 kernel/locking/spinlock_debug.c:116
io_acct_run_queue io_uring/io-wq.c:260 [inline]
io_wq_worker+0x44b/0xed0 io_uring/io-wq.c:654
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Alexei Starovoitov

unread,
Nov 29, 2024, 11:47:55 AM11/29/24
to syzbot, Puranjay Mohan, Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Eddy Z, Hao Luo, John Fastabend, Jiri Olsa, KP Singh, LKML, linux-trace-kernel, Martin KaFai Lau, Mathieu Desnoyers, Matt Bobrowski, Masami Hiramatsu, Steven Rostedt, Stanislav Fomichev, Song Liu, syzkaller-bugs, Yonghong Song
Puranjay, Andrii and All,

looks like if (irqs_disabled()) is not enough.
Should we change it to preemptible() ?

It will likely make it async all the time,
but in this it's an ok trade off?

syzbot

unread,
Dec 2, 2024, 5:14:31 AM12/2/24
to alexei.st...@gmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, linux-tra...@vger.kernel.org, marti...@linux.dev, mathieu....@efficios.com, mattbo...@google.com, mhir...@kernel.org, net...@vger.kernel.org, pura...@kernel.org, ros...@goodmis.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
syzbot has found a reproducer for the following issue on:

HEAD commit: 45e04eb4d9d8 bpf: Refactor bpf_tracing_func_proto() and re..
git tree: bpf-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=167e17c0580000
kernel config: https://syzkaller.appspot.com/x/.config?x=fb680913ee293bcc
dashboard link: https://syzkaller.appspot.com/bug?extid=97da3d7e0112d59971de
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=114a7d30580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1395ff78580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/f45e1a59de79/disk-45e04eb4.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/6e2405d2c818/vmlinux-45e04eb4.xz
kernel image: https://storage.googleapis.com/syzbot-assets/2c2415798034/bzImage-45e04eb4.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+97da3d...@syzkaller.appspotmail.com

=============================
[ BUG: Invalid wait context ]
6.12.0-syzkaller-g45e04eb4d9d8 #0 Not tainted
-----------------------------
syz-executor227/5855 is trying to lock:
ffff8880262a8018 (&sighand->siglock){-...}-{3:3}, at: __lock_task_sighand+0x149/0x2d0 kernel/signal.c:1379
other info that might help us debug this:
context-{5:5}
8 locks held by syz-executor227/5855:
#0: ffff88802f97ea90 (&vma->vm_lock->lock){++++}-{4:4}, at: vma_start_read include/linux/mm.h:716 [inline]
#0: ffff88802f97ea90 (&vma->vm_lock->lock){++++}-{4:4}, at: lock_vma_under_rcu+0x34b/0x790 mm/memory.c:6278
#1: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
#1: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
#1: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: do_fault_around mm/memory.c:5279 [inline]
#1: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: do_read_fault mm/memory.c:5313 [inline]
#1: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: do_fault mm/memory.c:5456 [inline]
#1: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: do_pte_missing mm/memory.c:3979 [inline]
#1: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: handle_pte_fault+0x21c3/0x68a0 mm/memory.c:5801
#2: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
#2: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
#2: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: filemap_map_pages+0x243/0x20d0 mm/filemap.c:3645
#3: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
#3: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
#3: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: __pte_offset_map+0x82/0x380 mm/pgtable-generic.c:287
#4: ffff8880791b2df8 (ptlock_ptr(ptdesc)#2){+.+.}-{3:3}, at: spin_lock include/linux/spinlock.h:351 [inline]
#4: ffff8880791b2df8 (ptlock_ptr(ptdesc)#2){+.+.}-{3:3}, at: __pte_offset_map_lock+0x1ba/0x300 mm/pgtable-generic.c:402
#5: ffffffff8e93c4a0 (rcu_read_lock_sched){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
#5: ffffffff8e93c4a0 (rcu_read_lock_sched){....}-{1:2}, at: rcu_read_lock_sched include/linux/rcupdate.h:941 [inline]
#5: ffffffff8e93c4a0 (rcu_read_lock_sched){....}-{1:2}, at: pfn_valid+0xf6/0x450 include/linux/mmzone.h:2048
#6: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: trace_call_bpf+0xbc/0x8a0
#7: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:337 [inline]
#7: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:849 [inline]
#7: ffffffff8e93c520 (rcu_read_lock){....}-{1:3}, at: __lock_task_sighand+0x29/0x2d0 kernel/signal.c:1362
stack backtrace:
CPU: 0 UID: 0 PID: 5855 Comm: syz-executor227 Not tainted 6.12.0-syzkaller-g45e04eb4d9d8 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_lock_invalid_wait_context kernel/locking/lockdep.c:4826 [inline]
check_wait_context kernel/locking/lockdep.c:4898 [inline]
__lock_acquire+0x15a8/0x2100 kernel/locking/lockdep.c:5176
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162
__lock_task_sighand+0x149/0x2d0 kernel/signal.c:1379
lock_task_sighand include/linux/sched/signal.h:743 [inline]
do_send_sig_info kernel/signal.c:1267 [inline]
group_send_sig_info+0x274/0x310 kernel/signal.c:1418
bpf_send_signal_common+0x3c4/0x630 kernel/trace/bpf_trace.c:870
____bpf_send_signal_thread kernel/trace/bpf_trace.c:887 [inline]
bpf_send_signal_thread+0x1a/0x30 kernel/trace/bpf_trace.c:885
bpf_prog_b7be628660dc1b90+0x23/0x29
bpf_dispatcher_nop_func include/linux/bpf.h:1290 [inline]
__bpf_prog_run include/linux/filter.h:701 [inline]
bpf_prog_run include/linux/filter.h:708 [inline]
bpf_prog_run_array include/linux/bpf.h:2177 [inline]
trace_call_bpf+0x369/0x8a0 kernel/trace/bpf_trace.c:146
perf_trace_run_bpf_submit+0x82/0x180 kernel/events/core.c:10473
do_perf_trace_lock include/trace/events/lock.h:50 [inline]
perf_trace_lock+0x388/0x490 include/trace/events/lock.h:50
trace_lock_release include/trace/events/lock.h:69 [inline]
lock_release+0x9cc/0xa30 kernel/locking/lockdep.c:5860
rcu_lock_release include/linux/rcupdate.h:347 [inline]
rcu_read_unlock_sched include/linux/rcupdate.h:962 [inline]
pfn_valid+0x3eb/0x450 include/linux/mmzone.h:2058
page_table_check_set+0x22/0x540 mm/page_table_check.c:110
__page_table_check_ptes_set+0x30f/0x410 mm/page_table_check.c:225
page_table_check_ptes_set include/linux/page_table_check.h:74 [inline]
set_ptes include/linux/pgtable.h:288 [inline]
set_pte_range+0x724/0x750 mm/memory.c:5067
filemap_map_order0_folio mm/filemap.c:3624 [inline]
filemap_map_pages+0x11c6/0x20d0 mm/filemap.c:3678
do_fault_around mm/memory.c:5280 [inline]
do_read_fault mm/memory.c:5313 [inline]
do_fault mm/memory.c:5456 [inline]
do_pte_missing mm/memory.c:3979 [inline]
handle_pte_fault+0x31d6/0x68a0 mm/memory.c:5801
__handle_mm_fault mm/memory.c:5944 [inline]
handle_mm_fault+0x1106/0x1bb0 mm/memory.c:6112
do_user_addr_fault arch/x86/mm/fault.c:1338 [inline]
handle_page_fault arch/x86/mm/fault.c:1481 [inline]
exc_page_fault+0x459/0x8c0 arch/x86/mm/fault.c:1539
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
RIP: 0033:0x7f2c735865f8
Code: Unable to access opcode bytes at 0x7f2c735865ce.
RSP: 002b:00007ffcc6892fb8 EFLAGS: 00010202
RAX: 00007f2c735b6ad8 RBX: 0000000000000000 RCX: 0000000000000004
RDX: 00007f2c735b7ce0 RSI: 0000000000000000 RDI: 00007f2c735b6ad8
RBP: 00007f2c735b5118 R08: 00007f2c7350e2b0 R09: 00007f2c7350e2b0
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2c735b7cc8
R13: 0000000000000000 R14: 00007f2c735b7ce0 R15: 00007f2c7350e590
</TASK>


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

Puranjay Mohan

unread,
Dec 2, 2024, 7:42:52 AM12/2/24
to Alexei Starovoitov, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Eddy Z, Hao Luo, John Fastabend, Jiri Olsa, KP Singh, LKML, linux-trace-kernel, Martin KaFai Lau, Mathieu Desnoyers, Matt Bobrowski, Masami Hiramatsu, Steven Rostedt, Stanislav Fomichev, Song Liu, syzkaller-bugs, Yonghong Song
Alexei Starovoitov <alexei.st...@gmail.com> writes:

> Puranjay, Andrii and All,
>
> looks like if (irqs_disabled()) is not enough.
> Should we change it to preemptible() ?
>
> It will likely make it async all the time,
> but in this it's an ok trade off?
>

Yes, as BPF programs can run in all kinds of contexts.

We should replace 'if (irqs_disabled())' with 'if (!preemptible())'

because the definition is:

#define preemptible() (preempt_count() == 0 && !irqs_disabled())

and we need if ((preempt_count() != 0) || irqs_disabled()), in both
these cases we want to make it async.

I will try to test the fix as Syzbot has now found a reproducer.

Thanks,
Puranjay
signature.asc

Alexei Starovoitov

unread,
Dec 17, 2024, 6:50:09 PM12/17/24
to Puranjay Mohan, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Eddy Z, Hao Luo, John Fastabend, Jiri Olsa, KP Singh, LKML, linux-trace-kernel, Martin KaFai Lau, Mathieu Desnoyers, Matt Bobrowski, Masami Hiramatsu, Steven Rostedt, Stanislav Fomichev, Song Liu, syzkaller-bugs, Yonghong Song
Puranjay,

Any progress on a patch ?

Alexei Starovoitov

unread,
Dec 20, 2024, 12:30:55 PM12/20/24
to Puranjay Mohan, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Eddy Z, Hao Luo, John Fastabend, Jiri Olsa, KP Singh, LKML, linux-trace-kernel, Martin KaFai Lau, Mathieu Desnoyers, Matt Bobrowski, Masami Hiramatsu, Steven Rostedt, Stanislav Fomichev, Song Liu, syzkaller-bugs, Yonghong Song
ping.

Puranjay Mohan

unread,
Jan 15, 2025, 5:39:14 AMJan 15
to Alexei Starovoitov, syzbot, Andrii Nakryiko, Alexei Starovoitov, bpf, Daniel Borkmann, Eddy Z, Hao Luo, John Fastabend, Jiri Olsa, KP Singh, LKML, linux-trace-kernel, Martin KaFai Lau, Mathieu Desnoyers, Matt Bobrowski, Masami Hiramatsu, Steven Rostedt, Stanislav Fomichev, Song Liu, syzkaller-bugs, Yonghong Song
Hi Alexei,
Sorry for being AWOL. I was on a long vacation in India and just got
back.

Here is the patch to fix this: https://lore.kernel.org/all/20250115103647....@kernel.org/

Thanks,
Puranjay

#syz test: https://github.com/puranjaymohan/bpf.git bpf_preemt_fix
signature.asc

syzbot

unread,
Jan 15, 2025, 6:47:04 AMJan 15
to alexei.st...@gmail.com, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, edd...@gmail.com, hao...@google.com, john.fa...@gmail.com, jo...@kernel.org, kps...@kernel.org, linux-...@vger.kernel.org, linux-tra...@vger.kernel.org, marti...@linux.dev, mathieu....@efficios.com, mattbo...@google.com, mhir...@kernel.org, pura...@kernel.org, ros...@goodmis.org, s...@fomichev.me, so...@kernel.org, syzkall...@googlegroups.com, yongho...@linux.dev
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
lost connection to test machine



Tested on:

commit: c547a7b6 bpf: trace: send signals asynchronously if !p..
git tree: https://github.com/puranjaymohan/bpf.git bpf_preemt_fix
console output: https://syzkaller.appspot.com/x/log.txt?x=178257c4580000
kernel config: https://syzkaller.appspot.com/x/.config?x=aadf89e2f6db86cc
dashboard link: https://syzkaller.appspot.com/bug?extid=97da3d7e0112d59971de
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Note: no patches were applied.
Reply all
Reply to author
Forward
0 new messages