[v6.1] possible deadlock in rcu_report_exp_cpu_mult

1 view
Skip to first unread message

syzbot

unread,
Mar 16, 2024, 6:50:20 PMMar 16
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: d7543167affd Linux 6.1.82
git tree: linux-6.1.y
console output: https://syzkaller.appspot.com/x/log.txt?x=133f31be180000
kernel config: https://syzkaller.appspot.com/x/.config?x=59059e181681c079
dashboard link: https://syzkaller.appspot.com/bug?extid=3b001e9ea0e979613227
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13fd8c81180000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1395afc1180000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/a2421980b49a/disk-d7543167.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/52a6bb44161f/vmlinux-d7543167.xz
kernel image: https://storage.googleapis.com/syzbot-assets/9b3723bf43a9/bzImage-d7543167.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+3b001e...@syzkaller.appspotmail.com

=====================================================
WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
6.1.82-syzkaller #0 Not tainted
-----------------------------------------------------
syz-executor277/3542 [HC0[0]:SC0[2]:HE0:SE0] is trying to acquire:
ffff8880771eb820 (&htab->buckets[i].lock){+...}-{2:2}, at: sock_hash_delete_elem+0xac/0x2f0 net/core/sock_map.c:932

and this task is already holding:
ffffffff8d12f7d8 (rcu_node_0){-.-.}-{2:2}, at: rcu_note_context_switch+0x2a5/0xf10 kernel/rcu/tree_plugin.h:326
which would create a new lock dependency:
(rcu_node_0){-.-.}-{2:2} -> (&htab->buckets[i].lock){+...}-{2:2}

but this new dependency connects a HARDIRQ-irq-safe lock:
(rcu_node_0){-.-.}-{2:2}

... which became HARDIRQ-irq-safe at:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
rcu_report_exp_cpu_mult+0x27/0x2e0 kernel/rcu/tree_exp.h:238
__flush_smp_call_function_queue+0x60c/0xd00 kernel/smp.c:676
__sysvec_call_function_single+0xbb/0x360 arch/x86/kernel/smp.c:267
sysvec_call_function_single+0x89/0xb0 arch/x86/kernel/smp.c:262
asm_sysvec_call_function_single+0x16/0x20 arch/x86/include/asm/idtentry.h:661
memset_erms+0xb/0x10 arch/x86/lib/memset_64.S:64
kasan_poison mm/kasan/shadow.c:98 [inline]
kasan_unpoison+0x5d/0x80 mm/kasan/shadow.c:138
register_global mm/kasan/generic.c:219 [inline]
__asan_register_globals+0x38/0x70 mm/kasan/generic.c:231
asan.module_ctor+0x11/0x20
do_ctors init/main.c:1156 [inline]
do_basic_setup+0x58/0x81 init/main.c:1403
kernel_init_freeable+0x45c/0x60f init/main.c:1624
kernel_init+0x19/0x290 init/main.c:1512
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:307

to a HARDIRQ-irq-unsafe lock:
(&htab->buckets[i].lock){+...}-{2:2}

... which became HARDIRQ-irq-unsafe at:
...
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:178
sock_hash_delete_elem+0xac/0x2f0 net/core/sock_map.c:932
bpf_prog_43221478a22f23b5+0x3a/0x3e
bpf_dispatcher_nop_func include/linux/bpf.h:989 [inline]
__bpf_prog_run include/linux/filter.h:600 [inline]
bpf_prog_run include/linux/filter.h:607 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2273 [inline]
bpf_trace_run2+0x1fd/0x410 kernel/trace/bpf_trace.c:2312
trace_contention_end+0x12f/0x170 include/trace/events/lock.h:122
__mutex_lock_common kernel/locking/mutex.c:612 [inline]
__mutex_lock+0x2ed/0xd80 kernel/locking/mutex.c:747
ep_send_events fs/eventpoll.c:1657 [inline]
ep_poll fs/eventpoll.c:1827 [inline]
do_epoll_wait+0x814/0x1e60 fs/eventpoll.c:2262
__do_sys_epoll_wait fs/eventpoll.c:2274 [inline]
__se_sys_epoll_wait fs/eventpoll.c:2269 [inline]
__x64_sys_epoll_wait+0x253/0x2a0 fs/eventpoll.c:2269
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

Possible interrupt unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&htab->buckets[i].lock);
local_irq_disable();
lock(rcu_node_0);
lock(&htab->buckets[i].lock);
<Interrupt>
lock(rcu_node_0);

*** DEADLOCK ***

4 locks held by syz-executor277/3542:
#0: ffff888028ab5b58 (&mm->mmap_lock){++++}-{3:3}, at: mmap_read_trylock include/linux/mmap_lock.h:136 [inline]
#0: ffff888028ab5b58 (&mm->mmap_lock){++++}-{3:3}, at: get_mmap_lock_carefully mm/memory.c:5304 [inline]
#0: ffff888028ab5b58 (&mm->mmap_lock){++++}-{3:3}, at: lock_mm_and_find_vma+0x2e/0x2e0 mm/memory.c:5366
#1: ffffffff8d12a940 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:319 [inline]
#1: ffffffff8d12a940 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:760 [inline]
#1: ffffffff8d12a940 (rcu_read_lock){....}-{1:2}, at: filemap_map_pages+0x277/0x12c0 mm/filemap.c:3415
#2: ffffffff8d12f7d8 (rcu_node_0){-.-.}-{2:2}, at: rcu_note_context_switch+0x2a5/0xf10 kernel/rcu/tree_plugin.h:326
#3: ffffffff8d12a940 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:319 [inline]
#3: ffffffff8d12a940 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:760 [inline]
#3: ffffffff8d12a940 (rcu_read_lock){....}-{1:2}, at: __bpf_trace_run kernel/trace/bpf_trace.c:2272 [inline]
#3: ffffffff8d12a940 (rcu_read_lock){....}-{1:2}, at: bpf_trace_run2+0x110/0x410 kernel/trace/bpf_trace.c:2312

the dependencies between HARDIRQ-irq-safe lock and the holding lock:
-> (rcu_node_0){-.-.}-{2:2} {
IN-HARDIRQ-W at:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
rcu_report_exp_cpu_mult+0x27/0x2e0 kernel/rcu/tree_exp.h:238
__flush_smp_call_function_queue+0x60c/0xd00 kernel/smp.c:676
__sysvec_call_function_single+0xbb/0x360 arch/x86/kernel/smp.c:267
sysvec_call_function_single+0x89/0xb0 arch/x86/kernel/smp.c:262
asm_sysvec_call_function_single+0x16/0x20 arch/x86/include/asm/idtentry.h:661
memset_erms+0xb/0x10 arch/x86/lib/memset_64.S:64
kasan_poison mm/kasan/shadow.c:98 [inline]
kasan_unpoison+0x5d/0x80 mm/kasan/shadow.c:138
register_global mm/kasan/generic.c:219 [inline]
__asan_register_globals+0x38/0x70 mm/kasan/generic.c:231
asan.module_ctor+0x11/0x20
do_ctors init/main.c:1156 [inline]
do_basic_setup+0x58/0x81 init/main.c:1403
kernel_init_freeable+0x45c/0x60f init/main.c:1624
kernel_init+0x19/0x290 init/main.c:1512
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:307
IN-SOFTIRQ-W at:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:154
rcu_accelerate_cbs_unlocked+0x8a/0x230 kernel/rcu/tree.c:1184
rcu_core+0x5a0/0x17e0 kernel/rcu/tree.c:2547
__do_softirq+0x2e9/0xa4c kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x155/0x240 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
sysvec_apic_timer_interrupt+0x91/0xb0 arch/x86/kernel/apic/apic.c:1106
asm_sysvec_apic_timer_interrupt+0x16/0x20 arch/x86/include/asm/idtentry.h:653
__alloc_pages+0x3ee/0x770 mm/page_alloc.c:5571
alloc_page_interleave+0x22/0x1c0 mm/mempolicy.c:2115
__get_free_pages+0x8/0x30 mm/page_alloc.c:5595
kasan_populate_vmalloc_pte+0x35/0xf0 mm/kasan/shadow.c:271
apply_to_pte_range mm/memory.c:2645 [inline]
apply_to_pmd_range mm/memory.c:2689 [inline]
apply_to_pud_range mm/memory.c:2725 [inline]
apply_to_p4d_range mm/memory.c:2761 [inline]
__apply_to_page_range+0x9c5/0xcc0 mm/memory.c:2795
alloc_vmap_area+0x1977/0x1ac0 mm/vmalloc.c:1646
__get_vm_area_node+0x16c/0x360 mm/vmalloc.c:2505
__vmalloc_node_range+0x394/0x1460 mm/vmalloc.c:3183
alloc_thread_stack_node kernel/fork.c:311 [inline]
dup_task_struct+0x3e5/0x6d0 kernel/fork.c:988
copy_process+0x637/0x4060 kernel/fork.c:2098
fork_idle+0xa1/0x264 kernel/fork.c:2602
idle_init kernel/smpboot.c:55 [inline]
idle_threads_init+0x118/0x22b kernel/smpboot.c:74
smp_init+0x14/0x149 kernel/smp.c:1118
kernel_init_freeable+0x40c/0x60f init/main.c:1615
kernel_init+0x19/0x290 init/main.c:1512
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:307
INITIAL USE at:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162
rcutree_prepare_cpu+0x6d/0x520 kernel/rcu/tree.c:4173
rcu_init+0xb4/0x200 kernel/rcu/tree.c:4854
start_kernel+0x20d/0x53f init/main.c:1031
secondary_startup_64_no_verify+0xcf/0xdb
}
... key at: [<ffffffff91cd2d60>] rcu_init_one.rcu_node_class+0x0/0x20

the dependencies between the lock to be acquired
and HARDIRQ-irq-unsafe lock:
-> (&htab->buckets[i].lock){+...}-{2:2} {
HARDIRQ-ON-W at:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:178
sock_hash_delete_elem+0xac/0x2f0 net/core/sock_map.c:932
bpf_prog_43221478a22f23b5+0x3a/0x3e
bpf_dispatcher_nop_func include/linux/bpf.h:989 [inline]
__bpf_prog_run include/linux/filter.h:600 [inline]
bpf_prog_run include/linux/filter.h:607 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2273 [inline]
bpf_trace_run2+0x1fd/0x410 kernel/trace/bpf_trace.c:2312
trace_contention_end+0x12f/0x170 include/trace/events/lock.h:122
__mutex_lock_common kernel/locking/mutex.c:612 [inline]
__mutex_lock+0x2ed/0xd80 kernel/locking/mutex.c:747
ep_send_events fs/eventpoll.c:1657 [inline]
ep_poll fs/eventpoll.c:1827 [inline]
do_epoll_wait+0x814/0x1e60 fs/eventpoll.c:2262
__do_sys_epoll_wait fs/eventpoll.c:2274 [inline]
__se_sys_epoll_wait fs/eventpoll.c:2269 [inline]
__x64_sys_epoll_wait+0x253/0x2a0 fs/eventpoll.c:2269
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x63/0xcd
INITIAL USE at:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:178
sock_hash_delete_elem+0xac/0x2f0 net/core/sock_map.c:932
bpf_prog_43221478a22f23b5+0x3a/0x3e
bpf_dispatcher_nop_func include/linux/bpf.h:989 [inline]
__bpf_prog_run include/linux/filter.h:600 [inline]
bpf_prog_run include/linux/filter.h:607 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2273 [inline]
bpf_trace_run2+0x1fd/0x410 kernel/trace/bpf_trace.c:2312
trace_contention_end+0x12f/0x170 include/trace/events/lock.h:122
__mutex_lock_common kernel/locking/mutex.c:612 [inline]
__mutex_lock+0x2ed/0xd80 kernel/locking/mutex.c:747
ep_send_events fs/eventpoll.c:1657 [inline]
ep_poll fs/eventpoll.c:1827 [inline]
do_epoll_wait+0x814/0x1e60 fs/eventpoll.c:2262
__do_sys_epoll_wait fs/eventpoll.c:2274 [inline]
__se_sys_epoll_wait fs/eventpoll.c:2269 [inline]
__x64_sys_epoll_wait+0x253/0x2a0 fs/eventpoll.c:2269
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x63/0xcd
}
... key at: [<ffffffff920af300>] sock_hash_alloc.__key+0x0/0x20
... acquired at:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:178
sock_hash_delete_elem+0xac/0x2f0 net/core/sock_map.c:932
bpf_prog_43221478a22f23b5+0x3a/0x3e
bpf_dispatcher_nop_func include/linux/bpf.h:989 [inline]
__bpf_prog_run include/linux/filter.h:600 [inline]
bpf_prog_run include/linux/filter.h:607 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2273 [inline]
bpf_trace_run2+0x1fd/0x410 kernel/trace/bpf_trace.c:2312
trace_contention_end+0x14c/0x190 include/trace/events/lock.h:122
__pv_queued_spin_lock_slowpath+0x935/0xc50 kernel/locking/qspinlock.c:560
pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline]
queued_spin_lock_slowpath+0x42/0x50 arch/x86/include/asm/qspinlock.h:51
queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
do_raw_spin_lock+0x269/0x370 kernel/locking/spinlock_debug.c:115
rcu_note_context_switch+0x2a5/0xf10 kernel/rcu/tree_plugin.h:326
__schedule+0x32e/0x4550 kernel/sched/core.c:6458
preempt_schedule_common+0x83/0xd0 kernel/sched/core.c:6727
preempt_schedule+0xd9/0xe0 kernel/sched/core.c:6751
preempt_schedule_thunk+0x16/0x18 arch/x86/entry/thunk_64.S:34
__raw_spin_unlock include/linux/spinlock_api_smp.h:143 [inline]
_raw_spin_unlock+0x36/0x40 kernel/locking/spinlock.c:186
spin_unlock include/linux/spinlock.h:391 [inline]
filemap_map_pages+0xffa/0x12c0 mm/filemap.c:3470
do_fault_around mm/memory.c:4581 [inline]
do_read_fault mm/memory.c:4607 [inline]
do_fault mm/memory.c:4741 [inline]
handle_pte_fault mm/memory.c:5013 [inline]
__handle_mm_fault mm/memory.c:5155 [inline]
handle_mm_fault+0x33e2/0x5340 mm/memory.c:5276
do_user_addr_fault arch/x86/mm/fault.c:1380 [inline]
handle_page_fault arch/x86/mm/fault.c:1471 [inline]
exc_page_fault+0x26f/0x660 arch/x86/mm/fault.c:1527
asm_exc_page_fault+0x22/0x30 arch/x86/include/asm/idtentry.h:570


stack backtrace:
CPU: 0 PID: 3542 Comm: syz-executor277 Not tainted 6.1.82-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
print_bad_irq_dependency kernel/locking/lockdep.c:2604 [inline]
check_irq_usage kernel/locking/lockdep.c:2843 [inline]
check_prev_add kernel/locking/lockdep.c:3094 [inline]
check_prevs_add kernel/locking/lockdep.c:3209 [inline]
validate_chain+0x4d16/0x5950 kernel/locking/lockdep.c:3825
__lock_acquire+0x125b/0x1f80 kernel/locking/lockdep.c:5049
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:178
sock_hash_delete_elem+0xac/0x2f0 net/core/sock_map.c:932
bpf_prog_43221478a22f23b5+0x3a/0x3e
bpf_dispatcher_nop_func include/linux/bpf.h:989 [inline]
__bpf_prog_run include/linux/filter.h:600 [inline]
bpf_prog_run include/linux/filter.h:607 [inline]
__bpf_trace_run kernel/trace/bpf_trace.c:2273 [inline]
bpf_trace_run2+0x1fd/0x410 kernel/trace/bpf_trace.c:2312
trace_contention_end+0x14c/0x190 include/trace/events/lock.h:122
__pv_queued_spin_lock_slowpath+0x935/0xc50 kernel/locking/qspinlock.c:560
pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:591 [inline]
queued_spin_lock_slowpath+0x42/0x50 arch/x86/include/asm/qspinlock.h:51
queued_spin_lock include/asm-generic/qspinlock.h:114 [inline]
do_raw_spin_lock+0x269/0x370 kernel/locking/spinlock_debug.c:115
rcu_note_context_switch+0x2a5/0xf10 kernel/rcu/tree_plugin.h:326
__schedule+0x32e/0x4550 kernel/sched/core.c:6458
preempt_schedule_common+0x83/0xd0 kernel/sched/core.c:6727
preempt_schedule+0xd9/0xe0 kernel/sched/core.c:6751
preempt_schedule_thunk+0x16/0x18 arch/x86/entry/thunk_64.S:34
__raw_spin_unlock include/linux/spinlock_api_smp.h:143 [inline]
_raw_spin_unlock+0x36/0x40 kernel/locking/spinlock.c:186
spin_unlock include/linux/spinlock.h:391 [inline]
filemap_map_pages+0xffa/0x12c0 mm/filemap.c:3470
do_fault_around mm/memory.c:4581 [inline]
do_read_fault mm/memory.c:4607 [inline]
do_fault mm/memory.c:4741 [inline]
handle_pte_fault mm/memory.c:5013 [inline]
__handle_mm_fault mm/memory.c:5155 [inline]
handle_mm_fault+0x33e2/0x5340 mm/memory.c:5276
do_user_addr_fault arch/x86/mm/fault.c:1380 [inline]
handle_page_fault arch/x86/mm/fault.c:1471 [inline]
exc_page_fault+0x26f/0x660 arch/x86/mm/fault.c:1527
asm_exc_page_fault+0x22/0x30 arch/x86/include/asm/idtentry.h:570
RIP: 0033:0x7f92a849fdb8
Code: e8 9d 7d f9 ff 48 85 db 75 f0 48 8b 3d 49 73 03 00 48 83 c5 08 48 81 fd f8 07 00 00 75 cc 48 83 c4 08 5b 5d e9 79 7d f9 ff c3 <48> 83 ec 08 48 83 c4 08 c3 00 00 00 00 00 00 00 00 00 00 00 00 00
RSP: 002b:00007ffe7f2fd5b8 EFLAGS: 00010202
RAX: 00007f92a84cfaf8 RBX: 0000000000000000 RCX: 0000000000000004
RDX: 00007f92a84d1da0 RSI: 0000000000000000 RDI: 00007f92a84cfaf8
RBP: 00007f92a84ce138 R08: 000055555706e610 R09: 000055555706e610
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f92a84d1d88
R13: 0000000000000000 R14: 00007f92a84d1da0 R15: 00007f92a84224c0
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup
Reply all
Reply to author
Forward
0 new messages