[v6.1] possible deadlock in hsr_dev_xmit (2)

4 views
Skip to first unread message

syzbot

unread,
Oct 18, 2024, 2:52:28 AM10/18/24
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 54d90d17e8ce Linux 6.1.113
git tree: linux-6.1.y
console output: https://syzkaller.appspot.com/x/log.txt?x=15ae8c5f980000
kernel config: https://syzkaller.appspot.com/x/.config?x=3757166a6a7e985
dashboard link: https://syzkaller.appspot.com/bug?extid=00fd05b0dd1cceac22c6
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/911f6b814ba2/disk-54d90d17.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/171ad8824108/vmlinux-54d90d17.xz
kernel image: https://storage.googleapis.com/syzbot-assets/ec19432ca7b0/bzImage-54d90d17.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+00fd05...@syzkaller.appspotmail.com

============================================
WARNING: possible recursive locking detected
6.1.113-syzkaller #0 Not tainted
--------------------------------------------
syz.3.1684/8806 is trying to acquire lock:
ffff88805d9bed88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
ffff88805d9bed88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: hsr_dev_xmit+0x13a/0x210 net/hsr/hsr_device.c:219

but task is already holding lock:
ffff88801ef78d88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
ffff88801ef78d88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: send_hsr_supervision_frame+0x272/0xad0 net/hsr/hsr_device.c:300

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&hsr->seqnr_lock);
lock(&hsr->seqnr_lock);

*** DEADLOCK ***

May be due to missing lock nesting notation

7 locks held by syz.3.1684/8806:
#0: ffffc900001e0bc0 ((&hsr->announce_timer)){+.-.}-{0:0}, at: call_timer_fn+0xc2/0x6b0 kernel/time/timer.c:1501
#1: ffffffff8d32afc0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:350 [inline]
#1: ffffffff8d32afc0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:791 [inline]
#1: ffffffff8d32afc0 (rcu_read_lock){....}-{1:2}, at: hsr_announce+0x9f/0x340 net/hsr/hsr_device.c:377
#2: ffff88801ef78d88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
#2: ffff88801ef78d88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: send_hsr_supervision_frame+0x272/0xad0 net/hsr/hsr_device.c:300
#3: ffffffff8d32afc0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:350 [inline]
#3: ffffffff8d32afc0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:791 [inline]
#3: ffffffff8d32afc0 (rcu_read_lock){....}-{1:2}, at: hsr_forward_skb+0xaa/0x2390 net/hsr/hsr_forward.c:614
#4: ffffffff8d32b020 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
#4: ffffffff8d32b020 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:843 [inline]
#4: ffffffff8d32b020 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2d6/0x3d50 net/core/dev.c:4220
#5: ffffffff8d32afc0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:350 [inline]
#5: ffffffff8d32afc0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:791 [inline]
#5: ffffffff8d32afc0 (rcu_read_lock){....}-{1:2}, at: br_dev_xmit+0x212/0x18e0 net/bridge/br_device.c:49
#6: ffffffff8d32b020 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline]
#6: ffffffff8d32b020 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:843 [inline]
#6: ffffffff8d32b020 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2d6/0x3d50 net/core/dev.c:4220

stack backtrace:
CPU: 1 PID: 8806 Comm: syz.3.1684 Not tainted 6.1.113-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
print_deadlock_bug kernel/locking/lockdep.c:2983 [inline]
check_deadlock kernel/locking/lockdep.c:3026 [inline]
validate_chain+0x4711/0x5950 kernel/locking/lockdep.c:3812
__lock_acquire+0x125b/0x1f80 kernel/locking/lockdep.c:5049
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:178
spin_lock_bh include/linux/spinlock.h:356 [inline]
hsr_dev_xmit+0x13a/0x210 net/hsr/hsr_device.c:219
__netdev_start_xmit include/linux/netdevice.h:4853 [inline]
netdev_start_xmit include/linux/netdevice.h:4867 [inline]
xmit_one net/core/dev.c:3627 [inline]
dev_hard_start_xmit+0x261/0x8c0 net/core/dev.c:3643
__dev_queue_xmit+0x1b5d/0x3d50 net/core/dev.c:4297
dev_queue_xmit include/linux/netdevice.h:3021 [inline]
br_dev_queue_push_xmit+0x6fe/0x8c0 net/bridge/br_forward.c:53
NF_HOOK+0x39f/0x450 include/linux/netfilter.h:302
br_forward_finish+0xe1/0x130 net/bridge/br_forward.c:66
NF_HOOK+0x39f/0x450 include/linux/netfilter.h:302
__br_forward+0x430/0x5f0 net/bridge/br_forward.c:115
deliver_clone net/bridge/br_forward.c:131 [inline]
maybe_deliver+0xb3/0x150 net/bridge/br_forward.c:189
br_flood+0x2e7/0x440 net/bridge/br_forward.c:231
br_dev_xmit+0x1194/0x18e0
__netdev_start_xmit include/linux/netdevice.h:4853 [inline]
netdev_start_xmit include/linux/netdevice.h:4867 [inline]
xmit_one net/core/dev.c:3627 [inline]
dev_hard_start_xmit+0x261/0x8c0 net/core/dev.c:3643
__dev_queue_xmit+0x1b5d/0x3d50 net/core/dev.c:4297
dev_queue_xmit include/linux/netdevice.h:3021 [inline]
hsr_xmit net/hsr/hsr_forward.c:380 [inline]
hsr_forward_do net/hsr/hsr_forward.c:471 [inline]
hsr_forward_skb+0x17f3/0x2390 net/hsr/hsr_forward.c:619
send_hsr_supervision_frame+0x540/0xad0 net/hsr/hsr_device.c:323
hsr_announce+0x1a4/0x340 net/hsr/hsr_device.c:379
call_timer_fn+0x1ad/0x6b0 kernel/time/timer.c:1504
expire_timers kernel/time/timer.c:1549 [inline]
__run_timers+0x67c/0x890 kernel/time/timer.c:1820
run_timer_softirq+0x63/0xf0 kernel/time/timer.c:1833
handle_softirqs+0x2ee/0xa40 kernel/softirq.c:571
__do_softirq kernel/softirq.c:605 [inline]
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x157/0x240 kernel/softirq.c:654
irq_exit_rcu+0x5/0x20 kernel/softirq.c:666
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1106 [inline]
sysvec_apic_timer_interrupt+0xa0/0xc0 arch/x86/kernel/apic/apic.c:1106
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20 arch/x86/include/asm/idtentry.h:691
RIP: 0010:finish_task_switch+0x1d3/0x810 kernel/sched/core.c:5120
Code: 38 0b 00 48 83 c4 08 4c 89 f7 e8 68 31 00 00 0f 1f 44 00 00 4c 89 f7 e8 1b 36 52 09 e8 d6 f2 31 00 fb 49 8d bc 24 f8 15 00 00 <48> 89 f8 48 c1 e8 03 49 bd 00 00 00 00 00 fc ff df 42 0f b6 04 28
RSP: 0018:ffffc9000402f2c8 EFLAGS: 00000282
RAX: ed340e3f36c2c400 RBX: ffff888027a1bbb4 RCX: ffffffff97337103
RDX: dffffc0000000000 RSI: ffffffff8b0c02c0 RDI: ffff88807a70d178
RBP: ffffc9000402f310 R08: dffffc0000000000 R09: ffffed10171e7539
R10: 0000000000000000 R11: dffffc0000000001 R12: ffff88807a70bb80
R13: 1ffff110171e76e3 R14: ffff8880b8f3a9c0 R15: ffff8880b8f3b718
context_switch kernel/sched/core.c:5244 [inline]
__schedule+0x1447/0x4570 kernel/sched/core.c:6558
schedule+0xbf/0x180 kernel/sched/core.c:6634
schedule_timeout+0xac/0x300 kernel/time/timer.c:1941
unix_wait_for_peer+0x24b/0x330 net/unix/af_unix.c:1443
unix_dgram_sendmsg+0x1348/0x2050 net/unix/af_unix.c:2022
sock_sendmsg_nosec net/socket.c:718 [inline]
__sock_sendmsg net/socket.c:730 [inline]
____sys_sendmsg+0x5a5/0x8f0 net/socket.c:2519
___sys_sendmsg net/socket.c:2573 [inline]
__sys_sendmmsg+0x3ab/0x730 net/socket.c:2659
__do_sys_sendmmsg net/socket.c:2688 [inline]
__se_sys_sendmmsg net/socket.c:2685 [inline]
__x64_sys_sendmmsg+0x9c/0xb0 net/socket.c:2685
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x3b/0xb0 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x68/0xd2
RIP: 0033:0x7f1e8cd7dff9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f1e8db4f038 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 00007f1e8cf35f80 RCX: 00007f1e8cd7dff9
RDX: 0000000000000651 RSI: 0000000020000000 RDI: 0000000000000004
RBP: 00007f1e8cdf0296 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f1e8cf35f80 R15: 00007ffe61af77d8
</TASK>
sched: RT throttling activated
----------------
Code disassembly (best guess), 1 bytes skipped:
0: 0b 00 or (%rax),%eax
2: 48 83 c4 08 add $0x8,%rsp
6: 4c 89 f7 mov %r14,%rdi
9: e8 68 31 00 00 call 0x3176
e: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
13: 4c 89 f7 mov %r14,%rdi
16: e8 1b 36 52 09 call 0x9523636
1b: e8 d6 f2 31 00 call 0x31f2f6
20: fb sti
21: 49 8d bc 24 f8 15 00 lea 0x15f8(%r12),%rdi
28: 00
* 29: 48 89 f8 mov %rdi,%rax <-- trapping instruction
2c: 48 c1 e8 03 shr $0x3,%rax
30: 49 bd 00 00 00 00 00 movabs $0xdffffc0000000000,%r13
37: fc ff df
3a: 42 0f b6 04 28 movzbl (%rax,%r13,1),%eax


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
Jan 11, 2025, 1:12:27 PM1/11/25
to syzkaller...@googlegroups.com
syzbot has found a reproducer for the following issue on:

HEAD commit: c63962be84ef Linux 6.1.124
git tree: linux-6.1.y
console output: https://syzkaller.appspot.com/x/log.txt?x=15dd2cb0580000
kernel config: https://syzkaller.appspot.com/x/.config?x=2f99ac2134f3ff64
dashboard link: https://syzkaller.appspot.com/bug?extid=00fd05b0dd1cceac22c6
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17ac5ef8580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17a9cbc4580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/7ce6b92b931c/disk-c63962be.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/edbb3cdca2c4/vmlinux-c63962be.xz
kernel image: https://storage.googleapis.com/syzbot-assets/728732ee05ab/Image-c63962be.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+00fd05...@syzkaller.appspotmail.com

============================================
WARNING: possible recursive locking detected
6.1.124-syzkaller #0 Not tainted
--------------------------------------------
kworker/1:1/24 is trying to acquire lock:
ffff0000ded30d88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
ffff0000ded30d88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: hsr_dev_xmit+0xf8/0x2d8 net/hsr/hsr_device.c:219

but task is already holding lock:
ffff0000d932cd88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
ffff0000d932cd88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: hsr_dev_xmit+0xf8/0x2d8 net/hsr/hsr_device.c:219

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&hsr->seqnr_lock);
lock(&hsr->seqnr_lock);

*** DEADLOCK ***

May be due to missing lock nesting notation

11 locks held by kworker/1:1/24:
#0: ffff0000d5590938 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_one_work+0x664/0x1404 kernel/workqueue.c:2265
#1: ffff80001d2e7c20 ((work_completion)(&(&ifa->dad_work)->work)){+.+.}-{0:0}, at: process_one_work+0x6a8/0x1404 kernel/workqueue.c:2267
#2: ffff8000180c4a08 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x20/0x2c net/core/rtnetlink.c:74
#3: ffff800015c65360 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x10/0x4c include/linux/rcupdate.h:349
#4: ffff800015c65360 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x10/0x4c include/linux/rcupdate.h:349
#5: ffff800015c653c0 (rcu_read_lock_bh){....}-{1:2}, at: rcu_lock_acquire+0x18/0x54 include/linux/rcupdate.h:349
#6: ffff0000d932cd88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
#6: ffff0000d932cd88 (&hsr->seqnr_lock){+.-.}-{2:2}, at: hsr_dev_xmit+0xf8/0x2d8 net/hsr/hsr_device.c:219
#7: ffff800015c65360 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x10/0x4c include/linux/rcupdate.h:349
#8: ffff800015c653c0 (rcu_read_lock_bh){....}-{1:2}, at: rcu_lock_acquire+0x18/0x54 include/linux/rcupdate.h:349
#9: ffff800015c65360 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x10/0x4c include/linux/rcupdate.h:349
#10: ffff800015c653c0 (rcu_read_lock_bh){....}-{1:2}, at: rcu_lock_acquire+0x18/0x54 include/linux/rcupdate.h:349

stack backtrace:
CPU: 1 PID: 24 Comm: kworker/1:1 Not tainted 6.1.124-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: ipv6_addrconf addrconf_dad_work
Call trace:
dump_backtrace+0x1c8/0x1f4 arch/arm64/kernel/stacktrace.c:158
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:165
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x108/0x170 lib/dump_stack.c:106
dump_stack+0x1c/0x5c lib/dump_stack.c:113
__lock_acquire+0x6310/0x7680 kernel/locking/lockdep.c:5049
lock_acquire+0x26c/0x7cc kernel/locking/lockdep.c:5662
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x54/0x6c kernel/locking/spinlock.c:178
spin_lock_bh include/linux/spinlock.h:356 [inline]
hsr_dev_xmit+0xf8/0x2d8 net/hsr/hsr_device.c:219
__netdev_start_xmit include/linux/netdevice.h:4888 [inline]
netdev_start_xmit include/linux/netdevice.h:4902 [inline]
xmit_one net/core/dev.c:3627 [inline]
dev_hard_start_xmit+0x25c/0x9a4 net/core/dev.c:3643
__dev_queue_xmit+0x161c/0x34d0 net/core/dev.c:4303
dev_queue_xmit include/linux/netdevice.h:3043 [inline]
br_dev_queue_push_xmit+0x584/0x730 net/bridge/br_forward.c:53
NF_HOOK+0x35c/0x408 include/linux/netfilter.h:302
br_forward_finish+0xd0/0x118 net/bridge/br_forward.c:66
NF_HOOK+0x35c/0x408 include/linux/netfilter.h:302
__br_forward+0x2f0/0x458 net/bridge/br_forward.c:115
deliver_clone net/bridge/br_forward.c:131 [inline]
maybe_deliver+0xc8/0x178 net/bridge/br_forward.c:189
br_flood+0x28c/0x3f8 net/bridge/br_forward.c:231
br_dev_xmit+0xdec/0x1520
__netdev_start_xmit include/linux/netdevice.h:4888 [inline]
netdev_start_xmit include/linux/netdevice.h:4902 [inline]
xmit_one net/core/dev.c:3627 [inline]
dev_hard_start_xmit+0x25c/0x9a4 net/core/dev.c:3643
__dev_queue_xmit+0x161c/0x34d0 net/core/dev.c:4303
dev_queue_xmit include/linux/netdevice.h:3043 [inline]
hsr_xmit net/hsr/hsr_forward.c:380 [inline]
hsr_forward_do net/hsr/hsr_forward.c:471 [inline]
hsr_forward_skb+0x1070/0x1c84 net/hsr/hsr_forward.c:621
hsr_dev_xmit+0x104/0x2d8 net/hsr/hsr_device.c:220
__netdev_start_xmit include/linux/netdevice.h:4888 [inline]
netdev_start_xmit include/linux/netdevice.h:4902 [inline]
xmit_one net/core/dev.c:3627 [inline]
dev_hard_start_xmit+0x25c/0x9a4 net/core/dev.c:3643
__dev_queue_xmit+0x161c/0x34d0 net/core/dev.c:4303
dev_queue_xmit include/linux/netdevice.h:3043 [inline]
neigh_connected_output+0x344/0x3d4 net/core/neighbour.c:1592
neigh_output include/net/neighbour.h:544 [inline]
ip6_finish_output2+0xdb8/0x1b54 net/ipv6/ip6_output.c:138
__ip6_finish_output net/ipv6/ip6_output.c:205 [inline]
ip6_finish_output+0x5a4/0x940 net/ipv6/ip6_output.c:216
NF_HOOK_COND include/linux/netfilter.h:291 [inline]
ip6_output+0x274/0x594 net/ipv6/ip6_output.c:237
dst_output include/net/dst.h:444 [inline]
NF_HOOK include/linux/netfilter.h:302 [inline]
ndisc_send_skb+0xc38/0x179c net/ipv6/ndisc.c:511
ndisc_send_ns+0xd4/0x164 net/ipv6/ndisc.c:669
addrconf_dad_work+0x99c/0x1390 net/ipv6/addrconf.c:4222
process_one_work+0x7ac/0x1404 kernel/workqueue.c:2292
worker_thread+0x8e4/0xfec kernel/workqueue.c:2439
kthread+0x250/0x2d8 kernel/kthread.c:376
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:864


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
Reply all
Reply to author
Forward
0 new messages