BUG: soft lockup in ieee80211_tasklet

syzbot

unread,

Feb 23, 2021, 12:55:21 PM2/23/21

to da...@davemloft.net, joha...@sipsolutions.net, ku...@kernel.org, linux-...@vger.kernel.org, linux-w...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com

Hello,

syzbot found the following issue on:

HEAD commit: 3b9cdafb Merge tag 'pinctrl-v5.12-1' of git://git.kernel.o..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=153024bcd00000
kernel config: https://syzkaller.appspot.com/x/.config?x=22008533485b2c35
dashboard link: https://syzkaller.appspot.com/bug?extid=27df43cf7ae73de7d8ee

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+27df43...@syzkaller.appspotmail.com

watchdog: BUG: soft lockup - CPU#0 stuck for 122s! [syz-executor.4:18357]
Modules linked in:
irq event stamp: 20542405
hardirqs last enabled at (20542404): [<ffffffff89200d42>] asm_sysvec_irq_work+0x12/0x20 arch/x86/include/asm/idtentry.h:661
hardirqs last disabled at (20542405): [<ffffffff8901540c>] sysvec_apic_timer_interrupt+0xc/0x100 arch/x86/kernel/apic/apic.c:1100
softirqs last enabled at (18968488): [<ffffffff89200eaf>] asm_call_irq_on_stack+0xf/0x20
softirqs last disabled at (18968491): [<ffffffff89200eaf>] asm_call_irq_on_stack+0xf/0x20
CPU: 0 PID: 18357 Comm: syz-executor.4 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:jhash+0x339/0x5d0 include/linux/jhash.h:95
Code: fc ff df 48 89 fa 48 c1 ea 03 0f b6 04 02 48 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 8a 02 00 00 0f b6 43 06 c1 e0 10 41 01 c7 <e8> 92 7f b2 fd 48 8d 7b 05 48 b8 00 00 00 00 00 fc ff df 48 89 fa
RSP: 0018:ffffc90000007a08 EFLAGS: 00000297
RAX: 0000000000000000 RBX: ffff88801419dc5a RCX: 000000000000000c
RDX: 0000000000000000 RSI: ffff88806da10000 RDI: 0000000000000003
RBP: 0000000000000006 R08: ffffffff89bed200 R09: ffffffff83c0d347
R10: 000000000000000c R11: 0000000000000006 R12: 0000000000000006
R13: 00000000b59356c8 R14: 00000000b59356c8 R15: 00000000b59356c8
FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000557aa2d18ea8 CR3: 000000001466b000 CR4: 0000000000350ef0
Call Trace:
<IRQ>
rht_key_hashfn include/linux/rhashtable.h:159 [inline]
__rhashtable_lookup+0x22b/0x780 include/linux/rhashtable.h:596
rhltable_lookup include/linux/rhashtable.h:688 [inline]
sta_info_hash_lookup net/mac80211/sta_info.c:162 [inline]
sta_info_get_bss+0x144/0x3f0 net/mac80211/sta_info.c:199
__ieee80211_rx_handle_packet net/mac80211/rx.c:4694 [inline]
ieee80211_rx_list+0x910/0x2680 net/mac80211/rx.c:4819
ieee80211_rx_napi+0xf7/0x3d0 net/mac80211/rx.c:4842
ieee80211_rx include/net/mac80211.h:4524 [inline]
ieee80211_tasklet_handler+0xd4/0x130 net/mac80211/main.c:235
tasklet_action_common.constprop.0+0x1d7/0x2d0 kernel/softirq.c:555
__do_softirq+0x29b/0x9f6 kernel/softirq.c:343
asm_call_irq_on_stack+0xf/0x20
</IRQ>
__run_on_irqstack arch/x86/include/asm/irq_stack.h:26 [inline]
run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:77 [inline]
do_softirq_own_stack+0xaa/0xd0 arch/x86/kernel/irq_64.c:77
invoke_softirq kernel/softirq.c:226 [inline]
__irq_exit_rcu kernel/softirq.c:420 [inline]
irq_exit_rcu+0x134/0x200 kernel/softirq.c:432
sysvec_apic_timer_interrupt+0x4d/0x100 arch/x86/kernel/apic/apic.c:1100
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:635
RIP: 0010:mm_update_next_owner+0x50f/0x7a0 kernel/exit.c:391
Code: 06 00 00 48 8d a8 f0 f9 ff ff 49 39 c7 0f 84 eb fe ff ff e8 43 e3 2e 00 48 8d 85 a0 04 00 00 48 89 c2 48 c1 ea 03 80 3c 1a 00 <0f> 85 96 01 00 00 4c 8b b5 a0 04 00 00 4d 39 e6 75 95 49 89 c5 e9
RSP: 0018:ffffc90002067b18 EFLAGS: 00000246
RAX: ffff888012bcbc20 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: 1ffff11002579784 RSI: ffffffff814470fd RDI: ffff888012bcbf38
RBP: ffff888012bcb780 R08: 0000000000000000 R09: ffffffff8bc0a083
R10: ffffffff8144705f R11: 0000000000000001 R12: ffff888073213f00
R13: ffff888012bcb780 R14: 0000000000000000 R15: ffff88802655c110
exit_mm kernel/exit.c:500 [inline]
do_exit+0xb67/0x2ae0 kernel/exit.c:812
do_group_exit+0x125/0x310 kernel/exit.c:922
get_signal+0x42c/0x2100 kernel/signal.c:2773
arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x148/0x250 kernel/entry/common.c:208
__syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:301
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x465ef9
Code: Unable to access opcode bytes at RIP 0x465ecf.
RSP: 002b:00007f599085f218 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 000000000056bf68 RCX: 0000000000465ef9
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 000000000056bf68
RBP: 000000000056bf60 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf6c
R13: 00007ffd3669b9ff R14: 00007f599085f300 R15: 0000000000022000
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 18367 Comm: syz-executor.5 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:queued_write_lock_slowpath+0x131/0x270 kernel/locking/qrwlock.c:76
Code: 00 00 00 00 fc ff df 49 01 c7 41 83 c6 03 41 0f b6 07 41 38 c6 7c 08 84 c0 0f 85 fe 00 00 00 8b 03 3d 00 01 00 00 74 19 f3 90 <41> 0f b6 07 41 38 c6 7c ec 84 c0 74 e8 48 89 df e8 ea c7 5c 00 eb
RSP: 0018:ffffc90001ee7ca0 EFLAGS: 00000006
RAX: 0000000000000300 RBX: ffffffff8bc0a080 RCX: ffffffff8159eafa
RDX: fffffbfff1781411 RSI: 0000000000000004 RDI: ffffffff8bc0a080
RBP: 00000000000000ff R08: 0000000000000001 R09: ffffffff8bc0a083
R10: fffffbfff1781410 R11: 0000000000000000 R12: 1ffff920003dcf95
R13: ffffffff8bc0a084 R14: 0000000000000003 R15: fffffbfff1781410
FS: 0000000000000000(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000004 CR3: 0000000026091000 CR4: 0000000000350ee0
Call Trace:
queued_write_lock include/asm-generic/qrwlock.h:97 [inline]
do_raw_write_lock+0x1ce/0x280 kernel/locking/spinlock_debug.c:207
exit_notify kernel/exit.c:667 [inline]
do_exit+0xcaf/0x2ae0 kernel/exit.c:845
do_group_exit+0x125/0x310 kernel/exit.c:922
__do_sys_exit_group kernel/exit.c:933 [inline]
__se_sys_exit_group kernel/exit.c:931 [inline]
__x64_sys_exit_group+0x3a/0x50 kernel/exit.c:931
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x465ef9
Code: Unable to access opcode bytes at RIP 0x465ecf.
RSP: 002b:00007ffd63249718 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 000000000000001e RCX: 0000000000465ef9
RDX: 000000000041920b RSI: ffffffffffffffbc RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000001 R15: 00007ffd63249810

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Hillf Danton

unread,

Feb 23, 2021, 9:30:42 PM2/23/21

to syzbot, joha...@sipsolutions.net, Hillf Danton, linux-...@vger.kernel.org, linux-w...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com

Tue, 23 Feb 2021 09:55:20 -0800

It's not the second time we see lockup like this.

Add budget for the 80211 softint handler - it's feasible not to try to
build the giant pyramid in a week.

--- x/net/mac80211/main.c
+++ y/net/mac80211/main.c
@@ -224,9 +224,15 @@ static void ieee80211_tasklet_handler(un
{
struct ieee80211_local *local = (struct ieee80211_local *) data;
struct sk_buff *skb;
+ int i = 0;
+
+ while (i++ < 64) {
+ skb = skb_dequeue(&local->skb_queue);
+ if (!skb)
+ skb = skb_dequeue(&local->skb_queue_unreliable);
+ if (!skb)
+ return;

- while ((skb = skb_dequeue(&local->skb_queue)) ||
- (skb = skb_dequeue(&local->skb_queue_unreliable))) {
switch (skb->pkt_type) {
case IEEE80211_RX_MSG:
/* Clear skb->pkt_type in order to not confuse kernel
@@ -245,6 +251,8 @@ static void ieee80211_tasklet_handler(un
break;
}
}
+
+ tasklet_schedule(&local->tasklet);
}

static void ieee80211_restart_work(struct work_struct *work)

syzbot

unread,

Mar 2, 2021, 3:10:23 AM3/2/21

to da...@davemloft.net, hda...@sina.com, joha...@sipsolutions.net, ku...@kernel.org, linux-...@vger.kernel.org, linux-w...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com

syzbot has found a reproducer for the following issue on:

HEAD commit: 7a7fd0de Merge branch 'kmap-conversion-for-5.12' of git://..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14df34ead00000
kernel config: https://syzkaller.appspot.com/x/.config?x=e0da2d01cc636e2c
dashboard link: https://syzkaller.appspot.com/bug?extid=27df43cf7ae73de7d8ee
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=154a476cd00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1152fb82d00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+27df43...@syzkaller.appspotmail.com

watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor290:22312]
Modules linked in:
irq event stamp: 18402725
hardirqs last enabled at (18402724): [<ffffffff89200d42>] asm_sysvec_irq_work+0x12/0x20 arch/x86/include/asm/idtentry.h:658
hardirqs last disabled at (18402725): [<ffffffff8902dd0b>] sysvec_apic_timer_interrupt+0xb/0xc0 arch/x86/kernel/apic/apic.c:1100
softirqs last enabled at (18165196): [<ffffffff8144d934>] invoke_softirq kernel/softirq.c:221 [inline]
softirqs last enabled at (18165196): [<ffffffff8144d934>] __irq_exit_rcu kernel/softirq.c:422 [inline]
softirqs last enabled at (18165196): [<ffffffff8144d934>] irq_exit_rcu+0x134/0x200 kernel/softirq.c:434
softirqs last disabled at (18165199): [<ffffffff8144d934>] invoke_softirq kernel/softirq.c:221 [inline]
softirqs last disabled at (18165199): [<ffffffff8144d934>] __irq_exit_rcu kernel/softirq.c:422 [inline]
softirqs last disabled at (18165199): [<ffffffff8144d934>] irq_exit_rcu+0x134/0x200 kernel/softirq.c:434
CPU: 0 PID: 22312 Comm: syz-executor290 Not tainted 5.12.0-rc1-syzkaller #0

Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

RIP: 0010:write_comp_data kernel/kcov.c:218 [inline]
RIP: 0010:__sanitizer_cov_trace_switch+0x63/0xf0 kernel/kcov.c:320
Code: 4d 8b 10 31 c9 65 4c 8b 24 25 00 f0 01 00 4d 85 d2 74 6b 4c 89 e6 bf 03 00 00 00 4c 8b 4c 24 20 49 8b 6c c8 10 e8 2d ff ff ff <84> c0 74 47 49 8b 84 24 b8 14 00 00 41 8b bc 24 b4 14 00 00 48 8b
RSP: 0018:ffffc900000078d8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000006
RDX: 0000000000000000 RSI: ffff88801c370000 RDI: 0000000000000003
RBP: 00000000000000b0 R08: ffffffff8a84bea0 R09: ffffffff885fcfcf
R10: 0000000000000008 R11: 0000000000000080 R12: ffff88801c370000
R13: 0000000000000080 R14: ffff888012b6a450 R15: 0000000000000000

FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033

CR2: 00000000004d0110 CR3: 0000000027282000 CR4: 0000000000350ef0
Call Trace:
<IRQ>
ieee80211_rx_h_mgmt net/mac80211/rx.c:3588 [inline]
ieee80211_rx_handlers+0x89ef/0xae60 net/mac80211/rx.c:3793
ieee80211_invoke_rx_handlers net/mac80211/rx.c:3823 [inline]
ieee80211_prepare_and_rx_handle+0x22ad/0x5070 net/mac80211/rx.c:4537
__ieee80211_rx_handle_packet net/mac80211/rx.c:4635 [inline]
ieee80211_rx_list+0x930/0x2680 net/mac80211/rx.c:4819

ieee80211_rx_napi+0xf7/0x3d0 net/mac80211/rx.c:4842
ieee80211_rx include/net/mac80211.h:4524 [inline]
ieee80211_tasklet_handler+0xd4/0x130 net/mac80211/main.c:235

tasklet_action_common.constprop.0+0x1d7/0x2d0 kernel/softirq.c:557
__do_softirq+0x29b/0x9f6 kernel/softirq.c:345
invoke_softirq kernel/softirq.c:221 [inline]
__irq_exit_rcu kernel/softirq.c:422 [inline]
irq_exit_rcu+0x134/0x200 kernel/softirq.c:434
sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1100
</IRQ>
asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632
RIP: 0010:mm_update_next_owner+0x432/0x7a0 kernel/exit.c:388
Code: 8d ad b0 fb ff ff 48 81 fd 50 c8 cb 8b 0f 84 65 01 00 00 e8 90 e6 2e 00 48 8d bd dc fb ff ff 48 89 f8 48 c1 e8 03 0f b6 14 18 <48> 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 b5 02 00 00 44
RSP: 0018:ffffc9000ab77b18 EFLAGS: 00000217
RAX: 1ffff110041046f5 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff814470e0 RDI: ffff8880208237ac
RBP: ffff888020823bd0 R08: 0000000000000000 R09: ffffffff8bc0a083
R10: ffffffff8144711f R11: 0000000000000001 R12: ffff888018b00000
R13: ffff888020823780 R14: 0000000000200000 R15: ffff888011520010
exit_mm kernel/exit.c:500 [inline]
do_exit+0xb02/0x2a60 kernel/exit.c:812

do_group_exit+0x125/0x310 kernel/exit.c:922
get_signal+0x42c/0x2100 kernel/signal.c:2773
arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x148/0x250 kernel/entry/common.c:208
__syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:301
entry_SYSCALL_64_after_hwframe+0x44/0xae

RIP: 0033:0x453dd9
Code: Unable to access opcode bytes at RIP 0x453daf.
RSP: 002b:00007fcbbf2d5218 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000004d8268 RCX: 0000000000453dd9
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00000000004d8268
RBP: 00000000004d8260 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004d826c
R13: 00007ffe178897df R14: 00007fcbbf2d5300 R15: 0000000000022000

Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1

CPU: 1 PID: 22313 Comm: syz-executor290 Not tainted 5.12.0-rc1-syzkaller #0

Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:queued_write_lock_slowpath+0x131/0x270 kernel/locking/qrwlock.c:76

Code: 00 00 00 00 fc ff df 49 01 c7 41 83 c6 03 41 0f b6 07 41 38 c6 7c 08 84 c0 0f 85 fe 00 00 00 8b 03 3d 00 01 00 00 74 19 f3 90 <41> 0f b6 07 41 38 c6 7c ec 84 c0 74 e8 48 89 df e8 8a 5c 5d 00 eb
RSP: 0018:ffffc9000a37fa60 EFLAGS: 00000006
RAX: 0000000000000300 RBX: ffffffff8bc0a080 RCX: ffffffff8159ecfa

RDX: fffffbfff1781411 RSI: 0000000000000004 RDI: ffffffff8bc0a080
RBP: 00000000000000ff R08: 0000000000000001 R09: ffffffff8bc0a083

R10: fffffbfff1781410 R11: 0000000000000000 R12: 1ffff9200146ff4d

R13: ffffffff8bc0a084 R14: 0000000000000003 R15: fffffbfff1781410
FS: 0000000000000000(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033

CR2: 00000000004a0d38 CR3: 0000000027282000 CR4: 0000000000350ee0

Call Trace:
queued_write_lock include/asm-generic/qrwlock.h:97 [inline]
do_raw_write_lock+0x1ce/0x280 kernel/locking/spinlock_debug.c:207
exit_notify kernel/exit.c:667 [inline]

do_exit+0xc4a/0x2a60 kernel/exit.c:845
do_group_exit+0x125/0x310 kernel/exit.c:922

get_signal+0x42c/0x2100 kernel/signal.c:2773
arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x148/0x250 kernel/entry/common.c:208
__syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline]
syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:301
entry_SYSCALL_64_after_hwframe+0x44/0xae

RIP: 0033:0x453dd9
Code: Unable to access opcode bytes at RIP 0x453daf.
RSP: 002b:00007fcbbf2b4218 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000004d8278 RCX: 0000000000453dd9
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00000000004d8278
RBP: 00000000004d8270 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004d827c
R13: 00007ffe178897df R14: 00007fcbbf2b4300 R15: 0000000000022000

Johannes Berg

unread,

Mar 2, 2021, 9:18:25 AM3/2/21

to Hillf Danton, syzbot, linux-...@vger.kernel.org, linux-w...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com

On Wed, 2021-02-24 at 10:30 +0800, Hillf Danton wrote:
>
> Add budget for the 80211 softint handler - it's feasible not to try to
> build the giant pyramid in a week.
>
> --- x/net/mac80211/main.c
> +++ y/net/mac80211/main.c
> @@ -224,9 +224,15 @@ static void ieee80211_tasklet_handler(un
> {
> struct ieee80211_local *local = (struct ieee80211_local *) data;
> struct sk_buff *skb;
> + int i = 0;
> +
> + while (i++ < 64) {
> + skb = skb_dequeue(&local->skb_queue);
> + if (!skb)
> + skb = skb_dequeue(&local->skb_queue_unreliable);
> + if (!skb)
> + return;

I guess that's not such a bad idea, but I do wonder how we get here,
userspace can submit packets faster than we can process?

It feels like a simulation-only case, tbh, since over the air you have
limits how much bandwidth you can get ... unless you have a very slow
CPU?

In any case, if you want anything merged you're going to have to submit
a proper patch with a real commit message and Signed-off-by, etc.

johannes

Dmitry Vyukov

unread,

Mar 2, 2021, 2:01:47 PM3/2/21

to Johannes Berg, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Hillf Danton, syzbot, LKML, linux-wireless, netdev, syzkaller-bugs

Looking at the reproducer that mostly contains just perf_event_open,
It may be the old known issue of perf_event_open with some extreme
parameters bringing down kernel.
+perf maintainers
And as far as I remember +Peter had some patch to restrict
perf_event_open parameters.

r0 = perf_event_open(&(0x7f000001d000)={0x1, 0x70, 0x0, 0x0, 0x0, 0x0,
0x0, 0x3ff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xfffffffe, 0x0, @perf_config_ext}, 0x0,
0x0, 0xffffffffffffffff, 0x0)

Hillf Danton

unread,

Mar 3, 2021, 3:59:39 AM3/3/21

to Johannes Berg, Hillf Danton, syzbot, linux-...@vger.kernel.org, linux-w...@vger.kernel.org, syzkall...@googlegroups.com

On Tue, 02 Mar 2021 15:18:16 +0100 Johannes Berg wrote:
> On Wed, 2021-02-24 at 10:30 +0800, Hillf Danton wrote:
> >
> > Add budget for the 80211 softint handler - it's feasible not to try to
> > build the giant pyramid in a week.
> >
> > --- x/net/mac80211/main.c
> > +++ y/net/mac80211/main.c
> > @@ -224,9 +224,15 @@ static void ieee80211_tasklet_handler(un
> > {
> > struct ieee80211_local *local = (struct ieee80211_local *) data;
> > struct sk_buff *skb;
> > + int i = 0;
> > +
> > + while (i++ < 64) {
> > + skb = skb_dequeue(&local->skb_queue);
> > + if (!skb)
> > + skb = skb_dequeue(&local->skb_queue_unreliable);
> > + if (!skb)
> > + return;
>
> I guess that's not such a bad idea, but I do wonder how we get here,
> userspace can submit packets faster than we can process?

I wonder why syzbot did not make other handlers stand out than
ieee80211_tasklet_handler.

>
> It feels like a simulation-only case, tbh, since over the air you have
> limits how much bandwidth you can get ... unless you have a very slow
> CPU?

Even with a slower CPU I want to run a FIFO task every tick - it can bear
latencies like two seconds.

Dmitry Vyukov

unread,

Mar 3, 2021, 4:07:04 AM3/3/21

to Hillf Danton, Johannes Berg, syzbot, LKML, linux-wireless, syzkaller-bugs

On Wed, Mar 3, 2021 at 9:59 AM Hillf Danton <hda...@sina.com> wrote:
>
> On Tue, 02 Mar 2021 15:18:16 +0100 Johannes Berg wrote:
> > On Wed, 2021-02-24 at 10:30 +0800, Hillf Danton wrote:
> > >
> > > Add budget for the 80211 softint handler - it's feasible not to try to
> > > build the giant pyramid in a week.
> > >
> > > --- x/net/mac80211/main.c
> > > +++ y/net/mac80211/main.c
> > > @@ -224,9 +224,15 @@ static void ieee80211_tasklet_handler(un
> > > {
> > > struct ieee80211_local *local = (struct ieee80211_local *) data;
> > > struct sk_buff *skb;
> > > + int i = 0;
> > > +
> > > + while (i++ < 64) {
> > > + skb = skb_dequeue(&local->skb_queue);
> > > + if (!skb)
> > > + skb = skb_dequeue(&local->skb_queue_unreliable);
> > > + if (!skb)
> > > + return;
> >
> > I guess that's not such a bad idea, but I do wonder how we get here,
> > userspace can submit packets faster than we can process?
>
> I wonder why syzbot did not make other handlers stand out than
> ieee80211_tasklet_handler.

syzbot has no relation to this whatsoever. It's just a proxy between
the kernel and you. Ask the kernel ;)

Johannes Berg

unread,

Mar 4, 2021, 3:31:32 AM3/4/21

to Dmitry Vyukov, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Hillf Danton, syzbot, LKML, linux-wireless, netdev, syzkaller-bugs

On Tue, 2021-03-02 at 20:01 +0100, Dmitry Vyukov wrote:
>
> Looking at the reproducer that mostly contains just perf_event_open,
> It may be the old known issue of perf_event_open with some extreme
> parameters bringing down kernel.
> +perf maintainers
> And as far as I remember +Peter had some patch to restrict
> perf_event_open parameters.
>
> r0 = perf_event_open(&(0x7f000001d000)={0x1, 0x70, 0x0, 0x0, 0x0, 0x0,
> 0x0, 0x3ff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,
> 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xfffffffe, 0x0, @perf_config_ext}, 0x0,
> 0x0, 0xffffffffffffffff, 0x0)

Oh! Thanks for looking.

Seems that also applies to

https://syzkaller.appspot.com/bug?extid=d6219cf21f26bdfcc22e

FWIW. I was still tracking that one, but never had a chance to look at
it (also way down the list since it was reported as directly in hwsim)

johannes

Hillf Danton

unread,

Nov 13, 2022, 4:26:06 AM11/13/22

to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com

On 02 Mar 2021 00:10:22 -0800

> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: 7a7fd0de Merge branch 'kmap-conversion-for-5.12' of git://..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=14df34ead00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=e0da2d01cc636e2c
> dashboard link: https://syzkaller.appspot.com/bug?extid=27df43cf7ae73de7d8ee
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=154a476cd00000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1152fb82d00000

Add debug info.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 7a7fd0de

--- x/net/mac80211/main.c
+++ m/net/mac80211/main.c
@@ -224,6 +224,7 @@ static void ieee80211_tasklet_handler(st
{
struct ieee80211_local *local = from_tasklet(local, t, tasklet);
struct sk_buff *skb;
+ unsigned long ts = jiffies + 2 * HZ;

while ((skb = skb_dequeue(&local->skb_queue)) ||

(skb = skb_dequeue(&local->skb_queue_unreliable))) {
@@ -244,6 +245,9 @@ static void ieee80211_tasklet_handler(st
dev_kfree_skb(skb);
break;
}
+
+ if (WARN_ON_ONCE(time_after(jiffies, ts)))
+ break;
}
}

--

syzbot

unread,

Nov 13, 2022, 5:29:21 AM11/13/22

to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+27df43...@syzkaller.appspotmail.com

Tested on:

commit: 7a7fd0de Merge branch 'kmap-conversion-for-5.12' of gi..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=148ea079880000
kernel config: https://syzkaller.appspot.com/x/.config?x=ad1d200e85d8538d
dashboard link: https://syzkaller.appspot.com/bug?extid=27df43cf7ae73de7d8ee
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=147b763e880000

Note: testing is done by a robot and is best-effort only.

syzbot

unread,

Jan 7, 2023, 3:49:39 AM1/7/23

to syzkall...@googlegroups.com

Auto-closing this bug as obsolete.
No recent activity, existing reproducers are no longer triggering the issue.

Reply all

Reply to author

Forward

BUG: soft lockup in ieee80211_tasklet_handler

syzbot

Hillf Danton

syzbot

Johannes Berg

Dmitry Vyukov

Hillf Danton

Dmitry Vyukov

Johannes Berg

Hillf Danton

syzbot

syzbot