[syzbot] [wireless?] INFO: task hung in cfg80211_event_work (2)

9 views
Skip to first unread message

syzbot

unread,
Jul 12, 2023, 8:52:53 AM7/12/23
to da...@davemloft.net, edum...@google.com, joha...@sipsolutions.net, ku...@kernel.org, linux-...@vger.kernel.org, linux-w...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 123212f53f3e Add linux-next specific files for 20230707
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=12442688a80000
kernel config: https://syzkaller.appspot.com/x/.config?x=15ec80b62f588543
dashboard link: https://syzkaller.appspot.com/bug?extid=85f0eb24e10cec9b8a10
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1049ee88a80000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12b31cbca80000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/098f7ee2237c/disk-123212f5.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/88defebbfc49/vmlinux-123212f5.xz
kernel image: https://storage.googleapis.com/syzbot-assets/d5e9343ec16a/bzImage-123212f5.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+85f0eb...@syzkaller.appspotmail.com

INFO: task kworker/u4:0:10 blocked for more than 143 seconds.
Not tainted 6.4.0-next-20230707-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u4:0 state:D
stack:25000 pid:10 ppid:2 flags:0x00004000
Workqueue: cfg80211 cfg80211_event_work

Call Trace:
<TASK>
context_switch kernel/sched/core.c:5380 [inline]
__schedule+0xc9a/0x5880 kernel/sched/core.c:6709
schedule+0xde/0x1a0 kernel/sched/core.c:6785
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6844
__mutex_lock_common kernel/locking/mutex.c:679 [inline]
__mutex_lock+0xa3b/0x1350 kernel/locking/mutex.c:747
wiphy_lock include/net/cfg80211.h:5773 [inline]
cfg80211_event_work+0x1f/0x40 net/wireless/core.c:332
process_one_work+0xa34/0x16f0 kernel/workqueue.c:2597
worker_thread+0x67d/0x10c0 kernel/workqueue.c:2748
kthread+0x344/0x440 kernel/kthread.c:389
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
</TASK>

Showing all locks held in the system:
3 locks held by kworker/u4:0/10:
#0:
ffff888016688938
(
(wq_completion)cfg80211){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:20 [inline]
(wq_completion)cfg80211){+.+.}-{0:0}, at: raw_atomic64_set include/linux/atomic/atomic-arch-fallback.h:2608 [inline]
(wq_completion)cfg80211){+.+.}-{0:0}, at: raw_atomic_long_set include/linux/atomic/atomic-long.h:79 [inline]
(wq_completion)cfg80211){+.+.}-{0:0}, at: atomic_long_set include/linux/atomic/atomic-instrumented.h:3196 [inline]
(wq_completion)cfg80211){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:675 [inline]
(wq_completion)cfg80211){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:702 [inline]
(wq_completion)cfg80211){+.+.}-{0:0}, at: process_one_work+0x8fd/0x16f0 kernel/workqueue.c:2567
#1: ffffc900000f7db0 ((work_completion)(&rdev->event_work)){+.+.}-{0:0}, at: process_one_work+0x930/0x16f0 kernel/workqueue.c:2571
#2: ffff88807dc48768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: wiphy_lock include/net/cfg80211.h:5773 [inline]
#2: ffff88807dc48768 (&rdev->wiphy.mtx){+.+.}-{3:3}, at: cfg80211_event_work+0x1f/0x40 net/wireless/core.c:332
1 lock held by rcu_tasks_kthre/13:
#0:
ffffffff8c9a3db0 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0xd80 kernel/rcu/tasks.h:522
1 lock held by rcu_tasks_trace/14:
#0: ffffffff8c9a3ab0 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0xd80 kernel/rcu/tasks.h:522
1 lock held by khungtaskd/28:
#0: ffffffff8c9a49c0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x55/0x340 kernel/locking/lockdep.c:6615
2 locks held by getty/4776:
#0:
ffff8880287e2098 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x26/0x80 drivers/tty/tty_ldisc.c:243
#1: ffffc900015872f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0xf08/0x13f0 drivers/tty/n_tty.c:2187
8 locks held by kworker/0:0/5070:

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 28 Comm: khungtaskd Not tainted 6.4.0-next-20230707-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
nmi_cpu_backtrace+0x29c/0x350 lib/nmi_backtrace.c:113
nmi_trigger_cpumask_backtrace+0x2a4/0x300 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:160 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:222 [inline]
watchdog+0xe16/0x1090 kernel/hung_task.c:379
kthread+0x344/0x440 kernel/kthread.c:389
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
</TASK>
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 5070 Comm: kworker/0:0 Not tainted 6.4.0-next-20230707-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
Workqueue: events cfg80211_wiphy_work
RIP: 0010:arch_static_branch arch/x86/include/asm/jump_label.h:27 [inline]
RIP: 0010:static_key_false include/linux/jump_label.h:207 [inline]
RIP: 0010:trace_irq_enable.constprop.0+0x2/0x100 include/trace/events/preemptirq.h:40
Code: 00 75 0b 48 83 ec 80 5b 5d 41 5c 41 5d c3 e8 45 74 94 08 0f 1f 44 00 00 f3 0f 1e fa c3 66 66 2e 0f 1f 84 00 00 00 00 00 55 53 <66> 90 65 8b 1d b1 2b 78 7e 83 fb 07 0f 87 cb 00 00 00 89 db be 08
RSP: 0018:ffffc90000007d4SeaBIOS (version 1.8.2-google)


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to change bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Johannes Berg

unread,
Aug 16, 2023, 4:27:16 AM8/16/23
to syzbot, linux-w...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com
Hi,

So this is interesting. In the report in the email, we can't really see
all that much. But, in the sample report shown on the dashboard:

> dashboard link: https://syzkaller.appspot.com/bug?extid=85f0eb24e10cec9b8a10

we see:

> INFO: task kworker/u4:4:57 blocked for more than 143 seconds.
> Not tainted 6.5.0-rc3-next-20230728-syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:kworker/u4:4 state:D
> stack:24336 pid:57 ppid:2 flags:0x00004000
> Workqueue: cfg80211 cfg80211_event_work
>
> Call Trace:
> <TASK>
> context_switch kernel/sched/core.c:5382 [inline]
> __schedule+0xee1/0x59f0 kernel/sched/core.c:6711
> schedule+0xe7/0x1b0 kernel/sched/core.c:6787
> schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6846
> __mutex_lock_common kernel/locking/mutex.c:679 [inline]
> __mutex_lock+0x967/0x1340 kernel/locking/mutex.c:747
> wiphy_lock include/net/cfg80211.h:5773 [inline]
> cfg80211_event_work+0x1f/0x40 net/wireless/core.c:332

which OK, we're blocked on the lock - but not clear why. However we get
shown the other CPUs, and one of them is:

> Sending NMI from CPU 0 to CPUs 1:
> NMI backtrace for cpu 1
> CPU: 1 PID: 3155 Comm: kworker/1:2 Not tainted 6.5.0-rc3-next-20230728-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2023
> Workqueue: events cfg80211_wiphy_work
> RIP: 0010:io_serial_in+0x87/0xa0 drivers/tty/serial/8250/8250_port.c:410
> Code: ad ee fc 48 8d 7d 40 44 89 e1 48 b8 00 00 00 00 00 fc ff df 48 89 fa d3 e3 48 c1 ea 03 80 3c 02 00 75 16 66 03 5d 40 89 da ec <5b> 0f b6 c0 5d 41 5c c3 e8 2c 7b 43 fd eb a6 e8 85 7b 43 fd eb e3
> RSP: 0018:ffffc9000cdb7128 EFLAGS: 00000002
> RAX: dffffc0000000000 RBX: 00000000000003fd RCX: 0000000000000000
> RDX: 00000000000003fd RSI: ffffffff849856d0 RDI: ffffffff92318020
> RBP: ffffffff92317fe0 R08: 0000000000000001 R09: 000000000000001f
> R10: 0000000000000000 R11: 205d383254202020 R12: 0000000000000000
> R13: 0000000000000020 R14: fffffbfff2463055 R15: dffffc0000000000
> FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f608b73d868 CR3: 000000002bdaf000 CR4: 00000000003506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <NMI>
> </NMI>
> <TASK>
> serial_in drivers/tty/serial/8250/8250.h:117 [inline]
> serial_lsr_in drivers/tty/serial/8250/8250.h:139 [inline]
> wait_for_lsr+0xd7/0x180 drivers/tty/serial/8250/8250_port.c:2081
> serial8250_console_fifo_write drivers/tty/serial/8250/8250_port.c:3365 [inline]
> serial8250_console_write+0xce5/0x1060 drivers/tty/serial/8250/8250_port.c:3443
> console_emit_next_record kernel/printk/printk.c:2910 [inline]
> console_flush_all+0x4e8/0xf70 kernel/printk/printk.c:2966
> console_unlock+0x10c/0x260 kernel/printk/printk.c:3035
> vprintk_emit+0x189/0x630 kernel/printk/printk.c:2307
> vprintk+0x7b/0x90 kernel/printk/printk_safe.c:45
> _printk+0xc8/0x100 kernel/printk/printk.c:2332
> ieee80211_parse_ch_switch_ie+0xeb0/0x1030 net/mac80211/spectmgmt.c:88
> ieee80211_ibss_process_chanswitch.constprop.0+0x274/0xfd0 net/mac80211/ibss.c:803
> ieee80211_rx_mgmt_spectrum_mgmt net/mac80211/ibss.c:931 [inline]
> ieee80211_ibss_rx_queued_mgmt+0x23cf/0x2ee0 net/mac80211/ibss.c:1666
> ieee80211_iface_process_skb net/mac80211/iface.c:1604 [inline]
> ieee80211_iface_work+0xa1f/0xd40 net/mac80211/iface.c:1658
> cfg80211_wiphy_work+0x24e/0x330 net/wireless/core.c:435

So indeed another worker is holding the mutex it's waiting for, but some
log message is taking _forever_ to get out there.

Not sure we can do anything about that, seems more like a syzbot
artifact? I'm not even sure why serial would take forever in a VM :)

johannes

syzbot

unread,
Sep 16, 2023, 6:46:29 PM9/16/23
to big...@linutronix.de, da...@davemloft.net, edum...@google.com, joha...@sipsolutions.net, kernelj...@gmail.com, ku...@kernel.org, linux-...@vger.kernel.org, linux-w...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de
syzbot has bisected this issue to:

commit d15121be7485655129101f3960ae6add40204463
Author: Paolo Abeni <pab...@redhat.com>
Date: Mon May 8 06:17:44 2023 +0000

Revert "softirq: Let ksoftirqd do its job"

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1411bef8680000
start commit: 727dbda16b83 Merge tag 'hardening-v6.6-rc1' of git://git.k..
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=1611bef8680000
console output: https://syzkaller.appspot.com/x/log.txt?x=1211bef8680000
kernel config: https://syzkaller.appspot.com/x/.config?x=2c13b5305ee70d27
dashboard link: https://syzkaller.appspot.com/bug?extid=85f0eb24e10cec9b8a10
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15913b90680000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=156720a8680000

Reported-by: syzbot+85f0eb...@syzkaller.appspotmail.com
Fixes: d15121be7485 ("Revert "softirq: Let ksoftirqd do its job"")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection
Reply all
Reply to author
Forward
0 new messages