[syzbot] [net?] [nfc?] INFO: task hung in nfc_targets_found

12 views
Skip to first unread message

syzbot

unread,
Jan 3, 2024, 7:59:27 PMJan 3
to da...@davemloft.net, edum...@google.com, krzysztof...@linaro.org, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 453f5db0619e Merge tag 'trace-v6.7-rc7' of git://git.kerne..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=141bc48de80000
kernel config: https://syzkaller.appspot.com/x/.config?x=f8e72bae38c079e4
dashboard link: https://syzkaller.appspot.com/bug?extid=2b131f51bb4af224ab40
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/6ed2cada0823/disk-453f5db0.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/1fa106f79073/vmlinux-453f5db0.xz
kernel image: https://storage.googleapis.com/syzbot-assets/c75ff5078c5c/bzImage-453f5db0.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2b131f...@syzkaller.appspotmail.com

INFO: task kworker/u4:93:7607 blocked for more than 143 seconds.
Not tainted 6.7.0-rc7-syzkaller-00049-g453f5db0619e #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u4:93 state:D stack:20248 pid:7607 tgid:7607 ppid:2 flags:0x00004000
Workqueue: nfc2_nci_rx_wq nci_rx_work
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5376 [inline]
__schedule+0x1961/0x4ab0 kernel/sched/core.c:6688
__schedule_loop kernel/sched/core.c:6763 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6778
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6835
__mutex_lock_common kernel/locking/mutex.c:679 [inline]
__mutex_lock+0x6a3/0xd60 kernel/locking/mutex.c:747
device_lock include/linux/device.h:992 [inline]
nfc_targets_found+0x26f/0x590 net/nfc/core.c:778
nci_ntf_packet+0x4431/0x4f40
nci_rx_work+0x14c/0x2b0 net/nfc/nci/core.c:1522
process_one_work kernel/workqueue.c:2627 [inline]
process_scheduled_works+0x90f/0x1420 kernel/workqueue.c:2700
worker_thread+0xa5f/0x1000 kernel/workqueue.c:2781
kthread+0x2d3/0x370 kernel/kthread.c:388
ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
</TASK>
INFO: task kworker/0:1:11541 blocked for more than 143 seconds.
Not tainted 6.7.0-rc7-syzkaller-00049-g453f5db0619e #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/0:1 state:D stack:23280 pid:11541 tgid:11541 ppid:2 flags:0x00004000
Workqueue: events nfc_urelease_event_work
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5376 [inline]
__schedule+0x1961/0x4ab0 kernel/sched/core.c:6688
__schedule_loop kernel/sched/core.c:6763 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6778
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6835
__mutex_lock_common kernel/locking/mutex.c:679 [inline]
__mutex_lock+0x6a3/0xd60 kernel/locking/mutex.c:747
nci_request net/nfc/nci/core.c:149 [inline]
nci_stop_poll+0x123/0x480 net/nfc/nci/core.c:873
nfc_stop_poll+0xea/0x220 net/nfc/core.c:259
nfc_urelease_event_work+0x160/0x2f0 net/nfc/netlink.c:1852
process_one_work kernel/workqueue.c:2627 [inline]
process_scheduled_works+0x90f/0x1420 kernel/workqueue.c:2700
worker_thread+0xa5f/0x1000 kernel/workqueue.c:2781
kthread+0x2d3/0x370 kernel/kthread.c:388
ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
</TASK>
INFO: task syz-executor.1:27827 blocked for more than 143 seconds.
Not tainted 6.7.0-rc7-syzkaller-00049-g453f5db0619e #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.1 state:D stack:25392 pid:27827 tgid:27824 ppid:27038 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5376 [inline]
__schedule+0x1961/0x4ab0 kernel/sched/core.c:6688
__schedule_loop kernel/sched/core.c:6763 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6778
schedule_timeout+0xb0/0x300 kernel/time/timer.c:2143
do_wait_for_common kernel/sched/completion.c:95 [inline]
__wait_for_common kernel/sched/completion.c:116 [inline]
wait_for_common kernel/sched/completion.c:127 [inline]
wait_for_completion+0x354/0x620 kernel/sched/completion.c:148
__flush_workqueue+0x730/0x1630 kernel/workqueue.c:3192
nci_close_device+0x193/0x600 net/nfc/nci/core.c:579
nci_unregister_device+0x40/0x240 net/nfc/nci/core.c:1297
virtual_ncidev_close+0x59/0x90 drivers/nfc/virtual_ncidev.c:168
__fput+0x3cc/0x9e0 fs/file_table.c:394
task_work_run+0x24a/0x300 kernel/task_work.c:180
get_signal+0x166e/0x1840 kernel/signal.c:2680
arch_do_signal_or_restart+0x96/0x860 arch/x86/kernel/signal.c:309
exit_to_user_mode_loop+0x6a/0x100 kernel/entry/common.c:168
exit_to_user_mode_prepare+0xb1/0x150 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
syscall_exit_to_user_mode+0x69/0x2a0 kernel/entry/common.c:296
do_syscall_64+0x52/0x110 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7fab9d07cce9
RSP: 002b:00007fab9ddde0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: fffffffffffffff2 RBX: 00007fab9d19bf80 RCX: 00007fab9d07cce9
RDX: 0000000000000064 RSI: 0000000020000500 RDI: 0000000000000003
RBP: 00007fab9d0c947a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007fab9d19bf80 R15: 00007ffe08b6c548
</TASK>

Showing all locks held in the system:
1 lock held by khungtaskd/29:
#0: ffffffff8d92dae0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:301 [inline]
#0: ffffffff8d92dae0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
#0: ffffffff8d92dae0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x55/0x2a0 kernel/locking/lockdep.c:6614
2 locks held by getty/4825:
#0: ffff88802a6320a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
#1: ffffc90002f062f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x6b4/0x1e10 drivers/tty/n_tty.c:2201
3 locks held by kworker/u4:93/7607:
#0: ffff8880368b3938 ((wq_completion)nfc2_nci_rx_wq){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#0: ffff8880368b3938 ((wq_completion)nfc2_nci_rx_wq){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#1: ffffc900156b7d20 ((work_completion)(&ndev->rx_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#1: ffffc900156b7d20 ((work_completion)(&ndev->rx_work)){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#2: ffff88802bed5100 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:992 [inline]
#2: ffff88802bed5100 (&dev->mutex){....}-{3:3}, at: nfc_targets_found+0x26f/0x590 net/nfc/core.c:778
6 locks held by kworker/0:1/11541:
#0: ffff888012c70d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#0: ffff888012c70d38 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#1: ffffc90004a6fd20 ((work_completion)(&w->w)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#1: ffffc90004a6fd20 ((work_completion)(&w->w)){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#2: ffffffff8ee4d808 (nfc_devlist_mutex){+.+.}-{3:3}, at: nfc_urelease_event_work+0xa7/0x2f0 net/nfc/netlink.c:1843
#3: ffff88802bed5508 (&genl_data->genl_data_mutex){+.+.}-{3:3}, at: nfc_urelease_event_work+0x111/0x2f0 net/nfc/netlink.c:1849
#4: ffff88802bed5100 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:992 [inline]
#4: ffff88802bed5100 (&dev->mutex){....}-{3:3}, at: nfc_stop_poll+0x3c/0x220 net/nfc/core.c:247
#5: ffff88802bed4350 (&ndev->req_lock){+.+.}-{3:3}, at: nci_request net/nfc/nci/core.c:149 [inline]
#5: ffff88802bed4350 (&ndev->req_lock){+.+.}-{3:3}, at: nci_stop_poll+0x123/0x480 net/nfc/nci/core.c:873
3 locks held by kworker/1:2/16113:
#0: ffff888012c70d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#0: ffff888012c70d38 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#1: ffffc9000322fd20 ((work_completion)(&w->w)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#1: ffffc9000322fd20 ((work_completion)(&w->w)){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#2: ffffffff8ee4d808 (nfc_devlist_mutex){+.+.}-{3:3}, at: nfc_urelease_event_work+0xa7/0x2f0 net/nfc/netlink.c:1843
3 locks held by kworker/1:9/17613:
#0: ffff888012c70d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#0: ffff888012c70d38 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#1: ffffc9000d927d20 ((work_completion)(&w->w)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#1: ffffc9000d927d20 ((work_completion)(&w->w)){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#2: ffffffff8ee4d808 (nfc_devlist_mutex){+.+.}-{3:3}, at: nfc_urelease_event_work+0xa7/0x2f0 net/nfc/netlink.c:1843
3 locks held by kworker/0:5/22239:
#0: ffff888012c70d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#0: ffff888012c70d38 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#1: ffffc90014e57d20 ((work_completion)(&w->w)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#1: ffffc90014e57d20 ((work_completion)(&w->w)){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#2: ffffffff8ee4d808 (nfc_devlist_mutex){+.+.}-{3:3}, at: nfc_urelease_event_work+0xa7/0x2f0 net/nfc/netlink.c:1843
3 locks held by kworker/1:0/23072:
#0: ffff888012c70d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#0: ffff888012c70d38 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#1: ffffc9000a5afd20 ((work_completion)(&w->w)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:2602 [inline]
#1: ffffc9000a5afd20 ((work_completion)(&w->w)){+.+.}-{0:0}, at: process_scheduled_works+0x825/0x1420 kernel/workqueue.c:2700
#2: ffffffff8ee4d808 (nfc_devlist_mutex){+.+.}-{3:3}, at: nfc_urelease_event_work+0xa7/0x2f0 net/nfc/netlink.c:1843
1 lock held by syz-executor.1/27827:
#0: ffff88802bed4350 (&ndev->req_lock){+.+.}-{3:3}, at: nci_close_device+0x10a/0x600 net/nfc/nci/core.c:561
2 locks held by syz-executor.1/27953:
#0: ffffffff8e0dcb48 (misc_mtx){+.+.}-{3:3}, at: misc_open+0x5c/0x380 drivers/char/misc.c:129
#1: ffffffff8ee4d808 (nfc_devlist_mutex){+.+.}-{3:3}, at: nfc_register_device+0x3c/0x320 net/nfc/core.c:1116
1 lock held by syz-executor.5/27976:
#0: ffffffff8e0dcb48 (misc_mtx){+.+.}-{3:3}, at: misc_open+0x5c/0x380 drivers/char/misc.c:129
1 lock held by syz-executor.5/27979:
#0: ffffffff8e0dcb48 (misc_mtx){+.+.}-{3:3}, at: misc_open+0x5c/0x380 drivers/char/misc.c:129
1 lock held by syz-executor.5/27981:
#0: ffffffff8e0dcb48 (misc_mtx){+.+.}-{3:3}, at: misc_open+0x5c/0x380 drivers/char/misc.c:129
1 lock held by syz-executor.5/27984:
#0: ffffffff8e0dcb48 (misc_mtx){+.+.}-{3:3}, at: misc_open+0x5c/0x380 drivers/char/misc.c:129
1 lock held by syz-executor.5/27986:
#0: ffffffff8e0dcb48 (misc_mtx){+.+.}-{3:3}, at: misc_open+0x5c/0x380 drivers/char/misc.c:129
1 lock held by syz-executor.1/28013:
#0: ffffffff8e0dcb48 (misc_mtx){+.+.}-{3:3}, at: misc_open+0x5c/0x380 drivers/char/misc.c:129

=============================================

NMI backtrace for cpu 0
CPU: 0 PID: 29 Comm: khungtaskd Not tainted 6.7.0-rc7-syzkaller-00049-g453f5db0619e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106
nmi_cpu_backtrace+0x498/0x4d0 lib/nmi_backtrace.c:113
nmi_trigger_cpumask_backtrace+0x198/0x310 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:160 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:222 [inline]
watchdog+0xfaf/0xff0 kernel/hung_task.c:379
kthread+0x2d3/0x370 kernel/kthread.c:388
ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
</TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 7038 Comm: kworker/u4:23 Not tainted 6.7.0-rc7-syzkaller-00049-g453f5db0619e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Workqueue: bat_events batadv_nc_worker
RIP: 0010:bytes_is_nonzero mm/kasan/generic.c:84 [inline]
RIP: 0010:memory_is_nonzero mm/kasan/generic.c:102 [inline]
RIP: 0010:memory_is_poisoned_n mm/kasan/generic.c:127 [inline]
RIP: 0010:memory_is_poisoned mm/kasan/generic.c:159 [inline]
RIP: 0010:check_region_inline mm/kasan/generic.c:178 [inline]
RIP: 0010:kasan_check_range+0x79/0x290 mm/kasan/generic.c:187
Code: 4d 89 c1 49 c1 e9 03 49 be 01 00 00 00 00 fc ff df 4f 8d 3c 31 4c 89 fd 4c 29 dd 48 83 fd 10 7f 29 48 85 ed 0f 84 3e 01 00 00 <4c> 89 cd 48 f7 d5 48 01 dd 41 80 3b 00 0f 85 c5 01 00 00 49 ff c3
RSP: 0018:ffffc9000d9379f0 EFLAGS: 00000202
RAX: 0000000000000001 RBX: 1ffff1100f2ad320 RCX: ffffffff816e8191
RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffff888079569900
RBP: 0000000000000001 R08: ffff888079569903 R09: 1ffff1100f2ad320
R10: dffffc0000000000 R11: ffffed100f2ad320 R12: ffff888079569900
R13: 1ffff92001b26f50 R14: dffffc0000000001 R15: ffffed100f2ad321
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c00a3bd000 CR3: 000000000d731000 CR4: 00000000003506f0
DR0: 000000000000d8dd DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<NMI>
</NMI>
<TASK>
instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
atomic_try_cmpxchg_acquire include/linux/atomic/atomic-instrumented.h:1294 [inline]
queued_spin_lock include/asm-generic/qspinlock.h:111 [inline]
do_raw_spin_lock+0x141/0x370 kernel/locking/spinlock_debug.c:115
spin_lock_bh include/linux/spinlock.h:356 [inline]
batadv_nc_purge_paths+0xe8/0x3a0 net/batman-adv/network-coding.c:442
batadv_nc_worker+0x328/0x610 net/batman-adv/network-coding.c:720
process_one_work kernel/workqueue.c:2627 [inline]
process_scheduled_works+0x90f/0x1420 kernel/workqueue.c:2700
worker_thread+0xa5f/0x1000 kernel/workqueue.c:2781
kthread+0x2d3/0x370 kernel/kthread.c:388
ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Hillf Danton

unread,
Jan 4, 2024, 12:05:19 AMJan 4
to syzbot, krzysztof...@linaro.org, Tetsuo Handa, Alan Stern, Greg KH, Linus Torvalds, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
On Wed, 03 Jan 2024 16:59:25 -0800
syz-executor.1:27827 kworker/u4:93/7607 kworker/0:1/11541
=== === ===
nci_close_device() nci_rx_work() nfc_urelease_event_work()
mutex_lock(&ndev->req_lock) device_lock()
flush_workqueue(ndev->rx_wq) mutex_lock(&ndev->req_lock)
device_lock()

Looks like lockdep failed to detect deadlock once more because of device_lock().

Tetsuo Handa

unread,
Jan 4, 2024, 5:35:37 AMJan 4
to Hillf Danton, syzbot, Alan Stern, Greg KH, krzysztof...@linaro.org, Linus Torvalds, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
On 2024/01/04 14:05, Hillf Danton wrote:
> On Wed, 03 Jan 2024 16:59:25 -0800
>> HEAD commit: 453f5db0619e Merge tag 'trace-v6.7-rc7' of git://git.kerne..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=141bc48de80000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=f8e72bae38c079e4
>> dashboard link: https://syzkaller.appspot.com/bug?extid=2b131f51bb4af224ab40
>> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>>
>
> syz-executor.1:27827 kworker/u4:93/7607 kworker/0:1/11541
> === === ===
> nci_close_device() nci_rx_work() nfc_urelease_event_work()
> mutex_lock(&ndev->req_lock) device_lock()
> flush_workqueue(ndev->rx_wq) mutex_lock(&ndev->req_lock)
> device_lock()
>
> Looks like lockdep failed to detect deadlock once more because of device_lock().

Yes, this is yet another circular locking dependency hidden by device_lock().

Calling flush_workqueue(ndev->rx_wq) with ndev->req_lock has to be avoided,
for nci_close_device() has ndev->req_lock => dev->dev dependency and
nfc_urelease_event_work() has dev->dev => ndev->req_lock dependency.

nci_close_device() {
mutex_lock(&ndev->req_lock); // ffff88802bed4350
flush_workqueue(ndev->rx_wq); // wait for nci_rx_work() to complete
mutex_unlock(&ndev->req_lock); // ffff88802bed4350
}

nci_rx_work() { // ndev->rx_work is on ndev->rx_wq
nci_ntf_packet() {
device_lock(&dev->dev); // ffff88802bed5100
device_unlock(&dev->dev); // ffff88802bed5100
}
}

nfc_urelease_event_work() {
mutex_lock(&nfc_devlist_mutex); // ffffffff8ee4d808
mutex_lock(&dev->genl_data.genl_data_mutex); // ffff88802bed5508
nfc_stop_poll() {
device_lock(&dev->dev); // ffff88802bed5100
nci_stop_poll() {
nci_request() {
mutex_lock(&ndev->req_lock); // ffff88802bed4350
mutex_unlock(&ndev->req_lock); // ffff88802bed4350
}
}
device_unlock(&dev->dev); // ffff88802bed5100
}
mutex_unlock(&dev->genl_data.genl_data_mutex); // ffff88802bed5508
mutex_unlock(&nfc_devlist_mutex); // ffffffff8ee4d808
}

I consider that we need to enable lockdep validation on dev->dev mutex
( https://lkml.kernel.org/r/c7fb01a9-3e12-77ed...@I-love.SAKURA.ne.jp )
but was some alternative to my proposal at
https://lkml.kernel.org/r/1ad499bb-0c53-7529...@I-love.SAKURA.ne.jp
proposed? Is it time to retry my proposal?

syzbot

unread,
Jan 11, 2024, 2:08:27 PMJan 11
to da...@davemloft.net, edum...@google.com, gre...@linuxfoundation.org, hda...@sina.com, krzysztof...@linaro.org, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, penguin...@i-love.sakura.ne.jp, st...@rowland.harvard.edu, syzkall...@googlegroups.com, torv...@linux-foundation.org
syzbot has found a reproducer for the following issue on:

HEAD commit: acc657692aed keys, dns: Fix size check of V1 server-list h..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10c95b33e80000
kernel config: https://syzkaller.appspot.com/x/.config?x=5c882ebde8a5f3b4
dashboard link: https://syzkaller.appspot.com/bug?extid=2b131f51bb4af224ab40
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=103698bde80000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1617e0fbe80000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/015b89b33490/disk-acc65769.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/5e0949d278bb/vmlinux-acc65769.xz
kernel image: https://storage.googleapis.com/syzbot-assets/0fc2cb65191a/bzImage-acc65769.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2b131f...@syzkaller.appspotmail.com

INFO: task kworker/u4:2:38 blocked for more than 143 seconds.
Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u4:2 state:D stack:22728 pid:38 tgid:38 ppid:2 flags:0x00004000
Workqueue: nfc3_nci_rx_wq nci_rx_work
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5399 [inline]
__schedule+0x177f/0x4960 kernel/sched/core.c:6726
__schedule_loop kernel/sched/core.c:6801 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6816
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6873
__mutex_lock_common kernel/locking/mutex.c:684 [inline]
__mutex_lock+0x6a3/0xd60 kernel/locking/mutex.c:752
device_lock include/linux/device.h:992 [inline]
nfc_targets_found+0x26f/0x590 net/nfc/core.c:778
nci_ntf_packet+0x4431/0x4f40
nci_rx_work+0x14c/0x2b0 net/nfc/nci/core.c:1522
process_one_work kernel/workqueue.c:2633 [inline]
process_scheduled_works+0x90f/0x1420 kernel/workqueue.c:2706
worker_thread+0xa5f/0x1000 kernel/workqueue.c:2787
kthread+0x2d3/0x370 kernel/kthread.c:388
ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
</TASK>
INFO: task kworker/1:2:927 blocked for more than 143 seconds.
Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/1:2 state:D stack:24400 pid:927 tgid:927 ppid:2 flags:0x00004000
Workqueue: events nfc_urelease_event_work
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5399 [inline]
__schedule+0x177f/0x4960 kernel/sched/core.c:6726
__schedule_loop kernel/sched/core.c:6801 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6816
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6873
__mutex_lock_common kernel/locking/mutex.c:684 [inline]
__mutex_lock+0x6a3/0xd60 kernel/locking/mutex.c:752
nfc_urelease_event_work+0x111/0x2f0 net/nfc/netlink.c:1849
process_one_work kernel/workqueue.c:2633 [inline]
process_scheduled_works+0x90f/0x1420 kernel/workqueue.c:2706
worker_thread+0xa5f/0x1000 kernel/workqueue.c:2787
kthread+0x2d3/0x370 kernel/kthread.c:388
ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
</TASK>
INFO: task syz-executor393:5158 blocked for more than 143 seconds.
Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor393 state:D stack:25488 pid:5158 tgid:5158 ppid:5119 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5399 [inline]
__schedule+0x177f/0x4960 kernel/sched/core.c:6726
__schedule_loop kernel/sched/core.c:6801 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6816
schedule_timeout+0xb0/0x300 kernel/time/timer.c:2159
do_wait_for_common kernel/sched/completion.c:95 [inline]
__wait_for_common kernel/sched/completion.c:116 [inline]
wait_for_common kernel/sched/completion.c:127 [inline]
wait_for_completion+0x354/0x620 kernel/sched/completion.c:148
__flush_workqueue+0x730/0x1630 kernel/workqueue.c:3198
nci_close_device+0x193/0x600 net/nfc/nci/core.c:579
nci_unregister_device+0x40/0x240 net/nfc/nci/core.c:1297
virtual_ncidev_close+0x59/0x90 drivers/nfc/virtual_ncidev.c:168
__fput+0x428/0x890 fs/file_table.c:382
__do_sys_close fs/open.c:1554 [inline]
__se_sys_close fs/open.c:1539 [inline]
__x64_sys_close+0x7e/0x100 fs/open.c:1539
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f2bd642ca9a
RSP: 002b:00007ffc00a39ae0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f2bd642ca9a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 0000000000104645 R08: 0000000000000000 R09: 00007ffc00a39bb0
R10: 0000000000000000 R11: 0000000000000293 R12: 00007f2bd64bd1fc
R13: 00007ffc00a39b40 R14: 00007f2bd63f60b0 R15: 00007ffc00a39bb0
</TASK>
INFO: task syz-executor393:5159 blocked for more than 144 seconds.
Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor393 state:D stack:25576 pid:5159 tgid:5158 ppid:5119 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5399 [inline]
__schedule+0x177f/0x4960 kernel/sched/core.c:6726
__schedule_loop kernel/sched/core.c:6801 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6816
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6873
__mutex_lock_common kernel/locking/mutex.c:684 [inline]
__mutex_lock+0x6a3/0xd60 kernel/locking/mutex.c:752
nci_request net/nfc/nci/core.c:149 [inline]
nci_start_poll+0xa30/0xf30 net/nfc/nci/core.c:854
nfc_start_poll+0x188/0x300 net/nfc/core.c:225
nfc_genl_start_poll+0x1eb/0x350 net/nfc/netlink.c:828
genl_family_rcv_msg_doit net/netlink/genetlink.c:972 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline]
genl_rcv_msg+0xacf/0xe40 net/netlink/genetlink.c:1067
netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2545
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
netlink_unicast+0x7e6/0x980 net/netlink/af_netlink.c:1368
netlink_sendmsg+0xa37/0xd70 net/netlink/af_netlink.c:1910
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x592/0x890 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f2bd642db49
RSP: 002b:00007f2bd63e7138 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f2bd642db49
RDX: 0000000000000000 RSI: 0000000020000440 RDI: 0000000000000005
RBP: 0000000000000003 R08: 0000000000000003 R09: 0000000000000000
R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000008
R13: 0000000000000000 R14: 00007ffc00a39930 R15: 00007ffc00a39a18
</TASK>
INFO: task kworker/0:3:5161 blocked for more than 144 seconds.
Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/0:3 state:D stack:24328 pid:5161 tgid:5161 ppid:2 flags:0x00004000
Workqueue: events nfc_urelease_event_work
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5399 [inline]
__schedule+0x177f/0x4960 kernel/sched/core.c:6726
__schedule_loop kernel/sched/core.c:6801 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6816
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6873
__mutex_lock_common kernel/locking/mutex.c:684 [inline]
__mutex_lock+0x6a3/0xd60 kernel/locking/mutex.c:752
nfc_urelease_event_work+0xa7/0x2f0 net/nfc/netlink.c:1843
process_one_work kernel/workqueue.c:2633 [inline]
process_scheduled_works+0x90f/0x1420 kernel/workqueue.c:2706
worker_thread+0xa5f/0x1000 kernel/workqueue.c:2787
kthread+0x2d3/0x370 kernel/kthread.c:388
ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
</TASK>
INFO: task syz-executor393:5165 blocked for more than 144 seconds.
Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor393 state:D stack:25808 pid:5165 tgid:5164 ppid:5122 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5399 [inline]
__schedule+0x177f/0x4960 kernel/sched/core.c:6726
__schedule_loop kernel/sched/core.c:6801 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6816
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6873
__mutex_lock_common kernel/locking/mutex.c:684 [inline]
__mutex_lock+0x6a3/0xd60 kernel/locking/mutex.c:752
genl_lock net/netlink/genetlink.c:33 [inline]
genl_op_lock net/netlink/genetlink.c:58 [inline]
genl_rcv_msg+0x121/0xe40 net/netlink/genetlink.c:1066
netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2545
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
netlink_unicast+0x7e6/0x980 net/netlink/af_netlink.c:1368
netlink_sendmsg+0xa37/0xd70 net/netlink/af_netlink.c:1910
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x592/0x890 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f2bd642db49
RSP: 002b:00007f2bd63e7138 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f2bd642db49
RDX: 0000000000000000 RSI: 0000000020000440 RDI: 0000000000000005
RBP: 0000000000000003 R08: 0000000000000003 R09: 0000000000000000
R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000008
R13: 0000000000000000 R14: 00007ffc00a39930 R15: 00007ffc00a39a18
</TASK>
INFO: task syz-executor393:5175 blocked for more than 145 seconds.
Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor393 state:D stack:27632 pid:5175 tgid:5164 ppid:5122 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5399 [inline]
__schedule+0x177f/0x4960 kernel/sched/core.c:6726
__schedule_loop kernel/sched/core.c:6801 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6816
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6873
__mutex_lock_common kernel/locking/mutex.c:684 [inline]
__mutex_lock+0x6a3/0xd60 kernel/locking/mutex.c:752
nfc_unregister_device+0x175/0x2a0 net/nfc/core.c:1181
virtual_ncidev_close+0x59/0x90 drivers/nfc/virtual_ncidev.c:168
__fput+0x428/0x890 fs/file_table.c:382
task_work_run+0x24a/0x300 kernel/task_work.c:180
get_signal+0x166e/0x1840 kernel/signal.c:2669
arch_do_signal_or_restart+0x96/0x860 arch/x86/kernel/signal.c:309
exit_to_user_mode_loop kernel/entry/common.c:105 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:201 [inline]
syscall_exit_to_user_mode+0xc8/0x370 kernel/entry/common.c:212
do_syscall_64+0x102/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f2bd642db49
RSP: 002b:00007f2bd63c6138 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: fffffffffffffff2 RBX: 00007f2bd64bd1f8 RCX: 00007f2bd642db49
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00007f2bd64bd1f0 R08: 00007f2bd63c66c0 R09: 0000000000000000
R10: 00007f2bd63c66c0 R11: 0000000000000246 R12: 00007f2bd64bd1fc
R13: 000000000000006e R14: 00007ffc00a39930 R15: 00007ffc00a39a18
</TASK>
INFO: task syz-executor393:5170 blocked for more than 145 seconds.
Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor393 state:D stack:24816 pid:5170 tgid:5169 ppid:5118 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5399 [inline]
__schedule+0x177f/0x4960 kernel/sched/core.c:6726
__schedule_loop kernel/sched/core.c:6801 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6816
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6873
__mutex_lock_common kernel/locking/mutex.c:684 [inline]
__mutex_lock+0x6a3/0xd60 kernel/locking/mutex.c:752
genl_lock net/netlink/genetlink.c:33 [inline]
genl_op_lock net/netlink/genetlink.c:58 [inline]
genl_rcv_msg+0x121/0xe40 net/netlink/genetlink.c:1066
netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2545
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
netlink_unicast+0x7e6/0x980 net/netlink/af_netlink.c:1368
netlink_sendmsg+0xa37/0xd70 net/netlink/af_netlink.c:1910
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x592/0x890 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f2bd642db49
RSP: 002b:00007f2bd63e7138 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f2bd64bd1e8 RCX: 00007f2bd642db49
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000005
RBP: 00007f2bd64bd1e0 R08: 00007f2bd63e76c0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f2bd64bd1ec
R13: 0000000000000011 R14: 00007ffc00a39930 R15: 00007ffc00a39a18
</TASK>
INFO: task syz-executor393:5183 blocked for more than 145 seconds.
Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor393 state:D stack:26864 pid:5183 tgid:5169 ppid:5118 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5399 [inline]
__schedule+0x177f/0x4960 kernel/sched/core.c:6726
__schedule_loop kernel/sched/core.c:6801 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6816
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6873
__mutex_lock_common kernel/locking/mutex.c:684 [inline]
__mutex_lock+0x6a3/0xd60 kernel/locking/mutex.c:752
nfc_unregister_device+0x175/0x2a0 net/nfc/core.c:1181
virtual_ncidev_close+0x59/0x90 drivers/nfc/virtual_ncidev.c:168
__fput+0x428/0x890 fs/file_table.c:382
task_work_run+0x24a/0x300 kernel/task_work.c:180
get_signal+0x166e/0x1840 kernel/signal.c:2669
arch_do_signal_or_restart+0x96/0x860 arch/x86/kernel/signal.c:309
exit_to_user_mode_loop kernel/entry/common.c:105 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:201 [inline]
syscall_exit_to_user_mode+0xc8/0x370 kernel/entry/common.c:212
do_syscall_64+0x102/0x230 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f2bd642db49
RSP: 002b:00007f2bd63c6138 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: fffffffffffffff2 RBX: 00007f2bd64bd1f8 RCX: 00007f2bd642db49
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00007f2bd64bd1f0 R08: 00007f2bd63c66c0 R09: 0000000000000000
R10: 00007ffc00a39a17 R11: 0000000000000246 R12: 00007f2bd64bd1fc
R13: 000000000000006e R14: 00007ffc00a39930 R15: 00007ffc00a39a18
</TASK>
INFO: task syz-executor393:5189 blocked for more than 146 seconds.
Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor393 state:D stack:28176 pid:5189 tgid:5169 ppid:5118 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5399 [inline]
__schedule+0x177f/0x4960 kernel/sched/core.c:6726
__schedule_loop kernel/sched/core.c:6801 [inline]
schedule+0x149/0x260 kernel/sched/core.c:6816
schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:6873
__mutex_lock_common kernel/locking/mutex.c:684 [inline]
__mutex_lock+0x6a3/0xd60 kernel/locking/mutex.c:752
genl_lock net/netlink/genetlink.c:33 [inline]
genl_op_lock net/netlink/genetlink.c:58 [inline]
genl_rcv_msg+0x121/0xe40 net/netlink/genetlink.c:1066
netlink_rcv_skb+0x1df/0x430 net/netlink/af_netlink.c:2545
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
netlink_unicast+0x7e6/0x980 net/netlink/af_netlink.c:1368
netlink_sendmsg+0xa37/0xd70 net/netlink/af_netlink.c:1910
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x592/0x890 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f2bd642db49
RSP: 002b:00007f2bd63a5138 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f2bd642db49
RDX: 0000000000000000 RSI: 0000000020000440 RDI: 0000000000000005
RBP: 0000000000000003 R08: 0000000000000003 R09: 0000000000000000
R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000008
R13: 0000000000000000 R14: 00007ffc00a39930 R15: 00007ffc00a39a18
</TASK>
Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
INFO: lockdep is turned off.
NMI backtrace for cpu 0
CPU: 0 PID: 29 Comm: khungtaskd Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106
nmi_cpu_backtrace+0x498/0x4d0 lib/nmi_backtrace.c:113
nmi_trigger_cpumask_backtrace+0x198/0x310 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:160 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:222 [inline]
watchdog+0xfaf/0xff0 kernel/hung_task.c:379
kthread+0x2d3/0x370 kernel/kthread.c:388
ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
</TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 49 Comm: kworker/u4:3 Not tainted 6.7.0-syzkaller-02320-gacc657692aed #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Workqueue: events_unbound cfg80211_wiphy_work
RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
RIP: 0010:write_comp_data kernel/kcov.c:236 [inline]
RIP: 0010:__sanitizer_cov_trace_switch+0xa3/0x110 kernel/kcov.c:341
Code: 84 8a 00 00 00 4c 8b 4c 24 20 65 4c 8b 1d 05 db 74 7e 31 d2 eb 08 48 ff c2 49 39 d2 74 71 4c 8b 74 d6 10 65 8b 05 f5 da 74 7e <a9> 00 01 ff 00 74 11 a9 00 01 00 00 74 de 41 83 bb fc 15 00 00 00
RSP: 0018:ffffc90000b9f198 EFLAGS: 00000202
RAX: 0000000080000000 RBX: 0000000000000028 RCX: ffff888013f18000
RDX: 0000000000000003 RSI: ffffffff8edb9710 RDI: 0000000000000000
RBP: ffffc90000b9f5d0 R08: 0000000000000001 R09: ffffffff8aa108e3
R10: 0000000000000028 R11: ffff888013f18000 R12: 0000000000000008
R13: dffffc0000000000 R14: 0000000000000003 R15: ffff888026966474
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005562339c8600 CR3: 000000000d732000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<NMI>
</NMI>
<TASK>
_ieee802_11_parse_elems_full+0x6f3/0x4340 net/mac80211/util.c:1049
ieee802_11_parse_elems_full+0xd47/0x2470 net/mac80211/util.c:1647
ieee802_11_parse_elems_crc net/mac80211/ieee80211_i.h:2288 [inline]
ieee802_11_parse_elems net/mac80211/ieee80211_i.h:2295 [inline]
ieee80211_rx_mgmt_probe_beacon net/mac80211/ibss.c:1573 [inline]
ieee80211_ibss_rx_queued_mgmt+0x4b3/0x2d30 net/mac80211/ibss.c:1604
ieee80211_iface_process_skb net/mac80211/iface.c:1589 [inline]
ieee80211_iface_work+0x805/0xd90 net/mac80211/iface.c:1643
cfg80211_wiphy_work+0x21e/0x250 net/wireless/core.c:437
process_one_work kernel/workqueue.c:2633 [inline]
process_scheduled_works+0x90f/0x1420 kernel/workqueue.c:2706
worker_thread+0xa5f/0x1000 kernel/workqueue.c:2787
kthread+0x2d3/0x370 kernel/kthread.c:388
ret_from_fork+0x48/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
</TASK>
INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 1.624 msecs


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

syzbot

unread,
Jan 13, 2024, 12:29:05 PMJan 13
to air...@gmail.com, dan...@ffwll.ch, da...@davemloft.net, dian...@chromium.org, dri-...@lists.freedesktop.org, edum...@google.com, gre...@linuxfoundation.org, hda...@sina.com, jernej....@gmail.com, krzysztof...@linaro.org, ku...@kernel.org, linux-ar...@lists.infradead.org, linux-...@vger.kernel.org, linux...@lists.linux.dev, mri...@kernel.org, net...@vger.kernel.org, pab...@redhat.com, penguin...@i-love.sakura.ne.jp, sam...@sholland.org, st...@rowland.harvard.edu, syzkall...@googlegroups.com, torv...@linux-foundation.org, u.klein...@pengutronix.de, we...@csie.org
syzbot has bisected this issue to:

commit d665e3c9d37ad31aec2d0d9d086e7c903ddd7250
Author: Uwe Kleine-König <u.klein...@pengutronix.de>
Date: Sun May 7 16:26:06 2023 +0000

drm/sun4i: Convert to platform remove callback returning void

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=146b91f5e80000
start commit: acc657692aed keys, dns: Fix size check of V1 server-list h..
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=166b91f5e80000
console output: https://syzkaller.appspot.com/x/log.txt?x=126b91f5e80000
Reported-by: syzbot+2b131f...@syzkaller.appspotmail.com
Fixes: d665e3c9d37a ("drm/sun4i: Convert to platform remove callback returning void")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Edward Adam Davis

unread,
Jan 14, 2024, 2:35:46 AMJan 14
to syzbot+2b131f...@syzkaller.appspotmail.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
please test task hung in nfc_targets_found

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git acc657692aed

diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c
index 6c9592d05120..9a277228a875 100644
--- a/net/nfc/nci/core.c
+++ b/net/nfc/nci/core.c
@@ -145,6 +145,8 @@ inline int nci_request(struct nci_dev *ndev,
{
int rc;

+ if (test_bit(NCI_UNREG, &ndev->flags))
+ return -ENODEV;
/* Serialize all requests */
mutex_lock(&ndev->req_lock);
/* check the state after obtaing the lock against any races

syzbot

unread,
Jan 14, 2024, 3:05:07 AMJan 14
to ead...@qq.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+2b131f...@syzkaller.appspotmail.com

Tested on:

commit: acc65769 keys, dns: Fix size check of V1 server-list h..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=12b062dbe80000
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=16749b35e80000

Note: testing is done by a robot and is best-effort only.

Edward Adam Davis

unread,
Jan 14, 2024, 3:20:29 AMJan 14
to syzbot+2b131f...@syzkaller.appspotmail.com, da...@davemloft.net, edum...@google.com, gre...@linuxfoundation.org, hda...@sina.com, krzysztof...@linaro.org, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, penguin...@i-love.sakura.ne.jp, st...@rowland.harvard.edu, syzkall...@googlegroups.com, torv...@linux-foundation.org
nci_start_poll() holds the dev->mutex required by the kworker of nci_close_device(),
and the related tasks are as follows:
|cpu0 |cpu1 |cpu2 |
|nci_close_device() | | |
|mutex_lock(&ndev->req_lock); | | |
|... |nfc_genl_start_poll() | |
|flush_workqueue(ndev->rx_wq) |mutex_lock(&dev->genl_data.genl_data_mutex); | |
| |nfc_start_poll() | |
| |device_lock(&dev->dev); |process_one_work() |
| |nci_start_poll() |nfc_targets_found() |
| |nci_request() |device_lock(&dev->dev); |
| |mutex_lock(&ndev->req_lock); | |

Therefore, before applying for req_lock in nci_request(), it should be determined
whether the execution of nci_close_device() has already begun.

Reported-and-tested-by: syzbot+2b131f...@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <ead...@qq.com>
---
net/nfc/nci/core.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c
index 6c9592d05120..9a277228a875 100644
--- a/net/nfc/nci/core.c
+++ b/net/nfc/nci/core.c
@@ -145,6 +145,8 @@ inline int nci_request(struct nci_dev *ndev,
{
int rc;

+ if (test_bit(NCI_UNREG, &ndev->flags))
+ return -ENODEV;
/* Serialize all requests */
mutex_lock(&ndev->req_lock);
/* check the state after obtaing the lock against any races
--
2.43.0

Krzysztof Kozlowski

unread,
Jan 15, 2024, 4:36:45 AMJan 15
to Edward Adam Davis, syzbot+2b131f...@syzkaller.appspotmail.com, da...@davemloft.net, edum...@google.com, gre...@linuxfoundation.org, hda...@sina.com, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, penguin...@i-love.sakura.ne.jp, st...@rowland.harvard.edu, syzkall...@googlegroups.com, torv...@linux-foundation.org
nci_close_device() clears the NCI_UP, which is tested here, just after
acquiring mutex. And there is explicit comment about it just below your
code. Why it is not relevant?

Your code looks really unnecessary, at least with that code flow from
commit msg. Especially considering you do it outside of mutex, so how
does it solve anything?

Best regards,
Krzysztof

Tetsuo Handa

unread,
Jan 15, 2024, 6:09:16 AMJan 15
to Krzysztof Kozlowski, Edward Adam Davis, syzbot+2b131f...@syzkaller.appspotmail.com, da...@davemloft.net, edum...@google.com, gre...@linuxfoundation.org, hda...@sina.com, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, st...@rowland.harvard.edu, syzkall...@googlegroups.com, torv...@linux-foundation.org
On 2024/01/15 18:36, Krzysztof Kozlowski wrote:
>> diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c
>> index 6c9592d05120..9a277228a875 100644
>> --- a/net/nfc/nci/core.c
>> +++ b/net/nfc/nci/core.c
>> @@ -145,6 +145,8 @@ inline int nci_request(struct nci_dev *ndev,
>> {
>> int rc;
>>
>> + if (test_bit(NCI_UNREG, &ndev->flags))
>> + return -ENODEV;
>
> nci_close_device() clears the NCI_UP, which is tested here, just after
> acquiring mutex. And there is explicit comment about it just below your
> code. Why it is not relevant?

Because the deadlock happens at mutex_lock(&ndev->req_lock), which is
before test_bit(NCI_UP, &ndev->flags) is called. Please see
https://lkml.kernel.org/r/d314e471-0251-461e...@I-love.SAKURA.ne.jp .

Krzysztof Kozlowski

unread,
Jan 15, 2024, 6:15:57 AMJan 15
to Tetsuo Handa, Edward Adam Davis, syzbot+2b131f...@syzkaller.appspotmail.com, da...@davemloft.net, edum...@google.com, gre...@linuxfoundation.org, hda...@sina.com, ku...@kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, pab...@redhat.com, st...@rowland.harvard.edu, syzkall...@googlegroups.com, torv...@linux-foundation.org
On 15/01/2024 12:08, Tetsuo Handa wrote:
> On 2024/01/15 18:36, Krzysztof Kozlowski wrote:
>>> diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c
>>> index 6c9592d05120..9a277228a875 100644
>>> --- a/net/nfc/nci/core.c
>>> +++ b/net/nfc/nci/core.c
>>> @@ -145,6 +145,8 @@ inline int nci_request(struct nci_dev *ndev,
>>> {
>>> int rc;
>>>
>>> + if (test_bit(NCI_UNREG, &ndev->flags))
>>> + return -ENODEV;
>>
>> nci_close_device() clears the NCI_UP, which is tested here, just after
>> acquiring mutex. And there is explicit comment about it just below your
>> code. Why it is not relevant?
>
> Because the deadlock happens at mutex_lock(&ndev->req_lock), which is
> before test_bit(NCI_UP, &ndev->flags) is called. Please see
> https://lkml.kernel.org/r/d314e471-0251-461e...@I-love.SAKURA.ne.jp .

I see, yet still this code looks like moving or copying existing
test_bit(NCI_UP) outside of mutex, which is usually not the correct
solution for deadlocks. First of all, flags are supposed to be
manipulated under lock, so the code here can be quite re-ordered. What
stops the CPU to test negative in above (so: !NCI_UNREG) and then
execute nci_close_device before waiting on mutex here? Nothing.

The problem seems to be locking and solution is not to add one if()
outside of locking.

Best regards,
Krzysztof

Reply all
Reply to author
Forward
0 new messages