[v6.1] possible deadlock in nci_close_device

0 views
Skip to first unread message

syzbot

unread,
Apr 3, 2024, 12:08:30 AMApr 3
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: e5cd595e23c1 Linux 6.1.83
git tree: linux-6.1.y
console output: https://syzkaller.appspot.com/x/log.txt?x=120e5823180000
kernel config: https://syzkaller.appspot.com/x/.config?x=99d0cbbc2b2c7cfd
dashboard link: https://syzkaller.appspot.com/bug?extid=b48d4c917c8276ea9df3
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/cd28292a2eef/disk-e5cd595e.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/e8297fd856b2/vmlinux-e5cd595e.xz
kernel image: https://storage.googleapis.com/syzbot-assets/ea8c74634429/bzImage-e5cd595e.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+b48d4c...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
6.1.83-syzkaller #0 Not tainted
------------------------------------------------------
kworker/0:7/3611 is trying to acquire lock:
ffff88807a488350 (&ndev->req_lock){+.+.}-{3:3}, at: nci_close_device+0x106/0x5f0 net/nfc/nci/core.c:560

but task is already holding lock:
ffffffff8e547a28 (rfkill_global_mutex){+.+.}-{3:3}, at: rfkill_sync_work+0x25/0xe0 net/rfkill/core.c:1040

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (rfkill_global_mutex){+.+.}-{3:3}:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__mutex_lock_common kernel/locking/mutex.c:603 [inline]
__mutex_lock+0x132/0xd80 kernel/locking/mutex.c:747
rfkill_register+0x30/0x880 net/rfkill/core.c:1057
nfc_register_device+0x144/0x310 net/nfc/core.c:1132
nci_register_device+0x7be/0x900 net/nfc/nci/core.c:1265
virtual_ncidev_open+0x55/0xc0 drivers/nfc/virtual_ncidev.c:146
misc_open+0x304/0x380 drivers/char/misc.c:143
chrdev_open+0x54a/0x630 fs/char_dev.c:414
do_dentry_open+0x7f9/0x10f0 fs/open.c:882
do_open fs/namei.c:3628 [inline]
path_openat+0x2644/0x2e60 fs/namei.c:3785
do_filp_open+0x230/0x480 fs/namei.c:3812
do_sys_openat2+0x13b/0x500 fs/open.c:1318
do_sys_open fs/open.c:1334 [inline]
__do_sys_openat fs/open.c:1350 [inline]
__se_sys_openat fs/open.c:1345 [inline]
__x64_sys_openat+0x243/0x290 fs/open.c:1345
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #1 (nci_mutex){+.+.}-{3:3}:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__mutex_lock_common kernel/locking/mutex.c:603 [inline]
__mutex_lock+0x132/0xd80 kernel/locking/mutex.c:747
virtual_nci_close+0x13/0x40 drivers/nfc/virtual_ncidev.c:44
nci_open_device net/nfc/nci/core.c:544 [inline]
nci_dev_up+0x954/0xd40 net/nfc/nci/core.c:631
nfc_dev_up+0x185/0x330 net/nfc/core.c:118
nfc_genl_dev_up+0x80/0xd0 net/nfc/netlink.c:770
genl_family_rcv_msg_doit net/netlink/genetlink.c:756 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:833 [inline]
genl_rcv_msg+0xc1a/0xf70 net/netlink/genetlink.c:850
netlink_rcv_skb+0x1cd/0x410 net/netlink/af_netlink.c:2508
genl_rcv+0x24/0x40 net/netlink/genetlink.c:861
netlink_unicast_kernel net/netlink/af_netlink.c:1326 [inline]
netlink_unicast+0x7d8/0x970 net/netlink/af_netlink.c:1352
netlink_sendmsg+0xa26/0xd60 net/netlink/af_netlink.c:1874
sock_sendmsg_nosec net/socket.c:718 [inline]
__sock_sendmsg net/socket.c:730 [inline]
____sys_sendmsg+0x5a5/0x8f0 net/socket.c:2514
___sys_sendmsg net/socket.c:2568 [inline]
__sys_sendmsg+0x2a9/0x390 net/socket.c:2597
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #0 (&ndev->req_lock){+.+.}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3090 [inline]
check_prevs_add kernel/locking/lockdep.c:3209 [inline]
validate_chain+0x1661/0x5950 kernel/locking/lockdep.c:3825
__lock_acquire+0x125b/0x1f80 kernel/locking/lockdep.c:5049
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__mutex_lock_common kernel/locking/mutex.c:603 [inline]
__mutex_lock+0x132/0xd80 kernel/locking/mutex.c:747
nci_close_device+0x106/0x5f0 net/nfc/nci/core.c:560
nci_dev_down+0x37/0x40 net/nfc/nci/core.c:638
nfc_dev_down net/nfc/core.c:161 [inline]
nfc_rfkill_set_block+0x16d/0x2f0 net/nfc/core.c:179
rfkill_set_block+0x1e7/0x430 net/rfkill/core.c:345
rfkill_sync_work+0x8a/0xe0 net/rfkill/core.c:1042
process_one_work+0x8a9/0x11d0 kernel/workqueue.c:2292
worker_thread+0xa47/0x1200 kernel/workqueue.c:2439
kthread+0x28d/0x320 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:307

other info that might help us debug this:

Chain exists of:
&ndev->req_lock --> nci_mutex --> rfkill_global_mutex

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(rfkill_global_mutex);
lock(nci_mutex);
lock(rfkill_global_mutex);
lock(&ndev->req_lock);

*** DEADLOCK ***

4 locks held by kworker/0:7/3611:
#0: ffff888012470938 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x7a9/0x11d0 kernel/workqueue.c:2267
#1: ffffc9000402fd20 ((work_completion)(&rfkill->sync_work)){+.+.}-{0:0}, at: process_one_work+0x7a9/0x11d0 kernel/workqueue.c:2267
#2: ffffffff8e547a28 (rfkill_global_mutex){+.+.}-{3:3}, at: rfkill_sync_work+0x25/0xe0 net/rfkill/core.c:1040
#3: ffff88807aa28100 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:837 [inline]
#3: ffff88807aa28100 (&dev->mutex){....}-{3:3}, at: nfc_dev_down net/nfc/core.c:143 [inline]
#3: ffff88807aa28100 (&dev->mutex){....}-{3:3}, at: nfc_rfkill_set_block+0x4c/0x2f0 net/nfc/core.c:179

stack backtrace:
CPU: 0 PID: 3611 Comm: kworker/0:7 Not tainted 6.1.83-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Workqueue: events rfkill_sync_work
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
check_noncircular+0x2fa/0x3b0 kernel/locking/lockdep.c:2170
check_prev_add kernel/locking/lockdep.c:3090 [inline]
check_prevs_add kernel/locking/lockdep.c:3209 [inline]
validate_chain+0x1661/0x5950 kernel/locking/lockdep.c:3825
__lock_acquire+0x125b/0x1f80 kernel/locking/lockdep.c:5049
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__mutex_lock_common kernel/locking/mutex.c:603 [inline]
__mutex_lock+0x132/0xd80 kernel/locking/mutex.c:747
nci_close_device+0x106/0x5f0 net/nfc/nci/core.c:560
nci_dev_down+0x37/0x40 net/nfc/nci/core.c:638
nfc_dev_down net/nfc/core.c:161 [inline]
nfc_rfkill_set_block+0x16d/0x2f0 net/nfc/core.c:179
rfkill_set_block+0x1e7/0x430 net/rfkill/core.c:345
rfkill_sync_work+0x8a/0xe0 net/rfkill/core.c:1042
process_one_work+0x8a9/0x11d0 kernel/workqueue.c:2292
worker_thread+0xa47/0x1200 kernel/workqueue.c:2439
kthread+0x28d/0x320 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:307
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup
Reply all
Reply to author
Forward
0 new messages