[v5.15] possible deadlock in hci_conn_hash_flush (2)

0 views
Skip to first unread message

syzbot

unread,
Jan 23, 2024, 6:43:18 PMJan 23
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: ddcaf4999061 Linux 5.15.147
git tree: linux-5.15.y
console output: https://syzkaller.appspot.com/x/log.txt?x=1016c56be80000
kernel config: https://syzkaller.appspot.com/x/.config?x=8c65db3d25098c3c
dashboard link: https://syzkaller.appspot.com/bug?extid=6540c42c3d5224a370e8
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/fe87fb57528f/disk-ddcaf499.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/f64608a2759c/vmlinux-ddcaf499.xz
kernel image: https://storage.googleapis.com/syzbot-assets/84cae5bc6ed5/bzImage-ddcaf499.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+6540c4...@syzkaller.appspotmail.com

Bluetooth: hci8: hardware error 0x00
======================================================
WARNING: possible circular locking dependency detected
5.15.147-syzkaller #0 Not tainted
------------------------------------------------------
kworker/u5:0/146 is trying to acquire lock:
ffff888022f06070 ((work_completion)(&(&conn->timeout_work)->work)){+.+.}-{0:0}, at: __flush_work+0xcf/0x1a0 kernel/workqueue.c:3090

but task is already holding lock:
ffffffff8db25028 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_disconn_cfm include/net/bluetooth/hci_core.h:1523 [inline]
ffffffff8db25028 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_conn_hash_flush+0xb8/0x220 net/bluetooth/hci_conn.c:1624

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (hci_cb_list_lock){+.+.}-{3:3}:
lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5623
__mutex_lock_common+0x1da/0x25a0 kernel/locking/mutex.c:596
__mutex_lock kernel/locking/mutex.c:729 [inline]
mutex_lock_nested+0x17/0x20 kernel/locking/mutex.c:743
hci_connect_cfm include/net/bluetooth/hci_core.h:1508 [inline]
hci_remote_features_evt+0x6d1/0xb60 net/bluetooth/hci_event.c:3336
hci_event_packet+0x6fe/0x1550 net/bluetooth/hci_event.c:6398
hci_rx_work+0x232/0x990 net/bluetooth/hci_core.c:5155
process_one_work+0x8a1/0x10c0 kernel/workqueue.c:2310
worker_thread+0xaca/0x1280 kernel/workqueue.c:2457
kthread+0x3f6/0x4f0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298

-> #2 (&hdev->lock){+.+.}-{3:3}:
lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5623
__mutex_lock_common+0x1da/0x25a0 kernel/locking/mutex.c:596
__mutex_lock kernel/locking/mutex.c:729 [inline]
mutex_lock_nested+0x17/0x20 kernel/locking/mutex.c:743
sco_sock_connect+0x181/0x8e0 net/bluetooth/sco.c:587
__sys_connect_file net/socket.c:1918 [inline]
__sys_connect+0x38b/0x410 net/socket.c:1935
__do_sys_connect net/socket.c:1945 [inline]
__se_sys_connect net/socket.c:1942 [inline]
__x64_sys_connect+0x76/0x80 net/socket.c:1942
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x61/0xcb

-> #1 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}:
lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5623
lock_sock_nested+0x44/0x100 net/core/sock.c:3239
lock_sock include/net/sock.h:1668 [inline]
sco_sock_timeout+0xbd/0x230 net/bluetooth/sco.c:96
process_one_work+0x8a1/0x10c0 kernel/workqueue.c:2310
worker_thread+0xaca/0x1280 kernel/workqueue.c:2457
kthread+0x3f6/0x4f0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298

-> #0 ((work_completion)(&(&conn->timeout_work)->work)){+.+.}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3053 [inline]
check_prevs_add kernel/locking/lockdep.c:3172 [inline]
validate_chain+0x1649/0x5930 kernel/locking/lockdep.c:3788
__lock_acquire+0x1295/0x1ff0 kernel/locking/lockdep.c:5012
lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5623
__flush_work+0xeb/0x1a0 kernel/workqueue.c:3090
__cancel_work_timer+0x519/0x6a0 kernel/workqueue.c:3181
sco_conn_del+0x205/0x300 net/bluetooth/sco.c:204
hci_disconn_cfm include/net/bluetooth/hci_core.h:1526 [inline]
hci_conn_hash_flush+0x10d/0x220 net/bluetooth/hci_conn.c:1624
hci_dev_do_close+0x9f6/0x1070 net/bluetooth/hci_core.c:1795
hci_error_reset+0xeb/0x190 net/bluetooth/hci_core.c:2340
process_one_work+0x8a1/0x10c0 kernel/workqueue.c:2310
worker_thread+0xaca/0x1280 kernel/workqueue.c:2457
kthread+0x3f6/0x4f0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298

other info that might help us debug this:

Chain exists of:
(work_completion)(&(&conn->timeout_work)->work) --> &hdev->lock --> hci_cb_list_lock

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(hci_cb_list_lock);
lock(&hdev->lock);
lock(hci_cb_list_lock);
lock((work_completion)(&(&conn->timeout_work)->work));

*** DEADLOCK ***

5 locks held by kworker/u5:0/146:
#0: ffff888088696138 ((wq_completion)hci8){+.+.}-{0:0}, at: process_one_work+0x78a/0x10c0 kernel/workqueue.c:2283
#1: ffffc9000179fd20 ((work_completion)(&hdev->error_reset)){+.+.}-{0:0}, at: process_one_work+0x7d0/0x10c0 kernel/workqueue.c:2285
#2: ffff888088410ff0 (&hdev->req_lock){+.+.}-{3:3}, at: hci_dev_do_close+0x63/0x1070 net/bluetooth/hci_core.c:1737
#3: ffff888088410078 (&hdev->lock){+.+.}-{3:3}, at: hci_dev_do_close+0x431/0x1070 net/bluetooth/hci_core.c:1782
#4: ffffffff8db25028 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_disconn_cfm include/net/bluetooth/hci_core.h:1523 [inline]
#4: ffffffff8db25028 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_conn_hash_flush+0xb8/0x220 net/bluetooth/hci_conn.c:1624

stack backtrace:
CPU: 1 PID: 146 Comm: kworker/u5:0 Not tainted 5.15.147-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023
Workqueue: hci8 hci_error_reset
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
check_noncircular+0x2f8/0x3b0 kernel/locking/lockdep.c:2133
check_prev_add kernel/locking/lockdep.c:3053 [inline]
check_prevs_add kernel/locking/lockdep.c:3172 [inline]
validate_chain+0x1649/0x5930 kernel/locking/lockdep.c:3788
__lock_acquire+0x1295/0x1ff0 kernel/locking/lockdep.c:5012
lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5623
__flush_work+0xeb/0x1a0 kernel/workqueue.c:3090
__cancel_work_timer+0x519/0x6a0 kernel/workqueue.c:3181
sco_conn_del+0x205/0x300 net/bluetooth/sco.c:204
hci_disconn_cfm include/net/bluetooth/hci_core.h:1526 [inline]
hci_conn_hash_flush+0x10d/0x220 net/bluetooth/hci_conn.c:1624
hci_dev_do_close+0x9f6/0x1070 net/bluetooth/hci_core.c:1795
hci_error_reset+0xeb/0x190 net/bluetooth/hci_core.c:2340
process_one_work+0x8a1/0x10c0 kernel/workqueue.c:2310
worker_thread+0xaca/0x1280 kernel/workqueue.c:2457
kthread+0x3f6/0x4f0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup
Reply all
Reply to author
Forward
0 new messages