[v6.1] possible deadlock in hci_conn_hash_flush (2)

2 views
Skip to first unread message

syzbot

unread,
Feb 27, 2024, 2:43:19 AMFeb 27
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 81e1dc2f7001 Linux 6.1.79
git tree: linux-6.1.y
console output: https://syzkaller.appspot.com/x/log.txt?x=15057daa180000
kernel config: https://syzkaller.appspot.com/x/.config?x=1f1b2df4155f9cc0
dashboard link: https://syzkaller.appspot.com/bug?extid=5f8bd15002bf22f8adf4
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/c15e68f002df/disk-81e1dc2f.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/48d036ef2f3c/vmlinux-81e1dc2f.xz
kernel image: https://storage.googleapis.com/syzbot-assets/f372bf435776/bzImage-81e1dc2f.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+5f8bd1...@syzkaller.appspotmail.com

Bluetooth: hci0: Controller not accepting commands anymore: ncmd = 0
Bluetooth: hci0: Injecting HCI hardware error event
Bluetooth: hci0: hardware error 0x00
======================================================
WARNING: possible circular locking dependency detected
6.1.79-syzkaller #0 Not tainted
------------------------------------------------------
kworker/u5:3/3585 is trying to acquire lock:
ffff888073524870 ((work_completion)(&(&conn->timeout_work)->work)){+.+.}-{0:0}, at: __flush_work+0xe5/0xad0 kernel/workqueue.c:3072

but task is already holding lock:
ffffffff8e3edb68 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_disconn_cfm include/net/bluetooth/hci_core.h:1800 [inline]
ffffffff8e3edb68 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_conn_hash_flush+0xb8/0x2a0 net/bluetooth/hci_conn.c:2500

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (hci_cb_list_lock){+.+.}-{3:3}:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__mutex_lock_common kernel/locking/mutex.c:603 [inline]
__mutex_lock+0x132/0xd80 kernel/locking/mutex.c:747
hci_connect_cfm include/net/bluetooth/hci_core.h:1785 [inline]
hci_remote_features_evt+0x664/0xab0 net/bluetooth/hci_event.c:3780
hci_event_func net/bluetooth/hci_event.c:7507 [inline]
hci_event_packet+0xaa1/0x1510 net/bluetooth/hci_event.c:7559
hci_rx_work+0x3cd/0xce0 net/bluetooth/hci_core.c:4093
process_one_work+0x8a9/0x11d0 kernel/workqueue.c:2292
worker_thread+0xa47/0x1200 kernel/workqueue.c:2439
kthread+0x28d/0x320 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306

-> #2 (&hdev->lock){+.+.}-{3:3}:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__mutex_lock_common kernel/locking/mutex.c:603 [inline]
__mutex_lock+0x132/0xd80 kernel/locking/mutex.c:747
sco_sock_connect+0x181/0x8f0 net/bluetooth/sco.c:593
__sys_connect_file net/socket.c:2006 [inline]
__sys_connect+0x2c9/0x300 net/socket.c:2023
__do_sys_connect net/socket.c:2033 [inline]
__se_sys_connect net/socket.c:2030 [inline]
__x64_sys_connect+0x76/0x80 net/socket.c:2030
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #1 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}:
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
lock_sock_nested+0x44/0x100 net/core/sock.c:3484
lock_sock include/net/sock.h:1745 [inline]
sco_sock_timeout+0xbd/0x230 net/bluetooth/sco.c:97
process_one_work+0x8a9/0x11d0 kernel/workqueue.c:2292
worker_thread+0xa47/0x1200 kernel/workqueue.c:2439
kthread+0x28d/0x320 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306

-> #0 ((work_completion)(&(&conn->timeout_work)->work)){+.+.}-{0:0}:
check_prev_add kernel/locking/lockdep.c:3090 [inline]
check_prevs_add kernel/locking/lockdep.c:3209 [inline]
validate_chain+0x1661/0x5950 kernel/locking/lockdep.c:3825
__lock_acquire+0x125b/0x1f80 kernel/locking/lockdep.c:5049
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__flush_work+0xfe/0xad0 kernel/workqueue.c:3072
__cancel_work_timer+0x519/0x6a0 kernel/workqueue.c:3163
sco_conn_del+0x205/0x300 net/bluetooth/sco.c:205
hci_disconn_cfm include/net/bluetooth/hci_core.h:1803 [inline]
hci_conn_hash_flush+0x10e/0x2a0 net/bluetooth/hci_conn.c:2500
hci_dev_close_sync+0x9a9/0xfc0 net/bluetooth/hci_sync.c:4977
hci_dev_do_close net/bluetooth/hci_core.c:554 [inline]
hci_error_reset+0x104/0x250 net/bluetooth/hci_core.c:1059
process_one_work+0x8a9/0x11d0 kernel/workqueue.c:2292
worker_thread+0xa47/0x1200 kernel/workqueue.c:2439
kthread+0x28d/0x320 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306

other info that might help us debug this:

Chain exists of:
(work_completion)(&(&conn->timeout_work)->work) --> &hdev->lock --> hci_cb_list_lock

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(hci_cb_list_lock);
lock(&hdev->lock);
lock(hci_cb_list_lock);
lock((work_completion)(&(&conn->timeout_work)->work));

*** DEADLOCK ***

5 locks held by kworker/u5:3/3585:
#0: ffff88807d828138 ((wq_completion)hci0){+.+.}-{0:0}, at: process_one_work+0x7a9/0x11d0 kernel/workqueue.c:2267
#1: ffffc900043ffd20 ((work_completion)(&hdev->error_reset)){+.+.}-{0:0}, at: process_one_work+0x7a9/0x11d0 kernel/workqueue.c:2267
#2: ffff8880217310b8 (&hdev->req_lock){+.+.}-{3:3}, at: hci_dev_do_close net/bluetooth/hci_core.c:552 [inline]
#2: ffff8880217310b8 (&hdev->req_lock){+.+.}-{3:3}, at: hci_error_reset+0xfc/0x250 net/bluetooth/hci_core.c:1059
#3: ffff888021730078 (&hdev->lock){+.+.}-{3:3}, at: hci_dev_close_sync+0x48d/0xfc0 net/bluetooth/hci_sync.c:4964
#4: ffffffff8e3edb68 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_disconn_cfm include/net/bluetooth/hci_core.h:1800 [inline]
#4: ffffffff8e3edb68 (hci_cb_list_lock){+.+.}-{3:3}, at: hci_conn_hash_flush+0xb8/0x2a0 net/bluetooth/hci_conn.c:2500

stack backtrace:
CPU: 1 PID: 3585 Comm: kworker/u5:3 Not tainted 6.1.79-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Workqueue: hci0 hci_error_reset
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2cb lib/dump_stack.c:106
check_noncircular+0x2fa/0x3b0 kernel/locking/lockdep.c:2170
check_prev_add kernel/locking/lockdep.c:3090 [inline]
check_prevs_add kernel/locking/lockdep.c:3209 [inline]
validate_chain+0x1661/0x5950 kernel/locking/lockdep.c:3825
__lock_acquire+0x125b/0x1f80 kernel/locking/lockdep.c:5049
lock_acquire+0x1f8/0x5a0 kernel/locking/lockdep.c:5662
__flush_work+0xfe/0xad0 kernel/workqueue.c:3072
__cancel_work_timer+0x519/0x6a0 kernel/workqueue.c:3163
sco_conn_del+0x205/0x300 net/bluetooth/sco.c:205
hci_disconn_cfm include/net/bluetooth/hci_core.h:1803 [inline]
hci_conn_hash_flush+0x10e/0x2a0 net/bluetooth/hci_conn.c:2500
hci_dev_close_sync+0x9a9/0xfc0 net/bluetooth/hci_sync.c:4977
hci_dev_do_close net/bluetooth/hci_core.c:554 [inline]
hci_error_reset+0x104/0x250 net/bluetooth/hci_core.c:1059
process_one_work+0x8a9/0x11d0 kernel/workqueue.c:2292
worker_thread+0xa47/0x1200 kernel/workqueue.c:2439
kthread+0x28d/0x320 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
</TASK>
Bluetooth: hci0: Opcode 0x0c03 failed: -110


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup
Reply all
Reply to author
Forward
0 new messages