[syzbot] INFO: task can't die in __lock_sock

28 views
Skip to first unread message

syzbot

unread,
Aug 15, 2021, 12:47:23 PM8/15/21
to da...@davemloft.net, johan....@gmail.com, ku...@kernel.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, luiz....@gmail.com, mar...@holtmann.org, net...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 4b358aabb93a Add linux-next specific files for 20210813
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1603f181300000
kernel config: https://syzkaller.appspot.com/x/.config?x=b99612666fbe2d6a
dashboard link: https://syzkaller.appspot.com/bug?extid=7d51f807c81b190a127d
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7d51f8...@syzkaller.appspotmail.com

INFO: task syz-executor.4:21120 can't die for more than 143 seconds.
task:syz-executor.4 state:D stack:28448 pid:21120 ppid: 6572 flags:0x00004004
Call Trace:
context_switch kernel/sched/core.c:4711 [inline]
__schedule+0x93a/0x26f0 kernel/sched/core.c:5966
schedule+0xd3/0x270 kernel/sched/core.c:6045
__lock_sock+0x13d/0x260 net/core/sock.c:2645
lock_sock_nested+0xf6/0x120 net/core/sock.c:3178
lock_sock include/net/sock.h:1612 [inline]
bt_sock_wait_state+0x249/0x590 net/bluetooth/af_bluetooth.c:557
rfcomm_sock_connect+0x3a5/0x460 net/bluetooth/rfcomm/sock.c:416
__sys_connect_file+0x155/0x1a0 net/socket.c:1890
__sys_connect+0x161/0x190 net/socket.c:1907
__do_sys_connect net/socket.c:1917 [inline]
__se_sys_connect net/socket.c:1914 [inline]
__x64_sys_connect+0x6f/0xb0 net/socket.c:1914
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4665e9
RSP: 002b:00007fa7b02bf188 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 000000000056bf80 RCX: 00000000004665e9
RDX: 0000000000000080 RSI: 0000000020000000 RDI: 0000000000000004
RBP: 00000000004bfcc4 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf80
R13: 00007fffd83567bf R14: 00007fa7b02bf300 R15: 0000000000022000
INFO: task syz-executor.4:21120 blocked for more than 143 seconds.
Not tainted 5.14.0-rc5-next-20210813-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.4 state:D stack:28448 pid:21120 ppid: 6572 flags:0x00004004
Call Trace:
context_switch kernel/sched/core.c:4711 [inline]
__schedule+0x93a/0x26f0 kernel/sched/core.c:5966
schedule+0xd3/0x270 kernel/sched/core.c:6045
__lock_sock+0x13d/0x260 net/core/sock.c:2645
lock_sock_nested+0xf6/0x120 net/core/sock.c:3178
lock_sock include/net/sock.h:1612 [inline]
bt_sock_wait_state+0x249/0x590 net/bluetooth/af_bluetooth.c:557
rfcomm_sock_connect+0x3a5/0x460 net/bluetooth/rfcomm/sock.c:416
__sys_connect_file+0x155/0x1a0 net/socket.c:1890
__sys_connect+0x161/0x190 net/socket.c:1907
__do_sys_connect net/socket.c:1917 [inline]
__se_sys_connect net/socket.c:1914 [inline]
__x64_sys_connect+0x6f/0xb0 net/socket.c:1914
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4665e9
RSP: 002b:00007fa7b02bf188 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 000000000056bf80 RCX: 00000000004665e9
RDX: 0000000000000080 RSI: 0000000020000000 RDI: 0000000000000004
RBP: 00000000004bfcc4 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf80
R13: 00007fffd83567bf R14: 00007fa7b02bf300 R15: 0000000000022000
INFO: task syz-executor.4:21124 can't die for more than 143 seconds.
task:syz-executor.4 state:D stack:29104 pid:21124 ppid: 6572 flags:0x00000004
Call Trace:
context_switch kernel/sched/core.c:4711 [inline]
__schedule+0x93a/0x26f0 kernel/sched/core.c:5966
schedule+0xd3/0x270 kernel/sched/core.c:6045
__lock_sock+0x13d/0x260 net/core/sock.c:2645
lock_sock_nested+0xf6/0x120 net/core/sock.c:3178
lock_sock include/net/sock.h:1612 [inline]
rfcomm_sk_state_change+0xb4/0x390 net/bluetooth/rfcomm/sock.c:73
__rfcomm_dlc_close+0x1b6/0x8a0 net/bluetooth/rfcomm/core.c:489
rfcomm_dlc_close+0x1ea/0x240 net/bluetooth/rfcomm/core.c:520
__rfcomm_sock_close+0xac/0x260 net/bluetooth/rfcomm/sock.c:220
rfcomm_sock_shutdown+0xe9/0x210 net/bluetooth/rfcomm/sock.c:931
__sys_shutdown_sock net/socket.c:2242 [inline]
__sys_shutdown_sock net/socket.c:2236 [inline]
__sys_shutdown+0xf1/0x1b0 net/socket.c:2254
__do_sys_shutdown net/socket.c:2262 [inline]
__se_sys_shutdown net/socket.c:2260 [inline]
__x64_sys_shutdown+0x50/0x70 net/socket.c:2260
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4665e9
RSP: 002b:00007fa7b029e188 EFLAGS: 00000246 ORIG_RAX: 0000000000000030
RAX: ffffffffffffffda RBX: 000000000056c038 RCX: 00000000004665e9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004
RBP: 00000000004bfcc4 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056c038
R13: 00007fffd83567bf R14: 00007fa7b029e300 R15: 0000000000022000
INFO: task syz-executor.4:21124 blocked for more than 143 seconds.
Not tainted 5.14.0-rc5-next-20210813-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.4 state:D stack:29104 pid:21124 ppid: 6572 flags:0x00000004
Call Trace:
context_switch kernel/sched/core.c:4711 [inline]
__schedule+0x93a/0x26f0 kernel/sched/core.c:5966
schedule+0xd3/0x270 kernel/sched/core.c:6045
__lock_sock+0x13d/0x260 net/core/sock.c:2645
lock_sock_nested+0xf6/0x120 net/core/sock.c:3178
lock_sock include/net/sock.h:1612 [inline]
rfcomm_sk_state_change+0xb4/0x390 net/bluetooth/rfcomm/sock.c:73
__rfcomm_dlc_close+0x1b6/0x8a0 net/bluetooth/rfcomm/core.c:489
rfcomm_dlc_close+0x1ea/0x240 net/bluetooth/rfcomm/core.c:520
__rfcomm_sock_close+0xac/0x260 net/bluetooth/rfcomm/sock.c:220
rfcomm_sock_shutdown+0xe9/0x210 net/bluetooth/rfcomm/sock.c:931
__sys_shutdown_sock net/socket.c:2242 [inline]
__sys_shutdown_sock net/socket.c:2236 [inline]
__sys_shutdown+0xf1/0x1b0 net/socket.c:2254
__do_sys_shutdown net/socket.c:2262 [inline]
__se_sys_shutdown net/socket.c:2260 [inline]
__x64_sys_shutdown+0x50/0x70 net/socket.c:2260
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4665e9
RSP: 002b:00007fa7b029e188 EFLAGS: 00000246 ORIG_RAX: 0000000000000030
RAX: ffffffffffffffda RBX: 000000000056c038 RCX: 00000000004665e9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000004
RBP: 00000000004bfcc4 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056c038
R13: 00007fffd83567bf R14: 00007fa7b029e300 R15: 0000000000022000
INFO: lockdep is turned off.
NMI backtrace for cpu 1
CPU: 1 PID: 27 Comm: khungtaskd Not tainted 5.14.0-rc5-next-20210813-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:105
nmi_trigger_cpumask_backtrace+0x1ae/0x220 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:254 [inline]
watchdog+0xcb7/0xed0 kernel/hung_task.c:339
kthread+0x3e5/0x4d0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 6267 Comm: in:imklog Not tainted 5.14.0-rc5-next-20210813-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__sanitizer_cov_trace_pc+0x37/0x60 kernel/kcov.c:197
Code: 81 e1 00 01 00 00 65 48 8b 14 25 40 f0 01 00 a9 00 01 ff 00 74 0e 85 c9 74 35 8b 82 3c 15 00 00 85 c0 74 2b 8b 82 18 15 00 00 <83> f8 02 75 20 48 8b 8a 20 15 00 00 8b 92 1c 15 00 00 48 8b 01 48
RSP: 0018:ffffc9000c577648 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880775b9c80 RSI: ffffffff83f260b5 RDI: 0000000000000003
RBP: 0000000000000000 R08: 000000000000002f R09: 0000000000000000
R10: ffffffff83f26037 R11: 0000000000000000 R12: 0000000000000004
R13: ffffffff898ca243 R14: ffffc9008c577a4f R15: ffffc9000c577a51
FS: 00007f9245542700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f9a4b72b008 CR3: 000000006fe68000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
number+0x205/0xae0 lib/vsprintf.c:490
vsnprintf+0xf09/0x14f0 lib/vsprintf.c:2875
sprintf+0xc0/0x100 lib/vsprintf.c:3011
print_syslog kernel/printk/printk.c:1257 [inline]
info_print_prefix+0x2d5/0x340 kernel/printk/printk.c:1287
record_print_text+0x14d/0x3e0 kernel/printk/printk.c:1339
syslog_print+0x48c/0x580 kernel/printk/printk.c:1539
do_syslog.part.0+0x202/0x640 kernel/printk/printk.c:1658
do_syslog+0x49/0x60 kernel/printk/printk.c:1643
kmsg_read+0x90/0xb0 fs/proc/kmsg.c:40
pde_read fs/proc/inode.c:311 [inline]
proc_reg_read+0x119/0x300 fs/proc/inode.c:321
vfs_read+0x1b5/0x600 fs/read_write.c:494
ksys_read+0x12d/0x250 fs/read_write.c:634
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f9247b8522d
Code: c1 20 00 00 75 10 b8 00 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 4e fc ff ff 48 89 04 24 b8 00 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 97 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007f9245521580 EFLAGS: 00000293 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9247b8522d
RDX: 0000000000001fa0 RSI: 00007f9245521da0 RDI: 0000000000000004
RBP: 000055eb855a59d0 R08: 0000000000000000 R09: 0000000004000001
R10: 0000000000000001 R11: 0000000000000293 R12: 00007f9245521da0
R13: 0000000000001fa0 R14: 0000000000001f9f R15: 00007f9245521dd7
----------------
Code disassembly (best guess):
0: 81 e1 00 01 00 00 and $0x100,%ecx
6: 65 48 8b 14 25 40 f0 mov %gs:0x1f040,%rdx
d: 01 00
f: a9 00 01 ff 00 test $0xff0100,%eax
14: 74 0e je 0x24
16: 85 c9 test %ecx,%ecx
18: 74 35 je 0x4f
1a: 8b 82 3c 15 00 00 mov 0x153c(%rdx),%eax
20: 85 c0 test %eax,%eax
22: 74 2b je 0x4f
24: 8b 82 18 15 00 00 mov 0x1518(%rdx),%eax
2a: 83 f8 02 cmp $0x2,%eax <-- trapping instruction
2d: 75 20 jne 0x4f
2f: 48 8b 8a 20 15 00 00 mov 0x1520(%rdx),%rcx
36: 8b 92 1c 15 00 00 mov 0x151c(%rdx),%edx
3c: 48 8b 01 mov (%rcx),%rax
3f: 48 rex.W


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

syzbot

unread,
Sep 1, 2021, 1:34:22 PM9/1/21
to da...@davemloft.net, desmond...@gmail.com, johan....@gmail.com, ku...@kernel.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, luiz....@gmail.com, mar...@holtmann.org, net...@vger.kernel.org, syzkall...@googlegroups.com
syzbot has found a reproducer for the following issue on:

HEAD commit: c1b13fe76e95 Add linux-next specific files for 20210901
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=12c6034d300000
kernel config: https://syzkaller.appspot.com/x/.config?x=e2afff7bc32736e5
dashboard link: https://syzkaller.appspot.com/bug?extid=7d51f807c81b190a127d
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14d42469300000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1107d815300000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7d51f8...@syzkaller.appspotmail.com

INFO: task syz-executor157:6562 blocked for more than 143 seconds.
Not tainted 5.14.0-next-20210901-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor157 state:D stack:26880 pid: 6562 ppid: 6530 flags:0x00004006
Call Trace:
context_switch kernel/sched/core.c:4955 [inline]
__schedule+0x940/0x26f0 kernel/sched/core.c:6302
schedule+0xd3/0x270 kernel/sched/core.c:6381
__lock_sock+0x13d/0x260 net/core/sock.c:2644
lock_sock_nested+0xf6/0x120 net/core/sock.c:3185
lock_sock include/net/sock.h:1612 [inline]
rfcomm_sk_state_change+0xb4/0x390 net/bluetooth/rfcomm/sock.c:73
__rfcomm_dlc_close+0x1b6/0x8a0 net/bluetooth/rfcomm/core.c:489
rfcomm_dlc_close+0x1ea/0x240 net/bluetooth/rfcomm/core.c:520
__rfcomm_sock_close+0xac/0x260 net/bluetooth/rfcomm/sock.c:220
rfcomm_sock_shutdown+0xe9/0x210 net/bluetooth/rfcomm/sock.c:931
rfcomm_sock_release+0x5f/0x140 net/bluetooth/rfcomm/sock.c:951
__sock_release+0xcd/0x280 net/socket.c:649
sock_close+0x18/0x20 net/socket.c:1314
__fput+0x288/0x9f0 fs/file_table.c:280
task_work_run+0xdd/0x1a0 kernel/task_work.c:164
exit_task_work include/linux/task_work.h:32 [inline]
do_exit+0xbae/0x2a30 kernel/exit.c:825
do_group_exit+0x125/0x310 kernel/exit.c:922
get_signal+0x47f/0x2160 kernel/signal.c:2868
arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:865
handle_signal_work kernel/entry/common.c:148 [inline]
exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:209
__syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:302
do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x445fe9
RSP: 002b:00007fff85049fe8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: fffffffffffffffc RBX: 0000000000000003 RCX: 0000000000445fe9
RDX: 0000000000000080 RSI: 0000000020000000 RDI: 0000000000000004
RBP: 0000000000000003 R08: 000000ff00000001 R09: 000000ff00000001
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000014112b8
R13: 0000000000000072 R14: 00007fff8504a040 R15: 0000000000000003

Showing all locks held in the system:
1 lock held by khungtaskd/26:
#0: ffffffff8b97fbe0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6446
1 lock held by krfcommd/2876:
#0: ffffffff8d31ede8 (rfcomm_mutex){+.+.}-{3:3}, at: rfcomm_process_sessions net/bluetooth/rfcomm/core.c:1979 [inline]
#0: ffffffff8d31ede8 (rfcomm_mutex){+.+.}-{3:3}, at: rfcomm_run+0x2ed/0x4a20 net/bluetooth/rfcomm/core.c:2086
1 lock held by in:imklog/6232:
4 locks held by syz-executor157/6562:
#0: ffff888145e26210 (&sb->s_type->i_mutex_key#13){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:786 [inline]
#0: ffff888145e26210 (&sb->s_type->i_mutex_key#13){+.+.}-{3:3}, at: __sock_release+0x86/0x280 net/socket.c:648
#1: ffff88801d622120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1612 [inline]
#1: ffff88801d622120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: rfcomm_sock_shutdown+0x54/0x210 net/bluetooth/rfcomm/sock.c:928
#2: ffffffff8d31ede8 (rfcomm_mutex){+.+.}-{3:3}, at: rfcomm_dlc_close+0x34/0x240 net/bluetooth/rfcomm/core.c:507
#3: ffff88807edd9928 (&d->lock){+.+.}-{3:3}, at: __rfcomm_dlc_close+0x162/0x8a0 net/bluetooth/rfcomm/core.c:487

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 26 Comm: khungtaskd Not tainted 5.14.0-next-20210901-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:105
nmi_trigger_cpumask_backtrace+0x1ae/0x220 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:254 [inline]
watchdog+0xcb7/0xed0 kernel/hung_task.c:339
kthread+0x3e5/0x4d0 kernel/kthread.c:319
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 2958 Comm: systemd-journal Not tainted 5.14.0-next-20210901-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:check_kcov_mode kernel/kcov.c:163 [inline]
RIP: 0010:__sanitizer_cov_trace_pc+0x7/0x60 kernel/kcov.c:197
Code: fd ff ff b9 ff ff ff ff ba 08 00 00 00 4d 8b 03 48 0f bd ca 49 8b 45 00 48 63 c9 e9 64 ff ff ff 0f 1f 00 65 8b 05 39 e6 8b 7e <89> c1 48 8b 34 24 81 e1 00 01 00 00 65 48 8b 14 25 40 f0 01 00 a9
RSP: 0018:ffffc900014dfde0 EFLAGS: 00000282
RAX: 0000000080000000 RBX: ffffc900014dff58 RCX: 1ffff9200029bfc7
RDX: dffffc0000000000 RSI: 1ffff9200029bfcd RDI: ffffc900014dfe38
RBP: 0000000000000000 R08: 0000000000000000 R09: ffffffff8176c71a
R10: ffffffff81765c97 R11: 0000000000000002 R12: 0000000000000053
R13: 0000000000000002 R14: 0000000000000000 R15: ffffc900014dfe30
FS: 00007f43756768c0(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f4372a49000 CR3: 000000001a5d4000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
get_current arch/x86/include/asm/current.h:15 [inline]
seccomp_run_filters kernel/seccomp.c:402 [inline]
__seccomp_filter+0x88/0x1040 kernel/seccomp.c:1180
__secure_computing+0xfc/0x360 kernel/seccomp.c:1311
syscall_trace_enter.constprop.0+0x94/0x270 kernel/entry/common.c:68
do_syscall_64+0x16/0xb0 arch/x86/entry/common.c:76
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f4374931687
Code: 00 b8 ff ff ff ff c3 0f 1f 40 00 48 8b 05 09 d8 2b 00 64 c7 00 5f 00 00 00 b8 ff ff ff ff c3 0f 1f 40 00 b8 53 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 d7 2b 00 f7 d8 64 89 01 48
RSP: 002b:00007ffc79978938 EFLAGS: 00000293 ORIG_RAX: 0000000000000053
RAX: ffffffffffffffda RBX: 00007ffc7997b850 RCX: 00007f4374931687
RDX: 00007f43753a2a00 RSI: 00000000000001ed RDI: 00005646c59898a0
RBP: 00007ffc79978970 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000069 R11: 0000000000000293 R12: 0000000000000000
R13: 0000000000000000 R14: 00007ffc7997b850 R15: 00007ffc79978e60
----------------
Code disassembly (best guess), 3 bytes skipped:
0: b9 ff ff ff ff mov $0xffffffff,%ecx
5: ba 08 00 00 00 mov $0x8,%edx
a: 4d 8b 03 mov (%r11),%r8
d: 48 0f bd ca bsr %rdx,%rcx
11: 49 8b 45 00 mov 0x0(%r13),%rax
15: 48 63 c9 movslq %ecx,%rcx
18: e9 64 ff ff ff jmpq 0xffffff81
1d: 0f 1f 00 nopl (%rax)
20: 65 8b 05 39 e6 8b 7e mov %gs:0x7e8be639(%rip),%eax # 0x7e8be660
* 27: 89 c1 mov %eax,%ecx <-- trapping instruction
29: 48 8b 34 24 mov (%rsp),%rsi
2d: 81 e1 00 01 00 00 and $0x100,%ecx
33: 65 48 8b 14 25 40 f0 mov %gs:0x1f040,%rdx
3a: 01 00
3c: a9 .byte 0xa9

syzbot

unread,
Sep 1, 2021, 9:34:12 PM9/1/21
to da...@davemloft.net, desmond...@gmail.com, johan....@gmail.com, ku...@kernel.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, luiz....@gmail.com, luiz.vo...@intel.com, mar...@holtmann.org, net...@vger.kernel.org, syzkall...@googlegroups.com
syzbot has bisected this issue to:

commit b7ce436a5d798bc59e71797952566608a4b4626b
Author: Desmond Cheong Zhi Xi <desmond...@gmail.com>
Date: Tue Aug 10 04:14:09 2021 +0000

Bluetooth: switch to lock_sock in RFCOMM

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=16f90ffe300000
start commit: 29ce8f970107 Merge git://git.kernel.org/pub/scm/linux/kern..
git tree: net-next
final oops: https://syzkaller.appspot.com/x/report.txt?x=15f90ffe300000
console output: https://syzkaller.appspot.com/x/log.txt?x=11f90ffe300000
kernel config: https://syzkaller.appspot.com/x/.config?x=d2f9d4c9ff8c5ae7
dashboard link: https://syzkaller.appspot.com/bug?extid=7d51f807c81b190a127d
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1630a66d300000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16d07c6d300000

Reported-by: syzbot+7d51f8...@syzkaller.appspotmail.com
Fixes: b7ce436a5d79 ("Bluetooth: switch to lock_sock in RFCOMM")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Hillf Danton

unread,
Sep 1, 2021, 11:18:13 PM9/1/21
to syzbot, desmond...@gmail.com, Eric Dumazet, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, luiz....@gmail.com, net...@vger.kernel.org, syzkall...@googlegroups.com
On Wed, 01 Sep 2021 10:34:21 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: c1b13fe76e95 Add linux-next specific files for 20210901
> git tree: linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=12c6034d300000
> kernel config: https://syzkaller.appspot.com/x/.config?x=e2afff7bc32736e5
> dashboard link: https://syzkaller.appspot.com/bug?extid=7d51f807c81b190a127d
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14d42469300000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1107d815300000
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+7d51f8...@syzkaller.appspotmail.com
>
> INFO: task syz-executor157:6562 blocked for more than 143 seconds.
> Not tainted 5.14.0-next-20210901-syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:syz-executor157 state:D stack:26880 pid: 6562 ppid: 6530 flags:0x00004006
> Call Trace:
> context_switch kernel/sched/core.c:4955 [inline]
> __schedule+0x940/0x26f0 kernel/sched/core.c:6302
> schedule+0xd3/0x270 kernel/sched/core.c:6381
> __lock_sock+0x13d/0x260 net/core/sock.c:2644
> lock_sock_nested+0xf6/0x120 net/core/sock.c:3185
> lock_sock include/net/sock.h:1612 [inline]

This is due to b7ce436a5d79 ("Bluetooth: switch to lock_sock in RFCOMM").
But sk is already locked before b7ce436a5d79.

What is wierd here is lock_sock() fails to complain about recursive locking
like this one if syzbot turned lockdep on. Any light on this, Eric?

Thanks
Hillf

Desmond Cheong Zhi Xi

unread,
Sep 2, 2021, 3:54:40 PM9/2/21
to Hillf Danton, syzbot, Eric Dumazet, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, luiz....@gmail.com, net...@vger.kernel.org, syzkall...@googlegroups.com
Sorry, this one was my bad. The patch swapped out spin_lock_bh for
lock_sock, to provide synchronization with other functions that use
lock_sock.

Problem is that in one of the call traces, we hit the following deadlock:

rfcomm_sock_close():
lock_sock();
__rfcomm_sock_close():
rfcomm_dlc_close():
__rfcomm_dlc_close():
rfcomm_sk_state_change():
lock_sock();

But we don't always hold onto the socket lock before calling
rfcomm_sk_state_change.

I'm still working and testing a fix, but I think one possibility is to
schedule rfcomm_sk_state_change on a workqueue. This seems to fit with
the rest of the code, since in rfcomm_sock_shutdown we call
rfcomm_sock_close then wait for the sk state to change to BT_CLOSED.

Hillf Danton

unread,
Sep 2, 2021, 11:31:17 PM9/2/21
to Desmond Cheong Zhi Xi, Eric Dumazet, syzbot, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, luiz....@gmail.com, net...@vger.kernel.org, syzkall...@googlegroups.com
Hi Desmond,
>
>Sorry, this one was my bad.

No, I see b7ce436a5d79 as the MVP in 2021 so far as it in combination with
the smart syzbot lifts up the lid that sk fails to detect deadlock in the
simple scenario like

lock_sock(sk);
lock_sock(sk);

and I could not see it without your work.

Worse is task hang was reported while the reason behind is deadlock.

Thanks again
Hillf

Hillf Danton

unread,
Sep 20, 2021, 10:24:03 AM9/20/21
to syzbot, desmond...@gmail.com, Eric Dumazet, hda...@sina.com, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
On Wed, 01 Sep 2021 18:34:11 -0700
The diff is trying to catch the recursive locking before b7ce436a5d79, which
is task hang in this report instead.

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

--- x/net/core/sock.c
+++ y/net/core/sock.c
@@ -3180,6 +3180,8 @@ EXPORT_SYMBOL(sock_init_data);
void lock_sock_nested(struct sock *sk, int subclass)
{
might_sleep();
+ mutex_acquire(&sk->sk_lock.dep_map, subclass, 0, _RET_IP_);
+ mutex_release(&sk->sk_lock.dep_map, _RET_IP_);
spin_lock_bh(&sk->sk_lock.slock);
if (sk->sk_lock.owned)
__lock_sock(sk);
--

syzbot

unread,
Sep 20, 2021, 11:50:08 AM9/20/21
to desmond...@gmail.com, edum...@google.com, hda...@sina.com, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
possible deadlock in rfcomm_sk_state_change

============================================
WARNING: possible recursive locking detected
5.15.0-rc2-syzkaller #0 Not tainted
--------------------------------------------
syz-executor.0/9050 is trying to acquire lock:
ffff88807ce5d120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1612 [inline]
ffff88807ce5d120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: rfcomm_sk_state_change+0xb4/0x390 net/bluetooth/rfcomm/sock.c:73

but task is already holding lock:
ffff88807ce5d120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1612 [inline]
ffff88807ce5d120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: rfcomm_sock_shutdown+0x54/0x210 net/bluetooth/rfcomm/sock.c:928

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM);
lock(sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM);

*** DEADLOCK ***

May be due to missing lock nesting notation

4 locks held by syz-executor.0/9050:
#0: ffff88806c8fa010 (&sb->s_type->i_mutex_key#13){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:786 [inline]
#0: ffff88806c8fa010 (&sb->s_type->i_mutex_key#13){+.+.}-{3:3}, at: __sock_release+0x86/0x280 net/socket.c:648
#1: ffff88807ce5d120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1612 [inline]
#1: ffff88807ce5d120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: rfcomm_sock_shutdown+0x54/0x210 net/bluetooth/rfcomm/sock.c:928
#2: ffffffff8d320f28 (rfcomm_mutex){+.+.}-{3:3}, at: rfcomm_dlc_close+0x34/0x240 net/bluetooth/rfcomm/core.c:507
#3: ffff88806875dd28 (&d->lock){+.+.}-{3:3}, at: __rfcomm_dlc_close+0x162/0x8a0 net/bluetooth/rfcomm/core.c:487

stack backtrace:
CPU: 1 PID: 9050 Comm: syz-executor.0 Not tainted 5.15.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_deadlock_bug kernel/locking/lockdep.c:2944 [inline]
check_deadlock kernel/locking/lockdep.c:2987 [inline]
validate_chain kernel/locking/lockdep.c:3776 [inline]
__lock_acquire.cold+0x149/0x3ab kernel/locking/lockdep.c:5015
lock_acquire kernel/locking/lockdep.c:5625 [inline]
lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590
lock_sock_nested+0x4e/0x140 net/core/sock.c:3183
lock_sock include/net/sock.h:1612 [inline]
rfcomm_sk_state_change+0xb4/0x390 net/bluetooth/rfcomm/sock.c:73
__rfcomm_dlc_close+0x1b6/0x8a0 net/bluetooth/rfcomm/core.c:489
rfcomm_dlc_close+0x1ea/0x240 net/bluetooth/rfcomm/core.c:520
__rfcomm_sock_close+0xac/0x260 net/bluetooth/rfcomm/sock.c:220
rfcomm_sock_shutdown+0xe9/0x210 net/bluetooth/rfcomm/sock.c:931
rfcomm_sock_release+0x5f/0x140 net/bluetooth/rfcomm/sock.c:951
__sock_release+0xcd/0x280 net/socket.c:649
sock_close+0x18/0x20 net/socket.c:1314
__fput+0x288/0x9f0 fs/file_table.c:280
task_work_run+0xdd/0x1a0 kernel/task_work.c:164
get_signal+0x1b35/0x2160 kernel/signal.c:2641
arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:865
handle_signal_work kernel/entry/common.c:148 [inline]
exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:209
__syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:302
do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x4665f9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f0f295fa188 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: fffffffffffffffc RBX: 000000000056bf80 RCX: 00000000004665f9
RDX: 0000000000000080 RSI: 0000000020000000 RDI: 0000000000000004
RBP: 00000000004bfcc4 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf80
R13: 00007fff56050a0f R14: 00007f0f295fa300 R15: 0000000000022000


Tested on:

commit: e4e737bb Linux 5.15-rc2
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1256b6e3300000
kernel config: https://syzkaller.appspot.com/x/.config?x=25af12c0a765245
dashboard link: https://syzkaller.appspot.com/bug?extid=7d51f807c81b190a127d
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=10e5ef77300000

Thomas Gleixner

unread,
Sep 22, 2021, 10:16:41 AM9/22/21
to syzbot, desmond...@gmail.com, edum...@google.com, hda...@sina.com, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
On Mon, Sep 20 2021 at 08:50, syzbot wrote:
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> possible deadlock in rfcomm_sk_state_change
>
> ============================================
> WARNING: possible recursive locking detected
> 5.15.0-rc2-syzkaller #0 Not tainted
> --------------------------------------------
> syz-executor.0/9050 is trying to acquire lock:
> ffff88807ce5d120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1612 [inline]
> ffff88807ce5d120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: rfcomm_sk_state_change+0xb4/0x390 net/bluetooth/rfcomm/sock.c:73
>
> but task is already holding lock:
> ffff88807ce5d120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1612 [inline]
> ffff88807ce5d120 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: rfcomm_sock_shutdown+0x54/0x210 net/bluetooth/rfcomm/sock.c:928

it's not only possible recursion. It's real. Same lock instance and the
stack trace tells how this happens

lock_sock_nested+0x4e/0x140 net/core/sock.c:3183
lock_sock include/net/sock.h:1612 [inline]

Lock is already held. See below.

rfcomm_sk_state_change+0xb4/0x390 net/bluetooth/rfcomm/sock.c:73
__rfcomm_dlc_close+0x1b6/0x8a0 net/bluetooth/rfcomm/core.c:489
rfcomm_dlc_close+0x1ea/0x240 net/bluetooth/rfcomm/core.c:520
__rfcomm_sock_close+0xac/0x260 net/bluetooth/rfcomm/sock.c:220

sock lock is held from here.

rfcomm_sock_shutdown+0xe9/0x210 net/bluetooth/rfcomm/sock.c:931
rfcomm_sock_release+0x5f/0x140 net/bluetooth/rfcomm/sock.c:951
__sock_release+0xcd/0x280 net/socket.c:649
sock_close+0x18/0x20 net/socket.c:1314
__fput+0x288/0x9f0 fs/file_table.c:280
task_work_run+0xdd/0x1a0 kernel/task_work.c:164

I assume that the lock_sock*() lockdep change was applied on top of
Linus tree. The previous reports were showing lockups IIRC because
lockdep had no chance to see that due to the placement of the acquire
annotation.

Thanks,

tglx

Desmond Cheong Zhi Xi

unread,
Oct 4, 2021, 1:50:27 AM10/4/21
to syzbot, syzkall...@googlegroups.com
0001-test.patch

syzbot

unread,
Oct 4, 2021, 2:10:12 AM10/4/21
to desmond...@gmail.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+7d51f8...@syzkaller.appspotmail.com

Tested on:

commit: 4539ca67 Bluetooth: Rename driver .prevent_wake to .wa..
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git master
kernel config: https://syzkaller.appspot.com/x/.config?x=a86ff78615d8caa2
dashboard link: https://syzkaller.appspot.com/bug?extid=7d51f807c81b190a127d
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=1519ce3f300000

Note: testing is done by a robot and is best-effort only.

Desmond Cheong Zhi Xi

unread,
Oct 4, 2021, 2:51:17 AM10/4/21
to syzbot, syzkall...@googlegroups.com
Missed out some error handling in the previous patch.

Best,
Desmond
0001-test.patch

syzbot

unread,
Oct 4, 2021, 3:10:06 AM10/4/21
to desmond...@gmail.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+7d51f8...@syzkaller.appspotmail.com

Tested on:

commit: 4539ca67 Bluetooth: Rename driver .prevent_wake to .wa..
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git master
kernel config: https://syzkaller.appspot.com/x/.config?x=a86ff78615d8caa2
dashboard link: https://syzkaller.appspot.com/bug?extid=7d51f807c81b190a127d
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=134cd28f300000

Desmond Cheong Zhi Xi

unread,
Aug 28, 2022, 10:55:28 PM8/28/22
to syzbot, syzkall...@googlegroups.com
Testing if the patch still fixes the bug.

Best,
Desmond
0001-Bluetooth-fix-deadlock-for-RFCOMM-sk-state-change.patch

syzbot

unread,
Aug 29, 2022, 4:04:13 AM8/29/22
to desmond...@gmail.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+7d51f8...@syzkaller.appspotmail.com

Tested on:

commit: b8288548 Bluetooth: hci_sync: hold hdev->lock when cle..
console output: https://syzkaller.appspot.com/x/log.txt?x=167c5e75080000
kernel config: https://syzkaller.appspot.com/x/.config?x=6eb7ef93c1f86f57
dashboard link: https://syzkaller.appspot.com/bug?extid=7d51f807c81b190a127d
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=1099c063080000

syzbot

unread,
Jan 16, 2024, 8:36:14 AMJan 16
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
No recent activity, existing reproducers are no longer triggering the issue.
Reply all
Reply to author
Forward
0 new messages