[syzbot] INFO: task hung in rfcomm_process_sessions (2)

14 views
Skip to first unread message

syzbot

unread,
Mar 29, 2022, 9:34:21ā€ÆPM3/29/22
to da...@davemloft.net, johan....@gmail.com, ku...@kernel.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, luiz....@gmail.com, mar...@holtmann.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: cffb2b72d3ed Merge tag 'kgdb-5.18-rc1' of git://git.kernel..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11ae58bb700000
kernel config: https://syzkaller.appspot.com/x/.config?x=69c8957f4ac2dea6
dashboard link: https://syzkaller.appspot.com/bug?extid=d761775dff24be3ad4be
compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=103e2107700000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=149c7ec7700000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+d76177...@syzkaller.appspotmail.com

INFO: task krfcommd:2706 blocked for more than 143 seconds.
Tainted: G W 5.17.0-syzkaller-12817-gcffb2b72d3ed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:krfcommd state:D stack:28984 pid: 2706 ppid: 2 flags:0x00004000
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5073 [inline]
__schedule+0x937/0x1090 kernel/sched/core.c:6382
schedule+0xeb/0x1b0 kernel/sched/core.c:6454
schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:6513
__mutex_lock_common+0xd1f/0x2590 kernel/locking/mutex.c:673
__mutex_lock kernel/locking/mutex.c:733 [inline]
mutex_lock_nested+0x1a/0x20 kernel/locking/mutex.c:785
rfcomm_process_sessions+0x21/0x3f0 net/bluetooth/rfcomm/core.c:2015
rfcomm_run+0x195/0x2c0 net/bluetooth/rfcomm/core.c:2122
kthread+0x2a3/0x2d0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30
</TASK>
INFO: task syz-executor296:3618 blocked for more than 143 seconds.
Tainted: G W 5.17.0-syzkaller-12817-gcffb2b72d3ed #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor296 state:D stack:26552 pid: 3618 ppid: 3612 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5073 [inline]
__schedule+0x937/0x1090 kernel/sched/core.c:6382
schedule+0xeb/0x1b0 kernel/sched/core.c:6454
__lock_sock+0x1cf/0x330 net/core/sock.c:2770
lock_sock_nested+0x9f/0x100 net/core/sock.c:3317
lock_sock include/net/sock.h:1682 [inline]
rfcomm_sk_state_change+0x63/0x300 net/bluetooth/rfcomm/sock.c:73
__rfcomm_dlc_close+0x2cc/0x480 net/bluetooth/rfcomm/core.c:489
rfcomm_dlc_close+0x10d/0x1c0 net/bluetooth/rfcomm/core.c:520
__rfcomm_sock_close+0x101/0x220 net/bluetooth/rfcomm/sock.c:220
rfcomm_sock_shutdown+0xa5/0x220 net/bluetooth/rfcomm/sock.c:905
rfcomm_sock_release+0x55/0x120 net/bluetooth/rfcomm/sock.c:925
__sock_release net/socket.c:650 [inline]
sock_close+0xd8/0x260 net/socket.c:1318
__fput+0x3f6/0x860 fs/file_table.c:317
task_work_run+0x146/0x1c0 kernel/task_work.c:164
exit_task_work include/linux/task_work.h:32 [inline]
do_exit+0x5e3/0x20f0 kernel/exit.c:794
do_group_exit+0x2af/0x2b0 kernel/exit.c:924
get_signal+0x23e8/0x23f0 kernel/signal.c:2903
arch_do_signal_or_restart+0xa1/0x740 arch/x86/kernel/signal.c:868
handle_signal_work kernel/entry/common.c:149 [inline]
exit_to_user_mode_loop kernel/entry/common.c:187 [inline]
exit_to_user_mode_prepare+0x165/0x200 kernel/entry/common.c:222
__syscall_exit_to_user_mode_work kernel/entry/common.c:304 [inline]
syscall_exit_to_user_mode+0x2e/0x70 kernel/entry/common.c:315
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f4e731503f9
RSP: 002b:00007fff14a0e988 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: fffffffffffffffc RBX: 0000555556a333b8 RCX: 00007f4e731503f9
RDX: 0000000000000080 RSI: 0000000020000000 RDI: 0000000000000004
RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004
R13: 0000000000000072 R14: 00007fff14a0ea10 R15: 0000000000000000
</TASK>
INFO: lockdep is turned off.
NMI backtrace for cpu 1
CPU: 1 PID: 28 Comm: khungtaskd Tainted: G W 5.17.0-syzkaller-12817-gcffb2b72d3ed #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1dc/0x2d8 lib/dump_stack.c:106
nmi_cpu_backtrace+0x45f/0x490 lib/nmi_backtrace.c:111
nmi_trigger_cpumask_backtrace+0x16a/0x280 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:212 [inline]
watchdog+0xc82/0xcd0 kernel/hung_task.c:369
kthread+0x2a3/0x2d0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30
</TASK>
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 8 Comm: kworker/u4:0 Tainted: G W 5.17.0-syzkaller-12817-gcffb2b72d3ed #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events_unbound toggle_allocation_gate
RIP: 0010:bcmp+0x1/0x170 lib/string.c:801
Code: 7e fd eb c5 44 89 f9 80 e1 07 38 c1 7c ce 4c 89 ff e8 33 7c 7e fd eb c4 31 ed 89 e8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 90 55 <41> 57 41 56 41 55 41 54 53 49 89 d6 49 89 f7 48 89 fb 49 bd 00 00
RSP: 0018:ffffc900000d7930 EFLAGS: 00000297
RAX: 1ffffffff15103dd RBX: ffffffff81d6d87e RCX: dffffc0000000000
RDX: 0000000000000005 RSI: ffffffff8f7f6000 RDI: ffffffff81d6d87e
RBP: ffffc900000d7a80 R08: ffffffff81d6d87e R09: ffffffff845cf58d
R10: 0000000000000003 R11: ffff88813fe6d700 R12: ffffffff8a881e6a
R13: ffffffff8f7f6000 R14: ffffffff8f7f6000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055e1e7f11600 CR3: 000000000ca8e000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__jump_label_patch+0x24d/0x410 arch/x86/kernel/jump_label.c:65
arch_jump_label_transform_queue+0x49/0xd0 arch/x86/kernel/jump_label.c:137
__jump_label_update+0x15d/0x350 kernel/jump_label.c:451
static_key_disable_cpuslocked+0xcc/0x1b0 kernel/jump_label.c:207
static_key_disable+0x16/0x20 kernel/jump_label.c:215
toggle_allocation_gate+0x3c8/0x460 mm/kfence/core.c:793
process_one_work+0x83c/0x11a0 kernel/workqueue.c:2289
worker_thread+0xa6c/0x1290 kernel/workqueue.c:2436
kthread+0x2a3/0x2d0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30
</TASK>
----------------
Code disassembly (best guess):
0: 7e fd jle 0xffffffff
2: eb c5 jmp 0xffffffc9
4: 44 89 f9 mov %r15d,%ecx
7: 80 e1 07 and $0x7,%cl
a: 38 c1 cmp %al,%cl
c: 7c ce jl 0xffffffdc
e: 4c 89 ff mov %r15,%rdi
11: e8 33 7c 7e fd callq 0xfd7e7c49
16: eb c4 jmp 0xffffffdc
18: 31 ed xor %ebp,%ebp
1a: 89 e8 mov %ebp,%eax
1c: 5b pop %rbx
1d: 41 5c pop %r12
1f: 41 5d pop %r13
21: 41 5e pop %r14
23: 41 5f pop %r15
25: 5d pop %rbp
26: c3 retq
27: 66 90 xchg %ax,%ax
29: 55 push %rbp
* 2a: 41 57 push %r15 <-- trapping instruction
2c: 41 56 push %r14
2e: 41 55 push %r13
30: 41 54 push %r12
32: 53 push %rbx
33: 49 89 d6 mov %rdx,%r14
36: 49 89 f7 mov %rsi,%r15
39: 48 89 fb mov %rdi,%rbx
3c: 49 rex.WB
3d: bd .byte 0xbd


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Hillf Danton

unread,
Mar 30, 2022, 6:26:57ā€ÆAM3/30/22
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On Tue, 29 Mar 2022 18:34:19 -0700
See if it is due to recursive deadlock wrt lock_sock, something like

lock_sock(sk);
...
lock_sock(sk);

Hillf

#syz test: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ cffb2b72d3ed

--- y/net/bluetooth/rfcomm/sock.c
+++ x/net/bluetooth/rfcomm/sock.c
@@ -64,13 +64,17 @@ static void rfcomm_sk_data_ready(struct
static void rfcomm_sk_state_change(struct rfcomm_dlc *d, int err)
{
struct sock *sk = d->owner, *parent;
+ bool locked = false;

if (!sk)
return;

BT_DBG("dlc %p state %ld err %d", d, d->state, err);

- lock_sock(sk);
+ if (!sk->sk_shutdown) {
+ lock_sock(sk);
+ locked = true;
+ }

if (err)
sk->sk_err = err;
@@ -91,7 +95,8 @@ static void rfcomm_sk_state_change(struc
sk->sk_state_change(sk);
}

- release_sock(sk);
+ if (locked)
+ release_sock(sk);

if (parent && sock_flag(sk, SOCK_ZAPPED)) {
/* We have to drop DLC lock here, otherwise
--

syzbot

unread,
Mar 30, 2022, 7:54:07ā€ÆAM3/30/22
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+d76177...@syzkaller.appspotmail.com

Tested on:

commit: cffb2b72 Merge tag 'kgdb-5.18-rc1' of git://git.kernel..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
kernel config: https://syzkaller.appspot.com/x/.config?x=69c8957f4ac2dea6
dashboard link: https://syzkaller.appspot.com/bug?extid=d761775dff24be3ad4be
compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=121c55b9700000

Note: testing is done by a robot and is best-effort only.

syzbot

unread,
Feb 1, 2023, 11:55:30ā€ÆPM2/1/23
to da...@davemloft.net, edum...@google.com, hda...@sina.com, johan....@gmail.com, ku...@kernel.org, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, luiz....@gmail.com, luiz.vo...@intel.com, mar...@holtmann.org, net...@vger.kernel.org, pab...@redhat.com, syzkall...@googlegroups.com, yin...@chromium.org
syzbot suspects this issue was fixed by commit:

commit 1d80d57ffcb55488f0ec0b77928d4f82d16b6a90
Author: Ying Hsu <yin...@chromium.org>
Date: Wed Jan 11 03:16:14 2023 +0000

Bluetooth: Fix possible deadlock in rfcomm_sk_state_change

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=154b77b5480000
start commit: d888c83fcec7 fs: fix fd table size alignment properly
git tree: upstream
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=162ca1fd700000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1036182d700000

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: Bluetooth: Fix possible deadlock in rfcomm_sk_state_change

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

syzbot

unread,
Feb 2, 2023, 7:28:11ā€ÆAM2/2/23
to yin...@chromium.org, yin...@chromium.org, syzkall...@googlegroups.com
> #syz fix: Bluetooth: Fix possible deadlock in rfcomm_sk_state_change

Your 'fix:' command is accepted, but please keep syzkall...@googlegroups.com mailing list in CC next time. It serves as a history of what happened with each bug report. Thank you.

Hillf Danton

unread,
Feb 2, 2023, 9:01:47ā€ÆAM2/2/23
to syzbot, edum...@google.com, linux-b...@vger.kernel.org, linux-...@vger.kernel.org, luiz....@gmail.com, mar...@holtmann.org, linu...@kvack.org, syzkall...@googlegroups.com, yin...@chromium.org
On Wed, 01 Feb 2023 20:55:28 -0800
If it is the correct fix then why was task hung reported instead of
deadlock? What went wrong with lockdep at cffb2b72d3ed upstream?

Hillf
Reply all
Reply to author
Forward
0 new messages