[syzbot] INFO: task hung in io_uring_del_tctx_node (2)

16 views
Skip to first unread message

syzbot

unread,
Mar 15, 2022, 11:47:24ā€ÆAM3/15/22
to asml.s...@gmail.com, ax...@kernel.dk, io-u...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: dda64ead7e82 Merge tag 'trace-v5.17-rc6' of git://git.kern..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15d7671d700000
kernel config: https://syzkaller.appspot.com/x/.config?x=aba0ab2928a512c2
dashboard link: https://syzkaller.appspot.com/bug?extid=771a9fd5d128e0a5708c
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11cb0461700000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+771a9f...@syzkaller.appspotmail.com

INFO: task syz-executor.1:31000 blocked for more than 143 seconds.
Not tainted 5.17.0-rc7-syzkaller-00200-gdda64ead7e82 #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.1 state:D stack:29160 pid:31000 ppid: 15800 flags:0x00004004
Call Trace:
<TASK>
context_switch kernel/sched/core.c:4995 [inline]
__schedule+0xa94/0x4910 kernel/sched/core.c:6304
schedule+0xd2/0x260 kernel/sched/core.c:6377
schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:6436
__mutex_lock_common kernel/locking/mutex.c:673 [inline]
__mutex_lock+0xa32/0x12f0 kernel/locking/mutex.c:733
io_uring_del_tctx_node+0x109/0x20a fs/io_uring.c:9875
io_uring_clean_tctx fs/io_uring.c:9891 [inline]
io_uring_cancel_generic+0x5c3/0x695 fs/io_uring.c:9969
io_uring_files_cancel include/linux/io_uring.h:16 [inline]
do_exit+0x523/0x2a30 kernel/exit.c:761
do_group_exit+0xd2/0x2f0 kernel/exit.c:935
get_signal+0x45a/0x2490 kernel/signal.c:2863
arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
handle_signal_work kernel/entry/common.c:148 [inline]
exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
__syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fe6cca58049
RSP: 002b:00007fe6cc1cd218 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00007fe6ccb6af68 RCX: 00007fe6cca58049
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007fe6ccb6af68
RBP: 00007fe6ccb6af60 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe6ccb6af6c
R13: 00007ffc7aa5c47f R14: 00007fe6cc1cd300 R15: 0000000000022000
</TASK>

Showing all locks held in the system:
1 lock held by khungtaskd/27:
#0: ffffffff8bb820a0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6460
2 locks held by getty/3275:
#0: ffff88814aeae098 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:244
#1: ffffc90002b5e2e8 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0xcf0/0x1230 drivers/tty/n_tty.c:2075
1 lock held by syz-executor.1/31000:
#0: ffff88801f8d10a8 (&ctx->uring_lock){+.+.}-{3:3}, at: io_uring_del_tctx_node+0x109/0x20a fs/io_uring.c:9875
2 locks held by syz-executor.1/31016:

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 27 Comm: khungtaskd Not tainted 5.17.0-rc7-syzkaller-00200-gdda64ead7e82 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111
nmi_trigger_cpumask_backtrace+0x1e6/0x230 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:212 [inline]
watchdog+0xc1d/0xf50 kernel/hung_task.c:369
kthread+0x2e9/0x3a0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
</TASK>
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 3752 Comm: kworker/u4:5 Not tainted 5.17.0-rc7-syzkaller-00200-gdda64ead7e82 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: bat_events batadv_nc_worker
RIP: 0010:__lock_is_held kernel/locking/lockdep.c:5380 [inline]
RIP: 0010:lock_is_held_type+0xa7/0x140 kernel/locking/lockdep.c:5682
Code: 12 e9 8a 00 00 00 83 c3 01 41 3b 9c 24 58 0a 00 00 7d 7d 48 63 c3 48 89 ee 48 8d 04 80 4d 8d 7c c5 00 4c 89 ff e8 99 fe ff ff <85> c0 74 d8 41 83 fe ff 41 bd 01 00 00 00 74 14 31 c0 41 f6 47 22
RSP: 0018:ffffc90002db7b00 EFLAGS: 00000096
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffffffff8bb81fe0 RDI: ffff888075e36b08
RBP: ffffffff8bb81fe0 R08: 0000000000000000 R09: ffffffff8d93f017
R10: fffffbfff1b27e02 R11: 0000000000000000 R12: ffff888075e36080
R13: ffff888075e36ae0 R14: 00000000ffffffff R15: ffff888075e36b08
FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055f753521608 CR3: 000000002103b000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
lock_is_held include/linux/lockdep.h:283 [inline]
rcu_read_lock_sched_held+0x3a/0x70 kernel/rcu/update.c:125
trace_lock_release include/trace/events/lock.h:58 [inline]
lock_release+0x522/0x720 kernel/locking/lockdep.c:5650
__raw_spin_unlock_bh include/linux/spinlock_api_smp.h:165 [inline]
_raw_spin_unlock_bh+0x12/0x30 kernel/locking/spinlock.c:210
spin_unlock_bh include/linux/spinlock.h:394 [inline]
batadv_nc_purge_paths+0x2a5/0x3a0 net/batman-adv/network-coding.c:471
batadv_nc_worker+0x8f9/0xfa0 net/batman-adv/network-coding.c:720
process_one_work+0x9ac/0x1650 kernel/workqueue.c:2307
worker_thread+0x657/0x1110 kernel/workqueue.c:2454
kthread+0x2e9/0x3a0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295
</TASK>
----------------
Code disassembly (best guess):
0: 12 e9 adc %cl,%ch
2: 8a 00 mov (%rax),%al
4: 00 00 add %al,(%rax)
6: 83 c3 01 add $0x1,%ebx
9: 41 3b 9c 24 58 0a 00 cmp 0xa58(%r12),%ebx
10: 00
11: 7d 7d jge 0x90
13: 48 63 c3 movslq %ebx,%rax
16: 48 89 ee mov %rbp,%rsi
19: 48 8d 04 80 lea (%rax,%rax,4),%rax
1d: 4d 8d 7c c5 00 lea 0x0(%r13,%rax,8),%r15
22: 4c 89 ff mov %r15,%rdi
25: e8 99 fe ff ff callq 0xfffffec3
* 2a: 85 c0 test %eax,%eax <-- trapping instruction
2c: 74 d8 je 0x6
2e: 41 83 fe ff cmp $0xffffffff,%r14d
32: 41 bd 01 00 00 00 mov $0x1,%r13d
38: 74 14 je 0x4e
3a: 31 c0 xor %eax,%eax
3c: 41 rex.B
3d: f6 .byte 0xf6
3e: 47 rex.RXB
3f: 22 .byte 0x22


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

syzbot

unread,
Jul 10, 2022, 2:35:30ā€ÆAM7/10/22
to asml.s...@gmail.com, ax...@kernel.dk, io-u...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com
syzbot has found a reproducer for the following issue on:

HEAD commit: cb71b93c2dc3 Add linux-next specific files for 20220628
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=1779602c080000
kernel config: https://syzkaller.appspot.com/x/.config?x=badbc1adb2d582eb
dashboard link: https://syzkaller.appspot.com/bug?extid=771a9fd5d128e0a5708c
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17011514080000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1204baa2080000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+771a9f...@syzkaller.appspotmail.com

INFO: task syz-executor616:3671 blocked for more than 143 seconds.
Not tainted 5.19.0-rc4-next-20220628-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor616 state:D stack:27536 pid: 3671 ppid: 3600 flags:0x00104004
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5184 [inline]
__schedule+0xa09/0x4f10 kernel/sched/core.c:6496
schedule+0xd2/0x1f0 kernel/sched/core.c:6568
schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:6627
__mutex_lock_common kernel/locking/mutex.c:679 [inline]
__mutex_lock+0xa70/0x1350 kernel/locking/mutex.c:747
io_uring_del_tctx_node+0x110/0x2b6 io_uring/tctx.c:175
io_uring_clean_tctx+0xce/0x174 io_uring/tctx.c:191
io_uring_cancel_generic+0x5aa/0x602 io_uring/io_uring.c:2885
io_uring_files_cancel include/linux/io_uring.h:44 [inline]
do_exit+0x4f9/0x29f0 kernel/exit.c:750
do_group_exit+0xd2/0x2f0 kernel/exit.c:925
__do_sys_exit_group kernel/exit.c:936 [inline]
__se_sys_exit_group kernel/exit.c:934 [inline]
__x64_sys_exit_group+0x3a/0x50 kernel/exit.c:934
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7ff32cbe16f9
RSP: 002b:00007ffcbffc2738 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007ff32cc563f0 RCX: 00007ff32cbe16f9
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffffffffffc0 R09: 0000000000004140
R10: 0000000000008011 R11: 0000000000000246 R12: 00007ff32cc563f0
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
</TASK>

Showing all locks held in the system:
1 lock held by rcu_tasks_kthre/12:
#0: ffffffff8bd864f0 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0xc70 kernel/rcu/tasks.h:507
1 lock held by rcu_tasks_trace/13:
#0: ffffffff8bd861f0 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x26/0xc70 kernel/rcu/tasks.h:507
1 lock held by khungtaskd/28:
#0: ffffffff8bd87040 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:6491
2 locks held by getty/3275:
#0: ffff88807f2e8098 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x22/0x80 drivers/tty/tty_ldisc.c:244
#1: ffffc90002d162f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0xe50/0x13c0 drivers/tty/n_tty.c:2177
1 lock held by syz-executor616/3671:
#0: ffff88807eb3c0a8 (&ctx->uring_lock){+.+.}-{3:3}, at: io_uring_del_tctx_node+0x110/0x2b6 io_uring/tctx.c:175
5 locks held by iou-sqp-3671/3672:

=============================================

NMI backtrace for cpu 1
CPU: 1 PID: 28 Comm: khungtaskd Not tainted 5.19.0-rc4-next-20220628-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
nmi_cpu_backtrace.cold+0x47/0x144 lib/nmi_backtrace.c:111
nmi_trigger_cpumask_backtrace+0x1e6/0x230 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:212 [inline]
watchdog+0xc18/0xf50 kernel/hung_task.c:369
kthread+0x2e9/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302
</TASK>
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 3672 Comm: iou-sqp-3671 Not tainted 5.19.0-rc4-next-20220628-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022
RIP: 0010:get_current arch/x86/include/asm/current.h:15 [inline]
RIP: 0010:__sanitizer_cov_trace_pc+0x13/0x60 kernel/kcov.c:199
Code: 02 66 0f 1f 44 00 00 48 8b be a8 01 00 00 e8 b4 ff ff ff 31 c0 c3 90 65 8b 05 29 b8 87 7e 89 c1 48 8b 34 24 81 e1 00 01 00 00 <65> 48 8b 14 25 80 6f 02 00 a9 00 01 ff 00 74 0e 85 c9 74 35 8b 82
RSP: 0018:ffffc9000313f650 EFLAGS: 00000246
RAX: 0000000080000000 RBX: ffff88813fffc320 RCX: 0000000000000000
RDX: ffff8880220a8000 RSI: ffffffff81a97d33 RDI: 0000000000000004
RBP: 0000000000000004 R08: 0000000000000004 R09: 0000000000000004
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: ffff88813fffa000 R14: ffffc9000313f7e0 R15: dffffc0000000000
FS: 000055555676a300(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff32cc59370 CR3: 0000000024fc8000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
zonelist_zone_idx include/linux/mmzone.h:1160 [inline]
next_zones_zonelist include/linux/mmzone.h:1191 [inline]
shrink_zones mm/vmscan.c:3474 [inline]
do_try_to_free_pages+0x403/0x16f0 mm/vmscan.c:3590
try_to_free_mem_cgroup_pages+0x31e/0x920 mm/vmscan.c:3904
try_charge_memcg+0x500/0x1440 mm/memcontrol.c:2756
obj_cgroup_charge_pages mm/memcontrol.c:3163 [inline]
__memcg_kmem_charge_page+0x183/0x3d0 mm/memcontrol.c:3189
__alloc_pages+0x1ef/0x510 mm/page_alloc.c:5523
alloc_pages+0x1aa/0x310 mm/mempolicy.c:2280
io_refill_buffer_cache io_uring/kbuf.c:379 [inline]
io_add_buffers io_uring/kbuf.c:405 [inline]
io_provide_buffers+0x87d/0xff0 io_uring/kbuf.c:457
io_issue_sqe+0x15e/0xd20 io_uring/io_uring.c:1601
io_queue_sqe io_uring/io_uring.c:1778 [inline]
io_submit_sqe io_uring/io_uring.c:2036 [inline]
io_submit_sqes+0x9a6/0x1ec0 io_uring/io_uring.c:2147
__io_sq_thread io_uring/sqpoll.c:193 [inline]
io_sq_thread+0x708/0xf60 io_uring/sqpoll.c:252
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302
</TASK>
----------------
Code disassembly (best guess), 1 bytes skipped:
0: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
6: 48 8b be a8 01 00 00 mov 0x1a8(%rsi),%rdi
d: e8 b4 ff ff ff callq 0xffffffc6
12: 31 c0 xor %eax,%eax
14: c3 retq
15: 90 nop
16: 65 8b 05 29 b8 87 7e mov %gs:0x7e87b829(%rip),%eax # 0x7e87b846
1d: 89 c1 mov %eax,%ecx
1f: 48 8b 34 24 mov (%rsp),%rsi
23: 81 e1 00 01 00 00 and $0x100,%ecx
* 29: 65 48 8b 14 25 80 6f mov %gs:0x26f80,%rdx <-- trapping instruction
30: 02 00
32: a9 00 01 ff 00 test $0xff0100,%eax
37: 74 0e je 0x47
39: 85 c9 test %ecx,%ecx
3b: 74 35 je 0x72
3d: 8b .byte 0x8b
3e: 82 .byte 0x82

Hillf Danton

unread,
Jul 10, 2022, 6:08:01ā€ÆAM7/10/22
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On Sat, 09 Jul 2022 23:35:29 -0700
Do not charge IO worker by default.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git cb71b93c2dc3

--- y/mm/memcontrol.c
+++ m/mm/memcontrol.c
@@ -3184,6 +3184,8 @@ int __memcg_kmem_charge_page(struct page
struct obj_cgroup *objcg;
int ret = 0;

+ if (current->flags & PF_IO_WORKER)
+ return 0;
objcg = get_obj_cgroup_from_current();
if (objcg) {
ret = obj_cgroup_charge_pages(objcg, gfp, 1 << order);
--

syzbot

unread,
Jul 10, 2022, 8:09:07ā€ÆAM7/10/22
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+771a9f...@syzkaller.appspotmail.com

Tested on:

commit: cb71b93c Add linux-next specific files for 20220628
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=11f557ec080000
kernel config: https://syzkaller.appspot.com/x/.config?x=badbc1adb2d582eb
dashboard link: https://syzkaller.appspot.com/bug?extid=771a9fd5d128e0a5708c
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=113c2eb7f00000

Note: testing is done by a robot and is best-effort only.

syzbot

unread,
Jul 30, 2023, 6:24:48ā€ÆPM7/30/23
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
No recent activity, existing reproducers are no longer triggering the issue.
Reply all
Reply to author
Forward
0 new messages