WARNING in kthread_park

12 views
Skip to first unread message

syzbot

unread,
Sep 30, 2020, 6:29:22 PM9/30/20
to ax...@kernel.dk, io-u...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Hello,

syzbot found the following issue on:

HEAD commit: d1d2220c Add linux-next specific files for 20200924
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=15b1918d900000
kernel config: https://syzkaller.appspot.com/x/.config?x=254e028a642027c
dashboard link: https://syzkaller.appspot.com/bug?extid=e7eea402700c6db193be
compiler: gcc (GCC) 10.1.0-syz 20200507

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+e7eea4...@syzkaller.appspotmail.com

------------[ cut here ]------------
WARNING: CPU: 1 PID: 28162 at kernel/kthread.c:547 kthread_park+0x17c/0x1b0 kernel/kthread.c:547
Modules linked in:
CPU: 1 PID: 28162 Comm: syz-executor.3 Not tainted 5.9.0-rc6-next-20200924-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:kthread_park+0x17c/0x1b0 kernel/kthread.c:547
Code: 2a 04 27 00 0f 0b e9 fb fe ff ff e8 1e 04 27 00 0f 0b e8 17 04 27 00 41 bc da ff ff ff 5b 44 89 e0 5d 41 5c c3 e8 04 04 27 00 <0f> 0b 41 bc f0 ff ff ff eb be e8 f5 03 27 00 0f 0b eb b2 e8 bc 74
RSP: 0018:ffffc90017ebfd50 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888092c0ca00 RCX: ffffffff814e2ccf
RDX: ffff88804f30a040 RSI: ffffffff814e2d6c RDI: 0000000000000007
RBP: ffff88804f46c440 R08: 0000000000000000 R09: ffff888092c0ca07
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: 0000000000000000 R14: ffff888096f7d000 R15: 0000000000000000
FS: 0000000002a7e940(0000) GS:ffff8880ae500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000001590004 CR3: 000000020a32f000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
io_sq_thread_park fs/io_uring.c:7145 [inline]
io_sq_thread_park fs/io_uring.c:7139 [inline]
io_uring_flush+0x10a6/0x1640 fs/io_uring.c:8596
filp_close+0xb4/0x170 fs/open.c:1276
__close_fd+0x2f/0x50 fs/file.c:671
__do_sys_close fs/open.c:1295 [inline]
__se_sys_close fs/open.c:1293 [inline]
__x64_sys_close+0x69/0x100 fs/open.c:1293
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x417901
Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 a4 1a 00 00 c3 48 83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007fffe4c281e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 0000000000417901
RDX: 0000000000000000 RSI: ffffffff8840e309 RDI: 0000000000000003
RBP: 0000000000000001 R08: ffffffff8134e496 R09: 0000000092f8bb6d
R10: 00007fffe4c282d0 R11: 0000000000000293 R12: 000000000118d9c0
R13: 000000000118d9c0 R14: ffffffffffffffff R15: 000000000118cf4c


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Hillf Danton

unread,
Oct 1, 2020, 1:41:52 AM10/1/20
to syzbot, ax...@kernel.dk, io-u...@vger.kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, Pavel Begunkov, Hillf Danton, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk

On Wed, 30 Sep 2020 15:29:21 -0700
Looks like a leak captured at kernel/kthread.c:547 because
io_sq_thread_park() is balanced with io_sq_thread_unpark().

The mix of park and stop in io_put_sq_data() however is likely to get
the last parkme in io_sq_thread left behind without care because of
the absence of unpark.

Apart from adding unpark behind stop because both park and unpark
would go before put, it is replaced with a wait in over caution to
ensure the kthread is reclaimed at the cost of a minor tweak in the
put funtion. Another simpler option is to cut it, and we can try it
after this one.

--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -240,6 +240,7 @@ struct io_sq_data {

struct task_struct *thread;
struct wait_queue_head wait;
+ struct completion *stop_done;
};

struct io_ring_ctx {
@@ -6887,6 +6888,7 @@ static int io_sq_thread(void *data)
}

kthread_parkme();
+ complete(sqd->stop_done);

return 0;
}
@@ -7059,21 +7061,20 @@ static int io_sqe_files_unregister(struc
return 0;
}

-static void io_put_sq_data(struct io_sq_data *sqd)
+static void io_put_sq_data(struct io_ring_ctx *ctx)
{
+ struct io_sq_data *sqd = ctx->sq_data;
+
if (refcount_dec_and_test(&sqd->refs)) {
- /*
- * The park is a bit of a work-around, without it we get
- * warning spews on shutdown with SQPOLL set and affinity
- * set to a single CPU.
- */
if (sqd->thread) {
- kthread_park(sqd->thread);
+ reinit_completion(&ctx->sq_thread_comp);
+ sqd->stop_done = &ctx->sq_thread_comp;
kthread_stop(sqd->thread);
+ wait_for_completion(&ctx->sq_thread_comp);
}
-
kfree(sqd);
}
+ ctx->sq_data = NULL;
}

static struct io_sq_data *io_attach_sq_data(struct io_uring_params *p)
@@ -7169,8 +7170,7 @@ static void io_sq_thread_stop(struct io_
io_sq_thread_unpark(sqd);
}

- io_put_sq_data(sqd);
- ctx->sq_data = NULL;
+ io_put_sq_data(ctx);
}
}


syzbot

unread,
Nov 25, 2020, 5:25:11 PM11/25/20
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
Crashes did not happen for a while, no reproducer and no activity.
Reply all
Reply to author
Forward
0 new messages