[syzbot] [xfs?] WARNING in __queue_delayed_work

7 views
Skip to first unread message

syzbot

unread,
Apr 7, 2023, 1:04:51 PM4/7/23
to djw...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 7e364e56293b Linux 6.3-rc5
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13241195c80000
kernel config: https://syzkaller.appspot.com/x/.config?x=e3b9dc6616d797bb
dashboard link: https://syzkaller.appspot.com/bug?extid=5ed016962f5137a09c7c
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
userspace arch: i386

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+5ed016...@syzkaller.appspotmail.com

------------[ cut here ]------------
WARNING: CPU: 1 PID: 102 at kernel/workqueue.c:1445 __queue_work+0xd44/0x1120 kernel/workqueue.c:1444
Modules linked in:
CPU: 1 PID: 102 Comm: kswapd0 Not tainted 6.3.0-rc5-syzkaller #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
RIP: 0010:__queue_work+0xd44/0x1120 kernel/workqueue.c:1444
Code: e0 07 83 c0 03 38 d0 7c 09 84 d2 74 05 e8 74 0c 81 00 8b 5b 2c 31 ff 83 e3 20 89 de e8 c5 fb 2f 00 85 db 75 42 e8 6c ff 2f 00 <0f> 0b e9 3c f9 ff ff e8 60 ff 2f 00 0f 0b e9 ce f8 ff ff e8 54 ff
RSP: 0000:ffffc90000ce7638 EFLAGS: 00010093
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff888015f73a80 RSI: ffffffff8152d854 RDI: 0000000000000005
RBP: 0000000000000002 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffe8ffffb03348
R13: ffff888078462000 R14: ffffe8ffffb03390 R15: ffff888078462000
FS: 0000000000000000(0000) GS:ffff88802ca80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000cfa5bb CR3: 0000000025fde000 CR4: 0000000000150ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__queue_delayed_work+0x1c8/0x270 kernel/workqueue.c:1672
mod_delayed_work_on+0xe1/0x220 kernel/workqueue.c:1746
xfs_inodegc_shrinker_scan fs/xfs/xfs_icache.c:2212 [inline]
xfs_inodegc_shrinker_scan+0x250/0x4f0 fs/xfs/xfs_icache.c:2191
do_shrink_slab+0x428/0xaa0 mm/vmscan.c:853
shrink_slab+0x175/0x660 mm/vmscan.c:1013
shrink_one+0x502/0x810 mm/vmscan.c:5343
shrink_many mm/vmscan.c:5394 [inline]
lru_gen_shrink_node mm/vmscan.c:5511 [inline]
shrink_node+0x2064/0x35f0 mm/vmscan.c:6459
kswapd_shrink_node mm/vmscan.c:7262 [inline]
balance_pgdat+0xa02/0x1ac0 mm/vmscan.c:7452
kswapd+0x677/0xd60 mm/vmscan.c:7712
kthread+0x2e8/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Hillf Danton

unread,
Apr 10, 2023, 8:20:44 AM4/10/23
to syzbot, djw...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com
On 07 Apr 2023 10:04:49 -0700
Looks like a valid race.

xfs_inodegc_shrinker_scan() xfs_inodegc_stop()
--- ---
if (!xfs_is_inodegc_enabled(mp))
return SHRINK_STOP;
if (!xfs_clear_inodegc_enabled(mp))
return;
xfs_inodegc_queue_all(mp);
drain_workqueue(mp->m_inodegc_wq);
wq->flags |= __WQ_DRAINING;
mod_delayed_work_on()

Darrick J. Wong

unread,
Apr 10, 2023, 9:15:59 PM4/10/23
to Hillf Danton, syzbot, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com
On Mon, Apr 10, 2023 at 08:20:22PM +0800, Hillf Danton wrote:
> On 07 Apr 2023 10:04:49 -0700
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 7e364e56293b Linux 6.3-rc5
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=13241195c80000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=e3b9dc6616d797bb
> > dashboard link: https://syzkaller.appspot.com/bug?extid=5ed016962f5137a09c7c
> > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > userspace arch: i386
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+5ed016...@syzkaller.appspotmail.com
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 1 PID: 102 at kernel/workqueue.c:1445 __queue_work+0xd44/0x1120 kernel/workqueue.c:1444
> > Modules linked in:
> > CPU: 1 PID: 102 Comm: kswapd0 Not tainted 6.3.0-rc5-syzkaller #0
> > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
> > RIP: 0010:__queue_work+0xd44/0x1120 kernel/workqueue.c:1444

Gross. I just got one of these splats too, in:

__queue_work+0x3a2/0x4a0
call_timer_fn+0x24/0x120
__run_timers.part.0+0x170/0x280
run_timer_softirq+0x31/0x60
__do_softirq+0xf6/0x2fd
irq_exit_rcu+0xc5/0x110
sysvec_apic_timer_interrupt+0x8e/0xc0

I guess someone might have done:

Thread 0: xfs_inodegc_queue_all(mp);

Thread 1: <add inodegc work>
Thread 1: mod_delayed_work_on(cpu, mp->m_inodegc_wq, &gc->work, <nonzero>);

Thread 0: drain_workqueue(mp->m_inodegc_wq);

<Timer fires, splats, everyone halts>

But I can't really tell, the VM froze. Any suggestions on how to fix this?

--D

Hillf Danton

unread,
Apr 10, 2023, 10:49:40 PM4/10/23
to Darrick J. Wong, syzbot, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com
On Mon, 10 Apr 2023 18:15:57 -0700 "Darrick J. Wong" <djw...@kernel.org>
Take a look at the diff below.
To close the race, a quick fix is put drain_workqueue() under
mp->m_inodegc_stop_mutex on one hand. On the other, queue work
after mutex_trylock().

+++ b/fs/xfs/xfs_icache.c
@@ -1917,10 +1917,14 @@ xfs_inodegc_stop(
if (!xfs_clear_inodegc_enabled(mp))
return;

+ mutex_lock(&mp->m_inodegc_stop_mutex);
+
xfs_inodegc_queue_all(mp);
drain_workqueue(mp->m_inodegc_wq);

trace_xfs_inodegc_stop(mp, __return_address);
+
+ mutex_unlock(&mp->m_inodegc_stop_mutex);
}

/*
@@ -2042,7 +2046,6 @@ xfs_inodegc_queue(
struct xfs_inodegc *gc;
int items;
unsigned int shrinker_hits;
- unsigned long queue_delay = 1;

trace_xfs_inode_set_need_inactive(ip);
spin_lock(&ip->i_flags_lock);
@@ -2060,16 +2063,18 @@ xfs_inodegc_queue(
* is scheduled to run on this CPU.
*/
if (!xfs_is_inodegc_enabled(mp)) {
+out:
put_cpu_ptr(gc);
return;
}

- if (xfs_inodegc_want_queue_work(ip, items))
- queue_delay = 0;
+ if (!mutex_trylock(&mp->m_inodegc_stop_mutex))
+ goto out;

trace_xfs_inodegc_queue(mp, __return_address);
- mod_delayed_work(mp->m_inodegc_wq, &gc->work, queue_delay);
+ mod_delayed_work(mp->m_inodegc_wq, &gc->work, 0);
put_cpu_ptr(gc);
+ mutex_unlock(&mp->m_inodegc_stop_mutex);

if (xfs_inodegc_want_flush_work(ip, items, shrinker_hits)) {
trace_xfs_inodegc_throttle(mp, __return_address);
@@ -2201,6 +2206,9 @@ xfs_inodegc_shrinker_scan(
if (!xfs_is_inodegc_enabled(mp))
return SHRINK_STOP;

+ if (!mutex_trylock(&mp->m_inodegc_stop_mutex))
+ return SHRINK_STOP;
+
trace_xfs_inodegc_shrinker_scan(mp, sc, __return_address);

for_each_online_cpu(cpu) {
@@ -2213,6 +2221,7 @@ xfs_inodegc_shrinker_scan(
no_items = false;
}
}
+ mutex_unlock(&mp->m_inodegc_stop_mutex);

/*
* If there are no inodes to inactivate, we don't want the shrinker
--

syzbot

unread,
Aug 1, 2023, 12:57:52 PM8/1/23
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
Crashes did not happen for a while, no reproducer and no activity.
Reply all
Reply to author
Forward
0 new messages