linux-next boot error: WARNING in prepare_kswapd_sleep

12 views
Skip to first unread message

syzbot

unread,
Nov 24, 2020, 2:54:22 AM11/24/20
to ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, linux...@vger.kernel.org, s...@canb.auug.org.au, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: d9137320 Add linux-next specific files for 20201124
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=17b14079500000
kernel config: https://syzkaller.appspot.com/x/.config?x=2ac6081150c8eac
dashboard link: https://syzkaller.appspot.com/bug?extid=ce635500093181f39c1c
compiler: gcc (GCC) 10.1.0-syz 20200507

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ce6355...@syzkaller.appspotmail.com

------------[ cut here ]------------
WARNING: CPU: 1 PID: 2192 at include/linux/memcontrol.h:621 arch_static_branch arch/x86/include/asm/jump_label.h:25 [inline]
WARNING: CPU: 1 PID: 2192 at include/linux/memcontrol.h:621 mem_cgroup_disabled include/linux/memcontrol.h:504 [inline]
WARNING: CPU: 1 PID: 2192 at include/linux/memcontrol.h:621 mem_cgroup_lruvec include/linux/memcontrol.h:616 [inline]
WARNING: CPU: 1 PID: 2192 at include/linux/memcontrol.h:621 clear_pgdat_congested mm/vmscan.c:3443 [inline]
WARNING: CPU: 1 PID: 2192 at include/linux/memcontrol.h:621 prepare_kswapd_sleep mm/vmscan.c:3480 [inline]
WARNING: CPU: 1 PID: 2192 at include/linux/memcontrol.h:621 prepare_kswapd_sleep+0xed/0x250 mm/vmscan.c:3456
Modules linked in:
CPU: 1 PID: 2192 Comm: kswapd0 Not tainted 5.10.0-rc5-next-20201124-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:621 [inline]
RIP: 0010:clear_pgdat_congested mm/vmscan.c:3443 [inline]
RIP: 0010:prepare_kswapd_sleep mm/vmscan.c:3480 [inline]
RIP: 0010:prepare_kswapd_sleep+0xed/0x250 mm/vmscan.c:3456
Code: 89 ee 48 89 df e8 73 d3 ff ff 31 ff 41 89 c4 89 c6 e8 87 19 d7 ff 45 84 e4 74 cc e8 6d 21 d7 ff 0f 1f 44 00 00 e8 63 21 d7 ff <0f> 0b 48 c7 c0 28 8d ee 8c 48 ba 00 00 00 00 00 fc ff df 48 c1 e8
RSP: 0000:ffffc900085bfda0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88813fffb000 RCX: ffffffff81998e19
RDX: ffff8880168c1ac0 RSI: ffffffff81998e2d RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000ab3 R09: 0000000000000f89
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff8880b9f00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000000b08e000 CR4: 00000000001506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
kswapd_try_to_sleep mm/vmscan.c:3784 [inline]
kswapd+0x37d/0xdb0 mm/vmscan.c:3899
kthread+0x3b1/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 2192 Comm: kswapd0 Not tainted 5.10.0-rc5-next-20201124-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:120
panic+0x306/0x73d kernel/panic.c:231
__warn.cold+0x35/0x44 kernel/panic.c:605
report_bug+0x1bd/0x210 lib/bug.c:198
handle_bug+0x3c/0x60 arch/x86/kernel/traps.c:239
exc_invalid_op+0x14/0x40 arch/x86/kernel/traps.c:259
asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:578
RIP: 0010:mem_cgroup_lruvec include/linux/memcontrol.h:621 [inline]
RIP: 0010:clear_pgdat_congested mm/vmscan.c:3443 [inline]
RIP: 0010:prepare_kswapd_sleep mm/vmscan.c:3480 [inline]
RIP: 0010:prepare_kswapd_sleep+0xed/0x250 mm/vmscan.c:3456
Code: 89 ee 48 89 df e8 73 d3 ff ff 31 ff 41 89 c4 89 c6 e8 87 19 d7 ff 45 84 e4 74 cc e8 6d 21 d7 ff 0f 1f 44 00 00 e8 63 21 d7 ff <0f> 0b 48 c7 c0 28 8d ee 8c 48 ba 00 00 00 00 00 fc ff df 48 c1 e8
RSP: 0000:ffffc900085bfda0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88813fffb000 RCX: ffffffff81998e19
RDX: ffff8880168c1ac0 RSI: ffffffff81998e2d RDI: 0000000000000001
RBP: 0000000000000000 R08: 0000000000000ab3 R09: 0000000000000f89
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: 0000000000000004 R14: 0000000000000000 R15: 0000000000000003
kswapd_try_to_sleep mm/vmscan.c:3784 [inline]
kswapd+0x37d/0xdb0 mm/vmscan.c:3899
kthread+0x3b1/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Lorenzo Stoakes

unread,
Nov 24, 2020, 1:00:04 PM11/24/20
to syzbot, Andrew Morton, Linux Kernel Mailing List, linux-mm, Linux-Next Mailing List, Stephen Rothwell, syzkaller-bugs, Alex Shi, Hui Su
On Tue, 24 Nov 2020 at 07:54, syzbot
<syzbot+ce6355...@syzkaller.appspotmail.com> wrote:
> syzbot found the following issue on:
>
> HEAD commit: d9137320 Add linux-next specific files for 20201124

This appears to be a product of 4b2904f3 ("mm/memcg: add missed
warning in mem_cgroup_lruvec") adding a VM_WARN_ON_ONCE() to
mem_cgroup_lruvec, which when invoked from a function other than
mem_cgroup_page_lruvec() can in fact be called with the condition
false.
If we move the check back into mem_cgroup_page_lruvec() it resolves
the issue. I enclose a simple version of this below, happy to submit
as a proper patch if this is the right approach:


diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 87ed56dc75f9..27cc40a490b2 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -618,7 +618,6 @@ static inline struct lruvec
*mem_cgroup_lruvec(struct mem_cgroup *memcg,
goto out;
}

- VM_WARN_ON_ONCE(!memcg);
if (!memcg)
memcg = root_mem_cgroup;

@@ -645,6 +644,7 @@ static inline struct lruvec
*mem_cgroup_lruvec(struct mem_cgroup *memcg,
static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page,
struct pglist_data *pgdat)
{
+ VM_WARN_ON_ONCE_PAGE(!page_memcg(page), page);
return mem_cgroup_lruvec(page_memcg(page), pgdat);
}

Alex Shi

unread,
Nov 25, 2020, 1:24:56 AM11/25/20
to Lorenzo Stoakes, syzbot, Andrew Morton, Linux Kernel Mailing List, linux-mm, Linux-Next Mailing List, Stephen Rothwell, syzkaller-bugs, Hui Su
Acked.

Right. Would you like to remove the bad commit 4b2904f3 ("mm/memcg: add missed
warning in mem_cgroup_lruvec") and replace yours.

and further more, could you like try another patch?

Thanks
Alex

From 073b222bd06a96c39656b0460c705e48c7eedafc Mon Sep 17 00:00:00 2001
From: Alex Shi <alex...@linux.alibaba.com>
Date: Wed, 25 Nov 2020 14:06:33 +0800
Subject: [PATCH] mm/memcg: bail out early when !memcg in mem_cgroup_lruvec

In some scenarios, we call NULL memcg in mem_cgroup_lruvec(NULL, pgdat)
so we could get out early to skip unnecessary check.

Also warning if both parameter are NULL.

Signed-off-by: Alex Shi <alex...@linux.alibaba.com>
---
include/linux/memcontrol.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 3a995bb3157f..5e4da83eb9ce 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -613,7 +613,9 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg,
struct mem_cgroup_per_node *mz;
struct lruvec *lruvec;

- if (mem_cgroup_disabled()) {
+ VM_WARN_ON_ONCE(!memcg && !pgdat);
+
+ if (mem_cgroup_disabled() || !memcg) {
lruvec = &pgdat->__lruvec;
goto out;
}
--
2.29.GIT

Lorenzo Stoakes

unread,
Nov 25, 2020, 6:22:11 AM11/25/20
to Alex Shi, Johannes Weiner, Hui Su, Stephen Rothwell, Shakeel Butt, Roman Gushchin, syzbot, Linux Kernel Mailing List, linux-mm, Linux-Next Mailing List, syzkaller-bugs, Lorenzo Stoakes
Move memcg check to mem_cgroup_page_lruvec() as there are callers which
may invoke this with !memcg in mem_cgroup_lruvec(), whereas they should
not in mem_cgroup_page_lruvec().

We expect that we have always charged a page to the memcg before
mem_cgroup_page_lruvec() is invoked, so add a warning to assert that this
is the case.

Signed-off-by: Lorenzo Stoakes <lsto...@gmail.com>
Reported-by: syzbot+ce6355...@syzkaller.appspotmail.com
---
include/linux/memcontrol.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 87ed56dc75f9..3e6a1df3bdb9 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -618,7 +618,6 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg,
goto out;
}

- VM_WARN_ON_ONCE(!memcg);
if (!memcg)
memcg = root_mem_cgroup;

@@ -645,7 +644,10 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg,
static inline struct lruvec *mem_cgroup_page_lruvec(struct page *page,
struct pglist_data *pgdat)
{
- return mem_cgroup_lruvec(page_memcg(page), pgdat);
+ struct mem_cgroup *memcg = page_memcg(page);
+
+ VM_WARN_ON_ONCE_PAGE(!memcg, page);
+ return mem_cgroup_lruvec(memcg, pgdat);
}

static inline bool lruvec_holds_page_lru_lock(struct page *page,
--
2.29.2

Lorenzo Stoakes

unread,
Nov 25, 2020, 6:25:34 AM11/25/20
to Alex Shi, syzbot, Andrew Morton, Linux Kernel Mailing List, linux-mm, Linux-Next Mailing List, Stephen Rothwell, syzkaller-bugs, Hui Su
On Wed, 25 Nov 2020 at 06:25, Alex Shi <alex...@linux.alibaba.com> wrote:
> Acked.

Thanks. I submitted as an actual patch, refactored it slightly to
avoid duplication of page_memcg().

> and further more, could you like try another patch?

I tried that patch against the syzkaller failure case and it worked fine!

Cheers, Lorenzo

Alex Shi

unread,
Nov 25, 2020, 7:15:17 AM11/25/20
to Lorenzo Stoakes, Johannes Weiner, Hui Su, Stephen Rothwell, Shakeel Butt, Roman Gushchin, syzbot, Linux Kernel Mailing List, linux-mm, Linux-Next Mailing List, syzkaller-bugs
Acked-by: Alex Shi <alex...@linux.alibaba.com>

Stephen Rothwell

unread,
Nov 25, 2020, 5:27:17 PM11/25/20
to Alex Shi, Lorenzo Stoakes, Johannes Weiner, Hui Su, Shakeel Butt, Roman Gushchin, syzbot, Linux Kernel Mailing List, linux-mm, Linux-Next Mailing List, syzkaller-bugs, Andrew Morton
Hi all,
I have added that patch to the akpm tree in linux-next today as a fix
for "mm/memcg: add missed warning in mem_cgroup_lruvec".

Andrew: the original patch is here:
https://lore.kernel.org/lkml/20201125112202.3...@gmail.com/
--
Cheers,
Stephen Rothwell
Reply all
Reply to author
Forward
0 new messages