[syzbot] general protection fault in list_lru_add

34 views
Skip to first unread message

syzbot

unread,
Mar 23, 2022, 6:03:22 PM3/23/22
to ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, songm...@bytedance.com, syzkall...@googlegroups.com, torv...@linux-foundation.org
Hello,

syzbot found the following issue on:

HEAD commit: 6b1f86f8e9c7 Merge tag 'folio-5.18b' of git://git.infradea..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1330b513700000
kernel config: https://syzkaller.appspot.com/x/.config?x=b99d35252f93aed2
dashboard link: https://syzkaller.appspot.com/bug?extid=f8c45ccc7d5d45fc5965
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=142a1f25700000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1618e40b700000

The issue was bisected to:

commit 5abc1e37afa0335c52608d640fd30910b2eeda21
Author: Muchun Song <songm...@bytedance.com>
Date: Tue Mar 22 21:41:19 2022 +0000

mm: list_lru: allocate list_lru_one only when needed

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13ea4c71700000
final oops: https://syzkaller.appspot.com/x/report.txt?x=101a4c71700000
console output: https://syzkaller.appspot.com/x/log.txt?x=17ea4c71700000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f8c45c...@syzkaller.appspotmail.com
Fixes: 5abc1e37afa0 ("mm: list_lru: allocate list_lru_one only when needed")

general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 0 PID: 2964 Comm: udevd Tainted: G W 5.17.0-syzkaller-02172-g6b1f86f8e9c7 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:list_add_tail include/linux/list.h:102 [inline]
RIP: 0010:list_lru_add+0x277/0x510 mm/list_lru.c:129
Code: 04 64 4d 8d 7c c7 10 4c 89 3c 24 e8 c3 f6 ca ff 49 8d 47 08 48 89 c2 48 89 44 24 10 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 4d 02 00 00 4d 8b 77 08 48 89 df 48 8b 14 24 4c
RSP: 0018:ffffc90002c17db0 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff88823bc54fc0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff81acf7ad RDI: ffffffff8d93ddd0
RBP: ffff8880256da7f0 R08: 0000000000000000 R09: ffffffff8d93ddd7
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: ffff88807fb2a880 R14: 0000000000000080 R15: 0000000000000000
FS: 00007f711b82e840(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f565fc7a718 CR3: 000000001a735000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
d_lru_add fs/dcache.c:431 [inline]
retain_dentry fs/dcache.c:685 [inline]
dput+0x7a7/0xdb0 fs/dcache.c:908
__fput+0x3ab/0x9f0 fs/file_table.c:330
task_work_run+0xdd/0x1a0 kernel/task_work.c:164
tracehook_notify_resume include/linux/tracehook.h:188 [inline]
exit_to_user_mode_loop kernel/entry/common.c:190 [inline]
exit_to_user_mode_prepare+0x27e/0x290 kernel/entry/common.c:222
__syscall_exit_to_user_mode_work kernel/entry/common.c:304 [inline]
syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:315
do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f711b92a467
Code: 44 00 00 48 8b 15 11 aa 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 e1 a9 0c 00 f7 d8 64 89 02 b8
RSP: 002b:00007ffe9fd16aa8 EFLAGS: 00000202 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 000055991078b240 RCX: 00007f711b92a467
RDX: 00007f711b9f1780 RSI: 0000000000000000 RDI: 000000000000000c
RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f711b9f5a60
R10: 0000000000000200 R11: 0000000000000202 R12: 00007f711b9f2380
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:list_add_tail include/linux/list.h:102 [inline]
RIP: 0010:list_lru_add+0x277/0x510 mm/list_lru.c:129
Code: 04 64 4d 8d 7c c7 10 4c 89 3c 24 e8 c3 f6 ca ff 49 8d 47 08 48 89 c2 48 89 44 24 10 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 4d 02 00 00 4d 8b 77 08 48 89 df 48 8b 14 24 4c
RSP: 0018:ffffc90002c17db0 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff88823bc54fc0 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff81acf7ad RDI: ffffffff8d93ddd0
RBP: ffff8880256da7f0 R08: 0000000000000000 R09: ffffffff8d93ddd7
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
R13: ffff88807fb2a880 R14: 0000000000000080 R15: 0000000000000000
FS: 00007f711b82e840(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f565fc7a718 CR3: 000000001a735000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: 04 64 add $0x64,%al
2: 4d 8d 7c c7 10 lea 0x10(%r15,%rax,8),%r15
7: 4c 89 3c 24 mov %r15,(%rsp)
b: e8 c3 f6 ca ff callq 0xffcaf6d3
10: 49 8d 47 08 lea 0x8(%r15),%rax
14: 48 89 c2 mov %rax,%rdx
17: 48 89 44 24 10 mov %rax,0x10(%rsp)
1c: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
23: fc ff df
26: 48 c1 ea 03 shr $0x3,%rdx
* 2a: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction
2e: 0f 85 4d 02 00 00 jne 0x281
34: 4d 8b 77 08 mov 0x8(%r15),%r14
38: 48 89 df mov %rbx,%rdi
3b: 48 8b 14 24 mov (%rsp),%rdx
3f: 4c rex.WR


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Linus Torvalds

unread,
Mar 23, 2022, 7:11:35 PM3/23/22
to syzbot, Andrew Morton, Linux Kernel Mailing List, Linux-MM, Muchun Song, syzkaller-bugs
Muchun,
mind taking a look at this asap? This seems like a pretty core thing,
and the fact that it oopses on that

> RIP: 0010:list_add_tail include/linux/list.h:102 [inline]
> RIP: 0010:list_lru_add+0x277/0x510 mm/list_lru.c:129
> d_lru_add fs/dcache.c:431 [inline]
> retain_dentry fs/dcache.c:685 [inline]
> dput+0x7a7/0xdb0 fs/dcache.c:908

just worries me a lot.

The dentry lru list rules are odd but not outrageously so. The main
oddity is that the DCACHE_LRU_LIST bit in the dentry flags indicate
whether the dentry is on a LRU list or not.

And it's not one single list - it can be *either* the usual
sb->lists_dentry_lru list, or the special "shrink list".

But this oops is for the regular d_lru_add() path that adds the dentry
to the sb->s_dentry_lru list as the dentry count goes down to zero
(and it's not one of the dentries that get insta-free'd).

So it looks like quite a regular path and something is horribly wrong
with a very core data structure.

Linus

Muchun Song

unread,
Mar 23, 2022, 10:19:34 PM3/23/22
to Linus Torvalds, syzbot, Andrew Morton, Linux Kernel Mailing List, Linux-MM, syzkaller-bugs
On Thu, Mar 24, 2022 at 7:11 AM Linus Torvalds
<torv...@linux-foundation.org> wrote:
>
> Muchun,
> mind taking a look at this asap? This seems like a pretty core thing,
> and the fact that it oopses on that

Sorry for this. I am looking at this now.

>
> > RIP: 0010:list_add_tail include/linux/list.h:102 [inline]
> > RIP: 0010:list_lru_add+0x277/0x510 mm/list_lru.c:129
> > d_lru_add fs/dcache.c:431 [inline]
> > retain_dentry fs/dcache.c:685 [inline]
> > dput+0x7a7/0xdb0 fs/dcache.c:908
>
> just worries me a lot.
>
> The dentry lru list rules are odd but not outrageously so. The main
> oddity is that the DCACHE_LRU_LIST bit in the dentry flags indicate
> whether the dentry is on a LRU list or not.
>
> And it's not one single list - it can be *either* the usual
> sb->lists_dentry_lru list, or the special "shrink list".
>
> But this oops is for the regular d_lru_add() path that adds the dentry
> to the sb->s_dentry_lru list as the dentry count goes down to zero
> (and it's not one of the dentries that get insta-free'd).

After this commit, the rules of dentry allocations changed.
The dentry should be allocated by kmem_cache_alloc_lru()
to set up the dentry reclaim context correctly (e.g. allocating
its list_lru_one). This issue seems that list_lru_one wasn't
allocated, then NULL pointer reference.

I'm trying to reproduce this and looking for the root cause.

Thanks.

Linus Torvalds

unread,
Mar 23, 2022, 11:05:31 PM3/23/22
to Muchun Song, syzbot, Andrew Morton, Linux Kernel Mailing List, Linux-MM, syzkaller-bugs
On Wed, Mar 23, 2022 at 7:19 PM Muchun Song <songm...@bytedance.com> wrote:
>
> After this commit, the rules of dentry allocations changed.
> The dentry should be allocated by kmem_cache_alloc_lru()

Yeah, I looked at that, but I can't find any way there could be other
allocations - not only are there strict rules how to initialize
everything, but the dentries are free'd using

kmem_cache_free(dentry_cache, dentry);

and as a result if they were allocated any other way I would expect
things would go south very quickly.

The only other thing I could come up with is some breakage in the
superblock lifetime so that &dentry->d_sb->s_dentry_lru would have
problems, but again, this is *such* core code and not some unusual
path, that I would be very very surprised if it wouldn't have
triggered other issues long long ago.

That's why I'd be more inclined to worry about the list_lru code being
somehow broken.

Linus

Muchun Song

unread,
Mar 24, 2022, 4:44:56 AM3/24/22
to Linus Torvalds, syzbot, Andrew Morton, Linux Kernel Mailing List, Linux-MM, syzkaller-bugs
I also have the same concern. I have been trying for a few hours to
reproduce this issue, but it didn't oops on my test machine. And I'll
continue reproducing this.

Thanks.

Dmitry Vyukov

unread,
Mar 24, 2022, 4:50:27 AM3/24/22
to Muchun Song, Linus Torvalds, syzbot, Andrew Morton, Linux Kernel Mailing List, Linux-MM, syzkaller-bugs
syzbot triggered it 222 times in a day, so it's most likely real:
https://syzkaller.appspot.com/bug?extid=f8c45ccc7d5d45fc5965

There are 2 reproducers, but they look completely different. May be a race.
You may also try to use syzbot's patch testing feature to get some
additional debug info.

Muchun Song

unread,
Mar 24, 2022, 12:13:04 PM3/24/22
to Dmitry Vyukov, Linus Torvalds, syzbot, Andrew Morton, Linux Kernel Mailing List, Linux-MM, syzkaller-bugs
Do you know how to tell the syzbot to test the following patch?
I found some infos from github, it says "#syz test:", is it like the following?
Thanks.

#syz test: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
master

diff --git a/mm/list_lru.c b/mm/list_lru.c
index c669d87001a6..ddb2ee627d32 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -67,6 +67,7 @@ list_lru_from_kmem(struct list_lru *lru, int nid, void *ptr,
struct list_lru_node *nlru = &lru->node[nid];
struct list_lru_one *l = &nlru->lru;
struct mem_cgroup *memcg = NULL;
+ int kmemcg_id;

if (!list_lru_memcg_aware(lru))
goto out;
@@ -75,7 +76,13 @@ list_lru_from_kmem(struct list_lru *lru, int nid, void *ptr,
if (!memcg)
goto out;

- l = list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(memcg));
+ kmemcg_id = memcg_kmem_id(memcg);
+ l = list_lru_from_memcg_idx(lru, nid, kmemcg_id);
+ if (!l) {
+ pr_info("BUG: the memcg(%px)->objcg(%px), kmemcg_id: %d\n",
+ memcg, memcg->objcg, kmemcg_id);
+ BUG();
+ }
out:
if (memcg_ptr)
*memcg_ptr = memcg;

Dmitry Vyukov

unread,
Mar 24, 2022, 12:18:40 PM3/24/22
to Muchun Song, Linus Torvalds, syzbot, Andrew Morton, Linux Kernel Mailing List, Linux-MM, syzkaller-bugs
Yes, this is correct. You can now see the request listed here:
https://syzkaller.appspot.com/bug?extid=f8c45ccc7d5d45fc5965

but the patch was truncated (probably you email client messed
whitespaces). In such case it's more reliable to attach the patch as
text file.

Muchun Song

unread,
Mar 24, 2022, 12:39:28 PM3/24/22
to Dmitry Vyukov, Linus Torvalds, syzbot, Andrew Morton, Linux Kernel Mailing List, Linux-MM, syzkaller-bugs
Cool!.

> but the patch was truncated (probably you email client messed
> whitespaces). In such case it's more reliable to attach the patch as
> text file.

Thanks for your reminder.

#syz test: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
5abc1e37afa0335c52608d640fd30910b2eeda21
test.patch

syzbot

unread,
Mar 24, 2022, 3:41:19 PM3/24/22
to ak...@linux-foundation.org, dvy...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, songm...@bytedance.com, syzkall...@googlegroups.com, torv...@linux-foundation.org
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+f8c45c...@syzkaller.appspotmail.com

Tested on:

commit: ed464352 Merge tag 'arm-dt-5.18' of git://git.kernel.o..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git master
kernel config: https://syzkaller.appspot.com/x/.config?x=113ff5d072bc15d6
dashboard link: https://syzkaller.appspot.com/bug?extid=f8c45ccc7d5d45fc5965
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=1208043d700000

Note: testing is done by a robot and is best-effort only.

syzbot

unread,
Mar 24, 2022, 3:44:15 PM3/24/22
to ak...@linux-foundation.org, dvy...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, songm...@bytedance.com, syzkall...@googlegroups.com, torv...@linux-foundation.org
Hello,

syzbot tried to test the proposed patch but the build/boot failed:

failed to apply patch:
checking file mm/list_lru.c
Hunk #1 succeeded at 76 with fuzz 2 (offset 9 lines).
Hunk #2 FAILED at 76.
1 out of 2 hunks FAILED



Tested on:

commit: 5abc1e37 mm: list_lru: allocate list_lru_one only when..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
patch: https://syzkaller.appspot.com/x/patch.diff?x=164cecb3700000

Linus Torvalds

unread,
Mar 24, 2022, 3:45:19 PM3/24/22
to syzbot, Andrew Morton, Dmitry Vyukov, Linux Kernel Mailing List, Linux-MM, Muchun Song, syzkaller-bugs
On Thu, Mar 24, 2022 at 12:41 PM syzbot
<syzbot+f8c45c...@syzkaller.appspotmail.com> wrote:
>
> syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Heh, well that's unfortunate.

I think the issue is that it triggered a new BUG() that didn't match
the previous NULL pointer dereference, so it thinks things are
"fixed".

Linus

Linus Torvalds

unread,
Mar 24, 2022, 3:47:48 PM3/24/22
to syzbot, Andrew Morton, Dmitry Vyukov, Linux Kernel Mailing List, Linux-MM, Muchun Song, syzkaller-bugs
Oh, no, it's because it used the truncated patch that didn't do anything:

> patch: https://syzkaller.appspot.com/x/patch.diff?x=1208043d700000

and maybe (due to the racy nature) nothing actually happened.

Linus

Muchun Song

unread,
Mar 24, 2022, 9:44:32 PM3/24/22
to Linus Torvalds, syzbot, Andrew Morton, Dmitry Vyukov, Linux Kernel Mailing List, Linux-MM, syzkaller-bugs
On Fri, Mar 25, 2022 at 3:47 AM Linus Torvalds
It is not easy to reproduce. I'm also trying to reproduce locally.
test.patch

syzbot

unread,
Mar 25, 2022, 12:49:10 AM3/25/22
to ak...@linux-foundation.org, dvy...@google.com, linux-...@vger.kernel.org, linu...@kvack.org, songm...@bytedance.com, syzkall...@googlegroups.com, torv...@linux-foundation.org
Hello,

syzbot tried to test the proposed patch but the build/boot failed:

mm/list_lru.c:88:14: error: implicit declaration of function 'memcg_kmem_id' [-Werror=implicit-function-declaration]


Tested on:

commit: 5abc1e37 mm: list_lru: allocate list_lru_one only when..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
dashboard link: https://syzkaller.appspot.com/bug?extid=f8c45ccc7d5d45fc5965
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=12e2d063700000

Muchun Song

unread,
Mar 25, 2022, 5:52:32 AM3/25/22
to syzbot, Andrew Morton, LKML, Linux Memory Management List, syzkaller-bugs, Linus Torvalds
I can reproduce this (base commit: 5abc1e37afa0335c52608d640fd30910b2eeda21)
on my machine locally and have some updates on this. I added the following
patch to print more infos.

We can see that we put the dentry (ffff88807ebda0f8) into
the list_lru (ffff888011bd47f0). But we do not allocate struct
list_lru_one for the memcg (ffff88801c530000). Then it panics.

I have added a pr_info into memcg_list_lru_alloc() which
will print the address of struct list_lru which we have
allocated struct list_lru_one for. However, I cannot find
this print info in the full dmesg (see the attachment).
It seems that this address (ffff88807ebda0f8) is not
allocated by kmem_cache_alloc_lru(). But I haven't
found the root cause. I will continue to investigate.


diff --git a/mm/list_lru.c b/mm/list_lru.c
index fc938d8ff48f..9dd9424cea4f 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -39,6 +39,7 @@ static void list_lru_unregister(struct list_lru *lru)
if (!list_lru_memcg_aware(lru))
return;

+ pr_info("smcdef: list_lru_unregister: %px\n", lru);
mutex_lock(&list_lrus_mutex);
list_del(&lru->list);
mutex_unlock(&list_lrus_mutex);
@@ -76,6 +77,7 @@ list_lru_from_kmem(struct list_lru *lru, int nid, void *ptr,
struct list_lru_node *nlru = &lru->node[nid];
struct list_lru_one *l = &nlru->lru;
struct mem_cgroup *memcg = NULL;
+ int kmemcg_id;

if (!lru->mlrus)
goto out;
@@ -84,7 +86,16 @@ list_lru_from_kmem(struct list_lru *lru, int nid, void *ptr,
if (!memcg)
goto out;

- l = list_lru_from_memcg_idx(lru, nid, memcg_cache_id(memcg));
+ kmemcg_id = memcg_cache_id(memcg);
+ l = list_lru_from_memcg_idx(lru, nid, kmemcg_id);
+ if (!l) {
+ pr_info("the memcg(%px)->objcg(%px), kmemcg_id: %d,
ptr: %px, lru: %px\n",
+ memcg, memcg->objcg, kmemcg_id, ptr, lru);
+ memcg = parent_mem_cgroup(memcg);
+ kmemcg_id = memcg_cache_id(memcg);
+ pr_info("the memcg(%px)->objcg(%px), kmemcg_id: %d, lru: %px\n",
+ memcg, memcg->objcg, kmemcg_id,
list_lru_from_memcg_idx(lru, nid, kmemcg_id));
+ }
out:
if (memcg_ptr)
*memcg_ptr = memcg;
@@ -503,6 +514,8 @@ void memcg_drain_all_list_lrus(struct mem_cgroup
*src, struct mem_cgroup *dst)
struct list_lru *lru;
int src_idx = src->kmemcg_id;

+ pr_info("smcdef offline src: %px(%d), dst: %px(%d)\n", src,
src_idx, dst, dst->kmemcg_id);
+
/*
* Change kmemcg_id of this cgroup and all its descendants to the
* parent's id, and then move all entries from this cgroup's list_lrus
@@ -567,12 +580,14 @@ int memcg_list_lru_alloc(struct mem_cgroup
*memcg, struct list_lru *lru,
if (!table)
return -ENOMEM;

+ pr_info("smcdef memcg->css.cgroup->level: %d, lru: %px\n",
memcg->css.cgroup->level, lru);
/*
* Because the list_lru can be reparented to the parent cgroup's
* list_lru, we should make sure that this cgroup and all its
* ancestors have allocated list_lru_per_memcg.
*/
for (i = 0; memcg; memcg = parent_mem_cgroup(memcg), i++) {
+ pr_info("smcdef memcg: %px, kmemcg_id: %d\n", memcg,
memcg->kmemcg_id);
if (memcg_list_lru_allocated(memcg, lru))
break;

@@ -592,10 +607,13 @@ int memcg_list_lru_alloc(struct mem_cgroup
*memcg, struct list_lru *lru,
int index = table[i].memcg->kmemcg_id;
struct list_lru_per_memcg *mlru = table[i].mlru;

- if (index < 0 ||
rcu_dereference_protected(mlrus->mlru[index], true))
+ if (index < 0 ||
rcu_dereference_protected(mlrus->mlru[index], true)) {
kfree(mlru);
- else
+ pr_info("smcdef allocated i: %d, memcg: %px,
kmemcg_id: %d\n", i, memcg, index);
+ } else {
+ pr_info("smcdef new i: %d, memcg: %px,
kmemcg_id: %d\n", i, memcg, index);
rcu_assign_pointer(mlrus->mlru[index], mlru);
+ }
}
spin_unlock_irqrestore(&lru->lock, flags);
file.txt

Linus Torvalds

unread,
Mar 25, 2022, 4:29:22 PM3/25/22
to Muchun Song, syzbot, Andrew Morton, LKML, Linux Memory Management List, syzkaller-bugs
On Fri, Mar 25, 2022 at 2:52 AM Muchun Song <songm...@bytedance.com> wrote:
>
> We can see that we put the dentry (ffff88807ebda0f8) into
> the list_lru (ffff888011bd47f0). But we do not allocate struct
> list_lru_one for the memcg (ffff88801c530000). Then it panics.

Hmm.

Looking at memcg_slab_pre_alloc_hook(), I note that it will return
success without doing the LRU checking for several cases.

So since you can reproduce the problem, I would suggest you add some
debug code to __d_alloc() that prints out something big if it gets a
dentry but you can't look up the list_lru_one() for that dentry.

Hmm?

The only other situation I can think of is if dentry->d_sb were to
change during the dentry lifetime, but I don't think that can happen.
The only assignment I can find with "git grep" is that

dentry->d_sb = sb;

in __d_alloc(), and while it's possible my grep pattern was bogus, it
sounds unlikely.

Linus

Muchun Song

unread,
Mar 27, 2022, 1:27:55 AM3/27/22
to Linus Torvalds, syzbot, Andrew Morton, LKML, Linux Memory Management List, syzkaller-bugs
I have found the root cause, it was caused by kfence. Here is
the fix patch [1].

[1] https://lore.kernel.org/all/20220327051853.5...@bytedance.com/

Thanks.
Reply all
Reply to author
Forward
0 new messages