[syzbot] [mm?] UBSAN: shift-out-of-bounds in do_shrink_slab

9 views
Skip to first unread message

syzbot

unread,
Jun 1, 2024, 3:08:28 AMJun 1
to ak...@linux-foundation.org, da...@fromorbit.com, linux-...@vger.kernel.org, linu...@kvack.org, muchu...@linux.dev, roman.g...@linux.dev, syzkall...@googlegroups.com, zhengq...@bytedance.com
Hello,

syzbot found the following issue on:

HEAD commit: 6dc544b66971 Add linux-next specific files for 20240528
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=14c7f806980000
kernel config: https://syzkaller.appspot.com/x/.config?x=6a363b35598e573d
dashboard link: https://syzkaller.appspot.com/bug?extid=981b8efffb3d71c46bef
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/334699ab67f8/disk-6dc544b6.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/4ca32b2218ce/vmlinux-6dc544b6.xz
kernel image: https://storage.googleapis.com/syzbot-assets/400bc5f019b3/bzImage-6dc544b6.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+981b8e...@syzkaller.appspotmail.com

------------[ cut here ]------------
UBSAN: shift-out-of-bounds in mm/shrinker.c:406:18
shift exponent -1 is negative
CPU: 0 PID: 5278 Comm: syz-executor.1 Not tainted 6.10.0-rc1-next-20240528-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
ubsan_epilogue lib/ubsan.c:231 [inline]
__ubsan_handle_shift_out_of_bounds+0x3c8/0x420 lib/ubsan.c:468
do_shrink_slab+0xe26/0x1160 mm/shrinker.c:406
shrink_slab_memcg mm/shrinker.c:548 [inline]
shrink_slab+0x87c/0x14d0 mm/shrinker.c:626
shrink_node_memcgs mm/vmscan.c:5923 [inline]
shrink_node+0xb82/0x4150 mm/vmscan.c:5961
shrink_zones mm/vmscan.c:6205 [inline]
do_try_to_free_pages+0x789/0x1cb0 mm/vmscan.c:6267
try_to_free_mem_cgroup_pages+0x48f/0xb10 mm/vmscan.c:6598
try_charge_memcg+0x704/0x1850 mm/memcontrol.c:2946
obj_cgroup_charge_pages mm/memcontrol.c:3420 [inline]
__memcg_kmem_charge_page+0xe2/0x250 mm/memcontrol.c:3446
__alloc_pages_noprof+0x28c/0x6c0 mm/page_alloc.c:4712
__alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
bpf_ringbuf_area_alloc kernel/bpf/ringbuf.c:122 [inline]
bpf_ringbuf_alloc+0xcb/0x420 kernel/bpf/ringbuf.c:170
ringbuf_map_alloc+0x1d7/0x2f0 kernel/bpf/ringbuf.c:204
map_create+0x90c/0x1200 kernel/bpf/syscall.c:1333
__sys_bpf+0x6d1/0x810 kernel/bpf/syscall.c:5669
__do_sys_bpf kernel/bpf/syscall.c:5794 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5792 [inline]
__x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5792
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7efea107cee9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007efea1de60c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00007efea11b3fa0 RCX: 00007efea107cee9
RDX: 0000000000000048 RSI: 00000000200002c0 RDI: 0000000000000000
RBP: 00007efea10c947f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007efea11b3fa0 R15: 00007fff6651b5d8
</TASK>
---[ end trace ]---


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Dave Chinner

unread,
Jun 2, 2024, 8:51:56 PMJun 2
to syzbot, ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, muchu...@linux.dev, roman.g...@linux.dev, syzkall...@googlegroups.com, zhengq...@bytedance.com
total_scan = nr >> priority;

Ok, that means the shrinker has been passed a priority of -1 from
the core memory reclaim code. That means it is more likely that
something has gone wrong with the higher level struct scan_control
sc->priority handling, not something in teh shrinker code itself.

> shrink_slab_memcg mm/shrinker.c:548 [inline]
> shrink_slab+0x87c/0x14d0 mm/shrinker.c:626
> shrink_node_memcgs mm/vmscan.c:5923 [inline]
> shrink_node+0xb82/0x4150 mm/vmscan.c:5961
> shrink_zones mm/vmscan.c:6205 [inline]
> do_try_to_free_pages+0x789/0x1cb0 mm/vmscan.c:6267

This has a loop that does:

do {
.....
shrink_zones(zonelist, sc);
.....
} while (--sc->priority >= 0);

and all the callers initialise sc->priority to DEF_PRIORITY. Hence
I can't see how we get shrink_zones() gets called with sc->priority
== -1 from here or anywhere else that decrements sc->priority. This
needs someone with more core mm reclaim expertise than I have to
triage this further.

-Dave.
--
Dave Chinner
da...@fromorbit.com

Qi Zheng

unread,
Jun 3, 2024, 3:41:32 AMJun 3
to syzbot, ak...@linux-foundation.org, da...@fromorbit.com, linux-...@vger.kernel.org, linu...@kvack.org, muchu...@linux.dev, roman.g...@linux.dev, syzkall...@googlegroups.com, zhengq...@bytedance.com, shakee...@linux.dev, Johannes Weiner
Hi,

I think this bug was introduced by commit 6be5e186fd65
("mm: vmscan: restore incremental cgroup iteration"), and
can be fixed by commit 9c8805439853 ("mm: vmscan: reset sc->priority on
retry").

Thanks,
Qi

Roman Gushchin

unread,
Jun 9, 2024, 7:51:30 PMJun 9
to Qi Zheng, syzbot, ak...@linux-foundation.org, da...@fromorbit.com, linux-...@vger.kernel.org, linu...@kvack.org, muchu...@linux.dev, syzkall...@googlegroups.com, zhengq...@bytedance.com, shakee...@linux.dev, Johannes Weiner
On Mon, Jun 03, 2024 at 11:25:42AM +0800, Qi Zheng wrote:
> Hi,
>
> I think this bug was introduced by commit 6be5e186fd65
> ("mm: vmscan: restore incremental cgroup iteration"), and
> can be fixed by commit 9c8805439853 ("mm: vmscan: reset sc->priority on
> retry").

I'm almost sure it's the same issue.

Thanks
Reply all
Reply to author
Forward
0 new messages