[syzbot ci] Re: mm: improve folio refcount scalability

0 views
Skip to first unread message

syzbot ci

unread,
6:07 AM (13 hours ago) 6:07 AM
to ak...@linux-foundation.org, apo...@nvidia.com, artem...@huawei.com, baoli...@linux.alibaba.com, da...@kernel.org, gladysh...@h-partners.com, gorbun...@h-partners.com, harr...@oracle.com, kir...@shutemov.name, liam.h...@oracle.com, linux-...@vger.kernel.org, linu...@kvack.org, lorenzo...@oracle.com, mho...@suse.com, muchu...@linux.dev, rp...@kernel.org, sur...@google.com, torv...@linuxfoundation.org, vba...@suse.cz, wi...@infradead.org, yuz...@google.com, z...@nvidia.com, syz...@lists.linux.dev, syzkall...@googlegroups.com
syzbot ci has tested the following series

[v2] mm: improve folio refcount scalability
https://lore.kernel.org/all/cover.1776350895....@h-partners.com
* [PATCH v2 1/2] mm: drop page refcount zero state semantics
* [PATCH v2 2/2] mm: implement page refcount locking via dedicated bit

and found the following issue:
kernel BUG in get_page_bootmem

Full report is available here:
https://ci.syzbot.org/series/eb14b73a-c461-4be5-b5af-91864e939f4c

***

kernel BUG in get_page_bootmem

tree: mm-new
URL: https://kernel.googlesource.com/pub/scm/linux/kernel/git/akpm/mm.git
base: f4279f87cd6c82ebdaccdc56f38e7b80ca7fcc03
arch: amd64
compiler: Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
config: https://ci.syzbot.org/builds/60ced5f4-8c33-43ea-a4ee-92d9b2b8f949/config

ACPI: HPET id: 0x8086a201 base: 0xfed00000
CPU topo: Max. logical packages: 2
CPU topo: Max. logical nodes: 1
CPU topo: Num. nodes per package: 1
CPU topo: Max. logical dies: 2
CPU topo: Max. dies per package: 1
CPU topo: Max. threads per core: 1
CPU topo: Num. cores per package: 1
CPU topo: Num. threads per package: 1
CPU topo: Allowing 2 present CPUs plus 0 hotplug CPUs
kvm-guest: APIC: eoi() replaced with kvm_guest_apic_eoi_write()
PM: hibernation: Registered nosave memory: [mem 0x00000000-0x00000fff]
PM: hibernation: Registered nosave memory: [mem 0x0009f000-0x000fffff]
PM: hibernation: Registered nosave memory: [mem 0x7ffdf000-0xffffffff]
[gap 0xc0000000-0xfed1bfff] available for PCI devices
Booting paravirtualized kernel on KVM
clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
Zone ranges:
DMA [mem 0x0000000000001000-0x0000000000ffffff]
DMA32 [mem 0x0000000001000000-0x00000000ffffffff]
Normal [mem 0x0000000100000000-0x000000023fffffff]
Device empty
Movable zone start for each node
Early memory node ranges
node 0: [mem 0x0000000000001000-0x000000000009efff]
node 0: [mem 0x0000000000100000-0x000000007ffdefff]
node 0: [mem 0x0000000100000000-0x0000000160000fff]
node 1: [mem 0x0000000160001000-0x000000023fffffff]
Initmem setup node 0 [mem 0x0000000000001000-0x0000000160000fff]
Initmem setup node 1 [mem 0x0000000160001000-0x000000023fffffff]
On node 0, zone DMA: 1 pages in unavailable ranges
On node 0, zone DMA: 97 pages in unavailable ranges
On node 0, zone Normal: 33 pages in unavailable ranges
setup_percpu: NR_CPUS:8 nr_cpumask_bits:2 nr_cpu_ids:2 nr_node_ids:2
percpu: Embedded 71 pages/cpu s250120 r8192 d32504 u2097152
kvm-guest: PV spinlocks disabled, no host support
Kernel command line: earlyprintk=serial net.ifnames=0 sysctl.kernel.hung_task_all_cpu_backtrace=1 ima_policy=tcb nf-conntrack-ftp.ports=20000 nf-conntrack-tftp.ports=20000 nf-conntrack-sip.ports=20000 nf-conntrack-irc.ports=20000 nf-conntrack-sane.ports=20000 binder.debug_mask=0 rcupdate.rcu_expedited=1 rcupdate.rcu_cpu_stall_cputime=1 no_hash_pointers page_owner=on sysctl.vm.nr_hugepages=4 sysctl.vm.nr_overcommit_hugepages=4 secretmem.enable=1 sysctl.max_rcu_stall_to_panic=1 msr.allow_writes=off coredump_filter=0xffff root=/dev/sda console=ttyS0 vsyscall=native numa=fake=2 kvm-intel.nested=1 spec_store_bypass_disable=prctl nopcid vivid.n_devs=64 vivid.multiplanar=1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2 netrom.nr_ndevs=32 rose.rose_ndevs=32 smp.csd_lock_timeout=100000 watchdog_thresh=55 workqueue.watchdog_thresh=140 sysctl.net.core.netdev_unregister_timeout_secs=140 dummy_hcd.num=32 max_loop=32 nbds_max=32 \
Kernel command line: comedi.comedi_num_legacy_minors=4 panic_on_warn=1 root=/dev/sda console=ttyS0 root=/dev/sda1
Unknown kernel command line parameters "nbds_max=32", will be passed to user space.
printk: log buffer data + meta data: 262144 + 917504 = 1179648 bytes
software IO TLB: area num 2.
Fallback order for Node 0: 0 1
Fallback order for Node 1: 1 0
Built 2 zonelists, mobility grouping on. Total pages: 1834877
Policy zone: Normal
mem auto-init: stack:all(zero), heap alloc:on, heap free:off
stackdepot: allocating hash table via alloc_large_system_hash
stackdepot hash table entries: 1048576 (order: 12, 16777216 bytes, linear)
stackdepot: allocating space for 8192 stack pools via memblock
------------[ cut here ]------------
kernel BUG at ./include/linux/page_ref.h:171!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted syzkaller #0 PREEMPT(undef)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:get_page_bootmem+0x188/0x190
Code: 86 ff 90 0f 0b e8 98 52 86 ff 90 0f 0b e8 90 52 86 ff 48 89 df 48 c7 c6 00 e4 dd 8b e8 51 d7 e8 fe 90 0f 0b e8 79 52 86 ff 90 <0f> 0b 66 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 0000:ffffffff8e407e50 EFLAGS: 00010093
RAX: ffffffff823f42b7 RBX: ffffea00057ffec0 RCX: ffffffff8e494ec0
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
RBP: 0000000000000001 R08: ffffea00057ffef7 R09: 1ffffd4000afffde
R10: dffffc0000000000 R11: fffff94000afffdf R12: dffffc0000000000
R13: 0000000000000000 R14: ffffea00057ffef4 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff88818de62000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff88823ffff000 CR3: 000000000e54c000 CR4: 00000000000000b0
Call Trace:
<TASK>
register_page_bootmem_info_node+0x88/0x410
register_page_bootmem_info+0x77/0xc0
mem_init+0x5a/0xb0
mm_core_init+0x79/0xb0
start_kernel+0x15a/0x3d0
x86_64_start_reservations+0x24/0x30
x86_64_start_kernel+0x143/0x1c0
common_startup_64+0x13e/0x147
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:get_page_bootmem+0x188/0x190
Code: 86 ff 90 0f 0b e8 98 52 86 ff 90 0f 0b e8 90 52 86 ff 48 89 df 48 c7 c6 00 e4 dd 8b e8 51 d7 e8 fe 90 0f 0b e8 79 52 86 ff 90 <0f> 0b 66 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 0000:ffffffff8e407e50 EFLAGS: 00010093
RAX: ffffffff823f42b7 RBX: ffffea00057ffec0 RCX: ffffffff8e494ec0
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
RBP: 0000000000000001 R08: ffffea00057ffef7 R09: 1ffffd4000afffde
R10: dffffc0000000000 R11: fffff94000afffdf R12: dffffc0000000000
R13: 0000000000000000 R14: ffffea00057ffef4 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff88818de62000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff88823ffff000 CR3: 000000000e54c000 CR4: 00000000000000b0


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
Tested-by: syz...@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzk...@googlegroups.com.

To test a patch for this bug, please reply with `#syz test`
(should be on a separate line).

The patch should be attached to the email.
Note: arguments like custom git repos and branches are not supported.

Gorbunov Ivan

unread,
8:32 AM (11 hours ago) 8:32 AM
to syzbot ci, ak...@linux-foundation.org, apo...@nvidia.com, artem...@huawei.com, baoli...@linux.alibaba.com, da...@kernel.org, gladysh...@h-partners.com, harr...@oracle.com, kir...@shutemov.name, liam.h...@oracle.com, linux-...@vger.kernel.org, linu...@kvack.org, lorenzo...@oracle.com, mho...@suse.com, muchu...@linux.dev, rp...@kernel.org, sur...@google.com, torv...@linuxfoundation.org, vba...@suse.cz, wi...@infradead.org, yuz...@google.com, z...@nvidia.com, syz...@lists.linux.dev, syzkall...@googlegroups.com
Apologies to all. The logic in the debug check was accidentally inverted
during rebase

#syz test

diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index 32194e953674..ca6e43b0cf95 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -64,7 +64,7 @@ static inline void __page_ref_unfreeze(struct page
*page, int v)

static inline bool __page_count_is_frozen(int count)
{
- return count > 0 && !((count & PAGEREF_FROZEN_BIT) != 0);
+ return count & PAGEREF_FROZEN_BIT;
}

static inline int page_ref_count(const struct page *page)

syzbot ci

unread,
9:21 AM (10 hours ago) 9:21 AM
to gorbun...@h-partners.com, ak...@linux-foundation.org, apo...@nvidia.com, artem...@huawei.com, baoli...@linux.alibaba.com, da...@kernel.org, gladysh...@h-partners.com, harr...@oracle.com, kir...@shutemov.name, liam.h...@oracle.com, linux-...@vger.kernel.org, linu...@kvack.org, lorenzo...@oracle.com, mho...@suse.com, muchu...@linux.dev, rp...@kernel.org, sur...@google.com, syz...@lists.linux.dev, syzkall...@googlegroups.com, torv...@linuxfoundation.org, vba...@suse.cz, wi...@infradead.org, yuz...@google.com, z...@nvidia.com, syz...@lists.linux.dev, syzkall...@googlegroups.com
syzbot ci has tested the suggested fix patch on top of the following series:

[v2] mm: improve folio refcount scalability
https://lore.kernel.org/all/cover.1776350895....@h-partners.com

Patch: https://ci.syzbot.org/jobs/1f75dd6a-7a6f-4420-ae4b-67a071622e07/patch

The patch testing request could not be completed:
Testing failed due to an infrastructure error.
Testing results:
* [build 0] Build Patched: error

Full report is available here:
https://ci.syzbot.org/session/0e12b11a-0902-43fb-b549-6c0cc5ae45eb
Reply all
Reply to author
Forward
0 new messages