[syzbot] [mm?] BUG: unable to handle kernel paging request in move_pages

6 views
Skip to first unread message

syzbot

unread,
Jul 17, 2025, 3:13:35 PM7/17/25
to ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, pet...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: e8352908bdcd Add linux-next specific files for 20250716
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=17f81382580000
kernel config: https://syzkaller.appspot.com/x/.config?x=b7b0e60e17dc5717
dashboard link: https://syzkaller.appspot.com/bug?extid=b446dbe27035ef6bd6c2
compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10041382580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10eb158c580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/ae8cc81c1781/disk-e8352908.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/57aaea991896/vmlinux-e8352908.xz
kernel image: https://storage.googleapis.com/syzbot-assets/feb871619bd4/bzImage-e8352908.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+b446db...@syzkaller.appspotmail.com

BUG: unable to handle page fault for address: ffffea6000391008
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 13fff8067 P4D 13fff8067 PUD 0
Oops: Oops: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 5860 Comm: syz-executor832 Not tainted 6.16.0-rc6-next-20250716-syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
RIP: 0010:_compound_head include/linux/page-flags.h:284 [inline]
RIP: 0010:move_pages+0xbe6/0x1430 mm/userfaultfd.c:1824
Code: c1 ec 06 4b 8d 1c 2c 48 83 c3 08 48 89 d8 48 c1 e8 03 48 b9 00 00 00 00 00 fc ff df 80 3c 08 00 74 08 48 89 df e8 9a 30 f4 ff <48> 8b 1b 48 89 de 48 83 e6 01 31 ff e8 59 70 8f ff 48 89 d8 48 83
RSP: 0018:ffffc90003f778a8 EFLAGS: 00010246
RAX: 1ffffd4c00072201 RBX: ffffea6000391008 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000004
R10: dffffc0000000000 R11: fffff520007eef00 R12: 0000006000391000
R13: ffffea0000000000 R14: 200018000e4401fd R15: 00002000003ab000
FS: 00007ff35708f6c0(0000) GS:ffff8881258aa000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffea6000391008 CR3: 0000000074390000 CR4: 00000000003526f0
Call Trace:
<TASK>
userfaultfd_move fs/userfaultfd.c:1923 [inline]
userfaultfd_ioctl+0x2e8b/0x4c80 fs/userfaultfd.c:2046
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:598 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:584
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff3570d6519
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff35708f218 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ff357160308 RCX: 00007ff3570d6519
RDX: 0000200000000180 RSI: 00000000c028aa05 RDI: 0000000000000003
RBP: 00007ff357160300 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ff35712d074
R13: 0000200000000180 R14: 0000200000000188 R15: 00002000002b9000
</TASK>
Modules linked in:
CR2: ffffea6000391008
---[ end trace 0000000000000000 ]---
RIP: 0010:_compound_head include/linux/page-flags.h:284 [inline]
RIP: 0010:move_pages+0xbe6/0x1430 mm/userfaultfd.c:1824
Code: c1 ec 06 4b 8d 1c 2c 48 83 c3 08 48 89 d8 48 c1 e8 03 48 b9 00 00 00 00 00 fc ff df 80 3c 08 00 74 08 48 89 df e8 9a 30 f4 ff <48> 8b 1b 48 89 de 48 83 e6 01 31 ff e8 59 70 8f ff 48 89 d8 48 83
RSP: 0018:ffffc90003f778a8 EFLAGS: 00010246
RAX: 1ffffd4c00072201 RBX: ffffea6000391008 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000004
R10: dffffc0000000000 R11: fffff520007eef00 R12: 0000006000391000
R13: ffffea0000000000 R14: 200018000e4401fd R15: 00002000003ab000
FS: 00007ff35708f6c0(0000) GS:ffff8881258aa000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffea6000391008 CR3: 0000000074390000 CR4: 00000000003526f0
----------------
Code disassembly (best guess):
0: c1 ec 06 shr $0x6,%esp
3: 4b 8d 1c 2c lea (%r12,%r13,1),%rbx
7: 48 83 c3 08 add $0x8,%rbx
b: 48 89 d8 mov %rbx,%rax
e: 48 c1 e8 03 shr $0x3,%rax
12: 48 b9 00 00 00 00 00 movabs $0xdffffc0000000000,%rcx
19: fc ff df
1c: 80 3c 08 00 cmpb $0x0,(%rax,%rcx,1)
20: 74 08 je 0x2a
22: 48 89 df mov %rbx,%rdi
25: e8 9a 30 f4 ff call 0xfff430c4
* 2a: 48 8b 1b mov (%rbx),%rbx <-- trapping instruction
2d: 48 89 de mov %rbx,%rsi
30: 48 83 e6 01 and $0x1,%rsi
34: 31 ff xor %edi,%edi
36: e8 59 70 8f ff call 0xff8f7094
3b: 48 89 d8 mov %rbx,%rax
3e: 48 rex.W
3f: 83 .byte 0x83


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Peter Xu

unread,
Jul 28, 2025, 5:08:26 PM7/28/25
to syzbot, ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com, Lokesh Gidra, Suren Baghdasaryan
Copy Lokesh and Suren.
--
Peter Xu

Suren Baghdasaryan

unread,
Jul 28, 2025, 10:51:13 PM7/28/25
to Peter Xu, syzbot, ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com, Lokesh Gidra
On Mon, Jul 28, 2025 at 9:08 PM Peter Xu <pet...@redhat.com> wrote:
>
> Copy Lokesh and Suren.

Thanks! I'll take a closer look tomorrow morning.

Lokesh Gidra

unread,
Jul 29, 2025, 4:08:16 AM7/29/25
to Suren Baghdasaryan, Peter Xu, syzbot, ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
On Mon, Jul 28, 2025 at 7:51 PM Suren Baghdasaryan <sur...@google.com> wrote:
>
> On Mon, Jul 28, 2025 at 9:08 PM Peter Xu <pet...@redhat.com> wrote:
> >
> > Copy Lokesh and Suren.

Thanks Peter!
>
> Thanks! I'll take a closer look tomorrow morning.
>
I think the issue is that we are incorrectly handling src holes in the
THP case. The reproducer is setting 'mode' to
UFFDIO_MOVE_MODE_ALLOW_SRC_HOLES and it seems like the src address is
indeed untouched at the time MOVE ioctl is invoked and hence likely
has a hole.

When this mode is set, we (correctly) don't fail with -ENOENT, but
then instead of skipping the page, we keep going with THP move, which
involves fetching the folio unconditionally from the src_pmd, which is
expected to have no page mapped there.

Suren, can you please double check if my hypothesis is correct?

Suren Baghdasaryan

unread,
Jul 29, 2025, 1:51:39 PM7/29/25
to Lokesh Gidra, Peter Xu, syzbot, ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
On Tue, Jul 29, 2025 at 1:08 AM Lokesh Gidra <lokes...@google.com> wrote:
>
> On Mon, Jul 28, 2025 at 7:51 PM Suren Baghdasaryan <sur...@google.com> wrote:
> >
> > On Mon, Jul 28, 2025 at 9:08 PM Peter Xu <pet...@redhat.com> wrote:
> > >
> > > Copy Lokesh and Suren.
>
> Thanks Peter!
> >
> > Thanks! I'll take a closer look tomorrow morning.
> >
> I think the issue is that we are incorrectly handling src holes in the
> THP case. The reproducer is setting 'mode' to
> UFFDIO_MOVE_MODE_ALLOW_SRC_HOLES and it seems like the src address is
> indeed untouched at the time MOVE ioctl is invoked and hence likely
> has a hole.
>
> When this mode is set, we (correctly) don't fail with -ENOENT, but
> then instead of skipping the page, we keep going with THP move, which
> involves fetching the folio unconditionally from the src_pmd, which is
> expected to have no page mapped there.
>
> Suren, can you please double check if my hypothesis is correct?

I think in the case of a hole the prior call to pmd_trans_huge_lock()
would return NULL and we would not handle it as THP move.
I was able to reproduce the crash, though the call stack is a bit
different. Will try to figure it out.

Suren Baghdasaryan

unread,
Jul 30, 2025, 1:09:43 PM7/30/25
to Lokesh Gidra, Peter Xu, syzbot, ak...@linux-foundation.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com
On Tue, Jul 29, 2025 at 10:51 AM Suren Baghdasaryan <sur...@google.com> wrote:
>
> On Tue, Jul 29, 2025 at 1:08 AM Lokesh Gidra <lokes...@google.com> wrote:
> >
> > On Mon, Jul 28, 2025 at 7:51 PM Suren Baghdasaryan <sur...@google.com> wrote:
> > >
> > > On Mon, Jul 28, 2025 at 9:08 PM Peter Xu <pet...@redhat.com> wrote:
> > > >
> > > > Copy Lokesh and Suren.
> >
> > Thanks Peter!
> > >
> > > Thanks! I'll take a closer look tomorrow morning.
> > >
> > I think the issue is that we are incorrectly handling src holes in the
> > THP case. The reproducer is setting 'mode' to
> > UFFDIO_MOVE_MODE_ALLOW_SRC_HOLES and it seems like the src address is
> > indeed untouched at the time MOVE ioctl is invoked and hence likely
> > has a hole.
> >
> > When this mode is set, we (correctly) don't fail with -ENOENT, but
> > then instead of skipping the page, we keep going with THP move, which
> > involves fetching the folio unconditionally from the src_pmd, which is
> > expected to have no page mapped there.
> >
> > Suren, can you please double check if my hypothesis is correct?
>
> I think in the case of a hole the prior call to pmd_trans_huge_lock()
> would return NULL and we would not handle it as THP move.
> I was able to reproduce the crash, though the call stack is a bit
> different. Will try to figure it out.

Ok, pmd_trans_huge_lock() actually confused non-present PMD with a
swap/migration entry and does not return NULL in such cases. I posted
a fix here: https://lore.kernel.org/all/20250730170733....@google.com/
Reply all
Reply to author
Forward
0 new messages