[syzbot] [mm?] WARNING in unmap_page_range (2)

7 views
Skip to first unread message

syzbot

unread,
Nov 15, 2023, 8:32:21ā€ÆAM11/15/23
to ak...@linux-foundation.org, da...@redhat.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com, usama...@collabora.com, wangkef...@huawei.com
Hello,

syzbot found the following issue on:

HEAD commit: ac347a0655db Merge tag 'arm64-fixes' of git://git.kernel.o..
git tree: upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=15ff3057680000
kernel config: https://syzkaller.appspot.com/x/.config?x=287570229f5c0a7c
dashboard link: https://syzkaller.appspot.com/bug?extid=7ca4b2719dc742b8d0a4
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=162a25ff680000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13d62338e80000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/00e30e1a5133/disk-ac347a06.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/07c43bc37935/vmlinux-ac347a06.xz
kernel image: https://storage.googleapis.com/syzbot-assets/c6690c715398/bzImage-ac347a06.xz

The issue was bisected to:

commit 12f6b01a0bcbeeab8cc9305673314adb3adf80f7
Author: Muhammad Usama Anjum <usama...@collabora.com>
Date: Mon Aug 21 14:15:15 2023 +0000

fs/proc/task_mmu: add fast paths to get/clear PAGE_IS_WRITTEN flag

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14e5591f680000
final oops: https://syzkaller.appspot.com/x/report.txt?x=16e5591f680000
console output: https://syzkaller.appspot.com/x/log.txt?x=12e5591f680000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7ca4b2...@syzkaller.appspotmail.com
Fixes: 12f6b01a0bcb ("fs/proc/task_mmu: add fast paths to get/clear PAGE_IS_WRITTEN flag")

------------[ cut here ]------------
WARNING: CPU: 0 PID: 5059 at mm/memory.c:1520 zap_pte_range mm/memory.c:1520 [inline]
WARNING: CPU: 0 PID: 5059 at mm/memory.c:1520 zap_pmd_range mm/memory.c:1582 [inline]
WARNING: CPU: 0 PID: 5059 at mm/memory.c:1520 zap_pud_range mm/memory.c:1611 [inline]
WARNING: CPU: 0 PID: 5059 at mm/memory.c:1520 zap_p4d_range mm/memory.c:1632 [inline]
WARNING: CPU: 0 PID: 5059 at mm/memory.c:1520 unmap_page_range+0x1711/0x2c00 mm/memory.c:1653
Modules linked in:
CPU: 0 PID: 5059 Comm: syz-executor416 Not tainted 6.6.0-syzkaller-16039-gac347a0655db #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
RIP: 0010:zap_pte_range mm/memory.c:1520 [inline]
RIP: 0010:zap_pmd_range mm/memory.c:1582 [inline]
RIP: 0010:zap_pud_range mm/memory.c:1611 [inline]
RIP: 0010:zap_p4d_range mm/memory.c:1632 [inline]
RIP: 0010:unmap_page_range+0x1711/0x2c00 mm/memory.c:1653
Code: 0f 8e 4a 12 00 00 48 8b 44 24 30 31 ff 0f b6 58 08 89 de e8 d1 00 bf ff 84 db 0f 85 88 f3 ff ff e9 0a f4 ff ff e8 8f 05 bf ff <0f> 0b e9 77 f3 ff ff e8 83 05 bf ff 48 83 44 24 10 08 e9 9d f6 ff
RSP: 0018:ffffc900034bf8f8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000007 RCX: ffffffff81c894fd
RDX: ffff88801ff66040 RSI: ffffffff81c89561 RDI: 0000000000000007
RBP: 0000000000000000 R08: 0000000000000007 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff888074017008 R14: dffffc0000000000 R15: 0000000000000004
FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f70d28ca0d0 CR3: 000000001d5be000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
unmap_single_vma+0x194/0x2b0 mm/memory.c:1699
unmap_vmas+0x229/0x470 mm/memory.c:1743
exit_mmap+0x1ad/0xa60 mm/mmap.c:3308
__mmput+0x12a/0x4d0 kernel/fork.c:1349
mmput+0x62/0x70 kernel/fork.c:1371
exit_mm kernel/exit.c:567 [inline]
do_exit+0x9ad/0x2ae0 kernel/exit.c:858
do_group_exit+0xd4/0x2a0 kernel/exit.c:1021
__do_sys_exit_group kernel/exit.c:1032 [inline]
__se_sys_exit_group kernel/exit.c:1030 [inline]
__x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1030
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f70d284ef39
Code: Unable to access opcode bytes at 0x7f70d284ef0f.
RSP: 002b:00007ffc9cfa2fb8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f70d284ef39
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
RBP: 00007f70d28c9270 R08: ffffffffffffffb8 R09: 65732f636f72702f
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f70d28c9270
R13: 0000000000000000 R14: 00007f70d28c9cc0 R15: 00007f70d2820ae0
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Andrew Morton

unread,
Nov 15, 2023, 5:00:12ā€ÆPM11/15/23
to syzbot, da...@redhat.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com, usama...@collabora.com, wangkef...@huawei.com
On Wed, 15 Nov 2023 05:32:19 -0800 syzbot <syzbot+7ca4b2...@syzkaller.appspotmail.com> wrote:

> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: ac347a0655db Merge tag 'arm64-fixes' of git://git.kernel.o..
> git tree: upstream
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=15ff3057680000
> kernel config: https://syzkaller.appspot.com/x/.config?x=287570229f5c0a7c
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ca4b2719dc742b8d0a4
> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=162a25ff680000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13d62338e80000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/00e30e1a5133/disk-ac347a06.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/07c43bc37935/vmlinux-ac347a06.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/c6690c715398/bzImage-ac347a06.xz
>
> The issue was bisected to:
>
> commit 12f6b01a0bcbeeab8cc9305673314adb3adf80f7
> Author: Muhammad Usama Anjum <usama...@collabora.com>
> Date: Mon Aug 21 14:15:15 2023 +0000
>
> fs/proc/task_mmu: add fast paths to get/clear PAGE_IS_WRITTEN flag

Thanks. The bisection is surprising, but the mentioned patch does
mess with pagemap.

How about we add this?

From: Andrew Morton <ak...@linux-foundation.org>
Subject: mm/memory.c:zap_pte_range() print bad swap entry
Date: Wed Nov 15 01:54:18 PM PST 2023

We have a report of this WARN() triggering. Let's print the offending
swp_entry_t to help diagnosis.

Link: https://lkml.kernel.org/r/000000000000b0...@google.com
Cc: Muhammad Usama Anjum <usama...@collabora.com>
Signed-off-by: Andrew Morton <ak...@linux-foundation.org>
---

mm/memory.c | 1 +
1 file changed, 1 insertion(+)

--- a/mm/memory.c~a
+++ a/mm/memory.c
@@ -1521,6 +1521,7 @@ static unsigned long zap_pte_range(struc
continue;
} else {
/* We should have covered all the swap entry types */
+ pr_alert("unrecognized swap entry 0x%lx\n", entry.val);
WARN_ON_ONCE(1);
}
pte_clear_not_present_full(mm, addr, pte, tlb->fullmm);
_

David Hildenbrand

unread,
Nov 16, 2023, 4:19:19ā€ÆAM11/16/23
to Andrew Morton, syzbot, Muhammad Usama Anjum, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com, usama...@collabora.com, wangkef...@huawei.com, Peter Xu
I'm curious if

1) make_uffd_wp_pte() won't end up overwriting existing pte markers, for
example, if PTE_MARKER_POISONED is set. [unrelated to this bug]

2) We get the error on arm64, which does *not* support uffd-wp. Do we
maybe end up calling make_uffd_wp_pte() and place a pte marker, even
though we don't have CONFIG_PTE_MARKER_UFFD_WP?


static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry)
{
#ifdef CONFIG_PTE_MARKER_UFFD_WP
return is_pte_marker_entry(entry) &&
(pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
#else
return false;
#endif
}

Will always return false without CONFIG_PTE_MARKER_UFFD_WP.

But make_uffd_wp_pte() might just happily place an entry. Hm.


The following might fix the problem:

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 51e0ec658457..ae1cf19918d3 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1830,8 +1830,10 @@ static void make_uffd_wp_pte(struct
vm_area_struct *vma,
ptent = pte_swp_mkuffd_wp(ptent);
set_pte_at(vma->vm_mm, addr, pte, ptent);
} else {
+#ifdef CONFIG_PTE_MARKER_UFFD_WP
set_pte_at(vma->vm_mm, addr, pte,
make_pte_marker(PTE_MARKER_UFFD_WP));
+#endif
}
}


But I am *pretty* sure that that whole machinery should be fenced off.
It does make 0 sense to mess with uffd-wp if there is no uffd-wp support.

--
Cheers,

David / dhildenb

Peter Xu

unread,
Nov 16, 2023, 1:00:44ā€ÆPM11/16/23
to David Hildenbrand, Andrew Morton, syzbot, Muhammad Usama Anjum, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com, wangkef...@huawei.com
It should be fine, as:

static void make_uffd_wp_pte(struct vm_area_struct *vma,
unsigned long addr, pte_t *pte)
{
pte_t ptent = ptep_get(pte);

#ifndef CONFIG_USERFAULTFD_

if (pte_present(ptent)) {
pte_t old_pte;

old_pte = ptep_modify_prot_start(vma, addr, pte);
ptent = pte_mkuffd_wp(ptent);
ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent);
} else if (is_swap_pte(ptent)) {
ptent = pte_swp_mkuffd_wp(ptent);
set_pte_at(vma->vm_mm, addr, pte, ptent);
} else { <----------------- this must be pte_none() already
I'd like to double check with Muhammad (as I didn't actually follow his
work in the latest versions.. quite a lot changed), but I _think_
fundamentally we missed something important in the fast path, and I think
it applies even to archs that support uffd..

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e91085d79926..3b81baabd22a 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -2171,7 +2171,8 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, unsigned long start,
return 0;
}

- if (!p->vec_out) {
+ if (!p->vec_out &&
+ (p->arg.flags & PM_SCAN_WP_MATCHING))
/* Fast path for performing exclusive WP */
for (addr = start; addr != end; pte++, addr += PAGE_SIZE) {
if (pte_uffd_wp(ptep_get(pte)))

There's yet another report in fs list that triggers other issues:

https://lore.kernel.org/all/00000000000077...@google.com/

I'll think over that and I plan to prepare a small patchset to fix all I
saw.

Thanks,

--
Peter Xu

David Hildenbrand

unread,
Nov 16, 2023, 1:13:52ā€ÆPM11/16/23
to Peter Xu, Andrew Morton, syzbot, Muhammad Usama Anjum, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com, wangkef...@huawei.com
> It should be fine, as:
>
> static void make_uffd_wp_pte(struct vm_area_struct *vma,
> unsigned long addr, pte_t *pte)
> {
> pte_t ptent = ptep_get(pte);
>
> #ifndef CONFIG_USERFAULTFD_
>
> if (pte_present(ptent)) {
> pte_t old_pte;
>
> old_pte = ptep_modify_prot_start(vma, addr, pte);
> ptent = pte_mkuffd_wp(ptent);
> ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent);
> } else if (is_swap_pte(ptent)) {
> ptent = pte_swp_mkuffd_wp(ptent);
> set_pte_at(vma->vm_mm, addr, pte, ptent);
> } else { <----------------- this must be pte_none() already
> set_pte_at(vma->vm_mm, addr, pte,
> make_pte_marker(PTE_MARKER_UFFD_WP));
> }
> }

Indeed! Is pte_swp_mkuffd_wp() reasonable for pte markers? I rememebr
that we don't support multiple markers yet, so it might be good enough.

>
>>
>> 2) We get the error on arm64, which does *not* support uffd-wp. Do we
>> maybe end up calling make_uffd_wp_pte() and place a pte marker, even
>> though we don't have CONFIG_PTE_MARKER_UFFD_WP?
>>
>>
>> static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry)
>> {
>> #ifdef CONFIG_PTE_MARKER_UFFD_WP
>> return is_pte_marker_entry(entry) &&
>> (pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
>> #else
>> return false;
>> #endif
>> }
>>
>> Will always return false without CONFIG_PTE_MARKER_UFFD_WP.
>>
>> But make_uffd_wp_pte() might just happily place an entry. Hm.
>>
>>
>> The following might fix the problem:
>>

[...]

>
> I'd like to double check with Muhammad (as I didn't actually follow his
> work in the latest versions.. quite a lot changed), but I _think_
> fundamentally we missed something important in the fast path, and I think
> it applies even to archs that support uffd..
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index e91085d79926..3b81baabd22a 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -2171,7 +2171,8 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, unsigned long start,
> return 0;
> }
>
> - if (!p->vec_out) {
> + if (!p->vec_out &&
> + (p->arg.flags & PM_SCAN_WP_MATCHING))

Ouch, yes. So that's the global fence I was wondering where to find it.

Peter Xu

unread,
Nov 16, 2023, 3:04:12ā€ÆPM11/16/23
to David Hildenbrand, Andrew Morton, syzbot, Muhammad Usama Anjum, linux-...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, syzkall...@googlegroups.com, wangkef...@huawei.com
On Thu, Nov 16, 2023 at 07:13:44PM +0100, David Hildenbrand wrote:
> > It should be fine, as:
> >
> > static void make_uffd_wp_pte(struct vm_area_struct *vma,
> > unsigned long addr, pte_t *pte)
> > {
> > pte_t ptent = ptep_get(pte);
> >
> > #ifndef CONFIG_USERFAULTFD_
> >
> > if (pte_present(ptent)) {
> > pte_t old_pte;
> >
> > old_pte = ptep_modify_prot_start(vma, addr, pte);
> > ptent = pte_mkuffd_wp(ptent);
> > ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent);
> > } else if (is_swap_pte(ptent)) {
> > ptent = pte_swp_mkuffd_wp(ptent);
> > set_pte_at(vma->vm_mm, addr, pte, ptent);
> > } else { <----------------- this must be pte_none() already
> > set_pte_at(vma->vm_mm, addr, pte,
> > make_pte_marker(PTE_MARKER_UFFD_WP));
> > }
> > }
>
> Indeed! Is pte_swp_mkuffd_wp() reasonable for pte markers? I rememebr that
> we don't support multiple markers yet, so it might be good enough.

Not really that reasonable, but nothing harmful either that I see so far;
the current code handles any pte marker without caring any of those hint
bits.

I can also reproduce this syzbot error easily with !UFFD config on x86.
Let me send the patchset to fix current known issues first.

Thanks,

--
Peter Xu

Reply all
Reply to author
Forward
0 new messages