mm: BUG in page_move_anon_rmap

42 views
Skip to first unread message

Dmitry Vyukov

unread,
Jul 1, 2016, 11:32:12 AM7/1/16
to linu...@kvack.org, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Hugh Dickins, LKML, Andrey Ryabinin, Konstantin Khlebnikov, Greg Thelen, Suleiman Souhlal, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Hello,

I am getting the following crashes while running syzkaller fuzzer on
00bf377d19ad3d80cbc7a036521279a86e397bfb (Jun 29). So far I did not
manage to reproduce it outside of fuzzer, but fuzzer hits it once per
hour or so.

flags: 0xfffe0000044079(locked|uptodate|dirty|lru|active|head|swapbacked)
page dumped because: VM_BUG_ON_PAGE(page->index !=
linear_page_index(vma, address))
page->mem_cgroup:ffff88003e829be0
------------[ cut here ]------------
kernel BUG at mm/rmap.c:1103!
invalid opcode: 0000 [#2] SMP DEBUG_PAGEALLOC KASAN
Modules linked in:
CPU: 0 PID: 7043 Comm: syz-fuzzer Tainted: G D 4.7.0-rc5+ #22
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff8800342f46c0 ti: ffff880034008000 task.ti: ffff880034008000
RIP: 0010:[<ffffffff817693d8>] [<ffffffff817693d8>]
page_move_anon_rmap+0x278/0x310 mm/rmap.c:1103
RSP: 0000:ffff88003400fad0 EFLAGS: 00010286
RAX: ffff8800342f46c0 RBX: ffffea0000928000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff88003ec16de8 RDI: ffffed0006801f41
RBP: ffff88003400fb00 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: ffffed000fffea01 R12: ffff88006776b8e8
R13: 001000000c829e00 R14: ffff88006247c3e8 R15: 000000000c829e00
FS: 00007f7627bc5700(0000) GS:ffff88003ec00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000c829fd8000 CR3: 0000000034b23000 CR4: 00000000000006f0
Stack:
ffffea0000928000 ffffea000092f600 ffff88006776b8e8 ffffea0000928000
ffffea0000928001 000000c829fd8000 ffff88003400fc38 ffffffff8173a25f
0000000000000086 ffff88003400fbd0 ffffea0000928001 ffff880036cd3ec0
Call Trace:
[<ffffffff8173a25f>] do_wp_page+0x7df/0x1c90 mm/memory.c:2402
[<ffffffff817404f5>] handle_pte_fault+0x1e85/0x4960 mm/memory.c:3381
[< inline >] __handle_mm_fault mm/memory.c:3489
[<ffffffff8174443b>] handle_mm_fault+0xeab/0x11a0 mm/memory.c:3518
[<ffffffff81290f77>] __do_page_fault+0x457/0xbb0 arch/x86/mm/fault.c:1356
[<ffffffff8129181f>] trace_do_page_fault+0xdf/0x5b0 arch/x86/mm/fault.c:1449
[<ffffffff81281c24>] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:265
[<ffffffff86a9d538>] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:923
Code: 0b e8 dd d5 e2 ff 48 c7 c6 40 f7 d0 86 48 89 df e8 2e 4a fc ff
0f 0b e8 c7 d5 e2 ff 48 c7 c6 c0 f7 d0 86 48 89 df e8 18 4a fc ff <0f>
0b e8 b1 d5 e2 ff 4c 89 ee 4c 89 e7 e8 96 80 02 00 49 89 c5
RIP [<ffffffff817693d8>] page_move_anon_rmap+0x278/0x310 mm/rmap.c:1103
RSP <ffff88003400fad0>
---[ end trace b6c02a1136e2a9ec ]---
BUG: sleeping function called from invalid context at include/linux/sched.h:2955
in_atomic(): 1, irqs_disabled(): 0, pid: 7043, name: syz-fuzzer
lockdep is turned off.
CPU: 0 PID: 7043 Comm: syz-fuzzer Tainted: G D 4.7.0-rc5+ #22
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffffffff880b58e0 ffff88003400f5c0 ffffffff82cc924f ffffffff342f46c0
fffffbfff1016b1c ffff8800342f46c0 0000000000001b83 0000000000000000
0000000000000000 dffffc0000000000 ffff88003400f5e8 ffffffff813efbfb
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82cc924f>] dump_stack+0x12e/0x18f lib/dump_stack.c:51
[<ffffffff813efbfb>] ___might_sleep+0x27b/0x3a0 kernel/sched/core.c:7573
[<ffffffff813efdb0>] __might_sleep+0x90/0x1a0 kernel/sched/core.c:7535
[< inline >] threadgroup_change_begin include/linux/sched.h:2955
[<ffffffff813a175f>] exit_signals+0x7f/0x430 kernel/signal.c:2392
[<ffffffff8137a6a4>] do_exit+0x234/0x2c80 kernel/exit.c:701
[<ffffffff81204331>] oops_end+0xa1/0xd0 arch/x86/kernel/dumpstack.c:250
[<ffffffff812045c6>] die+0x46/0x60 arch/x86/kernel/dumpstack.c:308
[< inline >] do_trap_no_signal arch/x86/kernel/traps.c:192
[<ffffffff811fd9f2>] do_trap+0x192/0x380 arch/x86/kernel/traps.c:238
[<ffffffff811fde4e>] do_error_trap+0x11e/0x280 arch/x86/kernel/traps.c:275
[<ffffffff811ff18b>] do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:288
[<ffffffff86a9cf0e>] invalid_op+0x1e/0x30 arch/x86/entry/entry_64.S:761
[<ffffffff8173a25f>] do_wp_page+0x7df/0x1c90 mm/memory.c:2402
[<ffffffff817404f5>] handle_pte_fault+0x1e85/0x4960 mm/memory.c:3381
[< inline >] __handle_mm_fault mm/memory.c:3489
[<ffffffff8174443b>] handle_mm_fault+0xeab/0x11a0 mm/memory.c:3518
[<ffffffff81290f77>] __do_page_fault+0x457/0xbb0 arch/x86/mm/fault.c:1356
[<ffffffff8129181f>] trace_do_page_fault+0xdf/0x5b0 arch/x86/mm/fault.c:1449
[<ffffffff81281c24>] do_async_page_fault+0x14/0xd0 arch/x86/kernel/kvm.c:265
[<ffffffff86a9d538>] async_page_fault+0x28/0x30 arch/x86/entry/entry_64.S:923
note: syz-fuzzer[7043] exited with preempt_count 1

Andrey Ryabinin

unread,
Jul 1, 2016, 12:02:49 PM7/1/16
to Dmitry Vyukov, linu...@kvack.org, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Hugh Dickins, LKML, Konstantin Khlebnikov, Greg Thelen, Suleiman Souhlal, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin


On 07/01/2016 06:31 PM, Dmitry Vyukov wrote:
> Hello,
>
> I am getting the following crashes while running syzkaller fuzzer on
> 00bf377d19ad3d80cbc7a036521279a86e397bfb (Jun 29). So far I did not
> manage to reproduce it outside of fuzzer, but fuzzer hits it once per
> hour or so.
>
> flags: 0xfffe0000044079(locked|uptodate|dirty|lru|active|head|swapbacked)

This report is incomplete. It lacks one line ahead with page address, mapcount, index, etc.

> page dumped because: VM_BUG_ON_PAGE(page->index !=
> linear_page_index(vma, address))
> page->mem_cgroup:ffff88003e829be0
> ------------[ cut here ]------------
> kernel BUG at mm/rmap.c:1103!
> invalid opcode: 0000 [#2] SMP DEBUG_PAGEALLOC KASAN
> Modules linked in:
> CPU: 0 PID: 7043 Comm: syz-fuzzer Tainted: G D 4.7.0-rc5+ #22

So the kernel is already tainted. Can you show us the first oops message?

Dmitry Vyukov

unread,
Jul 1, 2016, 12:11:59 PM7/1/16
to Andrey Ryabinin, linu...@kvack.org, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Hugh Dickins, LKML, Konstantin Khlebnikov, Greg Thelen, Suleiman Souhlal, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
On Fri, Jul 1, 2016 at 6:03 PM, Andrey Ryabinin <arya...@virtuozzo.com> wrote:
>
>
> On 07/01/2016 06:31 PM, Dmitry Vyukov wrote:
>> Hello,
>>
>> I am getting the following crashes while running syzkaller fuzzer on
>> 00bf377d19ad3d80cbc7a036521279a86e397bfb (Jun 29). So far I did not
>> manage to reproduce it outside of fuzzer, but fuzzer hits it once per
>> hour or so.
>>
>> flags: 0xfffe0000044079(locked|uptodate|dirty|lru|active|head|swapbacked)
>
> This report is incomplete. It lacks one line ahead with page address, mapcount, index, etc.
>
>> page dumped because: VM_BUG_ON_PAGE(page->index !=
>> linear_page_index(vma, address))
>> page->mem_cgroup:ffff88003e829be0
>> ------------[ cut here ]------------
>> kernel BUG at mm/rmap.c:1103!
>> invalid opcode: 0000 [#2] SMP DEBUG_PAGEALLOC KASAN
>> Modules linked in:
>> CPU: 0 PID: 7043 Comm: syz-fuzzer Tainted: G D 4.7.0-rc5+ #22
>
> So the kernel is already tainted. Can you show us the first oops message?

Here are 3 reports on non tainted kernels:
https://gist.githubusercontent.com/dvyukov/b70bc7ce5d1b69d36c00949ea7dec8ae/raw/0551cd816bf9d7c13ef8249c72dd32b976626086/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/461bd8b185bcd374ccb9ace852b89441/raw/4f77600467717e776ec1c10d136bdf23ddbab3e1/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/0078ec38b3e320173610cf6a0c2e107b/raw/488384222fe5e25d1d425ca29782e0b3e9273ffa/gistfile1.txt

Hugh Dickins

unread,
Jul 4, 2016, 7:11:03 PM7/4/16
to Dmitry Vyukov, linu...@kvack.org, Andrew Morton, Kirill A. Shutemov, Vlastimil Babka, Hugh Dickins, LKML, Andrey Ryabinin, Konstantin Khlebnikov, Greg Thelen, Suleiman Souhlal, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
I think 0798d3c022dc ("mm: thp: avoid false positive VM_BUG_ON_PAGE in
page_move_anon_rmap()") is flawed. It would certainly be interesting
to hear whether this patch fixes your crashes:

--- 4.7-rc6/include/linux/pagemap.h 2016-05-29 15:47:38.303064058 -0700
+++ linux/include/linux/pagemap.h 2016-07-04 15:44:46.635147739 -0700
@@ -408,7 +408,7 @@ static inline pgoff_t linear_page_index(
pgoff_t pgoff;
if (unlikely(is_vm_hugetlb_page(vma)))
return linear_hugepage_index(vma, address);
- pgoff = (address - vma->vm_start) >> PAGE_SHIFT;
+ pgoff = (long)(address - vma->vm_start) >> PAGE_SHIFT;
pgoff += vma->vm_pgoff;
return pgoff;
}

But if it does work, I'm not sure that we really want to extend
linear_page_index() to go in a backward direction, just for this case.

I think I'd prefer to skip page_move_anon_rmap()'s linear_page_index()
check in the PageTransHuge case; or, indeed, remove that check (and
the address arg) completely - address plays no further part there.
But let's wait to see what Kirill prefers before committing.

Hugh

Kirill A. Shutemov

unread,
Jul 5, 2016, 9:21:55 AM7/5/16
to Hugh Dickins, Andrea Arcangeli, Dmitry Vyukov, linu...@kvack.org, Andrew Morton, Vlastimil Babka, LKML, Andrey Ryabinin, Konstantin Khlebnikov, Greg Thelen, Suleiman Souhlal, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Yeah, I would rather kill the VM_BUG_ON_PAGE() altogether and simplify
page_move_anon_rmap() interface. It doesn't make sense to add even more
glue to get the assert work.

--
Kirill A. Shutemov

Hugh Dickins

unread,
Jul 8, 2016, 2:58:27 PM7/8/16
to Dmitry Vyukov, Hugh Dickins, Kirill A. Shutemov, Andrea Arcangeli, linu...@kvack.org, Andrew Morton, Vlastimil Babka, LKML, Andrey Ryabinin, Konstantin Khlebnikov, Greg Thelen, Suleiman Souhlal, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Hi Dmitry,
Kirill and I agree on what the final patch should be like, but I don't
want to post that until we've heard back from you whether the "(long)"
patch above works - if it does work, fine, we go ahead with removing a
check which has never shown anything but its own problems; but if the
"(long)" patch does not work, then we need to worry and keep the check.

Hugh

Dmitry Vyukov

unread,
Jul 11, 2016, 7:01:29 AM7/11/16
to Hugh Dickins, Kirill A. Shutemov, Andrea Arcangeli, linu...@kvack.org, Andrew Morton, Vlastimil Babka, LKML, Andrey Ryabinin, Konstantin Khlebnikov, Greg Thelen, Suleiman Souhlal, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Sorry for the delay. Yes, it seems to fix the issue. I see no crashes
when I would expect to see a dozen of them.

Hugh Dickins

unread,
Jul 12, 2016, 7:44:36 AM7/12/16
to Dmitry Vyukov, Hugh Dickins, Kirill A. Shutemov, Andrea Arcangeli, linu...@kvack.org, Andrew Morton, Vlastimil Babka, LKML, Andrey Ryabinin, Konstantin Khlebnikov, Greg Thelen, Suleiman Souhlal, syzkaller, Kostya Serebryany, Alexander Potapenko, Sasha Levin
Thanks a lot, Dmitry, that's very reassuring: patch follows, not touching
linear_page_index(), but removing the problem VM_BUG_ON_PAGE().

Hugh
Reply all
Reply to author
Forward
0 new messages