mm: use-after-free in collapse_huge_page

49 views
Skip to first unread message

Dmitry Vyukov

unread,
Aug 28, 2016, 6:42:43 AM8/28/16
to Andrew Morton, Vlastimil Babka, Kirill A. Shutemov, Mel Gorman, Johannes Weiner, linu...@kvack.org, LKML, Vegard Nossum, Sasha Levin, Konstantin Khlebnikov, Andrey Ryabinin, Greg Thelen, Suleiman Souhlal, Hugh Dickins, David Rientjes, syzkaller, Kostya Serebryany, Alexander Potapenko
Hello,

I've git the following use-after-free in collapse_huge_page while
running syzkaller fuzzer. It is in khugepaged, so not reproducible. On
commit 61c04572de404e52a655a36752e696bbcb483cf5 (Aug 25).

==================================================================
BUG: KASAN: use-after-free in collapse_huge_page+0x28b1/0x3500 at addr
ffff88006c731388
Read of size 8 by task khugepaged/1327
CPU: 0 PID: 1327 Comm: khugepaged Not tainted 4.8.0-rc3+ #33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffffffff884b8280 ffff88003c207920 ffffffff82d1b239 ffffffff89ec1520
fffffbfff1097050 ffff88003e94c700 ffff88006c731300 ffff88006c7313c0
0000000000000000 ffff88003c207b88 ffff88003c207948 ffffffff817da1fc
Call Trace:
[<ffffffff817da82e>] __asan_report_load8_noabort+0x3e/0x40
mm/kasan/report.c:322
[<ffffffff817ff651>] collapse_huge_page+0x28b1/0x3500 mm/khugepaged.c:1004
[< inline >] khugepaged_scan_pmd mm/khugepaged.c:1205
[< inline >] khugepaged_scan_mm_slot mm/khugepaged.c:1718
[< inline >] khugepaged_do_scan mm/khugepaged.c:1799
[<ffffffff8180206b>] khugepaged+0x1dcb/0x2b30 mm/khugepaged.c:1844
[<ffffffff813e8ddf>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff86c256cf>] ret_from_fork+0x1f/0x40 arch/x86/entry/entry_64.S:393
Object at ffff88006c731300, in cache vm_area_struct size: 192
Allocated:
PID = 23069
[<ffffffff8122b7d6>] save_stack_trace+0x26/0x50 arch/x86/kernel/stacktrace.c:67
[<ffffffff817d95e6>] save_stack+0x46/0xd0 mm/kasan/kasan.c:479
[< inline >] set_track mm/kasan/kasan.c:491
[<ffffffff817d985d>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:582
[<ffffffff817d9d92>] kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:521
[<ffffffff817d4fcb>] kmem_cache_alloc+0x12b/0x710 mm/slab.c:3573
[< inline >] kmem_cache_zalloc ./include/linux/slab.h:626
[<ffffffff8177d1ed>] mmap_region+0x63d/0xfe0 mm/mmap.c:1486
[<ffffffff8177e52d>] do_mmap+0x99d/0xbf0 mm/mmap.c:1297
[< inline >] do_mmap_pgoff ./include/linux/mm.h:2044
[<ffffffff81722a26>] vm_mmap_pgoff+0x156/0x1a0 mm/util.c:302
[< inline >] SYSC_mmap_pgoff mm/mmap.c:1347
[<ffffffff81777288>] SyS_mmap_pgoff+0x208/0x580 mm/mmap.c:1305
[< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
[<ffffffff8120cc36>] SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:86
[<ffffffff86c25480>] entry_SYSCALL_64_fastpath+0x23/0xc1
arch/x86/entry/entry_64.S:207
Freed:
PID = 23069
[<ffffffff8122b7d6>] save_stack_trace+0x26/0x50 arch/x86/kernel/stacktrace.c:67
[<ffffffff817d95e6>] save_stack+0x46/0xd0 mm/kasan/kasan.c:479
[< inline >] set_track mm/kasan/kasan.c:491
[<ffffffff817d9e12>] kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:555
[< inline >] __cache_free mm/slab.c:3515
[<ffffffff817d6f96>] kmem_cache_free+0x76/0x300 mm/slab.c:3775
[<ffffffff817727a2>] remove_vma+0x162/0x1b0 mm/mmap.c:168
[< inline >] remove_vma_list mm/mmap.c:2286
[<ffffffff81779017>] do_munmap+0x7c7/0xf00 mm/mmap.c:2509
[<ffffffff8177cd02>] mmap_region+0x152/0xfe0 mm/mmap.c:1459
[<ffffffff8177e52d>] do_mmap+0x99d/0xbf0 mm/mmap.c:1297
[< inline >] do_mmap_pgoff ./include/linux/mm.h:2044
[<ffffffff81722a26>] vm_mmap_pgoff+0x156/0x1a0 mm/util.c:302
[< inline >] SYSC_mmap_pgoff mm/mmap.c:1347
[<ffffffff81777288>] SyS_mmap_pgoff+0x208/0x580 mm/mmap.c:1305
[< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
[<ffffffff8120cc36>] SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:86
[<ffffffff86c25480>] entry_SYSCALL_64_fastpath+0x23/0xc1
arch/x86/entry/entry_64.S:207
Memory state around the buggy address:
ffff88006c731280: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
ffff88006c731300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88006c731380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
^
ffff88006c731400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88006c731480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint
==================================================================
BUG: KASAN: use-after-free in pmdp_collapse_flush+0x146/0x160 at addr
ffff88006c731350
Read of size 8 by task khugepaged/1327
CPU: 0 PID: 1327 Comm: khugepaged Tainted: G B 4.8.0-rc3+ #33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffffffff884b8280 ffff88003c2078e0 ffffffff82d1b239 ffffffff00000000
fffffbfff1097050 ffff88003e94c700 ffff88006c731300 ffff88006c7313c0
0000000020000000 ffff88003c207b88 ffff88003c207908 ffffffff817da1fc
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82d1b239>] dump_stack+0x12e/0x185 lib/dump_stack.c:51
[<ffffffff817da1fc>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:154
[< inline >] print_address_description mm/kasan/report.c:192
[<ffffffff817da44e>] kasan_report_error+0x1ae/0x490 mm/kasan/report.c:281
[< inline >] kasan_report mm/kasan/report.c:301
[<ffffffff817da82e>] __asan_report_load8_noabort+0x3e/0x40
mm/kasan/report.c:322
[<ffffffff81799f86>] pmdp_collapse_flush+0x146/0x160 mm/pgtable-generic.c:186
[<ffffffff817fde79>] collapse_huge_page+0x10d9/0x3500 mm/khugepaged.c:1019
[< inline >] khugepaged_scan_pmd mm/khugepaged.c:1205
[< inline >] khugepaged_scan_mm_slot mm/khugepaged.c:1718
[< inline >] khugepaged_do_scan mm/khugepaged.c:1799
[<ffffffff8180206b>] khugepaged+0x1dcb/0x2b30 mm/khugepaged.c:1844
[<ffffffff813e8ddf>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff86c256cf>] ret_from_fork+0x1f/0x40 arch/x86/entry/entry_64.S:393
Object at ffff88006c731300, in cache vm_area_struct size: 192
Allocated:
PID = 23069
[<ffffffff8122b7d6>] save_stack_trace+0x26/0x50 arch/x86/kernel/stacktrace.c:67
[<ffffffff817d95e6>] save_stack+0x46/0xd0 mm/kasan/kasan.c:479
[< inline >] set_track mm/kasan/kasan.c:491
[<ffffffff817d985d>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:582
[<ffffffff817d9d92>] kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:521
[<ffffffff817d4fcb>] kmem_cache_alloc+0x12b/0x710 mm/slab.c:3573
[< inline >] kmem_cache_zalloc ./include/linux/slab.h:626
[<ffffffff8177d1ed>] mmap_region+0x63d/0xfe0 mm/mmap.c:1486
[<ffffffff8177e52d>] do_mmap+0x99d/0xbf0 mm/mmap.c:1297
[< inline >] do_mmap_pgoff ./include/linux/mm.h:2044
[<ffffffff81722a26>] vm_mmap_pgoff+0x156/0x1a0 mm/util.c:302
[< inline >] SYSC_mmap_pgoff mm/mmap.c:1347
[<ffffffff81777288>] SyS_mmap_pgoff+0x208/0x580 mm/mmap.c:1305
[< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
[<ffffffff8120cc36>] SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:86
[<ffffffff86c25480>] entry_SYSCALL_64_fastpath+0x23/0xc1
arch/x86/entry/entry_64.S:207
Freed:
PID = 23069
[<ffffffff8122b7d6>] save_stack_trace+0x26/0x50 arch/x86/kernel/stacktrace.c:67
[<ffffffff817d95e6>] save_stack+0x46/0xd0 mm/kasan/kasan.c:479
[< inline >] set_track mm/kasan/kasan.c:491
[<ffffffff817d9e12>] kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:555
[< inline >] __cache_free mm/slab.c:3515
[<ffffffff817d6f96>] kmem_cache_free+0x76/0x300 mm/slab.c:3775
[<ffffffff817727a2>] remove_vma+0x162/0x1b0 mm/mmap.c:168
[< inline >] remove_vma_list mm/mmap.c:2286
[<ffffffff81779017>] do_munmap+0x7c7/0xf00 mm/mmap.c:2509
[<ffffffff8177cd02>] mmap_region+0x152/0xfe0 mm/mmap.c:1459
[<ffffffff8177e52d>] do_mmap+0x99d/0xbf0 mm/mmap.c:1297
[< inline >] do_mmap_pgoff ./include/linux/mm.h:2044
[<ffffffff81722a26>] vm_mmap_pgoff+0x156/0x1a0 mm/util.c:302
[< inline >] SYSC_mmap_pgoff mm/mmap.c:1347
[<ffffffff81777288>] SyS_mmap_pgoff+0x208/0x580 mm/mmap.c:1305
[< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
[<ffffffff8120cc36>] SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:86
[<ffffffff86c25480>] entry_SYSCALL_64_fastpath+0x23/0xc1
arch/x86/entry/entry_64.S:207
Memory state around the buggy address:
ffff88006c731200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88006c731280: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
>ffff88006c731300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88006c731380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff88006c731400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
==================================================================
BUG: KASAN: use-after-free in pmdp_collapse_flush+0x137/0x160 at addr
ffff88006c731340
Read of size 8 by task khugepaged/1327
CPU: 0 PID: 1327 Comm: khugepaged Tainted: G B 4.8.0-rc3+ #33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffffffff884b8280 ffff88003c2078e0 ffffffff82d1b239 ffffffff00000000
fffffbfff1097050 ffff88003e94c700 ffff88006c731300 ffff88006c7313c0
0000000020000000 ffff88003c207b88 ffff88003c207908 ffffffff817da1fc
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82d1b239>] dump_stack+0x12e/0x185 lib/dump_stack.c:51
[<ffffffff817da1fc>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:154
[< inline >] print_address_description mm/kasan/report.c:192
[<ffffffff817da44e>] kasan_report_error+0x1ae/0x490 mm/kasan/report.c:281
[< inline >] kasan_report mm/kasan/report.c:301
[<ffffffff817da82e>] __asan_report_load8_noabort+0x3e/0x40
mm/kasan/report.c:322
[<ffffffff81799f77>] pmdp_collapse_flush+0x137/0x160 mm/pgtable-generic.c:186
[<ffffffff817fde79>] collapse_huge_page+0x10d9/0x3500 mm/khugepaged.c:1019
[< inline >] khugepaged_scan_pmd mm/khugepaged.c:1205
[< inline >] khugepaged_scan_mm_slot mm/khugepaged.c:1718
[< inline >] khugepaged_do_scan mm/khugepaged.c:1799
[<ffffffff8180206b>] khugepaged+0x1dcb/0x2b30 mm/khugepaged.c:1844
[<ffffffff813e8ddf>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff86c256cf>] ret_from_fork+0x1f/0x40 arch/x86/entry/entry_64.S:393
Object at ffff88006c731300, in cache vm_area_struct size: 192
Allocated:
PID = 23069
[<ffffffff8122b7d6>] save_stack_trace+0x26/0x50 arch/x86/kernel/stacktrace.c:67
[<ffffffff817d95e6>] save_stack+0x46/0xd0 mm/kasan/kasan.c:479
[< inline >] set_track mm/kasan/kasan.c:491
[<ffffffff817d985d>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:582
[<ffffffff817d9d92>] kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:521
[<ffffffff817d4fcb>] kmem_cache_alloc+0x12b/0x710 mm/slab.c:3573
[< inline >] kmem_cache_zalloc ./include/linux/slab.h:626
[<ffffffff8177d1ed>] mmap_region+0x63d/0xfe0 mm/mmap.c:1486
[<ffffffff8177e52d>] do_mmap+0x99d/0xbf0 mm/mmap.c:1297
[< inline >] do_mmap_pgoff ./include/linux/mm.h:2044
[<ffffffff81722a26>] vm_mmap_pgoff+0x156/0x1a0 mm/util.c:302
[< inline >] SYSC_mmap_pgoff mm/mmap.c:1347
[<ffffffff81777288>] SyS_mmap_pgoff+0x208/0x580 mm/mmap.c:1305
[< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
[<ffffffff8120cc36>] SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:86
[<ffffffff86c25480>] entry_SYSCALL_64_fastpath+0x23/0xc1
arch/x86/entry/entry_64.S:207
Freed:
PID = 23069
[<ffffffff8122b7d6>] save_stack_trace+0x26/0x50 arch/x86/kernel/stacktrace.c:67
[<ffffffff817d95e6>] save_stack+0x46/0xd0 mm/kasan/kasan.c:479
[< inline >] set_track mm/kasan/kasan.c:491
[<ffffffff817d9e12>] kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:555
[< inline >] __cache_free mm/slab.c:3515
[<ffffffff817d6f96>] kmem_cache_free+0x76/0x300 mm/slab.c:3775
[<ffffffff817727a2>] remove_vma+0x162/0x1b0 mm/mmap.c:168
[< inline >] remove_vma_list mm/mmap.c:2286
[<ffffffff81779017>] do_munmap+0x7c7/0xf00 mm/mmap.c:2509
[<ffffffff8177cd02>] mmap_region+0x152/0xfe0 mm/mmap.c:1459
[<ffffffff8177e52d>] do_mmap+0x99d/0xbf0 mm/mmap.c:1297
[< inline >] do_mmap_pgoff ./include/linux/mm.h:2044
[<ffffffff81722a26>] vm_mmap_pgoff+0x156/0x1a0 mm/util.c:302
[< inline >] SYSC_mmap_pgoff mm/mmap.c:1347
[<ffffffff81777288>] SyS_mmap_pgoff+0x208/0x580 mm/mmap.c:1305
[< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
[<ffffffff8120cc36>] SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:86
[<ffffffff86c25480>] entry_SYSCALL_64_fastpath+0x23/0xc1
arch/x86/entry/entry_64.S:207
Memory state around the buggy address:
ffff88006c731200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88006c731280: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
>ffff88006c731300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88006c731380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff88006c731400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
==================================================================
BUG: KASAN: use-after-free in collapse_huge_page+0x231c/0x3500 at addr
ffff88006c731388
Read of size 8 by task khugepaged/1327
CPU: 0 PID: 1327 Comm: khugepaged Tainted: G B 4.8.0-rc3+ #33
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffffffff884b8280 ffff88003c207920 ffffffff82d1b239 ffffffff00000000
fffffbfff1097050 ffff88003e94c700 ffff88006c731300 ffff88006c7313c0
0000000000000000 ffff88003c207b88 ffff88003c207948 ffffffff817da1fc
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff82d1b239>] dump_stack+0x12e/0x185 lib/dump_stack.c:51
[<ffffffff817da1fc>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:154
[< inline >] print_address_description mm/kasan/report.c:192
[<ffffffff817da44e>] kasan_report_error+0x1ae/0x490 mm/kasan/report.c:281
[< inline >] kasan_report mm/kasan/report.c:301
[<ffffffff817da82e>] __asan_report_load8_noabort+0x3e/0x40
mm/kasan/report.c:322
[<ffffffff817ff0bc>] collapse_huge_page+0x231c/0x3500 mm/khugepaged.c:1038
[< inline >] khugepaged_scan_pmd mm/khugepaged.c:1205
[< inline >] khugepaged_scan_mm_slot mm/khugepaged.c:1718
[< inline >] khugepaged_do_scan mm/khugepaged.c:1799
[<ffffffff8180206b>] khugepaged+0x1dcb/0x2b30 mm/khugepaged.c:1844
[<ffffffff813e8ddf>] kthread+0x23f/0x2d0 drivers/block/aoe/aoecmd.c:1303
[<ffffffff86c256cf>] ret_from_fork+0x1f/0x40 arch/x86/entry/entry_64.S:393
Object at ffff88006c731300, in cache vm_area_struct size: 192
Allocated:
PID = 23069
[<ffffffff8122b7d6>] save_stack_trace+0x26/0x50 arch/x86/kernel/stacktrace.c:67
[<ffffffff817d95e6>] save_stack+0x46/0xd0 mm/kasan/kasan.c:479
[< inline >] set_track mm/kasan/kasan.c:491
[<ffffffff817d985d>] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:582
[<ffffffff817d9d92>] kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:521
[<ffffffff817d4fcb>] kmem_cache_alloc+0x12b/0x710 mm/slab.c:3573
[< inline >] kmem_cache_zalloc ./include/linux/slab.h:626
[<ffffffff8177d1ed>] mmap_region+0x63d/0xfe0 mm/mmap.c:1486
[<ffffffff8177e52d>] do_mmap+0x99d/0xbf0 mm/mmap.c:1297
[< inline >] do_mmap_pgoff ./include/linux/mm.h:2044
[<ffffffff81722a26>] vm_mmap_pgoff+0x156/0x1a0 mm/util.c:302
[< inline >] SYSC_mmap_pgoff mm/mmap.c:1347
[<ffffffff81777288>] SyS_mmap_pgoff+0x208/0x580 mm/mmap.c:1305
[< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
[<ffffffff8120cc36>] SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:86
[<ffffffff86c25480>] entry_SYSCALL_64_fastpath+0x23/0xc1
arch/x86/entry/entry_64.S:207
Freed:
PID = 23069
[<ffffffff8122b7d6>] save_stack_trace+0x26/0x50 arch/x86/kernel/stacktrace.c:67
[<ffffffff817d95e6>] save_stack+0x46/0xd0 mm/kasan/kasan.c:479
[< inline >] set_track mm/kasan/kasan.c:491
[<ffffffff817d9e12>] kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:555
[< inline >] __cache_free mm/slab.c:3515
[<ffffffff817d6f96>] kmem_cache_free+0x76/0x300 mm/slab.c:3775
[<ffffffff817727a2>] remove_vma+0x162/0x1b0 mm/mmap.c:168
[< inline >] remove_vma_list mm/mmap.c:2286
[<ffffffff81779017>] do_munmap+0x7c7/0xf00 mm/mmap.c:2509
[<ffffffff8177cd02>] mmap_region+0x152/0xfe0 mm/mmap.c:1459
[<ffffffff8177e52d>] do_mmap+0x99d/0xbf0 mm/mmap.c:1297
[< inline >] do_mmap_pgoff ./include/linux/mm.h:2044
[<ffffffff81722a26>] vm_mmap_pgoff+0x156/0x1a0 mm/util.c:302
[< inline >] SYSC_mmap_pgoff mm/mmap.c:1347
[<ffffffff81777288>] SyS_mmap_pgoff+0x208/0x580 mm/mmap.c:1305
[< inline >] SYSC_mmap arch/x86/kernel/sys_x86_64.c:95
[<ffffffff8120cc36>] SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:86
[<ffffffff86c25480>] entry_SYSCALL_64_fastpath+0x23/0xc1
arch/x86/entry/entry_64.S:207
Memory state around the buggy address:
ffff88006c731280: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc
ffff88006c731300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88006c731380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
^
ffff88006c731400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88006c731480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
==================================================================


For the record here is full crash log:
https://gist.githubusercontent.com/dvyukov/9366a1585f95df0251b9310e4fe33bb1/raw/ad635fb9594a733a95cd6f6c82dffa847f62c2ea/gistfile1.txt

Kirill A. Shutemov

unread,
Aug 29, 2016, 8:43:10 AM8/29/16
to Dmitry Vyukov, Ebru Akagunduz, Andrea Arcangeli, Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner, linu...@kvack.org, LKML, Vegard Nossum, Sasha Levin, Konstantin Khlebnikov, Andrey Ryabinin, Greg Thelen, Suleiman Souhlal, Hugh Dickins, David Rientjes, syzkaller, Kostya Serebryany, Alexander Potapenko
On Sun, Aug 28, 2016 at 12:42:21PM +0200, Dmitry Vyukov wrote:
> Hello,
>
> I've git the following use-after-free in collapse_huge_page while
> running syzkaller fuzzer. It is in khugepaged, so not reproducible. On
> commit 61c04572de404e52a655a36752e696bbcb483cf5 (Aug 25).
>
> ==================================================================
> BUG: KASAN: use-after-free in collapse_huge_page+0x28b1/0x3500 at addr
> ffff88006c731388
> Read of size 8 by task khugepaged/1327
> CPU: 0 PID: 1327 Comm: khugepaged Not tainted 4.8.0-rc3+ #33
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> ffffffff884b8280 ffff88003c207920 ffffffff82d1b239 ffffffff89ec1520
> fffffbfff1097050 ffff88003e94c700 ffff88006c731300 ffff88006c7313c0
> 0000000000000000 ffff88003c207b88 ffff88003c207948 ffffffff817da1fc
> Call Trace:
> [<ffffffff817da82e>] __asan_report_load8_noabort+0x3e/0x40
> mm/kasan/report.c:322
> [<ffffffff817ff651>] collapse_huge_page+0x28b1/0x3500 mm/khugepaged.c:1004

Okay, I think the patch below should do the trick. Build tested only.

Andrea, Ebru, could you re-check if it's reasonable.

From bc6c3589a3da75fabd6d440cce3c8ea23b69c2e5 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill....@linux.intel.com>
Date: Mon, 29 Aug 2016 15:32:50 +0300
Subject: [PATCH] khugepaged: fix use-after-free in collapse_huge_page()

hugepage_vma_revalidate() tries to re-check if we still should try to
collapse small pages into huge one after the re-acquiring mmap_sem.

The problem Dmitry Vyukov reported[1] is that the vma found by
hugepage_vma_revalidate() can be suitable for huge pages, but not the
same vma we had before dropping mmap_sem. And dereferencing original vma
can lead to fun results..

Let's use vma hugepage_vma_revalidate() found instead of assuming it's
the same as what we had before the lock was dropped.

[1] http://lkml.kernel.org/r/CACT4Y+Z3gigBvhca9kRJFcjX...@mail.gmail.com

Signed-off-by: Kirill A. Shutemov <kirill....@linux.intel.com>
Reported-by: Dmitry Vyukov <dvy...@google.com>
---
mm/khugepaged.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 79c52d0061af..3c8253f160ca 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -838,7 +838,8 @@ static bool hugepage_vma_check(struct vm_area_struct *vma)
* value (scan code).
*/

-static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address)
+static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
+ struct vm_area_struct **vmap)
{
struct vm_area_struct *vma;
unsigned long hstart, hend;
@@ -846,7 +847,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address)
if (unlikely(khugepaged_test_exit(mm)))
return SCAN_ANY_PROCESS;

- vma = find_vma(mm, address);
+ *vmap = vma = find_vma(mm, address);
if (!vma)
return SCAN_VMA_NULL;

@@ -898,13 +899,13 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
/* do_swap_page returns VM_FAULT_RETRY with released mmap_sem */
if (ret & VM_FAULT_RETRY) {
down_read(&mm->mmap_sem);
- if (hugepage_vma_revalidate(mm, address)) {
+ if (hugepage_vma_revalidate(mm, address, &vma)) {
/* vma is no longer available, don't continue to swapin */
trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
return false;
}
/* check if the pmd is still valid */
- if (mm_find_pmd(mm, address) != pmd)
+ if (mm_find_pmd(mm, address) != pmd || vma != fe.vma)
return false;
}
if (ret & VM_FAULT_ERROR) {
@@ -923,7 +924,6 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
static void collapse_huge_page(struct mm_struct *mm,
unsigned long address,
struct page **hpage,
- struct vm_area_struct *vma,
int node, int referenced)
{
pmd_t *pmd, _pmd;
@@ -933,6 +933,7 @@ static void collapse_huge_page(struct mm_struct *mm,
spinlock_t *pmd_ptl, *pte_ptl;
int isolated = 0, result = 0;
struct mem_cgroup *memcg;
+ struct vm_area_struct *vma;
unsigned long mmun_start; /* For mmu_notifiers */
unsigned long mmun_end; /* For mmu_notifiers */
gfp_t gfp;
@@ -961,7 +962,7 @@ static void collapse_huge_page(struct mm_struct *mm,
}

down_read(&mm->mmap_sem);
- result = hugepage_vma_revalidate(mm, address);
+ result = hugepage_vma_revalidate(mm, address, &vma);
if (result) {
mem_cgroup_cancel_charge(new_page, memcg, true);
up_read(&mm->mmap_sem);
@@ -994,7 +995,7 @@ static void collapse_huge_page(struct mm_struct *mm,
* handled by the anon_vma lock + PG_lock.
*/
down_write(&mm->mmap_sem);
- result = hugepage_vma_revalidate(mm, address);
+ result = hugepage_vma_revalidate(mm, address, &vma);
if (result)
goto out;
/* check if the pmd is still valid */
@@ -1202,7 +1203,7 @@ out_unmap:
if (ret) {
node = khugepaged_find_target_node();
/* collapse_huge_page will return with the mmap_sem released */
- collapse_huge_page(mm, address, hpage, vma, node, referenced);
+ collapse_huge_page(mm, address, hpage, node, referenced);
}
out:
trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced,
--
Kirill A. Shutemov

Andrea Arcangeli

unread,
Aug 29, 2016, 11:35:53 AM8/29/16
to Kirill A. Shutemov, Dmitry Vyukov, Ebru Akagunduz, Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner, linu...@kvack.org, LKML, Vegard Nossum, Sasha Levin, Konstantin Khlebnikov, Andrey Ryabinin, Greg Thelen, Suleiman Souhlal, Hugh Dickins, David Rientjes, syzkaller, Kostya Serebryany, Alexander Potapenko
Hello Kirill,

On Mon, Aug 29, 2016 at 03:42:33PM +0300, Kirill A. Shutemov wrote:
> @@ -898,13 +899,13 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
> /* do_swap_page returns VM_FAULT_RETRY with released mmap_sem */
> if (ret & VM_FAULT_RETRY) {
> down_read(&mm->mmap_sem);
> - if (hugepage_vma_revalidate(mm, address)) {
> + if (hugepage_vma_revalidate(mm, address, &vma)) {
> /* vma is no longer available, don't continue to swapin */
> trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
> return false;
> }
> /* check if the pmd is still valid */
> - if (mm_find_pmd(mm, address) != pmd)
> + if (mm_find_pmd(mm, address) != pmd || vma != fe.vma)
> return false;
> }
> if (ret & VM_FAULT_ERROR) {

You check if the vma changed if the mmap_sem was released by the
VM_FAULT_RETRY case but not below:

/*
* Prevent all access to pagetables with the exception of
* gup_fast later handled by the ptep_clear_flush and the VM
> @@ -994,7 +995,7 @@ static void collapse_huge_page(struct mm_struct *mm,
> * handled by the anon_vma lock + PG_lock.
> */
> down_write(&mm->mmap_sem);
> - result = hugepage_vma_revalidate(mm, address);
> + result = hugepage_vma_revalidate(mm, address, &vma);
> if (result)
> goto out;
> /* check if the pmd is still valid */
if (mm_find_pmd(mm, address) != pmd)
goto out;

Here you go ahead without care if the vma has changed as long as the
"vma" pointer was updated to the new one, and the pmd is still present
and stable (present and not huge) and all vma details matched as
before.

Either we care that the vma changed in both places or we don't in
either of the two places.

The idea was that even if the vma changed it doesn't matter because
it's still good to proceed for a collapse if all revalidation check
pass.

What we failed at, was in refreshing the pointer of the vma to the new
one after the vma revalidation passed, so that the code that goes
ahead uses the right vma pointer and not the stale one we got
initially.

Now it may give a perception that it is safer to check fa.vma != vma
but in reality it is not, because the vma may be freed and reallocated
in exactly the same address...

So I think the vma != fe.vma check shall be removed because no matter
what the safety of the vma revalidate cannot come from checking if the
pointer has not changed and it must come from something else.

Now reading __collapse_huge_page_swapin I noticed some other unrelated
issues.

if (referenced < HPAGE_PMD_NR/2) {
trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
return false;
}

Referenced is not updated ever by __collapse_huge_page_swapin. In turn
the above check would better be moved out of the loop, before we start
to kmap the pte.

Bigger issue is that leaving it where it is right now, we don't seem
to unmap the pte as needed when the above "return false" above runs,
which means we're leaking kmap_atomic entries, and that will also get
fixed by moving the check before the loop starts and before the first
pte_offset_map.

swapped_in will showup zero instead of 1 in the tracing at all times
but I doubt it makes any difference so I would move it out of the loop
instead of adding a pte_unmap before returning, so it runs faster.

Ebru Akagunduz

unread,
Sep 2, 2016, 8:50:38 AM9/2/16
to Kirill A. Shutemov, dvy...@google.com, ak...@linux-foundation.org, vba...@suse.cz, mgo...@techsingularity.net, han...@cmpxchg.org, linu...@kvack.org, linux-...@vger.kernel.org, vegard...@oracle.com, levins...@gmail.com, koc...@gmail.com, ryabin...@gmail.com, gth...@google.com, sule...@google.com, hu...@google.com, rien...@google.com, syzk...@googlegroups.com, k...@google.com, gli...@google.com
I could not realize, why we need to remove vma parameter and recreate it here?
> unsigned long mmun_end; /* For mmu_notifiers */
> gfp_t gfp;
> @@ -961,7 +962,7 @@ static void collapse_huge_page(struct mm_struct *mm,
> }
>
> down_read(&mm->mmap_sem);
And without fe.vma check, this patch seems work for me.

Andrea, I've just sent a fix patch for leaking mapped ptes.

Kind regards,
Ebru

Kirill A. Shutemov

unread,
Sep 7, 2016, 8:26:37 AM9/7/16
to Andrea Arcangeli, Dmitry Vyukov, Ebru Akagunduz, Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner, linu...@kvack.org, LKML, Vegard Nossum, Sasha Levin, Konstantin Khlebnikov, Andrey Ryabinin, Greg Thelen, Suleiman Souhlal, Hugh Dickins, David Rientjes, syzkaller, Kostya Serebryany, Alexander Potapenko
[ Finally back to this. ]

Here's updated version.

From 14d748bd8a7eb003efc10b1e5d5b8a644e7181b1 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill....@linux.intel.com>
Date: Mon, 29 Aug 2016 15:32:50 +0300
Subject: [PATCH] khugepaged: fix use-after-free in collapse_huge_page()

hugepage_vma_revalidate() tries to re-check if we still should try to
collapse small pages into huge one after the re-acquiring mmap_sem.

The problem Dmitry Vyukov reported[1] is that the vma found by
hugepage_vma_revalidate() can be suitable for huge pages, but not the
same vma we had before dropping mmap_sem. And dereferencing original vma
can lead to fun results..

Let's use vma hugepage_vma_revalidate() found instead of assuming it's
the same as what we had before the lock was dropped.

[1] http://lkml.kernel.org/r/CACT4Y+Z3gigBvhca9kRJFcjX...@mail.gmail.com

Signed-off-by: Kirill A. Shutemov <kirill....@linux.intel.com>
Reported-by: Dmitry Vyukov <dvy...@google.com>
---
mm/khugepaged.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index f401e9dfcc0c..728d7790dc2d 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -838,7 +838,8 @@ static bool hugepage_vma_check(struct vm_area_struct *vma)
* value (scan code).
*/

-static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address)
+static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address,
+ struct vm_area_struct **vmap)
{
struct vm_area_struct *vma;
unsigned long hstart, hend;
@@ -846,7 +847,7 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address)
if (unlikely(khugepaged_test_exit(mm)))
return SCAN_ANY_PROCESS;

- vma = find_vma(mm, address);
+ *vmap = vma = find_vma(mm, address);
if (!vma)
return SCAN_VMA_NULL;

@@ -898,7 +899,7 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
/* do_swap_page returns VM_FAULT_RETRY with released mmap_sem */
if (ret & VM_FAULT_RETRY) {
down_read(&mm->mmap_sem);
- if (hugepage_vma_revalidate(mm, address)) {
+ if (hugepage_vma_revalidate(mm, address, &fe.vma)) {
/* vma is no longer available, don't continue to swapin */
trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0);
return false;
@@ -923,7 +924,6 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm,
static void collapse_huge_page(struct mm_struct *mm,
unsigned long address,
struct page **hpage,
- struct vm_area_struct *vma,
int node, int referenced)
{
pmd_t *pmd, _pmd;
@@ -933,6 +933,7 @@ static void collapse_huge_page(struct mm_struct *mm,
spinlock_t *pmd_ptl, *pte_ptl;
int isolated = 0, result = 0;
struct mem_cgroup *memcg;
+ struct vm_area_struct *vma;
unsigned long mmun_start; /* For mmu_notifiers */
unsigned long mmun_end; /* For mmu_notifiers */
gfp_t gfp;
@@ -961,7 +962,7 @@ static void collapse_huge_page(struct mm_struct *mm,
}

down_read(&mm->mmap_sem);
- result = hugepage_vma_revalidate(mm, address);
+ result = hugepage_vma_revalidate(mm, address, &vma);
if (result) {
mem_cgroup_cancel_charge(new_page, memcg, true);
up_read(&mm->mmap_sem);
@@ -994,7 +995,7 @@ static void collapse_huge_page(struct mm_struct *mm,
* handled by the anon_vma lock + PG_lock.
*/
down_write(&mm->mmap_sem);
- result = hugepage_vma_revalidate(mm, address);
+ result = hugepage_vma_revalidate(mm, address, &vma);
if (result)
goto out;
/* check if the pmd is still valid */

Andrea Arcangeli

unread,
Sep 7, 2016, 8:40:50 AM9/7/16
to Kirill A. Shutemov, Dmitry Vyukov, Ebru Akagunduz, Andrew Morton, Vlastimil Babka, Mel Gorman, Johannes Weiner, linu...@kvack.org, LKML, Vegard Nossum, Sasha Levin, Konstantin Khlebnikov, Andrey Ryabinin, Greg Thelen, Suleiman Souhlal, Hugh Dickins, David Rientjes, syzkaller, Kostya Serebryany, Alexander Potapenko
On Wed, Sep 07, 2016 at 03:25:59PM +0300, Kirill A. Shutemov wrote:
> Here's updated version.
>
> From 14d748bd8a7eb003efc10b1e5d5b8a644e7181b1 Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill....@linux.intel.com>
> Date: Mon, 29 Aug 2016 15:32:50 +0300
> Subject: [PATCH] khugepaged: fix use-after-free in collapse_huge_page()
>
> hugepage_vma_revalidate() tries to re-check if we still should try to
> collapse small pages into huge one after the re-acquiring mmap_sem.
>
> The problem Dmitry Vyukov reported[1] is that the vma found by
> hugepage_vma_revalidate() can be suitable for huge pages, but not the
> same vma we had before dropping mmap_sem. And dereferencing original vma
> can lead to fun results..
>
> Let's use vma hugepage_vma_revalidate() found instead of assuming it's
> the same as what we had before the lock was dropped.
>
> [1] http://lkml.kernel.org/r/CACT4Y+Z3gigBvhca9kRJFcjX...@mail.gmail.com
>
> Signed-off-by: Kirill A. Shutemov <kirill....@linux.intel.com>
> Reported-by: Dmitry Vyukov <dvy...@google.com>
> ---
> mm/khugepaged.c | 15 ++++++++-------
> 1 file changed, 8 insertions(+), 7 deletions(-)

Reviewed-by: Andrea Arcangeli <aarc...@redhat.com>
Reply all
Reply to author
Forward
0 new messages