[PATCH] mm/hugetlb resv map memory leak for placeholder entries

6 views
Skip to first unread message

Mike Kravetz

unread,
Dec 1, 2015, 9:53:06 PM12/1/15
to Dmitry Vyukov, Andrew Morton, Naoya Horiguchi, Hillf Danton, David Rientjes, Kirill A. Shutemov, Dave Hansen, linu...@kvack.org, linux-...@vger.kernel.org, Hugh Dickins, Greg Thelen, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, syzkaller, Mike Kravetz, sta...@vger.kernel.org
Dmitry Vyukov reported the following memory leak

unreferenced object 0xffff88002eaafd88 (size 32):
comm "a.out", pid 5063, jiffies 4295774645 (age 15.810s)
hex dump (first 32 bytes):
28 e9 4e 63 00 88 ff ff 28 e9 4e 63 00 88 ff ff (.Nc....(.Nc....
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[< inline >] kmalloc include/linux/slab.h:458
[<ffffffff815efa64>] region_chg+0x2d4/0x6b0 mm/hugetlb.c:398
[<ffffffff815f0c63>] __vma_reservation_common+0x2c3/0x390 mm/hugetlb.c:1791
[< inline >] vma_needs_reservation mm/hugetlb.c:1813
[<ffffffff815f658e>] alloc_huge_page+0x19e/0xc70 mm/hugetlb.c:1845
[< inline >] hugetlb_no_page mm/hugetlb.c:3543
[<ffffffff815fc561>] hugetlb_fault+0x7a1/0x1250 mm/hugetlb.c:3717
[<ffffffff815fd349>] follow_hugetlb_page+0x339/0xc70 mm/hugetlb.c:3880
[<ffffffff815a2bb2>] __get_user_pages+0x542/0xf30 mm/gup.c:497
[<ffffffff815a400e>] populate_vma_page_range+0xde/0x110 mm/gup.c:919
[<ffffffff815a4207>] __mm_populate+0x1c7/0x310 mm/gup.c:969
[<ffffffff815b74f1>] do_mlock+0x291/0x360 mm/mlock.c:637
[< inline >] SYSC_mlock2 mm/mlock.c:658
[<ffffffff815b7a4b>] SyS_mlock2+0x4b/0x70 mm/mlock.c:648

Dmitry identified a potential memory leak in the routine region_chg,
where a region descriptor is not free'ed on an error path.

However, the root cause for the above memory leak resides in region_del.
In this specific case, a "placeholder" entry is created in region_chg. The
associated page allocation fails, and the placeholder entry is left in the
reserve map. This is "by design" as the entry should be deleted when the
map is released. The bug is in the region_del routine which is used to
delete entries within a specific range (and when the map is released).
region_del did not handle the case where a placeholder entry exactly matched
the start of the range range to be deleted. In this case, the entry would
not be deleted and leaked. The fix is to take these special placeholder
entries into account in region_del.

The region_chg error path leak is also fixed.

Fixes: feba16e25a57 ("add region_del() to delete a specific range of entries")
Cc: sta...@vger.kernel.org [4.3]
Signed-off-by: Mike Kravetz <mike.k...@oracle.com>
Reported-by: Dmitry Vyukov <dvy...@google.com>
---
mm/hugetlb.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1101ccd94..ba07014 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -372,8 +372,10 @@ retry_locked:
spin_unlock(&resv->lock);

trg = kmalloc(sizeof(*trg), GFP_KERNEL);
- if (!trg)
+ if (!trg) {
+ kfree(nrg);
return -ENOMEM;
+ }

spin_lock(&resv->lock);
list_add(&trg->link, &resv->region_cache);
@@ -483,7 +485,13 @@ static long region_del(struct resv_map *resv, long f, long t)
retry:
spin_lock(&resv->lock);
list_for_each_entry_safe(rg, trg, head, link) {
- if (rg->to <= f)
+ /*
+ * file_region ranges are normally of the form [from, to).
+ * However, there may be a "placeholder" entry in the map
+ * which is of the form (from, to) with from == to. Check
+ * for placeholder entries as well.
+ */
+ if (rg->to <= f && rg->to != rg->from)
continue;
if (rg->from >= t)
break;
--
2.4.3

Hillf Danton

unread,
Dec 2, 2015, 2:12:38 AM12/2/15
to Mike Kravetz, Dmitry Vyukov, Andrew Morton, Naoya Horiguchi, David Rientjes, Kirill A. Shutemov, Dave Hansen, linu...@kvack.org, linux-...@vger.kernel.org, Hugh Dickins, Greg Thelen, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, syzkaller
Acked-by: Hillf Danton <hill...@alibaba-inc.com>

Dmitry Vyukov

unread,
Dec 2, 2015, 4:26:47 AM12/2/15
to syzkaller, Mike Kravetz, Andrew Morton, Naoya Horiguchi, David Rientjes, Kirill A. Shutemov, Dave Hansen, linu...@kvack.org, LKML, Hugh Dickins, Greg Thelen, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
FWIW, I see this leak also with mlock, mmap, get_mempolicy and page
faults. So it is not specific only to the new fancy mlock2.
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+...@googlegroups.com.
> To post to this group, send email to syzk...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/04ad01d12cd0%24c9bfe070%245d3fa150%24%40alibaba-inc.com.
> For more options, visit https://groups.google.com/d/optout.

Mike Kravetz

unread,
Dec 2, 2015, 10:32:23 AM12/2/15
to Dmitry Vyukov, syzkaller, Andrew Morton, Naoya Horiguchi, David Rientjes, Kirill A. Shutemov, Dave Hansen, linu...@kvack.org, LKML, Hugh Dickins, Greg Thelen, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet
On 12/02/2015 01:26 AM, Dmitry Vyukov wrote:
> FWIW, I see this leak also with mlock, mmap, get_mempolicy and page
> faults. So it is not specific only to the new fancy mlock2.

I assume/hope the patch addresses leaks with those other calls as well?

--
Mike Kravetz

Mike Kravetz

unread,
Dec 2, 2015, 2:50:27 PM12/2/15
to Dmitry Vyukov, Andrew Morton, Naoya Horiguchi, Hillf Danton, David Rientjes, Kirill A. Shutemov, Dave Hansen, linu...@kvack.org, linux-...@vger.kernel.org, Hugh Dickins, Greg Thelen, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, syzkaller, sta...@vger.kernel.org
That check is not sufficient/correct. It will allow placeholder entries
BEFORE the range to be deleted to fall through. This would result in
modifications of those placeholder entries to create bogus/bad entries of
the form [from, to) where from > to.

A V2 of the patch with a more specific check will be sent shortly.
--
Mike Kravetz

Mike Kravetz

unread,
Dec 2, 2015, 3:14:17 PM12/2/15
to Dmitry Vyukov, Andrew Morton, Naoya Horiguchi, Hillf Danton, David Rientjes, Kirill A. Shutemov, Dave Hansen, linu...@kvack.org, linux-...@vger.kernel.org, Hugh Dickins, Greg Thelen, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, syzkaller, Mike Kravetz, sta...@vger.kernel.org
V2: The original version of the patch did not correctly handle placeholder
entries before the range to be deleted. The new check is more specific
and only matches placeholders at the start of range.

Fixes: feba16e25a57 ("add region_del() to delete a specific range of entries")
Cc: sta...@vger.kernel.org [4.3]
Signed-off-by: Mike Kravetz <mike.k...@oracle.com>
Reported-by: Dmitry Vyukov <dvy...@google.com>
---
mm/hugetlb.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 1101ccd94..c895ab9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -372,8 +372,10 @@ retry_locked:
spin_unlock(&resv->lock);

trg = kmalloc(sizeof(*trg), GFP_KERNEL);
- if (!trg)
+ if (!trg) {
+ kfree(nrg);
return -ENOMEM;
+ }

spin_lock(&resv->lock);
list_add(&trg->link, &resv->region_cache);
@@ -483,8 +485,16 @@ static long region_del(struct resv_map *resv, long f, long t)
retry:
spin_lock(&resv->lock);
list_for_each_entry_safe(rg, trg, head, link) {
- if (rg->to <= f)
+ /*
+ * Skip regions before the range to be deleted. file_region
+ * ranges are normally of the form [from, to). However, there
+ * may be a "placeholder" entry in the map which is of the form
+ * (from, to) with from == to. Check for placeholder entries
+ * at the beginning of the range to be deleted.
+ */
+ if (rg->to <= f && (rg->to != rg->from || rg->to != f))
continue;
+
if (rg->from >= t)
break;

--
2.4.3

Hillf Danton

unread,
Dec 3, 2015, 2:00:21 AM12/3/15
to Mike Kravetz, Dmitry Vyukov, Andrew Morton, Naoya Horiguchi, David Rientjes, Kirill A. Shutemov, Dave Hansen, linu...@kvack.org, linux-...@vger.kernel.org, Hugh Dickins, Greg Thelen, Kostya Serebryany, Alexander Potapenko, Sasha Levin, Eric Dumazet, syzkaller
Acked-by: Hillf Danton <hill...@alibaba-inc.com>
Reply all
Reply to author
Forward
0 new messages