Greetings,
PHC is a probabilistic heap checker I have been working on. It has landed
and I am planning to enable it on Monday morning AEST (which is Sunday
afternoon/evening in many parts of the world). For now it will only be
enabled on Linux64 Nightly builds (and also local developer builds) but I
hope to also enable it on Win64 Nightly builds soon.
Here is a short description of PHC that comes from the comment at the top
of memory/replace/phc/PHC.cpp:
// PHC is a probabilistic heap checker. A tiny fraction of randomly chosen
heap
// allocations are subject to some expensive checking via the use of OS page
// access protection. A failed check triggers a crash, whereupon useful
// information about the failure is put into the crash report. The cost and
// coverage for each user is minimal, but spread over the entire user base
the
// coverage becomes significant.
I have included the rest of that comment, which describes the
implementation in more detail, at the bottom of this email. Also see bug
1523276 <
https://bugzilla.mozilla.org/show_bug.cgi?id=1523276> for
additional details.
This is a tool that will hopefully detect use-after-free errors in the
wild. It is a somewhat complex wrapper of our heap allocator, and
everything depends on the heap allocator. This means that any bugs in PHC
have the potential to cause a wide variety of problems. Please keep an eye
out for any such problems. It's hard to say exactly what those problems
might be... during testing I saw and fixed some deadlocks and some crashes
within arena_dalloc(), but other effects may be possible. Also, it's
completely non-deterministic (by design) which further complicates the
detection of problems.
Please let me know if you have any questions or concerns.
Nick
// The idea comes from Chromium, where it is called GWP-ASAN. (Firefox uses
PHC
// as the name because GWP-ASAN is long, awkward, and doesn't have any
// particular meaning.)
//
// In the current implementation up to 64 allocations per process can become
// PHC allocations. These allocations must be page-sized or smaller. Each
PHC
// allocation gets its own page, and when the allocation is freed its page
is
// marked inaccessible until the page is reused for another allocation. This
// means that a use-after-free defect (which includes double-frees) will be
// caught if the use occurs before the page is reused for another
allocation.
// The crash report will contain stack traces for the allocation site, the
free
// site, and the use-after-free site, which is often enough to diagnose the
// defect.
//
// The design space for the randomization strategy is large. The current
// implementation has a large random delay before it starts operating, and a
// small random delay between each PHC allocation attempt. Each freed PHC
// allocation is quarantined for a medium random delay before being reused,
in
// order to increase the chance of catching UAFs.
//
// The basic cost of PHC's operation is as follows.
//
// - The memory cost is 64 * 4 KiB = 256 KiB per process (assuming 4 KiB
// pages) plus some metadata (including stack traces) for each page.
//
// - Every allocation requires a size check and a decrement-and-check of an
// atomic counter. When the counter reaches zero a PHC allocation can
occur,
// which involves marking a page as accessible and getting a stack trace
for
// the allocation site. Otherwise, mozjemalloc performs the allocation.
//
// - Every deallocation requires a range check on the pointer to see if it
// involves a PHC allocation. (The choice to only do PHC allocations that
are
// a page or smaller enables this range check, because the 64 pages are
// contiguous. Allowing larger allocations would make this more
complicated,
// and we definitely don't want something as slow as a hash table lookup
on
// every deallocation.) PHC deallocations involve marking a page as
// inaccessible and getting a stack trace for the deallocation site.
//
// In the future, we may add guard pages between the used pages in order
// to detect buffer overflows/underflows. This would change the memory cost
to
// (64 * 2 + 1) * 4 KiB = 516 KiB per process and complicate the machinery
// somewhat.