Today at the MoFo meeting Damon announced that the initial "infallible malloc" work is close to landing. He mentioned the motivation for this work: code size and performance wins, removing the burden of OOM checks from C++ development (in the common case), and crashing "reliably" and un-exploitably on OOM. I won't focus on the motivation in this post; instead I want to expand on the MoFo announcement a bit by describing the new APIs and the changes they may imply for new C++ code.
"Infallible malloc" is a blanket term for "infallible allocators." "Allocators" are all the libc functions and libc++ operators that dynamically allocate heap memory: malloc(), calloc(), realloc(), strdup(), strndup(), posix_memalign(), memalign(), valloc(), ::operator new(), and ::operator new[](). (Not all platforms define all these functions.) "Infallible" versions of these allocators will never fail; for us failure means returning NULL, so these infallible allocators will never return NULL.
For each allocator, we've created "infallible" and "fallible" variants. For the libc functions (malloc() et al.), these variants are named "moz_xmalloc()" and "moz_malloc()" respectively. The "moz_x*" prefix is intended to mean "infallible."
*VERY IMPORTANT*: both the "moz_" and "moz_x" allocators return memory that can be freed by whatever the |free()| symbol resolves to. This is expicitly part of their specification.
The libc++ operators don't follow this scheme, obviously; the "infallible" ::operator new is declared as
void* operator new(size_t size)
and is invoked by "normal" C++ allocations, e.g.
Foo* f = new Foo();
This |new Foo()| expression will never evaluate to NULL. The memory it allocates is freed by |delete f|.
using mozilla::fallible_t; Bar* b = new (fallible_t()) Bar();
Here |b| should be checked for NULL. The memory allocated should be freed by |delete (fallible_t()) b| for consistency. (But just between the two of us, |delete b| will also work.)
These functions and operators are defined in a new library, libmozalloc. The header declaring these functions and operators is included from nscore.h, so your code probably won't need to use it directly. When libmozalloc lands, an accompanying patch will land that migrates Gecko code *except for JS* to the infallible allocators, by default. These allocators are carefully defined and used in a such a way that they don't encroach on JS allocators and don't export symbols that would conflict with "standard" ones.
The immediate next step for this work (after it lands) is to use our static analysis and rewrite tools to remove code made superfluous by infallible malloc. Module owners, prepare yourselves for a patch storm ;).
Hopefully this explains the situation in a bit more detail. There were two more questions I heard that are probably better presented in FAQ style.
(Q) (as alluded to earlier) What happens when under the covers an infallible allocator fails? (I.e., the allocator moz_*alloc wraps returns NULL.) (A) The *first-stage* patch will simply |abort()|. In all likelihood, this won't be much different from what currently happens on OOM (it may happen a few microseconds earlier than previously). The *second-stage* infallible allocators will somehow attempt to reclaim memory when malloc() et al. return NULL. This may be done by some combination of keeping a private stash of memory and/or exposing a "memory reclamation" API, allowing clients to register "cleanup functions" to be invoked on OOM. It's debatable how effective this strategy will be, however, and it may not be pursued. A third stage might attempt to monitor memory stats using OS-level APIs and send a "low memory" notification, upon which memory reclamation would ensue as described above. This is, however, very much in the planning stage right now.
(Q) When should I use a fallible allocator rather than a default infallible one? (A) I don't have a good answer for this; it seems to be a judgment call. Certain situations clearly want fallible allocation: a many-MB cache is probably one such situation. (If it fails to grow by another 10MB, oh well.) In general it's probably best to ask your module owner on a per-case basis, but anyone with more Gecko experience or clearer guidelines please chime in.
> Today at the MoFo meeting Damon announced that the initial "infallible > malloc" work is close to landing. He mentioned the motivation for this > work: code size and performance wins, removing the burden of OOM checks > from C++ development (in the common case), and crashing "reliably" and > un-exploitably on OOM. I won't focus on the motivation in this post; > instead I want to expand on the MoFo announcement a bit by describing > the new APIs and the changes they may imply for new C++ code.
> "Infallible malloc" is a blanket term for "infallible allocators." > "Allocators" are all the libc functions and libc++ operators that > dynamically allocate heap memory: malloc(), calloc(), realloc(), > strdup(), strndup(), posix_memalign(), memalign(), valloc(), ::operator > new(), and ::operator new[](). (Not all platforms define all these > functions.) "Infallible" versions of these allocators will never fail; > for us failure means returning NULL, so these infallible allocators will > never return NULL.
> For each allocator, we've created "infallible" and "fallible" variants. > For the libc functions (malloc() et al.), these variants are named > "moz_xmalloc()" and "moz_malloc()" respectively. The "moz_x*" prefix is > intended to mean "infallible."
> *VERY IMPORTANT*: both the "moz_" and "moz_x" allocators return memory > that can be freed by whatever the |free()| symbol resolves to. This is > expicitly part of their specification.
> The libc++ operators don't follow this scheme, obviously; the > "infallible" ::operator new is declared as
> void* operator new(size_t size)
> and is invoked by "normal" C++ allocations, e.g.
> Foo* f = new Foo();
> This |new Foo()| expression will never evaluate to NULL. The memory it > allocates is freed by |delete f|.
> using mozilla::fallible_t; > Bar* b = new (fallible_t()) Bar();
> Here |b| should be checked for NULL. The memory allocated should be > freed by |delete (fallible_t()) b| for consistency. (But just between > the two of us, |delete b| will also work.)
> These functions and operators are defined in a new library, libmozalloc. > The header declaring these functions and operators is included from > nscore.h, so your code probably won't need to use it directly. When > libmozalloc lands, an accompanying patch will land that migrates Gecko > code *except for JS* to the infallible allocators, by default. These > allocators are carefully defined and used in a such a way that they > don't encroach on JS allocators and don't export symbols that would > conflict with "standard" ones.
> The immediate next step for this work (after it lands) is to use our > static analysis and rewrite tools to remove code made superfluous by > infallible malloc. Module owners, prepare yourselves for a patch storm ;).
> Hopefully this explains the situation in a bit more detail. There were > two more questions I heard that are probably better presented in FAQ style.
> (Q) (as alluded to earlier) What happens when under the covers an > infallible allocator fails? (I.e., the allocator moz_*alloc wraps > returns NULL.) > (A) The *first-stage* patch will simply |abort()|. In all likelihood, > this won't be much different from what currently happens on OOM (it may > happen a few microseconds earlier than previously). The *second-stage* > infallible allocators will somehow attempt to reclaim memory when > malloc() et al. return NULL. This may be done by some combination of > keeping a private stash of memory and/or exposing a "memory reclamation" > API, allowing clients to register "cleanup functions" to be invoked on > OOM. It's debatable how effective this strategy will be, however, and > it may not be pursued. A third stage might attempt to monitor memory > stats using OS-level APIs and send a "low memory" notification, upon > which memory reclamation would ensue as described above. This is, > however, very much in the planning stage right now.
> (Q) When should I use a fallible allocator rather than a default > infallible one? > (A) I don't have a good answer for this; it seems to be a judgment call. > Certain situations clearly want fallible allocation: a many-MB cache is > probably one such situation. (If it fails to grow by another 10MB, oh > well.) In general it's probably best to ask your module owner on a > per-case basis, but anyone with more Gecko experience or clearer > guidelines please chime in.
As I missed out on yesterday's meeting, how does this relate to 1.9.2?
> Axel Hecht wrote: >> As I missed out on yesterday's meeting, how does this relate to >> 1.9.2?
> In no way at all. This will land on trunk only.
To be clear, this is being proposed to land on trunk-only, after we branch for mozilla-1.9.2. This will be one of the first "large code change" areas for 1.9.3 as I understand things, though we should talk through that at today's developer's meeting at 11am PDT.
Chris Jones wrote: > Here |b| should be checked for NULL. The memory allocated should be > freed by |delete (fallible_t()) b| for consistency. (But just between > the two of us, |delete b| will also work.)
Why are we doing it this way?
That is, if the goal is to eventually maybe make |delete b| not work, then having it work for now will mean that people write code that way. This will be especially true if they follow our best practices and use nsAutoPtr and the like, right?
If we can guarantee that we'll never need to make |delete b| not work, I say we should just have |delete b| work up front and be the preferred style, just like we use free() for our malloc/etc.
> (Q) When should I use a fallible allocator rather than a default > infallible one? > (A) I don't have a good answer for this; it seems to be a judgment call. > Certain situations clearly want fallible allocation: a many-MB cache is > probably one such situation. (If it fails to grow by another 10MB, oh > well.) In general it's probably best to ask your module owner on a > per-case basis, but anyone with more Gecko experience or clearer > guidelines please chime in.
One important case is any allocation whose size is reasonably directly controlled by the webpage (image buffers, canvas imagedata, some cases where length properties are settable in the DOM, various JS allocations, etc).
That is, we don't want to allow trivial DOS by web pages. ;)
> Today at the MoFo meeting Damon announced that the initial "infallible > malloc" work is close to landing. He mentioned the motivation for this > work: code size and performance wins, removing the burden of OOM checks > from C++ development (in the common case), and crashing "reliably" and > un-exploitably on OOM. I won't focus on the motivation in this post; > instead I want to expand on the MoFo announcement a bit by describing > the new APIs and the changes they may imply for new C++ code.
What is the thought on how non-tier-1 platforms should handle this? Do we have to port your code (where is it?) or will it fully rely on the platform allocators? Peter.
Boris Zbarsky wrote: > Chris Jones wrote: >> Here |b| should be checked for NULL. The memory allocated should be >> freed by |delete (fallible_t()) b| for consistency. (But just between >> the two of us, |delete b| will also work.)
> Why are we doing it this way?
> That is, if the goal is to eventually maybe make |delete b| not work, > then having it work for now will mean that people write code that way. > This will be especially true if they follow our best practices and use > nsAutoPtr and the like, right?
> If we can guarantee that we'll never need to make |delete b| not work, I > say we should just have |delete b| work up front and be the preferred > style, just like we use free() for our malloc/etc.
|delete b| is explicitly specified to work with fallible new. This issue just reduces to a discussion of C++ style, and IMHO matching the allocator with its partner deallocator seems preferable. But you raise good points re: our smart pointers.
Peter Weilbacher wrote: > On 04.08.2009 06:05, Chris Jones wrote: >> Today at the MoFo meeting Damon announced that the initial "infallible >> malloc" work is close to landing. He mentioned the motivation for this >> work: code size and performance wins, removing the burden of OOM checks >> from C++ development (in the common case), and crashing "reliably" and >> un-exploitably on OOM. I won't focus on the motivation in this post; >> instead I want to expand on the MoFo announcement a bit by describing >> the new APIs and the changes they may imply for new C++ code.
> What is the thought on how non-tier-1 platforms should handle this? Do > we have to port your code (where is it?) or will it fully rely on the > platform allocators?
Short answers are (1) no need to port this code; (2) it's not checked in yet, but will live in memory/mozalloc; (3) yes.
In slightly more detail, the new code (libmozalloc) wraps the libc symbols listed in the parent post. This means that it leaves the symbols "malloc" et al. undefined. When Firefox is loaded, libmozalloc will use whatever the dynamic linker resolves "malloc" et al. to. With --enable-jemalloc builds, this will be jemalloc symbols; otherwise it'll be system symbols. (Sorry if this is pedantic, just trying to be clear.)
Chris Jones wrote: >> If we can guarantee that we'll never need to make |delete b| not work, I >> say we should just have |delete b| work up front and be the preferred >> style, just like we use free() for our malloc/etc.
> |delete b| is explicitly specified to work with fallible new. This > issue just reduces to a discussion of C++ style, and IMHO matching the > allocator with its partner deallocator seems preferable. But you raise > good points re: our smart pointers.
My real issue, and smart pointers are just a good example of this, is that this setup involves the code doing the deallocation knowing which allocator (fallible or not) was used to allocate. This _can_ be done, in general, with some extra effort, but if there's no practical need for that effort, is it worth the work?
Chris Jones wrote: >*VERY IMPORTANT*: both the "moz_" and "moz_x" allocators return memory that can be freed by whatever the |free()| symbol resolves to. This is expicitly part of their specification.
>> *VERY IMPORTANT*: both the "moz_" and "moz_x" allocators return memory >> that can be freed by whatever the |free()| symbol resolves to. This >> is expicitly part of their specification.
> So why not moz_free()?
There wasn't a compelling reason for its creation, and a single free() makes life slightly easier for developers. (Yes, this apparently contradicts what I wrote about matched new/delete.)
> using mozilla::fallible_t; > Bar* b = new (fallible_t()) Bar();
> Here |b| should be checked for NULL. The memory allocated should be > freed by |delete (fallible_t()) b| for consistency. (But just between > the two of us, |delete b| will also work.)
To the best of my knowledge, there ain't no such thing as a placement-delete syntax in C++. |delete (fallible_t()) b| shouldn't compile. If it compiles for you, you must be using a compiler that accepts it as an extension.
The only case where operator delete(void*, extra_params) may be called is when a corresponding placement new is used to create an object, and that object's constructor throws an exception. -- Igor Tandetnik
>> using mozilla::fallible_t; >> Bar* b = new (fallible_t()) Bar();
>> Here |b| should be checked for NULL. The memory allocated should be >> freed by |delete (fallible_t()) b| for consistency. (But just between >> the two of us, |delete b| will also work.)
> To the best of my knowledge, there ain't no such thing as a > placement-delete syntax in C++. |delete (fallible_t()) b| shouldn't > compile. If it compiles for you, you must be using a compiler that > accepts it as an extension.
> The only case where operator delete(void*, extra_params) may be called > is when a corresponding placement new is used to create an object, and > that object's constructor throws an exception.
Hmm, you're quite right. Apologies for the glaring error.
> Peter Weilbacher wrote: >> What is the thought on how non-tier-1 platforms should handle this? Do >> we have to port your code (where is it?) or will it fully rely on the >> platform allocators?
> Short answers are (1) no need to port this code;
Great. :-)
> (2) it's not checked in yet, but will live in memory/mozalloc;
No user repo or bug number yet?
> (3) yes.
> In slightly more detail, the new code (libmozalloc) wraps the libc > symbols listed in the parent post. This means that it leaves the > symbols "malloc" et al. undefined. When Firefox is loaded, libmozalloc > will use whatever the dynamic linker resolves "malloc" et al. to. With > --enable-jemalloc builds, this will be jemalloc symbols; otherwise it'll > be system symbols. (Sorry if this is pedantic, just trying to be clear.)
Peter Weilbacher wrote: > On 04/08/09 16:56, Chris Jones wrote: >> Peter Weilbacher wrote: >> (2) it's not checked in yet, but will live in memory/mozalloc;
> |delete b| is explicitly specified to work with fallible new. This > issue just reduces to a discussion of C++ style, and IMHO matching the > allocator with its partner deallocator seems preferable.
IMHO making all deallocation use the same code seems preferable.
> using mozilla::fallible_t; > Bar* b = new (fallible_t()) Bar();
Any particular reason why we don't follow Gecko naming conventions and use Fallible as the name here?
> (A) The *first-stage* patch will simply |abort()|. In all likelihood, > this won't be much different from what currently happens on OOM (it may > happen a few microseconds earlier than previously). The *second-stage* > infallible allocators will somehow attempt to reclaim memory when > malloc() et al. return NULL. This may be done by some combination of > keeping a private stash of memory and/or exposing a "memory reclamation" > API, allowing clients to register "cleanup functions" to be invoked on > OOM. It's debatable how effective this strategy will be, however, and > it may not be pursued.
While I agree that the first-stage patch abort()ing is not much different from what currently happens on OOM, and I think we should go ahead and check that in, I would prefer to not abandon manual OOM checking by removing our partial implementation of it before we have demonstrated that another approach can work.
>> using mozilla::fallible_t; >> Bar* b = new (fallible_t()) Bar();
> Any particular reason why we don't follow Gecko naming conventions and > use Fallible as the name here?
This lives below Gecko, and these declarations most resemble the std::nothrow_t new/delete, so I figured that was the appropriate "module" to copy. I don't care, though.
>> (A) The *first-stage* patch will simply |abort()|. In all likelihood, >> this won't be much different from what currently happens on OOM (it may >> happen a few microseconds earlier than previously). The *second-stage* >> infallible allocators will somehow attempt to reclaim memory when >> malloc() et al. return NULL. This may be done by some combination of >> keeping a private stash of memory and/or exposing a "memory reclamation" >> API, allowing clients to register "cleanup functions" to be invoked on >> OOM. It's debatable how effective this strategy will be, however, and >> it may not be pursued.
> While I agree that the first-stage patch abort()ing is not much > different from what currently happens on OOM, and I think we should go > ahead and check that in, I would prefer to not abandon manual OOM > checking by removing our partial implementation of it before we have > demonstrated that another approach can work.
On Aug 4, 1:51 pm, Chris Jones <cjo...@mozilla.com> wrote:
> > While I agree that the first-stage patch abort()ing is not much > > different from what currently happens on OOM, and I think we should go > > ahead and check that in, I would prefer to not abandon manual OOM > > checking by removing our partial implementation of it before we have > > demonstrated that another approach can work.
> What do you mean by "can work?"
Perhaps what bz said in his post today:
> One important case is any allocation whose size is reasonably directly > controlled by the webpage (image buffers, canvas imagedata, some cases > where length properties are settable in the DOM, various JS allocations, > etc).
> That is, we don't want to allow trivial DOS by web pages. ;)
Separating allocations that may be controlled from web content from all others is a real problem, which is why SpiderMonkey continues to handle OOM and recover. See
(I'm sure cjones has read it, I cite it here in case others have not).
We have some failure to handle OOM leading to nearly-null or null pointer dereference crashes. Some of these failures are in OS-specific library code. Nevertheless, JS at least perseveres. Other parts of the codebase seem to have given up, and anyway without static analysis support it's hard to promise perfection.
But the road we're heading down could easily abort over a marginal allocation, or one attemped during "recovery" from a detected nearly- out-of-memory condition. The worst case is very bad. Worse than today's behavior, where
Process isolation of web engines would change the user experience here, and does in Chrome, allowing the better programmer experience of ignoring OOM. You'd get a restart and reload from cache, though. Some state might be lost, though -- this would not necessarily be a bug, due to cache controls. In any event it seems too likely to dismiss out of hand.
> Robert O'Callahan wrote: >> While I agree that the first-stage patch abort()ing is not much >> different from what currently happens on OOM, and I think we should go >> ahead and check that in, I would prefer to not abandon manual OOM >> checking by removing our partial implementation of it before we have >> demonstrated that another approach can work.
> What do you mean by "can work?"
I'd like to be able to demo, for example, an infinite document.write loop or loading of an unlimited number of massive images eventually triggering some OOM handling that leads the page being cut off or closed without crashing the browser.
>>>*VERY IMPORTANT*: both the "moz_" and "moz_x" allocators return memory that can be freed by whatever the |free()| symbol resolves to. This is expicitly part of their specification.
>>So why not moz_free()?
>There wasn't a compelling reason for its creation
Robert O'Callahan wrote: > On 5/8/09 8:51 AM, Chris Jones wrote: >> Robert O'Callahan wrote: >>> While I agree that the first-stage patch abort()ing is not much >>> different from what currently happens on OOM, and I think we should go >>> ahead and check that in, I would prefer to not abandon manual OOM >>> checking by removing our partial implementation of it before we have >>> demonstrated that another approach can work.
>> What do you mean by "can work?"
> I'd like to be able to demo, for example, an infinite document.write > loop or loading of an unlimited number of massive images eventually > triggering some OOM handling that leads the page being cut off or closed > without crashing the browser.
While I would prefer not to learn anything about all of the preceding discussion including whether its relevant to me, I just want interject that when my javascript hits OOM I would like a debugger to help me fix it.
Karl Tomlinson wrote: >>>> using mozilla::fallible_t; >>>> Bar* b = new (fallible_t()) Bar();
>> ... these declarations most resemble the std::nothrow_t new ...
> What would be the difference between > operator new(size_t, const mozilla::fallible_t&) > and operator new(size_t, const std::nothrow_t&) ?
|operator new(size_t, const mozilla::fallible_t&)| may later do memory accounting and/or reclamation. It might also |throw(std::bad_alloc)| in some distant future.
On Aug 3, 9:05 pm, Chris Jones <cjo...@mozilla.com> wrote:
> (Q) When should I use a fallible allocator rather than a default > infallible one? > (A) I don't have a good answer for this; it seems to be a judgment call. > Certain situations clearly want fallible allocation: a many-MB cache is > probably one such situation. (If it fails to grow by another 10MB, oh > well.) In general it's probably best to ask your module owner on a > per-case basis, but anyone with more Gecko experience or clearer > guidelines please chime in.
I think it's out of order to go down a path that has bad worst-case behavior (worse than today's) without a better theory for partitioning allocator callsites into fallible and infallible.
The large cache allocation fallible allocator use-case is clear from Chris's FAQ item.
But beyond caches, web content can control some calls, and even directly control the size_t parameter of certain allocator calls; these calls must be fallible, with OOM detection and recovery. Indeed size_t calculations (scaling array length to size in bytes) must be done very carefully to avoid overflow (wraparound in size_t).
Say we partition successfully to find all such allocator callsites and make them fallible and handled.
Nearby, often subsequently in a related allocation, or possibly before to set up a peer data structure, there might be a small allocation whose size is not controlled, but whose call is controlled by web content. If it fails and the process aborts, then user-perceived quality of implementation may suffer unduly.
If malicious web content can DOS the user by arranging failure at such an infallible callsite, then there might be a competitive issue with browsers that don't abort (whether they restart a renderer process or catch/detect OOM and recover).
This makes me think that infallible malloc work should depend on either analysis to find and check all fallible calls, or else some chromium-inspired process-isolated renderers work not really scoped (i.e., beyond electrolysis), to put horse before cart where it belongs.
Another approach: static analysis.
Clearly there are many allocator callsites in our codebase. If we know some are unlikely to be reached due to web content influence, and if we lack a good recovery technique other than lots of null checks, then possibly those callsites could use infallible allocation. A lot of "if"s there.
Suppose we can propagate attributes around the control flow graph, from source methods to allocator callsite sinks. Anything coming from the network or filesystem inducing an allocation is marked F. Anything coming from the user (user input events) is marked I. Is it feasible (I'm thinking of cqual++ or whatever it was from the oink suite) to analyze for allocation sites reached only by I and never F or F+I? Is this model too simplistic to identify causal "influence"?
At first glance it seems almost no allocation site could be infallible. But I thought I'd throw this out to the group.