On Sun, Jan 24, 2016 at 8:27 PM, Richard Gooch <
rg+go...@safe-mbox.com> wrote:
>
> Ideally, one could recover() from an OOM panic, but that currently does not
> work. There is even code in the standard package library that attempts to do
> a recover() (to change the panic message), but it doesn't work. Commit
> b0d2713b77f80986f688d18bd0df03ed56d6e7b5 by Rob Pike attempted this 4 years
> ago. Perhaps it worked once?
That code doesn't recover from an OOM panic (as you note, an OOM is
not a panic, and it can not be recovered). That code recovers from an
attempt to create a slice so large that it does not fit in memory.
Such an attempt, which can never succeed, does panic, and can be
recovered (see the makeslice function near the start of
https://golang.org/src/runtime/slice.go). I agree that it's a subtle,
and often useless, distinction. It was a distinction that meant
slightly more at the time that change was written, because back then
int was 32 bits even on a 64-bit machine, and an attempt to create a
slice that required more than 31 bits to index it would panic.
> I realise that running defer() code when a memory allocation has failed,
> since the recovery code will also need to allocation memory. However, the
> common case is probably that a large allocation failed, and there is room
> for small allocations needed for cleanup, so just allowing applications to
> catch OOM panics would probably help most of the time. If there is another
> memory allocation failure during recovery, kill the application. There could
> also be a reserved chunk of memory that is freed/made available during OOM
> recovery, which is re-reserved once recovery is complete. That would also
> allow effective recovery if a small memory allocation failed. As long as the
> recovery code uses less memory than the reserved size, this approach should
> be reliable.
I don't agree that the common case is that a large allocation failed.
That is one case, but I think the common case is that the program has
a memory leak and has in fact run out of memory. Adding a reserved
chunk of memory introduces a new allocation approach that is almost
never used, and is therefore more likely to be buggy, and (I suspect)
will rarely help.
> A less attractive option is to add a trymake() built-in function, which
> returns a value,error tuple. I like this less because there are many other
> ways in which memory is allocated, so one cannot catch them all. It also
> requires changing a large number of callsites. Nevertheless, it would be
> better than the current situation.
It's true that there are many ways that allocation can occur, but
there aren't all that many ways that a large allocation can occur. If
we restrict ourselves to slices, which seems reasonable at first
glance, then there is really only one way: the make function (one
could write a truly large composite literal, but that seems
implausible). If we can agree that the only kind of memory allocation
from which one can plausibly reliably recover is a large one, then I
think an approach like trymake might seem more reasonable.
Ian