On Tue, Aug 8, 2017 at 7:35 PM, Bill Wood <
wpwo...@gmail.com> wrote:
> Thanks! I can't find how the size of these zones are set, all the online
> docs seem to say that DMA32 is 4 GB. Perhaps it is only 2 GB on these
> machines?
I'll take a look, but I've never worried about it. Even if we're
using the wrong size for DMA32, it will only impact fragmentation
insignificantly.
> Perhaps you could answer another question I have... why was it decided to
> set extra_free_kbytes to 350? That seems to have the effect of leaving
> about 500 to 550 MB of memory free, which seems like a waste on a 4 GB
> system and might reduce battery life if kswapd is often running?
Yes that's all correct, however in practice this has mostly the effect
of triggering tab discards earlier, so it slightly increases the
number of tab reloads (very little on average---we can monitor average
usage in the field via the stats reporting). In extreme cases it may
impact someone's workflow, if such workflow requires switching
cyclically through a set of tabs whose total footprint is too large to
fit, but we don't see that very often, not even inside Google, where
as you can imagine we have the most demanding users. (It's never
happened to me.)
Conversely, without that margin we run more often into situations in
which the memory allocation rate is higher than the freeing rate via
tab discards. When that happens the kernel starts OOM-killing
processes. This results in suboptimal loss of tabs (especially when
many tabs share the same "renderer" process), but, worse, the OOM-kill
code has a lot of lock contention and deadlocks, which cause temporary
or permanent freezes. This produces janky behavior, with freezes up
to seconds, and kernel crashes when the hang detector kicks in. This
is much worse than the extra discards, so it justifies the apparent
waste of RAM.
The OOM kill code is hard to fix. A good starting point for reading
about it is this LWN article, there are several more good ones on LWN.
https://lwn.net/Articles/668126/