Tatu Saloranta
unread,May 3, 2012, 3:17:06 PM5/3/12Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to java-serializat...@googlegroups.com
I thought it might make sense to explain how I think regular on-going
garbage collection activity affects measurements, with respect to heap
size.
Others can correct me if I am wrong in some aspects; and based on
shared understanding maybe we can first figure out what we want, and
from that, what settings would work best.
For this discussion I only consider so-called "young generation"
collection: this refers to fast copy-compacting collection JVM does
between "Eden" and "Survivor" spaces. These spaces are equally-sized,
and at least for small heap sizes cover most of heap. Eden is from
where all normal memory allocation occurs; filling of which triggers
YoungGen collection. Survivor space is used by JVM to implement
compaction during this collection, which is why spaces are
equal-sized.
I assume we can (and should) configure in such a way that no Old
Generation collection occurs -- but even if they do, it is a rare
enough occurence so that it either has no effect (choose minimum) or
is amortized over run time to also have little effect.
1. What is the size of Eden Space?
Since this is the relevant memory size for calculations, we need to
know how this depends on total heap size. We may also want to just
explicitly force this size, with "-XX:NewSize=32m" (etc).
Default method for determining is actually adaptive, meaning that JVM
may change ratios of spaces it uses. But looking at descriptions, it
looks like all young generation is given either 1/2 or 1/3 of space.
This would mean that with 256 megs we could get 128 megs for young
generation; and of this, 64 megs for Eden space.
With 16 megs it would mean mere 4 megs -- but since there is some
stuff that must go in OldGen (eventually), I bet it can go as low as 1
meg. We shall see that this is unreasonably small amount, and heavily
punishes "litter bug codecs".
But for now, consider we have allocation space of either 64 or 4 megs,
for two sizes that have been used (256, 16 meg heap sizes).
2. How often would YoungGen GC occur?
We have some idea of throughput of codecs; and fastest ones can push
through incredible number of iterations. Amount of garbage produced
depends on codec in question.
But to give some idea, let us assume that codec A can do 100000 (100k)
iterations per second, assuming unlimited memory (no GC of any kind).
This would be an average throughput; many codecs do more, others do less.
Further, assume that for each iteration, codec A produces 1k of
garbage: all codecs produce some, to create transient state. I don't
know what the exact range is, but I assume many produce more (JDK
default serialization, for example does); and at least some less.
From this, we get needed allocation rate of 100 megs per second; and
this would mean either 1.5 GCs per second (256), or, 25 per second
(16)
3. How long does a single YoungGen take?
Amount of time taken depends first and foremost on number of _live_
objects (non-garbage), not amount of garbage. This because
copy-compact strategy is used, where only live objects are copied
over. Garbage then is "free"; although there is some amount of
overhead that depends on size (non-local memory refs etc).
Typical times I have seen in production systems are in single-digit
milliseconds; say, 4 - 10 milliseconds.
Our test cases should have "mostly garbage" settings, so I think our
per-GC overhead is pretty low: we can find this out by forcing
printing of GC logs.
But once again, let's assume a value like 4 milliseconds for both settings.
4. Relative effect?
For Codec A, then, we would get:
* 256 meg heap: 1.5 x 4 msecs, which is about 0.6% of runtime.
* 16 meg heap: 25 x 4 msecs, which is about 10% of runtime
and for hypothetical case of 64 meg heap:
* 64 meg heap (16 meg eden): 6 x 4 msecs == 24 msecs == 2.4%
5. But wait! How does it change _relative_ numbers? (litterbugs!)
Above was assuming moderate garbage production, but one that pretty
much assumes that no buffers are re-created.
This is not true for all codecs: JVM default serialization, for
example, does heavy construction of buffers (I think).
And with buffers that work well for large(r) serializations, using 8k
or even 64k buffers is not unusual.
So assuming codec B did 100k per iteration allocation, we would get
100 times as frequent YoungGen GC. This heavily skews numbers, and
simple calculations won't give correct number for throughput (since GC
is mixed in with "real" work; earlier math is good enough
approximation for lower rates).
With 256 meg heap, then, you would get a YoungGen GC between every 640
iterations; and with 16, every 40 iterations.
And even with big heap, overhead would jump to majority of time (for
100,000 iterations we would get 150 GCs, which takes 600 msecs; which
would give us throughput of about 60k, i.e. 40% GC overhead).
6. One good thing about YoungGen GC: stability
Above back of envelope calculations suggest that YoungGen GC can
heavily affect numbers. This is true.
But one important thing to note is that its effect is somewhat stable,
on per-codec basis, considering that heap size is fixed (for test),
and garbage production is fixed (per codec and input). It should not
be highly variable, once we fix these variables.
So we can get stable numbers. The problem remains that of whether to:
(a) Hide memory allocation inefficiencies, by using largest heaps we
can, forcing large YoungGen size
(b) Punish memory-hungry codecs by using tiny heaps, or
(c) Try to present both overall efficiency and derived memory efficiency.
To do (c), we could run results on both (a) and (b), and calculate
per-codec throughput ratio (a / b) -- this would indicate how
gracefully codec degrades with more limited heap sizes.
-+ Tatu +-