Bit more on effects of GC...

272 views
Skip to first unread message

Tatu Saloranta

unread,
May 3, 2012, 3:17:06 PM5/3/12
to java-serializat...@googlegroups.com
I thought it might make sense to explain how I think regular on-going
garbage collection activity affects measurements, with respect to heap
size.
Others can correct me if I am wrong in some aspects; and based on
shared understanding maybe we can first figure out what we want, and
from that, what settings would work best.

For this discussion I only consider so-called "young generation"
collection: this refers to fast copy-compacting collection JVM does
between "Eden" and "Survivor" spaces. These spaces are equally-sized,
and at least for small heap sizes cover most of heap. Eden is from
where all normal memory allocation occurs; filling of which triggers
YoungGen collection. Survivor space is used by JVM to implement
compaction during this collection, which is why spaces are
equal-sized.

I assume we can (and should) configure in such a way that no Old
Generation collection occurs -- but even if they do, it is a rare
enough occurence so that it either has no effect (choose minimum) or
is amortized over run time to also have little effect.

1. What is the size of Eden Space?

Since this is the relevant memory size for calculations, we need to
know how this depends on total heap size. We may also want to just
explicitly force this size, with "-XX:NewSize=32m" (etc).

Default method for determining is actually adaptive, meaning that JVM
may change ratios of spaces it uses. But looking at descriptions, it
looks like all young generation is given either 1/2 or 1/3 of space.
This would mean that with 256 megs we could get 128 megs for young
generation; and of this, 64 megs for Eden space.
With 16 megs it would mean mere 4 megs -- but since there is some
stuff that must go in OldGen (eventually), I bet it can go as low as 1
meg. We shall see that this is unreasonably small amount, and heavily
punishes "litter bug codecs".

But for now, consider we have allocation space of either 64 or 4 megs,
for two sizes that have been used (256, 16 meg heap sizes).

2. How often would YoungGen GC occur?

We have some idea of throughput of codecs; and fastest ones can push
through incredible number of iterations. Amount of garbage produced
depends on codec in question.

But to give some idea, let us assume that codec A can do 100000 (100k)
iterations per second, assuming unlimited memory (no GC of any kind).
This would be an average throughput; many codecs do more, others do less.
Further, assume that for each iteration, codec A produces 1k of
garbage: all codecs produce some, to create transient state. I don't
know what the exact range is, but I assume many produce more (JDK
default serialization, for example does); and at least some less.

From this, we get needed allocation rate of 100 megs per second; and
this would mean either 1.5 GCs per second (256), or, 25 per second
(16)

3. How long does a single YoungGen take?

Amount of time taken depends first and foremost on number of _live_
objects (non-garbage), not amount of garbage. This because
copy-compact strategy is used, where only live objects are copied
over. Garbage then is "free"; although there is some amount of
overhead that depends on size (non-local memory refs etc).
Typical times I have seen in production systems are in single-digit
milliseconds; say, 4 - 10 milliseconds.

Our test cases should have "mostly garbage" settings, so I think our
per-GC overhead is pretty low: we can find this out by forcing
printing of GC logs.
But once again, let's assume a value like 4 milliseconds for both settings.

4. Relative effect?

For Codec A, then, we would get:

* 256 meg heap: 1.5 x 4 msecs, which is about 0.6% of runtime.
* 16 meg heap: 25 x 4 msecs, which is about 10% of runtime

and for hypothetical case of 64 meg heap:

* 64 meg heap (16 meg eden): 6 x 4 msecs == 24 msecs == 2.4%

5. But wait! How does it change _relative_ numbers? (litterbugs!)

Above was assuming moderate garbage production, but one that pretty
much assumes that no buffers are re-created.
This is not true for all codecs: JVM default serialization, for
example, does heavy construction of buffers (I think).
And with buffers that work well for large(r) serializations, using 8k
or even 64k buffers is not unusual.

So assuming codec B did 100k per iteration allocation, we would get
100 times as frequent YoungGen GC. This heavily skews numbers, and
simple calculations won't give correct number for throughput (since GC
is mixed in with "real" work; earlier math is good enough
approximation for lower rates).

With 256 meg heap, then, you would get a YoungGen GC between every 640
iterations; and with 16, every 40 iterations.
And even with big heap, overhead would jump to majority of time (for
100,000 iterations we would get 150 GCs, which takes 600 msecs; which
would give us throughput of about 60k, i.e. 40% GC overhead).

6. One good thing about YoungGen GC: stability

Above back of envelope calculations suggest that YoungGen GC can
heavily affect numbers. This is true.

But one important thing to note is that its effect is somewhat stable,
on per-codec basis, considering that heap size is fixed (for test),
and garbage production is fixed (per codec and input). It should not
be highly variable, once we fix these variables.

So we can get stable numbers. The problem remains that of whether to:

(a) Hide memory allocation inefficiencies, by using largest heaps we
can, forcing large YoungGen size
(b) Punish memory-hungry codecs by using tiny heaps, or
(c) Try to present both overall efficiency and derived memory efficiency.

To do (c), we could run results on both (a) and (b), and calculate
per-codec throughput ratio (a / b) -- this would indicate how
gracefully codec degrades with more limited heap sizes.

-+ Tatu +-

Kannan Goundan

unread,
May 3, 2012, 5:49:19 PM5/3/12
to java-serializat...@googlegroups.com
That sounds like a good idea.  Since our results page already has too many separate rectangles, I wonder if we could combine (a) and (b) into a single chart (by making the results for (b) just extend off the bar for (a))?

Gah, I've wanted to improve the chart rendering for a while, I just can't find the time...


-+ Tatu +-

--
You received this message because you are subscribed to the Google Groups "java-serialization-benchmarking" group.
To post to this group, send email to java-serializat...@googlegroups.com.
To unsubscribe from this group, send email to java-serialization-be...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/java-serialization-benchmarking?hl=en.


Tatu Saloranta

unread,
May 3, 2012, 6:59:57 PM5/3/12
to java-serializat...@googlegroups.com
On Thu, May 3, 2012 at 2:49 PM, Kannan Goundan <kan...@cakoose.com> wrote:
> That sounds like a good idea.  Since our results page already has too many
> separate rectangles, I wonder if we could combine (a) and (b) into a single
> chart (by making the results for (b) just extend off the bar for (a))?
>
> Gah, I've wanted to improve the chart rendering for a while, I just can't
> find the time...

That sounds like a good idea to me.

-+ Tatu +-

Rüdiger möller

unread,
Dec 6, 2012, 4:54:31 PM12/6/12
to java-serializat...@googlegroups.com
There is an interface inside the VM where one can register to get notified in case of a GC. Since consumption of both static and dynamic memory is a 'hidden' cost of a serializer (because it increases fragmentation and full gc's over time) it should be measured.

so 
1) do a System.gc before each single benchmark
2) register at the vm management interface and count the number of minor+major GC's once a benchmark is running
3) chart that

Note: AFAIK up to 64MB heap the VM does a mark and sweep instead of generational GC, so you should give at least 256mb heap.

Tatu Saloranta

unread,
Dec 6, 2012, 6:12:53 PM12/6/12
to java-serializat...@googlegroups.com
Very interesting, thank you for sharing this!

I would love a contribution for such change, if anyone has time?
(eventually maybe I can do it too). Didn't know 64M distinction, that
makes sense wrt some results I saw, too.

-+ Tatu +-
> --
> You received this message because you are subscribed to the Google Groups
> "java-serialization-benchmarking" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/java-serialization-benchmarking/-/zos_BvgOwLYJ.

Rüdiger möller

unread,
Dec 6, 2012, 6:49:36 PM12/6/12
to java-serializat...@googlegroups.com
I might do it when i add FST to the benchmark, however i am at vacation currently, so it will take some time ..
> java-serialization-benchm...@googlegroups.com.
> To unsubscribe from this group, send email to
> java-serialization-benchmarking+unsubscribe@googlegroups.com.

Tatu Saloranta

unread,
Dec 6, 2012, 7:22:51 PM12/6/12
to java-serializat...@googlegroups.com
Understood, most of us have time constraints. But it is a good idea,
and thanks once again for the pointer to possible mechanism to use.

-+ Tatu +-
>> > java-serializat...@googlegroups.com.
>> > To unsubscribe from this group, send email to
>> > java-serialization-be...@googlegroups.com.
>> > For more options, visit this group at
>> > http://groups.google.com/group/java-serialization-benchmarking?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "java-serialization-benchmarking" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/java-serialization-benchmarking/-/dTNxmmaFmtIJ.
>
> To post to this group, send email to
> java-serializat...@googlegroups.com.
> To unsubscribe from this group, send email to
> java-serialization-be...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages