2017-06-02T18:46:56.010+0200: 10178.264: [GC pause (G1 Evacuation Pause) (young)Desired survivor size 7902068736 bytes, new threshold 15 (max 15)- age  1:   3862296 bytes,   3862296 total- age  2:   459784 bytes,   4322080 total- age  3:   587472 bytes,   4909552 total- age  4:   383864 bytes,   5293416 total- age  5:   399064 bytes,   5692480 total- age  6:   363896 bytes,   6056376 total- age  7:   323632 bytes,   6380008 total- age  8:   357584 bytes,   6737592 total- age  9:   323312 bytes,   7060904 total- age  10:   328672 bytes,   7389576 total- age  11:   986184 bytes,   8375760 total- age  12:   222176 bytes,   8597936 total- age  13:   188536 bytes,   8786472 total- age  14:   207232 bytes,   8993704 total- age  15:   120168 bytes,   9113872 total 10178.275: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 834913, predicted base time: 89.27 ms, remaining time: 110.73 ms, target pause time: 200.00 ms] 10178.275: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 3762 regions, survivors: 1 regions, predicted young region time: 8.28 ms] 10178.275: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 3762 regions, survivors: 1 regions, old: 0 regions, predicted pause time: 97.55 ms, target pause time: 200.00 ms]2017-06-02T18:46:56.121+0200: 10178.375: [SoftReference, 0 refs, 0.0011474 secs]2017-06-02T18:46:56.122+0200: 10178.376: [WeakReference, 62 refs, 0.0005960 secs]2017-06-02T18:46:56.123+0200: 10178.377: [FinalReference, 536 refs, 0.0023264 secs]2017-06-02T18:46:56.125+0200: 10178.379: [PhantomReference, 0 refs, 0 refs, 0.0014977 secs]2017-06-02T18:46:56.127+0200: 10178.381: [JNI Weak Reference, 0.0000471 secs], 0.1337364 secs]  [Parallel Time: 97.7 ms, GC Workers: 28]   [GC Worker Start (ms): Min: 10178276.8, Avg: 10178277.6, Max: 10178278.2, Diff: 1.4]   [Ext Root Scanning (ms): Min: 0.0, Avg: 0.8, Max: 6.4, Diff: 6.4, Sum: 23.4]   [Update RS (ms): Min: 86.6, Avg: 91.4, Max: 91.7, Diff: 5.2, Sum: 2559.4]     [Processed Buffers: Min: 104, Avg: 118.4, Max: 143, Diff: 39, Sum: 3315]   [Scan RS (ms): Min: 0.7, Avg: 1.0, Max: 1.1, Diff: 0.3, Sum: 26.9]   [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.3, Diff: 0.3, Sum: 0.7]   [Object Copy (ms): Min: 2.3, Avg: 2.5, Max: 3.0, Diff: 0.6, Sum: 71.2]   [Termination (ms): Min: 0.0, Avg: 0.4, Max: 0.6, Diff: 0.6, Sum: 12.1]     [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 28]   [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.5, Diff: 0.5, Sum: 7.6]   [GC Worker Total (ms): Min: 95.7, Avg: 96.5, Max: 97.3, Diff: 1.6, Sum: 2701.4]   [GC Worker End (ms): Min: 10178373.9, Avg: 10178374.1, Max: 10178374.3, Diff: 0.4]  [Code Root Fixup: 0.1 ms]  [Code Root Purge: 0.0 ms]  [Clear CT: 9.0 ms]  [Other: 26.9 ms]   [Choose CSet: 0.0 ms]   [Ref Proc: 6.1 ms]   [Ref Enq: 0.2 ms]   [Redirty Cards: 0.3 ms]   [Humongous Register: 1.7 ms]   [Humongous Reclaim: 0.7 ms]   [Free CSet: 6.5 ms]  [Eden: 117.6G(117.6G)->0.0B(117.5G) Survivors: 32.0M->96.0M Heap: 138.2G(196.0G)->20.6G(196.0G)] [Times: user=2.99 sys=0.02, real=0.13 secs] 2017-06-02T18:46:56.144+0200: 10178.398: Total time for which application threads were stopped: 0.2457828 seconds, Stopping threads took: 0.1110667 seconds2017-06-02T18:46:56.145+0200: 10178.399: Total time for which application threads were stopped: 0.0006996 seconds, Stopping threads took: 0.0001565 seconds2017-06-02T18:46:56.145+0200: 10178.399: Total time for which application threads were stopped: 0.0003017 seconds, Stopping threads took: 0.0000817 seconds2017-06-02T18:46:56.146+0200: 10178.400: Total time for which application threads were stopped: 0.0002389 seconds, Stopping threads took: 0.0000372 seconds2017-06-02T18:46:56.146+0200: 10178.400: Total time for which application threads were stopped: 0.0002567 seconds, Stopping threads took: 0.0000476 seconds2017-06-02T18:46:59.820+0200: 10182.074: Total time for which application threads were stopped: 0.0006631 seconds, Stopping threads took: 0.0001533 seconds2017-06-02T18:46:59.820+0200: 10182.074: Total time for which application threads were stopped: 0.0002674 seconds, Stopping threads took: 0.0000527 seconds2017-06-02T18:47:06.354+0200: 10188.608: [GC pause (G1 Evacuation Pause) (young)Desired survivor size 671088640 bytes, new threshold 15 (max 15)- age  1:  49132824 bytes,  49132824 total- age  2:   1745176 bytes,  50878000 total- age  3:   417632 bytes,  51295632 total- age  4:   411672 bytes,  51707304 total- age  5:   371712 bytes,  52079016 total- age  6:   377456 bytes,  52456472 total- age  7:   344840 bytes,  52801312 total- age  8:   317216 bytes,  53118528 total- age  9:   331216 bytes,  53449744 total- age  10:   302656 bytes,  53752400 total- age  11:   282424 bytes,  54034824 total- age  12:   733032 bytes,  54767856 total- age  13:   221808 bytes,  54989664 total- age  14:   170104 bytes,  55159768 total- age  15:   179872 bytes,  55339640 total 10188.608: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 898670, predicted base time: 105.76 ms, remaining time: 94.24 ms, target pause time: 200.00 ms] 10188.608: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 310 regions, survivors: 3 regions, predicted young region time: 2434.15 ms] 10188.608: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 310 regions, survivors: 3 regions, old: 0 regions, predicted pause time: 2539.91 ms, target pause time: 200.00 ms]2017-06-02T18:47:08.146+0200: 10190.400: [SoftReference, 0 refs, 0.0010616 secs]2017-06-02T18:47:08.147+0200: 10190.401: [WeakReference, 24 refs, 0.0005880 secs]2017-06-02T18:47:08.148+0200: 10190.402: [FinalReference, 49 refs, 0.0006509 secs]2017-06-02T18:47:08.149+0200: 10190.402: [PhantomReference, 0 refs, 1 refs, 0.0010624 secs]2017-06-02T18:47:08.150+0200: 10190.404: [JNI Weak Reference, 0.0000308 secs], 1.8028711 secs]  [Parallel Time: 1790.9 ms, GC Workers: 28]   [GC Worker Start (ms): Min: 10188608.9, Avg: 10188609.3, Max: 10188609.6, Diff: 0.7]   [Ext Root Scanning (ms): Min: 0.3, Avg: 0.7, Max: 2.6, Diff: 2.3, Sum: 19.7]   [Update RS (ms): Min: 127.3, Avg: 128.9, Max: 129.7, Diff: 2.4, Sum: 3608.7]     [Processed Buffers: Min: 89, Avg: 127.7, Max: 180, Diff: 91, Sum: 3576]   [Scan RS (ms): Min: 582.3, Avg: 582.9, Max: 583.1, Diff: 0.9, Sum: 16322.1]   [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.4, Diff: 0.4, Sum: 0.5]   [Object Copy (ms): Min: 1073.9, Avg: 1074.5, Max: 1077.6, Diff: 3.7, Sum: 30085.2]   [Termination (ms): Min: 0.0, Avg: 3.2, Max: 3.7, Diff: 3.7, Sum: 90.0]     [Termination Attempts: Min: 1, Avg: 780.6, Max: 959, Diff: 958, Sum: 21856]   [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: 2.6]   [GC Worker Total (ms): Min: 1789.9, Avg: 1790.3, Max: 1790.7, Diff: 0.7, Sum: 50128.8]   [GC Worker End (ms): Min: 10190399.5, Avg: 10190399.6, Max: 10190399.7, Diff: 0.2]  [Code Root Fixup: 0.1 ms]  [Code Root Purge: 0.0 ms]  [Clear CT: 0.7 ms]  [Other: 11.2 ms]   [Choose CSet: 0.0 ms]   [Ref Proc: 3.7 ms]   [Ref Enq: 0.2 ms]   [Redirty Cards: 4.2 ms]   [Humongous Register: 0.4 ms]   [Humongous Reclaim: 0.1 ms]   [Free CSet: 0.8 ms]  [Eden: 9920.0M(9920.0M)->0.0B(8736.0M) Survivors: 96.0M->1280.0M Heap: 33.2G(196.0G)->24.9G(196.0G)] [Times: user=35.11 sys=3.42, real=1.80 secs]Â
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On Jun 4, 2017, at 12:31 AM, Gil Tene <g...@azul.com> wrote:I think that the cause and effect are backwards here... The application caused this behavior directly, and what it does is "normal" and "typical". Given the Stop-the-world nature of G1's newgen collector, you actually "got off easy", and saw a pause of "only" 1.8 seconds…
The significant thing about the first collection with the lower region count [the 2nd collection in your post] is the fact that a large amount of recently allocated objects are alive at the time of collection. Specifically, ~1421MB [9920MB - (33.2GB - 24.9GB)] out of the 9920MB allocated in the ~10 seconds preceding the 1.8 second pause, of which 1280MB was relocated to survivors, and the rest (~140MB) was presumably promoted. This is dramatically different behavior compared to the previous collection cycles (where only ~64MB out of ~117GB were alive at the time of collection). You can see in this application behavior continue through the next ~130 seconds (looking at the log file link): The next cycle after the "different": behavior started covered an allocation period of ~30 seconds, during which 8760MB were newly allocated, and of which ~1158MB stayed alive [8760MB - (33.4GB - 26.0GB)]. The next cycle after that covered an allocation period of ~100 seconds, during which 8736MB were newly allocated, and of which ~646MB stayed alive [8736MB - (34.5GB -26.6GB)].
We are looking at a period of time in the application execution that spans ~140seconds, during which a total of ~27416MB were newly allocated, and of which ~3225MB stuck around long enough to be copied around and promoted. This ~3.15GB of new [non ephemerally dying] state is very real, it's just not "typical" of other times in the log for the same application.This sexplanation for this is usually quite simple: it suggests a phase-change in application behavior, with a large amount of new state that is being kept for a while. This is not "wrong" in any way. In fact, it it extremely common and typical to see such phase change behaviors across many application types. Â E.g. a refresh of a cache, a reloading of a product catalog, a reindexing, a loading of data for an analytics computation, a rejoining of a cluster, the reaction to SS table compaction, or nodes coming and going in Cassandra, Elastic, Zookeeper, etc. etc. etc.
The earlier GC trigger with the smaller region count may very well be a reaction to a dramatic change in allocation rate, rather than the cause of the large pause. It actually "got lucky" here, In cases where such a phase change occurs but the allocation rate change is not "violent enough" to heuristically trigger an early newgen collection, the newgen pause on the phase change would have been much bigger [this is a common problem with large STW newgens, since some phase changes only effect data/state but may not change allocation rate dramatically].  In fact, the allocation rate going into the 2nd and 3rd collections in this ~140 second period is much more relaxes than the one leading to the first collections, and may not have been high enough to trigger more aggressive collection oil their own. If the collector waited for the whole 140 second before triggering the collection in this case, there would be  pause may have taken >2x longer because it would have had to deal with moving 3.15GB around in a single pause rather than "just" 1.4GB.ÂAt this point I often get questions form people that go something like this: "Isn't G1 supposed to break up these large pauses into smaller "increment" pauses?". While G1 does incremental STW compaction in *oldgen* (spreading the pause across increments), it's newgen collector is effectively monolithic: the whole of newgen must be collected in a single STW pause. Newgens are shorter primarily (only?) because they do not have to trace all live objects in the heap, and focus on tracing only the newgen objects and all roots to them. But this also means that newgens have no way to determine the amount of live matter that will be found in the newgen until after a monolithic cycle has completed, and to complete it, which fundamentally means that the set of pages in the newgen must be selected without that knowledge. If the "guess" is right (e.g. "they are mostly dead" and "last time only X MB stayed alive"), you get a very efficient newgen with a relatively small pause. But if the "guess" was wrong (1421MB is found alive alive in 9.9GB of allocation, even tho last time only 64MB was found alive in 117GB of allocation, as was the case here), it's too late. You're in amor a big pause.There problem/dillema that you see here, which is inherent in all monolithic stop-the-world relocating newgen collectors. For all such collectors [at least currently], the only way to reduce the newgen pause in the presence of a high amount of newly allocated surviving objects is to significantly reduce the newgen size (e.g. trigger at ~1GB of allocation rather than after 10GB, in the hope that only 1/10th of the stuff would need to be relocated in a single pause). Unfortunately the collector has no idea how much newgen material will survive before it starts the collection, and it's guesses based on prior collections are dramatically and inherently wrong at a phase change. It is stuck with a choice between running very inefficient collections (keep newgen at a few 100 MB, even tho you have a 196GB heap, just in case a large amount of relatively-long-lasting state is allocated during a phase change) and accepting a huge pause on a phase change. The only ways out of this dilemma is to change how the collector deals with newgen: either break the newgen into several (more than 1 or 2) incremental-STW sub-parts (something no JVM has done to date, AFAIK), or use a concurrent relocating newgen (which by now is a well-proven technique).Note that even during this different phase of operation the weak generational hypothesis is holding (85%+ of newly allocated objects have died). It's just that outside of such phase shifts, most applications demonstrate the weak generational hypothesis in a much more dramatic way (with 99%+ death rates, and with the rate growing with larger and larger newgens). So the newgen efficiency is still there (compared to dealign with the same workload with a non-generational or oldgen collection), but now it comes with huge newgen pauses...
To summarize your situation: Basically, the large amount of copying (either to survivors within newgen or via promotion to oldgen) is a result of program behavior. The collector is just reacting to that behavior. And unfortunately for all monolithic STW relocating newgen collectors, there isn't much you can do about that. Since several newly-allocated-but-surviving GBs need to be relocated in a relatively short period of time (10s of seconds), you are going to see pause times that amount to ~1-2sec per relocated GB over that period of time. This means that you will see significant pause artifacts in your run no matter what (as long as you are using a STW newgen).
The actual bottom line is very simple: You need a concurrent newgen to handle this workload, with these phase changes, without any of those "hiccups". We can discuss that in other posts if you want ;-).
And the actual bottom line is very simple: You need a concurrent newgen to handle this workload, with these phase changes, without any of those "hiccups". We can discuss that in other posts if you want ;-).
On Jun 5, 2017, at 3:58 PM, Alexandr Nikitin <nikitin.a...@gmail.com> wrote:Gil, Kirk, Thank you for the great and thorough explanation! It is an awesome read! Almost everything fell into place.Indeed, Gil is right about the application pattern change. It's a high-load API. There's a background thread that changes an internal state. The state is ~3.5G and requires some similar amount of memory to build itself. It happens in one background thread with lower priority. The change doesn't happen often, once per hour or so.The G1 is new for me and I couldn't intuitively understand why it reacts to the allocation pattern change like that.
There's the main allocation/ collection pattern that lasts for hours. And it needs just one background low priority thread (which allocates less memory than worker threads) to change it :) That region number change leads to more copying and promotions and longer pauses as a consequence (it changes max age threshold to 1 and happens more often)And the actual bottom line is very simple: You need a concurrent newgen to handle this workload, with these phase changes, without any of those "hiccups". We can discuss that in other posts if you want ;-).Do you mean Azul Zing by that?
On Jun 5, 2017, at 3:58 PM, Alexandr Nikitin <nikitin.a...@gmail.com> wrote:Gil, Kirk, Thank you for the great and thorough explanation! It is an awesome read! Almost everything fell into place.Indeed, Gil is right about the application pattern change. It's a high-load API. There's a background thread that changes an internal state. The state is ~3.5G and requires some similar amount of memory to build itself. It happens in one background thread with lower priority. The change doesn't happen often, once per hour or so.The G1 is new for me and I couldn't intuitively understand why it reacts to the allocation pattern change like that.All collectors in the JVM will react in a similar way. I’ve just spent 2 hours with a client explaining to them that in their case GC wasn’t at fault. It was simply shouting very loudly that something bad was happening in the application that needed to be investigated.
On Jun 5, 2017, at 9:02 PM, Gil Tene <g...@azul.com> wrote:
On Monday, June 5, 2017 at 9:47:56 PM UTC+3, Kirk Pepperdine wrote:On Jun 5, 2017, at 3:58 PM, Alexandr Nikitin <nikitin.a...@gmail.com> wrote:Gil, Kirk, Thank you for the great and thorough explanation! It is an awesome read! Almost everything fell into place.Indeed, Gil is right about the application pattern change. It's a high-load API. There's a background thread that changes an internal state. The state is ~3.5G and requires some similar amount of memory to build itself. It happens in one background thread with lower priority. The change doesn't happen often, once per hour or so.The G1 is new for me and I couldn't intuitively understand why it reacts to the allocation pattern change like that.All collectors in the JVM will react in a similar way. I’ve just spent 2 hours with a client explaining to them that in their case GC wasn’t at fault. It was simply shouting very loudly that something bad was happening in the application that needed to be investigated.All collectors in the *HotSpot* JVM (Oracle or OpenJDK) will react in a similar way. All collectors in the Zing JVM (we only have one) won't. It takes a lot more than a few GB of promotion and a few GB/sec of allocation rate to make the C4 collector lose it's temper and complain that the programmer is doing it wrong ;-).
ÂThere's the main allocation/ collection pattern that lasts for hours. And it needs just one background low priority thread (which allocates less memory than worker threads) to change it :) That region number change leads to more copying and promotions and longer pauses as a consequence (it changes max age threshold to 1 and happens more often)And the actual bottom line is very simple: You need a concurrent newgen to handle this workload, with these phase changes, without any of those "hiccups". We can discuss that in other posts if you want ;-).Do you mean Azul Zing by that?Of course he does :-)And he’s not wrong.— KirkPS, Gil, I’ll collect my commission on this lead once you close it. ;-)