V8: Getting Garbage Collection for Free

587 views
Skip to first unread message

Jimmy Jia

unread,
Aug 10, 2015, 10:04:10 PM8/10/15
to mechanical-sympathy
I thought this piece might look interesting to some people here:


Of course, 16.6 ms is an eternity and then some to most of us, but it's a cool technique. It's quite fascinating to me to see what techniques are available in "adjacent" fields, if you will.

Vitaly Davidovich

unread,
Aug 10, 2015, 10:57:37 PM8/10/15
to mechanical-sympathy

I saw this earlier too.  It's a cute trick when you have discrete latency sensitive events of a fixed SLA (e.g. animation running at 60 fps in the article).  But, when the arrival rate of latency sensitive tasks is random (or "constant"), you really don't know a priori whether you have idle time or not.  Moreover, 16 millis is an eternity for some server workloads, and it's unlikely smaller time slices would suffice to get any meaningful GC done on a typical server heap.

Fundamentally, if you must ensure no jitter due to GC then you need to take control away from it.  Otherwise, you're always subject to its heuristics/implementation details biting you.

sent from my phone

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kirk Pepperdine

unread,
Aug 11, 2015, 2:21:42 AM8/11/15
to mechanica...@googlegroups.com
On Aug 11, 2015, at 4:57 AM, Vitaly Davidovich <vit...@gmail.com> wrote:

I saw this earlier too.  It's a cute trick when you have discrete latency sensitive events of a fixed SLA (e.g. animation running at 60 fps in the article).  But, when the arrival rate of latency sensitive tasks is random (or "constant"), you really don't know a priori whether you have idle time or not.  Moreover, 16 millis is an eternity for some server workloads, and it's unlikely smaller time slices would suffice to get any meaningful GC done on a typical server heap.


Indeed, it’s easy to tune to a fixed goal such as 16ms. You know how long it takes to process a frame. That tells you how long a pause you can tolerate. With fixed rates of arrival and fixed rates of garbage being created means you can schedule GC in the dead time after processing the frame completes and the vsync. As soon as you hit a web facing application with variable allocation rates scheduling becomes more probabilistic and deterministic.

Fundamentally, if you must ensure no jitter due to GC then you need to take control away from it.  Otherwise, you're always subject to its heuristics/implementation details biting you.


I like Cliff Clicks comment that the JVM gives you the illusion that you have infinite amount of memory. You don’t and if in the long term, your reclaim rate < allocation rate you can have all the control you want over your collector, your application is going to come to a grinding halt because that illusion is a leaky abstraction.

Regards,
Kirk
signature.asc

ymo

unread,
Aug 12, 2015, 9:39:27 AM8/12/15
to mechanical-sympathy
It would be nice if you could tell the jvm to enter a particular method in a timely and deterministic fashion. Something similar to realtime os capabilities. If the method completes before the deadline then the jvm can go and do its garbage collection or anything else it fancy to do. I remember there was ton of talks about real-tme java a while ago but that seems like it died down. This would be great for latency sensisitive apps instead of fiddling with gc. At the very least what this does is guaranties that when the system is under load you would still not exceed your maximum expected load and perform back pressure on the the stuff waiting in line.


On Tuesday, August 11, 2015 at 2:21:42 AM UTC-4, Kirk Pepperdine wrote:

On Aug 11, 2015, at 4:57 AM, Vitaly Davidovich <vit...@gmail.com> wrote:

I saw this earlier too.  It's a cute trick when you have discrete latency sensitive events of a fixed SLA (e.g. animation running at 60 fps in the article).  But, when the arrival rate of latency sensitive tasks is random (or "constant"), you really don't know a priori whether you have idle time or not.  Moreover, 16 millis is an eternity for some server workloads, and it's unlikely smaller time slices would suffice to get any meaningful GC done on a typical server heap.


Indeed, it’s easy to tune to a fixed goal such as 16ms. You know how long it takes to process a frame. That tells you how long a pause you can tolerate. With fixed rates of arrival and fixed rates of garbage being created means you can schedule GC in the dead time after processing the frame completes and the vsync. As soon as you hit a web facing application with variable allocation rates scheduling becomes more probabilistic and deterministic.

Fundamentally, if you must ensure no jitter due to GC then you need to take control away from it.  Otherwise, you're always subject to its heuristics/implementation details biting you.


I like Cliff Clicks comment that the JVM gives you the illusion that you have infinite amount of memory. You don’t and if in the long term, your reclaim rate < allocation rate you can have all the control you want over your collector, your application is going to come to a grinding halt because that illusion is a leaky abstraction.

Regards,
Kirk

sent from my phone

On Aug 10, 2015 10:04 PM, "Jimmy Jia" <tes...@gmail.com> wrote:
I thought this piece might look interesting to some people here:


Of course, 16.6 ms is an eternity and then some to most of us, but it's a cool technique. It's quite fascinating to me to see what techniques are available in "adjacent" fields, if you will.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Vitaly Davidovich

unread,
Aug 12, 2015, 9:52:47 AM8/12/15
to mechanical-sympathy

Interestingly, .NET recently added something like that: https://msdn.microsoft.com/en-us/library/system.gc.trystartnogcregion(v=vs.110).aspx

sent from my phone

On Aug 12, 2015 9:39 AM, "ymo" <ymol...@gmail.com> wrote:
It would be nice if you could tell the jvm to enter a particular method in a timely and deterministic fashion. Something similar to realtime os capabilities. If the method completes before the deadline then the jvm can go and do its garbage collection or anything else it fancy to do. I remember there was ton of talks about real-tme java a while ago but that seems like it died down. This would be great for latency sensisitive apps instead of fiddling with gc. At the very least what this does is guaranties that when the system is under load you would still not exceed your maximum expected load and perform back pressure on the the stuff waiting in line.

On Tuesday, August 11, 2015 at 2:21:42 AM UTC-4, Kirk Pepperdine wrote:

On Aug 11, 2015, at 4:57 AM, Vitaly Davidovich <vit...@gmail.com> wrote:

I saw this earlier too.  It's a cute trick when you have discrete latency sensitive events of a fixed SLA (e.g. animation running at 60 fps in the article).  But, when the arrival rate of latency sensitive tasks is random (or "constant"), you really don't know a priori whether you have idle time or not.  Moreover, 16 millis is an eternity for some server workloads, and it's unlikely smaller time slices would suffice to get any meaningful GC done on a typical server heap.


Indeed, it’s easy to tune to a fixed goal such as 16ms. You know how long it takes to process a frame. That tells you how long a pause you can tolerate. With fixed rates of arrival and fixed rates of garbage being created means you can schedule GC in the dead time after processing the frame completes and the vsync. As soon as you hit a web facing application with variable allocation rates scheduling becomes more probabilistic and deterministic.

Fundamentally, if you must ensure no jitter due to GC then you need to take control away from it.  Otherwise, you're always subject to its heuristics/implementation details biting you.


I like Cliff Clicks comment that the JVM gives you the illusion that you have infinite amount of memory. You don’t and if in the long term, your reclaim rate < allocation rate you can have all the control you want over your collector, your application is going to come to a grinding halt because that illusion is a leaky abstraction.

Regards,
Kirk

sent from my phone

On Aug 10, 2015 10:04 PM, "Jimmy Jia" <tes...@gmail.com> wrote:
I thought this piece might look interesting to some people here:


Of course, 16.6 ms is an eternity and then some to most of us, but it's a cool technique. It's quite fascinating to me to see what techniques are available in "adjacent" fields, if you will.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Kirk Pepperdine

unread,
Aug 12, 2015, 9:53:25 AM8/12/15
to mechanica...@googlegroups.com
On Aug 12, 2015, at 3:39 PM, ymo <ymol...@gmail.com> wrote:

It would be nice if you could tell the jvm to enter a particular method in a timely and deterministic fashion. Something similar to realtime os capabilities. If the method completes before the deadline then the jvm can go and do its garbage collection or anything else it fancy to do. I remember there was ton of talks about real-tme java a while ago but that seems like it died down. This would be great for latency sensisitive apps instead of fiddling with gc. At the very least what this does is guaranties that when the system is under load you would still not exceed your maximum expected load and perform back pressure on the the stuff waiting in line.

People seem to think that real time is fast. Real time is like Swiss time. They always meter everything out so that the arrive “on time”. It’s an artificial schedule and it’s anything but fast. Second, if you exhaust *any* resource because you can’t recycle it fast enough you’re going to have to stop and focus on recycling and in the meantime miss your deadlines. Unfortunately there isn’t any magic here. That said, I think we can do better, much better with schemes to recycle reusable shared resources like memory.

— Kirk


On Tuesday, August 11, 2015 at 2:21:42 AM UTC-4, Kirk Pepperdine wrote:

On Aug 11, 2015, at 4:57 AM, Vitaly Davidovich <vit...@gmail.com> wrote:

I saw this earlier too.  It's a cute trick when you have discrete latency sensitive events of a fixed SLA (e.g. animation running at 60 fps in the article).  But, when the arrival rate of latency sensitive tasks is random (or "constant"), you really don't know a priori whether you have idle time or not.  Moreover, 16 millis is an eternity for some server workloads, and it's unlikely smaller time slices would suffice to get any meaningful GC done on a typical server heap.


Indeed, it’s easy to tune to a fixed goal such as 16ms. You know how long it takes to process a frame. That tells you how long a pause you can tolerate. With fixed rates of arrival and fixed rates of garbage being created means you can schedule GC in the dead time after processing the frame completes and the vsync. As soon as you hit a web facing application with variable allocation rates scheduling becomes more probabilistic and deterministic.

Fundamentally, if you must ensure no jitter due to GC then you need to take control away from it.  Otherwise, you're always subject to its heuristics/implementation details biting you.


I like Cliff Clicks comment that the JVM gives you the illusion that you have infinite amount of memory. You don’t and if in the long term, your reclaim rate < allocation rate you can have all the control you want over your collector, your application is going to come to a grinding halt because that illusion is a leaky abstraction.

Regards,
Kirk

sent from my phone

On Aug 10, 2015 10:04 PM, "Jimmy Jia" <tes...@gmail.com> wrote:
I thought this piece might look interesting to some people here:


Of course, 16.6 ms is an eternity and then some to most of us, but it's a cool technique. It's quite fascinating to me to see what techniques are available in "adjacent" fields, if you will.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
signature.asc

Jimmy Jia

unread,
Aug 12, 2015, 2:30:38 PM8/12/15
to mechanical-sympathy
I think the lesson to learn is that GC intrusiveness can be reduced when there's some awareness of the problem domain at hand.

In some sense, the JVM does the worst possible thing in initiating GC on allocation, because that's exactly when you know for sure that you aren't idle!

If you write everything GC-free, it's not a big deal - this is more for cases where you are mostly idle; even if events arrive randomly, it seems like you ought to be able to reduce the probability of GC happening when you care. Something like calling System.gc() yourself, but perhaps with a bit more control (i.e. I've made the calculation that if I give the JVM 20 ms to GC now, in expectation that speeds up future request handling).

Kirk Pepperdine

unread,
Aug 12, 2015, 3:02:20 PM8/12/15
to mechanica...@googlegroups.com
On Aug 12, 2015, at 8:30 PM, Jimmy Jia <tes...@gmail.com> wrote:

I think the lesson to learn is that GC intrusiveness can be reduced when there's some awareness of the problem domain at hand.

Indeed, and that is why we have different collectors each of which have their own characteristics. If you follow the guidelines of when to use which one, you will have a good chance a picking the wrong one. If you follow conventional tuning guidelines there is a reasonable chance that you’ll be not giving the collector the best chance to perform well for your application. I don’t believe that we currently have the best collectors that we could have in Java. C4 proves this point.


In some sense, the JVM does the worst possible thing in initiating GC on allocation, because that's exactly when you know for sure that you aren't idle!

Servers are rarely idle and when they are, they certainly aren’t consuming memory. And, the concurrent collectors start on a threshold. Ok, you will only violate the threshold on an allocation but that said, the collection will not start immediately.


If you write everything GC-free, it's not a big deal - this is more for cases where you are mostly idle; even if events arrive randomly, it seems like you ought to be able to reduce the probability of GC happening when you care. Something like calling System.gc() yourself, but perhaps with a bit more control (i.e. I've made the calculation that if I give the JVM 20 ms to GC now, in expectation that speeds up future request handling).

I’ve been able to get GC to run when the system is about to be idle without calling System.gc(). It’s not easy and it can’t work for every app out there because it requires specific conditions in order to work without risk of other bad things happening but it can be done. On the other hand, I’ve seen the disastrous results of developers making speculative calls to have the collector run. better to have the collector running concurrently and continuously then make speculative calls.

Regards,
Kirk

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
signature.asc

ymo

unread,
Aug 12, 2015, 4:46:38 PM8/12/15
to mechanical-sympathy
i never said fast i said "in a timely and deterministic fashion" that is miles away from saying fast ! Moreover, you should not be in a situation where you can "exhaust *any* resource because you can’t recycle" . This is just impossible if you are testing your stuff correctly. You should  determine at development and testing exactly what are your upper limits in terms of garbage generation/collection and design around that. Once you know your upper limit you can stop taking more calls when you reach it and do back pressure on the stuff coming upstream. All real time gives you is a deterministic approach to reason about things but yes being faster it is not guaranteed.

Another way of saying would be "If you cant handle a particular load with your current configuration then you probably should not try to handle it in the first place "! In the hardware people have been dealing with hard upper limits requirements and learned to design with these limitations in mind. On the software/server side all i see is "best efforts" ... no determinism or predictability whatsoever. A nice parallel is the frog that tried to be bigger than a cow .. it blows at your face at the end ! 

Kirk Pepperdine

unread,
Aug 12, 2015, 5:21:47 PM8/12/15
to mechanica...@googlegroups.com

> On Aug 12, 2015, at 10:46 PM, ymo <ymol...@gmail.com> wrote:
>
> i never said fast i said "in a timely and deterministic fashion" that is miles away from saying fast ! Moreover, you should not be in a situation where you can "exhaust *any* resource because you can’t recycle" . This is just impossible if you are testing your stuff correctly.

I completely agree. However it’s not what I commonly run into.

> You should determine at development and testing exactly what are your upper limits in terms of garbage generation/collection and design around that. Once you know your upper limit you can stop taking more calls when you reach it and do back pressure on the stuff coming upstream. All real time gives you is a deterministic approach to reason about things but yes being faster it is not guaranteed.

Upper bounding assumes ability to scale down which isn’t always the case so it’s not quite that simple. This is why I was hoping G1 would work better than it currently does because it’s very adaptive.. when it works right...
>
> Another way of saying would be "If you cant handle a particular load with your current configuration then you probably should not try to handle it in the first place "! In the hardware people have been dealing with hard upper limits requirements and learned to design with these limitations in mind. On the software/server side all i see is "best efforts" ... no determinism or predictability whatsoever. A nice parallel is the frog that tried to be bigger than a cow .. it blows at your face at the end !

Well, I’d agree again. Software generally has no regards for limitations which is why you often find software that over subscribes to the hardware.

Regards,
Kirk

signature.asc

Gil Tene

unread,
Aug 16, 2015, 1:28:25 PM8/16/15
to mechanica...@googlegroups.com
Depending on JVM, there is already a (non-guaranteed, implementation dependent) way to suppress GC activity during a critical execution section using GetPrimitiveArrayCritical() and ReleasePrimitiveArrayCritical() from JNI. HotSpot variants (including Oracle JVM, OpenJDK, and Zing) implement GetPrimitiveArrayCritical() by suppressing GC until the matching ReleasePrimitiveArrayCritical() is called.

Note that is not guaranteed (spec'ed) behavior. GetPrimitiveArrayCritical() MAY be implemented with object pinning (which would just disable the relocation of the specific array, but not block GC) or with copying (producing an off-heap copy of the array for JNI to operate on when GetPrimitiveArrayCritical() is called, and copying it's contents back to the on-heap array when ReleasePrimitiveArrayCritical() is called). The fact that HotSpot variants implement GetPrimitiveArrayCritical() by delaying GC until the release is an implementation detail that cannot be relied on in the long term... 

Also note the strong limitation on using this GetCrtical/ReleaseCritical method. It is assumed to (A) be very short lived, and (B) not depend on allocation of Java objects to proceed. (A) is obvious. (B) is needed to avoid deadlocks (critical code waits for something to allocate a java object, java object allocation waits for GC to free memory, GC waits for critical code to complete).

And last, preventing GC from proceeding is not the same as guaranteeing no interruption or stalls in the executed code. GC is just one of many things that causes a modern JVM to stall, so delaying GC will not guarantee anything close to determinsitic execution. For an overview of *some* of the many things that can make a JVM pause, you can view John Cuthbertson's excellent presentation on "What else makes a JVM pause" at https://www.youtube.com/watch?v=Y39kllzX1P8

On Wednesday, August 12, 2015 at 3:52:47 AM UTC-10, Vitaly Davidovich wrote:

Interestingly, .NET recently added something like that: https://msdn.microsoft.com/en-us/library/system.gc.trystartnogcregion(v=vs.110).aspx

sent from my phone

On Aug 12, 2015 9:39 AM, "ymo" <ymol...@gmail.com> wrote:
It would be nice if you could tell the jvm to enter a particular method in a timely and deterministic fashion. Something similar to realtime os capabilities. If the method completes before the deadline then the jvm can go and do its garbage collection or anything else it fancy to do. I remember there was ton of talks about real-tme java a while ago but that seems like it died down. This would be great for latency sensisitive apps instead of fiddling with gc. At the very least what this does is guaranties that when the system is under load you would still not exceed your maximum expected load and perform back pressure on the the stuff waiting in line.

On Tuesday, August 11, 2015 at 2:21:42 AM UTC-4, Kirk Pepperdine wrote:

On Aug 11, 2015, at 4:57 AM, Vitaly Davidovich <vit...@gmail.com> wrote:

I saw this earlier too.  It's a cute trick when you have discrete latency sensitive events of a fixed SLA (e.g. animation running at 60 fps in the article).  But, when the arrival rate of latency sensitive tasks is random (or "constant"), you really don't know a priori whether you have idle time or not.  Moreover, 16 millis is an eternity for some server workloads, and it's unlikely smaller time slices would suffice to get any meaningful GC done on a typical server heap.


Indeed, it’s easy to tune to a fixed goal such as 16ms. You know how long it takes to process a frame. That tells you how long a pause you can tolerate. With fixed rates of arrival and fixed rates of garbage being created means you can schedule GC in the dead time after processing the frame completes and the vsync. As soon as you hit a web facing application with variable allocation rates scheduling becomes more probabilistic and deterministic.

Fundamentally, if you must ensure no jitter due to GC then you need to take control away from it.  Otherwise, you're always subject to its heuristics/implementation details biting you.


I like Cliff Clicks comment that the JVM gives you the illusion that you have infinite amount of memory. You don’t and if in the long term, your reclaim rate < allocation rate you can have all the control you want over your collector, your application is going to come to a grinding halt because that illusion is a leaky abstraction.

Regards,
Kirk

sent from my phone

On Aug 10, 2015 10:04 PM, "Jimmy Jia" <tes...@gmail.com> wrote:
I thought this piece might look interesting to some people here:


Of course, 16.6 ms is an eternity and then some to most of us, but it's a cool technique. It's quite fascinating to me to see what techniques are available in "adjacent" fields, if you will.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Gil Tene

unread,
Aug 16, 2015, 4:51:06 PM8/16/15
to mechanical-sympathy
The "do some stop-the-world GC work now" (and even some of the "do some concurrent GC work now") approach is VERY useful when you have:

1) A critical workload that is known to execute with a certain period.

2) A reduced need (compared to the critical workload's need) to deal with asynchronous (unpredictable arrival time) events in a timely manner.

These two qualities describe the world of video playback and gaming fairly well, where "judder" is very percentile when you miss the 1/24th, 1/30th, or 1/60th of a second refresh rate deadlines, and where it is acceptable for response to asynchronous input to take a "larger" amount of time to deal with than the period of the critical refresh rate.

However, these qualities do not typically appear in server applications, most real time applications, and most low latency applications. The ability to predict a near-term future where no work need to be done is limited to a fairly narrow (but very useful) application space, the vast majority of which tends to be in the single-user client domain.

Vitaly Davidovich

unread,
Aug 16, 2015, 6:47:43 PM8/16/15
to mechanical-sympathy

Yeah, there are other pause sources in the JVM.  The GC ones, however, are the ones people mostly tend to notice (and thus try to avoid/shorten) even outside the low latency space.  The chief reason, I suspect, is because the GC pauses don't scale well with increasing heap sizes whereas the other ones (monitor deflation, biased lock revocation, nmethod sweeping, nmethod hotness marking, compiler deoptimizations) tend to stay relatively constant (increasing thread count could bump some of those as well).  So although there's no hard realtime guarantees (besides the JVM, the OS can induce some anyway), folks tend to try minimizing the ones inflicting the most noticeable performance penalties.

As for CLR, it doesn't have some of the safepoint tasks that are performed by the JVM (e.g. no nmethod sweeping/hotness marking, no biased lock revocation, no compiler deopt) so GC tends to be the only runtime induced pause.

sent from my phone

On Aug 16, 2015 1:28 PM, "Gil Tene" <g...@azulsystems.com> wrote:
Depending on JVM, there is already a (non-guaranteed, implementation dependent) way to suppress GC activity during a critical execution section using GetPrimitiveArrayCritical() and ReleasePrimitiveArrayCritical() from JNI. HotSpot variants (including Oracle JVM, OpenJDK, and Zing) implement GetPrimitiveArrayCritical() by suppressing GC until the matching ReleasePrimitiveArrayCritical() is called.

Note that is not guaranteed (spec'ed) behavior. GetPrimitiveArrayCritical() MAY be implemented with object pinning (which would just disable the relocation of the specific array, but not block GC) or with copying (producing an off-heap copy of the array for JNI to operate on when GetPrimitiveArrayCritical() is called, and copying it's contents back to the on-heap array when ReleasePrimitiveArrayCritical() is called). The fact that HotSpot variants implement GetPrimitiveArrayCritical() by delaying GC until the release is an implementation detail that cannot be relied on in the long term... 

Also note the strong limitation on using this GetCrtical/ReleaseCritical method. It is assumed to (A) be very short lived, and (B) not depend on allocation of Java objects to proceed. (A) is obvious. (B) is needed to avoid deadlocks (critical code wits for something to allocate a java object, java object allocation waits for GC to free memory, GC waits for critical code to complete).

And last, preventing GC from proceeding is not the same as guaranteeing no interruption or stalls in the executed code. GC is just one of many things that causes a modern JVM to stall, so delaying GC will not guarantee anything close to determinsitic execution. For an overview of *some* of the many things that can make a JVM pause, you can view John Cuthbertson's excellent presentation on "What els makes a JVM pause" at https://www.youtube.com/watch?v=Y39kllzX1P8

On Wednesday, August 12, 2015 at 3:52:47 AM UTC-10, Vitaly Davidovich wrote:

Interestingly, .NET recently added something like that: https://msdn.microsoft.com/en-us/library/system.gc.trystartnogcregion(v=vs.110).aspx

sent from my phone

On Aug 12, 2015 9:39 AM, "ymo" <ymol...@gmail.com> wrote:
It would be nice if you could tell the jvm to enter a particular method in a timely and deterministic fashion. Something similar to realtime os capabilities. If the method completes before the deadline then the jvm can go and do its garbage collection or anything else it fancy to do. I remember there was ton of talks about real-tme java a while ago but that seems like it died down. This would be great for latency sensisitive apps instead of fiddling with gc. At the very least what this does is guaranties that when the system is under load you would still not exceed your maximum expected load and perform back pressure on the the stuff waiting in line.

On Tuesday, August 11, 2015 at 2:21:42 AM UTC-4, Kirk Pepperdine wrote:

On Aug 11, 2015, at 4:57 AM, Vitaly Davidovich <vit...@gmail.com> wrote:

I saw this earlier too.  It's a cute trick when you have discrete latency sensitive events of a fixed SLA (e.g. animation running at 60 fps in the article).  But, when the arrival rate of latency sensitive tasks is random (or "constant"), you really don't know a priori whether you have idle time or not.  Moreover, 16 millis is an eternity for some server workloads, and it's unlikely smaller time slices would suffice to get any meaningful GC done on a typical server heap.


Indeed, it’s easy to tune to a fixed goal such as 16ms. You know how long it takes to process a frame. That tells you how long a pause you can tolerate. With fixed rates of arrival and fixed rates of garbage being created means you can schedule GC in the dead time after processing the frame completes and the vsync. As soon as you hit a web facing application with variable allocation rates scheduling becomes more probabilistic and deterministic.

Fundamentally, if you must ensure no jitter due to GC then you need to take control away from it.  Otherwise, you're always subject to its heuristics/implementation details biting you.


I like Cliff Clicks comment that the JVM gives you the illusion that you have infinite amount of memory. You don’t and if in the long term, your reclaim rate < allocation rate you can have all the control you want over your collector, your application is going to come to a grinding halt because that illusion is a leaky abstraction.

Regards,
Kirk

sent from my phone

On Aug 10, 2015 10:04 PM, "Jimmy Jia" <tes...@gmail.com> wrote:
I thought this piece might look interesting to some people here:


Of course, 16.6 ms is an eternity and then some to most of us, but it's a cool technique. It's quite fascinating to me to see what techniques are available in "adjacent" fields, if you will.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Gil Tene

unread,
Aug 16, 2015, 7:01:24 PM8/16/15
to <mechanical-sympathy@googlegroups.com>


Sent from Gil's iPhone

On Aug 16, 2015, at 12:47 PM, Vitaly Davidovich <vit...@gmail.com> wrote:

Yeah, there are other pause sources in the JVM.  The GC ones, however, are the ones people mostly tend to notice (and thus try to avoid/shorten) even outside the low latency space.  The chief reason, I suspect, is because the GC pauses don't scale well with increasing heap sizes whereas the other ones (monitor deflation, biased lock revocation, nmethod sweeping, nmethod hotness marking, compiler deoptimizations) tend to stay relatively constant (increasing thread count could bump some of those as well).  So although there's no hard realtime guarantees (besides the JVM, the OS can induce some anyway), folks tend to try minimizing the ones inflicting the most noticeable performance penalties.


While GC pauses certainly tend to dominate awareness (and for good reason with most collectors), any pause that requires a safepoint is equally susceptible to TTSP (time to safepoint) when it comes to pause length. And TTSP can get very large, e.g. with counted loops, ranging into the 100s of msec and beyond. 

As for CLR, it doesn't have some of the safepoint tasks that are performed by the JVM (e.g. no nmethod sweeping/hotness marking, no biased lock revocation, no compiler deopt) so GC tends to be the only runtime induced pause.

I don't know much about the CLR's JIT capabilities, but given how similar C# is to Java in some stylistic aspects, I suspect that the CLR shares some depot capabilities with the JVM (specifically, I'd be surprised if it doesn't do CHA based demutualization and/or guarded inlining)...
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/6vVb_ML2yoo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mechanical-symp...@googlegroups.com.

Vitaly Davidovich

unread,
Aug 16, 2015, 10:29:49 PM8/16/15
to mechanical-sympathy

CLR JIT is nowhere near as sophisticated as say Hotspot; closest comparison is probably to the C1 compiler.  There's no interpreter and no tiered compilation, and hence no real profile information and not enough time to do sophisticated analyses.  There's nothing to base speculations on.  This isn't quite as dire as java would be since methods are non-virtual by default and generics are reified.  I actually think that if the CLR had the same aggressive deopt capabilities as Hotspot it would be a significantly faster platform given the rest of the type system and capabilities/flexibility.  But the pertinent bit for this discussion is the JIT is not a source of stalls other than first invocation of a method, whereas it is in Hotspot.

sent from my phone

Sanjoy Das

unread,
Aug 16, 2015, 10:45:31 PM8/16/15
to mechanica...@googlegroups.com
> I actually think that if the CLR had the
> same aggressive deopt capabilities as Hotspot it would be a significantly
> faster platform given the rest of the type system and
> capabilities/flexibility.

That's a two-edged sword -- keeping around enough state to deoptimize a
method can be fairly expensive, and if you give up on deoptimizing you do not
have that cost.

-- Sanjoy

Vitaly Davidovich

unread,
Aug 16, 2015, 10:54:42 PM8/16/15
to mechanical-sympathy

Yes of course, there're many costs with carrying a more sophisticated profiling JIT, memory being one.  However, the deopt ability is what brings substantial gains to JITs in languages where the static compiler does nothing in terms of optimizations, dynamic class loading is possible, and that are heavy on virtual dispatch.

sent from my phone

Jean-Philippe BEMPEL

unread,
Aug 17, 2015, 2:56:13 AM8/17/15
to mechanical-sympathy
Good article on .NET JIT optimizations:

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Vitaly Davidovich

unread,
Aug 17, 2015, 8:01:10 AM8/17/15
to mechanical-sympathy

If anyone is interested about the iface dispatch technique this article mentions, it's described here: https://github.com/dotnet/coreclr/blob/master/Documentation/botr/virtual-stub-dispatch.md

Note that this is a dispatch optimization for interfaces only to avoid itable lookup when callsite is monomorphic; it's not guarded inlining (PIC) or CHA like Hotspot.

sent from my phone

To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages