Jason, JNI doesn't have this problem. Unsafe does. JNI carries a per call cost in both directions, but it does not serialize global safepoints.
The JVM serializes and sees a high TTSP only with threads
that do not cross a safepoint opportunity (where the
JVM can have the thread stall at a safe point) for a prolonged period of time. JNI doesn't have this problem, as each JNI call is "one big safepoint opportunity".
When I said "with regular JNI calls (in which your code
runs in a safepoint)" below, I was referring to the fact that the entire JNI code execution is a safepoint from the thread's perspective, meaning that the JVM can safely look at the thread's stack machine state anywhere during the JNI execution. In fact, JNI
keeps freely executing even during a global JVM safepoint (you just cant execute past it). In most JVMs, entering JNI releases the thread's "JVM lock", allowing the JVM to grab it at will and preventing the thread from proceeding out of the JNI call (by either
returning to the calling java code or by calling into a JNI C API call that interacts with heap state).
In contrast to JNI code, normal/regular Java code and in
other runtime-but-not-JNI code threads hold onto their JVM lock and don't let it go until asked to do so using some sort of "please come to a safepoint" request. When it notices the request (which only happens as it crosses a safepoint opportunity, it hands
its JVM lock to the JVM, waiting to be released.
It's code that goes for a long period of time without
crossing a safepoint opportunity (or already being at as us the case with JNI and pretty much all blocking calls) that is problematic. That's where long TTSPs come from. There are plenty of examples of normal Java things that can "accidentally" cause high
TTSPs on JVMs that don't specifically work to avoid it. Classic examples are array copies and large object allocations.
Many long TTSP in most JVMs are "the JVMs fault", while in JVMs that really care about latency consistency a lot of engineering tends to be invested in under the hood to minimize those long TTSP paths.
In Zing, we have a built-in TTSP profiler that lets us
hunt down long TTSP paths, and both Azul and our latency sensitive customers make frequent use if it to work out TTSP kinks. Using this profiling, the Zing JVM has had years of TTSP reduction work done to reduce and minimize these paths, but there are examples
where multi-millisecond TTSP is a result of user code semantics that is mot "the JVMs fault", and that you can affect (in both good and bad ways). A classic example is memory access into a mapped file whose contents is not locked in memory. Such access can
stall the accessing thread (not at a safepoint) for many milliseconds as the buffer is brought into memory. Our customers often find this sort of thing with our TTSP profiler, without waiting for the Russian roulette to roll around to the unlucky slot in production.
Sent from my iPad