Hey smart folks, I have a conundrum and Google is failing me.
As you may know, we have been maintaining the Java Native Runtime libraries for providing FFI from Java pre-Panama. These libraries handle loading and binding C functions to Java endpoints automagically.
Unfortunately, jffi -- the base library of the JNR stack -- has a few off-heap structures it allocates to support FFI calls. Those structures are generally held in static fields and cleaned up via finalization.
This seems to be a somewhat fatal design flaw in situations where the classloader that started up jffi might go away long before the JVM shuts down.
I've got a segfault, and all signs point toward it being a case of trying to call the JNI C code in jffi *after* the classloader has finalized and unloaded the library. The si_addr of the SIGSEGV and the top frame of the stack are the same address, which tells me that the segfault was caused by trying to call the JNI C code, which in this case is custom code to clean up those off-heap resources.
I have found no easy answer to this problem. You can't tell when your classloader unloads, and as far as I can tell you can't tell that that the JNI library has gone away. And of course you can't guarantee finalization order. Sometimes, it works fine. But eventually, it fails. My logging of classloader finalization versus data freeing ends like this:
I have not come up with any solution. These off-heap structures are tied to the lifecycle of the JNI backend, but there's no obvious way to clean them up just before the JNI backend gets unloaded.
1. Does it seem like I'm on the right track?
2. Anyone have ideas for dealing with this? My best idea right now is to add a bunch of smarts to JNI_onunload that tidies everything up, rather than allowing finalization to do it at some indeterminate time in the future.
3. Why does JNI + classloading suck so bad?