Hi,
I’ve been looking to use Groovy on an upcoming project, but Groovy objects not getting GCed (and the resulting classes not being cleared out of PermGen) is proving to be a showstopper.
The following code sufficiently demonstrates the problem:
import groovy.lang.GroovyShell;
public class DemoGroovy {
public static void main(String[] args) throws Exception {
GroovyShell gs = new GroovyShell();
Object result = gs.evaluate("println 'Hello, World';");
Thread.sleep(5000L);
gs = null;
result = null;
System.gc();
System.gc();
Thread.sleep(600000L);
}
}
Running the above with these options (on Linux-x64, Java 1.7.0_15):
java -verbose:gc -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+CMSClassUnloadingEnabled -XX:+ExplicitGCInvokesConcurrentAndUnloadsClass -XX:+TraceClassUnloading -XX:+CMSPermGenSweepingEnabled -cp .:groovy-2.1.2.jar:asm-4.0.jar:antlr-2.7.7.jar DemoGroovy
shows zero classes being removed from PermGen. Looking at a heap dump of the process with VisualVM shows many Groovy objects still in the VM’s existence. These seem unable to be garbage collected, meaning that all the corresponding classes need to remain in PermGen. (I’ve tried many variations of the above, including writing my own classloader and offloading the GroovyShell evaluation into its own thread that dies – and verifying that my own classloader and thread are properly disposed of – and the problem still occurs; I have a separate test thread that uses a couple of test classes/objects which shows that those test classes/objects are GCed and unloaded from the PermGen space just fine.)
The code that would normally be running the Groovy script exists as loadable module to a Java application that runs under Tomcat. This application loads our module with an isolated classloader. Our extension runs fine and runs Groovy scripts as expected, until it comes time to unload the extension on Windows systems: since all the Groovy objects are still on the heap, and the classes are still in the PermGen, the Windows JVM ends up holding a file lock on the Groovy JAR file. Trying to remove the module from disk bombs out since the Groovy JAR is locked. I’ve thought about trying to use OSGi and using Groovy as a bundle, but that is not an option for me (for a variety of reasons.)
I’d be terribly disappointed not to be able to embed Groovy, since the DSL/builder features are perfect for a feature I am trying to create, but the JAR locking issue is a complete showstopper. If anyone has any ideas, I’d really appreciate them.
Thanks,
-Bob
long id=0
while (true) {
GroovyShell gs = new GroovyShell();
def result = gs.evaluate("class Foo$id{ };
new Foo$id()");
id++
}
-- Cédric Champeau SpringSource - A Division Of VMware http://www.springsource.com/ http://twitter.com/CedricChampeau
Actually, that’s incorrect:
-XX:+CMSClassUnloadingEnabled -XX:+ExplicitGCInvokesConcurrentAndUnloadsClass pretty much guarantees (with OpenJDK 7) that unused classes are GCed from PermGen. (I have code that demonstrates that exact behavior and have verified it using –XX:+TraceClassUnloading.)
Regardless of whether they are or not, that’s not the fundamental problem: Groovy appears to be allocating objects on the heap that cannot be garbage collected, so the underlying class definitions can’t be swept out of PermGen, anyway.
I would expect that if I’m able to embed Groovy in my Java application via (new GroovyShell()).evaluate() that after the allocated GroovyShell object was disposed of, all of the related allocated Groovy objects would be disposed of.
To summarize: my use case is “I need all of these allocated Groovy objects to go away after evaluation,” not “Groovy doesn’t have a memory leak from repeated calls to GroovyShell.evaluate().” So, is there a method call or set of method calls in Groovy where I can make that happen?
Thanks,
-Bob
$ cat > DemoGroovy.java << DELIM
import groovy.lang.GroovyShell;
public class DemoGroovy {
public static void main(String[] args) throws
Exception {
GroovyShell gs;
Object result;
for (int i=0;i<10000000;i++) {
gs = new GroovyShell();
result = gs.evaluate(" 'Hello,
World';");
}
System.out.println("Sleeping before gc");
Thread.sleep(5000L);
gs = null;
result = null;
System.gc();
System.gc();
Thread.sleep(5000L);
}
}
DELIM
$ javac -cp groovy-all.jar DemoGroovy.java
$ /opt/jdk1.7.0_17/bin/java -cp
.:groovy-all.jar -verbose:class -XX:+UnlockDiagnosticVMOptions
-XX:+UnlockExperimentalVMOptions -XX:+CMSClassUnloadingEnabled
-XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses
-XX:+TraceClassUnloading -XX:MaxPermSize=16M
-XX:+UseConcMarkSweepGC -XX:+ExplicitGCInvokesConcurrent
-XX:+PrintGCDetails -XX:-UseParNewGC DemoGroovy
[Loaded Script1 from file:/groovy/shell]
[Loaded Script1 from file:/groovy/shell]
[Loaded Script1 from file:/groovy/shell]
[Loaded Script1 from file:/groovy/shell]
[Loaded Script1 from file:/groovy/shell]
[Loaded Script1 from file:/groovy/shell]
[Loaded Script1 from file:/groovy/shell]
// snip
[Unloading class Script1]
[Unloading class Script1]
[Unloading class Script1]
[Unloading class Script1]
[Unloading class Script1]
[Unloading class Script1]
[Unloading class Script1]
[Unloading class Script1]
[Unloading class Script1]
[Unloading class Script1]
// snip
Your test case below does not test my use case. I don’t think you’re understanding what I’m actually saying here:
Step 1: Allocate a GroovyShell
Step 2: Execute something with GroovyShell.evaluate()
Step 3: Dispose of the allocated GroovyShell
Step 4: Force a GC (with VisualVM or System.gc())
After Step 4, with any normal system, all of the objects allocated starting at the top of the tree with GroovyShell should be gone from the heap – entirely. At this point, I should be able to reclaim PermGen space because there are no more Groovy object instances on the heap.
What *you* are saying is that there is no PermGen/heap memory leak, because you can run a million iterations and the memory usage stays fixed. The problem with that test is that Groovy classes/objects occupy a fixed amount of PermGen/heap space on the first allocation, and that allocation never changes: there is no *accumulating* leak, but there is an *initial* leak.
In any case, this is trivial to test: run a demo that is guaranteed to throw away the GroovyShell and all of its classes. Set the PermGen size so that the demo has just enough space to allocate all of its classes and no more. Once you have the PermGen size set correctly, attach VisualVM with the flags I originally outlined. VisualVM will force the JVM to load RMI classes, so if what you think is true, the Groovy classes will get forced out – after all, if there’s no PermGen leak, all of the Groovy classes should be unloaded, right?
I can tell you what you’ll see, though: you might see a single Groovy class unloaded, but then you’ll see errors from the demo program because the JVM can’t allocate the RMI classes, because the Groovy classes are stuck in PermGen.
“Whatever the option ExplicitGCInvokesConcurrentAndUnloadsClass says (btw, it's not well documented), it doesn't work as expected.”
It works exactly as expected. I have code that demonstrates that it works exactly as expected, *on code that doesn’t load the Groovy JAR and run GroovyShell.evaluate()*. If you’re running it on a demo that runs GroovyShell.evaluate(), it doesn’t work at all, for the reasons I’ve mentioned above.
Look, you can try to demonstrate to me that I’m not seeing Unloaded class notifications until you’re blue in the face, but the fact is that I see them. I can easily trigger all unused classes getting swept out of PermGen, with the corresponding Unloaded notification for each class, using OpenJDK 7 on Linux-x64.
Now, read this carefully: the classes locked in PermGen are not the fundamental problem. (Yes, classes locked in PermGen cause the JAR to be locked on Windows because the JVM is keeping the JAR open for the class definitions. However, this is the *symptom* we see, and I am debugging the problem by addressing the symptom - trying to get classes swept out of PermGen, because if I can get classes swept out of PermGen, that means the classes are unused, the JAR can be closed by the JVM and Windows should unlock it). The fundamental problem is that Groovy spews a static set of objects all over the heap, which are not GCed when the root Groovy object is disposed of.
Further, I actually found out what the problem in Groovy is: a series of SoftReference objects that are keeping the instances (and underlying classes) loaded. Forcing all the SoftReference objects to be GCed after disposing of GroovyShell has resulted in every single Groovy instance being destroyed and all of the underlying classes being forced out of PermGen, with the correct ClassLoader/Thread code and the correct JVM options set.
Unfortunately, there is no good way of clearing SoftReferences out of a single loaded package hierarchy. SoftReferences are only unloaded when memory pressure is very high (such as trying to allocate more RAM than is available to the JVM heap), and this results in ALL SoftReferences being cleared. This is undesirable in a system that loads/unloads modules, as it will result in performance degradation of the core system or other modules that are also using SoftReferences.
I’ll probably be filing a JIRA issue about this, but I’m going to verify first that the JAR locking issue is resolved by forcing the JVM to evict all of the SoftReferences that Groovy is creating (instead of something fundamentally broken like ClassLoader.getResource().openStream() getting called somewhere in Groovy.)
If there’s any frustration there, it’s with Cedric for refusing to believe the observations I had made. (I don’t mind being told I’m wrong with the reasons why I’m wrong, but I do take some offense to being called a liar.) It’s doubly frustrating when I get responses like that when I’m trying to promote the use of a great tool – Groovy – in a corporate environment, because now I have to temper that enthusiasm with the extra time cost it might take to discuss possible issues with Groovy on the ML.
In any case, I apologize if that frustration came through in the email.
I agree with your experience on GC issues, but I feel like I’ve got a pretty solid case and evidence here. (The VM options for probing GC issues in Java 7 help a lot.)
-Bob
Now, read this carefully: the classes locked in PermGen are not the fundamental problem. (Yes, classes locked in PermGen cause the JAR to be locked on Windows because the JVM is keeping the JAR open for the class definitions. However, this is the *symptom* we see, and I am debugging the problem by addressing the symptom - trying to get classes swept out of PermGen, because if I can get classes swept out of PermGen, that means the classes are unused, the JAR can be closed by the JVM and Windows should unlock it). The fundamental problem is that Groovy spews a static set of objects all over the heap, which are not GCed when the root Groovy object is disposed of.
Further, I actually found out what the problem in Groovy is: a series of SoftReference objects that are keeping the instances (and underlying classes) loaded. Forcing all the SoftReference objects to be GCed after disposing of GroovyShell has resulted in every single Groovy instance being destroyed and all of the underlying classes being forced out of PermGen, with the correct ClassLoader/Thread code and the correct JVM options set.
So my understanding here, is that our custom subclass of SoftReference causes an unexpected behaviour in the JVM. Even if the reference itself is not used anymore, the fact of being a subclass of SoftReference makes it held by the JVM (probably because a SoftReference is something internally used for GC so has a specific behaviour). It's held because it has been loaded once (which is the meaning of <loader> of SoftRef). This is problematic indeed, and totally unexpected.
If there’s any frustration there, it’s with Cedric for refusing to believe the observations I had made. (I don’t mind being told I’m wrong with the reasons why I’m wrong, but I do take some offense to being called a liar.)