Ability to force class unloading in JDK 6 Update 4?

613 views
Skip to first unread message

Charles Oliver Nutter

unread,
Jan 14, 2008, 2:49:40 AM1/14/08
to jvm-la...@googlegroups.com
I stumbled across a blog post mentioning a new feature in an
"experimental VM"...but perhaps it's now in JDK 6 Update 4?

-XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses

Anyone know about it?

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6541037

I would expect that it doesn't change the fact that ClassLoader holds a
hard reference to all classes it loads, but perhaps it helps force
dereferenced classes in dereferenced classloaders to get collected more
quickly?

- Charlie

Attila Szegedi

unread,
Jan 14, 2008, 3:42:24 AM1/14/08
to jvm-la...@googlegroups.com
Sounds like that's exactly what it does - includes a permgen sweep on
explicit System.gc() invocation.

I have very mixed feelings about this. I mean, in a properly
implemented (ideal) JVM, why would you ever need this (or any other
explicit GC, for that matter)?

OTOH, leaving ideal worlds aside, it can be a blessing if you'd ever
have a programmatic need to nudge (a real world, non-ideal) VM into
freeing up unused code.

But still, the implementation bothers me. Applications shouldn't have
explicit System.gc() invocations, and even if I wanted to suggest the
JVM it'd be good time to do a permgen sweep, I think it's a bad idea
to overload System.gc() as the API for that purpose (and then
operationally rely on JVM being launched with this new command line
flag). Adding a method to either ClassLoadingMXBean or
GarbageCollectorMXBean in java.lang.management would've been a much
cleaner approach.

Attila.

Kresten Krab Thorup

unread,
Jan 14, 2008, 8:06:41 AM1/14/08
to JVM Languages
I think that the best way to improve this would be to add a new kind
of class loader that would permit unloading classes loaded by it.
That would relieve us from creating a new class loader for every class
that needs to be independently unloaded (such as compiled methods).

I have experimented in different VMs trying to remove the hard-link
from the class loader to the specific classes by means of reflection,
but it doesn't really work consistently and often kills the VM.

Kresten

Matthias Ernst

unread,
Jan 14, 2008, 8:26:13 AM1/14/08
to jvm-la...@googlegroups.com
On Jan 14, 2008 2:06 PM, Kresten Krab Thorup <kr...@trifork.com> wrote:
>
> I think that the best way to improve this would be to add a new kind
> of class loader that would permit unloading classes loaded by it.
> That would relieve us from creating a new class loader for every class
> that needs to be independently unloaded (such as compiled methods).

Is the overhead substantial for individual class loaders apart from
the Java heap? Do they
live in the permgen? Do they complicate the verification algorithm?
Has anyone investigated that?

Matthias

Attila Szegedi

unread,
Jan 14, 2008, 8:50:58 AM1/14/08
to jvm-la...@googlegroups.com

No, they don't really complicate anything as far as I know. They don't
live in permgen. They're just ordinary Java objects usually referenced
from Class objects and Thread objects, and can be GCed (except for
system class loader).

They do have a bit of a memory footprint -- in addition to their own
fields, each does create one 16-element hashset, one 16-element
hashmap, one 11-element hashmap, and one 10-element vector.

Rhino uses the classloader-per-function scheme since forever and it's
never been a big deal. I do agree that it'd be nice if there existed
an atomically loadable/unloadable unit of code in JVM that's lighter
than the currently only possible "Method-in-a-Class-in-a-ClassLoader",
even just because it'd make this rather baroque construct no longer
necessary, but even the current situation itself is, well, tolerable.

Attila.

--
home: http://www.szegedi.org
weblog: http://constc.blogspot.com


Charles Oliver Nutter

unread,
Jan 14, 2008, 4:25:32 PM1/14/08
to jvm-la...@googlegroups.com
Attila Szegedi wrote:
>
> On 2008.01.14., at 14:26, Matthias Ernst wrote:
>
>> On Jan 14, 2008 2:06 PM, Kresten Krab Thorup <kr...@trifork.com> wrote:
>>> I think that the best way to improve this would be to add a new kind
>>> of class loader that would permit unloading classes loaded by it.
>>> That would relieve us from creating a new class loader for every
>>> class
>>> that needs to be independently unloaded (such as compiled methods).
>> Is the overhead substantial for individual class loaders apart from
>> the Java heap? Do they
>> live in the permgen? Do they complicate the verification algorithm?
>> Has anyone investigated that?
>
> No, they don't really complicate anything as far as I know. They don't
> live in permgen. They're just ordinary Java objects usually referenced
> from Class objects and Thread objects, and can be GCed (except for
> system class loader).
>
> They do have a bit of a memory footprint -- in addition to their own
> fields, each does create one 16-element hashset, one 16-element
> hashmap, one 11-element hashmap, and one 10-element vector.

In really large applications, this footprint can become a real pain
though. Should an application that compiles 10k methods need 10k
classloaders taking up heap space?

Also, it's just plain gross...I should be able to specify that "this
class is transient, and I don't care if it gets GCed when all instances
of it go away" rather than having to hack around this.

Plus then there's the fact that this "sticky" behavior is on-purpose and
not overridable, because of this line from ClassLoader.java:

// The classes loaded by this class loader. The only purpose of
this table
// is to keep the classes from being GC'ed until the loader is GC'ed.
private Vector classes = new Vector();

Talk about irritating. So now not only do I have to have a classloader
per "method class", I have to pay the cost of an additional Vector. Suck.

I've honestly considered hacking around this in JRuby because it's
really irritating and increases the memory load substantially.

> Rhino uses the classloader-per-function scheme since forever and it's
> never been a big deal. I do agree that it'd be nice if there existed
> an atomically loadable/unloadable unit of code in JVM that's lighter
> than the currently only possible "Method-in-a-Class-in-a-ClassLoader",
> even just because it'd make this rather baroque construct no longer
> necessary, but even the current situation itself is, well, tolerable.

JRuby absolutely needs a lighter-weight construct like John Rose's
autonomous methods, because at the end of the day we're only interested
in getting the bytecode callable. JRuby (and other language impls) have
to hack around the fact that each compiled method has to live in a class
with its own metadata (in permgen), symbol tables (in permgen), and
symbolic name (parts in permgen and parts in classloader, for bits of
code we don't really need unique names for). If I could pick one feature
I'd like to have for JRuby, this would be it.

To further solidify my point...JRuby 1.1RC2 will include a new config
property to limit the total number of JITed methods at any time, so it
doesn't try to JIT everything and use up more permgen than it should.
Granted, there should be a limit either way, or an LRU cache of some
kind. But having to do it because every method takes up too much permgen
is pretty disgusting.

- Charlie

Matthias Ernst

unread,
Jan 14, 2008, 4:53:01 PM1/14/08
to jvm-la...@googlegroups.com
On Jan 14, 2008 10:25 PM, Charles Oliver Nutter <charles...@sun.com> wrote:

> In really large applications, this footprint can become a real pain
> though. Should an application that compiles 10k methods need 10k
> classloaders taking up heap space?

That is about 6MB according to my book (I can allocate around
110000 classloaders in 64MB heap). When you size all maps/vectors/sets
in ClassLoader
to 1, you can go to 180000, i.e. around 3.5MB overhead for 10000. I
agree it isn't great but a real pain?

Matthias

Ted Neward

unread,
Jan 15, 2008, 12:30:53 AM1/15/08
to jvm-la...@googlegroups.com
Hate to say it, Kresten, but I don't think you'll ever see this--the
semantics of unloading classes, except by GC, would need to be worked out
and codified someplace before this could happen.

Assume for the moment that we go with a simpler implementation,
System.classGC() that takes a Class object and unloads it. (Ignore the
complications of ClassLoaders for now.) Suppose...
(*) ... there is an instance of that Class still reachable in the system?
What should happen?
(*) ... this is an abstract class that is inherited by other classes in the
system (which may or may not have instances still reachable)?
(*) ... you're in the middle of a static method execution call on that class
on another Thread?
(*) ... another thread is *about* (say, five seconds from now) to make a
static method execution call on that class?

Reification of classes is its own brand of craziness, but it's actually a
simpler topic to consider than outright unloading of classes, IMHO.

Ted Neward
Java, .NET, XML Services
Consulting, Teaching, Speaking, Writing
http://www.tedneward.com

> -----Original Message-----
> From: jvm-la...@googlegroups.com [mailto:jvm-
> lang...@googlegroups.com] On Behalf Of Kresten Krab Thorup
> Sent: Monday, January 14, 2008 5:07 AM
> To: JVM Languages
> Subject: [jvm-l] Re: Ability to force class unloading in JDK 6 Update
> 4?
>
>

> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.516 / Virus Database: 269.19.2/1222 - Release Date:
> 1/13/2008 12:23 PM
>

No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.19.2/1222 - Release Date: 1/13/2008
12:23 PM

Ted Neward

unread,
Jan 15, 2008, 12:30:53 AM1/15/08
to jvm-la...@googlegroups.com
+1.

But adding this to the GCMXBean would mean changing that interface, and that
gets into all sorts of versioning problems. Perhaps a better way would be to
create a new GCMXBean interface (let's call it GarbageCollectorMXBean2 just
for cruelty's sake) and offer it through the JMX server as well.

Ted Neward
Java, .NET, XML Services
Consulting, Teaching, Speaking, Writing
http://www.tedneward.com

> -----Original Message-----
> From: jvm-la...@googlegroups.com [mailto:jvm-

Ted Neward

unread,
Jan 15, 2008, 12:30:53 AM1/15/08
to jvm-la...@googlegroups.com
Don't forget, too, that in the face of JSR 277 the guys at Sun are going to
be very reluctant to change anything ClassLoader-related at this point.
Maybe for Java 8, but don't hold your breath.

Ted Neward
Java, .NET, XML Services
Consulting, Teaching, Speaking, Writing
http://www.tedneward.com

> -----Original Message-----
> From: jvm-la...@googlegroups.com [mailto:jvm-
> lang...@googlegroups.com] On Behalf Of Attila Szegedi
> Sent: Monday, January 14, 2008 5:51 AM
> To: jvm-la...@googlegroups.com

> Subject: [jvm-l] Re: Ability to force class unloading in JDK 6 Update
> 4?
>
>
>

Attila Szegedi

unread,
Jan 15, 2008, 3:52:15 AM1/15/08
to jvm-la...@googlegroups.com

On 2008.01.15., at 6:30, Ted Neward wrote:

>
> +1.
>
> But adding this to the GCMXBean would mean changing that interface,
> and that
> gets into all sorts of versioning problems.

Well, the implementation itself is usually provided by the JVM, so it
shouldn't be a too big a problem. Still, I see your point.

> Perhaps a better way would be to
> create a new GCMXBean interface (let's call it
> GarbageCollectorMXBean2 just
> for cruelty's sake)

You just reminded me of good ol' COM programming days: IClassFactory,
IClassFactory2. I think some DirectX interfaces actually reached into
threes...

Attila.

Charles Oliver Nutter

unread,
Jan 15, 2008, 3:57:08 AM1/15/08
to jvm-la...@googlegroups.com
Ted Neward wrote:
> Hate to say it, Kresten, but I don't think you'll ever see this--the
> semantics of unloading classes, except by GC, would need to be worked out
> and codified someplace before this could happen.

I think the thing Kresten is looking for is an ability to unload classes
(or make them eligible for GC) when they are known to be unused. Or if
he's not, that's what I want.

> Assume for the moment that we go with a simpler implementation,
> System.classGC() that takes a Class object and unloads it. (Ignore the
> complications of ClassLoaders for now.) Suppose...
> (*) ... there is an instance of that Class still reachable in the system?
> What should happen?

It's a hard reference; there's nothing new here. An object references
its class.

> (*) ... this is an abstract class that is inherited by other classes in the
> system (which may or may not have instances still reachable)?

A child class has a hard reference to a parent class. No problem.

> (*) ... you're in the middle of a static method execution call on that class
> on another Thread?

Then there's a hard reference to that class from that thread or from
frame executing it.

> (*) ... another thread is *about* (say, five seconds from now) to make a
> static method execution call on that class?

Can't predict the future; but if the caller already has a reference to
the class it wouldn't be able to go away.

In general, if all reference to a class go away, there should be a way
to make it eligible for immediate GC.

The larger issues, as I understand them, relate to static initializers.
In order to guarantee the static initializers don't fire again under a
given classloader, there must be guarantees about classes staying alive.
If they could GC when there are no more references to them, their static
initializers could fire again.

However, there's a lot of uses for classes that don't require such
guarantees, such as generating anonymous classes to hold code compiled
at runtime. I don't care if the classes I compile in JRuby's JIT get
unloaded or reloaded, because I'll manage my own references to them. But
because of the Vector in ClassLoader...they tend to stick around too
long without ClassLoader tricks.

- Charlie

Charles Oliver Nutter

unread,
Jan 15, 2008, 4:00:33 AM1/15/08
to jvm-la...@googlegroups.com

Well, let's think about it another way...

Rails apps, as an example, require a separate JRuby instance per
concurrent request, because of some issues with the way Rails is
designed (certain aspects aren't thread-safe). So if you want 10 apps
that can handle 10 concurrent requests each, you may have as many as 100
JRuby instances at a time in a given JVM. So now we're looking at 600MB
of ClassLoader.

Let's assume you can fix Rails to be thread-safe. Then you still have
60MB for ten apps on top of everything else. It's less of a pain, but
it's still 60MB of waste. Waste that has tended to give Java/JVM folks
like us a bad name.

- Charlie

Rémi Forax

unread,
Jan 15, 2008, 4:50:08 AM1/15/08
to jvm-la...@googlegroups.com
Attila Szegedi a écrit :

> On 2008.01.15., at 6:30, Ted Neward wrote:
>
>
>> +1.
>>
>> But adding this to the GCMXBean would mean changing that interface,
>> and that
>> gets into all sorts of versioning problems.
>>
>
> Well, the implementation itself is usually provided by the JVM, so it
> shouldn't be a too big a problem. Still, I see your point.
>
>
>> Perhaps a better way would be to
>> create a new GCMXBean interface (let's call it
>> GarbageCollectorMXBean2 just
>> for cruelty's sake)
>>
>
> You just reminded me of good ol' COM programming days: IClassFactory,
> IClassFactory2. I think some DirectX interfaces actually reached into
> threes...
>
COM is so old school, some eclipse interfaces are numbered seven or height
(and some in the middle are deprecated).
Interfaces version is actually a real pain and there is no real solution.
> Attila.
>
Rémi

Adam Bouhenguel

unread,
Jan 15, 2008, 5:02:55 AM1/15/08
to jvm-la...@googlegroups.com
Why not only allow class unloading only for classes without static
fields, static methods, or static initializers? That side-steps the
initialization issues and means that you need an instance in order to
use it (no static elements + no references = no problem unloading). Am
I correct in my line of thinking?

-Adam

John Wilson

unread,
Jan 15, 2008, 5:05:51 AM1/15/08
to jvm-la...@googlegroups.com
If we could dynamically add/remove methods on classes would this not
provide an acceptable alternative?

e.g. when I compile a Lambda/Closure I add a private static method to
some handy class. I then instantiate my standard Lambda/Closure class
which delegates the call to the static method via reflection.

John Wilson

Ted Neward

unread,
Jan 17, 2008, 4:26:25 AM1/17/08
to jvm-la...@googlegroups.com
All of the scenarios you describe below suggest, then, that so long as a
strong reference is held to the class, the class is not eligible for
unloading. That means, of course, that once all strong references have been
dropped, the class is now eligible for unloading.

Which is exactly how--in theory--it works today. Which means we're really
not changing anything. Which means I suspect you're not really talking about
just allowing GC to happen, but want to "force" it somehow, which means
going above and beyond the conservative collection behavior the JVM has
historically espoused. (I am trying to summarize and paraphrase here, to
make sure I understand the general thrust of what you're looking for--if I'm
putting words in your mouth, by all means, loudly and emphatically tell me.
;-) )

Dropping the Vector out of the ClassLoader base class I think breaks a bunch
of other stuff, but I don't remember why. (It's been a while since I tried
to track that guy down.) I vaguely recall it being added in 1.2 for a
particular reason. IIRC, there's another collection of hard references
buried inside the native code, from what I understand, called the Loaded
Class Cache (LCC), and those references would likely keep the class alive
even if you could yank it out of that Vector.

I also believe that Class needs to hard reference the ClassLoader in order
to support the getResource() behavior on Class, which means we can't get rid
of those strong references, either.

I dunno who at Sun owns the ClassLoader facilities in the JVM these days--I
got my LCC info from Peter Kessler (sp?) at Sun, IIRC. He/she would be the
right one to tell us why that Vector exists, or if what you want is even
remotely feasible under the current JVM architecture. Anybody know who that
might be? (I'd love to know, just so I can ask some related but off-topic
questions, too. ;-) )

Ted Neward
Java, .NET, XML Services
Consulting, Teaching, Speaking, Writing
http://www.tedneward.com

> -----Original Message-----
> From: jvm-la...@googlegroups.com [mailto:jvm-
> lang...@googlegroups.com] On Behalf Of Charles Oliver Nutter
> Sent: Tuesday, January 15, 2008 12:57 AM
> To: jvm-la...@googlegroups.com
> Subject: [jvm-l] Re: Ability to force class unloading in JDK 6 Update
> 4?
>
>

> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.516 / Virus Database: 269.19.2/1222 - Release Date:
> 1/13/2008 12:23 PM
>

No virus found in this outgoing message.
Checked by AVG Free Edition.

Version: 7.5.516 / Virus Database: 269.19.5/1228 - Release Date: 1/16/2008
9:01 AM

Ted Neward

unread,
Jan 17, 2008, 4:27:24 AM1/17/08
to jvm-la...@googlegroups.com
Interface versioning was always solvable, it just required rather more
discipline on the part of the programmer than was really feasible.

If you wanted to see pain, you should've tried SOM. *shudder*

Ted Neward
Java, .NET, XML Services
Consulting, Teaching, Speaking, Writing
http://www.tedneward.com

> -----Original Message-----
> From: jvm-la...@googlegroups.com [mailto:jvm-
> lang...@googlegroups.com] On Behalf Of Rémi Forax
> Sent: Tuesday, January 15, 2008 1:50 AM
> To: jvm-la...@googlegroups.com
> Subject: [jvm-l] Re: Ability to force class unloading in JDK 6 Update
> 4?
>
>

> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.516 / Virus Database: 269.19.2/1222 - Release Date:
> 1/13/2008 12:23 PM
>

No virus found in this outgoing message.
Checked by AVG Free Edition.

Attila Szegedi

unread,
Jan 17, 2008, 8:19:58 AM1/17/08
to jvm-la...@googlegroups.com

On 2008.01.17., at 10:26, Ted Neward wrote:

> All of the scenarios you describe below suggest, then, that so long
> as a
> strong reference is held to the class, the class is not eligible for
> unloading. That means, of course, that once all strong references
> have been
> dropped, the class is now eligible for unloading.

Right, but the fact that the ClassLoader holds a strong reference to
all Class objects it loaded means that a class is not eligible for GC
until all classes loaded through the same class loader themselves
become eligible.

Functions being first-class objects in most of languages we discuss
here, their usual implementation is to generate an on-the-fly class,
i.e. "class 0a9cee3478 implements Function { ... }" and instantiate
exactly one instance of it to pass around as said first-class object.

If we want to have our functions to be garbage collectibe with a
single function granularity, then the class for each one of them needs
to be loaded in its own class loader that only loads that class and
nothing else . And that's what feels unnecessarily convoluted (having
one classloader per class) and raises the requirement for some
"lightweight method objects".

> [...]


>
> Dropping the Vector out of the ClassLoader base class I think breaks
> a bunch
> of other stuff, but I don't remember why. (It's been a while since I
> tried
> to track that guy down.) I vaguely recall it being added in 1.2 for a
> particular reason. IIRC, there's another collection of hard references
> buried inside the native code, from what I understand, called the
> Loaded
> Class Cache (LCC), and those references would likely keep the class
> alive
> even if you could yank it out of that Vector.

I'm not 100% sure about the limitations here myself, or whether they
could be partially lifted. I certainly hope they could :-) It's
embarrassing, but I think I once knew the reason, but can't seem to
remember it anymore.

On the other hand, the invariant that classes unload in bulk (when
their loader becomes unreachable) is maybe taken advantage of in the
JVM guts somehow -- i.e. JIT call site optimization/deoptimization can
be tied to a ClassLoader going away instead of checked on a per class
basis, and similars. That might cause some further resistance for
relaxing the rules from people who maintain JVMs.

> I also believe that Class needs to hard reference the ClassLoader in
> order
> to support the getResource() behavior on Class, which means we can't
> get rid
> of those strong references, either.

Nope, but that's not being debated :-)

> I dunno who at Sun owns the ClassLoader facilities in the JVM these
> days--I
> got my LCC info from Peter Kessler (sp?) at Sun, IIRC. He/she would
> be the
> right one to tell us why that Vector exists, or if what you want is
> even
> remotely feasible under the current JVM architecture.

That'd indeed be great.

Attila.

John Wilson

unread,
Jan 17, 2008, 9:17:58 AM1/17/08
to jvm-la...@googlegroups.com
> > Dropping the Vector out of the ClassLoader base class I think breaks
> > a bunch
> > of other stuff, but I don't remember why. (It's been a while since I
> > tried
> > to track that guy down.) I vaguely recall it being added in 1.2 for a
> > particular reason. IIRC, there's another collection of hard references
> > buried inside the native code, from what I understand, called the
> > Loaded
> > Class Cache (LCC), and those references would likely keep the class
> > alive
> > even if you could yank it out of that Vector.
>
> I'm not 100% sure about the limitations here myself, or whether they
> could be partially lifted. I certainly hope they could :-) It's
> embarrassing, but I think I once knew the reason, but can't seem to
> remember it anymore.

wasn't there a problem with Singletons getting GCd?

Like you I think I once knew and have now forgotten. Getting old....

John Wilson

Yardena

unread,
Jan 17, 2008, 9:29:08 AM1/17/08
to JVM Languages
Hi,

I'm a bit hesitant to step in the discussion among such knowledgeable
gentlemen, so forgive me if the suggestion is too naive :-)

The class loader doesn't unload individual classes because they can
hold static members or have static initializers, and unloading the
class may break the program semantics, right? However the classes
generated for functions do not have any members, and certainly no
static ones, so that's not an issue. What if for every classloader in
the system we'd create one dedicated child classloader for all these
function objects, which would not hold hard references to the classes
it loads making them eligible for GC. Then some sort of factory can
load "function objects" using that special loader.

This would only double the current number of classloaders in the
system, which isn't too bad, I think. There's still plenty of issues,
I'm sure... Dunno, just an idea.

Yardena.

John Wilson

unread,
Jan 17, 2008, 9:30:02 AM1/17/08
to jvm-la...@googlegroups.com
On Jan 17, 2008 1:19 PM, Attila Szegedi <szeg...@gmail.com> wrote:

> Functions being first-class objects in most of languages we discuss
> here, their usual implementation is to generate an on-the-fly class,
> i.e. "class 0a9cee3478 implements Function { ... }" and instantiate
> exactly one instance of it to pass around as said first-class object.
>
> If we want to have our functions to be garbage collectibe with a
> single function granularity, then the class for each one of them needs
> to be loaded in its own class loader that only loads that class and
> nothing else . And that's what feels unnecessarily convoluted (having
> one classloader per class) and raises the requirement for some
> "lightweight method objects".


One thing I have considered is aggregating all these
closures/lambda/etc in a compilation unit as static methods on a
single class. This means that you amortise the cost of a classloader
over several methods. Instances of the function object delegate to the
static method. The cost is, obviously, delayed GC of the "mothership"
class. Of course, if you can dynamically create new functions this
won't help.

John Wilson

Attila Szegedi

unread,
Jan 17, 2008, 10:33:33 AM1/17/08
to jvm-la...@googlegroups.com
Your idea wouldn't be bad as such, except for few devils in details:
- The code of the java.lang.ClassLoader class (at least in Sun's
implementation) is engineered so that you just can't override the
behaviour where it strongly references each Class object it creates.
- And then there's also the problem that if you try to get around this
(say, using a custom ClassLoader code using a different boot class
path, or resorting to reflection (with access checks off) to get to
that Vector object and remove Class objects manually from it), the JVM
typically becomes very unstable very quickly (crashes in native
code)... Been there :-)

Attila.

Rémi Forax

unread,
Jan 17, 2008, 11:27:12 AM1/17/08
to jvm-la...@googlegroups.com
Attila Szegedi a écrit :

> Your idea wouldn't be bad as such, except for few devils in details:
> - The code of the java.lang.ClassLoader class (at least in Sun's
> implementation) is engineered so that you just can't override the
> behaviour where it strongly references each Class object it creates.
> - And then there's also the problem that if you try to get around this
> (say, using a custom ClassLoader code using a different boot class
> path, or resorting to reflection (with access checks off) to get to
> that Vector object and remove Class objects manually from it), the JVM
> typically becomes very unstable very quickly (crashes in native
> code)... Been there :-)
>
> Attila.
>
>
Attila, the VM doesn't become unstable,
it crashes reproducibly in the GC collection.
Perhaps because the object layout in lost.

Rémi

Per Bothner

unread,
Jan 17, 2008, 11:51:41 AM1/17/08
to jvm-la...@googlegroups.com
John Wilson wrote:
> One thing I have considered is aggregating all these
> closures/lambda/etc in a compilation unit as static methods on a
> single class. This means that you amortise the cost of a classloader
> over several methods. Instances of the function object delegate to the
> static method. The cost is, obviously, delayed GC of the "mothership"
> class.

Kawa does a variant of this:
http://www.gnu.org/software/kawa/internals/procedures.html
--
--Per Bothner
p...@bothner.com http://per.bothner.com/

John Wilson

unread,
Jan 17, 2008, 12:03:46 PM1/17/08
to jvm-la...@googlegroups.com
On Jan 17, 2008 4:51 PM, Per Bothner <p...@bothner.com> wrote:
>
> John Wilson wrote:
> > One thing I have considered is aggregating all these
> > closures/lambda/etc in a compilation unit as static methods on a
> > single class. This means that you amortise the cost of a classloader
> > over several methods. Instances of the function object delegate to the
> > static method. The cost is, obviously, delayed GC of the "mothership"
> > class.
>
> Kawa does a variant of this:
> http://www.gnu.org/software/kawa/internals/procedures.html

Thanks for the reference. It's nice to know that somebody has
implemented this before. It makes me more likely to pursue the idea
further (possibly synthetic static methods on the class in which the
function object is declared).

John Wilson

John Cowan

unread,
Jan 17, 2008, 12:48:08 PM1/17/08
to jvm-la...@googlegroups.com
On Jan 17, 2008 6:30 AM, John Wilson <tugw...@gmail.com> wrote:

> One thing I have considered is aggregating all these
> closures/lambda/etc in a compilation unit as static methods on a
> single class. This means that you amortise the cost of a classloader
> over several methods. Instances of the function object delegate to the
> static method. The cost is, obviously, delayed GC of the "mothership"
> class. Of course, if you can dynamically create new functions this
> won't help.

That's what I'm doing. Function is an abstract class rather than a
marker interface, and it provides a field named _index. If Foo is a
subclass of Function that has five static methods, then there are also
five static fields initialized to five instances of Foo with _index
values from 0 to 4; these represent at source-language level the five
possible functions (in general, there may be more than one Foo per
source-code module, but one is typical).

So if you invoke the _invoke method (the only instance method of these
classes) on a particular Foo object, it will run a switch statement
that validates the number of passed arguments and calls the
appropriate static function. This is only done, of course, when the
function to be invoked is not known at compile time; otherwise, the
static method is called directly.

Currently this is only done for top-level functions. Nested functions
(which are all anonymous) use Java anonymous classes (subclasses of
Function that implement _invoke), but I may change this to do
lambda-lifting instead. Does anyone have any insights about the pros
and cons of lambda-lifting on the JVM? Can I do substantially better
than the Java compiler? (Note that I am generating Java, not
bytecode, and that all mutable local variables have already been
converted to point to boxes.)

--
GMail doesn't have rotating .sigs, but you can see mine at
http://www.ccil.org/~cowan/signatures

John Wilson

unread,
Jan 17, 2008, 2:05:09 PM1/17/08
to jvm-la...@googlegroups.com
On Jan 17, 2008 5:48 PM, John Cowan <johnw...@gmail.com> wrote:
>
> On Jan 17, 2008 6:30 AM, John Wilson <tugw...@gmail.com> wrote:
>
> > One thing I have considered is aggregating all these
> > closures/lambda/etc in a compilation unit as static methods on a
> > single class. This means that you amortise the cost of a classloader
> > over several methods. Instances of the function object delegate to the
> > static method. The cost is, obviously, delayed GC of the "mothership"
> > class. Of course, if you can dynamically create new functions this
> > won't help.
>
> That's what I'm doing. Function is an abstract class rather than a
> marker interface, and it provides a field named _index. If Foo is a
> subclass of Function that has five static methods, then there are also
> five static fields initialized to five instances of Foo with _index
> values from 0 to 4; these represent at source-language level the five
> possible functions (in general, there may be more than one Foo per
> source-code module, but one is typical).
>
> So if you invoke the _invoke method (the only instance method of these
> classes) on a particular Foo object, it will run a switch statement
> that validates the number of passed arguments and calls the
> appropriate static function. This is only done, of course, when the
> function to be invoked is not known at compile time; otherwise, the
> static method is called directly.

I plan to have a single, concrete class for all functions which are
objects (Lambdas and Closures, in my case). It contains a reference to
a java.lang.reflect.Method (which allows it to call the static method)
and some optional state. The calls from Ng all go via the MetaClass of
this wrapper class and just cause the static method to be called via
reflection (passing the state with the parameters). There is an
invoke() method for calls from Java. This means I can add an arbitrary
number of method bodies to a class and treat them all as separate
functions.

John Wilson

Charles Oliver Nutter

unread,
Jan 17, 2008, 3:13:20 PM1/17/08
to jvm-la...@googlegroups.com
Adam Bouhenguel wrote:
> Why not only allow class unloading only for classes without static
> fields, static methods, or static initializers? That side-steps the
> initialization issues and means that you need an instance in order to
> use it (no static elements + no references = no problem unloading). Am
> I correct in my line of thinking?

Seems perfectly reasonable to me :) And as far as I know, there's no
negative side to this.

- Charlie

Charles Oliver Nutter

unread,
Jan 17, 2008, 3:16:06 PM1/17/08
to jvm-la...@googlegroups.com
John Wilson wrote:
> wasn't there a problem with Singletons getting GCd?
>
> Like you I think I once knew and have now forgotten. Getting old....

If so this would be solved by the "no statics" requirement. It's
certainly a possible reason why the hard reference exists.

- Charlie

Charles Oliver Nutter

unread,
Jan 17, 2008, 4:06:34 PM1/17/08
to jvm-la...@googlegroups.com
John Cowan wrote:
> On Jan 17, 2008 6:30 AM, John Wilson <tugw...@gmail.com> wrote:
>
>> One thing I have considered is aggregating all these
>> closures/lambda/etc in a compilation unit as static methods on a
>> single class. This means that you amortise the cost of a classloader
>> over several methods. Instances of the function object delegate to the
>> static method. The cost is, obviously, delayed GC of the "mothership"
>> class. Of course, if you can dynamically create new functions this
>> won't help.
>
> That's what I'm doing. Function is an abstract class rather than a
> marker interface, and it provides a field named _index. If Foo is a
> subclass of Function that has five static methods, then there are also
> five static fields initialized to five instances of Foo with _index
> values from 0 to 4; these represent at source-language level the five
> possible functions (in general, there may be more than one Foo per
> source-code module, but one is typical).

JRuby has done this and many others, and the fastest way turns out to be
the most permgen-impactful way.

* First, JRuby used reflection everywhere. It was simple, but it's
always slower than non-reflective options
* Then we hand-wrote method objects as anonymous classes and used those
as bindings. But it doesn't scale, and doesn't work at all for code
loaded at runtime.
* Then we used a hand-written indexed method handle like you describe.
Again, it worked (albeit a bit slower than individual methods), but it
was too much effort to implement by hand and wouldn't work for generated
code.
* Then we started generating small method handle classes for all
methods. This allowed us to generate everything, so runtime code could
use the same model. The handles were wrapped in a DynamicMethod object.
This was faster than any of the above options because the invocation
code was monomorphic. But the DynamicMethod wrapper was generic and
megamorphic.
* Then we started generating DynamicMethod implementations, eliminating
that megamorphic call site from the picture. This is how it works today
in JRuby, with the call handle being only a single hop to the actual
method being invoked. But it's a load on permgen...each handle is a
class, in some cases with their own classloaders.
* JRuby also supports generating indexed DynamicMethods, but there's a
noticeable performance hit adding the extra decision.

The current compiler has its own strategy:

* One .rb script is compiled into exactly one .class file. Every body of
code is a method on that class. The methods are instance methods,
because there are various cache fields on the class, and we want to be
able to reuse the same compiled class across JRuby instances. So at
runtime, the bodies of code in the script are either executed directly
(main script, class bodies) or bound to methods using any of the
techniques above. But there's only ever one .class file for one .rb
file, which I believe is far cleaner than Groovy, Scala, XRuby, and
others generating dozens of .class files.
* In JIT mode, a single-method class is generated per compiled
method...each with its own classloader, so they can GC. In this mode,
the existing interpreted DynamicMethod object aggregates an instance of
the newly jitted method, and invokes against that instead of the
interpreter.

Perhaps this helps show why we need to reduce the cost of generating
single-method classes. The next trick we want to explore was suggested
by John Rose:

1. Define a set of numbered "invoker" interfaces with numbered methods
2. Define a parallel set of method handle instances that invoke exactly
one of those interfaces
3. Generate method handles or jitted methods in n-sized batches, with
the compiled object implementing the n invoker interfaces for the n
methods contained therein.
4. Bind each of the n methods using the specific method handle necessary.

The idea behind this is that while you have many implementations of the
method handle interface, callers to them are still fairly monomorphic.
At the same time you've eliminate the indexed switch, so from call site
to target method is a straight-through affair.

We'll probably implement this after 1.1, since we've managed to reduce
our permgen load in other ways.

- Charlie

John Cowan

unread,
Jan 18, 2008, 2:55:07 AM1/18/08
to jvm-la...@googlegroups.com
On Jan 17, 2008 1:06 PM, Charles Oliver Nutter <charles...@sun.com> wrote:

> * Then we used a hand-written indexed method handle like you describe.
> Again, it worked (albeit a bit slower than individual methods), but it
> was too much effort to implement by hand and wouldn't work for generated
> code.

I don't understand the force of this objection. In source code, I
have something like this:

public class Foo extends Function {
public Foo(int n) { _index = n; }

private static int BAR = 1;
private static int BAZ = 2;
private static int QUUX = 3;

public static Foo BarFunction = new Foo(BAR);
public static Foo BazFunction = new Foo(BAZ);
public static Foo QuuxFunction = new Foo(QUUX);

public Object _invoke(Object... args) {
switch(_index) {
case BAR: return bar(args[0], args[1]);
case BAZ: return baz(args[0]);
case QUUX: return quux();
default: throw new SomeError(...);
}

public static Object bar(Object a, Object b) { ... }
public static Object baz(Object a) { ... }
public static Object quux() { ... }
}

That pattern is very amenable to both hand-coding and code generation,
and the only limit to it is how much code a class can hold. The
private static constants are just for documentation in hand-written
code, and aren't used in generated code. There is obviously a speed
problem resulting from the call-switch-call, but the great majority of
all calls bypass this path completely to invoke the static method
directly (namely, those which are direct in the source language).

> * One .rb script is compiled into exactly one .class file.

So Ruby classes don't correspond to JVM classes?

I create one Java class for every source-code class, plus (currently)
one for each embedded anonymous procedure, plus several more, one for
each distinct namespace in the source language (functions, lexical
variables, dynamic variables).

Kresten Krab Thorup

unread,
Jan 18, 2008, 5:32:10 AM1/18/08
to JVM Languages
I think you misunderstand me here Ted. My intention would be to have
it such that classes loaded by a specific kind of class loader is
eligible for GC when there are no references to the class (usually
that there are no instances of said class). Such semantics is not
possible today because class loaders have a strong link to classes
loaded by it.

So what I suggest is to have a new special kind of class loader
(TransientClassLoader maybe) which doesn't have the strong link to
classes loaded by it. If this was a new class, then no existing code
would be broken by it; and so the new semantics for when a class is
eligible for class unloading would only apply to classes loaded by
transient class loader (ad subclasses thereof).

Kresten
> Consulting, Teaching, Speaking, Writinghttp://www.tedneward.com

Charles Oliver Nutter

unread,
Jan 18, 2008, 10:03:42 PM1/18/08
to jvm-la...@googlegroups.com
John Cowan wrote:
> On Jan 17, 2008 1:06 PM, Charles Oliver Nutter <charles...@sun.com> wrote:
>
>> * Then we used a hand-written indexed method handle like you describe.
>> Again, it worked (albeit a bit slower than individual methods), but it
>> was too much effort to implement by hand and wouldn't work for generated
>> code.
> That pattern is very amenable to both hand-coding and code generation,
> and the only limit to it is how much code a class can hold. The
> private static constants are just for documentation in hand-written
> code, and aren't used in generated code. There is obviously a speed
> problem resulting from the call-switch-call, but the great majority of
> all calls bypass this path completely to invoke the static method
> directly (namely, those which are direct in the source language).

It is the "hand coded" part that did not scale for us. Of course the
pattern itself scales fine when code-generating, and JRuby has an
optional flag that uses exactly this method for generating indexed
methods. But the performance hit is real for a language that uses
dynamic dispatch exclusively:

fib(30) with non-indexed (direct) method handles:

~/NetBeansProjects/jruby $ bin/jruby -J-server
test/bench/bench_fib_recursive.rb
0.952000 0.000000 0.952000 ( 0.952000)
0.647000 0.000000 0.647000 ( 0.648000)
0.636000 0.000000 0.636000 ( 0.636000)
0.651000 0.000000 0.651000 ( 0.651000)
0.634000 0.000000 0.634000 ( 0.635000)

fib(30) with indexed method handles:

~/NetBeansProjects/jruby $ bin/jruby -J-server
-J-Djruby.indexed.methods=true test/bench/bench_fib_recursive.rb
2.005000 0.000000 2.005000 ( 2.006000)
0.835000 0.000000 0.835000 ( 0.835000)
0.847000 0.000000 0.847000 ( 0.848000)
0.838000 0.000000 0.838000 ( 0.839000)
0.823000 0.000000 0.823000 ( 0.823000)

>> * One .rb script is compiled into exactly one .class file.
>
> So Ruby classes don't correspond to JVM classes?

They do not; Ruby's classes must be reified into first-class data
structures since they can have methods and instance variables added and
removed at runtime. An upcoming compiler extension for JRuby will allow
generating a static type + methods for a specific set of Ruby methods,
which will provide a more "Java-like" type and set of signatures.

> I create one Java class for every source-code class, plus (currently)
> one for each embedded anonymous procedure, plus several more, one for
> each distinct namespace in the source language (functions, lexical
> variables, dynamic variables).

I create a single class for all of those and bind them at runtime. It
was a key requirement I wanted for the compiler when I started.

- Charlie

Charles Oliver Nutter

unread,
Jan 18, 2008, 10:05:43 PM1/18/08
to jvm-la...@googlegroups.com
Kresten Krab Thorup wrote:
> So what I suggest is to have a new special kind of class loader
> (TransientClassLoader maybe) which doesn't have the strong link to
> classes loaded by it. If this was a new class, then no existing code
> would be broken by it; and so the new semantics for when a class is
> eligible for class unloading would only apply to classes loaded by
> transient class loader (ad subclasses thereof).

This is *exactly* what I want. Unfortunately the only way to get it
right now is to hack the JDK or create your own version that calls out
to JNI to define the underlying class. But such an addition would be a
trivial piece of code to add to JDK.

- Charlie

Michael Neale

unread,
Jan 27, 2008, 5:55:07 PM1/27/08
to JVM Languages
I thought I would throw my $0.02 in on this (at the risk of sounding
like an ignoramus !).

I work on Drools, and we have also had the function per class per
classloader issue (it is typical to have 1000's of rules) - this is
quite an unpleasant amount of overhead (we have ways around it - but
in a sense we are kind of not using the VM bytecode to its full
potential). The aggregation of functions into statics in some "unit"
of compilation is sometimes doable (but for dynamic reasons not the
general solution).

I think Atilla hit the nail on the head:
> > Functions being first-class objects in most of languages we discuss
> > here, their usual implementation is to generate an on-the-fly class,
> > i.e. "class 0a9cee3

Unless I am mistaken, this ties in with Charles Nutters request/
support for "free floating" method bytecode (not attached to a class -
excuse my ignorance in terminology !).

I think dynamic invocation is important, but this other idea of
functions in byte code probably would be the other bit that makes
everything just rosy for most languages. Given that each language
implements OO its own way (if its OO at all, which it may not even
be !) and does not often map neatly to the Java class design (no
surprise - OO is always open to interpretation).

So this could serve as a nice building block. I am totally ignorant of
the other ramifications in terms of loading/unloading, scope and
security - but at first look, being able to generate, and dynamically
invoke a chunk of code which is not attached to a class (and therefore
can be created "lite" and quickly discarded if not needed) sounds
fantastic and an almost general solution. Without this, we tend to
find ourselves building a VM within a VM (not always a great idea !),
rather then leaning on the superb engineering put into the JVM
itself.


Michael.


On Jan 18, 12:30 am, "John Wilson" <tugwil...@gmail.com> wrote:

Yardena

unread,
Jan 28, 2008, 4:40:55 AM1/28/08
to JVM Languages
I agree that building this into JVM would be probably the best thing.

In the meantime, however, I'll take another chance here: Cojen project
has some sort of solution "transient classes", looks like it creates
classloader for each new 100 "injections" -
http://cojen.sourceforge.net/xref/org/cojen/util/ClassInjector.html

Also one more thought - could we create a pool of classes that have
single method something like invoke(Class[], Object[]), then track the
references ourselves and instead of creating new class modify the
contents of the method in existing but unused class?

Yardena.

On Jan 19, 5:05 am, Charles Oliver Nutter <charles.nut...@sun.com>
wrote:

Attila Szegedi

unread,
Jan 30, 2008, 7:18:21 AM1/30/08
to jvm-la...@googlegroups.com
I don't see this being possible in a current JVM. You can not modify
the contents of a method. The only way to bring new executable content
into a running JVM today is through defineClass(). Once defined, you
can't modify it.

John Rose has a good proposal for solving this with anonymous classes
in the Da Vinci VM.

Attila.

Marcelo Fukushima

unread,
Jan 30, 2008, 7:49:18 AM1/30/08
to jvm-la...@googlegroups.com
i think you can if you use the jvm agents like eclipse does to modify
running bytecode, but there are lots of restrictions


--
[]'s
Marcelo Takeshi Fukushima

Attila Szegedi

unread,
Jan 30, 2008, 8:10:34 AM1/30/08
to jvm-la...@googlegroups.com
That's an entirely different aspect -- there is the JVMTI (JVM Tools
Interface) that allows you to do this kind of operations on a JVM from
a native (C-linked) code module loaded into the JVM process itself -
debugging, profiling, hot code replacement, whatnot. It is generally
not available as an implementation-independent facility to programs
running within the JVM itself. There are some "standard" native
modules that'll then an out-of-process bridge for, say, debugging
(i.e. a low-level JDI protocol upon which the higher level JPDA is
built).

But JVMTI/JDI/JPDA can't be considered to be standard facilities
available to code running within the JVM, so you it's generally a bad
idea to base your code logic on them :-)

Attila.

Rémi Forax

unread,
Jan 30, 2008, 8:31:45 AM1/30/08
to jvm-la...@googlegroups.com
Attila Szegedi a écrit :

> That's an entirely different aspect -- there is the JVMTI (JVM Tools
> Interface) that allows you to do this kind of operations on a JVM from
> a native (C-linked) code module loaded into the JVM process itself -
> debugging, profiling, hot code replacement, whatnot. It is generally
> not available as an implementation-independent facility to programs
> running within the JVM itself. There are some "standard" native
> modules that'll then an out-of-process bridge for, say, debugging
> (i.e. a low-level JDI protocol upon which the higher level JPDA is
> built).
>
> But JVMTI/JDI/JPDA can't be considered to be standard facilities
> available to code running within the JVM, so you it's generally a bad
> idea to base your code logic on them :-)
>
> Attila.
>
hum, yo forget java.lang.instrument.

Rémi

Attila Szegedi

unread,
Jan 30, 2008, 10:43:21 AM1/30/08
to jvm-la...@googlegroups.com
I'll admit to have limited knowledge of j.l.instrument (and right now
not even enough time to read through the API docs), but doesn't that
allow hooking into the class definition process only? It still
wouldn't allow you to hot-replace code of an already defined class
AFAIK (I might be wrong), which I believe is the "Eclipse capability"
the poster I replied to talked about.

Attila.

Rémi Forax

unread,
Jan 30, 2008, 10:58:39 AM1/30/08
to jvm-la...@googlegroups.com
Attila Szegedi a écrit :

> I'll admit to have limited knowledge of j.l.instrument (and right now
> not even enough time to read through the API docs), but doesn't that
> allow hooking into the class definition process only?
not only

> It still
> wouldn't allow you to hot-replace code of an already defined class
> AFAIK (I might be wrong),
you are wrong :)

> which I believe is the "Eclipse capability"
> the poster I replied to talked about.
>
i think so.
> Attila.
>
it's a 1.6 feature:
http://download.java.net/jdk7/docs/api/java/lang/instrument/Instrumentation.html#retransformClasses(java.lang.Class...)

Rémi

Attila Szegedi

unread,
Jan 31, 2008, 5:57:34 AM1/31/08
to jvm-la...@googlegroups.com
Wow - thanks for pointing this out; just learned something new. That's
indeed a quite amazing improvement...

Attila.

Patrick Wright

unread,
Jan 31, 2008, 6:04:19 AM1/31/08
to jvm-la...@googlegroups.com

I think I recall there are some limitations--you either have to
specify an agent on the command line, or else "An implementation may
provide a mechanism to start agents sometime after the the VM has
started. The details as to how this is initiated are implementation
specific but typically the application has already started and its
main method has already been invoked. "
(http://download.java.net/jdk7/docs/api/java/lang/instrument/package-summary.html).


Patrick

Ted Neward

unread,
Feb 3, 2008, 12:35:25 AM2/3/08
to jvm-la...@googlegroups.com
j.l.instrument does require a load-time agent specified on the command-line
(figuratively speaking; if you wrote your own custom launcher, you could
provide the necessary system property directly to the JNI Invocation code).
Your agent implements the Instrumentation interface, and the actual class
name is passed on the command-line, and then your instrumenter is called as
classes are loaded (after a core bootstrap set is in place--Object, String,
Class, etc).

Ted Neward
Java, .NET, XML Services
Consulting, Teaching, Speaking, Writing
http://www.tedneward.com

> -----Original Message-----
> From: jvm-la...@googlegroups.com [mailto:jvm-
> lang...@googlegroups.com] On Behalf Of Patrick Wright
> Sent: Thursday, January 31, 2008 3:04 AM
> To: jvm-la...@googlegroups.com
> Subject: [jvm-l] Re: Ability to force class unloading in JDK 6 Update
> 4?
>
>

> No virus found in this incoming message.
> Checked by AVG Free Edition.

> Version: 7.5.516 / Virus Database: 269.19.16/1251 - Release Date:
> 1/30/2008 9:29 AM
>

No virus found in this outgoing message.
Checked by AVG Free Edition.

Version: 7.5.516 / Virus Database: 269.19.19/1256 - Release Date: 2/2/2008
1:50 PM

Reply all
Reply to author
Forward
0 new messages