A bit about JRuby's design

2 views

Skip to first unread message

Charles Oliver Nutter

unread,

May 28, 2007, 1:41:07 AM5/28/07

to jvm-la...@googlegroups.com

JRuby's design has evolved over time, but this is a brain dump I did a
few weeks ago. I'm open for any and all questions.

****

A quick writeup for those interested. This is from memory, I'm missing
some details for sure. Disclaimer: Tom's the parser guy, so direct
questions in that area to him.

This is based on what we plan to ship for JRuby 1.0...

Parsing/lexing:

* JRuby has a YACC/BISON-based parser, basically Ruby's parser
ported to Java and using the Jay parser generated.
* The lexer is hand-written, as in Ruby.
* The resulting AST has positioning information for both start/end
lines and start/end offsets. This is to support IDE use.
* We also save comments as they're parsed, to allow round-tripping
the AST.
* We're considering an experimental migration to XRuby's
ANTLR-based parser post 1.0. I don't guess it would take more than a week.

Runtime layout:

* The "Ruby" class is the entry point for the runtime. All objects
created have a reference to the runtime, and do not / will not migrate
across runtimes without marshalling.
* Every thread using a given runtime is assigned a ThreadContext
object, stored in a threadlocal. We minimize threadlocal lookup by
frequently passing the ThreadContext on the call stack as a parameter.

More on threading:

* Ruby threads are mapped 1:1 to JRuby threads, except in thread
pooling mode (experimental) where a pool of threads is maintained from
which to fuel Ruby threads. The pooling was added to offset the overhead
of APIs that spin up many threads, assuming they're green/lightweight.
* Java threads entering the system from "outside" the runtime are
assigned a ThreadContext and "RubyThread" object. In our terminology,
they are "adopted" by the runtime.
* Ruby's unsafe thread operations (criticalization of a single
thread, Thread#kill, Thread#raise) are supported, but we make no
deterministic guarantees about them. Criticalization does not guarantee
other threads have stopped before proceeding, and kill/raise require
that the target thread eventually reach a checkpoint. Kill may wait
forever if the target thread never reaches the "die yourself" checkpoint.

Interpreter:

* The current interpreter engine is a "big switch", with each AST
node having a switch value. Each case calls out to a separate method;
this layout appeared to work best for HotSpot (presumably because the
switch gets inlined into the cases, rather than vice-versa being too
complicated to optimize).

Methods and dynamic dispatch:

* Java based methods are generally bound to names using
bytecode-generated adapter classes that directly invoke. This showed a
substantial gain over e.g. reflection. We tried a number of other
binding mechanisms, and this was easiest to maintain and gave the best
performance.
* There is a per-class method cache with a global flushing
mechanism. In experiments, adding an inline cache has not shown any
gains over a per-class cache, and since we control the class data
structure we've had no motivation to use an inline cache.
* We also support a fast dispatch mechanism based on Selector Table
Indexing. All core classes are assigned an index, and all STIable method
names are assigned an index. We then maintain an in-memory table mapping
class index and method index to a switch value. Core classes then have a
fast dispatch path based on that switch value, otherwise falling back to
hash-lookup and method caches. STI shows a major performance boost for
core method implementations. Reassigning/redefining any STI method zeros
that position in the table, reverting to slow dispatch.

Compiler:

* The compiler's AOT mode compiles one .rb file to one class. The
resulting class is primarily a "bag of functions", with all methods
static. When the methods are actually bound to specific names, the same
adapter-generation happens to bind them.
* Blocks and class bodies are compiled in the same way; one method
in the resulting class per closure.
* The compiler's JIT mode watches for interpreted calls to exceed a
simple threshold; 20 calls seemed to be a good cutoff. It then attempts
to compile it, permanently using the compiled code if it succeeds or
reverting to interpreted if it fails.
* The compiler is not optimized in any major way. It's almost
entirely a duplication of the interpreter logic.
* The resulting bytecode does not decompile to anything very
Java-like. This is primarily because I only use Ruby local variables (in
a local variable data structure) and just juggle items on the stack for
the rest. The complexity of managing Java local variables across a
visitor-like compiler design outweighed any gains.
* Perhaps 50% or more of the interpreter code is now represented in
the compiler. The remaining 50% is harder constructs I just haven't
tackled yet.

Compatibility:

* JRuby 1.0 is very close to 100% compatible with the Ruby
language. The core libraries we *can* support are close to 100%
compatible. There are libraries we can't support, generally in the areas
of POSIX functions not supported by Java or platform-specific features
like fork and symlinks.
* JRuby on Rails is very close to passing all tests 100% now. By
most accounts JRoR runs extremely well.
* A number of C-based extensions have been ported to JRuby, with
more on the way.

Pain:

* When classes are generated at runtime, they are each wrapped with
a classloader to allow them to be garbage collected. Super gross, but no
other way to safely generate arbitrary numbers of classes without
exceeding permgen. We need a "dynamic method" or "anonymous method"
concept, or even a super lightweight class/classloader that would
support the same semantics.
* In order to support Ruby's wilder features, we need to be able to
decorate Java's call stack with additional information. The only way to
do this is to construct our own stack frames per-call and either keep
them on ThreadContext (threadlocal) or pass them on the stack. This is
pain point #1 for performance.
* The other big performance pain is representing scope as an
external data structure rather than as local variables. This is
primarily to support closures and eval, which need to capture variables
in scope. I'm not fond of Groovy's mechanism for this, which copies
variables in and out. But our mechanism does mean more object overhead
per-call.

Jochen Theodorou

unread,

May 28, 2007, 6:10:38 AM5/28/07

to jvm-la...@googlegroups.com

Charles Oliver Nutter schrieb:

> JRuby's design has evolved over time, but this is a brain dump I did a
> few weeks ago. I'm open for any and all questions.

I have some questions ;)

[...]

> Methods and dynamic dispatch:
>
> * Java based methods are generally bound to names using
> bytecode-generated adapter classes that directly invoke. This showed a
> substantial gain over e.g. reflection. We tried a number of other
> binding mechanisms, and this was easiest to maintain and gave the best
> performance.

isn't that already some kind of inline cache? I mean in the wider sense
of the word maybe

> * There is a per-class method cache with a global flushing
> mechanism. In experiments, adding an inline cache has not shown any
> gains over a per-class cache, and since we control the class data
> structure we've had no motivation to use an inline cache.

have you done experiments with polymorphic inline caches too?

[...]

> Pain:
>
> * When classes are generated at runtime, they are each wrapped with
> a classloader to allow them to be garbage collected. Super gross, but no
> other way to safely generate arbitrary numbers of classes without
> exceeding permgen. We need a "dynamic method" or "anonymous method"
> concept, or even a super lightweight class/classloader that would
> support the same semantics.

how about making one classloader for your generated classes per
classloader of the class you create from. For example if class A and B
are from the same class loader, and you have generated one class for A,
then reuse the same classloader for B. Groovy is doing that for the
Reflector.

[...]

> * The other big performance pain is representing scope as an
> external data structure rather than as local variables. This is
> primarily to support closures and eval, which need to capture variables
> in scope. I'm not fond of Groovy's mechanism for this, which copies
> variables in and out. But our mechanism does mean more object overhead
> per-call.

Groovy does not copy in and out... maybe it can be best explained with
an example in mostly Java:

int i = 0;
Runable r = new Runable() {
public void run() {
i++;
}
}
r.run();
System.out.println(i);

of course this is illegal in Java and so you would do for example:

final int[] i = new int[]{0};
Runable r = new Runable() {
public void run() {
i[0]++;
}
}
r.run()
System.out.println(i[0]);

which is legal again. Groovy is using something comparable to this. We
don't use arrays, but we use a special class. We can do this in Groovy
because local variables are lexically scoped. So we always now for what
variables we have to do something like this and keep that to a minimum.

bye blackdrag

--
Jochen "blackdrag" Theodorou
Groovy Tech Lead (http://groovy.codehaus.org)
http://blackdragsview.blogspot.com/

Charles Oliver Nutter

unread,

May 28, 2007, 3:54:07 PM5/28/07

to jvm-la...@googlegroups.com

Jochen Theodorou wrote:
> isn't that already some kind of inline cache? I mean in the wider sense
> of the word maybe

It's not an inline cache, because it doesn't happen at the call
site...it is a per-class method cache. In general, this obviates the
need for an inline cache, but of course it only works in cases where you
can design classes to do the caching (and it is potentially less
memory-efficient than a good inline cache).

> have you done experiments with polymorphic inline caches too?

I have, and they also did not show significant gains, largely because
the existing cache is already per-class (and therefore polymorphic).

> how about making one classloader for your generated classes per
> classloader of the class you create from. For example if class A and B
> are from the same class loader, and you have generated one class for A,
> then reuse the same classloader for B. Groovy is doing that for the
> Reflector.

Well, the problem with that is that you can add or redefine methods on a
given class endlessly, so if there's a single classloader for the whole
class it would leak. See the following code (contrived, but illustrative):

while true
def hello; end
end

If each redefinition of hello gets compiled and loaded into the same
classloader, it would eventually use up the permgen space. So each such
redefinition must be done in its own classloader.

Can Groovy redefine methods in this way? If so, I'd expect it to run
into the same limitation.

> Groovy does not copy in and out... maybe it can be best explained with
> an example in mostly Java:

Ok, I must have misunderstood you in the past. JRuby uses a similar
structure to represent variable scopes.

Now I'm a bit confused about your statement that Groovy is lexically
scoped. As far as I understand it, Groovy only has two scopes, right? So
for example, I could do this:

doSomething {
a = 1
doSomething {
b = 2
}
println b
}

...since all such variables live in a binding scope (or something...I'm
still not entirely clear on the scoping mechanisms or terminology for
groovy.

- Charlie

Jochen Theodorou

unread,

May 28, 2007, 5:55:17 PM5/28/07

to jvm-la...@googlegroups.com

Charles Oliver Nutter schrieb:

> Jochen Theodorou wrote:
>> isn't that already some kind of inline cache? I mean in the wider sense
>> of the word maybe
>
> It's not an inline cache, because it doesn't happen at the call
> site...it is a per-class method cache. In general, this obviates the
> need for an inline cache, but of course it only works in cases where you
> can design classes to do the caching (and it is potentially less
> memory-efficient than a good inline cache).

ah ok, I misread something

[...]

>> how about making one classloader for your generated classes per
>> classloader of the class you create from. For example if class A and B
>> are from the same class loader, and you have generated one class for A,
>> then reuse the same classloader for B. Groovy is doing that for the
>> Reflector.
>
> Well, the problem with that is that you can add or redefine methods on a
> given class endlessly, so if there's a single classloader for the whole
> class it would leak. See the following code (contrived, but illustrative):
>
> while true
> def hello; end
> end
>
> If each redefinition of hello gets compiled and loaded into the same
> classloader, it would eventually use up the permgen space. So each such
> redefinition must be done in its own classloader.

I see... but you could still group such classes together.. I think it
must be tested, but 10-100 classes should be possible

> Can Groovy redefine methods in this way? If so, I'd expect it to run
> into the same limitation.

that's not so easy to answer. See the Reflector like a shadow, then we
define one Reflector per class, but only on the class we make a method
call on. So defining a big number of classes does not mean to get a big
number of Reflectors. The only way to replace a method is by defining a
new class through the user and we don't try to prevent a permgen problem
here, since the user is responsible for for using a certain class loader
to define that class. All in all, it is no question that does really fit
into the Groovy runtime system...

>> Groovy does not copy in and out... maybe it can be best explained with
>> an example in mostly Java:
>
> Ok, I must have misunderstood you in the past. JRuby uses a similar
> structure to represent variable scopes.
>
> Now I'm a bit confused about your statement that Groovy is lexically
> scoped.

nono, local variables are. Groovy is not in general lexcially scoped.

> A As far as I understand it, Groovy only has two scopes, right? So

> for example, I could do this:
>
> doSomething {
> a = 1
> doSomething {
> b = 2
> }
> println b
> }

yes, but you are not using local variables here. Change it to

doSomething {
def a = 1
doSomething {
def b = 2
}
println b
}

and a,b are becoming local variables, which means the program does still
compile, but println b would not refer to the locally variable b, but to
to a (possibly unknown) property b. And this:

doSomething {
def a = 1
doSomething {
def a = 2
def b = 2
}
println b
}

won't compile, while here:

doSomething {
def a = 1
doSomething {
def b = 2
}
def b = 42
println b
}

println b would refer to the b declared the line before. Just think it
is Java and you don't see what members your class does have.

> ...since all such variables live in a binding scope (or something...I'm
> still not entirely clear on the scoping mechanisms or terminology for
> groovy.

everything that is no local variable is more or less in the binding
scope, only that part of that scope is defined by the class with fields
and properties. Maybe you could say there is a lexical scope for local
variables and a runtime scope for fields/properties and anything else.
Not sure what the correct name for such a scope is.

Reply all

Reply to author

Forward

0 new messages