JRuby and Debugging Support

29 views

Skip to first unread message

Rocky Bernstein

unread,

Dec 22, 2010, 9:57:33 AM12/22/10

to Ruby Debugger, Charles Oliver Nutter

The following information, which I am grateful for, is from Charles Oliver Nutter in private email.

Since I think it way too valuable to keep private, I post it here. I'll also probably add to a wiki somewhere as well. I've edited it slightly to try to make it flow as one continuous piece of information. (I don't think I was completely successful here, so I apologize in advance for my lack of skill here and any mistakes and misrepresentations I've made in the process.)

Originally some of this was in back-and-forth comments.

set_trace_func is available in JRuby, but you have to turn on a flag (--debug) for it to receive all events.

The main problem I have with set_trace_func is that it passes a binding along. The *VAST* majority of overhead from using set_trace_func is because of that binding, and it's because of the binding (and to a lesser extent the constant pinging for an installed event hook) that we require a flag to enable full tracing.

It would probably be possible to reduce static-only tracing to the point where it could be on all the time, but being able to pull off arbitrary bindings for set_trace_func is almost a nonstarter.

Me: How about a kind of set_trace_func that passes no parameters, not even the static ones like the class of the method that the frame is in? If someone wants information, they issue a call to get it. If no binding is needed, none is created. This also, to some extent, pushes the burden of the slowdown back where it belongs: on the programmer/program that is requesting such stuff.

If you look at the examples in the Pickaxe book, I think you'll find that it contains no use of binding. For static POSIX set -x -like tracing you don't need a binding. There are lots of profiling and tracing use cases where you don't need a binding created. In light of this, it is a pity that right now they should have to take a hit for information they don't want or need. And even though it may not be prohibitive to get static information like position information, why bother if it's not wanted?

The JVM allows debugging a previously "non-debug" JVM instance. When a debugger attaches to a running process, loaded code is rewritten to include debugging logic throughout. Detaching returns code to the original non-debug footprint.

The rewriting generally happens at a JVM bytecode level, allowing for the code to still JIT compile, and so you can still run "optimized" code, albeit hindered by debugging overhead.

JRuby is a mixed-mode environment, like Rubinius, which means we have an interpreted phase before we "JIT" code into JVM bytecode. Code that goes to JVM bytecode can in theory be debugged exactly like any other JVM code, since it will get rewritten and be step-debuggable using the same mechanisms. Code that is still interpreted, however, can't really enlist in that process because the JVM knows nothing about our execution model.

Interpreted code is where set_trace_func (and the lower-overhead native "EventHook" logic which we duplicated in JRuby) makes more sense. When tracing is enabled, we emit trace function calls for both interpreted and JIT'ed code, allowing you to debug an application that's still JIT'ing and optimizing as it did before. The ruby-debug we ship is actually a "native" (i.e. Java-based) JRuby extension that mimics what the C version does, event hooks and all.

There may also be a possible marriage of the two worlds, where we present a Java debugging API that knows about both interpreted and compiled Ruby code.

The debugging support for JVM better than on most native runtimes, and other runtimes might do well to copy it.

Using standard JVM debuggers, I'm sure it would be possible to debug JITed JRuby code, since it's all just JVM bytecode then. Getting lower than that would require native code debuggers, and then you have the same problems we have in JRuby: combining debug logic for both
unoptimized and optimized code.

When we compile JRuby, we don't keep track of optimizations performed in the current compiler. But in general we don't throw away the AST, since that's our master blueprint and we may want to deoptimize and return to interpreting for a while. There's obviously even more challenges
for debugging optimized code if it may suddenly branch back intointerpreted code. The event hook handles this case nightly, since it just requires that executing code periodically send pings, which it can do whether optimized or interpreted.

Charles Oliver Nutter

unread,

Dec 22, 2010, 5:46:26 PM12/22/10

to Rocky Bernstein, Ruby Debugger

On Wed, Dec 22, 2010 at 8:57 AM, Rocky Bernstein
<rocky.b...@gmail.com> wrote:
> Me: How about a kind of set_trace_func that passes no parameters, not even
> the static ones like the class of the method that the frame is in? If
> someone wants information, they issue a call to get it. If no binding is
> needed, none is created. This also, to some extent, pushes the burden of
> the slowdown back where it belongs: on the programmer/program that is
> requesting such stuff.

That's largely how the internal "event hook" stuff in MRI works today.
set_trace_func passes your function a binding, but that's done only
for set_trace_func's API; if you implement a "native" hook, as in
ruby-debug and ruby-prof, your hook function only gets the mostly
static bits. At that point, you can make additional calls to retrieve
a full binding (if one is available).

My primary concern in any debugging API going forward is that it
remains mindful of different implementations in-memory representation,
code lifecycle, and optimization goals. It's not possible to do all
optimizations all the time AND have full debugging support, which is
why JRuby requires passing a start up flag in most cases to get full
"event hook" and set_trace_func behaviors (since constantly pinging
for a hook adds overhead).

There's nothing that any other impl does we can't do in JRuby, as far
as debugging support goes; but if it's too invasive, we won't support
it during normal execution.

> If you look at the examples in the Pickaxe book, I think you'll find that it
> contains no use of binding. For static POSIX set -x -like tracing you don't
> need a binding. There are lots of profiling and tracing use cases where you
> don't need a binding created. In light of this, it is a pity that right now
> they should have to take a hit for information they don't want or need. And
> even though it may not be prohibitive to get static information like
> position information, why bother if it's not wanted?

JRuby recently added built-in profiling support that simply wraps the
method lookup chain with "profiling" method lookup logic. Basically,
all method lookups now return that method wrapped with a profiling
aware wrapper. This makes it theoretically possible for us to turn on
profiling at runtime (if we flush all call sites in the system), but
more interestingly it points out the perils of expecting profiling or
debugging to work against optimized code. Any optimizing runtime will
produce different optimized code when profiling or debugging is
present, since it necessarily becomes part of the application's
execution profile.

- Charlie

Rocky Bernstein

unread,

Dec 22, 2010, 11:01:28 PM12/22/10

to Ruby Debugger

Comments in line.

On Wed, Dec 22, 2010 at 5:46 PM, Charles Oliver Nutter <hea...@headius.com> wrote:

On Wed, Dec 22, 2010 at 8:57 AM, Rocky Bernstein
<rocky.b...@gmail.com> wrote:
> Me: How about a kind of set_trace_func that passes no parameters, not even
> the static ones like the class of the method that the frame is in? If
> someone wants information, they issue a call to get it. If no binding is
> needed, none is created. This also, to some extent, pushes the burden of
> the slowdown back where it belongs: on the programmer/program that is
> requesting such stuff.

That's largely how the internal "event hook" stuff in MRI works today.

set_trace_func passes your function a binding, but that's done only
for set_trace_func's API; if you implement a "native" hook, as in
ruby-debug and ruby-prof, your hook function only gets the mostly
static bits. At that point, you can make additional calls to retrieve
a full binding (if one is available).

Perhaps I'm repeating the above. ruby-debug for MRI Ruby 1.8 and YARV 1.9 don't call set_trace_func but instead register themselves with rb_add_event_hook(). The callback via this kind of registration doesn't pass back a binding object, or a file name, or a line number, as set_trace_func does.

But it does pass back unneeded parameters, namely the last two ("mid" which is some sort of Ruby id presumably for a method, and klass, the object that the method is in if that's meaningful). These can be derived from other parameters when needed. For callbacks from some kinds of events it is irrelevant. So this is already a little more work than necessary. And from an API standpoint, having slots for these unused parameters in "raise", "end" and "return" events is ugly. Right now, YARV, passes 0 in these slots for these event types.

And this might facilitate Heisenbugs. (But so does having to use --debug.) There are no silver bullets and universal solutions. That's why I prefer to give programmers options, even at the expense of complexity.

I realize though that I may be the minority opinion; the majority not being prepared to handle the additional complexity and will therefore give up or look for other ways to figure out what's wrong.