Making use of RoboVM's LLVM Java frontend

Nate Deisinger

unread,

Sep 20, 2013, 7:04:03 PM9/20/13

to rob...@googlegroups.com

A quick introduction - I'm an undergrad at the University of Wisconsin-Madison, working with the programming languages group there. One of the things we've been developing is a lightweight method for collecting data to aid in debugging following a crash by way of storing collecting about call coverage and path tracing during execution (see Ohmann, 2013). This data is gathered by transforming LLVM IR through an LLVM opt plugin and then compiling and running the newly-annotated program.

I'm currently investigating how we can apply this work to languages beyond C and C++, and as such I've been searching for a good Java-to-LLVM frontend. I've downloaded the RoboVM source and played around with it and am very impressed with its abilities to create LLVM bitcode from Java bytecode. Right now I am looking into if it is possible to implement our custom pass into RoboVM's compilation pipeline along with the other LLVM passes performed on the code (optimizations, etc). I'm still familiarizing myself with the code, so if there are any obvious roadblocks I'm likely to hit that come to mind, any advice is appreciated.

There is one feature that is missing from RoboVM that we need for our plugin to work correctly, and that is debug information. We don't need full symbols and information about local variables, but rather just source lines and function names. I see there's already discussion on debugging going on, but I just wanted to say that it's a feature that would really help us. (I'll bring this up in the debug thread as well.)

This is really great work you've done, though, and it's been immensely helpful to find a good Java-to-LLVM frontend.

Edu García

unread,

Sep 20, 2013, 7:09:08 PM9/20/13

to rob...@googlegroups.com

Hello Nate, welcome to the forums.

I'm sure Niklas will chime in soon, but in the meantime, if you're going inside the guts of RoboVM, I recommend you to take a look at the Soot intermediate representation that RoboVM uses, it might be helpful: http://www.sable.mcgill.ca/soot/

Niklas Therning

unread,

Sep 30, 2013, 9:22:45 AM9/30/13

to Edu García, rob...@googlegroups.com

Hi Nate,

Sounds like a cool tool you're working on! Adding basic debug info like the one you need should be close to trivial. If you want to give it a try I'll do what I can to help you get started.

--
You received this message because you are subscribed to the Google Groups "RoboVM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to robovm+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Nate Deisinger

unread,

Sep 30, 2013, 3:58:22 PM9/30/13

to rob...@googlegroups.com, Edu García

That would be excellent - I've not had much chance to hack around with RoboVM since my last message due to other commitments, but it's encouraging to hear that it shouldn't be difficult.

I know there's some ongoing work into debugging, but as that thread seems to have halted for the moment, I take it you think it'd be simpler to implement this support on its own, separate from the main debug work? In that case, any ideas you'd have with regards to how best to implement it would be appreciated. I see that Soot already has good support for maintaining line numbers/method names, so I'm guessing it's a question of mapping Soot's IR to LLVM debug information as part of constructing our LLVM bitcode/

Thanks,
Nate

Nate Deisinger

unread,

Oct 7, 2013, 3:48:39 PM10/7/13

to rob...@googlegroups.com, Edu García

As an update - I've been making some adjustments to RoboVM to add very basic debug support. If you are familiar with Clang's -gline-tables-only flag, the debug output I'm hoping to export is similar to that, but without dealing with lexical blocks. In other words, the debug info that is output (in standard LLVM/DWARF3 format) will consist of the main debug block, the list of subprograms (functions), blocks for each function, and blocks for line information which point back to the function blocks.

One thing I've noticed is that having RoboVM print to console during compilation (for debug purposes) seems to cause the run to be flagged as failing, and partway through the process I get kicked back to the command line and the usage information is printed at me. Is there any way to suppress this behavior?

Nate Deisinger

unread,

Oct 7, 2013, 4:38:13 PM10/7/13

to rob...@googlegroups.com, Edu García

...And nevermind, I was missing an exception that was being thrown and had forgotten to turn on the -verbose flag, haha.

While I'm not certain how applicable my work will be to full debug support, as I'm only putting together this very limited subset, I will make it available once it's up and running.

Mario Zechner

unread,

Oct 8, 2013, 11:36:28 AM10/8/13

to rob...@googlegroups.com

I'd be very interested in checking out your fork. Maybe i can contribute a few little things.

Niklas Therning

unread,

Oct 14, 2013, 2:25:49 AM10/14/13

to Mario Zechner, rob...@googlegroups.com

Nate, do you have a fork for this?

On Tue, Oct 8, 2013 at 5:36 PM, Mario Zechner <badlog...@gmail.com> wrote:

I'd be very interested in checking out your fork. Maybe i can contribute a few little things.

Nate Deisinger

unread,

Oct 15, 2013, 4:15:52 PM10/15/13

to rob...@googlegroups.com, Mario Zechner

Hi Niklas,

I actually had just been working locally on my machine without setting up a dedicated fork, but I just set up a fork on Git now.

I am close to having debug support for the features I mentioned above implemented; I just need to do some refactoring of my code and bug fixes. Once I have that done (should be by EOD tomorrow) I'll post my fork here along with a summary of what is and isn't working and why.

-Nate

Nate Deisinger

unread,

Oct 16, 2013, 11:48:22 PM10/16/13

to rob...@googlegroups.com, Mario Zechner

A quick update - other obligations kept me from getting everything cleaned up enough, but the debug information is up and running. An executable compiled in my fork, when run in GDB, can give source line information (and if you put the source in the directory alongside it GDB can appropriately look it up.) I will try and get my code cleaned up and post my fork with annotations tomorrow; if I'm not able to do that, I'll just post the fork with what information I can provide, but it'll probably look a bit messy. Sorry again for the delay.

Nate Deisinger

unread,

Oct 17, 2013, 6:28:34 PM10/17/13

to rob...@googlegroups.com, Mario Zechner

The fork is available at https://github.com/ndeisinger/robovm. Some things to note:

This debug information is only designed to provide line number/source file information, not variables or other in-depth debug references. This information is gathered by turning on the -keep-line-number flag in Soot.
One major issue relating to the above: Soot does not seem to save the absolute path of a Java source, because the Java class file itself doesn't save it, even when compiling with -g. This will need more investigation. For now we set the directory to just be the local directory (hardcoded as Linux's "."), so when testing an executable in GDB, put the source files alongside the executable.
On a more minor note, Soot doesn't seem to save the line on which a function is defined, which the LLVM debug info expects. I fake it by getting the line number of the first statement in the function, but that's not quite right either.
At present, it generates line information for the standard Java libraries as well. In the future we might want to remove this.
Debug information is overall managed by a DebugManager class, which statically provides LLVM debug references.
For each class we compile, there is a corresponding DebugClass which contains appropriate FunctionDebugStatements and LineDebugStatements. This is also managed through DebugManager.
The fork is set up so debug information is always on. There is a toggle in DebugManager which would prevent it from returning debug references, but there will need to be a little cleanup and refactoring to make it a viable on/off switch. For now it's only disabled when linking classes so we don't try and generate debug info for the linkage functions we make.
I forgot to get set up with the proper LLVM formatting templates before I started coding, so the code is missing licenses/etc. Once it's been improved on I'll go back and fix that and other formatting issues.

It isn't a perfect solution - while stepping in GDB you'll probably find yourself frequently seeing the end bracket of a function for some reason - but the meat of it seems to work. Let me know if you have any questions.

-Nate

Nate Deisinger

unread,

Dec 2, 2013, 4:41:27 PM12/2/13

to rob...@googlegroups.com, Mario Zechner

Pinging to ask if anyone has taken a look at this. In the end my group decided on a different method for handling Java, but hopefully the below will still be useful.

Shane Clark

unread,

Oct 10, 2014, 11:41:35 AM10/10/14

to rob...@googlegroups.com, badlog...@gmail.com

I assume that lack of response means that no one picked this up, but I thought I'd give it one more try.

I have not looked at the details yet, but I am interested in doing something similar. I want to apply tools like KLEE that operate on LLVM IR to programs I receive as Java bytecode. It looks like RoboVM already implements all of the necessary conversion steps for this.

Niklas Therning

unread,

Oct 28, 2014, 2:30:27 AM10/28/14

to Shane Clark, rob...@googlegroups.com, Mario

Hi Shane,

It's not designed to be used this way so you will have to hack it up yourself. The point where we take the bitcode we generate and feed it into LLVM is in ClassCompiler.compile(). That's a good place to start.

--

You received this message because you are subscribed to the Google Groups "RoboVM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to robovm+un...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Shane Clark

unread,

Nov 3, 2014, 10:20:05 AM11/3/14

to Niklas Therning, rob...@googlegroups.com, Mario

Hi Niklas,

Thank you very much for the pointer. I understand that this is not the intended use, but I don't want to reinvent anything unnecessarily.

Reply all

Reply to author

Forward