Bytecode Gen Suggestions

6 views
Skip to first unread message

Matt Fowles

unread,
Apr 7, 2008, 6:02:51 PM4/7/08
to jvm-la...@googlegroups.com
All~

I am trying to find a reasonable path towards direct bytecode
generation. My current setup has a hand rolled java dom that I then
serialize to java code which I compile using javac. I have found that
I can replace my java dom with the jdt.core one and still do the
serialize/compile trick. But, I would really like to be able to skip
the serialize step and go directly some AST to a .class file.

This approach is fairly nice in that it allows me to inspect the
generated code easily and find bugs that way. Thus I would like to
maintain the ability to generate the java code, even if I only use it
for debugging.

I looked into jdt.core.dom which is the AST that eclipse uses when
refactoring code, but couldn't figure out a way to make it emit
bytecode. Does anyone know of any such targetable ASTs? Another
approach, how stable is the internal AST used by ECJ or javac?

Thanks,
Matt

Norris Boyd

unread,
Apr 7, 2008, 8:38:42 PM4/7/08
to JVM Languages
The Rhino bytecode generator can be seen at
http://lxr.mozilla.org/mozilla/source/js/rhino/src/org/mozilla/classfile/.
It doesn't support writing out Java source, however, and is fairly low
level.

--N

Charles Oliver Nutter

unread,
Apr 7, 2008, 8:50:37 PM4/7/08
to jvm-la...@googlegroups.com
Matt Fowles wrote:
> All~
>
> I am trying to find a reasonable path towards direct bytecode
> generation. My current setup has a hand rolled java dom that I then
> serialize to java code which I compile using javac. I have found that
> I can replace my java dom with the jdt.core one and still do the
> serialize/compile trick. But, I would really like to be able to skip
> the serialize step and go directly some AST to a .class file.

This is, in my opinion, is one of the values of the DLR
toolchain...given an AST, properly transformed into a DLR AST, you get a
compiler and such on the other side.

I have also very much wanted a way to take an AST and shove it directly
into the standard Java compiler chain to produce output, rather than the
(really really gross, in my opinion) age-old hack of writing to a file.
My first suggestion would be your second proposed approach: to look at
javac, which is now GPL along with the rest of OpenJDK. But I've looked
into that code myself, and it's pretty crazy...so I'm not sure how easy
it would be to construct the AST yourself and feed it in. A good day
project would be to see if it's feasible, since the process for doing so
would be incredibly useful to others.

> This approach is fairly nice in that it allows me to inspect the
> generated code easily and find bugs that way. Thus I would like to
> maintain the ability to generate the java code, even if I only use it
> for debugging.

There's another benefit to the javac approach too...it's a damn good
compiler. So feeding it an AST saves you the trouble of writing a good
compiler yourself, and you don't have to serialize anything. Plus it
opens up all sorts of other possibilities like producing custom Java
ASTs by transforming other languages, allowing them to leverage javac
directly too.

FWIW, even if you did find a way to make the Eclipse compiler be used in
used in this way, I don't believe it's recommended as a production
compiler (but I could be wrong on this, I haven't used it in a long time).

- Charlie

Per Bothner

unread,
Apr 7, 2008, 9:25:00 PM4/7/08
to jvm-la...@googlegroups.com

The gnu.expr AST used by Kawa is used for a number of different
languages: Scheme, Common Lisp, Emacs Lisp, XQuery, XSLT subset,
Nice, and possibly others. gnu.expr in turns is based on the
lower-level gnu.bytecode package. (Once it's posted, I'll add a
link to Elliott Hughes's blogging about his experience comparing
ASM and gnu.bytecode.)

http://www.gnu.org/software/kawa/internals/index.html
http://www.gnu.org/software/kawa/api/gnu/expr/package-summary.html
http://www.gnu.org/software/kawa/api/gnu/bytecode/package-summary.html

It generates very efficient code - Kawa Scheme with some judicious
type annotations may be the most efficient "scripting language"
(if defined as a language with eval and a repl) on the JVM.

It can't emit Java code, but it does have a pretty-printer
for the internal AST, which does intelligent indentation while
minimizing line-breaks:

$ kawa --debug-print-expr
#|kawa:1|# (define (foo x) (+ x 10))
[Module:atInteractiveLevel$1
(Module/atInteractiveLevel$1/1/
(Declarations: foo/15/fl:8c2::gnu.mapping.Procedure)
(Define line:1:1 /Declaration[foo/15]
(Lambda/foo/5/fl:0 line:1:9 (x/16/fl:40)
(Apply line:1:17 (Ref/6/Declaration[applyToArgs/1])
(Ref/4/Declaration[+/17])
(Ref/5/Declaration[x/16])
(Quote 10)))))]
#|kawa:2|#
--
--Per Bothner
p...@bothner.com http://per.bothner.com/

tpoi...@gmail.com

unread,
Apr 7, 2008, 9:40:19 PM4/7/08
to JVM Languages


On Apr 7, 6:50 pm, Charles Oliver Nutter <charles.nut...@sun.com>
wrote:
> Matt Fowles wrote:
> > All~
>

> FWIW, even if you did find a way to make the Eclipse compiler be used in
> used in this way, I don't believe it's recommended as a production
> compiler (but I could be wrong on this, I haven't used it in a long time).

The Janino compiler is also worth a look for compiling Java on the
fly.
It's used as the compiler for Jacl's TJC (Tcl to Java) compiler, and I
use
it in a Jacl package that inlines Java code. Just pass it a string of
a
class, and get back bytecodes, no messy files needed.
Janino is mostly Java 1.4 compatible.

http://www.janino.net/
Reply all
Reply to author
Forward
0 new messages