Draft article on embedding LLVM assembly in Mython.

Jon

unread,

Nov 4, 2009, 7:15:45 PM11/4/09

to Applied Language Technologies Group

Hi all,

I just wanted to quickly mention that I was inspired by a recent
article about implementing a language in LLVM. Riffing on this theme,
I've drafted an article about embedding LLVM assembly in Python and
Mython. If you have time and/or inclination, you can find my draft
article here:

http://log.jonriehl.com/?p=43

Any feedback would be appreciated.

Thanks,
-Jon

Wonseok Chae

unread,

Nov 5, 2009, 1:06:19 PM11/5/09

to Applied Language Technologies Group

Hi Jon.

I enjoyed this draft and especially liked the discussion part
partially because I was not familiar with LLVM assembly code. My
impression is that you did again convince us of the importance of
error-detection at compile time. Recently, I implemented small code
for converting my address book to the specific format for the
navigation in *Python*. The size of entry was more than 2,000 people.
There were several loops, each for staged compilation. Many time, I
found bugs (typos or syntax errors) in the middle of computation and
had to rerun from the beginning, which reminded me of the advantage of
static type systems.

At the same time, I felt a sense of betrayal by the result of the
performance. As I believe, one of many reasons to embedded low-level
languages is to gain high performance. why didn't you try to embed
some intensive computational code such as ray tracing or matrix
manipulation? Otherwise, embedding assembly seems less practical.
However, you also pointed out there was a case where LLVM source code
could be generated. I was thinking about the staged scenario that in
the first stage Mython takes high level descriptions and generate LLVM
source code; in the next stage, Mython takes LLVM source code and
generate bit code at compiler time; and in the next stage...sounds
interesting but I could not find any useful application in it, though.
I believe you have many.

It is good to know what you are up to. Keep working hard!

Thanks,
Wonseok

p.s. Right now, I am playing with Javascript and PHP here (http://
ttic.uchicago.edu/~wchae/SmartForm/ ) and hope to return "Visible
compiler stuff" real soon.

Jon Riehl

unread,

Nov 6, 2009, 1:36:54 PM11/6/09

to al...@googlegroups.com

Hi Wonseok, all,

On Thu, Nov 5, 2009 at 12:06 PM, Wonseok Chae <wsc...@gmail.com> wrote:
> I enjoyed this draft and especially liked the discussion part
> partially because I was not familiar with LLVM assembly code. My
> impression is that you did again convince us of the importance of
> error-detection at compile time. Recently, I implemented small code
> for converting my address book to the specific format for the
> navigation in *Python*. The size of entry was more than 2,000 people.
> There were several loops, each for staged compilation. Many time, I
> found bugs (typos or syntax errors) in the middle of computation and
> had to rerun from the beginning, which reminded me of the advantage of
> static type systems.

I'm glad you liked it. Many times the dynamic community considers the
class of errors you mentioned to be the cost of getting something
working quickly. It seems you could do some spot checks of generated
code by compiling it and then marshaling (serializing) the bytecode,
but this won't help name-space errors. I'll try to come up with an
example at some point (though I do something similar in the log
article using the MyFront front end to generate an AST from a string).

> At the same time, I felt a sense of betrayal by the result of the
> performance. As I believe, one of many reasons to embedded low-level
> languages is to gain high performance. why didn't you try to embed
> some intensive computational code such as ray tracing or matrix
> manipulation? Otherwise, embedding assembly seems less practical.
> However, you also pointed out there was a case where LLVM source code
> could be generated. I was thinking about the staged scenario that in
> the first stage Mython takes high level descriptions and generate LLVM
> source code; in the next stage, Mython takes LLVM source code and
> generate bit code at compiler time; and in the next stage...sounds
> interesting but I could not find any useful application in it, though.
> I believe you have many.

Betrayal, huh? How dare their bitcode deserializer not be an order of
magnitude faster than the assembler! Keep in mind that they still
have to JIT the LLVM IR, so I wasn't expecting lightning speed. The
real speed-up for Python involves running the embedded LLVM code. I
was just hoping Mython could make a modest claim about the JIT
compilation being faster at run time, since it pays double for
assembly and serialization at compile time, and you would think that
the bitcode is closer to native machine code.

Also note that the test that I used was very simplistic. I don't
write much assembler code, and so it remains to find more
sophisticated stuff to embed. Of course, I was thinking of using the
LLVM bindings as a final back end, and abstracting on top of the layer
I presented. The Haskell LLVM binding authors seem to have a more
robust suite of demos that I might look at.

> p.s. Right now, I am playing with Javascript and PHP here (http://
> ttic.uchicago.edu/~wchae/SmartForm/ ) and hope to return "Visible
> compiler stuff" real soon.

Yeah, I don't know if there is any interest in continuing on with the
PL lunch. If anyone is interested in maybe doing a weekly or
bi-monthly paper reading group, let me know. I'm pretty interested in
looking at extensibility in functional compilers, verifiable
compilers, and contracts.

Thanks,
-Jon

Reply all

Reply to author

Forward