LLVM backend for emscripten

1,335 views
Skip to first unread message

Alon Zakai

unread,
Dec 19, 2013, 8:53:15 PM12/19/13
to emscripte...@googlegroups.com
Hello everyone,

We've experimented in the past with an LLVM backend to replace parts of emscripten, most of which is written in JS. We hit some difficulties each time, and this was deferred. However, it looks like now everything is coming together, and we are on track. Details here:

https://github.com/kripken/emscripten/wiki/LLVM-Backend

Basically, the new compiler (codename "fastcomp" in commits) is an LLVM IR-based backend, so similar to the C++ backend (and as opposed to most other backends which are SDAG/tblgen based). This will replace much of the custom LLVM IR parsing and processing code in src/*, but not replace any of

 * libraries
 * toolchain
 * js optimizer (tools/js-optimizer.js)

The idea is that the LLVM backend will lower LLVM IR into JS, and then our existing toolchain scripts will process and optimize it. So we are just replacing one part of emscripten, several thousand lines of code and fairly small compared to the stuff which is not changing. But it should still give us 2 main benefits, in time:

1) faster compilation speed, no need to process IR in JS, can use LLVM IR in C++ directly
2) tighter integration with LLVM should allow us to avoid some current limitations, like overly pessimistic alignment

The backend is far from complete, but for all "basic" codegen, appears to work fine and passes the test suite. That includes building complex things like python, bullet, cubescript, etc., but does *not* include anything complex and requiring special support like C++ exceptions, setjmp/longjmp, and various compiler flags like SAFE_HEAP etc.

In time we can support all those things, although there are some features we never will - the new compiler will stay streamlined by focusing on one mode of codegen, optimized and relooped asm.js, as opposed to the old compiler which supported several other modes (non-asmjs typed arrays, and no typed arrays). Of course the old compiler will remain viable for things that need those codegen modes. Otherwise, things like C++ exceptions etc. should certainly be supported in the new compiler and are just a matter of time and how much people need them.

Note that, as mentioned in that link, the compiler is not much optimized or polished yet. The first goal is correctness. Although, even without any optimization for compilation speed, it already beats the old compiler on that by a large margin. Performance of the generated code will be slower though, until we spend time optimizing it. It should eventually be faster than the current compiler though, for the integration reasons mentioned earlier.

The link above has instructions to build and test the new compiler. If you have a codebase that does not need exceptions or setjmp, please test on the new compiler and file issues if you find any! And if you are interested in helping to finish the backend, that would be great too.

- Alon

Ehsan Akhgari

unread,
Dec 19, 2013, 9:32:39 PM12/19/13
to emscripte...@googlegroups.com
This is great news!  I've been looking forward to this for so long, congratulations on the great undertaking!

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Chad Austin

unread,
Dec 19, 2013, 11:55:17 PM12/19/13
to emscripte...@googlegroups.com
On Thu, Dec 19, 2013 at 5:53 PM, Alon Zakai <alon...@gmail.com> wrote:
1) faster compilation speed, no need to process IR in JS, can use LLVM IR in C++ directly

I'm super excited about this!  In particular, it means we can avoid invoking llvm-dis on the build and avoid multiply parsing the LLVM IR files.
 
In time we can support all those things, although there are some features we never will - the new compiler will stay streamlined by focusing on one mode of codegen, optimized and relooped asm.js, as opposed to the old compiler which supported several other modes (non-asmjs typed arrays, and no typed arrays). Of course the old compiler will remain viable for things that need those codegen modes. Otherwise, things like C++ exceptions etc. should certainly be supported in the new compiler and are just a matter of time and how much people need them.

I agree 100% with dropping support for TA0 and TA1 codegen modes.  I doubt many people depended on those, especially now that even IE supports typed arrays.

Will there be a way in the new compiler to emit asm.js-like code but with runtime heap resizing?  We can't use asm.js until at least Chrome and Firefox support resizable typed arrays.  :/

Again, very excited!

Thanks,
Chad

Alon Zakai

unread,
Dec 20, 2013, 12:14:19 AM12/20/13
to emscripte...@googlegroups.com
We could emit something close to asm but with heap resizing, but it would not be as optimizable because the heap would not be constant. Without proper browser support, our options are limited I'm afraid.

- Alon



Chad Austin

unread,
Dec 20, 2013, 12:36:47 AM12/20/13
to emscripte...@googlegroups.com
Great.  Even if it's not as fast as validated asm.js, that would be fine.
--
Chad Austin
Technical Director, IMVU

Alon Zakai

unread,
Dec 20, 2013, 12:53:24 AM12/20/13
to emscripte...@googlegroups.com
I don't think it would buy us anything over what you are currently doing, though. It would look superfically more similar, but the underlying issues would remain (heap can change, so js engines optimize less, and we must limit our eliminator as well).

- Alon

Chad Austin

unread,
Dec 20, 2013, 1:15:23 AM12/20/13
to emscripte...@googlegroups.com
Well, once you have the LLVM backend compiler running, build times will be reduced, which would help our team quite a bit.  I would be sad if we were stuck on the JavaScript compiler just because we require a resizable heap.  :)

Alon Zakai

unread,
Dec 20, 2013, 7:29:50 PM12/20/13
to emscripte...@googlegroups.com
Oh, in that sense, now I get you. We should be able to make the new compiler work with that very easily, just with the same limitations as before.

- Alon

Chad Austin

unread,
Dec 20, 2013, 7:35:31 PM12/20/13
to emscripte...@googlegroups.com
Great!
Reply all
Reply to author
Forward
0 new messages