I don't know if it exists already, I've heard about this "LLJVM" but I don't think it does the same thing as my idea.
What do you think?
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
One related project (though not exactly what you want) is
https://github.com/graalvm/sulong
-- Sanjoy
David Chisnall via llvm-dev wrote:
> On 19 Jul 2016, at 04:06, Lorenzo Laneve via
llvm-dev<llvm...@lists.llvm.org> wrote:
>> My idea was to create a complete backend treating Java as a normal
platform, to enable LLVM to compile programs to Java Bytecode (.class)
and Java Archive files (.jar). This could be useful in situations where
we need to compile a program for a platform still not natively supported
by LLVM.
>>
>> I don't know if it exists already, I've heard about this "LLJVM" but
I don't think it does the same thing as my idea.
>> What do you think?
>
> I think that it will be difficult. Java bytecode is intrinsically
designed to be memory safe, whereas LLVM IR is not. There is no
equivalent of inttoptr or ptrtoint in Java bytecode and the closest
equivalent of a GEP is to retrieve a field from an object (though that’s
only really for GEP + load/store).
>
> You could potentially do something a bit ugly and treat all of LLVM
memory as one big ByteBuffer object, and make pointers indexes into
this, but then you’d make it very hard for your LLVM-originating code to
interoperate with Java-originating code and so you’d have to write a lot
of code to take the place of the system call layer.
The caveat here is that Java has this "private"
but-not-really-in-practice API called sun.misc.Unsafe that can be used
to access native memory. So you can have (I'm paraphrasing, the
method names may not match):
long addr = unsafe.allocateMemory()
unsafe.putInt(addr + 48, 9001);
int val = unsafe.getInt(addr + 48);
etc. You may even get decent performance out of this since JIT
compilers tend to have to optimize these well (they're commonly uses
in the implementation of some popular JDK classes).
But you're right that it will still be difficult to naively
inter-operate between Java and C++ objects. Which is why it will be
an interesting research project. :)
-- Sanjoy
> Oh, and I doubt that you’ll find many more platforms that have a
fully functional JVM than are LLVM targets. Even big-endian MIPS64 is
not well-supported by Java (JamVM - a pure interpreter - is the only
thing that we’ve managed to find that works).
>
> David
>
>
If you're trying to bypass/replace JNI, you're in for a surprise. :)
The number of bugs I found while interacting with Java from C or C++
on different VMs (MS, Sun, OpenJDK) were astounding.
Apart from the usual C++ class layout (which may be better in gcj as
David says), we had corruption in the stack because the VMs weren't
understanding the unwind information.
I originally found the stack bug in 2002 on Windows, later checked in
2008 and it was still there. I'd be surprised if that's fixed, and
even more surprised if that's the only remaining problem.
And those were only through JNI, a relatively safe interface. If you
try to send C++ directly to Java Bytecode, you'll find a huge list of
"implementation details" that are not just undefined, but thoroughly
undocumented and different on purpose (like memory allocation,
signals, asynchronous I/O, threads, etc).
Good luck! :)
cheers,
--renato
Generally, this is a bad idea (I believe, Sanjoy is far more experienced here and can correct me if I’m wrong). Modern JVMs benefit a lot more from Java bytecode that is easy to analyse. The first thing that Hotspot does when it loads bytecode is undo a bunch of optimisations that javac does, because they make it harder to do the optimisations that the JIT performs.
> I know that it might be quite hard to treat Java or .NET as LLVM targets, but could this be a challenge? Let LLVM IR be compiled to Bytecode and CLI, which are higher level than our IR.
Going from a high-level language to a high-level IR via a low-level IR is certainly a challenge. Many things are difficult, but not all of them are worthwhile.