Considering the huge amount of C, C++, and Fortran code that only does
number crunching, well documented platform-independent binaries would
be a great thing. It would make platform independent scientific
software feasible in many places where it isn't currently.
Generic ELF Specification
http://www.linux-foundation.org/spec/book/ELF-generic/ELF-generic/book1.html
NestedVM doesn't support shared libraries yet, and there aren't
currently any plans for doing so. I don't think there are any
compelling reasons for supporting them.
Most users of NestedVM are only using it for a single binary. In these
cases there is absolutely no advantage to using shared libraries (in
fact, using shared libraries would actually increase the size of the
application because the linker couldn't drop dead code).
In the cases where multiple binaries are required the recommended trick
is to link them all together as a single binary and dispatch to the
correct main() via argv[0]. This gets us all the space advantages of
shared libraries at the cost of a little extra complexity.
> My reason for asking is that I have a number of legacy C-language
> libraries that I would like to access from within a Java-based
> application server (Spring Framework/ServiceMix/Hibernate).
I'd recommend that you statically link all your libraries into a single
mips binary and run NestedVM on that. If space usage isn't an issue you
could even make one binary per library (at the cost of duplicating libc
each time).
Please post any questions you might have on doing this. What you're
trying to do is 100% possible (SQLite JDBC does exatcly this) but not
documented well. We'd be happy to help.
-Brian
Supporting JNI would definitely be a huge help is easing the transition
to NestedVM. This is something I've always hoped someone would get
around to implementing someday. The only reason I haven't is because
I've never actually used JNI for anything bigger than hello-worldish
stuff and I'm not too familiar with it.
However, I think (possibly with a few tweaks and some additional
boilerplate code) NestedVM's "native interface" is way better, and
easier to use than JNI once you understand it (which probably requires
begging for help on the mailing list because it isn't well documented).
Aside from that, what other problems did you run into with NestedVM?
We'd be happy to help any way we can.
-Brian
I've also given some thought to NestedVM JNI support.
How do you envision NestedVM integrating with JNI?
Build a custom NestedVM JNI library to interface with MIPS object code
transparently in a pure 100% java solution? Wouldn't NestedVM have to
support ELF MIPS shared libraries and dlopen/dlsym for this to work?
Or just fake the loading of shared MIPS libraries via some statically
compiled NestedVM/MIPS convention?
> However, I think (possibly with a few tweaks and some additional
> boilerplate code) NestedVM's "native interface" is way better, and
> easier to use than JNI once you understand it (which probably requires
> begging for help on the mailing list because it isn't well documented).
____________________________________________________________________________________
Never miss a thing. Make Yahoo your home page.
http://www.yahoo.com/r/hs
I wouldn't really call it integrating with JNI. More like
reimplementing JNI on top of the (very low level)
_call_java()/Runtime.call() interface. This would have nothing to do
with the "real" JNI, it would just have the same API.
Supporting call into Java from C should be pretty straightforward. On
the java side you'd use normal reflection stuff to find classes,
methods, and objects. They would have to be stored in an array so that
we could return an int "handle" back to NestedVM. You'd have a
CallJavaCB object (the Java code that sits behind _call_java()) that
implements all this. On the C side you'd have to reimplement all the
JNI functions in terms of this new interface.
Supporting calls into C from Java ("native" methods) wouldn't be 100%
transparent though. "public native int foo(int bar);" would probably
have to be rewritten as:
public int foo(int bar) { return JNI.call("MyClass_bar",this,bar); }
where JNI.call magics up a JNIEnv and a handle for this, then does a
Runtime.call(env,this_handle,bar);
If we wanted to get really fancy we could use a custom classloader to
rewrite "native" methods in this form.
> transparently in a pure 100% java solution? Wouldn't NestedVM have to
> support ELF MIPS shared libraries and dlopen/dlsym for this to work?
The "libraries" would just be plain old statically linked binaries with
a dummy main() and a ton of dead code (the actual library). You'd never
run them as a standalone application. You only access them via
Runtime.call().
Shared libraries appear to be a more elegant way of handling this but
supporting them means we have to deal with position independent code,
runtime relocations, and everything else done by the dynamic linker in
a real system. While this is all certainly possible it doesn't seem
worth the effort.
-Brian
> Supporting calls into C from Java ("native" methods) wouldn't be 100%
> transparent though. "public native int foo(int bar);" would probably
> have to be rewritten as:
>
> public int foo(int bar) { return JNI.call("MyClass_bar",this,bar); }
>
> where JNI.call magics up a JNIEnv and a handle for this, then does a
> Runtime.call(env,this_handle,bar);
>
> If we wanted to get really fancy we could use a custom classloader to
> rewrite "native" methods in this form.
To avoid rewriting working JNI java source files, this step would be needed.
But you don't have to do this method rewriting at runtime with a classloader.
It could be done at compile time - after javac is run on the java source
files containing the native methods. A standalone java program could be
written to perform this task.
Does classgen support loading existing class files, rewriting
individual methods and dumping the entire class file out again?
____________________________________________________________________________________
Looking for last minute shopping deals?
Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping
Yep. We've used it for that in a few different projects.
-Brian
In the classgen ClassFile constructor I see this note:
// FEATURE: Support these
// NOTE: Until we can support them properly we HAVE to delete them,
// they'll be incorrect after we rewrite the constant pool, etc
attrs.remove("InnerClasses");
Can classgen handle reading and rewriting inner classes correctly?
Also, do you see any issue with using classgen on bytecode with java generics?
These are just InnerClasses attributes. They just serve as hints to
java compilers (so it knows to turn Foo.Inner into Foo$Inner) they
aren't needed by the VM at all. classgen will handle Foo$Inner (which
is just a plain vanilla Java class representing the inner class) just
fine.
classgen really only supports writing JDK1.1 class files (with a few
random parts of future versions that we needed for one reason or
another) and it dumps a lot of stuff it doesn't understand when reading
class files. However, the parts of the classfile specification that the
VM cares about really hasn't changed all that much since JDK1.1 so
classgen works just fine as a compiler backend.
> Also, do you see any issue with using classgen on bytecode with java generics?
It won't properly handle the Signature attribute so you'll lose all the
generics information when passing though classgen this doesn't matter
to the VM either (everything just looks like java.lang.Object).
-Brian
I did too. It looks pretty slick. Way nicer than JNI.
> *This* is where NestedVM could complete the picture - if NestedVM
> could support JNA as the generic alternative to finding the platform-
> specific version of the library.
I think NestedVM could pretty easily be used as a JNA "backend". It
looks like all the hard stuff that JNA does is implemented in Java and
it just has a very small native library containing a few primitives
that everything is implemented on top of. I think simply reimplementing
this native library in terms of NestedVM calls (lookupSymbol(), call(),
memRead(), memWrite(), etc) is all it would take to get JNA going.
> This would, I think, require that NestedVM to support treating each
> shared library separately, rather than packaging all of them together
> in a single executable, since in the general case we would want to use
> arbitrary combinations of libraries.
Everyone is getting too hung up on this shared library/monolithic
executable thing.
NestedVM isn't going to support shared libraries in the forseeable
future. There is just too much baggage associated with them that
doesn't buy us anything.
This, however, doesn't stop you from accessing a library from NestedVM.
You just have to statically link it to an executable ("int
main(){_pause();return 0;}" will do) first. We then get the linker to do a ton
of work for us. You also don't have to put all your libraries in one
big executable. That is only recommended if space is an issue as libc
is replicated in every statically linked binary.
When distributing NestedVM-ified C libraries it makes sense to ship one
executable per library as you have no way of knowing what else will be
needed in the final application. The final app will just have a bunch
of little NestedVM instances floating around, one for each library.
-Brian
Exactly. You actually do execute the executable, it just immediatly
returns control to java and sits around waiting for Runtime.call
requests. This needs to be done to do some initialization of libc.
> The potential problem here is that you would have one
> standard C library for each "packaged" library, plus any additional
> duplications that arise when library A uses library B. Leaving aside
> the question of memory usage, wouldn't this present problems for
> libraries that use global data (on the other hand, are there such
> things any more? On the Windows side we've been told "don't do that"
> for about twenty years ;-)
Yes. This would be a problem. You'd end up with one global data space
per executable. Not only that, but you wouldn't be able to share data
between executables. You couldn't pass a pointer from one to the other
for example (think of them as separate processes).
I think if you're doing something complicated enough that this becomes
as issue then linking everything into one executable isn't too much to
ask. You could even reuse the same Java bindings no matter how you did
it. The java bindings would just have to be paramaterized over the
Runtime class used to access the c library:
public MyJavaInterface() { this(new JustThisLibrary()); }
public MyJavaInterface(Runtime rt) { ... }
and then the guy who needs to make things complicated can do:
Runtime rt = new MyMonolithicExecutable();
foo = new MyJavaInterface(rt);
bar = new MyOtherJavaInterface(rt);
where everyone else just does "new MyJavaInterface()"
Thanks for the example though. I hadn't though of this before. You've
managed to convince me that it might actually be worth looking
into shared library support.
-Brian
Unless I'm mistaken, sqlitejdbc just seems to call the mips functions
directly without such a dummy main(). Although the real sqlite main()
is linked in to the mips binary, it is never called. (David, if you're
reading this please correct me.)
Perhaps the handful of system calls that sqlite uses do not require
such libc initialization?
> > The potential problem here is that you would have one
> > standard C library for each "packaged" library, plus any additional
> > duplications that arise when library A uses library B. Leaving aside
> > the question of memory usage, wouldn't this present problems for
> > libraries that use global data (on the other hand, are there such
> > things any more? On the Windows side we've been told "don't do that"
> > for about twenty years ;-)
>
> Yes. This would be a problem. You'd end up with one global data space
> per executable. Not only that, but you wouldn't be able to share data
> between executables. You couldn't pass a pointer from one to the other
> for example (think of them as separate processes).
A benefit of this approach could be greater parallelism. Each NestedVM
Runtime could run in a different thread unimpeded by other Runtimes, since
they do not (can not) share data.
> I think if you're doing something complicated enough that this becomes
> as issue then linking everything into one executable isn't too much to
> ask. You could even reuse the same Java bindings no matter how you did
> it. The java bindings would just have to be paramaterized over the
> Runtime class used to access the c library:
>
> public MyJavaInterface() { this(new JustThisLibrary()); }
> public MyJavaInterface(Runtime rt) { ... }
>
> and then the guy who needs to make things complicated can do:
>
> Runtime rt = new MyMonolithicExecutable();
> foo = new MyJavaInterface(rt);
> bar = new MyOtherJavaInterface(rt);
>
> where everyone else just does "new MyJavaInterface()"
>
> Thanks for the example though. I hadn't though of this before. You've
> managed to convince me that it might actually be worth looking
> into shared library support.
It would certainly be easier for the NestedVM user.
Although implementing a runtime linker could add more overhead to average
nestedvm processes due to another level of indirection to get to symbols.
>
> -Brian
You're right, it does (I just checked).
> Perhaps the handful of system calls that sqlite uses do not require
> such libc initialization?
I guess there isn't as much initialization as I though. I just double
checked and all _start (the thing that runs before main()) does is
setup the environment variable pointer and initialize any static c++
objects that exist.
So I guess you can (and sqlite obviously does) get away without doing
the run main/pause thing, but I still wouldn't call it supported.
-Brian
It just so happens that getenv() is not called in the calls made by
the sqlite API. It is called indirectly via sqlite's main(), which is
not invoked - so it does not matter in this case. A bit of luck, perhaps? ;-)
Speaking of _start, just tracking down where _init() and _fini()
implementations live is fun. I thought they were implemented in newlib,
but I can't find any trace of them there. Now I think GCC itself creates
these functions for ELF binaries. Do you happen to know where _init/_fini
are implemented?
It is implemented in crt{begin,end,i}.o, which are built with gcc. The
linker does some magic (you'll see it in
src/org/ibex/nestedvm/linker.ld) to arrange for crtbegin.o's part of
.init to appear first, everything else to appear in the middle,
followed by crtend.o's part. It basically builds a function to do all
the initialization at link time. Same for .dtors (the deconstructors).
To be honest I don't completelly understand how it all works either but
that is the basic idea.
-Brian
Using objdump on the binary to see the various functions and sections
is pretty informative as well.
--- Brian Alliet <br...@brianweb.net> wrote:
____________________________________________________________________________________