NestedVM and Shared Libraries?

65 views
Skip to first unread message

David Doty

unread,
Jan 4, 2008, 4:59:22 PM1/4/08
to NestedVM
Hi, All -
Maybe this is answered in previous messages or in the
documentation, but I can't find it. Does NestedVM create Java classes
that represent shared libraries or does it create classes that
represent statically linked applications?

My reason for asking is that I have a number of legacy C-language
libraries that I would like to access from within a Java-based
application server (Spring Framework/ServiceMix/Hibernate). While I
can (and have) used SWIG to create Java (and TCL) wrappers for the
libraries, I would have to create a separate library file for each
supported platform (I have at least three - 32-bit Windows
workstations, 32-bit Solaris workstations, and 54-bit Solaris
servers). The perfect solution (I think, possibly based on ignorance)
would be to run each library through the NestedVM toolchain and save
each library's class in a JAR file. Is this something NestedVM can do
today? In principle if someone (i.e., me) puts the work in? Can't do
because ... ?

Thanks for writing a really exciting tool and making it available!

David

Curt Cox

unread,
Jan 4, 2008, 7:06:14 PM1/4/08
to NestedVM
Sorry I have no answers, but I do have additional questions. Do you
currently use JNI, JNA, o r something else? When I was seriously
looking at NestedVM, it was to replace a bunch of number crunching C
code that was being accessed via JNI. So little processor time was
spent in the C code that one percent of native speed probably would
have been fast enough. Ultimately, I gave up the idea due to lack of
NestedVM tool support and documentation.

Considering the huge amount of C, C++, and Fortran code that only does
number crunching, well documented platform-independent binaries would
be a great thing. It would make platform independent scientific
software feasible in many places where it isn't currently.

JNA
https://jna.dev.java.net/

Generic ELF Specification
http://www.linux-foundation.org/spec/book/ELF-generic/ELF-generic/book1.html

Brian Alliet

unread,
Jan 7, 2008, 2:31:12 AM1/7/08
to David Doty, NestedVM
On Fri, Jan 04, 2008 at 01:59:22PM -0800, David Doty wrote:
> documentation, but I can't find it. Does NestedVM create Java classes
> that represent shared libraries or does it create classes that
> represent statically linked applications?

NestedVM doesn't support shared libraries yet, and there aren't
currently any plans for doing so. I don't think there are any
compelling reasons for supporting them.

Most users of NestedVM are only using it for a single binary. In these
cases there is absolutely no advantage to using shared libraries (in
fact, using shared libraries would actually increase the size of the
application because the linker couldn't drop dead code).

In the cases where multiple binaries are required the recommended trick
is to link them all together as a single binary and dispatch to the
correct main() via argv[0]. This gets us all the space advantages of
shared libraries at the cost of a little extra complexity.

> My reason for asking is that I have a number of legacy C-language
> libraries that I would like to access from within a Java-based
> application server (Spring Framework/ServiceMix/Hibernate).

I'd recommend that you statically link all your libraries into a single
mips binary and run NestedVM on that. If space usage isn't an issue you
could even make one binary per library (at the cost of duplicating libc
each time).

Please post any questions you might have on doing this. What you're
trying to do is 100% possible (SQLite JDBC does exatcly this) but not
documented well. We'd be happy to help.

-Brian

Brian Alliet

unread,
Jan 7, 2008, 2:45:20 AM1/7/08
to Curt Cox, NestedVM
On Fri, Jan 04, 2008 at 06:06:14PM -0600, Curt Cox wrote:
> currently use JNI, JNA, o r something else? When I was seriously
> looking at NestedVM, it was to replace a bunch of number crunching C
> code that was being accessed via JNI. So little processor time was
> spent in the C code that one percent of native speed probably would
> have been fast enough. Ultimately, I gave up the idea due to lack of
> NestedVM tool support and documentation.

Supporting JNI would definitely be a huge help is easing the transition
to NestedVM. This is something I've always hoped someone would get
around to implementing someday. The only reason I haven't is because
I've never actually used JNI for anything bigger than hello-worldish
stuff and I'm not too familiar with it.

However, I think (possibly with a few tweaks and some additional
boilerplate code) NestedVM's "native interface" is way better, and
easier to use than JNI once you understand it (which probably requires
begging for help on the mailing list because it isn't well documented).

Aside from that, what other problems did you run into with NestedVM?
We'd be happy to help any way we can.

-Brian

Joe Wilson

unread,
Jan 7, 2008, 1:41:15 PM1/7/08
to NestedVM
--- Brian Alliet <br...@brianweb.net> wrote:
>
> On Fri, Jan 04, 2008 at 06:06:14PM -0600, Curt Cox wrote:
> > currently use JNI, JNA, o r something else? When I was seriously
> > looking at NestedVM, it was to replace a bunch of number crunching C
> > code that was being accessed via JNI. So little processor time was
> > spent in the C code that one percent of native speed probably would
> > have been fast enough. Ultimately, I gave up the idea due to lack of
> > NestedVM tool support and documentation.
>
> Supporting JNI would definitely be a huge help is easing the transition
> to NestedVM. This is something I've always hoped someone would get
> around to implementing someday. The only reason I haven't is because
> I've never actually used JNI for anything bigger than hello-worldish
> stuff and I'm not too familiar with it.

I've also given some thought to NestedVM JNI support.

How do you envision NestedVM integrating with JNI?

Build a custom NestedVM JNI library to interface with MIPS object code
transparently in a pure 100% java solution? Wouldn't NestedVM have to
support ELF MIPS shared libraries and dlopen/dlsym for this to work?
Or just fake the loading of shared MIPS libraries via some statically
compiled NestedVM/MIPS convention?

> However, I think (possibly with a few tweaks and some additional
> boilerplate code) NestedVM's "native interface" is way better, and
> easier to use than JNI once you understand it (which probably requires
> begging for help on the mailing list because it isn't well documented).


____________________________________________________________________________________
Never miss a thing. Make Yahoo your home page.
http://www.yahoo.com/r/hs

Brian Alliet

unread,
Jan 7, 2008, 10:18:06 PM1/7/08
to Joe Wilson, NestedVM
On Mon, Jan 07, 2008 at 10:41:15AM -0800, Joe Wilson wrote:
> How do you envision NestedVM integrating with JNI?

I wouldn't really call it integrating with JNI. More like
reimplementing JNI on top of the (very low level)
_call_java()/Runtime.call() interface. This would have nothing to do
with the "real" JNI, it would just have the same API.

Supporting call into Java from C should be pretty straightforward. On
the java side you'd use normal reflection stuff to find classes,
methods, and objects. They would have to be stored in an array so that
we could return an int "handle" back to NestedVM. You'd have a
CallJavaCB object (the Java code that sits behind _call_java()) that
implements all this. On the C side you'd have to reimplement all the
JNI functions in terms of this new interface.

Supporting calls into C from Java ("native" methods) wouldn't be 100%
transparent though. "public native int foo(int bar);" would probably
have to be rewritten as:

public int foo(int bar) { return JNI.call("MyClass_bar",this,bar); }

where JNI.call magics up a JNIEnv and a handle for this, then does a
Runtime.call(env,this_handle,bar);

If we wanted to get really fancy we could use a custom classloader to
rewrite "native" methods in this form.

> transparently in a pure 100% java solution? Wouldn't NestedVM have to
> support ELF MIPS shared libraries and dlopen/dlsym for this to work?

The "libraries" would just be plain old statically linked binaries with
a dummy main() and a ton of dead code (the actual library). You'd never
run them as a standalone application. You only access them via
Runtime.call().

Shared libraries appear to be a more elegant way of handling this but
supporting them means we have to deal with position independent code,
runtime relocations, and everything else done by the dynamic linker in
a real system. While this is all certainly possible it doesn't seem
worth the effort.

-Brian

Joe Wilson

unread,
Jan 8, 2008, 3:09:39 AM1/8/08
to NestedVM
--- Brian Alliet <br...@brianweb.net> wrote:
> On Mon, Jan 07, 2008 at 10:41:15AM -0800, Joe Wilson wrote:
> > How do you envision NestedVM integrating with JNI?

> Supporting calls into C from Java ("native" methods) wouldn't be 100%


> transparent though. "public native int foo(int bar);" would probably
> have to be rewritten as:
>
> public int foo(int bar) { return JNI.call("MyClass_bar",this,bar); }
>
> where JNI.call magics up a JNIEnv and a handle for this, then does a
> Runtime.call(env,this_handle,bar);
>
> If we wanted to get really fancy we could use a custom classloader to
> rewrite "native" methods in this form.

To avoid rewriting working JNI java source files, this step would be needed.
But you don't have to do this method rewriting at runtime with a classloader.
It could be done at compile time - after javac is run on the java source
files containing the native methods. A standalone java program could be
written to perform this task.

Does classgen support loading existing class files, rewriting
individual methods and dumping the entire class file out again?

____________________________________________________________________________________
Looking for last minute shopping deals?
Find them fast with Yahoo! Search. http://tools.search.yahoo.com/newsearch/category.php?category=shopping

Brian Alliet

unread,
Jan 8, 2008, 9:04:53 AM1/8/08
to Joe Wilson, NestedVM
On Tue, Jan 08, 2008 at 12:09:39AM -0800, Joe Wilson wrote:
> Does classgen support loading existing class files, rewriting
> individual methods and dumping the entire class file out again?

Yep. We've used it for that in a few different projects.

-Brian

David Doty

unread,
Jan 8, 2008, 10:57:48 AM1/8/08
to NestedVM
Hi, Brian -
Thanks for your quick reply. It is really valuable to hear what
the intended use is for a tool. To amplify some on my initial
message, here are the two things I'd like to be able to do :

o Call C-language TCL extensions from JACL (the pure-Java
version of TCL). There are a huge number of these, and
most won't be translated to Java any time soon (if ever).

o Convert a legacy C application into a shared library and
call its functions from C, TCL, or Java applications.

The first of these appears to be possible with JNA (as Curt
mentioned), so I won't address it further. The second sounds a lot
like the SQLlite case.

I provide programming support for folks who work with satellite
orbits. We get satellite orbits as "two-line element sets" (TLEs).
We have an old C program that reads a TLE from standard input and
writes the "crossings" (i.e., where and when the satellite will cross
the equator) to standard output. It does this for exactly one TLE per
execution (after all, when the C code was originally written, you ran
it from the command line - how many TLEs could you type?).

We have some Java code that executes this app to calculate
crossings on demand. The usual scenario requires calculating dozens
(or, occasionally, hundreds) of crossings. At the moment, each of
these requires a separate execution of the C program, with the
attendant time taken for dozens of load-the-process-execute-the-
process-unload-the-process cycles. People grow old waiting ... ;-)
(We also have users who would like to get crossings from TCL or Python
scripts, other C/C++/Java language applications, etc. but that may be
a different issue).

While I could modify the C app to process an arbitrary number of
crossings per execution, this would make it harder to use. I suppose
I could wrap it in something and make a web service out of it, but
that seems extreme. My first thought was to turn the guts of the C
app into a shared library and then use it (either directly or via a
SWIG wrapper) for each of the programming languages that want access.
This leads to the build-it-on-all-target-platforms problem, which I'd
really rather avoid. Since there are Java-based implementations of
many major scripting languages (e.g., JACL, JRuby, Jython), I figured
that a reasonably step would be to provide the library in a form that
can be called from any Java-based platform. Hence my interest in
NestedVM.

It sounds like I should take a look at how SQLlite does its JDBC as
an example another method to make the crossings library accessible on
the Java side, right?

Thanks again!

David

David Doty

unread,
Jan 8, 2008, 11:02:51 AM1/8/08
to NestedVM
Hi, Curt -

> Sorry I have no answers, but I do have additional questions.  Do you
> currently use JNI, JNA, or something else?

My thought was to use SWIG to wrap the C code for various platforms
(esp. TCL and Java). For Java, SWIG generates the JNI glue code as
well as the Java class to wrap it. I'd like to avoid having to
compile separate versions of the library for each platform, which
makes the NestedVM concept really attractive. Since there is a JVM-
based version of TCL (as well as several other important scripting
languages), JNA sounds like a real possibility - thanks for pointing
it out (I'd never heard of it).

> Considering the huge amount of C, C++, and Fortran code that only does
> number crunching, well documented platform-independent binaries would
> be a great thing.  It would make platform independent scientific
> software feasible in many places where it isn't currently.

Absolutely! Who wants to re-write litterally billions of lines of
code??

David

David Doty

unread,
Jan 8, 2008, 11:18:07 AM1/8/08
to NestedVM
Hi, Brian -
I spent a little time looking at JNA per Curt's post. It looks
like a really promising technology. Basically, instead of writing a
bunch of JNI code to a C (or C++) library, you describe the interfaces
exposed by that library and the JNA runtime generates the equivalent
of the JNI, allowing you to call the library. It appears to support
calling back from the C side to the Java side, which you mention
elsewhere as a big issue. The hard part is hand-crafting the Java
code to describe the C interfaces...

Fortunately, *that* problem has been solved by the SWIG folks.
SWIG can read the C header files and create a wrapper for a couple of
dozen different scripting languages and other run-times (e.g., JNI).
It can also produce an XML description of the interfaces. It seems to
me that it wouldn't be difficult to transform the XML into the
corresponding JNA API calls and - la voila - instant Java interface to
any library with a C-language API.

But ... we still have the problem of compiling, linking, and
distributing all the platform-specific variations of the library.

*This* is where NestedVM could complete the picture - if NestedVM
could support JNA as the generic alternative to finding the platform-
specific version of the library. We could now distribute any library
that

o Is compilable by the GCC suite and

o Exposes a C-language API

in a form executable on any platform that supports Java. And never
have to compile a platform-specific version again.

This would, I think, require that NestedVM to support treating each
shared library separately, rather than packaging all of them together
in a single executable, since in the general case we would want to use
arbitrary combinations of libraries.

Does this seem feasible, or have I been smoking too much weed?

David

Joe Wilson

unread,
Jan 8, 2008, 1:03:24 PM1/8/08
to NestedVM
--- Brian Alliet <br...@brianweb.net> wrote:
>

In the classgen ClassFile constructor I see this note:

// FEATURE: Support these
// NOTE: Until we can support them properly we HAVE to delete them,
// they'll be incorrect after we rewrite the constant pool, etc
attrs.remove("InnerClasses");

Can classgen handle reading and rewriting inner classes correctly?

Also, do you see any issue with using classgen on bytecode with java generics?

Brian Alliet

unread,
Jan 8, 2008, 8:27:37 PM1/8/08
to Joe Wilson, NestedVM
On Tue, Jan 08, 2008 at 10:03:24AM -0800, Joe Wilson wrote:
> In the classgen ClassFile constructor I see this note:
>
> // FEATURE: Support these
> // NOTE: Until we can support them properly we HAVE to delete them,
> // they'll be incorrect after we rewrite the constant pool, etc
> attrs.remove("InnerClasses");

These are just InnerClasses attributes. They just serve as hints to
java compilers (so it knows to turn Foo.Inner into Foo$Inner) they
aren't needed by the VM at all. classgen will handle Foo$Inner (which
is just a plain vanilla Java class representing the inner class) just
fine.

classgen really only supports writing JDK1.1 class files (with a few
random parts of future versions that we needed for one reason or
another) and it dumps a lot of stuff it doesn't understand when reading
class files. However, the parts of the classfile specification that the
VM cares about really hasn't changed all that much since JDK1.1 so
classgen works just fine as a compiler backend.

> Also, do you see any issue with using classgen on bytecode with java generics?

It won't properly handle the Signature attribute so you'll lose all the
generics information when passing though classgen this doesn't matter
to the VM either (everything just looks like java.lang.Object).

-Brian

Brian Alliet

unread,
Jan 8, 2008, 9:09:09 PM1/8/08
to David Doty, NestedVM
On Tue, Jan 08, 2008 at 08:18:07AM -0800, David Doty wrote:
> I spent a little time looking at JNA per Curt's post. It looks
> like a really promising technology.

I did too. It looks pretty slick. Way nicer than JNI.

> *This* is where NestedVM could complete the picture - if NestedVM
> could support JNA as the generic alternative to finding the platform-
> specific version of the library.

I think NestedVM could pretty easily be used as a JNA "backend". It
looks like all the hard stuff that JNA does is implemented in Java and
it just has a very small native library containing a few primitives
that everything is implemented on top of. I think simply reimplementing
this native library in terms of NestedVM calls (lookupSymbol(), call(),
memRead(), memWrite(), etc) is all it would take to get JNA going.

> This would, I think, require that NestedVM to support treating each
> shared library separately, rather than packaging all of them together
> in a single executable, since in the general case we would want to use
> arbitrary combinations of libraries.

Everyone is getting too hung up on this shared library/monolithic
executable thing.

NestedVM isn't going to support shared libraries in the forseeable
future. There is just too much baggage associated with them that
doesn't buy us anything.

This, however, doesn't stop you from accessing a library from NestedVM.
You just have to statically link it to an executable ("int
main(){_pause();return 0;}" will do) first. We then get the linker to do a ton
of work for us. You also don't have to put all your libraries in one
big executable. That is only recommended if space is an issue as libc
is replicated in every statically linked binary.

When distributing NestedVM-ified C libraries it makes sense to ship one
executable per library as you have no way of knowing what else will be
needed in the final application. The final app will just have a bunch
of little NestedVM instances floating around, one for each library.

-Brian

David Doty

unread,
Jan 9, 2008, 2:39:45 PM1/9/08
to NestedVM
Hi, Brian -
> Everyone is getting too hung up on this shared library/monolithic executable thing.

I guess if you tell me something often enough, eventually it will
sink in (ah, yes, the old New England masterpiece "Light Dawns Over
Marblehead"... ;-)

I *had* been thinking that all the libraries one wanted to use with
NestedVM had to be linked into *one* executable. If I understand now,
you would link one library per executable (and just never execute the
executable). The potential problem here is that you would have one
standard C library for each "packaged" library, plus any additional
duplications that arise when library A uses library B. Leaving aside
the question of memory usage, wouldn't this present problems for
libraries that use global data (on the other hand, are there such
things any more? On the Windows side we've been told "don't do that"
for about twenty years ;-)

David

Brian Alliet

unread,
Jan 10, 2008, 10:08:27 AM1/10/08
to David Doty, NestedVM
On Wed, Jan 09, 2008 at 11:39:45AM -0800, David Doty wrote:
> NestedVM had to be linked into *one* executable. If I understand now,
> you would link one library per executable (and just never execute the
> executable).

Exactly. You actually do execute the executable, it just immediatly
returns control to java and sits around waiting for Runtime.call
requests. This needs to be done to do some initialization of libc.

> The potential problem here is that you would have one
> standard C library for each "packaged" library, plus any additional
> duplications that arise when library A uses library B. Leaving aside
> the question of memory usage, wouldn't this present problems for
> libraries that use global data (on the other hand, are there such
> things any more? On the Windows side we've been told "don't do that"
> for about twenty years ;-)

Yes. This would be a problem. You'd end up with one global data space
per executable. Not only that, but you wouldn't be able to share data
between executables. You couldn't pass a pointer from one to the other
for example (think of them as separate processes).

I think if you're doing something complicated enough that this becomes
as issue then linking everything into one executable isn't too much to
ask. You could even reuse the same Java bindings no matter how you did
it. The java bindings would just have to be paramaterized over the
Runtime class used to access the c library:

public MyJavaInterface() { this(new JustThisLibrary()); }
public MyJavaInterface(Runtime rt) { ... }

and then the guy who needs to make things complicated can do:

Runtime rt = new MyMonolithicExecutable();
foo = new MyJavaInterface(rt);
bar = new MyOtherJavaInterface(rt);

where everyone else just does "new MyJavaInterface()"

Thanks for the example though. I hadn't though of this before. You've
managed to convince me that it might actually be worth looking
into shared library support.

-Brian

Joe Wilson

unread,
Jan 10, 2008, 2:37:46 PM1/10/08
to NestedVM
--- Brian Alliet <br...@brianweb.net> wrote:
> On Wed, Jan 09, 2008 at 11:39:45AM -0800, David Doty wrote:
> > NestedVM had to be linked into *one* executable. If I understand now,
> > you would link one library per executable (and just never execute the
> > executable).
>
> Exactly. You actually do execute the executable, it just immediatly
> returns control to java and sits around waiting for Runtime.call
> requests. This needs to be done to do some initialization of libc.

Unless I'm mistaken, sqlitejdbc just seems to call the mips functions
directly without such a dummy main(). Although the real sqlite main()
is linked in to the mips binary, it is never called. (David, if you're
reading this please correct me.)

Perhaps the handful of system calls that sqlite uses do not require
such libc initialization?

> > The potential problem here is that you would have one
> > standard C library for each "packaged" library, plus any additional
> > duplications that arise when library A uses library B. Leaving aside
> > the question of memory usage, wouldn't this present problems for
> > libraries that use global data (on the other hand, are there such
> > things any more? On the Windows side we've been told "don't do that"
> > for about twenty years ;-)
>
> Yes. This would be a problem. You'd end up with one global data space
> per executable. Not only that, but you wouldn't be able to share data
> between executables. You couldn't pass a pointer from one to the other
> for example (think of them as separate processes).

A benefit of this approach could be greater parallelism. Each NestedVM
Runtime could run in a different thread unimpeded by other Runtimes, since
they do not (can not) share data.

> I think if you're doing something complicated enough that this becomes
> as issue then linking everything into one executable isn't too much to
> ask. You could even reuse the same Java bindings no matter how you did
> it. The java bindings would just have to be paramaterized over the
> Runtime class used to access the c library:
>
> public MyJavaInterface() { this(new JustThisLibrary()); }
> public MyJavaInterface(Runtime rt) { ... }
>
> and then the guy who needs to make things complicated can do:
>
> Runtime rt = new MyMonolithicExecutable();
> foo = new MyJavaInterface(rt);
> bar = new MyOtherJavaInterface(rt);
>
> where everyone else just does "new MyJavaInterface()"
>
> Thanks for the example though. I hadn't though of this before. You've
> managed to convince me that it might actually be worth looking
> into shared library support.

It would certainly be easier for the NestedVM user.

Although implementing a runtime linker could add more overhead to average
nestedvm processes due to another level of indirection to get to symbols.

>
> -Brian

Brian Alliet

unread,
Jan 10, 2008, 3:04:37 PM1/10/08
to Joe Wilson, NestedVM
On Thu, Jan 10, 2008 at 11:37:46AM -0800, Joe Wilson wrote:
> Unless I'm mistaken, sqlitejdbc just seems to call the mips functions
> directly without such a dummy main().

You're right, it does (I just checked).

> Perhaps the handful of system calls that sqlite uses do not require
> such libc initialization?

I guess there isn't as much initialization as I though. I just double
checked and all _start (the thing that runs before main()) does is
setup the environment variable pointer and initialize any static c++
objects that exist.

So I guess you can (and sqlite obviously does) get away without doing
the run main/pause thing, but I still wouldn't call it supported.

-Brian

Joe Wilson

unread,
Jan 10, 2008, 3:48:37 PM1/10/08
to NestedVM
--- Brian Alliet <br...@brianweb.net> wrote:
> I guess there isn't as much initialization as I though. I just double
> checked and all _start (the thing that runs before main()) does is
> setup the environment variable pointer and initialize any static c++
> objects that exist.
>
> So I guess you can (and sqlite obviously does) get away without doing
> the run main/pause thing, but I still wouldn't call it supported.

It just so happens that getenv() is not called in the calls made by
the sqlite API. It is called indirectly via sqlite's main(), which is
not invoked - so it does not matter in this case. A bit of luck, perhaps? ;-)

Speaking of _start, just tracking down where _init() and _fini()
implementations live is fun. I thought they were implemented in newlib,
but I can't find any trace of them there. Now I think GCC itself creates
these functions for ELF binaries. Do you happen to know where _init/_fini
are implemented?

Brian Alliet

unread,
Jan 10, 2008, 4:50:50 PM1/10/08
to Joe Wilson, NestedVM
On Thu, Jan 10, 2008 at 12:48:37PM -0800, Joe Wilson wrote:
> Speaking of _start, just tracking down where _init() and _fini()
> implementations live is fun.

It is implemented in crt{begin,end,i}.o, which are built with gcc. The
linker does some magic (you'll see it in
src/org/ibex/nestedvm/linker.ld) to arrange for crtbegin.o's part of
.init to appear first, everything else to appear in the middle,
followed by crtend.o's part. It basically builds a function to do all
the initialization at link time. Same for .dtors (the deconstructors).

To be honest I don't completelly understand how it all works either but
that is the basic idea.

-Brian

Joe Wilson

unread,
Jan 10, 2008, 5:43:57 PM1/10/08
to NestedVM
Thanks, Brian.

Using objdump on the binary to see the various functions and sections
is pretty informative as well.

--- Brian Alliet <br...@brianweb.net> wrote:

____________________________________________________________________________________

Reply all
Reply to author
Forward
0 new messages