Running Java (pie) from host without wrapper

11 views
Skip to first unread message

Waldek Kozaczuk

unread,
Apr 14, 2019, 4:21:16 PM4/14/19
to OSv Development
As I was researching OSv ability to run unmodified Linux pie as is from host I did quick experiment with Java 8.

I created an app that took all java 8 artifacts as is from host (so no wrapper):
cat usr.manifest 
/usr/lib/jvm/java-8-openjdk-amd64/jre/**: /usr/lib/jvm/java-8-openjdk-amd64/jre/**
/usr/lib/libjli.so: -> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/jli/libjli.so
/Hello.class: ${MODULE_DIR}/Hello.class
and I was able to run it with two caveats

1. After executing java apps OSv did not terminate and after I connected with gdb and some Java threads were stuck like this:
(gdb) bt
#0  sched::thread::switch_to (this=0xffff800000aa7040, this@entry=0xffff800001e73040) at arch/x64/arch-switch.hh:108
#1  0x00000000003fd104 in sched::cpu::reschedule_from_interrupt (this=0xffff80000096b040, 
    called_from_yield=called_from_yield@entry=false, preempt_after=..., preempt_after@entry=...) at core/sched.cc:339
#2  0x00000000003fd5fc in sched::cpu::schedule () at include/osv/sched.hh:1309
#3  0x00000000003fdd22 in sched::thread::wait (this=this@entry=0xffff800002f87040) at core/sched.cc:1214
#4  0x00000000003e096f in sched::thread::do_wait_until<sched::noninterruptible, sched::thread::dummy_lock, waiter::wait(sched::timer*) const::{lambda()#1}>(sched::thread::dummy_lock&, waiter::wait(sched::timer*) const::{lambda()#1}) (pred=..., 
    mtx=<synthetic pointer>...) at include/osv/sched.hh:938
#5  sched::thread::wait_until<waiter::wait(sched::timer*) const::{lambda()#1}>(waiter::wait(sched::timer*) const::{lambda()#1}) (pred=...) at include/osv/sched.hh:1076
#6  waiter::wait (tmr=0x0, this=0x2000105ce670) at include/osv/wait_record.hh:46
#7  condvar::wait (this=0xffffa00002c61280, user_mutex=0xffffa00002c5b140, tmr=<optimized out>) at core/condvar.cc:43
#8  0x0000100000c9842b in SharedRuntime::generate_native_wrapper(MacroAssembler*, methodHandle, int, BasicType*, VMRegPair*, BasicType) ()
#9  0x0000100000c8036d in SharedRuntime::throw_StackOverflowError(JavaThread*) ()
#10 0x0000000000000000 in ?? ()
I wonder if that is similar to why we have special logic to terminate all java threads in wrapper (java.cc). Ctrl-C does not work.

2. I believe there is a bug in elf.cc::program::load_object() where I had to hack it to force finding and loading one of the shared object java pie executable depended on (all other so loaded fine). There may be actually 2 bugs:
  • there is a bug that prevents loading libjli.so by java pie; adding this nasty hack worked: 
@@ -1149,6 +1150,9 @@ std::shared_ptr<elf::object>
 program::load_object(std::string name, std::vector<std::string> extra_path,
         std::vector<std::shared_ptr<object>> &loaded_objects)
 {
+    if( name == "libjli.so" ) {
+        name = "/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/jli/libjli.so";
+    }
     fileref f;
  • there is a bug that silently ignores the fact that shared file has not been loaded and ends up OSv report this:
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java: failed looking up symbol JLI_Launch

[backtrace]
0x000000000035cb8d <elf::object::symbol(unsigned int, bool)+1325>
0x000000000035cc4f <elf::object::resolve_pltgot(unsigned int)+127>
0x000000000035ce29 <elf_resolve_pltgot+57>
0x00000000003a2d1f <???+3812639>
0x00002000001ffe9f <???+2096799>
0x000000000042e97c <osv::application::run_main()+60>
0x000000000042eaae <__libc_start_main+46>
0x00001000000060c9 <???+24777>

I also ran more complicated examples like vertx REST service one and it worked perfectly.

Waldek
 

Nadav Har'El

unread,
Apr 14, 2019, 6:03:37 PM4/14/19
to Waldek Kozaczuk, OSv Development
On Sun, Apr 14, 2019 at 11:21 PM Waldek Kozaczuk <jwkoz...@gmail.com> wrote:
As I was researching OSv ability to run unmodified Linux pie as is from host I did quick experiment with Java 8.

I created an app that took all java 8 artifacts as is from host (so no wrapper):
cat usr.manifest 
/usr/lib/jvm/java-8-openjdk-amd64/jre/**: /usr/lib/jvm/java-8-openjdk-amd64/jre/**
/usr/lib/libjli.so: -> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/jli/libjli.so
/Hello.class: ${MODULE_DIR}/Hello.class
and I was able to run it with two caveats

1. After executing java apps OSv did not terminate and after I connected with gdb and some Java threads were stuck like this:
(gdb) bt
#0  sched::thread::switch_to (this=0xffff800000aa7040, this@entry=0xffff800001e73040) at arch/x64/arch-switch.hh:108
#1  0x00000000003fd104 in sched::cpu::reschedule_from_interrupt (this=0xffff80000096b040, 
    called_from_yield=called_from_yield@entry=false, preempt_after=..., preempt_after@entry=...) at core/sched.cc:339
#2  0x00000000003fd5fc in sched::cpu::schedule () at include/osv/sched.hh:1309
#3  0x00000000003fdd22 in sched::thread::wait (this=this@entry=0xffff800002f87040) at core/sched.cc:1214
#4  0x00000000003e096f in sched::thread::do_wait_until<sched::noninterruptible, sched::thread::dummy_lock, waiter::wait(sched::timer*) const::{lambda()#1}>(sched::thread::dummy_lock&, waiter::wait(sched::timer*) const::{lambda()#1}) (pred=..., 
    mtx=<synthetic pointer>...) at include/osv/sched.hh:938
#5  sched::thread::wait_until<waiter::wait(sched::timer*) const::{lambda()#1}>(waiter::wait(sched::timer*) const::{lambda()#1}) (pred=...) at include/osv/sched.hh:1076
#6  waiter::wait (tmr=0x0, this=0x2000105ce670) at include/osv/wait_record.hh:46
#7  condvar::wait (this=0xffffa00002c61280, user_mutex=0xffffa00002c5b140, tmr=<optimized out>) at core/condvar.cc:43
#8  0x0000100000c9842b in SharedRuntime::generate_native_wrapper(MacroAssembler*, methodHandle, int, BasicType*, VMRegPair*, BasicType) ()
#9  0x0000100000c8036d in SharedRuntime::throw_StackOverflowError(JavaThread*) ()
#10 0x0000000000000000 in ?? ()
I wonder if that is similar to why we have special logic to terminate all java threads in wrapper (java.cc). Ctrl-C does not work.

Yes. The code at the end of java.cc explains why it's needed:

    // Unfortunately, Java's jvm->DestroyJavaVM() doesn't fully clean up, and
    // leaves behind some detached threads such as GC threads and compilation
    // threads. If we return with those still existing, loader.cc will wait
    // (using application::join()) in vain for these threads to finish.
    // So let's stop these threads. This call is unsafe, in the sense we
    // assume that those renegade threads are not holding any critical
    // resources (e.g., not in the middle of I/O or memory allocation).
    while(!osv::application::unsafe_stop_and_abandon_other_threads()) {
        usleep(100000);
    }

The thing is that when loader.cc runs an application it, deliberately, doesn't just wait for the main() function to return - it also waits for any other threads that the application started to finish as well. But in Java, there is simply no way to close these extra threads (such as Java GC threads). This is really a bug in Java, but it does make one wonder why OSv needs to be that pedantic about it. We could have a run option to just run main() and shut down once main() returns, without caring about any other threads (and of course without being able to run another application after this one exits).

 

2. I believe there is a bug in elf.cc::program::load_object() where I had to hack it to force finding and loading one of the shared object java pie executable depended on (all other so loaded fine). There may be actually 2 bugs:
  • there is a bug that prevents loading libjli.so by java pie; adding this nasty hack worked: 
@@ -1149,6 +1150,9 @@ std::shared_ptr<elf::object>
 program::load_object(std::string name, std::vector<std::string> extra_path,
         std::vector<std::shared_ptr<object>> &loaded_objects)
 {
+    if( name == "libjli.so" ) {
+        name = "/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/jli/libjli.so";
+    }
     fileref f;

Let's see how this is supposed to work:

libjli.so is needed by jre/lib/amd64/libinstrument.so. On my host (Fedora 29),

$ ldd /usr/lib/jvm/java-1.8.0-openjdk/jre/lib/amd64/libinstrument.so
        linux-vdso.so.1 (0x00007ffec512e000)
        libz.so.1 => /lib64/libz.so.1 (0x00007fcd91ac0000)
        libjli.so => /usr/lib/jvm/java-1.8.0-openjdk/jre/lib/amd64/jli/libjli.so (0x00007fcd91aae000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fcd91aa8000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fcd918e2000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fcd918c0000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fcd91b28000)

How did it find the pathname of libjli.so? Well,

$ readelf -a /usr/lib/jvm/java-1.8.0-openjdk/jre/lib/amd64/libinstrument.so
...
 0x0000000000000001 (NEEDED)             Shared library: [libz.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libjli.so]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000e (SONAME)             Library soname: [libinstrument.so]
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN:$ORIGIN/jli]

This RPATH asks when loading /usr/lib/jvm/java-1.8.0-openjdk/jre/lib/amd64/libinstrument.so and searching for "libjli.so" to also search for it in $ORIGIN/jli, i.e., /usr/lib/jvm/java-1.8.0-openjdk/jre/lib/amd64/jli. Where it indeed is.

And we do have proper handling of DT_RPATH in core/elf.cc.

So either we have a bug in that RPATH handling that wasn't apparent to me now when I reviewed the code (and to Avi Kivity when he wrote it), or there's a different problem. Maybe me installed libjli.so at the wrong tree? Maybe if you add printouts to core/elf.cc's handling of DT_RPATH you can see more clearly what's the bug.

  • there is a bug that silently ignores the fact that shared file has not been loaded and ends up OSv report this:
For historic reasons, OSv prints missing DT_NEEDED messages with debug():

            debug("could not load %s\n", lib);

This means that unless you run with --verbose, you'll never see these messages...
Hiding these messages was convenient in the past, and sometimes still is (see https://github.com/cloudius-systems/osv/issues/601) but maybe it makes sense to convert this message to a debug_always(), so it is always shown?

 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java: failed looking up symbol JLI_Launch

[backtrace]
0x000000000035cb8d <elf::object::symbol(unsigned int, bool)+1325>
0x000000000035cc4f <elf::object::resolve_pltgot(unsigned int)+127>
0x000000000035ce29 <elf_resolve_pltgot+57>
0x00000000003a2d1f <???+3812639>
0x00002000001ffe9f <???+2096799>
0x000000000042e97c <osv::application::run_main()+60>
0x000000000042eaae <__libc_start_main+46>
0x00001000000060c9 <???+24777>

I also ran more complicated examples like vertx REST service one and it worked perfectly.

Waldek
 

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Waldek Kozaczuk

unread,
Apr 14, 2019, 8:16:04 PM4/14/19
to OSv Development
Being able to run jvm without a wrapper is pretty significant (I guest it was possible as of over 4 years ago when support for running pies in OSv was added). We do not need all this related baggage anymore. Even now wrapper has some limiting preventing certain java 9 and up apps from running due to inadequate support of new options.

In my other email I also mentioned I was able to run Python 2 as is without having to compile the main executable like in python2x app.

On Sunday, April 14, 2019 at 6:03:37 PM UTC-4, Nadav Har'El wrote:
> On Sun, Apr 14, 2019 at 11:21 PM Waldek Kozaczuk <jwkoz...@gmail.com> wrote:
>
> As I was researching OSv ability to run unmodified Linux pie as is from host I did quick experiment with Java 8.
>
>
> I created an app that took all java 8 artifacts as is from host (so no wrapper):
>
>
>
> cat usr.manifest 
> /usr/lib/jvm/java-8-openjdk-amd64/jre/**: /usr/lib/jvm/java-8-openjdk-amd64/jre/**
> /usr/lib/libjli.so: -> /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/jli/libjli.so
> /Hello.class: ${MODULE_DIR}/Hello.classand I was able to run it with two caveats
Sounds like a great idea. I will try to come up with a patch at some point.
I will try to figure it out. Hopefully something simple.
>
>
>
>
> there is a bug that silently ignores the fact that shared file has not been loaded and ends up OSv report this:
> For historic reasons, OSv prints missing DT_NEEDED messages with debug():
>
>
>             debug("could not load %s\n", lib);
>
>
> This means that unless you run with --verbose, you'll never see these messages...
>
> Hiding these messages was convenient in the past, and sometimes still is (see https://github.com/cloudius-systems/osv/issues/601) but maybe it makes sense to convert this message to a debug_always(), so it is always shown?
>
>
>
>  
>
>
>
>
>
>
> /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java: failed looking up symbol JLI_Launch
>
>
> [backtrace]
> 0x000000000035cb8d <elf::object::symbol(unsigned int, bool)+1325>
> 0x000000000035cc4f <elf::object::resolve_pltgot(unsigned int)+127>
> 0x000000000035ce29 <elf_resolve_pltgot+57>
> 0x00000000003a2d1f <???+3812639>
> 0x00002000001ffe9f <???+2096799>
> 0x000000000042e97c <osv::application::run_main()+60>
> 0x000000000042eaae <__libc_start_main+46>
> 0x00001000000060c9 <???+24777>
>
>
> I also ran more complicated examples like vertx REST service one and it worked perfectly.
>
>
> Waldek 
>
>
>
>
> --
>
> You received this message because you are subscribed to the Google Groups "OSv Development" group.
>
> To unsubscribe from this group and stop receiving emails from it, send an email to osv...@googlegroups.com.

Waldek Kozaczuk

unread,
May 10, 2019, 1:37:00 PM5/10/19
to OSv Development
It has to do with the fact that I am executing is a symlink to real executable:
/java -> /usr/lib/jvm/jdk-zulu11.31.11-ca-jdk11.0.3-linux_x64-java-base/bin/java

So if I execute this:
/scripts/run.py -e '/usr/lib/jvm/jdk-zulu11.31.11-ca-jdk11.0.3-linux_x64-java-base/bin/java --version'
it works.

If I execute this:
/scripts/run.py -e '/java --version'
it fails as I described in my original error.

Here is the dynamic section of java:
Dynamic section at offset 0x1018 contains 30 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libz.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libjli.so]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN/../lib/jli:$ORIGIN/../lib]
 ...

and this part of the host file system (which is the same as on OSv FS):
/usr/lib/jvm/jdk-zulu11.31.11-ca-jdk11.0.3-linux_x64-java-base/lib/jli/libjli.so
/usr/lib/jvm/jdk-zulu11.31.11-ca-jdk11.0.3-linux_x64-java-base/bin/java
/java -> /usr/lib/jvm/jdk-zulu11.31.11-ca-jdk11.0.3-linux_x64-java-base/bin/java

It works on Linux so I guess there is a way to fix it.

Nadav Har'El

unread,
May 10, 2019, 4:25:40 PM5/10/19
to Waldek Kozaczuk, OSv Development
Excellent detective work.  I opened https://github.com/cloudius-systems/osv/issues/1039 on this.
I outlined the problem you discovered, and the very simple fix:

object::load_needed does today:
        boost::replace_all(rpath_str, "$ORIGIN", dirname(_pathname));

Instead of using "_pathname" here as-is, we need to first try readlink() on _pathname and use the result if it is indeed a link, instead of _pathname.
But actually, we need to do this in a loop (we can have a link to a link to a link!). But with a limit to avoid infinite iterations.
Should be very easy to fix. I would be happy if you send a patch for this because you already have the setup to test it.

To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/d768b6be-ffa2-4391-abeb-6b558493e299%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages