New user: trouble running a simple program

18 views
Skip to first unread message

De Vries

unread,
May 21, 2020, 5:46:29 AM5/21/20
to OSv Development
Hi,

Sorry if this is a bit of a newbie question. I'm trying to run a pretty simple application on OSv: pxz. I'm able to run other apps like mysql for example without any problem.
I have tried this the following way. First, I compiled the pxz executable with the -fPIE flag on the host machine, then put it in a new folder at osv/apps/pxz. I then ran the following:
./scripts/manifest_from_host.sh -r ~/osv/apps/pxz/pxz > ./apps/pxz/usr.manifest
./scripts/build image=pxz

It generates the following usr.manifest
# (PIE) Position Independent Executable
/pxz: /home/user1/osv/apps/pxz/pxz
# --------------------
# Dependencies
# --------------------
/usr/lib/libgomp.so.1: /usr/lib/x86_64-linux-gnu/libgomp.so.1
/usr/lib/liblzma.so.5: /lib/x86_64-linux-gnu/liblzma.so.5
# --------------------

Running it with 
./scripts/run.py -e "pxz --version"

Results in
OSv v0.55.0-6-g557251e1
eth0
: 192.168.122.15
Booted up in 407.56 ms
Cmdline: pxz --version

But it just hangs. No errors, but also no output. I have tried actually using pxz (not just --version) to compress a file but that also hangs indefinitely (while this works fine on the host machine).
Running ./scripts/run.py with the -V flag looks completely fine except maybe for the last line that is printed (after it prints Cmdline: pxz --version):
sysconf(): stubbed for parameter 0

I have also tried to run pxz using the way its done in the native-example application, but that also results in it hanging indefinitely.
What could be the issue here?

Nadav Har'El

unread,
May 21, 2020, 6:59:07 AM5/21/20
to De Vries, OSv Development
On Thu, May 21, 2020 at 12:46 PM De Vries <f1r3fl...@gmail.com> wrote:
Hi,

Sorry if this is a bit of a newbie question. I'm trying to run a pretty simple application on OSv: pxz. I'm able to run other apps like mysql for example without any problem.
I have tried this the following way. First, I compiled the pxz executable with the -fPIE flag on the host machine, then put it in a new folder at osv/apps/pxz. I then ran the following:
./scripts/manifest_from_host.sh -r ~/osv/apps/pxz/pxz > ./apps/pxz/usr.manifest
./scripts/build image=pxz

It generates the following usr.manifest
# (PIE) Position Independent Executable
/pxz: /home/user1/osv/apps/pxz/pxz
# --------------------
# Dependencies
# --------------------
/usr/lib/libgomp.so.1: /usr/lib/x86_64-linux-gnu/libgomp.so.1
/usr/lib/liblzma.so.5: /lib/x86_64-linux-gnu/liblzma.so.5
# --------------------

Running it with 
./scripts/run.py -e "pxz --version"

Results in
OSv v0.55.0-6-g557251e1
eth0
: 192.168.122.15
Booted up in 407.56 ms
Cmdline: pxz --version

But it just hangs. No errors, but also no output. I have tried actually using pxz (not just --version) to compress a file but that also hangs indefinitely (while this works fine on the host machine).

It's hard to say. It seems like you did everything right. I assume that if you run "pxz --version" on the host it works properly - prints a version number and exits - right?
During the "hang", does OSv do some busy loop ("top" will show you the OSv vm taking 100% CPU) or waits for something?

One thing you can do to figure out what is going on is to attach gdb to the running VM, and inquire from it what threads are running, and what they are waiting for.
It's not trivial to do, but not particular difficult either, and explained well (I hope) here: https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#debugging-osv-with-gdb
Note that you don't need to rebuild OSv specially for debugging to debug it this way.
 

Running ./scripts/run.py with the -V flag looks completely fine except maybe for the last line that is printed (after it prints Cmdline: pxz --version):
sysconf(): stubbed for parameter 0


This is a _SC_ARG_MAX parameter to sysconf(), it is indeed not implemented (and can be trivially implemented) but I doubt that this is the problem causing the hang (I also wonder why this program would need to check _SC_ARG_MAX if it's just planning to print the version number, not exec() anything - you can look at this software's source code to see what it does with _SC_ARG_MAX.

 
I have also tried to run pxz using the way its done in the native-example application, but that also results in it hanging indefinitely.
What could be the issue here?

--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/9ce2c259-c6e9-475d-aa73-e7e6d71cd722%40googlegroups.com.

Waldek Kozaczuk

unread,
May 21, 2020, 3:16:29 PM5/21/20
to OSv Development
I connected with gdb and here is stacktrace I got for the main app thread:

#0  sched::thread::switch_to (this=this@entry=0xffff8000001d1040) at arch/x64/arch-switch.hh:108
#1  0x000000004040dace in sched::cpu::reschedule_from_interrupt (this=0xffff80000001e040, called_from_yield=called_from_yield@entry=false, 
    preempt_after=..., preempt_after@entry=...) at core/sched.cc:339
#2  0x000000004040e800 in sched::cpu::schedule () at include/osv/sched.hh:1315
#3  0x000000004040e8e6 in sched::thread::wait (this=this@entry=0xffff800000f0a040) at core/sched.cc:1216
#4  0x000000004043ca86 in sched::thread::do_wait_for<lockfree::mutex, sched::wait_object<waitqueue> > (mtx=...) at include/osv/mutex.h:41
#5  sched::thread::wait_for<waitqueue&> (mtx=...) at include/osv/sched.hh:1225
#6  waitqueue::wait (this=this@entry=0x408fa650 <mmu::vma_list_mutex+48>, mtx=...) at core/waitqueue.cc:56
#7  0x00000000403eb27b in rwlock::reader_wait_lockable (this=<optimized out>) at core/rwlock.cc:174
#8  rwlock::rlock (this=this@entry=0x408fa620 <mmu::vma_list_mutex>) at core/rwlock.cc:29
#9  0x000000004034b88c in rwlock_for_read::lock (this=0x408fa620 <mmu::vma_list_mutex>) at include/osv/rwlock.h:113
#10 std::lock_guard<rwlock_for_read&>::lock_guard (__m=..., this=<synthetic pointer>) at /usr/include/c++/9/bits/std_mutex.h:159
#11 lock_guard_for_with_lock<rwlock_for_read&>::lock_guard_for_with_lock (lock=..., this=<synthetic pointer>) at include/osv/mutex.h:89
#12 mmu::vm_fault (addr=17592186081280, addr@entry=17592186083096, ef=ef@entry=0xffff800000f0f068) at core/mmu.cc:1333
#13 0x00000000403adf7c in page_fault (ef=0xffff800000f0f068) at arch/x64/mmu.cc:42
#14 <signal handler called>
#15 0x00000000405bf0cd in _Unwind_IteratePhdrCallback ()
#16 0x000000004047fd37 in <lambda(const elf::program::modules_list&)>::operator() (ml=..., __closure=<synthetic pointer>) at libc/dlfcn.cc:118
#17 elf::program::with_modules<dl_iterate_phdr(int (*)(dl_phdr_info*, size_t, void*), void*)::<lambda(const elf::program::modules_list&)> > (f=..., 
    this=0xffffa0000009cbb0) at include/osv/elf.hh:698
#18 dl_iterate_phdr (callback=0x405befa0 <_Unwind_IteratePhdrCallback>, data=0x200000700520) at libc/dlfcn.cc:99
#19 0x00000000405c0255 in _Unwind_Find_FDE ()
#20 0x00000000405bc693 in uw_frame_state_for ()
#21 0x00000000405be1da in _Unwind_RaiseException ()
#22 0x00000000404c4d1c in __cxa_throw ()
#23 0x0000000040205229 in mmu::find_hole (start=<optimized out>, size=<optimized out>) at include/osv/error.h:36
#24 0x000000004034ecea in mmu::allocate (v=v@entry=0xffffa00000cf2b80, start=35184372088832, start@entry=0, size=size@entry=9223372036854779904, 
    search=search@entry=true) at core/mmu.cc:1113
#25 0x000000004034fa97 in mmu::map_anon (addr=addr@entry=0x0, size=size@entry=9223372036854779904, flags=flags@entry=2, perm=perm@entry=3)
    at core/mmu.cc:1219
#26 0x00000000403f89a0 in memory::mapped_malloc_large (offset=64, size=9223372036854779904) at core/mempool.cc:919
#27 memory::malloc_large (size=9223372036854779904, alignment=16, block=true, contiguous=false) at core/mempool.cc:919
#28 0x00000000403fa272 in std_malloc (size=9223372036854775807, alignment=16) at core/mempool.cc:1795
#29 0x00000000403fa63b in malloc (size=9223372036854775807) at core/mempool.cc:2001
#30 0x00001000000075d5 in main ()
#31 0x0000000040444c11 in osv::application::run_main (this=0xffffa0007ffb4210) at /usr/include/c++/9/bits/stl_vector.h:915
#32 0x0000000040444d65 in __libc_start_main (main=0x100000007560 <main>) at core/app.cc:37
#33 0x000010000000801e in _start ()

It is trying to allocate tons of memory and it looks like we crash in find_hole() probably with throw make_error(ENOMEM)

I wonder if it is app (https://github.com/jnovy/pxz/blob/master/pxz.c) passing such memory size or is there some bug on our side?

(BTW osv info threads fails like this - would be nice to fix it:

(gdb) osv info threads
   1 (0xffff800000017040) reclaimer       cpu0 status::waiting condvar::wait(lockfree::mutex*, sched::timer*) at core/condvar.cc:43 vruntime  6.07461e-25
Python Exception <class 'Exception'> Class does not extend list_base_hook: sched::timer_base: 
Error occurred in Python: Class does not extend list_base_hook: sched::timer_base
)

When I examined pxz.c it eventually calls execvpe() which will definitely NOT work in OSv (OSv does not support processes so forking does not work -> there is some research fork that does that which I sent paper about recently).

135 void __attribute__((noreturn)) run_xz( char **argv, char **envp ) {
136         execve(XZ_BINARY, argv, envp);
137         error(0, errno, "execution of "XZ_BINARY" binary failed");
138         exit(EXIT_FAILURE);
139 }

xz seems to work fine (at least --help):

./scripts/manifest_from_host.sh -w xz && ./scripts/build --append-manifest fs=rofs
./scripts/firecracker.py 
OSv v0.55.0-9-gc13529d9
Booted up in 7.42 ms
Cmdline: /xz --help 
Usage: /xz [OPTION]... [FILE]...
Compress or decompress FILEs in the .xz format.

  -z, --compress      force compression
  -d, --decompress    force decompression
  -t, --test          test compressed file integrity
  -l, --list          list information about .xz files
  -k, --keep          keep (don't delete) input files
  -f, --force         force overwrite of output file and (de)compress links
  -c, --stdout        write to standard output and don't delete input files
  -0 ... -9           compression preset; default is 6; take compressor *and*
                      decompressor memory usage into account before using 7-9!
  -e, --extreme       try to improve compression ratio by using more CPU time;
                      does not affect decompressor memory requirements
  -T, --threads=NUM   use at most NUM threads; the default is 1; set to 0
                      to use as many threads as there are processor cores
  -q, --quiet         suppress warnings; specify twice to suppress errors too
  -v, --verbose       be verbose; specify twice for even more verbose
  -h, --help          display this short help and exit
  -H, --long-help     display the long help (lists also the advanced options)
  -V, --version       display the version number and exit

With no FILE, or when FILE is -, read standard input.

Report bugs to <lasse....@tukaani.org> (in English or Finnish).
XZ Utils home page: <https://tukaani.org/xz/>

Waldek
To unsubscribe from this group and stop receiving emails from it, send an email to osv...@googlegroups.com.

Waldek Kozaczuk

unread,
May 21, 2020, 3:20:47 PM5/21/20
to OSv Development
I think this code in the app might explain this huge malloc:

lzma_options_lzma lzma_options;
xzcmd_max = sysconf(_SC_ARG_MAX);
page_size = sysconf(_SC_PAGE_SIZE);
xzcmd = malloc(xzcmd_max);

f1r3fl...@gmail.com

unread,
May 21, 2020, 3:40:46 PM5/21/20
to OSv Development
Hi, thanks for the prompt replies. I was just typing up a long response (and ended up suspecting similar culprits) until I noticed there were 2 new replies. Thanks for looking into it.
I'm guessing this means it is more or less impossible to use the pxz application, unless I rewrite the source such that it does not fork the xz executable?
Report bugs to <lasse...@tukaani.org> (in English or Finnish).

Waldek Kozaczuk

unread,
May 21, 2020, 4:22:36 PM5/21/20
to f1r3fl...@gmail.com, OSv Development
On Thu, May 21, 2020 at 3:40 PM <f1r3fl...@gmail.com> wrote:
Hi, thanks for the prompt replies. I was just typing up a long response (and ended up suspecting similar culprits) until I noticed there were 2 new replies. Thanks for looking into it.
I'm guessing this means it is more or less impossible to use the pxz application, unless I rewrite the source such that it does not fork the xz executable?
Pretty much. Use some sort of multi-threading approach. 
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/0be8f468-8df1-4964-b376-3ba2219abb47%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages