Yesterday I tweaked OSv further to get it to run a dynamically linked executable 'Hello World' via ld-linux-x86-64.so.2 instead of OSv's built-in dynamic linker. In essence, I built upon the work to get the static executable to work and modified core/elf.cc to handle ld-linux-x86-64.so.2 as a program interpreter to launch an executable. I also had to enhance the startup code to prepare the program stack accordingly.
OSv v0.57.0-41-gf27b35b9
Booted up in 4.57 ms
Cmdline: /usr/lib/ld-linux-x86-64.so.2 /hello
-> syscall: 012
-> syscall: 158
-> syscall: 257
-> syscall: 000
-> syscall: 017
-> syscall: 017
-> syscall: 009
-> syscall: 009
-> syscall: 009
-> syscall: 009
-> syscall: 003
-> syscall: 063
-> syscall: 021
-> syscall: 257
-> syscall: 257
-> syscall: 262
-> syscall: 257
-> syscall: 262
-> syscall: 257
-> syscall: 262
-> syscall: 257
-> syscall: 262
-> syscall: 257
-> syscall: 262
-> syscall: 257
-> syscall: 262
-> syscall: 257
-> syscall: 262
-> syscall: 257
-> syscall: 262
-> syscall: 257
-> syscall: 262
-> syscall: 257
-> syscall: 000
-> syscall: 017
-> syscall: 017
-> syscall: 017
-> syscall: 262
-> syscall: 017
-> syscall: 009
-> syscall: 009
-> syscall: 009
-> syscall: 009
-> syscall: 009
-> syscall: 003
-> syscall: 009
-> syscall: 158
-> syscall: 218
-> syscall: 273
-> syscall: 010
-> syscall: 010
-> syscall: 010
-> syscall: 302
-> syscall: 262
-> syscall: 016
-> syscall: 318
-> syscall: 228
-> syscall: 228
-> syscall: 012
-> syscall: 012
-> syscall: 001
Hello from C code
-> syscall: 231
and the memory map from gdb:
(gdb) osv mmap
0x0000000000000000 0x0000000000000000 [0.0 kB] flags=none perm=none
0x0000100000000000 0x0000100000001000 [4.0 kB] flags=fmF perm=r offset=0x00000000 path=/libvdso.so
0x0000100000001000 0x0000100000002000 [4.0 kB] flags=fmF perm=rx offset=0x00001000 path=/libvdso.so
0x0000100000002000 0x0000100000003000 [4.0 kB] flags=fmF perm=r offset=0x00002000 path=/libvdso.so
0x0000100000003000 0x0000100000004000 [4.0 kB] flags=fmF perm=r offset=0x00002000 path=/libvdso.so
0x0000100000004000 0x0000100000005000 [4.0 kB] flags=fmF perm=rw offset=0x00003000 path=/libvdso.so
0x0000100000005000 0x0000100000007000 [8.0 kB] flags=fmF perm=r offset=0x00000000 path=/usr/lib/ld-linux-x86-64.so.2
0x0000100000007000 0x000010000002d000 [152.0 kB] flags=fmF perm=rx offset=0x00002000 path=/usr/lib/ld-linux-x86-64.so.2
0x000010000002d000 0x0000100000038000 [44.0 kB] flags=fmF perm=r offset=0x00028000 path=/usr/lib/ld-linux-x86-64.so.2
0x0000100000039000 0x000010000003b000 [8.0 kB] flags=fmF perm=r offset=0x00033000 path=/usr/lib/ld-linux-x86-64.so.2
0x000010000003b000 0x000010000003d000 [8.0 kB] flags=fmF perm=rw offset=0x00035000 path=/usr/lib/ld-linux-x86-64.so.2
0x0000200000000000 0x0000200000001000 [4.0 kB] flags=p perm=none
0x0000200000001000 0x0000200000002000 [4.0 kB] flags=p perm=none
0x0000200000002000 0x0000200000101000 [1020.0 kB] flags=p perm=rw
0x0000200000101000 0x0000200000102000 [4.0 kB] flags=p perm=none
0x0000200000102000 0x0000200000201000 [1020.0 kB] flags=p perm=rw
0x0000200000201000 0x0000200000301000 [1024.0 kB] flags=none perm=rw
0x0000200000301000 0x0000200000302000 [4.0 kB] flags=mF perm=r offset=0x00000000 path=/hello
0x0000200000302000 0x0000200000303000 [4.0 kB] flags=fmF perm=rx offset=0x00001000 path=/hello
0x0000200000303000 0x0000200000304000 [4.0 kB] flags=fmF perm=r offset=0x00002000 path=/hello
0x0000200000304000 0x0000200000305000 [4.0 kB] flags=fmF perm=r offset=0x00002000 path=/hello
0x0000200000305000 0x0000200000306000 [4.0 kB] flags=fmF perm=rw offset=0x00003000 path=/hello
0x0000200000306000 0x0000200000308000 [8.0 kB] flags=none perm=rw
0x0000200000400000 0x0000200000428000 [160.0 kB] flags=mF perm=r offset=0x00000000 path=/lib64/libc.so.6
0x0000200000428000 0x000020000059c000 [1488.0 kB] flags=fmF perm=rx offset=0x00028000 path=/lib64/libc.so.6
0x000020000059c000 0x00002000005f4000 [352.0 kB] flags=fmF perm=r offset=0x0019c000 path=/lib64/libc.so.6
0x00002000005f4000 0x00002000005f8000 [16.0 kB] flags=fmF perm=r offset=0x001f3000 path=/lib64/libc.so.6
0x00002000005f8000 0x00002000005fa000 [8.0 kB] flags=fmF perm=rw offset=0x001f7000 path=/lib64/libc.so.6
0x00002000005fa000 0x0000200000607000 [52.0 kB] flags=f perm=rw
0x0000400000000000 0x0000400000000000 [0.0 kB] flags=none perm=none
As you can see we have to explicitly tell OSv to run ld-linux-x86-64.so.2 as a statically linked PIE and then give control over to it to launch the hello executable. But going forward we can add a new kernel option to do so. Obviously, the build image has to contain libc.so and other libraries the OSv dynamic linker would normally provide:
/lib64/libc.so.6: /usr/lib64/libc.so.6
/usr/lib/ld-linux-x86-64.so.2: /usr/lib64/ld-linux-x86-64.so.2
Now one may ask why do we even want to run dynamically linked executables with the Linux program interpreter when OSv dynamic linker has been able to do it for years. For sure as my previous email shows local calls are almost free and syscalls cost around 100 ns.
First, this capability should make OSv even more compatible with Linux executables because we no longer have to implement every single libc function new app may need. We just need to handle enough of syscalls.
Secondly, the cost of syscalls (still over 3 times faster than Linux) may not matter as much (we will see once we have clone syscall supported so we can stress test and compare more sophisticated apps). The argument goes both ways, many Linux apps have been written to call as few syscalls as possible given they are expensive.
Thirdly, I think that many optimizations in OSv (van Jacobsen channels, memory allocators, etc) are still going to help as they will be internally used by kernel (malloc called from kernel code for example) but not by apps anymore as they interact via syscalls. But it also allows apps to use their own optimizations (special memory allocators, etc) which I think may be of extra benefit even on OSv.
All in all, this just makes OSv even more flexible by offering more ways to run Linux apps. And I am very excited about this new capability and ability to run static executables as well.
Regards,
Waldek
PS. Hopefully, I will have some news about the clone syscall soon as well.