printf("%s: %s\n", ok ? "PASS" : "FAIL", msg.c_str());
Hi,When one compiles kernel with most symbols hidden (which also hides completely STD C++ library), the tst-tls-pie.so crashes like so:#0 0x00000000402da5b2 in processor::cli_hlt () at arch/x64/processor.hh:247
#1 arch::halt_no_interrupts () at arch/x64/arch.hh:48
#2 osv::halt () at arch/x64/power.cc:26
#3 0x00000000402211a0 in abort (fmt=fmt@entry=0x4049aef3 "Aborted\n") at runtime.cc:137
#4 0x00000000402211b2 in abort () at runtime.cc:103
#5 0x000000004039cade in osv::generate_signal (siginfo=..., ef=0xffff800000e87068) at libc/signal.cc:130
#6 0x000000004039cb9f in osv::handle_mmap_fault (addr=addr@entry=18446744073709547520, sig=sig@entry=11, ef=ef@entry=0xffff800000e87068)
at libc/signal.cc:145
#7 0x000000004028709b in mmu::vm_sigsegv (ef=0xffff800000e87068, addr=18446744073709547520) at core/mmu.cc:1334
#8 mmu::vm_fault (addr=18446744073709547520, addr@entry=18446744073709551592, ef=ef@entry=0xffff800000e87068) at core/mmu.cc:1354
#9 0x00000000402d3d90 in page_fault (ef=0xffff800000e87068) at arch/x64/mmu.cc:42
#10 <signal handler called>
#11 0x0000100000156cf7 in std::ostream::sentry::sentry (this=0x200000200df0, __os=...)
at /usr/src/debug/gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/ostream.tcc:51
#12 0x000010000015743c in std::__ostream_insert<char, std::char_traits<char> > (__out=..., __s=0x40303a "PASS", __n=__n@entry=4)
at /usr/src/debug/gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/ostream_insert.h:82
#13 0x0000000000401ccf in std::operator<< <std::char_traits<char> > (__s=<optimized out>, __out=...)
at /usr/include/c++/10/bits/char_traits.h:371
#14 report (ok=<optimized out>, msg="v7 in init function") at /home/wkozaczuk/projects/osv-true-master/tests/tst-tls.cc:57
#15 0x00000000004014b3 in before_main () at /home/wkozaczuk/projects/osv-true-master/tests/tst-tls.cc:127
#16 0x00000000402972c5 in elf::object::run_init_funcs (this=0xffffa0000094a600, argc=argc@entry=1, argv=argv@entry=0xffffa0000094a400)
at core/elf.cc:1178
#17 0x0000000040298a0b in elf::program::init_library (this=<optimized out>, argc=1, argv=0xffffa0000094a400) at core/elf.cc:1500
#18 0x000000004020c239 in osv::application::main (this=0xffffa0000094cc10) at core/app.cc:319
#19 0x0000000040365cb9 in operator() (app=<optimized out>, __closure=0x0) at core/app.cc:236
#20 _FUN () at core/app.cc:238
#21 0x0000000040398a66 in operator() (__closure=0xffffa00000abda00) at libc/pthread.cc:116
#22 std::__invoke_impl<void, pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::<lambda()>&> (__f=...) at /usr/include/c++/10/bits/invoke.h:60
#23 std::__invoke_r<void, pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::<lambda()>&> (__fn=...) at /usr/include/c++/10/bits/invoke.h:153
#24 std::_Function_handler<void(), pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/10/bits/std_function.h:291
#25 0x00000000403391ea in sched::thread::main (this=0xffff800000e82040) at core/sched.cc:1267
#26 sched::thread_main_c (t=0xffff800000e82040) at arch/x64/arch-switch.hh:325
#27 0x00000000402d3b33 in thread_main () at arch/x64/entry.S:116I have been researching this for a bit and realized that when I change the report() function not use std::cout but instead use printf() like so:printf("%s: %s\n", ok ? "PASS" : "FAIL", msg.c_str());
the crash goes away.I knew something was wrong with the initialization of some objects but could not quite pin it down. Then I tried to run the tst-tls-pie.so on the Linux host (it is a pie) and it crashed with a segmentation fault. This made me think that maybe something is wrong with the test program itself.
Finally, I found this on the internet - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94810 - where somebody tried to compile and run a program with a similar construct: std::cout used in the __constructor__ annotated function. The stack trace looks very similar as well. The bug was rejected with the explanation that the std::ios_base::Init object may not be initialized yet if used in the __constructor__ annotated function.
What is interesting this test happens to work with kernel exposing all symbols including stdc++ because the kernel copy of std::ios_base::Init was already initialized.
So I think the tst-tls.cc needs to be changed to not to use std:cout in the report function or uses different report() from before_main().
WaldekPS1. Regarding using std:cout in various __constructor__ kernel functions like parse_madt() (which calls debug()) and called by smp_init(), I wonder if we are just lucky that std::ios_base::Init is already initialized and we do not see similar crashes.
PS2: This printout with dynamic linker info suggests that all global objects in stdc++.so should have been initialized before calling before_main(). So why is std::ios_base::Init not initialized yet?ELF [tid:26, mod:5, /usr/lib/libstdc++.so.6]: Executing DT_INIT function
ELF [tid:26, mod:5, /usr/lib/libstdc++.so.6]: Finished executing DT_INIT function
ELF [tid:26, mod:5, /usr/lib/libstdc++.so.6]: Executing 12 DT_INIT_ARRAYSZ functions
ELF [tid:26, mod:5, /usr/lib/libstdc++.so.6]: Finished executing 12 DT_INIT_ARRAYSZ functions
ELF [tid:26, mod:4, /tests/libtls.so]: Executing DT_INIT function
ELF [tid:26, mod:4, /tests/libtls.so]: Finished executing DT_INIT function
ELF [tid:26, mod:4, /tests/libtls.so]: Executing 1 DT_INIT_ARRAYSZ functions
ELF [tid:26, mod:4, /tests/libtls.so]: Finished executing 1 DT_INIT_ARRAYSZ functions
ELF [tid:26, mod:3, /usr/lib/libgcc_s.so.1]: Executing DT_INIT function
ELF [tid:26, mod:3, /usr/lib/libgcc_s.so.1]: Finished executing DT_INIT function
ELF [tid:26, mod:3, /usr/lib/libgcc_s.so.1]: Executing 2 DT_INIT_ARRAYSZ functions
ELF [tid:26, mod:3, /usr/lib/libgcc_s.so.1]: Finished executing 2 DT_INIT_ARRAYSZ functions
ELF [tid:26, mod:2, /tests/tst-tls-pie.so]: Executing DT_INIT function
ELF [tid:26, mod:2, /tests/tst-tls-pie.so]: Finished executing DT_INIT function
ELF [tid:26, mod:2, /tests/tst-tls-pie.so]: Executing 3 DT_INIT_ARRAYSZ functions
Aborted
[backtrace]
0x000000004039cadd <???+1077529309>
0x000000004039cb9e <???+1077529502>
0x000000004028709a <???+1076392090>
0x00000000402d3d8f <???+1076706703>
0x00000000402d2bb6 <???+1076702134>
0x00000000004050bf <???+4214975>
--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/d0c7f7c0-91f4-4478-9ccf-c2cfe13c15afn%40googlegroups.com.
On Thu, Jan 6, 2022 at 6:56 AM Waldek Kozaczuk <jwkoz...@gmail.com> wrote:Hi,When one compiles kernel with most symbols hidden (which also hides completely STD C++ library), the tst-tls-pie.so crashes like so:#0 0x00000000402da5b2 in processor::cli_hlt () at arch/x64/processor.hh:247
#1 arch::halt_no_interrupts () at arch/x64/arch.hh:48
#2 osv::halt () at arch/x64/power.cc:26
#3 0x00000000402211a0 in abort (fmt=fmt@entry=0x4049aef3 "Aborted\n") at runtime.cc:137
#4 0x00000000402211b2 in abort () at runtime.cc:103
#5 0x000000004039cade in osv::generate_signal (siginfo=..., ef=0xffff800000e87068) at libc/signal.cc:130
#6 0x000000004039cb9f in osv::handle_mmap_fault (addr=addr@entry=18446744073709547520, sig=sig@entry=11, ef=ef@entry=0xffff800000e87068)
at libc/signal.cc:145
#7 0x000000004028709b in mmu::vm_sigsegv (ef=0xffff800000e87068, addr=18446744073709547520) at core/mmu.cc:1334
#8 mmu::vm_fault (addr=18446744073709547520, addr@entry=18446744073709551592, ef=ef@entry=0xffff800000e87068) at core/mmu.cc:1354
#9 0x00000000402d3d90 in page_fault (ef=0xffff800000e87068) at arch/x64/mmu.cc:42
#10 <signal handler called>
#11 0x0000100000156cf7 in std::ostream::sentry::sentry (this=0x200000200df0, __os=...)
at /usr/src/debug/gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/ostream.tcc:51Since your kernel does *not* export the C++ library, where did this symbol come from? Did you build the tests with the host's C++ library?I'll continue reading with the assumption that you did.
#12 0x000010000015743c in std::__ostream_insert<char, std::char_traits<char> > (__out=..., __s=0x40303a "PASS", __n=__n@entry=4)
at /usr/src/debug/gcc-10.3.1-1.fc33.x86_64/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/bits/ostream_insert.h:82
#13 0x0000000000401ccf in std::operator<< <std::char_traits<char> > (__s=<optimized out>, __out=...)
at /usr/include/c++/10/bits/char_traits.h:371
#14 report (ok=<optimized out>, msg="v7 in init function") at /home/wkozaczuk/projects/osv-true-master/tests/tst-tls.cc:57
#15 0x00000000004014b3 in before_main () at /home/wkozaczuk/projects/osv-true-master/tests/tst-tls.cc:127
#16 0x00000000402972c5 in elf::object::run_init_funcs (this=0xffffa0000094a600, argc=argc@entry=1, argv=argv@entry=0xffffa0000094a400)
at core/elf.cc:1178
#17 0x0000000040298a0b in elf::program::init_library (this=<optimized out>, argc=1, argv=0xffffa0000094a400) at core/elf.cc:1500
#18 0x000000004020c239 in osv::application::main (this=0xffffa0000094cc10) at core/app.cc:319
#19 0x0000000040365cb9 in operator() (app=<optimized out>, __closure=0x0) at core/app.cc:236
#20 _FUN () at core/app.cc:238
#21 0x0000000040398a66 in operator() (__closure=0xffffa00000abda00) at libc/pthread.cc:116
#22 std::__invoke_impl<void, pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::<lambda()>&> (__f=...) at /usr/include/c++/10/bits/invoke.h:60
#23 std::__invoke_r<void, pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::<lambda()>&> (__fn=...) at /usr/include/c++/10/bits/invoke.h:153
#24 std::_Function_handler<void(), pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, const pthread_private::thread_attr*)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/10/bits/std_function.h:291
#25 0x00000000403391ea in sched::thread::main (this=0xffff800000e82040) at core/sched.cc:1267
#26 sched::thread_main_c (t=0xffff800000e82040) at arch/x64/arch-switch.hh:325
#27 0x00000000402d3b33 in thread_main () at arch/x64/entry.S:116I have been researching this for a bit and realized that when I change the report() function not use std::cout but instead use printf() like so:printf("%s: %s\n", ok ? "PASS" : "FAIL", msg.c_str());
the crash goes away.I knew something was wrong with the initialization of some objects but could not quite pin it down. Then I tried to run the tst-tls-pie.so on the Linux host (it is a pie) and it crashed with a segmentation fault. This made me think that maybe something is wrong with the test program itself.Interesting. I remember tests/tst-tls.cc did work on Linux in the past, and the "pie" compilation which you added in 70547a6c64c56cbdf4c5b5d0c7eaa1e85badb0f7 should supposedly have worked too but I don't see that you documented it. Maybe you never ran it on Linux?
Finally, I found this on the internet - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94810 - where somebody tried to compile and run a program with a similar construct: std::cout used in the __constructor__ annotated function. The stack trace looks very similar as well. The bug was rejected with the explanation that the std::ios_base::Init object may not be initialized yet if used in the __constructor__ annotated function.What is interesting this test happens to work with kernel exposing all symbols including stdc++ because the kernel copy of std::ios_base::Init was already initialized.Very interesting, and nice detective work. Yes, I think this is exactly the same bug.*However*, I'm not sure I understand why it shouldn't have worked:If test-tls-pie's *depends* on libstdc++.so, unless I'm missing something, OSv should run libstdc++'s constructors *before* it runs test-tls-pie's.So when tests/tst-tls's constructor function runs, libstdc++ should already have been fully constructed.I see below you also raised this question....That being said, I'm probably missing something because if I was correctly, it would run correctly on Linux and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94810 would not exist...
So I think the tst-tls.cc needs to be changed to not to use std:cout in the report function or uses different report() from before_main().That sounds like a good workaround, but I would have liked to understand better if this problem is expected, or some sort of OSv initialization-order bug.