tst-feexcept.cc
tst-sigaltstack.so
Did something write over this memory? BTW the crash for tst-feexcept.cc looks pretty much identical.Now, I have replaced two files in Makefile from libc to musl and added 3rd musl file.diff --git a/Makefile b/Makefileindex 9d2997a1..f7d740b0 100644--- a/Makefile+++ b/Makefile@@ -1398,8 +1398,12 @@ musl += process/wait.omusl += setjmp/$(musl_arch)/setjmp.omusl += setjmp/$(musl_arch)/longjmp.o-libc += arch/$(arch)/setjmp/siglongjmp.o-libc += arch/$(arch)/setjmp/sigsetjmp.o+#libc += arch/$(arch)/setjmp/siglongjmp.o+#libc += arch/$(arch)/setjmp/sigsetjmp.o+musl += signal/$(musl_arch)/sigsetjmp.o
I am not sure if I fully understand the change but it looks the return address is stored in a different place. Maybe because we use thread local variable for env causes some issues? Why are using thread local variables in those 2 tests?
Now, these two tests also fail on aarch64 but for a different reason - we do not have TLS implemented on aarch64 but I guess if we did those tests would fail in similar way.Unfortunately, I do not understand signal handling on OSv and its limitations so I am not sure if the changes to sigsetjmp on musl side are actually helpful to us and we should upgrade to the newest version. Or maybe it would be better to keep the x64 version of it as is and only use (copy) the pre-583e55122e767b1586286a0d9c35e2a4027998ab-commit version of aarch64 sigsetjmp from musl instead which would be sad.Any ideas?Waldek
--
You received this message because you are subscribed to the Google Groups "OSv Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/bc3d65f7-f8d1-4bf6-afeb-76be861afa25n%40googlegroups.com.
On Thursday, October 1, 2020 at 2:56:09 PM UTC-4 Nadav Har'El wrote:I looked at the difference between the libc and musl implementation of sigsetjmp.s, and there are extensive differences I don't understand.One of them is a difference between a "call" and a "jmp". One of them is the use of "@PLT" in the code.Maybe that's an ABI (stack alignment) problem? Maybe linker?There's also a new bizarre __sigsetjmp_tail which I don't understand.Can you try switching those different functions one by one and perhaps finding which one is causing the problem?I have narrowed down the tst-feexcept.cc to have only the following two assertions on (remaining code after in main is commented out) and also have changed env to be regular variable (see below) :static sigjmp_buf env;
...int main(int argc, char **argv){// Test that fegetexcept() does not return a negative numberexpect(fegetexcept() >= 0, true);
// Test that *integer* division by zero generates, ironically, a SIGFPEexpect(sig_check([] { printf("%d\n", 1 / zero_i()); }, SIGFPE), true);std::cout << "SUMMARY: " << tests << " tests, " << fails << " failures\n";return fails == 0 ? 0 : 1;}And the test still crashes (and 2 assertions pass). Obviously, if I keep the 1st assertion only, it does not crash. So something to do with sig_check().Now, I have discovered that if I comment out the invocation of f() in sig_check the test does not crash
I think you can get rid of the thread-local and also have a chance to run it on aarch64 then.Getting rid of thread_local for env and making it static does not help on x64 - still exact same crash (which is good news in sense:-). Interestingly the same test does not crash on aarch64 now (no thread_local which is not supported yet) but some assertions fail.
On Mon, Oct 5, 2020 at 9:48 PM Waldek Kozaczuk <jwkoz...@gmail.com> wrote:On Thursday, October 1, 2020 at 2:56:09 PM UTC-4 Nadav Har'El wrote:I looked at the difference between the libc and musl implementation of sigsetjmp.s, and there are extensive differences I don't understand.One of them is a difference between a "call" and a "jmp". One of them is the use of "@PLT" in the code.Maybe that's an ABI (stack alignment) problem? Maybe linker?There's also a new bizarre __sigsetjmp_tail which I don't understand.Can you try switching those different functions one by one and perhaps finding which one is causing the problem?I have narrowed down the tst-feexcept.cc to have only the following two assertions on (remaining code after in main is commented out) and also have changed env to be regular variable (see below) :static sigjmp_buf env;I thought about it, thread_local is weird, but should have also worked (and apparently, doesn't work exactly the same) - the reason is that for synchronic signals (sigfpe, sigsegv - things that happen because of bad instructions, not signal from another thread), Osv runs the handler in the same thread....int main(int argc, char **argv){// Test that fegetexcept() does not return a negative numberexpect(fegetexcept() >= 0, true);I guess you can drop this from the test too, and it will fail just the same.// Test that *integer* division by zero generates, ironically, a SIGFPEexpect(sig_check([] { printf("%d\n", 1 / zero_i()); }, SIGFPE), true);std::cout << "SUMMARY: " << tests << " tests, " << fails << " failures\n";return fails == 0 ? 0 : 1;}And the test still crashes (and 2 assertions pass). Obviously, if I keep the 1st assertion only, it does not crash. So something to do with sig_check().Now, I have discovered that if I comment out the invocation of f() in sig_check the test does not crashThe invocation of f() is what causes siglongjmp() (in the signal handler) to be called.You can add a printout in the signal handler before siglongjmp() and see that if it was reached, and then add a printout in if (sigsetjmp(env, 1)) { (inside the if) - if that printout isn't shown, the siglongjmp didn't work.
I think you can get rid of the thread-local and also have a chance to run it on aarch64 then.Getting rid of thread_local for env and making it static does not help on x64 - still exact same crash (which is good news in sense:-). Interestingly the same test does not crash on aarch64 now (no thread_local which is not supported yet) but some assertions fail.Yes - it shouldn't be thread_local but for synchronous signals (like SIGFPE and SIGSEGV) it doesn't matter - the handler runs in the same thread.