Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Crash with mozjs-76 on Linux in _dl_runtime_resolve() / dl_fixup() as soon as program starts

44 views
Skip to first unread message

Miles

unread,
May 19, 2020, 12:24:52 PM5/19/20
to
Hi,

I'm trying to upgrade to mozjs-76 and have slowly being sorting out the compile errors so I get a program that can link. I've now done that but when I run the executable it crashes instantly on startup. If I look at where it is crashing in gdb I see (for a release build)

#0 0x000000393100e02c in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
#1 0x0000003931014725 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
#2 0x00007ffff6e2f170 in ?? () from /spidermonkey/js76_release/lib/libmozjs-76.so
#3 0x00007ffff706e9b2 in ?? () from /spidermonkey/js76_release/lib/libmozjs-76.so

If I use a debug build I get

#0 0x000000393100e02c in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
#1 0x0000003931014725 in _dl_runtime_resolve () from /lib64/ld-linux-x86-64.so.2
#2 0x00007ffff570465e in js::MutexImpl::MutexImpl() ()
at /spidermonkey/tmp/mozjs-76.0.0/js/src/threading/Mutex.h:39
#3 0x00007ffff57046ec in js::Mutex::Mutex(js::MutexId const&) ()
at /spidermonkey/tmp/mozjs-76.0.0/js/src/threading/Mutex.h:62
#4 0x00007ffff670237f in _ZN2js13ExclusiveDataI12ReadLockFlagEC2IJEEERKNS_7MutexIdEDpOT_ ()
at /spidermonkey/tmp/mozjs-76.0.0/js/src/threading/ExclusiveData.h:110
#5 0x00007ffff670ea11 in __static_initialization_and_destruction_0 ()
at /spidermonkey/tmp/mozjs-76.0.0/js/src/wasm/WasmProcess.cpp:336
#6 0x00007ffff670eb10 in _GLOBAL__sub_I_Unified_cpp_js_src_wasm2.cpp ()
at /spidermonkey/tmp/mozjs-76.0.0/js/src/wasm/WasmStubs.cpp:2823
#7 0x00007ffff6997a42 in __do_global_ctors_aux () from /spidermonkey/js76_debug/lib/libmozjs-76.so
#8 0x00007fffffffd9a8 in ?? ()
#9 0x0000000000000001 in ?? ()
#10 0x00007fffffffd9a8 in ?? ()
#11 0x00007ffff567f5bb in _init () from /spidermonkey/js76_debug/lib/libmozjs-76.so


I presumed I was doing something wrong so I went back to trying to run the simplest program I could think of using the JS API

#include "jsapi.h"
#include "js/Initialization.h"

int main(int argc, const char *argv[])
{
JS_Init();
JS_ShutDown();
exit(0);
}

and I'm compiling that with something like

g++ -std=gnu++17 -include=/spidermonkey/js76_release/include/mozjs-76/js/RequiredDefines.h -I/spidermonkey/js76_release/include/mozjs-76 -O -pthread test.cpp -o test -rdynamic -lm -ldl -lpthread -lrt -lz -L/spidermonkey/js76_release/lib -lmozjs-76

and that still crashes in exactly the same way.

So I then commented out the JS_Init() and JS_ShutDown() calls (but still link the executable with -lmozjs-76)

and it *still* crashes.

If I remove -lmozjs-76 from the linker the program runs (but obviously then doesn't do anything!)

I'm compiling this on an old CentOS 6 machine so the installed g++ compilers are ancient. I compiled g++ 9.3 from source and that's what I used to build.

I successfully managed to build mozjs-76 and the JS shell and the test suite all ran OK and passed so the JS shell clearly works OK using the compilers I used.

I've spent the last few days trying to understand why the shell works OK but my executable doesn't.
One difference was that the shell linked to libjs_static.a and libjsrust.a instead of libmozjs-76.so so I tried linking statically instead of using the shared library but I still get the same crash.

I presume I'm somehow getting some sort of incompatibility when the library is being initialised but I'm completely at a loss as to why the JS shell works but it crashes for me. There must be *something* that I'm doing wrong...
I hope I'm just being really stupid and missing something.
Can anybody help?

Many thanks

Miles

Matthew Gaudet

unread,
May 20, 2020, 8:43:37 AM5/20/20
to Miles, dev-tech-...@lists.mozilla.org
Hi Miles,

So I looked at this for about five minutes (take the following with the
appropriate grain of salt):

My guess is something is funky with your newly built compilers and
pthreads: I suspect there's some weird incompatibility here. Can you build
a straight up basic pthreads program?

My reasoning is as follows:

1. If you look at the crash location, it appears to be calling into the
constructor of mozilla::detail::MutexImpl():
https://searchfox.org/mozilla-central/rev/61fceb7c0729773f544a9656f474e36cd636e5ea/js/src/threading/Mutex.h#39
2. Looking at the posix implementation of that and we see the mutex
implementation is built on pthreads mutexes. }
https://searchfox.org/mozilla-central/rev/61fceb7c0729773f544a9656f474e36cd636e5ea/mozglue/misc/Mutex_posix.cpp#61

Unfortunately your stack sort of dies there. It's difficult to figure out
where it goes next.

In your position, what I'd suggest is trying to build some basic pthreads
code modelled on Mutex_Posix.cpp, and seeing if it fails.

The reason this fails whenever you link in mozjs is because when you start
the program with mozjs linked in you're running the static constructors, if
I had to guess, this one:
https://searchfox.org/mozilla-central/source/js/src/wasm/WasmProcess.cpp#336
> _______________________________________________
> dev-tech-js-engine mailing list
> dev-tech-...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-tech-js-engine
>

Nathan Froyd

unread,
May 20, 2020, 9:06:31 AM5/20/20
to Miles, dev-tech-js-engine
On Tue, May 19, 2020 at 12:25 PM Miles <miles.t...@arup.com> wrote:

>
> #0 0x000000393100e02c in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
> #1 0x0000003931014725 in _dl_runtime_resolve () from
> /lib64/ld-linux-x86-64.so.2
> #2 0x00007ffff570465e in js::MutexImpl::MutexImpl() ()
> at /spidermonkey/tmp/mozjs-76.0.0/js/src/threading/Mutex.h:39
>

This is crashing at
https://searchfox.org/mozilla-central/rev/61fceb7c0729773f544a9656f474e36cd636e5ea/js/src/threading/Mutex.h#39
which is a function that lives in mozglue:

https://searchfox.org/mozilla-central/source/mozglue/misc/PlatformMutex.h#29

and your linker is telling you it can't find the symbol. It looks like you
didn't link mozglue in anywhere? (Not really sure how mozglue works with
standalone SpiderMonkey...)

-Nathan

Steve Fink

unread,
May 20, 2020, 11:28:16 AM5/20/20
to Nathan Froyd, Miles, dev-tech-js-engine
On 5/20/20 6:06 AM, Nathan Froyd wrote:
> On Tue, May 19, 2020 at 12:25 PM Miles <miles.t...@arup.com> wrote:
>
>> #0 0x000000393100e02c in _dl_fixup () from /lib64/ld-linux-x86-64.so.2
>> #1 0x0000003931014725 in _dl_runtime_resolve () from
>> /lib64/ld-linux-x86-64.so.2
>> #2 0x00007ffff570465e in js::MutexImpl::MutexImpl() ()
>> at /spidermonkey/tmp/mozjs-76.0.0/js/src/threading/Mutex.h:39
>>
> This is crashing at
> https://searchfox.org/mozilla-central/rev/61fceb7c0729773f544a9656f474e36cd636e5ea/js/src/threading/Mutex.h#39
> which is a function that lives in mozglue:
>
> https://searchfox.org/mozilla-central/source/mozglue/misc/PlatformMutex.h#29
>
> and your linker is telling you it can't find the symbol. It looks like you
> didn't link mozglue in anywhere? (Not really sure how mozglue works with
> standalone SpiderMonkey...)

This is <https://bugzilla.mozilla.org/show_bug.cgi?id=1465038>. The
usual fix is to compile mozjs with --disable-jemalloc, which really
should be the default when building for embedding.

Miles

unread,
May 21, 2020, 10:38:46 AM5/21/20
to
Many thanks to everyone for their replies and pointing me in the correct direction of looking at mozglue.
The problem was indeed the fact that mozglue isn't included in libmozjs-76.so.
I fixed this my looking at the input to the linker when building the JS shell and the input when building libmozjs-76.so.
For the shell it has the following inputs to g++ when linking

INPUT("Unified_cpp_js_src_shell0.o")
INPUT("../../../memory/build/Unified_cpp_memory_build0.o")
INPUT("../../../memory/mozalloc/cxxalloc.o")
INPUT("../../../memory/mozalloc/mozalloc_abort.o")
INPUT("../../../memory/mozalloc/Unified_cpp_memory_mozalloc0.o")
INPUT("../../../mozglue/misc/AutoProfilerLabel.o")
INPUT("../../../mozglue/misc/ConditionVariable_posix.o")
INPUT("../../../mozglue/misc/Mutex_posix.o")
INPUT("../../../mozglue/misc/Printf.o")
INPUT("../../../mozglue/misc/StackWalk.o")
INPUT("../../../mozglue/misc/TimeStamp.o")
INPUT("../../../mozglue/misc/TimeStamp_posix.o")
INPUT("../../../mozglue/misc/Decimal.o")
INPUT("../../../mfbt/lz4.o")
INPUT("../../../mfbt/lz4frame.o")
INPUT("../../../mfbt/lz4hc.o")
INPUT("../../../mfbt/xxhash.o")
INPUT("../../../mfbt/Compression.o")
INPUT("../../../mfbt/Unified_cpp_mfbt0.o")
INPUT("../../../mfbt/Unified_cpp_mfbt1.o")
INPUT("../editline/Unified_c_js_src_editline0.o")

I added all of the memory, mozglue and mfbt inputs to the file libmozjs-76.so.list and relinked libmozjs-76.so and my test program now works! :-)

@Steve Fink, I did the above hack before I read your reply about using --disable-jemalloc. Does adding that 'properly' do what I did above manually or is that something else/another issue that I might face?

Thanks again for the help. Now back to upgrading my embedding... :-)

Miles

Steve Fink

unread,
May 21, 2020, 11:40:06 AM5/21/20
to dev-tech-...@lists.mozilla.org
I think so.

I just traced through some of our configure goop. I'm nowhere close to
being an expert on this stuff, so don't trust anything I say here. But I
think the most relevant part is:
https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/js/src/old-configure.in#1239-1252

--disable-jemalloc will clear out MOZ_MEMORY, which from the above
configure snippet will result in MOZ_GLUE_IN_PROGRAM being set to the
empty string for JS_STANDALONE (mozjs). MOZ_GLUE_IN_PROGRAM can be read
as "mozglue is linked into the executable *as opposed to the library*",
and you want it in the library.

The magic unicorn in the build system will then look at
MOZ_GLUE_IN_PROGRAM, and if it is clear, be chopped up and processed
into glue to compensate and will no longer be able to help you. As a
result, mozglue will not be linked into the library. (We don't give out
the unicorn glue, sorry; that stuff is valuable.)

Uh... sorry, having a lot of trouble tracking down how this affects
things. That's the best explanation I have at the moment.

I'll try harder.
https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/build/gecko_templates.mozbuild#39-50
seems to be where the unicorn-chopping must happen. I have no idea how
'mozglue' gets set there, but you can see that if the magic unicorn set
it to 'library' before dying and MOZ_GLUE_IN_PROGRAM is clear, then
you'll end up with USE_LIBS including 'mozglue'. I'll take it on faith
that USE_LIBS will result in the right thing happening.

GeckoBinary seems an odd thing to use for configuring building for a
*library*, but the unicorn was probably a little occupied with its
impending doom to worry about that, and
https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/build/docs/defining-binaries.rst#326-345
seems to confirm.
https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/build/gecko_templates.mozbuild#99-110
shows that GeckoSharedLibrary does indeed use it. I guess my brain
associates "binary" with "executable", but this code appears to use
"Program" for that.

And sure enough,
https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/js/src/build/moz.build#25-26
uses GeckoSharedLibrary (the actual library name mozjs appears to be set
here:
https://searchfox.org/mozilla-central/rev/a40ef31fc9af34a99ceda6d65cdc4573d52b83d2/js/src/old-configure.in#1589
)

Anyway: yes, --disable-jemalloc should be adequate to get the linking
correct. And our tarball sucks, in that it produces a mozjs depending on
weak symbols in a library that it does not provide, which results in
things linking happily but then trying to call null pointers when you
call any of those weak symbols. Not only should we switch the tarball's
default, we should also add a canary call (or simple null pointer test)
that gives some clue as to what's actually going wrong if you end up in
this configuration.


0 new messages