Weird linker problem with unresolved calloc and emmalloc

420 views
Skip to first unread message

Floh

unread,
Sep 3, 2022, 9:03:21 AM9/3/22
to emscripten-discuss
I just stumbled over a very weird problem in a new sample for my sokol headers which includes a non-trivial 3rd party C library - the spine-c runtime (https://github.com/EsotericSoftware/spine-runtimes/tree/4.1/spine-c/spine-c)

When building in release mode (but not in debug mode) the linker complains about an unresolved 'calloc' call. Adding detailed output via LLD_REPORT_UNDEFINED=1 I get tons of:

wasm-ld: error: /Users/floh/projects/fips-sdks/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libemmalloc.a(emmalloc.o): undefined symbol: calloc

... how does that even make sense when emmalloc is supposed to provide an implementation of calloc... anyway...

The error goes away when compiling in debug mode (some differences that come to mind to the release mode is that debug mode doesn't use -flto, and also doesn't run the closure compiler).

The problem also goes away when not building with emmalloc.

Now the weird thing is that the entire code this sample is built from doesn't appear to call calloc anywhere... and that other samples which actually use calloc build just fines.

Has anybody seen a similar problem yet, before I jump myself into the rabbit hole? :)

The problematic command line is:

em++ -s DISABLE_EXCEPTION_CATCHING=1  -fno-exceptions -fno-rtti -fstrict-aliasing -Wall -Wno-multichar -Wextra -Wno-unknown-pragmas -Wno-ignored-qualifiers -Wno-long-long -Wno-overloaded-virtual -Wno-deprecated-writable-strings -Wno-unused-volatile-lvalue -Wno-inconsistent-missing-override -Wno-warn-absolute-paths -Wno-expansion-to-defined  -flto -O3 -DNDEBUG -s DISABLE_EXCEPTION_CATCHING=1  --memory-init-file 0 -s INITIAL_MEMORY=33554432 -s ERROR_ON_UNDEFINED_SYMBOLS=1 -s NO_EXIT_RUNTIME=1 -s LLD_REPORT_UNDEFINED=1 -s ALLOW_MEMORY_GROWTH=1 -s USE_WEBGL2=1 -s "MALLOC='emmalloc'" -s NO_FILESYSTEM=1 -s WASM=1  --shell-file /Users/floh/projects/sokol-samples/webpage/shell.html -O3  -flto  --closure 1 -s ASSERTIONS=0 sapp/CMakeFiles/spine-sapp.dir/spine-sapp.c.obj sapp/CMakeFiles/spine-sapp.dir/spine-assets.c.obj -o /Users/floh/projects/fips-deploy/sokol-samples/sapp-webgl2-wasm-vscode-release/spine-sapp.html  libs/sokol/libsokol.a  libs/spine-c/libspine-c.a  libs/stb/libstb.a  libs/util/libfileutil.a  fips-cimgui_cimgui/libcimgui.a

Floh

unread,
Sep 3, 2022, 9:16:42 AM9/3/22
to emscripten-discuss
PS: this is the wasm-ld cmdline:

em++: error: '/Users/floh/projects/fips-sdks/emsdk/upstream/bin/wasm-ld -o /Users/floh/projects/fips-deploy/sokol-samples/sapp-webgl2-wasm-vscode-release/spine-sapp.wasm sapp/CMakeFiles/spine-sapp.dir/spine-sapp.c.obj sapp/CMakeFiles/spine-sapp.dir/spine-assets.c.obj libs/sokol/libsokol.a libs/spine-c/libspine-c.a libs/stb/libstb.a libs/util/libfileutil.a fips-cimgui_cimgui/libcimgui.a -L/Users/floh/projects/fips-sdks/emsdk/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/lto -lGL-webgl2 -lal -lhtml5 -lstubs -lc -lcompiler_rt -lc++-noexcept -lc++abi-noexcept -lemmalloc -lc_rt -lsockets -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --allow-undefined-file=/var/folders/dz/g9ydwg8973z9nn5bvffcwf3h0000gn/T/tmphqvycm26.undefined --strip-debug --export-if-defined=main --export-if-defined=stackSave --export-if-defined=stackRestore --export-if-defined=stackAlloc --export-if-defined=__wasm_call_ctors --export-if-defined=__errno_location --export-if-defined=malloc --export-if-defined=free --export-if-defined=__start_em_asm --export-if-defined=__stop_em_asm --export-table -z stack-size=5242880 --initial-memory=33554432 --entry=main --max-memory=2147483648 --global-base=1024' failed (returned 1)

...I'm now trying to bisect the SDK version to see if this is a regression...

Floh

unread,
Sep 3, 2022, 9:21:18 AM9/3/22
to emscripten-discuss
PPS: ...looks like I can only go back to SDK version 3.0.0 on ARM Macs, and the problem already existed in that version.

Floh

unread,
Sep 3, 2022, 9:54:37 AM9/3/22
to emscripten-discuss
Ok, it works if I change one thing: setting the -O flag from -O3 to -O0 (-O1 also already breaks).

...also it seems that calloc *is* actually called somewhere (the wasm crashes in the calloc stub when I build with ERROR_ON_UNDEFINED_SYMBOLS=0). But I still think it's not anywhere in my code (e.g. the Spine runtime has CALLOC macros all over the place, but this resolves to a call to a helper function with malloc+memset.


Floh

unread,
Sep 3, 2022, 10:26:07 AM9/3/22
to emscripten-discuss
Another information tidbit:

Adding a calloc() call in any of the other (very similar) samples always works without problems (e.g. I don't need to add calloc to EXPORTED_FUNCTIONS as advised by the emsdk linker - I don't use EXPORTED_FUNCTIONS anywhere in my compiler settings).

It also works if I remove any calls to that spine-c library in my own code (but mind that the spine-c library doesn't call into calloc - emsdk's llvm-nm doesn't show any references to calloc in the resulting library).

The only place where I can find a reference to calloc is as 'weak symbol' in libemmalloc.a, but I don't understand how any potential problems in emmalloc would be triggered by the presence or absence of a random third party library that doesn't even call into calloc... 

Floh

unread,
Sep 3, 2022, 10:44:41 AM9/3/22
to emscripten-discuss
PPPS: building with "-sEXPORTED_FUNCTIONS=_calloc" also works, but I don't understand why I need this only for this one specific sample, for a problem triggered by a library that doesn't even call calloc. I didn't have to deal with EXPORTED_FUNCTIONS for years (instead EMSCRIPTEN_KEEPALIVE did the job), and especially never for any C runtime functions.

Any ideas of what's the issue here would be greatly appreciated. I'll put the EXPORTED_FUNCTIONS into my build options as a hack for now, but it would be nice if that wouldn't be needed (especially since I don't understand the reason, lol).

Cheers!

Floh

unread,
Sep 3, 2022, 11:03:59 AM9/3/22
to emscripten-discuss
Ok, it gets weirder:

If I add the -sEXPORTED_FUNCTIONS=_calloc, the sample links alright, but then doesn't run. There's no crash message on the JS console either, it just sits there with a black screen.

This time it's the same behaviour in release and debug mode, yet debug mode works alright without the EXPORTED_FUNCTIONS option (because it builds with -O0, which triggers the problem).

Ok, that's enough "rabbit-holeing" for today. My options for this particular sample currently seem to be: 

(1) don't use emmalloc
(2) don't build with optimizations enabled

...both options suck TBH ;)

Floh

unread,
Sep 4, 2022, 9:40:12 AM9/4/22
to emscripten-discuss
Followup: could this somehow be related to function pointers to C runtime functions? E.g. spine-c memory allocations calls all go through functions pointers:


...it all works fine (with emmalloc and optimizations) if I simply bypass this entire code by wiring the spine-c allocation macros directly to malloc, calloc, realloc and free (here: https://github.com/EsotericSoftware/spine-runtimes/blob/933ccbba6244cd8aefb04dadf8324be2442eb858/spine-c/spine-c/include/spine/extension.h#L69-L91).

...which works for me ATM, but it would still be good to get to the bottom of this issue.

Floh

unread,
Sep 4, 2022, 9:54:41 AM9/4/22
to emscripten-discuss
Ok, I could reduce the problem to not calling malloc() through a function pointer (but realloc and free seems to be fine). Basically applying this diff with 3 changed lines:

diff --git a/libs/spine-c/src/spine/extension.c b/libs/spine-c/src/spine/extension.c
index 85b8d30..0e8d63c 100644
--- a/libs/spine-c/src/spine/extension.c
+++ b/libs/spine-c/src/spine/extension.c
@@ -45,10 +45,10 @@ static void (*freeFunc)(void *ptr) = free;
 static float (*randomFunc)() = _spInternalRandom;
 
 void *_spMalloc(size_t size, const char *file, int line) {
-    if (debugMallocFunc)
-        return debugMallocFunc(size, file, line);
+//    if (debugMallocFunc)
+//        return debugMallocFunc(size, file, line);
 
-    return mallocFunc(size);
+    return malloc(size);
 }
 
 void *_spCalloc(size_t num, size_t size, const char *file, int line) {

...to this file:


...makes it all work (builds and runs with emmalloc and optimizations on, both _spMalloc and the default debugMallocFunc call malloc through a function pointer)

I still don't understand *why* this happens though, and why a function pointer to malloc() would lead to a linker error about missing calloc() - and only when using emmalloc.

Cheers!

Sam Clegg

unread,
Sep 5, 2022, 1:28:42 PM9/5/22
to emscripte...@googlegroups.com
Can you trying building with `-Wl,--trace-symbol=calloc` rather than `LLD_REPORT_UNDEFINED=1` (I guess there are still some issues with that setting).

Does the link time error tell you why calloc is being pulled in?  (The JS compiler linker errors should do this but they often just say "top level C/C++").

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/c47cee25-6e4b-4fbe-9f6f-0d7da7e02562n%40googlegroups.com.

Sam Clegg

unread,
Sep 5, 2022, 1:31:17 PM9/5/22
to emscripte...@googlegroups.com
The other thing this could be related to is LTO.  IIUC LTO itself can generate calls to buildin functions (such as malloc and calloc) that are not present in the original program.   Does disabling LTO fix the problem?

Floh

unread,
Sep 6, 2022, 12:43:52 PM9/6/22
to emscripten-discuss
Part of the mystery can be solved because clang replaces malloc+memset pairs with a single calloc call (only when optimization is enabled):


...which explains why the problem shows up in release mode, but not debug mode.

There still must be something else going on though, because I'm using the malloc+memset combo all the time together with emmalloc (and I guess the missing piece is that the malloc call is going through the function pointer).

I'll tinker around a bit more with compiler options (but since this is a library, I cannot dictate what compiler options the code is built with).

Cheers!

Floh

unread,
Sep 6, 2022, 1:19:08 PM9/6/22
to emscripten-discuss
Ok, I could cobble together a surprisingly simple reproducer here:


-flto is indeed needed. So the 4 ingredients are:

1. malloc+memset being 'optimized' to a call to calloc via -O1 or better
2. ...where malloc is called through a function pointer
3. ...and emmalloc is used instead of the vanilla allocator
4. ...and -flto must be enabled

...if any of this is missing, it works :D

Should I write an Emscripten ticket too?

Sam Clegg

unread,
Sep 6, 2022, 2:30:55 PM9/6/22
to emscripte...@googlegroups.com
Yes, please file a ticket.

I think the solution is going to be to set `force_object_files = True` on `libmalloc` in `tools/system_libs.spy`.. which means it (like compiler-rt et al) will be compiled as normal object files and not take part in LTO.

I am curious why this doesn't happen for dlmalloc too though..

cheers,
sam

Floh

unread,
Sep 7, 2022, 4:46:45 AM9/7/22
to emscripten-discuss
Ok, ticket is here:


Let me know if I can help with anything else :)

Floh

unread,
Sep 17, 2022, 6:45:16 AM9/17/22
to emscripten-discuss
Ok, I just confirmed that the PR https://github.com/emscripten-core/emscripten/pull/17866 fixes the original issue with the spine-c runtime. Many thanks!
Reply all
Reply to author
Forward
0 new messages