Problem with EM_JS() in mixed zig/emcc project.

280 views
Skip to first unread message

Floh

unread,
Jan 29, 2022, 12:12:42 PM1/29/22
to emscripten-discuss
I'm currently tinkering with bringing one of my toy Zig projects to the web via
Alon's nice gist here which uses emcc only for the linker step:


...and it *nearly* works except for code that uses EM_JS() macros.

The project (https://github.com/floooh/pacman.zig) consists of some C code (my cross-platform 'sokol headers') which uses EM_JS() quite extensively (very handy for STB-style single-file libraries), and at the top, the "game code" is written in Zig.

I'm compiling all code with Zig with the wasm32-wasi target (wasm32-emscripten exists, but currently doesn't seem to be supported by the Zig compiler), and then use emcc for linking.

Long story short, it works except for the one problem that emcc cannot resolve any functions which have been defined with EM_JS(). If I compile the same library with emcc instead of Zig it works.

So my question is: does emcc also do some "EM_JS() magic" when compiling the source code which contains EM_JS macros? Maybe I'm missing some Clang command line options which emcc inserts?

The errors look like this:

error: undefined symbol: sapp_js_add_clipboard_listener (referenced by top-level compiled C/C++ code)

Followed by:

warning: _sapp_js_add_clipboard_listener may need to be added to EXPORTED_FUNCTIONS if it arrives from a system library
...there's also a single warning about malloc:

...if I compile with "-s ERROR_ON_UNDEFINED_SYMBOLS=0", then the code breaks at runtime failing to resolve those EM_JS() functions, e.g.:

"missing function: sapp_js_pointer_init"

Compiling the same static link library with emcc, it magically works.

If I look at both libraries with nm I don't see much of a difference, e.g. here's the relevant parts from the emcc-compiled library, every EM_JS symbol has an "D __em_js..." entry, and a matching "U sapp_js..." entry, e.g.:

0000185f D __em_js__sapp_js_add_beforeunload_listener
...
U sapp_js_add_beforeunload_listener
...

The Zig-compiled library has the same entries:

00001841 D __em_js__sapp_js_add_beforeunload_listener
...
U sapp_js_add_beforeunload_listener
...

...yet one library (the zig-compiled) produces linker errors for those symbols, and the other (emcc-compiled) works.

Clearly I'm missing something. I was expecting that all the EM_JS() magic is in the linker (by extracting the __em_js_* Javascript source code strings, and then "somehow" providing the C function import). Any ideas what I'm missing?

Thanks!
-Floh.





Floh

unread,
Jan 29, 2022, 12:29:40 PM1/29/22
to emscripten-discuss
PS: the .js file generated by emcc has functions like this:

/** @type {function(...*):?} */
function _sapp_js_add_beforeunload_listener(
) {
err('missing function: sapp_js_add_beforeunload_listener'); abort(-1);
}

(...ok, so that's where the JS console error is coming from...)


Alon Zakai

unread,
Jan 29, 2022, 12:58:53 PM1/29/22
to emscripte...@googlegroups.com
Sam can confirm, but I would guess perhaps the emscripten triple is necessary. That is, clang and/or wasm-ld might do something for EM_JS code but only in emscripten mode.

If we can confirm that then we should definitely get a bug filed on Zig - hopefully it would be easy to add support for the emscripten triple there and open up a bunch of use cases...

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/15129292-2f07-44d9-99a9-a27ac4721a0cn%40googlegroups.com.

Floh

unread,
Jan 29, 2022, 1:22:25 PM1/29/22
to emscripten-discuss
Spot on Alon :)

It works if I hardwire just the C library (with the EM_JS functions) to the wasm32-emscripten triple.

The Zig code needs to be compiled either with wasm32-wasi or wasm32-freestanding, when using wasm32-emscripten, parts of the Zig stdlib won't compile.

Also, when I tried to use wasm32-freestanding with the C code, then wasm-ld complained about some missing stack-check functions (don't have the exact symbol at hand currently).

...I think I have enough to build a little 'proof-of-concept', even though it's a bit hacky :)

Thanks!
-Floh.

Sam Clegg

unread,
Jan 29, 2022, 2:33:05 PM1/29/22
to emscripte...@googlegroups.com
The undefined symbol error you are seeing here is coming from the post-linking phase.  The way EM_JS works is that the function is that function `foo` declared as external using `__attribute__((import_name("foo")))` and the data symbol `__em_js_foo` is defined in the data section along with `__attribute__((used, visibility("default")))`.    For more details on this see https://github.com/emscripten-core/emscripten/blob/main/system/include/emscripten/em_js.h#L23-L49.

I believe the problem you are seeing stems from the different meaning of `__attribute__((used))` under emscripten compared to with triples.    The problem stems from the fact that we use `__attribute__((used))` to implement the EMSCRIPTEN_KEEPALIVE macro, which is defined to mean "keep this symbol alive *and* export it to JS under its symbol name". 

If you use wasm-objdump to look at an object file containing EM_JS symbols you will see them marked as both "no_strip" and "exported".  For example:

```
  - 38: D <__em_js__noarg> segment=0 offset=0 size=36 [ exported no_strip binding=global vis=default ]
  - 39: D <__em_js__noarg_int> segment=0 offset=36 size=55 [ exported no_strip binding=global vis=default ]
  - 40: D <__em_js__noarg_double> segment=0 offset=91 size=61 [ exported no_strip binding=global vis=default ]
  - 41: D <__em_js__intarg> segment=0 offset=152 size=41 [ exported no_strip binding=global vis=default ]
```

If you compile the same source using a non-emscripten triple you will see them only marked as `no_strip` which is a more traditional meaning of the `used` attribute which simply tells the linker to keep them around in the binary, not to export them.   Here is where the hack/difference is: https://github.com/llvm/llvm-project/blob/333f5019300c6e56782374627e64da0b62ffa3bc/llvm/lib/MC/WasmObjectWriter.cpp#L1773-L1777

There are two ways we can solve this issue I believe.

1. Long term solution: Stop abusing `__attribute__((used))`, and thus remove this special handling in emscripten.  We should really have a separate attribute to mark a symbol as exported.  I've been trying to get this done for while but its stalled.  See https://reviews.llvm.org/D76547
2. Short term solution: Use the more explicit (but not EMSCIRPTEN_KEEPALIVE-compatible), 'export-name' attribute in em_js.h. I think this should "just work".

cheers,
sam


Sam Clegg

unread,
Jan 29, 2022, 2:47:24 PM1/29/22
to emscripte...@googlegroups.com

Floh

unread,
Jan 29, 2022, 7:08:47 PM1/29/22
to emscripten-discuss
Thanks for the thorough explanation Sam! Regarding this PR: https://github.com/emscripten-core/emscripten/pull/16149, as far as I have seen, only the EM_JS() macros caused trouble (with a non-emscripten triple), I haven't seen any linker warnings regarding EMSCRIPTEN_KEEPALIVE functions (which I'm using too in the same code base).

I'll try to bring the current workaround (use wasm32-emscripten just for the C code with the EM_JS macros, and wasm32-freestanding for the Zig code), into a better shape tomorrow and then will most likely write a Zig ticket, I think the Zig stdlib needs a few fixes for wasm32-emscripten (if just some empty stubs), so that a complete project can be compiled with this triple.

Cheers!
-Floh.

Sam Clegg

unread,
Jan 29, 2022, 7:16:44 PM1/29/22
to emscripte...@googlegroups.com
I'm pretty sure EMSCRIPTEN_KEEPALIVE won't have the intended behaviour of actually exporting symbols when compiled with non-emscripten triples.

Floh

unread,
Jan 30, 2022, 11:56:32 AM1/30/22
to emscripten-discuss
Ok, here's the result:


Build instructions:


...and a quick explanation how it works, and what workarounds are currently required (and below that is the actual build function):


...disclaimer: I'm not yet very familiar with the Zig build system which doesn't really support injecting a different linker, but that's the cleanest I came up with.

Cheers and thanks for the help :)
-Floh.

Floh

unread,
Jan 31, 2022, 4:53:51 AM1/31/22
to emscripten-discuss
PS: the missing __stack_chk_* symbol errors I mentioned formerly are unrelated to the target triple, but instead happen when compiling via Zig in debug mode. In that case, Zig compiles the C code with clang's "-fstack-protector-strong" which requires two externally defined symbols, a guard-value and a function which is called when the stack protection triggers. When linking with emcc those symbols can't be found, so I just provided them myself in the C stub here:

Alon Zakai

unread,
Jan 31, 2022, 12:51:07 PM1/31/22
to emscripte...@googlegroups.com
Nice, good to see pacman working!

Btw, did you get a chance to file a Zig ticket as you said?


Floh

unread,
Feb 1, 2022, 6:44:13 AM2/1/22
to emscripten-discuss
> Btw, did you get a chance to file a Zig ticket as you said?

Haven't done that yet!

Alon Zakai

unread,
Feb 17, 2022, 1:33:32 PM2/17/22
to emscripte...@googlegroups.com
There is now this Zig issue for emscripten discussion, more details might make sense to add there, if any,


Floh

unread,
Feb 18, 2022, 5:27:00 AM2/18/22
to emscripten-discuss
Ah thanks! I added a comment with a link to my toy project and how to reproduce the problem:

Reply all
Reply to author
Forward
0 new messages