Static library linking order with --start-group

2,876 views
Skip to first unread message

Warren Seine

unread,
Mar 4, 2014, 3:21:09 AM3/4/14
to emscripte...@googlegroups.com
Hi,

-Wl,--start-group and -Wl,--end-group are compiler options to tell the linker to look for symbols in previous libraries, as in "before in the command line". They seem to have no effect on link order with emcc and I'm not really surprised because the link step is different from native builds in many ways. I'm wondering if someone has been there and found a workaround to avoid unresolved symbols when linking static libraries with circular dependencies.

I can reorder manually reorder libraries or put them twice, but for some reason I'd prefer to use this feature (or something similar) which works fine with gcc.

Cheers,

Warren Seine

unread,
Mar 5, 2014, 3:50:54 AM3/5/14
to emscripte...@googlegroups.com
@Alon, any idea? Can you describe (or point me to a page explaining) the linking step?

Alon Zakai

unread,
Mar 5, 2014, 4:56:42 PM3/5/14
to emscripte...@googlegroups.com
The most relevant doc is likely https://github.com/kripken/emscripten/wiki/Building-Projects but you can just look in emcc to see what we support and do not support. I've never heard of that option, so I guess we don't support it.

- Alon



--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Warren Seine

unread,
Mar 6, 2014, 3:26:18 AM3/6/14
to emscripte...@googlegroups.com
Thank you. I've had a look at the llvm-link source code, and there used to be an option --start-group which was simply ignored (and has since been withdrawn), so I'm wondering if that's the default behaviour now.

Anyway, somebody mentioned that I could put together all the .a files into a single one and then only link the big one. I guess that'd work? I'll try that.

Alon Zakai

unread,
Mar 6, 2014, 6:18:27 PM3/6/14
to emscripte...@googlegroups.com
Hmm yes, .a files do work that way I believe, and we support linking them with proper semantics, so that should work.

- Alon



--

Ryan Sturgell

unread,
Mar 28, 2014, 5:24:25 PM3/28/14
to emscripte...@googlegroups.com
Hi all, I'm also interested in --start-group/--end-group. I'm trying to build an existing project which uses gyp and ninja to build, and gyp basically passes the .a files in an arbitrary order and wraps the whole set in --start-group/--end-group.

Supporting this functionality in shared.Building.link should be possible (passing a list of archive-groups instead of a list of archives). 

Alternatively, the particular case of gyp could be handled by supporting a new emcc-specific flag like --group-all or something which specifies that the whole set of archives should be repeatedly scanned as if they were all contained in --start-group/--end-group. This would be simpler to implement.

Do either of these options sound palatable? I could take a stab and putting a patch together.

Thanks,
Ryan
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+unsub...@googlegroups.com.

Alon Zakai

unread,
Mar 31, 2014, 11:27:24 PM3/31/14
to emscripte...@googlegroups.com
From the perspective of maintainability both sound reasonable, the latter sounds much simpler though, so if that is good enough for the people that want this functionality then it is preferable.

- Alon



To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Warren Seine

unread,
Apr 1, 2014, 9:48:28 AM4/1/14
to emscripte...@googlegroups.com
This is something I need and it suits my requirements. At the moment, I'm using the wonderful solution of adding libraries twice in the command line (which works btw...), so anything cleaner is better.

Ryan Sturgell

unread,
Apr 7, 2014, 5:07:14 PM4/7/14
to emscripte...@googlegroups.com
I have a patch at https://github.com/rsturgell/emscripten/tree/rescan_libs which adds a --rescan-libs flag to emcc which behaves as if -Wl,--start-group and -Wl,--end-group surround all the inputs.

I have to say, though, I hate to add another way to do specify this... I think it would be better to just properly support groups, but this will be more complex. We'd have to either handle the flags in emcc and maybe pass some kind of tree through to the link function, or propagate -Wl,* linker flags through and handle the grouping in link. But both of these interact poorly with the processing that happens to these inputs in emcc. It will be a big change to make it all work.

Another option which doesn't add complexity is go with --rescan-libs but default to true (!). This way emcc will just work with whatever archive order. This is exactly how msvc does it (from http://msdn.microsoft.com/en-us/library/hcce369f.aspx):

Object files on the command line are processed in the order they appear on the command line. Libraries are searched in command line order as well, with the following caveat: Symbols that are unresolved when bringing in an object file from a library are searched for in that library first, and then the following libraries from the command line and /DEFAULTLIB (Specify Default Library) directives, and then to any libraries at the beginning of the command line.

Thoughts?

Thanks,
Ryan

Ben Vanik

unread,
Apr 7, 2014, 7:20:01 PM4/7/14
to emscripte...@googlegroups.com
--rescan-libs is awesome Ryan. You mention a possible perf issue in the comment; have you experienced this or is it speculative? I'd vote to have something like this enabled by default if it didn't mean 2x longer builds. I've been bit by this ordering issue a few times and it's painful to work around.


Alon Zakai

unread,
Apr 7, 2014, 7:37:12 PM4/7/14
to emscripte...@googlegroups.com
Very interesting about msvc. However different behavior from gcc/clang sounds worrying. But, if there is no perf downside (as Ben just asked) and no compatibility issues, might be worth considering.

If we do handle groups, we would need to do it in emcc and the linking code in tools/shared. We do linking ourselves except for calling llvm-link which is very low-level and does not support groups and such, last I checked. Hopefully it wouldn't be too complex, but I don't know offhand.

- Alon



Ryan Sturgell

unread,
Apr 7, 2014, 10:07:47 PM4/7/14
to emscripte...@googlegroups.com


On Monday, April 7, 2014 4:20:01 PM UTC-7, Ben Vanik wrote:
--rescan-libs is awesome Ryan. You mention a possible perf issue in the comment; have you experienced this or is it speculative? I'd vote to have something like this enabled by default if it didn't mean 2x longer builds. I've been bit by this ordering issue a few times and it's painful to work around.

Well, I just included that warning because the --start-group/--end-group docs do and the algorithm is worst case O(archives^2). In practice it will be very little overhead. It's just a bunch of extra dictionary intersections in python. You could theoretically construct a case where you were building with thousands of archives with a linear dependency chain between them, each of which had thousands of symbols, and you passed those to emcc in the exact wrong order (from dependency to dependants), then it would do a lot of extra work (but probably STILL not expensive compared to the actual llvm calls...).

In the case I was testing with there are dozens of large archives in an order that requires 4 complete passes and the additional passes took around 1ms extra.

If anyone ever hit a bad case where this slowed them down it could be improved at the cost of memory. How much memory depends on how slavishly you want to adhere to the link order (vs just making it work with any symbol when there are dups).
 
Ryan



To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+unsub...@googlegroups.com.

Ryan Sturgell

unread,
Apr 8, 2014, 12:00:44 AM4/8/14
to emscripte...@googlegroups.com

On Monday, April 7, 2014 4:37:12 PM UTC-7, Alon Zakai wrote:
Very interesting about msvc. However different behavior from gcc/clang sounds worrying. But, if there is no perf downside (as Ben just asked) and no compatibility issues, might be worth considering.

The compatibility issue would be that builds that fail with "symbol not found" in gcc/clang might succeed in emcc.

Well... there is one real case that could be a problem. If you implement a function which overrides something from a library_*.js (like, say, glCreateShader), for example:

emcc main.o lib1.a lib2.a

Where:
  main.o does NOT call glCreateShader
  lib1.a contains obj1.o, which implements glCreateShader (and nothing that main.o depends on, so is not included on the first pass)
  lib2.a contains obj2.o, which uses glCreateShader and implements a function that main calls (so that it IS picked up on the first pass)

With the current implementation, obj1.o is not included in the link, and glCreateShader ends up binding to the implementation in library_gl.js (however that works).

But with --rescan-libs, obj1.o would be picked up on the second scan, and the implementation there would be used.


If we do handle groups, we would need to do it in emcc and the linking code in tools/shared.

Right, that's where I made the changes for --rescan-libs in my branch.
 
We do linking ourselves except for calling llvm-link which is very low-level and does not support groups and such, last I checked.

Correct. llvm-ld DID handle --start-group/--end-group but llvm-ld was dropped in llvm 3.2 because it was incomplete and clang is where all that logic happens now.
 
Hopefully it wouldn't be too complex, but I don't know offhand.
 
As I mentioned above it's not _too_ bad (the changes in tools/shared.py would be very similar to what I have in my branch), the hard part is plumbing it all together since the grouping needs to propagate through from emcc to shared.py.

I guess the cleanest way would be to unwrap -Wl,--X options to --X in the option parsing in emcc, keep these around in the file list, make sure to pass them through all the transformations / filters that happen in there, and pass those through to shared.building.link.

Ryan


- Alon



To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+unsub...@googlegroups.com.

Alon Zakai

unread,
Apr 10, 2014, 2:42:03 PM4/10/14
to emscripte...@googlegroups.com
Regarding llvm-ld, might be worth finding out how other bitcode-using projects do linking, like PNaCl. They have to do something for this as well, I assume.

- Alon



To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-disc...@googlegroups.com.

Ryan Sturgell

unread,
Apr 10, 2014, 2:56:28 PM4/10/14
to emscripte...@googlegroups.com
Good point. Pnacl has an ld replacement called "pnacl-ld.py" where they handle stuff like --start-group,--end-group:


Ryan

You received this message because you are subscribed to a topic in the Google Groups "emscripten-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/emscripten-discuss/n1qfKPPAy08/unsubscribe.
To unsubscribe from this group and all its topics, send an email to emscripten-disc...@googlegroups.com.

Alon Zakai

unread,
Apr 14, 2014, 5:26:18 PM4/14/14
to emscripte...@googlegroups.com
That link doesn't work for me. But that makes sense.

Can we perhaps reuse their code? What do they do to solve this, how do they actually do the linking? llvm-link like us?

- Alon

Ryan Sturgell

unread,
Apr 14, 2014, 7:15:30 PM4/14/14
to emscripte...@googlegroups.com
Hmm, yes, that file seems to have disappeared from that location! Let's try again:


But, upon closer inspection, this looks like it's just messing around with the flags (splitting bitcode and native inputs?) and then calling out to other tools. Looks like it shells out to a "real" linker. Looks like maybe they use this:


(everything is very indirect, I'm only pretty sure this is what it does).

Looks like this allows the system "gold" linker (a real linker that knows all about --start-group and friends) to interface with llvm in a way that it can do global optimizations on bitcode files (and bitcode archives) to generate - I think - a single optimized bitcode output.

I have no idea whether it's feasible to use this in emscripten!

Thanks,
Ryan

Alon Zakai

unread,
Apr 22, 2014, 1:16:45 PM4/22/14
to emscripte...@googlegroups.com
Thanks for looking into this, I'll ask some pnacl people I know.

- Alon

Alon Zakai

unread,
Apr 22, 2014, 1:17:58 PM4/22/14
to emscripte...@googlegroups.com
Regardless though, it would be a lot of work for us to switch to a gold plugin, both development and maintenance, so I doubt we would do it, as the only missing feature I am aware of is the groups issue being discussed here. We should fix that directly I think.

- Alon

Ryan Sturgell

unread,
Apr 22, 2014, 1:44:39 PM4/22/14
to emscripte...@googlegroups.com
Agreed. I have a pretty clean change to support --start/end-group, I'll send you a pull request.

Ryan

Ryan Sturgell

unread,
May 21, 2014, 2:37:51 AM5/21/14
to emscripte...@googlegroups.com
emcc (in the incoming branch) now understands -Wl,--start-group/-Wl,--end-group (and the -Wl,-( / -Wl,-) variations). Give it a try and let me know if you have any problems with it.

Ryan
Ryan


- Alon



To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "emscripten-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/emscripten-discuss/n1qfKPPAy08/unsubscribe.
To unsubscribe from this group and all its topics, send an email to emscripten-discuss+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "emscripten-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/emscripten-discuss/n1qfKPPAy08/unsubscribe.
To unsubscribe from this group and all its topics, send an email to emscripten-discuss+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to a topic in the Google Groups "emscripten-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/emscripten-discuss/n1qfKPPAy08/unsubscribe.
To unsubscribe from this group and all its topics, send an email to emscripten-discuss+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages