STB_GLOBAL/STB_WEAK in symbol lookup

23 views
Skip to first unread message

Fangrui Song

unread,
Aug 16, 2021, 2:49:18 AMAug 16
to Generic System V Application Binary Interface
"When resolving symbolic references, the dynamic linker examines the symbol tables with a breadth-first search. That is, it first looks at the symbol table of the executable program itself, then at the symbol tables of the DT_NEEDED entries (in order), and then at the second level DT_NEEDED entries, and so on..."
(this paragraph has not been updated in the latest snapshot)

This paragraph says nothing about STB_WEAK.
The common(?) interpretation is that STB_WEAK/STB_GLOBAL have no differences for symbol lookup.
On Linux, glibc 2.2 made STB_WEAK/STB_GLOBAL the same. man ld.so says:

```
LD_DYNAMIC_WEAK (since glibc 2.1.91)
       By default, when searching shared libraries to resolve a symbol reference, the dynamic linker will resolve to the first definition it finds.

       Old  glibc versions (before 2.2), provided a different behavior: if the linker found a symbol that was weak, it would remember that symbol and keep searching in the remaining shared libraries.  If it subsequently
       found a strong definition of the same symbol, then it would instead use that definition.  (If no further symbol was found, then the dynamic linker would use the weak symbol that it initially found.)

       The old glibc behavior was nonstandard.  (Standard practice is that the distinction between weak and strong symbols should have effect only at static link time.)  In glibc 2.2, the dynamic linker was modified  to
       provide the current behavior (which was the behavior that was provided by most other implementations at that time).
       [...]
```

musl libc preferred STB_GLOBAL over STB_WEAK before 2017.

---

(1998 "Import the ELF dynamic linker. This is the ElfKit version with")
treated STB_WEAK overridable by STB_GLOBAL.
(I have a pending patch https://reviews.freebsd.org/D26352 to allow switching the behavior.)

On NetBSD, the initial code treated STB_WEAK/STB_GLOBAL the same.
(1999 "Changes from msaitoh to fix local/global symbol confusion, and to fix weak")
made STB_WEAK overridable by STB_GLOBAL.

---

I am curious whether other systems prefer a STB_GLOBAL definition to a STB_WEAK definition.

We know that the dynamic linking design principle is "dynamic linking should be similar to static linking".
However, I believe this sentence is ambiguous for this case, because I can use it to argue for either side.

On one hand, I can argue that `ld ref.o -lweak -lglobal` (when libglobal.a members are extracted) should select the definition from libglobal.a,
so libglobal.so's definition should be prioritized over libweak.so during symbol lookup.

On the other hand, I can argue that `ld ref.o -lweak -lglobal` (when libglobal.a members are not extracted) is similar to libglobal.so's definition
being ignored. After all, ELF interposition is an emulation of archive member extraction.

Cary Coutant

unread,
Nov 2, 2021, 1:05:57 AMNov 2
to Generic System V Application Binary Interface
> I am curious whether other systems prefer a STB_GLOBAL definition to a STB_WEAK definition.
>
> We know that the dynamic linking design principle is "dynamic linking should be similar to static linking".
> However, I believe this sentence is ambiguous for this case, because I can use it to argue for either side.

The motivation for adding weak definitions to the gABI was to support
ANSI C namespace rules. We all needed our archive C libraries to be
able to contain a definition of, say, "open" or "close", without
intruding on the user's namespace. The problem was that even though
those symbols weren't part of the system's namespace (unless the
source code included a certain header file), they were still entry
points needed by other C library functions (like "fopen" and
"fclose"). We needed to be able to provide aliases (e.g., "_open" and
"_close") that could be used by the rest of the C library, in case the
user's program defined its own versions of "open" and/or "close",
while still making the plain names available to a program using the
extended namespace.

This was a problem that existed only for archive libraries, and it was
a solution intended only for archive libraries -- in a shared library,
you can have regular definitions of "open" and "close" without causing
a multiple definition error, and as long as other functions in the
library bind to the internal entry points, there is no need for the
plain definitions in a shared library to be weak.

Weak definitions still managed to end up in shared libraries, but our
intent was that there would be no difference between weak and regular
definitions.

In particular, the gABI speaks of weak definitions only in the context
of relocatable object files, and says this:

The behavior of weak symbols in areas not specified by this document is
implementation defined.
Weak symbols are intended primarily for use in system software.
Applications using weak symbols are unreliable
since changes in the runtime environment
might cause the execution to fail.

The document revision history also has these notes:

(July 6, 1999) New language has been added warning about the use
of WEAK symbols in application programs.

(April 24, 2001) Changed the warning about using weak to be
stronger. [pun probably intended]

We really didn't want people mis-using STB_WEAK! (Not that those
warnings did any good.)

Weak references came along for the ride. We considered the behavior
useful, but it wasn't our primary motivation for adding STB_WEAK, and
we probably left it under-specified. But again, it was considered
useful only in the context of static linking. For dynamic linking,
users were supposed to call dlsym() if a symbol was optional. Our
failure to lock this down has been a rich source of bug reports due to
applications relying on unspecified, implementation-defined behavior.

-cary

Fangrui Song

unread,
Nov 2, 2021, 2:17:02 AMNov 2
to gener...@googlegroups.com
Thanks for sharing!

> I am curious whether other systems prefer a STB_GLOBAL definition to a
> STB_WEAK definition.

Answering my own question: ld64 linker on Mach-O has similar rules.
It even goes beyond: a weak dylib definition can extract an archive member as well.
(ld a.dylib b.a)

The rule perhaps makes `ld ... a.dylib b.a` and `ld ... b.a a.dylib` more similar.


One more question: did the PDP-11 object file format invent weak symbols?

From my archaeology (https://maskray.me/blog/2021-04-25-weak-symbol),
PDP-11 MACRO-11 Language Reference Manual mentions the .WEAK directive.
Its weak directive for an externally defined symbol appears to be very
close to the semantics of ELF weak references. However, I must say I
fail to understand some of its terms like "object library".

Roland McGrath

unread,
Nov 2, 2021, 2:14:46 PMNov 2
to gener...@googlegroups.com
Before ELF came along, the name space problem that the "weak alias" pattern in ELF is used to solve was addressed in GNU a.out extensions with a different sort of semantics.  In GNU a.out, there were no weak symbols, but there were indirect symbols.  So rather than having a strong __foo and a weak foo with the same value, a library could define __foo and separately define a foo -> __foo symbol indirection (which didn't have to be in the same object file as the __foo definition).  Clearly ELF avoided this sort of complexity in favor of the weak symbol solution (which has turned out to have other nuances we didn't anticipate, I think).  I hadn't heard of the weak symbol concept before ELF introduced it, and I kind of suspect that if there were a clear precedent for it then the GNU a.out extension might have looked more like that (but I wasn't involved in designing the a.out extensions, only in using them in pre-ELF glibc and a little in maintaining the pre-BFD GNU linker that implemented them).

In the original glibc dynamic linker code, I frankly was just learning about ELF while writing it and hadn't thought much about the weak symbol issue yet.  Later it was Drepper (perhaps around the time of implementing GNU symbol versioning, I'm not sure) who clarified the understanding as Cary described it and implemented the always-intended behavior and the LD_DYNAMIC_WEAK variable to control it for (then hoped to be shorted-lived) bug-compatibility if needed.  I certainly concurred with the assessment that this was the correct behavior and that the dynamic linker paying attention to weakness (aside from the undefined weak case) had always been a bug in my implementation.

--
You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/generic-abi/20211102061657.w7qlzahbhwygbw4m%40google.com.

Cary Coutant

unread,
Nov 2, 2021, 6:12:35 PMNov 2
to Generic System V Application Binary Interface
> Before ELF came along, the name space problem that the "weak alias" pattern in ELF is used to solve was addressed in GNU a.out extensions with a different sort of semantics. In GNU a.out, there were no weak symbols, but there were indirect symbols. So rather than having a strong __foo and a weak foo with the same value, a library could define __foo and separately define a foo -> __foo symbol indirection (which didn't have to be in the same object file as the __foo definition).

For a bit more historical trivia, for pre-ELF (SOM) HP-UX, we invented
"secondary definitions," which worked kind of like weak symbols in
ELF. They were a bit more restrictive, though, as they had to be
paired with a regular definition.

The Gnu a.out scheme actually seems nice. It sounds like an
optimization of the fallback/no-extra-object-file-support scheme of
providing a stub named "open" that simply branched to "_open".

-cary

Roland McGrath

unread,
Nov 2, 2021, 6:40:56 PMNov 2
to gener...@googlegroups.com
Yes, indirect symbols can be thought of as an optimization of tail-call stubs for functions.  However, they also work for variables and preserve function pointer identity for aliased functions like ELF-style "aliases" do and tail-call stubs don't.  (It's unclear anybody ever noticed or cared that &__read == &read or the like and ELF PLT rules wouldn't make that so anyway, but it was with N_INDR.  For variables this issue is crucial, and __environ/environ was in the initial set of use cases.)

Nowadays you could almost do it with input linker scripts containing `PROVIDE(foo = __foo);` but I don't think any linker has rules to draw input linker scripts out of archive libraries like you'd want for that (let alone any ranlib that could index a linker script!).

But indeed if we were starting over, I think I'd probably go for link-time indirect symbols over weak aliases.  (I'd also make the weak definition for override purpose--if supported at all--and the "optionally undefined" purpose separate features not overloaded onto one symbol table feature.)

--
You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages