Hello,
On Tue, 23 Aug 2022, Florian Weimer wrote:
> >> >> I would go even further: The implementation is not required to provide
> >> >> its own symbols through ELF data structures.
> >> >
> >> > Define "the implementation".
> >>
> >> Components that provide the core run time functionality and need to be
> >> upgraded in a coordinated fashion.
> >
> > That just replaces one with another term "core run time". What's that and
> > where does it stop? Say, is libstdc++ core? libgcc_s? I'm guessing
> > you're thinking libc and ld-linux, but why should they not provide their
> > symbols via ELF mechanisms? (Note: this doesn't preclude those components
> > to use different means to do whatever they need, they merely should also
> > provide the standard ELF means, as long as they claim to implement an ELF
> > system).
>
> As far as glibc is concerned:
>
> libstdc++ is part of the implementation because it uses undocumented
> glibc functions, and (historically) calls glibc functions without proper
> symbol versioning. libgcc_s is part of the implementation because glibc
> hard-codes the name (and ABI) of the C personality routine used by the
> compiler. That's just two examples; these components are intertwined in
> other ways, too.
Yes, I'm aware. With this mailing lists hat on I would declare all of
this to be implementation artifacts/issues/suboptimalities/bugs to work
around something missing (e.g. proper reliable interfaces). I'm fully
aware that these happen (often one just finds out after the fact how
something should look like when it's already too late to rectify because
of backward compatibility problems; and sometimes the means that really
would fit the usecase don't exist yet), but they should not be used as
argument for why something basic standardized is to be avoided.
If you really want an interface that's totally opaque and can't be looked
up, don't use separate symbols at all. Use something similar to _rtld, a
blob of memory that adheres to some internal undocumented layout.
> >> Up to a point, but e.g. IFUNCs are already non-standard.
> >
> > IFUNCs are an addition, not a replacement, and hence orthogonal to basic
> > ELF mechanisms. That's not the same as arguing for removal of symbol
> > lookup capabilities via ELF means. And they are only non-standard until
> > standardized (e.g. in several psABI GNU extensions).
>
> I must say I disgree with that, at a technical level. The issue with
> IFUNCs in this context that it's still possible to look up the symbol
> and get some address, but if the lookup code does not know anything
> about IFUNCs, it will use the address directly, which does not work.
Of course you will have to care for the type of symbol when looking up
manually, which indeed is something you trivially know then, unlike e.g.
using dlsym, because it's right there in ST_TYPE(Elf_Sym.st_info). If you
deal optimistically with symbol types (i.e. just use the address as is
with unknown types, or don't check OSABI for some types) you get what you
asked for. Noone is saying that relying on ELF guarantees is trivial, if
it were dlsym wouldn't be necessary. But at least you can then rely on
something.
> Furthermore, allowing direct lookup (bypassing the dynamic linker),
> breaks the e_ident[EI_ABIVERSION] handshake. In the past, we assumed
> that we could increment the number if the dynamic linker has been
> updated. But if direct lookups are permitted, this client code would
> have to check e_ident[EI_ABIVERSION] (which is somewhat difficult to get
> hold of in glibc)
That's nothing directly to do with glibc. Either the ELF header is part
of mapped segments, or it's not (generally it's a good idea to map it).
Of course to get ahold of all mapped libs is somewhat
difficult if you can't rely on _r_debug, and that's indeed libc land. But
depending on OS you can also use different means to get at mappings.
And then, if e_ident[EI_ABIVERSION] is not supported by your lookup code,
then indeed you should fail, see above about noone saying life is easy.
> >> To give an extreme example, one might expect that
> >>
> >> printf ("Hello, world!\n");
> >>
> >> actually produces a printf symbol reference, relocation, and a PLT
> >> call. But the symbol is likely puts these days, and the PLT call might
> >> be gone as well.
> >
> > That's not an extreme example, because it falls into the implementation
> > defined category.
>
> Why isn't binding of symbols used by the implementation
> implementation-defined in the same way, at least implicitly?
First: ELF of course does define the implementation of symbol binding. So
we have the implementation-definedness readily there. What you are
arguing for is to loosen this simple rule ("look into ELF") to get at the
implementations meaning with something more complicated: look into ELF,
except for some unspecified set of things, where you should go look into
some random source code snippet.
Second: do you consider 'printf' to be a implementation symbol because
it's also used by the implementation? To me it's quite clearly a symbol
that's supposed to be user visible and hence part of the public API and
ABI, and hence should adhere to whatever the gABI and psABI says.
A symbol used by the implementation should be one _only_ used by the
implementation. But again, where does that stop? Some interfaces are (or
were) for internal communication between GNU libc, libpthread and libdl.
You could say, "implementation symbols, can use non-ELF means". Then you
can just as well put their addresses in an undocumented shared blob,
instead of having symbols. But sooner or later you will find usecases
that really could make use of them, let's say debuggers, at which point
you go "meh". I guess I'm saying that it's short sighted to not use
standardized means to do whatever processing is required when such
standard exists and is already in use anyway (here 'processing' == 'symbol
lookup'). Possibly the standard needs to be extended to match a new
usecase, but that's still better than inventing something completely new.
Ciao,
Michael.