Making DT_HASH optional?

1,520 views
Skip to first unread message

Carlos O'Donell

unread,
Aug 8, 2022, 4:50:54 PM8/8/22
to gener...@googlegroups.com
We recently had a case where dropping DT_HASH from the generic
glibc binaries broke an ELF consumer.

For the "real world" details see:
https://sourceware.org/pipermail/libc-alpha/2022-August/141302.html

I was surprised to see that DT_HASH was "mandatory" in the existing
published standard.

Why is it mandatory and not optional?

Should we and could we make it optional?

--
Cheers,
Carlos.

ali_e...@emvision.com

unread,
Aug 8, 2022, 6:26:32 PM8/8/22
to gener...@googlegroups.com
Hi Carlos,

DT_GNU_HASH doesn't exist in the gABI, so I don't
think that the gABI has the ability, or the need, to
say that. This is an OSABI matter.

If you are building a generic object (ELFOSABI_NONE),
then you do have to follow that rule. Here though, your
objects are ELFOSABI_GNU --- not generic. The mere
presence of DT_GNU_HASH requires that. The OSABI supersedes
the gABI in a case like this. I think you can handle
this by simply amending the GNU OSABI to make the rule
you want, and say that GNU OSABI objects must include
at least one of these 2 hash types, or both.

- Ali

Roland McGrath

unread,
Aug 8, 2022, 6:49:36 PM8/8/22
to gener...@googlegroups.com
It's not at all true that the mere presence of DT_GNU_HASH requires setting EI_OSABI to ELFOSABI_GNU.  That is only necessary for the symbol table features like STB_GNU_UNIQUE.  It's fine to have DT_GNU_HASH in addition to DT_HASH regardless of whether other extensions are used or not.  ELF never prohibits extra nonstandard DT_* entries being present.

It's obviously true that the generic ABI only specifies DT_HASH and not DT_GNU_HASH so it's not a place that would mandate it.  Having the GNU ELF ABI mandate DT_GNU_HASH and make DT_HASH explicitly optional is probably very reasonable at this point, but this is not the forum to discuss that.  Carlos may have intended to post on gnu-...@sourceware.org.

--
You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/generic-abi/b8cd6eb6-c0b6-2a32-86f1-dbe79af64ba4%40emvision.com.

Fangrui Song

unread,
Aug 8, 2022, 7:17:55 PM8/8/22
to gener...@googlegroups.com
On 2022-08-08, 'Roland McGrath' via Generic System V Application Binary Interface wrote:
>It's not at all true that the mere presence of DT_GNU_HASH requires setting
>EI_OSABI to ELFOSABI_GNU. That is only necessary for the symbol table
>features like STB_GNU_UNIQUE. It's fine to have DT_GNU_HASH in addition to
>DT_HASH regardless of whether other extensions are used or not. ELF never
>prohibits extra nonstandard DT_* entries being present.
>
>It's obviously true that the generic ABI only specifies DT_HASH and not
>DT_GNU_HASH so it's not a place that would mandate it. Having the GNU ELF
>ABI mandate DT_GNU_HASH and make DT_HASH explicitly optional is probably
>very reasonable at this point, but this is not the forum to discuss that.
>Carlos may have intended to post on gnu-...@sourceware.org.

DT_HASH is specified as "mandatory" probably because it wanted a way to
count the number of dynamic symbols, and DT_HASH provides the
information (hashtab[1]). DT_GNU_HASH provides the size, although in a
slightly involved way.

Making DT_HASH optional for ELFOSABI_NONE makes sense to me, since it
perhaps isn't the generic ABI role to define how a dynamic loader can
obtain the size information.

The mere existence of DT_GNU_HASH "upgrading" ELFOSABI_NONE to
ELFOSABI_GNU does not make sense to me. Many OSes have adopted
DT_GNU_HASH (though many don't really care so much about OSABI anyway).
FreeBSD and DragonFlyBSD have had DT_GNU_HASH for a long time.
OpenBSD got support in 2018 and NetBSD in 2020.

Having the clarification on GNU ABI will be useful, regardless of
generic ABI's stance.

>On Mon, Aug 8, 2022 at 3:26 PM <ali_e...@emvision.com> wrote:
>
>> On 8/8/22 2:50 PM, Carlos O'Donell wrote:
>> > We recently had a case where dropping DT_HASH from the generic
>> > glibc binaries broke an ELF consumer.
>> >
>> > For the "real world" details see:
>> > https://sourceware.org/pipermail/libc-alpha/2022-August/141302.html
>> >
>> > I was surprised to see that DT_HASH was "mandatory" in the existing
>> > published standard.
>> >
>> > Why is it mandatory and not optional?
>> >
>> > Should we and could we make it optional?
>> >

I have a comment on
https://github.com/ValveSoftware/Proton/issues/6051#issuecomment-1208698263
that Easy AntiCheat's requirement of DT_HASH should be dropped.

ali_e...@emvision.com

unread,
Aug 8, 2022, 8:29:06 PM8/8/22
to gener...@googlegroups.com
Hi Roland,

On 8/8/22 4:49 PM, 'Roland McGrath' via Generic System V Application Binary Interface wrote:
> It's not at all true that the mere presence of DT_GNU_HASH requires setting EI_OSABI to ELFOSABI_GNU.  That is only necessary for the symbol table features like STB_GNU_UNIQUE.  It's fine to have
> DT_GNU_HASH in addition to DT_HASH regardless of whether other extensions are used or not.  ELF never prohibits extra nonstandard DT_* entries being present.

I think understand what you're saying --- please correct me
if not: If an object can function in a generic world, the presence
of "extras" doesn't necessarily require an OSABI to be set. A
platform could ignore those extras, run generically, and
all would be well.

I don't think this is one of those cases though. SHT_GNU_HASH
is defined in the OSABI range between SHT_LOOS and SHT_HIOS,
and an OSABI is required to properly interpret it. The same
numbers can have very different meanings on different platforms,
as is true in this case:

% grep 0x6ffffff6 /usr/include/sys/elf.h
#define SHT_SUNW_SIGNATURE 0x6ffffff6
#define SHT_GNU_HASH 0x6ffffff6 /* GNU-style hash table */

There might be cases where extra nonstandard DT_* entries
don't require an OSABI to be set, but this (DT_GNU_HASH,
which points at an SHT_GNU_HASH) seems like one where
it is. ???

And in any case, without DT_HASH, the resulting object
can't function generically, so the newer objects would
seem to want an OSABI.


>
> It's obviously true that the generic ABI only specifies DT_HASH and not DT_GNU_HASH so it's not a place that would mandate it.  Having the GNU ELF ABI mandate DT_GNU_HASH and make DT_HASH explicitly
> optional is probably very reasonable at this point, but this is not the forum to discuss that.  Carlos may have intended to post on gnu-...@sourceware.org <mailto:gnu-...@sourceware.org>.

That was my assumption as well.

Thanks.

- Ali

Cary Coutant

unread,
Aug 8, 2022, 8:29:32 PM8/8/22
to Generic System V Application Binary Interface
This is an odd situation.

DT_HASH is mandatory because the "nchain" field in the hash table is
the only way to know the number of symbols in DT_SYMTAB (unless you
still have a section table and look for the .dynsym section, which is
not something we want to require). But when the psABI has effectively
replaced DT_HASH with something newer, it makes sense for that
requirement to transfer to the newer table. I suppose the gABI could
say something like "mandatory unless the psABI provides for a
replacement in some form," but that doesn't feel right to me. In this
case, I think it's simply up to the psABI to override the gABI and say
that DT_HASH is optional if DT_GNU_HASH is present. Unfortunately,
that means that generic tools that know nothing of ELFOSABI_GNU or
ELFOSABI_LINUX would be unable to process DT_SYMTAB if they can't find
DT_HASH.

I'd probably have preferred to have a mandatory DT_SYMTAB_COUNT or
DT_SYMTABSZ entry in the dynamic table, which would have enabled us to
make the hash table (in any form) optional. But the DT_HASH
requirement has been with us since the beginning of time.

-cary

Cary Coutant

unread,
Aug 8, 2022, 8:41:17 PM8/8/22
to Generic System V Application Binary Interface
> > It's not at all true that the mere presence of DT_GNU_HASH requires setting EI_OSABI to ELFOSABI_GNU. That is only necessary for the symbol table features like STB_GNU_UNIQUE. It's fine to have
> > DT_GNU_HASH in addition to DT_HASH regardless of whether other extensions are used or not. ELF never prohibits extra nonstandard DT_* entries being present.
>
> I think understand what you're saying --- please correct me
> if not: If an object can function in a generic world, the presence
> of "extras" doesn't necessarily require an OSABI to be set. A
> platform could ignore those extras, run generically, and
> all would be well.
>
> I don't think this is one of those cases though. SHT_GNU_HASH
> is defined in the OSABI range between SHT_LOOS and SHT_HIOS,
> and an OSABI is required to properly interpret it. The same
> numbers can have very different meanings on different platforms,
> as is true in this case:
>
> % grep 0x6ffffff6 /usr/include/sys/elf.h
> #define SHT_SUNW_SIGNATURE 0x6ffffff6
> #define SHT_GNU_HASH 0x6ffffff6 /* GNU-style hash table */

Actually, this is *outside* the OS-specific range --

DT_LOOS 0x6000000D unspecified unspecified unspecified
DT_HIOS 0x6ffff000 unspecified unspecified unspecified

-- meaning that DT_GNU_HASH can be interpreted without knowing the
OSABI. It was effectively "grandfathered in" to a set of non-standard
generic values.

So Roland is correct that the presence of DT_GNU_HASH does not require
setting EI_OSABI to something other than NONE. I don't think that
helps us, though, when it comes to dealing with a missing DT_HASH.

-cary

ali_e...@emvision.com

unread,
Aug 8, 2022, 9:31:23 PM8/8/22
to gener...@googlegroups.com
On 8/8/22 6:41 PM, Cary Coutant wrote:
>> % grep 0x6ffffff6 /usr/include/sys/elf.h
>> #define SHT_SUNW_SIGNATURE 0x6ffffff6
>> #define SHT_GNU_HASH 0x6ffffff6 /* GNU-style hash table */
> Actually, this is*outside* the OS-specific range --
>
> DT_LOOS 0x6000000D unspecified unspecified unspecified
> DT_HIOS 0x6ffff000 unspecified unspecified unspecified
>
> -- meaning that DT_GNU_HASH can be interpreted without knowing the
> OSABI. It was effectively "grandfathered in" to a set of non-standard
> generic values.
>
> So Roland is correct that the presence of DT_GNU_HASH does not require
> setting EI_OSABI to something other than NONE. I don't think that
> helps us, though, when it comes to dealing with a missing DT_HASH.


This is getting progressively more nit-picky, so I
apologize in advance for dragging things out, but there
seems to be some interesting history here.

Roland is right about DT_GNU_HASH:

#define DT_GNU_HASH 0x6ffffef5 /* GNU-style hash table (unused) */

I listed SHT_GNU_HASH above though, not DT_GNU_HASH.
The section code is in OSABI territory:

% egrep 'SHT_(HI|LO)OS' /usr/include/sys/elf.h
#define SHT_LOOS 0x60000000 /* OS specific range */
#define SHT_HIOS 0x6fffffff

So am I right about the SHT_ value, but not the DT_?
Possibly, on a technicality, but it seems clear that
both the SHT_ and DT_ values predate the OSABI partitioning,
so it's equally possible that we should have reserved the
preexisting values, but didn't. I don't know what was
said about this back at the gABI meetings, which predate
my involvement with ELF by 3-4 years.

Anyway, it might be helpful to set it, given where we
are now, since it eliminates all doubt.

- Ali

Roland McGrath

unread,
Aug 8, 2022, 9:31:30 PM8/8/22
to gener...@googlegroups.com
My point was unrelated to the conditions where omitting DT_HASH might or might not be acceptable.  It's clear that as the generic ABI does not specify an alternative, the generic ABI on its own requires DT_HASH.  Every real system's ABI is a derivative that sets its own rules, requiring both more and less than what the generic ABI says.  I think many users and maintainers of systems based on GNU tools have long ago decided that `--hash-style=gnu` was fine for their system ABIs, though many still use `--hash-style=both` for many binaries and I'm not aware of any that has removed DT_HASH support from the dynamic linker implementation.

The point is that there are many producers and consumers of ELF, many of whom use and even mandate some GNU extensions, but not all of them.  The history of EI_OSABI is that many "additive" GNU extensions were introduced and widely used, without ever setting EI_OSABI nonzero.  The ELFOSABI_GNU value was specifically introduced to indicate that a binary requires support for STB_GNU_UNIQUE and STT_GNU_IFUNC in the dynamic linker that loads it.  The widely-used GNU tools do *not* set ELFOSABI_GNU in binaries that do not use any of these features.  (Now also SHT_GNU_MBIND and SHF_GNU_RETAIN, though the latter causes ELFOSABI_GNU in ET_REL files but not in linked files.)  This is exactly the situation needed for robust support of the many implementations that support (or even require) DT_GNU_HASH and various other GNU extensions, but do not support ELFOSABI_GNU files because they do not support STB_GNU_UNIQUE or STT_GNU_IFUNC.

--
You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.

ali_e...@emvision.com

unread,
Aug 8, 2022, 9:40:24 PM8/8/22
to gener...@googlegroups.com
On 8/8/22 7:31 PM, 'Roland McGrath' via Generic System V Application Binary Interface wrote:
> This is exactly the situation needed for robust support of the many implementations that support (or even require) DT_GNU_HASH and various other GNU extensions, but do not support ELFOSABI_GNU files
> because they do not support STB_GNU_UNIQUE or STT_GNU_IFUNC.

My last message crossed paths with this one.

Thanks for this explanation. I can see that the OSABI
concept gets brittle in these cases. We could probably
continue this discussion, as I find it very interesting,
but I'll give you a break and drop it now.

Just to reaffirm, I don't see a problem with making
DT_HASH optional for GNU objects, and I don't think
the gABI needs to be changed to allow it, so please
carry on.

Thanks again.

- Ali

Mark Wielaard

unread,
Aug 9, 2022, 6:30:26 AM8/9/22
to gener...@googlegroups.com, elfutil...@sourceware.org, Di Chen, Milian Wolff
Hi,

CCing the elfutils-devel list where this issue (of missing
DT_SYMTAB_COUNT or DT_SYMTABSZ) does come up occasionally. Which are
needed if you only have the dynamic segment, and not the section
headers, to enumerate all symbols.
Do you know the reason for not having a DT_SYMTAB_COUNT or
DT_SYMTABSZ? It is not intuitive that one can find the number of
entries through the hash table.

I don't think DT_GNU_HASH can simply be made optional and replace
DT_HASH without adding something like DT_SYMTAB_COUNT or DT_SYMTABSZ
because the gnu hashtable doesn't come with a simple symbol count. To
find the symbol count you have to go through the whole gnu hash table
to count the number of entries (which is a non-trivial amount of
work).

Cheers,

Mark

Michael Matz

unread,
Aug 9, 2022, 8:59:07 AM8/9/22
to gener...@googlegroups.com
Hello,
Not without a (then mandatory!) replacement to easily get at the number of
symbols in DT_SYMTAB. DT_HASH provides nchain, DT_GNU_HASH would need to
be completely interpreted to get the same info. (I do see the proposal of
another DT_ entry for this, which makes sense, but must come before making
DT_HASH optional)

Even then I think dynamic objects that don't have a DT_HASH will need to
be ELFOSABI_GNU.


Ciao,
Michael.

Carlos O'Donell

unread,
Aug 9, 2022, 12:14:59 PM8/9/22
to gener...@googlegroups.com, Cary Coutant
On 8/8/22 20:29, Cary Coutant wrote:
>> We recently had a case where dropping DT_HASH from the generic
>> glibc binaries broke an ELF consumer.
>>
>> For the "real world" details see:
>> https://sourceware.org/pipermail/libc-alpha/2022-August/141302.html
>>
>> I was surprised to see that DT_HASH was "mandatory" in the existing
>> published standard.
>>
>> Why is it mandatory and not optional?
>>
>> Should we and could we make it optional?
>
> This is an odd situation.
>
> DT_HASH is mandatory because the "nchain" field in the hash table is
> the only way to know the number of symbols in DT_SYMTAB (unless you
> still have a section table and look for the .dynsym section, which is
> not something we want to require). But when the psABI has effectively
> replaced DT_HASH with something newer, it makes sense for that
> requirement to transfer to the newer table. I suppose the gABI could
> say something like "mandatory unless the psABI provides for a
> replacement in some form," but that doesn't feel right to me. In this
> case, I think it's simply up to the psABI to override the gABI and say
> that DT_HASH is optional if DT_GNU_HASH is present. Unfortunately,
> that means that generic tools that know nothing of ELFOSABI_GNU or
> ELFOSABI_LINUX would be unable to process DT_SYMTAB if they can't find
> DT_HASH.

I was hoping that this wasn't the case, but I see what you mean with nchain.

It really depends on what you want to process and how you want to process it.

> I'd probably have preferred to have a mandatory DT_SYMTAB_COUNT or
> DT_SYMTABSZ entry in the dynamic table, which would have enabled us to
> make the hash table (in any form) optional. But the DT_HASH
> requirement has been with us since the beginning of time.

Absolutely.

Like DT_MIPS_SYMTABNO, but standard.

--
Cheers,
Carlos.

Fangrui Song

unread,
Aug 9, 2022, 2:07:30 PM8/9/22
to gener...@googlegroups.com
Hi Michael,

On 2022-08-09, Michael Matz wrote:
>Hello,
>
>On Mon, 8 Aug 2022, Carlos O'Donell wrote:
>
>> We recently had a case where dropping DT_HASH from the generic
>> glibc binaries broke an ELF consumer.
>>
>> For the "real world" details see:
>> https://sourceware.org/pipermail/libc-alpha/2022-August/141302.html
>>
>> I was surprised to see that DT_HASH was "mandatory" in the existing
>> published standard.
>>
>> Why is it mandatory and not optional?
>>
>> Should we and could we make it optional?
>
>Not without a (then mandatory!) replacement to easily get at the number of
>symbols in DT_SYMTAB. DT_HASH provides nchain, DT_GNU_HASH would need to
>be completely interpreted to get the same info. (I do see the proposal of
>another DT_ entry for this, which makes sense, but must come before making
>DT_HASH optional)

Why is the nchain information mandatory before making DT_HASH optional?
The Dynamic Linking chapter is somewhat odd in the specification, though
I do find it convenient to state something about the common ground of
ELF based operating systems.

I have tried rereading ch5.dynamic.html, but do not find anything
requiring the number of symbol table entries.

>Even then I think dynamic objects that don't have a DT_HASH will need to
>be ELFOSABI_GNU.

This will make it inconvenient for linkers. The GNU ld -m concept is
emulated in all other linkers I know. Ideally we would use a value more
specific to the OS, but then we would have to invent a number of
elf_amd64_fbsd like emulations which just prefer a particular OSABI
value.

>
>Ciao,
>Michael.
>
>--
>You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.
>To view this discussion on the web visit https://groups.google.com/d/msgid/generic-abi/alpine.LSU.2.20.2208091253090.13939%40wotan.suse.de.

Roland McGrath

unread,
Aug 9, 2022, 2:46:20 PM8/9/22
to gener...@googlegroups.com
As I've already said, changing the meaning of ELFOSABI_GNU is unacceptable. It exclusively indicates use of new dynamic symbol table features, not other GNU extensions.
If DT_HASH is omitted, it's omitted.  You don't need a flag to tell you what it not being there means.

ali_e...@emvision.com

unread,
Aug 9, 2022, 3:42:09 PM8/9/22
to gener...@googlegroups.com
If I understand you, you're claiming that those
old features are part of the generic ABI, and don't
need an OSABI to be set. You only want to set ELFOSABI_GNU
when the newer features are used. I understand your desire
to draw a distinction between pre-OSABI, and post-OSABI
work.

At the same time, while those older GNU features predate the
OSABI partitioning, they're not really part of the gABI. These
are really just GNU specific features of a certain age. The
idea that they're generic is news to those of us with equally
old, or older, platforms that never had those things. In fact,
the growth of such things was the big motivation for the original
gABI meetings that led to the introduction of OSABI partitioning.

While objects using these features might not be ELFOSABI_GNU, it's
problematic to label them as ELFOSABI_NONE. Whatever the history,
can we clean things up now, by creating an OSABI for those other GNU-lite
platforms, and start setting it? Obviously we can't do anything about
existing objects, but perhaps we might do this now, and make using
one of the GNU OSABIs a prerequisite for this move to drop DT_HASH.
Until they adopt their new OSABI, they can just keep producing DT_HASH.

ELFOSABI_GNUBASE? GNUCORE? PROTOGNU?

- Ali



On 8/9/22 12:46 PM, 'Roland McGrath' via Generic System V Application Binary Interface wrote:
> As I've already said, changing the meaning of ELFOSABI_GNU is unacceptable. It exclusively indicates use of new dynamic symbol table features, not other GNU extensions.
> If DT_HASH is omitted, it's omitted.  You don't need a flag to tell you what it not being there means.
>
> On Tue, Aug 9, 2022 at 5:59 AM Michael Matz <ma...@suse.de <mailto:ma...@suse.de>> wrote:
>
> Hello,
>
> On Mon, 8 Aug 2022, Carlos O'Donell wrote:
>
> > We recently had a case where dropping DT_HASH from the generic
> > glibc binaries broke an ELF consumer.
> >
> > For the "real world" details see:
> > https://sourceware.org/pipermail/libc-alpha/2022-August/141302.html <https://sourceware.org/pipermail/libc-alpha/2022-August/141302.html>
> >
> > I was surprised to see that DT_HASH was "mandatory" in the existing
> > published standard.
> >
> > Why is it mandatory and not optional?
> >
> > Should we and could we make it optional?
>
> Not without a (then mandatory!) replacement to easily get at the number of
> symbols in DT_SYMTAB.  DT_HASH provides nchain, DT_GNU_HASH would need to
> be completely interpreted to get the same info.  (I do see the proposal of
> another DT_ entry for this, which makes sense, but must come before making
> DT_HASH optional)
>
> Even then I think dynamic objects that don't have a DT_HASH will need to
> be ELFOSABI_GNU.
>
>
> Ciao,
> Michael.
>
> --
> You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com <mailto:generic-abi%2Bunsu...@googlegroups.com>.
> <https://groups.google.com/d/msgid/generic-abi/alpine.LSU.2.20.2208091253090.13939%40wotan.suse.de>.
>
> --
> You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com <mailto:generic-abi...@googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/generic-abi/CAB%3D4xhpAw4j_YsaupzCqB8J9_Yq_E4_-Hq%3DP4piH2B1CzQoSBg%40mail.gmail.com
> <https://groups.google.com/d/msgid/generic-abi/CAB%3D4xhpAw4j_YsaupzCqB8J9_Yq_E4_-Hq%3DP4piH2B1CzQoSBg%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Michael Matz

unread,
Aug 10, 2022, 8:52:48 AM8/10/22
to 'Roland McGrath' via Generic System V Application Binary Interface
Hello,

On Tue, 9 Aug 2022, 'Roland McGrath' via Generic System V Application Binary Interface wrote:

> As I've already said, changing the meaning of ELFOSABI_GNU is unacceptable.

Let's go with strong wording then: it's equally unacceptable for a file
claiming to be ELFOSABI_SYSV to not have a DT_HASH.


Ciao,
Michael.

Michael Matz

unread,
Aug 10, 2022, 9:11:34 AM8/10/22
to 'Fangrui Song' via Generic System V Application Binary Interface
Hello,

On Tue, 9 Aug 2022, 'Fangrui Song' via Generic System V Application Binary Interface wrote:

> > Not without a (then mandatory!) replacement to easily get at the number of
> > symbols in DT_SYMTAB. DT_HASH provides nchain, DT_GNU_HASH would need to
> > be completely interpreted to get the same info. (I do see the proposal of
> > another DT_ entry for this, which makes sense, but must come before making
> > DT_HASH optional)
>
> Why is the nchain information mandatory before making DT_HASH optional?

Well, that boils down to making a decision of which features are
absolutely essential or not. Without nchain you have no guaranteed way,
at all, to _know_ the number of dynamic symbols. Section headers (which
would give you that) are optional, and the symbol table is not
self-delimiting. So, without the number of symbols you can only guess.
You know the start and then ... well, you try :)

For one you then for instance can't check relocation symbol indices
against out-of-bounds values.

Try writing a symbol table dumper for dynamic objects that doesn't rely on
sections (e.g. because it works on in-memory representation of loaded
processes) and doesn't emit garbage at around the end of the symbol table
:)

So, for just loading and running well-formed ELF dynamic files, nchain is
not required. For inspecting and protection against non-well-formedness
it would be. For post-link manipulation of the symbol table it's also
required. We could of course say, that, well, tough luck, let's make
number of symbols not be mandatory and files missing it then can't be
inspected, manipulated and checked, but I think a standard should itself
be measured to higher standars (ahem!) than "for me it works".

> > Even then I think dynamic objects that don't have a DT_HASH will need to
> > be ELFOSABI_GNU.
>
> This will make it inconvenient for linkers. The GNU ld -m concept is
> emulated in all other linkers I know. Ideally we would use a value more
> specific to the OS, but then we would have to invent a number of
> elf_amd64_fbsd like emulations which just prefer a particular OSABI
> value.

Hmm, I'm not sure how ld -m enters the picture. But what I'm not
comfortable with is if a ELFOSABI_SYSV dynamic file doesn't contain a
DT_HASH.


Ciao,
Michael.

Florian Weimer

unread,
Aug 16, 2022, 4:56:12 AM8/16/22
to Carlos O'Donell, gener...@googlegroups.com
* Carlos O'Donell:
I think this question is somewhat unrelated to the original issue. I
think the real matter is this: If you have an integrated ELF run-time
environment with its own dynamic loader and a set of standard (C/C++)
symbols, must all these symbols be bound by the mechanism defined in the
ELF specification? I think the answer to that is clearly ”no”, and
applications cannot expect that if they look up symbols on their own by
parsing the ELF directly, they will be able to find all the symbols, or
get the same binding results as the dynamic loader would provide.

That is, even if we added DT_HASH to the implementation shared objects,
it would not necessarily be useful because direct access to the ELF is
not forward-compatible, or just compatible with existing vendor
extensions not present in the original ELF specification.

Thanks,
Florian

Carlos O'Donell

unread,
Aug 17, 2022, 9:19:21 AM8/17/22
to Florian Weimer, gener...@googlegroups.com
On 8/16/22 04:56, Florian Weimer wrote:
> * Carlos O'Donell:
>
>> We recently had a case where dropping DT_HASH from the generic
>> glibc binaries broke an ELF consumer.
>>
>> For the "real world" details see:
>> https://sourceware.org/pipermail/libc-alpha/2022-August/141302.html
>>
>> I was surprised to see that DT_HASH was "mandatory" in the existing
>> published standard.
>>
>> Why is it mandatory and not optional?
>>
>> Should we and could we make it optional?
>
> I think this question is somewhat unrelated to the original issue. I
> think the real matter is this: If you have an integrated ELF run-time
> environment with its own dynamic loader and a set of standard (C/C++)
> symbols, must all these symbols be bound by the mechanism defined in the
> ELF specification? I think the answer to that is clearly ”no”, and
> applications cannot expect that if they look up symbols on their own by
> parsing the ELF directly, they will be able to find all the symbols, or
> get the same binding results as the dynamic loader would provide.

I agree.

The purpose of the standard is only to ease the burden on developers who
are working on these binary interfaces.

Extending the interfaces into an operating system requires many more
specific details.

> That is, even if we added DT_HASH to the implementation shared objects,
> it would not necessarily be useful because direct access to the ELF is
> not forward-compatible, or just compatible with existing vendor
> extensions not present in the original ELF specification.

I agree.

I see where you're going with this.

That the vendor in question needs DT_HASH *and* a specific set of runtime behaviours
that follow on from that dynamic tags presence, and in the future those behaviours
may be different?

--
Cheers,
Carlos.

Florian Weimer

unread,
Aug 17, 2022, 9:27:41 AM8/17/22
to Carlos O'Donell, gener...@googlegroups.com
* Carlos O'Donell:

> That the vendor in question needs DT_HASH *and* a specific set of
> runtime behaviours that follow on from that dynamic tags presence, and
> in the future those behaviours may be different?

I would go even further: The implementation is not required to provide
its own symbols through ELF data structures. It can use something else
entirely for symbol lookups, for example to speed up symbol binding.

Thanks,
Florian

Carlos O'Donell

unread,
Aug 17, 2022, 9:39:25 AM8/17/22
to Florian Weimer, gener...@googlegroups.com
I can agree with that.

Yet the spirit of the ELF standard is to extend as much uniformity into the software
stack so to make it easier to write tooling for that stack.

We may yet get downstream requests to document and standardize an interface to these
"speedy" symbols, and then we need to ask: Should we standardize it in the
gABI or the psABI or?

So whatever we come up with, may in the future end up in the standard, and so in a
kind of fait accompli we meet the standard, since this is after all a trailing
standard.

To quote the standard:
~~~~
The ELF standard is intended to streamline software development by providing developers
with a set of binary interface definitions that extend across multiple operating environments.
This should reduce the number of different interface implementations, thereby reducing the
need for recoding and recompiling code.
~~~

I want to point out that "across multiple operating enviroments" applies to the same named
operating environment but across decades of development and evolution.

You are right though, in that certain vendors may decide *never* to standardize certain
binding mechanisms. Personally I would hope we could standardize them and in a way that
their presence can be detected, and their deprecation also.

--
Cheers,
Carlos.

Michael Matz

unread,
Aug 17, 2022, 9:45:56 AM8/17/22
to gener...@googlegroups.com, Carlos O'Donell
Hello,

On Wed, 17 Aug 2022, Florian Weimer wrote:

> > That the vendor in question needs DT_HASH *and* a specific set of
> > runtime behaviours that follow on from that dynamic tags presence, and
> > in the future those behaviours may be different?
>
> I would go even further: The implementation is not required to provide
> its own symbols through ELF data structures.

Define "the implementation". I would hope we could agree that if you have
symbols at all, that there's some value in providing them via standard ELF
mechanisms (on an otherwise ELF system, but that's the context of this
mailing list).

> It can use something else entirely for symbol lookups, for example to
> speed up symbol binding.

So, if something is to be sped up, one should first investigate if that's
achievable with ELF means. Not everything that can be done should be done
:)


Ciao,
Michael.

Florian Weimer

unread,
Aug 19, 2022, 4:36:12 PM8/19/22
to Michael Matz, gener...@googlegroups.com, Carlos O'Donell
* Michael Matz:

> Hello,
>
> On Wed, 17 Aug 2022, Florian Weimer wrote:
>
>> > That the vendor in question needs DT_HASH *and* a specific set of
>> > runtime behaviours that follow on from that dynamic tags presence, and
>> > in the future those behaviours may be different?
>>
>> I would go even further: The implementation is not required to provide
>> its own symbols through ELF data structures.
>
> Define "the implementation".

Components that provide the core run time functionality and need to be
upgraded in a coordinated fashion.

> I would hope we could agree that if you have symbols at all, that
> there's some value in providing them via standard ELF mechanisms (on
> an otherwise ELF system, but that's the context of this mailing list).

Up to a point, but e.g. IFUNCs are already non-standard.

>> It can use something else entirely for symbol lookups, for example to
>> speed up symbol binding.
>
> So, if something is to be sped up, one should first investigate if that's
> achievable with ELF means. Not everything that can be done should be done
> :)

To give an extreme example, one might expect that

printf ("Hello, world!\n");

actually produces a printf symbol reference, relocation, and a PLT
call. But the symbol is likely puts these days, and the PLT call might
be gone as well.

We actually had a brief discussion about doing something like that in
the dynamic loader for the rseq area (e.g., use a TLS symbol, but
provide the symbol at run time via special lookup), but we decided
against it. We wanted to enable increasing the symbol size after a
program had been linked, and that seemed too problematic.

Thanks,
Florian

Michael Matz

unread,
Aug 22, 2022, 8:51:30 AM8/22/22
to Florian Weimer, gener...@googlegroups.com, Carlos O'Donell
Hello,

On Fri, 19 Aug 2022, Florian Weimer wrote:

> >> I would go even further: The implementation is not required to provide
> >> its own symbols through ELF data structures.
> >
> > Define "the implementation".
>
> Components that provide the core run time functionality and need to be
> upgraded in a coordinated fashion.

That just replaces one with another term "core run time". What's that and
where does it stop? Say, is libstdc++ core? libgcc_s? I'm guessing
you're thinking libc and ld-linux, but why should they not provide their
symbols via ELF mechanisms? (Note: this doesn't preclude those components
to use different means to do whatever they need, they merely should also
provide the standard ELF means, as long as they claim to implement an ELF
system).

> > I would hope we could agree that if you have symbols at all, that
> > there's some value in providing them via standard ELF mechanisms (on
> > an otherwise ELF system, but that's the context of this mailing list).
>
> Up to a point, but e.g. IFUNCs are already non-standard.

IFUNCs are an addition, not a replacement, and hence orthogonal to basic
ELF mechanisms. That's not the same as arguing for removal of symbol
lookup capabilities via ELF means. And they are only non-standard until
standardized (e.g. in several psABI GNU extensions).

> >> It can use something else entirely for symbol lookups, for example to
> >> speed up symbol binding.
> >
> > So, if something is to be sped up, one should first investigate if
> > that's achievable with ELF means. Not everything that can be done
> > should be done :)
>
> To give an extreme example, one might expect that
>
> printf ("Hello, world!\n");
>
> actually produces a printf symbol reference, relocation, and a PLT
> call. But the symbol is likely puts these days, and the PLT call might
> be gone as well.

That's not an extreme example, because it falls into the implementation
defined category. Noone can expect anything of the above from the ELF
standard, first and foremost because the ELF standard isn't directly
concerned with C and its semantics (which gives you the printf->puts), and
because using the PLT is optional. But what I do expect from ELF is that
when seeing an object file containing:

11: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf

and a shared object with:

2922: 000000000007347e 206 FUNC GLOBAL DEFAULT 16 printf@@GLIBC_2.2.5

and link-editing that object file with that shared object (and having
verified that no other object file and shared object provides 'printf')
that the result does indeed still contain

3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf@GLIBC_2.2.5 (3)

and that that symbol is resolved at runtime like ELF specifies (when I
observe that process; when I don't observe that process I don't care and
the runtime could use any other viable, and possibly faster, mean).

> We actually had a brief discussion about doing something like that in
> the dynamic loader for the rseq area (e.g., use a TLS symbol, but
> provide the symbol at run time via special lookup), but we decided
> against it. We wanted to enable increasing the symbol size after a
> program had been linked, and that seemed too problematic.

I'm not knowing restartable-seqs very well, so I can't say much here.
The usual solution to unkown sizes is either indirection or providing the
address/offset plus size of whatever memory snippet is needed. Either way
it seems the rseq setup is in the same camp as what the offset of
_r_debug.r_map is, i.e. nothing that would concern ELF but is purely
internal to the implementation.

May I instead ask from where this wish of removing DT_HASH comes (or
removing the dynamic symbol table from libc like you seem to suggest)?


Ciao,
Michael.

Florian Weimer

unread,
Aug 23, 2022, 3:36:53 AM8/23/22
to Michael Matz, gener...@googlegroups.com, Carlos O'Donell
* Michael Matz:

> Hello,
>
> On Fri, 19 Aug 2022, Florian Weimer wrote:
>
>> >> I would go even further: The implementation is not required to provide
>> >> its own symbols through ELF data structures.
>> >
>> > Define "the implementation".
>>
>> Components that provide the core run time functionality and need to be
>> upgraded in a coordinated fashion.
>
> That just replaces one with another term "core run time". What's that and
> where does it stop? Say, is libstdc++ core? libgcc_s? I'm guessing
> you're thinking libc and ld-linux, but why should they not provide their
> symbols via ELF mechanisms? (Note: this doesn't preclude those components
> to use different means to do whatever they need, they merely should also
> provide the standard ELF means, as long as they claim to implement an ELF
> system).

As far as glibc is concerned:

libstdc++ is part of the implementation because it uses undocumented
glibc functions, and (historically) calls glibc functions without proper
symbol versioning. libgcc_s is part of the implementation because glibc
hard-codes the name (and ABI) of the C personality routine used by the
compiler. That's just two examples; these components are intertwined in
other ways, too.

>> > I would hope we could agree that if you have symbols at all, that
>> > there's some value in providing them via standard ELF mechanisms (on
>> > an otherwise ELF system, but that's the context of this mailing list).
>>
>> Up to a point, but e.g. IFUNCs are already non-standard.
>
> IFUNCs are an addition, not a replacement, and hence orthogonal to basic
> ELF mechanisms. That's not the same as arguing for removal of symbol
> lookup capabilities via ELF means. And they are only non-standard until
> standardized (e.g. in several psABI GNU extensions).

I must say I disgree with that, at a technical level. The issue with
IFUNCs in this context that it's still possible to look up the symbol
and get some address, but if the lookup code does not know anything
about IFUNCs, it will use the address directly, which does not work.

Furthermore, allowing direct lookup (bypassing the dynamic linker),
breaks the e_ident[EI_ABIVERSION] handshake. In the past, we assumed
that we could increment the number if the dynamic linker has been
updated. But if direct lookups are permitted, this client code would
have to check e_ident[EI_ABIVERSION] (which is somewhat difficult to get
hold of in glibc) and conservatively fail if it is higher than the
supported version value (to follow the version protocol). But this
means that bumping the version is not backwards compatible.

>> To give an extreme example, one might expect that
>>
>> printf ("Hello, world!\n");
>>
>> actually produces a printf symbol reference, relocation, and a PLT
>> call. But the symbol is likely puts these days, and the PLT call might
>> be gone as well.
>
> That's not an extreme example, because it falls into the implementation
> defined category.

Why isn't binding of symbols used by the implementation
implementation-defined in the same way, at least implicitly?

Thanks,
Florian

Michael Matz

unread,
Aug 23, 2022, 8:32:08 AM8/23/22
to Florian Weimer, gener...@googlegroups.com, Carlos O'Donell
Hello,

On Tue, 23 Aug 2022, Florian Weimer wrote:

> >> >> I would go even further: The implementation is not required to provide
> >> >> its own symbols through ELF data structures.
> >> >
> >> > Define "the implementation".
> >>
> >> Components that provide the core run time functionality and need to be
> >> upgraded in a coordinated fashion.
> >
> > That just replaces one with another term "core run time". What's that and
> > where does it stop? Say, is libstdc++ core? libgcc_s? I'm guessing
> > you're thinking libc and ld-linux, but why should they not provide their
> > symbols via ELF mechanisms? (Note: this doesn't preclude those components
> > to use different means to do whatever they need, they merely should also
> > provide the standard ELF means, as long as they claim to implement an ELF
> > system).
>
> As far as glibc is concerned:
>
> libstdc++ is part of the implementation because it uses undocumented
> glibc functions, and (historically) calls glibc functions without proper
> symbol versioning. libgcc_s is part of the implementation because glibc
> hard-codes the name (and ABI) of the C personality routine used by the
> compiler. That's just two examples; these components are intertwined in
> other ways, too.

Yes, I'm aware. With this mailing lists hat on I would declare all of
this to be implementation artifacts/issues/suboptimalities/bugs to work
around something missing (e.g. proper reliable interfaces). I'm fully
aware that these happen (often one just finds out after the fact how
something should look like when it's already too late to rectify because
of backward compatibility problems; and sometimes the means that really
would fit the usecase don't exist yet), but they should not be used as
argument for why something basic standardized is to be avoided.

If you really want an interface that's totally opaque and can't be looked
up, don't use separate symbols at all. Use something similar to _rtld, a
blob of memory that adheres to some internal undocumented layout.

> >> Up to a point, but e.g. IFUNCs are already non-standard.
> >
> > IFUNCs are an addition, not a replacement, and hence orthogonal to basic
> > ELF mechanisms. That's not the same as arguing for removal of symbol
> > lookup capabilities via ELF means. And they are only non-standard until
> > standardized (e.g. in several psABI GNU extensions).
>
> I must say I disgree with that, at a technical level. The issue with
> IFUNCs in this context that it's still possible to look up the symbol
> and get some address, but if the lookup code does not know anything
> about IFUNCs, it will use the address directly, which does not work.

Of course you will have to care for the type of symbol when looking up
manually, which indeed is something you trivially know then, unlike e.g.
using dlsym, because it's right there in ST_TYPE(Elf_Sym.st_info). If you
deal optimistically with symbol types (i.e. just use the address as is
with unknown types, or don't check OSABI for some types) you get what you
asked for. Noone is saying that relying on ELF guarantees is trivial, if
it were dlsym wouldn't be necessary. But at least you can then rely on
something.

> Furthermore, allowing direct lookup (bypassing the dynamic linker),
> breaks the e_ident[EI_ABIVERSION] handshake. In the past, we assumed
> that we could increment the number if the dynamic linker has been
> updated. But if direct lookups are permitted, this client code would
> have to check e_ident[EI_ABIVERSION] (which is somewhat difficult to get
> hold of in glibc)

That's nothing directly to do with glibc. Either the ELF header is part
of mapped segments, or it's not (generally it's a good idea to map it).
Of course to get ahold of all mapped libs is somewhat
difficult if you can't rely on _r_debug, and that's indeed libc land. But
depending on OS you can also use different means to get at mappings.

And then, if e_ident[EI_ABIVERSION] is not supported by your lookup code,
then indeed you should fail, see above about noone saying life is easy.

> >> To give an extreme example, one might expect that
> >>
> >> printf ("Hello, world!\n");
> >>
> >> actually produces a printf symbol reference, relocation, and a PLT
> >> call. But the symbol is likely puts these days, and the PLT call might
> >> be gone as well.
> >
> > That's not an extreme example, because it falls into the implementation
> > defined category.
>
> Why isn't binding of symbols used by the implementation
> implementation-defined in the same way, at least implicitly?

First: ELF of course does define the implementation of symbol binding. So
we have the implementation-definedness readily there. What you are
arguing for is to loosen this simple rule ("look into ELF") to get at the
implementations meaning with something more complicated: look into ELF,
except for some unspecified set of things, where you should go look into
some random source code snippet.

Second: do you consider 'printf' to be a implementation symbol because
it's also used by the implementation? To me it's quite clearly a symbol
that's supposed to be user visible and hence part of the public API and
ABI, and hence should adhere to whatever the gABI and psABI says.

A symbol used by the implementation should be one _only_ used by the
implementation. But again, where does that stop? Some interfaces are (or
were) for internal communication between GNU libc, libpthread and libdl.
You could say, "implementation symbols, can use non-ELF means". Then you
can just as well put their addresses in an undocumented shared blob,
instead of having symbols. But sooner or later you will find usecases
that really could make use of them, let's say debuggers, at which point
you go "meh". I guess I'm saying that it's short sighted to not use
standardized means to do whatever processing is required when such
standard exists and is already in use anyway (here 'processing' == 'symbol
lookup'). Possibly the standard needs to be extended to match a new
usecase, but that's still better than inventing something completely new.


Ciao,
Michael.
Reply all
Reply to author
Forward
0 new messages