Proposal: New DT_REQUIRED tag to identify required features

82 views
Skip to first unread message

Cary Coutant

unread,
Nov 1, 2021, 3:45:32 PM11/1/21
to Generic System V Application Binary Interface, Fāng-ruì Sòng, Ali Bahrami, H.J. Lu
Given HJ's concern about binaries using DT_RELR-style relocations and
possibility of apps mysteriously crashing because old dynamic loaders
ignore the new tags without complaint, I am proposing a new dynamic
tag, DT_REQUIRED. This tag will work like DT_FLAGS, but each flag bit
will represent a feature that must be supported by the dynamic loader.
Hence, if the loader sees a DT_REQUIRED flag that it does not
understand, it can issue an appropriate error message.

I understand that this won't immediately address HJ's concern with
respect to DT_RELR, but it will at least lay the groundwork for future
extensions to the ELF spec in this area, and it will also allow
dynamic loaders who do not plan on implementing support for DT_RELR
right away to at least add support for DT_REQUIRED so that they may
refuse such binaries until the support is added.

The new flag will be defined as follows:

Name Value d_un Executable Shared Object
----------- ----- ----- ---------- -------------
DT_REQUIRED 39 d_val optional optional

DT_REQUIRED
This element holds flag values specific to the object being
loaded. Each flag value will have the name DR_<required_flag>.
Defined values and their meanings are described below.
All other values are reserved. Each flag bit that is set
indicates a specific feature that the object requires
from the dynamic linker; if the dynamic linker does not
recognize the flag bit, or if it represents a feature
that is unsupported, the dynamic linker should refuse to
load the object.

Table: DT_REQUIRED values

Name Value
---------------- -----
DR_RELR 0x1
DR_REQUIRED_NEXT 0x80000000

DR_RELR
This flag signifies that the object requires support for
RELR-style dynamic relocations, and the DT_RELR and
associated dynamic tags.

DR_REQUIRED_NEXT
This flag is reserved to indicate that a second group
of required flags is present. The tag holding the second
group is currently undefined, but will be specified in
a future revision if more than 31 required feature bits
are needed.

-cary

H.J. Lu

unread,
Nov 1, 2021, 3:49:17 PM11/1/21
to Cary Coutant, Generic System V Application Binary Interface, Fāng-ruì Sòng, Ali Bahrami
On Mon, Nov 1, 2021 at 12:45 PM Cary Coutant <ccou...@gmail.com> wrote:
>
> Given HJ's concern about binaries using DT_RELR-style relocations and
> possibility of apps mysteriously crashing because old dynamic loaders
> ignore the new tags without complaint, I am proposing a new dynamic
> tag, DT_REQUIRED. This tag will work like DT_FLAGS, but each flag bit
> will represent a feature that must be supported by the dynamic loader.
> Hence, if the loader sees a DT_REQUIRED flag that it does not
> understand, it can issue an appropriate error message.
>
> I understand that this won't immediately address HJ's concern with
> respect to DT_RELR, but it will at least lay the groundwork for future
> extensions to the ELF spec in this area, and it will also allow
> dynamic loaders who do not plan on implementing support for DT_RELR
> right away to at least add support for DT_REQUIRED so that they may
> refuse such binaries until the support is added.

While this is good for the future, the existing ld.so binaries will ignore this
and crash.
--
H.J.

Fāng-ruì Sòng

unread,
Nov 1, 2021, 5:30:52 PM11/1/21
to Cary Coutant, Generic System V Application Binary Interface, Ali Bahrami, H.J. Lu
Thanks for starting the topic!

The proposal looks good to me.
It differs from https://groups.google.com/g/generic-abi/c/vdG_G4l3N-Y
in using bits instead of values.
If future critical features are assigned bits at the same time,
the flag assignment should not be too burdensome.

Ali mentioned Solaris doesn't support such a development model.
Perhaps Ali has more to say on this proposal.

I have informed

* libc-coord https://www.openwall.com/lists/libc-coord/2021/11/01/1
* FreeBSD https://reviews.freebsd.org/D32524 emaste: "DT_REQUIRED seems fine and I wouldn't object to supporting it"

---

I can imagine that ELFOSABI_GNU may add quite a few extensions which
will never be incorporated into the generic ABI. GNU probably can add
a similar DT_ dynamic tag in the DT_LOOS~DT_HIOS range.

---

For HJ's glibc concern, letting GNU ld synthesize _dl_have_relr is the
most practical solution.
But as I mentioned in
https://maskray.me/blog/2021-10-31-relative-relocations-and-relr#time-travel-compatibility ,
I will at most add a dumb alias but not synthesize a symbol for LLD.

Fāng-ruì Sòng

unread,
Nov 1, 2021, 7:12:42 PM11/1/21
to Cary Coutant, Generic System V Application Binary Interface, Ali Bahrami, H.J. Lu
Update.

jrtc27 has a comment that "DT_REQUIRED" can be confused with "DT_NEEDED"
as required/needed are synonymous as English words.

joerg from NetBSD said that
"I'm building a binary on a newer system and am surprised that it crashed on an old system" is not really something they have to address.

So I have collected evidences that (Android, Chrome OS, FreeBSD, Fuchsia,
NetBSD, Solaris) are happy with poor diagnostic that new objects don't
work on old systems.

Alan Modra

unread,
Nov 1, 2021, 9:16:42 PM11/1/21
to gener...@googlegroups.com, Fāng-ruì Sòng, Ali Bahrami, H.J. Lu
We already have a mechanism that identifies required features, symbols
and symbol versioning.

How is this superior to a dynamic symbol reference, apart from
DT_REQUIRED encoding more efficiently? A reference in an executable
to say, _RELR, will do just as well to say "this executable needs RELR
support to run", won't it?

--
Alan Modra
Australia Development Lab, IBM

Cary Coutant

unread,
Nov 1, 2021, 11:46:00 PM11/1/21
to Fāng-ruì Sòng, Generic System V Application Binary Interface, Ali Bahrami, H.J. Lu
> jrtc27 has a comment that "DT_REQUIRED" can be confused with "DT_NEEDED"
> as required/needed are synonymous as English words.

Good point. If this goes further, I'll consider a better name.

> joerg from NetBSD said that
> "I'm building a binary on a newer system and am surprised that it crashed on an old system" is not really something they have to address.
>
> So I have collected evidences that (Android, Chrome OS, FreeBSD, Fuchsia,
> NetBSD, Solaris) are happy with poor diagnostic that new objects don't
> work on old systems.

Yes, this did seem to be the consensus in the other thread. While I
agree with this for the most part, I am sympathetic to the complaint
that ELF, where extensibility is a key feature, exhibits an
extensibility problem.

> >I can imagine that ELFOSABI_GNU may add quite a few extensions which
> >will never be incorporated into the generic ABI. GNU probably can add
> >a similar DT_ dynamic tag in the DT_LOOS~DT_HIOS range.
> >
> >---
> >
> >For HJ's glibc concern, letting GNU ld synthesize _dl_have_relr is the
> >most practical solution.
> >But as I mentioned in
> >https://maskray.me/blog/2021-10-31-relative-relocations-and-relr#time-travel-compatibility
> >, I will at most add a dumb alias but not synthesize a symbol for LLD.

This seems like an out-of-band solution to the problem, more of a
workaround than a solution.

-cary

Cary Coutant

unread,
Nov 2, 2021, 12:05:45 AM11/2/21
to Generic System V Application Binary Interface, Fāng-ruì Sòng, Ali Bahrami, H.J. Lu
> We already have a mechanism that identifies required features, symbols
> and symbol versioning.

Symbol versioning is a vendor extension, not part of standard ELF.

> How is this superior to a dynamic symbol reference, apart from
> DT_REQUIRED encoding more efficiently? A reference in an executable
> to say, _RELR, will do just as well to say "this executable needs RELR
> support to run", won't it?

I feel that it's a more disciplined approach than using dummy symbols,
and similar in spirit to the SHF_OS_NONCONFORMING flag for section
types. As Fang-Rui pointed out, there was a similar proposal a year
ago, and it was that discussion that got me thinking about feature
flag bits instead. In the thread he linked to, we talked about using
EI_OSABI and EI_ABIVERSION, but I agree that those are heavy hammers
to use. I've come to prefer an extensibility model that relies on
individual features rather than versions. Dummy symbols can achieve
that model, but it's not something I'd want to put in an ELF spec.

-cary

Konstantin Belousov

unread,
Nov 2, 2021, 12:07:20 AM11/2/21
to 'Fāng-ruì Sòng' via Generic System V Application Binary Interface, Cary Coutant, Ali Bahrami, H.J. Lu
For FreeBSD, there is already a mechanism to detect newer binaries running
on the older systems (p_osrel checks). It is not as precise as DT_REQUIRED,
but probably cover it in many cases.

That said, I do not object against DT_REQUIRED in any way, and will add it to
FreeBSD if/when the agreement for it is made.

Ali Bahrami

unread,
Nov 2, 2021, 1:50:30 AM11/2/21
to gener...@googlegroups.com
On 11/1/21 9:45 PM, Cary Coutant wrote:
>> So I have collected evidences that (Android, Chrome OS, FreeBSD, Fuchsia,
>> NetBSD, Solaris) are happy with poor diagnostic that new objects don't
>> work on old systems.
> Yes, this did seem to be the consensus in the other thread. While I
> agree with this for the most part, I am sympathetic to the complaint
> that ELF, where extensibility is a key feature, exhibits an
> extensibility problem.
>

I don't think I'd say that anyone is happy with poor diagnostics.
Rather, it's that we just don't support that time traveling model,
and because we always roll out the linker support ahead of the
use, it just isn't a big enough problem to be worth complicating
things for. "Happy Enough", perhaps.

We've built our system around a backward compatibility
guarantee that says (roughly), that we will continue to make
your old objects work on new systems, but not the other way
around. As such, there's no ELF extensibility problem, other
than the fact that rolling it out takes coordination and
time.

----

If there needs to be such a feature (that's the debate in my
mind), then the one Cary proposes is as simple as one could be,
and so I won't object too loudly. I predict though that you'll
quickly run out of bits, so you might as well start numbering
the tags now (call it DT_REQUIRED_1, rather than DT_REQUIRED).
And please don't assign any bits above the first 32, so that
the same tags will work for ELFCLASS32 and 64.

That gives me an idea. We already have DT_POSFLAG_1, a tag
that adds semantics to the tag that follows:

#define DT_POSFLAG_1 0x6ffffdfd /* flags for DT_* entries, effecting */
/* the following DT_* entry. */
/* See DF_P1_* definitions */

/*
* Values for the DT_POSFLAG_1 .dynamic entry.
* These values only affect the following DT_* entry.
*/
#define DF_P1_LAZYLOAD 0x00000001 /* following object is to be */
/* lazy loaded */
#define DF_P1_GROUPPERM 0x00000002 /* following object's symbols are */
/* not available for general */
/* symbol bindings */
#define DF_P1_DEFERRED 0x00000004 /* following object is deferred */
#define DF_P1_EXISTING 0x00000008 /* following object should resolve */
/* to existing object first */

Suppose we were to add a new bit to DT_POSFLAG_1:

#define DF_P1_REQUIRED 0x00000010 /* fatal load error if following tag */
/* is unknown */

The advantage of this is that it can be applied to any tag, without having
to continually assign new "require" bits to go with new features. And those
of us who don't see a benefit can opt out of using it easily enough.

(It doesn't have to be DF_POSFLAG_1. We can invent a new, similar,
tag instead).

- Ali

Florian Weimer

unread,
Nov 2, 2021, 7:20:01 AM11/2/21
to Ali Bahrami, gener...@googlegroups.com
* Ali Bahrami:

> I don't think I'd say that anyone is happy with poor diagnostics.
> Rather, it's that we just don't support that time traveling model,
> and because we always roll out the linker support ahead of the
> use, it just isn't a big enough problem to be worth complicating
> things for. "Happy Enough", perhaps.
>
> We've built our system around a backward compatibility
> guarantee that says (roughly), that we will continue to make
> your old objects work on new systems, but not the other way
> around. As such, there's no ELF extensibility problem, other
> than the fact that rolling it out takes coordination and
> time.

This is not about providing compatibility, but providing at least some
form of diagnostics rather than a mysterious crash. Crashes caused by
missing (not applied) relocations are rather difficult to diagnose, so
I'd really like to have something better here.

The idea of a gradual phase-in is nice, but it's not what has happened
at all on the consumer side (where it is most important).

I would really appreciate if we could address the issue that loaders are
currently expected to ignore unkown dynamic tags and program header
tags, even in cases where it is clear due to the way the binary is
constructed that ignoring these tags will not lead to a usable process
image. DT_REQUIRED works for me if we can get flags assigned as needed
(without having this very discussion over and over again for each
individual flag). My previous proposal side-stepped the assignment
issue, but apart from that it is equivalent.

If DT_REQUIRED is adopted, link editors for GNU/Linux would likely
increase ABIVERSION if DT_REQUIRED is generated (with a non-zero value),
to get the expected failure behavior from existing dynamic loaders that
don't know about DT_REQUIRED.

Thanks,
Florian

H.J. Lu

unread,
Nov 2, 2021, 8:31:57 AM11/2/21
to Generic System V Application Binary Interface, Ali Bahrami
For GNU/Linux, we need to set ELFOSABI_GNU and bump ABIVERSION
when DT_REQUIRED is generated. GNU/Linux should implement it together
with DT_RELR or before DT_RELR.

--
H.J.

Fangrui Song

unread,
Nov 2, 2021, 1:46:18 PM11/2/21
to gener...@googlegroups.com
[Sorry for discussing very glibc specific things in generic-abi.]

I don't see why GNU/Linux should implement the (to-be-renamed) DT_REQUIRED diagnostic before DT_RELR.


ld.lld --pack-dyn-relocs=relr can produce DT_RELR since 2018.
Such an object will work once glibc supports DT_RELR.

Current state:

* old glibc: sigsegv
* new glibc: sigsegv

Are you saying changing the state to the following is undesired from
glibc point of view? I fail to follow.

* old glibc: sigsegv
* new glibc: work


--pack-dyn-relocs=relr is not a default linker option. Nobody changes
GCC/binutils/Clang default today. The user producing DT_RELR objects
with --pack-dyn-relocs=relr takes the risk that such objects don't run on
old systems. That's it. For Chrome OS, their glibc has supported DT_RELR
3 years ago. They have passed the finite window of time when "time
travel compatibility" could ever become a problem.

> Quote Ali's suggestion https://groups.google.com/g/generic-abi/c/bX460iggiKg/m/0PMCJ0hjBAAJ
>
> My free advice (worth what you paid for it) is to roll out the support,
> and then wait a bit before turning on the use widely, so that the
> support is in place before it is needed, and to not complicate things
> with a way to catch time travelers. The window of time where this can
> be a problem is finite, and once you're past it, you'll be glad to have
> a simpler system.

mips uses EI_ABIVERSION==1 to mean (AIUI) canonical PLT entries and copy
relocations can be used. A different EI_ABIVERSION would be picked.

Ali Bahrami

unread,
Nov 2, 2021, 7:03:33 PM11/2/21
to gener...@googlegroups.com
On 11/2/21 5:19 AM, Florian Weimer wrote:
> This is not about providing compatibility, but providing at least some
> form of diagnostics rather than a mysterious crash. Crashes caused by
> missing (not applied) relocations are rather difficult to diagnose, so
> I'd really like to have something better here.
>
> The idea of a gradual phase-in is nice, but it's not what has happened
> at all on the consumer side (where it is most important).
>
> I would really appreciate if we could address the issue that loaders are
> currently expected to ignore unkown dynamic tags and program header
> tags, even in cases where it is clear due to the way the binary is
> constructed that ignoring these tags will not lead to a usable process
> image. DT_REQUIRED works for me if we can get flags assigned as needed
> (without having this very discussion over and over again for each
> individual flag). My previous proposal side-stepped the assignment
> issue, but apart from that it is equivalent.
>
> If DT_REQUIRED is adopted, link editors for GNU/Linux would likely
> increase ABIVERSION if DT_REQUIRED is generated (with a non-zero value),
> to get the expected failure behavior from existing dynamic loaders that
> don't know about DT_REQUIRED.


OK. I'd also appreciate not having this discussion
over and over, so we have at least that as a common
interest. :-) And of course, a clean error message is much
easier for end users to understand than starting at a
core file. Mistakes happen, and good errors are never
a bad thing.

-----

My first question for you is about this not being to
provide compatibility. As I said, I've always found it
questionable that unknown things are ignored. Suppose
we were to simply modify the linkers to reject such
objects? I don't recall the ELF spec saying that unknown
things should be ignored and think that this is just
one of those implementation things (I could be wrong --- I
didn't look before typing this).

The biggest impact of that is that using new features will
force new linkers to be installed first, but in today's world
with package managers and online repos, and continually updated
bits waiting to be pulled down, is that as big a deal as it
used to be? It sure would be nice to solve this without
needing to extend ELF.

-----

Assuming that my previous suggestion is too rigid to
gain support, let's forge on and explore what we might
do to allow the compilers to label things that must be
understood and supported.

A weakness of DT_REQUIRED, as proposed, is that new
bits need to be assigned for it, as new features get
added. Not only is this an added layer of bureaucracy,
but as things age, the older bits become pointless. In
5-10 years, no one will need to say "RELR is required",
but that bit in DT_REQUIRED will live on, taking up space
in objects, documentation, and our heads.

Problem #2: As a gABI feature, DT_REQUIRED can only label
gBI features. I bet you'd love to be able to label OSABI
and psABI features too, wouldn't you?

Another issue occurs when an old linker, sees a new bit
set in DT_REQUIRED that it doesn't understand. The presence
of that unknown bit means that we can't continue, but since
we don't know what the bit means, we're limited to an overly
generic error:

ld.so.1: foo.so: unrecognized requirement 0x2 in DT_REQUIRED

The idea of using a new flag from DT_POSFLAG_1, or something
similar to it, to mark dynamic items that can't be ignored
solves most of these issues:

- Once defined, it doesn't need ongoing ABI maintenance,
and can be applied to any dynamic tag, old/new,
generic/osabi/psabi.

- Compilers can quit setting it for tags that have
aged long enough to be ubiquitous.

- The error message that can be produced can actually
reference the thing that isn't understood:

ld.so.1: foo.so: unsupported dymamic tag: [7] 0x<tag value in hex>

That doesn't help with program headers, or section headers.
For program headers, we could create a PT_REQUIRED flag.
Same for section headers, but since section headers aren't
used by the runtime linker, perhaps it's not necessary to
label them.

I do think that a better name than REQUIRED, like MUSTSUPPORT,
might make all of this more clear. As noted, REQUIRED does
seem like something that would apply to a NEEDED dependency.

- Ali

Roland McGrath

unread,
Nov 2, 2021, 7:18:24 PM11/2/21
to gener...@googlegroups.com
For phdrs, if we look at the extensions that have been added over the years, at least half are in the category with PT_GNU_EH_FRAME and PT_GNU_RELRO: extra information that either a dynamic linker doesn't examine anyway or that a dynamic linker can safely ignore with only the loss of new functionality/security and no compatibility breakage.  So it's likely that a "refuse unrecognized p_type" rule would do more harm than help on its own.  It could be said that new features that can safely be overlooked by the dynamic linker should all be funneled through PT_NOTE.  But that's a hassle (my one regret about Build ID is that we used a note instead of a new PT_* type for it since parsing is so much simpler).  However, phdrs have many bits to spare.  So a simple but backward-compatible new rule would be to reserve a p_flag bit (e.g. the high bit) and say that any unrecognized type with PF_MUSTSUPPORT in p_flags must be a hard error.  Existing things like PT_GNU_EH_FRAME or whatnot that a dynamic linker already safely ignores would be unaffected since the high bit in p_flags has never been set before.  New things can be added with or without the PF_MUSTSUPPORT flag set by the linker as appropriate to the semantic requirements of the new p_type.

Reply all
Reply to author
Forward
0 new messages