SHF_LINK_ORDER's original semantics make upgrade difficult

190 views
Skip to first unread message

Fangrui Song

unread,
Jul 16, 2020, 2:37:25 AM7/16/20
to gener...@googlegroups.com, binu...@sourceware.org
For your convenience, the agreed semantics of SHF_LINK_ORDER is attached:

https://groups.google.com/d/msg/generic-abi/_CbBM6T6WeM/eGF9A0AnAAAJ

SHF_LINK_ORDER

### Original semantics
This flag adds special ordering requirements for link editors. The
requirements apply to the referenced section identified by the sh_link
field of this section's header. If this section is combined with other
sections in the output file, the section must appear in the same
relative order with respect to those sections, as the referenced
section appears with respect to sections the referenced section is
combined with.

A typical use of this flag is to build a table that references
text or data sections in address order.

### Extended semantics for metadata sections
In addition to adding ordering requirements, SHF_LINK_ORDER indicates
that the section contains metadata describing the referenced section.
When performing unused section elimination, the link editor should
ensure that both the section and the referenced section are retained
or discarded together. Furthermore, relocations from this section
into the referenced section should not be taken as evidence that the
referenced section should be retained.

Say we have a monolithic metadata section (.foo) associated to a text section.
The metadata section is only referenced by the associated text section. If we
want to "upgrade" the metadata section to SHF_LINK_ORDER in the future (to get
--gc-sections functionality with fragmented .foo sections), unfortunately the
original semantics make the design space narrow.

For compatibility reasons, we still have some object files with
non-SHF_LINK_ORDER monolithic .foo . New .foo have the SHF_LINK_ORDER flag.
However, mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER sections have an issue with
the linked-to order requirement (Original semantics). In practice, linkers
forbid a mix.

GNU ld: .gcc_except_table has both ordered [.foo' in a.o] and unordered [.foo' in b.o] sections
LLD is more relaxed, but it does not allow non-contiguous SHF_LINK_ORDER sections: a.o:(.foo): SHF_LINK_ORDER sections in .foo are not contiguous
(https://reviews.llvm.org/D77007 )

In many cases, .foo doesn't really need original semantics and .foo can be
combined in an arbitrary order. What can we do to make such "upgrade" feasible?
The current design simply requires such metadata section to be designed with
SHF_LINK_ORDER in mind in the very beginning.

James Henderson

unread,
Jul 16, 2020, 3:27:36 AM7/16/20
to Generic System V Application Binary Interface
I've run into this same problem whilst doing some prototyping work on fragmenting DWARF sections into smaller pieces. These pieces are a mixture of common pieces and function/variable-related pieces, with the latter being associated with the corresponding text/data sections via SHF_LINK_ORDER. However, in some cases, the section is laid out such that there are multiple common pieces intermixed with the function/variable-specific pieces (e.g. it might look like "common, function, function, common, function, variable, common"). I actually don't want this section to be ordered at all, since the original ordering should be preserved.I've been using a hacked LLD for this, which removes the ordering for SHF_LINK_ORDER completely, so that the sections remain in the "natural" order, whilst the associated-with semantics for --gc-sections are still preserved.

Whilst doing this, I couldn't help but feel that the associated-with and ordering semantics are somewhat orthoganol. Clearly if a section is ordered, it needs the associated-with semantics, but it seems like section association is a different thing. There are multiple sections now where the ordering is irrelevant, but which still wants association in some way. Examples include LLVM's stack sizes section and debug data.The original SHF_LINK_ORDER extension discussion actually started out with discussing a SHF_ASSOCIATED flag. Maybe we should revisit that idea in some form? Thus, SHF_ASSOCIATED implies the section should be discarded when it's linked section is discarded, and SHF_LINK_ORDER requires ordering. I think both for backwards-compatibility's sake and to avoid redundancy, we could say SHF_LINK_ORDER implies SHF_ASSOCIATED.

Fangrui Song

unread,
Jul 16, 2020, 11:36:20 AM7/16/20
to gener...@googlegroups.com, binu...@sourceware.org
> GNU ld: .gcc_except_table has both ordered [.foo' in a.o] and unordered [.foo' in b.o] sections

Correction: .foo has both ordered [`.foo' in a.o] and unordered [`.foo' in b.o] sections

(I was experimenting with SHF_LINK_ORDER .gcc_except_table . For this
discussion I just used a metasyntactic section name .foo but forgot to
replace all occurrences of .gcc_except_table)
I am in favor of a new section flag SHF_ASSOCIATED as well. We can use
the next bit:

#define SHF_ASSOCIATED 0x1000

In the assembler, assign it a .section flag 'm', e.g.

.section __patchable_function_entries,"am",@progbits,foo
# Create a __patchable_function_entries section with the SHF_ASSOCIATED flag
# and sh_link referencing the section defining 'foo'

Unfortunately we will have a binutils release (2.35) with SHF_LINK_ORDER
syntax (.section flag 'o')
https://sourceware.org/bugzilla/show_bug.cgi?id=24526
Unless we could be really quick and fix it, we might have to let SHF_LINK_ORDER
imply SHF_ASSOCIATED permanently.
(On the LLVM side, I think we could keep LLD's SHF_LINK_ORDER implying SHF_ASSOCIATED
for one or two releases and then drop that)


SHF_ASSOCIATED

SHF_ASSOCIATED indicates that the section contains metadata describing

Ali Bahrami

unread,
Jul 16, 2020, 12:55:09 PM7/16/20
to gener...@googlegroups.com
On 7/16/20 12:37 AM, 'Fangrui Song' via Generic System V Application Binary Interface wrote:
> For your convenience, the agreed semantics of SHF_LINK_ORDER is attached:
...

> Say we have a monolithic metadata section (.foo) associated to a text section.
> The metadata section is only referenced by the associated text section.  If we
> want to "upgrade" the metadata section to SHF_LINK_ORDER in the future (to get
> --gc-sections functionality with fragmented .foo sections), unfortunately the
> original semantics make the design space narrow.
>
> For compatibility reasons, we still have some object files with
> non-SHF_LINK_ORDER monolithic .foo .  New .foo have the SHF_LINK_ORDER flag.
> However, mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER sections have an issue with
> the linked-to order requirement (Original semantics). In practice, linkers
> forbid a mix.
>
> GNU ld: .gcc_except_table has both ordered [.foo' in a.o] and unordered [.foo' in b.o] sections
> LLD is more relaxed, but it does not allow non-contiguous SHF_LINK_ORDER sections: a.o:(.foo): SHF_LINK_ORDER sections in .foo are not contiguous
>   (https://reviews.llvm.org/D77007 )
>
> In many cases, .foo doesn't really need original semantics and .foo can be
> combined in an arbitrary order. What can we do to make such "upgrade" feasible?
> The current design simply requires such metadata section to be designed with
> SHF_LINK_ORDER in mind in the very beginning.
>



The Solaris linker puts sections without SHF_LINK_ORDER at the
end of the output section, in first-in-first-out order, and I
don't believe that's considered to be an error. Perhaps it's
worth investigating why GNU ld and LDD have these restrictions,
and whether they might be relaxed. I'm a linker specialist,
but this seems like a case where the linker should do as it's
told by the compiler, and not make assumptions about what the
compiler intends to achieve.

-----

Coming at this from another angle, I'd say that if the linkers
can't be modified to be more compliant, the most straightforward
solution is to just require all the objects to be recompiled if
--gc-sections functionality is required by any part of the object.

Yes, you could add complexity to support a mixed model, but why
cater to this odd case? Compatibility is of course, very important,
but there are limits. Folks who insist on using old objects don't
necessarily have a right to demand new abilities. It's not a good
bargain to add permanent complexity to handle a temporary case.

- Ali

Ali Bahrami

unread,
Jul 16, 2020, 12:59:22 PM7/16/20
to gener...@googlegroups.com
On 7/16/20 1:27 AM, James Henderson wrote:
> I've run into this same problem whilst doing some prototyping work on fragmenting DWARF sections into smaller pieces. These pieces are a mixture of common pieces and function/variable-related pieces, with the latter being associated with the corresponding text/data sections via SHF_LINK_ORDER. However, in some cases, the section is laid out such that there are multiple common pieces intermixed with the function/variable-specific pieces (e.g. it might look like "common, function, function, common, function, variable, common"). I actually don't want this section to be ordered at all, since the original ordering should be preserved.I've been using a hacked LLD for this, which removes the ordering for SHF_LINK_ORDER completely, so that the sections remain in the "natural" order, whilst the associated-with semantics for --gc-sections are still preserved.
>
> Whilst doing this, I couldn't help but feel that the associated-with and ordering semantics are somewhat orthoganol. Clearly if a section is ordered, it needs the associated-with semantics, but it seems like section association is a different thing. There are multiple sections now where the ordering is irrelevant, but which still wants association in some way. Examples include LLVM's stack sizes section and debug data.The original SHF_LINK_ORDER extension discussion actually started out with discussing a SHF_ASSOCIATED flag. Maybe we should revisit that idea in some form? Thus, SHF_ASSOCIATED implies the section should be discarded when it's linked section is discarded, and SHF_LINK_ORDER requires ordering. I think both for backwards-compatibility's sake and to avoid redundancy, we could say SHF_LINK_ORDER implies SHF_ASSOCIATED.
>



I guess I don't understand why SHT_GROUP sections aren't the general
solution to this scenario, and any others where sections need to be
associated.

https://docs.oracle.com/cd/E37838_01/html/E36783/chapter7-26.html

In your scenario, I think you'd set the flags field to 0, rather
than GRP_COMDAT, and then you'd have an associated set of sections
that are kept, or discarded, as a unit. And, GROUP conveniently go
away as part of a "final" link that produces an executable or shared
object, as you propose above.

If the GNU linkers were modified to allow both ordered and
non-ordered sections, then I think you could use GROUP for
associating, and SHF_LINK_ORDER for actual ordering, where
that is desired. We probably don't need a third mechanism,
do we?

- Ali

Fangrui Song

unread,
Jul 16, 2020, 7:32:57 PM7/16/20
to gener...@googlegroups.com, binu...@sourceware.org, James Henderson
If I understand correctly, linkers support GRP_COMDAT but not other values. So
you are right that we can make use of other values if we want to do some fancy
things.

In practice, I guess one reason people want to mess with a section flag is that
adding a .group has non-trivial cost (sizeof(Elf64_Shdr)=64) if an object file
may have many of them (especially when -ffunction-sections is enabled).

For example, I added -fpatchable-function-entry=N[,M] in clang for the Linux
kernel. Every .text section has an accompanying __patchable_function_entries
with SHF_LINK_ORDER set. If I want to express this with a section group, I need
to add a .group (SHT_GROUP) for each .text, the cost will go up from double to
triple.

>>On 7/16/20 1:27 AM, James Henderson wrote:
>> Whilst doing this, I couldn't help but feel that the associated-with and ordering semantics are somewhat orthoganol.

Yeah, I am a big fan of non-orthogonality.. However, a gABI section flag seems a
no-go now.

>If the GNU linkers were modified to allow both ordered and
>non-ordered sections, then I think you could use GROUP for
>associating, and SHF_LINK_ORDER for actual ordering, where
>that is desired. We probably don't need a third mechanism,
>do we?

This solution is acceptable. I will do this for LLD
(https://reviews.llvm.org/D72904 ). We need an assembler syntax for
(SHF_LINK_ORDER & sh_link=0).

I created https://sourceware.org/bugzilla/show_bug.cgi?id=26253 for a GNU as feature request.
Peter Collingbourne proposed

.section .meta,"ao",@progbits,0

I will expect that GNU as and LLVM integrated assembler match
(probably https://reviews.llvm.org/D72899#2157020 )

Fangrui Song

unread,
Jul 17, 2020, 12:33:30 AM7/17/20
to gener...@googlegroups.com, binu...@sourceware.org, James Henderson
Forgot to mention that there are two feature requests for ld and one for as.

* ld: A SHF_LINK_ORDER section with sh_link=0 is treated like a non-SHF_LINK_ORDER section.
(sorry to send llvm links here: https://reviews.llvm.org/D72904
but I hope the interested can get a bit more context from the links)

* ld: Arbitrary mix of SHF_LINK_ORDER and non-SHF_LINK_ORDER components
within an output section is allowed.
(https://reviews.llvm.org/D84001 )

I just saw https://sourceware.org/bugzilla/show_bug.cgi?id=16833 after
I had filed https://sourceware.org/bugzilla/show_bug.cgi?id=26256

There is actually an interesting question about how we perform sorting
with input section descriptions. My feeling is that sorting should be
scoped within an input section description. See the links for more information.

* as: Support SHF_LINK_ORDER with sh_link=0
https://sourceware.org/bugzilla/show_bug.cgi?id=26253
(https://reviews.llvm.org/D72899 )

There is actually an LLVM IR design issue requiring sh_link=0.
Supporting this will help GNU ld and gold (if gold implements
SHF_LINK_ORDER) with clang -flto (which requires LLVMgold.so )

James Henderson

unread,
Jul 17, 2020, 3:20:36 AM7/17/20
to Fangrui Song, gener...@googlegroups.com, binu...@sourceware.org
Fangrui more or less covered it, but to add, I have in the past experimented with non-comdat groups, but since they weren't supported in LLVM (no assembler syntax, no LLD support), it was a non-trivial amount of work to get them to work nicely at the time. Plus, as mentioned, they are a significantly higher file space (and likely link-time) overhead than just using the sh_link value. Given that one of the main concerns with the debug data fragmentation concept was file I/O overhead costs, due to there being potentially many more section headers, adding another set per function/variable (due to -ffunction-sections/-fdata-sections which is where this approach would be used) makes the matter even worse.

Of course using the SHF_LINK_ORDER/a proposed SHF_ASSOCIATED flag does not solve the question of what happens if sh_link is used for something else, but that hasn't been relevant in my situation so far, since the sections have all been SHT_PROGBITS.

ben.du...@sony.com

unread,
Jul 17, 2020, 5:13:41 AM7/17/20
to Generic System V Application Binary Interface
+1 to separation of orthogonal features. IIRC Cary raised this on the original reverse edges for gc-sections discussions. It seems ok to leave LINK_ORDER implying ASSOCIATED to me, or at least I can't think of any examples where this could cause a problem; but, it makes sense to add a support for ASSOCIATED sections.

The first thing that needs doing when we can make edits to the gabi is clarify the wording for groups. See: https://groups.google.com/g/generic-abi/c/_CbBM6T6WeM/m/cSELn86gAQAJ and https://groups.google.com/g/generic-abi/c/PFAzdbKLWjs/m/rU-VA193EwAJ.

I still believe that, ideally, groups should be processed *early* before any gc-sections considerations. Certainly that was the original intention of the ELF authors... I concede though that the "discarded or retained" together wording is unfortunately ambiguous in the days of --gc-sections.

Ali Bahrami

unread,
Jul 17, 2020, 1:55:56 PM7/17/20
to gener...@googlegroups.com
I guess my role is to be the contrarian in this, sorry.
I've gathered 3 replies together, since my answer to all
three has a common theme.

On 7/16/20 5:32 PM, 'Fangrui Song' via Generic System V Application Binary Interface wrote:
> If I understand correctly, linkers support GRP_COMDAT but not other values. So
> you are right that we can make use of other values if we want to do some fancy
> things.
>
> In practice, I guess one reason people want to mess with a section flag is that
> adding a .group has non-trivial cost (sizeof(Elf64_Shdr)=64) if an object file
> may have many of them (especially when -ffunction-sections is enabled).
>
> For example, I added -fpatchable-function-entry=N[,M] in clang for the Linux
> kernel. Every .text section has an accompanying __patchable_function_entries
> with SHF_LINK_ORDER set. If I want to express this with a section group, I need
> to add a .group (SHT_GROUP) for each .text, the cost will go up from double to
> triple.


It makes sense to me that an SHT_GROUP with a zero flag
represents a group of sections that are a unit. I'm not sure
this would even be an extension to the existing gABI. The Solaris
linker accepts these (based on a cursory look just now) without
issuing any errors, and it does not treat a value of 0 like COMDAT.
Today, that means that an SHT_GROUP with flag=0 is a bit of a
no-op, but there's nothing that would keep a link-editor from using
the grouping to make decisions. We don't need to make use of other
values for this particular case.

As such, an SHT_GROUP with a flag value of 0 is the ASSOCIATE feature
you're asking for, albeit one that costs more than a flag bit.

SHT_GROUP sections are reasonably compact. I understand that they
cost, but the cost seems in line with the benefit. I would not
like to see other new flags and features that cover the same
space added, just to shave a bit of overhead. The added complexity
to an already complex ABI seems like a problem to me, particularly
when you're asking multiple implementations to support it, at least
some of which won't ever see the benefit. We already support GROUP.



On 7/17/20 1:20 AM, James Henderson wrote:
> Fangrui more or less covered it, but to add, I have in the past experimented with non-comdat groups, but since they weren't supported in LLVM (no assembler syntax, no LLD support), it was a non-trivial amount of work to get them to work nicely at the time. Plus, as mentioned, they are a significantly higher file space (and likely link-time) overhead than just using the sh_link value. Given that one of the main concerns with the debug data fragmentation concept was file I/O overhead costs, due to there being potentially many more section headers, adding another set per function/variable (due to -ffunction-sections/-fdata-sections which is where this approach would be used) makes the matter even worse.
>
> Of course using the SHF_LINK_ORDER/a proposed SHF_ASSOCIATED flag does not solve the question of what happens if sh_link is used for something else, but that hasn't been relevant in my situation so far, since the sections have all been SHT_PROGBITS.

The problem that poses is that just because something isn't
relevant now doesn't tell us much about where we'll find
ourselves later. One thing that seems to be invariably true
in file formats is that the fewer ways there are to do something,
the less likely we are to end up with a fragile unmaintainable web.

I understand that SHT_GROUP is more work to use, and there is
overhead, and neither of those things is ideal, but I don't think
that should be the main criteria for adding redundant features to
an ABI.

A bonus is that if you use SHT_GROUP, and then you later discover
a use for 3 or more related sections, you're looking at a minor
change, rather than a rewrite.


On 7/17/20 3:13 AM, ben.du...@sony.com wrote:
> +1 to separation of orthogonal features. IIRC Cary raised this
> on the original reverse edges for gc-sections discussions.
> It seems ok to leave LINK_ORDER implying ASSOCIATED to me, or
> at least I can't think of any examples where this could cause
> a problem; but, it makes sense to add a support for ASSOCIATED
> sections.

Cary's observation, as I understood and agreed with, is
that LINK_ORDER naturally implies association. That association
is the basis for sorting sections. This was not a new
interpretation of LINK_ORDER, but just a recognition of
an existing fact.

In other words, Cary noted that in cases where you don't
care whether the sections are sorted or not, you can still
use LINK_ORDER for the association. This is a clever and
useful trick, but it's not necessarily a reason to add yet
another ASSOCIATE bit, given that we already have a general
association mechanism.

This discussion seems to assume that we just have this one
final ASSOCIATE problem, so it would be no big deal to just
give it a flag. I guess I'm not convinced that this isn't just
one of many similar scenarios, each of which probably need
their own flag, all of which are already solvable with SHT_GROUP.

It's never easy, is it...

- Ali

Fangrui Song

unread,
Jul 17, 2020, 3:16:31 PM7/17/20
to gener...@googlegroups.com, ben.du...@sony.com, James Henderson, pe...@pcc.me.uk
Ali, the wording about SHN_BEFORE/SHN_AFTER on Linker and Libraries
Guide is a bit unclear to me. Does Solaris ld support something similar
to GNU ld's linker scripts?

If yes, for an output section description .foo_bar : { *(.foo) *(.bar) },
does it order sections in the following order?

--- Contribution from sections named .foo
SHF_LINK_ORDER with sh_link=SHN_BEFORE
SHF_LINK_ORDER with sh_link=1
SHF_LINK_ORDER with sh_link=2
non-SHF_LINK_ORDER
SHF_LINK_ORDER with sh_link=0
non-SHF_LINK_ORDER
SHF_LINK_ORDER with sh_link=0
SHF_LINK_ORDER with sh_link=SHN_AFTER

--- Contribution from sections named .bar
SHF_LINK_ORDER with sh_link=SHN_BEFORE
SHF_LINK_ORDER with sh_link=1
SHF_LINK_ORDER with sh_link=2
non-SHF_LINK_ORDER
SHF_LINK_ORDER with sh_link=0
non-SHF_LINK_ORDER
SHF_LINK_ORDER with sh_link=0
SHF_LINK_ORDER with sh_link=SHN_AFTER

1. A SHF_LINK_ORDER section with sh_link=0 is treated the same way as a
non-SHF_LINK_ORDER section. They are ordered with respect to the input order.
2. The special ordering of SHN_BEFORE & SHN_AFTER is scoped within
an input section description.

>The problem that poses is that just because something isn't
>relevant now doesn't tell us much about where we'll find
>ourselves later. One thing that seems to be invariably true
>in file formats is that the fewer ways there are to do something,
>the less likely we are to end up with a fragile unmaintainable web.
>
>I understand that SHT_GROUP is more work to use, and there is
>overhead, and neither of those things is ideal, but I don't think
>that should be the main criteria for adding redundant features to
>an ABI.
>
>A bonus is that if you use SHT_GROUP, and then you later discover
>a use for 3 or more related sections, you're looking at a minor
>change, rather than a rewrite.

To James:

I think you need a way to keep .debug_info sections from an input file
contiguous, i.e. not reordered with sections from another input file, right?

--- Contribution from input file 0
.debug_info header (sh_link=0 ??)
.debug_info 1
.debug_info 2
.debug_info 3
.debug_info footer (sh_link=0 ??)

--- Contribution from input file 1
.debug_info header (sh_link=0 ??)
.debug_info 1
.debug_info 2
.debug_info 3
.debug_info footer (sh_link=0 ??)

If a section flag does not have ordering requirement, the regular
section combination rule will naturally satisfy the file order.
OK, you position pretty much makes a generic ELF flag no-go now.

If my previous .debug_info example can find its merit (if header/footer
contribution can't be elegantly represented within the current
SHF_LINK_ORDER framework), I guess we can still consider SHF_LLVM_ASSOCIATED
(or SHF_GNU_ASSOCIATED if GNU people find it useful)

Ali Bahrami

unread,
Jul 17, 2020, 3:54:31 PM7/17/20
to gener...@googlegroups.com
On 7/17/20 1:16 PM, 'Fangrui Song' via Generic System V Application Binary Interface wrote:
> Ali, the wording about SHN_BEFORE/SHN_AFTER on Linker and Libraries
> Guide is a bit unclear to me.  Does Solaris ld support something similar
> to GNU ld's linker scripts?

Solaris uses a very different language than linker scripts.
We call them mapfiles. The original mapfile language was a
terrible thing that came with SVR4. I created a different,
I think better, one about 10 years ago, which I've blogged about.

http://www.linker-aliens.org/blogs/ali/entry/the_problem_s_with_solaris/
> http://www.linker-aliens.org/blogs/ali/entry/a_new_mapfile_syntax_for/
http://www.linker-aliens.org/blogs/ali/entry/much_ado_about_nothing_stub/
http://www.linker-aliens.org/blogs/ali/entry/regex_and_glob_for_mapfiles/

The documentation for the current language is here:

https://docs.oracle.com/cd/E37838_01/html/E36783/man-m.html#scrolltoc

And the horrible original language:

https://docs.oracle.com/cd/E37838_01/html/E36783/chapter7-55900.html#scrolltoc

No, I don't expect you to read most of that. Probably the
blog marked with '>', or the current documentation are
really relevant here.

I believe that the GNU linker scripts are based on the similar
feature from Unix SVR3, which got dropped in SVR4 in favor of
the terrible language we inherited in Solaris.

-----

All you really need to know about SHN_BEFORE and SHN_AFTER is
that we regret them, and took steps to deprecate them a few
years ago. I will include the text of a comment I wrote then,
to help me remember the details, below.

The big problem is that they're not compatible with extended
section indexes. At the same time, the only users of them were
the Sun/Oracle Studio compilers, and we noted that when those
compilers got ported to Linux a few years ago, they were easily
able to do without them. So we support them for non-extended
objects as a backward compatibility measure, but they're
otherwise defunct.

Make sure you're looking at the Solaris 11.4 version of Linker and
Libraries, and not an older version.

https://docs.oracle.com/cd/E37838_01/html/E36783/man-s.html

We now say:

SHN_BEFORE, SHN_AFTER

Provide for initial and final section ordering in
conjunction with SHF_LINK_ORDER section flags. See
Figure 18, Table 18, ELF Section Attribute Flags.
SHN_BEFORE and SHN_AFTER are incompatible with objects
that use extended section indexes. They are considered
deprecated, and their use is discouraged. See Extended
Section Header.

To give you the full scoop on output section placement, the
Solaris ld divides the output section into 4 ranges, and then
sections are concatenated into the appropriate section in the
appropriate order:

/*
* Output sections contain lists of input sections that are assigned to them.
* These items fall into 4 categories:
* BEFORE - Ordered sections that specify SHN_BEFORE, in input order.
* ORDERED - Ordered sections that are sorted using unsorted sections
* as the sort key.
* DEFAULT - Sections that are placed into the output section
* in input order.
* AFTER - Ordered sections that specify SHN_AFTER, in input order.
*/

So today, all the action is in ORDERED and DEFAULT, which is why
I didn't mention BEFORE/AFTER in my earlier comments. I advise you
to pretend they don't exist.

If you want an example of baroque over engineering, you need not
look farther than the Sun SHF_ORDERED, and things like BEFORE/AFTER
that came with it. :-( SHF_LINK_ORDER was a welcome chance at
a do-over. See the notes below.


> If my previous .debug_info example can find its merit (if header/footer
> contribution can't be elegantly represented within the current
> SHF_LINK_ORDER framework), I guess we can still consider SHF_LLVM_ASSOCIATED
> (or SHF_GNU_ASSOCIATED if GNU people find it useful)

I think "elegantly" is arguable, but if you'd accept "dirt cheap",
then yes, I think that's fair.

I think I would implement ASSOCIATED rather than see things
fragment, but I'm voicing my opinion that it's a debatable
idea, and that we might be better off without it. It's probably
time for others to chime in at this point.

- Ali


--------------------------------------------
My notes about BEFORE/AFTER

/*
* Section Ordering History/Background:
*
* Historically, there have been two forms of section ordering in ELF,
* SHF_ORDERED, and SHF_LINK_ORDER. SHF_ORDERED is now EOL (End Of Life)
* and is no longer supported. However, the information below describing
* it is not found anywhere else, and is retained to provide historical
* detail and context for the parts that survive.
*
* SHF_ORDERED was invented at Sun in order to support the PowerPC port
* of Solaris 2.6, which used it for sorting tag words which describe
* the state of callee saves registers for given PC ranges. It was defined
* in the OS specific ELF section flag range. Some other values were defined
* at the same time:
* SHF_EXCLUDE - Section is to be excluded from executables or shared
* objects, and only kept in relocatable object output.
* SHN_BEFORE/SHN_AFTER - Sections are placed before/after all other
* sections, in the order they are encountered by the linker.
* Although initially conceived to support the PowerPC, the functionality
* was implemented for all platforms, and was later used to manage C++
* exceptions and stack unwinding. The PowerPC port never appeared, but
* SHF_ORDERED lived on, and was extended to the other platforms.
*
* SHF_LINK_ORDER was invented later by the wider ELF community, and is
* therefore assigned a value in the generic ELF section flag range. It is
* essentially a simpler version of SHF_ORDERED that dispenses with the section
* renaming feature. The Solaris implementation of SHF_LINK_ORDER uses
* SHF_EXCLUDE, and SHN_BEFORE/SHN_AFTER as well, but these Solaris-only
* extensions are not recognized by other implementations.
*
* Extended section indexes (shnum >= SHN_LORESERVE) were added, via the gABI,
* after SHN_BEFORE/SHN_AFTER. We participated in the gABI, but missed the
* fact that if the number of sections exceed SHN_LORESERVE, the meaning
* of the values assigned to SHN_BEFORE and SHN_AFTER become ambiguous,
* since you can't tell whether they are intended to refer to the special
* concept, or a section with that index. This issue was resolved with
*
* PSARC/2017/015 EOL SHF_ORDERED, Deprecate SHN_BEFORE/SHN_AFTER
*
* For backward compatibility with existing use from the C++ compilers for
* .exception_ranges sections, SHN_BEFORE/SHN_AFTER continue to be supported
* with SHF_LINK_ORDER, but only when (shnum <= SHN_LORESERVE). The compilers
* must switch away in order to support extended section indexes.
*
* -----
*
* SHF_ORDERED offered two distinct and separate abilities:
*
* (1) To specify the output section
* (2) To optionally be sorted relative to other sorted sections,
* using a non-sorted section as a sort key.
*
* To do this, it used both the sh_link, and sh_info fields:
*
* sh_link
* Specifies the output section to receive this input section.
* The sh_link field of an SHF_ORDERED section forms a linked list of
* sections, all of which must have identical section header flags
* (including SHF_ORDERED). The list is terminated by a final section
* with a sh_link that points at itself. All input sections in this list
* are assigned to the output section of the final section in the list.
* Hence, if a section points at itself, the effect is that it gets
* assigned to an output section in the usual default manner (i.e. an
* output section with the same name as the input). However, it can
* point at any arbitrary other section. This is a way to put a section
* with one name into an output section with a different name. It should
* be noted that this is of little value overall, because the link-editor
* now supports a more general feature for directing input sections
* to output sections: An input section named .text%foo will be sent to
* an output section named ".text", and this works for all sections,
* not just ordered ones.
*
* sh_info
* If sh_info is in the range (1 <= value < shnum), then this input section
* is added to the group of sorted sections. The section referenced by
* sh_info must be unsorted, and is used as the sort key.
*
* If sh_info is SHN_BEFORE or SHN_AFTER, it is put in the pre/post group,
* in the order it arrives (the before/after classes are not sorted).
*
* If sh_info is "invalid" (typically 0), then this section is added to
* the group of non-sorted sections, and goes into the output file in the
* order it arrives. This is not a valuable feature, as the same effect
* can be achieved more simply by not setting SHF_ORDERED at all.
*
* SHF_LINK_ORDER is a simplification of SHF_ORDERED. It uses sh_link to specify
* the section to use as a sort key and sh_info is set to 0. The standard
* ".text%foo" mechanism is used to direct input sections to output sections,
* and unordered sections indicate that by not setting SHF_LINK_ORDER.
*/

Fangrui Song

unread,
Jul 17, 2020, 4:01:01 PM7/17/20
to gener...@googlegroups.com, ben.du...@sony.com, James Henderson, pe...@pcc.me.uk
>To James:
>
>I think you need a way to keep .debug_info sections from an input file
>contiguous, i.e. not reordered with sections from another input file, right?
>
>--- Contribution from input file 0
>.debug_info header (sh_link=0 ??)
> .debug_info 1
> .debug_info 2
> .debug_info 3
>.debug_info footer (sh_link=0 ??)
>
>--- Contribution from input file 1
>.debug_info header (sh_link=0 ??)
> .debug_info 1
> .debug_info 2
> .debug_info 3
>.debug_info footer (sh_link=0 ??)
>
>If a section flag does not have ordering requirement, the regular
>section combination rule will naturally satisfy the file order.
>
>OK, you position pretty much makes a generic ELF flag no-go now.
>
>If my previous .debug_info example can find its merit (if header/footer
>contribution can't be elegantly represented within the current
>SHF_LINK_ORDER framework), I guess we can still consider SHF_LLVM_ASSOCIATED
>(or SHF_GNU_ASSOCIATED if GNU people find it useful)

Let me be clearer about my take of James' needs.

Some metadata sections can benefit from being organized by file, rather than by
linked-to output section.

Fragmented .debug_info is one example. The file header/footer can apply
abbreviations to all metadata sections within the file. Things like
.stack_sizes, __patchable_function_entries theoretically can benefit from a
header as well. By scanning the content, we can look up the associated file
efficiently.

If we apply ordering requirement, the metadata sections will essentially be
ordered by linked-to output sections. The header/footer advantage will be lost.

Ali Bahrami

unread,
Jul 17, 2020, 6:08:10 PM7/17/20
to gener...@googlegroups.com
On 7/17/20 2:00 PM, 'Fangrui Song' via Generic System V Application Binary Interface wrote:
> Let me be clearer about my take of James' needs.
>
> Some metadata sections can benefit from being organized by file, rather than by
> linked-to output section.
>
> Fragmented .debug_info is one example. The file header/footer can apply
> abbreviations to all metadata sections within the file. Things like
> .stack_sizes, __patchable_function_entries theoretically can benefit from a
> header as well.  By scanning the content, we can look up the associated file
> efficiently.
>
> If we apply ordering requirement, the metadata sections will essentially be
> ordered by linked-to output sections. The header/footer advantage will be lost.


I'll throw this out as an experiment to consider. I hope I'm
understanding what you're after. If not, I'm probably wasting your
time, for which I appologize.

The scenario:

- There are input sections that get concatenated to the
output in the order in which they are seen in the input.

- There are associated metadata input sections that get
sorted on output, so that their order agrees with
the other set of sections. SHF_LINK_ORDER can do this.

- There are header and footer metadata sections that
we want placed first and last. These sections might
be based on those other sections, but they don't have
a data section to point SHF_LINK_ORDER at.

Suppose you linked these applications with a pair of CRT
objects, to deliver the header/footer. You arrange that the
link-editor always sees the first CRT before any other objects,
and that the end CRT follows them:

ld -o libfoo.so crtJamesBegin.o <arbitrary objects>crtJamesEnd.o ...

crtJamesBegin.o contains the header metadata, SHF_LINK'd to an
empty section of the other type.

crtJamesEnd.o contains the footer metadata, SHF_LINK'd to another
empty section of the other type.

The link-editor will concatenate the data sections in link order,
so the empty header and footer data sections end up at the top
and bottom, respectively. They're empty sections, so they don't
contribute anything to the output section. However, the order of
the metadata sections will track the order of these data sections,
so the header and footer metadata sections land at the top and
bottom of that section. We're sorting everything, but we've rigged
the game so that the header and footer sort into the right spots.

???

- Ali

Fangrui Song

unread,
Jul 17, 2020, 7:32:22 PM7/17/20
to gener...@googlegroups.com
On 2020-07-17, Ali Bahrami wrote:
I believe James is in an European time zone so will not rely now:)
I hope I captured the idea sufficiently closely:)

The source program is:

inline int comdat() { return 0; }
int main() { return comdat(); }

The generated DWARF interleaved with my .section notation to mark where a section starts:

.section .debug_info,"",@progbits,unique,0 # llvm-mc and GNU as 2.35 syntax, allow multiple sections to have the same name
0x0000000c: DW_TAG_compile_unit
DW_AT_producer ("clang version 12.0.0")
DW_AT_language (DW_LANG_C_plus_plus_14)
DW_AT_name ("a.cc")
DW_AT_str_offsets_base (0x00000008)
DW_AT_stmt_list (0x00000000)
DW_AT_comp_dir ("/tmp/c")
DW_AT_low_pc (0x0000000000000000)
DW_AT_ranges (indexed (0x0) rangelist = 0x00000010
[0x0000000000000000, 0x000000000000001a)
[0x0000000000000000, 0x0000000000000008))
DW_AT_addr_base (0x00000008)
DW_AT_rnglists_base (0x0000000c)

.section .debug_info,"o",@progbits,.text.main # SHF_LINK_ORDER, linked-to .text.main
0x0000002b: DW_TAG_subprogram
DW_AT_low_pc (0x0000000000000000)
DW_AT_high_pc (0x000000000000001a)
DW_AT_frame_base (DW_OP_reg6 RBP)
DW_AT_name ("main")
DW_AT_decl_file ("/tmp/c/a.cc")
DW_AT_decl_line (3)
DW_AT_type (0x0000004a "int")
DW_AT_external (true)

.section .debug_info,"o",@progbits,.text._Z6comdatv # SHF_LINK_ORDER, linked-to .text._Z6comdatv
0x0000003a: DW_TAG_subprogram
DW_AT_low_pc (0x0000000000000000)
DW_AT_high_pc (0x0000000000000008)
DW_AT_frame_base (DW_OP_reg6 RBP)
DW_AT_linkage_name ("_Z6comdatv")
DW_AT_name ("comdat")
DW_AT_decl_file ("/tmp/c/a.cc")
DW_AT_decl_line (2)
DW_AT_type (0x0000004a "int")
DW_AT_external (true)

.section .debug_info,"",@progbits # this can be a regular unordered section
0x0000004a: DW_TAG_base_type
DW_AT_name ("int")
DW_AT_encoding (DW_ATE_signed)
DW_AT_byte_size (0x04)

.section .debug_info,"",@progbits,unique,1
# this is a footer. DW_TAG_compile_unit has the DW_CHILDREN_yes encoding
0x0000004e: NULL


For this object file, all SHF_LINK_ORDER contribution of .debug_info should be between a header and footer.
Ordering a single .debug_info to another object file will be an error.

This could be unreliable if .debug_info section are suffixed and a crazy linker script does things like
.debug_info : { *(.debug_info.foo) *(.debug_info.bar) }
to intentionally move some .debug_info outside their header/footer.
All .debug_info have the same name. So simple wildcard (retaining input order) will not break the use case. (Things like SORT_BY_ALIGNMENT can potentially break the ordering)

This is where the ordering requirement makes things worse. With the ordering
requirement of SHF_LINK_ORDER, the header and footer of one object file will be
moved to either the beginning or the end, destroying the scheme.

The above is one potential use case of the hypothetical SHF_ASSOCIATED.

-----

A simpler example without a footer:

Ideal:
.meta
.meta header for file a (sh_link=0)
.meta data 1
.meta data 2
.meta header for file b (sh_link=0)
.meta data 3
.meta data 4

The "data" metadata sections can reference their header to get some
abbreviations, for example. There is no need to add a reference from .meta
data 1 to its header because the ordering implicitly assigns a header to each
data section.


With Solaris-style ordering requirement:

.meta
.meta data 1
.meta data 2
.meta data 3
.meta data 4
.meta header for file a (sh_link=0)
.meta header for file b (sh_link=0)

The abbreviation scheme is destroyed.

Ali Bahrami

unread,
Jul 17, 2020, 7:51:27 PM7/17/20
to gener...@googlegroups.com
On 7/17/20 5:32 PM, 'Fangrui Song' via Generic System V Application Binary Interface wrote:
> For this object file, all SHF_LINK_ORDER contribution of .debug_info should be between a header and footer.
> Ordering a single .debug_info to another object file will be an error.
>
> This could be unreliable if .debug_info section are suffixed and a crazy linker script does things like
> .debug_info : { *(.debug_info.foo) *(.debug_info.bar) }
> to intentionally move some .debug_info outside their header/footer.
> All .debug_info have the same name. So simple wildcard (retaining input order) will not break the use case. (Things like SORT_BY_ALIGNMENT can potentially break the ordering)
>
> This is where the ordering requirement makes things worse.  With the ordering
> requirement of SHF_LINK_ORDER, the header and footer of one object file will be
> moved to either the beginning or the end, destroying the scheme.
>
> The above is one potential use case of the hypothetical SHF_ASSOCIATED.


I'm sorry, but I don't fully understand most of what you've
said there. I don't read "linker script" normally. I think
however, that you're saying that someone can break things by writing
a linker script. Of course, someone could blindly write a linker
script to trash a link, but surely that applies to many cases
other than this one. Isn't the answer "don't do that"?

I suggested that the header and footer should also be in
the SHF_LINK_ORDER bucket, and that you manipulate the order
of the sections they link to such that they end up at the
top and bottom respectively. If SHF_ASSOCIATED can work for
this, I don't really follow why SHF_LINK_ORDER can't also work.
since they're basically the same thing, modulo sorting.
init/fini sections use the crt approach to encapsulate those
sections, so it's a pretty standard way to approach things
like this.

I'll take your word for it though --- if it won't work, I
don't insist that it must.

- Ali

Fāng-ruì Sòng

unread,
Jul 17, 2020, 8:03:04 PM7/17/20
to Generic System V Application Binary Interface
On Fri, Jul 17, 2020 at 4:51 PM Ali Bahrami <Ali.B...@oracle.com> wrote:
>
> On 7/17/20 5:32 PM, 'Fangrui Song' via Generic System V Application Binary Interface wrote:
> > For this object file, all SHF_LINK_ORDER contribution of .debug_info should be between a header and footer.
> > Ordering a single .debug_info to another object file will be an error.
> >
> > This could be unreliable if .debug_info section are suffixed and a crazy linker script does things like
> > .debug_info : { *(.debug_info.foo) *(.debug_info.bar) }
> > to intentionally move some .debug_info outside their header/footer.
> > All .debug_info have the same name. So simple wildcard (retaining input order) will not break the use case. (Things like SORT_BY_ALIGNMENT can potentially break the ordering)
> >
> > This is where the ordering requirement makes things worse. With the ordering
> > requirement of SHF_LINK_ORDER, the header and footer of one object file will be
> > moved to either the beginning or the end, destroying the scheme.
> >
> > The above is one potential use case of the hypothetical SHF_ASSOCIATED.
>
>
> I'm sorry, but I don't fully understand most of what you've
> said there. I don't read "linker script" normally. I think
> however, that you're saying that someone can break things by writing
> a linker script. Of course, someone could blindly write a linker
> script to trash a link, but surely that applies to many cases
> other than this one. Isn't the answer "don't do that"?

It is indeed "don't do that" Sections of the same name will be a bit
more robust, though.
With GNU ld's linker script semantics, the innocent looking
.debug_info : { *(.debug_info) *(.debug_info.*) } can sort sections in
an undesired way
(for instance, if both header/footer are named ".debug_info")

> I suggested that the header and footer should also be in
> the SHF_LINK_ORDER bucket, and that you manipulate the order
> of the sections they link to such that they end up at the
> top and bottom respectively. If SHF_ASSOCIATED can work for
> this, I don't really follow why SHF_LINK_ORDER can't also work.
> since they're basically the same thing, modulo sorting.
> init/fini sections use the crt approach to encapsulate those
> sections, so it's a pretty standard way to approach things
> like this.
>
> I'll take your word for it though --- if it won't work, I
> don't insist that it must.
>
> - Ali

Assigning SHF_LINK_ORDER to headers/footers will give us something like:

.debug_info (data 1 from input file a) # sh_link!=0
.debug_info (data 2 from input file a)
.debug_info (data 1 from input file b)
.debug_info (data 2 from input file b)
---------
.debug_info (header from input file a) # sh_link=0
.debug_info (header from input file b)
.debug_info (footer from input file a)
.debug_info (footer from input file b)

which is undesired. Such header/footer use case really needs metadata
sections coming from the same object file
contiguous in the output section:

i.e. both
.debug_info (header from input file a)
.debug_info (data 1 from input file a)
.debug_info (data 2 from input file a)
.debug_info (header from input file b)
and
.debug_info (header from input file a)
.debug_info (data 2 from input file a)
.debug_info (data 1 from input file a)
.debug_info (header from input file b)

are OK, but other ordering is not acceptable. A lot of existing
metadata section use cases don't leverage a header for potential
.debug_abbrev style simplification,
but technically they can.

Fāng-ruì Sòng

unread,
Jul 17, 2020, 8:04:54 PM7/17/20
to Generic System V Application Binary Interface
Correction.

i.e. both
.debug_info (header from input file a)
.debug_info (data 1 from input file a)
.debug_info (data 2 from input file a)
.debug_info (footer from input file a)
and
.debug_info (header from input file a)
.debug_info (data 2 from input file a)
.debug_info (data 1 from input file a)
.debug_info (footer from input file a)

Alan Modra

unread,
Jul 17, 2020, 9:09:27 PM7/17/20
to gener...@googlegroups.com
On Fri, Jul 17, 2020 at 05:02:51PM -0700, 'Fāng-ruì Sòng' via Generic System V Application Binary Interface wrote:
> It is indeed "don't do that" Sections of the same name will be a bit
> more robust, though.
> With GNU ld's linker script semantics, the innocent looking
> .debug_info : { *(.debug_info) *(.debug_info.*) } can sort sections in
> an undesired way
> (for instance, if both header/footer are named ".debug_info")

Undesired? You'd write the script that way if you *wanted* all
.debug_info sections to be placed before sections matching
.debug_info.*

And
.debug_info : { *(.debug_info .debug_info.*) }
would place .debug_info and .debug_info.* sections in input order.

You can even place sections from certain files first or last. See GNU
ld script handling of .ctors and .dtors.

I haven't been following this discussion fully, but please don't
invent new ELF section flags in order to make random project linker
scripts work with fragmented debug info. If scripts need updating,
they need updating.

Also, SHT_GROUP without the comdat flag should work fine in GNU ld to
keep/discard sections as a group under --gc-sections.

--
Alan Modra
Australia Development Lab, IBM

James Henderson

unread,
Jul 20, 2020, 3:36:14 AM7/20/20
to gener...@googlegroups.com
I think Fangrui has more or less captured my example use-case, and highlighted more concretely the problems with ordering (there's other overhead to do with the sort cost, but I doubt it's significant). To highlight a point: the section may not all be header/footer/linked data, but the order may still need preserving. For example, DWARF .debug_info can contain "DW_TAG_namespace" instances which indicate that the contained DW_TAG_subprogram and DW_TAG_variables are inside the corresponding namespace. If the common namespace block were to be moved to a different location than expected, the DWARF would no longer be correct. Example:

# input.o:
<table header> [common]
- DW_TAG_compile_unit [common]
-- DW_TAG_variable [.data.foo]
-- DW_TAG_namespace [common]
--- DW_TAG_subprogram [.text.bar]
--- DW_TAG_variable [.data.baz]
<table footer> [common]

I'm not sure this can be resolved at all using SHF_LINK_ORDER, even with dummy sections for the common bits to refer to and preserve ordering, because the two .data-related blocks cannot be placed consecutively in the output section without breaking semantics. [Aside: A SHF_LINK_ORDER-linked dummy section would be unreferenced and therefore thrown away by --gc-sections which in turn would cause the corresponding .debug_info common block to be discarded too, but there might be a way around that].

I've been giving the SHT_GROUP idea further thought, and I think it is worth investigating at least before we start adding an extra flag, whether in LLVM/GNU/the gABI. At least with debug data, there are actually going to be multiple fragments that need grouping with a text section - .debug_info, .debug_line, .debug_ranges/rnglists, .debug_loc/.debug_loclists and .debug_aranges all could contain relevant information that could be fragmented - so the overall impact of an extra section header may not be so painful after all. If I get a chance, I will try to look further at this too.

--
You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/generic-abi/20200718010921.GP31072%40bubble.grove.modra.org.

Fangrui Song

unread,
Aug 6, 2020, 1:33:15 AM8/6/20
to gener...@googlegroups.com, Nelson Chu, binu...@sourceware.org
A short update about what some LLVM folks have done on SHF_LINK_ORDER:

* New syntax was added to support SHF_LINK_ORDER with sh_link=0 (literal 0 in the symbol position)
(https://reviews.llvm.org/D72899 )
.section linkorder,"ao",@progbits,0
GNU as feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=26253
Relocatable links and LTO dead stripping are two things which can lead to sh_link=0

* LLD allows mixed sh_link!=0 and sh_link=0 components now.
(https://reviews.llvm.org/D72904 )
GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=26256

* To allow "upgrade" (add the SHF_LINK_ORDER flag to an existing
non-SHF_LINK_ORDER section, and allow linking of mixed old and new objects):
Mixed non-SHF_LINK_ORDER and SHF_LINK_ORDER components should be allowed.
Pending (promising): https://reviews.llvm.org/D84001
GNU ld feature request: https://sourceware.org/bugzilla/show_bug.cgi?id=16833
(A Solaris developer requested this in 2014)

(This can be useful to RISC-V .gcc_except_table, so I am CCing Nelson
here: https://reviews.llvm.org/D83655
GCC uses .gcc_except_table.function_name in a COMDAT group,
the 'function_name' part is however a bit tricky to implement in clang.
It also waste .strtab space.

The "o" key, similar to the "G" key, can differentiate two sections
.section .gcc_except_table,"ao",@progbits,.Z3foov
.section .gcc_except_table,"ao",@progbits,.Z3barv
)

Dancing with overloaded semantics of SHF_LINK_ORDER is not the most elegant
thing, but in the absence of a new section flag we have to these kind of stuff
(some are probably already done by Solaris ld). Designing a new section flag
may still be considered premature currently. We might not be able to sort out
all the implications (https://reviews.llvm.org/D72904#2192452 )

Playing with GC semantics+SHF_LINK_ORDER may not be favored by some ELF purists
("Why not section group?") but the object file size factor cannot really be neglected.
Sanitizer folks will continue leveraging SHF_LINK_ORDER.
Reply all
Reply to author
Forward
0 new messages