Re: [PATCH] ld: entry size and merge/strings attributes propagation

8 views
Skip to first unread message

H.J. Lu

unread,
Aug 19, 2025, 1:54:14 PMAug 19
to Jan Beulich, Generic System V Application Binary Interface, Binutils, Sam James
On Tue, Aug 19, 2025 at 10:31 AM H.J. Lu <hjl....@gmail.com> wrote:
>
> On Tue, Aug 19, 2025 at 1:36 AM Jan Beulich <jbeu...@suse.com> wrote:
> >
> > PR ld/33291
> >
> > As indicated in other recent commits, the three properties can be
> > largely independent (ELF generally being the target here): Entry size
> > doesn't require either of merge/strings, and strings also doesn't
> > require merge. Commit 98e6d3f5bd4e ("gas/ELF: allow specifying entity
> > size for arbitrary sections") uncovered issues with ld's handling.
> >
> > Zap entry size when it doesn't match between input sections. In that
> > case SEC_MERGE and SEC_STRINGS also need to be removed, as their
> > underlying granularity is lost. Then deal with SEC_MERGE and
> > SEC_STRINGS separately.
> >
> > Otoh record entry size from the first input independent of SEC_MERGE.
> > ---
> > The handling of the three attributes still isn't correct when it comes
> > to data allocation statements within the section, or position changes
> > (including alignment other than at the start): These would all need to
> > clear entry size (for not coming with an entry size themselves), and
> > hence also SEC_MERGE and SEC_STRINGS.
> >
> > --- a/ld/ldlang.c
> > +++ b/ld/ldlang.c
> > @@ -2857,14 +2857,24 @@ lang_add_section (lang_statement_list_ty
> > /* Only set SEC_READONLY flag on the first input section. */
> > flags &= ~ SEC_READONLY;
> >
> > - /* Keep SEC_MERGE and SEC_STRINGS only if they are the same. */
> > - if ((output->bfd_section->flags & (SEC_MERGE | SEC_STRINGS))
> > - != (flags & (SEC_MERGE | SEC_STRINGS))
> > - || ((flags & SEC_MERGE) != 0
> > - && output->bfd_section->entsize != section->entsize))
> > + /* Keep entry size, SEC_MERGE, and SEC_STRINGS only if entry sizes are
> > + the same. */
> > + if (output->bfd_section->entsize != section->entsize)
> > {
> > - output->bfd_section->flags &= ~ (SEC_MERGE | SEC_STRINGS);
> > - flags &= ~ (SEC_MERGE | SEC_STRINGS);
> > + output->bfd_section->entsize = 0;
> > + flags &= ~(SEC_MERGE | SEC_STRINGS);
> > + }
> > +
> > + /* Keep SEC_MERGE and SEC_STRINGS (each) only if they are the same. */
> > + if ((output->bfd_section->flags ^ flags) & SEC_MERGE)
> > + {
> > + output->bfd_section->flags &= ~SEC_MERGE;
> > + flags &= ~SEC_MERGE;
> > + }
> > + if ((output->bfd_section->flags ^ flags) & SEC_STRINGS)
> > + {
> > + output->bfd_section->flags &= ~SEC_STRINGS;
> > + flags &= ~SEC_STRINGS;
> > }
> > }
> > output->bfd_section->flags |= flags;
> > @@ -2879,8 +2889,7 @@ lang_add_section (lang_statement_list_ty
> > link_info.output_bfd,
> > output->bfd_section,
> > &link_info);
> > - if ((flags & SEC_MERGE) != 0)
> > - output->bfd_section->entsize = section->entsize;
> > + output->bfd_section->entsize = section->entsize;
> > }
> >
> > if ((flags & SEC_TIC54X_BLOCK) != 0
>
> A testcase is needed to verify that ld does the right thing.
>
> --
> H.J.

gABI has

sh_entsize
Some sections hold a table of fixed-size entries, such as a symbol
table. For such a section, this member gives the size in bytes of each
entry. The member contains 0 if the section does not hold a table of
fixed-size entries.

SHF_MERGE
The data in the section may be merged to eliminate duplication. Unless
the SHF_STRINGS flag is also set, the data elements in the section are
of a uniform size. The size of each element is specified in the
section header's sh_entsize field. If the SHF_STRINGS flag is also
set, the data elements consist of null-terminated character strings.
The size of each character is specified in the section header's
sh_entsize field.
Each element in the section is compared against other elements in
sections with the same name, type and flags. Elements that would have
identical values at program run-time may be merged. Relocations
referencing elements of such sections must be resolved to the merged
locations of the referenced values. Note that any relocatable values,
including values that would result in run-time relocations, must be
analyzed to determine whether the run-time values would actually be
identical. An ABI-conforming object file may not depend on specific
elements being merged, and an ABI-conforming link editor may choose
not to merge specific elements.

SHF_STRINGS
The data elements in the section consist of null-terminated character
strings. The size of each character is specified in the section
header's sh_entsize field.

If sh_entsize != 0 and SHF_MERGE or SHF_STRINGS bits are't set, ELF tools
may not work properly. I think

commit 98e6d3f5bd4e7e3cbd2718151cc54692f6740b65
Author: Jan Beulich <jbeu...@suse.com>
AuthorDate: Fri Aug 15 12:19:59 2025 +0200
Commit: Jan Beulich <jbeu...@suse.com>
CommitDate: Fri Aug 15 12:19:59 2025 +0200

gas/ELF: allow specifying entity size for arbitrary sections

The spec doesn't tie entity size to just SHF_MERGE and SHF_STRINGS
sections. Introduce a new "section letter" 'E' to allow recording (and
checking) of entity size even without 'M' or 'S'.

should be reverted. If we want to allow non-0 entity size for
arbitrary sections,
it should be discussed at gABI group first:

https://groups.google.com/g/generic-abi

which is CCed.

--
H.J.

Cary Coutant

unread,
Aug 19, 2025, 9:50:38 PMAug 19
to gener...@googlegroups.com, Jan Beulich, Binutils, Sam James
> gABI has
>
> sh_entsize
> Some sections hold a table of fixed-size entries, such as a symbol
> table. For such a section, this member gives the size in bytes of each
> entry. The member contains 0 if the section does not hold a table of
> fixed-size entries.
> ...
> If sh_entsize != 0 and SHF_MERGE or SHF_STRINGS bits are't set, ELF tools
> may not work properly. I think
> ...
> should be reverted. If we want to allow non-0 entity size for
> arbitrary sections,
> it should be discussed at gABI group first:

If you're thinking that the ELF spec doesn't allow non-zero sh_entsize
for anything other than merge sections, I think you've misread the
spec. In fact, the text you quoted from the spec gives a specific
example of a non-merge section that uses sh_entsize: "such as a symbol
table."

The sh_entsize field has been part of ELF from the beginning, and was
meant to provide the entry size for any section that might have
fixed-size entries, whether special sections — like symbol tables,
relocations, dynamic sections — or ordinary PROGBITS sections. Since
the section header table and program header table are not themselves
sections, they have entsize fields in the ELF header. The original
purpose for the field was to allow for extensibility — theoretically,
one could extend a structure with extra fields, and the entsize field
would in effect "version" that structure. In practice, we have never
extended an ELF structure this way.

Merge sections were added much later (ca. 1999), and the use of the
sh_entsize field to describe the fixed-size entries or (for strings)
the fixed-size character width was natural. In effect, a constant
merge section is a table of fixed-size entries. A string merge section
is more of a stretch — it's really a table of variable-sized strings,
but at a lower level, it is also kind of a table of fixed-size
characters.

For other sections, sh_entsize can certainly be used for any
structured program data, such as unwind tables or other metadata.

If any tools can't handle a non-zero sh_entsize field for non-merge
sections, they should be fixed. Unless they care about the structure
and contents of the data in those sections, they shouldn't even care
about the value of sh_entsize. If they do care about the structure and
contents, they could use that field as a versioning mechanism (or at
least a sanity check that the producer isn't out of sync with the
consumer).

-cary

H.J. Lu

unread,
Aug 19, 2025, 10:08:02 PMAug 19
to gener...@googlegroups.com, Jan Beulich, Binutils, Sam James
The section in question is .rodata. Some tools don't work with non-zero
sh_entsize on .rodata. How should a tool interpret non-zero sh_entsize
on these special sections?

--
H.J.

Michael Matz

unread,
Aug 20, 2025, 10:17:31 AMAug 20
to H.J. Lu, gener...@googlegroups.com, Jan Beulich, Binutils, Sam James
Hello,

On Tue, 19 Aug 2025, H.J. Lu wrote:

> The section in question is .rodata. Some tools don't work with non-zero
> sh_entsize on .rodata.

Define "some tools", and define "don't work". Those tools need
fixing then.

> How should a tool interpret non-zero sh_entsize on these special
> sections?

How is .rodata special? entsize!=0 doesn't have much meaning for (e.g.)
the link editors default section handling (cat'ing them together), or in
fact for most of the usual binary tools, at least not beyond the fact that
it should divide the sections size and be compatible with alignment. But
it may have meaning for something else, maybe for the program containing
them.

As QoI issue, if all input sections have the same entsize then it'd
be nice if the output section has that as well. Obviously if the inputs
disagree on the entsize the entsize of the output isn't known anymore and
must be set to zero (or, if the inputs were in some way special, this
coule be diagnosed).

For real special sections (e.g. symbol tables), the entsize does have
meaning (and should perhaps be checked for consistency), but then whatever
tools handle them already do handle them specially.

So tools should handle entsize "naturally": try to conserve it from inputs
to outputs, and zero them out if that's not obviously possible. Any
special sections that don't have linker-cat semantics (e.g. MERGE, STRING
sections, or non SHT_PROGBITS): those may of course specify additional
constraints on entsize, which the tools then have to obey.


Ciao,
Michael.

H.J. Lu

unread,
Aug 20, 2025, 10:23:14 AMAug 20
to Michael Matz, gener...@googlegroups.com, Jan Beulich, Binutils, Sam James
On Wed, Aug 20, 2025 at 7:17 AM Michael Matz <ma...@suse.de> wrote:
>
> Hello,
>
> On Tue, 19 Aug 2025, H.J. Lu wrote:
>
> > The section in question is .rodata. Some tools don't work with non-zero
> > sh_entsize on .rodata.
>
> Define "some tools", and define "don't work". Those tools need

https://sourceware.org/bugzilla/show_bug.cgi?id=33291

> fixing then.

When someone updates binutils, all of a sudden eu-strip stops working.
Is this a good idea? I can understand changing sh_entsize on some new
sections. But changing it on existing special sections is a bad idea.

> > How should a tool interpret non-zero sh_entsize on these special
> > sections?
>
> How is .rodata special? entsize!=0 doesn't have much meaning for (e.g.)
> the link editors default section handling (cat'ing them together), or in
> fact for most of the usual binary tools, at least not beyond the fact that
> it should divide the sections size and be compatible with alignment. But
> it may have meaning for something else, maybe for the program containing
> them.
>
> As QoI issue, if all input sections have the same entsize then it'd
> be nice if the output section has that as well. Obviously if the inputs
> disagree on the entsize the entsize of the output isn't known anymore and
> must be set to zero (or, if the inputs were in some way special, this
> coule be diagnosed).
>
> For real special sections (e.g. symbol tables), the entsize does have
> meaning (and should perhaps be checked for consistency), but then whatever
> tools handle them already do handle them specially.
>
> So tools should handle entsize "naturally": try to conserve it from inputs
> to outputs, and zero them out if that's not obviously possible. Any
> special sections that don't have linker-cat semantics (e.g. MERGE, STRING
> sections, or non SHT_PROGBITS): those may of course specify additional
> constraints on entsize, which the tools then have to obey.
>
>
> Ciao,
> Michael.



--
H.J.

H.J. Lu

unread,
Aug 20, 2025, 10:29:43 AMAug 20
to Michael Matz, gener...@googlegroups.com, Jan Beulich, Binutils, Sam James
On Wed, Aug 20, 2025 at 7:22 AM H.J. Lu <hjl....@gmail.com> wrote:
>
> On Wed, Aug 20, 2025 at 7:17 AM Michael Matz <ma...@suse.de> wrote:
> >
> > Hello,
> >
> > On Tue, 19 Aug 2025, H.J. Lu wrote:
> >
> > > The section in question is .rodata. Some tools don't work with non-zero
> > > sh_entsize on .rodata.
> >
> > Define "some tools", and define "don't work". Those tools need
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=33291
>
> > fixing then.
>
> When someone updates binutils, all of a sudden eu-strip stops working.
> Is this a good idea? I can understand changing sh_entsize on some new
> sections. But changing it on existing special sections is a bad idea.

The output .rodata section has all kinds of data. Some input sections
may contain strings which have non-zero sh_entsize. It doesn't make
sense to set the output sh_entsize to some sh_entsize value in an input
string section.
--
H.J.

Michael Matz

unread,
Aug 20, 2025, 10:38:03 AMAug 20
to H.J. Lu, gener...@googlegroups.com, Jan Beulich, Binutils, Sam James
Hello,

On Wed, 20 Aug 2025, H.J. Lu wrote:

> > On Tue, 19 Aug 2025, H.J. Lu wrote:
> >
> > > The section in question is .rodata. Some tools don't work with
> > > non-zero sh_entsize on .rodata.
> >
> > Define "some tools", and define "don't work". Those tools need
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=33291
>
> > fixing then.
>
> When someone updates binutils, all of a sudden eu-strip stops working.
> Is this a good idea? I can understand changing sh_entsize on some new
> sections. But changing it on existing special sections is a bad idea.

Again: what is special here? And well, yes, breaking eu-strip isn't nice,
but updating binutils always comes with minor issues here and there, and
may require updating other tools as well. Here: eu-strip should be fixed
and there's even enough time until binutils with this change is released.


Ciao,
Michael.

Michael Matz

unread,
Aug 20, 2025, 10:46:49 AMAug 20
to H.J. Lu, gener...@googlegroups.com, Jan Beulich, Binutils, Sam James
Hello,

On Wed, 20 Aug 2025, H.J. Lu wrote:

> > > > The section in question is .rodata. Some tools don't work with non-zero
> > > > sh_entsize on .rodata.
> > >
> > > Define "some tools", and define "don't work". Those tools need
> >
> > https://sourceware.org/bugzilla/show_bug.cgi?id=33291
> >
> > > fixing then.
> >
> > When someone updates binutils, all of a sudden eu-strip stops working.
> > Is this a good idea? I can understand changing sh_entsize on some new
> > sections. But changing it on existing special sections is a bad idea.
>
> The output .rodata section has all kinds of data. Some input sections
> may contain strings which have non-zero sh_entsize. It doesn't make
> sense to set the output sh_entsize to some sh_entsize value in an input
> string section.

That is just broken handling (in binutils) of input inconsistency (that
seemingly is addressed by the patches in that PR): if one
of the inputs has zero entsize, then that means "unknown", not "unset".
It can't be upgraded to some non-zero entsize (claiming the entsize now
magically is known), just because other input sections contain non-zero
entsizes.

So, a .rodata output that consists of a entsize=0 .rodata, and entsize!=0
.strings/.const.data must have entsize=0.

But that is orthogonal to the question if a .rodata (or anything else) can
or can not have entsize!=0 at all. It _can_.


Ciao,
Michael.

Cary Coutant

unread,
Aug 20, 2025, 12:18:38 PMAug 20
to gener...@googlegroups.com, Jan Beulich, Binutils, Sam James
> The section in question is .rodata. Some tools don't work with non-zero
> sh_entsize on .rodata. How should a tool interpret non-zero sh_entsize
> on these special sections?

Well, that's unusual. By setting sh_entsize on the section, the
compiler (or assembler) is claiming that the section has a table of
fixed-size records. Perhaps that's true, but it seems unlikely. Maybe
it's the case for one single contribution to the .rodata section, but
it's highly unlikely to remain true for the combined output section.
The question is who is the compiler trying to inform about the
fixed-size nature of the contents? The linker doesn't care, so it's
not really its job to validate or enforce. Perhaps there's a pre-link
or post-link tool that could make use of that information. More
likely, I wonder if someone misinterpreted the ELF spec in the other
direction to infer that if a section has a table of fixed-length
records, then it *must* set sh_entsize. That's no more true than your
assertion that it must be 0 except for merge sections.

Anyway, I suppose your question is what is the linker supposed to do
with it? If the linker encounters multiple .rodata sections, all with
the same sh_entsize, then just preserve the value. If they don't all
have the same value, you have two choices: set the output sh_entsize
to 0, or refuse to combine sections with mismatched values. I'd
suggest the former.

What other tools are not working because of this? Why would a tool
care what the sh_entsize value is on a section that has no special
significance to that tool? (Answer: it shouldn't.)

-cary

Mark Wielaard

unread,
Aug 20, 2025, 4:05:14 PMAug 20
to Sam James, Cary Coutant, gener...@googlegroups.com, Jan Beulich, Binutils
Hi Sam,

On Wed, Aug 20, 2025 at 05:49:33PM +0100, Sam James wrote:
> Cary Coutant <ccou...@gmail.com> writes:
>
> > [...]
> >
> > What other tools are not working because of this? Why would a tool
> > care what the sh_entsize value is on a section that has no special
> > significance to that tool? (Answer: it shouldn't.)
>
> I'm not (yet) aware of anything else breaking, but I can't really go
> looking for that until elfutils is fixed.

I am not sure elfutils is broken. It handles sections with non-zero
sh_entsize just fine. But it does contain a sanity check [*] that if
entsize is set then the (uncompressed) section data contains a table
of fixed-[sh_ent]size records. Specifically it checks sh_size modulo
sh_entsize is zero. But I think that is a reasonable interpretation of
the spec.

Cheers,

Mark

[*] If you really want to you can use elf_flagelf ELF_F_PERMISSIVE or
eu-strip --permissive to disable the sanity check to handle slightly
broken ELF files.
Reply all
Reply to author
Forward
0 new messages