Rationale for the first phase of section combination

47 views
Skip to first unread message

Rafael Avila de Espindola

unread,
Jan 9, 2017, 4:41:40 PM1/9/17
to gener...@googlegroups.com
Sorry if anyone is getting a duplicate. There was some issue with the
mailing list before.


The spec says

----------------------------------------------------------------
In the first phase, input sections that match in name, type and
attribute flags should be concatenated into single sections.
----------------------------------------------------------------

This is referring to unknown sections, but it is recommended for known
sections too.

It is clear that some flags have to be ignored for section merging. At
the very least SHF_GROUP and SHF_COMPRESSED have to be ignored. We
should not have two output .text sections just because one was in a
group and another was not for example.

Different linkers disagree on what other flags they ignore. LLD so far
ignores just those flags. Gold also seems to ignore SHF_WRITE and
SHF_EXECINSTR, so that a read only .foo section and a read write .foo
section are concatenated.

Merging sections with different flags is expected by some users. One
reason is that if one file has

int *const bar __attribute__((section(".foo"))) = (int *)0;

gcc with -fPIC will produce a read only .foo section. But if another
file has

int zed;
int *const bar __attribute__((section(".foo"))) = (int *)&zed;

gcc with -fPIC will produce a read write section.

So, what is the rationale for using the flags and type (instead of just
the name) for deciding which sections should be concatenated? Should the
spec explicitly list the flags that should be ignored?

Thanks,
Rafael

Cary Coutant

unread,
Jan 9, 2017, 8:54:22 PM1/9/17
to gener...@googlegroups.com
> Sorry if anyone is getting a duplicate. There was some issue with the
> mailing list before.

I sent the following privately in an earlier reply to Rafael. He asked
me to forward my reply to this list...

-cary


---------- Forwarded message ----------
From: Cary Coutant <ccou...@gmail.com>
Date: Mon, Jan 2, 2017 at 4:24 PM
Subject: Re: Rationale for the first phase of section combination
To: Rafael Espíndola <rafael.e...@gmail.com>

[...]

> The spec says
>
> ----------------------------------------------------------------
> In the first phase, input sections that match in name, type and
> attribute flags should be concatenated into single sections.
> ----------------------------------------------------------------
>
> This is referring to unknown sections, but it is recommended for known
> sections too.

Yes, but that whole section was a late addition to the gABI back
around 1999-2000 when HP, Sun, IBM, SGI, Intel, and a few others were
all working on it together (mostly as part of the Itanium work).
Frankly, I don't think the wording ever really got the scrutiny it
deserved, and I wouldn't treat it as particularly authoritative. We
had recently added a few new section types, and were worried about how
linkers should process any section types they didn't already know
about; the only real thing to take from that section is that if the
NONCONFORMING flag is set, they should complain; otherwise, they
should just treat them as if they were PROGBITS. I don't even remember
adding the part about recommending that for known sections.

It was never the intent of the gABI to specify the details of how
linkers should combine sections; different platforms all had their
quirks, and there was no way we'd ever have reached any real
agreement.

If a linker does in fact want to combine only sections with like
flags, that's certainly fine, but there's no real value in doing that
unless the output sections are placed in separate segments --
ultimately, all sections within the same segment are going to get the
same access permissions. It would be unexpected (but not invalid) to
see an output file with a .data section in the text segment, and
another .data section in the data segment.

> It is clear that some flags have to be ignored for section merging. At
> the very least SHF_GROUP and SHF_COMPRESSED have to be ignored. We
> should not have two output .text sections just because one was in a
> group and another was not for example.
>
> Different linkers disagree on what other flags they ignore. LLD so far
> ignores just those flags. Gold also seems to ignore SHF_WRITE and
> SHF_EXECINSTR, so that a read only .foo section and a read write .foo
> section are concatenated.
>
> Merging sections with different flags is expected by some users. One
> reason is that if one file has
>
> int *const bar __attribute__((section(".foo"))) = (int *)0;
>
> gcc with -fPIC will produce a read only .foo section. But if another
> file has
>
> int zed;
> int *const bar __attribute__((section(".foo"))) = (int *)&zed;
>
> gcc with -fPIC will produce a read write section.
>
> So, what is the rationale for using the flags and type (instead of just
> the name) for deciding which sections should be concatenated? Should the
> spec explicitly list the flags that should be ignored?

At least for WRITE and EXECINSTR, there's no real reason not to merge
sections unless you're going to put them into separate segments, and
if you want them in different segments, they should have different
names. If you do merge them, the output section should have the
inclusive-or of these flags. As you suspect, part of the reason most
linkers don't follow the recommendation you quoted is most likely
inconsistencies among compilers.

It kind of makes sense not to combine ALLOC sections with non-ALLOC
sections, but it seems like more of a hypothetical question -- in
practice, I simply don't expect to see this, and it probably wouldn't
be out of line for a linker to issue at least a warning in that case.

Multiple sections with the MERGE flag are meant to be merged, but the
result of that merge can still be combined with other like sections
that don't have the MERGE flag set. Likewise with MERGE|STRINGS
sections (which could theoretically even be combined after merging
with the result of merging plain MERGE sections). The output section
would not have the MERGE flag set.

TLS sections, obviously, must always match, and the output section
must also have the flag set.

GROUP and COMPRESSED are really just special-purpose flags telling the
linker about the input section, and have nothing to do with how the
sections should be combined. Whether you compress the output section
is orthogonal to whether the input sections are compressed.

The INFO_LINK flag is just there so that linkers can know to
"relocate" the section index in the sh_info field. Inconsistency
abounds here, and it's mostly there for the unknown section types. If
the linker knows about a section type, it should know whether sh_link
is a section index or not; if it doesn't, it should probably refuse to
combine sections whose flags don't match -- and even sections whose
sh_info fields don't refer to same-named sections.

The LINK_ORDER field was added to make explicit a requirement of the
Tahoe unwind sections (SHT_IA64_UNWIND), which were expected to be
combined in the same respective order as the PROGBITS sections they
were associated with.

-cary
Reply all
Reply to author
Forward
0 new messages