monolithic input section handling

ben.dunbobb...@gtempaccount.com

unread,

Mar 26, 2018, 1:07:39 PM3/26/18

to Generic System V Application Binary Interface

Hi All,

The hardest part of my toolchain's binary tools to maintain
has always been supporting monolithic sections, especially,
.debug_*, .eh_*, etc, sections that describe the contents
of .text sections.

Currently, these are handled on a case-by-case basis
by section specific code.

One idea that we would really like to pursue is using
elf features to describe the the input sections rather
than having special case code in the linker that
understands the contents of these sections. To put it
another way, it would be great if normal elf linking
rules were all that were needed to link these sections.
For example, for a given comdat function what if the
compiler were able to emit the exception handling info
in per function sections in the same group as the
associated .text and then the normal comdat rules would
be enough to remove the exception info for the comdat
if it is not the chosen one. These ideas mostly depend
on the compiler chopping up the information into per
function or per datum pieces and the format of the
output section being sufficiently simply that the linker
simply has to glue the retained pieces back together in
some order.

However, there is a problem with overhead. To give a
concrete example. our compiler currently emits stack
sizes information in a monolithic section. This stack
sizes information is a table of entries, one entry per
function, where the each entry is a variable size but
is on average about 10 bytes. The overhead in terms of
input file size and fragmentation of moving to one
section per function seems rather steep here.

What we have usually done is to keep the monolithic
input sections but to ensure that the input sections
have a structure where they can be chopped up into
per function pieces by the linker without the linker
having to actually parse the contents of the section.
Usually, this will be done by relying on a set of
relocations or symbols to describe the chunks of the
input sections.

My question is does anyone have a good solution to this
problem? Are there any ELF features that I have missed
which would allow the description of chunks of information
within monolithic sections?

Many thanks in advance.

Cary Coutant

unread,

Mar 26, 2018, 2:09:33 PM3/26/18

to gener...@googlegroups.com

On Mon, Mar 26, 2018 at 10:07 AM, ben.dunbobbin%sony.com via Generic
System V Application Binary Interface <gener...@googlegroups.com>
wrote:

> Hi All,
>
> The hardest part of my toolchain's binary tools to maintain
> has always been supporting monolithic sections, especially,
> .debug_*, .eh_*, etc, sections that describe the contents
> of .text sections.
>
> Currently, these are handled on a case-by-case basis
> by section specific code.
>
> One idea that we would really like to pursue is using
> elf features to describe the the input sections rather
> than having special case code in the linker that
> understands the contents of these sections. To put it
> another way, it would be great if normal elf linking
> rules were all that were needed to link these sections.
> For example, for a given comdat function what if the
> compiler were able to emit the exception handling info
> in per function sections in the same group as the
> associated .text and then the normal comdat rules would
> be enough to remove the exception info for the comdat
> if it is not the chosen one. These ideas mostly depend
> on the compiler chopping up the information into per
> function or per datum pieces and the format of the
> output section being sufficiently simply that the linker
> simply has to glue the retained pieces back together in
> some order.

This is the whole point of comdat groups in ELF: in a given comdat
group (which isn't necessarily limited to a single function), you can
put all the related sections, text, static data, debug info, etc.,
into the group, and the linker will keep or discard it all as one
unit. The group essentially forms a "sub-object".

> However, there is a problem with overhead. To give a
> concrete example. our compiler currently emits stack
> sizes information in a monolithic section. This stack
> sizes information is a table of entries, one entry per
> function, where the each entry is a variable size but
> is on average about 10 bytes. The overhead in terms of
> input file size and fragmentation of moving to one
> section per function seems rather steep here.

This is what GCC does. Even for C code, many people choose to compile
with -ffunction-sections, which splits each function into its own text
section.

> What we have usually done is to keep the monolithic
> input sections but to ensure that the input sections
> have a structure where they can be chopped up into
> per function pieces by the linker without the linker
> having to actually parse the contents of the section.
> Usually, this will be done by relying on a set of
> relocations or symbols to describe the chunks of the
> input sections.
>
> My question is does anyone have a good solution to this
> problem? Are there any ELF features that I have missed
> which would allow the description of chunks of information
> within monolithic sections?

It's a core principle of ELF that sections are indivisible. We've
compromised that principle only for merge sections, and that brought
enough headaches (as you've discovered). I have seen ELF files with
hundreds of thousands of sections; it's not a problem.

Be sure to check out recent discussions of SHF_LINK_INFO in this
group, for some ideas on how to associate those extra annotation
sections you're using with the text sections, so that garbage
collection can work properly.

-cary

ben.dunbobb...@gtempaccount.com

unread,

Mar 27, 2018, 5:43:26 AM3/27/18

to Generic System V Application Binary Interface

Thanks for the reply.

> - Cary

> This is the whole point of comdat groups in ELF: in a given comdat
> group (which isn't necessarily limited to a single function), you can
> put all the related sections, text, static data, debug info, etc.,
> into the group, and the linker will keep or discard it all as one
> unit. The group essentially forms a "sub-object".

This is great. Also, comdats are a relatively relatively lightweight
mechanism in that if they are not chosen the linker doesn't have
to read the contents of the sections from disk.

It occurred to me recently that I am not sure that this simple use
of comdats is quite correct for metadata. The wording in the
elf spec that states that "the sections are discarded or retained
together".. this implies to me that you really want only the sections
whose contents are actually required in one comdat (.text, .data,
.eh_frame, etc..), the metadata sections (.debug_*, .stack_sizes, etc..)
probably want to be in their own comdat which is linked by a
SHF_LINK_ORDER like dependency to the main comdat. i.e:

- if the linker discards the .text comdat then the metadata comdat
is discarded.
- if the linker discards the metadata comdat then the .text comdat
is not required to be discarded.

> - Cary

> This is what GCC does. Even for C code, many people choose to compile
> with -ffunction-sections, which splits each function into its own text
> section.

Measurements on my toolchain put the overhead in terms of link time
at up to 5% (as I remember). However, this only affects text and data.
We have so far avoided using a per function/datum approach for metadata.

> - Cary

> It's a core principle of ELF that sections are indivisible. We've
> compromised that principle only for merge sections, and that brought
> enough headaches (as you've discovered). I have seen ELF files with
> hundreds of thousands of sections; it's not a problem.

I understand that this is conceptually elegant, and it does simplify
the implementation of linker features greatly. Practically, (and I
*know* that you are aware of all of this but for completeness :) )
there are three problems with this:

- overhead
- fragmentation
- some sections are not amenable

The metadata sections (and certain exception handling sections
- .eh_frame as a example) do tend to be the least amenable.

Merge sections might be trouble, but they are efficient. You could
use a merge section to represent a table of per function entries.
Unfortunately, I don't *think* that elf has a good way of putting
a single element of a merge section into a comdat.

> - Cary

> Be sure to check out recent discussions of SHF_LINK_INFO in this
> group, for some ideas on how to associate those extra annotation
> sections you're using with the text sections, so that garbage
> collection can work properly.

This is a really nice addition to the elf spec. Particularly
impressive to get agreement from a number of different
platforms. Our toolchain has the SHF_LINK_ORDER concept

but:

1. We separate the concept of an "associated section" from
the concept of an "ordered section". Although, it looks like
you proposed this in that thread but there wasn't enough
interest.

2. We have a rule that relocations from "associated sections"
to global symbols are treated as discarded if the chosen global
was not from the same object file. This is an obvious extension
as the metadata entry for a function is only applicable to that
one particular version.

ben.dunbobb...@gtempaccount.com

unread,

Mar 28, 2018, 5:14:12 AM3/28/18

to Generic System V Application Binary Interface

> Cary
> This is the whole point of comdat groups in ELF: in a given comdat
> group (which isn't necessarily limited to a single function), you can
> put all the related sections, text, static data, debug info, etc.,
> into the group

Apologies for double posting. I did a bit of research and I couldn't
find evidence that any toolchain has ever fragmented the metadata into
per function/datum pieces and put them into the comdat. Questions:

- Was this the original intent of the comdat feature in elf?
- Was this done with previous formats like stabs?
- How did we get to the monolithic metadata sections of today?

Cary Coutant

unread,

Mar 28, 2018, 5:39:37 PM3/28/18

to Generic System V Application Binary Interface

> Apologies for double posting. I did a bit of research and I couldn't
> find evidence that any toolchain has ever fragmented the metadata into
> per function/datum pieces and put them into the comdat. Questions:
>
> - Was this the original intent of the comdat feature in elf?
> - Was this done with previous formats like stabs?
> - How did we get to the monolithic metadata sections of today?

Yes, this is the intent. Unfortunately, DWARF isn't well suited for
that approach, and while I think there are some investigations going
on there, no one has yet split the DWARF into the individual comdat
groups. Stabs was already pretty obsolete by that time, and isn't
normally used with ELF.

One of my goals for DWARF-6, which we'll probably start working on
later this year, is to come up with a better solution for splitting
DWARF info into comdat groups.

-cary

ben.dunbobb...@gtempaccount.com

unread,

Apr 4, 2018, 5:17:32 PM4/4/18

to Generic System V Application Binary Interface

> Cary

> One of my goals for DWARF-6, which we'll probably start working on
> later this year, is to come up with a better solution for splitting
> DWARF info into comdat groups.

No small task! We have put together some ideas on the same subject.
I will ask if I can make them available to you (if you are interested).

Even if the holy grail of per comdat dwarf is difficult to achieve
there are still relatively cheap improvements that could be made to
the format. Examples:

1. Explicit mechanism for marking "pieces" of dwarf as discarded

A nice improvement would be to allow for some explicit mechanism
for marking bits of the dwarf as discarded. Currently, we rely on
non-standard, clunky, ways of doing this because the format doesn't
have an explicit notion that this is something linkers have to do.

2. Easing the task of carving up the dwarf:

As an example, if we had the following two rules for line programs...

a) No DW_LNE_define_file
b) Sequences always start with DW_LNE_set_address opcode

...then the line program could be carved into sequences without
resorting to actually reading the opcodes.

I have to confess that I was rather hoping (fishing :) ) that
someone from e.g. Solaris would have a great scheme for handling
metadata in elf. I know that some formats allow rather more exciting
ideas for handling metadata (e.g. codeview and pdb's) but I don't
know of anyone doing anything similar with elf/dwarf.

Ali Bahrami

unread,

Apr 4, 2018, 5:42:01 PM4/4/18

to gener...@googlegroups.com

On 04/ 4/18 03:17 PM, ben.dunbobbin%sony.com via Generic System V Application Binary Interface wrote:
> I have to confess that I was rather hoping (fishing :) ) that
> someone from e.g. Solaris would have a great scheme for handling
> metadata in elf. I know that some formats allow rather more exciting
> ideas for handling metadata (e.g. codeview and pdb's) but I don't
> know of anyone doing anything similar with elf/dwarf.

Sorry, no. We're in the same boat. :-)

DWARF has been an issue for us for a long time. Cary and Ian have both
fielded "why don't gnu use COMDAT properly for debug" questions from us
before. Our own native compilers have the same issue --- as you know, and
as I learned from talking to our dbx folks, dwarf wasn't really designed
to support this model. The idea that link-editors would do usage
analysis and throw out unused code hadn't really arrived yet.

One mitigating factor is that debug sections aren't allocable
(are not mapped into the running process), so it's not as important
as it would be otherwise. So we've mostly ignored it. Unfortunately,
we often see debug data that is 10x the size of the code it describes,
so we can't ignore it fully. We've tried putting it elsewhere:

http://www.linker-aliens.org/blogs/ali/entry/ancillary_objects_separate_debug_elf/

and we've tried to make it smaller:

http://www.linker-aliens.org/blogs/ali/entry/elf_section_compression/

but we haven't tackled the core problem. It really is the sort of thing
that wants to be standardized, outside of any one organization.

For what it's worth, we think that making dwarf play with comdat
groups is the right thing. We haven't invented a grand metadata
scheme, mainly because we don't really want that sort of complexity
in the core format.

TL;DR: Go Cary, Go!

- Ali

ben.dunbobb...@gtempaccount.com

unread,

Apr 5, 2018, 4:29:20 PM4/5/18

to Generic System V Application Binary Interface

Hi Ali,

Thanks for your reply. Apologies if I am asking questions that have already
been raked over many times. I enjoyed reading your blog posts. I am also a
big fan of your? linker and libraries guide.

I think that we have relatively rare requirements for dwarf. Most toolchains
seem to be pretty happy with the situation because, as you stated, it is not
allocatable. In fact I think that it is probable that no other elf toolchain
implements the kind of comprehensive deadstripping of dwarf metadata that we
do (and we probably implement more involved handling of the other metadata
sections as well). It is very hard work, and requires extensions beyond the
elf standard, but it is important to our users.

We also have a requirement to be *very* performant in terms of link speed. As
I stated, this requirement sometimes conflicts with the binary representation
when a simpler scheme has, for example, higher IO requirements.

An effective solution to these issues would be fantastic!

> Ali

> For what it's worth, we think that making dwarf play with comdat
> groups is the right thing. We haven't invented a grand metadata
> scheme, mainly because we don't really want that sort of complexity
> in the core format.

I agree with this and Cary's statement that sections are the

atomic unit in dwarf. I wonder if it would be possible to use some

sort of delta encoding for representing a whole set of sections which

are all very similar. This way you might be able to get the benefit of

describing things with individual sections whilst eliminating some

of the overhead?

ben.dunbobb...@gtempaccount.com

unread,

Apr 5, 2018, 4:31:27 PM4/5/18

to Generic System V Application Binary Interface

> Ben
> atomic unit in dwarf

Oops, elf!

Ali Bahrami

unread,

Apr 5, 2018, 11:16:15 PM4/5/18

to gener...@googlegroups.com

Hi Ben,

On 04/ 5/18 02:29 PM, ben.dunbobbin%sony.com via Generic System V Application Binary Interface wrote:
> I enjoyed reading your blog posts. I am also a
> big fan of your? linker and libraries guide.

We really appreciate that. Rod and I write the linker and libraries guide,
and the manpages, so it's nice to hear.

A shameless plug: Oracle keeps moving and renaming our blog URLs, which plays
havoc with google searches, so I've recently created www.linker-aliens.org as a
place to archive all of that old stuff in a stable place. It's hopefully the
same content, wherever it may be found.

You've undoubtedly seen them, but Ian's ELF blogs are great, and
very wide ranging:

https://www.airs.com/blog/archives/38

> I think that we have relatively rare requirements for dwarf. Most toolchains
> seem to be pretty happy with the situation because, as you stated, it is not
> allocatable. In fact I think that it is probable that no other elf toolchain
> implements the kind of comprehensive deadstripping of dwarf metadata that we
> do (and we probably implement more involved handling of the other metadata
> sections as well). It is very hard work, and requires extensions beyond the
> elf standard, but it is important to our users.

I don't understand DWARF well enough to judge if the requirements are rare,
but it's clear from the above that you've lived it, so I believe you.

As I'm sure you know all too well, the big risk in what you've done
isn't the up front hard work. It's the fragility going forward, when the
world changes, and you have to reconcile your extensions with whatever
new things come down the pike. I can understand why you're looking for
standards here.

FWIW, I think dwarf bloat is a problem for everyone. This just seems to
be one of those thorny problems that people tend to look at, and then
back away from, hoping that it will become easier in the future. In our
case, the ancillary objects were created because we had customers hitting
the 4GB limit on 32-bit ELF files. That's a lot of dwarf!

> I agree with this and Cary's statement that sections are the
> atomic unit in dwarf. I wonder if it would be possible to use some
> sort of delta encoding for representing a whole set of sections which
> are all very similar. This way you might be able to get the benefit of
> describing things with individual sections whilst eliminating some
> of the overhead?

The design of comdat groups lets you assume that different instances of
a given group are interchangeable, without actually having to compare them.
So ignoring the cost of I/O for the moment, that might be part of the answer.
You can ignore instances other than the first without processing them, so they
need not necessarily be tiny to be efficient.

I guess my main concern about delta encoding is that it imposes interpretation
costs to the link-editor (it has to "inflate things", much as with compression).
You'd have to prove that it was really faster --- it might not end up measuring
that way.

I don't know how desperate you are to solve this today, but assuming you
have some time to evolve to your desired end point, it sounds to me like it
would be a good thing for you to get involved with the DWARF-6 work that
Cary mentioned will start work later this year. Your real world experience
would be helpful.

- Ali

Rafael Avila de Espindola

unread,

Apr 6, 2018, 2:04:48 PM4/6/18

to gener...@googlegroups.com

> I don't know how desperate you are to solve this today, but assuming you
> have some time to evolve to your desired end point, it sounds to me like it
> would be a good thing for you to get involved with the DWARF-6 work that
> Cary mentioned will start work later this year. Your real world experience
> would be helpful.

How does one join? Just sign up for a mailing list of is there a committee that one must formally apply to?

Thanks,
Rafael

ben.dunbobb...@gtempaccount.com

unread,

Apr 9, 2018, 7:55:18 PM4/9/18

to Generic System V Application Binary Interface

> Ali

> Oracle keeps moving and renaming our blog URLs, which plays
> havoc with google searches, so I've recently created www.linker-aliens.org as a
> place to archive all of that old stuff in a stable place. It's hopefully the
> same content, wherever it may be found.

I'll add your link to our teams resources page! We ported our linker to
FreeBSD (x86-64) based mostly on, the abi documents, Ian's blog pages,
and your linker and libraries guide. Aside - in another thread someone
made reference to *countless* linker textbooks - intrigued me as I only
know of Levine's :)

> Ali

> I don't understand DWARF well enough to judge if the requirements are rare,
> but it's clear from the above that you've lived it, so I believe you.

I think that I made it sound too heroic :) You can read about the toolchain:

https://llvm.org/devmtg/2013-11/slides/Robinson-PS4Toolchain.pdf
https://www.snsystems.com/technology/tech-blog/2017/06/12/quick-wins-for-speedy-links/
https://www.snsystems.com/technology/tech-blog/2017/06/12/linker-optimizations/

> Ali

> As I'm sure you know all too well, the big risk in what you've done
> isn't the up front hard work. It's the fragility going forward, when the
> world changes, and you have to reconcile your extensions with whatever
> new things come down the pike. I can understand why you're looking for
> standards here.

Indeed. You guys are solving a *much* harder problem than us as you are
supporting general purpose operating systems. Previously we had a
completely closed platform including our own toolchain. If we wanted to
change the file format that was not a problem (although we try to stick
close to the standards). This sort of thing is increasingly difficult.

> Ali

> The design of comdat groups lets you assume that different instances of
> a given group are interchangeable, without actually having to compare them.
> So ignoring the cost of I/O for the moment, that might be part of the answer.
> You can ignore instances other than the first without processing them, so they
> need not necessarily be tiny to be efficient.

Absolutely, comdat gorups are a lightweight mechanism in this sense (assuming
you get a reasonable level of duplication).

> Ali

> I guess my main concern about delta encoding is that it imposes interpretation
> costs to the link-editor (it has to "inflate things", much as with compression).
> You'd have to prove that it was really faster --- it might not end up measuring
> that way.

Right. It is always best to come up with some concrete numbers. This isn't
something that I have put a lot of thought into yet. I suppose the simplest
version of what I have in mind is simply some way of expressing that the
next N section headers are all essentially the same apart from the offset and
size.

> Ali

> I don't know how desperate you are to solve this today, but assuming you
> have some time to evolve to your desired end point, it sounds to me like it
> would be a good thing for you to get involved with the DWARF-6 work that
> Cary mentioned will start work later this year. Your real world experience
> would be helpful.

Definitely don't need an immediate solution. Would be very happy to help in
any way we can with this effort. We actually have a chap on the DWARF
committee (Paul Robinson) - hopefully he can co-ordinate.

Ali Bahrami

unread,

Apr 9, 2018, 11:26:37 PM4/9/18

to gener...@googlegroups.com

On 4/9/18 5:55 PM, ben.dunbobbin%sony.com via Generic System V Application Binary Interface wrote:
> Aside - in another thread someone
> made reference to *countless* linker textbooks - intrigued me as I only
> know of Levine's :)

Yeah, me too. I'm aware of lots of papers and vendor documents,
our LLM being one, but not many proper textbooks.

> You can read about the toolchain:
>
> https://llvm.org/devmtg/2013-11/slides/Robinson-PS4Toolchain.pdf
> https://www.snsystems.com/technology/tech-blog/2017/06/12/quick-wins-for-speedy-links/
> https://www.snsystems.com/technology/tech-blog/2017/06/12/linker-optimizations/

Thanks, interesting stuff!

- Ali

ben.dunbobb...@gtempaccount.com

unread,

Apr 19, 2018, 8:12:46 PM4/19/18

to Generic System V Application Binary Interface

Hi All,

Just wanted to add an update as I have managed to speak
to a number of people on this subject. To my surprise several
of the chaps I spoke to said that they were aware of toolchains
that at either currently or some point in the past produced per

function dwarf and put this into the comdat group:

- ARMs toolchain used to do this.
- TI's toolchain currently does this.
- HP's toochain did this at some point.

So it seems that I was incorrect to think that splitting the
dwarf into per function pieces would be a novel implementation!

The guys I spoke to said that doing this with the dwarf (and
other metadata) did not cause a performance issue for users.
Having said that it might be true that compile + link times
were not a top priority for users.

Unfortunately, I don't have any detailed information, e.g. what

version/s of dwarf these toolchains supported/support.

Suprateeka R Hegde

unread,

Apr 20, 2018, 12:40:46 PM4/20/18

to gener...@googlegroups.com

(Sorry, I could not reply to this discussion at all. Was too tied up)

On 20-Apr-2018 05:42 AM, ben.dunbobbin%sony.com via Generic System V

Application Binary Interface wrote:
> - HP's toochain did this at some point.

Not just some point. We do that currently too. But its slightly
non-standard (OS Specific) way. We have SHT_HP_COMDAT.

On 05-Apr-2018 03:11 AM, Ali Bahrami wrote:
> We've tried putting it elsewhere:
>
> http://www.linker-aliens.org/blogs/ali/entry/ancillary_objects_separate_debug_elf/
>
> and we've tried to make it smaller:
>
> http://www.linker-aliens.org/blogs/ali/entry/elf_section_compression/

We do the same. Both separation and compression at link time. In
addition to the separation, we have +objdbg option that keeps the debug
info in the .o files itself and does not bulge the executable.

On HP-UX, I added some functionality that even supports distribution of
debug info -- some in executable and some of them in separate file. So
that for some important (selectable) functions, the debug info need not
be fetched/downloaded (see below) from other files.

In addition, last year, I added some fancy functionality that is being
used by many of users on HP-UX. One can upload the debug info onto a
configurable cloud storage. And rest of the tool chain, like debugger,
can download automatically and continue the debug session, etc.

(Unfortunately, the official documentation is still pending from my side)

--
Supra

Suprateeka R Hegde

unread,

Apr 20, 2018, 12:44:08 PM4/20/18

to gener...@googlegroups.com

On 20-Apr-2018 05:42 AM, ben.dunbobbin%sony.com via Generic System V

Application Binary Interface wrote:
> So it seems that I was incorrect to think that splitting the
> dwarf into per function pieces would be a novel implementation!

But you still can standardize that, with explicit support in DWARF 6, etc.

--
Supra

ben.dunbobb...@gtempaccount.com

unread,

Apr 23, 2018, 9:10:23 PM4/23/18

to Generic System V Application Binary Interface

Hi Supra,

Thanks for supplying this information.

> Supra

> Not just some point. We do that currently too. But its slightly
> non-standard (OS Specific) way. We have SHT_HP_COMDAT.

SHT_HP_COMDAT intrigued me. I found a reference to it here:
http://www.staroceans.org/e-book/elf-64-hp.pdf. Seems pretty
similar to elf comdat groups... I assume that you guys were ahead
of the standard for comdats?

> Supra

> We do the same. Both separation and compression at link time. In
> addition to the separation, we have +objdbg option that keeps the debug
> info in the .o files itself and does not bulge the executable.

Those sound like some nifty dwarf optimizations! We have a project for fixing
the whole world:

https://www.snsystems.com/technology/tech-blog/2016/11/17/demo-of-a-repository-for-statically-compiled-programs/

However, as you might expect, these sorts of project are a long road.

Suprateeka R Hegde

unread,

Apr 24, 2018, 10:34:53 AM4/24/18

to gener...@googlegroups.com

On 24-Apr-2018 06:40 AM, ben.dunbobbin%sony.com via Generic System V

Application Binary Interface wrote:
> Hi Supra,
>
> Thanks for supplying this information.
>
>> Supra
>> Not just some point. We do that currently too. But its slightly
>> non-standard (OS Specific) way. We have SHT_HP_COMDAT.
>
> SHT_HP_COMDAT intrigued me. I found a reference to it here:
> http://www.staroceans.org/e-book/elf-64-hp.pdf. Seems pretty
> similar to elf comdat groups... I assume that you guys were ahead
> of the standard for comdats?

That seems like the case as per the notes I have. And I am the only one
who has any notes at all on these history.

Cary might know as he was the runtime architect then. I was still in
10th at school ;-)

>
>> Supra
>> We do the same. Both separation and compression at link time. In
>> addition to the separation, we have +objdbg option that keeps the debug
>> info in the .o files itself and does not bulge the executable.
>
> Those sound like some nifty dwarf optimizations!

Yeah, correct.

--
Supra

Peter Smith

unread,

Apr 24, 2018, 10:41:35 AM4/24/18

to Generic System V Application Binary Interface

I can confirm that Arm Compiler 5, Arm's proprietary compiler will split up Dwarf 3 debug information and put these sections into comdat groups.

For example an instantiation of add produces a comdat group.

template <class T>
class Foo {
public:
Foo(T x, T y) : x_(x), y_(y) {}
T add() const { return x_ + y_; }
private:
T x_;
T y_;
};

The group contains the debug information needed for that function.

Name : _ZNK3FooIiE3addEv
Type : SHT_GROUP (0x00000011)
Flags : None (0x00000000)
Addr : 0x00000000
File Offset : 1896 (0x768)
Size : 28 bytes (0x1c)
Link : Section 41 (.symtab)
Info : Group Signature symbol _ZNK3FooIiE3addEv
Alignment : 4
Entry Size : 4

Group Flags
GRP_COMDAT
Group members (section)
t._ZNK3FooIiE3addEv (14)
.ARM.exidx (15)
.debug_loc (19)
.debug_line (18)
.debug_info (17)
.debug_frame (16)

Arm Compiler 6 is based on clang and no modifications have been to the debug output, so it does the same as an upstream clang.

We found that splitting up the debug into fragments works well as it permits the linker to ensure that all the references to local symbols are to sections within the same group, this makes it easy for the linker to remove all the debug when the group isn't selected.

This approach did produce significantly more debug information than gcc did. For small microcontroller projects this wasn't a problem. For larger feature phone problems we had to put a lot of work into keeping the linker's memory usage down as many of our customers at the time were using 32-bit Windows machines with a default maximum virtual memory of 2Gb.

Apologies for the lateness of the message, just thought I'd confirm the details.

Peter

Reply all

Reply to author

Forward