Reserve a section type value for DWARF64

216 views
Skip to first unread message

Fangrui Song

unread,
Nov 13, 2020, 3:15:31 PM11/13/20
to Generic System V Application Binary Interface
There are threads discussing support mixed DWARF32 and DWARF64 input sections on llvm-dev[1] and binutils[2]. Let me summarize the problem:

If a .debug_* output section S can be larger than 32-bit and its section offset
is referenced by a DWARF32 input section of itself or another .debug_* output
section, the relocation may be subject to 32-bit relocation overflow
(R_X86_64_32, R_PPC64_ADDR32, ...).

In a link,

... user_object_files libs.a libstdc++.a libc.a libc_nonshared.a libgcc.a libgcc_eh.a crtn.o crtend.o

Even if user object files are all DWARF64, it may be impractical to ensure all
the system libraries/object files are DWARF64. Moreover, the linked libraries
may not all be DWARF64. When a user has relocation overflow problems, usually
their controlled object files and libraries dominate the total size, while the
linked libraries contribute a small portion but just that small portion can
cause overflows if not ordered before DWARF64.

For such linked libraries, shipping two copies DWARF32/DWARF64 seem to ask too
much...

So, there is a need to mitigate 32-bit relocation overflows by sorting DWARF32
components before DWARF64 components within an output section. This works
for most cases because the total size of DWARF32 components (non-user controlled
part) should be representable with 32-bit absolute relocation types (if not the
user really should recompile their prebuilt libraries;-))

To sort DWARF32/DWARF64 components, a natural way is to assign DWARF64 sections a
new section type (DWARF32 are SHT_PROGBITS) so that linker scripts/Solaris
mapfiles(? ASSIGN_SECTION TYPE=...) can know DWARF32/DWARF64 without checking
the section contents (which is against the "smart format, dumb linker" spirit of
ELF; practically, this can cause performance issues).

So here is the request, can we (GNU binutils folks, LLVM folks, and stakeholders
of other operating systems) agree on a section type value? I shall add that this
has not been decided yet on binutils/LLVM sides (I've got ack from a senior
contributor on LLVM debug info side, though) but I want to check whether it can
get objection for other operating systems - the discussions on llvm-dev/binutils
will benefit from agreement/disagreement here.

A DWARF specific probably does not have too much to do with ELF spec but the
DWARF spec does not specify ELF specific stuff, either;-) So this list is my
best hope to get things standardized. Practically, LLVM is of my main concern:
LLVM has Solaris support and I don't want Linux/*BSD/Solaris to gratuitously
disagree on the section type values:)

So, what do you think of allocating a value in the low range (like 19, 20, ...)?
If you dislike it, can we pick a common value in the SHT_LOOS-SHT_HIOS range?
I'd like the section type to be named SHT_DWARF64.

As a GNU ld linker script example, with a type (and a to-be-determined section type matching keyword, once agreed upon), we can write

.debug_info 0 : { *(TYPE (SHT_PROGBITS) .debug_info${RELOCATING+ .gnu.linkonce.wi.*}) *(TYPE (SHT_GNU_DWARF64) .debug_info) }


Ali Bahrami

unread,
Nov 13, 2020, 4:36:23 PM11/13/20
to gener...@googlegroups.com
On 11/13/20 1:15 PM, 'Fangrui Song' via Generic System V Application Binary Interface wrote:
> f a .debug_* output section S can be larger than 32-bit and its section offset
> is referenced by a DWARF32 input section of itself or another .debug_* output
> section, the relocation may be subject to 32-bit relocation overflow
> (R_X86_64_32, R_PPC64_ADDR32, ...).
>

Hi,

I don't really understand this. I'm surprised that a single
object would have both DWARF32 and DWARF64. If this is
possible, then haven't the '32' and '64' lost their meaning?
I assumed that ELF32/DWARF32, and ELF64/DWARF64 go together.

In 32-bit objects, section sizes are limited to 32-bit,
so there's no issue.

In 64-bit objects, section sizes are 64-bit, but why
would you be generating DWARF32 for 64-bit objects?

I suspect that this from 2007 might be a clue:

http://wiki.dwarfstd.org/index.php?title=Questions_about_Dwarf_32/64

> I think the Sun compiler uses it all the time when 64-bit object code is being generated, but I think that's a waste.

So folks are using dwarf32 in 64-bit objects because it
is more compact, rather than fixing dwarf64 to not be
so wasteful? And now, we want to solve the resulting confusion
by using different ELF section types to label them, rather than
by having the dwarf content be self describing?

I'm a fan of giving sections explicit types, but at first blush
this seems like solving the wrong problem. The right problem
would be one of:

1) Improve DWARF to support more compact encodings
regardless of ELFCLASS, when offsets are small
(you could make 32-bit objects smaller too).

2) Have a way to signal which encoding (32 or 64) follows
explicitly in the DWARF stream itself.

I would bet that (2) wouldn't be any harder than using
a new ELF section type, and it feels more like the right
layer to be working at.

???

- Ali

Fāng-ruì Sòng

unread,
Nov 13, 2020, 4:55:43 PM11/13/20
to Generic System V Application Binary Interface
On Fri, Nov 13, 2020 at 1:36 PM Ali Bahrami <Ali.B...@oracle.com> wrote:
>
> On 11/13/20 1:15 PM, 'Fangrui Song' via Generic System V Application Binary Interface wrote:
> > f a .debug_* output section S can be larger than 32-bit and its section offset
> > is referenced by a DWARF32 input section of itself or another .debug_* output
> > section, the relocation may be subject to 32-bit relocation overflow
> > (R_X86_64_32, R_PPC64_ADDR32, ...).
> >
>
> Hi,
>
> I don't really understand this. I'm surprised that a single
> object would have both DWARF32 and DWARF64. If this is
> possible, then haven't the '32' and '64' lost their meaning?
> I assumed that ELF32/DWARF32, and ELF64/DWARF64 go together.

Not that a single object file uses both DWARF32 and DWARF64 - some
object files has DWARF32 while others have DWARF64.

> In 32-bit objects, section sizes are limited to 32-bit,
> so there's no issue.

Yes, this is an ELF64 issue.

> In 64-bit objects, section sizes are 64-bit, but why
> would you be generating DWARF32 for 64-bit objects?
>
> I suspect that this from 2007 might be a clue:
>
> http://wiki.dwarfstd.org/index.php?title=Questions_about_Dwarf_32/64
>
> > I think the Sun compiler uses it all the time when 64-bit object code is being generated, but I think that's a waste.
>
> So folks are using dwarf32 in 64-bit objects because it
> is more compact, rather than fixing dwarf64 to not be
> so wasteful? And now, we want to solve the resulting confusion
> by using different ELF section types to label them, rather than
> by having the dwarf content be self describing?

DWARF v5 says: "In the 32-bit DWARF format, all values that represent
lengths of DWARF sections and offsets relative to the beginning of
DWARF sections are represented using four bytes."
There is no requirement that ELF64 must use DWARF64. DWARF64 imposes
some size overhead so not every ELF64 user wants to adopt DWARF64.
DWARF64 serves a group of users who have large binaries which suffer
from the 32-bit DWARF format limitation.

> I'm a fan of giving sections explicit types, but at first blush
> this seems like solving the wrong problem. The right problem
> would be one of:
>
> 1) Improve DWARF to support more compact encodings
> regardless of ELFCLASS, when offsets are small
> (you could make 32-bit objects smaller too).

I don't think the compactness can practically be improved in the DWARF
specification. Cross-section offsets exist which are either .long or
.quad in assembly. If you pick .quad (DWARF64), you suffer from the
size overhead.

> 2) Have a way to signal which encoding (32 or 64) follows
> explicitly in the DWARF stream itself.
>
> I would bet that (2) wouldn't be any harder than using
> a new ELF section type, and it feels more like the right
> layer to be working at.

The DWARF64 section contents have encoded the formats via the length
field. The problem is that the linker should not inspect section
contents to take on different section ordering decisions.
A section type is a way to signal the encoding.

> ???
>
> - Ali
>
> --
> You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/generic-abi/555c56a7-58f9-cef4-8578-31f9a777ce34%40Oracle.COM.



--
宋方睿

Ali Bahrami

unread,
Nov 13, 2020, 8:18:15 PM11/13/20
to gener...@googlegroups.com
On 11/13/20 2:55 PM, 'Fāng-ruì Sòng' via Generic System V Application Binary Interface wrote:
> There is no requirement that ELF64 must use DWARF64.

I guess you mean that the dwarf spec doesn't explicitly
say ELF64 can't contain DWARF32? That might be true, but no spec
can explicitly list the universe of things that weren't intended
or anticipated. "No requirement" is not the same thing as "expected"
or "reasonable". In the absence of an explicit rule, why would anyone
assume that 64-bit objects can contain 32-bit content? These objects
violate expectations, if not the literal spec.

I'll stand down from that claim, and apologize, if you can point at
the place where the dwarf spec that says this is explicitly allowed.
Otherwise, it really sounds like a mistake that has solidified
into practice. I'll be interested to hear if I'm the only one who
sees it that way.

One strong sign that this wasn't really intended is that they didn't
provide any solution to the rather obvious problems you're trying
to solve now. If it was intended that ELF64 should have a choice
of using DWARF32 or DWARF64, and that it is OK to have both
(meaning that the link-editor would have to contain code to sort
them), isn't it odd that the DWARF spec says nothing about how
that was intended to work?

Since it doesn't say any of that (I think?), this seems more like
something that just happened, rather than something that was designed
to work this way.


> DWARF64 imposes
> some size overhead so not every ELF64 user wants to adopt DWARF64.
> DWARF64 serves a group of users who have large binaries which suffer
> from the 32-bit DWARF format limitation.

That says that DWARF64 has some important problems that needed
solving, which is undoubtedly true. I don't think it necessarily
says that using DWARF32, and then expecting 32 and 64 bit formats
to intermingle, was intended.

I recognize that you didn't cause this problem, and that now
you're holding the bag (a common fate for linker folks), and
trying to do something about it. If there is no better answer,
then I would go along with this section type (indeed, it's
better for Solaris that I do go along, if it happens). It seems
like a bandage on a mistake though, so I'm holding out hope
that someone can suggest a better more fundamental DWARF fix.

Personally, I'd prefer that ELF64 objects always use DWARF64,
and that these existing objects be fixed by recompilation. DWARF
bloat is a real problem, but surely the answer to that lies
in fixing DWARF, rather than having link-editors, and debuggers,
go through odd sorting and other workarounds?

- Ali

James Henderson

unread,
Nov 16, 2020, 4:46:54 AM11/16/20
to gener...@googlegroups.com
The DWARF v5 standard, section 7.4 states the following:

"Attribute values and section header fields that represent addresses in the target program are not affected by these rules."

This indicates to me that the target address and DWARF32/DWARF64 nature are deliberately unrelated. Note also that various DWARF sections have the address size encoded as a field in their DWARF header, separately from the DWARF32/DWARF64 stuff. Again, this shows that address size and 32 versus 64 bit DWARF are deliberately different. (Note: this piece is in italics, indicating it is not part of the specification per se, but is intended to clarify intent.)

"A DWARF consumer that supports the 64-bit DWARF format must support executables in which some compilation units use the 32-bit format and others use the 64-bit format provided that the combination links correctly (that is, provided that there are no link-time errors due to truncation or overflow). (An implementation is not required to guarantee detection and reporting of all such errors.)"

This is part of the formal specification - DWARF32 and DWARF64 are intermixable BY DESIGN. However, the second half of that first sentence is an interesting caveat and I'm not entirely sure how to digest that. It seems to indicate that the linker would be perfectly in keeping with the DWARF standard to reject things that don't work, but at the same time, it doesn't preclude the linker doing something smart to get it to work.

"It is expected that DWARF producing compilers will not use the 64-bit format by default. In most cases, the division of even very large applications into a number of executable and shared object files will suffice to assure that the DWARF sections within each individual linked object are less than 4 GBytes in size. However, for those cases where needed, the 64-bit format allows the unusual case to be handled as well. Even in this case, it is expected that only application supplied objects will need to be compiled using the 64-bit format; separate 32-bit format versions of system supplied shared executable libraries can still be used."

More normative text here suggests that DWARF32 is anticipated to be the norm.

 
> DWARF64 imposes
> some size overhead so not every ELF64 user wants to adopt DWARF64.
> DWARF64 serves a group of users who have large binaries which suffer
> from the 32-bit DWARF format limitation.

That says that DWARF64 has some important problems that needed
solving, which is undoubtedly true. I don't think it necessarily
says that using DWARF32, and then expecting 32 and 64 bit formats
to intermingle, was intended.

I recognize that you didn't cause this problem, and that now
you're holding the bag (a common fate for linker folks), and
trying to do something about it. If there is no better answer,
then I would go along with this section type (indeed, it's
better for Solaris that I do go along, if it happens). It seems
like a bandage on a mistake though, so I'm holding out hope
that someone can suggest a better more fundamental DWARF fix.

Personally, I'd prefer that ELF64 objects always use DWARF64,
and that these existing objects be fixed by recompilation. DWARF
bloat is a real problem, but surely the answer to that lies
in fixing DWARF, rather than having link-editors, and debuggers,
go through odd sorting and other workarounds?

- Ali

--
You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/generic-abi/a81c1d94-fe4c-b588-7b57-6c55f9367c31%40Oracle.COM.

FWIW, I think if we have a section type for DWARF64, we should also have a section type for DWARF32 (presumably consecutive numbers, probably with DWARF32 first). This allows tools in the future to skip some name comparisons, if they know they aren't relevant, because they know the section contains debug data. For example, the linker's --strip-debug option or LLD's getOutputSectionName functions could be improved to not need to check names if the type is DWARF32 (a name check would still be needed for older/mixed SHT_PROGBITS DWARF sections).

James

Pavel Labath

unread,
Nov 16, 2020, 6:56:49 AM11/16/20
to Generic System V Application Binary Interface
On Saturday, 14 November 2020 at 02:18:15 UTC+1 Ali Bahrami wrote:
On 11/13/20 2:55 PM, 'Fāng-ruì Sòng' via Generic System V Application Binary Interface wrote:
> There is no requirement that ELF64 must use DWARF64.

I guess you mean that the dwarf spec doesn't explicitly
say ELF64 can't contain DWARF32? That might be true, but no spec
can explicitly list the universe of things that weren't intended
or anticipated. "No requirement" is not the same thing as "expected"
or "reasonable". In the absence of an explicit rule, why would anyone
assume that 64-bit objects can contain 32-bit content? These objects
violate expectations, if not the literal spec.

I'll stand down from that claim, and apologize, if you can point at
the place where the dwarf spec that says this is explicitly allowed.

How about this (near the end of Section 7.4 32-Bit and 64-Bit DWARF Formats, on page 198 of DWARF v5 spec):


=====
A DWARF consumer that supports the 64-bit DWARF format must support
executables in which some compilation units use the 32-bit format and others use
the 64-bit format provided that the combination links correctly (that is, provided
that there are no link-time errors due to truncation or overflow). (An
implementation is not required to guarantee detection and reporting of all such
errors.)

It is expected that DWARF producing compilers will not use the 64-bit format by
default. In most cases, the division of even very large applications into a number of
executable and shared object files will suffice to assure that the DWARF sections within
each individual linked object are less than 4 GBytes in size. However, for those cases
where needed, the 64-bit format allows the unusual case to be handled as well. Even in
this case, it is expected that only application supplied objects will need to be compiled
using the 64-bit format; separate 32-bit format versions of system supplied shared
executable libraries can still be used.
=====

cheers,
Pavel

Ali Bahrami

unread,
Nov 16, 2020, 10:39:42 AM11/16/20
to gener...@googlegroups.com
That's pretty clear, thanks. :-)

So the DWARF spec explicitly allows this, something I didn't
know, and which wasn't clear to me from the original proposal
for this new section type. I apologize for wasting time
proposing other answers --- that ship has clearly sailed.

I interpret the above to say that if the amount of DWARF32
exceeds ELF limits, that one must switch entirely to DWARF64.
Fangrui mentioned the sorting of dwarf content to put the
32-bit stuff first. I don't know if the GNU linkers do that
already, but such a (horrible) fix would clearly be done to
forestall (not prevent) hitting that limit.

I now understand where this proposal is coming from. While it
would be arguably cleaner to force people to recompile to
DWARF64 when they hit the limit, that's not very realistic --- this
will certainly become a permanent FAQ unless the linkers
deal with it somehow.

I'd rather have SHT_DWARF64 than be required to implement
the sorting of 32-bit DWARF ahead of 64-bit DWARF, so I
guess I support this change. I don't have a problem with
assigning it a value in the low generic range.

- Ali

Ali Bahrami

unread,
Nov 16, 2020, 11:03:54 AM11/16/20
to gener...@googlegroups.com
On 11/16/20 2:46 AM, James Henderson wrote:
...
> I'll stand down from that claim, and apologize, if you can point at
> the place where the dwarf spec that says this is explicitly allowed.
> Otherwise, it really sounds like a mistake that has solidified
> into practice. I'll be interested to hear if I'm the only one who
> sees it that way.
,,,
> The DWARF v5 standard, section 7.4 states the following:

I stand down and apologize (see my previous reply also).

>
> "/Attribute values and section header fields that represent addresses in the target program are not affected by these rules./"
>
> This indicates to me that the target address and DWARF32/DWARF64 nature are deliberately unrelated. Note also that various DWARF sections have the address size encoded as a field in their DWARF header, separately from the DWARF32/DWARF64 stuff. Again, this shows that address size and 32 versus 64 bit DWARF are deliberately different. (Note: this piece is in italics, indicating it is not part of the specification per se, but is intended to clarify intent.)
>
> "A DWARF consumer that supports the 64-bit DWARF format must support executables in which some compilation units use the 32-bit format and others use the 64-bit format provided that the combination links correctly (that is, provided that there are no link-time errors due to truncation or overflow). (An implementation is not required to guarantee detection and reporting of all such errors.)"
>
> This is part of the formal specification - DWARF32 and DWARF64 are intermixable BY DESIGN. However, the second half of that first sentence is an interesting caveat and I'm not entirely sure how to digest that. It seems to indicate that the linker would be perfectly in keeping with the DWARF standard to reject things that don't work, but at the same time, it doesn't preclude the linker doing something smart to get it to work.

It seems that they are intermixable by design, but that the
design is incomplete. I read this as saying "we don't care
if you mix this content, but the linkers might, and that's
not our problem". I think you're right that we could just
reject such a link, but it's also clear that the end user
won't understand these subtleties, and will simply blame
the component that issues the error. That's fine if the
number of cases is small, but at some point, it does become
a linking problem that wants solving.

I'd rather tell them "no" than add dwarf sorting to the
linkers, but I can see that this SHT_DWARF64 proposal
would help those users without putting much of a burden
on the linkers. If we're going to become part of the
solution, this is a reasonable way to do it.



>
> "/It is expected that DWARF producing compilers will not use the 64-bit format by default.
...

> /
> /
> More normative text here suggests that DWARF32 is anticipated to be the norm.
> //
>

Good to know. Noted.


> FWIW, I think if we have a section type for DWARF64, we should also have a section type for DWARF32 (presumably consecutive numbers, probably with DWARF32 first). This allows tools in the future to skip some name comparisons, if they know they aren't relevant, because they know the section contains debug data. For example, the linker's --strip-debug option or LLD's getOutputSectionName functions could be improved to not need to check names if the type is DWARF32 (a name check would still be needed for older/mixed SHT_PROGBITS DWARF sections).


I almost said the same thing in my previous reply, but consider that
this is all being driven by existing use. If we did have SHT_DWARF32,
we would still have to support all the PROGBITS cases as well, so
it just adds additional complexity to also have SHT_DWARF32. This
is a lot like the recent discussion of unwind sections. It makes
sense as a clean from scratch design, but maybe less so as a retrofit.

One benefit of not doing this, is that old debuggers that understand
PROGBITS debug sections, but not SHT_DWARF32, can continue working for
the vast number of existing objects that don't have these size problems.

Thanks.

- Ali

Fangrui Song

unread,
Nov 16, 2020, 6:06:49 PM11/16/20
to Generic System V Application Binary Interface
Thanks to James and Pavel who have quoted the standards and to Ali who does not stop SHT_DWARF64 now:)

In Figure 4-9: Section Types, sh_type, let's add a new row:

SHT_DWARF64    19

and in the descriptions below:

SHT_DWARF64
This section holds debugging information which is described by the 64-bit DWARF format, as specified by the DWARF Debugging Information Format, Version 3 or a later version.

I hope that this is clear that the section type usage is optional. 

About SHT_DWARF32, I agree that it is probably not useful. GDB folks have mentioned gdb/dwarf2/read.c checks the section by name and ignores the type. LLDB does something similar. Using a different section type mostly confuses `readelf -S` with all existing readelf versions. In any case, the `readelf -S` output does not concern me and I don't mind if in the future we add a SHT_DWARF32 for completeness.

Cary Coutant

unread,
Nov 17, 2020, 11:35:39 AM11/17/20
to Generic System V Application Binary Interface
>> It seems that they are intermixable by design, but that the
>> design is incomplete. I read this as saying "we don't care
>> if you mix this content, but the linkers might, and that's
>> not our problem". I think you're right that we could just
>> reject such a link, but it's also clear that the end user
>> won't understand these subtleties, and will simply blame
>> the component that issues the error. That's fine if the
>> number of cases is small, but at some point, it does become
>> a linking problem that wants solving.

I think that the DWARF spec made a mistake in not specifying
DW_FORM_sec_offset and DW_FORM_ref_addr to be address-sized values
rather than dependent on DWARF-32 vs. DWARF-64. That would blow up all
DWARF attributes of form DW_FORM_sec_offset, but that can be mitigated
by using DW_FORM_strx, DW_FORM_loclistx, and DW_FORM_rnglistx wherever
possible. DW_FORM_ref_addr should be pretty rare.

I would move to fix this in DWARF 6.

Until that's available, though, it looks like we still need a fix....

> SHT_DWARF64 19

I'd prefer not to add a new section type, but I have an alternate
suggestion: Use a new section flag, say SHF_LARGE, that would tell the
linker that these section contributions should be placed after all
other contributions. X86 already has such a flag, SHF_X86_64_LARGE,
and PA-RISC and Itanium have the opposite -- SHF_PARISC_SHORT and
SHF_IA_64_SHORT.

-cary

Ali Bahrami

unread,
Nov 17, 2020, 12:57:19 PM11/17/20
to gener...@googlegroups.com
Hi Cary,

Not adding a new section type is appealing, but SHF_LARGE raises
numerous questions for me:

1) It considerably widens the scope, from a hack for a specific
dwarf issue, to a general purpose feature. What are the wider
ramifications and unintended consequences when its applied to
code, or other sections?

2) While we're calling it "large" and doing it for 64-bit dwarf,
it really amounts to a limited 2 bucket form of section ordering,
available for any purpose. If we're going to introduce generic
section ordering, is this really the feature we want to end up with?

3) How does it interact with the existing SHF_LINK_ORDER?

4) It's not really 1:1 with SHF_X86_64_LARGE, which is fine, but
which will probably be confusing.

Nailing all of that down will be a bit of work.

I'm afraid of what other uses of SHF_LARGE might follow, after it's been
used for this dwarf purpose and we've moved on. The advantage of SHT_DWARF64
is that its purpose is very limited, making it easy to deprecate later
once the need has passed. I'm not wild about burning a section code
like that for a short term purpose, but I like that the blast radius
is limited and that the situation is relatively easy to explain, and
understand. I also like that it requires very little new code in
link-editors.

If there's a chance of fixing this in DWARF6, then can we get away
with saying "recompile your code with DWARF64" to people in the meantime?
I was under the impression that there can be no good fix for this in DWARF.
If that was too pessimistic, then can we tread water and wait for
a proper dwarf fix?

As an aside, the mention of SHF_LARGE reminds me that we already accept
that code needs to be recompiled with different options when it grows
to certain limits under x86_64, with its small, medium, and large
programming models. The situation with DWARF seems conceptually the
same. Perhaps we'd be better off in the long run if we didn't do
any of this dwarf sorting or segregation, and keep things simple,
even though it does cause some complaints in the meantime?

- Ali

Cary Coutant

unread,
Nov 17, 2020, 2:48:33 PM11/17/20
to Generic System V Application Binary Interface
> Not adding a new section type is appealing, but SHF_LARGE raises
> numerous questions for me:
>
> 1) It considerably widens the scope, from a hack for a specific
> dwarf issue, to a general purpose feature. What are the wider
> ramifications and unintended consequences when its applied to
> code, or other sections?

True, but I think the impact is less than adding a new section type.
We can narrow the scope as much as we want, though, perhaps naming it
SHF_DWARF64 or SHF_DEBUG_LARGE. Given the uses we've already seen for
small/large partitioning, I could see it as a more general feature,
though. We could document it as a hint that could be ignored except
when linking exceptionally large applications.

> 2) While we're calling it "large" and doing it for 64-bit dwarf,
> it really amounts to a limited 2 bucket form of section ordering,
> available for any purpose. If we're going to introduce generic
> section ordering, is this really the feature we want to end up with?

We've already got hacks to do priority ordering for the
.ctors/.init_array sections, where the ordering is far more than 2
buckets, but we've also got .sdata/.data and .data/.ldata situations
where it's just a binary partitioning. I don't really know, but I
think the cost/benefit of this simple binary flag is pretty
attractive.

> 3) How does it interact with the existing SHF_LINK_ORDER?

The two flags would be incompatible.

> 4) It's not really 1:1 with SHF_X86_64_LARGE, which is fine, but
> which will probably be confusing.

Right, same with SHF_{PARISC|IA_64}_SHORT. Those flags apply to the
section as a whole, without saying anything about relative ordering of
contributions within a section.

> Nailing all of that down will be a bit of work.
>
> I'm afraid of what other uses of SHF_LARGE might follow, after it's been
> used for this dwarf purpose and we've moved on. The advantage of SHT_DWARF64
> is that its purpose is very limited, making it easy to deprecate later
> once the need has passed. I'm not wild about burning a section code
> like that for a short term purpose, but I like that the blast radius
> is limited and that the situation is relatively easy to explain, and
> understand. I also like that it requires very little new code in
> link-editors.

I'm not actually that afraid of later uses for the flag, if it's got a
well-specified meaning.

But I *really* don't like the new section type, even if its blast
radius is small.

Having mentioned the special handling of .ctors/.init_array, that
suggests another approach, too, that involves no ELF change. If the
compilers could just generate the 64-bit debug sections with
".debug64_" prefixes, we could handle the section ordering in the
linker as special cases in the same way we do the .ctors/.init_array
sections, and the same way we do .text.unlikely, .text.hot, etc.
sections. Although I hate special-casing on section names, I could
live with it given the existing precedents and the short-term nature
of the treatment.

> If there's a chance of fixing this in DWARF6, then can we get away
> with saying "recompile your code with DWARF64" to people in the meantime?
> I was under the impression that there can be no good fix for this in DWARF.
> If that was too pessimistic, then can we tread water and wait for
> a proper dwarf fix?

That would be nice, but it sounds like the LLVM community is looking
for a fix soon.

> As an aside, the mention of SHF_LARGE reminds me that we already accept
> that code needs to be recompiled with different options when it grows
> to certain limits under x86_64, with its small, medium, and large
> programming models. The situation with DWARF seems conceptually the
> same. Perhaps we'd be better off in the long run if we didn't do
> any of this dwarf sorting or segregation, and keep things simple,
> even though it does cause some complaints in the meantime?

Perhaps. I fear that we'd end up with a combinatorial explosion of
runtimes, though. I think those that considered the idea of having to
have separate versions of the libraries were correct in their concerns
about that approach.

-cary

Ali Bahrami

unread,
Nov 17, 2020, 11:01:36 PM11/17/20
to gener...@googlegroups.com
On 11/17/20 12:48 PM, Cary Coutant wrote:
> True, but I think the impact is less than adding a new section type.
> We can narrow the scope as much as we want, though, perhaps naming it
> SHF_DWARF64 or SHF_DEBUG_LARGE. Given the uses we've already seen for
> small/large partitioning, I could see it as a more general feature,
> though. We could document it as a hint that could be ignored except
> when linking exceptionally large applications.
...
> But I *really* don't like the new section type, even if its blast
> radius is small.

I don't see how the impact is less for a flag than a type. A flag
can potentially apply to all section types, but the reverse isn't true,
so I think (flag > sec).

But maybe we can defer settling that debate for now, since I think
we can agree that both are pretty unfortunate. I really like your
idea of using section names to solve this, below, in preference to
either of these ELF changes.

>
> Having mentioned the special handling of .ctors/.init_array, that
> suggests another approach, too, that involves no ELF change. If the
> compilers could just generate the 64-bit debug sections with
> ".debug64_" prefixes, we could handle the section ordering in the
> linker as special cases in the same way we do the .ctors/.init_array
> sections, and the same way we do .text.unlikely, .text.hot, etc.
> sections. Although I hate special-casing on section names, I could
> live with it given the existing precedents and the short-term nature
> of the treatment.
>

Yeah!

I could definitely live with using a special name for a PROGBITS
section for this, both because I hope it's not forever, but also
because so many of the debug related sections are only differentiated
by name. We're already stuck playing the name game for debug sections,
so .debug64 doesn't make things appreciably worse.

I wonder if we could dispense with the section ordering entirely,
and simply allow the output objects to end up with both .debug
and .debug64 sections?

- The work to the compiler is the same in either case. Just
put DWARF64 in .debug64 sections.

- Link-editors need no changes. They'll just concatenate
the input sections and pass them through using default
ELF behavior.

- Debuggers would need to learn to read both .debug and .debug64,
to obtain all the debug content. I imagine that there's currently
code that matches .debug and calls a function to read that content.
Calling that same function for 2 sections ought to be pretty simple.

It's human nature for compiler folks to want linkers to deal with stuff,
and for linker folks to want the debuggers to deal with it. However, in
this case, I think the change to the debuggers should really be simpler
than what the linkers would have to do, because the debuggers can just
read the 2 sections and treat them as one logical item --- no sorting
required.

The merit of this is that once dwarf gets fixed, we can easily
erase our tracks without leaving behind any permanent damage to ELF,
and without having to add sorting code that we would then be stuck
with after the need for it is largely gone.

- Ali

Fangrui Song

unread,
Nov 18, 2020, 12:07:46 PM11/18/20
to gener...@googlegroups.com
I am glad that there are several alternative proposals discussed here:)
It was my intention that I would like to see how other linker experts thought on
this topic:)

FWIW I have considered 4 proposals now:

1) No ELF side change. The linker does magic DWARF32/DWARF64 partition.
Actually I have a prototype for this idea: https://reviews.llvm.org/D91404
In practice, the first relocation of .debug_* is a good indicator whether it is
a DWARF64 section. You can see the patch for details.

However, .debug_str is difficult to handle with this approach because .debug_str
itself does not have relocations. The closest heuristic I can think of is: "if a
.debug_info's first relocation is of a 64-bit absolute relocation type, mark
.debug_str in the same input file as 'DWARF64'".

Unfortunately, this makes the linker behavior dependent on other sections, which
is why I feel lost: when we write .debug_str 0 : { *(.debug_str) }, we really
want the output section .debug_str can be produced with information just from
the input section descriptions, not random information from other .debug_*
(--sort-section/SHF_LINK_ORDER/LLD --shuffle-sections/gold --section-ordering-file/LLD --symbol-ordering-file/SORT:
we have many ways to change the order of '*', but this is the time we need information from other output sections)

Another problem is relocatable links: if we order DWARF64 before DWARF32 in a
relocatable link, we may treat the combined section as "DWARF64" while it has
DWARF32 relocation limitation.

2) A new section type SHT_DWARF64. The linker partitions sections with the section type.

.debug_info 0 : { *(TYPE (SHT_PROGBITS) .debug_info *(TYPE (SHT_GNU_DWARF64) .debug_info) }

With the usual linker rule that when a non-SHT_PROGBITS section is mixed with a SHT_PROGBITS,
the result is SHT_PROGBITS. A relocatable link output can be correctly inferred as "DWARF32".
(Conceptually, the combined section should impose the rigid restriction
when it is further combined with other sections)

3) A new section flag. I agree with Ali on this point. This idea has a larger blast radius.
(We have 0x60000000 section types while only 8 remaining generic section flags.)
I don't see the existing SHT_*_LARGE and SHT_*_SMALL convincing because they actually use different output section names.
(Please correct me as I also think they are some legacy stuff)
Should the output section have the flag if any of the input sections have the flag?
For many other flags this rule applies but for SHF_*_LARGE the flag can cause a problem
with relocatable links (as I mentioned above).

4) A section prefix. To be fair, as a linker person I like it in two ways:

* It is immediately obvious whether DWARF64 is used and whether
DWARF32 is used along with DWARF64.
* In a relocatable link mixing DWARF32 and DWARF64 sections, DWARF32
and DWARF64 sections will naturally not get mixed. We don't even need another linker feature
to match input sections by type.

On the other hand,

* It is non-conforming due to different section names.
James Henderson
(https://lists.llvm.org/pipermail/llvm-dev/2020-November/146721.html): "However,
conformance is still a concern to me as we cannot really retrofit the existing
standard versions, and the section names themselves are in the standard. That
means that tools that otherwise would work might stop working when presented
with a "new" DWARFv3/4/5 output that it in theory could otherwise handle."

* Tooling support. Some commonly used consumers have recognized
+ gdb: gdb/dwarf2/read.c recognizes .debug_* by name and does not support multiple .debug_info sections
(confirmed with gdb maintainers)
+ objcopy --strip-debug: needs to learn the new .debug64 prefix.
James mentioned that "you could do .debug_64_info or .debug_info_64 probably safely"
+ gold --gdb-index
+ In LLVM, to give an intuitive feeling, a number of places need to account for more sections:
integrated assembler / DWARFContext / MCObjectFileInfo / llvm-dwarfdump's -debug* options.

As a refinement (James'), we can let the linker combine .debug64 or .debug_64 in object files.
Tools dealing with linked images will not need a change, but tools dealing with object files
(objcopy --strip-debug/gold --gdb-index/assembler/...) still suffer from
complexity due to the doubled number of sections.

On balance, downsides do not make this more appealing than SHT_DWARF64, which
can retrofit existing standards.
Hey, I like this quote:)

>However, in
>this case, I think the change to the debuggers should really be simpler
>than what the linkers would have to do, because the debuggers can just
>read the 2 sections and treat them as one logical item --- no sorting
>required.

That said, I hope my comments above have explained why a section prefix proposal
is not more appealing.

>The merit of this is that once dwarf gets fixed, we can easily
>erase our tracks without leaving behind any permanent damage to ELF,
>and without having to add sorting code that we would then be stuck
>with after the need for it is largely gone.
>
>- Ali

"dwarf gets fixed" - I hope Pavel and James' comments have explained this is not
a bug in the DWARF standard:)
DW_FORM_strp/DW_strp_sup/DW_FORM_line_strp/DW_FORM_sec_offset and GNU extensions
DW_FORM_GNU_ref_alt/DW_FORM_GNU_strp_alt use 32-bit offsets in the 32-bit DWARF
format. The offset encoding is the main point. I think this is intentional
otherwise what's the advantage/differences having a separate 64-bit DWARF
format? Just for its 64-bit length encoding? :)

H.J. Lu

unread,
Nov 18, 2020, 12:23:38 PM11/18/20
to Generic System V Application Binary Interface
FWIW, as a linker person, I prefer the .debug64 prefix.

--
H.J.

Eric Christopher

unread,
Nov 18, 2020, 12:27:50 PM11/18/20
to gener...@googlegroups.com
That will require, at least, a textual change to the dwarf standard and a lot of updates to tooling. That side of the discussion is probably best had on the dwarf mailing lists. 

-eric
 

--
H.J.


--
You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.

Cary Coutant

unread,
Nov 18, 2020, 2:19:10 PM11/18/20
to Generic System V Application Binary Interface
> However, .debug_str is difficult to handle with this approach because .debug_str
> itself does not have relocations. The closest heuristic I can think of is: "if a
> .debug_info's first relocation is of a 64-bit absolute relocation type, mark
> .debug_str in the same input file as 'DWARF64'".

.debug_str is also (typically) a string-merge section, and perhaps the
least likely section (aside from .debug_abbrev) to overflow 4 GB.

-cary

Cary Coutant

unread,
Nov 18, 2020, 2:40:49 PM11/18/20
to Generic System V Application Binary Interface
> >However, in
> >this case, I think the change to the debuggers should really be simpler
> >than what the linkers would have to do, because the debuggers can just
> >read the 2 sections and treat them as one logical item --- no sorting
> >required.
>
> That said, I hope my comments above have explained why a section prefix proposal
> is not more appealing.
>
> >The merit of this is that once dwarf gets fixed, we can easily
> >erase our tracks without leaving behind any permanent damage to ELF,
> >and without having to add sorting code that we would then be stuck
> >with after the need for it is largely gone.
> >
> >- Ali
>
> "dwarf gets fixed" - I hope Pavel and James' comments have explained this is not
> a bug in the DWARF standard:)
> DW_FORM_strp/DW_strp_sup/DW_FORM_line_strp/DW_FORM_sec_offset and GNU extensions
> DW_FORM_GNU_ref_alt/DW_FORM_GNU_strp_alt use 32-bit offsets in the 32-bit DWARF
> format. The offset encoding is the main point. I think this is intentional
> otherwise what's the advantage/differences having a separate 64-bit DWARF
> format? Just for its 64-bit length encoding? :)

You may have missed my comment where I explained why I think the DWARF
spec *does* need to be fixed. It is not an advantage if we can't mix
DWARF-32 and DWARF-64 objects freely.

Also, please see my reply on the binutils list where I asked whether
this is merely a theoretical problem, and, if not, are you
experiencing overflow in any section besides .debug_info?

https://sourceware.org/pipermail/binutils/2020-November/114200.html

I view any changes you're going to make for DWARF 5 here as temporary
hacks to last until the DWARF committee agrees on a proper solution.
The scope of the problem you're experiencing is a factor in what
solution may appear preferable. For example, your objection to the
section name prefix proposal is based on perceived conformance issues
such as requiring a textual changes to the DWARF standard. That ship
has sailed: conformance is already out the window. Any of your
solutions are going to deviate from the DWARF and/or ELF specs. And I
doubt that we'll be able to pick a solution now that will end up
unmodified in DWARF 6.

-cary

p.s. It's unfortunate that this discussion now appears to be split
across three separate mailing lists. Can we try to keep it all in one
place?

Ali Bahrami

unread,
Nov 18, 2020, 2:42:37 PM11/18/20
to gener...@googlegroups.com
On 11/18/20 10:07 AM, 'Fangrui Song' via Generic System V Application Binary Interface wrote:
> FWIW I have considered 4 proposals now:

I appreciate it. I know you're trying to solve a problem
no one else has picked up, and that you're trying to find
the simplest path through.


> 4) A section prefix. To be fair, as a linker person I like it in two ways:
...
> * It is non-conforming due to different section names.
> James Henderson
> (https://urldefense.com/v3/__https://lists.llvm.org/pipermail/llvm-dev/2020-November/146721.html__;!!GqivPVa7Brio!OLDNEIKTzRkbOb64OJfudy7FnWNBmJBpgkGS3sZrqNCJLHf4iJs9rr1uyzXtU4rq$ ): "However,
> conformance is still a concern to me as we cannot really retrofit the existing
> standard versions, and the section names themselves are in the standard. That
> means that tools that otherwise would work might stop working when presented
> with a "new" DWARFv3/4/5 output that it in theory could otherwise handle."

I hope that this is simple enough to be seriously considered.
Can we explore the possibility? How many such tools are there,
and is updating them really out of the question?


> "dwarf gets fixed" - I hope Pavel and James' comments have explained this is not
> a bug in the DWARF standard:

Please read it as "dwarf gets improved". Bugs and feature
requests both get fixes.

- Ali

Cary Coutant

unread,
Nov 18, 2020, 2:44:36 PM11/18/20
to Generic System V Application Binary Interface
> > But I *really* don't like the new section type, even if its blast
> > radius is small.
>
> I don't see how the impact is less for a flag than a type. A flag
> can potentially apply to all section types, but the reverse isn't true,
> so I think (flag > sec).

That's not the way I look at it. Yes, the flag can apply in more
cases, but that's why I prefer it. It's less specialized and could be
potentially useful in other similar situations while not causing any
undue strain on the ELF ecosystem. A section type sticks out like a
sore thumb.

-cary

Cary Coutant

unread,
Nov 18, 2020, 2:49:38 PM11/18/20
to Generic System V Application Binary Interface
Here's a copy of yesterday's response that I posted to the binutils
list. Let's keep the conversation here on this list for now...

-cary

---------- Forwarded message ---------
From: Cary Coutant <ccou...@gmail.com>
Date: Tue, Nov 17, 2020 at 3:08 PM
Subject: Re: How to sort mixed DWARF32 and DWARF64 .debug_*
To: Fangrui Song <i...@maskray.me>
Cc: Binutils <binu...@sourceware.org>


> I want to bring it to your attention that on llvm-dev some folks are discussing
> how to sort mixed DWARF32 and DWARF64 .debug_* sections https://lists.llvm.org/pipermail/llvm-dev/2020-November/146522.html
>
> There are merits placing DWARF32 .debug_info before DWARF64 .debug_info
> because otherwise the DWARF32 .debug_info could still lead to out-of-range
> relocations. This approach will however break the usual rule for linking
> unrecognized sections:
>
> "When not otherwise constrained, sections should be emitted in input order."
>
> https://lists.llvm.org/pipermail/llvm-dev/2020-November/146528.html
> has an idea that DWARF64 sections can be detected by the relocation type (i.e.
> on ppc64, check whether the first relocation is R_PPC64_ADDR64)

I went back and read through that thread, and now I get the impression
that this is a theoretical concern that no one has yet encountered in
practice. Am I wrong? If so, what sections have actually exceeded 4 GB
in size?

Looking at this analytically, the most likely section to exceed the
total size limit is .debug_info. There are only three places you could
have a relocatable reference into .debug_info:

(1) A reference from one DIE to another (external) DIE using
DW_FORM_ref_addr. This form should be rare, if present at all -- I
don't know of any mainstream use of this. It was intended to be used
in some kind of separate compilation where you have multiple
translation units for one compilation unit, or in some extreme forms
of global optimization. If this is in fact where you're encountering a
relocation overflow, you could (for the time being) simply invent your
own form DW_FORM_ref_addr8 and use it throughout your toolchain. That
could tide you over until we address this issue in DWARF 6.

(2) References from the .debug_names (or, in DWARF 4,
.debug_pubnames/pubtypes) table to a compilation unit. These
accelerated access tables are meant to be combined at link time, so
they could be generated as DWARF-64 tables when necessary.

(3) References from the .debug_aranges table to a compilation unit.
This may be the most likely failure scenario, where a .debug_aranges
section from a library is generated as DWARF-32, but references a
compilation unit whose offset within .debug_info is larger than 4 GB.
We could mitigate this by simply requiring that *all* .debug_aranges
tables be generated as DWARF-64. This would be a minimal impact, since
only the unit_length and debug_info_offset fields would grow; all the
range descriptor entries in the table would already be address-size.

All other relocatable values in DWARF using DW_FORM_sec_offset,
DW_FORM_strp, DW_FORM_ref_sup[48], DW_FORM_strp_sup, or
DW_FORM_line_strp refer to other DWARF sections, and those other
sections should be at least an order of magnitude smaller than
.debug_info. I suspect this is either a theoretical problem that could
wait, or something you could solve without any special linker support.

-cary

Igor Kudrin

unread,
Nov 19, 2020, 7:27:35 AM11/19/20
to gener...@googlegroups.com, Cary Coutant
On 19.11.2020 2:49, Cary Coutant wrote:
> (3) References from the .debug_aranges table to a compilation unit.
> This may be the most likely failure scenario, where a .debug_aranges
> section from a library is generated as DWARF-32, but references a
> compilation unit whose offset within .debug_info is larger than 4 GB.
> We could mitigate this by simply requiring that *all* .debug_aranges
> tables be generated as DWARF-64. This would be a minimal impact, since
> only the unit_length and debug_info_offset fields would grow; all the
> range descriptor entries in the table would already be address-size.

Would not that violate the statement from section 7.4? "The 32-bit and 64-bit DWARF format conventions must not be intermixed within a single compilation unit."

--
Best Regards,
Igor Kudrin
C++ Developer, Access Softek, Inc.

James Henderson

unread,
Nov 19, 2020, 7:31:35 AM11/19/20
to gener...@googlegroups.com, Cary Coutant
Not by my understanding - the .debug_info for that CU is still DWARF32, but could be placed after 4GB of other .debug_info from other CUs at link time. A linked object will usually contain multiple CUs worth of information.

--
You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.

Igor Kudrin

unread,
Nov 19, 2020, 8:36:46 AM11/19/20
to gener...@googlegroups.com, James Henderson, Cary Coutant


On 19.11.2020 19:31, James Henderson wrote:
> Not by my understanding - the .debug_info for that CU is still DWARF32, but could be placed after 4GB of other .debug_info from other CUs at link time. A linked object will usually contain multiple CUs worth of information.

I was always under impression that this excerpt sets the requirement for all debugging information for a compilation unit spread among all sections, not only .debug_info. Otherwise, some links would be broken. For example, let's assume there are a DWARF64 .debug_rnglists with a non-zero offset_entry_count and a DWARF32 CU which references it using DW_AT_rnglists_base. What size would offsets in the range list have?

James Henderson

unread,
Nov 19, 2020, 8:46:42 AM11/19/20
to Igor Kudrin, gener...@googlegroups.com, Cary Coutant
I don't think we're disagreeing. The .debug_rnglists table will belong to the same CU as the referencing DW_TAG_compile_unit referencing it and therefore by section 7.4 both must be 32-bit DWARF or 64-bit DWARF.

On the other hand, if you end up with two input objects, say 32.o and 64.o, they'll each have their own .debug_rnglists table, which means there is no risk of ambiguity here. When linked together, there'll be a single .debug_rnglists section with multiple contributions, which can have different DWARF32/DWARF64 state.

I think I misunderstood what you were commenting on, having reread Cary's comment you were replying to (you were specifically calling out the .debug_aranges being 64-bit DWARF always, right? I thought you were commenting on the .debug_info/4 GB offsets).

Thinking about Cary's comment, whilst it would be a technical violation of the standard, making all .debug_aranges DWARF64 would not likely have any negative impact, since they're self-contained. The only issue would be code which doesn't check the state of the info contribution they're referring to.

Cary Coutant

unread,
Nov 19, 2020, 4:22:51 PM11/19/20
to jh737...@my.bristol.ac.uk, Igor Kudrin, Generic System V Application Binary Interface
> Thinking about Cary's comment, whilst it would be a technical violation of the standard, making all .debug_aranges DWARF64 would not likely have any negative impact, since they're self-contained. The only issue would be code which doesn't check the state of the info contribution they're referring to.

Yes, that's what I was intending. It's an unnecessary restriction in
this case, and could be violated with little consequence.

-cary

Cary Coutant

unread,
Nov 20, 2020, 3:44:36 PM11/20/20
to Generic System V Application Binary Interface
> I went back and read through that thread, and now I get the impression
> that this is a theoretical concern that no one has yet encountered in
> practice. Am I wrong? If so, what sections have actually exceeded 4 GB
> in size?

I'm still interested in the answer to this question. Over on the
llvm-dev thread that started this, Alex Y. introduced the problem with
this:

> ... we are looking into [DWARF-64] as one of the options for
> handling debug information over 4gigs in production environment.
> One concern is that due to mix of third party libraries and llvm
> compiled code the final library/binary will have a mix of CU that
> are DWARF32/64. This is supported by DWARF format. With this mix
> it is possible that even with DWARF64 enabled one can still
> encounter relocation overflows errors in LLD if DWARF32 sections
> happen to be processed towards the end.

That suggests that this is a theoretical problem not yet seen in the
wild. While I agree that the problem is real in theory, it would help
to know just how urgent a solution is. Can it wait for DWARF 6, for
example? (Know that a final version of DWARF 6 wouldn't be out for at
least two years, even if we reconvene the working group right away.)
Can you use -gsplit-dwarf?

At Google, link times and binary sizes were getting untenable long
before we approached the limits of DWARF-32. Split DWARF (aka
"Fission") provided much relief. I cringe at the thought of a binary
with over 4 GB of debug info in it.

-cary

Fangrui Song

unread,
Nov 20, 2020, 4:15:53 PM11/20/20
to Generic System V Application Binary Interface
> Cary: p.s. It's unfortunate that this discussion now appears to be split
> across three separate mailing lists. Can we try to keep it all in one
> place?

(A bit unfortunate but .. many folks have different preferences/subscriptions
and the topic is related to too many parts of the toolchain. This list is my best effort to
reach to folks from other ELF based OSes.)

> You may have missed my comment where I explained why I think the DWARF
> spec *does* need to be fixed. It is not an advantage if we can't mix
> DWARF-32 and DWARF-64 objects freely.
> Also, please see my reply on the binutils list where I asked whether this is merely a theoretical problem, and, if not, are you experiencing overflow in any section besides .debug_info? https://sourceware.org/pipermail/binutils/2020-November/114200.html
> I view any changes you're going to make for DWARF 5 here as temporary hacks to last until the DWARF committee agrees on a proper solution. The scope of the problem you're experiencing is a factor in what solution may appear preferable. For example, your objection to the section name prefix proposal is based on perceived conformance issues such as requiring a textual changes to the DWARF standard. That ship has sailed: conformance is already out the window. Any of your solutions are going to deviate from the DWARF and/or ELF specs. And I doubt that we'll be able to pick a solution now that will end up unmodified in DWARF 6.

Thanks for the write-up (I did only notice it after you had mentioned it) and
thanks for making replies on llvm-dev. Other folks probably haven't seen your
reply (and Some replies on llvm-dev have lost some CCes). I'll post a reply
to their attention. I agree that many .debug_info offsets can probably be replaced
with a .quad (cost: 4 byte for one input section) to mitigate the 32-bit relocation overflow issues.

> Ali: I hope that this is simple enough to be seriously considered.
> Can we explore the possibility? How many such tools are there,
> and is updating them really out of the question?

The section name approach should definitely be seriously considered.
(A new standard is required to make new section names conformant)
I've forwarded the proposal to llvm-dev and I'd like to emphasize it again over there.

That said, I still think reserving a section type complements the proposal.
Should we still teach the tools to detect DWARF64 by DWARF64 specific section
names? (if (name.startswith(".debug64_")) ... else if (name.startswith(".debug_")) ...)

Personally I think a SHT_DWARF64 still makes sense, no matter the
decision on the section name.

  if (type == SHT_DWARF64)
    handle DWARF64
  else if (name.startswith(".debug_"))
    handle DWARF32

Note that a section type can retrofit the existing standard versions (v3, v4
and v5).  As of tooling support, tools generally don't require the section type
of .debug_*: no issue for lldb/gdb/llvm-{objcopy,strip}/(likely) binutils
strip/objcopy.

> I'm still interested in the answer to this question. Over on the
> llvm-dev thread that started this, Alex Y. introduced the problem with
> this:
>
> > ... we are looking into [DWARF-64] as one of the options for
> > handling debug information over 4gigs in production environment.
> > One concern is that due to mix of third party libraries and llvm
> > compiled code the final library/binary will have a mix of CU that
> > are DWARF32/64. This is supported by DWARF format. With this mix
> > it is possible that even with DWARF64 enabled one can still
> > encounter relocation overflows errors in LLD if DWARF32 sections
> > happen to be processed towards the end.

> That suggests that this is a theoretical problem not yet seen in the
> wild. While I agree that the problem is real in theory, it would help
> to know just how urgent a solution is. Can it wait for DWARF 6, for
> example? (Know that a final version of DWARF 6 wouldn't be out for at
> least two years, even if we reconvene the working group right away.)
> Can you use -gsplit-dwarf?
> At Google, link times and binary sizes were getting untenable long
> before we approached the limits of DWARF-32. Split DWARF (aka
> "Fission") provided much relief. I cringe at the thought of a binary
> with over 4 GB of debug info in it.

Yes. I think the 32-bit relocation overflow issue is less urgent for us since we use split DWARF.
The problem might be more urgent for Alexander (Facebook).
Igor Kudrin should have needs, too considering his contribution on DWARF64 generation.

With my LLVM contributor (and an LLD maintainer) hat on, I still want to
address the problem timely to help other community members. Of course, we
should make sure the relevant parties have been properly informed (that was why
I posted here, posted on binutils (many binutils folks with opinions don't
subscribe here)). Just now I say that Alexander Yermolovich started a GCC
thread about the DWARF64 option name


My LLD patch is ready https://reviews.llvm.org/D91404 pending on an appropriate to fix the problem.

So, again, would the list not consider a section type, essentially requiring DWARF64 sections to keep
using SHT_PROGBITS?

Once a decision is made, I will volunteer to do the LLVM integrated assembler/binary utilities side work.
I am happy to make readelf -S look good, too. GNU as is too difficult for me to work on, though....

Alex

unread,
Nov 21, 2020, 7:42:53 PM11/21/20
to Generic System V Application Binary Interface

My apologies didn't realize conversation has moved entirely to this group/thread.
Yes we have hit .debug_info overflow in some of our builds. So it's not a theoretical problem. 
Regarding .debug_str, at least at binaries I looked at it's size ranged from 1.8GB to 2.2GB for .debug_info section size range of 3.3GB to ~4GB. So at least that part is theoretical, for now.

My 2cents, for whatever it's worth, I don't see how going with proposed solutions
1) section type
2) section flag
3) debug section name change

doesn't affect the spec. At which point it's a future spec change vs retrofitting existing one.
Maybe it is better to leave it to DWARF6 spec.

Thank You
Alex

Michael Eager

unread,
Nov 21, 2020, 11:46:42 PM11/21/20
to gener...@googlegroups.com
On 11/20/20 12:44 PM, Cary Coutant wrote:
> That suggests that this is a theoretical problem not yet seen in the
> wild. While I agree that the problem is real in theory, it would help
> to know just how urgent a solution is. Can it wait for DWARF 6, for
> example? (Know that a final version of DWARF 6 wouldn't be out for at
> least two years, even if we reconvene the working group right away.)
> Can you use -gsplit-dwarf?

I agree.

I'd much rather see a real world problem driving a solution, rather than
looking at a theoretical issue. The potential for issues with DWARF64
have been around since DWARF v3, I believe.

Rather than generate DWARF64, it seems to me that splitting debug
sections would be much preferred.

> At Google, link times and binary sizes were getting untenable long
> before we approached the limits of DWARF-32. Split DWARF (aka
> "Fission") provided much relief. I cringe at the thought of a binary
> with over 4 GB of debug info in it.

Likewise.

FWIW, I'd prefer using a different section name, if that is part of a
proposed solution to a real problem, rather than a section type.

--
Michael Eager

Igor Kudrin

unread,
Nov 24, 2020, 10:44:21 AM11/24/20
to gener...@googlegroups.com, Cary Coutant
On 21.11.2020 3:44, Cary Coutant wrote:
>> I went back and read through that thread, and now I get the impression
>> that this is a theoretical concern that no one has yet encountered in
>> practice. Am I wrong? If so, what sections have actually exceeded 4 GB
>> in size?
>
> I'm still interested in the answer to this question. Over on the
> llvm-dev thread that started this, Alex Y. introduced the problem with
> this:
>
>> ... we are looking into [DWARF-64] as one of the options for
>> handling debug information over 4gigs in production environment.
>> One concern is that due to mix of third party libraries and llvm
>> compiled code the final library/binary will have a mix of CU that
>> are DWARF32/64. This is supported by DWARF format. With this mix
>> it is possible that even with DWARF64 enabled one can still
>> encounter relocation overflows errors in LLD if DWARF32 sections
>> happen to be processed towards the end.
>
> That suggests that this is a theoretical problem not yet seen in the
> wild. While I agree that the problem is real in theory, it would help
> to know just how urgent a solution is. Can it wait for DWARF 6, for
> example? (Know that a final version of DWARF 6 wouldn't be out for at
> least two years, even if we reconvene the working group right away.)
> Can you use -gsplit-dwarf?

In our case, developers create large applications, which must be represented as a single binary because of the restrictions of the platform. Split DWARF might be an option, but there are cases when it is not very convenient. Note also that the DWARF Package files have a design flaw that prevents them to contain more than 4Gigs of data in a single section, which makes them useless at the same time you hit the DWARF32 limits.

In my understanding, the DWARF standard only defines how the debugging information is represented in the final binary where it can be read by a debugger. It does not instruct tools like compilers and linkers on how to produce intermediate files in detail. From this standpoint, the standard does not have issues that need to be fixed and the problem of intermixing DWARF32/DWARF64 data can be resolved by adjusting only linkers and/or compilers.

If we take the way of updating tools, the solution can be delivered to the users almost instantly.

If we take the way of changing the standard, I dread that users will not be able to use the feature for eons. If the solution depends on the changed standard, that means that all libraries, including third-party ones, will have to be updated to contain the debugging information in the new format. But most of the producers do not have any stimulus to do that, at least until all tools their clients use will support the new format. I guess that may take forever.

Ali Bahrami

unread,
Nov 24, 2020, 2:26:22 PM11/24/20
to gener...@googlegroups.com
On 11/24/20 8:44 AM, Igor Kudrin wrote:
> In my understanding, the DWARF standard only defines how the debugging information is represented in the final binary where it can be read by a debugger. It does not instruct tools like compilers and linkers on how to produce intermediate files in detail. From this standpoint, the standard does not have issues that need to be fixed and the problem of intermixing DWARF32/DWARF64 data can be resolved by adjusting only linkers and/or compilers.

The decisions made in the dwarf standard about this clearly
have ramifications for compilers and linkers. If the DWARF
content is a problem for compilers and linkers, then it's a
problem for DWARF too. "We did this --- you fix it", just
isn't a good model.


> If we take the way of updating tools, the solution can be delivered to the users almost instantly.

And the ugly unwanted effects of that can be erased, never.


> If we take the way of changing the standard, I dread that users will not be able to use the feature for eons. If the solution depends on the changed standard, that means that all libraries, including third-party ones, will have to be updated to contain the debugging information in the new format. But most of the producers do not have any stimulus to do that, at least until all tools their clients use will support the new format. I guess that may take forever.

It's not great that anyone has to wait, but that's life on
the leading edge. Standards exist to build a permanent
foundation. Once committed, they're rarely changeable.
Needing a quick fix from a standard is itself a problem.

With SHT_DWARF64 you're asking the ELF standard to quickly adopt
something dubious, of temporary value, which we'll be stuck with
forever, to work around the slowness of changing the DWARF standard.
It doesn't feel like the best move to me. If it were the only
possible answer, I guess I would go along with it, and did so
earlier, before more information emerged. But it's not the
only answer --- we've already seen one other suggestion that
can be implemented today without any standards change.

If there has to be a quick/dirty answer today, then creating a
.debug64 section of type PROGBITS is far more palatable than
the other suggestions, as it's the one that can most completely
disappear once this moment in time has passed. If you can't
wait, or want to use the split debug workaround, then this
is your quickest/easiest way forward, while the standards
process takes the time it needs.

- Ali

Fangrui Song

unread,
Nov 24, 2020, 2:51:48 PM11/24/20
to gener...@googlegroups.com
On 2020-11-24, Ali Bahrami wrote:
>On 11/24/20 8:44 AM, Igor Kudrin wrote:
>>In my understanding, the DWARF standard only defines how the debugging information is represented in the final binary where it can be read by a debugger. It does not instruct tools like compilers and linkers on how to produce intermediate files in detail. From this standpoint, the standard does not have issues that need to be fixed and the problem of intermixing DWARF32/DWARF64 data can be resolved by adjusting only linkers and/or compilers.
>
> The decisions made in the dwarf standard about this clearly
>have ramifications for compilers and linkers. If the DWARF
>content is a problem for compilers and linkers, then it's a
>problem for DWARF too. "We did this --- you fix it", just
>isn't a good model.
>
>
>>If we take the way of updating tools, the solution can be delivered to the users almost instantly.
>
>And the ugly unwanted effects of that can be erased, never.
>
>
>>If we take the way of changing the standard, I dread that users will not be able to use the feature for eons. If the solution depends on the changed standard, that means that all libraries, including third-party ones, will have to be updated to contain the debugging information in the new format. But most of the producers do not have any stimulus to do that, at least until all tools their clients use will support the new format. I guess that may take forever.
>
>It's not great that anyone has to wait, but that's life on
>the leading edge. Standards exist to build a permanent
>foundation. Once committed, they're rarely changeable.
>Needing a quick fix from a standard is itself a problem.
>
>With SHT_DWARF64 you're asking the ELF standard to quickly adopt
>something dubious, of temporary value, which we'll be stuck with
>forever, to work around the slowness of changing the DWARF standard.
>It doesn't feel like the best move to me. If it were the only
>possible answer, I guess I would go along with it, and did so
>earlier, before more information emerged. But it's not the
>only answer --- we've already seen one other suggestion that
>can be implemented today without any standards change.

Did this paragraph consider my previous reply that SHT_DWARF64
complements the DWARF improvements?

>> Personally I think a SHT_DWARF64 still makes sense, no matter the
>> decision on the section name.
>>
>> if (type == SHT_DWARF64)
>> handle DWARF64
>> else if (name.startswith(".debug_"))
>> handle mixed DWARF32 and DWARF64
>>
>> Note that a section type can retrofit the existing standard versions (v3, v4
>> and v5). As of tooling support, tools generally don't require the section type
>> of .debug_*: no issue for lldb/gdb/llvm-{objcopy,strip}/(likely) binutils
>> strip/objcopy.

We should do:

if (type == SHT_DWARF64 || name.startswith(".debug_"))
handle DWARF

instead of

if (name.startswith(".debug_") || name.startswith(".debug64_"))
handle DWARF

Ali Bahrami

unread,
Nov 24, 2020, 4:41:32 PM11/24/20
to gener...@googlegroups.com
On 11/24/20 12:51 PM, 'Fangrui Song' via Generic System V Application Binary Interface wrote:
> Did this paragraph consider my previous reply that SHT_DWARF64
> complements the DWARF improvements?
>
>>> Personally I think a SHT_DWARF64 still makes sense, no matter the
>>> decision on the section name.
>>>
>>>   if (type == SHT_DWARF64)
>>>     handle DWARF64
>>>   else if (name.startswith(".debug_"))
>>>     handle mixed DWARF32 and DWARF64
>>>
>>> Note that a section type can retrofit the existing standard versions (v3, v4
>>> and v5).  As of tooling support, tools generally don't require the section type
>>> of .debug_*: no issue for lldb/gdb/llvm-{objcopy,strip}/(likely) binutils
>>> strip/objcopy.
>
> We should do:
>
>   if (type == SHT_DWARF64 || name.startswith(".debug_"))
>     handle DWARF
>
> instead of
>
>   if (name.startswith(".debug_") || name.startswith(".debug64_"))
>     handle DWARF


Hi,

Yes, I did see this, but still think the progbits
answer is better, and since others had also voiced that
opinion, I didn't think that you needed to hear more from
me about it. Sorry if it wasn't clear.

Of course, all things being equal, you're right, it is
better to compare types. Here though, it seems that the
need for SHT_DWARF64 goes away once a new improved dwarf
comes along. If that's true, then it's better overall to
avoid creating a new section type, and to simply use PROGBITS
with a new name. This is because the name can go away completely,
while a section type can at best only be deprecated, and
so, is destined to become permanent baggage. It's more
important to do the right thing for the long haul.

Less important to me in this particular case, but still
worth noting, is that in the short haul, this string compare
is just one of many, because all the debug sections are
matched by name, and that's mostly never going to change
(e.g. name.startswith(".debug_")). Hence, it's not much
of a price to pay now, and since it might even go away
eventually, it seems like something we can live with.
While I'd like to see GNU define more section types
instead of using names, this isn't the case I'd choose
to make that argument for.

- Ali

Michael Eager

unread,
Nov 27, 2020, 1:22:14 PM11/27/20
to gener...@googlegroups.com
On 11/24/20 11:51 AM, 'Fangrui Song' via Generic System V Application
Binary Interface wrote:
> We should do:
>
>   if (type == SHT_DWARF64 || name.startswith(".debug_"))
>     handle DWARF
>
> instead of
>
>   if (name.startswith(".debug_") || name.startswith(".debug64_"))
>     handle DWARF


DWARF uses section names to identify content. Since it is object file
format independent, and not all object files have section types, it
cannot use a section type.

If the difference is making a change to the ELF standard and the
resulting compatibility issues, vs. using a different section name, with
no change to the ELF standard, then it seems pretty clear to me that the
latter is preferred.


--
Michael Eager

Michael Eager

unread,
Nov 27, 2020, 1:28:32 PM11/27/20
to gener...@googlegroups.com
On 11/24/20 7:44 AM, Igor Kudrin wrote:
> In our case, developers create large applications, which must be
> represented as a single binary because of the restrictions of the
> platform. Split DWARF might be an option, but there are cases when it is
> not very convenient. Note also that the DWARF Package files have a
> design flaw that prevents them to contain more than 4Gigs of data in a
> single section, which makes them useless at the same time you hit the
> DWARF32 limits.

Whatever platform restrictions you have on a binary, it seems that this
is (or should be) independent of DWARF. How is split DWARF data, not in
the binary, inconvenient?

Have you hit the 4Gb limit in DWARF data?

> In my understanding, the DWARF standard only defines how the debugging
> information is represented in the final binary where it can be read by a
> debugger. It does not instruct tools like compilers and linkers on how
> to produce intermediate files in detail. From this standpoint, the
> standard does not have issues that need to be fixed and the problem of
> intermixing DWARF32/DWARF64 data can be resolved by adjusting only
> linkers and/or compilers.

While this may be true, one objective of the DWARF Standard is to reduce
the changes required from linkers and compilers, not to make DWARF
issues into compiler/linker issues.


--
Michael Eager

Alex

unread,
Dec 1, 2020, 6:35:57 PM12/1/20
to Generic System V Application Binary Interface
I went and found one of the builds that is failing due to relocation overflow. Stepping through LLD the relocation that overflows is in the debug_info and relocation points to the .debug_loc output section. Looking at DWARFv4 spec looks like it is a DW_AT_location attribute which is of class loclistptr, form DW_FORM_sec_offset. So looks like at least in this build .debug_loc that grows too large.

Edd Dawson

unread,
Dec 3, 2020, 9:15:33 AM12/3/20
to gener...@googlegroups.com, Alex
Hi Alex,

What's the producer? Colleagues of mine have been working on reducing
the size of .debug_loc, as produced by LLVM. IIUC, their work isn't in
the LLVM 11 release but you can see the results on master:

http://lnt.llvm.org/db_default/v4/nts/graph?plot.0=1366.1607468.4&highlight_run=137693

(Approx 40% reduction when building clang 3.4 as the benchmark).

Any good to you?

Thanks,
Edd
> --
> You received this message because you are subscribed to the Google
> Groups "Generic System V Application Binary Interface" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to generic-abi...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/generic-abi/1dbab327-5574-4b60-a4cc-c89723a8e9e2n%40googlegroups.com
> [1].
>
>
> Links:
> ------
> [1]
> https://groups.google.com/d/msgid/generic-abi/1dbab327-5574-4b60-a4cc-c89723a8e9e2n%40googlegroups.com?utm_medium=email&utm_source=footer

--
Edd Dawson
SN Systems - Sony Interactive Entertainment
http://www.snsystems.com

Igor Kudrin

unread,
Dec 7, 2020, 8:40:23 AM12/7/20
to gener...@googlegroups.com
On 28.11.2020 1:28, Michael Eager wrote:
> Whatever platform restrictions you have on a binary, it seems that this
> is (or should be) independent of DWARF. How is split DWARF data, not in
> the binary, inconvenient?
>
> Have you hit the 4Gb limit in DWARF data?

Yes, we have some cases of that. Can't explain much of the details, but that is why we are working on supporting DWARF64 in LLVM. There might be different ways to handle really big debugging data, and DWARF64 is one of the possible options, which we would like to provide to our customers.

>> In my understanding, the DWARF standard only defines how the debugging
>> information is represented in the final binary where it can be read by a
>> debugger. It does not instruct tools like compilers and linkers on how
>> to produce intermediate files in detail. From this standpoint, the
>> standard does not have issues that need to be fixed and the problem of
>> intermixing DWARF32/DWARF64 data can be resolved by adjusting only
>> linkers and/or compilers.
>
> While this may be true, one objective of the DWARF Standard is to reduce
> the changes required from linkers and compilers, not to make DWARF
> issues into compiler/linker issues.

Maybe I'm missing something, but I can't remember seeing any actual proposal for what exactly is going to be fixed in the DWARF standard. It is hard to assess if the way of fixing the standard is really plausible. On the other hand, there are at least three proposals to fix the issue by changing linkers and/or compilers, and all of them not only doable but also support existing DWARF standards.

Alex

unread,
Jan 11, 2021, 9:04:27 PM1/11/21
to Generic System V Application Binary Interface
Hi Edd

Sorry I missed this. For production we are just moving to clang 9.0, although we do back port changes as needed from trunk.
Do you have links to llvm reviews for changes?

Thank You
Alex

Edd Dawson

unread,
Jan 12, 2021, 12:50:28 PM1/12/21
to gener...@googlegroups.com
Reply all
Reply to author
Forward
0 new messages