I am glad that there are several alternative proposals discussed here:)
It was my intention that I would like to see how other linker experts thought on
this topic:)
FWIW I have considered 4 proposals now:
1) No ELF side change. The linker does magic DWARF32/DWARF64 partition.
Actually I have a prototype for this idea:
https://reviews.llvm.org/D91404
In practice, the first relocation of .debug_* is a good indicator whether it is
a DWARF64 section. You can see the patch for details.
However, .debug_str is difficult to handle with this approach because .debug_str
itself does not have relocations. The closest heuristic I can think of is: "if a
.debug_info's first relocation is of a 64-bit absolute relocation type, mark
.debug_str in the same input file as 'DWARF64'".
Unfortunately, this makes the linker behavior dependent on other sections, which
is why I feel lost: when we write .debug_str 0 : { *(.debug_str) }, we really
want the output section .debug_str can be produced with information just from
the input section descriptions, not random information from other .debug_*
(--sort-section/SHF_LINK_ORDER/LLD --shuffle-sections/gold --section-ordering-file/LLD --symbol-ordering-file/SORT:
we have many ways to change the order of '*', but this is the time we need information from other output sections)
Another problem is relocatable links: if we order DWARF64 before DWARF32 in a
relocatable link, we may treat the combined section as "DWARF64" while it has
DWARF32 relocation limitation.
2) A new section type SHT_DWARF64. The linker partitions sections with the section type.
.debug_info 0 : { *(TYPE (SHT_PROGBITS) .debug_info *(TYPE (SHT_GNU_DWARF64) .debug_info) }
With the usual linker rule that when a non-SHT_PROGBITS section is mixed with a SHT_PROGBITS,
the result is SHT_PROGBITS. A relocatable link output can be correctly inferred as "DWARF32".
(Conceptually, the combined section should impose the rigid restriction
when it is further combined with other sections)
3) A new section flag. I agree with Ali on this point. This idea has a larger blast radius.
(We have 0x60000000 section types while only 8 remaining generic section flags.)
I don't see the existing SHT_*_LARGE and SHT_*_SMALL convincing because they actually use different output section names.
(Please correct me as I also think they are some legacy stuff)
Should the output section have the flag if any of the input sections have the flag?
For many other flags this rule applies but for SHF_*_LARGE the flag can cause a problem
with relocatable links (as I mentioned above).
4) A section prefix. To be fair, as a linker person I like it in two ways:
* It is immediately obvious whether DWARF64 is used and whether
DWARF32 is used along with DWARF64.
* In a relocatable link mixing DWARF32 and DWARF64 sections, DWARF32
and DWARF64 sections will naturally not get mixed. We don't even need another linker feature
to match input sections by type.
On the other hand,
* It is non-conforming due to different section names.
James Henderson
(
https://lists.llvm.org/pipermail/llvm-dev/2020-November/146721.html): "However,
conformance is still a concern to me as we cannot really retrofit the existing
standard versions, and the section names themselves are in the standard. That
means that tools that otherwise would work might stop working when presented
with a "new" DWARFv3/4/5 output that it in theory could otherwise handle."
* Tooling support. Some commonly used consumers have recognized
+ gdb: gdb/dwarf2/read.c recognizes .debug_* by name and does not support multiple .debug_info sections
(confirmed with gdb maintainers)
+ objcopy --strip-debug: needs to learn the new .debug64 prefix.
James mentioned that "you could do .debug_64_info or .debug_info_64 probably safely"
+ gold --gdb-index
+ In LLVM, to give an intuitive feeling, a number of places need to account for more sections:
integrated assembler / DWARFContext / MCObjectFileInfo / llvm-dwarfdump's -debug* options.
As a refinement (James'), we can let the linker combine .debug64 or .debug_64 in object files.
Tools dealing with linked images will not need a change, but tools dealing with object files
(objcopy --strip-debug/gold --gdb-index/assembler/...) still suffer from
complexity due to the doubled number of sections.
On balance, downsides do not make this more appealing than SHT_DWARF64, which
can retrofit existing standards.
Hey, I like this quote:)
>However, in
>this case, I think the change to the debuggers should really be simpler
>than what the linkers would have to do, because the debuggers can just
>read the 2 sections and treat them as one logical item --- no sorting
>required.
That said, I hope my comments above have explained why a section prefix proposal
is not more appealing.
>The merit of this is that once dwarf gets fixed, we can easily
>erase our tracks without leaving behind any permanent damage to ELF,
>and without having to add sorting code that we would then be stuck
>with after the need for it is largely gone.
>
>- Ali
"dwarf gets fixed" - I hope Pavel and James' comments have explained this is not
a bug in the DWARF standard:)
DW_FORM_strp/DW_strp_sup/DW_FORM_line_strp/DW_FORM_sec_offset and GNU extensions
DW_FORM_GNU_ref_alt/DW_FORM_GNU_strp_alt use 32-bit offsets in the 32-bit DWARF
format. The offset encoding is the main point. I think this is intentional
otherwise what's the advantage/differences having a separate 64-bit DWARF
format? Just for its 64-bit length encoding? :)