RFC: Add ET_DEBUG

35 views
Skip to first unread message

H.J. Lu

unread,
Jun 26, 2020, 12:16:58 PM6/26/20
to Generic System V Application Binary Interface, GNU C Library, Binutils
We need a way to identify "debug" info files, which appear like they
are ELF files but if inspected are actually missing a lot of
information and can't be properly parsed without the original DSO or
executable.
We propose

#define ET_DEBUG 5 /* Debug information file */

Consumers should skip ET_DEBUG files if they don't know how to
handle them. Debuggers should process ET_DEBUG files to extract
debug info.

--
H.J.

Ali Bahrami

unread,
Jun 26, 2020, 12:32:47 PM6/26/20
to gener...@googlegroups.com
These won't be useful to many implementations. Is there
a reason not to define ET_GNU_DEBUG from the OSABI range,
for the GNU OSABI instead of seeking to make this part of
the gABI?

- Ali

H.J. Lu

unread,
Jun 26, 2020, 12:37:24 PM6/26/20
to Generic System V Application Binary Interface
We were inquiring if there were common interests to identify
debug info files. We will move it to ET_GNU_DEBUG.

--
H.J.

Ali Bahrami

unread,
Jun 26, 2020, 12:50:50 PM6/26/20
to gener...@googlegroups.com
Sure, that makes sense, sorry. I don't know much about them,
and the above doesn't give much detail, but it seems that they
would contain debug information related to a primary
object?

In that case, Solaris already has ET_SUNW_ANCILLARY
(http://www.linker-aliens.org/blogs/ali/entry/ancillary_objects_separate_debug_elf/),
so we wouldn't be looking for another mechanism to cover the
same space. As such, ET_GNU_DEBUG makes sense to me.

- Ali

Carlos O'Donell

unread,
Jun 26, 2020, 1:22:37 PM6/26/20
to gener...@googlegroups.com, Ali Bahrami
On 6/26/20 12:50 PM, Ali Bahrami wrote:
> On 6/26/20 10:36 AM, H.J. Lu wrote:
>> On Fri, Jun 26, 2020 at 9:32 AM Ali Bahrami <Ali.B...@oracle.com> wrote:
>>>
>>> On 6/26/20 10:16 AM, H.J. Lu wrote:
>>>> We need a way to identify "debug" info files, which appear like they
>>>> are ELF files but if inspected are actually missing a lot of
>>>> information and can't be properly parsed without the original DSO or
>>>> executable.
>>>> We propose
>>>>
>>>> #define ET_DEBUG        5       /* Debug information file */
>>>>
>>>> Consumers should skip ET_DEBUG files if they don't know how to
>>>> handle them.  Debuggers should process ET_DEBUG files to extract
>>>> debug info.
>>>>
>>>
>>>
>>>      These won't be useful to many implementations. Is there
>>> a reason not to define ET_GNU_DEBUG from the OSABI range,
>>> for the GNU OSABI instead of seeking to make this part of
>>> the gABI?
>>>
>>
>> We were inquiring if there were common interests to identify
>> debug info files.  We will move it to ET_GNU_DEBUG.
>>
>
>    Sure, that makes sense, sorry. I don't know much about them,
> and the above doesn't give much detail, but it seems that they
> would contain debug information related to a primary
> object?

Correct.

The primary object is split into 2 ELF files.

1 ELF file with debug information stripped.
- A new .gnu_debuglink non-allocatable section with information
to find the second ELF file.

1 ELF file with *only* debug information present.
- Minimal ELF file.
- Looks like a "corrupted" ELF file from generic tooling perspective.

> In that case, Solaris already has ET_SUNW_ANCILLARY
> (http://www.linker-aliens.org/blogs/ali/entry/ancillary_objects_separate_debug_elf/),
> so we wouldn't be looking for another mechanism to cover the
> same space. As such, ET_GNU_DEBUG makes sense to me.

Yes, and SHT_SUNW_ANCILLARY is equivalent to .gnu_debuglink (SHT_PROGBITS).

The Solaris design of SHF_SUNW_ABSENT is very cool.

I also like the design of SUNW_SHF_PRIMARY.

I am always impressed with your work.

Good job on that design.

I think following in the footsteps of this design
would be a good idea.

Perhaps we should propose a ET_GNU_ANCILLARY and try to
model our choices after those in Solaris?

--
Cheers,
Carlos.

Fangrui Song

unread,
Jun 26, 2020, 1:24:27 PM6/26/20
to gener...@googlegroups.com
Sigh, this makes a lot sense to me as a generic value. I believe the
concept is documented here:

https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html

There is more information on

* https://sourceware.org/binutils/docs/binutils/objcopy.html --only-keep-debug
* llvm-objcopy implementation https://reviews.llvm.org/D67137

Ali Bahrami

unread,
Jun 26, 2020, 2:21:46 PM6/26/20
to gener...@googlegroups.com
On 6/26/20 11:22 AM, Carlos O'Donell wrote:
> I think following in the footsteps of this design
> would be a good idea.
>
> Perhaps we should propose a ET_GNU_ANCILLARY and try to
> model our choices after those in Solaris?

I realize that there's an existing investment in the
ET_GNU_DEBUG model, but that would be pretty cool. For
what it's worth, I would be happy to cooperate with such an
effort. If it hewed closely enough to the ET_SUNW_ANCILLARY,
we could even consider making it a generic feature.

It's more work for Cary, but given the existing documentation
from that blog entry, as well as the final version found in
the published Solaris Linker and Libraries manual, it would
be straightforward to slot that material into the gABI docs,
and I can also help with that.

Of course, I fully understand if the choice goes the other way.

- Ali

Florian Weimer

unread,
Jun 26, 2020, 5:35:29 PM6/26/20
to H.J. Lu via Libc-alpha, Generic System V Application Binary Interface, Binutils, H.J. Lu
* H. J. Lu via Libc-alpha:
I would like to see a change like this.

For background: These separate debuginfo files contain the same program
headers as the original object file, but all the loadable segments are
missing from the file (including the dynamic segment). Tools compare
these program headers for consistency, so we cannot change them in the
separate debuginfo.

In the dynamic loader, we only have ready access to the loadable
segments. This means the ELF header is the only area of overlap, and
the information to tell the two apart has to be located there.

Thanks,
Florian

Michael Matz

unread,
Jun 29, 2020, 9:25:48 AM6/29/20
to 'Fangrui Song' via Generic System V Application Binary Interface
Hello,

On Fri, 26 Jun 2020, 'Fangrui Song' via Generic System V Application
Binary Interface wrote:

> Sigh, this makes a lot sense to me as a generic value.

Not without a sensible design. The Sun^WOracle design is sensible, as
usual. The GNU separate debug files were just the easiest hack that
people came up with at the time: there's no reason they contain section or
program headers for non-existing things (except that by this creating the
writer for those files was easier), the references from primary to debug
object are via section names (not section types or dynamic entries), and
perhaps more un-ELF things.

So, at the current stage I'd like to not see a generic ET_DEBUG. An
ET_GNU_DEBUG would be fine of course.


Ciao,
Michael.

Mark Wielaard

unread,
Jul 15, 2020, 9:56:21 AM7/15/20
to gener...@googlegroups.com
Hi,

It would be good to have this properly documented indeed.
If only so that the different implementations that implement producing
and consuming the auxiliary (ET_[GNU]_DEBUG) files do it in a similar
way (something I don't think they currently do, if I read the llvm-
objcopy patch correctly).

On Fri, 2020-06-26 at 10:24 -0700, 'Fangrui Song' via Generic System V
Application Binary Interface wrote:
> Sigh, this makes a lot sense to me as a generic value. I believe the
> concept is documented here:
>
> https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
>
> There is more information on
>
> * https://sourceware.org/binutils/docs/binutils/objcopy.html --only-
> keep-debug
> * llvm-objcopy implementation https://reviews.llvm.org/D67137

rpm based distros often make use of elfutils eu-strip through the find-
debuginfo.sh script, which does everything in one go through:
eu-strip -f elf.debug elf.file
or keeping the original untouched using:
eu-strip -f elf.debug -o elf.stripped elf.orig

This will strip the sections (and symbols), put those into a separate
debug file, using the original file ELF and program headers, and adds a
.gnu_debuglink section to the stripped file.

Note that it looks like the llvm-objcopy implementation doesn't
preserve the main file program headers like the binutils objcopy and
elfutils eu-strip implementations do. Having the exact program headers
from the original file is a feature of the format since it allows a
program to get the original memory mappings, correct any addresses that
might have changed when the main ELF file is (later) prelinked and
facilitates recombining the main and debug file (for example by using
eu-unstrip).

Cheers,

Mark

Mark Wielaard

unread,
Jul 15, 2020, 10:02:20 AM7/15/20
to gener...@googlegroups.com, H.J. Lu via Libc-alpha, Binutils, H.J. Lu
Note that this is only true for separate debug file for ET_EXEC and
ET_DYN ELF files. Some debug files, like stripped ET_REL ELF files,
split-dwarf .dwo files, supplementary debug files (dwz multi files),
don't contain any program headers. Should those also be marked ET_DEBUG
or not? I think they should. And then the above description should
simply say "may contain the same program headers as the associated
(original) object file...".

Cheers,

Mark

Fangrui Song

unread,
Jul 15, 2020, 12:54:54 PM7/15/20
to gener...@googlegroups.com
llvm-objcopy preserves the program headers and rewrites p_offset fields.
(layoutSegmentsForOnlyKeepDebug on https://reviews.llvm.org/D67137 )

When implementing the feature, I had another attempt which did not
retain program headers: https://reviews.llvm.org/D67090 . I abandoned
that because I learned that users have linux-perf symbolization needs without
access to the stripped binary. Honestly I still don't quiet get how this
works.

>Having the exact program headers
>from the original file is a feature of the format since it allows a
>program to get the original memory mappings, correct any addresses that
>might have changed when the main ELF file is (later) prelinked and
>facilitates recombining the main and debug file (for example by using
>eu-unstrip).

Interesting, I am eager to know your thoughts on this stuff!
p_offset/p_filesz fields are rewritten. Other fields (including
p_vaddr/p_memsz) are retained. How is address correction required and
how does it work?

* If symbolization is done with p_vaddr/p_memsz => no address correct is
needed.
* If file offsets in both the stripped binary and the debug file are
needed, how does address correction help? File offsets in the two
files are unrelated.

Regarding eu-unstrip (I don't know its internals):
if the offsets of the sections in the debug file are insignificant (they
are SHT_NOTE or non-SHF_ALLOC sections), moving them into the stripped
binary in an arbitrary order will work, without program headers, right?
Reply all
Reply to author
Forward
0 new messages