Thanks for sending this out.
My initial reaction is that this would be most useful for post linking tools. For human readable output only I expect that we’d be comfortable with existing map file output and a disassembly.
I have a small concern of upstream maintainability without the binary patching tools themselves. For example it may be that all we have is the llvm-readobj/llvm-objdump to textually dump the output. It is possible that we could make modifications with corresponding changes to the text dumps that could break assumptions the binary patching tools are making. I think this is likely to be rare, but I couldn’t rule it out.
While I wouldn’t object as I think the extra debug output is not likely to need a lot of maintenance I think it would be good to get someone actively interested in binary patching or some other post-link tool to comment.
Peter
Thanks for sending this out.
My initial reaction is that this would be most useful for post linking tools. For human readable output only I expect that we’d be comfortable with existing map file output and a disassembly.
I have a small concern of upstream maintainability without the binary patching tools themselves. For example it may be that all we have is the llvm-readobj/llvm-objdump to textually dump the output. It is possible that we could make modifications with corresponding changes to the text dumps that could break assumptions the binary patching tools are making. I think this is likely to be rare, but I couldn’t rule it out.
While I wouldn’t object as I think the extra debug output is not likely to need a lot of maintenance I think it would be good to get someone actively interested in binary patching or some other post-link tool to comment.
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Related to naming, is there a chance that other linkers might adopt this feature as well? If so, maybe we should avoid including "lld" in the name and use a more generic name like .debug_linker_got and .debug_linker_plt?
I am dubivious whether people will find incremental linking useful:)
https://news.ycombinator.com/item?id=26233244 from Rui Ueyama
and
https://sourceware.org/pipermail/binutils/2021-September/117828.html
from Cary Coutant:
"Do you think you'd ever want incremental linking on powerpc? Frankly,
the effort for just the one target platform was pretty high, the
maintenance on it is burdensome, and I'm tempted to deprecate it and
rip it out at some point in the future."
The PLT header size and PLT entry size are hard coded depending on the
architecture and a few security related options like -z retpolineplt,
ibt, bti. Is a generic description scheme useful?
If the new format is to describe dynamic relocations in a compact way, I
am wondering whether this has over-engineered and can achieve the design
goal.
A program doesn't typically have many GLOB_DAT, TLSDESC, and TLS GD/LD/IE relocations.
MIPS folks invented DT_MIPS_LOCAL_GOTNO and
DT_MIPS_SYMTABNO-DT_MIPS_GOTSYM, but the scheme rarely saves much space
and turns out to cause more problems with .gnu.hash
https://sourceware.org/pipermail/binutils/2019-December/109330.html
>The header is then followed by list of entry descriptions.
>- Each entry description is a single uleb and describes the PLT entry with
>the same index.
>- The value of the uleb gives the index of the associated GOT entry.
>- The value cannot exceed Elf_Off.
Is disassembling .plt more convenient? The linker uses a predictable way
to generate it so its content is not that hard to parse.
It can be quick because the shape of a PLT entry is known and many bytes
can be skipped.
With this in mind, this information is just easy to infer from
R_*_JUMP_SLOT relocations.
>In addition to allowing hot-patching tools to work with the GOT and PLT the
>information in these sections is of use to any tool that needs to display
>information on the GOT and PLT sections. For example, debuggers and binary
>tools synthesize labels of the form <symbol>@plt to label the PLT sections.
>The information in these sections could be used to simplify such tasks.
How is this format more suitable than existing Elf64_Rel/Elf64_Rela for
hot-patching? The GOT and PLT information can be inferred from .rela.plt
and .rela.dyn easily. The scheme appears to be more complex than the
relocation format.
>_______________________________________________
On 2021-09-21, bd1976 llvm via llvm-dev wrote:
>As mentioned Sony would like LLD to optionally emit sections that describe
>the GOT and PLT.
>
>The proposed binary format of these sections is as follows:
>
>.debug_lld_got
>==============
>
>The .debug_lld_got section contains a GOT description. The GOT description
>begins with a header composed of the following fields:
>
>length (uleb)
>- The length in bytes of the GOT description not including the length field
>itself.
>- This allows for padding to be added to the section, useful for purposes
>such as slop for incremental linking.
I am dubivious whether people will find incremental linking useful:)
https://news.ycombinator.com/item?id=26233244 from Rui Ueyama
and
https://sourceware.org/pipermail/binutils/2021-September/117828.html
from Cary Coutant:
"Do you think you'd ever want incremental linking on powerpc? Frankly,
the effort for just the one target platform was pretty high, the
maintenance on it is burdensome, and I'm tempted to deprecate it and
rip it out at some point in the future."
If the new format is to describe dynamic relocations in a compact way, I
am wondering whether this has over-engineered and can achieve the design
goal.
A program doesn't typically have many GLOB_DAT, TLSDESC, and TLS GD/LD/IE relocations.
MIPS folks invented DT_MIPS_LOCAL_GOTNO and
DT_MIPS_SYMTABNO-DT_MIPS_GOTSYM, but the scheme rarely saves much space
and turns out to cause more problems with .gnu.hash
https://sourceware.org/pipermail/binutils/2019-December/109330.html
>The header is then followed by list of entry descriptions.
>- Each entry description is a single uleb and describes the PLT entry with
>the same index.
>- The value of the uleb gives the index of the associated GOT entry.
>- The value cannot exceed Elf_Off.
Is disassembling .plt more convenient? The linker uses a predictable way
to generate it so its content is not that hard to parse.
It can be quick because the shape of a PLT entry is known and many bytes
can be skipped.
With this in mind, this information is just easy to infer from
R_*_JUMP_SLOT relocations.
>In addition to allowing hot-patching tools to work with the GOT and PLT the
>information in these sections is of use to any tool that needs to display
>information on the GOT and PLT sections. For example, debuggers and binary
>tools synthesize labels of the form <symbol>@plt to label the PLT sections.
>The information in these sections could be used to simplify such tasks.
How is this format more suitable than existing Elf64_Rel/Elf64_Rela for
hot-patching? The GOT and PLT information can be inferred from .rela.plt
and .rela.dyn easily. The scheme appears to be more complex than the
relocation format.
On Mon, Sep 20, 2021 at 6:29 PM Petr Hosek <pho...@google.com> wrote:Related to naming, is there a chance that other linkers might adopt this feature as well? If so, maybe we should avoid including "lld" in the name and use a more generic name like .debug_linker_got and .debug_linker_plt?
Yeah, mixed feelings - using lld/llvm/something ensures we don't collide with someone else's ideas, but may reduce the possibility of uptake elsewhere. I'd usually err on a non-colliding name at first, and generalize if there's interest, but it's possible the non-colliding name just encourages other people to go make there own thing before anyone has a chance to generalize it.
On Mon, Sep 20, 2021 at 6:22 PM David Blaikie via llvm-dev <llvm...@lists.llvm.org> wrote:(minor quibble: I'd probably avoid using the ".debug_*" namespace for things that seem pretty separate from/not a clear extension to DWARF - but maybe there's precedent for this? Not sure)
I am concerned that this would add a significant complexity to LLD.
Except canonical PLT entries (normal function and STT_GNU_IFUNC
converted STT_FUNC), PLT entries have insignificant addresses and the
linker can generate multiple instances.
For example, the PowerPC64 port PLT is coupled with range extension
thunks and there can be multiple instances.
Each architecture's PLT may have a different shape.
I am not sure how a generic format can describe a stub.
Some architectures can do micro optimization like: if we know the hi
part of a pair of hi/lo values is zero, we may save one instruction.
Such choice is easy to represent in code but difficult to describe
in a serialized format.
AArch64's BTI PLT is also interesting: some PLT entries may have a
leading `bti c` while some don't.
x86-64's IBT PLT is worse: there are two sections: .plt and .plt.sec .
How to describe it?
(Multiple folks were against .plt.sec ; I subscribed to x86-64-abi after
this event in case I missed such over-engineering designs in the
future.)
Describing PLT/GOT gives me a sense like support GNU ld --verbose style
linker script dump (https://bugs.llvm.org/show_bug.cgi?id=51309).
Yes, it can make some applications happy but the implementation complexity
would be huge.
Perhaps something I really want to ask is whether we ran into an XY
problem (https://xyproblem.info/). What did the hot-patching feature
actually need? FWIW such a feature is also implemented in the Linux
kernel, called live-patching, which is related to dynamic ftrace.
So far we haven't heard that they need anything from the linker side.
Well, a GNU contributor added -z unique-symbol very quickly while the
needs appear to have disappeared :)
https://bugs.llvm.org/show_bug.cgi?id=50745
I am sold that this option is misdesigned :)
(https://maskray.me/blog/2020-11-15-explain-gnu-linker-options#z-unique-symbol)
>
>> If the new format is to describe dynamic relocations in a compact way, I
>> am wondering whether this has over-engineered and can achieve the design
>> goal.
>> A program doesn't typically have many GLOB_DAT, TLSDESC, and TLS GD/LD/IE
>> relocations.
>>
>
>The purpose is to describe the GOT/PLT in a consistent and simple manner
>for consuming tools. Over the years there have been a number of changes to
>how the GOT is optimised. GOT entries can be patched statically, patched
>with relocations that don't reference dynamic symbols, or patched with
>relocations that reference a dynamic symbol etc.. using this section allows
>each GOT entry to be consistently described. If we can design a more
>compact format for the same information that would be great.
Does --emit-relocs help here?
Thanks for understanding :)
But sorry for some pushback below.
>I will reply to the (great) points you have raised tomorrow. The
>hot-patching feature is proprietary technology and I need to check how much
>I can disclose about it - sorry! I will also put up a prototype
>implementation so that the complexity of the implementation can be judged.
>I have not attempted to describe all GOT/PLTs only the ones that are
>structured "normally". x86-64's IBT PLT would need an extension to the
>binary format to describe. I am not convinced we need to describe every
>variation to add value. If the binary format can describe the commonly used
>GOT/PLT structures then I believe that is sufficient. We can design the
>binary format to be flexible so that it can be extended in the future if
>support for a GOT/PLT structure that cannot be described currently is
>required.
>
>Do you have an opinion on the other sections? In particular the linkmap
>section? That section is the most important information for our
>hot-patching implementation and it also has clear benefits over the current
>-Map file option.
For "A section which specifies the list of wrapped symbols.", I believe
the symbol table should be the source of truth for --wrap results. It is
just easier to inspect the symbol table than parsing a serialized file
with encoding some properties of the wrapped symbols. For --wrap=foo, we
have foo, __wrap_foo, __real_foo, how do the serialized format describe
their properties better than the symbol table itself?
For this one, I understand that the information can probably make the
propritetary technology easier but it can be difficult to maintain and
by upstreaming you may end up incurring more overhead if other
contributors need to alter the format for legitimate reasons.
For "A linkmap section that contains a subset of the information
contained in a linker -Map file.", that looks like duplicated
information. Can your users parse the -Map output and synthesize the
needed section?
If the feedback is that "parsing -Map output is fragile", then you'd
need justifying reasons for a new format.
For example, the --dependency-file feature, while potentially useful, is
not sufficiently orthogonal, so it got lots of pushback
(https://reviews.llvm.org/D82437). In the end it was accepted because
the sufficient usefulness was demonstrated (it supports the venerable
make and ninja, which have very wide adoption) and as a bonus: GNU
ld/gold added it as well. Well, the GNU linkers' feature parity is
certainly not a necessity: but their efforts increased accessibility and
may reach out to more users. As a comparison, I am concerned that some
sections you may mention may be restricted to benefit the few who are
using the proprietary technology.
For many sections, they really need to be discussed case by case. As
another example, I added --why-extract a few days ago. It is something
GNU ld's -Map already describes. It is sufficiently useful and LLD's
existing features did not cover it. It is sufficiently useful, so I
think a dedicated option is more appropriate.
GNU ld -Map has other output like "Merging program properties" which may
be less useful. But if the value supporting them is sufficient, we can
add them as well.
On 2021-09-24, bd1976 llvm wrote:
>It is very important to resist features that add needless complexity :)
Thanks for understanding :)
But sorry for some pushback below.
>I will reply to the (great) points you have raised tomorrow. The
>hot-patching feature is proprietary technology and I need to check how much
>I can disclose about it - sorry!
I will also put up a prototype
>implementation so that the complexity of the implementation can be judged.
>I have not attempted to describe all GOT/PLTs only the ones that are
>structured "normally". x86-64's IBT PLT would need an extension to the
>binary format to describe. I am not convinced we need to describe every
>variation to add value. If the binary format can describe the commonly used
>GOT/PLT structures then I believe that is sufficient. We can design the
>binary format to be flexible so that it can be extended in the future if
>support for a GOT/PLT structure that cannot be described currently is
>required.
>
>Do you have an opinion on the other sections? In particular the linkmap
>section? That section is the most important information for our
>hot-patching implementation and it also has clear benefits over the current
>-Map file option.
For "A section which specifies the list of wrapped symbols.", I believe
the symbol table should be the source of truth for --wrap results. It is
just easier to inspect the symbol table than parsing a serialized file
with encoding some properties of the wrapped symbols. For --wrap=foo, we
have foo, __wrap_foo, __real_foo, how do the serialized format describe
their properties better than the symbol table itself?
For this one, I understand that the information can probably make the
propritetary technology easier but it can be difficult to maintain and
by upstreaming you may end up incurring more overhead if other
contributors need to alter the format for legitimate reasons.
For "A linkmap section that contains a subset of the information
contained in a linker -Map file.", that looks like duplicated
information. Can your users parse the -Map output and synthesize the
needed section?
If the feedback is that "parsing -Map output is fragile", then you'd
need justifying reasons for a new format.
For example, the --dependency-file feature, while potentially useful, is
not sufficiently orthogonal, so it got lots of pushback
(https://reviews.llvm.org/D82437). In the end it was accepted because
the sufficient usefulness was demonstrated (it supports the venerable
make and ninja, which have very wide adoption) and as a bonus: GNU
ld/gold added it as well. Well, the GNU linkers' feature parity is
certainly not a necessity: but their efforts increased accessibility and
may reach out to more users. As a comparison, I am concerned that some
sections you may mention may be restricted to benefit the few who are
using the proprietary technology.