This is a proposal for defining a new section type "SHT_RELR" in the gABI. This
new section will be used to compactly encode R_*_RELATIVE relocations in shared
object files and position independent executables (PIE).
## Background
In position independent executables (PIE), R_*_RELATIVE relocations can account
for more than 99% of the entries in SHT_REL/SHT_RELA sections. This results in
more than 5% size bloat compared to non-PIE binaries. See this thread for more
details:
https://sourceware.org/ml/gnu-gabi/2017-q2/msg00000.html
Recent releases of Debian, Ubuntu, and several other distributions build
executables as PIE by default. Suprateeka Hegde posted some statistics in the
above thread on the prevalence of relative relocations in executables residing
in /usr/bin:
https://sourceware.org/ml/gnu-gabi/2017-q2/msg00013.html
This proposal is based on the 'experimental-relr' prototype from Cary Coutant
that is available at 'users/ccoutant/experimental-relr' branch in the binutils
repository, and was described in this post in the same thread on gnu-gabi:
https://sourceware.org/ml/gnu-gabi/2017-q2/msg00003.html
## Proposal
We propose defining a new section type and dynamic array tags in the
generic-abi for compactly encoding the R_*_RELATIVE relocations in a
special '.relr.dyn' section.
Defining a new section type:
Chapter 4:
http://www.sco.com/developers/gabi/latest/ch4.sheader.html
Figure 4-9: Section Types,sh_type
Name Value
-----------------
SHT_RELR 19
Description:
SHT_RELR: The section holds relative relocation entries without explicit
addends or info, such as type Elf32_Relr for the 32-bit class of
object files or type Elf64_Relr for the 64-bit class of object
files. An object file may have multiple relocation sections. See
``Relocation'' below for details.
Chapter 4:
http://www.sco.com/developers/gabi/latest/ch4.sheader.html
Figure 4-14: sh_link and sh_info Interpretation
sh_type: SHT_RELR (same as SHT_REL and SHT_RELA)
sh_link: The section header index of the associated symbol table.
sh_info: The section header index of the section to which the relocation
applies.
Chapter 4:
http://www.sco.com/developers/gabi/latest/ch4.sheader.html
Figure 4-16: Special Sections
Name: .relr<name>
Type: SHT_RELR
Attributes: (same as .rel<name> and .rela<name>)
These sections hold relocation information, as described in
``Relocation''. If the file has a loadable segment that includes
relocation, the sections' attributes will include the SHF_ALLOC bit;
otherwise, that bit will be off. Conventionally, name is supplied by the
section to which the relocations apply. Thus a relocation section for
.text normally would have the name .rel.text, .rela.text, or .relr.text.
Defining new dynamic array tags:
Chapter 5:
http://www.sco.com/developers/gabi/latest/ch5.dynamic.html
Figure 5-10: Dynamic Array Tags, d_tag
Name Value d_un Executable Shared Object
---------------------------------------------------
DT_RELRSZ 35 d_val optional optional
DT_RELR 36 d_ptr optional optional
DT_RELRENT 37 d_val optional optional
Note: The use of d_un conforms to the odd-even rule in that section.
Description:
DT_RELR: This element is similar to DT_RELA, except its table has implicit
addends and info, such as Elf32_Relr for the 32-bit file class or
Elf64_Relr for the 64-bit file class. If this element is present,
the dynamic structure must also have DT_RELRSZ and DT_RELRENT
elements.
DT_RELRSZ: This element holds the total size, in bytes, of the DT_RELR
relocation table.
DT_RELRENT: This element holds the size, in bytes, of the DT_RELR relocation
entry.
This table will hold entries of type Elf32_Relr for the 32-bit class of object
files or type Elf64_Relr for the 64-bit class of object files:
Chapter 4:
http://www.sco.com/developers/gabi/latest/ch4.reloc.html
Figure 4-23: Relocation Entries
typedef Elf32_Word Elf32_Relr;
typedef Elf64_Xword Elf64_Relr;
Addends are stored implicitly in the location to be modified, similar to
Elf32_Rel and Elf64_Rel. Relocation type is implicit (R_*_RELATIVE). The
entries effectively store a sorted list of offsets only.
The encoding used in these entries is a simple combination of delta-encoding
and a bitmap of offsets. For the 64-bit entries, higher 8-bits contain delta
since last offset, and lower 56-bits contain a bitmap for which words to apply
the relocation to. This is best described by showing the code for decoding the
section entries:
#define ELF64_R_JUMP(val) ((val) >> 56)
#define ELF64_R_BITS(val) ((val) & 0xffffffffffffff)
#ifdef DO_RELR
{
ElfW(Addr) offset = 0;
for (; relative < end; ++relative)
{
ElfW(Addr) jump = ELFW(R_JUMP) (*relative);
ElfW(Addr) bits = ELFW(R_BITS) (*relative);
offset += jump * sizeof(ElfW(Addr));
if (jump == 0)
{
++relative;
offset = *relative;
}
ElfW(Addr) r_offset = offset;
for (; bits != 0; bits >>= 1)
{
if ((bits&1) != 0)
elf_machine_relr_relative (l_addr, (void *) (l_addr+r_offset));
r_offset += sizeof(ElfW(Addr));
}
}
}
#endif
Note that the 8-bit 'jump' encodes the number of ElfW(Addr) sized words since
last offset. The case where jump would not fit in 8-bits is handled by setting
jump to 0, and emitting the full offset for the next relocation in the
subsequent entry.
This encoding can represent up to 56 relative relocations in a single 64-bit
entry. The above code is the entirety of the implementation for decoding and
processing SHT_RELR sections in glibc dynamic loader.
For 32-bit targets, we use 32-bit entries: 8-bits for 'jump' and 24-bits for
the bitmap:
#define ELF32_R_JUMP(val) ((val) >> 24)
#define ELF32_R_BITS(val) ((val) & 0xffffff)
## Motivating examples
Here are three real world examples that demonstrate the savings:
1. Chrome browser (x86_64, built as PIE):
File size (stripped): 152265064 bytes (145.21MB)
605159 relocation entries (24 bytes each) in '.rela.dyn'
594542 are R_X86_64_RELATIVE relocations (98.25%)
14269008 bytes (13.61MB) in use in '.rela.dyn' section
109256 bytes (0.10MB) if moved to '.relr.dyn' section
Savings: 14159752 bytes, or 9.29% of original file size.
2. Go net/http test binary (x86_64, 'go test -buildmode=pie -c net/http')
File size (stripped): 10238168 bytes (9.76MB)
83810 relocation entries (24 bytes each) in '.rela.dyn'
83804 are R_X86_64_RELATIVE relocations (99.99%)
2011296 bytes (1.92MB) in use in .rela.dyn section
43744 bytes (0.04MB) if moved to .relr.dyn section
Savings: 1967552 bytes, or 19.21% of original file size.
3. Vim binary in /usr/bin on my workstation (Ubuntu, x86_64)
File size (stripped): 3030032 bytes (2.89MB)
6680 relocation entries (24 bytes each) in '.rela.dyn'
6272 are R_X86_64_RELATIVE relocations (93.89%)
150528 bytes (0.14MB) in use in .rela.dyn section
1992 bytes (0.00MB) if moved to .relr.dyn section
Savings: 148536 bytes, or 4.90% of original file size.
Recent releases of Debian, Ubuntu, and several other distributions build
executables as PIE by default. The third example shows that using SHT_RELR
sections to encode relative relocations can bring decent savings to executable
sizes in /usr/bin across many distributions.
## Topics for discussion
* Does this really belong in the generic-abi? The R_*_RELATIVE relocations can
be specified in the SHT_REL/SHT_RELA sections. This is just an optimization.
Our opinion (for seeding the discussion):
Yes, SHT_RELR is just an optimization for storing the R_*_RELATIVE relocations.
If this proposal is rejected, the relocations can continue to reside in the
SHT_REL/SHT_RELA sections like they do today. However, from that point of view,
SHT_REL is also just an optimization over SHT_RELA, so there is precedent for
doing this in the generic-abi.
Entries of type Elf32_Rel and Elf64_Rel store an implicit addend in the
location to be modified, saving 33% memory per entry over Elf32_Rela or
Elf64_Rela.
SHT_RELR takes this same approach a few steps further: removing the r_info
field (relocation type is implicit, and there is no need for symbol table
index), and compactly encoding the remaining data in the entries, which is
simply a sorted list of offsets.
* Assuming there is consensus that adding SHT_RELR to generic-abi is a good
idea, should the ABI really detail the encoding to be used or just define
SHT_RELR/DT_RELR* and leave the encoding unspecified?
Based on the specification of Elf32_Rel, Elf32_Rela, Elf64_Rel, and Elf64_Rela
in the ABI, it probably makes sense to at least define the Elf32_Relr and
Elf64_Relr types.
* Assuming there is consensus that the ABI should specify the encoding used in
SHT_RELR sections, is the encoding described here the best one possible?
Better encodings may be possible. Discovering a more compact *and* simpler
encoding would be a very pleasant surprise indeed. Ideas welcome.
We would be happy to provide details of encoding schemes we've experimented
with and how they compared.
There are two minor optimizations of the above scheme that are worth mentioning
here: (64-bit version described here, but they're applicable to 32-bit as well)
1. The jump field in an entry will always be >=56. Values 0-55 are not used,
since the corresponding offsets would have been covered in the previous
entry. We can bias jump by 55 (jump=0 is used as a special value) to raise
the upper bound for deltas that can be encoded without requiring an extra
word.
2. The least significant bit of the bitmap is always 1. There is an explicit
relocation at every new value of offset. The bitmap can be defined to mean
the *next* 56 words after the new offset, making the relocation at the new
offset implicit. This enables each entry to encode 57 relocations instead
of 56.
* What happens when an older dynamic loader that does not recognize DT_RELR*
tags meets a shared library or executable that contains a SHT_RELR section?
Mysterious crashes!!
Unknown DT_* tags are not fatal errors at program loading time. The older
dynamic loader will simply skip the relocations in SHT_RELR section and start
the executable. This is likely to result in mysterious crashes.
Here's one proposal to handle the situation:
While unknown DT_* tags are not fatal errors at program loading time, unknown
relocations are.
We propose adding a new R_*_RELR relocation on each architecture that adds
support for SHT_RELR sections. Whenever the linker emits a SHT_RELR section in
a shared object or executable, it must also include one entry of type R_*_RELR
in the SHT_REL/SHT_RELA section of the binary. When the dynamic loader adds
support on that architeccure for SHT_RELR sections, it must also add support
for the corresponding R_*_RELR relocation (by simply ignoring it instead of
aborting).
This scheme would ensure that older dynamic loaders error out with a useful
error message when trying to load newer binaries with SHT_RELR sections.