Hi, I'd like to propose a LEB128-based relocation format to replace REL and RELA formats.
The natural size guideline for control structures presents significant drawbacks for relocations.
Since relocations are typically processed sequentially, they don't gain the same random-access advantages.
The large 24-byte Elf64_Rela structure highlights the drawback.
I've written a detailed analysis of the size problem and my solution at
https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf ,
and developed a LLVM/Clang/lld prototype at
https://github.com/MaskRay/llvm-project/tree/demo-relleb (clang -mrelleb, llvm-readelf -r, ld.lld)
Below, you'll find an overview of the size issue, followed by excerpts from my blog post that detail the "RELLEB relocation format" and my "RELLEB proposal for the generic ABI."
---
In contrast, the WebAssembly object file format uses LEB128 encoding for relocations and other constrol structures, offering a significant size advantage over ELF.
In my Clang+lld prototype,
The total relocation section size has decreased from 28768176 to 5318839, 18.4% of the original size.
The .o size has decreased by 20.5%.
Applying zstd (SHF_COMPRESSED) would reduce SHT_RELLEB section size by 26.4%.
The overall decrease of only 1.088% in `.o` file sizes indicates that RELLEB offers a reasonable compromise.
---
## Use cases
* Dynamic relocations
* Marker relocations (R_RISCV_RELAX, etc)
* `.llvm_addrsig`
* `.llvm.call-graph-profile`
* `.debug_names`
Refer to
https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf#use-cases---
## RELLEB relocation format
The 1990 ELF paper _ELF: An Object File to Mitigate Mischievous Misoneism_ says "ELF allows extension and redefinition for other control structures."
Inspired by WebAssembly, let's explore RELLEB, a new and more compact relocation format designed to replace RELA.
Our emphasis is on simplicity over absolute minimal encoding. See the end of the article for a detailed format description.
A `SHT_RELLEB` section (preferred name: `.relleb<name>`) holds compact relocation entries that decode to `Elf32_Rela` or `Elf64_Rela` depending on the object file class (32-bit or 64-bit).
Its content begins with a ULEB128-encoded relocation count, followed by entries encoding `r_offset`, `r_type`, `r_symidx`, and `r_addend`.
Here are key design choices:
*Relocation count (ULEB128)*:
This allows for efficient retrieval of the relocation count without decoding the entire section.
While a `uint32_t` (like [`SHT_HASH`](
https://www.sco.com/developers/gabi/latest/ch5.dynamic.html#hash)) could be used, ULEB128 aligns with subsequent entries, removes endianness differences, and offers a slight size advantage in most cases when the number of symbols can be encoded in one to three bytes.
*Delta encoding for `r_offset` (ULEB128)*:
Section offsets can be large, and relocations are typically ordered. Storing the difference between consecutive offsets offers compression potential.
In most cases, a single byte will suffice.
While there are exceptions (general dynamic TLS model of s390/s390x uses a local "out-of-order" pair: `R_390_PLT32DBL(offset=o) R_390_TLS_GDCALL(offset=o-2)`), we are optimizing for the common case.
Switching to SLEB128 would increase the total `.o` size by 0.1%.
For ELFCLASS32, `r_offsets` members are calculated using modular arithmetic modulo 4294967296.
*Delta encoding for `r_type` (SLEB128)*:
Some psABIs utilize relocation types greater than 128.
AArch64's static relocation types begin at 257 and dynamic relocation types begin at 1024, necessitating two bytes with ULEB128/SLEB128 encoding in the absence of delta encoding.
Delta encoding allows all but the first relocation's type to be encoded in a single byte.
An alternative design is to define a base type in the header and encode types relative to the base type, which would introduce slight complexity.
If the AArch32 psABI could be redesigned, allocating `[0,64)` for Thumb relocation types and `[64,*)` for ARM relocation types would optimize delta encoding even further.
While sharing a single type code for multiple relocations would be efficient, it would require reordering relocations.
This conflicts with order requirements imposed by several psABIs and could complicate linker implementations.
*Symbol index and addend presence (SLEB128)*:
ULEB128 optimizes for the common case when the symbol index is encodable in one or two bytes.
Using SLEB128 and delta encoding instead of ULEB128 for the symbol index field would increase the total size by 0.4%.
Potential gains for dynamic relocations with consecutive indexes are outweighed by the loss in static relocations and added complexity, hence avoiding delta encoding.
We indicate addend omission using the sign bit (see below).
*Delta encoding for addend (SLEB128)*:
Consecutive static relocations often have identical addends, especially on architectures with frequent zero addends (AArch64, PowerPC, RISC-V, etc).
Addend omission optimizes this scenario.
Additionally, it benefits architectures like AArch32 and x86, which often have identical negative addends (call instructions) for consecutive relocations.
```c
// Many R_X86_64_PLT32(g-4)
void f() { g(); g(); g(); ... }
```
Dynamic relocations (except `R_*_RELATIVE` and `R_*_IRELATIVE`) typically have zero addends, also benefiting from our scheme.
The sign bit of the previous member flags whether the addend has changed relative to the prior entry.
When the addend changes, we use an addend delta. This offers a slight size advantage (about 0.20%) and optimizes for cases like:
```asm
.quad .data + 0x78
.quad .data + 0x80
.quad .data + 0x88
...
```
Note: when the bitwise NOT code path is taken, the zero delta addend is not utilized.
### RELLEB for dynamic relocations
Redacted. My blog post presents a comparison with Android packed relocations.
## RELLEB proposal for the generic ABI
In <
https://www.sco.com/developers/gabi/latest/ch4.sheader.html>, make the following changes.
_In Figure 4-9: Section Types,sh_type, append a row_
`SHT_RELLEB` | 20
_Add text:_
SHT_RELLEB - The section holds compact relocation entries with explicit addends. An object file may have multiple relocation sections. ''Relocation'' below for details.
_In Figure 4-16: Special Sections, append a row_
`.rellebname` | `SHT_RELLEB` | see below
_Change the text below:_
.relname, .relaname, and .rellebname
These sections hold relocation information, as described in ''Relocation''. If the file has a loadable segment that includes relocation, the sections' attributes will include the SHF_ALLOC bit; otherwise, that bit will be off. Conventionally, name is supplied by the section to which the relocations apply. Thus a relocation section for .text normally would have the name .rel.text, .rela.text, or .relleb.text.
_In Figure 4-23: Relocation Entries, add:_
```c
typedef struct {
Elf32_Addr r_offset;
Elf32_Word r_type;
Elf32_Word r_symidx;
Elf32_Sxword r_addend;
} Elf32_Relleb;
typedef struct {
Elf64_Addr r_offset;
Elf64_Word r_type;
Elf64_Word r_symidx;
Elf64_Sxword r_addend;
} Elf64_Relleb;
```
_Add text above "A relocation section references two other sections":_
A `SHT_RELLEB` section holds compact relocation entries that decode to `Elf32_Relr` or `Elf64_Relr` depending on the object file class (32-bit or 64-bit).
Its content begins with a ULEB128-encoded relocation count, followed by entries encoding `r_offset`, `r_type`, `r_symidx`, and `r_addend`.
Note that the `r_info` member in traditional REL/RELA formats has been split into separate `r_type` and `r_symidx` members, allowing `uint32_t` relocation types for ELFCLASS32 as well.
* Delta offset: (ULEB128-encoded) The difference in `r_offset` relative to the previous entry. The difference is truncated to 32-bit/64-bit for ELFCLASS32/ELFCLASS64, respectively.
* Delta type: (SLEB128-encoded) The difference in relocation type relative to the previous entry.
* Symbol table index and addend presence: (SLEB128-encoded) If the `r_offset` is equal to the previous `r_addend`, the encoded value represents the symbol index; otherwise, the bitwise NOT of the encoded value indicates the symbol index.
* Delta addend: (SLEB128-encoded, only if indicated in the previous member) The difference in `r_addend` relative to the previous entry. The difference is truncated to 32-bit/64-bit for ELFCLASS32/ELFCLASS64, respectively.
Example C++ encoder for `Elf64_Relleb`:
```cpp
uint64_t offset = 0, addend = 0;
uint32_t type = 0;
encodeULEB128(relocs.size(), os);
for (const Reloc &rel : relocs) {
encodeULEB128(rel.offset - offset, os);
encodeSLEB128(static_cast<int32_t>(rel.type - type), os);
type = rel.type;
if (addend == rel.addend) {
encodeSLEB128(rel.symidx, os);
} else {
encodeSLEB128(~symidx, os);
encodeSLEB128(rel.addend - addend, os);
addend = rel.addend;
}
}
```
For the first relocation entry, the previous offset, type, and addend members are treated as zero.
_In Figure 5-10: Dynamic Array Tags, d_tag, add:_
`DT_RELLEB` | 38 | `d_ptr` | optional | optional
_Add text below:_
* `DT_RELLEB` - This element is similar to `DT_RELA`, except its table uses the RELLEB format.