New gABI/ELF Spec Available for Public Review

1,811 views
Skip to first unread message

Cary Coutant

unread,
Sep 2, 2025, 10:31:03 PMSep 2
to Generic System V Application Binary Interface
A draft of Version 4.3 of the ELF specification is now available for
public review, in both HTML and PDF:

https:://gabi.xinuos.com

The ELF specification was formerly part (chapters 4 and 5) of the SVR4
gABI document, but I have separated it from the gABI document to
better serve as a specification independent of the SVR4 ABI.

The last published gABI documents were the Fourth Edition and a draft
of Edition 4.1, both published in March 1997. The ELF portions of the
document were updated several times between 1998 and 2015, published
online as drafts at:

https://www.sco.com/developers/devspecs/

I've published the last draft from 2015 as Version 4.2, and collected
the several changes since then, along with new e_machine values, as
Version 4.3.

Version 4.2 (based on the fifteenth draft after Edition 4.1, from July 23, 2015)

- Converted to ReStructuredText.
- ELF specification is now separate from the gABI document.
- Removed empty placeholders for psABI sections.

Version 4.3 (DRAFT)

- Added extra requirements for SHF_LINKORDER flag.
- Added relative relocation table (Elf32_Relr and Elf64_Relr).
- Changed the symbol visibility attribute to use the lower 3 bits of
st_other (instead of 2 bits).
- Added DT_SYMTABSZ entry, and made DT_HASH optional if DT_SYMTABSZ is provided.
- Changed SHF_COMPRESSED to allow with SHF_ALLOC sections in ET_REL objects.
- Added ELFCOMPRESS_ZSTD compression algorithm.

Version 4.2, for reference, is available at:

https://gabi.xinuos.com/v42/

The source is on github:

https://github.com/xinuos/gabi

I've collected the gABI mailing list discussions for the changes
listed above for Version 4.3 as the first six issues in the issue
tracker.

-cary

H.J. Lu

unread,
Sep 2, 2025, 10:48:56 PMSep 2
to gener...@googlegroups.com
Thanks. I really appreciate it.

> -cary
>


--
H.J.

Fangrui Song

unread,
Sep 3, 2025, 2:06:11 AMSep 3
to Generic System V Application Binary Interface
On Tuesday, September 2, 2025 at 7:48:56 PM UTC-7 H.J. Lu wrote:
Thank you for creating the new website and maintaining the specification!
The August 2015 post marked the start of the unmaintenained period https://groups.google.com/g/generic-abi/c/IakWYdGABjQ
After a decade, we now have an official live specification once again!

Updated my notes as well https://maskray.me/blog/2024-05-26-evolution-of-elf-object-file-format 

Michael Matz

unread,
Sep 3, 2025, 10:01:52 AMSep 3
to Generic System V Application Binary Interface
Heyho,

On Tue, 2 Sep 2025, Cary Coutant wrote:

> A draft of Version 4.3 of the ELF specification is now available for
> public review, in both HTML and PDF:
>
> https:://gabi.xinuos.com
...
> - Converted to ReStructuredText.
...
> The source is on github:
>
> https://github.com/xinuos/gabi

Hurray! Thank you for persisting through it, Cary! I'm very happy.


Cioa,
Michael.

Mark Wielaard

unread,
Sep 3, 2025, 11:46:46 AMSep 3
to Cary Coutant, Generic System V Application Binary Interface
Hi Cary,

On Tue, 2025-09-02 at 19:30 -0700, Cary Coutant wrote:
> I've published the last draft from 2015 as Version 4.2, and collected
> the several changes since then, along with new e_machine values, as
> Version 4.3.
>
> Version 4.2 (based on the fifteenth draft after Edition 4.1, from July 23, 2015)
>
> - Converted to ReStructuredText.
> - ELF specification is now separate from the gABI document.
> - Removed empty placeholders for psABI sections.
>
> Version 4.3 (DRAFT)
>
> - Added extra requirements for SHF_LINKORDER flag.
> - Added relative relocation table (Elf32_Relr and Elf64_Relr).
> - Changed the symbol visibility attribute to use the lower 3 bits of
> st_other (instead of 2 bits).
> - Added DT_SYMTABSZ entry, and made DT_HASH optional if DT_SYMTABSZ is provided.
> - Changed SHF_COMPRESSED to allow with SHF_ALLOC sections in ET_REL objects.
> - Added ELFCOMPRESS_ZSTD compression algorithm.
>
> Version 4.2, for reference, is available at:
>
> https://gabi.xinuos.com/v42/
>
> The source is on github:
>
> https://github.com/xinuos/gabi

Very nice! Thanks so much.

Could you add a license to it so it is clear what the redistribution
terms are and under which terms the supplement psABI documents may
reuse the text?

Thanks,

Mark

Cary Coutant

unread,
Sep 3, 2025, 2:24:08 PMSep 3
to Mark Wielaard, Generic System V Application Binary Interface
> Could you add a license to it so it is clear what the redistribution
> terms are and under which terms the supplement psABI documents may
> reuse the text?

Working on it. I've proposed a more permissive license to the Xinuos
legal team, and need their approval.

-cary

Cary Coutant

unread,
Sep 3, 2025, 2:33:26 PMSep 3
to Generic System V Application Binary Interface
By the way, I'd like to thank Ali Bahrami and Carlos O'Donell for
their thorough and constructive reviews of earlier drafts of this
work.

-cary

ali_e...@emvision.com

unread,
Sep 3, 2025, 3:06:07 PMSep 3
to gener...@googlegroups.com
On 9/2/25 8:30 PM, Cary Coutant wrote:
> Version 4.2, for reference, is available at:
>
> https://gabi.xinuos.com/v42/
>
> The source is on github:
>
> https://github.com/xinuos/gabi


Even though I saw an earlier draft, and had a pretty
good idea of what to expect, I'm really impressed by how good
this is. It's quite a leap from the previous edition.

You've done a huge service for the rest of us, and one that I
doubt could have happened otherwise. We're set for years to
come now.

Thank You!

- Ali

Mark Wielaard

unread,
Sep 8, 2025, 1:56:56 PMSep 8
to Cary Coutant, Generic System V Application Binary Interface
Hi Cary,
Thanks. Could you also make clear that this is based on public
discussions by lots of people who contributed concrete wording for the
specification text? It is probably not feasible to mention every
individual that contributed. But I found the addition of just
"Copyright © 2011-2014, 2023–2025 Xinuos Inc. All rights reserved." a
little misleading. This is clearly a public work which we want to
redistribute openly without any rights reserved.

Thanks,

Mark

Cary Coutant

unread,
Oct 8, 2025, 1:27:14 PM (6 days ago) Oct 8
to Generic System V Application Binary Interface
I've been going through old email discussions on this list, and have found three more where we seemed to reach a consensus on changing the spec. I've opened new issues for these on the github gabi project:

- Issue 9: SHT_NOBITS clarification
- Issue 10: Support for more than 64K program headers
- Issue 11: Update ET_EXEC and ET_DYN

Each issue has a link to the relevant discussion on this list.

I'm continuing to look. If anyone knows of any other discussions where we as a group agreed to a change that is not yet reflected in the v4.3 draft, please point them out to me.

I'll update the 4.3 draft to incorporate these changes, unless I hear objections.

-cary

Ali Bahrami

unread,
Oct 8, 2025, 3:34:28 PM (6 days ago) Oct 8
to gener...@googlegroups.com
On 10/8/25 11:26 AM, Cary Coutant wrote:
> - Issue 9 <https://github.com/xinuos/gabi/issues/9>: SHT_NOBITS
> clarification

From that discussion:

> Because this section contains no bytes, the
> sh_offset member has no meaning and is not used.

I don't have an issue with this, but will note that most
link-editor's do adhere to setting it to the offset at
which the bits would be found if the section was not SHT_NOBITS.
Is it worth nodding to that, so that anyone comparing
a dump of section headers to the spec won't be confused
in trying to make sense of it? Maybe add a follow on like:

Note that many link-editors follow a convention of
setting sh_offset to the offset at which data would
be found if the section were not SHT_NOBITS. These
values are purely informational.

Just a thought --- I think it's good either way.

- Ali

Cary Coutant

unread,
Oct 8, 2025, 5:17:39 PM (6 days ago) Oct 8
to gener...@googlegroups.com
I never saw the point of specifying a value—I'd probably have chosen to require the field to be 0. As it is, some tools fail to set it consistently, leading to bug reports based on the current wording of the spec. "The offset at which data would be found if..." is really meaningless, isn't it? The ELF writer might have placed those zeroes anywhere if it was forced to materialize them. More accurate might be: "the offset following the last allocated space" (perhaps overly restrictive) or "an offset at which the data might have been allocated if..." (still meaningless).

The only thing I think might make sense to add here is a requirement that whatever value is put there is within the bounds of the object file, but I still don't think there's a reason to impose that requirement. It might make it easier on tools that just blindly fseek() before realizing there's nothing to read.

-cary
 

Ali Bahrami

unread,
Oct 8, 2025, 5:56:08 PM (6 days ago) Oct 8
to gener...@googlegroups.com
On 10/8/25 3:17 PM, Cary Coutant wrote:
> I never saw the point of specifying a value—I'd probably have chosen to
> require the field to be 0.
...> The only thing I think might make sense to add here is a requirement
> that whatever value is put there is within the bounds of the object
> file, but I still don't think there's a reason to impose that
> requirement. It might make it easier on tools that just blindly fseek()
> before realizing there's nothing to read.

I've no argument with that reasoning, nor do I want to
add more requirements.

At the same time, I don't want people to suddenly think that
their existing link-editors are broken. There's this existing
historical behavior from link-editors to set sh_offset to point
at the spot bits would go (if there were bits), and there's not
much reason to require that to change. Add to that decades of
existing objects, and we can expect new folks to occasionally
notice, and spend time trying to decode these "meaningless" values
that actually do seem to have an underlying meaning of some sort.

We can let them guess and puzzle it out, or we can just tell them
that it's a thing they may encounter, and not to worry about it.

Thanks.

- Ali

Cary Coutant

unread,
Oct 8, 2025, 7:33:08 PM (6 days ago) Oct 8
to gener...@googlegroups.com
At the same time, I don't want people to suddenly think that
their existing link-editors are broken. There's this existing
historical behavior from link-editors to set sh_offset to point
at the spot bits would go (if there were bits), and there's not
much reason to require that to change. Add to that decades of
existing  objects, and we can  expect new folks to occasionally
notice, and spend time trying to decode these "meaningless" values
that actually do seem to have an underlying meaning of some sort.

This is why I didn't say it must be 0. If we say it's meaningless and isn't used, why would anything think their link editor is broken if it puts an arbitrary file offset there?
 
We can let them guess and puzzle it out, or we can just tell them
that it's a thing they may encounter, and not to worry about it.

I guess I wouldn't be opposed to a note that some tools might use the field that way, but I don't want to make it look anything like a requirement.

I also don't want to open up the possibility that some psABI might coopt the sh_offset field for another purpose, so maybe a bit more careful wording is needed. Maybe rather than replace the existing wording "... contains the conceptual file offset", we could simply add a note that the conceptual file offset could be any valid file offset within the file.

-cary

Roland McGrath

unread,
Oct 8, 2025, 7:47:44 PM (6 days ago) Oct 8
to gener...@googlegroups.com
On Wed, Oct 8, 2025 at 4:33 PM Cary Coutant <ccou...@gmail.com> wrote:
This is why I didn't say it must be 0. If we say it's meaningless and isn't used, why would anything think their link editor is broken if it puts an arbitrary file offset there?

I think we can  make this sufficiently clear with slightly different wording.  Saying "... has no meaning and is not used ..." can be construed in many ways, including that it should not have a value that appears to have meaning (by being nonzero).  It seems worthwhile just to be a little more verbose here, to say that it has no meaning and is not used by program loaders at runtime; static linkers may set this to zero or to another value that is convenient for them or that they choose for historical reasons.

Ali Bahrami

unread,
Oct 8, 2025, 8:23:37 PM (6 days ago) Oct 8
to gener...@googlegroups.com
On 10/8/25 5:32 PM, Cary Coutant wrote:
> This is why I didn't say it must be 0. If we say it's meaningless and
> isn't used, why would anything think their link editor is broken if it
> puts an arbitrary file offset there?

A good link-editor never writes garbage, so if I see non-zero
values, and I'm told they're meaningless, I'm going to conclude
that something is fishy. Either they're meaningful, and the spec
isn't telling me something I should know, or they're not, and the
link-editor shouldn't be giving them non-zero values, so there's
a bug to fix. One way or the other, something seems off, and I'm
going to dig.



> I guess I wouldn't be opposed to a note that some tools might use the
> field that way, but I don't want to make it look anything like a
> requirement.

Me either, which is why I called it an informational convention:

> Note that many link-editors follow a convention of
> setting sh_offset to the offset at which data would
> be found if the section were not SHT_NOBITS. These
> values are purely informational.

I accept that it didn't land as intended, but you can see
the effort to explain without requiring.

>
> I also don't want to open up the possibility that some psABI might coopt
> the sh_offset field for another purpose, so maybe a bit more careful
> wording is needed. Maybe rather than replace the existing wording "...
> contains the conceptual file offset", we could simply add a note that
> the conceptual file offset could be any valid file offset within the file.

We don't want some psABI to assign added meaning to sh_offset, whether
it's a valid offset or not though, do we? It occurs to me that one merit
of the old "offset where bits would go if there were bits" definition is
that it takes away any wiggle room for such coopting.

Rather than "any valid file offset", why not limit things to the 2
cases we actually condone? Something like:

A section of this type occupies no space in the file but
otherwise resembles SHT_PROGBITS. Because this section
contains no bytes, the sh_offset member has no meaning and
is not used. It may be set to 0, or alternatively, to the file
offset that the link editor would have assigned the section
if the type had been SHT_PROGBITS.

- Ali

Fangrui Song

unread,
Oct 9, 2025, 3:17:44 AM (6 days ago) Oct 9
to Generic System V Application Binary Interface
On Wednesday, October 8, 2025 at 5:23:37 PM UTC-7 Ali Bahrami wrote:
On 10/8/25 5:32 PM, Cary Coutant wrote:
> This is why I didn't say it must be 0. If we say it's meaningless and
> isn't used, why would anything think their link editor is broken if it
> puts an arbitrary file offset there?

A good link-editor never writes garbage, so if I see non-zero
values, and I'm told they're meaningless, I'm going to conclude
that something is fishy. Either they're meaningful, and the spec
isn't telling me something I should know, or they're not, and the
link-editor shouldn't be giving them non-zero values, so there's
a bug to fix. One way or the other, something seems off, and I'm
going to dig.


A linker might want to maintain non-decreasing sh_offset values, even if it is required.
Assigning a zero value to sh_offset would break this property.

Additional notes detailing sh_offset for NOBITS sections would be helpful but I am unsure how to word it.

> It may be set to 0, or alternatively, to the file offset that the link editor would have assigned the section if the type had been SHT_PROGBITS.

This should be relaxed. For a SHT_NOBITS section, a linker can assign DOT to sh_offset rather than align(DOT,sh_addralign) (if the type had been SHT_PROGBITS).
This approach, adopted by the LLVM Linker, simplifies the sh_offset assignment algorithm and guarantees non-decreasing sh_offset values.
(However, align(DOT,sh_addralign) should still be used if the SHT_NOBITS section is the first section of its containing PT_LOAD or PT_TLS segment.)

In contrast, I vaguely recall that GNU ld might decrease sh_offset when moving from a SHT_NOBITS section to a SHT_PROGBITS section.

(
I was wondering about the significance of sh_offset on llvm-objcopy. It turns out it has none.
For non-NOBITS sections, sh_offset defines the section's position within the segment, while sh_vaddr is used for NOBITS sections.

However, I am unsure whether sh_offset for NOBITS sections holds any significance for GNU objcopy.
)
Reply all
Reply to author
Forward
0 new messages