Use of sh_link and sh_info for SHT_NOTE sections.

26 views
Skip to first unread message

connor horman

unread,
Sep 5, 2022, 11:35:36 PM9/5/22
to gener...@googlegroups.com
Is it valid to use sh_link and sh_info for SHT_NOTE sections, for name/type-specific purposes?

If so, are there any restrictions to be aware of, and if not, would it be reasonable to make it valid?

ali_e...@emvision.com

unread,
Sep 6, 2022, 12:05:35 PM9/6/22
to gener...@googlegroups.com
On 9/5/22 9:35 PM, connor horman wrote:
> Is it valid to use sh_link and sh_info for SHT_NOTE sections, for name/type-specific purposes?
>
> If so, are there any restrictions to be aware of, and if not, would it be reasonable to make it valid?


Not really valid, but it probably depends on the details
of what you're trying to do. This old blog might be useful
in terms of understanding the sort of issues one runs into:

http://www.linker-aliens.org/blogs/ali/entry/how_to_strip_an_elf/

In principle, you can only use sh_link to point at other
sections (a section index), while sh_link is more flexible,
but requires SHF_INFO_LINK to be set to distinguish section
indexes from other uses. The more important point though, is
that the valid uses for sh_link/sh_info are established when
the section type is defined, and don't generally change later.
While you might start setting sh_link on an SHT_NOTE, and
comply with the rules spelled out in that blog, other
implementations, and tools, won't know why, or what needs to
be done with it.

The known valid uses of sh_link and sh_info are captured
in Table 4-14 in the gABI, and SHT_NOTE isn't listed, so
any non-zero values would be at odds with it:

http://www.sco.com/developers/gabi/latest/ch4.sheader.html

If you want to change the meaning of sh_link/sh_info for
SHT_NOTE, and you think your idea for them is sufficiently
general and useful to others, it would be best to propose a
change to the gABI. It's a high bar though, and given how
unstructured notes are, you might find it easier to just
encode this extra information within the note data for your
specific note, or just require the program interpreting the
note to find that other section through some other means.

Better than notes, is to define a new section type for your
purpose, probably under an OSABI, rather than the gABI, and
set whatever rules for sh_link/sh_info that make sense for it.

So it depends. If you think it's worth pursuing, then you
might want to give a brief sketch of the problem you're
trying to solve, and why it's a general problem that the gABI
should stretch to accommodate.

- Ali

connor horman

unread,
Sep 6, 2022, 4:16:03 PM9/6/22
to gener...@googlegroups.com
On Tue, 6 Sept 2022 at 12:05, <ali_e...@emvision.com> wrote:
On 9/5/22 9:35 PM, connor horman wrote:
> Is it valid to use sh_link and sh_info for SHT_NOTE sections, for name/type-specific purposes?
>
> If so, are there any restrictions to be aware of, and if not, would it be reasonable to make it valid?


    Not really valid, but it probably depends on the details
of what you're trying to do. This old blog might be useful
in terms of understanding the sort of issues one runs into:

     http://www.linker-aliens.org/blogs/ali/entry/how_to_strip_an_elf/

In principle, you can only use sh_link to point at other
sections (a section index), while sh_link is more flexible,
but requires SHF_INFO_LINK to be set to distinguish section
indexes from other uses. The more important point though, is
that the valid uses for sh_link/sh_info are established when
the section type is defined, and don't generally change later.
While you might start setting sh_link on an SHT_NOTE, and
comply with the rules spelled out in that blog, other
implementations, and tools, won't know why, or what needs to
be done with it.

Currently, the use is to refer to a string table section, as well as a reference hash table in another note section, so in both cases this is fine. The second point is more of a problem, though. 
The known valid uses of sh_link and sh_info are captured
in Table 4-14 in the gABI, and SHT_NOTE isn't listed, so
any non-zero values would be at odds with it:

     http://www.sco.com/developers/gabi/latest/ch4.sheader.html

If you want to change the meaning of sh_link/sh_info for
SHT_NOTE, and you think your idea for them is sufficiently
general and useful to others, it would be best to propose a
change to the gABI. It's a high bar though, and given how
unstructured notes are, you might find it easier to just
encode this extra information within the note data for your
specific note, or just require the program interpreting the
note to find that other section through some other means.

Better than notes, is to define a new section type for your
purpose, probably under an OSABI, rather than the gABI, and
set whatever rules for sh_link/sh_info that make sense for it.
So, the use for this is for a compiler, which uses dynamic libraries at link* time to communicate metadata, such as type information of exports (so that it can be safely referred to w/o a separate export file). In my case, this is for an implementation of the language rust, and contains similar information to statically linked rlibs. As mentioned, the sections are a string table, used to reference various strings used in the manifest, such as crate name, item names and signatures, and the names used for the links key, and a hash table so make it more efficient to reference other sections by name from within the manifest, which is used in replacement to similar references that would be made to files within an archive.
I don't necessarily want to put the data as part of the note, because the current design allows the compiler to find the root manifest section, validate the name and type, preinitialize the string and reference tables, then toss the entire contents of the note right through the same manifest parser that would be used for manifests present within an archive. It's not a huge problem to read two i32s
*The sections are read not by the link editor used to link the dynamic library, but by the language frontend when it is told to find a particular crate, and it finds an ET_DYN elf file instead of a (possibly compressed) archive. A link editor is still used in most cases to construct the dynamic library though, and the sections are emitted in intermediate object files that are appended to the link line, along with <$rustlib>/misc/dylib_link.ld, which, among other things, instructs the link editor creating the .so to KEEP all of the relevant sections.

So it depends. If you think it's worth pursuing, then you
might want to give a brief sketch of the problem you're
trying to solve, and why it's a general problem that the gABI
should stretch to accommodate.

- Ali

--
You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/generic-abi/ac12af5b-5ef5-425a-0d97-627b32e42392%40emvision.com.

connor horman

unread,
Sep 6, 2022, 4:18:54 PM9/6/22
to gener...@googlegroups.com
I'd add that other than keep the section numbers within the section, which isn't ideal but is ok, I don't want to start hardcoding names, since this is part of a spec building upon ELF, Mach-O, and PE (and possibly others in the future) to allow implementations of rust that care to interoperate, and minimizing the amount that gets hardcoded is a nice goal, as it allows more implementation flexibility (some of which I might like to take advantage of).

ali_e...@emvision.com

unread,
Sep 8, 2022, 12:23:35 AM9/8/22
to gener...@googlegroups.com
On 9/6/22 2:15 PM, connor horman wrote:
> So, the use for this is for a compiler, which uses dynamic libraries at link* time to communicate metadata, such as type information of exports (so that it can be safely referred to w/o a separate
> export file). In my case, this is for an implementation of the language rust, and contains similar information to statically linked rlibs. As mentioned, the sections are a string table, used to
> reference various strings used in the manifest, such as crate name, item names and signatures, and the names used for the links key, and a hash table so make it more efficient to reference other
> sections by name from within the manifest, which is used in replacement to similar references that would be made to files within an archive.
> I don't necessarily want to put the data as part of the note, because the current design allows the compiler to find the root manifest section, validate the name and type, preinitialize the string and
> reference tables, then toss the entire contents of the note right through the same manifest parser that would be used for manifests present within an archive. It's not a huge problem to read two i32s
> *The sections are read not by the link editor used to link the dynamic library, but by the language frontend when it is told to find a particular crate, and it finds an ET_DYN elf file instead of a
> (possibly compressed) archive. A link editor is still used in most cases to construct the dynamic library though, and the sections are emitted in intermediate object files that are appended to the
> link line, along with <$rustlib>/misc/dylib_link.ld, which, among other things, instructs the link editor creating the .so to KEEP all of the relevant sections.

...

> I'd add that other than keep the section numbers within the section, which isn't ideal
> but is ok, I don't want to start hardcoding names, since this is part of a spec building
> upon ELF, Mach-O, and PE (and possibly others in the future) to allow implementations
> of rust that care to interoperate, and minimizing the amount that gets hardcoded
> is a nice goal, as it allows more implementation flexibility (some of which I might
> like to take advantage of).


Hi Connor,

Notes are particularly difficult here, because their contents
are arbitrary in gABI terms. Each note creator puts whatever
they want in their note, and the standard has nothing to say
about that. Your note needs a string table and a hash table,
but other notes may need other things, and so there's a conflict
over this finite and tiny resource. Even worse, a given note
section can contain multiple unrelated notes, each potentially
with their own requirements for what sh_link/sh_info should mean.
Hence, I really think that if you do use notes for this, that you're
going to need to embed the section indexes, or their names,
in the note data, rather than the section header.

Of course, that means that tools like strip (recall my old blog)
can't automatically fix things up if the object gets stripped, which
is a bummer. Embedding the section names instead of section indexes
ducks that, but I agree with you that it's not normally what one
wants to do.

Using dedicated ELF section types, rather than notes, can solve
much of this, but it brings its own problems. You need to make
those sections part of the gABI, which can be a tough sell, and which
probably ties your hands on future changes, which you undoubtedly
want to avoid. Similar for adding it to an OSABI, plus you then
create cross-ELF portability problems. The gABI has generic, OS,
and platform partitions, but no language partition.

Or, you can put all the stuff (strings and hash) in their own notes,
and identify them at read time by their note names. You touched on
that above, so I know this isn't what you want, but perhaps it's worth
a second look? The upside here is that you can put whatever you
want in notes --- no standard gets to tell you differently.

Yet another idea is to create a rust specific file that captures
this stuff in a format you control directly, outside of the
ELF. Of course, there are obvious negatives to that, but it does
give you lots of flexibility and portability to non-ELF systems.

Good luck!

- Ali

connor horman

unread,
Sep 8, 2022, 7:22:38 AM9/8/22
to gener...@googlegroups.com
That's fair - I'm currently contemplating these psuedo-files getting their own sections,  but that isn't necessarily a hard requirement (though due to the way the spec

Of course, that means that tools like strip (recall my old blog)
can't automatically fix things up if the object gets stripped, which
is a bummer. Embedding the section names instead of section indexes
ducks that, but I agree with you that it's not normally what one
wants to do.
Yeah, strip will cause a lot of fun here. If it was the name index, that's fine. I wanted to avoid hardcoding section names right into the format spec, but encoding them into the binary doesn't have the same problem. The problem with this would itself require a string table. Hardreferencing the section string table may function properly, but idk how that interacts with strip as well. I definitely don't want to embed two variable-length strings right into the preamble before the actual manifest, though. This probably seems like the best option, aside from using a separate section type.

Using dedicated ELF section types, rather than notes, can solve
much of this, but it brings its own problems. You need to make
those sections part of the gABI, which can be a tough sell, and which
probably ties your hands on future changes, which you undoubtedly
want to avoid. Similar for adding it to an OSABI, plus you then
create cross-ELF portability problems. The gABI has generic, OS,
and platform partitions, but no language partition.
I was thinking about using the user partition, but IDK if it is appropriate here. And yeah, allowing growth in the format is useful. I'm not sure that *how* it gets embedded will change significantly (and I would not be opposed to forgoing that part entirely), but certainly the manifest format itself will get updates. 

Or, you can put all the stuff (strings and hash) in their own notes,
and identify them at read time by their note names. You touched on
that above, so I know this isn't what you want, but perhaps it's worth
a second look? The upside here is that you can put whatever you
want in notes --- no standard gets to tell you differently.
Not name, though type might work for the section hash table (I've already reserved a specific note "name", and use types to store various files from the format spec). The string table can't really end up in a note section, though - it was supposed to be at the implementation's choice whether it ends up in an allocated section, as the mangled names that end up in this section may be reused at runtime (that was the whole point of outlining the string table from the manifest in the first place - allow the implementation to include it in an allocated section and shove symbol references in that section.

Yet another idea is to create a rust specific file that captures
this stuff in a format you control directly, outside of the
ELF. Of course, there are obvious negatives to that, but it does
give you lots of flexibility and portability to non-ELF systems.
This is something I'd rather not do - there is already such a file format, the manifest format that is getting directly embedded here, but keeping the dylib output self-contained is pretty much a requirement, especially for this particular rust compiler, for which distributing dylibs and rlibs is viable.

Good luck!


- Ali

--
You received this message because you are subscribed to the Google Groups "Generic System V Application Binary Interface" group.
To unsubscribe from this group and stop receiving emails from it, send an email to generic-abi...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages