Hi Cary,
On 10/28/21 7:06 PM, Cary Coutant wrote:
> I'm unconvinced that PT_MEMTAG makes sense as a standard gABI feature,
> but even as an extension, I'm worried about the direction you're
> going.
The number of architectures supporting some form of memory tagging is
still not that great for sure. There are 2 concrete examples (ARM's MTE
and SPARC's ADI) and a few research examples with CUCL's CHERI
architecture and ARM's Morello where tags are used as part of capabilities.
I have discussed this with developers from ARM, Oracle (to some extent)
and CHERI. There seems to be agreement on using a new segment type for
dumping the tags.
However, there is less consensus on whether we should use a new segment
type for each individual variant of memory tagging system or if we
should go with a single common segment type like PT_MEMTAG.
Personally I like the use of a common PT_MEMTAG segment. But I'm open to
changing my mind about it
>
> Using a single type code to identify a segment that has similar
> purpose but not a common interpretation -- i.e., if the best a dumper
> can do is hex dump the contents -- doesn't make sense to me. Further,
I suppose that is a bit subjective. Having segments with a similar
purpose and semantics can be enough to get them grouped together under a
common type. It also simplifies the implementation, as the code that
handles the common type will be the same. Arch-specific code can then
dump a blob of data to the PT_MEMTAG segment, formatted/compressed in an
appropriate way.
But I see where this may not be generic enough to go into the gABI. The
contents and potentially some of the fields would be opaque with
arch-specific meaning.
> there really isn't an established mechanism for using a hierarchical
> type/subtype, and repurposing the p_flags field as a subtype is (I
> think) a bad idea: now, you have to teach the dumpers that, for this
> segment type, the p_flags field isn't actually a flags field. Using
I agree, and that's not ideal. The structure lacks flexibility to be
able to hold metadata properly, and that's why I'd like to get some
level of consensus. If (ab)using these fields is not an option, then I'd
like to discuss other ways to accommodate the needs for architectures
for dumping memory tags.
> any of the other fields isn't any better, unless you can infer what
> you need to know from the natural uses of those fields -- for example,
> from the relationship between p_filesz and p_memsz.
When I mentioned tools that can list/dump these PT_MEMTAG segments, I
had objdump/readelf in mind, tools that provide a quick way to see the
contents of a particular ELF file.
I don't think there is value in interpreting the tags contained in a
PT_MEMTAG segment, in which case these tools wouldn't need to go
reinterpreting p_flags/p_offset etc. But developers might find it useful
if the tools show a dump and at least a readable name for the segment.
(PT_LOPROC + 0x00000001) certainly isn't a desirable name.
It is also not desirable to have to build a cross tool so it can show
the name of a particular segment properly.
On inferring the data from p_filesz/p_memsz and other fields, that might
work. But dictating those relationships removes some of the flexibility
for architectures to compress the tag dump as they see fit. Then again,
it might be a bit of overengineering.
If we can make it so all the fields have well-defined interpretations,
like inferring the tag stride from p_memsz/p_filesz, do you think it
would be acceptable?
>
> Based on what I understand from this thread so far, I'd suggest
> reserving a value in the LOPROC/HIPROC space that can be shared across
> all processors (it's a big space). In the processor-specific space,
> readers can then check the e_machine value to further disambiguate the
> format (essentially serving as the subtype), and you'll have the
> common PT_ value you want. From what you said earlier, it sounds like
> the 64- vs. 128-bit aspect can be inferred from the ELF file container
> size (ELFCLASS32 vs. ELFCLASS64).
While that might work for most architectures, it wouldn't work for an
architecture that supports multiple types of memory tags. You could have
ELFCLASS64 / PT_MEMTAG (say, PT_LOPROC + 1) / EM_AARCH64 and still have
to disambiguate between, say, two different types of memory tags (MTE
and capability tags, for example). It gets more complex to determine the
final memory tag dump type, since you need to check more fields. I think
it is more error prone in the end.
Alternatively, we could drop the common type and start populating the
PT_LOPROC range with multiple small scope constants. As you said, it is
a big space. For example:
PT_LOPROC + 1 for ARM's MTE
PT_LOPROC + 2 for SPARC's ADI
PT_LOPROC + 3 for CHERI's capability tags
...
That means we need to update the code whenever a new constant is added
to the list. But otherwise the types are pretty clear and non-ambiguous.
And you don't need to (ab)use the program header structure's fields.
>
> Even if you do need more than one PT_ value, I really don't see why
> it's such a big inconvenience.
I wouldn't say it is a big inconvenience, but seems cleaner for me.
Maybe also a subjective matter.
>
> However, if you do pick a single value in the processor-specific range
> to be shared across all processor architectures, I think I'd be
> willing to list that in the ELF spec. I think it's a reasonable place
> to document inter-architecture commonalities, even if they aren't
> gABI-blessed.
That's good to know. Before we get anything upstreamed, I think it is
best to get this potentially documented in the gABI. After the constants
are picked and upstreamed, it gets more difficult to modify the design.