Option to record data (e.g. tags) in drcachesim traces

40 views
Skip to first unread message

Mingle Chen

unread,
Aug 5, 2022, 10:28:18 AM8/5/22
to DynamoRIO Users
Hi,
I'm working on running drcachesim with traces from CHERI (https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/), which has an additional tag bit for each capability address. I am wondering what would be the best way to record the tag data in trace_entry_t, and to implement an additional level of tag cache under the current LL cache?

This might also be an interesting feature to have as well for other architectures, such as the ARM MTE.

Derek Bruening

unread,
Aug 5, 2022, 1:37:28 PM8/5/22
to Mingle Chen, DynamoRIO Users
Are these CHERI tags beyond the 64 address bits?  I believe MTE uses the existing top bits and so its bits would fit in stored addresses, having only the issue that offline raw entries assume they can use the top 3 bits for categorization.

--
You received this message because you are subscribed to the Google Groups "DynamoRIO Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dynamorio-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dynamorio-users/72707b07-68e7-4b22-81c1-864139b0806fn%40googlegroups.com.

Mingle Chen

unread,
Aug 5, 2022, 2:10:25 PM8/5/22
to DynamoRIO Users
Yes, the tags in CHERI are stored separately. For example on a 64bit machine the capability would be 128 bits, plus an additional bit for tag.

Derek Bruening

unread,
Aug 9, 2022, 11:26:30 AM8/9/22
to Mingle Chen, DynamoRIO Users
Maybe providing an initial proposal with details would help to understand the tradeoffs by having something concrete to look at.  Would each load/store/fetch record the 128-bit capability plus the 1-bit tag, or just the 64-bit address plus the 1-bit tag?  For the trace_entry_t format, would you take a bit from the type field, or you'd add a new field (increasing the size on disk) and have the reader interface hide the difference?  For the memref_t format presented to tools, taking up more space has fewer downsides, so another field's only cost is compatibility.  For offline raw entries, any extra space is quite costly as tracing is i/o bound.  For that, are there unused bits in existing implementations for the type and tag, or is another 4 bits needed for a 3-bit type and 1-bit tag?  That extra 4 bits may well translate to 50% extra overhead.

Mingle Chen

unread,
Aug 9, 2022, 3:29:38 PM8/9/22
to DynamoRIO Users
The capabilities are special pointers that record the bounds each one can access, and only these are associated with tags. We'd only need one bit in the entry (to record whether the fetch is a capability and thus need to fetch the tag bit) to implement a simple tag cache. The actual value of the tag would be needed for something more complex, as we also have a multi-level tag cache, which essentially does some compression. Our initial idea is to hack these 2 bits in the existing entries, maybe the higher bits in the type or addr field could be used?

Derek Bruening

unread,
Aug 9, 2022, 5:22:28 PM8/9/22
to Mingle Chen, DynamoRIO Users
As mentioned, there are 3 different structures, so the location of the new data would need to be decided separately for each: the offline raw entries (offline_entry_t, the most performance-sensitive), the on-disk offline format (trace_entry_t), and the tool interface format (memref_t, which is not persisted and so its size matters much less).

Mingle Chen

unread,
Aug 10, 2022, 11:28:12 AM8/10/22
to DynamoRIO Users
I currently have a tracer in qemu which records the traces and outputs directly to the trace_entry_t format, which is then fed to drcachesim. So offline_entry_t can be skipped. For trace_entry_t as I mentioned before perhaps we can use a high bit in the size field for tag? For memref_t I think a separate type such as _memref_tag_t would work better?
Reply all
Reply to author
Forward
0 new messages