Function of 0xA3 L2 special access I/O with L2 larger than 256kiB

7 views
Skip to first unread message

Ioannis Galanommatis

unread,
Aug 21, 2025, 12:55:36 PMAug 21
to OpenPiton Discussion
Hello,

We have a vcu118-based SoC design with four ariane cores and we exchange data with an accelerator. To maintain cache coherence, we used the 0xA3 special access i/o address, as described on par. 5.4.3 of the microarchitecture manual.

All well when we had 256kiB L2, but when we changed to 512kiB, there are indications that coherency is not always working. I see in manual that the index selection field is 8-bit long and for 256kiB it was enough: 4 nodes x 4 ways x 2^8 idx x 64bytes/line = 256kiB. In 512kiB we normally require a 9-bit index and the hw team says that indeed that index selection path has become so. However, even i changed the code to assume index sel field is now [14:6], pushing the way sel field to [16:15], we still have the issue.

My question is, can 0xA3 support systems with >256kB L2? What happens with the index selection field then? Or there is a hardwired upper limit?

If it does, is there anything more we should have done to make it work with the larger cache? The hw team told me that increasing the cache is straightforward but could they be missing something regarding the 0xA3?

Thank you,
-- Ioannis

Jonathan Balkind

unread,
Aug 25, 2025, 4:11:59 PM (11 days ago) Aug 25
to OpenPiton Discussion
Hi Ioannis,

The special address setup isn't really going to be flexible to different designs because it hardcodes the index and way like that. The scheme just gives an entry point of somewhere to start and the plumbing to make it work, so you can make modifications as desired. Ultimately it's a question of how you want to change the scheme for your design and whether there are enough bits to do it. If you look at the displacement flush, it uses a different encoding scheme. It's really up to you how you want to reuse the bits.

It's also the case that displacement flush can be easier to deal with because it's just address based. We tend to use it more. We're also working on supporting RISC-V CMOs, though that's a work in progress.

If ultimately you truly are short on bits (though I'm not sure you will be), another option could be reusing another part of the special address space. We used e0,e1,e2 but not e3-ef for the on-chip tile accesses for e.g. DECADES/Stripes, CIFER, MAPLE, Cohort, etc. You might be able to allocate e8-ef for example and have a few more bits available for your purposes.

Thanks,
Jon

--
You received this message because you are subscribed to the Google Groups "OpenPiton Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openpiton+...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/openpiton/1dad7e28-1eb2-4de1-bb25-6c1219e7571cn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages