Thank you for your explanation of CCE memory commands.
It looks like we will proceed with a hardware stream prefetcher near the L2 cache.
You mentioned that examining the stream of e_bedrock_mem_rd from CCE is a good way to keep track of L1 misses.
And the
bp_cce_fsm module is a good design example that we can learn from, when implementing a stream prefetcher module (using hardware FSM?).
So whenever CCE sends an access to L2, this is a signal that we have an L1 cache miss.
How do we capture this signal as an input?
Will we be writing a module like the cce_fsm module (i.e. a new .v file)?
How will our module fit in with the rest of the architecture? E.g what signal do we send to initiate a prefetch?
What steps do we need to take to test that our module works? I assume that we will have to write a test bench where a stream buffer is useful, and another where it is not.
The challenge level for this task is relatively high, not because of the logic but because we are unfamiliar with the codebase and architecture. So we need a little bit of handholding.
If there are helpful documents or resources you could point us to, we'd appreciate it.