Ya, I also ran that same experiment, and that does solve the assertion errors.
Granted, we're still stuck with poor performance, because the NVDLA is
trying to issue many back-to-back requests with the same AXI ID, and
we're only able to service them one-at-a-time.
I
dug into the AXI4ToTL block a bit more, and I feel that I have a good
understanding of the problem. The root cause is how multiple outstanding
TileLink requests interact with AXI's assumptions of outstanding
requests. To start with, the "Operation Ordering" section of the
TileLink spec says:
> Within a TileLink network,
there may be multiple outstanding operations inflight at any given time.
These operations may be completed in any order.
Section 5.1 of the AXI4 spec says:
> All transactions with a given AXI ID value must remain ordered, but there is no restriction on the ordering of transactions with different ID values.
So, when the "capMaxFlight" parameter in the AXI4UserYanker is set to more than 1, AXI4ToTL will send multiple TileLink transactions for requests coming from a single AXI ID. However, AXI4ToTL has no logic to deal with out-of-order TileLink responses. For sending read requests,
this is the logic AXI4ToTL uses to generate multiple TileLink requests for AXI requests on a single AXI ID:
val r_count = RegInit(Vec.fill(numIds) { UInt(0, width = txnCountBits) })
val r_id = if (maxFlight == 1) {
Cat(in.ar.bits.id, UInt(0, width=1))
} else {
Cat(in.ar.bits.id, r_count(in.ar.bits.id)(logFlight-1,0), UInt(0, width=1))
}
So, r_count is a vector of counters, one counter for each supported AXI ID.
Here's the code for how those counters are incremented:
val r_sel = UIntToOH(in.ar.bits.id, numIds)
(r_sel.asBools zip r_count) foreach { case (s, r) =>
when (in.ar.fire() && s) { r := r + UInt(1) }
}
As expected, whenever a request is sent, we increment the appropriate counter based on the AXI ID the request came from.
To see the problem with this scheme, think of this setup: we support only 1 AXI ID (so numIds is 1), and we support two transactions in flight for this AXI ID (so maxFlight is 2). We receive two AXI read requests from this single AXI ID, back-to-back, and AXI4ToTL will happily issue two TileLink Get requests back-to-back, each with unique "r_id"s, since r_count is incremented after the first request. After sending the second request, r_count is incremented a second time, which means r_id returns to the original value used for the first TileLink Get request. Now, if the two responses come back out-of-order (maybe request 1 is a cache miss and request 2 is a cache hit), and there's another AXI request waiting to be sent, then when we send request 3, we'll re-use the same TileLink source identifier from request 1, which is still in flight. This is illegal, since all outstanding TileLink requests must be uniquely identifiable by their source IDs.
So that's one problem with this system: we re-use TileLink source IDs when responses come back out-of-order, and this leads to the TileLink Monitors triggering assertions. There's one more problem though: let's assume we fixed AXI4ToTL so that it would only issue TileLink requests with valid source IDs when an out-of-order response came back. This response is still invalid from the AXI4 master's perspective, because it expects all of its requests from a single AXI ID to turn into well-ordered responses. Thus, AXI4ToTL would need re-ordering logic for all outstanding requests within a single AXI ID to convert the out-of-order TileLink responses into in-order AXI responses.
Let me know if I explained that well enough, or if you want me to take another crack at it, or if you think I have it completely wrong.
-Tynan