Guy Lemieux wrote:
>
> (4) We propose adding the following new instructions for cache
> management:
> INVAL rd,rs1,rs2
> WBACK rd,rs1,rs2
> FLUSH rd,rs1,rs2
>
> where rs1,rs2 defines a Memory Range (cache line starting with
> rs1, ending with rs2, inclusive)
>
> INVAL invalidates the data cache (dirty data discarded)
>
>
> I do not see where INVAL could ever be useful in RISC-V.
>
>
> Scenario: a device buffer has been allocated in memory, you should
> remove its contents from the data cache. However, there is no need to
> write back any existing content that may be in the data cache, since
> the IO device will clobber it anyways, so a FLUSH or FENCE would be
> excessive overhead.
Fair enough. I had only been thinking about MP synchronization and had
forgotten about I/O.
>
> WBACK writes dirty lines in data cache, marking them clean and
> valid
> FLUSH writes dirty lines in data cache, marking them invalid
>
>
> Why distinguish these cases? Both of these write dirty cachelines
> to memory and leave those cachelines available for reallocation.
>
>
> WBACK does not erase the data from the cache footprint, so reads can
> still be accelerated.
>
> Consider the case where you must ensure all data has been written to a
> DMA buffer before a device copies it from RAM to disk. Performing a
> complete FLUSH is excessive overhead, so the WBACK would be a faster
> version.
Would a ranged I/O FENCE also be appropriate here?
> we also think the following instructions will be useful:
>
> FENCE rs1,rs2
> FENCE.I rs1,rs2
>
> These FENCE variants work as before, but only across a defined
> memory region. That is, they do not invalidate the entire
> cache, only cache lines that hold data within the defined
> memory region. Thus, you can use FENCE.I rs1,rs2 to
> invalidate a region of memory that was written by
> self-modifying code, without destroying the whole i-cache
> footprint. Likewise, you can FENCE rs1,rs2 to prepare a DMA
> buffer region prior to issuing a command to an IO device to do
> a DMA READ from memory.
>
>
> FENCE instructions are I-type and do not have an rs2 field.
> Changing them to S-type would break backwards compatibility.
>
>
> We are not changing the existing FENCE instructions which have rd and
> rs1 fields.
>
> We would be adding new FENCE instructions which would not be I-type.
> Perhaps I should change the name to FENCEMR and FENCEMR.I to emphasize
> their different encoding, but I we have not yet determined suitable names.
We just got rid of a similar "same mnemonic produces different
instructions" where the assembler would produce ADDI if ADD was given an
immediate instead of a register. I would prefer not to introduce more
of those. On a side note, I just sent a proposal that calls those
instructions FENCE.RD and FENCE.RI to the list.
>
> Can we allow FENCE and FENCE.I to be interrupted? If they are
> actually flushing caches, then this could create a very
> long/nondeterministic interrupt service latency. It's unclear
> we can interrupt without starting over again after the
> interrupt is serviced.
>
>
> Interrupts are loosely-specified as I understand and
> implementations choose whether to take the interrupt before
> executing the next instruction or to discard a partially-completed
> instruction. Either way, *epc is loaded with the address of the
> instruction where execution should continue after an interrupt.
> Starting over again should be fine, as the cache will have fewer
> dirty lines after the ISR returns.
>
>
> You must be guaranteed to make forward progress or experience
> livelock. The ISR can introduce new dirty lines, which need to be
> flushed during FENCE/FENCE.I. The FENCE semantics talk about
> predecessor sets, but it isn't clear that instructions in an interrupt
> service routine would be deemed a predecessor if the FENCE is interrupted.
This depends on interrupt load, but you are correct. In my experience,
however, if a cache flush can't complete between interrupts, interrupt
load is probably excessive and the system design needs to be rethought.
I expect a well-designed system to have interrupts sufficiently
infrequently that "interrupt during cache flush" is rare and simply
repeating the cache flush will be the right answer.
-- Jacob