Implementing precise exceptions for vector instructions

199 views
Skip to first unread message

za...@bu.edu

unread,
Jul 4, 2021, 11:56:21 PM7/4/21
to RISC-V HW Dev
Hi all,

Based on the RISC-V vector spec, precise exceptions are required for VM support and Linux. 
What is the efficient mechanism to support precise exceptions for vector instructions (e.g. page faults caused by vector loads/stores) in an in-order core?

Is the only solution blocking the scalar core and waiting until the vector instruction is known to be non-speculative i.e. the vector unit signals no exceptions (semi-decoupled vector unit)?
If not, how the subsequent instructions in the scalar pipeline should be handled in the case of a faulting vector instruction and OOO flushes?

Is there any in-order RISC-V cores with precise exception support for vectors?

Thank you,
Zara

Andrew Waterman

unread,
Jul 5, 2021, 11:39:06 PM7/5/21
to za...@bu.edu, RISC-V HW Dev
On Sun, Jul 4, 2021 at 8:56 PM za...@bu.edu <za...@bu.edu> wrote:
Hi all,

Based on the RISC-V vector spec, precise exceptions are required for VM support and Linux. 
What is the efficient mechanism to support precise exceptions for vector instructions (e.g. page faults caused by vector loads/stores) in an in-order core?

Is the only solution blocking the scalar core and waiting until the vector instruction is known to be non-speculative i.e. the vector unit signals no exceptions (semi-decoupled vector unit)?
If not, how the subsequent instructions in the scalar pipeline should be handled in the case of a faulting vector instruction and OOO flushes?

Most vector instructions can be proven exception-free relatively easily, making this a reasonable design point.



Is there any in-order RISC-V cores with precise exception support for vectors?

Yes, including SiFive’s recently announced offerings in this space.


Thank you,
Zara









--


You received this message because you are subscribed to the Google Groups "RISC-V HW Dev" group.


To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+un...@groups.riscv.org.


To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/hw-dev/dc4f5da1-df14-41ad-ad66-aa9fc2e5f5b4n%40groups.riscv.org.


Daniel Petrisko

unread,
Jul 6, 2021, 2:40:47 AM7/6/21
to Andrew Waterman, za...@bu.edu, RISC-V HW Dev
Hi Andrew,

The hard case for this seems to be gather/scatter type instructions, which could trigger faults on any page accessed. Is there a trick for proving these instructions will not fault, other than manually iterating over the scalar TLB? (Of course, one could also imagine other solutions like specialized vector TLBs which can do strided checks). 

- Dan

On Jul 5, 2021, at 8:39 PM, Andrew Waterman <and...@sifive.com> wrote:



Andrew Waterman

unread,
Jul 6, 2021, 5:53:53 PM7/6/21
to Daniel Petrisko, za...@bu.edu, RISC-V HW Dev
On Mon, Jul 5, 2021 at 11:40 PM Daniel Petrisko <petr...@cs.washington.edu> wrote:
Hi Andrew,

The hard case for this seems to be gather/scatter type instructions, which could trigger faults on any page accessed. Is there a trick for proving these instructions will not fault, other than manually iterating over the scalar TLB? (Of course, one could also imagine other solutions like specialized vector TLBs which can do strided checks). 

In the general case, there is no way around checking each effective address while holding up retirement of younger instructions.  Of course, for many workloads, this level of indexed memory access performance is fine.

Daniel Petrisko

unread,
Jul 7, 2021, 2:37:05 AM7/7/21
to Andrew Waterman, za...@bu.edu, RISC-V HW Dev
Thanks Andrew, 

Eager to see details and compare notes with SiFive architectures. Seems like an exciting year for vectors!

- Dan

On Jul 6, 2021, at 2:53 PM, Andrew Waterman <and...@sifive.com> wrote:



za...@bu.edu

unread,
Jul 13, 2021, 11:43:59 AM7/13/21
to RISC-V HW Dev, andrew, za...@bu.edu, RISC-V HW Dev, petr...@cs.washington.edu
On Tuesday, July 6, 2021 at 5:53:53 PM UTC-4 andrew wrote:
In the general case, there is no way around checking each effective address while holding up retirement of younger instructions.  Of course, for many workloads, this level of indexed memory access performance is fine.

Thank you Andrew!

Agreed, this level of performance can be acceptable for scatter-gather memory accesses because they are by default heavy weight and time expensive operations.
But what about the non-scatter-gather operations?

Is it the case that unit stride vector load/store would traverse multiple pages, and we should stall the scalar core until vector unit signals no exceptions (same approach as scatter-gather one)?
Or they should be page-aligned or have some restrictions on the number of pages they access? 

Best,
Zara

Andrew Waterman

unread,
Jul 13, 2021, 12:25:57 PM7/13/21
to za...@bu.edu, RISC-V HW Dev, petr...@cs.washington.edu
Both kinds of accesses might span page boundaries, but there’s usually enough information available early in the pipeline to determine whether a given access is easy to prove exception-free or needs to be handled more conservatively like indexed accesses.



Best,
Zara








--


You received this message because you are subscribed to the Google Groups "RISC-V HW Dev" group.


To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+un...@groups.riscv.org.


Reply all
Reply to author
Forward
0 new messages