C.add16isp ((N +3)>>2) # i.e. addi sp sp ((N +3)>>2 <<4)
jalr t0 zero Csavew - (N<<1)
where Csavew is the bit of milicode
Csavew-16 C.swsp t2 (+32)
Csavew-14 C.swsp t1 (+28)
Csavew-12 C.swsp a5 (+24)
Csavew-10 C.swsp a4 (+20)
Csavew -8 C.swsp a3 (+16)
Csavew -6 C.swsp a2 (+12)
Csavew -4 C.swsp a1 (+8)
Csavew -2 C.swsp a0 (+4)
Csavew: C.swsp ra (0)
j t0
regards,
Liviu
jalr t0 zero Csavew - (N<<1)
addi sp sp (((N +1)>>1) << 3)
jalr t0 zero Csavew - (N<<1)
regards,
Liviu
The problem with shadow registers is that you always run out and you still need to spill to main memory.
For an RVE implementation, which reduces the RF in half to save gates, it would be weird to double the memory now, just to implement a shadow register.
Richard
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAG7hfcJRmvBgfMCg2rtnAh-3xfoahRiF225Yvu9k%3DBbNjy_xDA%40mail.gmail.com.
On 15 March 2018 at 10:55:11, Richard Herveille
(richard....@roalogic.com) wrote:
> The problem with shadow registers is that you always run out and you still need to spill
> to main memory.
you run out when the interrupt nesting gets deeper than the available
register banks; in most cases, the depth is 1, rarely 2, even rarely
3, and so on.
spilling can be done in parallel, while starting the handler.
but this method reveals a possible latency problem: if a high priority
interrupt occurs right after a series of other interrupts, and there
are no more register banks, it must wait for a previous spill to
complete, to free a register bank, leading to a jitter on the high
priority interrupt latency.
most applications tolerate a small jitter, but for applications that
implement control loops we might need a way to disable this mechanism
and provide constant latency (even if it is slightly higher).
Cortex-M0 has such a configuration bit to prevent jitter.
> For an RVE implementation, which reduces the RF in half to save gates, it would be weird
> to double the memory now, just to implement a shadow register.
yes, this mechanism is not cheap. however, as Jacob suggested, only
the ABI caller registers need to be shadowed/spilled, so, with a
lighter EABI, the extra cost may be kept to a minimum.
regards,
Liviu
> On Mar 15, 2018, at 16:48 , kr...@berkeley.edu wrote:
> I'll shortly be sending out an invite to a new Foundation Task Group
> we have formed to address adding fast interrupts to RISC-V.
>
> Germane to this thread, one feature of the proposal under development
> is to standardize interrupt attribute annotations so C compilers can
> generate interrupt handlers that only save registers as needed. This
> effectively changes the calling conventions just for the handlers but
> leaves the rest of the ABI unchanged.
>
> /* Not real code, just a sketch. */
> extern volatile int *DEVICE;
> extern volatile int *COUNT;
>
> void __attribute__ ((interrupt))
> foo() {
> *DEVICE = 0;
> *COUNT++;
> }
>
> A rough sketch of what a generated handler looks like is:
>
> # Small ISR that pokes device to clear interrupt, and increments in-memory counter.
>
> .align 3 # Has to be 8-byte aligned.
> foo:
> addi sp, sp, -16 # Create a frame on stack.
If the ABI had included a stack "red zone" with a small reservation for interrupts,
then the two "addi sp, " instructions could have been avoided in most cases.
> sw s0, 0(sp) # Save working register.
Presumedly you meant to load s0 with a global pointer?
> sw x0, DEVICE, s0 # Clear interrupt flag.
> sw s1, 4(sp) # Save working register.
> la s0, COUNT # Get counter address.
> li s1, 1
> amoadd.w x0, (s0), s1 # Increment counter in memory.
> lw s1, 4(sp) # Restore registers.
> lw s0, 0(sp)
> addi sp, sp, 16 # Free stack frame.
> mret # Return from handler using saved mepc.
Tommy
>
> This change will be useful even with existing interrupt architecture,
> but TG will be looking at a new design that supports nested
> interrupts. Our initial studies show a small core can take interrupt,
> enter, execute, and exit the handler above in less than 20 cycles,
> while supporting preemption on any clock cycle (i.e., only a few cycles ~3
> to get to first instruction).
>
> Krste
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/995D7241-2798-4754-B2D0-866B910C4B02%40esperantotech.com.
On 16 March 2018 at 20:58:13, Samuel Falvo II (sam....@gmail.com) wrote:
sure.
> On Fri, Mar 16, 2018 at 1:46 AM, Liviu Ionescu wrote:
> > yes, two-tiered interrupt processing is the ideal textbook solution,
> > but few RTOSes/applications do it.
>
> Citation needed?
the venerable eCos calls them ISRs and DSRs; the µC/OS-III calls them
direct and deferred interrupts; FreeRTOS has an optional Deferred
Interrupt Handling.
any proposals that come with use cases that show they are simpler to
use, and performance estimates that show better latency or better
performance in general, are welcomed.