# RISC-V proposed microcontroller system ISA			-*- org -*-

* Introduction

The general RISC-V environment ISA is designed for running POSIX operating
systems on large-scale hardware.  Some of the design decisions driven by
this requirement make the general environment ISA less than optimal for
smaller embedded applications.  This proposal provides an alternate
environment ISA and embedded ABI for microcontrollers intended for hard
real-time embedded applications.

** Key Features
   - predictable execution timing for digital process control applications
   - simplified execution model to reduce the complexity of implementations
   - flat interrupt handling with hardware support for nested interrupts
   - hardware stack limit checking associated with register x2/sp
   - support for trap handlers as C functions

** Scope

An immediate question is simply: what is a microcontroller?  Perhaps as
importantly, what is the difference between a microcontroller and an
M-mode-only implementation of the general environment ISA?

The distinction used in this proposal in software is simple: a
microcontroller executes a single process from some non-volatile memory.
Use of an RTOS may enable this single process to be multi-threaded, but all
threads exist in the same process context.  There is no memory protection
in a microcontroller.  Changing the program in a microcontroller is a
maintenance task and is not routinely performed during normal operation.

This proposal also draws a line in hardware: microcontrollers have simple,
in-order execution pipelines with predictable execution timing and constant
latencies.  Microcontrollers do not have complex memory hierarchies or
out-of-order execution.  Instruction caches associated with Flash ROM are
common, but data caches are normally not present because all memory in a
microcontroller is SRAM that operates at the full core frequency.  The
memory subsystem in general is tightly-coupled to the execution pipeline.

** Privilege Levels

Microcontrollers have no concept of privilege levels.  All code implicitly
executes in an equivalent of machine mode in the general environment ISA.

** Shadow Registers

To reduce trap latency, microcontrollers have an implementation-defined
number of shadow register banks that store the interrupted context's values
of all caller-saved registers and the stack pointer.  Shadow registers are
automatically spilled to the trap stack by hardware as needed.

While microcontrollers eschew privilege levels, there are two modes of
execution: thread mode and trap mode.  The difference is whether shadow
registers are currently in use and therefore which stack is active.

* Control and Status Registers

** Status Register

|-------+-------+--------------------|
| Bits  | Name  | Description        |
|-------+-------+--------------------|
| 16:15 | XS    | extension status   |
| 14:13 | FS    | FP status          |
| 12: 5 | LEVEL | trap nesting level |
|  4: 0 |       | (reserved)         |
|-------+-------+--------------------|

*** LEVEL field

The LEVEL field in the status CSR tracks the current nesting level for trap
handling.  The low-order bits of this field also select the current shadow
register bank.  This field is maintained by hardware and is read-only in the
status CSR.

If LEVEL is zero, the hart is executing in thread mode and the thread stack
is active.  If LEVEL is non-zero, the hart is executing in trap mode and
the trap stack is active.

** Interrupt Assert and Enable

Each bit in the interrupt assert CSR (ia) corresponds to an interrupt
channel, of which XLEN interrupt channels are available.  Each bit has both
a level-sensitive external interrupt line and a software-writable latch.
This CSR, when read, returns the logical OR of the interrupt line and the
latches.  Writes to this CSR directly affect the latches.  Setting a latch
in this CSR causes an interrupt on the associated channel to be taken.  Each
latch is cleared by hardware when its associated interrupt trap is taken.

Each bit in the interrupt enable CSR (ie) similarly corresponds to an
interrupt channel, but masks its interrupt channel if clear.

** Thread and Trap Stack Bases and Limits

These CSRs define the allowed ranges of memory accesses made using the ABI
stack pointer register x2.  This feature is present, despite
microcontrollers generally eschewing memory protection, because stack
errors are a very common cause of crashes in embedded applications.

Since the stack grows downwards in RISC-V, the stack base holds
the *highest* address permitted as part of a stack access, while the stack
limit holds the *lowest* address permitted as part of a stack access.

An access to memory using the x2 register outside of these bounds causes an
exception.

** Thread Stack Pointer

This CSR provides access to the x2 register belonging to the thread
context.  This always refers to the thread stack pointer, regardless of
trap nesting level.  In thread mode, this CSR is an alias to x2.

** Background Link
** Background Millicode Link
** Background a0
** Background a1

These CSRs provide access to the x1, x5, x10, and x11 registers belonging
to the context at the top of a shadow register stack while in trap mode.
In thread mode, these CSRs alias the respective general registers.

** Trap Vector Base

This CSR contains the base address of the trap vector table and the vector
spacing.  The trap vector table must be aligned to a multiple of the vector
spacing or an 8 byte boundary, whichever is larger.

|--------+--------------+---------------------|
| Bits   | Name         | Description         |
|--------+--------------+---------------------|
| XLEN:4 | Base[XLEN:4] | trap vector base    |
| 3:0    | Size         | trap vector spacing |
|--------+--------------+---------------------|

The vector for an exception trap is determined by calculating
(Base-(Cause<<(2+Size))) while the vector for an interrupt trap is
determined by calculating (Base+(Channel<<(2+Size))).

These calculations produce a minimum vector spacing of 4 bytes (one RVI
opcode) and a maximum vector spacing of 2^17 or 32768 bytes (2^15 or 8192
RVI opcodes).

This means that exception vectors are at *lower* addresses than the vector
base and that interrupt vectors are at *higher* addresses than the vector
base.  The vector base exactly is reserved for interrupt channel zero,
which is reserved for software context switch.

* Embedded ABI

RISC-V microcontrollers use an alternate ABI in which nearly all registers
are callee-saved.  Only the return address and return value registers are
caller-saved.  The eABI is otherwise aligned with the standard RISC-V POSIX
ABI.  Code using the eABI can be safely called from POSIX ABI code, but
eABI code must use special thunks that save/restore the caller-saved
registers in the POSIX ABI to safely call into POSIX ABI code.

|------------+-------------+----------------------------+--------|
| Register   | ABI Name    | Description                | Thunk? |
|------------+-------------+----------------------------+--------|
| x0         | zero        | hard-wired zero            |        |
| x1         | ra          | return address             | no     |
| x2         | sp          | stack pointer              | no     |
| x3         | gp          | global pointer             |        |
| x4         | tp          | thread pointer             |        |
| x5         | t0          | millicode link             | no     |
| x6 -- x7   | t1 -- t2    | temporaries                | yes    |
| x8         | s0 / fp     | frame pointer              | no     |
| x9         | s1          | saved register             | no     |
| x10 -- x11 | a0 -- a1    | arguments/return values    | no     |
| x12 -- x17 | a2 -- a7    | arguments                  | yes    |
| x18 -- x27 | s2 -- s11   | saved registers            | no     |
| x28 -- x31 | t3 -- t6    | temporaries                | yes    |
|------------+-------------+----------------------------+--------|
| f0 -- f7   | ft0 -- ft7  | FP temporaries             | yes    |
| f8 -- f9   | fs0 -- fs1  | FP saved registers         | no     |
| f10 -- f11 | fa0 -- fa1  | FP arguments/return values | no     |
| f12 -- f17 | fa2 -- fa7  | FP arguments               | yes    |
| f18 -- f27 | fs2 -- fs11 | FP saved registers         | no     |
| f28 -- f31 | ft8 -- ft11 | FP temporaries             | yes    |
|------------+-------------+----------------------------+--------|

The ABI names are unchanged from the POSIX ABI, to preserve compatibility
with the POSIX RISC-V tools, however nearly all registers are callee-saved
in the eABI.  This reduces the amount of context that must be preserved in
trap entry and restored at trap exit, which is often latency-sensitive in
microcontroller applications.

The "Thunk?" column indicates registers which are caller-saved in the POSIX
ABI but callee-saved in the eABI.  Of the general registers, only x1, x5,
x10, and x11 are caller-saved in the eABI.  Of the FP registers, only f10
and f11 are caller-saved in the eABI.

* Traps

Shadow registers are used to improve the performance of trap-entry and
trap-exit operations.  Trap-return is the TRET instruction, which is
encoded identically to the general environment ISA MRET instruction.

Upon entry to a trap handler, x10 contains the previous program counter
value.  A trap handler must return the address at which to resume the
previous context.

Taking a trap updates the current shadow register bank by incrementing
status.LEVEL and spills the oldest set of shadow registers onto the trap
stack.  Trap handlers must not assume that they can access any particular
saved context, since the number of shadow register banks is
implementation-defined.  Only the top of the shadow register stack is
accessible in the background link and a0/a1 CSRs.

Spilling shadow registers to memory may be accomplished during pipeline
refill latency while taking the trap or may be performed in the background
after the trap handler is entered.  Implementations may also use a wide
(256x(4*XLEN)) internal RAM to spill shadow registers, instead of using the
stack, or some combination of approaches.

Taking a trap from thread mode initializes the trap mode x2 (stack pointer)
to the trap stack base from the trap stack base CSR, or the thread mode x2
value if the trap stack base CSR is zero.

** TRAP Instruction

The TRAP instruction asserts interrupt channel zero.  In thread mode this
causes an immediate trap to the context switch handler.  In trap mode, the
context switch is delayed until all higher-priority interrupt handlers have
completed.

TRAP is a pseudo-instruction corresponding to "CSRSI ia, 1".

** TRET Instruction

The TRET instruction effects a return from a trap handler in a
microcontroller.  This involves restoring shadow registers from the context
save area in the current stack frame, decrementing status.LEVEL, and
transferring control to the address in x10.

In thread mode, TRET simply transfers control to the address in x10.

Upon entry to a trap handler, hardware sets the x1 (ra)
register to the address of a TRET instruction in ROM.  The return address
given to a trap handler must appear to contain a TRET instruction if read
as data, but may be specially recognized by hardware and a JALR to that
address may be directly executed as TRET.

_Issue_ : Should TRET be identical to MRET?

** Exceptions

A RISC-V microcontroller supports a reduced set of exception causes.

|------+----------------------------------------|
| Code | Description                            |
|------+----------------------------------------|
|    0 | (reserved for context switch handler)  |
|    1 | stack fault                            |
|    2 | illegal instruction                    |
|    3 | debug breakpoint (execution of EBREAK) |
|    4 | instruction misaligned                 |
|    5 | instruction bus fault                  |
|    6 | data address misaligned                |
|    7 | data bus fault                         |
|------+----------------------------------------|

While microcontrollers eschew memory protection, hardware may still be able
to recognize accesses to addresses that simply are not mapped to any
target.  Such an access raises a bus fault.

The stack fault exception is raised when a memory access using x2 accesses
memory outside of the region defined by the stack base and limit CSRs.  A
stack fault due to overrunning the trap stack is not recoverable and is
taken at the base of the trap stack, overwriting the lowest stack frame.

Microcontroller software is expected to issue only aligned accesses.
Emulation of misaligned accesses is possible, but most applications will
not use it.  Hardware may support misaligned data access, in which case
misaligned data exceptions will never occur.

Instructions *must* be aligned, unlike data accesses.

** Interrupts

A RISC-V microcontroller core has XLEN interrupt channels, but may have
fewer than XLEN external interrupt inputs.  Interrupt channels lacking
external inputs can only produce interrupts by software.

Interrupt priority is fixed: higher-numbered channels have greater
priority and can interrupt lower priority interrupt service routines.

An interrupt map, external to the core and configured using MMIO, assigns
physical peripherals to interrupt channels.


* Standard Memory Map

[to be written]

** Main Program ROM
** Main RAM
** System Configuration Area
*** Descriptor ROM
*** Interrupt Map

* Appendix:  Acknowledgements

This would never have been possible without the work of the RISC-V Project,
and many parts of this proposal are adapted from the standard RISC-V
specifications.

The idea of a separate microcontroller environment ISA is largely due to
the efforts of Liviu Ionescu, although I am certain that there are many
disagreements with some details of this proposal.

This proposal itself was written by Jacob Bachmeyer.

* Appendix:  Rationale

[to be written]

* [meta]
** LocalWords
 LocalWords:  RISC POSIX ABI RTOS SRAM CSRs callee eABI ra sp gp tp millicode
 LocalWords:  fp fs CSR TRET MRET JALR XLEN SLLI RVI opcode opcodes Liviu ECALL
 LocalWords:  Ionescu breakpoint EBREAK AMO CSRSI ia ie MMIO