RISC-V calling convention - context switch

585 views
Skip to first unread message

JO

unread,
Oct 19, 2023, 1:05:09 PM10/19/23
to RISC-V SW Dev
Hello

I am trying to understand what registers are saved and then subsequently restored upona context switch.

I was looking at the following docuement (which is quite old, being dated 2015):

Is there a more up-to-date document that i can refer to?

Tks
JO

Tommy Murphy

unread,
Oct 19, 2023, 4:14:05 PM10/19/23
to RISC-V SW Dev, RISC-V SW Dev
Doesn't it depend on what is being used?
E.g. bare-metal, Linux, other [RT]OS?
Is this of any use to you?

Javed Osmany

unread,
Oct 19, 2023, 4:18:01 PM10/19/23
to Tommy Murphy, RISC-V SW Dev
Yes it is of use hence the question. 
The planned cpu will be used in an automotive application and thus running RTOS. 

Sent from Outlook for iOS

From: Tommy Murphy <tommy_...@hotmail.com>
Sent: Thursday, October 19, 2023 9:14:05 PM
To: RISC-V SW Dev <sw-...@groups.riscv.org>
Cc: RISC-V SW Dev <livi...@gmail.com>
Subject: [sw-dev] Re: RISC-V calling convention - context switch
 
--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/f2f23884-be51-4fea-9063-bb4b68c1b894n%40groups.riscv.org.

Robert Lipe

unread,
Oct 19, 2023, 10:02:02 PM10/19/23
to RISC-V SW Dev, JO, tommy_...@hotmail.com
Normally, the OS handles this exactly so your code doesn't have to.  The OS will define the ABI and decide things like Caller-Save vs. Callee-Save, for example. The OS, compiler, all libraries, and such have to know what the OS is using so they can all interact appropriately.

This is very parallel to a discussion in a related group I used to read:
https://forums.sifive.com/t/context-switch-on-risc-v/3624
The examples start out helpful, then just turn into smack talk and "helping" by distracting what an unrelated CPU arch does.

The punchline of those examples holds. A few searches on GitHub should turn up dozens and maybe hundreds by now of independent implementations of swtch() (original UNIX name), setjmp/longjmp (similar functions in user space) OSes doing task/interrupt handling, which requires context switches, calling them via swapcontext as above, etc.

If you're doing RV32e, you have fewer registers.
If you're doing fp or V, you have more registers.
If you have page mappings that aren't shared between tasks (e.g. a Real OS where applications/tasks can't scribble in each others' space) you may also have mappings to rebuild on a switch.

Again, any OS knows all of that.

This is all pretty much the heart of what an RTOS does and ALL of them have some variation of it. Some OSes are little more than figuring out which task to run and then switching to it. Figuring out how to do it 5% faster has been the basis of many  student presentations, heralded industry improvements, etc. and such. It's a decades old "problem" because every system is doing it hundreds to thousands of times per second.

JO

unread,
Oct 20, 2023, 2:47:19 AM10/20/23
to RISC-V SW Dev, rober...@gmail.com, JO, tommy_...@hotmail.com
Thank you for the answer.

I did a search on google for "RISC-V context switch state" or similar and did not really find what i was looking for.

The rationale for looking being:
I am trying to provide HW support for context switching to minimise the switch time - important for real-time systems.
Thus trying to scope out which of the GPR and FPR (if used by the context) that HW could automatically save after a trigger from the SW.

Thanks
JO

Tommy Murphy

unread,
Oct 20, 2023, 4:44:44 AM10/20/23
to JO, RISC-V SW Dev, rober...@gmail.com
> I am trying to provide HW support for context switching to minimise the switch time

That doesn't sound like a great idea given that it would presumably restrict the hardware to the calling convention of a specific compiler, [RT]OS etc.?

Robert Lipe

unread,
Oct 20, 2023, 5:24:53 AM10/20/23
to Tommy Murphy, JO, RISC-V SW Dev
Past chips that have tried this have largely faded into obscurity for pretty much that reason.

WinChipHead has a small optimization for this in their Quinke cores. People that really care about reducing context switch time use the RV32E ABIs which halves the register files or just use FIFOs/Write Burst/Memory speeds that are fast enough to help...or use hardware coprocessors to handle anything that needs responses within a couple of clock cycles.

The advantage of the way WCH does it is that you can ignore it (intentionally or otherwise) and it acts exactly like a RISC-V part is supposed to act. :-)

Just because you HAVE 32 registers (16 for RV32E) doesn't mean you have to USE 32 registers. You can tell GCC you have 25 integers in 25 registers (something like 'register int *r8 asm("r8")') and now they'll now not get used by the compiler. If you take this to an extreme like reclaiming A0-A7, you're going to have to create your own ABI and all support code. Again, people reduced the number of registers in play.) for years. (And it was terrible...IMO one of the points of using something like RISC-V is because smart people have already figured these things out and provided tools like debuggers that know about calling conventions and how to handle them. But people used to build their own debuggers, too, so it can be done.)

Most of the RTOS folks save and restore all the registers and we've pointed you to source code that proves it. You can either not use all the registers and spend all your time on the memory bus handling spills or you can have a lot of registers and just be smart about when you take interrupts and system calls and task swaps and such so you don't NEED to context switch at 10 kHz. Sure, be smart about what you need to save and when, but this is all such covered ground these days that there's not much space for "innovation" here.

If you have remaining questions about the RISC-V specific aspect (not general computer science that've been in compiler and OS courses for decades and in every OS supporting an architecture with more registers than the 6502) please be more specific. We've pointed you at mountains of examples.

Christoph Müllner

unread,
Oct 20, 2023, 5:28:51 AM10/20/23
to JO, RISC-V SW Dev, rober...@gmail.com, tommy_...@hotmail.com
On Fri, Oct 20, 2023 at 8:47 AM JO <livi...@gmail.com> wrote:
>
> Thank you for the answer.
>
> I did a search on google for "RISC-V context switch state" or similar and did not really find what i was looking for.

An open-source example for RVA-style processors can be found in the
Linux kernel:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/riscv/kernel/entry.S#n17
An open-source example for RVM-style processors can be found in FreeRTOS:
https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/main/portable/GCC/RISC-V/portASM.S#L298

The calling convention is not of relevance in case of context
switching as it only defines the contract across function calls,
not across context switches.

When you switch out a task/process/thread, then you need to save all
state that is exposed to that task/process/thread (program counter,
register contents, relevant CSR state, etc.).
Of course, there are ways to optimize that (e.g. a process that does
not use FP registers does not need to save them), but this might
come with some cost.

> The rationale for looking being:
> I am trying to provide HW support for context switching to minimise the switch time - important for real-time systems.
> Thus trying to scope out which of the GPR and FPR (if used by the context) that HW could automatically save after a trigger from the SW.

There are several existing extensions that attempt to tackle the same
goal (at least to some degree):
* Zcmp has cm.push*/cm.pop* instructions
https://github.com/riscv/riscv-code-size-reduction/releases/latest
* XTheadInt has simiar th.ipush/th.ipop instructions
https://github.com/T-head-Semi/thead-extension-spec/releases/latest

So if you have something similar in mind, it might be reasonable to
use existing extensions instead of reinventing them.

BR

Christoph
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/fa70c866-324e-4dc4-8056-0b4dca362a8an%40groups.riscv.org.

Liviu Ionescu

unread,
Oct 20, 2023, 5:33:07 AM10/20/23
to Tommy Murphy, JO, RISC-V SW Dev, rober...@gmail.com


> On 20 Oct 2023, at 11:44, Tommy Murphy <tommy_...@hotmail.com> wrote:
>
> > I am trying to provide HW support for context switching to minimise the switch time
>
> That doesn't sound like a great idea given that it would presumably restrict the hardware to the calling convention of a specific compiler, [RT]OS etc.?--

If I remember right, the RISC-V hardware does not save/restore registers when processing exceptions/interrupts, and the software must save/restore all registers in assembly (which is independent of the compiler/ABI/etc) at each context switch, so I doubt there is much room for optimisations at this level.

After saving the context, a C function is called, to select the next context, and here optimisations can make some difference, but this is [RT]OS specific.


BTW, given that the current RISC-V interrupt controller does not support embedded interrupts (higher priority interrupts are entered only after completing a lower priority interrupt), the context switch time might not be the biggest problem if real time characteristics are important.

Regards,

Liviu



Robert Lipe

unread,
Oct 20, 2023, 5:49:56 AM10/20/23
to Liviu Ionescu, Tommy Murphy, JO, RISC-V SW Dev
There are several "current RISC-V interrupt controllers".

Most of them support prioritization. Some support preemption.

Some are in other documents that I'm not linking individually. :-)

Liviu Ionescu

unread,
Oct 20, 2023, 6:22:14 AM10/20/23
to Robert Lipe, Tommy Murphy, JO, RISC-V SW Dev


> On 20 Oct 2023, at 12:49, Robert Lipe <rober...@gmail.com> wrote:
>
> There are several "current RISC-V interrupt controllers".
>
> Most of them support prioritization. Some support preemption.

You mean the CLIC? That project was created in 2018, and I was one of those who pushed for it. Now (end of 2023) it is still a proposal. Do you know of any device implementing it?

> Some are in other documents that I'm not linking individually. :-)

I know of various vendor custom implementations, but I'm not aware of any being ratified.


Regards,

Liviu

Robert Lipe

unread,
Oct 20, 2023, 6:46:55 AM10/20/23
to Liviu Ionescu, Tommy Murphy, JO, RISC-V SW Dev
No, but I'll admit I've lost track of how many are in play. We have at least PLIC, CLINT, ACLINT, CLIC, surely more vendor -specific ones that I don't recall/have never met offhand.

Those links were indeed to CLIC. This is one of the messiest areas about supporting RISC-V in the real world. You still need custom code for almost every chipset out there in the real world.

I knew your name was familiar, but I'll admit I didn't check why. Sorry to taunt you. :-)

RJL

Liviu Ionescu

unread,
Oct 20, 2023, 7:01:52 AM10/20/23
to Robert Lipe, Tommy Murphy, JO, RISC-V SW Dev


> On 20 Oct 2023, at 13:46, Robert Lipe <rober...@gmail.com> wrote:
>
> This is one of the messiest areas about supporting RISC-V in the real world.

Fully agree. One of the many. The Linux RISC-V world is more or less ok, but the bare-metal RISC-V world is nowhere near.

> You still need custom code for almost every chipset out there in the real world.

This makes porting and maintaining RTOS-es on RISC-V devices a real challenge.

Try to implement your own RTOS on both Cortex-M and RISC-V and you'll see what I mean. There is nothing easier to use than the Cortex-M NVIC.


Regards,

Liviu





Robert Lipe

unread,
Oct 20, 2023, 7:41:25 AM10/20/23
to Liviu Ionescu, Tommy Murphy, JO, RISC-V SW Dev
> This is one of the messiest areas about supporting RISC-V in the real world.

Fully agree. One of the many. The Linux RISC-V world is more or less ok, but the bare-metal RISC-V world is nowhere near.

That's mostly the world I inhabit. (metal/embedded, not "real" :-)

Try to implement your own RTOS on both Cortex-M and RISC-V and you'll see what I mean. There is nothing easier to use than the Cortex-M NVIC.

STM has no shortage of variations, but getting a timer interrupt and to main is super easy. There's less diversity in Espressif-land, but their tooling is super. The details of where the timer is, where the interrupt vectors are, where in address space you live, etc. are all just busy-work differences.

I get the vendors need/want to take advantage of freedoms, but if you work with > 1 chip, it's mostly just annoying to track down everything you need. I also suspect that most people pick one chip, build a product around it, and never touch the rest.

Is life really that good in Linux-land?  I have four of the seven boards that Ubuntu has seen a need to spin a custom distribution for. That's not a great binary compatibility story.

I don't know what PolarFire uses, and we'll leave QEMU out, but so far we have three SiFives and two C906's on that list. Some day we'll get around to dynamically patching the kernel to teach C906 to stay out of the reserved bits in the PTEs and so on, but we're not there yet. It's all still basically #ifdefed together. For core peripherals, V5 and V5R2 (JH-7100 vs. 7110) aren't that distant. The same could be said for all the D1 boards. There are a ton of cheapo (under $30) D1 boards that are probably almost capable of running the Nezha port for alternatives to Zero-class products.

As long as they're having to release and QA each product individually, I don't see them tripping over themsleves to shore up MangoPi, CD1800, 0X64, and other really low end boards that can technically run Linux, but are probably best left to Zephyr or NuttX or whatever. Then again, those groups probably have even fewer people to dedicate to bespoke ports.

Adding antother STM32F with yet one more UART mutant is WAY easier than figuring out scheduling issues for this round of asymmetric cores in the SOCs.




Liviu Ionescu

unread,
Oct 20, 2023, 8:09:58 AM10/20/23
to Robert Lipe, Tommy Murphy, JO, RISC-V SW Dev


> On 20 Oct 2023, at 14:41, Robert Lipe <rober...@gmail.com> wrote:
>
> Is life really that good in Linux-land?

I was thinking on the specifics of context switching (the subject of this topic); the old PLIC and the current context switching code in the Linux kernel is probably ok for most devices.

> I have four of the seven boards that Ubuntu has seen a need to spin a custom distribution for. That's not a great binary compatibility story.

Yeah, not ideal, but at least the specifics of each device are handled by different kernels, maintained by a small group of gurus, and the developers are presented a more consistent user-land.

In the bare-metal world the responsibility for handling the device specifics is transferred to the developer, which has to handle all these variations; not an easy task, and the RISC-V Foundation does not seem to provide much help.


Regards,

Liviu





JO

unread,
Oct 20, 2023, 8:50:20 AM10/20/23
to RISC-V SW Dev, rober...@gmail.com, JO, RISC-V SW Dev, tommy_...@hotmail.com

  >> Sure, be smart about what you need to save and when, but this is all such covered ground these days that there's not much space for "innovation" here.

That is exactly what i  am trying to scope out. As mentioned in an earlier reply, the brute force method would be save and restore all 31 GPRs and all 32 FPRs (if FPR are implemented and used bt the context).

Tks

JO

unread,
Oct 20, 2023, 8:52:13 AM10/20/23
to RISC-V SW Dev, christoph...@vrull.eu, RISC-V SW Dev, rober...@gmail.com, tommy_...@hotmail.com
Thank you for the pointer.

BR

Liviu Ionescu

unread,
Oct 20, 2023, 8:53:16 AM10/20/23
to JO, RISC-V SW Dev, rober...@gmail.com, tommy_...@hotmail.com


> On 20 Oct 2023, at 15:50, JO <livi...@gmail.com> wrote:
>
> ... the brute force method would be save and restore all 31 GPRs ...

Is there any other safe method?

Liviu

JO

unread,
Oct 20, 2023, 8:54:43 AM10/20/23
to RISC-V SW Dev, i...@livius.net, JO, RISC-V SW Dev, tommy_...@hotmail.com
Thank you for the insight.

BR

JO

unread,
Oct 20, 2023, 9:01:14 AM10/20/23
to RISC-V SW Dev, i...@livius.net, RISC-V SW Dev, rober...@gmail.com, tommy_...@hotmail.com
Trying to scope out if there is.

BR

Liviu Ionescu

unread,
Oct 20, 2023, 9:04:18 AM10/20/23
to Tommy Murphy, JO, RISC-V SW Dev, rober...@gmail.com


> On 20 Oct 2023, at 12:32, Liviu Ionescu <i...@livius.net> wrote:
>
> ... embedded interrupts (higher priority interrupts are entered only after completing a lower priority interrupt)

For the sake of correctness, these are actually called 'nested interrupts' (this is also where the N in Cortex-M NVIC comes from).

Liviu



Tommy Murphy

unread,
Oct 20, 2023, 9:53:49 AM10/20/23
to JO, RISC-V SW Dev, rober...@gmail.com, JO, RISC-V SW Dev
> the brute force method would be save and restore all 31 GPRs and all 32 FPRs (if FPR are implemented and used bt the context).

64 FPRs if both F and D extensions are implemented?

Liviu Ionescu

unread,
Oct 20, 2023, 10:01:59 AM10/20/23
to Tommy Murphy, JO, RISC-V SW Dev, rober...@gmail.com


> On 20 Oct 2023, at 16:53, Tommy Murphy <tommy_...@hotmail.com> wrote:
>
> > the brute force method would be save and restore all 31 GPRs and all 32 FPRs (if FPR are implemented and used bt the context).
>
> 64 FPRs if both F and D extensions are implemented?

Ah, sure. And there is no lazy save/restore as in Cortex-M, that saves quite a lot of cycles.

And probably some CSRs, as well.

And so the poor core gets pretty busy even when not doing anything...


Liviu

J O

unread,
Oct 20, 2023, 10:16:25 AM10/20/23
to Liviu Ionescu, JO, RISC-V SW Dev, rober...@gmail.com, tommy_...@hotmail.com
>> Is there any other safe method?

Some other possibilities being:

  1. Shadow register banks for both GPR and FPR. Have a build option for how many shadow banks (2, 4,8,16,32). @ 16, 32 shadow banks, the area and power will be high, but the tradeoff being that you are able to support (2,4,8,16,32) contexts without having to save and restore. Just need a trigger from SW for HW to switch to the relevant bank
  2. As eluded in one of the previous replies, have custom instructions for the save/restore.
There are probably others as well ...

Tks
JO



Best Regards

J.Osmany


From: Liviu Ionescu <i...@livius.net>
Sent: 20 October 2023 13:52
To: JO <livi...@gmail.com>
Cc: RISC-V SW Dev <sw-...@groups.riscv.org>; rober...@gmail.com <rober...@gmail.com>; tommy_...@hotmail.com <tommy_...@hotmail.com>
Subject: Re: [sw-dev] RISC-V calling convention - context switch
 
--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.

Liviu Ionescu

unread,
Oct 20, 2023, 11:02:40 AM10/20/23
to J O, JO, RISC-V SW Dev, rober...@gmail.com, tommy_...@hotmail.com


> On 20 Oct 2023, at 17:16, J O <liv...@hotmail.co.uk> wrote:
>
> >> Is there any other safe method?
>
> Some other possibilities being:
>
> - ... Shadow register banks ...
> - ... custom instructions for the save/restore.
> There are probably others as well ...

You mean custom hardware? Good luck with this approach!

This is a good example why writing software for bare-metal RISC-V will continue to be a nightmare in the foreseeable future.


Liviu


Jan Oleksiewicz

unread,
Oct 20, 2023, 9:18:34 PM10/20/23
to RISC-V SW Dev, i...@livius.net, JO, RISC-V SW Dev, rober...@gmail.com, tommy_...@hotmail.com, livid16
> That doesn't sound like a great idea given that it would presumably restrict the hardware to the calling convention of a specific compiler, [RT]OS etc.?

FYI, I have sent prestacked annotation RFC to llvm list recently (which would cover whatever HW stacker (combined with whatever ABI) anyone ever invents)

> Shadow register banks for both GPR and FPR. Have a build option for how many shadow banks (2, 4,8,16,32). @ 16, 32 shadow banks, the area and power will be high, but the tradeoff being that you are able to support (2,4,8,16,32) contexts without having to save and restore. Just need a trigger from SW for HW to switch to the relevant bank

Two quantitative questions: 
- How much % of the time is your core going to spend in context switching? (worst case, average case)
- What would be the improvement of such blind optimization?

I'm guessing that both are negligible.

If there is such critical task to justify those shadow regs, then you do it inside interrupts. Which can be 
optimized at much lower cost. (like in NVIC or TEIC)
Reply all
Reply to author
Forward
0 new messages