Automation of RISC-V generation from spec

92 views
Skip to first unread message

Yash Jha

unread,
Oct 6, 2025, 3:00:20 AMOct 6
to v8-dev
Hey v8 devs,
The simulator, the disassembler and many header files in the v8 can benefit from an automation tool that transforms information from the spec, RISC-V Unified Database potentially, to the provided format. 
Starting with the disassembler, the code, for example, contains TODOs and might be refactored with macros that can be reused throughout the code. This refactoring can be easily generated using the RISC-V Unified Database (UDB), a proven, ongoing effort to unify the RISC-V specification.

The new sub-category feature in the unified spec also provides human-readable generation in the required control flow format.

Moreover, an automated codegen also proves its realibility with the development of various RISC-V extensions (AES, Zfa, CMOs, Scalar Crypto. etc.) The automation pipeline can also expand on the turbofan micro assembler /codegen for identifying optimization techniques.

What do you (the developers) think of subsituting some parts of the code with an automated system? Do you also believe it can this support v8's motives and its connection with the ecosystem? I can already identify the countless TODOs in the disassembler that might help from the UDB. 

I pose this dicussion as a mentee for the Linux Mentorship program. Would appreciate your thoughts on this bridge. 
With warm regards,
Yash

Florian Loitsch

unread,
Oct 8, 2025, 1:52:13 AMOct 8
to v8-...@googlegroups.com
Hi Yash,

I'm currently one of the maintainers of the RISC-V port, but can't speak for the other maintainers, some of whom have worked on the RISC-V port for much longer than me.
From my point of view creating the (dis)assembler sources from the UDB seems like the correct approach, but it's not clear whether changing the current sources is necessary:
- the current (dis)assembler works and requires little maintenance effort. It's very rare that we start using a new instruction.
- the build system for something that creates sources from a DB automatically becomes more complicated. Few of us like to review build-system code.
- I'm worried that just reviewing the new code is more effort than any change we will make to that part of the code in the future.

That said; if the CLs (pull requests) are small and easy to review I'm not against it. What we definitely don't want is a huge CL in some weeks that replaces thousands of lines of code with some other thousands of lines of code.

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/v8-dev/f902f493-6138-4f1f-b581-1e9eb66dbb77n%40googlegroups.com.

ji qiu

unread,
Oct 8, 2025, 12:43:51 PMOct 8
to v8-dev
Hi Yash,
Additionally, I'd like to add one more point: V8's backend does not support all RISC-V standard extensions. We only implement those extension instruction sets that can correspond to the machine IR generated after JavaScript and Wasm are lowered within V8. In other words, we only implement the extension instruction sets that help accelerate V8's execution, and there is no need to implement other redundant parts. Therefore, V8's current assembler, disassembler, and simulator do not actually need to handle all extensions.
For the corresponding functional modules generated by UDB, is there the possibility of fine-grained selection of which extensions to implement and which not to? When new extensions need to be added in the future, will it require overall modifications or just incremental additions? Does UDB have different versions due to its ongoing evolution, which may produce different results? These are all factors we need to consider.

Paul Clarke

unread,
Oct 28, 2025, 1:09:28 PM (11 days ago) Oct 28
to v8-dev
Thanks for the responses! I'm one of the maintainers of the UDB project, and serving as a mentor for Yash. I've added some comments inline below...

On Wednesday, October 8, 2025 at 11:43:51 AM UTC-5 qiuji.c...@gmail.com wrote:
Hi Yash,
Additionally, I'd like to add one more point: V8's backend does not support all RISC-V standard extensions. We only implement those extension instruction sets that can correspond to the machine IR generated after JavaScript and Wasm are lowered within V8. In other words, we only implement the extension instruction sets that help accelerate V8's execution, and there is no need to implement other redundant parts. Therefore, V8's current assembler, disassembler, and simulator do not actually need to handle all extensions.
For the corresponding functional modules generated by UDB, is there the possibility of fine-grained selection of which extensions to implement and which not to?

Yes. Configurability is built in. Here's a fairly minimal "RV32" configuration as an example: https://github.com/riscv-software-src/riscv-unified-db/blob/main/cfgs/rv32.yaml
 
When new extensions need to be added in the future, will it require overall modifications or just incremental additions?

You can just add the new extension to your configuration file.

 
Does UDB have different versions due to its ongoing evolution, which may produce different results?

I'd say no. There are efforts to finalize a "1.0" of the underlying schema, but the content is always based on the ratified specification, so the results won't change. The goal is to make adoption painless -- to generate content as close to what you have now as possible.

 
These are all factors we need to consider.

在2025年10月8日星期三 UTC+8 13:52:13<Florian Loitsch> 写道:
Hi Yash,

I'm currently one of the maintainers of the RISC-V port, but can't speak for the other maintainers, some of whom have worked on the RISC-V port for much longer than me.
From my point of view creating the (dis)assembler sources from the UDB seems like the correct approach, but it's not clear whether changing the current sources is necessary:
- the current (dis)assembler works and requires little maintenance effort. It's very rare that we start using a new instruction.
- the build system for something that creates sources from a DB automatically becomes more complicated. Few of us like to review build-system code.
- I'm worried that just reviewing the new code is more effort than any change we will make to that part of the code in the future.

That said; if the CLs (pull requests) are small and easy to review I'm not against it. What we definitely don't want is a huge CL in some weeks that replaces thousands of lines of code with some other thousands of lines of code.

I completely understand. I wouldn't want significant (or any) disruption without significant value, either.

In the absence of "no", it's on us to produce something for your consideration. We'll take a stab at it and come back. We may have questions as we proceed, but hope to not become a nuisance.
Reply all
Reply to author
Forward
0 new messages