How to add a custom instruction to the RISC-V GCC tools?

4,562 views
Skip to first unread message

Tommy Murphy

unread,
Aug 3, 2018, 8:41:25 AM8/3/18
to RISC-V SW Dev
I know that this (or something similar) has been asked a few times before but I have never seen a clear and authoritative answer.
(If one exists and has been posted then apologies for overlooking it and please just point me at it).
Is there any clear and authoritative guide to how to add a custom RISC-V instruction to the GCC (bare metal/newlib) toolchain?
If not then are there any specific links to help along the way to achieving that?

On the other hand perhaps this is not so much a RISC-V as a general GCC tools (gcc/gas, gdb, binutils, simulators...) question in which case I may be asking in the wrong place?

Thanks a lot
Regards
Tommy

Jim Wilson

unread,
Aug 3, 2018, 4:53:53 PM8/3/18
to Tommy Murphy, RISC-V SW Dev
On Fri, Aug 3, 2018 at 5:41 AM, Tommy Murphy <tommy_...@hotmail.com> wrote:
> I know that this (or something similar) has been asked a few times before
> but I have never seen a clear and authoritative answer.
> (If one exists and has been posted then apologies for overlooking it and
> please just point me at it).
> Is there any clear and authoritative guide to how to add a custom RISC-V
> instruction to the GCC (bare metal/newlib) toolchain?
> If not then are there any specific links to help along the way to achieving
> that?

It is easier to add instructions to binutils than gcc. For binutils,
we have the .insn support that Kito Cheng added that lets you specify
each instruction field individually, and can be used for custom
instructions. If you want something more programmer friendly, then
you can add a line to opcodes/riscv-opc.c to add an instruction. Just
copy a similar instruction and modify the fields as appropriate. You
can find the match/mask patterns in include/opcode/riscv-opc.h. For
the operand letters, you can look at the code in gas/config/tc-riscv.c
riscv_ip() that handles them, e.g. search for 'p' to find the p
support, and note that the first one is for compressed support 'Cp'
and the second one is the plain 'p'. Adding instructions this way
will require some understanding of how binutils works, but the
assemble/disassemble stuff is pretty easy. It is the linker stuff
that is complicated. If you have your own instruction formats, and
need new relocations and/or relaxations, then that can get very
complicated very quickly, and I'm not going to try to explain that
here. The simulator is also pretty easy, though we have two of them,
the gdb simulator which is not upstream, and the QEMU simulator which
is upstream. Both should be pretty easy to modify. Oh, and I suppose
we also have spike, but I haven't looked at that one much.

There is a binutils tool called cgen that lets you construct an
assembler from an architecture description file. This is an easier
way to go if you want to do a lot of architecture experimenting, but
it is not how the current assembler is written, and changing to a new
assembler design at this point would likely be painful. Embecosm
incidentally has a cgen risc-v assembler port. I don't know if they
plan to release the sources for it, and I don't know how many existing
RISC-V binutils features are supported in it.

For gcc, the first question is what do you mean by gcc support. Are
you OK using an extended asm to hand code in the instruction? That is
trivial. Do you want an intrinsic that will generate the instruction
for you? This is not very hard. Do you want the compiler optimizer
to automatically generate the instruction? This is harder. You need
to add a pattern to the gcc/config/riscv/riscv.md file to describe the
instruction. If the instruction is performing a common operation,
then just adding the instruction pattern may be enough to get it
generated. You will have to spend some timing debugging the compiler
to get the pattern details right so that it gets generated when
appropriate, but this is generally not too hard. If the instruction
is performing a less common operation, then you may have to do work on
target independent and/or target dependent optimization passes to get
the instruction to be generated, and this will require a lot of gcc
internals knowledge, and possibly a lot of time.

There are a number of GCC internals tutorials that have been written
by various people over the years. We have a link to some of them on
the gcc web site.
https://gcc.gnu.org/wiki/GettingStarted#Tutorials.2C_HOWTOs
GCC internals is always changing as development progresses, so some of
the info in these will be out-of-date. And of course there are lots
of text books that talk about compiler design and implementation if
you need a general introduction to compilers.

For binutils, it is a much smaller development community than gcc, and
the core developers tend to stay with it longer, so there is less
tutorial type info available. There is one on the web site
https://sourceware.org/binutils/binutils-porting-guide.txt
but it appears to be a brief high level description and maybe not very
useful to you. There aren't many textbooks that cover what binutils
does, but the linker part is the only part worthy of a textbook. For
that, I would suggest "Linkers and Loaders" by John R Levine. I
haven't actually read this, but I know a number of the people
mentioned in the Acknowledgements section and have heard good things
about the book.

Both binutils and gcc have mailing lists where you can ask questions
if you are serious about getting involved in development, and need
help understanding something. Usually the best way to get started is
to just pick a bug report or enhancement request, start reading
sources, try various ways to fix or implement it until you find
solution, then asking on the mailing lists if you have a good
solution, and iterate until it is right, learning how the sources work
along the way. Then pick another one and repeat for a few years until
you are an expert.

Jim

Tommy Murphy

unread,
Aug 6, 2018, 8:48:00 AM8/6/18
to RISC-V SW Dev, tommy_...@hotmail.com
Hi Jim

Apologies for the delay in replying.
Thanks a lot for that info - very useful as usual and that gives me more than enough orientation on how to proceed.
My focus here was on the simple case of a custom instruction exercised via inline asm or raw asm.
So nothing "inferred" by the compiler per se.

Thanks again
Regards
Tommy 

Muhammad Shami

unread,
Jul 8, 2020, 6:43:03 AM7/8/20
to RISC-V SW Dev, ji...@sifive.com, RISC-V SW Dev, tommy_...@hotmail.com

Thanks Jim for such information .But I am confused about how  _MASK and _match could be generated ?
 I could not understand any pattern in include/opcode/riscv-opc.h . One more thing, how to write an intrinsic instruction in C code?

Nelson Chu

unread,
Jul 8, 2020, 7:30:03 AM7/8/20
to Muhammad Shami, RISC-V SW Dev, ji...@sifive.com, tommy_...@hotmail.com
On Wed, Jul 8, 2020 at 6:43 PM Muhammad Shami <msha...@gmail.com> wrote:
> But I am confused about how _MASK and _match could be generated ?

These macros are generated by riscv-opcodes in the past, and can be
used by lots of tools, including spike. For now, binutils don't use
riscv-opcodes to generate these macros, they are added manually. The
old comments in the include/opcode/riscv-opc.h are removed and updated
to the new ones on the FSF binutils.

> I could not understand any pattern in include/opcode/riscv-opc.h

You can see what insn->match_func do in the following link,
https://github.com/riscv/riscv-binutils-gdb/blob/riscv-binutils-2.34/gas/config/tc-riscv.c#L1455
https://github.com/riscv/riscv-binutils-gdb/blob/riscv-binutils-2.34/opcodes/riscv-opc.c#L84

The MASK_ macros means the fixed bits for the instruction, and the
MATCH_ macros are used to check what the instructions are. Consider
the MATCH_ADD and MASK_ADD,

#define MATCH_ADD 0x33
#define MASK_ADD 0xfe00707f

And you can check the riscv-opcodes, it is useful.
https://github.com/riscv/riscv-opcodes/blob/master/opcodes-rv32i#L34

add rd rs1 rs2 31..25=0 14..12=0 6..2=0x0C 1..0=3

So the fixed bits for ADD is 31->25, 14->12, 6->2 and 1->0, so the
MASK_ADD is 0xfe00707f. And if the fixed bits are 0x33, then you can
make sure that the instruction is ADD.

Thanks
Nelson

Vivek Singh

unread,
Jan 16, 2022, 6:37:11 AM1/16/22
to RISC-V SW Dev, nelso...@sifive.com, RISC-V SW Dev, ji...@sifive.com, tommy_...@hotmail.com, msha...@gmail.com
Wrote a blog post about this recenty. It might be helpful for someone who stumble upon this thread in future

https://medium.com/@viveksgt/adding-custom-instructions-compilation-support-to-riscv-toolchain-78ce1b6efcf4

tian abei

unread,
Jun 29, 2022, 12:19:17 AM6/29/22
to RISC-V SW Dev, vive...@gmail.com
Hi Vivek , I was lucky enough to read this article, but an error occurred while trying to build the toolchain.

This is done as follows.

1. Add gcd and fact entries, and some macro definitions for MATCH and MASK to the files under the following path.

riscv-gnu-toolchain/riscv-binutils/opcodes/riscv-opc.c
riscv-gnu-toolchain/riscv-binutils/include/opcode/riscv-opc.h
riscv-gnu-toolchain/riscv-gdb/opcodes/riscv-opc.c
riscv-gnu-toolchain/riscv-gdb/include/opcode/riscv-opc.h

2. Use the following command to configure and compile.

. /configure --prefix=/opt/riscv/ --with-arch=rv32imc --with-abi=ilp32
make -j12

I wonder if there are still some problems that I haven't noticed? For example, is there a requirement for the position of the new entry in the array riscv_opcodes[] and the position of the macro definition?

tian abei

unread,
Jun 29, 2022, 7:23:52 AM6/29/22
to RISC-V SW Dev, vive...@gmail.com
Hi  Vivek  , I have succeeded. The new entries do have a location restriction, currently I put the two new entries after the comments(/* name, xlen, isa, operands, match, mask, match_func, pinfo. */) below and it doesn't report an error.

Hager Rafaat

unread,
Jun 15, 2023, 8:28:47 PM6/15/23
to RISC-V SW Dev, vive...@gmail.com

Hi first of all I’d like to apologize for bothering you but I need your help urgently I have seen your article in Adding Custom Instructions to the RISC-V GNU-GCC toolchain (hsandid.github.io) and I’d like to thank you a lot for this precious information ,but I need your help as I have my graduation project I have to make extension in riscv gcc compiler I have to make custom instructions like mov  that takes the value in the register rs2 and saves it in rd , asm: mov rd , rs2 it’s different from the standard mov that’s already in the riscv as it shouldn’t be pseudo (it shouldn’t be converted to addi) so I made these modifications i modified the riscv-opc.c by adding this line {"mov", 0, INSN_CLASS_I,"d,t",MATCH_MOV,MASK_MOV,match_opcode,0}, and modifying the riscv-opc.h by adding those lines #define MATCH_MOV 0x1F80 #define MASK_MOV 0xFE00707F and DECLARE_INSN(mov,MATCH_MOV,MASK_MOV) and modifying riscv.md by adding (define_insn "mov" [(set (match_operand:SI 0 "register_operand" "=r") (match_operand:SI 1 "register_operand" "r"))] "" "mov\t%0, %1" ) but it gives me the following error : Assembler messages: Error: internal: bad RISC-V opcode (mask error): mov d,s Fatal error: internal: broken assembler. No assembly attempted make[6]: *** [Makefile:397: lib_a-chk_fail.o] Error 1 make[6]: Leaving directory '/home/eman/riscv32_toolchain/build-newlib/riscv32-unknown-elf/newlib/libc/ssp' Making all in . make[6]: Entering directory '/home/eman/riscv32_toolchain/build-newlib/riscv32-unknown-elf/newlib/libc' rm -f libc.a rm -rf tmp mkdir tmp cd tmp; \ for i in argz/lib.a stdlib/lib.a ctype/lib.a search/lib.a stdio/lib.a string/lib.a signal/lib.a time/lib.a locale/lib.a reent/lib.a errno/lib.a misc/lib.a ssp/lib.a syscalls/lib.a machine/lib.a ; do \ riscv32-unknown-elf-ar x ../$i; \ done; \ riscv32-unknown-elf-ar rc ../libc.a *.o riscv32-unknown-elf-ar: ../argz/lib.a: No such file or directory riscv32-unknown-elf-ar: ../stdlib/lib.a: No such file or directory riscv32-unknown-elf-ar: ../ctype/lib.a: No such file or directory riscv32-unknown-elf-ar: ../search/lib.a: No such file or directory riscv32-unknown-elf-ar: ../stdio/lib.a: No such file or directory riscv32-unknown-elf-ar: ../string/lib.a: No such file or directory riscv32-unknown-elf-ar: ../signal/lib.a: No such file or directory riscv32-unknown-elf-ar: ../time/lib.a: No such file or directory riscv32-unknown-elf-ar: ../locale/lib.a: No such file or directory riscv32-unknown-elf-ar: ../reent/lib.a: No such file or directory riscv32-unknown-elf-ar: ../errno/lib.a: No such file or directory riscv32-unknown-elf-ar: ../misc/lib.a: No such file or directory riscv32-unknown-elf-ar: ../ssp/lib.a: No such file or directory riscv32-unknown-elf-ar: ../syscalls/lib.a: No such file or directory riscv32-unknown-elf-ar: ../machine/lib.a: No such file or directory riscv32-unknown-elf-ar: *.o: No such file or directory make[6]: *** [Makefile:1034: libc.a] Error 1 make[6]: Leaving directory '/home/eman/riscv32_toolchain/build-newlib/riscv32-unknown-elf/newlib/libc' make[5]: *** [Makefile:683: all-recursive] Error 1 make[5]: Leaving directory '/home/eman/riscv32_toolchain/build-newlib/riscv32-unknown-elf/newlib/libc' make[4]: *** [Makefile:641: all-recursive] Error 1 make[4]: Leaving directory '/home/eman/riscv32_toolchain/build-newlib/riscv32-unknown-elf/newlib' make[3]: *** [Makefile:452: all] Error 2 make[3]: Leaving directory '/home/eman/riscv32_toolchain/build-newlib/riscv32-unknown-elf/newlib' make[2]: *** [Makefile:8492: all-target-newlib] Error 2 make[2]: Leaving directory '/home/eman/riscv32_toolchain/build-newlib' make[1]: *** [Makefile:879: all] Error 2 make[1]: Leaving directory '/home/eman/riscv32_toolchain/build-newlib' make: *** [Makefile:606: stamps/build-newlib] Error 2Capture.PNG

Vivek Singh

unread,
Jun 16, 2023, 8:59:10 AM6/16/23
to RISC-V SW Dev, hage...@gmail.com, Vivek Singh

Hello Hager,
Question: Have you been able to successfully rebuild the cross-compiler after you made the changes above?
Since you won't be using rs1, so bits 15-19 should be set to 1. So your mask should be 0xfe0ff07f. 
Second for 32 bits last 2 bits (First two LSBs) should be set to 11, while your mask is 0x1f80.
Choose the correct mask value.
 
The rest of the changes look okay. Rebuild the cross-compiler after making these changes.
And try compiling your code again.
Hope this helps.
Message has been deleted

Hager Rafaat

unread,
Jun 16, 2023, 9:42:09 AM6/16/23
to RISC-V SW Dev, vive...@gmail.com, Hager Rafaat
863692D8-93A8-40B5-A14D-B49F7FA44DD4.png
Hello Vivek at first I’d like to thank you for your response,  I put the match (Opcode)0x1f83 and the mask 0xfe0ff07f but it still gives me the same error 
Can you please help me?

Vivek Singh

unread,
Jun 21, 2023, 4:16:46 PM6/21/23
to RISC-V SW Dev, hage...@gmail.com, Vivek Singh
I don't think your "match" value satisfies the following logic

(opcode ^ mask == match)

I tried with match 0x27 and I was able to build the cross compiler.
And was able to successfully compile the test code. 


Changes
riscv-opc.c 

{"mov_c", 0, INSN_CLASS_I,"d,t", MATCH_MOV_C,MASK_MOV_C,match_opcode,0},

riscv-opc.h
#define MATCH_MOV_C 0x27
#define MASK_MOV_C 0xfe0ff07f

And then I rebuild the cross-compiler. And generated elf for the following code
clude <stdint.h>

int main (void)
{
asm volatile("mov_c t3, t4");
return 0;
}


And the compiler was able to generate code with the "mov_c" instruction

018c <main>:
   1018c: ff010113           add sp,sp,-16
   10190: 00812623           sw s0,12(sp)
   10194: 01010413           add s0,sp,16
   10198:01d00e27           mov_c t3,t4

   1019c: 00000793           li a5,0
   101a0: 00078513           mv a0,a5
   101a4: 00c12403           lw s0,12(sp)
   101a8: 01010113           add sp,sp,16
   101ac: 00008067           ret

Vivek Singh

unread,
Jun 21, 2023, 5:57:09 PM6/21/23
to RISC-V SW Dev, Vivek Singh, hage...@gmail.com
Correction:
In the Previous message the correct condition should be
((opcode & mask) == match)

hameeza ahmed

unread,
Feb 1, 2024, 4:04:18 AMFeb 1
to RISC-V SW Dev, vive...@gmail.com
Hello Vivek,
Thank you for such detailed tutorial. I have followed that and added custom instruction. But unable to work it with gdb.  You haven't made any changes in linker. Do we need to do that to make debugger work?
Also, apart from binutil if you can guide to add custom instructions in gcc by implementing patterns in md file. I believe the actual riscv instructions are implemented in core gcc.

Thank You

Tommy Murphy

unread,
Feb 1, 2024, 4:40:05 AMFeb 1
to hameeza ahmed, RISC-V SW Dev, vive...@gmail.com
> But unable to work it with gdb

What exactly do you mean by this?

Did you make the relevant changes in both copies of the `binutils-gdb` sources used by the `riscv-gnu-toolchain`? See here for example:

hameeza ahmed

unread,
Feb 1, 2024, 6:15:32 AMFeb 1
to RISC-V SW Dev, Jim Wilson, RISC-V SW Dev, tommy_...@hotmail.com
Hello,
I want to add  an intrinsic in gcc toolchain that will generate the assembly instruction. As you said I need

to add a pattern to the gcc/config/riscv/riscv.md file to describe the
instruction. Can you please help me. Are there any tutorial that you ca recommend for this?

Tommy Murphy

unread,
Feb 1, 2024, 6:36:06 AMFeb 1
to hameeza ahmed, RISC-V SW Dev, Jim Wilson, RISC-V SW Dev
If you search the open and closed issues here you may find some useful info. 


The upstream GCC forums are also a good place to look for advice:

Reply all
Reply to author
Forward
0 new messages