riscv-opcodes - distinguishing instruction set

588 views
Skip to first unread message

Michael Clark

unread,
Feb 22, 2016, 4:35:14 AM2/22/16
to RISC-V ISA Specification Discussion
Hi All,

I'm writing some code that reflects on riscv-opcodes to auto-generate
headers and code however I'm missing some information that allows me to
automate the task completely, in particular detecting whether an
instruction is RV32, RV64 or both. I want to avoid hand coding this.

Also noting the parse-opcodes python script
<https://github.com/riscv/riscv-opcodes/blob/master/parse-opcodes>
contains the instruction set groupings and some of the encoding type
information (not all of the information is in the opcodes files) and
that the RV128 instructions are not yet present in the opcodes files. I
suspect the latter may not be a pressing issue.

The issues I am having are:

* instruction set membership cannot be reflected (RV32, RV64), (I, M,
A, F, D)
* shift instructions mask sizes can't be distinguished for RV32, RV64
* compressed instructions have almost no encoding type information at all
* csr listings are not available in a parsable format

Initially I am wondering about adding tags to riscv-opcodes so that
instruction set membership can be reflected (this is my immediate problem):

* rv32i
* rv64i
* rv32m
* rv64m
* rv32a
* rv64a
* rv32f
* rv64f
* rv32d
* rv64d

Instead of:

addi rd rs1 imm12 14..12=0 6..2=0x04 1..0=3
addiw rd rs1 imm12 14..12=0 6..2=0x06 1..0=3
slli rd rs1 31..26=0 shamt 14..12=1 6..2=0x04 1..0=3
@slli.rv32 rd rs1 31..25=0 shamtw 14..12=1 6..2=0x04 1..0=3

We could have:

addi rd rs1 imm12 14..12=0 6..2=0x04 1..0=3 rv32i rv64i
addiw rd rs1 imm12 14..12=0 6..2=0x06 1..0=3 rv64i
slli rd rs1 31..26=0 shamt5 14..12=1 6..2=0x04 1..0=3 rv32i
slli rd rs1 31..26=0 shamt6 14..12=1 6..2=0x04 1..0=3 rv64i

Just a thought at present. Let me know if you think it is a good idea
and I can work on it in my spare time. There are also the RV128
instructions to add...

Regards,
Michael.

signature.asc

Samuel Falvo II

unread,
Feb 22, 2016, 2:48:57 PM2/22/16
to Michael Clark, RISC-V ISA Specification Discussion
I seem to recall that there is a M-mode register which has flag bits
indicating what instruction set the processor currently supports.
`mid`, IIRC.
> --
> You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/56CAD636.9080500%40mac.com.



--
Samuel A. Falvo II

Samuel Falvo II

unread,
Feb 22, 2016, 2:52:06 PM2/22/16
to RISC-V ISA Dev, michae...@mac.com, isa...@lists.riscv.org
Whoops; never mind.  I thought you were looking for run-time information, but you stated you were looking at compiler inputs instead.  I somehow missed that on first-reading.  Sorry.

Colin Schmidt

unread,
Feb 22, 2016, 4:01:21 PM2/22/16
to Michael Clark, RISC-V ISA Specification Discussion
Identifying ISA subsets for each instruction seems like a useful thing to add, it may even simplify some of the code in parse-opcodes.

It would be important to make sure the current output does not change based on this new feature.

On Mon, Feb 22, 2016 at 1:34 AM, Michael Clark <michae...@mac.com> wrote:

Michael Clark

unread,
Feb 28, 2016, 3:17:21 AM2/28/16
to Colin Schmidt, RISC-V ISA Specification Discussion, RISC-V Software Developers
Hi,

Sorry for not replying sooner.

I intend to work on a candidate patch to add "instruction set membership" to riscv-opcodes and will as you mention test the output of the python programs to make sure there are no regressions. I did not get as far as I had hoped this weekend. I will get to it eventually as I need this additional data for a translator framework I am working on.

Presently I have been working on decoder template meta-programs with an intention to maintain "semantic symmetry" with the RISC-V specification notation. The idea is to make updating code from the specification easier and less error prone. This weekend I worked on a new method of immediate decoding as part of the translator framework I am working on.

Using C++11 template meta-programming (variadic templates) allowed the construction of an immediate decoder notation that is relatively similar to the specification notation and concise enough that it can be easily transcribed while also creating optimized assembly language. The meta-program is evaluated during compilation and it requires no constants or run-time support. The beauty is fast code generated from the spec. I will also be happy to contribute code to Spike when I get time.

Here is the immediate decoder meta-program:

  https://gist.github.com/michaeljclark/5b4d2c40d14c23c5d77b

Here is the assembler it produces (gcc produces odd sign extension for CJ):

  https://gist.github.com/michaeljclark/3f41c03b952fdf8b0d03

Here is an example of the immediate decoder notation using the immediate formats from page 9 of the riscv-compressed-spec-v1.9:

/*
 *      12         10  6               2
 * CB   offset[8|4:3]  offset[7:6|2:1|5]
 */
typedef imm_t<9, S<12,10, B<8>,B<4,3>>, S<6,2, B<7,6>,B<2,1>,B<5>>> CB;

/*
 *      12                          2
 * CJ   offset[11|4|9:8|10|6|7|3:1|5]
 */
typedef imm_t<12, S<12,2, B<11>,B<4>,B<9,8>,B<10>,B<6>,B<7>,B<3,1>,B<5>>> CJ;

The template meta-program expands like this:

/* CB format */
imm_t<9
	S<12,10
		B<8>	(inst >> 4) & 0b00000000000000000000000100000000
		B<4:3>	(inst >> 7) & 0b00000000000000000000000000011000
	>
	S<6,2
		B<7:6>	(inst << 1) & 0b00000000000000000000000011000000
		B<2:1>	(inst >> 2) & 0b00000000000000000000000000000110
		B<5>	(inst << 3) & 0b00000000000000000000000000100000
	>
>

/* CJ format */
imm_t<12
	S<12,2
		B<11>	(inst >> 1) & 0b00000000000000000000100000000000
		B<4>	(inst >> 7) & 0b00000000000000000000000000010000
		B<9:8>	(inst >> 1) & 0b00000000000000000000001100000000
		B<10>	(inst << 2) & 0b00000000000000000000010000000000
		B<6>	(inst >> 1) & 0b00000000000000000000000001000000
		B<7>	(inst << 1) & 0b00000000000000000000000010000000
		B<3:1>	(inst >> 2) & 0b00000000000000000000000000001110
		B<5>	(inst << 3) & 0b00000000000000000000000000100000
	>
>

The idea is to be able to translate as much as possible as closely as possible from the specification using meta-programming. In a couple more weekends I should have full encode / decode completed for the entire ISA and will be able to start looking at binary translation (my interest). Just thought someone might find these templates useful. The gists are public domain.

Regards,
Michael.

Andrew Waterman

unread,
Feb 28, 2016, 3:27:02 AM2/28/16
to Michael Clark, Colin Schmidt, RISC-V ISA Specification Discussion, RISC-V Software Developers
This sounds to me like the implementation of a simulator's decoder,
rather than something that belongs in riscv-opcodes, which by design
is brain-dead.

I'm quite interested to hear how the template metaprogramming approach
performs in practice for instruction decoding.
> https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/56D2AD0B.30905%40mac.com.

Michael Clark

unread,
Feb 28, 2016, 3:39:39 AM2/28/16
to Andrew Waterman, Colin Schmidt, RISC-V ISA Specification Discussion, RISC-V Software Developers
Hi Andrew,

Yes it's quite interesting. If we're careful and check the assembler output and benchmark I think we can make something that performs very well. Will share progress and ideas over time... I would like to spend some time on Spike...

Regards,
Michael

Sent from my iPhone
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CA%2B%2B6G0BfpOO5gT8%2Byuo8UQ9vibd_PKbHnWFGtHvhLQ6usczX7A%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages