how to handle conflicting/overlapping ISA encodings with custom instruction extensions

Guy Lemieux

unread,

Apr 11, 2018, 8:37:46 AM4/11/18

to RISC-V ISA Dev

This question may seem a bit out there.... but given the flexibility
of RISC-V, should we prepare for the eventuality where there are
multiple *custom* instruction set extensions that have conflicting ISA
encodings?

Will this ever happen in practise? Well, I can imagine two scenarios...

Scenario1: As a CPU designer, I may have developed two different
custom instruction set extensions, possibly version 1 and version 2,
or possibly just type A and type B (different application domains),
and I want them both to co-exist on my future RISC-V cores. However, I
know they won't ever be used at the same time, and so to keep my ISA
encoding as compact as possible, I want a way to "switch" between
using these two extensions.

Scenario2: Extending this a bit further, perhaps a CPU vendor wants to
create as general of a piece of silicon as possible, and support
multiple different custom instruction extensions by different third
parties. These may come from multiple/conflicting/different proposals
for bit manipulation, for example, such as one which comes from
BigCompanyA, while another that comes from CommunityProposalB, and a
third that comes from MedCompanyC.

It *is probably* reasonable to assume that once a process is launched,
it cannot switch between these. Or, if it does switch, the cost of
switching may be very high (complete pipeline+cache flush+more).

If someone can make a case that *rapid* change is necessary, where we
switch ISA subsets on the fly (possibly followed by a FENCE), then
this should also be considered.

Should we consider an official generic way to select among these
possibilities? or is it already part of the CSSRs? (sorry but I
haven't studied CSSRs at all)

Thanks,
Guy

Liviu Ionescu

unread,

Apr 11, 2018, 8:55:18 AM4/11/18

to Guy Lemieux, RISC-V ISA Dev

On 11 April 2018 at 15:37:46, Guy Lemieux (glem...@vectorblox.com) wrote:

> or is it already part of the CSSRs? (sorry but I
> haven't studied CSSRs at all)

Do you mean custom CSRs? Please note that support for CSRs in existing
development tools is already problematic, and the more you abuse them
with custom usage, the more problems you'll have.

Regards,

Liviu

Guy Lemieux

unread,

Apr 11, 2018, 8:58:13 AM4/11/18

to Liviu Ionescu, RISC-V ISA Dev

> Do you mean custom CSRs? Please note that support for CSRs in existing
> development tools is already problematic, and the more you abuse them
> with custom usage, the more problems you'll have.

I'm sure it can be done with custom CSRs. However, I was wondering if
we should standardize the approach taken so this becomes the
non-custom part, and thereby allows things like OSs to do a proper
context switch even if it isn't aware of the actual purpose of the CSR
in use.

I haven't thought this idea through too much... but wanted to ask
opinions before I totally forgot about it...

Guy

kr...@berkeley.edu

unread,

Apr 11, 2018, 10:34:21 AM4/11/18

to Guy Lemieux, RISC-V ISA Dev

>>>>> On Wed, 11 Apr 2018 05:37:03 -0700, Guy Lemieux <glem...@vectorblox.com> said:
| This question may seem a bit out there.... but given the flexibility
| of RISC-V, should we prepare for the eventuality where there are
| multiple *custom* instruction set extensions that have conflicting ISA
| encodings?

| Will this ever happen in practise? Well, I can imagine two scenarios...

This could certainly happen, as the custom space is expressly designed
not to be regulated.

| Scenario1: As a CPU designer, I may have developed two different
| custom instruction set extensions, possibly version 1 and version 2,
| or possibly just type A and type B (different application domains),
| and I want them both to co-exist on my future RISC-V cores. However, I
| know they won't ever be used at the same time, and so to keep my ISA
| encoding as compact as possible, I want a way to "switch" between
| using these two extensions.

| Scenario2: Extending this a bit further, perhaps a CPU vendor wants to
| create as general of a piece of silicon as possible, and support
| multiple different custom instruction extensions by different third
| parties. These may come from multiple/conflicting/different proposals
| for bit manipulation, for example, such as one which comes from
| BigCompanyA, while another that comes from CommunityProposalB, and a
| third that comes from MedCompanyC.

| It *is probably* reasonable to assume that once a process is launched,
| it cannot switch between these. Or, if it does switch, the cost of
| switching may be very high (complete pipeline+cache flush+more).

| If someone can make a case that *rapid* change is necessary, where we
| switch ISA subsets on the fly (possibly followed by a FENCE), then
| this should also be considered.

| Should we consider an official generic way to select among these
| possibilities? or is it already part of the CSSRs? (sorry but I
| haven't studied CSSRs at all)

It is certainly possible to add a CSR that changes instruction
encoding on the fly. We already have the possibilty of a writable
misa to enable/disable extensions on the fly or to change base ISA.

But you'd want to make sure most common use cases don't need this.
Moving the custom extensions into the standard ISA encoding space,
even if it means 48-bit or longer encodings, is probably preferable to
any kind of dynamic switching in practice. For many custom
extensions, code size is probably not the primary concern, as overall
application's code size will be dominated by all the standard code.

Krste

| Thanks,
| Guy

| --
| You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
| To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
| To post to this group, send email to isa...@groups.riscv.org.
| Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
| To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CALo5CZxocJKFVrF9vaMObm_XV%3D2jdHv97sh8R_yuGOfwP7e94g%40mail.gmail.com.

Guy Lemieux

unread,

Apr 11, 2018, 10:51:08 AM4/11/18

to kr...@berkeley.edu, RISC-V ISA Dev

On Wed, Apr 11, 2018 at 7:34 AM <kr...@berkeley.edu> wrote:

This could certainly happen, as the custom space is expressly designed
not to be regulated.

It is certainly possible to add a CSR that changes instruction
encoding on the fly. We already have the possibilty of a writable
misa to enable/disable extensions on the fly or to change base ISA.

But you'd want to make sure most common use cases don't need this.
Moving the custom extensions into the standard ISA encoding space,
even if it means 48-bit or longer encodings, is probably preferable to
any kind of dynamic switching in practice. For many custom
extensions, code size is probably not the primary concern, as overall
application's code size will be dominated by all the standard code.

but if it is unregulated, then collisions can occur, even if using 48b or 64b encodings.

not to mention there may be a deliberate reason for overlap, eg between a vendors own version 1 and 2 of custom extensions, where there may be a mutex incompatibility or a provable non-need to access V1 if V2 is enabled.

plus, embedded systems will want compact code size, so things like dsp/simd/vector encodings should be compact.

(topic change: if we want very compact embedded systems, we also need multi-register save/restore instructions — these will be multicycle in most implementations, but it saves a lot of space in function calls)

Guy

kr...@berkeley.edu

unread,

Apr 11, 2018, 11:16:37 AM4/11/18

to Guy Lemieux, kr...@berkeley.edu, RISC-V ISA Dev

>>>>> On Wed, 11 Apr 2018 14:50:56 +0000, Guy Lemieux <glem...@vectorblox.com> said:
| On Wed, Apr 11, 2018 at 7:34 AM <kr...@berkeley.edu> wrote:
| This could certainly happen, as the custom space is expressly designed
| not to be regulated.

| It is certainly possible to add a CSR that changes instruction
| encoding on the fly. We already have the possibilty of a writable
| misa to enable/disable extensions on the fly or to change base ISA.

| But you'd want to make sure most common use cases don't need this.
| Moving the custom extensions into the standard ISA encoding space,
| even if it means 48-bit or longer encodings, is probably preferable to
| any kind of dynamic switching in practice. For many custom
| extensions, code size is probably not the primary concern, as overall
| application's code size will be dominated by all the standard code.

| but if it is unregulated, then collisions can occur, even if using 48b or 64b
| encodings.

standard = regulated

| not to mention there may be a deliberate reason for overlap, eg between a
| vendors own version 1 and 2 of custom extensions, where there may be a mutex
| incompatibility or a provable non-need to access V1 if V2 is enabled.

| plus, embedded systems will want compact code size, so things like dsp/simd/
| vector encodings should be compact.

If you want switchable ISA modes, there is a path through CSRs. But
making this a standard technique will involve regulating the ISA mode
encoding space.

To avoid having two spaces to regulate, probably makes more sense to
have a standard technique for compressing longer instruction encodings
into shorter encodings+ISA mode, i.e., regulate the 48/64bit encoding
space and define a standard way for ISA mode CSRs to map portions of this
encoding space into 32/16-bit encoding spaces (yes, could include
scheme to map 32-bit too into 16-bit space).

Obviously, software stack gets much harder with having to interpret
ISA mode registers.

| (topic change: if we want very compact embedded systems, we also need
| multi-register save/restore instructions — these will be multicycle in most
| implementations, but it saves a lot of space in function calls)

This is a very different topic - please start a new thread.

Krste

| Guy

Guy Lemieux

unread,

Apr 11, 2018, 11:36:22 AM4/11/18

to Krste Asanovic, RISC-V ISA Dev

On Wed, Apr 11, 2018 at 8:16 AM, <kr...@berkeley.edu> wrote:
>
>>>>>> On Wed, 11 Apr 2018 14:50:56 +0000, Guy Lemieux <glem...@vectorblox.com> said:
> | On Wed, Apr 11, 2018 at 7:34 AM <kr...@berkeley.edu> wrote:
> | This could certainly happen, as the custom space is expressly designed
> | not to be regulated.

> | but if it is unregulated, then collisions can occur, even if using 48b or 64b
> | encodings.
>
> standard = regulated

sorry I wasn't clear.

I was saying custom extensions are unregulated, so the likelihood of
collisions is high.

the standard part is regulated by R-V F, so it won't have collisions.

> If you want switchable ISA modes, there is a path through CSRs. But
> making this a standard technique will involve regulating the ISA mode
> encoding space.

yes, this is what I was imagining/proposing.

I'm not sure it's the only method, or the best method, but at least
the mechanism
for switching modes needs to be standardized.

> To avoid having two spaces to regulate, probably makes more sense to
> have a standard technique for compressing longer instruction encodings
> into shorter encodings+ISA mode, i.e., regulate the 48/64bit encoding
> space and define a standard way for ISA mode CSRs to map portions of this
> encoding space into 32/16-bit encoding spaces (yes, could include
> scheme to map 32-bit too into 16-bit space).

I was thinking something like this as well.

Generalizing, we could also make the whole ISA, including "standard"
extensions, swappable.

Thus, you could even upgrade from compressed version C.1 to version
C.2 if you want 20% tighter code savings..... e.g., C.1 may be for
general-purpose code, but C.2 may be an application-specific version
that achieves better results. letting users develop their own 16b -->
32b mappings for optimal code compaction would also be interesting
(designing a compact/feasible 16b-->32b decoder would be subject of
further research, but might look like a RAM or PLA or FPGA).

Guy

Albert Cahalan

unread,

Apr 11, 2018, 2:21:30 PM4/11/18

to Guy Lemieux, RISC-V ISA Dev

> This question may seem a bit out there.... but given the flexibility
> of RISC-V, should we prepare for the eventuality where there are
> multiple *custom* instruction set extensions that have conflicting ISA
> encodings?

Yes. Well, it should be blocked via legal means. Incompatibility is
a disaster for an architecture.

The viability of PowerPC was badly damaged when SPE was
introduced. This was a vector instruction set that was incompatible
with the AltiVec instruction set. Software vendors had to choose,
and typically the choice was "neither". Nobody wants to put in the
effort when there is uncertainty and a market fragmented into
small bits.

Note how Intel did not screw up. When SSE was added, MMX remained.
Software vendors could trust that instructions would be supported.
Both MMX and SSE remain today, in all shipping processors. With very
few exceptions, Intel does not ship chips with missing functionality.
There is a unified software ecosystem.

This goes beyond the instruction set. MMU functionality also matters.
You can add stuff, but then it must be implemented in every future CPU.
You can not take stuff away without harming the architecture.

Guy Lemieux

unread,

Apr 11, 2018, 4:37:23 PM4/11/18

to Albert Cahalan, RISC-V ISA Dev

On Wed, Apr 11, 2018 at 11:21 AM, Albert Cahalan <acah...@gmail.com> wrote:
>> This question may seem a bit out there.... but given the flexibility
>> of RISC-V, should we prepare for the eventuality where there are
>> multiple *custom* instruction set extensions that have conflicting ISA
>> encodings?
>
> Yes. Well, it should be blocked via legal means. Incompatibility is
> a disaster for an architecture.

This isn't about incompatibility -- this is about two different CUSTOM
instruction set extensions being added into the same silicon. There
are blocks of opcode space for these instructions, but inevitably
there will be overlap due to conflict/scarcity.

We can't expect the RISC-V Foundation to allocate these custom blocks
out like radio spectrum... we can expect some degree of
overlap/collisions between independent groups. We can also expect
that, to maximize affordability, silicon should be able to be used in
many different use cases, eg support these independent groups by
having an ISA-select capability.

> The viability of PowerPC was badly damaged when SPE was
> introduced. This was a vector instruction set that was incompatible
> with the AltiVec instruction set. Software vendors had to choose,
> and typically the choice was "neither". Nobody wants to put in the
> effort when there is uncertainty and a market fragmented into
> small bits.

I don't know the full story between SPE and AltiVec... but did any
silicon do both?

> Note how Intel did not screw up. When SSE was added, MMX remained.

Intel also has incredibly complex variable-length instruction set
encoding, and they are a single entity that determines what encoding
will be.

> Software vendors could trust that instructions would be supported.

For general purpose workstations, yes I agree.

For application-specific embedded systems, or dedicated supercomputing
centers, I disagree.

> Both MMX and SSE remain today, in all shipping processors. With very
> few exceptions, Intel does not ship chips with missing functionality.
> There is a unified software ecosystem.

MMX and SSE should die.

AVX is sort of ok, only because vector lengths are getting
interesting, but even in their 3rd generation Intel still got it wrong
-- fixed-length vectors are wrong because they don't scale.

An ISA with variable-length vectors is the only way to go, because
then you don't need to define a whole new ISA every time you want
higher performance.

> This goes beyond the instruction set. MMU functionality also matters.
> You can add stuff, but then it must be implemented in every future CPU.
> You can not take stuff away without harming the architecture.

Sure, but who is still using legacy protected mode or segment
registers with Intel CPUs? Didn't they get rid of those yet? I'm sure
nobody would miss that ....

Guy

Albert Cahalan

unread,

Apr 12, 2018, 12:10:45 AM4/12/18

to Guy Lemieux, RISC-V ISA Dev

On 4/11/18, Guy Lemieux <glem...@vectorblox.com> wrote:
> On Wed, Apr 11, 2018 at 11:21 AM, Albert Cahalan <acah...@gmail.com>
> wrote:

>>> This question may seem a bit out there.... but given the flexibility
>>> of RISC-V, should we prepare for the eventuality where there are
>>> multiple *custom* instruction set extensions that have conflicting ISA
>>> encodings?
>>
>> Yes. Well, it should be blocked via legal means. Incompatibility is
>> a disaster for an architecture.
>
> This isn't about incompatibility -- this is about two different CUSTOM
> instruction set extensions being added into the same silicon. There
> are blocks of opcode space for these instructions, but inevitably
> there will be overlap due to conflict/scarcity.

That sure is incompatibility. That opcode space is asking for disaster.

> We can't expect the RISC-V Foundation to allocate these custom blocks
> out like radio spectrum...

That would partially mitigate the disaster.

>> The viability of PowerPC was badly damaged when SPE was
>> introduced. This was a vector instruction set that was incompatible
>> with the AltiVec instruction set. Software vendors had to choose,
>> and typically the choice was "neither". Nobody wants to put in the
>> effort when there is uncertainty and a market fragmented into
>> small bits.
>
> I don't know the full story between SPE and AltiVec... but did any
> silicon do both?

Due to conflicting opcodes, no silicon did both. Perhaps it would have
been possible to enable one at a time, but that wasn't implemented.

You can't even write a generic PowerPC disassembler. The user
has to specify the variant of the instruction set.

>> This goes beyond the instruction set. MMU functionality also matters.
>> You can add stuff, but then it must be implemented in every future CPU.
>> You can not take stuff away without harming the architecture.
>
> Sure, but who is still using legacy protected mode or segment
> registers with Intel CPUs? Didn't they get rid of those yet? I'm sure
> nobody would miss that ....

I've recently seen three different embedded OSes use segmentation
all over the place. (16-bit segment with 32-bit offset, making a 48-bit
pointer, and thus pushing 5 times for a call to memcpy) These OSes
even use the hardware-based task switching.

So no, Intel can't get rid of all that. Compatibility really matters.

Getting back to PowerPC, the embedded OS developers have not
enjoyed writing piles of low-level MMU code again and again. There
are several completely different MMU designs. The architecture is
hurt by this because OS vendors will not bother to support everything.

Luke Kenneth Casson Leighton

unread,

Apr 12, 2018, 12:29:07 AM4/12/18

to Krste Asanovic, Guy Lemieux, RISC-V ISA Dev

---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Wed, Apr 11, 2018 at 3:34 PM, <kr...@berkeley.edu> wrote:

> | Should we consider an official generic way to select among these
> | possibilities? or is it already part of the CSSRs? (sorry but I
> | haven't studied CSSRs at all)
>
> It is certainly possible to add a CSR that changes instruction
> encoding on the fly. We already have the possibilty of a writable
> misa to enable/disable extensions on the fly or to change base ISA.

Ok so that's good to know, that there is precedent for that: I am
presently studying the feasibility of abstracting out the parallelism
from both RVV and RVP, such that (apart from anything) RVV may even
utilise Compressed (16-bit) opcodes as opposed to its present 64-bit
encoding.

The trade-off - which has to be thought through exxxtreeeeeemely
carefully - is to do exactly this: change base ISA depending on bits
in certain CSRs.

l.

Luke Kenneth Casson Leighton

unread,

Apr 12, 2018, 12:55:55 AM4/12/18

to Krste Asanovic, Guy Lemieux, RISC-V ISA Dev

On Wed, Apr 11, 2018 at 4:16 PM, <kr...@berkeley.edu> wrote:

>>>>>> On Wed, 11 Apr 2018 14:50:56 +0000, Guy Lemieux <glem...@vectorblox.com> said:

> | but if it is unregulated, then collisions can occur, even if using 48b or 64b
> | encodings.

> [...]

> If you want switchable ISA modes, there is a path through CSRs. But
> making this a standard technique will involve regulating the ISA mode
> encoding space.

That's right. This need not be burdensome. Here is the "nightmare scenario":

* custom extremely good ISA 1 uses encoding A
* custom extremely good ISA 2 uses encoding A
* vendor lays down both in the same ASIC
* binary code is encountered with A and the engine has no idea what to do.

iiiif howeverrrrr, the situation was as follows:

* custom vendor 1 registers intent to develop custom extension with
the RISC-V Foundation
* custom vendor 1 receives "IANA-style unique number" from RISC-V Foundation
* custom extremely good ISA 1 uses encoding A HOWEVER....
* AS PART OF THE BASE (not the custom) ISA, it is made ABSOLUTELY
CLEAR that BEFORE this ISA is used, a special (*REGULATED* repeat
*REGULATED*) instruction *MUST* be called:

set-csr-iana-style-custom-unique-isa-number {insert custom vendor 1
unique number here}

compilers may now confidently generate any number of repeated
encodings A, because each usage *must* have had a call to that
iana-style mode-setting instruction with the unique CSR-mode-setting
that unambiguously defines the meaning of the instruction.

in particular it's worth noting that it is *not* necessary to call
that mode-setting instruction prior to *every* use of a custom
instruction. It would be perfectly reasonable to do this (and for
compilers to track it):

set-csr-iana-style {insert custom vendor 1 unique number here}
instruction A # custom vendor 1's "meaning"
standard RV32 instruction
standard RV32 instruction
instruction A' # from the same custom vendor 1
set-csr-iana-style {insert custom vendor 2 unique number here}
instruction A # custom vendor *2*'s "meaning"
instruction A' # from the same custom vendor *2*
standard RV32 instruction
....

The nice thing about this would be that the compiler writers would
know what to expect when it comes to upcoming custom instructions.
Right now, hmmm this hadn't occurred to me until now: vendors that
create custom instructions will be FORCED into the position of
maintaining their own fork of gcc. Without the above *STANDARD*
formal method that is part of the *BASE* RV, the first vendor that
submits patches to gcc basically becomes the de-facto "dominator" of
that encoding. Total chaos and much pain ensues.

l.

Luke Kenneth Casson Leighton

unread,

Apr 12, 2018, 1:00:54 AM4/12/18

to Krste Asanovic, Guy Lemieux, RISC-V ISA Dev

---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Wed, Apr 11, 2018 at 4:16 PM, <kr...@berkeley.edu> wrote:

> Obviously, software stack gets much harder with having to interpret
> ISA mode registers.

yyup. more than that: Jacob pointed out that an implementation
without a particular extension will not have the CSR space in which to
store even "emulating" CSRs. thus it becomes necessary to have
some... "blank CSR space" in which to do exactly that. and B-Ext with
which to manipulate it. putting the "blank CSR space" into J-Ext
might not be enough.

l.

Andrew Waterman

unread,

Apr 12, 2018, 1:03:01 AM4/12/18

to Luke Kenneth Casson Leighton, Krste Asanovic, Guy Lemieux, RISC-V ISA Dev

It might've just occurred to you, but it occurred to us a long time ago. As one of the maintainers of the RISC-V GCC and Binutils ports, I can assert quite confidently that we'll never accept a patch that contributes support for a nonstandard extension. Not from anyone. Standardization is a gateway to incorporation into the GNU tools.

l.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAPweEDzSUSh2bCGJ-oDzSWN-SVk43MTHfRdn1aw_e-M3XoUs3A%40mail.gmail.com.

Luke Kenneth Casson Leighton

unread,

Apr 12, 2018, 1:16:27 AM4/12/18

to Andrew Waterman, Krste Asanovic, Guy Lemieux, RISC-V ISA Dev

On Thu, Apr 12, 2018 at 6:02 AM, Andrew Waterman
<wate...@eecs.berkeley.edu> wrote:

>> Right now, hmmm this hadn't occurred to me until now: vendors that
>> create custom instructions will be FORCED into the position of
>> maintaining their own fork of gcc. Without the above *STANDARD*
>> formal method that is part of the *BASE* RV, the first vendor that
>> submits patches to gcc basically becomes the de-facto "dominator" of
>> that encoding. Total chaos and much pain ensues.

> It might've just occurred to you, but it occurred to us a long time ago.

*gently*, Andrew. Someone coming in new can't know everything. That
I quickly and independently derived "what you already knew" is a good
sign, as it increases the probability that the decision / analysis
that you did was good, yeah?

Plus, what in effect you're saying is: this really *does* have to be
fixed (by introducing some form of registration of custom extensions),
because it is *guaranteed* that custom extension maintainers will have
to maintain their own separate fork of gcc. We know where that goes:
hell on earth for the organisation that now has to spend more time
forward-porting gcc patches than it does on actually supporting its
customers.

I once did a 1000-line patch to samba in a separate branch (this was
before git). I decided to pull in mainline patches using dirdiff
(developed by paulus). It took a WEEK. I added a few more lines....
couple hours work there. next day tried pulling in more mainline code
patches... another WEEK later...

Bottom line: it is absolutely hopelessly unrealistic to expect custom
vendors to maintain forks of gcc. Linaro was set up *specifically* by
David Rusling, who kindly spent about an hour on the phone with me to
explain why Linaro was established: to stop the situation where
vendors were forced to fork gcc and entire BSPs.

Therefore, logically, this has to be dealt with at the hardware level.

l.

Luke Kenneth Casson Leighton

unread,

Apr 12, 2018, 1:30:24 AM4/12/18

to Guy Lemieux, Krste Asanovic, RISC-V ISA Dev

On Wed, Apr 11, 2018 at 4:35 PM, Guy Lemieux <glem...@vectorblox.com> wrote:
> On Wed, Apr 11, 2018 at 8:16 AM, <kr...@berkeley.edu> wrote:

>> To avoid having two spaces to regulate, probably makes more sense to
>> have a standard technique for compressing longer instruction encodings
>> into shorter encodings+ISA mode, i.e., regulate the 48/64bit encoding
>> space and define a standard way for ISA mode CSRs to map portions of this
>> encoding space into 32/16-bit encoding spaces (yes, could include
>> scheme to map 32-bit too into 16-bit space).
>
> I was thinking something like this as well.

Whilst it appears on the face of it to be sensible, the idea of
regulating the custom ISA space defeats the very freedom and
flexibility of having custom ISA spaces in the first place. It's a
drastic solution that's worse than the problem.

*However*: as a *separate* idea (separate from the "IANA-like custom
registration with RISC-V Foundation" concept) the idea of mapping down
to a smaller encoding space has strong merit... but should *not* be
confused or amalgamated with the "IANA-like" concept.

> Generalizing, we could also make the whole ISA, including "standard"
> extensions, swappable.
>
> Thus, you could even upgrade from compressed version C.1 to version
> C.2 if you want 20% tighter code savings..... e.g., C.1 may be for
> general-purpose code, but C.2 may be an application-specific version
> that achieves better results.

There was that research into RV16 which came up with a whopping 25%
better compression that C, wasn't there?

Sadly, bear in mind (reminder): C as part of the unix ABI standard
makes it completely pointless to consider doing any further research
into alternative Compression, and many other down-sides. This was
discussed in-depth a few months back on sw-dev.

l.

Luke Kenneth Casson Leighton

unread,

Apr 12, 2018, 1:38:37 AM4/12/18

to Albert Cahalan, Guy Lemieux, RISC-V ISA Dev

On Thu, Apr 12, 2018 at 5:10 AM, Albert Cahalan <acah...@gmail.com> wrote:

>> I don't know the full story between SPE and AltiVec... but did any
>> silicon do both?
>
> Due to conflicting opcodes, no silicon did both. Perhaps it would have
> been possible to enable one at a time, but that wasn't implemented.
>
> You can't even write a generic PowerPC disassembler. The user
> has to specify the variant of the instruction set.

Thank you for sharing these insights and stories, Albert, I
appreciate hearing things like this. hmmm, has anyone done a RISC-V
disassembler to make sure that it also does not have the same kinds of
flaws? I appreciate that there's qemu but a disassembler (whose
output could be passed to gcc -S and then disassembled and then passed
to gcc -S as a way to do the same triple-checks that gcc does of
itself) would have additional formal verification uses.

l.

Andrew Waterman

unread,

Apr 12, 2018, 2:24:34 AM4/12/18

to Luke Kenneth Casson Leighton, Krste Asanovic, Guy Lemieux, RISC-V ISA Dev

On Wed, Apr 11, 2018 at 10:16 PM, Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:

On Thu, Apr 12, 2018 at 6:02 AM, Andrew Waterman
<wate...@eecs.berkeley.edu> wrote:

>> Right now, hmmm this hadn't occurred to me until now: vendors that
>> create custom instructions will be FORCED into the position of
>> maintaining their own fork of gcc. Without the above *STANDARD*
>> formal method that is part of the *BASE* RV, the first vendor that
>> submits patches to gcc basically becomes the de-facto "dominator" of
>> that encoding. Total chaos and much pain ensues.

> It might've just occurred to you, but it occurred to us a long time ago.

*gently*, Andrew. Someone coming in new can't know everything. That
I quickly and independently derived "what you already knew" is a good
sign, as it increases the probability that the decision / analysis
that you did was good, yeah?

The disconnect here is that this mailing list typically has a sober tone, whereas my observation is that when you see a hole (actual or perceived), you jump on it, assume the worst, and editorialize an extrapolation. Of course I can't expect you to know everything that's happened over the last several years. But I can also ask that you operate under the assumption that the other participants on this list weren't born yesterday, and perhaps some more subtlety might be appropriate.

kr...@berkeley.edu

unread,

Apr 12, 2018, 2:55:38 AM4/12/18

to Albert Cahalan, Guy Lemieux, RISC-V ISA Dev

By definition, RISC-V won't have incompatible standard extensions.

The whole point of the custom instruction encoding space is that we
don't have to talk about it. All standard tools and OSs should ignore
it.

Krste

| --
| You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.

| To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

| To post to this group, send email to isa...@groups.riscv.org.
| Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

| To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CABfYdSqTaedUk-2C%3D-aRSrQDEXAfE%2Boh%3D9DQzZQE%3DjW3wyV8zg%40mail.gmail.com.

Richard Herveille

unread,

Apr 12, 2018, 3:16:26 AM4/12/18

to kr...@berkeley.edu, Albert Cahalan, Guy Lemieux, RISC-V ISA Dev, Richard Herveille

On 12/04/2018, 08:55, "kr...@berkeley.edu" <kr...@berkeley.edu> wrote:

By definition, RISC-V won't have incompatible standard extensions.

The whole point of the custom instruction encoding space is that we

don't have to talk about it. All standard tools and OSs should ignore

it.

Agreed. Custom extensions are just that … custom for a particular vendor/device/project/implementation.

Think an FFT or AES engine that gets hooked to the CPU and can be called via a single instruction.

The actual instruction may vary from project/device to project/device.

If someone comes up with a great (set of) instruction(s) that benefits a whole community, it should be proposed as a foundation supported extension.

Richard

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/23247.742.482965.940004%40KAiMac.local.

Jim Wilson

unread,

Apr 12, 2018, 10:48:51 AM4/12/18

to Luke Kenneth Casson Leighton, Albert Cahalan, Guy Lemieux, RISC-V ISA Dev

On Wed, Apr 11, 2018 at 10:38 PM, Luke Kenneth Casson Leighton
<lk...@lkcl.net> wrote:
> Thank you for sharing these insights and stories, Albert, I
> appreciate hearing things like this. hmmm, has anyone done a RISC-V
> disassembler to make sure that it also does not have the same kinds of
> flaws? I appreciate that there's qemu but a disassembler (whose
> output could be passed to gcc -S and then disassembled and then passed
> to gcc -S as a way to do the same triple-checks that gcc does of
> itself) would have additional formal verification uses.

Objdump can disassemble of course. The objdump testsuite has
testcases to verify that assembling code and disassembling it gives
the right result.

The disassembler needs to know the base architecture, rv32e, rv32i,
rv64i, rv128i, as there are some conflicting encodings between the
base architectures. This info can be obtained from the elf class and
elf header flags. Otherwise, there are no conflicts, but we don't
support any custom extensions currently.

Jim

lk...@lkcl.net

unread,

Apr 12, 2018, 1:36:05 PM4/12/18

to RISC-V ISA Dev, lk...@lkcl.net, acah...@gmail.com, glem...@vectorblox.com

On Thursday, April 12, 2018 at 3:48:51 PM UTC+1, Jim Wilson wrote:

On Wed, Apr 11, 2018 at 10:38 PM, Luke Kenneth Casson Leighton
<lk...@lkcl.net> wrote:
> Thank you for sharing these insights and stories, Albert, I
> appreciate hearing things like this. hmmm, has anyone done a RISC-V
> disassembler to make sure that it also does not have the same kinds of
> flaws?

Objdump can disassemble of course.

oh duh :) that's a huge relief to know it's covered.

lk...@lkcl.net

unread,

Apr 12, 2018, 1:52:50 PM4/12/18

to RISC-V ISA Dev, lk...@lkcl.net, kr...@berkeley.edu, glem...@vectorblox.com

On Thursday, April 12, 2018 at 7:24:34 AM UTC+1, waterman wrote:

On Wed, Apr 11, 2018 at 10:16 PM, Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:
On Thu, Apr 12, 2018 at 6:02 AM, Andrew Waterman
<wate...@eecs.berkeley.edu> wrote:

The disconnect here is that this mailing list typically has a sober tone, whereas my observation is that when you see a hole (actual or perceived), you jump on it, assume the worst, and editorialize an extrapolation. Of course I can't expect you to know everything that's happened over the last several years. But I can also ask that you operate under the assumption that the other participants on this list weren't born yesterday, and perhaps some more subtlety might be appropriate.

Appreciated the feedback, Andrew. *sigh* mindful of the wise quote "certainty is a pathological state of mind", subtlety is something I find extremely challenging. Honesty and integrity, no problem. Formerly and historically, prior efforts on my part to be "subtle" have been clearly so painful to witness by the recipients that people mistake them for disingenuity, sarcasm and more. So I'm now slightly nervous and don't really know what to do (and would like to be a useful long-term contributor here so have to fix it) Off-list feedback (from anyone) much appreciated.

l.

lk...@lkcl.net

unread,

Apr 12, 2018, 2:25:09 PM4/12/18

to RISC-V ISA Dev, kr...@berkeley.edu, acah...@gmail.com, glem...@vectorblox.com, richard....@roalogic.com

On Thursday, April 12, 2018 at 8:16:26 AM UTC+1, Richard Herveille wrote:

On 12/04/2018, 08:55, "kr...@berkeley.edu" <kr...@berkeley.edu> wrote:

By definition, RISC-V won't have incompatible standard extensions.

The whole point of the custom instruction encoding space is that we
don't have to talk about it. All standard tools and OSs should ignore
it.

Agreed. Custom extensions are just that … custom for a particular vendor/device/project/implementation.
Think an FFT or AES engine that gets hooked to the CPU and can be called via a single instruction.
The actual instruction may vary from project/device to project/device.
If someone comes up with a great (set of) instruction(s) that benefits a whole community, it should be proposed as a foundation supported extension.

Ok, so the crypto engine is a good example to run with as it may end up being extremely common (insert any other appropriate engine if there is a belief that it is not). Let's imagine that there's some unregulated china clone of RISC-V, where they go "oo I like that custom crypto extension I found on the internet, it's BSD-licensed, I can do what I like with it, let's drop that in and make a mint!!!"

That then ends up with a *hundred million* units being sold world-wide in various smartphones and tablets as part of the SoC (the story then follows from the example of how Allwinner-based products made it from China across the world).

After various reverse-engineering efforts someone goes "hang on a minute, that's an unregulated custom extension to RISC_V straight off of github!!" and at *that* point, due to the sheer overwhelming demand for the product, which is now so ubiquitous that the gcc maintainers, u-boot maintainers, linux kernel maintainers and so on receive *overwhelming* numbers of requests to support the custom extension by default that they get absolutely fed up and decide to include it.

Bear in mind that those people in the software libre community are *not* under the control of the RISC-V Foundation, so that factor has to be catered for as a "likely scenario".

Repeat this at least twice and the probability of the nightmare PowerPC scenario coming true for RISC-V increases alarmingly.

So tracing back the chain, the *cause* of the clash can be traced back to the lack of control and clarity in the RISC-V ISA Speciffication regarding custom extensions.

Standards MUST nail EVERYTHING down if they are to BE Standards, and there must be no wavering. ALL possibility of ambiguity, conflict and confusion MUST be nailed in a backwards-compatible, forwards-compatible, extensible way. No exceptions. This is one thing that, as Albert points out, Intel gets right.

So I can tell you, *right now*, that the lesson that Albert warns about *will* hit RISC-V - RISC-V will be a FAILURE - if it is not made mandatory that custom extensions be "registered" and given an IANA-like CSR, with an associated instruction to be called to put the processor into that "mode".

Note that I did not say that the custom extension has to tell the RISC-V what they intend to *do* with that custom extension: they just have to *register* it (and have gcc emit appropriate CSR mode-setting instructions prior to use)

I do understand that there is a desire for a "hands-off" approach to custom extensions, Krste. An alternative approach would be to request that the creators of custom extensions be required, by the RISC-V Foundation, *NEVER*, under *ANY* circumstances, to publish their designs or ANY aspect of the custom extension. not the source code, not the RTL, not the tools, nothing. However I believe you would agree that that would be unworkable and impractical, and very much against the spirit of RISC-V. Not least it would have implementors view RISC-V as not to be taken seriously.

So in comparison to the alternatives, I think you'll find that an IANA-like arrangement would be warmly accepted and welcomed, once it was explained Precedents include IANA itself, USB IDs, PCIe IDs, and linux kernel MACHINE_IDs. It's an extremely well-known solution. The RISC-V Foundation becomes the "atomic transactor" for unambiguity of the ISA under its wing :)

Writing unambiguous standards is damn hard. Being *responsible* for them is... an awful lot of work.

l.

Jacob Bachmeyer

unread,

Apr 12, 2018, 5:33:15 PM4/12/18

to lk...@lkcl.net, RISC-V ISA Dev, kr...@berkeley.edu, acah...@gmail.com, glem...@vectorblox.com, richard....@roalogic.com

lk...@lkcl.net wrote:
> [...]
>
> [... snip nightmare scenario of Chinese vendors appropriating
> CUSTOM-0/CUSTOM-1 opcodes ...]
>
> [... snip plea for "custom extension" registry ...]

>
> So in comparison to the alternatives, I think you'll find that an
> IANA-like arrangement would be warmly accepted and welcomed, once it
> was explained Precedents include IANA itself, USB IDs, PCIe IDs, and
> linux kernel MACHINE_IDs. It's an extremely well-known solution. The
> RISC-V Foundation becomes the "atomic transactor" for unambiguity of
> the ISA under its wing :)

We have this: the mvendorid and marchid CSRs. If we require that any
given implementation can have at most one set of custom opcodes,
determining the custom opcodes available requires merely examining
{mvendorid, marchid} and looking that tuple up in a table.

This could also allow for a database-driven extensible assembler, using
a --targetid=<VENDOR-ID>/<ARCH-ID> parameter to select which custom
mnemonics should be recognized. GCC support is unlikely to be needed
for custom extensions; wrapper functions or inline assembler should be
sufficient.

I do agree, however, that RISC-V binutils should not accept patches for
proprietary exclusive extensions -- if you want the standard tools to
support your extension, even as a custom extension, you *must* permit
anyone to implement compatible support. No claim of exclusivity will be
tolerated.

-- Jacob

Luke Kenneth Casson Leighton

unread,

Apr 12, 2018, 5:52:18 PM4/12/18

to Jacob Bachmeyer, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Thu, Apr 12, 2018 at 10:33 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
> lk...@lkcl.net wrote:
>>
>> [...]
>>
>> [... snip nightmare scenario of Chinese vendors appropriating
>> CUSTOM-0/CUSTOM-1 opcodes ...]
>>
>> [... snip plea for "custom extension" registry ...]
>>
>> So in comparison to the alternatives, I think you'll find that an
>> IANA-like arrangement would be warmly accepted and welcomed, once it was
>> explained Precedents include IANA itself, USB IDs, PCIe IDs, and linux
>> kernel MACHINE_IDs. It's an extremely well-known solution. The RISC-V
>> Foundation becomes the "atomic transactor" for unambiguity of the ISA under
>> its wing :)
>
>
> We have this: the mvendorid and marchid CSRs.

oh superb!

> If we require that any given
> implementation can have at most one set of custom opcodes, determining the
> custom opcodes available requires merely examining {mvendorid, marchid} and
> looking that tuple up in a table.

It's not precisely what I had in mind however as long as the
implementors ensure that if a particular marchid is *absolutely*
guaranteed to be unambiguous with respect to the custom extensions in
it (mapping is "one to one and onto" i think the phrase is), then
whilst it's not precisely what I proposed it's functionally
equivalent...

... for *identifying the custom extensions*... which just leaves one
other bit missing...

> This could also allow for a database-driven extensible assembler, using a
> --targetid=<VENDOR-ID>/<ARCH-ID> parameter to select which custom mnemonics
> should be recognized. GCC support is unlikely to be needed for custom
> extensions; wrapper functions or inline assembler should be sufficient.

indeed it could.

Ok so there is one bit that is missing, here: making sure that gcc
knows that, when two particular custom extensions contain overlapping
extensions are used on the same {VENDORID/ARCHID} machine, they.... oh
damn it doesn't work, does it?

So the issue is: {VENDORID/ARCHID} identifies the *machine*, it
doesn't identify the *custom extension*. As in: one vendor may
license *another company's custom extension*.

So unfortunately it really does have to be that the *custom
extension* needs [atomic, RISC-V-Foundation-controlled] registration,
such that gcc may issue "custom extension NNN instructions about to be
used, please interpret all upcoming binary code A instructions as
being of custom extension type NNN".

Sorry I can't think of an appropriate more compact phrase to describe that.

> I do agree, however, that RISC-V binutils should not accept patches for
> proprietary exclusive extensions -- if you want the standard tools to
> support your extension, even as a custom extension, you *must* permit anyone
> to implement compatible support. No claim of exclusivity will be tolerated.

This is a slightly different issue from the (show-stopping) initial
issue that Guy identified.

l.

Guy Lemieux

unread,

Apr 12, 2018, 6:07:14 PM4/12/18

to Jacob Bachmeyer, lk...@lkcl.net, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

> We have this: the mvendorid and marchid CSRs. If we require that any given
> implementation can have at most one set of custom opcodes,

Please try to keep an open mind. This is a very strong requirement.
The cost of producing silicon is very high. Hence, it is very likely
that a silicon vendor will produce a multi-capable CPU for different
fabless customers, where each fabless customer has one or more custom
extensions.

I believe we need a way to prevent conflicts among custom extensions.

In this situation, the silicon vendor must produce the mendorid and
marched. The question is whether these values depend upon the silicon
instance itself (single value), or whether they can take on multiple
values (depending upon which custom extension is activated).

So far, Luke's IANA-style numbering is the best way. I'm pretty sure
we can elaborate on that to make it very flexible and still keep it
simple. This will not affect vendors that do not attempt to support
conflicting extensions, so it has zero cost to them (except perhaps
returning 0s for a CSR that might control this switching of
extensions).

> This could also allow for a database-driven extensible assembler, using a
> --targetid=<VENDOR-ID>/<ARCH-ID> parameter to select which custom mnemonics
> should be recognized. GCC support is unlikely to be needed for custom
> extensions; wrapper functions or inline assembler should be sufficient.

I think you're right here. As a single datapoint: VectorBlox has only
modified the assembler.

> I do agree, however, that RISC-V binutils should not accept patches for
> proprietary exclusive extensions -- if you want the standard tools to
> support your extension, even as a custom extension, you *must* permit anyone
> to implement compatible support. No claim of exclusivity will be tolerated.

This is very GPL-like in intensity.

There is a lot of merit to keep the baseline as streamlined and
non-proprietary as possible.

However, there is also a point where even proprietary extensions might
get adopted, if they become popular enough. Still, I think this issue
is a bit off-topic -- the question is whether the ISA needs to support
a *standard* way of switching between conflicting custom instruction
sets.

Guy

Cesar Eduardo Barros

unread,

Apr 12, 2018, 6:48:37 PM4/12/18

to Guy Lemieux, RISC-V ISA Dev

Em 11-04-2018 09:37, Guy Lemieux escreveu:
> This question may seem a bit out there.... but given the flexibility
> of RISC-V, should we prepare for the eventuality where there are
> multiple *custom* instruction set extensions that have conflicting ISA
> encodings?

It will happen. There are four major opcodes reserved for custom
instructions, once the fifth group creates a custom extension using a
major opcode, the pigeonhole principle says there will be a conflicting
encoding.

> Will this ever happen in practise? Well, I can imagine two scenarios...
>

> Scenario1: As a CPU designer, I may have developed two different
> custom instruction set extensions, possibly version 1 and version 2,
> or possibly just type A and type B (different application domains),
> and I want them both to co-exist on my future RISC-V cores. However, I
> know they won't ever be used at the same time, and so to keep my ISA
> encoding as compact as possible, I want a way to "switch" between
> using these two extensions.

Unless these extensions need either a lot of encoding space, or a
restricted encoding space (like filling the gaps within one of the
already allocated major opcodes), it should be simple to keep them
separate. For instance, type A uses the custom-0 major opcode, while
type B uses the custom-1 major opcode.

> Scenario2: Extending this a bit further, perhaps a CPU vendor wants to
> create as general of a piece of silicon as possible, and support
> multiple different custom instruction extensions by different third
> parties. These may come from multiple/conflicting/different proposals
> for bit manipulation, for example, such as one which comes from
> BigCompanyA, while another that comes from CommunityProposalB, and a
> third that comes from MedCompanyC.

The simplest way (and one which is very likely to already be present)
would be to allow enabling and disabling each extension individually,
unless their encoding conflicts (in that case, it should allow chosing
between the alternatives).

> It *is probably* reasonable to assume that once a process is launched,
> it cannot switch between these. Or, if it does switch, the cost of
> switching may be very high (complete pipeline+cache flush+more).

If the switching affects only the decoder, the cost is not that high.
The CSR instructions have the CSR number hardcoded within the
instruction, so the instruction decoder can see it early; when decoding
an instruction which touches a decoder-switching CSR, the decoder can
insert a bubble in the pipeline (through the same mechanism it uses for
an instruction cache miss) until the instruction is retired. The cost
ends up being equivalent to only a pipeline flush.

Of course, if the core caches pre-decoded instructions, it might have to
discard that cache, unless it's tagged with the instruction "mode". And
if the switch turns on/off power/clock to parts of the core, it will
obviously be much more expensive.

> If someone can make a case that *rapid* change is necessary, where we
> switch ISA subsets on the fly (possibly followed by a FENCE), then
> this should also be considered.
>

> Should we consider an official generic way to select among these
> possibilities? or is it already part of the CSSRs? (sorry but I
> haven't studied CSSRs at all)

We already have one for the standard extensions (the MISA CSR). Custom
extensions will probably use a similar mechanism. If you want a generic
way, it could be described in the device tree ("to toggle extension
'riscvfans-bitmanip', use CSR 1234 bit 12").

--
Cesar Eduardo Barros
ces...@cesarb.eti.br

Luke Kenneth Casson Leighton

unread,

Apr 12, 2018, 6:50:32 PM4/12/18

to Guy Lemieux, Jacob Bachmeyer, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

On Thu, Apr 12, 2018 at 11:06 PM, Guy Lemieux <glem...@vectorblox.com> wrote:
>> We have this: the mvendorid and marchid CSRs. If we require that any given
>> implementation can have at most one set of custom opcodes,
>
> Please try to keep an open mind. This is a very strong requirement.

(I hear you. I didn't want to go overboard with the urgency, though.
Relieved you picked up on it).

> The cost of producing silicon is very high. Hence, it is very likely
> that a silicon vendor will produce a multi-capable CPU for different
> fabless customers, where each fabless customer has one or more custom
> extensions.

... and cross-license the same, due to the high development and
testing costs. They're certainly not going to make modifications to
third party RTLs just because there happens to be a conflict. I can't
possibly imagine any third party vendor reacting particularly well to
a request, "hey guys we need you to move those instructions out of
binary encoding space AAAA, because some uncontrollable China vendor
dumped a hundred million units onto the market which uses that
already.... but we're only going to pay you the same $50k as anyone
else does: take it or leave it".

Given that they probably spent upwards of $2m in VC Funding to develop
the extension (*way* more if they also tested it in silicon), they'll
quite reasonably tell such a difficult potential new customer to take
a running jump. nicely, of course.

> I believe we need a way to prevent conflicts among custom extensions.
>
> In this situation, the silicon vendor must produce the mendorid and
> marched. The question is whether these values depend upon the silicon
> instance itself (single value), or whether they can take on multiple
> values (depending upon which custom extension is activated).
>
> So far, Luke's IANA-style numbering is the best way. I'm pretty sure
> we can elaborate on that to make it very flexible and still keep it
> simple.

I thought.. moments after hitting send of course, as you do... that
perhaps allowing vendors to create their own custom extensions
*without* needing to actually register with the RISC-V Foundation, by
utilising the same trick of USB:IDs. once they'd registered the
(unique) VENDOR_ID, of course it then becomes entirely their own
prerogative to use that as a prefix for custom extensions.

Form: {VENDORID}{CUSTOMEXTENSIONID}

Thus the administrative burden on the RISC-V Foundation that I
previously thought was necessary now becomes zero (delegating the
burden of responsibility onto implementors, where it belongs).

The Devil Will Be In The Details.

In the "Parallelism Extension" I raised the same conceptual idea that
you and Krste also raised: conditional decoding of instructions based
on CSRs. It sounds like a really really good solution, and, in light
of the direction that I wanted to explore with the Parallelism
Extension, a huge relief that the same conceptual paradigm was raised.

HOWEVER

It has to be said that several people pointed out their concern as to
the potential impact on the decode phase of instructions for
CSR-dependent decoding. As in: it now becomes necessary to consult
the CSRs on *EVERY* occurrence of binary-encoding AAAA. In the case
of the Parallelism Extension, the idea under discussion was to have a
bit-per-register "hey this register is actually a vector please direct
all enquiries to the *vector* instruction decode engine and pipelines"
which gets *really* interesting and needs a heck of a lot of careful
thought.

So with the cost (increased latency) being potentially quite high at
such a critical part of any implementation, it certainly would not be
practical or desirable to compare against large numbers of bits
{VENDORID}{CUSTOMEXTENSIONID}, consequently it might be necessary to
devise some level of indirection (implementation-specific not
Standards-mandated) that results in a greatly-reduced number of bits
being needed to carry out the required disambiguation.

Anyway. At that point, my knowledge runs out (latency impact on
pipelined architectures). I've asked a source if they could give an
opinion, but if anyone else knows more that would be fantastic to get
a consensus.

Or if anyone else can think of an alternative? My feeling is though
that some sort of globally-unique CSR-derived injection of bits into
the instruction decode phase is inherently unavoidable.

l.

Luke Kenneth Casson Leighton

unread,

Apr 12, 2018, 7:09:36 PM4/12/18

to Cesar Eduardo Barros, Guy Lemieux, RISC-V ISA Dev

On Thu, Apr 12, 2018 at 11:48 PM, Cesar Eduardo Barros
<ces...@cesarb.eti.br> wrote:
> Em 11-04-2018 09:37, Guy Lemieux escreveu:
>>
>> This question may seem a bit out there.... but given the flexibility
>> of RISC-V, should we prepare for the eventuality where there are
>> multiple *custom* instruction set extensions that have conflicting ISA
>> encodings?
>
>
> It will happen. There are four major opcodes reserved for custom
> instructions, once the fifth group creates a custom extension using a major
> opcode, the pigeonhole principle says there will be a conflicting encoding.

Off the top of my head I can think of several: video (lots), crypto
(several vendors will make their own), audio (several), dsps (several
of those), tensors, vectors, 3D GPUs - those are all *categories*
meaning that multiple vendors will at some point want to create their
own custom extension.

One I'd like to look at is a "local memory" extension which puts,
right next to every core, a *small* block of SRAM (64x32 bits) or even
Content-Addressable Memory (really REALLY useful for speech / phoneme
decoding as well as TCP/UDP routing).

Bottom line: it's *gonna* run out of space :)

> Unless these extensions need either a lot of encoding space,

vectorblox is an entire VPU, Guy can probably give some idea, it's
likely of the order of.... fifty? a hundred opcodes?

> The simplest way (and one which is very likely to already be present) would
> be to allow enabling and disabling each extension individually, unless their
> encoding conflicts (in that case, it should allow chosing between the
> alternatives).

the "nightmare scenario" is where there's conflicts, yes. AAAA has
two (or more) meanings.

>> It *is probably* reasonable to assume that once a process is launched,
>> it cannot switch between these. Or, if it does switch, the cost of
>> switching may be very high (complete pipeline+cache flush+more).
>
>
> If the switching affects only the decoder, the cost is not that high. The
> CSR instructions have the CSR number hardcoded within the instruction, so
> the instruction decoder can see it early; when decoding an instruction which
> touches a decoder-switching CSR, the decoder can insert a bubble in the
> pipeline (through the same mechanism it uses for an instruction cache miss)
> until the instruction is retired. The cost ends up being equivalent to only
> a pipeline flush.

Ok, this seems inordinately better than the idea I came up with on
the "Parallelism Extension" exploration, where an entirely new
pipeline phase was conditionally dropped in to the mix.

> Of course, if the core caches pre-decoded instructions, it might have to
> discard that cache, unless it's tagged with the instruction "mode". And if
> the switch turns on/off power/clock to parts of the core, it will obviously
> be much more expensive.

whoops :)

>> If someone can make a case that *rapid* change is necessary, where we
>> switch ISA subsets on the fly (possibly followed by a FENCE), then
>> this should also be considered.
>>
>> Should we consider an official generic way to select among these
>> possibilities? or is it already part of the CSSRs? (sorry but I
>> haven't studied CSSRs at all)
>
>
> We already have one for the standard extensions (the MISA CSR).

ah ha! very cool! precedent [and another area I need to investigate].

thx Cesar.

l.

Cesar Eduardo Barros

unread,

Apr 12, 2018, 7:16:49 PM4/12/18

to Luke Kenneth Casson Leighton, Krste Asanovic, Guy Lemieux, RISC-V ISA Dev

There's a weak precedent for this in Intel's MMX, where the EMMS/FEMMS
instruction should be used when switching between floating point and
SIMD, and in Intel's VZEROUPPER instruction, which should be used when
switching between different kinds of SIMD instructions.

There's also precedent for this in the x86's EFLAGS.DF bit, which shows
the pitfalls of this approach:

- It's state which must be saved and restored when entering the kernel
on an interrupt, and/or switching context to another process;
- It's state which must be saved, initialized, and restored on a Unix
signal handler (see https://lwn.net/Articles/272048/ for what happens
when you get this wrong).

> in particular it's worth noting that it is *not* necessary to call
> that mode-setting instruction prior to *every* use of a custom
> instruction. It would be perfectly reasonable to do this (and for
> compilers to track it):
>
> set-csr-iana-style {insert custom vendor 1 unique number here}
> instruction A # custom vendor 1's "meaning"
> standard RV32 instruction
> standard RV32 instruction
> instruction A' # from the same custom vendor 1
> set-csr-iana-style {insert custom vendor 2 unique number here}
> instruction A # custom vendor *2*'s "meaning"
> instruction A' # from the same custom vendor *2*
> standard RV32 instruction
> ....

In practice, the compiler would insert a call to the mode-setting
instruction at the top of every function which needs the non-default mode.

> The nice thing about this would be that the compiler writers would
> know what to expect when it comes to upcoming custom instructions.
> Right now, hmmm this hadn't occurred to me until now: vendors that
> create custom instructions will be FORCED into the position of
> maintaining their own fork of gcc. Without the above *STANDARD*
> formal method that is part of the *BASE* RV, the first vendor that
> submits patches to gcc basically becomes the de-facto "dominator" of
> that encoding. Total chaos and much pain ensues.

There seems to be some confusion here. First of all, gcc doesn't know
anything about the instruction encodings; that's the job of the
assembler, which comes from a different project (the binutils project).
Besides that, the compiler will not use these instructions by default,
only when directed to do so through its command line flags or function
attributes. Mutually incompatible instruction set extensions can be
added to gcc without any problem, as long as it knows they can't be
enabled at the same time. I see no reason why the first one to be
submitted would win.

Also, there are ways to use custom instructions without gcc knowing
about them. The gcc "inline assembly" syntax was designed for that use
case; it allows one to describe precisely which registers and/or memory
are used by the custom instruction, such that the compiler can use it as
if it were an instruction it knows (in fact, the gcc inline assembly
operand constraint syntax is directly based on gcc's internal machine
description syntax).

Luke Kenneth Casson Leighton

unread,

Apr 12, 2018, 7:39:22 PM4/12/18

to Cesar Eduardo Barros, Krste Asanovic, Guy Lemieux, RISC-V ISA Dev

On Fri, Apr 13, 2018 at 12:16 AM, Cesar Eduardo Barros
<ces...@cesarb.eti.br> wrote:

> There's also precedent for this in the x86's EFLAGS.DF bit, which shows the
> pitfalls of this approach:

wheww, nice to know that new problems are actually old ones with
history and clear understanding.

>> [...]

>> set-csr-iana-style {insert custom vendor 1 unique number here}
>> instruction A # custom vendor 1's "meaning"
>> standard RV32 instruction
>

> In practice, the compiler would insert a call to the mode-setting
> instruction at the top of every function which needs the non-default mode.

i'm very relieved to learn that there's someone who knows the precise
and relevant details of what I cannot possibly hope to cover in full,
due to the extreme broad and general nature of my expertise. so...
thank you... and also for clarifying about gcc / assembly. i'm
reminded that gcc stands for "gnu compiler *collection*".

l.

Cesar Eduardo Barros

unread,

Apr 12, 2018, 8:01:27 PM4/12/18

to lk...@lkcl.net, RISC-V ISA Dev, kr...@berkeley.edu, acah...@gmail.com, glem...@vectorblox.com, richard....@roalogic.com

Em 12-04-2018 15:25, lk...@lkcl.net escreveu:
>
>
> On Thursday, April 12, 2018 at 8:16:26 AM UTC+1, Richard Herveille wrote:
>

> On 12/04/2018, 08:55, "kr...@berkeley.edu <javascript:>"

Once you postulate a rogue manufacturer, all bets are off. But I fail to
see the issue here; gcc would add it behind a command line option,
u-boot and the kernel would add it behind a kernel config option. The
problematic scenario is on the hardware side: other manufacturers
feeling pressured to include the extension with a compatible encoding.

A crypto engine instruction, by the way, is a bad example, since it's
easy to contain it within a few well-defined and easily-replaced
software subroutines. A good example would be a pervasive instruction
set extension, like SIMD (with auto-vectorising compilers) or bit
manipulation; these tend to be used everywhere.

> So tracing back the chain, the *cause* of the clash can be traced back
> to the lack of control and clarity in the RISC-V ISA Speciffication
> regarding custom extensions.
>
> Standards MUST nail EVERYTHING down if they are to BE Standards, and
> there must be no wavering. ALL possibility of ambiguity, conflict and
> confusion MUST be nailed in a backwards-compatible, forwards-compatible,
> extensible way. No exceptions. This is one thing that, as Albert
> points out, Intel gets right.

There's a contradiction here; if a standard nails absolutely everything
down, there's no possibility of forward compatibility of extensibility.
There's a balance between making a standard too rigid or too soft.

> So I can tell you, *right now*, that the lesson that Albert warns
> about *will* hit RISC-V - RISC-V will be a FAILURE - if it is not made
> mandatory that custom extensions be "registered" and given an IANA-like
> CSR, with an associated instruction to be called to put the processor
> into that "mode".

There are many scenarios where it makes no sense to be able to disable
any extension, be it standard or custom. For instance, a deeply embedded
control processor with a few special-purpose instructions. Disabling and
enabling extensions is more useful for general-purpose CPUs.

> Note that I did not say that the custom extension has to tell the
> RISC-V what they intend to *do* with that custom extension: they just
> have to *register* it (and have gcc emit appropriate CSR mode-setting
> instructions prior to use)

For many scenarios, forcing the compiler to emit a CSR write on every
function is pointless. Systems where the extension can't be disabled,
systems where the extension is enabled on boot and left enabled,
programs where an initialization routine sets up the extensions, ...

> I do understand that there is a desire for a "hands-off" approach to
> custom extensions, Krste. An alternative approach would be to request
> that the creators of custom extensions be required, by the RISC-V
> Foundation, *NEVER*, under *ANY* circumstances, to publish their designs
> or ANY aspect of the custom extension. not the source code, not the
> RTL, not the tools, nothing. However I believe you would agree that
> that would be unworkable and impractical, and very much against the
> spirit of RISC-V. Not least it would have implementors view RISC-V as
> not to be taken seriously.
>
> So in comparison to the alternatives, I think you'll find that an
> IANA-like arrangement would be warmly accepted and welcomed, once it was
> explained Precedents include IANA itself, USB IDs, PCIe IDs, and linux
> kernel MACHINE_IDs. It's an extremely well-known solution. The RISC-V
> Foundation becomes the "atomic transactor" for unambiguity of the ISA
> under its wing :)
>
> Writing unambiguous standards is damn hard. Being *responsible* for
> them is... an awful lot of work.

I see nothing ambiguous about "these opcodes are reserved for your
custom extensions". And requiring registration for custom extensions is
not a good thing. The Debian project (whose Free Software Guidelines
were later adopted as a standard definition for Open Source) has the
"Desert Island Test": if your software cannot be shared by a group of
people on a desert island who have no way of contacting the outside,
it's not free software.

In the same way, if one could not add extensions to RISC-V without
registering them with the foundation, an important part of RISC-V's
freedom would have been lost. And it's that freedom which is RISC-V's
main differential.

Luke Kenneth Casson Leighton

unread,

Apr 12, 2018, 8:51:37 PM4/12/18

to Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

On Fri, Apr 13, 2018 at 1:01 AM, Cesar Eduardo Barros
<ces...@cesarb.eti.br> wrote:

> Once you postulate a rogue manufacturer, all bets are off.

precisely. one that, due to overwhelming volumes, has to be taken
into consideration. earlier in the thread Krste stated that it would
not be necessary *at all* to take into account custom extensions, on
the basis that gcc (and other tools) would not incorporate custom
extensions. i felt compelled to point out a realistic scenario that
indicated that the RISC-V Foundation is *knowingly* not in control
decisions made by Software Libre developers, and, thus, logically, the
RISC-V Foundation really does have to take that into account, take it
very seriously and mitigate the "threat" to the RISC-V eco-system
accordingly.

> But I fail to see
> the issue here; gcc would add it behind a command line option, u-boot and
> the kernel would add it behind a kernel config option. The problematic
> scenario is on the hardware side: other manufacturers feeling pressured to
> include the extension with a compatible encoding.

and so on. there are many scenarios that lead to the same ambiguous
hell. the litmus test being: can objdump successfully decompile all
and *any* arbitrarily found-somewhere-on-the-internet binaries? if
the answer's "no" then the RISC-V Foundation has unfortunately failed
in its duty and responsibility towards the RISC-V ecosystem (in the
manner outlined by Albert in the POWERPC example that he gave).

> A crypto engine instruction, by the way, is a bad example,

oh :)

> since it's easy
> to contain it within a few well-defined and easily-replaced software
> subroutines. A good example would be a pervasive instruction set extension,
> like SIMD (with auto-vectorising compilers) or bit manipulation; these tend
> to be used everywhere.

ok. very cool. thank you for coming up with a better example.

>> So tracing back the chain, the *cause* of the clash can be traced back
>> to the lack of control and clarity in the RISC-V ISA Speciffication
>> regarding custom extensions.
>>
>> Standards MUST nail EVERYTHING down if they are to BE Standards, and
>> there must be no wavering. ALL possibility of ambiguity, conflict and
>> confusion MUST be nailed in a backwards-compatible, forwards-compatible,
>> extensible way. No exceptions. This is one thing that, as Albert points
>> out, Intel gets right.
>
>
> There's a contradiction here; if a standard nails absolutely everything
> down, there's no possibility of forward compatibility of extensibility.
> There's a balance between making a standard too rigid or too soft.

sorry there was a misunderstanding: the
*backwards/forwards-compatibility* needs to be absolutely nailed to
the floor. successful standards that achieve this include SMB (with
some hilarious screw-ups by various vendors over the years), SATA
(hardware-level and firmware-level speed negotiation), PCIe
(multi-lane negotiation, firmware-level speed negotiation), USB3
(*several* levels of hardware and firmware negotiation).

SD/MMC is a counter-example where they *didn't* get it right, by
trying to remove support for SPI (1-bit) mode in MMC 4... everybody
ignores that because it's harder to rip out the functionality that's
already there, than it is to leave it in. *apart* from that, SD/MMC
is an awesome example of how backwards/forwards-compatibility is
absolutely nailed to the floor. there's even voltage-switching (3.3v
down to 1.8v) and flipping over to DDR, as well as to
differential-pairs *and* backwards-compatibility right down to
25mbit/sec single-lane SPI.

> [...]

>> Writing unambiguous standards is damn hard. Being *responsible* for
>> them is... an awful lot of work.
>
>
> I see nothing ambiguous about "these opcodes are reserved for your custom
> extensions". And requiring registration for custom extensions is not a good
> thing.

in a later message (i notice you're going through the thread....)
you'll see that this was addressed. i hope.

> The Debian project (whose Free Software Guidelines were later adopted
> as a standard definition for Open Source) has the "Desert Island Test": if
> your software cannot be shared by a group of people on a desert island who
> have no way of contacting the outside, it's not free software.
>
> In the same way, if one could not add extensions to RISC-V without
> registering them with the foundation, an important part of RISC-V's freedom
> would have been lost. And it's that freedom which is RISC-V's main
> differential.

Jacob kindly pointed out that RISC-V has a similar Vendor-Mach ID to
USB and PCIe. Also, the linux kernel maintainers similarly have
"MACHINE_ID" registration, all of which *must* be done in an atomic
fashion by some "central authority".

By the "Desert Island Test" in its strictest definition, neither
RISC-V *nor the linux kernel* are software libre, due to the need for
the atomic global registration. However I do get your point.

l.

Cesar Eduardo Barros

unread,

Apr 12, 2018, 10:38:50 PM4/12/18

to Luke Kenneth Casson Leighton, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Em 12-04-2018 21:51, Luke Kenneth Casson Leighton escreveu:
> On Fri, Apr 13, 2018 at 1:01 AM, Cesar Eduardo Barros
> <ces...@cesarb.eti.br> wrote:
>
>> Once you postulate a rogue manufacturer, all bets are off.
>
> precisely. one that, due to overwhelming volumes, has to be taken
> into consideration. earlier in the thread Krste stated that it would
> not be necessary *at all* to take into account custom extensions, on
> the basis that gcc (and other tools) would not incorporate custom
> extensions. i felt compelled to point out a realistic scenario that
> indicated that the RISC-V Foundation is *knowingly* not in control
> decisions made by Software Libre developers, and, thus, logically, the
> RISC-V Foundation really does have to take that into account, take it
> very seriously and mitigate the "threat" to the RISC-V eco-system
> accordingly.

Here's where we disagree: I don't see it as a threat, but as an
opportunity. You're worried that manufacturers might be "forced" to
implement specific custom extensions, and moreover that these very
popular custom extensions will unavoidably conflict in their encoding.
But for it to reach that point, RISC-V must already be very popular. And
these custom extensions can be used as a prototype for a standard
extension covering the same domain.

If the toolchain developers add support for these custom extensions, so
what? They'll do so even for obscure extensions, and it'll be behind an
option in any case. Support for a custom extension does not hinder
support for another custom extension, or for a future standard extension.

>> But I fail to see
>> the issue here; gcc would add it behind a command line option, u-boot and
>> the kernel would add it behind a kernel config option. The problematic
>> scenario is on the hardware side: other manufacturers feeling pressured to
>> include the extension with a compatible encoding.
>
> and so on. there are many scenarios that lead to the same ambiguous
> hell. the litmus test being: can objdump successfully decompile all
> and *any* arbitrarily found-somewhere-on-the-internet binaries? if
> the answer's "no" then the RISC-V Foundation has unfortunately failed
> in its duty and responsibility towards the RISC-V ecosystem (in the
> manner outlined by Albert in the POWERPC example that he gave).

No, it can't, not even for x86 binaries. That's too high of a bar to clear.

Let's reduce the scope a bit: can it successfuly decompile once you tell
it which subarchitecture it is (x86-32, x86-64, etc)? It still can't, it
gets lost too easily in the instruction boundaries.

So, let's reduce the scope even more: can it successfuly decompile when
the instruction boundary is correct? Not necessarily, the instruction
might be newer than the objdump.

So we see here the limits of objdump: it has to know the instruction,
and it has to know which subarchitecture it's looking at. In RISC-V's
case, the "subarchitecture" includes which custom instructions it has.

(And that's before considering the cases where the documentation doesn't
match the hardware, like the recent issues with UD0/UD1 in the Linux
kernel. These should hopefully be less common with RISC-V.)

>>> So tracing back the chain, the *cause* of the clash can be traced back
>>> to the lack of control and clarity in the RISC-V ISA Speciffication
>>> regarding custom extensions.
>>>
>>> Standards MUST nail EVERYTHING down if they are to BE Standards, and
>>> there must be no wavering. ALL possibility of ambiguity, conflict and
>>> confusion MUST be nailed in a backwards-compatible, forwards-compatible,
>>> extensible way. No exceptions. This is one thing that, as Albert points
>>> out, Intel gets right.
>>
>>
>> There's a contradiction here; if a standard nails absolutely everything
>> down, there's no possibility of forward compatibility of extensibility.
>> There's a balance between making a standard too rigid or too soft.
>
> sorry there was a misunderstanding: the
> *backwards/forwards-compatibility* needs to be absolutely nailed to
> the floor. successful standards that achieve this include SMB (with
> some hilarious screw-ups by various vendors over the years), SATA
> (hardware-level and firmware-level speed negotiation), PCIe
> (multi-lane negotiation, firmware-level speed negotiation), USB3
> (*several* levels of hardware and firmware negotiation).

The backwards/forwards compatibility with custom instructions has been
addressed, by reserving a few of the major opcodes for custom
instruction extensions only. That is: future versions of the RISC-V
standard will not use these major opcodes, so they will not conflict
with custom extensions which used them.

The backwards/forwards compatibility within a custom extension is the
responsibility of the custom extension author.

What you're concerned with is something completely different: conflicts
between different, separately authored, custom extensions. There's no
easy answer for that. The encoding space is unavoidably limited (though
RISC-V has more free encoding space than most), hardware considerations
like the placement of fields within an instruction are important, and
every solution or workaround has a cost, which might not be acceptable
for some applications.

>> [...]
>
>>> Writing unambiguous standards is damn hard. Being *responsible* for
>>> them is... an awful lot of work.
>>
>>
>> I see nothing ambiguous about "these opcodes are reserved for your custom
>> extensions". And requiring registration for custom extensions is not a good
>> thing.
>
> in a later message (i notice you're going through the thread....)
> you'll see that this was addressed. i hope.

Personally, I'd put that information in the device tree: which
extensions are available, with their name (in device tree style, which
combines the vendor within the name) and how to enable/disable each of
them (which CSR and which bit within the CSR, if it's not permanently
enabled). This has the distinct advantage that the operating system does
not need to have a table mapping the architecture ID to the extensions
and their control registers, and therefore an old operating system can
be used with a new processor without losing access to the custom
extensions it already knows.

>> The Debian project (whose Free Software Guidelines were later adopted
>> as a standard definition for Open Source) has the "Desert Island Test": if
>> your software cannot be shared by a group of people on a desert island who
>> have no way of contacting the outside, it's not free software.
>>
>> In the same way, if one could not add extensions to RISC-V without
>> registering them with the foundation, an important part of RISC-V's freedom
>> would have been lost. And it's that freedom which is RISC-V's main
>> differential.
>
> Jacob kindly pointed out that RISC-V has a similar Vendor-Mach ID to
> USB and PCIe. Also, the linux kernel maintainers similarly have
> "MACHINE_ID" registration, all of which *must* be done in an atomic
> fashion by some "central authority".
>
> By the "Desert Island Test" in its strictest definition, neither
> RISC-V *nor the linux kernel* are software libre, due to the need for
> the atomic global registration. However I do get your point.

The difference is that it's not legally mandated. If I'm stuck in a
desert island, I can use any machine ID I want, and won't face any legal
repercussion for doing so when I get back to civilization.

Also, AFAIK the machine ID is much less important nowadays, since the
configuration is no longer done using board files (keyed by the machine
ID), instead it's done through the device tree. The ARM machine ID is a
relic from the old days.

Jacob Bachmeyer

unread,

Apr 12, 2018, 11:01:59 PM4/12/18

to Luke Kenneth Casson Leighton, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Luke Kenneth Casson Leighton wrote:
> ---
> crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
>
>
> On Thu, Apr 12, 2018 at 10:33 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>

>> [...]

>>
>> This could also allow for a database-driven extensible assembler, using a
>> --targetid=<VENDOR-ID>/<ARCH-ID> parameter to select which custom mnemonics
>> should be recognized. GCC support is unlikely to be needed for custom
>> extensions; wrapper functions or inline assembler should be sufficient.
>>
>
> indeed it could.
>
> Ok so there is one bit that is missing, here: making sure that gcc
> knows that, when two particular custom extensions contain overlapping
> extensions are used on the same {VENDORID/ARCHID} machine, they.... oh
> damn it doesn't work, does it?
>
> So the issue is: {VENDORID/ARCHID} identifies the *machine*, it
> doesn't identify the *custom extension*. As in: one vendor may
> license *another company's custom extension*.
>

The database can store aliases? Or some form of "this instruction is
available on this list of {vendor-id, arch-id} tuples"? It still holds,
as long as any particular {vendor-id, arch-id} identifies an
implementation with a non-overlapping set of custom instructions. The
"same" instruction could even have different encodings on different
target machines.

>> I do agree, however, that RISC-V binutils should not accept patches for
>> proprietary exclusive extensions -- if you want the standard tools to
>> support your extension, even as a custom extension, you *must* permit anyone
>> to implement compatible support. No claim of exclusivity will be tolerated.
>>
>
> This is a slightly different issue from the (show-stopping) initial
> issue that Guy identified.

It is still an important policy and the Foundation, as maintainers of
the official binutils port and the hypothetical custom opcode database,
is in a position to enforce it. I highly doubt that the GNU project
would object to such a policy.

-- Jacob

Jacob Bachmeyer

unread,

Apr 12, 2018, 11:20:49 PM4/12/18

to Guy Lemieux, lk...@lkcl.net, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

Guy Lemieux wrote:
>> We have this: the mvendorid and marchid CSRs. If we require that any given
>> implementation can have at most one set of custom opcodes,
>>
>
> Please try to keep an open mind. This is a very strong requirement.
> The cost of producing silicon is very high. Hence, it is very likely
> that a silicon vendor will produce a multi-capable CPU for different
> fabless customers, where each fabless customer has one or more custom
> extensions.
>
> I believe we need a way to prevent conflicts among custom extensions.
>

Such a vendor could ensure that the various extensions do not conflict,
or could arrange for each customer to get only their own extensions.

> In this situation, the silicon vendor must produce the mendorid and
> marched. The question is whether these values depend upon the silicon
> instance itself (single value), or whether they can take on multiple
> values (depending upon which custom extension is activated).
>

Having different "strappings" for the same nearly-identical silicon,
that activate different extensions (or different encodings ...) and are
set at fabrication-time would solve this problem, while also isolating
the extension sets (and customers) from each other.

> So far, Luke's IANA-style numbering is the best way. I'm pretty sure
> we can elaborate on that to make it very flexible and still keep it
> simple. This will not affect vendors that do not attempt to support
> conflicting extensions, so it has zero cost to them (except perhaps
> returning 0s for a CSR that might control this switching of
> extensions).
>

We already have IANA-style numbering for vendor and arch IDs.

>> This could also allow for a database-driven extensible assembler, using a
>> --targetid=<VENDOR-ID>/<ARCH-ID> parameter to select which custom mnemonics
>> should be recognized. GCC support is unlikely to be needed for custom
>> extensions; wrapper functions or inline assembler should be sufficient.
>>
>
> I think you're right here. As a single datapoint: VectorBlox has only
> modified the assembler.
>

Thank you.

>> I do agree, however, that RISC-V binutils should not accept patches for
>> proprietary exclusive extensions -- if you want the standard tools to
>> support your extension, even as a custom extension, you *must* permit anyone
>> to implement compatible support. No claim of exclusivity will be tolerated.
>>
>
> This is very GPL-like in intensity.
>
> There is a lot of merit to keep the baseline as streamlined and
> non-proprietary as possible.
>

An instruction, fundamentally, is a mathematical operation "named" by a
number (the encoding). Permitting exclusive claims to *math* is
outrageous. I suggest no requirement to publish an implementation, only
that the official RISC-V binutils adopt a policy of never supporting any
extension for which anyone claims a right to prevent someone else from
producing an independent implementation that also understands those
instructions. This is also a "ring fence" around potential bad actors
who might try to claim exclusive ownership of the CUSTOM-0/CUSTOM-1
opcodes, which are specifically intended as a sort of "RFC1918" space
for implementation-defined instructions, or any other part of the RISC-V
encoding space.

> However, there is also a point where even proprietary extensions might
> get adopted, if they become popular enough. Still, I think this issue
> is a bit off-topic -- the question is whether the ISA needs to support
> a *standard* way of switching between conflicting custom instruction
> sets.
>

For the nuance, I believe that proprietary instructions should be
allowed, but will receive no support whatsoever from the official RISC-V
binutils. (Since the current policy is no support at all for custom
instructions, I am actually advocating a more-permissive policy.) I
would suggest that waiving any claims to exclusive ownership of any part
of the RISC-V encoding space be a requirement for using the RISC-V
trademark, however.

A bad actor might be able to make such a claim under some bizarre
theory, but would then be prohibited from calling their product
"RISC-V", thus protecting RISC-V from such claims.

-- Jacob

Jacob Bachmeyer

unread,

Apr 12, 2018, 11:28:48 PM4/12/18

to Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Luke Kenneth Casson Leighton wrote:

> On Fri, Apr 13, 2018 at 1:01 AM, Cesar Eduardo Barros
> <ces...@cesarb.eti.br> wrote:
>

> [...]

>
>
>> The Debian project (whose Free Software Guidelines were later adopted
>> as a standard definition for Open Source) has the "Desert Island Test": if
>> your software cannot be shared by a group of people on a desert island who
>> have no way of contacting the outside, it's not free software.
>>
>> In the same way, if one could not add extensions to RISC-V without
>> registering them with the foundation, an important part of RISC-V's freedom
>> would have been lost. And it's that freedom which is RISC-V's main
>> differential.
>>
>
> Jacob kindly pointed out that RISC-V has a similar Vendor-Mach ID to
> USB and PCIe. Also, the linux kernel maintainers similarly have
> "MACHINE_ID" registration, all of which *must* be done in an atomic
> fashion by some "central authority".
>
> By the "Desert Island Test" in its strictest definition, neither
> RISC-V *nor the linux kernel* are software libre, due to the need for
> the atomic global registration. However I do get your point.
>

Assuming our intrepid band has somehow constructed a fab on their desert
island, they do have an option fully conformant to the spec: hardwire
mvendorid and marchid to zero. No communication with the outside world
is needed. (Although if they can build a fab, they can probably build
radios...)

-- Jacob

lk...@lkcl.net

unread,

Apr 12, 2018, 11:44:26 PM4/12/18

to RISC-V ISA Dev, lk...@lkcl.net, kr...@berkeley.edu, acah...@gmail.com, glem...@vectorblox.com, richard....@roalogic.com, ces...@cesarb.eti.br

so how would gcc support two custom extensions where the exact same encoding is required *in the same executable* because the vendor has licensed two custom RTLs that require the exact same encoding to represent their instructions? (it's not as non-sensical a question as it sounds).

So we see here the limits of objdump: it has to know the instruction,
and it has to know which subarchitecture it's looking at. In RISC-V's
case, the "subarchitecture" includes which custom instructions it has.

ok so there are subtleties of objdump that I wasn't aware of which cloud the point I was making. Then the question perhaps should be rephrased, "can qemu be programmed to be capable of supporting all and any custom extensions" and if not ....

The backwards/forwards compatibility with custom instructions has been
addressed, by reserving a few of the major opcodes for custom
instruction extensions only.

it *hasn't* been addressed, in the specific case of where a vendor wishes to license two (or more) RTLs from two (or more) different vendors, each of which utilises the exact same encoding AAAA to mean totally different things.

would you agree that it is hopelessly unrealistic to expect the providers of those custom extensions to *rewrite* them, given the cost of developing and testing them?

how is this *specific* scenario to be handled?

What you're concerned with is something completely different: conflicts
between different, separately authored, custom extensions.

yes.

There's no easy answer for that.

that's the purpose of the discussion that guy raised: to creatively explore such options. various branches of the discussion ruled out several misunderstandings along the way (and raised some interesting creative possibilities), but yes, agreed: it's not easy.

The encoding space is unavoidably limited (though
RISC-V has more free encoding space than most), hardware considerations
like the placement of fields within an instruction are important, and
every solution or workaround has a cost, which might not be acceptable
for some applications.

plus, due to the very nature of custom extensions, the complete lack of coordination and centralisation is touted as a *strength* of the freedom... but with no "prefix" - no "scoping" if we are to use c++ or other software language - we end up with the kind of chaos that you get with OpenSCAD, javascript, and other brain-dead languages that fail to provide clear and clean scoping rules.

>> I see nothing ambiguous about "these opcodes are reserved for your custom
>> extensions". And requiring registration for custom extensions is not a good
>> thing.
>
> in a later message (i notice you're going through the thread....)
> you'll see that this was addressed. i hope.

Personally, I'd put that information in the device tree: which
extensions are available, with their name (in device tree style, which
combines the vendor within the name) and how to enable/disable each of
them (which CSR and which bit within the CSR, if it's not permanently
enabled).

which doesn't work in the case above (vendor licenses two RTLs that utilise the exact same encoding). wait... yes it might! ok! awesome!

This has the distinct advantage that the operating system does
not need to have a table mapping the architecture ID to the extensions
and their control registers, and therefore an old operating system can
be used with a new processor without losing access to the custom
extensions it already knows.

yehyeh, no i totally get it.

ok so let's think it through. topologically it's equivalent to the idea that i suggested:

MISA |= extension1 # sets context "extension 1 encoding enabled"

AAAA # extension 1 meaning of AAAA

ADD r1, r2, r3

MISA &= ~extension1 # disable extension 1

MISA |= extension 2 # sets context "extension 2 encoding enabled"

BBBB # extension 2 meaning of AAAA

yeah that would work! so, again, the assembler (see, i'm paying attention...) would generate (at the top of each function... _still_ paying attention i hope) the required enable-extension MISA codes. the *implementation* of that would switch out hardware in the "instruction decode" phase (precedent for that practice has now been set... and implemented... so that's good to know).

my only main concern would be: does enabling or disabling of the extension result in any kind of unpredictable resetting or alteration of the CSRs associated with that extension? because if so that's going to be Bad.

also: beyond the actual clock-cycles is there a performance penalty for doing the switching?

also... and i saw the discussions about MISA.C... "if the PC is not aligned the unsetting of MISA.C is silently suppressed", that now becomes a problem that's potentially massively exacerbated given that some extensions may *also* be on unaligned boundaries.

so that section of the spec needs to be generalised and clarified.

> By the "Desert Island Test" in its strictest definition, neither
> RISC-V *nor the linux kernel* are software libre, due to the need for
> the atomic global registration. However I do get your point.

The difference is that it's not legally mandated. If I'm stuck in a
desert island, I can use any machine ID I want, and won't face any legal
repercussion for doing so when I get back to civilization.

Also, AFAIK the machine ID is much less important nowadays, since the
configuration is no longer done using board files (keyed by the machine
ID), instead it's done through the device tree. The ARM machine ID is a
relic from the old days.

i appreciate the clarification (and update): i'm sure you understand the gist of what i'm trying to communicate.

l.

Cesar Eduardo Barros

unread,

Apr 13, 2018, 8:00:17 AM4/13/18

to lk...@lkcl.net, RISC-V ISA Dev, kr...@berkeley.edu, acah...@gmail.com, glem...@vectorblox.com, richard....@roalogic.com

The same way it supports, in the same executable, functions which
require the AVX2 extension, and functions which cannot use the AVX2
extension.

> So we see here the limits of objdump: it has to know the instruction,
> and it has to know which subarchitecture it's looking at. In RISC-V's
> case, the "subarchitecture" includes which custom instructions it has.
>
>
> ok so there are subtleties of objdump that I wasn't aware of which
> cloud the point I was making. Then the question perhaps should be
> rephrased, "can qemu be programmed to be capable of supporting all and
> any custom extensions" and if not ....

If qemu can be programmed to disable some of the standard extensions
(like the floating point extensions), the same mechanism can be used to
enable or disable custom extensions.

> The backwards/forwards compatibility with custom instructions has been
> addressed, by reserving a few of the major opcodes for custom
> instruction extensions only.
>
>
> it *hasn't* been addressed, in the specific case of where a vendor
> wishes to license two (or more) RTLs from two (or more) different
> vendors, each of which utilises the exact same encoding AAAA to mean
> totally different things.

There's a piece missing here. What defines the encoding is the
instruction decoder, which is a single block; if what you licensed
includes the instruction decoder, it will only know that vendor's custom
instructions.

What's more probable is that you license the backend part of the custom
instructions, and are responsible for wiring them to the decoder yourself.

>
> would you agree that it is hopelessly unrealistic to expect the
> providers of those custom extensions to *rewrite* them, given the cost
> of developing and testing them?
>
> how is this *specific* scenario to be handled?

I believe that it's more realistic that, if it's only the custom
extension (and not the whole core), it receives the instructions already
in a decoded form, possibly even with the values already fetched from
the register file. In that case, it would be simple to "renumber" the
instructions, so there wouldn't be a conflict.

If the extension includes a decoder, it would most probably be a partial
decoder ("when the lower bits of the instruction word are the custom-0
major opcode, pass me the rest of the bits and I'll decode them for
you"). In that case, it would also be simple change the bits used to
select whether to pass the instruction to the custom instruction decoder.

Personally, I believe that vendors of "custom RISC-V extensions" would
make them relocatable (i.e. able to use several major opcodes, or even
longer instructions), precisely because of the possibility of conflicts.
A bit like old ISA cards, where some bits of the I/O address for the
card were defined by jumpers; if you had two parallel ports, you used a
jumper to move the second one to a different I/O port.

>
>
> What you're concerned with is something completely different: conflicts
> between different, separately authored, custom extensions.
>
>
> yes.
>
> There's no easy answer for that.
>
>
> that's the purpose of the discussion that guy raised: to creatively
> explore such options. various branches of the discussion ruled out
> several misunderstandings along the way (and raised some interesting
> creative possibilities), but yes, agreed: it's not easy.

There's a difference between exploring the options, and emphatically
affirming that RISC-V *will* fail unless the correct option is chosen.

> The encoding space is unavoidably limited (though
> RISC-V has more free encoding space than most), hardware considerations
> like the placement of fields within an instruction are important, and
> every solution or workaround has a cost, which might not be acceptable
> for some applications.
>
>
> plus, due to the very nature of custom extensions, the complete lack of
> coordination and centralisation is touted as a *strength* of the
> freedom... but with no "prefix" - no "scoping" if we are to use c++ or
> other software language - we end up with the kind of chaos that you get
> with OpenSCAD, javascript, and other brain-dead languages that fail to
> provide clear and clean scoping rules.

Yet, JavaScript is a huge success. The chaos that is the JavaScript
ecosystem has no relationship with its variable scoping rules.

The assembler cannot generate the enable/disable instructions, because
the assembler doesn't know the function boundaries. What would generate
these instructions would be the compiler. The compiler could, however,
use a "pseudo-instruction" which the assembler expands into one or more
real instructions.

>
> my only main concern would be: does enabling or disabling of the
> extension result in any kind of unpredictable resetting or alteration of
> the CSRs associated with that extension? because if so that's going to
> be Bad.

That's defined by the extension author and/or implementer.

>
> also: beyond the actual clock-cycles is there a performance penalty for
> doing the switching?
>
> also... and i saw the discussions about MISA.C... "if the PC is not
> aligned the unsetting of MISA.C is silently suppressed", that now
> becomes a problem that's potentially massively exacerbated given that
> some extensions may *also* be on unaligned boundaries.

The compiler can insert directives to instruct the assembler to align
the instructions. But that's only an issue when disabling compressed
instructions (more formally, when disabling all extensions which have
instructions with length that is not an integer multiple of 32 bits).

lk...@lkcl.net

unread,

Apr 13, 2018, 9:36:12 AM4/13/18

to RISC-V ISA Dev, lk...@lkcl.net, kr...@berkeley.edu, acah...@gmail.com, glem...@vectorblox.com, richard....@roalogic.com, ces...@cesarb.eti.br

[hi cesar hope you don't mind i'm cutting some of this and focussing on the bits i feel are important, appreciate that you took the time to correct some of my misapprehensions].

On Friday, April 13, 2018 at 1:00:17 PM UTC+1, Cesar Eduardo Barros wrote:

What's more probable is that you license the backend part of the custom
instructions, and are responsible for wiring them to the decoder yourself.

wiring them to the decoder, yes. changing them to be utterly different encodings? i really *really* hope not, because... well, that in turn would mean that custom ASIC manufacturers now also need to become compiler software engineers as well, modifying gcc to include *yet another* encoding of the exact same back-end functionality. *and* modifying the supplied test vectors (or writing their own) - it gets horribly messy and costly, both short term and long term, very very quickly.

There's a difference between exploring the options, and emphatically
affirming that RISC-V *will* fail unless the correct option is chosen.

will fail (in the exact same way that Albert pointed out that POWERPC failed) unless *an* option is chosen. fortunately - thank god - i believe MISA covers it. with caveats.

> ok so let's think it through. topologically it's equivalent to the idea
> that i suggested:
>
> MISA |= extension1 # sets context "extension 1 encoding enabled"
> AAAA # extension 1 meaning of AAAA
> ADD r1, r2, r3
> MISA &= ~extension1 # disable extension 1
> MISA |= extension 2 # sets context "extension 2 encoding enabled"
> BBBB # extension 2 meaning of AAAA
>
> yeah that would work! so, again, the assembler (see, i'm paying
> attention...) would generate (at the top of each function... _still_
> paying attention i hope) the required enable-extension MISA codes. the
> *implementation* of that would switch out hardware in the "instruction
> decode" phase (precedent for that practice has now been set... and
> implemented... so that's good to know).

The assembler cannot generate the enable/disable instructions, because
the assembler doesn't know the function boundaries. What would generate
these instructions would be the compiler. The compiler could, however,
use a "pseudo-instruction" which the assembler expands into one or more
real instructions.

ok. appreciate the distinction.

>
> my only main concern would be: does enabling or disabling of the
> extension result in any kind of unpredictable resetting or alteration of
> the CSRs associated with that extension? because if so that's going to
> be Bad.

That's defined by the extension author and/or implementer.

right. So that's a really, really important caveat / hard-fail. because if the use of MISA to enable/disable extensions cannot be *guaranteed* to leave state alone, then we do not have an "option" (cf above), and, consequently, RISC-V can be declared - *right now* - to be a complete failure, in exactly the same way that POWERPC is a failure due to two conflicting implementations of Vector ISAs occupying the exact same binary instruction encoding.

So again, the rule "nothing must be unclear or ambiguous" has to be applied if the RISC-V Standard is to be a success.

The alternative "fix" - which I trust that you can appreciate would be absolutely disastrous for performance - would be to record an extension's CSRs (somewhere - probably on the stack), *THEN* alter MISA, *then* enable the new MISA, *then* pull back in that extension's CSRs... and repeat the process again whenever needed. If there are interleaved instructions in the same function using the exact same binary encoding the results would be absolutely catastrophic for performance if there were say even 10 32-bit CSRs per extension.

Basically the standard needs to be updated and it made absolutely clear that when disabled and re-enabled, CSR state *must* not be altered. Or be declared - categorically from long-term historical observation of other architectures - to be a failure. That's the main choices available.

Thinking it through: how the heck would anything work if that wasn't the case?? Switching off and on an extension results in destruction of state into an unknown state? moo? How can anyone do formal proofs and analysis of RISC-V??

l.

lk...@lkcl.net

unread,

Apr 13, 2018, 9:53:39 AM4/13/18

to RISC-V ISA Dev, glem...@vectorblox.com, lk...@lkcl.net, kr...@berkeley.edu, acah...@gmail.com, richard....@roalogic.com, jcb6...@gmail.com

On Friday, April 13, 2018 at 4:20:49 AM UTC+1, Jacob Bachmeyer wrote:

> So far, Luke's IANA-style numbering is the best way. I'm pretty sure
> we can elaborate on that to make it very flexible and still keep it
> simple. This will not affect vendors that do not attempt to support
> conflicting extensions, so it has zero cost to them (except perhaps
> returning 0s for a CSR that might control this switching of
> extensions).
>

We already have IANA-style numbering for vendor and arch IDs.

So: jumping forward a bit (and summarising), Cesar kindly pointed out that MISA-driven enabling and disabling of extensions is a... "possible option that has the advantage of being part of the existing standard".

However it has two down-sides:

(1) The RISC-V Specification currently permits extension developers to DESTROY OR ALTER CSR STATE if its associated MISA bit is cleared followed by being reset.

(2) Whilst it's a workable scheme (apart from the hard-fail caveat (1)) it does mean that gcc, objdump *and* qemu would all have to have that --marchid={vendorid}{machid} option passed to them... *and then de-reference a table which lists which extensions are provided*. which is a bit "yuk" but is doable.

Fixing (1) will requrire that the Standard be updated to make it a hard requirement that state NEVER be altered just because MISA is altered. There are less... palatable options.

Fixing (2) will involve coming up with an alternative scheme that boils down to --marchextensionid={vendorid|extensionid} and has associated instructions that basically require major modification of the privileged instruction set.

Oh. one other important question: what impact on performance, if any, does setting MISA &= EXTENSIONBIT have, when called from a normal (OS application-level) level? If it causes a trap then that would be a massive performance hit. I'm reading Section 3.1.1 of the privileged ISA spec, pages 13-15, and I see nothing that mentions if a trap results. Earlier sections did not make this clear either.

l.

Cesar Eduardo Barros

unread,

Apr 13, 2018, 10:50:23 AM4/13/18

to lk...@lkcl.net, RISC-V ISA Dev, kr...@berkeley.edu, acah...@gmail.com, glem...@vectorblox.com, richard....@roalogic.com

Em 13-04-2018 10:36, lk...@lkcl.net escreveu:
> On Friday, April 13, 2018 at 1:00:17 PM UTC+1, Cesar Eduardo Barros wrote:
>
> What's more probable is that you license the backend part of the custom
> instructions, and are responsible for wiring them to the decoder
> yourself.
>
>
> wiring them to the decoder, yes. changing them to be utterly different
> encodings? i really *really* hope not, because... well, that in turn
> would mean that custom ASIC manufacturers now also need to become
> compiler software engineers as well, modifying gcc to include *yet
> another* encoding of the exact same back-end functionality. *and*
> modifying the supplied test vectors (or writing their own) - it gets
> horribly messy and costly, both short term and long term, very very quickly.

The instruction encoding is on the assembler (binutils), not gcc. It's
much simpler, AFAIK, to modify the instruction encoding on binutils than
to modify gcc (and in any case, even modifying gcc wouldn't be hard -
you're looking for places where it outputs "instruction1" and modifying
them to output "instruction2", with no changes to the code flow).

As for test vectors, if they are supplied as textual assembly files,
modifying the assembler is enough. Even if they are provided as binary
files, it wouldn't be hard to scan them for the corresponding
instructions and modify them, provided the data is separated from the
code (and the instruction lenght doesn't change).

> >
> > my only main concern would be: does enabling or disabling of the
> > extension result in any kind of unpredictable resetting or
> alteration of
> > the CSRs associated with that extension? because if so that's
> going to
> > be Bad.
>
> That's defined by the extension author and/or implementer.
>
>
> right. So that's a really, really important caveat / hard-fail.
> because if the use of MISA to enable/disable extensions cannot be
> *guaranteed* to leave state alone, then we do not have an "option" (cf
> above), and, consequently, RISC-V can be declared - *right now* - to be
> a complete failure, in exactly the same way that POWERPC is a failure
> due to two conflicting implementations of Vector ISAs occupying the
> exact same binary instruction encoding.

Does RISC-V have two conflicting implementations of vector extensions
occupying the exact same binary instruction encoding? You're imagining a
possible future scenario (one which probably won't happen, since there
will be a standard vector extension, which won't use the "reserved for
custom extensions" opcodes), and declaring that RISC-V has failed in the
present.

And as I mentioned elsewhere in the thread, this is only an issue for
"pervasive" instructions, which tend to be used everywhere. For more
contained instruction set extensions, it's very simple (and common) for
software to contain the calls to it to within a few places, and even
switch these at runtime.

>
> So again, the rule "nothing must be unclear or ambiguous" has to be
> applied if the RISC-V Standard is to be a success.
>
> The alternative "fix" - which I trust that you can appreciate would be
> absolutely disastrous for performance - would be to record an
> extension's CSRs (somewhere - probably on the stack), *THEN* alter MISA,
> *then* enable the new MISA, *then* pull back in that extension's CSRs...
> and repeat the process again whenever needed. If there are interleaved
> instructions in the same function using the exact same binary encoding
> the results would be absolutely catastrophic for performance if there
> were say even 10 32-bit CSRs per extension.

This is an extreme example with a long list of preconditions.

- Two extensions with conflicting encodings;
- These extensions are used "interleaved" in the same function;
- They have state, either a register file or a CSR;
- They are not defined to keep their state when disabled;
- The function which is using them needs to preserve the state of one
extension when calling the other.

For an example of the last point, consider how MIPS did multiplication:
it had a pair of registers used exclusively for the multiplier. Now
suppose this were a RISC-V extension, which lost state when disabled;
this is not a problem, since there's no need to keep state after a
routine is done with the multiplication. This is a very common situation.

You are imagining this extreme example, and assuming it will be common
enough to threaten RISC-V's chance of survival.

>
> Basically the standard needs to be updated and it made absolutely clear
> that when disabled and re-enabled, CSR state *must* not be altered. Or
> be declared - categorically from long-term historical observation of
> other architectures - to be a failure. That's the main choices available.

As I mentioned above, when there's state, it's common to not have to
preserve it between routines. I've seen, though I don't recall which ISA
at the moment, a whole register file being declared as "caller saved",
meaning any function call can be assumed by the compiler to overwrite
all these registers; the compiler saves and restores these registers as
necessary, if they are still "live" across the call.

> Thinking it through: how the heck would anything work if that wasn't the
> case?? Switching off and on an extension results in destruction of state
> into an unknown state? moo? How can anyone do formal proofs and analysis
> of RISC-V??

Easy: model the write to the CSR which enables the extension as a
command to also zero the extension state. It's more subtle when the
implementation has the option of either zeroing the state or restoring
the previous state, but I believe it can be done.

Luke Kenneth Casson Leighton

unread,

Apr 13, 2018, 12:30:55 PM4/13/18

to Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Fri, Apr 13, 2018 at 3:50 PM, Cesar Eduardo Barros
<ces...@cesarb.eti.br> wrote:
> Em 13-04-2018 10:36, lk...@lkcl.net escreveu:
>>
>> On Friday, April 13, 2018 at 1:00:17 PM UTC+1, Cesar Eduardo Barros wrote:
>>
>> What's more probable is that you license the backend part of the
>> custom
>> instructions, and are responsible for wiring them to the decoder
>> yourself.
>>
>>
>> wiring them to the decoder, yes. changing them to be utterly different
>> encodings? i really *really* hope not, because... well, that in turn would
>> mean that custom ASIC manufacturers now also need to become compiler
>> software engineers as well, modifying gcc to include *yet another* encoding
>> of the exact same back-end functionality. *and* modifying the supplied test
>> vectors (or writing their own) - it gets horribly messy and costly, both
>> short term and long term, very very quickly.
>
>
> The instruction encoding is on the assembler (binutils), not gcc. It's much
> simpler, AFAIK, to modify the instruction encoding on binutils than to
> modify gcc (and in any case, even modifying gcc wouldn't be hard - you're
> looking for places where it outputs "instruction1" and modifying them to
> output "instruction2", with no changes to the code flow).

ok appreciated.

> As for test vectors, if they are supplied as textual assembly files,
> modifying the assembler is enough. Even if they are provided as binary
> files, it wouldn't be hard to scan them for the corresponding instructions
> and modify them, provided the data is separated from the code (and the
> instruction lenght doesn't change).

No: I am referring to test vectors in the hardware sense. A series,
often numbering in the tens of thousands, of vendor-supplied verilog
(or other) source code, which provides the buyer (and the Foundry)
with the absolute cast-iron guaranteed assurance that people's time
and money are not going to be wasted.

And it's a **** of a lot of money, Cesar. This isn't software.
Foundries don't like having their time wasted, and they have long
memories of who wasted their time.

When a new fabless semiconductor company comes along and is asked,
"is this a silicon proven design", because you had to modify the test
vectors the answer has to be NO. In which case they are extremely
likely to not further respond to enquiries.

> Does RISC-V have two conflicting implementations of vector extensions
> occupying the exact same binary instruction encoding? You're imagining a
> possible future scenario (one which probably won't happen, since there will
> be a standard vector extension, which won't use the "reserved for custom
> extensions" opcodes), and declaring that RISC-V has failed in the present.

Cesar, *please*, please: stop using the argument "it might never
happen" when it comes to writing Standards. Standards die even
before they're released if they have not done proper future-scenario
analysis. It's not like software where you can change it after a
release. You get *one* shot at developing a Standard. It's even
worse than developing a chip, because although it's costly you can do
a replacement chip. But Standards? you fail to take into account
*one* possible scenario that you didn't properly take into account,
and it's dead.

Not least, this entire thread is raised *by* a company - VectorBlox -
who is *developing and licensing a custom Vector Extension*.

>> The alternative "fix" - which I trust that you can appreciate would be
>> absolutely disastrous for performance - would be to record an extension's
>> CSRs (somewhere - probably on the stack), *THEN* alter MISA, *then* enable
>> the new MISA, *then* pull back in that extension's CSRs... and repeat the
>> process again whenever needed. If there are interleaved instructions in the
>> same function using the exact same binary encoding the results would be
>> absolutely catastrophic for performance if there were say even 10 32-bit
>> CSRs per extension.
>
>
> This is an extreme example with a long list of preconditions.

That's right. It was written to illustrate precisely how
unacceptable the case of letting an extension make the decision
whether to store state (or not). In the development of a Standard,
"either A or B" is a "Standard is Dead" red-flag. In some cases "A
with upwards-negotiable B" is okay, but "A *or* B" *especially* "A
mutually-exclusively B" is a clear and unequivocable sign that the
standard is likely to fail.

> - Two extensions with conflicting encodings;
> - These extensions are used "interleaved" in the same function;
> - They have state, either a register file or a CSR;
> - They are not defined to keep their state when disabled;
> - The function which is using them needs to preserve the state of one
> extension when calling the other.
>
> For an example of the last point, consider how MIPS did multiplication: it
> had a pair of registers used exclusively for the multiplier. Now suppose
> this were a RISC-V extension, which lost state when disabled; this is not a
> problem, since there's no need to keep state after a routine is done with
> the multiplication. This is a very common situation.

That's right. and it's why I specifically did not give that as an
example, because, as you say, it is not a problem. (ok, it is if
there's an interrupt that, again, doesn't properly restore the
extension's state because the ISR *resets* (clears and then sets) MISA
for that multiply function, and doesn't properly save and restore the
associated CSRs - but that's another matter)

I may not have made it clear (I forgot that I had always quoted
examples with *two* custom extensions being utilised in the exact same
function). Think it through, Cesar: what would happen if there were
*two or more* Custom Extensions utilising the *exact* same
binary-encoding, utilised in the *exact* same function? (this has
been the context all along, it's important to remember that, apologies
if I do not make that clear).

> You are imagining this extreme example,

That's just what you have to do when it comes to writing Standards.
This is how it is. The "normal" risk-mitigation strategies associated
with software development are, I suspect, causing you some
consternation and disbelief at quite how extreme the risk-assessment
"threat" level has to be dialed up to, in Standards development.
Honestly it's pretty exhausting, and teaches you extreme levels of
patience.

Please understand that this is just how it is (whilst at the same
time I must add the proviso that you are of course entirely at liberty
to make your own choices and decisions).

> and assuming it will be common
> enough to threaten RISC-V's chance of survival.

That's right. It's not assumption, it's risk-analysis (with the risk
mitigation level dialled up to 11). there is clear and unequivocal
historical precedent, countless examples that show how standards
failed because they weren't thought through properly, having been
developed by people who did not take risk levels seriously. I have
six years of experience of writing a future-proof Standard. It has
been excruciatingly tedious in the extreme, and required significant
study and comprehensive analysis of several hardware standards and
their success (or failure), discerning what made them successful or
not.

I've also worked in companies that did mission-critical software.
The software development rules were utterly different: no recursion,
no local variables, use of malloc and free were outright banned.
Those were just the ones that I remember. And before code was signed
off, there was a FIVE MAN TEAM doing a LINE BY LINE code walk-through.
This was in Venicle Engine Controllers so one mistake could cause the
engine to seize (or worse be thrown into reverse), resulting in
potentially deadly explosions.

I'm trying to get across to you that the normal risk mitigation rules
associated with normal software development simply do not apply. I
trust that I do not have to emphasise it any more?

So. back to the critical, critical real-world scenario that was
raised. What impact would there be on performance if there were three
custom extensions utilising the exact same binary-encoding, called in
quick succession within a few instructions of each other within the
same function, and MISA masking was utilised to mutually-exclusively
enable / disabe each of the three custom extensions in succession?
For emphasis we may wish to augment that quesiton with "within a
performance-critical inner loop"?

Bear in mind that VectorBlox (the initiator of the thread) has
developed a *high-performance* Custom Vector Extension that could be
utilised in conjunction with a Video Processing Custom Extension and
perhaps even a Crypto Custom Extension (or, mindful of what you said
before, better a "high performance Parallel Bit-wise Custom
Extension"). So this is *NOT* a hypothetical scenario. A company's
future and its employees livelihoods *DEPEND* on the RISC-V Foundation
getting this right.

l.

Luke Kenneth Casson Leighton

unread,

Apr 13, 2018, 5:26:34 PM4/13/18

to Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Ok so I checked section 3.1.11 on Extension Context Status Register
(mentioned in another thread), and in Table 3.4 can I assume that
"Execute instruction to disable unit" would be "MISA &= ~EXTBIT" and
likewise enable to be |= on MISA?

If so, the specification seems clear: the Extension's engine is SHUT
DOWN. State is to be DESTROYED.

Thus it would appear that we may not utilise MISA after all to resolve
conflicting binary encodings.

l.

Jacob Bachmeyer

unread,

Apr 13, 2018, 7:20:16 PM4/13/18

to Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Luke Kenneth Casson Leighton wrote:

I still suggest a simple solution: no one implementation may have
conflicting encodings. The {vendor-id, arch-id} tuple is used in a
".target riscv vendor 0xXXXXXXXX arch 0xYYYYYYYY" directive to look up
which extension mnemonics map to what encodings for that particular
vendor/arch pair. "Portable" extensions must be renumberable, which
means that any programs that are part of their test vectors must be in
assembler source form.

Policies about what should go in the "standard" extensible assembler
database are a separate issue. I argue that the database distributed
with the official RISC-V binutils should include only
freely-implementable instructions. (I can explain the ethical reasoning
for that if asked.) This does not prohibit proprietary extensions,
since no one is under any requirement to publish an extension at all.
Nor do I advocate that implementation details such as RTL be required to
be published for inclusion in the "standard" custom opcode database --
only (1) the assembler mnemonics, (2) the encoding used for those
mnemonics on a particular implementation, and (3) what those
instructions *do*. A vendor who does not wish to publish that
information, or wishes to assert some exclusive claim to implementing
their non-standard extension may do so, but will need to distribute
their own database overlay, as the official RISC-V binutils will not
include support for those instructions. [To lkcl: While GNU binutils
is not under the RISC-V Foundation's control, I strongly doubt that the
GNU project would have any objection to the described policy. It aligns
well with the GNU rhetoric about software freedom. Further, the
"standard" extensible assembler database could also be distributed
separately from the RISC-V binutils by the Foundation, somewhat akin to
the tz database.]

Similarly, I advocate that the RISC-V Foundation should make absolutely
clear that no one is permitted to lay any sort of exclusive claim to
either assembler mnemonics or binary opcode space. I believe that this
can be effectively enforced by making waiver of any such claims a
condition to use the RISC-V trademarks. This should work because anyone
making such a claim would no longer be permitted to call their product
"RISC-V" and therefore would receive an entire opcode space and
assembler mnemonic space all to themselves, disjoint from the RISC-V
mnemonic set and opcode space. :-)

-- Jacob

Guy Lemieux

unread,

Apr 13, 2018, 7:41:51 PM4/13/18

to Jacob Bachmeyer, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

Jacob, your proposal isn't a solution -- it's prohibition.

Perhaps we can try harder.

Guy

Luke Kenneth Casson Leighton

unread,

Apr 13, 2018, 7:56:18 PM4/13/18

to Guy Lemieux, Jacob Bachmeyer, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Sat, Apr 14, 2018 at 12:41 AM, Guy Lemieux <glem...@vectorblox.com> wrote:
> Jacob, your proposal isn't a solution -- it's prohibition.

Jacob wrote:
"Similarly, I advocate that the RISC-V Foundation should make
absolutely clear that no one is permitted to lay any sort of exclusive
claim to either assembler mnemonics or binary opcode space."

In effect what that means Jacob is that you're advocating shutting
down - entirely - the custom extensions space. *All* opcodes would
fall under the jurisdiction of the RISC-V Foundation.

The unfortunate thing is that in providing people "freedom" in the
custom opcode space, the RISC-V Foundation has forgotten that it needs
to remain the enforcer of ensuring that *conflicts* have an
unambiguous direct or indirect resolution. We have also established
that saying "it's officially unsupported by the official RISC-V
toolchains" is not an acceptable option as the RISC-V Foundatiion has
absolutely no control over the toolchains (upstream, distros, or
forks, published or unpublished).

Alternative schemes might include that all and any custom
instructions be explicitly prefixed - every single one of them
absolutely every time that they are used - with a "custom encoding
coming up, unique ID NNNN" instruction. I only mention that as a
possibility, not to be taken seriously, but at least it fits the
requirements.

l.

Michael Clark

unread,

Apr 13, 2018, 9:15:29 PM4/13/18

to Luke Kenneth Casson Leighton, Jacob Bachmeyer, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

You seem to be making lots of radical assumptions, first that a CPU/SOC architect would include conflicting extensions, and secondly if they did, that they would not consider the consequences.

I don’t believe disabling misa.F or misa.D requires _destroying_ the floating point register file state. At minimum it may be implementation defined. Of course, if another extension uses the floating point register file then the results would be implementation defined.

There are also the mstatus bits (SD/FS/XS) that are designed to indicate whether extension state needs to be saved. The semantics are defined and future extensions that exceed the capability of these mechanisms will define their own state saving mechanisms, for example a mechanism will be defined to save and restore the Vector extension state. The state format potentially needs to be standardized as migrating processors between different machines supporting the V extensions is likely given the support for live migration (something that needs to be added to the RISC-V QEMU port).

I don’t share the same general feeling that the RISC-V architecture is in reckless hands and that considered solutions to these sorts of problems will not be appropriately addressed. Many of them have been already addressed and others have that have already been considered, but the best solutions need to be written down and modelled, implemented and proposed for standardization.

Jacob, While the RISC-V CSR space is very small, there is a loophole that would allow access to a much larger state space (read-only) due to the fact that Rd apparently carries a (false) dependency from Rs in the CSR* Instructions, if my understanding of the Draft memory model spec is correct. If one decouples the the normal CSR semantics for a special ID or processor configuration CSR, this dependency could be used to expose processor state with a huge amount of (read-only) addressable space for version information on vendor specific extensions.

Noting that device-tree is a device configuration standard, I would propose that we come up with an ID system that can return 2^XLEN pages of XLEN-bits or read-only ID information.

CSRRW Rd, mhartcfg, Rs

where:

Rd <- mhartcfg[Rs]

The instruction would have different semantics to regular CSR instructions but would fit inside the current encoding for 256 CSRs (the upper 4-bits of the current 12-bit CSR space are reserved for permissions and mode aliasing making the effective CSR space able to expose 256 XLEN words). This proposed mhartcfg would be able to expose 2^XLEN XLEN words of processor identification information allowing for standard extension version iformation e.g. 26 leaves and a huge space for vendor specific information.

The reason I like this idea over device-tree or MMIO regions is that code can access processor ID information in any address translation mode and not depend on physical memory layout and it can address information about the CPU features, not the attached devices in an SOC (for which device-tree, ACPI or other mechanisms are more appropriate). This ID mechanism could even be exposed to U mode to provide an equivalent to arm ID and x86 CPUID Instructions use to discover read-only information regarding processor features at runtime (not devices). It could easily be added with trap-and-emulate assuming and assuming it is not used in any fast path, would allow this mechanism to be added to feature detection libraries like Google’s cpu_features [1]. We have to consider simple user-mode CPU feature detection that exists on other CPU architectures and we also have to consider pre-boot environments or Embedded environments that may not have device tree. I think an extended processor ID CSR that takes 1/256 of the CSR space might just be the right way to do this.

With respect to conflicting extensions, there are already methods for altering a (currently small) standard extension space, along with h methods to switch ISA (MXL/SXL/UXL). These problems can be addressed with careful and considered evolution of the specifications as the many contributors to the ISA specifications already have many solutions in mind.

There is also “compliance” which will address whether an implementation can call itself RISC-V. There are many layers that conflicts can be accommodated if they are “compliant” or not which the evolving specification base.

It is worthwhile that these issues are discussed but they should not be sensationalised into some DISASTER SCENARIO (apologies for shouting, I don’t normally do that, I guess I am mirroring).

Michael.

[1 ]https://github.com/google/cpu_features

Jacob Bachmeyer

unread,

Apr 13, 2018, 9:29:40 PM4/13/18

to Luke Kenneth Casson Leighton, Guy Lemieux, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

Luke Kenneth Casson Leighton wrote:

> ---
> crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
>
>
> On Sat, Apr 14, 2018 at 12:41 AM, Guy Lemieux <glem...@vectorblox.com> wrote:
>
>> Jacob, your proposal isn't a solution -- it's prohibition.
>>
>
> Jacob wrote:
> "Similarly, I advocate that the RISC-V Foundation should make
> absolutely clear that no one is permitted to lay any sort of exclusive
> claim to either assembler mnemonics or binary opcode space."
>
> In effect what that means Jacob is that you're advocating shutting
> down - entirely - the custom extensions space.

No, I am saying that we should prevent someone from saying "I own the
FOOBLARTCH mnemonic and no one else can make an opcode with that
mnemonic" in the custom extension space.

The extensible assembler proposal effectively namespaces the mnemonic
and encoding spaces under {vendor-id, arch-id} tuples.

-- Jacob

Jacob Bachmeyer

unread,

Apr 13, 2018, 9:47:47 PM4/13/18

to Michael Clark, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Michael Clark wrote:
>> On 14/04/2018, at 9:26 AM, Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:
>>
>> Ok so I checked section 3.1.11 on Extension Context Status Register
>> (mentioned in another thread), and in Table 3.4 can I assume that
>> "Execute instruction to disable unit" would be "MISA &= ~EXTBIT" and
>> likewise enable to be |= on MISA?
>>
>> If so, the specification seems clear: the Extension's engine is SHUT
>> DOWN. State is to be DESTROYED.
>>
>> Thus it would appear that we may not utilise MISA after all to resolve
>> conflicting binary encodings.
>>
>
> You seem to be making lots of radical assumptions, first that a CPU/SOC architect would include conflicting extensions, and secondly if they did, that they would not consider the consequences.
>

This is the main reason that I advocate (1) requiring "portable"
non-standard extensions to be renumberable, and (2) requiring that any
given implementation have unambiguous instruction encodings.

> I don’t believe disabling misa.F or misa.D requires _destroying_ the floating point register file state. At minimum it may be implementation defined. Of course, if another extension uses the floating point register file then the results would be implementation defined.
>
> There are also the mstatus bits (SD/FS/XS) that are designed to indicate whether extension state needs to be saved. The semantics are defined and future extensions that exceed the capability of these mechanisms will define their own state saving mechanisms, for example a mechanism will be defined to save and restore the Vector extension state. The state format potentially needs to be standardized as migrating processors between different machines supporting the V extensions is likely given the support for live migration (something that needs to be added to the RISC-V QEMU port).
>
> I don’t share the same general feeling that the RISC-V architecture is in reckless hands and that considered solutions to these sorts of problems will not be appropriately addressed. Many of them have been already addressed and others have that have already been considered, but the best solutions need to be written down and modelled, implemented and proposed for standardization.
>
> Jacob, While the RISC-V CSR space is very small, there is a loophole that would allow access to a much larger state space (read-only) due to the fact that Rd apparently carries a (false) dependency from Rs in the CSR* Instructions, if my understanding of the Draft memory model spec is correct. If one decouples the the normal CSR semantics for a special ID or processor configuration CSR, this dependency could be used to expose processor state with a huge amount of (read-only) addressable space for version information on vendor specific extensions.
>

Technically, CSRRW is defined to return the previous value of the
selected CSR. (RISC-V ISA spec section 2.7 "Control and Status Register
Instructions"; "The CSRRW instruction...")

> Noting that device-tree is a device configuration standard, I would propose that we come up with an ID system that can return 2^XLEN pages of XLEN-bits or read-only ID information.
>
> CSRRW Rd, mhartcfg, Rs
>
> where:
>
> Rd <- mhartcfg[Rs]
>
> The instruction would have different semantics to regular CSR instructions but would fit inside the current encoding for 256 CSRs (the upper 4-bits of the current 12-bit CSR space are reserved for permissions and mode aliasing making the effective CSR space able to expose 256 XLEN words). This proposed mhartcfg would be able to expose 2^XLEN XLEN words of processor identification information allowing for standard extension version iformation e.g. 26 leaves and a huge space for vendor specific information.
>
> The reason I like this idea over device-tree or MMIO regions is that code can access processor ID information in any address translation mode and not depend on physical memory layout and it can address information about the CPU features, not the attached devices in an SOC (for which device-tree, ACPI or other mechanisms are more appropriate). This ID mechanism could even be exposed to U mode to provide an equivalent to arm ID and x86 CPUID Instructions use to discover read-only information regarding processor features at runtime (not devices). It could easily be added with trap-and-emulate assuming and assuming it is not used in any fast path, would allow this mechanism to be added to feature detection libraries like Google’s cpu_features [1]. We have to consider simple user-mode CPU feature detection that exists on other CPU architectures and we also have to consider pre-boot environments or Embedded environments that may not have device tree. I think an extended processor ID CSR that takes 1/256 of the CSR space might just be the right way to do this.
>

This would be another good way to read the processor ID ROM that I have
been advocating.

Or how about a READID instruction using a new funct12 code in
SYSTEM/PRIV? When exectued, READID would perform "Rd <- ID_ROM[Rs1]"
where ID_ROM is an XLENx??? ROM storing a processor configuration. Very
small embedded systems can either omit READID or always return zero.

> With respect to conflicting extensions, there are already methods for altering a (currently small) standard extension space, along with h methods to switch ISA (MXL/SXL/UXL). These problems can be addressed with careful and considered evolution of the specifications as the many contributors to the ISA specifications already have many solutions in mind.
>
> There is also “compliance” which will address whether an implementation can call itself RISC-V. There are many layers that conflicts can be accommodated if they are “compliant” or not which the evolving specification base.
>
> It is worthwhile that these issues are discussed but they should not be sensationalised into some DISASTER SCENARIO (apologies for shouting, I don’t normally do that, I guess I am mirroring).
>

Agreed that "DISASTER SCENARIO" is overblown at this point. My request
to prohibit exclusive claims to mnemonics or binary opcodes is an effort
to ensure that future TSG-alikes are "headed off at the pass" before
they can cause trouble for RISC-V.

-- Jacob

Michael Clark

unread,

Apr 13, 2018, 9:57:04 PM4/13/18

to jcb6...@gmail.com, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

> On 14/04/2018, at 1:47 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>
> Michael Clark wrote:
>>> On 14/04/2018, at 9:26 AM, Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:
>>>
>>> Ok so I checked section 3.1.11 on Extension Context Status Register
>>> (mentioned in another thread), and in Table 3.4 can I assume that
>>> "Execute instruction to disable unit" would be "MISA &= ~EXTBIT" and
>>> likewise enable to be |= on MISA?
>>>
>>> If so, the specification seems clear: the Extension's engine is SHUT
>>> DOWN. State is to be DESTROYED.
>>>
>>> Thus it would appear that we may not utilise MISA after all to resolve
>>> conflicting binary encodings.
>>>
>>
>> You seem to be making lots of radical assumptions, first that a CPU/SOC architect would include conflicting extensions, and secondly if they did, that they would not consider the consequences.
>>
>
> This is the main reason that I advocate (1) requiring "portable" non-standard extensions to be renumberable, and (2) requiring that any given implementation have unambiguous instruction encodings.
>
>> I don’t believe disabling misa.F or misa.D requires _destroying_ the floating point register file state. At minimum it may be implementation defined. Of course, if another extension uses the floating point register file then the results would be implementation defined.
>>
>> There are also the mstatus bits (SD/FS/XS) that are designed to indicate whether extension state needs to be saved. The semantics are defined and future extensions that exceed the capability of these mechanisms will define their own state saving mechanisms, for example a mechanism will be defined to save and restore the Vector extension state. The state format potentially needs to be standardized as migrating processors between different machines supporting the V extensions is likely given the support for live migration (something that needs to be added to the RISC-V QEMU port).
>>
>> I don’t share the same general feeling that the RISC-V architecture is in reckless hands and that considered solutions to these sorts of problems will not be appropriately addressed. Many of them have been already addressed and others have that have already been considered, but the best solutions need to be written down and modelled, implemented and proposed for standardization.
>>
>> Jacob, While the RISC-V CSR space is very small, there is a loophole that would allow access to a much larger state space (read-only) due to the fact that Rd apparently carries a (false) dependency from Rs in the CSR* Instructions, if my understanding of the Draft memory model spec is correct. If one decouples the the normal CSR semantics for a special ID or processor configuration CSR, this dependency could be used to expose processor state with a huge amount of (read-only) addressable space for version information on vendor specific extensions.
>>
>
> Technically, CSRRW is defined to return the previous value of the selected CSR. (RISC-V ISA spec section 2.7 “Control and Status Register Instructions"; "The CSRRW instruction...")

I understand that, nevertheless I believe for simplicity in OoO, CSR carries a (false) dependency from Rs to Rd, if I am not mistaken. We could use 0xff in the CSR address space and mark that as reserved given it doesn’t follow the exact semantics defined for the other CSR* instructions.

>> Noting that device-tree is a device configuration standard, I would propose that we come up with an ID system that can return 2^XLEN pages of XLEN-bits or read-only ID information.
>>
>> CSRRW Rd, mhartcfg, Rs
>>
>> where:
>>
>> Rd <- mhartcfg[Rs]
>>
>> The instruction would have different semantics to regular CSR instructions but would fit inside the current encoding for 256 CSRs (the upper 4-bits of the current 12-bit CSR space are reserved for permissions and mode aliasing making the effective CSR space able to expose 256 XLEN words). This proposed mhartcfg would be able to expose 2^XLEN XLEN words of processor identification information allowing for standard extension version iformation e.g. 26 leaves and a huge space for vendor specific information.
>>
>> The reason I like this idea over device-tree or MMIO regions is that code can access processor ID information in any address translation mode and not depend on physical memory layout and it can address information about the CPU features, not the attached devices in an SOC (for which device-tree, ACPI or other mechanisms are more appropriate). This ID mechanism could even be exposed to U mode to provide an equivalent to arm ID and x86 CPUID Instructions use to discover read-only information regarding processor features at runtime (not devices). It could easily be added with trap-and-emulate assuming and assuming it is not used in any fast path, would allow this mechanism to be added to feature detection libraries like Google’s cpu_features [1]. We have to consider simple user-mode CPU feature detection that exists on other CPU architectures and we also have to consider pre-boot environments or Embedded environments that may not have device tree. I think an extended processor ID CSR that takes 1/256 of the CSR space might just be the right way to do this.
>>
>
> This would be another good way to read the processor ID ROM that I have been advocating.

I concur with this.

> Or how about a READID instruction using a new funct12 code in SYSTEM/PRIV? When exectued, READID would perform "Rd <- ID_ROM[Rs1]" where ID_ROM is an XLENx??? ROM storing a processor configuration. Very small embedded systems can either omit READID or always return zero.

I agree. It could be encoded with CSR id 0xff and have enables for S/U mode like the counters do, but be accessed via a psuedo.

I like READID

Michael Clark

unread,

Apr 13, 2018, 10:02:49 PM4/13/18

to jcb6...@gmail.com, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Noting that its possible to make an implementation that is not vulnerable to defeating ASLR and reveals nothing about the physical or virtual address space of the CPU. It’s an ID space for extended version information on extensions. It makes sense to start out with 26 leaves that have version numbers like 2.3, 1.11 and 0.4, etc. I’m certain it will be useful.

Jacob Bachmeyer

unread,

Apr 13, 2018, 10:42:36 PM4/13/18

to Michael Clark, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Michael Clark wrote: [pair of replies merged]

> On 14/04/2018, at 1:47 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>
>> Michael Clark wrote:
>>

>> [...]

>>> Jacob, While the RISC-V CSR space is very small, there is a loophole that would allow access to a much larger state space (read-only) due to the fact that Rd apparently carries a (false) dependency from Rs in the CSR* Instructions, if my understanding of the Draft memory model spec is correct. If one decouples the the normal CSR semantics for a special ID or processor configuration CSR, this dependency could be used to expose processor state with a huge amount of (read-only) addressable space for version information on vendor specific extensions.
>>>
>> Technically, CSRRW is defined to return the previous value of the selected CSR. (RISC-V ISA spec section 2.7 “Control and Status Register Instructions"; "The CSRRW instruction...")
>>
> I understand that, nevertheless I believe for simplicity in OoO, CSR carries a (false) dependency from Rs to Rd, if I am not mistaken. We could use 0xff in the CSR address space and mark that as reserved given it doesn’t follow the exact semantics defined for the other CSR* instructions.
>

Or, since this is only for reading a processor ID ROM, we could avoid
wasting a CSR number and make it a separate instruction in SYSTEM/PRIV.

> [...]

>> Or how about a READID instruction using a new funct12 code in SYSTEM/PRIV? When exectued, READID would perform "Rd <- ID_ROM[Rs1]" where ID_ROM is an XLENx??? ROM storing a processor configuration. Very small embedded systems can either omit READID or always return zero.
>>
> I agree. It could be encoded with CSR id 0xff and have enables for S/U mode like the counters do, but be accessed via a psuedo.
>

Why encode with a CSR number at all? Why not make READID a new opcode
in SYSTEM/PRIV, instead of wasting at least four slots in the CSR
instruction space? (Only CSRRW and CSRRWI could be useful with this
magic "CSR", so the associated CSRRS, CSRRC, CSRRSI, CSRRCI opcodes are
wasted.)

> I like READID
>
>
> Noting that its possible to make an implementation that is not vulnerable to defeating ASLR and reveals nothing about the physical or virtual address space of the CPU. It’s an ID space for extended version information on extensions. It makes sense to start out with 26 leaves that have version numbers like 2.3, 1.11 and 0.4, etc. I’m certain it will be useful.
>

Exactly, READID accesses a processor ID ROM that is not part of the main
address space at all. Putting it in SYSTEM/PRIV also allows it to be
easily handled with trap-and-emulate for environments that need to do so.

The contents of the ID ROM can be determined later. I favor a
DeviceTree subset or something structurally equivalent but possibly
easier to scan.

-- Jacob

Michael Clark

unread,

Apr 14, 2018, 12:58:35 AM4/14/18

to jcb6...@gmail.com, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

On 14/04/2018, at 2:42 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:

>> I like READID
>>
>> Noting that its possible to make an implementation that is not vulnerable to defeating ASLR and reveals nothing about the physical or virtual address space of the CPU. It’s an ID space for extended version information on extensions. It makes sense to start out with 26 leaves that have version numbers like 2.3, 1.11 and 0.4, etc. I’m certain it will be useful.
>>
>
> Exactly, READID accesses a processor ID ROM that is not part of the main address space at all. Putting it in SYSTEM/PRIV also allows it to be easily handled with trap-and-emulate for environments that need to do so.
>
> The contents of the ID ROM can be determined later. I favor a DeviceTree subset or something structurally equivalent but possibly easier to scan.

For the use cases I am thinking about, device-tree is not appropriate. Ideally the mechanism is usable by user space code and M mode code on cores without device-tree or on server cores that use ACPI.

Perhaps encoding version numbers is not the right think to do, only the intent is to tag extended information that can’t be represented with just V (the current 26 bits). One example may be a Vector Bit Manipulation extension that is not part of the base Vector Extension or a future enhancement to the Vector extension that adds new instructions, or perhaps loose instructions e.g subsets of M: Mmul, Mdiv. ID pages and bits within pages that are defined to indicate the presence of a feature.

I have a more of a practical and realistic view of how to handle what others have handled with ID instructions that can be masked into userspace for runtime CPU feature detection. Device tree is simple not appropriate as the way to access it is not standardised across OSes. It’s the wrong abstraction-level. Device tree is designed around memory mapped devices attached to the core. Perhaps the limit one might expose in device-tree is cache topology but it’s arguable. Pre-device tree or cores without device tree at all could represent this information in a much more compact form without resorting to heavy duty string matching, having used cache and core topology enumeration on other cores, I’d rather not have to resort to using device tree other than what it is designed for, passing device information to the OS for devices that are not otherwise dynamically discoverable i.e. one only exposes a PCI host in device-tree, not the devices behind it.

Code that has optimised assembly routines that are selectively invoked based on cached values from a processor ID Instruction is just they way this is handled these days so that C and asm for RISC-V that uses extensions can be portable between environments. This can already be done with arm and Intel. Feature indication is not particularly proprietary. RISC-V will obviously have it’s own set of unique set of features and way of encoding then.

That said, “imafdcsu” is all we need at the moment, but userspace will want a simpler mechanism than device tree to dynamically detect B and V. Device-tree would be the wrong way to do this. RDISA would be more appropriate and trap and emulate would match the performance requirement given feature detection is at library load time and not in the subroutine fastpath.

Luke Kenneth Casson Leighton

unread,

Apr 14, 2018, 2:20:53 PM4/14/18

to Michael Clark, Jacob Bachmeyer, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

On Sat, Apr 14, 2018 at 2:15 AM, Michael Clark <michae...@mac.com> wrote:

> You seem to be making lots of radical assumptions,
> first that a CPU/SOC architect would include conflicting
> extensions

Guy raised the scenario as the company he works for licenses a custom
extension: they're clearly deeply concerned that their work would come
into direct conflict on any given binary encoding with other custom
extensions. So it's not a "radical" assumption at all.

Also it's not an *assumption* it's a critical-level risk-assessment.
So secondly: this is just what you have to go through when developing
standards. The risk-assessment level really does have to be dialed up
to 11 on a hair-trigger.

Why? because as Guy pointed out, the cost of developing an ASIC is so
insanely high that if there's one fatal mistake absolutely nobody and
I mean nobody is going to redesign their ASICs, by tha t I refer to
nobody *plural* because there will obviously be multiple vendors
implementing the same Standards-fatal design flaw.

Once out the gate (Revision 1.0) the colleciive NREs across all
implementors could well exceed a quarter of a billion dollars.
Reminder of the definition of NREs: NON recoverable expenses.

So does that give some kind of context as to how seriously even the
slightest flaw has to be taken, and why I am on a hair-trigger alert,
using emphasis, capital letters and markdown "bold" asterisk
demarcation?

> and secondly if they did, that they would not consider the consequences.

we *are* considering the consequences... on this very thread! :)

> I don’t believe disabling misa.F or misa.D requires _destroying_
> the floating point register file state.

The implication from 3.1.11 Table 3.4 is exactly that: the words are
quite clear, "state must be destroyed (Extension power set to 'off',
Extension must be returned to 'initial' state on re-enablement").

Now, that may not be the *intention*, in which case there is a lack
of clarity that must be addressed. For example: the "instruction to
switch on" may not in fact *be* "MISA.F = 0" but it's not made
explicitly clear.

> At minimum it may be implementation defined.

... which is one of the "red flags" of any standard. Ok, that's not
quite true. I wish this stuff "How to develop long-term standards"
were taught in university as a course... the only problem being that I
know with 100% certainty that I would have been asleep at the back of
the lecture theatre.

"implementation-defined" is perfectly okay... *as long as* there is
*nothing else* within the Standard that critically depends on any
state *WITHIN* that implementation. Now, previously, there was no
critical inter-dependency that had been envisioned, therefore it
didn't matter.

Anyway this is partly moot as it's been established that MISA cannot
be utilised in the general case because custom extensions are *not
exposed via MISA*, it's only for *standard* extensions.

So *sigh* we are back to square one and unfortunately still have a
"red flag" showstopper on the Standard going to stable release.

> I don’t share the same general feeling that the RISC-V architecture is in reckless hands

It's not "reckless" or "implications of recklessness", please,
*really*, do not interpret my words in any such way, *really*. It's
just that despite all the best and genuinely amazing efforts of
everyone who has contributed so far, a scenario has slipped through
the cracks that was previously completely unforseen.

This is just how Standards are. It's painstaking
unbelievably-levels-of-patience-teaching scenario modelling that allow
everybody to spot the hidden flaws. Miss even one and the standard's
*genuinely* dead... *not*, Michael, that any one person can be
"blamed" or considered "reckless" as a result.

That having been said if now I that have made people aware of the
problem and it *is* ignored, that *would* be reckless. Keenly aware
that my bluntly-honest communications style tends to _drive_ people to
ignore what I say because it's so open to misinterpretation. Given
that the consequences are so severe I am giving serious consideration
to seeking outside assistance and training to deal with that. It'll
be a while though (apologies).

l.

Luke Kenneth Casson Leighton

unread,

Apr 14, 2018, 2:27:38 PM4/14/18

to Jacob Bachmeyer, Guy Lemieux, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

On Sat, Apr 14, 2018 at 2:29 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:

>> Jacob wrote:
>> "Similarly, I advocate that the RISC-V Foundation should make
>> absolutely clear that no one is permitted to lay any sort of exclusive
>> claim to either assembler mnemonics or binary opcode space."
>>
>> In effect what that means Jacob is that you're advocating shutting
>> down - entirely - the custom extensions space.
>
>
> No, I am saying that we should prevent someone from saying "I own the
> FOOBLARTCH mnemonic and no one else can make an opcode with that mnemonic"
> in the custom extension space.

Ok unfortunately for that to work, it requires a third party of some
kind (the RISC-V Foundation being about the only acceptable logical
choice) to act as the "Atomic Arbitrator" on such proposal.

Which is slightly different but still, unfortunately, results in the
termination of the very "freedom" of custom extensions that is such a
nice part of the Specification... whilst also placing a massive
administrative and technical burden on the RISC-V Foundation in the
process.

That aside, it *is* actually a workable (last-resort) solution,
albeit one that, if I was the RISC-V Foundation, I would, from my
experience in developing a Certification Mark and so knowing just
quite how much of an adminstrative burden it would be, would strongly
be inclined to reject unless absolutely absolutely and completely
necessary.

l.

Luke Kenneth Casson Leighton

unread,

Apr 14, 2018, 2:40:19 PM4/14/18

to Michael Clark, Jacob Bachmeyer, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

On Sat, Apr 14, 2018 at 2:15 AM, Michael Clark <michae...@mac.com> wrote:

> It is worthwhile that these issues are discussed but they should
> not be sensationalised

yyeahh that would tend to imply (don't worry michael, i've seen this
before) that i was seeking attention on *myself*, whereas the actual
intention is to emphasise the seriousness of the consequences
(collective estimted quarter of a billion dollars in development and
production NREs, to put it into context).

being unable to successfully get the critical nature and importance
of a point across over an iterative series of communications i tend to
get really stressed out and it triggers me to use over-emphasis,
instead, out of frustration.

> into some DISASTER SCENARIO (apologies
> for shouting, I don’t normally do that, I guess I am mirroring).

ohnooo, please don't do that. next thing you know you'll also
develop serious long-term undiagnosable illnesses related to stress
and disrupted sleep patterns, as well. you no doubt saw the same
slashdot report last week about how people with sleep disorders
statistically tend to die earlier. he said, writing at 2:29am.

l.

Jacob Bachmeyer

unread,

Apr 14, 2018, 7:23:35 PM4/14/18

to Michael Clark, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Michael Clark wrote:
> On 14/04/2018, at 2:42 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>
>>> I like READID
>>>
>>> Noting that its possible to make an implementation that is not vulnerable to defeating ASLR and reveals nothing about the physical or virtual address space of the CPU. It’s an ID space for extended version information on extensions. It makes sense to start out with 26 leaves that have version numbers like 2.3, 1.11 and 0.4, etc. I’m certain it will be useful.
>>>
>>>
>> Exactly, READID accesses a processor ID ROM that is not part of the main address space at all. Putting it in SYSTEM/PRIV also allows it to be easily handled with trap-and-emulate for environments that need to do so.
>>
>> The contents of the ID ROM can be determined later. I favor a DeviceTree subset or something structurally equivalent but possibly easier to scan.
>>
>
> For the use cases I am thinking about, device-tree is not appropriate. Ideally the mechanism is usable by user space code and M mode code on cores without device-tree or on server cores that use ACPI.
>

Flattened DeviceTree is simply a hierarchical key-value store, expressed
in a single blob. We could also define a structurally similar format
optimized for XLEN-bit word-based access, as READID would provide. ACPI
is a disaster and one of the mistakes of the past that RISC-V seeks to
leave in the past. (Remember the reaction on isa-dev when UEFI for
RISC-V was suggested? One of the things UEFI needed was ACPI bindings.
I think that ACPI provoked almost as strong a reaction as UEFI itself.)

Example:

processor-module {
TYPE_1: core-type {
base = "RV64IMAFDCV";
RVI { version = "2.0"; };
RVM { version = "2.0"; };
RVA { version = "2.0"; extend = "RVAmlr"; RVAmlr { version "2.0";
}; };
RVF { version = "2.0"; };
RVD { version = "2.0"; };
RVC { version = "2.0"; };
RVV { version = "1.8"; };
};
hart@1 { type = <&TYPE_1>; };
hart@2 { type = <&TYPE_1>; };
};

> Perhaps encoding version numbers is not the right think to do, only the intent is to tag extended information that can’t be represented with just V (the current 26 bits). One example may be a Vector Bit Manipulation extension that is not part of the base Vector Extension or a future enhancement to the Vector extension that adds new instructions, or perhaps loose instructions e.g subsets of M: Mmul, Mdiv. ID pages and bits within pages that are defined to indicate the presence of a feature.
>

With Vector Bit Manipulation (if distinct from the combination RVBV),
the "RVV" subnode could be similarly extended just as the "RVA" subnode
was extended for a (still-hypothetical; I need to write that proposal)
RVAmlr multi-LR extension in the earlier example:

...
RVV { version = "1.8"; extend = "RVVbitmanip"; RVVbitmanip {
version = "0.5"; }; };
...

> I have a more of a practical and realistic view of how to handle what others have handled with ID instructions that can be masked into userspace for runtime CPU feature detection. Device tree is simple not appropriate as the way to access it is not standardised across OSes.

READID would be a standard way on RISC-V to access the processor ID
ROM. I propose that that ID ROM should contain an FDT blob (or perhaps
some equivalent better optimized for word reads) describing the
processor, and possibly a containing SoC, but not the surrounding
board. A bootloader would splice the ID ROM into the board device tree
at an appropriate point when preparing the device tree for the supervisor.

> It’s the wrong abstraction-level. Device tree is designed around memory mapped devices attached to the core. Perhaps the limit one might expose in device-tree is cache topology but it’s arguable. Pre-device tree or cores without device tree at all could represent this information in a much more compact form without resorting to heavy duty string matching, having used cache and core topology enumeration on other cores, I’d rather not have to resort to using device tree other than what it is designed for, passing device information to the OS for devices that are not otherwise dynamically discoverable i.e. one only exposes a PCI host in device-tree, not the devices behind it.
>

DeviceTree, structurally, is isomorphic to the previous config string
format. There should be no problem using config string for this
purpose, so DeviceTree is simply another encoding for the same data.

Also, the READID instruction should be trappable, allowing environments
to present modified trees if needed. Or the hardware ID ROM can be
exposed to user mode. Or hardware could support redirecting READID.
Either way works for me. :-)

> Code that has optimised assembly routines that are selectively invoked based on cached values from a processor ID Instruction is just they way this is handled these days so that C and asm for RISC-V that uses extensions can be portable between environments. This can already be done with arm and Intel. Feature indication is not particularly proprietary. RISC-V will obviously have it’s own set of unique set of features and way of encoding then.
>
> That said, “imafdcsu” is all we need at the moment, but userspace will want a simpler mechanism than device tree to dynamically detect B and V. Device-tree would be the wrong way to do this. RDISA would be more appropriate and trap and emulate would match the performance requirement given feature detection is at library load time and not in the subroutine fastpath.
>

This means that structure-parsing overhead, as a tree format would
incur, is a non-issue. The extensibility that tree formats offer is
important here.

-- Jacob

Jacob Bachmeyer

unread,

Apr 14, 2018, 7:40:38 PM4/14/18

to Luke Kenneth Casson Leighton, Guy Lemieux, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

Luke Kenneth Casson Leighton wrote:

> On Sat, Apr 14, 2018 at 2:29 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>
>>> Jacob wrote:
>>> "Similarly, I advocate that the RISC-V Foundation should make
>>> absolutely clear that no one is permitted to lay any sort of exclusive
>>> claim to either assembler mnemonics or binary opcode space."
>>>
>>> In effect what that means Jacob is that you're advocating shutting
>>> down - entirely - the custom extensions space.
>>>
>> No, I am saying that we should prevent someone from saying "I own the
>> FOOBLARTCH mnemonic and no one else can make an opcode with that mnemonic"
>> in the custom extension space.
>>
>
> Ok unfortunately for that to work, it requires a third party of some
> kind (the RISC-V Foundation being about the only acceptable logical
> choice) to act as the "Atomic Arbitrator" on such proposal.
>

No, any number of non-standard extensions can have their own meanings
(and encodings, identical or different) for the FOOBLARTCH mnemonic.
What FOOBLARTCH assembles to would depend on the currently-selected
target ({vendor-id, arch-id} tuple) and the associated records in the
extensible assembler database. The extensible assembler database can
hold multiple instructions called "FOOBLARTCH" for different targets;
the unique key would be {{vendor-id, arch-id}, mnemonic}.

> Which is slightly different but still, unfortunately, results in the
> termination of the very "freedom" of custom extensions that is such a
> nice part of the Specification... whilst also placing a massive
> administrative and technical burden on the RISC-V Foundation in the
> process.
>

How does it place any burden on the Foundation? The most this can put
on the Foundation is maintaining the extensible assembler database, but
that burden comes from *having* the extensible assembler database, not
from saying that no one can prevent someone else from making a
non-standard extension that collides with their own non-standard extension.

> That aside, it *is* actually a workable (last-resort) solution,
> albeit one that, if I was the RISC-V Foundation, I would, from my
> experience in developing a Certification Mark and so knowing just
> quite how much of an adminstrative burden it would be, would strongly
> be inclined to reject unless absolutely absolutely and completely
> necessary.
>

There are two distinct sub-proposals here, one of which creates work for
the Foundation and one of which does not:

(1) An extensible assembler database, storing non-standard mnemonics
and encodings indexed by {vendor-id, arch-id} tuples. Maintaining the
official reference copy of this database creates work for the
Foundation, but can be considered part of the ongoing maintenance of the
RISC-V binutils port. Policies governing inclusion in this database are
a separate issue on which I have made my position known.

(2) A policy that no one can claim exclusive ownership of any
non-standard mnemonic or encoding. This can be implemented passively,
by making waiver of any such claims a condition of using the RISC-V
trademark. The only work created for the Foundation is the unavoidable
work of protecting the RISC-V trademark, which the Foundation already
must undertake.

-- Jacob

Jacob Bachmeyer

unread,

Apr 14, 2018, 8:01:41 PM4/14/18

to Luke Kenneth Casson Leighton, Michael Clark, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Luke Kenneth Casson Leighton wrote:

> On Sat, Apr 14, 2018 at 2:15 AM, Michael Clark <michae...@mac.com> wrote:
>
>> You seem to be making lots of radical assumptions,
>> first that a CPU/SOC architect would include conflicting
>> extensions
>>
>
> Guy raised the scenario as the company he works for licenses a custom
> extension: they're clearly deeply concerned that their work would come
> into direct conflict on any given binary encoding with other custom
> extensions. So it's not a "radical" assumption at all.
>
> Also it's not an *assumption* it's a critical-level risk-assessment.
> So secondly: this is just what you have to go through when developing
> standards. The risk-assessment level really does have to be dialed up
> to 11 on a hair-trigger.
>
> Why? because as Guy pointed out, the cost of developing an ASIC is so
> insanely high that if there's one fatal mistake absolutely nobody and
> I mean nobody is going to redesign their ASICs, by tha t I refer to
> nobody *plural* because there will obviously be multiple vendors
> implementing the same Standards-fatal design flaw.
>
> Once out the gate (Revision 1.0) the colleciive NREs across all
> implementors could well exceed a quarter of a billion dollars.
> Reminder of the definition of NREs: NON recoverable expenses.
>

I thought "NRE costs" were "non-recurring engineering costs". The
effect is the same: NRE is the costs to produce the first wafer that do
not recur on the second wafer. Modern IC fabrication is basically block
printing, using a photographic process. The mask is the "negative" and
the wafers are the "prints". One mask can produce thousands of wafers.
(Well, one layer on each of thousands of wafers.) The mask set (one for
each layer, modern processes can have dozens of layers) must be produced
first and is, to put it mildly, insanely expensive.

> So does that give some kind of context as to how seriously even the
> slightest flaw has to be taken, and why I am on a hair-trigger alert,
> using emphasis, capital letters and markdown "bold" asterisk
> demarcation?
>

Yes, but it still creates the "cry wolf" syndrome in people who do not
see the same risks.

>> and secondly if they did, that they would not consider the consequences.
>>
>
> we *are* considering the consequences... on this very thread! :)
>

This is why I think that the proper solution is to state that "portable"
extensions are expected to be renumberable (the spec already implies
this) and individual implementations must have unambiguous opcode sets.
This renumberability is thus a feature that licensable extensions can
offer to increase their value in the licensing market.

>> I don’t believe disabling misa.F or misa.D requires _destroying_
>> the floating point register file state.
>>
>
> The implication from 3.1.11 Table 3.4 is exactly that: the words are
> quite clear, "state must be destroyed (Extension power set to 'off',
> Extension must be returned to 'initial' state on re-enablement").
>
> Now, that may not be the *intention*, in which case there is a lack
> of clarity that must be addressed. For example: the "instruction to
> switch on" may not in fact *be* "MISA.F = 0" but it's not made
> explicitly clear.
>

The intent, as far as I can determine, was that turning off an FPU would
be an action taken to save power.

>> At minimum it may be implementation defined.
>>
>
> ... which is one of the "red flags" of any standard. Ok, that's not
> quite true. I wish this stuff "How to develop long-term standards"
> were taught in university as a course... the only problem being that I
> know with 100% certainty that I would have been asleep at the back of
> the lecture theatre.
>
> "implementation-defined" is perfectly okay... *as long as* there is
> *nothing else* within the Standard that critically depends on any
> state *WITHIN* that implementation. Now, previously, there was no
> critical inter-dependency that had been envisioned, therefore it
> didn't matter.
>
> Anyway this is partly moot as it's been established that MISA cannot
> be utilised in the general case because custom extensions are *not
> exposed via MISA*, it's only for *standard* extensions.
>

We have four remaining bits in RV32 misa; they could be mapped in the
processor ID ROM to support up to four non-standard extension toggles in
any one implementation.

> [...]

>> I don’t share the same general feeling that the RISC-V architecture is in reckless hands
>>
>
> It's not "reckless" or "implications of recklessness", please,
> *really*, do not interpret my words in any such way, *really*. It's
> just that despite all the best and genuinely amazing efforts of
> everyone who has contributed so far, a scenario has slipped through
> the cracks that was previously completely unforseen.
>

This is also why I am just now advocating explicit prohibitions on
claims of exclusivity to non-standard mnemonics and encoding space. The
issue had previously slipped through the cracks, since only a bad actor
(think TSG) could cause problems in that way.

-- Jacob

Cesar Eduardo Barros

unread,

Apr 14, 2018, 8:27:26 PM4/14/18

to jcb6...@gmail.com, Luke Kenneth Casson Leighton, Michael Clark, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Em 14-04-2018 21:01, Jacob Bachmeyer escreveu:
> We have four remaining bits in RV32 misa; they could be mapped in the
> processor ID ROM to support up to four non-standard extension toggles in
> any one implementation.

We also have four major opcodes reserved for custom extensions. Hm...

Jacob Bachmeyer

unread,

Apr 14, 2018, 11:02:04 PM4/14/18

to Cesar Eduardo Barros, Luke Kenneth Casson Leighton, Michael Clark, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Cesar Eduardo Barros wrote:
> Em 14-04-2018 21:01, Jacob Bachmeyer escreveu:
>> We have four remaining bits in RV32 misa; they could be mapped in the
>> processor ID ROM to support up to four non-standard extension toggles
>> in any one implementation.
>
> We also have four major opcodes reserved for custom extensions. Hm...

We lose two of those (CUSTOM-2/OP-IMM-64 and CUSTOM-3/OP-64) on RV128,
but gain another two (OP-IMM-32 and OP-32) on RV32. So RV32G has room
for up to six custom major opcodes -- and only four bits available in
misa. RV64G has four custom-use major opcodes, but 36 bits available in
misa. RV128G will have only CUSTOM-0 and CUSTOM-1 -- and 100 bits
available in misa. (I specified RVG because non-RVF implementations can
also reuse the RVF major opcodes for non-standard extensions.)

Nonetheless, if "portable" non-standard extensions adhere to a "small
greenfield spaces" principle, allowing some "prefix" (actually scattered
all over the instruction word because RISC-V is fun like that) to be
easily renumbered, then conflicts can, for any particular concrete
implementation, be avoided, up to the point of actually running out of
opcode space. This, of course, allows even an RV32 implementation to
have far more than four non-standard extensions.

-- Jacob

Guy Lemieux

unread,

Apr 15, 2018, 2:07:35 AM4/15/18

to jcb6...@gmail.com, Albert Cahalan, Cesar Eduardo Barros, Krste Asanovic, Luke Kenneth Casson Leighton, Michael Clark, RISC-V ISA Dev, Richard Herveille

It’s not only custom ip code blocks that need to be remapped. Every opcode block, as well as the C instruction set, should be remappable. This allows the RISC-V Foundation the ability to “upgrade” the ISA in the future, yet retain compatability with the base 1.0 (which will always be required). Thus, C could be replaced with C+ and C++ (or C ver2, or whatever name you want to give).

As for custom opcodes being renumbered, only the major opcode is easy to change — the rest of the bit positions have specific meanings and in general they can’t just be shuffled around, eg forcing two custom extensions to share space within the same opcode block.

Whatever mechanism is used to switch extensions, it needs to be standardized. This will allow OS code to preserve the right state in context switches, and to have an API to change extensions.

I have deliberately avoided trying to propose a solution — it seems many in this forum are keen to do so immediately. The reason for my reluctance is that I want to discuss how this capability might be used, so we can produce a set of requirements before settling upon an implementation. In particular, I don’t think anyone has discussed how quickly we need to switch extensions, eg: 1 cycle, a pipeline flush, fence, a context switch / system call, or a cache flush are all different timescales — and this will greatly influence the implementation mechanism. Ideally, I’d like to see something fast, perhaps on the order of 1 cycle or a pipeline flush, but I can’t really justify this with a real example. On an SMT processor implementation, for example, each thread could have activtaed a different custom extension, so rapid interleaving of settings in the pipeline would have to be possible.

Guy

Liviu Ionescu

unread,

Apr 15, 2018, 2:30:49 AM4/15/18

to Guy Lemieux, jcb6...@gmail.com, Albert Cahalan, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, Michael Clark, RISC-V ISA Dev, Richard Herveille, Krste Asanovic

On 15 April 2018 at 09:07:36, Guy Lemieux (glem...@vectorblox.com) wrote:

> It’s not only custom ip code blocks that need to be remapped.
> Every opcode block, as well as the C instruction set, should be

> remappable. ...

just curious, if everything can be remapped and changed at run-time,
how does the compiler know to generate the correct code?

Liviu

Guy Lemieux

unread,

Apr 15, 2018, 6:47:43 AM4/15/18

to Liviu Ionescu, Albert Cahalan, Cesar Eduardo Barros, Krste Asanovic, Luke Kenneth Casson Leighton, Michael Clark, RISC-V ISA Dev, Richard Herveille, jcb6...@gmail.com

again, several possible solutions, some of which have already been suggested.

at the C level, it could be explicit like:

rvf_select_custom( some_id );

which you insert to change. this would manipulate the CSR bits appropriately.

at the assembly level, it would be the explicit CSR manipulation, possibly hidden behind a macro or function call like the C API above.

alternatively, RVF could register official assembly mnemonic prefixes. a 3-alphanumeric digit vendor-specific one is all that is needed, eg VB1 may be selected for the VectorBlox organization. internally, VectorBlox can allocate another say 2-digit ID for individual projects/extension sets, such as AA for the first one. so the total prefix might be 5 digits. this, the first set of custom opcodes from VectorBlox would have an assembly mnemonic prefix “VB1AA.” this solves the representation problem in binutils and disassemblers. however, it suggests changes to encodings might be possible on an instruction by instruction basis.

i have been wondering if it is possible to use something like C++ namespaces to define scoping of which ISA (or which subset) is active. this suggests encodings change less frequently, at the function or compilation unit boundary.

or maybe it’s like setting IEEE rounding modes.

guy

Cesar Eduardo Barros

unread,

Apr 15, 2018, 8:34:42 AM4/15/18

to Guy Lemieux, jcb6...@gmail.com, Albert Cahalan, Krste Asanovic, Luke Kenneth Casson Leighton, Michael Clark, RISC-V ISA Dev, Richard Herveille

Em 15-04-2018 03:07, Guy Lemieux escreveu:
> It’s not only custom ip code blocks that need to be remapped. Every
> opcode block, as well as the C instruction set, should be remappable.
> This allows the RISC-V Foundation the ability to “upgrade” the ISA in
> the future, yet retain compatability with the base 1.0 (which will
> always be required). Thus, C could be replaced with C+ and C++ (or C
> ver2, or whatever name you want to give).

I find it hard to imagine replacing the base ISA (the I "extension"),
but it's an interesting thought experiment. Replacing the C extension is
much easier to imagine, and also much more probable. As for naming: the
current C extension is already "C2" (see chapter 22 of riscv-spec-v2.2),
so its next major version would be called "C3".

> As for custom opcodes being renumbered, only the major opcode is easy to
> change — the rest of the bit positions have specific meanings and in
> general they can’t just be shuffled around, eg forcing two custom
> extensions to share space within the same opcode block.

Agreed. Also, standard extensions probably won't be renumberable at all
(only disableable), and even custom extensions might be limited on which
major opcodes they can be renumbered to. And we might also have
extensions which use minor opcodes instead (a 22-bit encoding space); on
these probably only the minor opcode would be renumberable, keeping the
same major opcode.

> Whatever mechanism is used to switch extensions, it needs to be
> standardized. This will allow OS code to preserve the right state in
> context switches, and to have an API to change extensions.
>
> I have deliberately avoided trying to propose a solution — it seems many
> in this forum are keen to do so immediately. The reason for my
> reluctance is that I want to discuss how this capability might be used,
> so we can produce a set of requirements before settling upon an
> implementation. In particular, I don’t think anyone has discussed how
> quickly we need to switch extensions, eg: 1 cycle, a pipeline flush,
> fence, a context switch / system call, or a cache flush are all
> different timescales — and this will greatly influence the
> implementation mechanism. Ideally, I’d like to see something fast,
> perhaps on the order of 1 cycle or a pipeline flush, but I can’t really
> justify this with a real example. On an SMT processor implementation,
> for example, each thread could have activtaed a different custom
> extension, so rapid interleaving of settings in the pipeline would have
> to be possible.

How quickly the extensions have to be switched depends on how often they
have to be switched. We have the following cases:

- Every instruction (the "interleaving" case)
- Every function
- Once in a while within the same program
- When switching threads (the "context switch" case)
- As a global setting for the whole system (set during boot)
- During board design (as a strap pin)
- During chip design

For the first case, the switch would have to be extremly fast (single
cycle), but renumbering makes it much less necessary. In fact,
renumbering conflicting extensions allows for a "zero-cycle" switch
between them, since the opcode bits select the extension. The other
cases are less demanding.

Of course, depending on renumbering then leads to the question of "how
often do we have to switch and/or renumber extensions". And there's
another issue: how many steps it takes to do the renumbering. If you
need 3 steps for a change (disable this, enable that, renumber that
other, each on a separate CSR), it will take 3 times as much, unless a
two-phase scheme is used (prepare and then commit), or somehow
everything's packed into a single CSR.

And also, there's two levels of disabling an extension. There's "decoder
disable", which only affects the decoder but keeps the instruction
state, and "full disable", which can power down the extension circuits
but loses state. It makes sense to allow both separately and explicitly,
since from the software point of view, they have very different
characteristics.

In particular, there's risk of a side channel through an extension's
state, if lower priviledge levels can disable/enable an extension while
keeping its state, and the operating system is not careful to
save/restore the state of all "decoder disabled" extensions when context
switching.

Going back to the implementation: I imagine a "decoder disable/enable"
would need only a pipeline flush, since it only affects the decoder (but
does so in the next instruction). A "full disable/enable" could take a
bit longer (how much depends on the extension). Requiring an instruction
fence could simplify the implementation a lot, especially in the case
where multiple steps are needed, at the small software cost that before
the fence, it's undefined whether an instruction is using the old or the
new encoding.

And once we have several extensions with state, which can be
enabled/disabled individually, and which have potentially overlapping
but renumberable encodings, the complexity of the context switching code
starts being a pain. Therefore, I'd like to propose an offshot of this
thread: a "save all/restore all" mechanism, to save and restore the
state of all extensions, even the disabled ones, without having to
enable/disable anything.

Luke Kenneth Casson Leighton

unread,

Apr 15, 2018, 10:34:28 AM4/15/18

to Jacob Bachmeyer, Michael Clark, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

On Sun, Apr 15, 2018 at 12:23 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:

> READID would be a standard way on RISC-V to access the processor ID ROM. I
> propose that that ID ROM should contain an FDT blob (or perhaps some
> equivalent better optimized for word reads) describing the processor, and
> possibly a containing SoC, but not the surrounding board. A bootloader
> would splice the ID ROM into the board device tree at an appropriate point
> when preparing the device tree for the supervisor.

You would need device-tree overlays to do that, which have funnily
enough been making their way into mainline u-boot and the linux kernel
in recent years.

Similar schemes were considered three possibly four years ago when
developing the EOMA68 Standard (a Certification Mark). EOMA68
requires cooperation between hardware and software, such that
manufacturers don't end up killing people due to lithium battery fires
because the manufacturer swapped two pins and short-circuited a power
rail.

The issue is that Cards (containing a fully-functioning
credit-card-sized computer and possibly even a battery) have
absolutely no way of knowing what the GPIO in the Housing (tablet
"dock" or laptop "dock") will be for, until it's plugged in. The
solution is for the Housing to have an I2C EEPROM at a known address,
containing a VENDOR:ARCH identifier.

It was *briefly* considered to actually place device-tree overlays
into the EEPROM itself.

In the end there is no actual [topological] difference between having
a VENDOR:ARCH identifier in the processor that allows OS writers to
*select* a devicetree fragment vs actually *having* the devicetree
fragment in the processor

EXCEPT....

(1) if the manufacturer gets the devicetree fragment WRONG they're
screwed and everybody's time has been wasted

(2) there's no guarantee that devicetree fragments will remain
in-date. i realise people *claim* that they are *supposed* to

(3) not all OSes will be happy with having to pull in GPLv2 source
code or write their own.

This latter is an understatement which basically kills the idea
unfortunately. Basically everyone can and probably will just fall
back to VENDOR:ARCH.

Which means that yes it's still possible to consider proposing
device-tree fragments (I do not see precisely how they meet the
requirements yet), just bear in mind that topologically-equivalent
schemes need to be proposed for *other* systems such as *shudder*
windows, FreeBSD and many more.

l.

Luke Kenneth Casson Leighton

unread,

Apr 15, 2018, 11:45:42 AM4/15/18

to Jacob Bachmeyer, Guy Lemieux, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

*thinks*... if the instruction is of the form (literally)
{vendor-id}{arch-id}{mnemonic} then it wouldn't. The instruction, by
virtue of having the prefix {vendor-id}{arch-id} *actually embedded in
the instruction* makes it globally world-wide unique and the v/a has
already been (Atomically) registered.

Which translates to "would it be acceptable for implementors to have
insane-length instructions" and I think you'll find that the answer to
that will be a resounding "NO".

On the *other* hand... if there is only ONE such instruction which
effectively amounts to "please set up the binary-encoding
interpretation context for this function such that encoding AAAA
(32-bit or whatever) is relative to {vendor-id}{march-id} meaning"
that *would* fulfil the requirements.

Depending on how long the proposed instruction actually was.

You *might* instead get away with some form of indirection table to
reduce the space:

ROM address 1: {thirdpartyvendor-id1}{march-id0}
ROM address 2: {thirdpartyvendor-id1}{march-id1}
ROM address 3: {thirdpartyvendor-id2}{march-id593}

function assembler:

SETARCHCONTEXT romaddr=1 # immediate reference into ROM
AAAA r1, r2 # thirdpartyvendor-id1}{marchid0} meaning of AAAA binary-encoding
SETARCHCONTEXT romaddr=2 # immediate reference into ROM
AAAA r1, r2 # thirdpartyvendor-id1}{marchid1} meaning of AAAA binary-encoding

the SETARCHCONTEXT instruction would only need to be a 32-bit
instruction. a CLRARCHCONTEXT would also be needed (set ROM address
= 0?)

CLR/SETARCHCONTEXT *will* however need to be part of RV Base.

*sigh* now that I think about it this is topologically
functionally-identical to the "Shadow MISA CSR Enable/Disable"
solution.

l.

Luke Kenneth Casson Leighton

unread,

Apr 15, 2018, 12:08:42 PM4/15/18

to Jacob Bachmeyer, Michael Clark, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Sun, Apr 15, 2018 at 1:01 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:

>> So does that give some kind of context as to how seriously even the
>> slightest flaw has to be taken, and why I am on a hair-trigger alert,
>> using emphasis, capital letters and markdown "bold" asterisk
>> demarcation?
>>
>
>
> Yes, but it still creates the "cry wolf" syndrome in people who do not see
> the same risks.

Caveat: the following sentence can be translated into
diplomatic-speak by someone who has the skill and the desire to do so.
If they cannot perceive the same risks (or perceive them in the same
way) then they should be either prevented and prohibited from being
involved in Standards development, or, much better, have them trained
properly so that they can.

I'm considering relaying the experiences at some point of what I went
through in developing the EOMA68 Standard (Certification Mark). It
wasn't fun. It cost me a hell of a lot of personal money, due to
having to revise the Standard several times, throwing away prototypes
costing USD 15 THOUSAND DOLLARS in some cases. I lost several key
strategic business relationships along the way as a result of having
to put my foot down over the years, where various people tried to
force the Standard to a 1.0 release in order to make an immediate
profit.

> This is also why I am just now advocating explicit prohibitions on claims of
> exclusivity to non-standard mnemonics and encoding space.

Yes, that would be bad. I'm only just starting to be able to
vocalise the "heuristics" involving "How To Check A Standard For
Long-Term Future Success", and the example you gave would fall into
the category of "freedom to claim exclusivity results in state that by
not being controlled has a detrimental effect on other implementors
due to uncontrolled inter-dependencies".

Freedom (implementation-defined freedom) must only be delegated to
the implementor if the internal "state" within that compartmentalised
"thing" has no external dependencies or interaction with any *other*
"state".

If we want to get all mathematical about it, when writing out the
state-diagram of ALL given implementation, any one Implementations'
"implementation-defined freedom" *MUST* be a self-contained
state-diagram that *ONLY* contains *INCOMING* state-arrows from
STANDARDS-DEFINED areas.

The moment that there are arrows between the "implementation-defined
freedom" state diagrams of any two (or more) independent
implementations, that is a *100%* indication that the Standard has
failed.

It's frickin complex.

l.

Jacob Bachmeyer

unread,

Apr 15, 2018, 11:15:42 PM4/15/18

to Luke Kenneth Casson Leighton, Guy Lemieux, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

> [...]

>
> *sigh* now that I think about it this is topologically
> functionally-identical to the "Shadow MISA CSR Enable/Disable"
> solution.
>

And that is not what I am proposing.

Two different implementations can use the *same* binary opcode for
*different* nonstandard operations. (Or possibly for an operation from
a standard extension on one and a non-standard extension on the other.)
The use of the {vendor-id, arch-id} tuple is in the *assembler*, where
the assembler source declares its target using a ".target riscv vendor
0xXXXXXXXX arch 0xYYYYYYYY" directive. The *only* effect of that
directive is to direct the assembler to load the extended mnemonics that
apply to that *specific* target and to support them in that *specific*
source file while writing that *specific* object.

Looking at it from another perspective, assume that FOOBAR is an opcode
from the RVXfoo non-standard extension and that RVXfoo is in the
extensible assembler database. Implementation A (vendor 1 arch 5)
implements RVXfoo using the CUSTOM-0 major opcode. Implementation B
(vendor 9 arch 2) implements RVXfoo using the CUSTOM-3 major opcode
(presumably because vendor B has some other extension that uses
CUSTOM-0). The extensible assembler database has records for {1,5}
mapping FOOBAR to {..., $CUSTOM-0} and for {9,2} mapping FOOBAR to {...,
$CUSTOM-3}. They could even use completely different binary encodings;
the assembler could support different encodings for non-standard
instructions on different implementations. Encoding differences aside,
both of these are the *same* FOOBAR operation.

The resultant limited portability of binary objects is an inherent
consequence of *using* non-standard instructions in the first place.

-- Jacob

Jacob Bachmeyer

unread,

Apr 15, 2018, 11:22:48 PM4/15/18

to Luke Kenneth Casson Leighton, Michael Clark, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Luke Kenneth Casson Leighton wrote:

> On Sun, Apr 15, 2018 at 12:23 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>
>> READID would be a standard way on RISC-V to access the processor ID ROM. I
>> propose that that ID ROM should contain an FDT blob (or perhaps some
>> equivalent better optimized for word reads) describing the processor, and
>> possibly a containing SoC, but not the surrounding board. A bootloader
>> would splice the ID ROM into the board device tree at an appropriate point
>> when preparing the device tree for the supervisor.
>>
>
> You would need device-tree overlays to do that, which have funnily
> enough been making their way into mainline u-boot and the linux kernel
> in recent years.
>

The FDT format is also designed to permit splicing nodes into an FDT
blob fairly easily. Handling the string table is the only complex part.

> Similar schemes were considered three possibly four years ago when
> developing the EOMA68 Standard (a Certification Mark). EOMA68
> requires cooperation between hardware and software, such that
> manufacturers don't end up killing people due to lithium battery fires
> because the manufacturer swapped two pins and short-circuited a power
> rail.
>
> The issue is that Cards (containing a fully-functioning
> credit-card-sized computer and possibly even a battery) have
> absolutely no way of knowing what the GPIO in the Housing (tablet
> "dock" or laptop "dock") will be for, until it's plugged in. The
> solution is for the Housing to have an I2C EEPROM at a known address,
> containing a VENDOR:ARCH identifier.
>
> It was *briefly* considered to actually place device-tree overlays
> into the EEPROM itself.
>
> In the end there is no actual [topological] difference between having
> a VENDOR:ARCH identifier in the processor that allows OS writers to
> *select* a devicetree fragment vs actually *having* the devicetree
> fragment in the processor
>
> EXCEPT....
>
> (1) if the manufacturer gets the devicetree fragment WRONG they're
> screwed and everybody's time has been wasted
>

Mask ROM is simply part of the hardware and can be validated along with
the rest of the ASIC. EEPROM is, well, rewritable. Rewrite it with the
correct devicetree fragment.

> (2) there's no guarantee that devicetree fragments will remain
> in-date. i realise people *claim* that they are *supposed* to
>

The only way that that can occur is for (1) backwards-compatibility to
have been broken (a bug) or (2) hardware changed and the descriptor did
not, which equally affects VENDOR:ARCH schemes.

> (3) not all OSes will be happy with having to pull in GPLv2 source
> code or write their own.
>

If you are writing an OS, you are writing an OS. Writing code is part
of that job.

> This latter is an understatement which basically kills the idea
> unfortunately. Basically everyone can and probably will just fall
> back to VENDOR:ARCH.
>
> Which means that yes it's still possible to consider proposing
> device-tree fragments (I do not see precisely how they meet the
> requirements yet), just bear in mind that topologically-equivalent
> schemes need to be proposed for *other* systems such as *shudder*
> windows, FreeBSD and many more.

READID would be visible to user-space.

-- Jacob

Jacob Bachmeyer

unread,

Apr 15, 2018, 11:35:07 PM4/15/18

to Luke Kenneth Casson Leighton, Michael Clark, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

Luke Kenneth Casson Leighton wrote:

> ---
> crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68
>
>
> On Sun, Apr 15, 2018 at 1:01 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>
>
>>> So does that give some kind of context as to how seriously even the
>>> slightest flaw has to be taken, and why I am on a hair-trigger alert,
>>> using emphasis, capital letters and markdown "bold" asterisk
>>> demarcation?
>>>
>> Yes, but it still creates the "cry wolf" syndrome in people who do not see
>> the same risks.
>>
>
> Caveat: the following sentence can be translated into
> diplomatic-speak by someone who has the skill and the desire to do so.
> If they cannot perceive the same risks (or perceive them in the same
> way) then they should be either prevented and prohibited from being
> involved in Standards development, or, much better, have them trained
> properly so that they can.
>

It also does not help that many people have been forced to learn to
ignore emphasis, capital letters, etc. due to the abuse of same by
marketers and spammers.

> I'm considering relaying the experiences at some point of what I went
> through in developing the EOMA68 Standard (Certification Mark). It
> wasn't fun. It cost me a hell of a lot of personal money, due to
> having to revise the Standard several times, throwing away prototypes
> costing USD 15 THOUSAND DOLLARS in some cases. I lost several key
> strategic business relationships along the way as a result of having
> to put my foot down over the years, where various people tried to
> force the Standard to a 1.0 release in order to make an immediate
> profit.
>

Those experiences would probably be of value to the community, yes.

>> This is also why I am just now advocating explicit prohibitions on claims of
>> exclusivity to non-standard mnemonics and encoding space.
>>
>
> Yes, that would be bad. I'm only just starting to be able to
> vocalise the "heuristics" involving "How To Check A Standard For
> Long-Term Future Success", and the example you gave would fall into
> the category of "freedom to claim exclusivity results in state that by
> not being controlled has a detrimental effect on other implementors
> due to uncontrolled inter-dependencies".
>
> Freedom (implementation-defined freedom) must only be delegated to
> the implementor if the internal "state" within that compartmentalised
> "thing" has no external dependencies or interaction with any *other*
> "state".
>
> If we want to get all mathematical about it, when writing out the
> state-diagram of ALL given implementation, any one Implementations'
> "implementation-defined freedom" *MUST* be a self-contained
> state-diagram that *ONLY* contains *INCOMING* state-arrows from
> STANDARDS-DEFINED areas.
>
> The moment that there are arrows between the "implementation-defined
> freedom" state diagrams of any two (or more) independent
> implementations, that is a *100%* indication that the Standard has
> failed.
>
> It's frickin complex.
>

I think that you have explained the problems that the extensible
assembler database (combined with those policies) is intended to solve
better than I could.

-- Jacob

Luke Kenneth Casson Leighton

unread,

Apr 16, 2018, 12:00:33 AM4/16/18

to Jacob Bachmeyer, Guy Lemieux, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

On Mon, Apr 16, 2018 at 4:15 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:

>> *sigh* now that I think about it this is topologically
>> functionally-identical to the "Shadow MISA CSR Enable/Disable"
>> solution.
>
> And that is not what I am proposing.

ah, sorry. It's still not clear to me. We're slowly getting there
by eliminating what it *isn't* :)

> Two different implementations can use the *same* binary opcode for
> *different* nonstandard operations.

yes. that's a statement of the problem, rather than a statement of
the requirement which in effect boils down to "a SINGLE unified source
(upstream) qemu (and other simulators) MUST, with a single
command-line option, be able to correctly run all and any custom
extensions even in cases of conflicts between binary encodings" with
caveats of course that the full ISA of such custom extensions would
have to be published, and implemented in gcc / binutils.

> (Or possibly for an operation from a
> standard extension on one and a non-standard extension on the other.)

yes that's an extremely good point and may even actually be needed in
future, to fix backwards-compatibility problems or updates to the
RISC-V Specification. It's not just custom extensions that may need
to have different encodings / meanings of the exact same binary
encoding.

> The use of the {vendor-id, arch-id} tuple is in the *assembler*, where the
> assembler source declares its target using a ".target riscv vendor
> 0xXXXXXXXX arch 0xYYYYYYYY" directive.

ok. with you so far...

> The *only* effect of that directive
> is to direct the assembler to load the extended mnemonics that apply to that
> *specific* target and to support them in that *specific* source file while
> writing that *specific* object.

i *think* i'm starting to see what you're saying. gcc could
incorporate support (patches) for all of the custom extensions because
they're uniquely identified in the assembler (by unique
vendor-march-customext prefix). binutils could likewise incorporate
patches from custom extension developers because those *too* would be
uniquely identified....

ah. i think i see what the problem is. as there's cross-over
(obviously), allow me to continue, and if i've got the above wrong,
just ignore the continuation ok?

> Looking at it from another perspective, assume that FOOBAR is an opcode from
> the RVXfoo non-standard extension and that RVXfoo is in the extensible
> assembler database. Implementation A (vendor 1 arch 5) implements RVXfoo
> using the CUSTOM-0 major opcode. Implementation B (vendor 9 arch 2)
> implements RVXfoo using the CUSTOM-3 major opcode (presumably because vendor
> B has some other extension that uses CUSTOM-0). The extensible assembler
> database has records for {1,5} mapping FOOBAR to {..., $CUSTOM-0} and for
> {9,2} mapping FOOBAR to {..., $CUSTOM-3}. They could even use completely
> different binary encodings; the assembler could support different encodings
> for non-standard instructions on different implementations. Encoding
> differences aside, both of these are the *same* FOOBAR operation.

ok, so the counter-example that shows what the problem is, is as
follows (i'm going to use the suggested assembly proposal, even though
you could skip it entirely and go directly to the "binary encoding").

* vendor 1 implements custom extension A and picks binary encoding AAAA
* vendor 1 submits upstream patches (accepted) for gcc and binutilts
with {vendor-1:NOARCH:A} prefixes. they can't submit an arch
registration (can they?) because they haven't actually released an
actual chip
* vendor 2 implements custom extension B and picks binary encoding AAAA
* vendor 1 submits upstream patches (accepted) for gcc and binutilts
with {vendor-2:NOARCH:B} prefixes. they can't submit an arch
registration (can they?) because they haven't actually released an
actual chip.
* Fabless Semi company 1 licenses BOTH extensions and (for reasons of
cost already outlined earlier in the thread, does *NOT* repeat *NOT*
modify or desire to modify EITHER vendor 1 OR vendor 2's custom
extensions).
* Fabless Semi company 1 registers {fabless-1:arch-1} (which doesn't
help us but hey they still have to do it)
* developers compile code with gcc and binutils for
{fabless-1:arch-1} and it FAILS TO RUN BECAUSE THE EXACT SAME ENCODING
AAAA HAS TWO MEANINGS.

Actually the Fabless Semi company could not even RELEASE the hardware
in the first place because the RTL (and assembly test vectors) during
simulation would pick up on the fact that the exact same encoding AAAA
is required to be routed to *TWO SEPARATE DECODERS*. we hope.

Now, that's not to say that what you propose isn't a damn good idea:
it's an extremely good idea that, if deployed, could help solve the
"maintenance problem" associated with custom extension vendors having
to maintain disparate forks of gcc and binutils.

The thing is that the (globally-world-wide-unique identifying) state
information which you quite rightly propose should be carried across
gcc and binutils is the LOST AT THE BINARY LEVEL.

And it's the BINARY level where it really, really matters.

Thus, the scheme that you propose *only* works if there is a central
authority that prevents and prohibits BINARY encodings from being
duplicated, which is why I mentioned it (a couple of messages ago).
Without actually fully understanding precisely what it was that you
were proposing. oopsie.

So. Did I get it right? Has what you're saying finally gone in to my
pudding-grade-mush-for-a-brain?

l.

Luke Kenneth Casson Leighton

unread,

Apr 16, 2018, 12:28:30 AM4/16/18

to Jacob Bachmeyer, Michael Clark, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Mon, Apr 16, 2018 at 4:35 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:

> It also does not help that many people have been forced to learn to ignore
> emphasis, capital letters, etc. due to the abuse of same by marketers and
> spammers.

*rueful* truuue.... if email supported markdown I'd be fine...

>> I'm considering relaying the experiences at some point of what I went
>> through in developing the EOMA68 Standard (Certification Mark). It
>> wasn't fun. It cost me a hell of a lot of personal money, due to
>

> Those experiences would probably be of value to the community, yes.

There's an indirect reason for considering it: I do have to be quite
careful about how I tell the story though. The key lesson that I
learned was that profit-maximised "Corporate" decision-making
conflicts very very badly with Standards development (and protection
of the same). Corporate interests *pathologically* require that they
maximise profits [1]. By Law! Directors can be *sued and struck
off*, and never permitted to be a Director of a Corporation ever
again, if the Shareholders decide that they've not properly maximised
profits as outlined in the Articles of Incorporation.

Which means that, sadly, even if any one of the Corporate members of
the RISC-V Foundation *spotted* that there was a problem with the
RISC-V Standard, they may actually make the decision *not to point out
the problem* because of the adverse effects doing so might have on
that Corporation's profits!

This is one of the primary reasons why a Certification Mark Holder is
*not* permitted to actually compete with its licensees. That the
RISC-V Foundation has been advised to take out a *Service* Mark (by a
Legal Firm that doesn't have a *single person* publicly listed
*anywhere in the world* as being an expert in Certification Mark Law)
seriously undermines the extent to which the RISC-V Foundation can be
trusted to not do that. They can *state* that they have no intention
to release actual hardware competing with its licensees but there is
nothing - legally - which actually prevents and prohibits them from
doing so.

Very very few people have actually studied this stuff. Whilst
everyone here has some absolutely amazing expertise to contribute, far
more than I ever could possibly learn in a dozen lifetimes, I'm one of
the few people in the world who has both broad technical knowledge,
and knowledge of Certification Marks and Standards Development, *and*
is not under any form of Corporate control, University affiliation and
not even a member of a related Software Libre Team.

l.

[1] and it's a damn good thing that they do, otherwise they'd be out
of business! Profit-maximisation results in hugely-efficient
cost-effective volume manufacturing decision-making. See "The Other
Side of Innovation" https://www.amazon.co.uk/dp/1422166961 which
explains it very well.

Luke Kenneth Casson Leighton

unread,

Apr 17, 2018, 12:14:39 AM4/17/18

to Jacob Bachmeyer, Michael Clark, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

On Mon, Apr 16, 2018 at 4:35 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:

> It also does not help that many people have been forced to learn to ignore
> emphasis, capital letters, etc. due to the abuse of same by marketers and
> spammers.

Apologies for a follow-up: RFCs explicitly use "MUST", "MAY",
"SHOULD" as very very deliberate words with special and
clearly-defined standardised meanings.

>> It's frickin complex.

> I think that you have explained the problems that the extensible assembler
> database (combined with those policies) is intended to solve better than I
> could.

Likewise, here, I am reminded of the Systemic Laws outlined in
"Invisible Dynamics", one of which states, "All contributions by all
contributors, and all contributors themselves, past and present, must
be recognised and acknowledged as valuable".

l.

Jacob Bachmeyer

unread,

Apr 17, 2018, 11:25:04 PM4/17/18

to Luke Kenneth Casson Leighton, Guy Lemieux, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

Luke Kenneth Casson Leighton wrote:

> On Mon, Apr 16, 2018 at 4:15 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>
>>> *sigh* now that I think about it this is topologically
>>> functionally-identical to the "Shadow MISA CSR Enable/Disable"
>>> solution.
>>>
>> And that is not what I am proposing.
>>
>
> ah, sorry. It's still not clear to me. We're slowly getting there
> by eliminating what it *isn't* :)
>

If nothing else, we will get it pinned down eventually. :-)

>> Two different implementations can use the *same* binary opcode for
>> *different* nonstandard operations.
>>
>
> yes. that's a statement of the problem, rather than a statement of
> the requirement which in effect boils down to "a SINGLE unified source
> (upstream) qemu (and other simulators) MUST, with a single
> command-line option, be able to correctly run all and any custom
> extensions even in cases of conflicts between binary encodings" with
> caveats of course that the full ISA of such custom extensions would
> have to be published, and implemented in gcc / binutils.
>

For QEMU, assuming non-standard extension RVXfoo is supported and QEMU
supports a particular target A ({vendor-id, arch-id} tuple) that decodes
RVXfoo for some particular encoding, QEMU would decode RVXfoo
instructions with target A's encoding when directed to emulate target
A. Assuming QEMU also supports target B that uses a different encoding
for RVXfoo, QEMU would decode RVXfoo instructions with target B's
encoding when directed to emulate target B.

The requirement is that *no* *single* implementation (target) can have
conflicting encodings. This is accomplished by renumbering extensions,
which produces the same final result as independent development.

>> (Or possibly for an operation from a
>> standard extension on one and a non-standard extension on the other.)
>>
>
> yes that's an extremely good point and may even actually be needed in
> future, to fix backwards-compatibility problems or updates to the
> RISC-V Specification. It's not just custom extensions that may need
> to have different encodings / meanings of the exact same binary
> encoding.
>

The constraint that I propose is that any *specific* implementation
*must* have unambiguous instruction encoding, *but* across *all*
implementations, ambiguous instruction encodings are no problem.

>> The use of the {vendor-id, arch-id} tuple is in the *assembler*, where the
>> assembler source declares its target using a ".target riscv vendor
>> 0xXXXXXXXX arch 0xYYYYYYYY" directive.
>>
>
> ok. with you so far...
>
>
>> The *only* effect of that directive
>> is to direct the assembler to load the extended mnemonics that apply to that
>> *specific* target and to support them in that *specific* source file while
>> writing that *specific* object.
>>
>
> i *think* i'm starting to see what you're saying. gcc could
> incorporate support (patches) for all of the custom extensions because
> they're uniquely identified in the assembler (by unique
> vendor-march-customext prefix).

Well, GCC likely would not support any of them beyond accepting __asm__
blocks containing the required ".target riscv" directive and the custom
instructions themselves. A program that uses non-standard extensions is
inherently non-portable.

> binutils could likewise incorporate
> patches from custom extension developers because those *too* would be
> uniquely identified....
>

The extensible assembler database would essentially be set of
mnemonic->opcode tables for various target {vendor-id, arch-id} tuples.
Different targets can map the same mnemonics to different encodings.

From the draft privileged ISA spec: "Commercial architecture IDs are
allocated by each commercial vendor independently, ..."

On the other hand, if these are "abstract modular extensions" (perhaps
using the Rocket coprocessor interface), then vendors 1 and 2 have only
defined their extensions and selected *preferred* binary encodings. The
extensible assembler database only stores non-standard extensions for
actual implementations.

> * Fabless Semi company 1 licenses BOTH extensions and (for reasons of
> cost already outlined earlier in the thread, does *NOT* repeat *NOT*
> modify or desire to modify EITHER vendor 1 OR vendor 2's custom
> extensions).
> * Fabless Semi company 1 registers {fabless-1:arch-1} (which doesn't
> help us but hey they still have to do it)
> * developers compile code with gcc and binutils for
> {fabless-1:arch-1} and it FAILS TO RUN BECAUSE THE EXACT SAME ENCODING
> AAAA HAS TWO MEANINGS.
>
> Actually the Fabless Semi company could not even RELEASE the hardware
> in the first place because the RTL (and assembly test vectors) during
> simulation would pick up on the fact that the exact same encoding AAAA
> is required to be routed to *TWO SEPARATE DECODERS*. we hope.
>

This is where Fabless Semi *simply* *cannot* *do* *that*. Combining
*both* extensions in a single hart simply entails the cost of
renumbering one or both. There are ways to reduce those costs, which
vendors 1 and 2 would presumably use to increase the values of their
extensions in the market. Alternately, Fabless Semi could build a
heterogeneous multiprocessor, where one hart is {fabless-1:arch-1} and
the other {fabless-1:arch-2}; one supports the vendor 1 extension and
the other supports the vendor 2 extension, with neither extension
renumbered.

The solution I propose accepts this fundamental reality and assists the
broader community in handling the resultant renumbering and related
complexities, including laying down ground rules to promote this
development.

> Now, that's not to say that what you propose isn't a damn good idea:
> it's an extremely good idea that, if deployed, could help solve the
> "maintenance problem" associated with custom extension vendors having
> to maintain disparate forks of gcc and binutils.
>
> The thing is that the (globally-world-wide-unique identifying) state
> information which you quite rightly propose should be carried across
> gcc and binutils is the LOST AT THE BINARY LEVEL.
>

That information is unneeded at the binary level in this case: the
program is adapted to the specific hardware for which it is intended,
which is unavoidable when non-standard extensions are used.

An ELF note or other "target tag" could be added to allow the linker to
generate a warning if modules with different non-standard instruction
sets are combined (but not an error -- a heterogeneous multiprocessor
might actually be able to run such a program) and to allow loaders to
reject programs that use non-standard instructions not matching any
processor in the system.

> And it's the BINARY level where it really, really matters.
>
> Thus, the scheme that you propose *only* works if there is a central
> authority that prevents and prohibits BINARY encodings from being
> duplicated, which is why I mentioned it (a couple of messages ago).
> Without actually fully understanding precisely what it was that you
> were proposing. oopsie.
>
> So. Did I get it right? Has what you're saying finally gone in to my
> pudding-grade-mush-for-a-brain?
>

Part way. You still seem to be thinking that there is some feasible way
to coordinate unique opcode numbering without creating "land rush"
scenarios. Even adding an "extension-select" CSR is simply expanding
the opcode with the contents of that added CSR.

The important point that I am trying to make is that globally-unique
opcodes for non-standard extensions is simply *wrong* for RISC-V. Only
standard extensions can have globally-unique encodings -- and (to make
this more complex) implementations *can* implement a standard extension
with a non-standard encoding. (Possible example: RVL is expected to
require a 48-bit standard encoding, but an implementation could renumber
RVL into the CUSTOM-0/CUSTOM-1/other-standard major opcodes as a
non-standard extension and have (non-standard) decimal floating-point
without needing to support 48-bit instructions. In this case, the
appropriate ".target riscv ..." directive would result in the standard
RVL mnemonics producing the non-standard encoding. The implementation
in this case does *not* implement RVL, because the encoding used is
non-standard, but should be able to advertise some level of source
compatibility -- a program using RVL *can* be specially compiled for
this implementation and will run correctly if so compiled.)

-- Jacob

Luke Kenneth Casson Leighton

unread,

Apr 17, 2018, 11:58:50 PM4/17/18

to Jacob Bachmeyer, Guy Lemieux, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

Hi Jacob apologies being brief, short on time, I have other matters to
attend to, please don't think I'm being curt, I'm just using shorter
sentences, apologies.

On Wed, Apr 18, 2018 at 4:24 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:

> The requirement is that *no* *single* implementation (target) can have
> conflicting encodings.

Unfortunately, that's not what's being discussed. The problem being
discussed *is* when there are multiple conflicting simultaneous
encodings. So, discussing the situations where there are no such
conflicts is, unfortunately, out of scope (because it's not actually a
problem).

> This is where Fabless Semi *simply* *cannot* *do* *that*.

... and the reason why it cannot do that is because the RISC-V ISA
Standard has failed to take into account the possibility where
conflicts may occur.

> Combining *both*
> extensions in a single hart simply entails the cost of renumbering one or
> both.

As previously explained that is unfortunately totally unacceptable,
both for the vendors and also for the Fabless Semiconductor company
licensing the two (conflicting) extensions. The Fabless Semi company
wants *silicon-proven* extensions, and as we know that costs $1.5
million for masks in 28nm and a development budget of between five to
ten times that, to get to that point.

>There are ways to reduce those costs, which vendors 1 and 2 would
> presumably use to increase the values of their extensions in the market.

If this situation is not resolved there won't *be* a market because,
on reading this thread and seeing that RISC-V has failed to take into
account the conflict-scenario, no company considering developing
RISC-V custom extensions will be able to take RISC-V seriously: it's
simply too risky.

> Alternately, Fabless Semi could build a heterogeneous multiprocessor, where
> one hart is {fabless-1:arch-1} and the other {fabless-1:arch-2}; one
> supports the vendor 1 extension and the other supports the vendor 2
> extension, with neither extension renumbered.

not realistic (

> The solution I propose accepts this fundamental reality and assists the
> broader community in handling the resultant renumbering and related
> complexities, including laying down ground rules to promote this
> development.

Unfortunately it also still doesn't solve the problem because it
assumes that there will be zero conflicts of any given encoding AAAA.
That in turn implies that someone, somewhere, has to have an "atomic
transaction database", and the RISC-V Foundation is extremely unlikely
to want to do that. It defeats the purpose of the exercise of
"freedom within custom ISAs"

>> Now, that's not to say that what you propose isn't a damn good idea:
>> it's an extremely good idea that, if deployed, could help solve the
>> "maintenance problem" associated with custom extension vendors having
>> to maintain disparate forks of gcc and binutils.
>>
>> The thing is that the (globally-world-wide-unique identifying) state
>> information which you quite rightly propose should be carried across
>> gcc and binutils is the LOST AT THE BINARY LEVEL.
>>
>
>
> That information is unneeded at the binary level in this case: the program
> is adapted to the specific hardware for which it is intended, which is
> unavoidable when non-standard extensions are used.

it _is_ avoidable... if an indirect "context" instruction is added to
the ISA which allows multiple encodings AAAA to be uniquely
identified.

>> So. Did I get it right? Has what you're saying finally gone in to my
>> pudding-grade-mush-for-a-brain?
>>
>
>
> Part way. You still seem to be thinking that there is some feasible way to
> coordinate unique opcode numbering

there is.

> without creating "land rush" scenarios.

so such scenarios occur if the indirection-table scheme i proposed is used.

> Even adding an "extension-select" CSR is simply expanding the opcode with
> the contents of that added CSR.

in a unique fashion that ensures there are no conflicts... yes.

ok running out of time, apologies. can we establish for future
exchanges that the context of the discussion is *specifically* the
case where conflicting encodings occur, as opposed to those which
*avoid* conflicting encodings (by any means)? if that's not agreeable
can we please choose one and only one at a time to discuss as both are
completely different and it's extremely difficult to have a discussion
where the context is not properly established and understood. happy
to mutually-exclusively discuss one at a time.

l.

Allen Baum

unread,

Apr 18, 2018, 2:03:02 AM4/18/18

to jcb6...@gmail.com, Luke Kenneth Casson Leighton, Guy Lemieux, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

Jacob- I think you have explained it succinctly. There is a problem here, but it’s not a problem that needs to be solved, it is a problem that simply needs to be avoided.
Market forces by themselves will be sufficient to ensure this.

Custom extension vendors will be required to do whatever it takes to make their extensions work in the Rv ecosystem, which includes compiler tools and even instruction encoding relocation tools (both SW and HW) should that become necessary. If they don’t, they’ll go out of business, thus solving the problem in another way.

Vendors selling custom extensions presumably have (performance/area/power) advantages and will need to provide the libraries that make use of them anyway, so this isn’t a stretch.

The vendor/architecture/implementationID provides a completely unique signature regardless of whether the actual CSRs are implemented.

-Allen

> --
> You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/5AD6BA8B.6020209%40gmail.com.

Guy Lemieux

unread,

Apr 18, 2018, 2:21:28 AM4/18/18

to Jacob Bachmeyer, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

> The requirement is that *no* *single* implementation (target) can have
> conflicting encodings. This is accomplished by renumbering extensions,
> which produces the same final result as independent development.

yes, you can solve the problem by legislating it away.

however, you then abandon flexibility and other benefits.

again, consider there being two *completely* different C instruction
encodings. The current one (C2), plus a brand new one that someone
else has created called C3.

now, a silicon vendor wants to support both C2 and C3, but obviously
not at the same time. instead, they want to dynamically select between
C2 and C3 on a per-process level. thus, the OS must save the current
state of whether C2 or C3 is being used on a context switch. this
mechanism demands standardization.

now, it may also be helpful to sometimes use C2 and sometimes use C3
within the same process, in order to get the maximum compression.
consider that C3 may be very domain-specific, e.g. focussed on
arithmetic codes (or vector codes), and C2 is designed for
general-purpose. at that point, it would be beneficial to change
extensions a bit more quickly. how fast this switching should be done
depends upon the envisioned use cases.

finally, we must recognize the limited 32b encoding space, which has
only 2 major custom opcodes available. if a silicon vendor wants to
implement *4* different major opcodes, then it is not possible, and
things must go to 48b or 64b encodings. however, the vendors that
provide these custom extensions may only support 32b encodings (for
various reasons, including it being difficult to dynamically
regenerate library code when instructions can grow or shrink), and the
silicon vendor's core itself may not support 48b or 64b instruction
fetch. for this reason, we need to support the ability to select
which 2 of the 4 major opcodes are activated at the same time, just
like selecting between C2 and C3.

the point is, with this type of "bank switching" of the opcode space,
RISC-V becomes significantly more flexible, and we don't run out of
opcodes as quickly. however, at minimum, we must define a way too save
& restore the "current opcode space setting" in the base spec, so that
all OSes will do the right thing on a context switch.

note that some custom extensions may have additional state that also
needs to be saved on a context switch -- I'll leave that out of the
discussion for now, but perhaps there should be something defined at
the system ABI level (?) for user-provided context switch stubs. note
that switching between C2 and C3 does not require this type of state
switching.

guy

Bruce Hoult

unread,

Apr 18, 2018, 3:33:51 AM4/18/18

to Guy Lemieux, Jacob Bachmeyer, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

Just a quick comment. I don't regard C2 as being all that "general purpose". It's been designed to perform as well as possible across SPEC, both INT and FP.

Typical computer workloads do not have anywhere near as much FP in them as SPEC and so C2 is likely to be quite far from the best compressed encoding for typical operating system, compiler, internet, and business code (which SPECINT is probably not a terrible proxy for).

Similarly you could probably improve on C2 for SPECFP although my guess is there would be little benefit in doing so -- I think it would mostly materialise as a (static and dynamic) code size benefit rather than a benefit in decreasing L1 instruction cache misses. FP code tends to spend most of its time executing relatively small bits of code on relatively large chunks of data.

I would be very tempted to just rip all the FP instructions out of C and fall back to 32 bit instructions, at least as an experiment.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CALo5CZyr-2GS9pxyM8AVj0ZunTn5wXUj8PzOD5PuP%2B6A2xd9Tg%40mail.gmail.com.

Andrew Waterman

unread,

Apr 18, 2018, 3:58:18 AM4/18/18

to Bruce Hoult, Guy Lemieux, Jacob Bachmeyer, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

On Wed, Apr 18, 2018 at 12:33 AM, Bruce Hoult <br...@hoult.org> wrote:

Just a quick comment. I don't regard C2 as being all that "general purpose". It's been designed to perform as well as possible across SPEC, both INT and FP.

That's a mischaracterization of our methodology. SPEC was one of the workloads we tuned for. We actually got better compression on the Linux kernel (a much more important workload than most of SPEC) than all but one SPEC benchmark.

Typical computer workloads do not have anywhere near as much FP in them as SPEC and so C2 is likely to be quite far from the best compressed encoding for typical operating system, compiler, internet, and business code (which SPECINT is probably not a terrible proxy for).

The only floating-point instructions in C are loads and stores, and they are actually a big help to SPECint. Even a little bit of FP means a fair number of static loads and stores (callee-saved registers, etc.). Had we tuned only for SPECint, the FP loads and stores still would've made the cut.

It would've been satisfying for C to reflect only the base ISA, but the quantitative approach didn't support that decision.

To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CALo5CZyr-2GS9pxyM8AVj0ZunTn5wXUj8PzOD5PuP%2B6A2xd9Tg%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAMU%2BEkxN29Yc%3DS8XFXQ5LFzH4FJh7p1_kEjcYSPh592jzej-eQ%40mail.gmail.com.

Bruce Hoult

unread,

Apr 18, 2018, 4:41:42 AM4/18/18

to Andrew Waterman, Guy Lemieux, Jacob Bachmeyer, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

Thanks for the clarification.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAMU%2BEkxN29Yc%3DS8XFXQ5LFzH4FJh7p1_kEjcYSPh592jzej-eQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CA%2B%2B6G0CGjmDc54Cw80NjCKm39xnzxJHctXrf2BtMfR9Lc0Jv7Q%40mail.gmail.com.

Luke Kenneth Casson Leighton

unread,

Apr 18, 2018, 8:52:03 AM4/18/18

to Guy Lemieux, Jacob Bachmeyer, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

On Wed, Apr 18, 2018 at 7:20 AM, Guy Lemieux <glem...@vectorblox.com> wrote:

>> The requirement is that *no* *single* implementation (target) can have
>> conflicting encodings. This is accomplished by renumbering extensions,
>> which produces the same final result as independent development.
>
> yes, you can solve the problem by legislating it away.
>
> however, you then abandon flexibility and other benefits.

ok, so as the instigator of this thread, and also one of the
companies whose profits and business model will in fact be adversely
and immediately affected if this is gotten wrong, how would your CEO
(and Sales Team) feel about the following conversation:

Client: "so we have another 3rd party extension that conflicts with yours"
VB CEO: "ok."
Client: "and we're happy with the {insert 5-figure? sum} for vectorblox"
VB CEO: "good to hear."
Client: "and you understand our requirement that we need
silicon-proven RTL for both extensions"
VB CEO: "yes... which we have. 28nm test chips were successful, last year"
Client: "but the conflicts require modifying one of the extensions"
VB CEO: "yes, my technical team explained to your technical team that
there's a problem with the RISC-V Specification which we're not in
control of".
Client: "that doesn't fill me with a lot of confidence, either in
VectorBlox *or* RISC-V as an architecture. Can we play Prisoner's
Dilemma with you and the other 3rd Party Extension Company over who
gets to pay the cost of adjusting their extension to meet our needs?"
VB CEO: "no, sorry. we can give you a quote for adjusting it, and
doing the new round of silicon. we will include new customised Test
Vectors for you. Estimates for 28nm will be somewhere around USD $2
million including the Mask Charges. That's assuming we get it right
first time. No guarantee, mind".
Client: "that's INSANE!!!! We're not paying custom development costs
with NO GUARANTEE OF SUCCESS!"
VB CEO: "ok errr have a nice day"

Client (now off the phone): "Somebody get me the telephone number of
the local ARM Rep. These RISC-V people are just not serious
contenders. At least with ARM it's a known silicon-proven quantity
with hundreds of silicon-proven licensees with no complaints. We just
can't take the risk and we certainly aren't going to pay custom
development NREs".

How would VectorBlox feel about having to go through that kind of
easily-predictable conversation, based on how the RISC-V ISA stands
right now?

> finally, we must recognize the limited 32b encoding space, which has
> only 2 major custom opcodes available. if a silicon vendor wants to
> implement *4* different major opcodes, then it is not possible, and
> things must go to 48b or 64b encodings. however, the vendors that
> provide these custom extensions may only support 32b encodings (for
> various reasons, including it being difficult to dynamically
> regenerate library code when instructions can grow or shrink),

which in turn is a huge (custom!) NRE on the binutils software
development. and subsequent maintenance costs to also take into
account (by the Fabless Semi Company).

> and the
> silicon vendor's core itself may not support 48b or 64b instruction
> fetch.

oh. ah. right. that's actually much more serious than even I
imagined, initially. the significance of only 2 custom opcodes being
left in the 32-bit space hadn't occurred to me.

so if this is not tackled, not only are the third party custom
extension vendors playing russian roulette for a really *really*
limited number of "empty chamber slots", pretty much *guaranteeing*
that there will be clashes, but the companies developing (or
licensing) cores are adversely affected *as well*.

fortunately a solution exists: it means adding one extra instruction
to the privileged spec.

l.

Allen Baum

unread,

Apr 18, 2018, 9:38:33 AM4/18/18

to Luke Kenneth Casson Leighton, Guy Lemieux, Jacob Bachmeyer, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

This strawman is so stupid it is amusing.

First: IP companies do not necessarily guarantee working silicon of exactly what you specify.

(e.g. NOC IP where you get to specify a configuration, and they'll generate RTL, and perhaps test vectors - but every customer's implementation is unique).

Yes, this is an existing, proven, successful business model.

Second: there is even a business model providing processor+customer's custom extension - and that is also an existing successful business model. And they don't provide working silicon before you pay them.

Thirdly - going to call ARM to solve your problem?

Do you seriously expect them to implement a custom extension?

They will, however, allow you to do it - if you fork over $10^7+ for the privilege of an architectural license

- but won't guarantee anything about functionality.

Oh, and you'll have to do all the work of integration and layout yourself.

There's probably a point 4 & 5, but I have better uses of my time.

As I said earlier - this is a non-problem.

A customer that falls into the trap of requiring two separate custom extensions from one or more vendors that have conflicting encodings, and requiring working silicon for both extensions that do not conflict--- deserves to go out of business.

Neither IP vendor will go out of business (for those reasons, anyway).

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAPweEDy%3D0Oc3VqY7pGGCtyMLj3FGFTLNi0YiF23ROqq7yfgpsw%40mail.gmail.com.

Guy Lemieux

unread,

Apr 18, 2018, 9:48:04 AM4/18/18

to Allen Baum, Albert Cahalan, Cesar Eduardo Barros, Jacob Bachmeyer, Krste Asanovic, Luke Kenneth Casson Leighton, RISC-V ISA Dev, Richard Herveille

On Wed, Apr 18, 2018 at 6:38 AM Allen Baum <allen...@esperantotech.com> wrote:

This strawman is so stupid it is amusing.

ok

As I said earlier - this is a non-problem.
A customer that falls into the trap of requiring two separate custom extensions from one or more vendors that have conflicting encodings, and requiring working silicon for both extensions that do not conflict--- deserves to go out of business.

Neither IP vendor will go out of business (for those reasons, anyway).

Yeah, 640kB (or 2 major opcode slots) ought to be enough anyway.

guy

lk...@lkcl.net

unread,

Apr 18, 2018, 4:38:38 PM4/18/18

to RISC-V ISA Dev, lk...@lkcl.net, glem...@vectorblox.com, jcb6...@gmail.com, ces...@cesarb.eti.br, kr...@berkeley.edu, acah...@gmail.com, richard....@roalogic.com

On Wednesday, April 18, 2018 at 2:38:33 PM UTC+1, Allen Baum wrote:

This strawman is so stupid it is amusing.

*gently*, Allen. The language that you've used invites conflict and division if viewed incorrectly or misinterpreted. Fortunately, from reading "Invisible Dynamics", I'm aware of the Systemlc Law that states "every contribution by every contributor must be respected and valued by all", and that includes your words.

so, if the scenario is not perfect, then in the interests of testing the RISC-V Speciffication so that, for the benefit of everyone including EsperantoTech, we may work together to ensure that the Standard is viable, would you be interested to help refine the scenario so that it is more realistic?

And, if so, can I invite you to do so in a way that is easier on you (and others)? Amazing - me using those words when I'm normally the one throwing in ludicrousness... :)

So, to summarise what you're saying, *in your view*, Fabless Semi companies have had to put up huge NREs and customise both hardware and software: that's the way that it's always been done in the past.

Are you therefore also saying that we should not, as a group, attempt to find creative ways to reduce NREs for potential implementors interested in using RISC-V? I apologise if that was not the intended implication, did you mean to imply this?

Thirdly - going to call ARM to solve your problem?
Do you seriously expect them to implement a custom extension?

No, Allen: I would expect the hypothetical customer to go for an entirely off-the-shelf solution (both the main processor and the extensions) with a better clearly-organised pedigree, a proven track record, and consequently massively-reduced risk and NREs.

If the example given does not "work" with ARM, we may choose a architecture to substitute until it does. MIPS, ARC, Xtensa, or anything but RISC-V, where the probability of working with that (chosen) architecture increases the probability of third party RTLs having an off-the-shelf solution designed to meet the hypothetical Client's needs.

There's another possibility however (see below) which in the interests of FRAND I'm compelled to raise.

As I said earlier - this is a non-problem.

Did you mean to say, "in my view, I *believe* this to be a non-problem, would you agree with that assessment?"

Would you agree that a declaration "this is a non-problem" is a way to invite conflict?

A customer that falls into the trap of requiring two separate custom extensions from one or more vendors that have conflicting encodings, and requiring working silicon for both extensions that do not conflict--- deserves to go out of business.

I believe we have established that the number of remaining custom opcodes in the 32-bit space is sufficiently small (2) such that the probability of this scenario "businesses using RISC-V deserving to go out of business" - is extremely high. In an earlier message in this thread approximately six different *classes* of high-value potential custom extensions were listed.

Would you agree that the probability of conflicting encodings is very high?

Anyway: there is one potential "solution" that comes out of your reaction, Allen (see? every contribution is valuable!) and that's the possibility of simply laying down one core with one custom extension, and another totally separate core with another custom extension.

Now, whether that's acceptable to a Fabless Semi Company is an entirely different matter. It would mean having to have separate SMP cores, such that the performance may be adversely affected if the algorithm specifically requires the close cooperation between two custom encodings, such that the overhead of dropping down to L2 cache and back to another core is simply not a way to achieve the desired performance.

l.

lk...@lkcl.net

unread,

Apr 18, 2018, 4:54:01 PM4/18/18

to RISC-V ISA Dev, allen...@esperantotech.com, acah...@gmail.com, ces...@cesarb.eti.br, jcb6...@gmail.com, kr...@berkeley.edu, lk...@lkcl.net, richard....@roalogic.com

On Wednesday, April 18, 2018 at 2:48:04 PM UTC+1, glemieux wrote:

Yeah, 640kB (or 2 major opcode slots) ought to be enough anyway.

love the reference to Bill Gate's retrospectively hilarious words.

In terminology that I have seen used a lot, Allen's been "triggered by" (reacted to) a couple of separate discussions, now, which, far from being "his fault" or anything like that, gives us an indicator of quite how extreeeemely patient - and respectful - all of us have to be. We're in this for the long-term.

As I mentioned earlier in this thread, Standards development is just... genuinely this detailed. Scenarios that in a software environment have serious show-stopping ramifications can always be rectified. In Hardware you simply can't. That in turn means, quite simply, that the "risk assessment level" has to be dialed up to 11.

That's of course if it is expected that the Standard be a success. There is always the option - we always have the choice - to *knowingly* develop a Standard that is *known* to be failure.

l.

Michael Clark

unread,

Apr 18, 2018, 6:47:40 PM4/18/18

to Jacob Bachmeyer, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

> On 15/04/2018, at 11:23 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>
> Michael Clark wrote:
>> On 14/04/2018, at 2:42 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>>
>>>> I like READID
>>>> Noting that its possible to make an implementation that is not vulnerable to defeating ASLR and reveals nothing about the physical or virtual address space of the CPU. It’s an ID space for extended version information on extensions. It makes sense to start out with 26 leaves that have version numbers like 2.3, 1.11 and 0.4, etc. I’m certain it will be useful.
>>>>
>>> Exactly, READID accesses a processor ID ROM that is not part of the main address space at all. Putting it in SYSTEM/PRIV also allows it to be easily handled with trap-and-emulate for environments that need to do so.
>>>
>>> The contents of the ID ROM can be determined later. I favor a DeviceTree subset or something structurally equivalent but possibly easier to scan.
>>>
>>
>> For the use cases I am thinking about, device-tree is not appropriate. Ideally the mechanism is usable by user space code and M mode code on cores without device-tree or on server cores that use ACPI.
>>
>
> Flattened DeviceTree is simply a hierarchical key-value store, expressed in a single blob. We could also define a structurally similar format optimized for XLEN-bit word-based access, as READID would provide. ACPI is a disaster and one of the mistakes of the past that RISC-V seeks to leave in the past. (Remember the reaction on isa-dev when UEFI for RISC-V was suggested? One of the things UEFI needed was ACPI bindings. I think that ACPI provoked almost as strong a reaction as UEFI itself.)
>
> Example:
>
> processor-module {
> TYPE_1: core-type {
> base = "RV64IMAFDCV";
> RVI { version = "2.0"; };
> RVM { version = "2.0"; };
> RVA { version = "2.0"; extend = "RVAmlr"; RVAmlr { version "2.0"; }; };
> RVF { version = "2.0"; };
> RVD { version = "2.0"; };
> RVC { version = "2.0"; };
> RVV { version = "1.8"; };
> };
> hart@1 { type = <&TYPE_1>; };
> hart@2 { type = <&TYPE_1>; };
> };

I think you are conflating the core and its periphery.

Device-tree is not appropriate to solve the core level ISA extension information problem.

The problem has already been solved on other platforms in a much simpler manner.

It needs to be simple for portable C with inline RISC-V asm or RISC-V asm to be able to test for presence of instructions. Instructions have nothing to do with device-tree, which is a tree or devices and their memory map.

There is no way that a developer would want to use some heavy weight OS specific mechanism (parsing /proc/device-tree) to solve a problem that on other major platforms can be solved by a bit test on a ‘per hart’ cpu id word.

The approach you are proposing will discourage developers from writing code that uses standard or custom extensions requiring information beyond the current 26 bits in misa. It means that cpu_features would need to pull in a heavy weight device tree parser. It means that cpu_features won’t work on embedded systems without device-tree.

I propose that we would add XLEN-bit pages for each extension. e.g. 0x100 + ext = extended extension info, and then in a registry define bits for extended extension information. Of course these bits would have mnemonics and descriptions in a specification, but the code to read them would simply test for a bit in a word. We would start with one word per extension with several bits for version and bits for any extension distinguishers, such as indicators of optional instructions within a particular extension.

Use case: testing for B extension optional BEXT/BDEP (bit scatter gather) vs B base that may not contain BEXT/BDEP

has_bit_scatter_gather:
LI t1, 0x1 # misa.B = B base
READISA t0 # pseudo for CSR rd, isa, x0; user mode accessible ISA CSR e.g. misa mapped into U mode
AND a0, t0, t1
BNEZ a0, bit_scatter_gather_test
RET

bit_scatter_gather_test:
LI t1, 0x101 # processor ID page 0x100 + ext = extended extension info, leaf 1 (‘B - ‘A’) = B extension
READID t0, t1 # pseudo for CSRRS rd, hartcfg, rs; mhartcfg
AND a0, t0, t1 # misa.B = B base, bit 1, bit 1 = B_EXTDEP or B extension, optional extract deposit
RET

Device-tree is appropriate for describing the periphery, but not for identifying instruction set extensions. This is already a well-solved problem on ‘many' other architectures.

It would be sad if we created some heavy weight solution that makes simple libraries like cpu_features depend on OS specific interfaces to a system designed for describing memory map and interrupt routing for ‘devices / periphery’. e.g. int fd = open(“/proc/device-tree”); error handling ; parse_device_tree(fd) ; lookup(“string-identifier”); close(fd); etc, etc. This is too heavy weight to include in an embedded system, and despite that, it makes assembly optimized routines in “portable RISC-V code” depend on non-portable OS interfaces to access the device tree passed at boot.

The existing mechanisms on many other processors for testing whether an optional instruction exists are typically solved via a bit test on a word returned from an ID instruction.

The way to handle this is via registries:

- https://github.com/riscv/riscv-opcodes
- https://github.com/michaeljclark/riscv-meta

We can then start assigning word ranges and bits within words for sub-extension mnemonics (extended extension information). e.g. B_EXTDEP for Bit Extract and Deposit, if for example it is decided that scatter gather instructions ends up as optional instructions in the B extension (purely for example’s sake).

While I don’t like the idea of negative presence in general, it makes sense in cases where we are trimming an instruction from a base extension, to use a negative identifier, as is the case where we might remove division from M, F and D, or remove square root from F or D. In this case I have prefixed the sub-extension with and N.

I
E
M
M_NDIV # no div,divu,rem,remu for rv32/rv64/rv128 and no divw,divuw,remw,remuw for rv64/rv128, and no divd,divud,remd,remud for rv128
A
A_TSO # indicates implementation of total store order (RVTSO)
A_DWLRSC # double word lr/sc
F
F_NDIV # no fdiv.s
F_NSQRT # no fsqrt.s
F_INVSQRT # fast inverse square root: finvsqrt.s
F_TD # fast transcendentals: fsin.s, fcos.s, ftan.s
D
D_NDIV # no fdiv.d
D_NSQRT # no fsqrt.d
D_INVSQRT # fast inverse square root: finvsqrt.d
D_TD # fast transcendentals: fsin.d, fcos.d, ftan.d
Q
Q_NDIV # no fdiv.q
Q_NSQRT # no fsqrt.q
C
C_DICTV2 # Version 2 of RVC dictionary
L
B
B_EXTDEP # Optional bit gather scatter
J
T
P
V
V_BEXTDEP # Optional vector bit gather scatter
N
S
S_SV32 # safer for S-mode code than probing, as S-mode can’t change VM without side-effects
S_SV39
S_SV48
S_SV56
U
H
H_T1 # reserved for type 1 hypervisors that have an H-mode
H_T2 # type 2 hypervisors e.g. HS mode
X_PMP # currently there is no way to detect PMP besides illegal instruction traps on pmpcfg. It can’t be put on S or U and M is used for muldiv

etc…

READDISA rd would be a pseudo for CSRRS rd, isa, x0 and isa would be a U accessible version of misa. Easy to implement with trap and emulate and easy to implement in hardware.

READID rs, rd would be a special type of CSR (perhaps 0xff) that carries a direct dependency from rs to rd, allowing access to XLEN words of state, that is outside of the regular address space. i.e. it works whichever virtual addressing mode is currently active and is not subject to leaking anything about physical address space to U-mode. Easy to implement with trap and emulate and easy to implement in hardware. ROMs are 1T vs SRAM which are 6T or 8T, if it were implemented as a ROM lookup. That said implementors may wish to have early firmware configure a tiny CAM so that the words can be configured from EEPROM, but still have the instruction execute quickly.

While the CSR space is only effectively 256 words (due to the mode aliasing convention), the READID instruction could potentially return a larger number of words, and a convention on partitioning could be designed for standard and vendor specific identification.

Indeed, if we had this earlier, we could move vendor,arch,impl into the ID ROM to save precious CSR space.

>> Perhaps encoding version numbers is not the right think to do, only the intent is to tag extended information that can’t be represented with just V (the current 26 bits). One example may be a Vector Bit Manipulation extension that is not part of the base Vector Extension or a future enhancement to the Vector extension that adds new instructions, or perhaps loose instructions e.g subsets of M: Mmul, Mdiv. ID pages and bits within pages that are defined to indicate the presence of a feature.
>>
>
> With Vector Bit Manipulation (if distinct from the combination RVBV), the "RVV" subnode could be similarly extended just as the "RVA" subnode was extended for a (still-hypothetical; I need to write that proposal) RVAmlr multi-LR extension in the earlier example:
>
> ...
> RVV { version = "1.8"; extend = "RVVbitmanip"; RVVbitmanip { version = "0.5"; }; };
> ...
>
>> I have a more of a practical and realistic view of how to handle what others have handled with ID instructions that can be masked into userspace for runtime CPU feature detection. Device tree is simple not appropriate as the way to access it is not standardised across OSes.

>
> READID would be a standard way on RISC-V to access the processor ID ROM. I propose that that ID ROM should contain an FDT blob (or perhaps some equivalent better optimized for word reads) describing the processor, and possibly a containing SoC, but not the surrounding board. A bootloader would splice the ID ROM into the board device tree at an appropriate point when preparing the device tree for the supervisor.
>

>> It’s the wrong abstraction-level. Device tree is designed around memory mapped devices attached to the core. Perhaps the limit one might expose in device-tree is cache topology but it’s arguable. Pre-device tree or cores without device tree at all could represent this information in a much more compact form without resorting to heavy duty string matching, having used cache and core topology enumeration on other cores, I’d rather not have to resort to using device tree other than what it is designed for, passing device information to the OS for devices that are not otherwise dynamically discoverable i.e. one only exposes a PCI host in device-tree, not the devices behind it.
>>
>
> DeviceTree, structurally, is isomorphic to the previous config string format. There should be no problem using config string for this purpose, so DeviceTree is simply another encoding for the same data.
>
> Also, the READID instruction should be trappable, allowing environments to present modified trees if needed. Or the hardware ID ROM can be exposed to user mode. Or hardware could support redirecting READID. Either way works for me. :-)
>
>> Code that has optimised assembly routines that are selectively invoked based on cached values from a processor ID Instruction is just they way this is handled these days so that C and asm for RISC-V that uses extensions can be portable between environments. This can already be done with arm and Intel. Feature indication is not particularly proprietary. RISC-V will obviously have it’s own set of unique set of features and way of encoding then.
>>
>> That said, “imafdcsu” is all we need at the moment, but userspace will want a simpler mechanism than device tree to dynamically detect B and V. Device-tree would be the wrong way to do this. RDISA would be more appropriate and trap and emulate would match the performance requirement given feature detection is at library load time and not in the subroutine fastpath.
>>
>
> This means that structure-parsing overhead, as a tree format would incur, is a non-issue. The extensibility that tree formats offer is important here.
>
>
> -- Jacob

>
> --
> You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/5AD28D72.6050908%40gmail.com.

Bruce Hoult

unread,

Apr 18, 2018, 7:42:51 PM4/18/18

to Luke Kenneth Casson Leighton, Guy Lemieux, Jacob Bachmeyer, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Richard Herveille

With ARM you're not going to be able to have one custom ISA extension, let alone two!!

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAPweEDy%3D0Oc3VqY7pGGCtyMLj3FGFTLNi0YiF23ROqq7yfgpsw%40mail.gmail.com.

Jacob Bachmeyer

unread,

Apr 18, 2018, 10:52:02 PM4/18/18

to Michael Clark, Luke Kenneth Casson Leighton, Cesar Eduardo Barros, RISC-V ISA Dev, Krste Asanovic, Albert Cahalan, Guy Lemieux, Richard Herveille

I am *not* suggesting reliance on DeviceTree, in the sense of the
hardware map passed from the bootloader to the supervisor. I am
suggesting reusing the Flattened DeviceTree blob format for the
processor ID ROM.

> The problem has already been solved on other platforms in a much simpler manner.
>
> It needs to be simple for portable C with inline RISC-V asm or RISC-V asm to be able to test for presence of instructions. Instructions have nothing to do with device-tree, which is a tree or devices and their memory map.
>
> There is no way that a developer would want to use some heavy weight OS specific mechanism (parsing /proc/device-tree) to solve a problem that on other major platforms can be solved by a bit test on a ‘per hart’ cpu id word.
>

That is *not* what I propose. All that is needed is a tree walk to find
the relevant word, followed by a bit-test, or reading some numeric or
string value.

> The approach you are proposing will discourage developers from writing code that uses standard or custom extensions requiring information beyond the current 26 bits in misa. It means that cpu_features would need to pull in a heavy weight device tree parser. It means that cpu_features won’t work on embedded systems without device-tree.
>

It means that cpu_features will not work on embedded systems *that* *do*
*not* *have* *an* *ID* *ROM*... which is to say, that cpu_features will
not work on embedded systems that do not report their features. This is
similar to how cpu_features would not work on an x86 that does not
implement CPUID.

> I propose that we would add XLEN-bit pages for each extension. e.g. 0x100 + ext = extended extension info, and then in a registry define bits for extended extension information. Of course these bits would have mnemonics and descriptions in a specification, but the code to read them would simply test for a bit in a word. We would start with one word per extension with several bits for version and bits for any extension distinguishers, such as indicators of optional instructions within a particular extension.
>
> Use case: testing for B extension optional BEXT/BDEP (bit scatter gather) vs B base that may not contain BEXT/BDEP
>
> has_bit_scatter_gather:
> LI t1, 0x1 # misa.B = B base
> READISA t0 # pseudo for CSR rd, isa, x0; user mode accessible ISA CSR e.g. misa mapped into U mode
> AND a0, t0, t1
> BNEZ a0, bit_scatter_gather_test
> RET
>
> bit_scatter_gather_test:
> LI t1, 0x101 # processor ID page 0x100 + ext = extended extension info, leaf 1 (‘B - ‘A’) = B extension
> READID t0, t1 # pseudo for CSRRS rd, hartcfg, rs; mhartcfg
> AND a0, t0, t1 # misa.B = B base, bit 1, bit 1 = B_EXTDEP or B extension, optional extract deposit
> RET
>
> Device-tree is appropriate for describing the periphery, but not for identifying instruction set extensions. This is already a well-solved problem on ‘many' other architectures.
>

The problem is that now the ID ROM requires cross-referencing with some
external registry. The tree structures I propose are self-contained,
and with well-chosen keys, *self-describing*.

> It would be sad if we created some heavy weight solution that makes simple libraries like cpu_features depend on OS specific interfaces to a system designed for describing memory map and interrupt routing for ‘devices / periphery’. e.g. int fd = open(“/proc/device-tree”); error handling ; parse_device_tree(fd) ; lookup(“string-identifier”); close(fd); etc, etc. This is too heavy weight to include in an embedded system, and despite that, it makes assembly optimized routines in “portable RISC-V code” depend on non-portable OS interfaces to access the device tree passed at boot.
>

No. No, no, no, nonononononono! That is *not* what I propose. Access
would be "riscv_IDROM_lookup("string-identifier");" and
riscv_IDROM_lookup() would *not* make any system calls. READID
*directly* exposes a tree structure to user mode. There is *NO*
OS-specific interface involved. (A supervisor may trap READID, but the
user program *still* *uses* READID.)

> The existing mechanisms on many other processors for testing whether an optional instruction exists are typically solved via a bit test on a word returned from an ID instruction.
>

Which could be a bit test on a word retrieved at the end of a tree walk.

> [...]

>
> READDISA rd would be a pseudo for CSRRS rd, isa, x0 and isa would be a U accessible version of misa. Easy to implement with trap and emulate and easy to implement in hardware.
>
> READID rs, rd would be a special type of CSR (perhaps 0xff) that carries a direct dependency from rs to rd, allowing access to XLEN words of state, that is outside of the regular address space. i.e. it works whichever virtual addressing mode is currently active and is not subject to leaking anything about physical address space to U-mode. Easy to implement with trap and emulate and easy to implement in hardware. ROMs are 1T vs SRAM which are 6T or 8T, if it were implemented as a ROM lookup. That said implementors may wish to have early firmware configure a tiny CAM so that the words can be configured from EEPROM, but still have the instruction execute quickly.
>
> While the CSR space is only effectively 256 words (due to the mode aliasing convention), the READID instruction could potentially return a larger number of words, and a convention on partitioning could be designed for standard and vendor specific identification.
>

Or, with a tree-structured ID ROM, simply define a convention for
vendor-specific nodes and the problem solves itself.

> Indeed, if we had this earlier, we could move vendor,arch,impl into the ID ROM to save precious CSR space.
>

We could also make READID its own instruction instead of using up a CSR
slot.

-- Jacob

Jacob Bachmeyer

unread,

Apr 18, 2018, 11:13:50 PM4/18/18

to lk...@lkcl.net, RISC-V ISA Dev, glem...@vectorblox.com, ces...@cesarb.eti.br, kr...@berkeley.edu, acah...@gmail.com, richard....@roalogic.com

lk...@lkcl.net wrote:
> On Wednesday, April 18, 2018 at 2:38:33 PM UTC+1, Allen Baum wrote:
>

> [...]

>
>
> A customer that falls into the trap of requiring two separate
> custom extensions from one or more vendors that have conflicting
> encodings, and requiring working silicon for both extensions that
> do not conflict--- deserves to go out of business.
>
>
> I believe we have established that the number of remaining custom
> opcodes in the 32-bit space is sufficiently small (2) such that the
> probability of this scenario "businesses using RISC-V deserving to go
> out of business" - is extremely high. In an earlier message in this
> thread approximately six different *classes* of high-value potential
> custom extensions were listed.

What luck! For RV32, there are exactly *six* major opcodes that can be
used in this manner: CUSTOM-0, CUSTOM-1, OP-IMM-64/CUSTOM-2,
OP-64/CUSTOM-3, OP-IMM-32, OP-32. :-)

On a more serious note, the issue here is the inflexibility of wanting
extensions with incompatible encodings in a single processor and wanting
*pre-existing* working silicon for what, put simply, would be a *new*
product -- a processor that combines those extensions for the first
time. I hope that all can see the temporal contradiction in that scenario.

> Would you agree that the probability of conflicting encodings is very
> high?

I consider it a certainty (at the scale of "all of RISC-V") and propose
that we work to manage the consequences of that inevitability. Consider
this: how to ensure that the Chinese no-name gone-tomorrow
new-name-next-day manufacturers follow whatever process is developed for
coordinating unique extension encodings and do not just decide that that
process is "too complex" and drop their extensions right on top of
whatever happens to be in the way?

> Anyway: there is one potential "solution" that comes out of your
> reaction, Allen (see? every contribution is valuable!) and that's the
> possibility of simply laying down one core with one custom extension,
> and another totally separate core with another custom extension.

Entirely separate cores are not needed, as each hart can have its own
{vendor-id, arch-id} tuple.

> Now, whether that's acceptable to a Fabless Semi Company is an
> entirely different matter. It would mean having to have separate SMP
> cores, such that the performance may be adversely affected if the
> algorithm specifically requires the close cooperation between two
> custom encodings, such that the overhead of dropping down to L2 cache
> and back to another core is simply not a way to achieve the desired
> performance.

Not entirely SMP, only separate harts (HARdware Threads) within the same
core. Custom inter-hart synchronization instructions (directed-yield?
semaphores?) could even be added to assist in aligning the cooperating
harts.

I expect that at least one vendor will develop such a product if there
is a market for it.

-- Jacob

Allen Baum

unread,

Apr 19, 2018, 2:10:26 AM4/19/18

to jcb6...@gmail.com, lk...@lkcl.net, RISC-V ISA Dev, glem...@vectorblox.com, ces...@cesarb.eti.br, kr...@berkeley.edu, acah...@gmail.com, richard....@roalogic.com

Exactly.
Consider the Scenarios
1: one IP vendor supplying only the custom extensions. Customer must integrate into their core
2: one IP vendor supplying core+ custom extensions
3: two or more IP vendors supplying only custom extensions. Customer must integrate into their core

1: Either the customer relocates (one of) the extensions or the vendor provides tools to do this (since the integration is custom, this will be unavoidable)
2: The vendor will relocate. It’s their core and their responsibility
3: same as (1).

Note existence proofs, both for custom extensions and other IP.

Imagine if you wanted to buy a DRAM controller that required you to fix (not configure) exactly where in address space the controller registers were ( or where DRAM was) and it overlapped with some other IO device.

Same problem, same solutions.

So,
I repeat: this is not a problem that takes a lot of thought.
IP vendors will implement their extensions in a manner that makes this seamless or go out of business.
That’s their job. Not ours.

-Allen

> --
> You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/5AD8096A.10002%40gmail.com.

lk...@lkcl.net

unread,

Apr 19, 2018, 2:49:57 AM4/19/18

to RISC-V ISA Dev, jcb6...@gmail.com, lk...@lkcl.net, glem...@vectorblox.com, ces...@cesarb.eti.br, kr...@berkeley.edu, acah...@gmail.com, richard....@roalogic.com

On Thursday, April 19, 2018 at 7:10:26 AM UTC+1, Allen Baum wrote:

Exactly.
Consider the Scenarios
1: one IP vendor supplying only the custom extensions. Customer must integrate into their core

if the two custom extensions (likely supplied by the same vendor) are not conflicting, the work required by the fabless company is minimal... with the proviso that guy pointed out

2: one IP vendor supplying core+ custom extensions

very *very* good for the fabless company. *all* the work has been done.

3: two or more IP vendors supplying only custom extensions. Customer must integrate into their core

same as 1, above.

in scenarios 1 and 3, therefore, there are instances where the costs and risks are enormous. Remember also that those costs also include custom software development (of gcc and binutils as the starting point), as well as the vendors being forced to recompile certain software libre libraries (customised to that specific non-standard extension), as well as forcing them to be the sole distribution and maintenance point for that custom software.

Also if the resultant hardware is commodity hardware (sold in tens to hundreds of millions of units world-wide), there is an additional burden forced onto the developers of gcc, binutils, and also the Android, Debian, Fedora, SUSE, Linux Kernel mainline developers and other Software Libre Communities through peer-pressure and end-user pressure to support the custom hardware as *mainline*.

At that point it becomes the very nightmare that RISC-V is supposed to have been designed to avoid.

Question, Allen: what is the reason why you believe it is ok for Fabless Semi Conductors, Software Libre Developers, End-users and many many more people, to have to have such costs?

Would you consider it reasonable for the Standard to be adjusted - with the addition of one single simple instruction that really is no significant burden for implementators - so that such costs (for everyone) may be entirely eliminated?

If you do not consider an addition to be reasonable, could you explain why you do not consider it to be reasonble?

l.

lk...@lkcl.net

unread,

Apr 19, 2018, 3:25:44 AM4/19/18

to RISC-V ISA Dev, lk...@lkcl.net, glem...@vectorblox.com, ces...@cesarb.eti.br, kr...@berkeley.edu, acah...@gmail.com, richard....@roalogic.com, jcb6...@gmail.com

On Thursday, April 19, 2018 at 4:13:50 AM UTC+1, Jacob Bachmeyer wrote:

lk...@lkcl.net wrote:
> On Wednesday, April 18, 2018 at 2:38:33 PM UTC+1, Allen Baum wrote:
>
> [...]
>
>
> A customer that falls into the trap of requiring two separate
> custom extensions from one or more vendors that have conflicting
> encodings, and requiring working silicon for both extensions that
> do not conflict--- deserves to go out of business.
>
>
> I believe we have established that the number of remaining custom
> opcodes in the 32-bit space is sufficiently small (2) such that the
> probability of this scenario "businesses using RISC-V deserving to go
> out of business" - is extremely high. In an earlier message in this
> thread approximately six different *classes* of high-value potential
> custom extensions were listed.

What luck! For RV32, there are exactly *six* major opcodes that can be
used in this manner: CUSTOM-0, CUSTOM-1, OP-IMM-64/CUSTOM-2,
OP-64/CUSTOM-3, OP-IMM-32, OP-32. :-)

:) _classes_ of high-value potential custom extensions. that means several implementors in each class. which... oh, I get your point: it'll result in 100% guaranteed conflicts in absolutely every single one of the six major custom opcode areas.

On a more serious note, the issue here is the inflexibility of wanting
extensions with incompatible encodings in a single processor and wanting
*pre-existing* working silicon for what, put simply, would be a *new*
product -- a processor that combines those extensions for the first
time. I hope that all can see the temporal contradiction in that scenario.

I am guessing that it basically boils down to the unusual (unique?) approach and opportunity that RISC-V represents. ARC (bought by Synopsis) is about the only other "major custom extender" that I personally am aware of: they have thousands of opcodes, specialising in video and 3D and DSP and and and and. It's managed *entirely* by Synopsis, so the problem that we're discussing simply doesn't arise (for them).

At the beginning of this thread Albert mentioned POWERPC being extended with two Custom Vector Extensions, both using the exact same opcodes. Result: absolutely NOBODY wants to bother with POWERPC.

MIPS is also well-known for being extended: Ingenic added X-Burst (a Vector FP pipeline) which hilariously and ingeniously their "compiler" is a series of awk and sed scripts that look for regular patterns in c-code and replace them with assembler.

Celestial Semiconductors did their own MIPS-based Video Processor and were one of the first companies to ever manufacture a 400mhz ARM9 core (!!!!) that could do 1080p60 video decode. Full custom extensions to a MIPS instruction set.

However none of these examples have ever tried what the RISC-V Foundation is doing: having a base core where anyone may extend it by following a documented Standard that is *not* exclusively controlled by one Corporate Entity.

So these are the "Industry-standard" approaches:

* A Fabless Semi Company might license several cores with their own (embedded) CPU - even if that's an entire 3D GPU such as Vivante. They'd license a Video "core", and a 3D "core" and a crypto "core" *and* a "main CPU core", all of them separate.

Their job becomes one of *SIMPLE* integration (at the AXI / Tilelink level). Decide L2 and L1 cache sizes. Even the I/O is "yet another building block".

* A Fabless Semi Company chooses some of those functions to be INTEGRATED into ONE vendor's ISA / Core. This is where ARC (Synopsis) comes in to play. Fabless Company pays Synopsis for an ARC core *plus* the Video Extensions *plus* the 3D extensions, and goes with a separate crypto "core" and a separate "main CPU core". Also gets free pre-integrated compilers that JUST WORK with the ARC core: no need for them to do any special work.

Again, the job is one of SIMPLE integration.

* A Fabless Semi Company evaluates RISC-V, choosing to put down several RISC-V Cores with wholly different custom extensions in each. Has a bit of a "wtf??" moment at the fact that they have to have two totally disparate sets of gcc compilers and binutils toolchains, but they "roll with it".

Again, the job is one of SIMPLE integration. Even the totally disparate toolchains they consider quirky but otherwise go "ehh what the heck". No Problem.

* A Fabless Semi Company evaluates RISC-V, finds that they have a scenario where it is ESSENTIAL that they have two custom extensions IN THE SAME CORE, find that the task is so insanely complex and costly, and burdensome to maintain, that they go "you know what? fuckit don't bother".

And if they *don't* decide that this is a nightmare scenario of total lunacy compared to other options which are far, far less costly and risky for them (both in NREs and ongoing maintenance), then a couple of years down the line they end up forcing the free software community to shoulder the cost of maintaining their insanity, in the instance where their insane botch-job becomes commodity hardware.

> Would you agree that the probability of conflicting encodings is very
> high?

I consider it a certainty (at the scale of "all of RISC-V") and propose
that we work to manage the consequences of that inevitability. Consider
this: how to ensure that the Chinese no-name gone-tomorrow
new-name-next-day manufacturers follow whatever process is developed for
coordinating unique extension encodings and do not just decide that that
process is "too complex" and drop their extensions right on top of
whatever happens to be in the way?

*sigh* yeah this is extremely likely to happen at some point in the next 4-10 years, as the software ecosystem stabilises. Once there's an Android port for RISC-V I *guarantee* it'll happen.

> Anyway: there is one potential "solution" that comes out of your
> reaction, Allen (see? every contribution is valuable!) and that's the
> possibility of simply laying down one core with one custom extension,
> and another totally separate core with another custom extension.

Entirely separate cores are not needed, as each hart can have its own
{vendor-id, arch-id} tuple.

In order to avoid the... oh, you're saying that you could have an integrated (not-quite-SMP) system? i.e. 4 cores supporting one custom extension, and 4 supporting an entirely different custom extension? It's a bit weird but might actually work. And they'd share the same L2 cache. And would be good to do 8-way SMP multithreading for tasks that did not require either custom extension. I like it! Doesn't cover the case below, though (tight integration required between custom extensions).

> Now, whether that's acceptable to a Fabless Semi Company is an
> entirely different matter. It would mean having to have separate SMP
> cores, such that the performance may be adversely affected if the
> algorithm specifically requires the close cooperation between two
> custom encodings, such that the overhead of dropping down to L2 cache
> and back to another core is simply not a way to achieve the desired
> performance.

Not entirely SMP, only separate harts (HARdware Threads) within the same
core. Custom inter-hart synchronization instructions (directed-yield?
semaphores?) could even be added to assist in aligning the cooperating
harts.

It'd fly... (however I *am* currently watching Angry Birds with my daughter...)

It might just work, Jacob: I don't know the exact implications (and my initial thoughts would be to be wary of forcing implementors to have to rewrite software around a non-standard computing paradigm).

I do have to say though that it's a hell of a lot of effort to go to just to avoid adding one single simple instruction to the privileged spec.

l.

lk...@lkcl.net

unread,

Apr 19, 2018, 3:26:53 AM4/19/18

to RISC-V ISA Dev, lk...@lkcl.net, glem...@vectorblox.com, jcb6...@gmail.com, ces...@cesarb.eti.br, kr...@berkeley.edu, acah...@gmail.com, richard....@roalogic.com, br...@hoult.org

On Thursday, April 19, 2018 at 12:42:51 AM UTC+1, Bruce Hoult wrote:

With ARM you're not going to be able to have one custom ISA extension, let alone two!!

yes: appreciated. Allen kindly highlighted that mistake as well, and I believe I adjusted to take it into account?

l.

Luke Kenneth Casson Leighton

unread,

Apr 19, 2018, 3:35:10 AM4/19/18

to Jacob Bachmeyer, RISC-V ISA Dev, Guy Lemieux, Cesar Eduardo Barros, Krste Asanovic, Albert Cahalan, Richard Herveille

---
crowd-funded eco-conscious hardware: https://www.crowdsupply.com/eoma68

On Thu, Apr 19, 2018 at 4:13 AM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:

> On a more serious note, the issue here is the inflexibility of wanting
> extensions with incompatible encodings in a single processor and wanting
> *pre-existing* working silicon for what, put simply, would be a *new*
> product -- a processor that combines those extensions for the first time. I
> hope that all can see the temporal contradiction in that scenario.

yyyeahhh it's not a bed of roses by any means: one custom extension
being written in BlueSpec, another in Verilog, another in Chisel,
another in VHDL, and the core (or cores) being the same. which would
actually point to trying to *minimise* the amount of modification(s)
made to each. not least, having to hire separate teams with at least
two sets of RTL language skills?

modifications means risk and cost. minimise the modifications,
reduce the risk and cost.

l.

Allen Baum

unread,

Apr 19, 2018, 7:35:25 AM4/19/18

to lk...@lkcl.net, RISC-V ISA Dev, Jacob Bachmeyer, Guy Lemieux, Cesar Eduardo Barros, Krste Asanovic, Albert Cahalan, Richard Herveille

First: I don't think you can claim precisely what RISC-V is designed to avoid.

Your interpretation certainly doesn't match mine, and basing conclusions based on your specific interpretations pretty much make your whole premise invalid.

Second: I don't see any *additional* costs, except in the rare cases where a customer requires 2 specific extensions that conflict.

I have no problem imposing costs for that rare customer - if it makes that much difference, it will be worth paying for.

That is what is called the cost of business.

If it enables them to sell millions/billions of parts, that cost is unnoticeable.

if they won't sell enough parts to amortize the costs - why are they wasting their time?

And, I don't actually don't believe it will be a cost to the customer. It will be a cost to the IP vendor.

They have to provide all those libraries in any case, and parameterizing it for different encodings is not rocket science.

It's a .h file.

That is what is called the cost of business, and IP vendors already deal with that kind of configurability.

--

You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/f90e6d0a-07a5-426e-a63a-18d2c1898f7c%40groups.riscv.org.

Luke Kenneth Casson Leighton

unread,

Apr 19, 2018, 9:25:02 AM4/19/18

to Allen Baum, RISC-V ISA Dev, Jacob Bachmeyer, Guy Lemieux, Cesar Eduardo Barros, Krste Asanovic, Albert Cahalan, Richard Herveille

On Thu, Apr 19, 2018 at 12:35 PM, Allen Baum
<allen...@esperantotech.com> wrote:

> First: I don't think you can claim precisely what RISC-V is designed to
> avoid.
> Your interpretation certainly doesn't match mine, and basing conclusions
> based on your specific interpretations pretty much make your whole premise
> invalid.

Allen: I have known difficulties communicating (cross-wiring) and
with recall, which makes it hard for me to say exactly what I mean.
A number of times people have been kind enough to compensate for that
by providing additional information (and corrections) that, after a
number of iterations, help demonstrate more clearly what I am trying
to get across. In particularly complex analysis I can often tell
immediately that there's something wrong, but actually finding the
words or the *precise* reasons why can often take days, weeks, and
sometimes even months.

The past three or four communicattions with you, one of them I'm not
sure if you're aware of it: you actually asked me to cease and desist
from contributing to RISC-V development.

I also noticed that you didn't answer the questions that I asked.

Now, as you've probably noticed I am comfortable with challenging
conversations (both ways), and I am quite happy to give people the
time and space to open up and explore complex topics. What I *cannot*
deal with is the scenario where people are closed-minded, unwilling to
listen, and seek to *undermine* an analysis (with a view to shutting
down the conversation) rather than help spot mistakes, and refine and
understand the space.

I'm going to leave it at that: I'm not in any way going to tell you
"what to do". The absolute most that I can do is present people with
truth, perception and awareness.

If there is anyone else who can take over dealing with Allen on this
thread, that would be great.

l.

Guy Lemieux

unread,

Apr 19, 2018, 9:52:51 AM4/19/18

to Allen Baum, lk...@lkcl.net, RISC-V ISA Dev, Jacob Bachmeyer, Cesar Eduardo Barros, Krste Asanovic, Albert Cahalan, Richard Herveille

On Thu, Apr 19, 2018 at 4:35 AM, Allen Baum
<allen...@esperantotech.com> wrote:
> If it enables them to sell millions/billions of parts, that cost is
> unnoticeable.
> if they won't sell enough parts to amortize the costs - why are they wasting
> their time?
>
> And, I don't actually don't believe it will be a cost to the customer. It
> will be a cost to the IP vendor.

Out of all the parties involved, the IP vendor actually makes the
least amount of money and bears the greatest risk of going bankrupt.
(ARM excepted.)

You can ask this of any IP vendor and you'll get the same answer. IP
sales aren't enough -- they augment their sales with design services
just to pay the bills.

> They have to provide all those libraries in any case, and parameterizing it
> for different encodings is not rocket science.
>
> It's a .h file.
> That is what is called the cost of business, and IP vendors already deal
> with that kind of configurability.

I fully agree -- and AFAICT nobody has argued about this. Simply
reassigning a major opcode is a piece of cake for someone who is
compiling their own software.

However, when we run out of custom opcode space in the 32b encoding
world, then we need a mechanism to "stuff more in". In particular, the
most compelling reason which I've mentioned before, is to upgrade the
C extension with new "domain-specific encodings" which can be used in
libraries. This has the capability to provide even more compression,
e.g. some new encodings may find ways to use 16b to represent 2 or
more instructions (e.g., one instruction encoding to "save all
caller-saved registers on the stack"). Now think about shrink-wrapped
software trying to use custom extensions when available, and needing
to include all possible variations of each custom extension as
implemented on each custom silicon chip -- can you see the
combinatoric explosion happening yet?

However, for one custom extension by an IP vendor, using different
major opcodes in every different silicon vendor's implementation (just
because of what combinations of extensions those silicon vendors
choose to include) is not viable. The IP vendor likely has to release
a LIBRARY of software around their custom extensions. Do they now have
to release a combinatoric explosion of different versions of this
library? So much for binary compatibility! Now try to excuse the
situation when a user downloads the WRONG library but doesn't know it,
and software breaks.

When we run out, and we still want more, it may not be viable to use
48b or 64b encoding spaces. Rewriting software so that some custom
instructions are 32b, others are 48b or 64b, is a true nightmare.

I have come to the inevitable conclusion that IP vendors should NOT
have to reassign their opcodes into a different space, as this is a
software maintenance nightmare.

Instead, RISC-V must have a standard mechanism for SWITCHING which
custom extension is currently in use.

Going farther, we could also consider a standard query mechanism to
find out which ones are available, but there are so many varying
opinions here (READISA, or DevTree, etc) I'm not participating in that
discussion. However, the NEED FOR SWITCHING and the mechanism for
SWITCHING needs to be nailed down first.

Obviously there are some people who will always think 2 major custom
opcodes is enough (*).

Guy

(*) Jacob pointed out there are 6 major opcode spaces available
including custom-2, custom-3, OP-32 and OP-IMM-32, but all of these
already have some other defined use and are reserved for use by the
foundation. Only custom-0 and custom-1 are guaranteed to be available
in the 32-bit encoding space. In the 48-bit and beyond encoding
spaces, AFAIK the RISC-V Foundation has not yet released *any* of the
opcode space so the entire future extension space should be considered
"reserved" at this point. However, once (if) they release some of the
opcode space, the chances for collisions will still occur.