Depreciate JALR with low bit set

121 views
Skip to first unread message

David Horner

unread,
Feb 9, 2021, 12:06:38 AM2/9/21
to RISC-V ISA Dev

During the end of the RVI ratification and following it, the meaning of "reserved" was subtly morphed.
During the early RVI development, prior to RISCV.org and during its early years, "reserved" consistently meant that the instruction encoding was unavailable to implementations. IThe implementation was required to trap with illegal instruction if encoding was detected in the input stream.

Now that requirement is substantially relaxed.
Trapping is not mandated.
Behaviour is undefined for such an encoding.
Specifically allowed is behaviour that might be expected if the reserved encoding followed  and the patterning of the other instructions.   e.g. C.LI explicitly does not include x0 in its encoding, however, when rd is x0 it is considered a hint.

Had this more permissive definition been in place from the outset, I believe the JALR instruction would have been defined differently. The formulation with the low immediate bit set [ that is, a 1 in the immediate for the target address zero bit] in JALR would have been reserved for future use, rather than having a fully defined implementation specific formulation. Implementations would then have been free to a) trap on this variant, b) cleared the bit before [or not include it in] the target address calculation, or c) added the bit to the register and cleared the low bit of the resultant target address [the prescribed obtuse behaviour].

Therefore  Issue #625 https://github.com/riscv/riscv-isa-manual/issues/625 requests this formulation of JALR to be defined under the current "reserved" meaning. This in effect depreciates the stipulated behaviour and allows future use for the 2K encoding points.

 

MitchAlsup

unread,
Feb 23, 2021, 4:35:36 PM2/23/21
to RISC-V ISA Dev, David Horner
On Monday, February 8, 2021 at 11:06:38 PM UTC-6 David Horner wrote:

During the end of the RVI ratification and following it, the meaning of "reserved" was subtly morphed.
During the early RVI development, prior to RISCV.org and during its early years, "reserved" consistently meant that the instruction encoding was unavailable to implementations. IThe implementation was required to trap with illegal instruction if encoding was detected in the input stream.

BTW, this is the only sane thing to do--even in light of RISC-Vs desire for individual enhancements of ISA for a given application goal. 

Now that requirement is substantially relaxed.
Trapping is not mandated.
Behaviour is undefined for such an encoding.
Specifically allowed is behaviour that might be expected if the reserved encoding followed  and the patterning of the other instructions.   e.g. C.LI explicitly does not include x0 in its encoding, however, when rd is x0 it is considered a hint.

I predict this will come back to bite at least some of you. 

Allen Baum

unread,
Feb 23, 2021, 6:42:01 PM2/23/21
to MitchAlsup, RISC-V ISA Dev, David Horner
I'm unclear which bit of the above you predict will come back to bite:
 - defining destination=X0 to be a hint (which is a bit different than say it's a reserved encoding, oh, and it's a hint)
 - defining reserved encodings as unspecified and that trapping is not mandated

I'm a little unclear about what
      "Specifically allowed is behaviour that might be expected if the reserved encoding followed  and the patterning of the other instructions"
means beyond the hint definition when rd=x0; those are specifically called out in the spec for each base opcode  (can C-extopcode) to which it applies, so they aren't RESERVED as such.

@DHorner: did you have some other examples in mind?

Specifically for rd=x0 case: they are defined to have no architectural effect (a noop), but may (not shall) have microarchitectural effects
The specific microarchitectural effect may (not shall) be defined and ratified at a later time.





--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/86272c02-8424-4a69-8434-7d452a893088n%40groups.riscv.org.

David Horner

unread,
Feb 23, 2021, 7:15:53 PM2/23/21
to Allen Baum, MitchAlsup, RISC-V ISA Dev


On 2021-02-23 6:41 p.m., Allen Baum wrote:
I'm unclear which bit of the above you predict will come back to bite:
 - defining destination=X0 to be a hint (which is a bit different than say it's a reserved encoding, oh, and it's a hint)
 - defining reserved encodings as unspecified and that trapping is not mandated

I'm a little unclear about what
      "Specifically allowed is behaviour that might be expected if the reserved encoding followed  and the patterning of the other instructions"
means beyond the hint definition when rd=x0; those are specifically called out in the spec for each base opcode  (can C-extopcode) to which it applies, so they aren't RESERVED as such.

Correct . The idea is that C.LI X0,n is not the same hint as ADDI X0, X0, n .

So the encoding means something different than "normal" [expansion case].


@DHorner: did you have some other examples in mind?

Perhaps RV32 C.SRLI [and other shifts] is a better example.

When nzuimm[5]=1 the instruction is designated NSE. The original old term, Non-Standard Extension.

But I think it is now understood to be "reserved".

It is allowed to expand to SRLI with the high shift bit set, and thus on RV32  trap [or not] as a SRLI does [or not] .

Or it could map to any other 32 bit instruction; however, it is not a custom extension but a NSE.

I will raise an issue in github rsicv-isa-manual to suggest NSE is removed here as well and we standardize on "reserved".

Allen Baum

unread,
Feb 23, 2021, 7:45:10 PM2/23/21
to David Horner, MitchAlsup, RISC-V ISA Dev
NSE and reserved aren't quite the same thing.
A Non-Standard Extension is saying something stronger than simply reserved: it guarantees that no standard ratified instruction will ever use that encoding.
It is reserved, indeed, but for a custom extension, never a ratified one.

IT can expand to a ratified operation at a different encoding, or have totally custom functionality.

David HORNER

unread,
Feb 23, 2021, 7:50:31 PM2/23/21
to Allen Baum, MitchAlsup, RISC-V ISA Dev

In section 16.8 RVC Instruction Set Listing NSE is used to indication Custom Encoding, a subset of Non-Standard Extension/Encoding.

I opened this issue to fix it up some: https://github.com/riscv/riscv-isa-manual/issues/629
Change NSE designation in RVC section 16.8 to VSNE Vendor-Specific Non-standard Extension, or like.... #629

So, the better example is :

C.ADDI4SPN (RES, nzuimm=0)

Reserved when nzuimm=0, that is when it would [otherwise] translate to the "hint" ADDI X2, X2, 0.

A compliant implementation can make that mapping explicit.

However, it would not be compliant with a future version of the standard that uses that encoding for something other than a hint.

MitchAlsup

unread,
Feb 23, 2021, 9:14:38 PM2/23/21
to RISC-V ISA Dev, Allen Baum, RISC-V ISA Dev, David Horner, MitchAlsup
On Tuesday, February 23, 2021 at 5:42:01 PM UTC-6 Allen Baum wrote:
I'm unclear which bit of the above you predict will come back to bite:
 - defining destination=X0 to be a hint (which is a bit different than say it's a reserved encoding, oh, and it's a hint)
 - defining reserved encodings as unspecified and that trapping is not mandated
 
This is where it has come back to bite me more than once. Any fields that are not mandated to have a patterns set, and that can be generated by software unbeknownst to you is a lurking danger. Thus, all patterns need complete definitions--even if the definition is that this pattern, right now, is undefined.

Boroughs got around this problem by only letting their compilers create executable files--but that ship has sailed.

Just remember, the charge of the ISA group is to define and specify an base-ISA that should live for 50 years; minimum. 50 years from now, you will be glad that you prevented someone from utilizing this bit in that instruction--because you finally found a better use for that bit.

The instruction decoder should be able to distinguish all 2^32 encodings into the set {Legal, Exception}.And every instruction encoding that falls into the Legal set should have a specific defined meaning {for example a SW subroutine that performs the defined work--or modifies the requisite defined state}

For example: I have been caught where the specification clearly stated that this bit "must be zero (0)", but somebody figured out that the hardware decoder did not decode on that bit and they used it, preventing future uses of that bit. This played out not only in the minor opcode space, but also the branch-on-condition space where the condition multiplexer in hardware did not actually decode that bit into the take/no-take resultant. And somebody used that bit--preventing an ISA extension from being able to capitalize on using that decode pattern. 68000 was caught with a conditional branch instruction that sampled one of the odd bits in the condition code register. Several logical and shift instructions were defined such that that bit could be anything the HW wanted it to be. 68020 actually opened the latch to capture the next setting of the bit and left it open until an instruction set the bit. Along comes a branch instruction sampling an Open_latch. The only players that saw this effect were those running a checker processor 1 cycle behind the master processor. The fix was that the checking and master processors had to be from the same mask set !!

{My general take is:: a) no undefined or undecoded patterns of a field b) no undefined calculation semantics--for example say an instruction set has an ABS instruction to calculate the absolute value of signed integers, single precision, and double precision operands. What is the "proper" definition if this instruction is handed an unsigned ?? Answer:: you must choose if you choose that this is a legal instruction--then the result is identical to a MOV instruction, otherwise your only choice is that the instruction is malformed and you should raise a "malformed instruction" exception (which in RISC-V can be mapped to Operation or Operand depending on your druthers). It would be entirely incorrect to map it to "do whatever the first generation HW does.}

I did not develop this thick skin because this happened once or twice, but literally every time the entire encoding space is not nailed down--completely--100%, there are going to be entities roaming that space for nefarious patterns and purposes. Don't let them in. Also, empower your verification team to fully explore the entire encoding space, leaving no stone unturned. You will thank me in 5 years or so....

Samuel Falvo II

unread,
Feb 23, 2021, 9:32:34 PM2/23/21
to MitchAlsup, RISC-V ISA Dev, Allen Baum, David Horner
This also happened much earlier than the 68000 too.  Remember the original MOS 6502 had a ton of undefined opcodes, some of which were claimed by the 65C02 processor for instructions like PHX/PLX, PHY/PLY, and STZ, which did not originally exist in the NMOS variant.  And, with the 65816 processor, literally all formerly undefined opcodes now had a legal interpretation.

Needless to say, running many games, and in some cases, productivity applications written for the Apple II or Commodore 64 broke when run on an Apple IIc, IIgs, or a SuperCPU-equipped Commodore computer.


--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.


--
Samuel A. Falvo II

Allen Baum

unread,
Feb 24, 2021, 1:13:16 AM2/24/21
to MitchAlsup, RISC-V ISA Dev, David Horner
The instruction decoder should be able to distinguish all 2^32 encodings into the set {Legal, Exception}
Talking about ships that have sail: that one certainly has.

Risc-V can (I think) define encodings as 
 - Implemented (based on a ISA string), 
 - un-implemented (but defined in the architecture), 
 - reserved for Risc-V (therefore undefined) and
 - reserved for custom use (undefined).

There may yet further categories. E.g. there may be some encodings guaranteed to trap if unimplemented-- but nothing comes to mind.
 - Un-implemented but defined encodings are not required to trap; they can be repurposed, at the risk of incompatibilities with that implementation
 - Reserved for Risc-V:   Ditto. Those encodings may be defined someday, and anyone using them has no recourse if they suddenly get defined.
 - Reserved for Custom use: Anything goes. They can trap or not. They can alias an existing op. They can halt and catch fire. 

Effectively, we are allowing people to shoot themselves in the foot ...
     ... if they're pretty sure their aim is bad, but good enough to hit the target they were shooting for.

This is the loosest possible architectural definition that enables application SW written to a standard (platform profile) to interoperate, and that is the goal .
This is a different than the historical norm, and is partially a result of 
 -  the architectural modularity, 
 -  the desire for lots of people to come up with their own custom secret sauce, 
 -  the fact that there will be many, many implementations.
That last one is critical: no one can find some weird anomaly on someone's implementation and depend on it working for any other.
That is guaranteed non-portability - and anyone who wants to sell SW knows that, and knows there is no market for SW that will run on only a single vendor's chip.
Someone who is selling an embedded core, or an appliance running a specific workload can get away with that though.

A platform and  profile definition could go farther and require that all undefined encodings trap, or that they are treated as noops or simply not be used.
The model is that SW is written for a platform profile, and as long as you meet its standards, your SW will interoperate.
I would expect that any core that cares a lot about security will need to be very, very careful about what they leave undefined, myself.
I also expect that there will be few applications that don't care about security.
 (of course, an embedded application which doesn't run arbitrary downloaded code, but only runs code from ROM or flash won't need to worry about that specific aspect so much)

Enough soapbox. My sympathies certainly run with yours, but this is where we are, and I think why we got here.
This is entirely my interpretation, and I encourage corrections to anything I got wrong here, or to my conclusions from them



--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.

Samuel Falvo II

unread,
Feb 24, 2021, 1:40:02 AM2/24/21
to Allen Baum, MitchAlsup, RISC-V ISA Dev, David Horner
So, in my next 64-bit design, I no longer have to add logic to trap on a LDU instruction encoding?  :-)

David HORNER

unread,
Feb 24, 2021, 6:14:36 AM2/24/21
to Allen Baum, MitchAlsup, RISC-V ISA Dev
I agree with your assessment and apparent rationale.
It is the reason underlying the post version 2.2
Defined instruction-set categories: standard, reserved, custom, non-standard, and non-conforming
revision of the non-privilege spec.

However, the philosophy predated this concept in the privilege spec [volume II]
which specifically stated [since version 1.0] in the introduction:
We briefly note that the entire privileged-level design described in this document could be replaced
with an entirely different privileged-level design without changing the unprivileged ISA, and possibly
without even changing the ABI. In particular, this privileged specification was designed to run
existing popular operating systems, and so embodies the conventional level-based protection
model. Alternate privileged specifications could embody other more flexible protection-domain
models. For simplicity of expression, the text is written as if this was the only possible privileged
architecture.

The same justifications of least intrusion into vendor options applies.

I suggest your assessment of the flexibility inherent in the "Defined instruction-set categories"
would make invaluable Commentary for the Volume I introduction,
to help prepare/acclimatize the appropriate mindset.

"Give me a [computer professional] at an impressionable  [stage] and I will make [them] mine for life"
               - with apologies to Muriel Spark  The Prime of Miss Jean Brodie 1961

Chapter 26 provides some further discussion about rationalization,
but focuses on various ways of extending the RISC-V ISA,
not the philosophy per se.

Allen Baum

unread,
Feb 24, 2021, 1:18:55 PM2/24/21
to Samuel Falvo II, MitchAlsup, RISC-V ISA Dev, David Horner
I don't believe you ever had to trap on LDU if you're implementing an RV64I architecture.
That's an RV128I opcode. It's not defined for RV64I, so it reserved and actions taken are UNSPECIFIED.
A platform spec could require it, but the architecture does not.
Reply all
Reply to author
Forward
0 new messages