Proposal from ISA Formal Spec Technical Group: Behavior on clearing MISA.C

358 views
Skip to first unread message

Rishiyur Nikhil

unread,
Feb 15, 2018, 10:02:38 AM2/15/18
to RISC-V ISA Dev
BACKGROUND:
Consider a RISC-V implementation that supports the 'C'
extension (Compressed Instructions), and can clear the MISA.C bit at
runtime (thereby switching off C support dynamically).  What should be
the behavior if PC is not currently 32-bit aligned?  What if return
addresses and trap vectors are not 32-bit aligned?

This is a proposal to specify the behavior precisely.

PROPOSAL:
When MISA.C is cleared,

- If PC is not 32-bit aligned; just continue executing, consuming 32b
    instructions that are 16-bit aligned (since the CPU is already
    capable of doing this to support C).

- If a trap occurs and MTVEC contains an address that is
    16-bit aligned and not 32-bit aligned, just continue executing,
    consuming 32b instructions that are 16-bit aligned (since the CPU
    is already capable of doing this to support C).

- All jump or branch instructions that attempt to jump to an address
    that is not 32-bit aligned will cause an Instruction address
    misaligned exception, per the current spec (exception is taken at
    the jump, not at the target).  The return address is saved as is,
    even if it is not 32-bit aligned.

- CSR* instructions that write to [MSU]EPC will clear the two LSBs of
    the written address, i.e. EPC will be 32-bit aligned.

- When PC is copied to [MSU]EPC as part of trap handling, it is copied
    as-is even if it points to an address that is not 32-bit aligned.



Please let us know if you see any problems with this.

[Thanks to Clifford Wolf for articulating this first.]

Rishiyur Nikhil (Chair) and the ISA Formal Spec Technical Group

Christopher Celio

unread,
Feb 15, 2018, 11:10:36 AM2/15/18
to Rishiyur Nikhil, RISC-V ISA Dev
Hi guys,

What is the rationale for allowing a grace period for relaxing 4B alignment requirements? Can we not reliably force alignment of "CSRW MISA"? (or .norvc on functions that write MISA?).

My immediate reaction is that this appears very messy from the point-of-view of the core. I can no longer rely on !MISA.C && PC(1) to signify an error without throwing in a state machine? What if I'm doing crazy optimizations specific to RV64G mode that rely on alignment?

-Chris


--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAAVo%2BPmhXr_94XDKEc87%2B1Jsg-8Qo%3Dvd2CTFvB0C8-ShknxyCA%40mail.gmail.com.

Samuel Falvo II

unread,
Feb 15, 2018, 11:52:16 AM2/15/18
to Rishiyur Nikhil, RISC-V ISA Dev
On Thu, Feb 15, 2018 at 7:02 AM, Rishiyur Nikhil <nik...@bluespec.com> wrote:
> PROPOSAL:
> When MISA.C is cleared,
>
> - If PC is not 32-bit aligned; just continue executing, consuming 32b
> instructions that are 16-bit aligned (since the CPU is already
> capable of doing this to support C).

I think this is too complicated, as it is at odds with the third
requirement that all branches (conditional or otherwise) be
word-aligned. I would say that, after clearing the C bit, if the next
instruction fetch is not word aligned, it must raise an exception just
like any instruction fetch after a branch would.

> - If a trap occurs and MTVEC contains an address that is
> 16-bit aligned and not 32-bit aligned, just continue executing,
> consuming 32b instructions that are 16-bit aligned (since the CPU
> is already capable of doing this to support C).

This, by definition, is an unconditional branch, and should be treated
as an unconditional branch -- that is, if *tvec is not word-aligned
AND C-bit is clear, then you'll raise an exception here as well. Of
course, you run the risk of an infinite trap loop; this can be stopped
by enforcing the requirement that mtvec be word-aligned (EVEN with
C-bit set!), while all other *tvec registers have relaxed
requirements.

> - CSR* instructions that write to [MSU]EPC will clear the two LSBs of
> the written address, i.e. EPC will be 32-bit aligned.

Ouch. No. No black magic, please. The spec states that the EPC bits
are read/written as-is, and as well, this patently prevents software
delegation to exception handlers written with the C-extension.

> - When PC is copied to [MSU]EPC as part of trap handling, it is copied
> as-is even if it points to an address that is not 32-bit aligned.

That has to be the case anyway.

--
Samuel A. Falvo II

Clifford Wolf

unread,
Feb 15, 2018, 12:08:44 PM2/15/18
to Christopher Celio, Rishiyur Nikhil, RISC-V ISA Dev
Hi Chris,

On Thu, Feb 15, 2018 at 08:10:32AM -0800, Christopher Celio wrote:
> What is the rationale for allowing a grace period for relaxing 4B
> alignment requirements? Can we not reliably force alignment of "CSRW
> MISA"? (or .norvc on functions that write MISA?).

Some history on that:

Rocket currently does extremely strange things in those cases. You can get
it to execute an unaligned LB instruction following the MISA write as an
LW instruction, but also jump to the trap handler with an MPEC that
suggests the load instruction was never executed. (The reason is that the
LB get scheduled, then a trap is raised and the LB is not properly killed,
causing it to write to its destination register when the memory returns the
data, but the register write is handled like it would be for a LW
instruction as a side-effect of the inproper instruction kill.)

Apparently Andrew Waterman doesn't want to fix this because it would add
additional hardware. (Please correct me if I'm misrepresenting you here
Andrew. This would not be my intention.) Instead he want's this to be
undefined behavior.

However, we from the formal spec working group don't think that this should
be undefined behavior and therefore propose a behavior that should be
implementable with minimal to zero hardware overhead. I would assume that
the behavior we propose would actually require less hardware than what
rocket currently does.

> My immediate reaction is that this appears very messy from the
> point-of-view of the core. I can no longer rely on !MISA.C && PC(1) to
> signify an error without throwing in a state machine? What if I'm doing
> crazy optimizations specific to RV64G mode that rely on alignment?

You can not use "!MISA.C && PC(1)" anyways because it's not the unaligned
jump target that traps, it is the instruction that tries to jump to the
unaligned address. (In which case the jump is not executed and PC still
points to the (probably aligned) jump/branch instruction.)

Unless you mean something like "!MISA.C && NEXT_PC(1)" which is exactly
what you should be doing now for all jump/branch instructions, and it is
also the behavior we are proposing below.

>> PROPOSAL:
>> When MISA.C is cleared,
>>
>> (1) If PC is not 32-bit aligned; just continue executing, consuming 32b
>> instructions that are 16-bit aligned (since the CPU is already
>> capable of doing this to support C).
>>
>> (2) If a trap occurs and MTVEC contains an address that is
>> 16-bit aligned and not 32-bit aligned, just continue executing,
>> consuming 32b instructions that are 16-bit aligned (since the CPU
>> is already capable of doing this to support C).
>>
>> (3) All jump or branch instructions that attempt to jump to an address
>> that is not 32-bit aligned will cause an Instruction address
>> misaligned exception, per the current spec (exception is taken at
>> the jump, not at the target). The return address is saved as is,
>> even if it is not 32-bit aligned.
>>
>> (4) CSR* instructions that write to [MSU]EPC will clear the two LSBs of
>> the written address, i.e. EPC will be 32-bit aligned.
>>
>> (5) When PC is copied to [MSU]EPC as part of trap handling, it is copied
>> as-is even if it points to an address that is not 32-bit aligned.

Note that (3) and (4) is exactly what every RISC-V processor should already
do when !MISA.C.

Also note that (1), (2), and (5) in fact remove complexity. In this points
we say that a processor should *not* add extra complexity in areas where
some might assume that the spec requires them to add complexity. (But in
fact the current spec doesn't say anything about what to do in those
situations.)

regards,
- clifford

--
there's no place like 127.0.0.1
until we found ::1 -- which is even bigger

Clifford Wolf

unread,
Feb 15, 2018, 12:19:17 PM2/15/18
to Samuel Falvo II, Rishiyur Nikhil, RISC-V ISA Dev
Hi,

On Thu, Feb 15, 2018 at 08:52:13AM -0800, Samuel Falvo II wrote:
> > - If PC is not 32-bit aligned; just continue executing, consuming 32b
> > instructions that are 16-bit aligned (since the CPU is already
> > capable of doing this to support C).
>
> I think this is too complicated, as it is at odds with the third
> requirement that all branches (conditional or otherwise) be
> word-aligned.

Aaehm.. No. There is no conflict between those clauses.

> I would say that, after clearing the C bit, if the next
> instruction fetch is not word aligned, it must raise an exception just
> like any instruction fetch after a branch would.

This is not how the exception in response to a branch to an unaligned
instruction works.

It is not the new (unaligned) instruction that traps, it's it the branch
instruction that tries to make the program counter unaligned that traps.

> > - If a trap occurs and MTVEC contains an address that is
> > 16-bit aligned and not 32-bit aligned, just continue executing,
> > consuming 32b instructions that are 16-bit aligned (since the CPU
> > is already capable of doing this to support C).
>
> This, by definition, is an unconditional branch, and should be treated
> as an unconditional branch -- that is, if *tvec is not word-aligned
> AND C-bit is clear, then you'll raise an exception here as well. Of
> course, you run the risk of an infinite trap loop; this can be stopped
> by enforcing the requirement that mtvec be word-aligned (EVEN with
> C-bit set!), while all other *tvec registers have relaxed
> requirements.

I don't know what you are talking about. *tvec are always word aligned.
Please read the spec.

> > - CSR* instructions that write to [MSU]EPC will clear the two LSBs of
> > the written address, i.e. EPC will be 32-bit aligned.
>
> Ouch. No. No black magic, please. The spec states that the EPC bits
> are read/written as-is, and as well, this patently prevents software
> delegation to exception handlers written with the C-extension.

What? Please read the spec!
This is the current behavior, not "black magic"!

> > - When PC is copied to [MSU]EPC as part of trap handling, it is copied
> > as-is even if it points to an address that is not 32-bit aligned.
>
> That has to be the case anyway.

if you read the spec and put it side by side to your proposal then you will
see that this is the case for most of it. It is the entire point to have
a proposal that is more a clarification regarding some small spec holes
rather than a major spec change.

regards,
- clifford

--
Oh, boy, virtual memory! Now I'm gonna make myself a really *big* RAMdisk!

Jose Renau

unread,
Feb 15, 2018, 12:22:35 PM2/15/18
to Rishiyur Nikhil, RISC-V ISA Dev

This looks very complicated/involved when we start to have wide fetch.
Clearing the MISA.C should be an infrequent even, so adding a nop
overhead to align the instruction correctly should not be a problem.
 
Doing the state machine indicated bellow is "ok" for a single scalar,
but as we fetch wider and uOP the RISC-V instructions, we have an
explosion of options just for something very infrequent/weird.
 
What about the following simpler solution:
 
-If the MISA.C is cleared by a compressed instruction. The instruction
following the MISA.C clear must be 32bit aligned or an illegal instruction
exception is raised.
 
Much easier to implement/debug and cleaner to verify.

Samuel Falvo II

unread,
Feb 15, 2018, 12:32:01 PM2/15/18
to Clifford Wolf, Rishiyur Nikhil, RISC-V ISA Dev
On Thu, Feb 15, 2018 at 9:19 AM, Clifford Wolf <clif...@clifford.at> wrote:
> It is not the new (unaligned) instruction that traps, it's it the branch
> instruction that tries to make the program counter unaligned that traps.

What is the exception? Is it an illegal instruction exception? OK,
I'll concede that.

If it is an unaligned instruction address exception, then I'm going to
vociferously disagree. I am not going to add unaligned address checks
for every single instruction capable of altering PC, when instead I
can just add the hardware *once* in the instruction fetch logic to
detect this condition. That just doesn't make any sense. Unless you
know some trick to this that I don't?

> I don't know what you are talking about. *tvec are always word aligned.
> Please read the spec.

Then the whole bullet point is nonsensical by definition, for it can
*NEVER* happen. As you apparently are fond of saying, Please read the
spec.

Andrew Waterman

unread,
Feb 15, 2018, 12:42:26 PM2/15/18
to Jose Renau, RISC-V ISA Dev, Rishiyur Nikhil
I still don’t think this needs to be well defined, but if we disagree on that point, I prefer Jose’s/Chris’ proposal for simpler semantics.

Clifford Wolf

unread,
Feb 15, 2018, 12:57:58 PM2/15/18
to Samuel Falvo II, Rishiyur Nikhil, RISC-V ISA Dev
Hi,

On Thu, Feb 15, 2018 at 09:31:59AM -0800, Samuel Falvo II wrote:
> On Thu, Feb 15, 2018 at 9:19 AM, Clifford Wolf <clif...@clifford.at> wrote:
> > It is not the new (unaligned) instruction that traps, it's it the branch
> > instruction that tries to make the program counter unaligned that traps.
>
> What is the exception? Is it an illegal instruction exception? OK,
> I'll concede that.
>
> If it is an unaligned instruction address exception, then I'm going to
> vociferously disagree. I am not going to add unaligned address checks
> for every single instruction capable of altering PC, when instead I
> can just add the hardware *once* in the instruction fetch logic to
> detect this condition. That just doesn't make any sense. Unless you
> know some trick to this that I don't?

I'm not sure what you are arguing here.

The "Instruction address misaligned" (mcause=0) exception currently works
the way that it traps on the instruction that tries to make the PC
unaligned. The branch/jump instruction in question is never executed and
in the trap handler mepc points to the branch/jump, not to the instruction
the branch/jump would have tried to jump to.

Our proposal does not change this is any way.

If you think this behavior is bad and you want to discuss it then maybe
create a new thread for that, but please don't sidetrack this thread.

> > I don't know what you are talking about. *tvec are always word aligned.
> > Please read the spec.
>
> Then the whole bullet point is nonsensical by definition, for it can
> *NEVER* happen. As you apparently are fond of saying, Please read the
> spec.

Yes, it is tautological.

It was not part of my original wording. Nikhil must have added it when he
edited my original 3 point proposal. But the overall semantics should be
the same. A tautological statement doesn't really change that.

Here is my original wording for reference. Maybe that makes it easier to
understand what this is about:

--snip--

A processor with C support and writable misa.C should behave as follows
when misa.C is cleared:

- If not stated otherwise below, behave the same as if misa.C would be set.
Specifically, if PC points to an address that isn't aligned to 32 bits
the core should just execute unaligned code.

- All jump or branch instructions that attempt to jump to an address that
is not aligned to 32 bits will cause an Instruction address misaligned
exception.

- Writes to [msu]epc will clear the two LSB of the written address. This
is only the case for the CSR* instructions that explicitly write to
[msu]epc. When PC is copied to [msu]epc as part of trap handling than
it is copied as-is even when it points to an address not aligned to
32 bits.

--snap--

regards,
- clifford

Clifford Wolf

unread,
Feb 15, 2018, 1:05:23 PM2/15/18
to Jose Renau, Rishiyur Nikhil, RISC-V ISA Dev
Hi,

On Thu, Feb 15, 2018 at 05:22:32PM +0000, Jose Renau wrote:
> This looks very complicated/involved when we start to have wide fetch.
> Clearing the MISA.C should be an infrequent even, so adding a nop
> overhead to align the instruction correctly should not be a problem.

You don't understand. This is not an optimization. This is to avoid
undefined behavior in the spec.

> Doing the state machine indicated bellow is "ok" for a single scalar,
> but as we fetch wider and uOP the RISC-V instructions, we have an
> explosion of options just for something very infrequent/weird.
>
> What about the following simpler solution:
>
> -If the MISA.C is cleared by a compressed instruction. The instruction
> following the MISA.C clear must be 32bit aligned or an illegal
> instruction exception is raised.

(1) Do you really mean "compressed instruction"? Because you cant modify
MISA.C with a compressed instruction. Only 32b opcodes exist for that. So
I'm assuming you meant unaligned instruction.

(2) This is only addressing the trivial part of the problem. The hard one
is what happens if one clears MISA.C while [MSU]EPC points to an address
that is not aligned to 32b. What happens if the next instruction is
[msu]ret?

regards,
- clifford

Bruce Hoult

unread,
Feb 15, 2018, 1:06:55 PM2/15/18
to Jose Renau, Rishiyur Nikhil, RISC-V ISA Dev
It doesn't matter whether MISA.C is cleared by a compressed instruction or not. If it's cleared by a 16 bit-aligned 32 bit instruction then that should trap also.

Also: both this and the other thread about unaligned 32 bit instructions crossing a page boundary should explicitly consider the behaviour with 48 bit, 64 bit and longer instructions. The generalisation is pretty obvious, and to make it clear for the future I think these proposals should not be talking about C extension instructions, but about ANY instruction consisting of an odd number of 16 bit packets.

To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Clifford Wolf

unread,
Feb 15, 2018, 1:13:10 PM2/15/18
to Andrew Waterman, Jose Renau, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Thu, Feb 15, 2018 at 05:42:12PM +0000, Andrew Waterman wrote:
> I still don’t think this needs to be well defined, but if we disagree on
> that point, I prefer Jose’s/Chris’ proposal for simpler semantics.

I'm confused. You pointed out to me in the original mails about this last
year that this solution would be insufficient because it does not address
the [MSU]EPC issue and you rejected anything along the lines of those
solutions that would address the [MSU]EPC issue because you said it would
add too much hardware overhead.

So what solution do you propose that solves the [MSU]EPC issue and stays
within what you would see as accaptable hardware overhead? Because afaict
neither Jose nor Chris addressed those issues in their mails.

regards,
- clifford

Clifford Wolf

unread,
Feb 15, 2018, 1:21:09 PM2/15/18
to Bruce Hoult, Jose Renau, Rishiyur Nikhil, RISC-V ISA Dev
Hi,

On Thu, Feb 15, 2018 at 09:06:52PM +0300, Bruce Hoult wrote:
> Also: both this and the other thread about unaligned 32 bit instructions
> crossing a page boundary should explicitly consider the behaviour with 48
> bit, 64 bit and longer instructions. The generalisation is pretty obvious,
> and to make it clear for the future I think these proposals should not be
> talking about C extension instructions, but about ANY instruction
> consisting of an odd number of 16 bit packets.

absolutely. however, in the current spec only toggling MISA.C will toggle
the support for unaligned instructions, because no other extension adds
support for instructions that are not a multiple of 32b in length.

one of me previous proposals was to decouple the C extension and unaligned
instructions completely and let MISA.C only toggle decoding of compressed
instructions, but still allow unaligned code in MISA.C capable processors
when MISA.C is disabled. This proposal was shut down because the idea is
that one should be able to use MISA.C to completely emulate a non-C core
with all it's limitations, including the un-ability to load unaligned
instructions.

Se when we talk about MISA.C in this proposal, we actually are talking
about the OR'ed value of all feature bits that control extensions that
would enable support for executing unaligned instructions.

Christopher Celio

unread,
Feb 15, 2018, 1:27:39 PM2/15/18
to Clifford Wolf, Andrew Waterman, Jose Renau, RISC-V ISA Dev, Rishiyur Nikhil
Thank you Clifford for providing clarification on this issue. Let me summarize this as I understand it and suggest a proposal or two:

* In current RISC-V behavior, misalignment in RVG is detected by monitoring ONLY the branch and jump instructions.
* This proposal keeps the "hardware simple" by maintaining this behavior.
* EPC is forced to 4-byte alignment during a RVG register restore to maintain this behavior too.


An alternative proposal, that better matches intuition would be:

* Misalignment in RVG is detected by monitoring branches, jumps, the instruction following "CSRW MISA.C"*, and xRET (use of EPC).

A sub-alternative is to force xRET to align EPC based on RVG or RVC (that is, on the USE of EPC and not the WRITE of EPC).


Do I understand the situation properly?
Chris
> --
> You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/20180215181309.GC19634%40clifford.at.

Clifford Wolf

unread,
Feb 15, 2018, 1:57:18 PM2/15/18
to Christopher Celio, Andrew Waterman, Jose Renau, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Thu, Feb 15, 2018 at 10:27:35AM -0800, Christopher Celio wrote:
> Thank you Clifford for providing clarification on this issue. Let me summarize this as I understand it and suggest a proposal or two:
>
> * In current RISC-V behavior, misalignment in RVG is detected by monitoring ONLY the branch and jump instructions.
> * This proposal keeps the "hardware simple" by maintaining this behavior.
> * EPC is forced to 4-byte alignment during a RVG register restore to maintain this behavior too.
>
> Do I understand the situation properly?

Yes.

I'm about the 3rd point about "register restore". Do you mean transfers
between the EPC CSR and general purpose registers? If so then yes: Writes
to the CSR clear the LSB bit or two LSB bits depending on the value of
MISA.C (that's the current spec). But reads will return the potentially
16-bit aligned address in the CSR if it was written before clearing the
MISA.C (that's the proposal, this situation is undefined in the current
spec).

> An alternative proposal, that better matches intuition would be:
> * Misalignment in RVG is detected by monitoring branches, jumps, the instruction following "CSRW MISA.C"*, and xRET (use of EPC).
> * A sub-alternative is to force xRET to align EPC based on RVG or RVC (that is, on the USE of EPC and not the WRITE of EPC).

Yes, this are the kind of things I proposed in october last year.

The objectsion if I remember correctly where:

In the first case we could have xRET trap based on the value of xEPC. But
when xRET traps then xEPC will be overwritten with the address of the xRET
and the original xEPC value that caused the trap will be lost. The spec has
been crafted carefully avoid this situation by making sure xEPC will never
be able to hold a value that would make xRET trap. Our proposal preserves
this guarantee.

In the second case we would effectively execute random code. (Whatever the
interpretation of the code is when offset by 2 bytes.) This has been deemed
less desireable than all other options by pretty much everyone I spoke to.

We also discussed other options, such as checking all xEPC regs when
clearing MISA.C and trigger an illegal instruction exception when any one
of them (or the PC) points to an unaligned address. But this was rejected
because of the extra hardware overhead. From a user perspective I think
this would have been the most intuitive option.

regards,
- clifford

Jose Renau

unread,
Feb 15, 2018, 2:10:59 PM2/15/18
to Clifford Wolf, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Just curious, what is the problem of nuking the pipeline when the misa.c is changed. Then, no need to check all the inflight instructions.

Clifford Wolf

unread,
Feb 15, 2018, 2:15:29 PM2/15/18
to Jose Renau, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
On Thu, Feb 15, 2018 at 11:10:56AM -0800, Jose Renau wrote:
> Just curious, what is the problem of nuking the pipeline when the misa.c is
> changed. Then, no need to check all the inflight instructions.

That would be a way of implementing a defined behavior, but it does not
answer the question what the defined behaviour should be.

Jose Renau

unread,
Feb 15, 2018, 2:22:22 PM2/15/18
to Clifford Wolf, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
It was to avoid checking all the instructions, and now allow miss aligned after misa.c is cleared for any 32bit instruction.

After change, clear pipe and future instructions are all aligned 32 or illegal instruction raised.

Clifford Wolf

unread,
Feb 15, 2018, 2:28:36 PM2/15/18
to Jose Renau, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
On Thu, Feb 15, 2018 at 11:22:19AM -0800, Jose Renau wrote:
> It was to avoid checking all the instructions, and now allow miss aligned
> after misa.c is cleared for any 32bit instruction.
>
> After change, clear pipe and future instructions are all aligned 32 or
> illegal instruction raised.

So the behavior you are proposing is tho raise an illegal instruction
exception for an instruction that clears MISA.C when this instruction in
unaligned, correct?

How does this address the issue with xEPC potentially pointing to a
non-aligned address?

Jose Renau

unread,
Feb 15, 2018, 2:36:34 PM2/15/18
to Clifford Wolf, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
After misa.c is being cleared, if any instruction is miss aligned fetched it also raises an illegal instruction exception. The xEPC would fall in that class

Clifford Wolf

unread,
Feb 15, 2018, 3:07:28 PM2/15/18
to Jose Renau, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
On Thu, Feb 15, 2018 at 11:36:31AM -0800, Jose Renau wrote:
> After misa.c is being cleared, if any instruction is miss aligned fetched
> it also raises an illegal instruction exception. The xEPC would fall in
> that class

That was one of the possible solution I proposed in october. See previous
mail in this thread (Message-ID: <20180215185...@clifford.at>) for
my memory of the objections from back then.

Jose Renau

unread,
Feb 15, 2018, 3:16:51 PM2/15/18
to Clifford Wolf, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
I like it because it is simple and clean for both in order and wide OoO cores.

No implementation dependent or difficult to explain

On Feb 15, 2018 12:07 PM, "Clifford Wolf" <clif...@clifford.at> wrote:
On Thu, Feb 15, 2018 at 11:36:31AM -0800, Jose Renau wrote:
> After misa.c is being cleared, if any instruction is miss aligned fetched
> it also raises an illegal instruction exception. The xEPC would fall in
> that class

That was one of the possible solution I proposed in october. See previous
mail in this thread (Message-ID: <20180215185716.GE19634@clifford.at>) for

Christopher Celio

unread,
Feb 15, 2018, 5:50:12 PM2/15/18
to Clifford Wolf, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
> The spec has been crafted carefully avoid this situation by making sure xEPC will never
> be able to hold a value that would make xRET trap. Our proposal preserves
> this guarantee.

I don't understand the full ramifications of this scenario. Yes, this is a bad situation, but at least it's verifiable and understandable. Does this cause us to infinite loop? How recoverable does this situation need to be?

> In the second case we would effectively execute random code. (Whatever the
> interpretation of the code is when offset by 2 bytes.) This has been deemed
> less desireable than all other options by pretty much everyone I spoke to.

Do we not find ourselves effectively executing random code anyways? The original proposal is that CSR* writes to xEPC clear the two LSBs. Is that different from ignoring the two LSBs during xRET?

Do we not find ourselves executing random code when a xRET or trap/MTVEC takes us to unaligned 4 byte instructions? What are we hoping happens while we galavant through a misaligned RV64G segment? A jump to an aligned instruction means we've done something VERY ODD and no error is ever thrown. If a jump is performed to a misaligned instruction, then the jump is marked as if somehow this is his fault --- and its actual the xRET or MTVEC's fault. I'm not yet convinced that forcing alignment on xRET is worse than anything else proposed.

I'm trying to imagine a security hole where a misaligned (and malicious) xRET takes us to a unaligned 4 byte jump gadget (that serendipitously was created out of the two halves of adjacent and aligned 4-byte instructions) which takes us to an attack vector that's correctly 4 byte aligned. The fact that the xRET was misaligned is never discovered. Or would an attacker never get access to xEPC so this hypothetical can never be realized?


I'm sorry I haven't spend more than a few hours thinking about this topic. I don't know the correct solution off the top of my head to the xRET/MTVEC problem, but I'm concerned if the purpose of turning off RVC for RVC processors is to emulate RVG processors, having a co-simulation mismatch on xRET seems really problematic.

RVG should be about all 4 byte instructions being aligned. Allowing corner case exceptions to this is worrisome.


-Chris

Cesar Eduardo Barros

unread,
Feb 15, 2018, 6:31:56 PM2/15/18
to Rishiyur Nikhil, RISC-V ISA Dev
Em 15-02-2018 13:02, Rishiyur Nikhil escreveu:
> BACKGROUND:
> Consider a RISC-V implementation that supports the 'C'
> extension (Compressed Instructions), and can clear the MISA.C bit at
> runtime (thereby switching off C support dynamically).  What should be
> the behavior if PC is not currently 32-bit aligned?  What if return
> addresses and trap vectors are not 32-bit aligned?

That sounds like a "half-updated" state, like x86's "unreal mode". I'm
not a fan of it, since it has the potential of confusing software which
does not expect this "misaligned" mode. However, since it can only be
entered by M-mode, and M-mode can defend itself against it (by just not
doing crazy things in the first place), it's not that bad, only odd (and
I agree that it should be precisely specified, to reduce the chance of
unexpected effects).

Since it's a "half-updated" state, it's actually simple to specify:

- If something was previously set to a misaligned value, it keeps its
misaligned value;
- If you attempt to set something to a misaligned value, it either
becomes aligned or traps (depending on what you're trying to do), as
would normally happen on non-RVC.

From the point of view of these two rules, here's my opinion on your
five rules:

> This is a proposal to specify the behavior precisely.
>
> PROPOSAL:
> When MISA.C is cleared,
>
> - If PC is not 32-bit aligned; just continue executing, consuming 32b
>     instructions that are 16-bit aligned (since the CPU is already
>     capable of doing this to support C).

The PC was already misaligned, so it stays misaligned (incrementing by
the instruction length every time).

> - If a trap occurs and MTVEC contains an address that is
>     16-bit aligned and not 32-bit aligned, just continue executing,
>     consuming 32b instructions that are 16-bit aligned (since the CPU
>     is already capable of doing this to support C).

If the MTVEC was already misaligned, it stays misaligned and is copied
as is to the PC (which then becomes misaligned). Any attempt to write a
misaligned address to MTVEC, however, will not write a misaligned
address to MTVEC. (Are misaligned MTVEC addresses even possible?)

> - All jump or branch instructions that attempt to jump to an address
>     that is not 32-bit aligned will cause an Instruction address
>     misaligned exception, per the current spec (exception is taken at
>     the jump, not at the target).  The return address is saved as is,
>     even if it is not 32-bit aligned.

That's my second rule: attempting to set the PC to something misaligned
won't work. The return address was already misaligned, so it stays
misaligned (and is copied as is).

> - CSR* instructions that write to [MSU]EPC will clear the two LSBs of
>     the written address, i.e. EPC will be 32-bit aligned.
>
> - When PC is copied to [MSU]EPC as part of trap handling, it is copied
>     as-is even if it points to an address that is not 32-bit aligned.

These two are also correct: the PC was already misaligned and stays
misaligned (copied to EPC), and on return also stays misaligned (copied
from EPC - that would be a sixth rule you forgot), but software can't
write something misaligned to EPC (it can read something misaligned from
EPC, however).

But here is the problem. What if the trap handler saves and restores the
EPC (for instance, to allow for nested traps)? It will fail to restore
the EPC correctly. That is, the "half-updated" state is fragile: it's
easy to accidentally slip out of it.

Which is why this section of the specification should be prefaced with
"this is how it works, but it's odd and can easily break, so save
yourself the headache and align the PC before clearing MISA.C, or asking
M-mode to clear MISA.C for you". Also, whoever defines the ABI for
"asking M-mode to clear MISA.C" should define it to fail whenever the PC
of the caller (copied into EPC) is misaligned, to prevent the behavior
from differing between M-mode implementations.

--
Cesar Eduardo Barros
ces...@cesarb.eti.br

Clifford Wolf

unread,
Feb 16, 2018, 7:20:05 AM2/16/18
to Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

sorry, long mail. If it's too long please skip to the and start reading
at "This is not about security".

On Thu, Feb 15, 2018 at 02:50:07PM -0800, Christopher Celio wrote:
> > The spec has been crafted carefully avoid this situation by making sure xEPC will never
> > be able to hold a value that would make xRET trap. Our proposal preserves
> > this guarantee.
>
> I don't understand the full ramifications of this scenario. Yes, this is
> a bad situation, but at least it's verifiable and understandable. Does
> this cause us to infinite loop? How recoverable does this situation need
> to be?

I'm just repeating the arguments I heard in october for why this suggestion
is not acceptable.

I took all the "we can't do that because ..." argument and crafted a
solution that requires no extra hardware (because this was the main
argument against most suggestions) and avoid also the other things that
have been brought up.

I personally don't care much as long as we have can avoid complete
undefined behavior.

As I've said in my previous mail, rocket currently can retire a killed LB
instruction as LW as a result of clearing MISA.C with an unaligned PC and
this is deemed okay because clearing MISA.C with an unaligned PC is
undefined behavior in the current spec. (It wasn't when I found the
behavior, but when I reported it it was added to the spec instead of fixing
rocket.)

--8<--
Quick sidenote: What is undefined behavior? Usually it means the core can
do whatever it wants whenever it wants, even before the unaligned MISA.C
clear hits as long as the core is destined to hit it (undefined behavior is
not causal). This is completely unacceptable for formal verification. We
would need to solve the halting problem in order to determine if the core
is destined to hit the unaligned MISA.C clear.

In one mail in october Andrew said undefined behavior in this sense is
causal (good!). But even causal undefined behavior screws up formal.
Suddenly all properties depend on if the processor hit this one obscure
state in its past. That drastically complicates everything.

I've also been told that "undefined behavior" in the sense of the RISC-V
spec is causal and gurantees that it "won't violate protections and won't
hang". But this is a very problematic definition because it is not a
statement in terms of CPU state but a statement in terms of emergent
behavior.

For example: What if the core suddenly starts interpreting every
instruction as "j 0", i.e. endless loop. It still keeps executiong
instructions, so does it hang?

You might think that it is crazy to even consider this correct behavior,
but rocket currently can be made to retire a LW instruction that is not
there as response to clearing MISA.C, and apparently that is okay according
to the definition.

I think we should be able to get away in the formal spec with unspecified
values and situations where an implementation may chose one of multiple
concrete options (aka "unpredictable behavior"). For example, instead of
saying that clearing MISA.C with xEPC pointing to an unaligned address is
undefined behavior we could say that clearing MISA.C will leave undefined
values in all xEPC registers.
--8<--

We in the formal spec WG don't think that clearing MISA.C justifies undefined
behavior. We even think it is possible to define it in a way so that it has
no unspecified or unpredictable aspects and adds no extra hardware.

> > In the second case we would effectively execute random code. (Whatever the
> > interpretation of the code is when offset by 2 bytes.) This has been deemed
> > less desireable than all other options by pretty much everyone I spoke to.
>
> Do we not find ourselves effectively executing random code anyways? The
> original proposal is that CSR* writes to xEPC clear the two LSBs. Is that
> different from ignoring the two LSBs during xRET? [..]

Yes, that's very different.

Because this is only for CSR* writes to xEPC, not for when the PC gets
copied into xEPC as a result of a trap. Clearing the low bits in xEPC on
write is spec behavior now (3.1.19 of Privileged Architectures V1.10), not
something we have added.

So with our proposal you can still take a trap in unaligned code and return
correctly. We only get a problem when people try to set xEPC by copying a
register value into it. (Which is already the case right now.)

Note that our proposal completely stays within the realm of what is
currently completely unspecified. So all guarantees that the current spec
gives to software still hold.

> I'm not yet convinced that forcing alignment on xRET is worse than
> anything else proposed.

I don't say it is. I'm saying it the suggestion was already shut down in
october because it adds extra hardware complexity.

Our proposal doesn't add hardware complexity. It's just a bit harder to get
your head around but it is actually simpler than what (at least) rocket is
doing right now and it only clarifies some cases that are currently
unspecified because the spec is for the most part written in a way that
assumes that MISA.C is read-only.

Let me explain what I mean using my original 3 clause proposal (that should
be semantically identicall to the 5 clause proposal nikhil posted):

--snip--

A processor with C support and writable misa.C should behave as follows
when misa.C is cleared:

(1) If not stated otherwise below, behave the same as if misa.C would be set.
Specifically, if PC points to an address that isn't aligned to 32 bits
the core should just execute unaligned code.

(2) All jump or branch instructions that attempt to jump to an address that
is not aligned to 32 bits will cause an Instruction address misaligned
exception.

(3) Writes to [msu]epc will clear the two LSB of the written address. This
is only the case for the CSR* instructions that explicitly write to
[msu]epc. When PC is copied to [msu]epc as part of trap handling than
it is copied as-is even when it points to an address not aligned to
32 bits.

--snap--

Clause (1) says that with the exception of (2) and (3) there is no need to
use MISA.C anywhere in the core (with the exception of the instruction
decoder that should forget about C encodings of course).

This is important! It means that if you have anything else in your core
right now that uses misa.C then you can remove that logic. It is not
needed.

Clause (2) is something that a core should already do. It is specified in
2.5 of RISC-V User-Level ISA V2.2.

Clause (3) is two parts. First the part about clearing bits during CSR*
writes to [msu]epc. This is something a core should already do. It is
specified in 3.1.19 of Privileged Architectures V1.10. And second the
copying from PC to [msu]epc. This is essentially a clarification that
no special action is required if PC is not aligned to 32b.

Your proposal is to mask the LSB bits of xEPC on xRET to fix the address.
This would be an additional action ontop of the clauses above. It would
not remove hardware anywhere else from the system but it would require you
to add a few extra gates to the xRET logic.

> I'm trying to imagine a security hole where a misaligned (and malicious)
> xRET takes us to a unaligned 4 byte jump gadget (that serendipitously was
> created out of the two halves of adjacent and aligned 4-byte
> instructions) which takes us to an attack vector that's correctly 4 byte
> aligned. The fact that the xRET was misaligned is never discovered. Or
> would an attacker never get access to xEPC so this hypothetical can never
> be realized?

This is not about security.

Proper OS would only clear misa.C when in aligned code and xEPC pointing to
aligned addresses.

If misa.C is cleared and PC and xEPC point to 32b aligned addresses then
you can never have an unaligned address in PC and xEPC without first
setting misa.C again. That's the whole point: Emulating RVG on RVC.

This is about avoiding unnecessary undefined behavior so that we can have a
clean formal specification. Our proposal does this without adding any
additional complexity to the hardware. In fact, I believe it simplifies
implementations.

> I'm sorry I haven't spend more than a few hours thinking about this
> topic. I don't know the correct solution off the top of my head to the
> xRET/MTVEC problem, but I'm concerned if the purpose of turning off RVC
> for RVC processors is to emulate RVG processors, having a co-simulation
> mismatch on xRET seems really problematic.

How would you have a co-simulation mismatch on xRET if you only clear
misa.C when in aligned code and xEPC pointing to aligned addresses?

The whole point of our proposal is to guarantee complete compatibility with
RVG if we start in a state that is valid for a RVG processors. I don't see
how you could run into any co-simulation mismatches if you start with a
valid state.

> RVG should be about all 4 byte instructions being aligned. Allowing
> corner case exceptions to this is worrisome.

But how would you ever get into those corner cases? Could you provide an
example? The only way to do that is to start with a state a RVG system can
never get into without first setting it up in RVC mode and then clearing
misa.C. And then you are talking about RVC system behavior, not RVG,
because in order to get into that state you had to be in C mode for at
least a few instructions.

I don't know.. maybe we are misunderstanding each other here. Or maybe
there is a bug in my proposal. In the latter case I would appriciate an
example that demonstrates that you could get a co-simulation mismatch with
a core that follows my proposal.

regards,
- clifford

--
Tell people there is an invisible man in the sky who created the universe, and
the vast majority will believe you. Tell them the paint is wet, and they have
to touch it to be sure.

Clifford Wolf

unread,
Feb 16, 2018, 7:33:13 AM2/16/18
to Cesar Eduardo Barros, Rishiyur Nikhil, RISC-V ISA Dev
Hi,

On Thu, Feb 15, 2018 at 09:31:46PM -0200, Cesar Eduardo Barros wrote:
> >- If a trap occurs and MTVEC contains an address that is
> >     16-bit aligned and not 32-bit aligned, just continue executing,
> >     consuming 32b instructions that are 16-bit aligned (since the CPU
> >     is already capable of doing this to support C).
>
> If the MTVEC was already misaligned, it stays misaligned and is
> copied as is to the PC (which then becomes misaligned). Any attempt
> to write a misaligned address to MTVEC, however, will not write a
> misaligned address to MTVEC. (Are misaligned MTVEC addresses even
> possible?)

No, it is not possible. You can ignore this clause.

> But here is the problem. What if the trap handler saves and restores
> the EPC (for instance, to allow for nested traps)? It will fail to
> restore the EPC correctly. That is, the "half-updated" state is
> fragile: it's easy to accidentally slip out of it.

Please note that this is only about having a clean formal spec with nice
deterministic behavior.

Currently the priv spec says this:

When clearing the "C" bit in misa, software must ensure that the current pc
is 4-byte aligned and that all xepc registers contain 4-byte-aligned values.

Leaving it completely undefined what happens when software does not follow
this rule. (And as I've said before in this thread, at least rocket can do
really weird things if you clear MISA.C with an unaligned PC.)

Our goal here is to make this defined behavior for the sake of a clean and
formally verifiable specification.

Actual software would of course still ensure that PC and xEPC are all
4-byte aligned. You are not emulating RVC until you are in a state where
PC and xEPC are all 4-byte aligned.

As long as we have defined behavior we are good. This is not a state the
processor is actually going to be operated in by software. So it does not
matter if the trap handler can't restore xEPC from a register.

There are not requirements other than: (a) the behavior should be well
defined and (b) it should be implementable with minial (zero) extra
hardware cost.

regards,
- clifford

Paul Miranda

unread,
Feb 16, 2018, 9:45:26 AM2/16/18
to RISC-V ISA Dev
I like the proposal since it allows some flexibility in implementation while providing a more complete specification.
However, I am curious what use cases there are for clearing the C bit? (Other than the case celio mentioned of emulating RVG on RVC hardware.)

Cesar Eduardo Barros

unread,
Feb 16, 2018, 5:07:03 PM2/16/18
to Paul Miranda, RISC-V ISA Dev
Em 16-02-2018 12:45, Paul Miranda escreveu:
> I like the proposal since it allows some flexibility in implementation
> while providing a more complete specification.
> However, I am curious what use cases there are for clearing the C bit?
> (Other than the case celio mentioned of emulating RVG on RVC hardware.)

Given that all clearing the C bit does is "emulating RVG on RVC
hardware", if you exclude that you won't find any use cases. A better
question would be "why would someone want to disable the C extension",
and I can see at least six use cases for that:

1. Software development: you want to make sure your software works even
on cores without the C extension, but all you have available for testing
are cores with the C extension.

2. Virtual machine migration: you have a mixed farm of machines, some of
which have the C extension, and some of which don't. You want to make
sure a virtual machine can be migrated between all these physical
machines, without crashing or becoming slower, so you run the virtual
machines with the C extension disabled.

3. PNaCl-style sandboxing: base non-C RISC-V has the property that all
instructions are 4 bytes wide and aligned to 4 bytes, that is, jumping
into the middle of an instruction isn't possible. This makes it easier
for a validator to confirm that a sequence of instructions follows a set
of rules for a sandbox.

4. Extensions using the RVC quadrants: disabling RVC frees 3/4 of the
encoding space for extensions, either emulated or decoded in hardware.
This is particularly useful together with the first use case (software
develpment): suppose you have a special-purpose micro-controller which
uses the RVC encoding space for a custom extension; disabling RVC allows
one to use a generic RISC-V core to test software compiled for that
micro-controller (by trapping and emulating the custom extension).

5. Hardware evolution: if you started with a tiny core which didn't have
the C extension, and later you needed more compute power but all the
cores you could find had the C extension, disabling the C extension
would help make sure your software keeps working unchanged.

6. Alternative compressed instructions: suppose you disagree with the
design of RVC, and have what you think is a better design. Disabling RVC
allows experimenting (through emulation) with your "obviously superior"
design.

This last use case is special in that, unlike all the others, it wants
to allow misaligned 32-bit instructions. With Clifford Wolf's proposal,
there's an easy trick for that: when necessary, the emulator could
temporarily enable MISA.C and misalign the EPC. Alternative proposals
that enforce alignment when MISA.C is clear wouldn't allow for this
niche use case without emulating everything whenever the PC misaligns.

Christopher Celio

unread,
Feb 16, 2018, 7:42:31 PM2/16/18
to Jose Renau, Clifford Wolf, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil

- If PC is not 32-bit aligned; just continue executing, consuming 32b
    instructions that are 16-bit aligned (since the CPU is already
    capable of doing this to support C).

I believe I have convinced myself that the parenthetical that underpins this proposal is not always true. If MISA.C is disabled, then the processor may no longer be capable of executing misaligned instructions. I am in particular concerned about superscalar processors where power becomes a concern.

One particular design may choose to work as follows: 

A 16 Byte fetch front-end brings in four 4-byte instructions and up to eight 2-byte instructions. 8 expanders/RVC-decoders are used to generate micro-ops, and a mux is required to choose between the potential 4-byte or expanded 2-byte instruction that may start at any of the 8 locations within the 16B packet.

If MISA.C is disabled, then I can turn off the RVC decoders/expanders, and I can turn off the flops and muxing for the extra 4 uops. To power down the clock is not instantaneous; I can fold it in with the pipeline flush, but I can't magically turn it on when a misaligned xRET is sprung on the frontend. 

If the flops are off, there's no way for the core to execute unaligned 4-byte instructions as demanded by this proposal.


For this reason, and for sanity, and security (even if only "security by depth"), I feel that RVG should strongly mandate that all instructions are 4-bytes and always aligned.


In particular:

I believe that MISA.C should flush the pipeline and fetch PC+4. (the effects of uncompressed is immediate from the ISA view and misalignment is forbidden after the MISA.C).

I believe that ALL instructions should be checked for misalignment.

The instruction after a misaligned MISA.C should throw a misaligned exception.

Likewise, the instruction at xEPC after an xRET should throw a misaligned exception. 


Checking every fetch_pc is a 1 (or 2-bit) check and can piggy-back on the existing exception signals that come from the front-end. You only have to check PC of the first instruction in a packet.


I'm willing to accept we can't throw exception on MISA.C, because it's not exceptional. An unaligned 4 byte is legal. The state change has to be preserved and only afterwards do we enter exceptional behavior.

I would like to better understand why xRET can't throw an exception (since I'm not in the Formal Group I haven't gotten to hear all of the push-back against this option), but off the top of my head it seems that there would be additional complexity of a environment return conflicting with an exception within the CSR File. There is a lot of side-effects of both, which work in opposition to one another, so unless you can detect an excepting xRET on an earlier cycle, this seems problematic. Or maybe checking xEPC(1) && dec_is_xret is trivial for all designs?



TL, DR: This proposal makes me uneasy, especially when we have such a small number of RVC designs to analyze. 


-Chris



Dan Hopper

unread,
Feb 16, 2018, 9:27:04 PM2/16/18
to Christopher Celio, Jose Renau, Clifford Wolf, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi folks, 

(send attempt #2, sorry for the churn)

FWIW, my recommendation would be to define the mode switching behaviors rather than leave them undefined, and to do so in a way that doesn't impact hardware too much, but nevertheless makes the mode-switching a step function in the hardware.  Even if you have to flush the pipe to do so. (Not that pipe flushing on a MISA.C mode switch would be burdensome in terms of either gates or performance.  It's actually the normal procedure for many types of mode switching.)  So, I agree with Chris in that regard.

If the mode switch is a slow, gradual change of behavior, dependent upon the instruction stream, the hardware becomes more burdensome and harder to verify.

Besides, what's the point of having a mode bit to disable an optional feature, if the core with a feature disabled doesn't at least mostly behave like a core that doesn't have the feature in hardware? 

Cesar provided excellent examples in his email this afternoon.  I'd highlight #1 & #2 as particularly important.  Or a multitasking OS running a variety of programs that support a variety of optional processor modes. Sure, some mode bits like MISA.C might run a "legacy" program fine with the feature enabled, but there will be features where that is not the case, where perhaps there are orthogonal features or orthogonal behaviors that require (rather than merely allow) a "legacy" program to run with a particular feature disabled. 

Wrt the xEPC reg values and such, I don't think the hardware required to force bit 1 to zero upon MISA.C=0 mode switch is large or burdensome.  Mode bits such as these will already be available in the front and back-ends, so either the flop can be clocked and written to zero upon MISA.C=0 transition, or alternatively the bit 1 flop value can be masked by the value of MISA.C=0. 


One comparison that could be drawn here is to the defined behavior when writing MISA.MXL=1 (32-bit) on a RV64 core: "all operations must ignore source operand register bits above the configured XLEN, and must sign-extend results to fill the
entire widest supported XLEN in the destination register. ... We require that operations always fill the entire underlying hardware registers with defined values to avoid implementation-defined behavior."

I would suggest this MISA.MXL mode-switching behavior spec, when translated to a MISA.C mode-switch, could be interpreted to mean that xEPC bits 1:0 must be ignored and read as zero when MISA.C=0, and bit 0 must be ignored and read as zero when MISA.C=1.   (But I'd also be fine with actually writing the bit to zero, if that ends up being the way it is defined.)

Regards,
Dan
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.

Clifford Wolf

unread,
Feb 17, 2018, 7:36:26 AM2/17/18
to Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Fri, Feb 16, 2018 at 04:42:29PM -0800, Christopher Celio wrote:
> > - If PC is not 32-bit aligned; just continue executing, consuming 32b
> > instructions that are 16-bit aligned (since the CPU is already
> > capable of doing this to support C).
>
> I believe I have convinced myself that the parenthetical that underpins
> this proposal is not always true. If MISA.C is disabled, then the processor
> may no longer be capable of executing misaligned instructions. I am in
> particular concerned about superscalar processors where power becomes a
> concern. [...]

This is an interesting point that I have not considered. Thanks.

> I believe that MISA.C should flush the pipeline and fetch PC+4. (the
> effects of uncompressed is immediate from the ISA view and misalignment is
> forbidden after the MISA.C).
>
> I believe that ALL instructions should be checked for misalignment.

How could they become misaligned other than via MISA.C clear and xRET?

Branches/jumps should check the new PC before commiting the instruction, so
PC never becomes misaligned.

And the xVEC registers don't allow addresses that aren't 32b aligned.

> The instruction after a misaligned MISA.C should throw a misaligned
> exception.

Shouldn't the instruction that clears MISA.C throw that exception?
(We are talking about the mcause=0 exception, right?)

> Likewise, the instruction at xEPC after an xRET should throw a misaligned
> exception.

Likewise, shouldn't the xRET throw that exception?

> I would like to better understand why xRET can't throw an exception (since
> I'm not in the Formal Group I haven't gotten to hear all of the push-back
> against this option),

There is no push-back against this from the formal group afaik. I'd be
perfectly fine with this solution.

I'm just paraphrasing the pushback I got from from Andrew (mostly
communicated through Jacob) back in october.


Alternatively the MISA.C clear could throw an illegal instruction exception
when it is unaligned or any of the xEPC CSRs point to an unaligned
address. That proposal has also been rejected.


Or we can throw an exception for an unaligned MISA.C and just say changing
MISA.C will leave an unspecified value in the xEPC registers (with xEPC[0]
always clear and xEPC[1] clear if !MISA.C). I particularily like that one
because it allows an implementation to do a few things: It could just mask
xEPC[1] when MISA.C is cleared (that mask would stop when MISA.C is set
again, thus a *change* to MISA.C will mess with xEPC, not just clearing
it). Or it could also just reset all xEPC regs to their reset values. This
proposal of course was also rejected.


In both cases the reason given was that the hardware cost is too high.


> but off the top of my head it seems that there would
> be additional complexity of a environment return conflicting with an
> exception within the CSR File. There is a lot of side-effects of both,
> which work in opposition to one another, so unless you can detect an
> excepting xRET on an earlier cycle, this seems problematic. Or maybe
> checking xEPC(1) && dec_is_xret is trivial for all designs?
>
> TL, DR: This proposal makes me uneasy, especially when we have such a small
> number of RVC designs to analyze.

I think it is even worse because most of them don't have a writeable
MISA.C. :)

regards,
- clifford

Stef O'Rear

unread,
Feb 17, 2018, 7:53:10 AM2/17/18
to Rishiyur Nikhil, RISC-V ISA Dev
On Thu, Feb 15, 2018 at 7:02 AM, Rishiyur Nikhil <nik...@bluespec.com> wrote:
> BACKGROUND:
> Consider a RISC-V implementation that supports the 'C'
> extension (Compressed Instructions), and can clear the MISA.C bit at
> runtime (thereby switching off C support dynamically). What should be
> the behavior if PC is not currently 32-bit aligned? What if return
> addresses and trap vectors are not 32-bit aligned?
>
> This is a proposal to specify the behavior precisely.
>
> PROPOSAL:
> When MISA.C is cleared,
>
> - If PC is not 32-bit aligned; just continue executing, consuming 32b
> instructions that are 16-bit aligned (since the CPU is already
> capable of doing this to support C).
>
> - If a trap occurs and MTVEC contains an address that is
> 16-bit aligned and not 32-bit aligned, just continue executing,
> consuming 32b instructions that are 16-bit aligned (since the CPU
> is already capable of doing this to support C).
>
> - All jump or branch instructions that attempt to jump to an address
> that is not 32-bit aligned will cause an Instruction address
> misaligned exception, per the current spec (exception is taken at
> the jump, not at the target). The return address is saved as is,
> even if it is not 32-bit aligned.
>
> - CSR* instructions that write to [MSU]EPC will clear the two LSBs of
> the written address, i.e. EPC will be 32-bit aligned.
>
> - When PC is copied to [MSU]EPC as part of trap handling, it is copied
> as-is even if it points to an address that is not 32-bit aligned.
>
>
>
> Please let us know if you see any problems with this.
>
> [Thanks to Clifford Wolf for articulating this first.]
>
> Rishiyur Nikhil (Chair) and the ISA Formal Spec Technical Group

I have no objections to this proposal. It has interactions with the
SBI VM World Switch call, but the latter has not been specced yet, so
that is OK.

-s

Allen J. Baum

unread,
Feb 17, 2018, 6:26:03 PM2/17/18
to Clifford Wolf, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
At 1:36 PM +0100 2/17/18, Clifford Wolf wrote:
>
>> The instruction after a misaligned MISA.C should throw a misaligned
>> exception.
>
>Shouldn't the instruction that clears MISA.C throw that exception?
>(We are talking about the mcause=0 exception, right?)

If we could trap on the CSRW instruction, it would be ideal.
The problem is that condition isn't easily detected until the time that the instruction is retiring, making it a bit awkward. If we're going to do that, then perhaps we should also allow traps on integer overflow - pretty much the same problem (a bit of a false equivalency - you could detect it a pipestage earlier in this particular case, but close enough).
The instruction following would work also, though implementing that in an OOO superscalar machine gets ugly fast, according to Chris.

> > Likewise, the instruction at xEPC after an xRET should throw a misaligned
>> exception.
>
>Likewise, shouldn't the xRET throw that exception?

Maybe; that decision (should xRET throw the exception or the instruction at the return point) will have to be made regardless since the condition leading to that case can occur whether or not we allow executing unaligned code with MISA.C=0.

But stepping back - the real issue is whether we allow execution of non aligned code in a 32bit aligned environment, even for a little while.
I think most would agree it's ugly, but don't agree that the cost of fixing it is worthwhile. I'd like to see a much more thorough discussion of the costs (which may have happened & I just never knew about it, in which case point me to the thread)).

Of course, that will just lead to a further discussion of "what is too much cost".




--
**************************************************
* Allen Baum tel. (908)BIT-BAUM *
* 248-2286 *
**************************************************

Jacob Bachmeyer

unread,
Feb 17, 2018, 6:49:37 PM2/17/18
to Clifford Wolf, Christopher Celio, Rishiyur Nikhil, RISC-V ISA Dev
Clifford Wolf wrote:
>>> PROPOSAL:
>>> When MISA.C is cleared,
>>>
>>> (1) If PC is not 32-bit aligned; just continue executing, consuming 32b
>>> instructions that are 16-bit aligned (since the CPU is already
>>> capable of doing this to support C).
>>>
>>> (2) If a trap occurs and MTVEC contains an address that is
>>> 16-bit aligned and not 32-bit aligned, just continue executing,
>>> consuming 32b instructions that are 16-bit aligned (since the CPU
>>> is already capable of doing this to support C).
>>>
>>> (3) All jump or branch instructions that attempt to jump to an address
>>> that is not 32-bit aligned will cause an Instruction address
>>> misaligned exception, per the current spec (exception is taken at
>>> the jump, not at the target). The return address is saved as is,
>>> even if it is not 32-bit aligned.
>>>
>>> (4) CSR* instructions that write to [MSU]EPC will clear the two LSBs of
>>> the written address, i.e. EPC will be 32-bit aligned.
>>>
>>> (5) When PC is copied to [MSU]EPC as part of trap handling, it is copied
>>> as-is even if it points to an address that is not 32-bit aligned.
>>>
>
> Note that (3) and (4) is exactly what every RISC-V processor should already
> do when !MISA.C.
>
> Also note that (1), (2), and (5) in fact remove complexity. In this points
> we say that a processor should *not* add extra complexity in areas where
> some might assume that the spec requires them to add complexity. (But in
> fact the current spec doesn't say anything about what to do in those
> situations.)

Perhaps I am mistaken, but (1) and (2) seem to preclude an optimization
for aligned fetch that might otherwise be possible when RVC is disabled.

Could we instead require *tvec to be 32-bit aligned in all modes? (Even
RVC trap handlers must *begin* on a 32-bit word boundary?) This would
eliminate the need to even implement the low two bits of *tvec (or
permit them to be reserved as future flags in all cases). If *tvec must
be 32-bit aligned, then (1) can be simplified to "take instruction
misaligned trap".

Having an extra "ghost RVC mode" just feels too much like some of the
crazy magic on x86 for my taste.


-- Jacob

Stef O'Rear

unread,
Feb 17, 2018, 6:51:59 PM2/17/18
to Rishiyur Nikhil, RISC-V ISA Dev
On Thu, Feb 15, 2018 at 7:02 AM, Rishiyur Nikhil <nik...@bluespec.com> wrote:
> - If a trap occurs and MTVEC contains an address that is
> 16-bit aligned and not 32-bit aligned, just continue executing,
> consuming 32b instructions that are 16-bit aligned (since the CPU
> is already capable of doing this to support C).

According to https://cdn.rawgit.com/riscv/riscv-isa-manual/master/release/riscv-privileged-v1.10.pdf#page=36
mtvec and stvec already cannot hold addresses that are not 32-bit
aligned, so this section is moot I think.

-s

Jacob Bachmeyer

unread,
Feb 17, 2018, 7:01:55 PM2/17/18
to Jose Renau, Rishiyur Nikhil, RISC-V ISA Dev
Jose Renau wrote:
> -If the MISA.C is cleared by a compressed instruction. The instruction
> following the MISA.C clear must be 32bit aligned or an illegal instruction
> exception is raised.

This is not possible: misa is a CSR and there are no CSR access
instructions in RVC. The C bit can only be adjusted by a 32-bit
instruction. Do you mean to require that that instruction be 32-bit
aligned?

On the other hand, might we all be missing the forest for the trees
here? The misa CSR is only accessible in M-mode. A monitor that
permits clearing misa.C cannot use RVC itself, otherwise the monitor
would risk illegal instruction exceptions if a part that uses RVC is
executed with misa.C clear. Therefore, mtvec must always be aligned to
a word boundary, execution at the point where misa.C is cleared can only
be 32-bit instructions on 32-bit alignment, and any possible return to
RVC code must cross privilege levels. MRET can trap for misaligned
instruction if mepc is not 32-bit aligned and misa.C is clear.


-- Jacob

Clifford Wolf

unread,
Feb 17, 2018, 7:27:07 PM2/17/18
to Allen J. Baum, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Sat, Feb 17, 2018 at 03:25:54PM -0800, Allen J. Baum wrote:
> >> The instruction after a misaligned MISA.C should throw a misaligned
> >> exception.
> >
> >Shouldn't the instruction that clears MISA.C throw that exception?
> >(We are talking about the mcause=0 exception, right?)
>
> If we could trap on the CSRW instruction, it would be ideal.
> The problem is that condition isn't easily detected until the time that
> the instruction is retiring, making it a bit awkward. If we're going to
> do that, then perhaps we should also allow traps on integer overflow -
> pretty much the same problem (a bit of a false equivalency - you could
> detect it a pipestage earlier in this particular case, but close enough).

I think it would be ok for an MISA.C write to flush the pipeline. So it
should be far less problematic than something like an exception for
division by zero.

> The instruction following would work also, though implementing that in an
> OOO superscalar machine gets ugly fast, according to Chris.

Right now there is no exception for "this instruction is unaligned" afaiu,
only for "this instruction is trying to make PC unaligned" (mcause=0). So
it would require adding an extra exception just for this circumstance if
the exception should be thrown for the instruction following.

Btw, yet another possible solution would be to make MISA.C read-only when
PC[1] or any xEPC[1] bit is set. This would turn an attempt to clear MISA.C
into a NOP when it's not safe to disable MISA.C. (Which would match the
behavior of architectures that do not allow clearing MISA.C at all.)

regards,
- clifford

Jacob Bachmeyer

unread,
Feb 17, 2018, 7:46:53 PM2/17/18
to Clifford Wolf, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Clifford Wolf wrote:
> Hi,
>
> On Fri, Feb 16, 2018 at 04:42:29PM -0800, Christopher Celio wrote:
>> The instruction after a misaligned MISA.C should throw a misaligned
>> exception.
>>
>
> Shouldn't the instruction that clears MISA.C throw that exception?
> (We are talking about the mcause=0 exception, right?)
>
>
>> Likewise, the instruction at xEPC after an xRET should throw a misaligned
>> exception.
>>
>
> Likewise, shouldn't the xRET throw that exception?
>
>
>> I would like to better understand why xRET can't throw an exception (since
>> I'm not in the Formal Group I haven't gotten to hear all of the push-back
>> against this option),
>>

I actually suggested having MRET raise an exception just earlier and
then read a message that reminded me why that cannot work: xRET cannot
raise an exception, because an exception at xRET destroys the state that
the xRET needs to operate. MRET jumps to <mepc> in <mstatus.MPP> mode.
If MRET traps instead, mepc will be overwritten with the address of MRET
and MPP set to M. Now the trap handler has to untangle the mess and
reconstruct the original mepc and mstatus values (OK, so the monitor
should have stashed these on the stack somewhere that is still (barely)
valid) before it can resume, although if misa.C is cleared and the
higher levels are relying on RVC support, resuming execution is not
actually possible -- a trap must be delivered (by software delegation)
instead.

> I'm just paraphrasing the pushback I got from from Andrew (mostly
> communicated through Jacob) back in october.
>

There must be another Jacob; I have checked my past emails and the only
pushback I gave you last October was concerns about possible patents on
bitwise parallel extract/deposit.


-- Jacob

Allen J. Baum

unread,
Feb 17, 2018, 8:28:15 PM2/17/18
to jcb6...@gmail.com, Clifford Wolf, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
At 6:46 PM -0600 2/17/18, Jacob Bachmeyer wrote:
>
>I actually suggested having MRET raise an exception just earlier and then read a message that reminded me why that cannot work: xRET cannot raise an exception, because an exception at xRET destroys the state that the xRET needs to operate. MRET jumps to <mepc> in <mstatus.MPP> mode. If MRET traps instead, mepc will be overwritten with the address of MRET and MPP set to M. Now the trap handler has to untangle the mess and reconstruct the original mepc and mstatus values (OK, so the monitor should have stashed these on the stack somewhere that is still (barely) valid) before it can resume, although if misa.C is cleared and the higher levels are relying on RVC support, resuming execution is not actually possible -- a trap must be delivered (by software delegation) instead.

Actually, I'm OK if this rare cases causes effort to untangle or is even impossible to untangle.
I'm also OK with it cause a fatal error (e.g. NMI) or getting into an infinite loop.
"If it hurts when you do that, don't do that" and this is in code that MMode should ensure has the correct alignment to begin with.

I do want formal verification to be happy, and if either of those options make them happy, I'm happy (well, for this particular case.)

Clifford wrote:
>
>Btw, yet another possible solution would be to make MISA.C read-only when
>PC[1] or any xEPC[1] bit is set. This would turn an attempt to clear MISA.C
>into a NOP when it's not safe to disable MISA.C. (Which would match the
>behavior of architectures that do not allow clearing MISA.C at all.)

I think that is a really elegant solution.

Obeying platform restrictions (the CSRW that clears MISA.C must be aligned) would then simply work. Buggy code would do something weird but predictable.
And its pretty cheap.
Very (very) slightly cheaper is if only allows set .C, but not clearing it (by a copule of gates)

Clifford Wolf

unread,
Feb 18, 2018, 6:04:29 AM2/18/18
to Jacob Bachmeyer, Andrew Waterman, RISC-V ISA Dev
On Sat, Feb 17, 2018 at 06:46:48PM -0600, Jacob Bachmeyer wrote:
> >I'm just paraphrasing the pushback I got from from Andrew (mostly
> >communicated through Jacob) back in october.
>
> There must be another Jacob; I have checked my past emails and the
> only pushback I gave you last October was concerns about possible
> patents on bitwise parallel extract/deposit.

Yes. That was Jacob Chang from SiFive. ;)

Clifford Wolf

unread,
Feb 18, 2018, 6:07:18 AM2/18/18
to Allen J. Baum, jcb6...@gmail.com, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
On Sat, Feb 17, 2018 at 05:28:10PM -0800, Allen J. Baum wrote:
> Very (very) slightly cheaper is if only allows set .C, but not clearing it (by a copule of gates)

Unfortunately that's not possible: The spec says that the init value of
MISA must be the max feature set. So MISA.C must be already set on reset.

Bruce Hoult

unread,
Feb 18, 2018, 6:40:28 AM2/18/18
to Clifford Wolf, Allen J. Baum, Jacob Bachmeyer, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Is that even possible, logically?

What if you have mutually-exclusive features? For example, maybe your processor supports C, but also supports another feature (perhaps private) that uses the 16-bit instruction space for something different -- perhaps vectors or GPU or ML or whatever.

Clifford Wolf

unread,
Feb 18, 2018, 7:18:49 AM2/18/18
to Bruce Hoult, Allen J. Baum, Jacob Bachmeyer, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Sun, Feb 18, 2018 at 02:40:24PM +0300, Bruce Hoult wrote:
>>> Very (very) slightly cheaper is if only allows set .C, but not clearing
>>> it (by a copule of gates)
>>
>> Unfortunately that's not possible: The spec says that the init value of
>> MISA must be the max feature set. So MISA.C must be already set on reset.
>
> Is that even possible, logically?
>
> What if you have mutually-exclusive features? For example, maybe your
> processor supports C, but also supports another feature (perhaps private)
> that uses the 16-bit instruction space for something different -- perhaps
> vectors or GPU or ML or whatever.

Exactly my first thought when I read this in the spec.. :)

The exact quote from the spec:

At reset, the Extension field should contain the maximal set of supported
extensions, and I should be selected over E if both are available.

Afaiu I and E are the only exclusive standard extensions, so this case is
explicitly covered by the spec.

All that remains are conflicts between standard extensions and non-standard
extension (that would not be represented by flags in MISA). My reading of
the spec is that in this cases the standard extensions should be enabled
and the conflicting non-standard extensions disabled on reset.

regards,
- clifford

Allen J. Baum

unread,
Feb 19, 2018, 2:02:23 AM2/19/18
to Clifford Wolf, jcb6...@gmail.com, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
I'm missing something here.

You suggested making MISA.C read-only when PC[1] or any xEPC[1] bit is set.
I suggested disallowing clearing it when PC[1] or any xEPC[1] bit is set,
as a (very) small saving of a couple of gates.
What does this have to do with the initial value at reset?
My assumption here is that the reset value is whatever the reset value is, and the only other changes can occur when a CSRW to MISA is executed.

In both cases (after reset with MISA.C=1) clearing MISA.C works if the CSRW instruction is 32bit aligned. The difference only shows up if you try to set MISA.C when the CSRW op is not aligned: in your proposal you can't, in my proposal (though I hesitate to even call it a proposal, more like an observation) you can.

So, are you saying even your suggestion violates the spec, or just mine?
And, in either case: what does that have to do with the reset value?

Clifford Wolf

unread,
Feb 19, 2018, 6:10:23 AM2/19/18
to Allen J. Baum, jcb6...@gmail.com, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Sun, Feb 18, 2018 at 11:02:16PM -0800, Allen J. Baum wrote:
> At 12:07 PM +0100 2/18/18, Clifford Wolf wrote:
> >On Sat, Feb 17, 2018 at 05:28:10PM -0800, Allen J. Baum wrote:
> >> Very (very) slightly cheaper is if only allows set .C, but not clearing it (by a copule of gates)
> >
> >Unfortunately that's not possible: The spec says that the init value of
> >MISA must be the max feature set. So MISA.C must be already set on reset.
>
> I'm missing something here.
>
> You suggested making MISA.C read-only when PC[1] or any xEPC[1] bit is set.
> I suggested disallowing clearing it when PC[1] or any xEPC[1] bit is set,

Then I misunderstood you. I thought you suggestes disallowing clearing it
at all, so one could switch from RVG to RVC but never back. (And that would
not work because MISA.C must be already active at reset when the processor
supports it.)

> In both cases (after reset with MISA.C=1) clearing MISA.C works if the
> CSRW instruction is 32bit aligned. The difference only shows up if you
> try to set MISA.C when the CSRW op is not aligned: in your proposal you
> can't, in my proposal (though I hesitate to even call it a proposal, more
> like an observation) you can.

I think our suggestions are identical because in this solution PC[1] or
xEPC[1] can only be set when MISA.C is already active. So it doesn't matter
if we say it's read-only or disallow clearing, as clearinging would be the
only possible write operation anyways.

regards,
- clifford

Allen J. Baum

unread,
Feb 19, 2018, 2:33:09 PM2/19/18
to Clifford Wolf, jcb6...@gmail.com, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Good - I was hoping we were agreeing.

Andrew Waterman

unread,
Feb 20, 2018, 2:58:04 PM2/20/18
to Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
This also seems like the cleanest solution to me: any attempt to execute an instruction with a misaligned PC raises a misaligned-instruction exception.  This is easier to explain and simpler to verify than the other proposed alternatives.  I withdraw my objection about increased cost from a few months ago, as it really only is a couple gates, and implementations that are pinching pennies can simply hard-wire misa.C.

On Thu, Feb 15, 2018 at 12:16 PM, Jose Renau <re...@ucsc.edu> wrote:

Bruce Hoult

unread,
Feb 20, 2018, 3:12:49 PM2/20/18
to Andrew Waterman, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
That's clean and simple .. at the moment.

I see two objections:

1) one of the proposed uses for disabling C was to trap and emulate 16 bit instructions (possibly with different semantics). That will be pretty big overhead if 16 bit instructions are common, but if misaligned 32 bit instructions trap too then it will be even worse!

2) This definition makes no sense in some future CPU that supports 48 bit instructions but not the C extension.

I'd suggest that the notion of "defined behaviour" would allow a solution such as:

On any given CPU, misaligned 32 bit instructions while MISA.C is cleared must EITHER:
a) always trap with a misaligned instruction exception, OR
b) always work correctly

A situation where an in-flight LB instruction ends up executing as a LW (or whatever it is that Rocket is doing) is clearly unacceptable.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Stef O'Rear

unread,
Feb 20, 2018, 3:14:34 PM2/20/18
to Andrew Waterman, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
On Tue, Feb 20, 2018 at 11:57 AM, Andrew Waterman <and...@sifive.com> wrote:
> This also seems like the cleanest solution to me: any attempt to execute an
> instruction with a misaligned PC raises a misaligned-instruction exception.
> This is easier to explain and simpler to verify than the other proposed
> alternatives. I withdraw my objection about increased cost from a few
> months ago, as it really only is a couple gates, and implementations that
> are pinching pennies can simply hard-wire misa.C.

It seems quite inconsistent to me that we would have two possibilities
for misaligned instruction exceptions:

1. After a JAL or JALR; mepc points to the JAL or JALR

2. After a write to MISA; mepc points *one after* the MISA write

Unless you meant that the trap would be taken with mepc pointing *to*
the MISA write?

This also doesn't address what happens to the *EPC registers if bit 1
of any of them is nonzero.

My preferences, ranked:

1. csrw misa raises a misalignment (or illegal) exception if any of
pc[1], mepc[1], sepc[1], bsepc[1], or uepc[1] is 1

2. csrw misa is a no-op if any of the above are 1

3. the original proposal from Rishiyur Nikhil

-s

Stef O'Rear

unread,
Feb 20, 2018, 3:18:11 PM2/20/18
to Bruce Hoult, Andrew Waterman, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
On Tue, Feb 20, 2018 at 12:12 PM, Bruce Hoult <br...@hoult.org> wrote:
> That's clean and simple .. at the moment.
>
> I see two objections:
>
> 1) one of the proposed uses for disabling C was to trap and emulate 16 bit
> instructions (possibly with different semantics). That will be pretty big
> overhead if 16 bit instructions are common, but if misaligned 32 bit
> instructions trap too then it will be even worse!

Unfortunately we defined the mepc and sepc registers so that bit[1] is
hardwired to 0 if C is not supported. If you want to change that, it
needs to be in a new thread, because it is out of scope for this one.
With that behavior, misaligned 32 bit instructions cannot execute
normally if they include traps and trap returns.

> 2) This definition makes no sense in some future CPU that supports 48 bit
> instructions but not the C extension.

What I intend to propose if the matter comes up is that on hardware
without C, 48-bit instructions must be followed by a C.NOP and the
pair is treated as a unit.

-s

Andrew Waterman

unread,
Feb 20, 2018, 3:28:17 PM2/20/18
to Stef O'Rear, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
On Tue, Feb 20, 2018 at 12:14 PM, Stef O'Rear <sor...@gmail.com> wrote:
On Tue, Feb 20, 2018 at 11:57 AM, Andrew Waterman <and...@sifive.com> wrote:
> This also seems like the cleanest solution to me: any attempt to execute an
> instruction with a misaligned PC raises a misaligned-instruction exception.
> This is easier to explain and simpler to verify than the other proposed
> alternatives.  I withdraw my objection about increased cost from a few
> months ago, as it really only is a couple gates, and implementations that
> are pinching pennies can simply hard-wire misa.C.

It seems quite inconsistent to me that we would have two possibilities
for misaligned instruction exceptions:

1. After a JAL or JALR; mepc points to the JAL or JALR

2. After a write to MISA; mepc points *one after* the MISA write

Unless you meant that the trap would be taken with mepc pointing *to*
the MISA write?

It is inconsistent but not actually problematic.
 

This also doesn't address what happens to the *EPC registers if bit 1
of any of them is nonzero.

No, it does address that case.  After executing xRET, you'll take a misaligned-instruction exception on the address that was stored in xEPC.


My preferences, ranked:

1. csrw misa raises a misalignment (or illegal) exception if any of
pc[1], mepc[1], sepc[1], bsepc[1], or uepc[1] is 1

Introducing a data-dependent trap on CSR writes is a new hazard, so this is not preferable.

It also makes it harder to store the xEPC registers in a RAM, as you'd like to do for very cheap implementations, since you need to access all the epc[1] bits at the same time.


2. csrw misa is a no-op if any of the above are 1

I think this option is preferable to your option 1, but it still has the drawback of needing access to all the epc[1] bits at the same time.

Stef O'Rear

unread,
Feb 20, 2018, 3:31:25 PM2/20/18
to Andrew Waterman, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
On Tue, Feb 20, 2018 at 12:27 PM, Andrew Waterman <and...@sifive.com> wrote:
> No, it does address that case. After executing xRET, you'll take a
> misaligned-instruction exception on the address that was stored in xEPC.

Does this mean that a supervisor can probe for "C present but possibly
disabled" by writing and then attempting to read back a misaligned
value from sepc?

-s

Christopher Celio

unread,
Feb 20, 2018, 3:33:59 PM2/20/18
to Stef O'Rear, Andrew Waterman, Jose Renau, Clifford Wolf, RISC-V ISA Dev, Rishiyur Nikhil
The proposal (as Jose and Andrew have iterated):

If RVC is disabled, an exception must be thrown on any misaligned 4-byte (or 8-byte) instruction. Clearly, if RVC is enabled or a 48b-extension is enabled, the alignment restriction of 4-byte/8-byte instructions is relaxed.

Allowing a "delay slot" for alignment requirements is not possible. If RVC is disabled, the RVC-expansion hardware is disabled and the processor will NOT work. Full stop.

Detecting misalignment is trivial in hardware. No extra flops, and like 10 gates of hardware.


> Unless you meant that the trap would be taken with mepc pointing *to*
> the MISA write?

No, that's too much hardware. It is not feasible to throw the exception on MISA-write. The instruction to reset MISA.C is a CSRW that writes to address MISA register with a data-dependent mask that dictates it wants to clear the C-bit in MISA (so both the CSR address and the mask is data-dependent!). Likewise, the xRET would be another data-dependent exception. And in both instances, you must broadcast the xEPC bits to the decode/exception-catcher unit.


-Chris

Andrew Waterman

unread,
Feb 20, 2018, 3:38:24 PM2/20/18
to Stef O'Rear, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
"C or some other (n*32 + 16)-bit extension present but possibly disabled."

Whether sepc[1] should be writable when all such extensions are disabled is a related but separate question.

Cesar Eduardo Barros

unread,
Feb 20, 2018, 6:30:05 PM2/20/18
to Andrew Waterman, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
Em 20-02-2018 16:57, Andrew Waterman escreveu:
> This also seems like the cleanest solution to me: any attempt to execute
> an instruction with a misaligned PC raises a misaligned-instruction
> exception.  This is easier to explain and simpler to verify than the
> other proposed alternatives.  I withdraw my objection about increased
> cost from a few months ago, as it really only is a couple gates, and
> implementations that are pinching pennies can simply hard-wire misa.C.

But please, make sure to specify that, on xRET to a misaligned xEPC, the
misaligned instruction trap is taken AFTER the privilege level change!
Let's avoid repeating Intel's mistake
(https://blog.xenproject.org/2012/06/13/the-intel-sysret-privilege-escalation/).

That is, the privilege level change should always come before the
assignment from xEPC to PC.

--
Cesar Eduardo Barros
ces...@cesarb.eti.br

Andrew Waterman

unread,
Feb 20, 2018, 6:34:59 PM2/20/18
to Cesar Eduardo Barros, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
Yeah, the xRET itself retires, so the privilege mode is changed.  The exception occurs afterwards.

Cesar Eduardo Barros

unread,
Feb 20, 2018, 6:48:06 PM2/20/18
to Stef O'Rear, Bruce Hoult, Andrew Waterman, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
I would suggest the opposite: whenever 48-bit instructions (or any
instruction size that's not a multiple of 32 bits) are enabled, the
hardware must allow 16-bit instruction alignment, even when MISA.C is off.

Also, having C.NOP available without the C extension enabled is strange
and inconsistent, and hinders alternative 16-bit instruction sets which
might want to use that encoding for something else, or 32-bit
instruction sets using the RVC quadrants for custom extensions.

However, since realigning the instruction stream in the presence of
48-bit instructions can be useful, even when RVC is not present, I
propose a "CNOP" extension, which is a subset of RVC with only the
following three instructions: C.NOP, C.EBREAK, and C.ILLEGAL (all-zero).
Of course, it should be possible to disable it. (I wonder how useful
such an extension would be in practice).

Allen J. Baum

unread,
Feb 20, 2018, 11:47:57 PM2/20/18
to Christopher Celio, Stef O'Rear, Andrew Waterman, Jose Renau, Clifford Wolf, RISC-V ISA Dev, Rishiyur Nikhil
At 12:33 PM -0800 2/20/18, Christopher Celio wrote:
>
>No, that's too much hardware. It is not feasible to throw the exception on MISA-write. The instruction to reset MISA.C is a CSRW that writes to address MISA register with a data-dependent mask that dictates it wants to clear the C-bit in MISA (so both the CSR address and the mask is data-dependent!).

Quibble - the CSR address is a constant in the instruction, so not data dependent any more than an opcode is.

Richard Herveille

unread,
Feb 21, 2018, 5:01:16 AM2/21/18
to Stef O'Rear, Bruce Hoult, Andrew Waterman, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil, Richard Herveille

 

2) This definition makes no sense in some future CPU that supports 48 bit

instructions but not the C extension.

 

What I intend to propose if the matter comes up is that on hardware

without C, 48-bit instructions must be followed by a C.NOP and the

pair is treated as a unit.

 

[rih] This makes no sense, on a CPU without C-extensions you want a mandatory C-extension instruction?

Also are you suggesting that all 48bit instructions must be followed by a C.NOP? That would effectively turn these into 64bit instructions. Then why not just use 64bit instructions?

 

What’s the advantage of 48bit instructions over 64bit instructions? Code density .. therefore if a system supports 48bit instructions, then that implies that instructions end up on 16bit boundaries. With similar issues as for the C-extensions.

Any effort to somehow avoid that seems odd.

 

Anyways, just my 2 cents…

Richard

 

John Hauser

unread,
Mar 7, 2018, 11:40:56 AM3/7/18
to RISC-V ISA Dev
I'm sorry I didn't find the time to weigh in on this earlier, but I see
that the latest draft ISA document has been updated, and I believe the
way this issue was resolved was a mistake.

For maximal consistency, any time IALIGN is 32, the hardware should
be required to act the same as an implementation that doesn't support
IALIGN = 16 at all.  The only exception to this rule should be the
possibility of changing misa to cause IALIGN to be 16.  Hence, any
time IALIGN is 32, the hardware should be required to treat bit 1 of
each of the *epc CSRs as hardwired to 0.  As a consequence, it should
under no circumstance be possible for an *RET instruction to cause an
instruction-address-misaligned trap.

A write to misa that causes IALIGN to change from 16 to 32 should have
the apparent effect of forcing bit 1 of each of the *epc CSRs to zero.
Hardware can force bit 1 of all the *epc CSRs simultaneously to zero
using an AND gate on that bit position that is applied when an *epc is
read for any reason.  (Actually, for many implementations, there will
already be an AND gate or its logical equivalent for all bits of all
CSRs, but I don't want to get into such details.)

The hardware need not be concerned about the possibility that M-mode
software will write misa to change IALIGN from 16 to 32 and then execute
an MRET to return to code that requires IALIGN = 16.  Of course that
could happen, but it's only one of almost uncountable ways that M-mode
software can corrupt the system if it's being stupid, and we don't try
to catch the vast majority of those cases.

If I'm not mistaken, the only remaining question is how to treat a write
to misa that changes IALIGN from 16 to 32 when the CSR instruction
itself is not 32-bit aligned.  In my opinion, the answer most consistent
with other ISA requirements is to raise an instruction-address-
misaligned trap for the CSR instruction itself, while IALIGN is still
16, not for the instruction that follows it when IALIGN = 32.  This
choice, I believe, is most consistent with the way instruction-address-
misaligned traps are already handled for branch instructions, where the
trap is taken on the branch that would cause the misalignment, not the
subsequent misaligned instruction.

However, if others are convinced that this response is sometimes too
costly to implement, I'll be fine with requiring almost any other
behavior here, even to the point of causing the machine to lock up in
this circumstance, because it's once again something that should happen
only if M-mode software is being stupid.

Regards,

    - John Hauser

Andrew Waterman

unread,
Mar 7, 2018, 1:48:15 PM3/7/18
to John Hauser, RISC-V ISA Dev
On Wed, Mar 7, 2018 at 8:40 AM, John Hauser <jhause...@gmail.com> wrote:
> I'm sorry I didn't find the time to weigh in on this earlier, but I see
> that the latest draft ISA document has been updated, and I believe the
> way this issue was resolved was a mistake.
>
> For maximal consistency, any time IALIGN is 32, the hardware should
> be required to act the same as an implementation that doesn't support
> IALIGN = 16 at all. The only exception to this rule should be the
> possibility of changing misa to cause IALIGN to be 16. Hence, any
> time IALIGN is 32, the hardware should be required to treat bit 1 of
> each of the *epc CSRs as hardwired to 0. As a consequence, it should
> under no circumstance be possible for an *RET instruction to cause an
> instruction-address-misaligned trap.
>
> A write to misa that causes IALIGN to change from 16 to 32 should have
> the apparent effect of forcing bit 1 of each of the *epc CSRs to zero.
> Hardware can force bit 1 of all the *epc CSRs simultaneously to zero
> using an AND gate on that bit position that is applied when an *epc is
> read for any reason. (Actually, for many implementations, there will
> already be an AND gate or its logical equivalent for all bits of all
> CSRs, but I don't want to get into such details.)

This is one valid implementation under the proposed specification.

>
> The hardware need not be concerned about the possibility that M-mode
> software will write misa to change IALIGN from 16 to 32 and then execute
> an MRET to return to code that requires IALIGN = 16. Of course that
> could happen, but it's only one of almost uncountable ways that M-mode
> software can corrupt the system if it's being stupid, and we don't try
> to catch the vast majority of those cases.
>
> If I'm not mistaken, the only remaining question is how to treat a write
> to misa that changes IALIGN from 16 to 32 when the CSR instruction
> itself is not 32-bit aligned. In my opinion, the answer most consistent
> with other ISA requirements is to raise an instruction-address-
> misaligned trap for the CSR instruction itself, while IALIGN is still
> 16, not for the instruction that follows it when IALIGN = 32. This
> choice, I believe, is most consistent with the way instruction-address-
> misaligned traps are already handled for branch instructions, where the
> trap is taken on the branch that would cause the misalignment, not the
> subsequent misaligned instruction.
>
> However, if others are convinced that this response is sometimes too
> costly to implement, I'll be fine with requiring almost any other
> behavior here, even to the point of causing the machine to lock up in
> this circumstance, because it's once again something that should happen
> only if M-mode software is being stupid.
>
> Regards,
>
> - John Hauser
>
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to isa-dev+u...@groups.riscv.org.
> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit
> https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/0bf1e81c-dcf8-46f4-91da-b12f46332f86%40groups.riscv.org.

John Hauser

unread,
Mar 7, 2018, 2:27:30 PM3/7/18
to RISC-V ISA Dev
I wrote:
For maximal consistency, any time IALIGN is 32, the hardware should
be required to act the same as an implementation that doesn't support
IALIGN = 16 at all.  The only exception to this rule should be the
possibility of changing misa to cause IALIGN to be 16.  Hence, any
time IALIGN is 32, the hardware should be required to treat bit 1 of
each of the *epc CSRs as hardwired to 0.  As a consequence, it should
under no circumstance be possible for an *RET instruction to cause an
instruction-address-misaligned trap.
[...]
 
Andrew:
This is one valid implementation under the proposed specification.

Is it?  I'll look again, but I'm not sure I would've realized that from
the current text.

Regardless, I'm saying it needs to be a requirement for _all_
implementations that have changeable IALIGN, not just an option for
some.  In fact, I strongly believe the Foundation must adopt a general
rule that, if a feature is disabled in misa, all of the rest of the
hardware (other than misa itself) acts as though that feature isn't
supported.  If the ISA spec says that an implementation without
support for feature X must hardwire certain register bits to 0, then
if feature X is disabled in misa, those register bits darn well better
appear to be hardwired to 0.

Why?  Because, by intention or by accident, there will be software
that depends on those bits being hardwired to 0 when feature X isn't
supported.  The ISA document guarantees this to be the case, so you
can't then cast blame at the programmers for taking you at your word.
Subsequently, you won't be doing implementers or programmers any favors
by giving implementations the freedom to only "sort-of" follow the same
rules when feature X is disabled using misa.

You know from experience that I'm not averse to having some things be
implementation-defined.  But it's another ball of wax to try to have
things be _conditionally_ implementation-defined, where the condition
isn't testable (or won't be tested) by the software that depends on it. 
That's not really going to work.

Regards,

    - John Hauser

Allen Baum

unread,
Mar 7, 2018, 3:57:23 PM3/7/18
to John Hauser, RISC-V ISA Dev
Note that there is a difference between having an "and" gate on a bit, and clearing the bit (the difference being that if the feature is disabled, then re-enabled, the old value reappears).
But, (wearing my compliance hat)(and possibly a security hat) 
     if a feature is disabled in misa, all of the rest of the hardware (other than misa itself) acts as though that feature isn't supported
sounds - superficially, at least - like a good idea. it probably only affects implementations where that feature can be enabled or disabled (MISA.C is the poster child for this).

This requires proper M-mode support, that is: it won't turn off .C if the CSRW is unaligned, 
It could even ensure that the return address won't cause an problem even before it is turned off and take action,
     e.g. faking an unaligned address fault as if it came from the return.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

John Hauser

unread,
Mar 7, 2018, 6:53:50 PM3/7/18
to RISC-V ISA Dev
Allen Baum wrote:
Note that there is a difference between having an "and" gate on a bit, and clearing the bit (the difference being that if the feature is disabled, then re-enabled, the old value reappears).

You're absolutely right, and I tried to gloss over that point with
careful wording, to avoid getting sidetracked.  (Your Honor, I call to
the court's attention that I never actually said the register bits would
get written with zeros, only that this would be the "apparent effect"
when IALIGN = 32.)

The issue you raise concerns what software can expect to see in those
bits after changing IALIGN from 32 to 16.  That's a related question to
be pinned down, possibly as "unspecified" in this case.  But the answer
to that question won't impact how the hardware must behave when IALIGN
is 32, unless the ISA specification is going to be changed to make it
always explicitly implementation-dependent whether bit 1 of each *epc
register is writable when IALIGN = 32.

Regards,

    - John Hauser

Jacob Bachmeyer

unread,
Mar 7, 2018, 10:34:06 PM3/7/18
to Allen Baum, John Hauser, RISC-V ISA Dev
Allen Baum wrote:
> This requires proper M-mode support, that is: it won't turn off .C if
> the CSRW is unaligned,
> It could even ensure that the return address won't cause an problem
> even before it is turned off and take action,
> e.g. faking an unaligned address fault as if it came from the return.

Again, a monitor that permits RVC to be disabled must not itself use or
depend on RVC, so *no* instructions in the monitor can be on 16-bit
alignment -- *every* instruction in the monitor must be 32-bit aligned.
The monitor *cannot* use RVC itself if it permits RVC to be disabled.
Perhaps introducing an sisa CSR that only affects U-mode might be a
better option here?


-- Jacob

Michael Clark

unread,
Mar 8, 2018, 2:58:39 AM3/8/18
to jcb6...@gmail.com, Allen Baum, John Hauser, RISC-V ISA Dev


> On 8/03/2018, at 4:34 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>
> Perhaps introducing an sisa CSR that only affects U-mode might be a better option here?

I like the sisa idea, combined with making misa readonly (without renumbering misa for backwards compatibility).

This way misa always holds the complete set of extensions and there is no necessity for a hidden register or logic containing the full misa mask. Then way one only has to worry about sepc when sisa.C is disabled.

The rationale would be that the ability to disable C would only be desired in the Application processor platform. i.e. if one wanted sandboxes that avoided half word instruction embeddings.

It’s logical that a processor that supports C will have a monitor using the C extension and the utility is for Supervisor sandboxes perhaps on a per process basis with the Supervisor setting and clearing sisa.C on context switches.

Michael

Allen J. Baum

unread,
Mar 8, 2018, 6:08:48 PM3/8/18
to Michael Clark, jcb6...@gmail.com, John Hauser, RISC-V ISA Dev
Why not just trap MISA and return the value that you want? It's not like you're going to be executing it a lot, so speed doesn't matter.

Michael Clark

unread,
Mar 8, 2018, 6:45:07 PM3/8/18
to Allen J. Baum, jcb6...@gmail.com, John Hauser, RISC-V ISA Dev


> On 9/03/2018, at 12:08 PM, Allen J. Baum <allen...@esperantotech.com> wrote:
>
> Why not just trap MISA and return the value that you want? It's not like you're going to be executing it a lot, so speed doesn't matter.

That doesn’t help because the idea is to switch to a mode that actually enforces the 4-byte alignment and disables RVC decoding, so returning the value you want doesn’t mean the no-RVC constraints are enforced.

What Jacob has pointed out is salient, very much like SXL/UXL. The monitor on these systems is likely going to be compiled with RVC.

Sure sisa could be implemented in the monitor on hardware with writable misa, or there could be an SBI interface that could check constraints much like microcode.

sisa could be the interface that adds the additional constraints on sepc et al, leaving misa the leeway to operate in an ‘unreal’ mode until it hits aligned code, or a branch, on hardware that allows that behaviour. sisa being implemented in a monitor with trap and emulate could verify the registers in RISC-V monitor code.

In any case. I like the sisa idea but it would need hardware to relax the alignment in M mode and enforce the alignment in S mode. i.e. it’s not an easy fix, it’s actually an ask to make the misa flags specific to privilege modes.

The question one has to ask is when and why does someone want to disable RVC? Is it necessary in the MCU case in M mode? or is it something someone would want to do when colocating and sandboxing multiple processes on an application processor? Or both.

sisa is worth considering. we need to think about feature detection in the platform profiles anyhow, as there is no way for code to detect whether a system with misa.S set has an MMU. There is no necessity to have an MMU and implement S mode, however unlikely it would be to implement such a configuration.

We’d actually have to specify somewhere that processors implementing S mode must implement at least one non-zero satp mode. Until then, it’s ambiguous.

misa/sisa/uisa would allow Supervisor and User code to substitute code at runtime in much the same way that CPUID is allowed to be executed in unprivileged code on other architectures. Orthogonal to the misa.C issue is runtime feature detection for User code (which obviously can’t call SBI as that is a Supervisor interface).

It should be easy. It doesn’t have to be too quick on S mode processors that implement something like sisa or uisa via trap and emulate.

The beauty of a CSRs (like MSRs and other architecture and the User accessible performance monitor counters for thing like ‘perf’) is that they are consistent no matter which OS environment you are running. i.e. the programmer has an architecture specific method versus several OS specific methods to probe for processor features.

> At 8:58 PM +1300 3/8/18, Michael Clark wrote:
>>> On 8/03/2018, at 4:34 PM, Jacob Bachmeyer <jcb6...@gmail.com> wrote:
>>>
>>> Perhaps introducing an sisa CSR that only affects U-mode might be a better option here?
>>
>> I like the sisa idea, combined with making misa readonly (without renumbering misa for backwards compatibility).
>>
>> This way misa always holds the complete set of extensions and there is no necessity for a hidden register or logic containing the full misa mask. Then way one only has to worry about sepc when sisa.C is disabled.
>>
>> The rationale would be that the ability to disable C would only be desired in the Application processor platform. i.e. if one wanted sandboxes that avoided half word instruction embeddings.
>>
>> It's logical that a processor that supports C will have a monitor using the C extension and the utility is for Supervisor sandboxes perhaps on a per process basis with the Supervisor setting and clearing sisa.C on context switches.
>>
>> Michael
>
>
> --
> **************************************************
> * Allen Baum tel. (908)BIT-BAUM *
> * 248-2286 *
> **************************************************
>
> --
> You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/p06240808d6c772cbf24d%40%5B192.168.1.50%5D.

Andrew Waterman

unread,
Mar 13, 2018, 8:41:13 PM3/13/18
to John Hauser, RISC-V ISA Dev
I agree with this sentiment, which is why under this proposal, there
is no longer a statement that mepc[1] is guaranteed to be 0 when
IALIGN=32.

In any case, it was already implementation-defined which bits of mepc
were writable. Systems with limited address spaces are permitted, but
not required, to hardwire some of these bits. Under this proposal,
mepc[1] is no different than the high-order bits.

>
> You know from experience that I'm not averse to having some things be
> implementation-defined. But it's another ball of wax to try to have
> things be _conditionally_ implementation-defined, where the condition
> isn't testable (or won't be tested) by the software that depends on it.
> That's not really going to work.
>
> Regards,
>
> - John Hauser
>
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to isa-dev+u...@groups.riscv.org.
> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit
> https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/ef5f979f-8de1-4e3e-a20b-f8670b39bf07%40groups.riscv.org.

John Hauser

unread,
Mar 16, 2018, 11:14:26 PM3/16/18
to RISC-V ISA Dev
I wrote:
> In fact, I strongly believe the Foundation must adopt a general
> rule that, if a feature is disabled in misa, all of the rest of the
> hardware (other than misa itself) acts as though that feature isn't
> supported.  If the ISA spec says that an implementation without
> support for feature X must hardwire certain register bits to 0, then
> if feature X is disabled in misa, those register bits darn well better
> appear to be hardwired to 0.

Andrew:

> I agree with this sentiment, which is why under this proposal, there
> is no longer a statement that mepc[1] is guaranteed to be 0 when
> IALIGN=32.

Forgive me if I overlooked that concurrent change.  In that case, you're
right; there isn't the overt inconsistency I claimed.

Nevertheless, I will still argue that allowing bit 1 of the *epc CSRs to
be nonzero when IALIGN = 32 was unwise.  This change ought to be undone,
so that *RET instructions once again cannot cause an instruction address
misaligned trap.


> In any case, it was already implementation-defined which bits of mepc
> were writable.  Systems with limited address spaces are permitted, but
> not required, to hardwire some of these bits.  Under this proposal,
> mepc[1] is no different than the high-order bits.

Now that just can't be correct.  The spec previously said:

    On implementations that do not support instruction-set extensions
    with 16-bit instruction alignment, the two low bits (mepc[1:0]) are
    always zero.

If, despite those words, it's true that implementations were previously
free _not_ to make the two low bits zero when IALIGN = 32, then I can
only conclude that the document doesn't always mean what it appears to
say.  You're suggesting that implementations are allowed to contravene
the clear words of the specification in certain ways, yet without the
document giving sufficient guidance for readers to know exactly what
deviations are allowed and which are not.

Regards,

    - John Hauser

Andrew Waterman

unread,
Mar 17, 2018, 1:45:16 AM3/17/18
to John Hauser, RISC-V ISA Dev
I’m not trying to be opaque here. I’m saying that our proposal revises this part of the spec to remove that guarantee.



Regards,

    - John Hauser

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

John Hauser

unread,
Mar 17, 2018, 2:54:05 AM3/17/18
to RISC-V ISA Dev
> I’m not trying to be opaque here. I’m saying that our proposal revises
> this part of the spec to remove that guarantee.

Hold on, Andrew, you said more than that.  You wrote:

> In any case, it was already implementation-defined which bits of mepc
> were writable.

Me:

> Now that just can't be correct.  The spec previously said:
>
>     On implementations that do not support instruction-set extensions
>     with 16-bit instruction alignment, the two low bits (mepc[1:0]) are
>     always zero.

Trying to give you the benefit of the doubt, I suppose you meant that,
for _some_ of the bits of mepc, it was already implementation-defined
that they might be writable.  And now you've simply expanded that set
of implementation-defined bits.  Is that what you meant?

Despite this detour, I remain convinced it's a mistake to allow bit 1
of the *epc CSRs to be nonzero when IALIGN = 32.  Whatever convenience
this is supposed to provide, it's overwhelmed by the need to allow for
the possibility that the *RET instructions could cause an instruction
address misaligned trap.

Regards,

    - John Hauser

Andrew Waterman

unread,
Mar 17, 2018, 3:05:25 AM3/17/18
to John Hauser, RISC-V ISA Dev
xRET doesn’t trap under the new proposal. As discussed earlier in this thread, that’s a very bad idea. The trap is raised on the subsequent instruction’s fetch, i.e., after the instruction has retired.



Regards,

    - John Hauser

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Andrew Waterman

unread,
Mar 17, 2018, 3:07:45 AM3/17/18
to John Hauser, RISC-V ISA Dev
On Sat, Mar 17, 2018 at 12:05 AM Andrew Waterman <wate...@eecs.berkeley.edu> wrote:
On Fri, Mar 16, 2018 at 11:54 PM John Hauser <jhause...@gmail.com> wrote:
> I’m not trying to be opaque here. I’m saying that our proposal revises
> this part of the spec to remove that guarantee.

Hold on, Andrew, you said more than that.  You wrote:

> In any case, it was already implementation-defined which bits of mepc
> were writable.

Me:

> Now that just can't be correct.  The spec previously said:
>
>     On implementations that do not support instruction-set extensions
>     with 16-bit instruction alignment, the two low bits (mepc[1:0]) are
>     always zero.

Trying to give you the benefit of the doubt, I suppose you meant that,
for _some_ of the bits of mepc, it was already implementation-defined
that they might be writable.  And now you've simply expanded that set
of implementation-defined bits.  Is that what you meant?

Rather than giving me the benefit of the doubt, why don’t you take a look at the repo?



Despite this detour, I remain convinced it's a mistake to allow bit 1
of the *epc CSRs to be nonzero when IALIGN = 32.  Whatever convenience
this is supposed to provide, it's overwhelmed by the need to allow for
the possibility that the *RET instructions could cause an instruction
address misaligned trap.

xRET doesn’t trap under the new proposal. As discussed earlier in this thread, that’s a very bad idea. The trap is raised on the subsequent instruction’s fetch, i.e., after the instruction has retired.

That last sentence is missing a word; it should end with “after the xRET instruction has retired.”

lkcl .

unread,
Mar 18, 2018, 7:15:04 AM3/18/18
to John Hauser, RISC-V ISA Dev
On Wed, Mar 7, 2018 at 7:27 PM, John Hauser <jhause...@gmail.com> wrote:

> You know from experience that I'm not averse to having some things be
> implementation-defined. But it's another ball of wax to try to have
> things be _conditionally_ implementation-defined, where the condition
> isn't testable (or won't be tested) by the software that depends on it.
> That's not really going to work.

it's abbbsolutely critical that a standard have nothing that's
"optional". if the word "or" is either implicit or explicit it is a
critical mistake. the only way for a standard to have "options" is if
they are [forever and in perpetuity] backwards-compatible negotiated
[at the hardware and/or software level]. that means planning ahead
with the ability to extend (reserved).

good examples of backwards-compatibility include SD/MMC which can do
1, 2, 4 or 8 data lines and the controller can auto-negotiate / detect
the capabilities of the card. PCIe, USB, SATA: they all have
auto-negotiation of capabilities because you have no idea what's going
to be plugged in.

an example of a failed opportunity for a standard is X25, where they
allowed a hardware control line *OR* a software control "escape" code.
given that one end had absolutely no way of knowing if the other end
had the hardware control line connected or not, they *always* had to
implement software-defined escape sequencing... making the hardware
control line totally redundant... and thus they could have put in a
transmit clock line in its place [X25 required a $15 external clock
box instead of a passive $1 cable like RS232].

please for goodness sake treat implicit *and* explicit "or" (any kind
of ambiguity or any kind of double or greater options) in the RiSC-V
ISA standard as a really serious mistake that prevents the standard
from being ratified until it's corrected.

l.

John Hauser

unread,
Mar 21, 2018, 8:24:38 PM3/21/18
to RISC-V ISA Dev
After further private discussions with Andrew Waterman and Krste
Asanovic, the three of us have agreed to back a different way to
handle changes to IALIGN.  Andrew recently committed an update to the
privileged spec; the diffs can be seen here:
https://github.com/riscv/riscv-isa-manual/commit/0472bcdd166f45712492829a250e228bb45fa5e7

In summary:

There are no longer any new instruction-address-misaligned traps.  Only
jumps/branches can cause such a trap, as originally.

If a CSR instruction attempts to write to misa in such a way that IALIGN
would change from 16 to 32, yet the following instruction isn't 32-bit-
aligned, then the write to misa is suppressed entirely.  misa is thus
unchanged and IALIGN remains 16.  (This is analogous to how writes to
satp are ignored when they would result in an unsupported MODE setting.)

When IALIGN = 32, the bottom two bits of each *epc register always
appear as zeros.  If an implementation supports both ALIGN = 16 and
ALIGN = 32 (determined by misa), then bit 1 of each *epc register is
writable, but, when IALIGN = 32, the bit is masked on reads so as to
appear to be zero.  Thus *epc bit 1 is updated on writes but is hidden
on reads until IALIGN is 16 again.

These choices were made to provide behavior that is both sufficient and
maximally consistent across different implementations, with the least
demands on hardware.

Regards,

    - John Hauser

Brian Case

unread,
Mar 22, 2018, 5:47:33 PM3/22/18
to RISC-V ISA Dev
So, I'm new here, and I haven't read the whole thread, so ignore me if I'm being ignorant or stupid, but I think this is slightly, uh, less than desirable.

I don't think this behavior as currently proposed will cause the sun to go supernova prematurely or RISCV to fail in any way, but:

A program with a change to MISA.C that would leave the PC misaligned (from the 32-bit point of view) is indicative of malformed, dysfunctional code, and it's in M-mode no less (yikes!), if I understand correctly.

Such code, being broken, should cause an orderly "machine check" exception or whatever you want to call it.

Yes, I realize that eventually the code containing this errant change to MISA.C (and the disobeyance of its semantics) will eventually fall off the cliff somehow (most likely with a jump/branch that will trigger the misaligned exception). It just seems unhelpful at a minimum to wait until this bad branch happens when it's the MISA.C change instruction that was at fault (and the programmer at the root). Probably, the code between the MISA.C change and the bad branch will do little damage, but for sure?

I'm *still* unconvinced that code would ever need to change the MISA.C bit on anything other than a trap/exception enter or return. At a minimum, we can make it this way by decree (unless I miss some detail that makes this undesirable).

Do we really want to encourage an M-mode hyper-hypervisor to be programmed in both C and non-C modes? (Sorry if the answer is already understood to be a resounding "yes;" maybe someone can edumacate me. :)

My first $0.02,

-bcase

John Hauser

unread,
Mar 22, 2018, 8:26:27 PM3/22/18
to RISC-V ISA Dev
Brian Case wrote:
A program with a change to MISA.C that would leave the PC misaligned (from the 32-bit point of view) is indicative of malformed, dysfunctional code, and it's in M-mode no less (yikes!), if I understand correctly.

Such code, being broken, should cause an orderly "machine check" exception or whatever you want to call it.

Such code is broken, yes, and the hardware could in principle catch this
case, but it does not follow that the hardware should be built to do
so.  RISC principles dictate that you don't just throw in all desirable
hardware features; rather, you're supposed to evaluate possible features
for whether their benefits outweigh their costs.

It's always nice if the hardware can help catch bugs, but it should
not be forgotten that for every bug condition that might be tested and
trapped by the hardware there are literally thousands if not millions
of bug causes that the hardware cannot possibly detect for you.  The
particular kind of bug at issue here (a write to misa that increases
IALIGN when the instruction doing the write isn't sufficiently aligned)
would be extremely rare among all bugs, and also, as you noted, has
a high probability of being detected during testing, whether the
machine traps or not.  Weighing the possible advantage of an almost
infinitesimally small improvement to bug detection against the known
physical cost to every RISC-V implementation that supports changeable
IALIGN, for myself I can say without question that the trap isn't
justified.

Regards,

    - John Hauser

Allen Baum

unread,
Mar 22, 2018, 10:49:22 PM3/22/18
to John Hauser, RISC-V ISA Dev
My primary concern is security: is there some way that this feature can be used to break security?
The first part of this change is that clearing MISA.C from an unaligned PC will be silently ignored.
M-mode code can easily use a macro to do this that enforces the correct behavior.

It should also probably terminate any process to which it will return to that has xEPC[1] set and MISCA.C clear.
That's a little trickier, because it can't simply read xEPC[1] unless it is in compressed mode.
So, it has to recognize that case, possibly on any return 
(but maybe on just a few- when is returning directly or indirectly from code that asked for a change - 
 in that context switch case, you could terminate before switching)

There are other scenarios where xEPC is being saved / restored from memory - those must be done in compressed mode.
There can be a macro to do that also.

So there's some work to be done to ensure M-mode code doesn't have bugs; 
These particular scenarios are simple enough and likely occur infrequently enough in the code that formal verification might be used to ensure it isn't broken.
I'm not generally happy when code required for security gets more complicated, but this one has rules that are clear enough that I'm OK with it.
Not perfect, but low cost and workable.





--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Clifford Wolf

unread,
Mar 24, 2018, 9:30:18 AM3/24/18
to John Hauser, RISC-V ISA Dev
Hi,

On Wed, Mar 21, 2018 at 05:24:38PM -0700, John Hauser wrote:
> When IALIGN = 32, the bottom two bits of each *epc register always
> appear as zeros. If an implementation supports both ALIGN = 16 and
> ALIGN = 32 (determined by misa), then bit 1 of each *epc register is
> writable, but, when IALIGN = 32, the bit is masked on reads so as to
> appear to be zero. Thus *epc bit 1 is updated on writes but is hidden
> on reads until IALIGN is 16 again.

I have a problem with this because it adds hidden state that can be
modified but not observed in IALIGN=32.

Imo the new value of *epc after setting misa.c should be completely
predictable from the state observable before performing the misa write.

I think *epc[1] should either be cleared when switching to IALIGN=32 and
then bit 1 should not be writeable until IALIGN=16 again.

Or alternatively *epc[1] should be ignored by *RET but still be observable
when reading the CSR using the explicit CSR read instructions.

But instructions that modify a non-observable state bit, as proposed here,
is a verification nightmare imho.

> These choices were made to provide behavior that is both sufficient and
> maximally consistent across different implementations, with the least
> demands on hardware.

Ignoring *epc[1] in *RET instructions when IALIGN=16, and masking *epc[1] in
writes to the CSR when IALIGN=16 but reading it back set when it is already
set, would also be sufficient and equally miximally consistent and would add
the same or even less demands on the hardware. But it would not suffer from
the hidden state bit problem.

regards,
- clifford

Dan Hopper

unread,
Mar 24, 2018, 11:04:45 PM3/24/18
to Clifford Wolf, John Hauser, RISC-V ISA Dev
Hi folks,

I guess I have two problems with the current proposed changes. I agree with Clifford that *epc[1] should not be modifiable when IALIGN=32-bits. If *epc[1] is hard-wired to zero on a IALIGN=32-only CPU, then a more capable CPU in IALIGN=32 mode should have *exactly* the same read/write behavior for that register (or any other similar arch state).  This is no different than not being able to modify bits 63:32 of r1-31 of a 64-bit core running in 32-bit mode. This does mean the old value will magically re-appear if IALIGN is later changed back to 16-bits. This is not problematic behavior.

I also have a problem with the proposal that the alignment of instruction N+1 will reach backwards in time and conditionally affect how instruction N behaves, if N is a particular CSR write.  That's a little bizarre, at least in wording.  Yeah, I know the endbyte address of instr. N will tell you what you need to know, since the CSR write isn't a control transfer. I guess you could re-state it as a constraint on the alignment of endbyte+1 and it wouldn't sound as wacky, and wouldn't imply that I-fetch needs to do anything at all with respect to instr N+1 itself.

But the way it's worded aside, are there any other cases where decode needs to check the endbyte+1 alignment when cracking an instruction?  If there are not (and I haven't come across any yet), then IMO we shouldn't introduce one here, either, because that's more decode hardware - detecting the endbyte alignment error and suppressing a lot of decoded fields so that the CSR write turns into a NOP.

Thanks,
Dan Hopper
On 03/21/2018 07:24 PM, John Hauser wrote:
After further private discussions with Andrew Waterman and Krste 
Asanovic, the three of us have agreed to back a different way to 
handle changes to IALIGN.  Andrew recently committed an update to the
privileged spec; the diffs can be seen here:
https://github.com/riscv/riscv-isa-manual/commit/0472bcdd166f45712492829a250e228bb45fa5e7

In summary:

There are no longer any new instruction-address-misaligned traps.  Only 
jumps/branches can cause such a trap, as originally.

If a CSR instruction attempts to write to misa in such a way that IALIGN 
would change from 16 to 32, yet the following instruction isn't 32-bit-
aligned, then the write to misa is suppressed entirely.  misa is thus
unchanged and IALIGN remains 16.  (This is analogous to how writes to 
satp are ignored when they would result in an unsupported MODE setting.)

When IALIGN = 32, the bottom two bits of each *epc register always 
appear as zeros.  If an implementation supports both ALIGN = 16 and
ALIGN = 32 (determined by misa), then bit 1 of each *epc register is 
writable, but, when IALIGN = 32, the bit is masked on reads so as to 
appear to be zero.  Thus *epc bit 1 is updated on writes but is hidden
on reads until IALIGN is 16 again.

These choices were made to provide behavior that is both sufficient and 
maximally consistent across different implementations, with the least
demands on hardware.

Regards,

    - John Hauser


Alex Elsayed

unread,
Mar 25, 2018, 2:56:41 AM3/25/18
to RISC-V ISA Dev
On Sat, Mar 24, 2018, 20:04 Dan Hopper <dho...@tesla.com> wrote:
Hi folks,


I also have a problem with the proposal that the alignment of instruction N+1 will reach backwards in time and conditionally affect how instruction N behaves, if N is a particular CSR write.  That's a little bizarre, at least in wording.  Yeah, I know the endbyte address of instr. N will tell you what you need to know, since the CSR write isn't a control transfer. I guess you could re-state it as a constraint on the alignment of endbyte+1 and it wouldn't sound as wacky, and wouldn't imply that I-fetch needs to do anything at all with respect to instr N+1 itself.

It's worth noting that RVC does not add any 16-bit CSR manipulation instructions, and as such, if the CSR write itself is aligned its succeeding instruction will be as well.

This avoids any time-travel or endbyte+1 complexity.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Clifford Wolf

unread,
Mar 25, 2018, 8:19:21 AM3/25/18
to Dan Hopper, John Hauser, RISC-V ISA Dev
Hi,

On Sat, Mar 24, 2018 at 10:04:33PM -0500, Dan Hopper wrote:
> [..] This does mean the old value will magically re-appear if IALIGN is
> later changed back to 16-bits. This is not problematic behavior.

In my opinion it _is_ problematic.

If the value of *epc after setting misa.C is not predictable from the state
observalbe before setting misa.C then it becomes much harder to verify that
an implementation is implementing setting of misa.C correctly, especially
if you want to do it formally from a symbolic starting state.

> I also have a problem with the proposal that the alignment of
> instruction N+1 will reach backwards in time and conditionally affect
> how instruction N behaves, [...]

As Alex Elsayed correctly pointed out, currecly IALIGN can only be changed
using non-branching instructions that are 32 bit wide. So the next
instruction is not aligned to 32 bits if and only if the current
instruction is not aligned to 32 bits. So this is not an issue imo.

regards,
- clifford

--
The mind is not a vessel that needs filling, but wood that needs igniting.
-- Plutarch

Cesar Eduardo Barros

unread,
Mar 25, 2018, 8:28:11 AM3/25/18
to Dan Hopper, Clifford Wolf, John Hauser, RISC-V ISA Dev
Em 25-03-2018 00:04, Dan Hopper escreveu:
> Hi folks,
>
> I guess I have two problems with the current proposed changes. I agree
> with Clifford that *epc[1] should not be modifiable when IALIGN=32-bits.
> If *epc[1] is hard-wired to zero on a IALIGN=32-only CPU, then a more
> capable CPU in IALIGN=32 mode should have *exactly* the same read/write
> behavior for that register (or any other similar arch state).  This is
> no different than not being able to modify bits 63:32 of r1-31 of a
> 64-bit core running in 32-bit mode. This does mean the old value will
> magically re-appear if IALIGN is later changed back to 16-bits. This is
> not problematic behavior.

From a software point of view, what matters is not the value of epc[1],
it's the value of epc as a whole. The proposed rules are consistent from
that point of view: (1) if a value is written to xEPC while IALIGN=32,
the same value will still be in xEPC when IALIGN is changed to 16; (2)
if a value is written to xEPC while IALIGN=16, then IALIGN is changed to
32 and back to 16, the same value will still be in xEPC.

Having an old value for epc[1] appear when the core is switched to
IALIGN=16, even when software (or an interrupt, or a trap) had changed
the EPC to point elsewhere, is asking for hard-to-debug bugs.

> I also have a problem with the proposal that the alignment of
> instruction N+1 will reach backwards in time and conditionally affect
> how instruction N behaves, if N is a particular CSR write. That's a
> little bizarre, at least in wording.  Yeah, I know the endbyte address
> of instr. N will tell you what you need to know, since the CSR write
> isn't a control transfer. I guess you could re-state it as a constraint
> on the alignment of endbyte+1 and it wouldn't sound as wacky, and
> wouldn't imply that I-fetch needs to do anything at all with respect to
> instr N+1 itself.

Since IALIGN can only be 16 or 32, and CSR writes are always 32-bit
instructions (at least with the standard C extension), the alignment of
instruction N+1 and instruction N will always be the same when N is a
CSR write. IMHO, there should be a footnote mentioning that.

There's precedent for the address of instruction N+1 affecting the
behavior of instruction N. In particular, JAL/JALR stores the address of
instruction N+1 on a register. This also means that hardware already has
a means of passing the address of instruction N+1 to the units which
will execute the instruction, which explains why the cost of doing so
won't be too high.

Conceptually, that CSR write alters the behavior of the whole core; you
could think of it as stopping the core, exchanging a piece of it, and
then resuming. On that model what's the value of pc while the core is
stopped? The address of instruction N+1. It makes sense if you think of
it as "switching the core is invalid if pc is not aligned; if it's
invalid, do nothing".

> But the way it's worded aside, are there any other cases where decode
> needs to check the endbyte+1 alignment when cracking an instruction?  If
> there are not (and I haven't come across any yet), then IMO we shouldn't
> introduce one here, either, because that's more decode hardware -
> detecting the endbyte alignment error and suppressing a lot of decoded
> fields so that the CSR write turns into a NOP.

It's not the decode that's checking the alignment; it's the execution
unit, a few steps later in the pipeline. There are already several cases
(conditional branches) where the execution unit conditionally ignores
the instruction. It should be a simple test in the MISA case: if the CSR
register is MISA, it's a write, it would clear the C bit, and the second
bit of the pc+4 (which the decoder sent this way together with the
instruction) is set, then ignore the instruction.
--
Cesar Eduardo Barros
ces...@cesarb.eti.br

Dan Hopper

unread,
Mar 25, 2018, 8:56:25 PM3/25/18
to Cesar Eduardo Barros, Clifford Wolf, John Hauser, RISC-V ISA Dev
Hi Clifford and Cesar,

They are interesting counterpoints. From my point of view, though, there should be consistency in how state associated with various operating modes is/is not available:
  • 64-bit-only state should, generally, only be visible and modifiable in >= 64-bit mode.
  • Extension-specific state should, generally, only be visible and modifiable when the extension's mode is enabled.

It's not that there can't be exceptions to such consistency where it makes sense, but otherwise it seems most undesirable from design/verif/software standpoints to have weird in-betweens like the current proposal includes. (A bit that must be zero in non-C mode can be modified when C-mode is disabled on a C-capable processor - i.e. not matching the behavior of a non-C-mode-capable processor.)

We should have a consistent, general strategy for whether or not extension- or mode-specific state should be writable and/or readable when that mode is disabled. I'm suggesting it be "no, to modify mode-specific state you must have that mode enabled." 

I would expect that, among other disciplines, that the virtualization crowd would also be in favor of this approach.

I don't see why it's a burden to require software to initialize 64-bit-only processor state when entering 64-bit mode. Or to initialize C-mode state when entering C-mode.  Etc. Why is *epc[1] state any different than any other mode-specific state that we have laying around? Either software keeps track of the fact that it's previously initialized it before disabling that mode, or it has not.  If it has not, then it should go set it to known values.  And I certainly don't see how it's ok for code running with a mode disabled to go scribbling on mode-specific values.   Well in my view anyway :) 

---
Somewhat orthogonal to the general consistency of state accessibility side of the argument (above) - I hear the point about about CSR writes not having a 16-bit encoding, but what if the next rev of the spec decides to make a 16-bit CSR write instruction?  Or some other extension adds an instruction for a yet-to-be-defined CSR that has a non-32-bit instruction length?  Should they be saddled with the extra logic that was not envisioned here? 

A couple of specific responses are in-line, below.

Thanks for your time,
Dan


On 03/25/2018 07:19 AM, Clifford Wolf wrote:
Hi,

On Sat, Mar 24, 2018 at 10:04:33PM -0500, Dan Hopper wrote:
[..] This does mean the old value will magically re-appear if IALIGN is
later changed back to 16-bits. This is not problematic behavior.
In my opinion it _is_ problematic.

If the value of *epc after setting misa.C is not predictable from the state
observalbe before setting misa.C then it becomes much harder to verify that
an implementation is implementing setting of misa.C correctly, especially
if you want to do it formally from a symbolic starting state.

I also have a problem with the proposal that the alignment of
instruction N+1 will reach backwards in time and conditionally affect
how instruction N behaves, [...]
As Alex Elsayed correctly pointed out, currecly IALIGN can only be changed
using non-branching instructions that are 32 bit wide. So the next
instruction is not aligned to 32 bits if and only if the current
instruction is not aligned to 32 bits. So this is not an issue imo.

regards,
 - clifford


On 03/25/2018 07:28 AM, Cesar Eduardo Barros wrote:
Em 25-03-2018 00:04, Dan Hopper escreveu:
Hi folks,

I guess I have two problems with the current proposed changes. I agree with Clifford that *epc[1] should not be modifiable when IALIGN=32-bits. If *epc[1] is hard-wired to zero on a IALIGN=32-only CPU, then a more capable CPU in IALIGN=32 mode should have *exactly* the same read/write behavior for that register (or any other similar arch state).  This is no different than not being able to modify bits 63:32 of r1-31 of a 64-bit core running in 32-bit mode. This does mean the old value will magically re-appear if IALIGN is later changed back to 16-bits. This is not problematic behavior.

From a software point of view, what matters is not the value of epc[1], it's the value of epc as a whole. The proposed rules are consistent from that point of view: (1) if a value is written to xEPC while IALIGN=32, the same value will still be in xEPC when IALIGN is changed to 16; (2) if a value is written to xEPC while IALIGN=16, then IALIGN is changed to 32 and back to 16, the same value will still be in xEPC.

Having an old value for epc[1] appear when the core is switched to IALIGN=16, even when software (or an interrupt, or a trap) had changed the EPC to point elsewhere, is asking for hard-to-debug bugs.

I also have a problem with the proposal that the alignment of instruction N+1 will reach backwards in time and conditionally affect how instruction N behaves, if N is a particular CSR write. That's a little bizarre, at least in wording.  Yeah, I know the endbyte address of instr. N will tell you what you need to know, since the CSR write isn't a control transfer. I guess you could re-state it as a constraint on the alignment of endbyte+1 and it wouldn't sound as wacky, and wouldn't imply that I-fetch needs to do anything at all with respect to instr N+1 itself.

Since IALIGN can only be 16 or 32, and CSR writes are always 32-bit instructions (at least with the standard C extension), the alignment of instruction N+1 and instruction N will always be the same when N is a CSR write. IMHO, there should be a footnote mentioning that.

There's precedent for the address of instruction N+1 affecting the behavior of instruction N. In particular, JAL/JALR stores the address of instruction N+1 on a register. This also means that hardware already has a means of passing the address of instruction N+1 to the units which will execute the instruction, which explains why the cost of doing so won't be too high.

Well that's a value that's often generated in an ALU or branch resolution unit at execution time, and written into a arch or physical register. It's not necessarily available in the decode pipe stage(s).  (I'm not saying that information can't be created in decode - just saying it's not necessarily available for the leveraging case you're suggesting.)

Conceptually, that CSR write alters the behavior of the whole core; you could think of it as stopping the core, exchanging a piece of it, and then resuming. On that model what's the value of pc while the core is stopped? The address of instruction N+1. It makes sense if you think of it as "switching the core is invalid if pc is not aligned; if it's invalid, do nothing".

I admit that some CSR writes might require serialization or stronger pipe controls to produce correct behavior.  Switching modes, for example, like the misa write we're talking about here, might be handled with a full pipeline flush and re-fetch. But hey, if we're really talking requiring rather than just allowing implementations to take the full-flush approach, then doesn't it make more sense for the detection and handling of instr. N+1 being misaligned to be taken on N+1?  I say absolutely!:

  1. The CSR write is fetched, decoded, executed, and retired.
  2. CSR write propagates new mode state throughout core (exactly how is implementation-specific).
  3. Pipe flush of front and back-ends. Redirect to PC of N+1.
  4. N+1 is re-fetched, using new mode state. Oops, misaligned. Takes an exception. Piece of cake, we already had to have that fetch misaligned logic anyway. 

My two cents, anyway :)

Clifford Wolf

unread,
Apr 1, 2018, 9:12:38 AM4/1/18
to Dan Hopper, Cesar Eduardo Barros, John Hauser, RISC-V ISA Dev
On Sun, Mar 25, 2018 at 07:55:57PM -0500, Dan Hopper wrote:
> I don't see why it's a burden to require software to initialize
> 64-bit-only processor state when entering 64-bit mode. Or to initialize
> C-mode state when entering C-mode.

The issue here is _not_ burden to software. The issue is formal verification.

Persistent state that is invisible (or very hard to observe) is hard to
formally verify.

lk...@lkcl.net

unread,
Apr 3, 2018, 2:26:58 PM4/3/18
to RISC-V ISA Dev, dho...@tesla.com, ces...@cesarb.eti.br, jhause...@gmail.com


On Sunday, April 1, 2018 at 2:12:38 PM UTC+1, clifford wrote:
On Sun, Mar 25, 2018 at 07:55:57PM -0500, Dan Hopper wrote:
> I don't see why it's a burden to require software to initialize
> 64-bit-only processor state when entering 64-bit mode. Or to initialize
> C-mode state when entering C-mode.

The issue here is _not_ burden to software. The issue is formal verification.

 and formal verification means you *know* if it's correct (and secure).  dan, clifford has been developing a tool that finds extremely obscure bugs thanks to the formal representation system he's developed.  one of the examples is *precisely* in this area MISA.C https://github.com/cliffordwolf/riscv-formal/blob/master/docs/examplebugs.md
 

Persistent state that is invisible (or very hard to observe) is hard to
formally verify.


it's the intelligence community "nightmare scenario".  if you know something's insecure, that's (bizarrely) quite fine because you can do something about it and/or avoid the problem.  it's also (obviously) fine if you know it's secure.  it's what you DON'T know whether it's secure or insecure, that's the nightmare scenario.

and if state is hidden, there's no possible way to capture it in any kind of way that allows you to say if it's making the processor secure *or* insecure, does the job *or* doesn't do the job.

in my view, if clifford has kept track of the RISC-V specification and can actually tell us precisely where security and design flaws are in the specification, we should listen to him and make his work easier to do, not harder, even if it makes software writers jobs more difficult.

l.
Reply all
Reply to author
Forward
0 new messages