Proposal from ISA Formal Spec Technical Group: Behavior on clearing MISA.C

324 views
Skip to first unread message

Rishiyur Nikhil

unread,
Feb 15, 2018, 10:02:38 AM2/15/18
to RISC-V ISA Dev
BACKGROUND:
Consider a RISC-V implementation that supports the 'C'
extension (Compressed Instructions), and can clear the MISA.C bit at
runtime (thereby switching off C support dynamically).  What should be
the behavior if PC is not currently 32-bit aligned?  What if return
addresses and trap vectors are not 32-bit aligned?

This is a proposal to specify the behavior precisely.

PROPOSAL:
When MISA.C is cleared,

- If PC is not 32-bit aligned; just continue executing, consuming 32b
    instructions that are 16-bit aligned (since the CPU is already
    capable of doing this to support C).

- If a trap occurs and MTVEC contains an address that is
    16-bit aligned and not 32-bit aligned, just continue executing,
    consuming 32b instructions that are 16-bit aligned (since the CPU
    is already capable of doing this to support C).

- All jump or branch instructions that attempt to jump to an address
    that is not 32-bit aligned will cause an Instruction address
    misaligned exception, per the current spec (exception is taken at
    the jump, not at the target).  The return address is saved as is,
    even if it is not 32-bit aligned.

- CSR* instructions that write to [MSU]EPC will clear the two LSBs of
    the written address, i.e. EPC will be 32-bit aligned.

- When PC is copied to [MSU]EPC as part of trap handling, it is copied
    as-is even if it points to an address that is not 32-bit aligned.



Please let us know if you see any problems with this.

[Thanks to Clifford Wolf for articulating this first.]

Rishiyur Nikhil (Chair) and the ISA Formal Spec Technical Group

Christopher Celio

unread,
Feb 15, 2018, 11:10:36 AM2/15/18
to Rishiyur Nikhil, RISC-V ISA Dev
Hi guys,

What is the rationale for allowing a grace period for relaxing 4B alignment requirements? Can we not reliably force alignment of "CSRW MISA"? (or .norvc on functions that write MISA?).

My immediate reaction is that this appears very messy from the point-of-view of the core. I can no longer rely on !MISA.C && PC(1) to signify an error without throwing in a state machine? What if I'm doing crazy optimizations specific to RV64G mode that rely on alignment?

-Chris


--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/CAAVo%2BPmhXr_94XDKEc87%2B1Jsg-8Qo%3Dvd2CTFvB0C8-ShknxyCA%40mail.gmail.com.

Samuel Falvo II

unread,
Feb 15, 2018, 11:52:16 AM2/15/18
to Rishiyur Nikhil, RISC-V ISA Dev
On Thu, Feb 15, 2018 at 7:02 AM, Rishiyur Nikhil <nik...@bluespec.com> wrote:
> PROPOSAL:
> When MISA.C is cleared,
>
> - If PC is not 32-bit aligned; just continue executing, consuming 32b
> instructions that are 16-bit aligned (since the CPU is already
> capable of doing this to support C).

I think this is too complicated, as it is at odds with the third
requirement that all branches (conditional or otherwise) be
word-aligned. I would say that, after clearing the C bit, if the next
instruction fetch is not word aligned, it must raise an exception just
like any instruction fetch after a branch would.

> - If a trap occurs and MTVEC contains an address that is
> 16-bit aligned and not 32-bit aligned, just continue executing,
> consuming 32b instructions that are 16-bit aligned (since the CPU
> is already capable of doing this to support C).

This, by definition, is an unconditional branch, and should be treated
as an unconditional branch -- that is, if *tvec is not word-aligned
AND C-bit is clear, then you'll raise an exception here as well. Of
course, you run the risk of an infinite trap loop; this can be stopped
by enforcing the requirement that mtvec be word-aligned (EVEN with
C-bit set!), while all other *tvec registers have relaxed
requirements.

> - CSR* instructions that write to [MSU]EPC will clear the two LSBs of
> the written address, i.e. EPC will be 32-bit aligned.

Ouch. No. No black magic, please. The spec states that the EPC bits
are read/written as-is, and as well, this patently prevents software
delegation to exception handlers written with the C-extension.

> - When PC is copied to [MSU]EPC as part of trap handling, it is copied
> as-is even if it points to an address that is not 32-bit aligned.

That has to be the case anyway.

--
Samuel A. Falvo II

Clifford Wolf

unread,
Feb 15, 2018, 12:08:44 PM2/15/18
to Christopher Celio, Rishiyur Nikhil, RISC-V ISA Dev
Hi Chris,

On Thu, Feb 15, 2018 at 08:10:32AM -0800, Christopher Celio wrote:
> What is the rationale for allowing a grace period for relaxing 4B
> alignment requirements? Can we not reliably force alignment of "CSRW
> MISA"? (or .norvc on functions that write MISA?).

Some history on that:

Rocket currently does extremely strange things in those cases. You can get
it to execute an unaligned LB instruction following the MISA write as an
LW instruction, but also jump to the trap handler with an MPEC that
suggests the load instruction was never executed. (The reason is that the
LB get scheduled, then a trap is raised and the LB is not properly killed,
causing it to write to its destination register when the memory returns the
data, but the register write is handled like it would be for a LW
instruction as a side-effect of the inproper instruction kill.)

Apparently Andrew Waterman doesn't want to fix this because it would add
additional hardware. (Please correct me if I'm misrepresenting you here
Andrew. This would not be my intention.) Instead he want's this to be
undefined behavior.

However, we from the formal spec working group don't think that this should
be undefined behavior and therefore propose a behavior that should be
implementable with minimal to zero hardware overhead. I would assume that
the behavior we propose would actually require less hardware than what
rocket currently does.

> My immediate reaction is that this appears very messy from the
> point-of-view of the core. I can no longer rely on !MISA.C && PC(1) to
> signify an error without throwing in a state machine? What if I'm doing
> crazy optimizations specific to RV64G mode that rely on alignment?

You can not use "!MISA.C && PC(1)" anyways because it's not the unaligned
jump target that traps, it is the instruction that tries to jump to the
unaligned address. (In which case the jump is not executed and PC still
points to the (probably aligned) jump/branch instruction.)

Unless you mean something like "!MISA.C && NEXT_PC(1)" which is exactly
what you should be doing now for all jump/branch instructions, and it is
also the behavior we are proposing below.

>> PROPOSAL:
>> When MISA.C is cleared,
>>
>> (1) If PC is not 32-bit aligned; just continue executing, consuming 32b
>> instructions that are 16-bit aligned (since the CPU is already
>> capable of doing this to support C).
>>
>> (2) If a trap occurs and MTVEC contains an address that is
>> 16-bit aligned and not 32-bit aligned, just continue executing,
>> consuming 32b instructions that are 16-bit aligned (since the CPU
>> is already capable of doing this to support C).
>>
>> (3) All jump or branch instructions that attempt to jump to an address
>> that is not 32-bit aligned will cause an Instruction address
>> misaligned exception, per the current spec (exception is taken at
>> the jump, not at the target). The return address is saved as is,
>> even if it is not 32-bit aligned.
>>
>> (4) CSR* instructions that write to [MSU]EPC will clear the two LSBs of
>> the written address, i.e. EPC will be 32-bit aligned.
>>
>> (5) When PC is copied to [MSU]EPC as part of trap handling, it is copied
>> as-is even if it points to an address that is not 32-bit aligned.

Note that (3) and (4) is exactly what every RISC-V processor should already
do when !MISA.C.

Also note that (1), (2), and (5) in fact remove complexity. In this points
we say that a processor should *not* add extra complexity in areas where
some might assume that the spec requires them to add complexity. (But in
fact the current spec doesn't say anything about what to do in those
situations.)

regards,
- clifford

--
there's no place like 127.0.0.1
until we found ::1 -- which is even bigger

Clifford Wolf

unread,
Feb 15, 2018, 12:19:17 PM2/15/18
to Samuel Falvo II, Rishiyur Nikhil, RISC-V ISA Dev
Hi,

On Thu, Feb 15, 2018 at 08:52:13AM -0800, Samuel Falvo II wrote:
> > - If PC is not 32-bit aligned; just continue executing, consuming 32b
> > instructions that are 16-bit aligned (since the CPU is already
> > capable of doing this to support C).
>
> I think this is too complicated, as it is at odds with the third
> requirement that all branches (conditional or otherwise) be
> word-aligned.

Aaehm.. No. There is no conflict between those clauses.

> I would say that, after clearing the C bit, if the next
> instruction fetch is not word aligned, it must raise an exception just
> like any instruction fetch after a branch would.

This is not how the exception in response to a branch to an unaligned
instruction works.

It is not the new (unaligned) instruction that traps, it's it the branch
instruction that tries to make the program counter unaligned that traps.

> > - If a trap occurs and MTVEC contains an address that is
> > 16-bit aligned and not 32-bit aligned, just continue executing,
> > consuming 32b instructions that are 16-bit aligned (since the CPU
> > is already capable of doing this to support C).
>
> This, by definition, is an unconditional branch, and should be treated
> as an unconditional branch -- that is, if *tvec is not word-aligned
> AND C-bit is clear, then you'll raise an exception here as well. Of
> course, you run the risk of an infinite trap loop; this can be stopped
> by enforcing the requirement that mtvec be word-aligned (EVEN with
> C-bit set!), while all other *tvec registers have relaxed
> requirements.

I don't know what you are talking about. *tvec are always word aligned.
Please read the spec.

> > - CSR* instructions that write to [MSU]EPC will clear the two LSBs of
> > the written address, i.e. EPC will be 32-bit aligned.
>
> Ouch. No. No black magic, please. The spec states that the EPC bits
> are read/written as-is, and as well, this patently prevents software
> delegation to exception handlers written with the C-extension.

What? Please read the spec!
This is the current behavior, not "black magic"!

> > - When PC is copied to [MSU]EPC as part of trap handling, it is copied
> > as-is even if it points to an address that is not 32-bit aligned.
>
> That has to be the case anyway.

if you read the spec and put it side by side to your proposal then you will
see that this is the case for most of it. It is the entire point to have
a proposal that is more a clarification regarding some small spec holes
rather than a major spec change.

regards,
- clifford

--
Oh, boy, virtual memory! Now I'm gonna make myself a really *big* RAMdisk!

Jose Renau

unread,
Feb 15, 2018, 12:22:35 PM2/15/18
to Rishiyur Nikhil, RISC-V ISA Dev

This looks very complicated/involved when we start to have wide fetch.
Clearing the MISA.C should be an infrequent even, so adding a nop
overhead to align the instruction correctly should not be a problem.
 
Doing the state machine indicated bellow is "ok" for a single scalar,
but as we fetch wider and uOP the RISC-V instructions, we have an
explosion of options just for something very infrequent/weird.
 
What about the following simpler solution:
 
-If the MISA.C is cleared by a compressed instruction. The instruction
following the MISA.C clear must be 32bit aligned or an illegal instruction
exception is raised.
 
Much easier to implement/debug and cleaner to verify.

Samuel Falvo II

unread,
Feb 15, 2018, 12:32:01 PM2/15/18
to Clifford Wolf, Rishiyur Nikhil, RISC-V ISA Dev
On Thu, Feb 15, 2018 at 9:19 AM, Clifford Wolf <clif...@clifford.at> wrote:
> It is not the new (unaligned) instruction that traps, it's it the branch
> instruction that tries to make the program counter unaligned that traps.

What is the exception? Is it an illegal instruction exception? OK,
I'll concede that.

If it is an unaligned instruction address exception, then I'm going to
vociferously disagree. I am not going to add unaligned address checks
for every single instruction capable of altering PC, when instead I
can just add the hardware *once* in the instruction fetch logic to
detect this condition. That just doesn't make any sense. Unless you
know some trick to this that I don't?

> I don't know what you are talking about. *tvec are always word aligned.
> Please read the spec.

Then the whole bullet point is nonsensical by definition, for it can
*NEVER* happen. As you apparently are fond of saying, Please read the
spec.

Andrew Waterman

unread,
Feb 15, 2018, 12:42:26 PM2/15/18
to Jose Renau, RISC-V ISA Dev, Rishiyur Nikhil
I still don’t think this needs to be well defined, but if we disagree on that point, I prefer Jose’s/Chris’ proposal for simpler semantics.

Clifford Wolf

unread,
Feb 15, 2018, 12:57:58 PM2/15/18
to Samuel Falvo II, Rishiyur Nikhil, RISC-V ISA Dev
Hi,

On Thu, Feb 15, 2018 at 09:31:59AM -0800, Samuel Falvo II wrote:
> On Thu, Feb 15, 2018 at 9:19 AM, Clifford Wolf <clif...@clifford.at> wrote:
> > It is not the new (unaligned) instruction that traps, it's it the branch
> > instruction that tries to make the program counter unaligned that traps.
>
> What is the exception? Is it an illegal instruction exception? OK,
> I'll concede that.
>
> If it is an unaligned instruction address exception, then I'm going to
> vociferously disagree. I am not going to add unaligned address checks
> for every single instruction capable of altering PC, when instead I
> can just add the hardware *once* in the instruction fetch logic to
> detect this condition. That just doesn't make any sense. Unless you
> know some trick to this that I don't?

I'm not sure what you are arguing here.

The "Instruction address misaligned" (mcause=0) exception currently works
the way that it traps on the instruction that tries to make the PC
unaligned. The branch/jump instruction in question is never executed and
in the trap handler mepc points to the branch/jump, not to the instruction
the branch/jump would have tried to jump to.

Our proposal does not change this is any way.

If you think this behavior is bad and you want to discuss it then maybe
create a new thread for that, but please don't sidetrack this thread.

> > I don't know what you are talking about. *tvec are always word aligned.
> > Please read the spec.
>
> Then the whole bullet point is nonsensical by definition, for it can
> *NEVER* happen. As you apparently are fond of saying, Please read the
> spec.

Yes, it is tautological.

It was not part of my original wording. Nikhil must have added it when he
edited my original 3 point proposal. But the overall semantics should be
the same. A tautological statement doesn't really change that.

Here is my original wording for reference. Maybe that makes it easier to
understand what this is about:

--snip--

A processor with C support and writable misa.C should behave as follows
when misa.C is cleared:

- If not stated otherwise below, behave the same as if misa.C would be set.
Specifically, if PC points to an address that isn't aligned to 32 bits
the core should just execute unaligned code.

- All jump or branch instructions that attempt to jump to an address that
is not aligned to 32 bits will cause an Instruction address misaligned
exception.

- Writes to [msu]epc will clear the two LSB of the written address. This
is only the case for the CSR* instructions that explicitly write to
[msu]epc. When PC is copied to [msu]epc as part of trap handling than
it is copied as-is even when it points to an address not aligned to
32 bits.

--snap--

regards,
- clifford

Clifford Wolf

unread,
Feb 15, 2018, 1:05:23 PM2/15/18
to Jose Renau, Rishiyur Nikhil, RISC-V ISA Dev
Hi,

On Thu, Feb 15, 2018 at 05:22:32PM +0000, Jose Renau wrote:
> This looks very complicated/involved when we start to have wide fetch.
> Clearing the MISA.C should be an infrequent even, so adding a nop
> overhead to align the instruction correctly should not be a problem.

You don't understand. This is not an optimization. This is to avoid
undefined behavior in the spec.

> Doing the state machine indicated bellow is "ok" for a single scalar,
> but as we fetch wider and uOP the RISC-V instructions, we have an
> explosion of options just for something very infrequent/weird.
>
> What about the following simpler solution:
>
> -If the MISA.C is cleared by a compressed instruction. The instruction
> following the MISA.C clear must be 32bit aligned or an illegal
> instruction exception is raised.

(1) Do you really mean "compressed instruction"? Because you cant modify
MISA.C with a compressed instruction. Only 32b opcodes exist for that. So
I'm assuming you meant unaligned instruction.

(2) This is only addressing the trivial part of the problem. The hard one
is what happens if one clears MISA.C while [MSU]EPC points to an address
that is not aligned to 32b. What happens if the next instruction is
[msu]ret?

regards,
- clifford

Bruce Hoult

unread,
Feb 15, 2018, 1:06:55 PM2/15/18
to Jose Renau, Rishiyur Nikhil, RISC-V ISA Dev
It doesn't matter whether MISA.C is cleared by a compressed instruction or not. If it's cleared by a 16 bit-aligned 32 bit instruction then that should trap also.

Also: both this and the other thread about unaligned 32 bit instructions crossing a page boundary should explicitly consider the behaviour with 48 bit, 64 bit and longer instructions. The generalisation is pretty obvious, and to make it clear for the future I think these proposals should not be talking about C extension instructions, but about ANY instruction consisting of an odd number of 16 bit packets.

To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Clifford Wolf

unread,
Feb 15, 2018, 1:13:10 PM2/15/18
to Andrew Waterman, Jose Renau, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Thu, Feb 15, 2018 at 05:42:12PM +0000, Andrew Waterman wrote:
> I still don’t think this needs to be well defined, but if we disagree on
> that point, I prefer Jose’s/Chris’ proposal for simpler semantics.

I'm confused. You pointed out to me in the original mails about this last
year that this solution would be insufficient because it does not address
the [MSU]EPC issue and you rejected anything along the lines of those
solutions that would address the [MSU]EPC issue because you said it would
add too much hardware overhead.

So what solution do you propose that solves the [MSU]EPC issue and stays
within what you would see as accaptable hardware overhead? Because afaict
neither Jose nor Chris addressed those issues in their mails.

regards,
- clifford

Clifford Wolf

unread,
Feb 15, 2018, 1:21:09 PM2/15/18
to Bruce Hoult, Jose Renau, Rishiyur Nikhil, RISC-V ISA Dev
Hi,

On Thu, Feb 15, 2018 at 09:06:52PM +0300, Bruce Hoult wrote:
> Also: both this and the other thread about unaligned 32 bit instructions
> crossing a page boundary should explicitly consider the behaviour with 48
> bit, 64 bit and longer instructions. The generalisation is pretty obvious,
> and to make it clear for the future I think these proposals should not be
> talking about C extension instructions, but about ANY instruction
> consisting of an odd number of 16 bit packets.

absolutely. however, in the current spec only toggling MISA.C will toggle
the support for unaligned instructions, because no other extension adds
support for instructions that are not a multiple of 32b in length.

one of me previous proposals was to decouple the C extension and unaligned
instructions completely and let MISA.C only toggle decoding of compressed
instructions, but still allow unaligned code in MISA.C capable processors
when MISA.C is disabled. This proposal was shut down because the idea is
that one should be able to use MISA.C to completely emulate a non-C core
with all it's limitations, including the un-ability to load unaligned
instructions.

Se when we talk about MISA.C in this proposal, we actually are talking
about the OR'ed value of all feature bits that control extensions that
would enable support for executing unaligned instructions.

Christopher Celio

unread,
Feb 15, 2018, 1:27:39 PM2/15/18
to Clifford Wolf, Andrew Waterman, Jose Renau, RISC-V ISA Dev, Rishiyur Nikhil
Thank you Clifford for providing clarification on this issue. Let me summarize this as I understand it and suggest a proposal or two:

* In current RISC-V behavior, misalignment in RVG is detected by monitoring ONLY the branch and jump instructions.
* This proposal keeps the "hardware simple" by maintaining this behavior.
* EPC is forced to 4-byte alignment during a RVG register restore to maintain this behavior too.


An alternative proposal, that better matches intuition would be:

* Misalignment in RVG is detected by monitoring branches, jumps, the instruction following "CSRW MISA.C"*, and xRET (use of EPC).

A sub-alternative is to force xRET to align EPC based on RVG or RVC (that is, on the USE of EPC and not the WRITE of EPC).


Do I understand the situation properly?
Chris
> --
> You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/20180215181309.GC19634%40clifford.at.

Clifford Wolf

unread,
Feb 15, 2018, 1:57:18 PM2/15/18
to Christopher Celio, Andrew Waterman, Jose Renau, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Thu, Feb 15, 2018 at 10:27:35AM -0800, Christopher Celio wrote:
> Thank you Clifford for providing clarification on this issue. Let me summarize this as I understand it and suggest a proposal or two:
>
> * In current RISC-V behavior, misalignment in RVG is detected by monitoring ONLY the branch and jump instructions.
> * This proposal keeps the "hardware simple" by maintaining this behavior.
> * EPC is forced to 4-byte alignment during a RVG register restore to maintain this behavior too.
>
> Do I understand the situation properly?

Yes.

I'm about the 3rd point about "register restore". Do you mean transfers
between the EPC CSR and general purpose registers? If so then yes: Writes
to the CSR clear the LSB bit or two LSB bits depending on the value of
MISA.C (that's the current spec). But reads will return the potentially
16-bit aligned address in the CSR if it was written before clearing the
MISA.C (that's the proposal, this situation is undefined in the current
spec).

> An alternative proposal, that better matches intuition would be:
> * Misalignment in RVG is detected by monitoring branches, jumps, the instruction following "CSRW MISA.C"*, and xRET (use of EPC).
> * A sub-alternative is to force xRET to align EPC based on RVG or RVC (that is, on the USE of EPC and not the WRITE of EPC).

Yes, this are the kind of things I proposed in october last year.

The objectsion if I remember correctly where:

In the first case we could have xRET trap based on the value of xEPC. But
when xRET traps then xEPC will be overwritten with the address of the xRET
and the original xEPC value that caused the trap will be lost. The spec has
been crafted carefully avoid this situation by making sure xEPC will never
be able to hold a value that would make xRET trap. Our proposal preserves
this guarantee.

In the second case we would effectively execute random code. (Whatever the
interpretation of the code is when offset by 2 bytes.) This has been deemed
less desireable than all other options by pretty much everyone I spoke to.

We also discussed other options, such as checking all xEPC regs when
clearing MISA.C and trigger an illegal instruction exception when any one
of them (or the PC) points to an unaligned address. But this was rejected
because of the extra hardware overhead. From a user perspective I think
this would have been the most intuitive option.

regards,
- clifford

Jose Renau

unread,
Feb 15, 2018, 2:10:59 PM2/15/18
to Clifford Wolf, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Just curious, what is the problem of nuking the pipeline when the misa.c is changed. Then, no need to check all the inflight instructions.

Clifford Wolf

unread,
Feb 15, 2018, 2:15:29 PM2/15/18
to Jose Renau, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
On Thu, Feb 15, 2018 at 11:10:56AM -0800, Jose Renau wrote:
> Just curious, what is the problem of nuking the pipeline when the misa.c is
> changed. Then, no need to check all the inflight instructions.

That would be a way of implementing a defined behavior, but it does not
answer the question what the defined behaviour should be.

Jose Renau

unread,
Feb 15, 2018, 2:22:22 PM2/15/18
to Clifford Wolf, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
It was to avoid checking all the instructions, and now allow miss aligned after misa.c is cleared for any 32bit instruction.

After change, clear pipe and future instructions are all aligned 32 or illegal instruction raised.

Clifford Wolf

unread,
Feb 15, 2018, 2:28:36 PM2/15/18
to Jose Renau, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
On Thu, Feb 15, 2018 at 11:22:19AM -0800, Jose Renau wrote:
> It was to avoid checking all the instructions, and now allow miss aligned
> after misa.c is cleared for any 32bit instruction.
>
> After change, clear pipe and future instructions are all aligned 32 or
> illegal instruction raised.

So the behavior you are proposing is tho raise an illegal instruction
exception for an instruction that clears MISA.C when this instruction in
unaligned, correct?

How does this address the issue with xEPC potentially pointing to a
non-aligned address?

Jose Renau

unread,
Feb 15, 2018, 2:36:34 PM2/15/18
to Clifford Wolf, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
After misa.c is being cleared, if any instruction is miss aligned fetched it also raises an illegal instruction exception. The xEPC would fall in that class

Clifford Wolf

unread,
Feb 15, 2018, 3:07:28 PM2/15/18
to Jose Renau, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
On Thu, Feb 15, 2018 at 11:36:31AM -0800, Jose Renau wrote:
> After misa.c is being cleared, if any instruction is miss aligned fetched
> it also raises an illegal instruction exception. The xEPC would fall in
> that class

That was one of the possible solution I proposed in october. See previous
mail in this thread (Message-ID: <20180215185...@clifford.at>) for
my memory of the objections from back then.

Jose Renau

unread,
Feb 15, 2018, 3:16:51 PM2/15/18
to Clifford Wolf, Christopher Celio, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
I like it because it is simple and clean for both in order and wide OoO cores.

No implementation dependent or difficult to explain

On Feb 15, 2018 12:07 PM, "Clifford Wolf" <clif...@clifford.at> wrote:
On Thu, Feb 15, 2018 at 11:36:31AM -0800, Jose Renau wrote:
> After misa.c is being cleared, if any instruction is miss aligned fetched
> it also raises an illegal instruction exception. The xEPC would fall in
> that class

That was one of the possible solution I proposed in october. See previous
mail in this thread (Message-ID: <20180215185716.GE19634@clifford.at>) for

Christopher Celio

unread,
Feb 15, 2018, 5:50:12 PM2/15/18
to Clifford Wolf, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
> The spec has been crafted carefully avoid this situation by making sure xEPC will never
> be able to hold a value that would make xRET trap. Our proposal preserves
> this guarantee.

I don't understand the full ramifications of this scenario. Yes, this is a bad situation, but at least it's verifiable and understandable. Does this cause us to infinite loop? How recoverable does this situation need to be?

> In the second case we would effectively execute random code. (Whatever the
> interpretation of the code is when offset by 2 bytes.) This has been deemed
> less desireable than all other options by pretty much everyone I spoke to.

Do we not find ourselves effectively executing random code anyways? The original proposal is that CSR* writes to xEPC clear the two LSBs. Is that different from ignoring the two LSBs during xRET?

Do we not find ourselves executing random code when a xRET or trap/MTVEC takes us to unaligned 4 byte instructions? What are we hoping happens while we galavant through a misaligned RV64G segment? A jump to an aligned instruction means we've done something VERY ODD and no error is ever thrown. If a jump is performed to a misaligned instruction, then the jump is marked as if somehow this is his fault --- and its actual the xRET or MTVEC's fault. I'm not yet convinced that forcing alignment on xRET is worse than anything else proposed.

I'm trying to imagine a security hole where a misaligned (and malicious) xRET takes us to a unaligned 4 byte jump gadget (that serendipitously was created out of the two halves of adjacent and aligned 4-byte instructions) which takes us to an attack vector that's correctly 4 byte aligned. The fact that the xRET was misaligned is never discovered. Or would an attacker never get access to xEPC so this hypothetical can never be realized?


I'm sorry I haven't spend more than a few hours thinking about this topic. I don't know the correct solution off the top of my head to the xRET/MTVEC problem, but I'm concerned if the purpose of turning off RVC for RVC processors is to emulate RVG processors, having a co-simulation mismatch on xRET seems really problematic.

RVG should be about all 4 byte instructions being aligned. Allowing corner case exceptions to this is worrisome.


-Chris

Cesar Eduardo Barros

unread,
Feb 15, 2018, 6:31:56 PM2/15/18
to Rishiyur Nikhil, RISC-V ISA Dev
Em 15-02-2018 13:02, Rishiyur Nikhil escreveu:
> BACKGROUND:
> Consider a RISC-V implementation that supports the 'C'
> extension (Compressed Instructions), and can clear the MISA.C bit at
> runtime (thereby switching off C support dynamically).  What should be
> the behavior if PC is not currently 32-bit aligned?  What if return
> addresses and trap vectors are not 32-bit aligned?

That sounds like a "half-updated" state, like x86's "unreal mode". I'm
not a fan of it, since it has the potential of confusing software which
does not expect this "misaligned" mode. However, since it can only be
entered by M-mode, and M-mode can defend itself against it (by just not
doing crazy things in the first place), it's not that bad, only odd (and
I agree that it should be precisely specified, to reduce the chance of
unexpected effects).

Since it's a "half-updated" state, it's actually simple to specify:

- If something was previously set to a misaligned value, it keeps its
misaligned value;
- If you attempt to set something to a misaligned value, it either
becomes aligned or traps (depending on what you're trying to do), as
would normally happen on non-RVC.

From the point of view of these two rules, here's my opinion on your
five rules:

> This is a proposal to specify the behavior precisely.
>
> PROPOSAL:
> When MISA.C is cleared,
>
> - If PC is not 32-bit aligned; just continue executing, consuming 32b
>     instructions that are 16-bit aligned (since the CPU is already
>     capable of doing this to support C).

The PC was already misaligned, so it stays misaligned (incrementing by
the instruction length every time).

> - If a trap occurs and MTVEC contains an address that is
>     16-bit aligned and not 32-bit aligned, just continue executing,
>     consuming 32b instructions that are 16-bit aligned (since the CPU
>     is already capable of doing this to support C).

If the MTVEC was already misaligned, it stays misaligned and is copied
as is to the PC (which then becomes misaligned). Any attempt to write a
misaligned address to MTVEC, however, will not write a misaligned
address to MTVEC. (Are misaligned MTVEC addresses even possible?)

> - All jump or branch instructions that attempt to jump to an address
>     that is not 32-bit aligned will cause an Instruction address
>     misaligned exception, per the current spec (exception is taken at
>     the jump, not at the target).  The return address is saved as is,
>     even if it is not 32-bit aligned.

That's my second rule: attempting to set the PC to something misaligned
won't work. The return address was already misaligned, so it stays
misaligned (and is copied as is).

> - CSR* instructions that write to [MSU]EPC will clear the two LSBs of
>     the written address, i.e. EPC will be 32-bit aligned.
>
> - When PC is copied to [MSU]EPC as part of trap handling, it is copied
>     as-is even if it points to an address that is not 32-bit aligned.

These two are also correct: the PC was already misaligned and stays
misaligned (copied to EPC), and on return also stays misaligned (copied
from EPC - that would be a sixth rule you forgot), but software can't
write something misaligned to EPC (it can read something misaligned from
EPC, however).

But here is the problem. What if the trap handler saves and restores the
EPC (for instance, to allow for nested traps)? It will fail to restore
the EPC correctly. That is, the "half-updated" state is fragile: it's
easy to accidentally slip out of it.

Which is why this section of the specification should be prefaced with
"this is how it works, but it's odd and can easily break, so save
yourself the headache and align the PC before clearing MISA.C, or asking
M-mode to clear MISA.C for you". Also, whoever defines the ABI for
"asking M-mode to clear MISA.C" should define it to fail whenever the PC
of the caller (copied into EPC) is misaligned, to prevent the behavior
from differing between M-mode implementations.

--
Cesar Eduardo Barros
ces...@cesarb.eti.br

Clifford Wolf

unread,
Feb 16, 2018, 7:20:05 AM2/16/18
to Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

sorry, long mail. If it's too long please skip to the and start reading
at "This is not about security".

On Thu, Feb 15, 2018 at 02:50:07PM -0800, Christopher Celio wrote:
> > The spec has been crafted carefully avoid this situation by making sure xEPC will never
> > be able to hold a value that would make xRET trap. Our proposal preserves
> > this guarantee.
>
> I don't understand the full ramifications of this scenario. Yes, this is
> a bad situation, but at least it's verifiable and understandable. Does
> this cause us to infinite loop? How recoverable does this situation need
> to be?

I'm just repeating the arguments I heard in october for why this suggestion
is not acceptable.

I took all the "we can't do that because ..." argument and crafted a
solution that requires no extra hardware (because this was the main
argument against most suggestions) and avoid also the other things that
have been brought up.

I personally don't care much as long as we have can avoid complete
undefined behavior.

As I've said in my previous mail, rocket currently can retire a killed LB
instruction as LW as a result of clearing MISA.C with an unaligned PC and
this is deemed okay because clearing MISA.C with an unaligned PC is
undefined behavior in the current spec. (It wasn't when I found the
behavior, but when I reported it it was added to the spec instead of fixing
rocket.)

--8<--
Quick sidenote: What is undefined behavior? Usually it means the core can
do whatever it wants whenever it wants, even before the unaligned MISA.C
clear hits as long as the core is destined to hit it (undefined behavior is
not causal). This is completely unacceptable for formal verification. We
would need to solve the halting problem in order to determine if the core
is destined to hit the unaligned MISA.C clear.

In one mail in october Andrew said undefined behavior in this sense is
causal (good!). But even causal undefined behavior screws up formal.
Suddenly all properties depend on if the processor hit this one obscure
state in its past. That drastically complicates everything.

I've also been told that "undefined behavior" in the sense of the RISC-V
spec is causal and gurantees that it "won't violate protections and won't
hang". But this is a very problematic definition because it is not a
statement in terms of CPU state but a statement in terms of emergent
behavior.

For example: What if the core suddenly starts interpreting every
instruction as "j 0", i.e. endless loop. It still keeps executiong
instructions, so does it hang?

You might think that it is crazy to even consider this correct behavior,
but rocket currently can be made to retire a LW instruction that is not
there as response to clearing MISA.C, and apparently that is okay according
to the definition.

I think we should be able to get away in the formal spec with unspecified
values and situations where an implementation may chose one of multiple
concrete options (aka "unpredictable behavior"). For example, instead of
saying that clearing MISA.C with xEPC pointing to an unaligned address is
undefined behavior we could say that clearing MISA.C will leave undefined
values in all xEPC registers.
--8<--

We in the formal spec WG don't think that clearing MISA.C justifies undefined
behavior. We even think it is possible to define it in a way so that it has
no unspecified or unpredictable aspects and adds no extra hardware.

> > In the second case we would effectively execute random code. (Whatever the
> > interpretation of the code is when offset by 2 bytes.) This has been deemed
> > less desireable than all other options by pretty much everyone I spoke to.
>
> Do we not find ourselves effectively executing random code anyways? The
> original proposal is that CSR* writes to xEPC clear the two LSBs. Is that
> different from ignoring the two LSBs during xRET? [..]

Yes, that's very different.

Because this is only for CSR* writes to xEPC, not for when the PC gets
copied into xEPC as a result of a trap. Clearing the low bits in xEPC on
write is spec behavior now (3.1.19 of Privileged Architectures V1.10), not
something we have added.

So with our proposal you can still take a trap in unaligned code and return
correctly. We only get a problem when people try to set xEPC by copying a
register value into it. (Which is already the case right now.)

Note that our proposal completely stays within the realm of what is
currently completely unspecified. So all guarantees that the current spec
gives to software still hold.

> I'm not yet convinced that forcing alignment on xRET is worse than
> anything else proposed.

I don't say it is. I'm saying it the suggestion was already shut down in
october because it adds extra hardware complexity.

Our proposal doesn't add hardware complexity. It's just a bit harder to get
your head around but it is actually simpler than what (at least) rocket is
doing right now and it only clarifies some cases that are currently
unspecified because the spec is for the most part written in a way that
assumes that MISA.C is read-only.

Let me explain what I mean using my original 3 clause proposal (that should
be semantically identicall to the 5 clause proposal nikhil posted):

--snip--

A processor with C support and writable misa.C should behave as follows
when misa.C is cleared:

(1) If not stated otherwise below, behave the same as if misa.C would be set.
Specifically, if PC points to an address that isn't aligned to 32 bits
the core should just execute unaligned code.

(2) All jump or branch instructions that attempt to jump to an address that
is not aligned to 32 bits will cause an Instruction address misaligned
exception.

(3) Writes to [msu]epc will clear the two LSB of the written address. This
is only the case for the CSR* instructions that explicitly write to
[msu]epc. When PC is copied to [msu]epc as part of trap handling than
it is copied as-is even when it points to an address not aligned to
32 bits.

--snap--

Clause (1) says that with the exception of (2) and (3) there is no need to
use MISA.C anywhere in the core (with the exception of the instruction
decoder that should forget about C encodings of course).

This is important! It means that if you have anything else in your core
right now that uses misa.C then you can remove that logic. It is not
needed.

Clause (2) is something that a core should already do. It is specified in
2.5 of RISC-V User-Level ISA V2.2.

Clause (3) is two parts. First the part about clearing bits during CSR*
writes to [msu]epc. This is something a core should already do. It is
specified in 3.1.19 of Privileged Architectures V1.10. And second the
copying from PC to [msu]epc. This is essentially a clarification that
no special action is required if PC is not aligned to 32b.

Your proposal is to mask the LSB bits of xEPC on xRET to fix the address.
This would be an additional action ontop of the clauses above. It would
not remove hardware anywhere else from the system but it would require you
to add a few extra gates to the xRET logic.

> I'm trying to imagine a security hole where a misaligned (and malicious)
> xRET takes us to a unaligned 4 byte jump gadget (that serendipitously was
> created out of the two halves of adjacent and aligned 4-byte
> instructions) which takes us to an attack vector that's correctly 4 byte
> aligned. The fact that the xRET was misaligned is never discovered. Or
> would an attacker never get access to xEPC so this hypothetical can never
> be realized?

This is not about security.

Proper OS would only clear misa.C when in aligned code and xEPC pointing to
aligned addresses.

If misa.C is cleared and PC and xEPC point to 32b aligned addresses then
you can never have an unaligned address in PC and xEPC without first
setting misa.C again. That's the whole point: Emulating RVG on RVC.

This is about avoiding unnecessary undefined behavior so that we can have a
clean formal specification. Our proposal does this without adding any
additional complexity to the hardware. In fact, I believe it simplifies
implementations.

> I'm sorry I haven't spend more than a few hours thinking about this
> topic. I don't know the correct solution off the top of my head to the
> xRET/MTVEC problem, but I'm concerned if the purpose of turning off RVC
> for RVC processors is to emulate RVG processors, having a co-simulation
> mismatch on xRET seems really problematic.

How would you have a co-simulation mismatch on xRET if you only clear
misa.C when in aligned code and xEPC pointing to aligned addresses?

The whole point of our proposal is to guarantee complete compatibility with
RVG if we start in a state that is valid for a RVG processors. I don't see
how you could run into any co-simulation mismatches if you start with a
valid state.

> RVG should be about all 4 byte instructions being aligned. Allowing
> corner case exceptions to this is worrisome.

But how would you ever get into those corner cases? Could you provide an
example? The only way to do that is to start with a state a RVG system can
never get into without first setting it up in RVC mode and then clearing
misa.C. And then you are talking about RVC system behavior, not RVG,
because in order to get into that state you had to be in C mode for at
least a few instructions.

I don't know.. maybe we are misunderstanding each other here. Or maybe
there is a bug in my proposal. In the latter case I would appriciate an
example that demonstrates that you could get a co-simulation mismatch with
a core that follows my proposal.

regards,
- clifford

--
Tell people there is an invisible man in the sky who created the universe, and
the vast majority will believe you. Tell them the paint is wet, and they have
to touch it to be sure.

Clifford Wolf

unread,
Feb 16, 2018, 7:33:13 AM2/16/18
to Cesar Eduardo Barros, Rishiyur Nikhil, RISC-V ISA Dev
Hi,

On Thu, Feb 15, 2018 at 09:31:46PM -0200, Cesar Eduardo Barros wrote:
> >- If a trap occurs and MTVEC contains an address that is
> >     16-bit aligned and not 32-bit aligned, just continue executing,
> >     consuming 32b instructions that are 16-bit aligned (since the CPU
> >     is already capable of doing this to support C).
>
> If the MTVEC was already misaligned, it stays misaligned and is
> copied as is to the PC (which then becomes misaligned). Any attempt
> to write a misaligned address to MTVEC, however, will not write a
> misaligned address to MTVEC. (Are misaligned MTVEC addresses even
> possible?)

No, it is not possible. You can ignore this clause.

> But here is the problem. What if the trap handler saves and restores
> the EPC (for instance, to allow for nested traps)? It will fail to
> restore the EPC correctly. That is, the "half-updated" state is
> fragile: it's easy to accidentally slip out of it.

Please note that this is only about having a clean formal spec with nice
deterministic behavior.

Currently the priv spec says this:

When clearing the "C" bit in misa, software must ensure that the current pc
is 4-byte aligned and that all xepc registers contain 4-byte-aligned values.

Leaving it completely undefined what happens when software does not follow
this rule. (And as I've said before in this thread, at least rocket can do
really weird things if you clear MISA.C with an unaligned PC.)

Our goal here is to make this defined behavior for the sake of a clean and
formally verifiable specification.

Actual software would of course still ensure that PC and xEPC are all
4-byte aligned. You are not emulating RVC until you are in a state where
PC and xEPC are all 4-byte aligned.

As long as we have defined behavior we are good. This is not a state the
processor is actually going to be operated in by software. So it does not
matter if the trap handler can't restore xEPC from a register.

There are not requirements other than: (a) the behavior should be well
defined and (b) it should be implementable with minial (zero) extra
hardware cost.

regards,
- clifford

Paul Miranda

unread,
Feb 16, 2018, 9:45:26 AM2/16/18
to RISC-V ISA Dev
I like the proposal since it allows some flexibility in implementation while providing a more complete specification.
However, I am curious what use cases there are for clearing the C bit? (Other than the case celio mentioned of emulating RVG on RVC hardware.)

Cesar Eduardo Barros

unread,
Feb 16, 2018, 5:07:03 PM2/16/18
to Paul Miranda, RISC-V ISA Dev
Em 16-02-2018 12:45, Paul Miranda escreveu:
> I like the proposal since it allows some flexibility in implementation
> while providing a more complete specification.
> However, I am curious what use cases there are for clearing the C bit?
> (Other than the case celio mentioned of emulating RVG on RVC hardware.)

Given that all clearing the C bit does is "emulating RVG on RVC
hardware", if you exclude that you won't find any use cases. A better
question would be "why would someone want to disable the C extension",
and I can see at least six use cases for that:

1. Software development: you want to make sure your software works even
on cores without the C extension, but all you have available for testing
are cores with the C extension.

2. Virtual machine migration: you have a mixed farm of machines, some of
which have the C extension, and some of which don't. You want to make
sure a virtual machine can be migrated between all these physical
machines, without crashing or becoming slower, so you run the virtual
machines with the C extension disabled.

3. PNaCl-style sandboxing: base non-C RISC-V has the property that all
instructions are 4 bytes wide and aligned to 4 bytes, that is, jumping
into the middle of an instruction isn't possible. This makes it easier
for a validator to confirm that a sequence of instructions follows a set
of rules for a sandbox.

4. Extensions using the RVC quadrants: disabling RVC frees 3/4 of the
encoding space for extensions, either emulated or decoded in hardware.
This is particularly useful together with the first use case (software
develpment): suppose you have a special-purpose micro-controller which
uses the RVC encoding space for a custom extension; disabling RVC allows
one to use a generic RISC-V core to test software compiled for that
micro-controller (by trapping and emulating the custom extension).

5. Hardware evolution: if you started with a tiny core which didn't have
the C extension, and later you needed more compute power but all the
cores you could find had the C extension, disabling the C extension
would help make sure your software keeps working unchanged.

6. Alternative compressed instructions: suppose you disagree with the
design of RVC, and have what you think is a better design. Disabling RVC
allows experimenting (through emulation) with your "obviously superior"
design.

This last use case is special in that, unlike all the others, it wants
to allow misaligned 32-bit instructions. With Clifford Wolf's proposal,
there's an easy trick for that: when necessary, the emulator could
temporarily enable MISA.C and misalign the EPC. Alternative proposals
that enforce alignment when MISA.C is clear wouldn't allow for this
niche use case without emulating everything whenever the PC misaligns.

Christopher Celio

unread,
Feb 16, 2018, 7:42:31 PM2/16/18
to Jose Renau, Clifford Wolf, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil

- If PC is not 32-bit aligned; just continue executing, consuming 32b
    instructions that are 16-bit aligned (since the CPU is already
    capable of doing this to support C).

I believe I have convinced myself that the parenthetical that underpins this proposal is not always true. If MISA.C is disabled, then the processor may no longer be capable of executing misaligned instructions. I am in particular concerned about superscalar processors where power becomes a concern.

One particular design may choose to work as follows: 

A 16 Byte fetch front-end brings in four 4-byte instructions and up to eight 2-byte instructions. 8 expanders/RVC-decoders are used to generate micro-ops, and a mux is required to choose between the potential 4-byte or expanded 2-byte instruction that may start at any of the 8 locations within the 16B packet.

If MISA.C is disabled, then I can turn off the RVC decoders/expanders, and I can turn off the flops and muxing for the extra 4 uops. To power down the clock is not instantaneous; I can fold it in with the pipeline flush, but I can't magically turn it on when a misaligned xRET is sprung on the frontend. 

If the flops are off, there's no way for the core to execute unaligned 4-byte instructions as demanded by this proposal.


For this reason, and for sanity, and security (even if only "security by depth"), I feel that RVG should strongly mandate that all instructions are 4-bytes and always aligned.


In particular:

I believe that MISA.C should flush the pipeline and fetch PC+4. (the effects of uncompressed is immediate from the ISA view and misalignment is forbidden after the MISA.C).

I believe that ALL instructions should be checked for misalignment.

The instruction after a misaligned MISA.C should throw a misaligned exception.

Likewise, the instruction at xEPC after an xRET should throw a misaligned exception. 


Checking every fetch_pc is a 1 (or 2-bit) check and can piggy-back on the existing exception signals that come from the front-end. You only have to check PC of the first instruction in a packet.


I'm willing to accept we can't throw exception on MISA.C, because it's not exceptional. An unaligned 4 byte is legal. The state change has to be preserved and only afterwards do we enter exceptional behavior.

I would like to better understand why xRET can't throw an exception (since I'm not in the Formal Group I haven't gotten to hear all of the push-back against this option), but off the top of my head it seems that there would be additional complexity of a environment return conflicting with an exception within the CSR File. There is a lot of side-effects of both, which work in opposition to one another, so unless you can detect an excepting xRET on an earlier cycle, this seems problematic. Or maybe checking xEPC(1) && dec_is_xret is trivial for all designs?



TL, DR: This proposal makes me uneasy, especially when we have such a small number of RVC designs to analyze. 


-Chris



Dan Hopper

unread,
Feb 16, 2018, 9:27:04 PM2/16/18
to Christopher Celio, Jose Renau, Clifford Wolf, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi folks, 

(send attempt #2, sorry for the churn)

FWIW, my recommendation would be to define the mode switching behaviors rather than leave them undefined, and to do so in a way that doesn't impact hardware too much, but nevertheless makes the mode-switching a step function in the hardware.  Even if you have to flush the pipe to do so. (Not that pipe flushing on a MISA.C mode switch would be burdensome in terms of either gates or performance.  It's actually the normal procedure for many types of mode switching.)  So, I agree with Chris in that regard.

If the mode switch is a slow, gradual change of behavior, dependent upon the instruction stream, the hardware becomes more burdensome and harder to verify.

Besides, what's the point of having a mode bit to disable an optional feature, if the core with a feature disabled doesn't at least mostly behave like a core that doesn't have the feature in hardware? 

Cesar provided excellent examples in his email this afternoon.  I'd highlight #1 & #2 as particularly important.  Or a multitasking OS running a variety of programs that support a variety of optional processor modes. Sure, some mode bits like MISA.C might run a "legacy" program fine with the feature enabled, but there will be features where that is not the case, where perhaps there are orthogonal features or orthogonal behaviors that require (rather than merely allow) a "legacy" program to run with a particular feature disabled. 

Wrt the xEPC reg values and such, I don't think the hardware required to force bit 1 to zero upon MISA.C=0 mode switch is large or burdensome.  Mode bits such as these will already be available in the front and back-ends, so either the flop can be clocked and written to zero upon MISA.C=0 transition, or alternatively the bit 1 flop value can be masked by the value of MISA.C=0. 


One comparison that could be drawn here is to the defined behavior when writing MISA.MXL=1 (32-bit) on a RV64 core: "all operations must ignore source operand register bits above the configured XLEN, and must sign-extend results to fill the
entire widest supported XLEN in the destination register. ... We require that operations always fill the entire underlying hardware registers with defined values to avoid implementation-defined behavior."

I would suggest this MISA.MXL mode-switching behavior spec, when translated to a MISA.C mode-switch, could be interpreted to mean that xEPC bits 1:0 must be ignored and read as zero when MISA.C=0, and bit 0 must be ignored and read as zero when MISA.C=1.   (But I'd also be fine with actually writing the bit to zero, if that ends up being the way it is defined.)

Regards,
Dan
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.

Clifford Wolf

unread,
Feb 17, 2018, 7:36:26 AM2/17/18
to Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Fri, Feb 16, 2018 at 04:42:29PM -0800, Christopher Celio wrote:
> > - If PC is not 32-bit aligned; just continue executing, consuming 32b
> > instructions that are 16-bit aligned (since the CPU is already
> > capable of doing this to support C).
>
> I believe I have convinced myself that the parenthetical that underpins
> this proposal is not always true. If MISA.C is disabled, then the processor
> may no longer be capable of executing misaligned instructions. I am in
> particular concerned about superscalar processors where power becomes a
> concern. [...]

This is an interesting point that I have not considered. Thanks.

> I believe that MISA.C should flush the pipeline and fetch PC+4. (the
> effects of uncompressed is immediate from the ISA view and misalignment is
> forbidden after the MISA.C).
>
> I believe that ALL instructions should be checked for misalignment.

How could they become misaligned other than via MISA.C clear and xRET?

Branches/jumps should check the new PC before commiting the instruction, so
PC never becomes misaligned.

And the xVEC registers don't allow addresses that aren't 32b aligned.

> The instruction after a misaligned MISA.C should throw a misaligned
> exception.

Shouldn't the instruction that clears MISA.C throw that exception?
(We are talking about the mcause=0 exception, right?)

> Likewise, the instruction at xEPC after an xRET should throw a misaligned
> exception.

Likewise, shouldn't the xRET throw that exception?

> I would like to better understand why xRET can't throw an exception (since
> I'm not in the Formal Group I haven't gotten to hear all of the push-back
> against this option),

There is no push-back against this from the formal group afaik. I'd be
perfectly fine with this solution.

I'm just paraphrasing the pushback I got from from Andrew (mostly
communicated through Jacob) back in october.


Alternatively the MISA.C clear could throw an illegal instruction exception
when it is unaligned or any of the xEPC CSRs point to an unaligned
address. That proposal has also been rejected.


Or we can throw an exception for an unaligned MISA.C and just say changing
MISA.C will leave an unspecified value in the xEPC registers (with xEPC[0]
always clear and xEPC[1] clear if !MISA.C). I particularily like that one
because it allows an implementation to do a few things: It could just mask
xEPC[1] when MISA.C is cleared (that mask would stop when MISA.C is set
again, thus a *change* to MISA.C will mess with xEPC, not just clearing
it). Or it could also just reset all xEPC regs to their reset values. This
proposal of course was also rejected.


In both cases the reason given was that the hardware cost is too high.


> but off the top of my head it seems that there would
> be additional complexity of a environment return conflicting with an
> exception within the CSR File. There is a lot of side-effects of both,
> which work in opposition to one another, so unless you can detect an
> excepting xRET on an earlier cycle, this seems problematic. Or maybe
> checking xEPC(1) && dec_is_xret is trivial for all designs?
>
> TL, DR: This proposal makes me uneasy, especially when we have such a small
> number of RVC designs to analyze.

I think it is even worse because most of them don't have a writeable
MISA.C. :)

regards,
- clifford

Stef O'Rear

unread,
Feb 17, 2018, 7:53:10 AM2/17/18
to Rishiyur Nikhil, RISC-V ISA Dev
On Thu, Feb 15, 2018 at 7:02 AM, Rishiyur Nikhil <nik...@bluespec.com> wrote:
> BACKGROUND:
> Consider a RISC-V implementation that supports the 'C'
> extension (Compressed Instructions), and can clear the MISA.C bit at
> runtime (thereby switching off C support dynamically). What should be
> the behavior if PC is not currently 32-bit aligned? What if return
> addresses and trap vectors are not 32-bit aligned?
>
> This is a proposal to specify the behavior precisely.
>
> PROPOSAL:
> When MISA.C is cleared,
>
> - If PC is not 32-bit aligned; just continue executing, consuming 32b
> instructions that are 16-bit aligned (since the CPU is already
> capable of doing this to support C).
>
> - If a trap occurs and MTVEC contains an address that is
> 16-bit aligned and not 32-bit aligned, just continue executing,
> consuming 32b instructions that are 16-bit aligned (since the CPU
> is already capable of doing this to support C).
>
> - All jump or branch instructions that attempt to jump to an address
> that is not 32-bit aligned will cause an Instruction address
> misaligned exception, per the current spec (exception is taken at
> the jump, not at the target). The return address is saved as is,
> even if it is not 32-bit aligned.
>
> - CSR* instructions that write to [MSU]EPC will clear the two LSBs of
> the written address, i.e. EPC will be 32-bit aligned.
>
> - When PC is copied to [MSU]EPC as part of trap handling, it is copied
> as-is even if it points to an address that is not 32-bit aligned.
>
>
>
> Please let us know if you see any problems with this.
>
> [Thanks to Clifford Wolf for articulating this first.]
>
> Rishiyur Nikhil (Chair) and the ISA Formal Spec Technical Group

I have no objections to this proposal. It has interactions with the
SBI VM World Switch call, but the latter has not been specced yet, so
that is OK.

-s

Allen J. Baum

unread,
Feb 17, 2018, 6:26:03 PM2/17/18
to Clifford Wolf, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
At 1:36 PM +0100 2/17/18, Clifford Wolf wrote:
>
>> The instruction after a misaligned MISA.C should throw a misaligned
>> exception.
>
>Shouldn't the instruction that clears MISA.C throw that exception?
>(We are talking about the mcause=0 exception, right?)

If we could trap on the CSRW instruction, it would be ideal.
The problem is that condition isn't easily detected until the time that the instruction is retiring, making it a bit awkward. If we're going to do that, then perhaps we should also allow traps on integer overflow - pretty much the same problem (a bit of a false equivalency - you could detect it a pipestage earlier in this particular case, but close enough).
The instruction following would work also, though implementing that in an OOO superscalar machine gets ugly fast, according to Chris.

> > Likewise, the instruction at xEPC after an xRET should throw a misaligned
>> exception.
>
>Likewise, shouldn't the xRET throw that exception?

Maybe; that decision (should xRET throw the exception or the instruction at the return point) will have to be made regardless since the condition leading to that case can occur whether or not we allow executing unaligned code with MISA.C=0.

But stepping back - the real issue is whether we allow execution of non aligned code in a 32bit aligned environment, even for a little while.
I think most would agree it's ugly, but don't agree that the cost of fixing it is worthwhile. I'd like to see a much more thorough discussion of the costs (which may have happened & I just never knew about it, in which case point me to the thread)).

Of course, that will just lead to a further discussion of "what is too much cost".




--
**************************************************
* Allen Baum tel. (908)BIT-BAUM *
* 248-2286 *
**************************************************

Jacob Bachmeyer

unread,
Feb 17, 2018, 6:49:37 PM2/17/18
to Clifford Wolf, Christopher Celio, Rishiyur Nikhil, RISC-V ISA Dev
Clifford Wolf wrote:
>>> PROPOSAL:
>>> When MISA.C is cleared,
>>>
>>> (1) If PC is not 32-bit aligned; just continue executing, consuming 32b
>>> instructions that are 16-bit aligned (since the CPU is already
>>> capable of doing this to support C).
>>>
>>> (2) If a trap occurs and MTVEC contains an address that is
>>> 16-bit aligned and not 32-bit aligned, just continue executing,
>>> consuming 32b instructions that are 16-bit aligned (since the CPU
>>> is already capable of doing this to support C).
>>>
>>> (3) All jump or branch instructions that attempt to jump to an address
>>> that is not 32-bit aligned will cause an Instruction address
>>> misaligned exception, per the current spec (exception is taken at
>>> the jump, not at the target). The return address is saved as is,
>>> even if it is not 32-bit aligned.
>>>
>>> (4) CSR* instructions that write to [MSU]EPC will clear the two LSBs of
>>> the written address, i.e. EPC will be 32-bit aligned.
>>>
>>> (5) When PC is copied to [MSU]EPC as part of trap handling, it is copied
>>> as-is even if it points to an address that is not 32-bit aligned.
>>>
>
> Note that (3) and (4) is exactly what every RISC-V processor should already
> do when !MISA.C.
>
> Also note that (1), (2), and (5) in fact remove complexity. In this points
> we say that a processor should *not* add extra complexity in areas where
> some might assume that the spec requires them to add complexity. (But in
> fact the current spec doesn't say anything about what to do in those
> situations.)

Perhaps I am mistaken, but (1) and (2) seem to preclude an optimization
for aligned fetch that might otherwise be possible when RVC is disabled.

Could we instead require *tvec to be 32-bit aligned in all modes? (Even
RVC trap handlers must *begin* on a 32-bit word boundary?) This would
eliminate the need to even implement the low two bits of *tvec (or
permit them to be reserved as future flags in all cases). If *tvec must
be 32-bit aligned, then (1) can be simplified to "take instruction
misaligned trap".

Having an extra "ghost RVC mode" just feels too much like some of the
crazy magic on x86 for my taste.


-- Jacob

Stef O'Rear

unread,
Feb 17, 2018, 6:51:59 PM2/17/18
to Rishiyur Nikhil, RISC-V ISA Dev
On Thu, Feb 15, 2018 at 7:02 AM, Rishiyur Nikhil <nik...@bluespec.com> wrote:
> - If a trap occurs and MTVEC contains an address that is
> 16-bit aligned and not 32-bit aligned, just continue executing,
> consuming 32b instructions that are 16-bit aligned (since the CPU
> is already capable of doing this to support C).

According to https://cdn.rawgit.com/riscv/riscv-isa-manual/master/release/riscv-privileged-v1.10.pdf#page=36
mtvec and stvec already cannot hold addresses that are not 32-bit
aligned, so this section is moot I think.

-s

Jacob Bachmeyer

unread,
Feb 17, 2018, 7:01:55 PM2/17/18
to Jose Renau, Rishiyur Nikhil, RISC-V ISA Dev
Jose Renau wrote:
> -If the MISA.C is cleared by a compressed instruction. The instruction
> following the MISA.C clear must be 32bit aligned or an illegal instruction
> exception is raised.

This is not possible: misa is a CSR and there are no CSR access
instructions in RVC. The C bit can only be adjusted by a 32-bit
instruction. Do you mean to require that that instruction be 32-bit
aligned?

On the other hand, might we all be missing the forest for the trees
here? The misa CSR is only accessible in M-mode. A monitor that
permits clearing misa.C cannot use RVC itself, otherwise the monitor
would risk illegal instruction exceptions if a part that uses RVC is
executed with misa.C clear. Therefore, mtvec must always be aligned to
a word boundary, execution at the point where misa.C is cleared can only
be 32-bit instructions on 32-bit alignment, and any possible return to
RVC code must cross privilege levels. MRET can trap for misaligned
instruction if mepc is not 32-bit aligned and misa.C is clear.


-- Jacob

Clifford Wolf

unread,
Feb 17, 2018, 7:27:07 PM2/17/18
to Allen J. Baum, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Sat, Feb 17, 2018 at 03:25:54PM -0800, Allen J. Baum wrote:
> >> The instruction after a misaligned MISA.C should throw a misaligned
> >> exception.
> >
> >Shouldn't the instruction that clears MISA.C throw that exception?
> >(We are talking about the mcause=0 exception, right?)
>
> If we could trap on the CSRW instruction, it would be ideal.
> The problem is that condition isn't easily detected until the time that
> the instruction is retiring, making it a bit awkward. If we're going to
> do that, then perhaps we should also allow traps on integer overflow -
> pretty much the same problem (a bit of a false equivalency - you could
> detect it a pipestage earlier in this particular case, but close enough).

I think it would be ok for an MISA.C write to flush the pipeline. So it
should be far less problematic than something like an exception for
division by zero.

> The instruction following would work also, though implementing that in an
> OOO superscalar machine gets ugly fast, according to Chris.

Right now there is no exception for "this instruction is unaligned" afaiu,
only for "this instruction is trying to make PC unaligned" (mcause=0). So
it would require adding an extra exception just for this circumstance if
the exception should be thrown for the instruction following.

Btw, yet another possible solution would be to make MISA.C read-only when
PC[1] or any xEPC[1] bit is set. This would turn an attempt to clear MISA.C
into a NOP when it's not safe to disable MISA.C. (Which would match the
behavior of architectures that do not allow clearing MISA.C at all.)

regards,
- clifford

Jacob Bachmeyer

unread,
Feb 17, 2018, 7:46:53 PM2/17/18
to Clifford Wolf, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Clifford Wolf wrote:
> Hi,
>
> On Fri, Feb 16, 2018 at 04:42:29PM -0800, Christopher Celio wrote:
>> The instruction after a misaligned MISA.C should throw a misaligned
>> exception.
>>
>
> Shouldn't the instruction that clears MISA.C throw that exception?
> (We are talking about the mcause=0 exception, right?)
>
>
>> Likewise, the instruction at xEPC after an xRET should throw a misaligned
>> exception.
>>
>
> Likewise, shouldn't the xRET throw that exception?
>
>
>> I would like to better understand why xRET can't throw an exception (since
>> I'm not in the Formal Group I haven't gotten to hear all of the push-back
>> against this option),
>>

I actually suggested having MRET raise an exception just earlier and
then read a message that reminded me why that cannot work: xRET cannot
raise an exception, because an exception at xRET destroys the state that
the xRET needs to operate. MRET jumps to <mepc> in <mstatus.MPP> mode.
If MRET traps instead, mepc will be overwritten with the address of MRET
and MPP set to M. Now the trap handler has to untangle the mess and
reconstruct the original mepc and mstatus values (OK, so the monitor
should have stashed these on the stack somewhere that is still (barely)
valid) before it can resume, although if misa.C is cleared and the
higher levels are relying on RVC support, resuming execution is not
actually possible -- a trap must be delivered (by software delegation)
instead.

> I'm just paraphrasing the pushback I got from from Andrew (mostly
> communicated through Jacob) back in october.
>

There must be another Jacob; I have checked my past emails and the only
pushback I gave you last October was concerns about possible patents on
bitwise parallel extract/deposit.


-- Jacob

Allen J. Baum

unread,
Feb 17, 2018, 8:28:15 PM2/17/18
to jcb6...@gmail.com, Clifford Wolf, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
At 6:46 PM -0600 2/17/18, Jacob Bachmeyer wrote:
>
>I actually suggested having MRET raise an exception just earlier and then read a message that reminded me why that cannot work: xRET cannot raise an exception, because an exception at xRET destroys the state that the xRET needs to operate. MRET jumps to <mepc> in <mstatus.MPP> mode. If MRET traps instead, mepc will be overwritten with the address of MRET and MPP set to M. Now the trap handler has to untangle the mess and reconstruct the original mepc and mstatus values (OK, so the monitor should have stashed these on the stack somewhere that is still (barely) valid) before it can resume, although if misa.C is cleared and the higher levels are relying on RVC support, resuming execution is not actually possible -- a trap must be delivered (by software delegation) instead.

Actually, I'm OK if this rare cases causes effort to untangle or is even impossible to untangle.
I'm also OK with it cause a fatal error (e.g. NMI) or getting into an infinite loop.
"If it hurts when you do that, don't do that" and this is in code that MMode should ensure has the correct alignment to begin with.

I do want formal verification to be happy, and if either of those options make them happy, I'm happy (well, for this particular case.)

Clifford wrote:
>
>Btw, yet another possible solution would be to make MISA.C read-only when
>PC[1] or any xEPC[1] bit is set. This would turn an attempt to clear MISA.C
>into a NOP when it's not safe to disable MISA.C. (Which would match the
>behavior of architectures that do not allow clearing MISA.C at all.)

I think that is a really elegant solution.

Obeying platform restrictions (the CSRW that clears MISA.C must be aligned) would then simply work. Buggy code would do something weird but predictable.
And its pretty cheap.
Very (very) slightly cheaper is if only allows set .C, but not clearing it (by a copule of gates)

Clifford Wolf

unread,
Feb 18, 2018, 6:04:29 AM2/18/18
to Jacob Bachmeyer, Andrew Waterman, RISC-V ISA Dev
On Sat, Feb 17, 2018 at 06:46:48PM -0600, Jacob Bachmeyer wrote:
> >I'm just paraphrasing the pushback I got from from Andrew (mostly
> >communicated through Jacob) back in october.
>
> There must be another Jacob; I have checked my past emails and the
> only pushback I gave you last October was concerns about possible
> patents on bitwise parallel extract/deposit.

Yes. That was Jacob Chang from SiFive. ;)

Clifford Wolf

unread,
Feb 18, 2018, 6:07:18 AM2/18/18
to Allen J. Baum, jcb6...@gmail.com, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
On Sat, Feb 17, 2018 at 05:28:10PM -0800, Allen J. Baum wrote:
> Very (very) slightly cheaper is if only allows set .C, but not clearing it (by a copule of gates)

Unfortunately that's not possible: The spec says that the init value of
MISA must be the max feature set. So MISA.C must be already set on reset.

Bruce Hoult

unread,
Feb 18, 2018, 6:40:28 AM2/18/18
to Clifford Wolf, Allen J. Baum, Jacob Bachmeyer, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Is that even possible, logically?

What if you have mutually-exclusive features? For example, maybe your processor supports C, but also supports another feature (perhaps private) that uses the 16-bit instruction space for something different -- perhaps vectors or GPU or ML or whatever.

Clifford Wolf

unread,
Feb 18, 2018, 7:18:49 AM2/18/18
to Bruce Hoult, Allen J. Baum, Jacob Bachmeyer, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Sun, Feb 18, 2018 at 02:40:24PM +0300, Bruce Hoult wrote:
>>> Very (very) slightly cheaper is if only allows set .C, but not clearing
>>> it (by a copule of gates)
>>
>> Unfortunately that's not possible: The spec says that the init value of
>> MISA must be the max feature set. So MISA.C must be already set on reset.
>
> Is that even possible, logically?
>
> What if you have mutually-exclusive features? For example, maybe your
> processor supports C, but also supports another feature (perhaps private)
> that uses the 16-bit instruction space for something different -- perhaps
> vectors or GPU or ML or whatever.

Exactly my first thought when I read this in the spec.. :)

The exact quote from the spec:

At reset, the Extension field should contain the maximal set of supported
extensions, and I should be selected over E if both are available.

Afaiu I and E are the only exclusive standard extensions, so this case is
explicitly covered by the spec.

All that remains are conflicts between standard extensions and non-standard
extension (that would not be represented by flags in MISA). My reading of
the spec is that in this cases the standard extensions should be enabled
and the conflicting non-standard extensions disabled on reset.

regards,
- clifford

Allen J. Baum

unread,
Feb 19, 2018, 2:02:23 AM2/19/18
to Clifford Wolf, jcb6...@gmail.com, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
I'm missing something here.

You suggested making MISA.C read-only when PC[1] or any xEPC[1] bit is set.
I suggested disallowing clearing it when PC[1] or any xEPC[1] bit is set,
as a (very) small saving of a couple of gates.
What does this have to do with the initial value at reset?
My assumption here is that the reset value is whatever the reset value is, and the only other changes can occur when a CSRW to MISA is executed.

In both cases (after reset with MISA.C=1) clearing MISA.C works if the CSRW instruction is 32bit aligned. The difference only shows up if you try to set MISA.C when the CSRW op is not aligned: in your proposal you can't, in my proposal (though I hesitate to even call it a proposal, more like an observation) you can.

So, are you saying even your suggestion violates the spec, or just mine?
And, in either case: what does that have to do with the reset value?

Clifford Wolf

unread,
Feb 19, 2018, 6:10:23 AM2/19/18
to Allen J. Baum, jcb6...@gmail.com, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Hi,

On Sun, Feb 18, 2018 at 11:02:16PM -0800, Allen J. Baum wrote:
> At 12:07 PM +0100 2/18/18, Clifford Wolf wrote:
> >On Sat, Feb 17, 2018 at 05:28:10PM -0800, Allen J. Baum wrote:
> >> Very (very) slightly cheaper is if only allows set .C, but not clearing it (by a copule of gates)
> >
> >Unfortunately that's not possible: The spec says that the init value of
> >MISA must be the max feature set. So MISA.C must be already set on reset.
>
> I'm missing something here.
>
> You suggested making MISA.C read-only when PC[1] or any xEPC[1] bit is set.
> I suggested disallowing clearing it when PC[1] or any xEPC[1] bit is set,

Then I misunderstood you. I thought you suggestes disallowing clearing it
at all, so one could switch from RVG to RVC but never back. (And that would
not work because MISA.C must be already active at reset when the processor
supports it.)

> In both cases (after reset with MISA.C=1) clearing MISA.C works if the
> CSRW instruction is 32bit aligned. The difference only shows up if you
> try to set MISA.C when the CSRW op is not aligned: in your proposal you
> can't, in my proposal (though I hesitate to even call it a proposal, more
> like an observation) you can.

I think our suggestions are identical because in this solution PC[1] or
xEPC[1] can only be set when MISA.C is already active. So it doesn't matter
if we say it's read-only or disallow clearing, as clearinging would be the
only possible write operation anyways.

regards,
- clifford

Allen J. Baum

unread,
Feb 19, 2018, 2:33:09 PM2/19/18
to Clifford Wolf, jcb6...@gmail.com, Christopher Celio, Jose Renau, Andrew Waterman, RISC-V ISA Dev, Rishiyur Nikhil
Good - I was hoping we were agreeing.

Andrew Waterman

unread,
Feb 20, 2018, 2:58:04 PM2/20/18
to Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
This also seems like the cleanest solution to me: any attempt to execute an instruction with a misaligned PC raises a misaligned-instruction exception.  This is easier to explain and simpler to verify than the other proposed alternatives.  I withdraw my objection about increased cost from a few months ago, as it really only is a couple gates, and implementations that are pinching pennies can simply hard-wire misa.C.

On Thu, Feb 15, 2018 at 12:16 PM, Jose Renau <re...@ucsc.edu> wrote:

Bruce Hoult

unread,
Feb 20, 2018, 3:12:49 PM2/20/18
to Andrew Waterman, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
That's clean and simple .. at the moment.

I see two objections:

1) one of the proposed uses for disabling C was to trap and emulate 16 bit instructions (possibly with different semantics). That will be pretty big overhead if 16 bit instructions are common, but if misaligned 32 bit instructions trap too then it will be even worse!

2) This definition makes no sense in some future CPU that supports 48 bit instructions but not the C extension.

I'd suggest that the notion of "defined behaviour" would allow a solution such as:

On any given CPU, misaligned 32 bit instructions while MISA.C is cleared must EITHER:
a) always trap with a misaligned instruction exception, OR
b) always work correctly

A situation where an in-flight LB instruction ends up executing as a LW (or whatever it is that Rocket is doing) is clearly unacceptable.

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Stef O'Rear

unread,
Feb 20, 2018, 3:14:34 PM2/20/18
to Andrew Waterman, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
On Tue, Feb 20, 2018 at 11:57 AM, Andrew Waterman <and...@sifive.com> wrote:
> This also seems like the cleanest solution to me: any attempt to execute an
> instruction with a misaligned PC raises a misaligned-instruction exception.
> This is easier to explain and simpler to verify than the other proposed
> alternatives. I withdraw my objection about increased cost from a few
> months ago, as it really only is a couple gates, and implementations that
> are pinching pennies can simply hard-wire misa.C.

It seems quite inconsistent to me that we would have two possibilities
for misaligned instruction exceptions:

1. After a JAL or JALR; mepc points to the JAL or JALR

2. After a write to MISA; mepc points *one after* the MISA write

Unless you meant that the trap would be taken with mepc pointing *to*
the MISA write?

This also doesn't address what happens to the *EPC registers if bit 1
of any of them is nonzero.

My preferences, ranked:

1. csrw misa raises a misalignment (or illegal) exception if any of
pc[1], mepc[1], sepc[1], bsepc[1], or uepc[1] is 1

2. csrw misa is a no-op if any of the above are 1

3. the original proposal from Rishiyur Nikhil

-s

Stef O'Rear

unread,
Feb 20, 2018, 3:18:11 PM2/20/18
to Bruce Hoult, Andrew Waterman, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
On Tue, Feb 20, 2018 at 12:12 PM, Bruce Hoult <br...@hoult.org> wrote:
> That's clean and simple .. at the moment.
>
> I see two objections:
>
> 1) one of the proposed uses for disabling C was to trap and emulate 16 bit
> instructions (possibly with different semantics). That will be pretty big
> overhead if 16 bit instructions are common, but if misaligned 32 bit
> instructions trap too then it will be even worse!

Unfortunately we defined the mepc and sepc registers so that bit[1] is
hardwired to 0 if C is not supported. If you want to change that, it
needs to be in a new thread, because it is out of scope for this one.
With that behavior, misaligned 32 bit instructions cannot execute
normally if they include traps and trap returns.

> 2) This definition makes no sense in some future CPU that supports 48 bit
> instructions but not the C extension.

What I intend to propose if the matter comes up is that on hardware
without C, 48-bit instructions must be followed by a C.NOP and the
pair is treated as a unit.

-s

Andrew Waterman

unread,
Feb 20, 2018, 3:28:17 PM2/20/18
to Stef O'Rear, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
On Tue, Feb 20, 2018 at 12:14 PM, Stef O'Rear <sor...@gmail.com> wrote:
On Tue, Feb 20, 2018 at 11:57 AM, Andrew Waterman <and...@sifive.com> wrote:
> This also seems like the cleanest solution to me: any attempt to execute an
> instruction with a misaligned PC raises a misaligned-instruction exception.
> This is easier to explain and simpler to verify than the other proposed
> alternatives.  I withdraw my objection about increased cost from a few
> months ago, as it really only is a couple gates, and implementations that
> are pinching pennies can simply hard-wire misa.C.

It seems quite inconsistent to me that we would have two possibilities
for misaligned instruction exceptions:

1. After a JAL or JALR; mepc points to the JAL or JALR

2. After a write to MISA; mepc points *one after* the MISA write

Unless you meant that the trap would be taken with mepc pointing *to*
the MISA write?

It is inconsistent but not actually problematic.
 

This also doesn't address what happens to the *EPC registers if bit 1
of any of them is nonzero.

No, it does address that case.  After executing xRET, you'll take a misaligned-instruction exception on the address that was stored in xEPC.


My preferences, ranked:

1. csrw misa raises a misalignment (or illegal) exception if any of
pc[1], mepc[1], sepc[1], bsepc[1], or uepc[1] is 1

Introducing a data-dependent trap on CSR writes is a new hazard, so this is not preferable.

It also makes it harder to store the xEPC registers in a RAM, as you'd like to do for very cheap implementations, since you need to access all the epc[1] bits at the same time.


2. csrw misa is a no-op if any of the above are 1

I think this option is preferable to your option 1, but it still has the drawback of needing access to all the epc[1] bits at the same time.

Stef O'Rear

unread,
Feb 20, 2018, 3:31:25 PM2/20/18
to Andrew Waterman, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
On Tue, Feb 20, 2018 at 12:27 PM, Andrew Waterman <and...@sifive.com> wrote:
> No, it does address that case. After executing xRET, you'll take a
> misaligned-instruction exception on the address that was stored in xEPC.

Does this mean that a supervisor can probe for "C present but possibly
disabled" by writing and then attempting to read back a misaligned
value from sepc?

-s

Christopher Celio

unread,
Feb 20, 2018, 3:33:59 PM2/20/18
to Stef O'Rear, Andrew Waterman, Jose Renau, Clifford Wolf, RISC-V ISA Dev, Rishiyur Nikhil
The proposal (as Jose and Andrew have iterated):

If RVC is disabled, an exception must be thrown on any misaligned 4-byte (or 8-byte) instruction. Clearly, if RVC is enabled or a 48b-extension is enabled, the alignment restriction of 4-byte/8-byte instructions is relaxed.

Allowing a "delay slot" for alignment requirements is not possible. If RVC is disabled, the RVC-expansion hardware is disabled and the processor will NOT work. Full stop.

Detecting misalignment is trivial in hardware. No extra flops, and like 10 gates of hardware.


> Unless you meant that the trap would be taken with mepc pointing *to*
> the MISA write?

No, that's too much hardware. It is not feasible to throw the exception on MISA-write. The instruction to reset MISA.C is a CSRW that writes to address MISA register with a data-dependent mask that dictates it wants to clear the C-bit in MISA (so both the CSR address and the mask is data-dependent!). Likewise, the xRET would be another data-dependent exception. And in both instances, you must broadcast the xEPC bits to the decode/exception-catcher unit.


-Chris

Andrew Waterman

unread,
Feb 20, 2018, 3:38:24 PM2/20/18
to Stef O'Rear, Jose Renau, Clifford Wolf, Christopher Celio, RISC-V ISA Dev, Rishiyur Nikhil
"C or some other (n*32 + 16)-bit extension present but possibly disabled."

Whether sepc[1] should be writable when all such extensions are disabled is a related but separate question.

Cesar Eduardo Barros

unread,
Feb 20, 2018, 6:30:05 PM2/20/18