Hi,
sorry, long mail. If it's too long please skip to the and start reading
at "This is not about security".
On Thu, Feb 15, 2018 at 02:50:07PM -0800, Christopher Celio wrote:
> > The spec has been crafted carefully avoid this situation by making sure xEPC will never
> > be able to hold a value that would make xRET trap. Our proposal preserves
> > this guarantee.
>
> I don't understand the full ramifications of this scenario. Yes, this is
> a bad situation, but at least it's verifiable and understandable. Does
> this cause us to infinite loop? How recoverable does this situation need
> to be?
I'm just repeating the arguments I heard in october for why this suggestion
is not acceptable.
I took all the "we can't do that because ..." argument and crafted a
solution that requires no extra hardware (because this was the main
argument against most suggestions) and avoid also the other things that
have been brought up.
I personally don't care much as long as we have can avoid complete
undefined behavior.
As I've said in my previous mail, rocket currently can retire a killed LB
instruction as LW as a result of clearing MISA.C with an unaligned PC and
this is deemed okay because clearing MISA.C with an unaligned PC is
undefined behavior in the current spec. (It wasn't when I found the
behavior, but when I reported it it was added to the spec instead of fixing
rocket.)
--8<--
Quick sidenote: What is undefined behavior? Usually it means the core can
do whatever it wants whenever it wants, even before the unaligned MISA.C
clear hits as long as the core is destined to hit it (undefined behavior is
not causal). This is completely unacceptable for formal verification. We
would need to solve the halting problem in order to determine if the core
is destined to hit the unaligned MISA.C clear.
In one mail in october Andrew said undefined behavior in this sense is
causal (good!). But even causal undefined behavior screws up formal.
Suddenly all properties depend on if the processor hit this one obscure
state in its past. That drastically complicates everything.
I've also been told that "undefined behavior" in the sense of the RISC-V
spec is causal and gurantees that it "won't violate protections and won't
hang". But this is a very problematic definition because it is not a
statement in terms of CPU state but a statement in terms of emergent
behavior.
For example: What if the core suddenly starts interpreting every
instruction as "j 0", i.e. endless loop. It still keeps executiong
instructions, so does it hang?
You might think that it is crazy to even consider this correct behavior,
but rocket currently can be made to retire a LW instruction that is not
there as response to clearing MISA.C, and apparently that is okay according
to the definition.
I think we should be able to get away in the formal spec with unspecified
values and situations where an implementation may chose one of multiple
concrete options (aka "unpredictable behavior"). For example, instead of
saying that clearing MISA.C with xEPC pointing to an unaligned address is
undefined behavior we could say that clearing MISA.C will leave undefined
values in all xEPC registers.
--8<--
We in the formal spec WG don't think that clearing MISA.C justifies undefined
behavior. We even think it is possible to define it in a way so that it has
no unspecified or unpredictable aspects and adds no extra hardware.
> > In the second case we would effectively execute random code. (Whatever the
> > interpretation of the code is when offset by 2 bytes.) This has been deemed
> > less desireable than all other options by pretty much everyone I spoke to.
>
> Do we not find ourselves effectively executing random code anyways? The
> original proposal is that CSR* writes to xEPC clear the two LSBs. Is that
> different from ignoring the two LSBs during xRET? [..]
Yes, that's very different.
Because this is only for CSR* writes to xEPC, not for when the PC gets
copied into xEPC as a result of a trap. Clearing the low bits in xEPC on
write is spec behavior now (3.1.19 of Privileged Architectures V1.10), not
something we have added.
So with our proposal you can still take a trap in unaligned code and return
correctly. We only get a problem when people try to set xEPC by copying a
register value into it. (Which is already the case right now.)
Note that our proposal completely stays within the realm of what is
currently completely unspecified. So all guarantees that the current spec
gives to software still hold.
> I'm not yet convinced that forcing alignment on xRET is worse than
> anything else proposed.
I don't say it is. I'm saying it the suggestion was already shut down in
october because it adds extra hardware complexity.
Our proposal doesn't add hardware complexity. It's just a bit harder to get
your head around but it is actually simpler than what (at least) rocket is
doing right now and it only clarifies some cases that are currently
unspecified because the spec is for the most part written in a way that
assumes that MISA.C is read-only.
Let me explain what I mean using my original 3 clause proposal (that should
be semantically identicall to the 5 clause proposal nikhil posted):
--snip--
A processor with C support and writable misa.C should behave as follows
when misa.C is cleared:
(1) If not stated otherwise below, behave the same as if misa.C would be set.
Specifically, if PC points to an address that isn't aligned to 32 bits
the core should just execute unaligned code.
(2) All jump or branch instructions that attempt to jump to an address that
is not aligned to 32 bits will cause an Instruction address misaligned
exception.
(3) Writes to [msu]epc will clear the two LSB of the written address. This
is only the case for the CSR* instructions that explicitly write to
[msu]epc. When PC is copied to [msu]epc as part of trap handling than
it is copied as-is even when it points to an address not aligned to
32 bits.
--snap--
Clause (1) says that with the exception of (2) and (3) there is no need to
use MISA.C anywhere in the core (with the exception of the instruction
decoder that should forget about C encodings of course).
This is important! It means that if you have anything else in your core
right now that uses misa.C then you can remove that logic. It is not
needed.
Clause (2) is something that a core should already do. It is specified in
2.5 of RISC-V User-Level ISA V2.2.
Clause (3) is two parts. First the part about clearing bits during CSR*
writes to [msu]epc. This is something a core should already do. It is
specified in 3.1.19 of Privileged Architectures V1.10. And second the
copying from PC to [msu]epc. This is essentially a clarification that
no special action is required if PC is not aligned to 32b.
Your proposal is to mask the LSB bits of xEPC on xRET to fix the address.
This would be an additional action ontop of the clauses above. It would
not remove hardware anywhere else from the system but it would require you
to add a few extra gates to the xRET logic.
> I'm trying to imagine a security hole where a misaligned (and malicious)
> xRET takes us to a unaligned 4 byte jump gadget (that serendipitously was
> created out of the two halves of adjacent and aligned 4-byte
> instructions) which takes us to an attack vector that's correctly 4 byte
> aligned. The fact that the xRET was misaligned is never discovered. Or
> would an attacker never get access to xEPC so this hypothetical can never
> be realized?
This is not about security.
Proper OS would only clear misa.C when in aligned code and xEPC pointing to
aligned addresses.
If misa.C is cleared and PC and xEPC point to 32b aligned addresses then
you can never have an unaligned address in PC and xEPC without first
setting misa.C again. That's the whole point: Emulating RVG on RVC.
This is about avoiding unnecessary undefined behavior so that we can have a
clean formal specification. Our proposal does this without adding any
additional complexity to the hardware. In fact, I believe it simplifies
implementations.
> I'm sorry I haven't spend more than a few hours thinking about this
> topic. I don't know the correct solution off the top of my head to the
> xRET/MTVEC problem, but I'm concerned if the purpose of turning off RVC
> for RVC processors is to emulate RVG processors, having a co-simulation
> mismatch on xRET seems really problematic.
How would you have a co-simulation mismatch on xRET if you only clear
misa.C when in aligned code and xEPC pointing to aligned addresses?
The whole point of our proposal is to guarantee complete compatibility with
RVG if we start in a state that is valid for a RVG processors. I don't see
how you could run into any co-simulation mismatches if you start with a
valid state.
> RVG should be about all 4 byte instructions being aligned. Allowing
> corner case exceptions to this is worrisome.
But how would you ever get into those corner cases? Could you provide an
example? The only way to do that is to start with a state a RVG system can
never get into without first setting it up in RVC mode and then clearing
misa.C. And then you are talking about RVC system behavior, not RVG,
because in order to get into that state you had to be in C mode for at
least a few instructions.
I don't know.. maybe we are misunderstanding each other here. Or maybe
there is a bug in my proposal. In the latter case I would appriciate an
example that demonstrates that you could get a co-simulation mismatch with
a core that follows my proposal.
regards,
- clifford
--
Tell people there is an invisible man in the sky who created the universe, and
the vast majority will believe you. Tell them the paint is wet, and they have
to touch it to be sure.