PMP update and the content of TLB

296 views
Skip to first unread message

chuanhua.chang

unread,
May 23, 2018, 10:25:03 PM5/23/18
to RISC-V ISA Dev
Since PMP checks are applied to page-table accesses for virtual-address translation, after PMP is updated, is there any architecture requirement to make sure the content of TLB is still ok to be accessed?

Does the HW has to invalidate all TLB contents? or SW has to use sfence.vma to invalidate all TLB contents?

This needs clarification. Thanks.


Chuanhua

Allen Baum

unread,
May 23, 2018, 11:10:48 PM5/23/18
to chuanhua.chang, RISC-V ISA Dev
Tricky case.
Nice.
This is a fetch of the page table entries themselves, correct?
 Translation doesn’t require a TLB at all, or that any particular translation will ever be guaranteed inserted into the TLB or not of course, but if there is one, only the final translation could be put into the TLB.

 So are you asking that if there is a TLB miss, and the final translation is disallowed by the PMP, will that translation be inserted into TLB or not?
 
That sounds implementation dependent.

Note that this could occur before the final translation, in which case it couldn’t/shouldn’t be put into the TLB.

It wouldn’t hurt to clarify.

-Allen
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/b95fdfd8-672e-4adc-a92f-249bdc36242d%40groups.riscv.org.

Jacob Bachmeyer

unread,
May 23, 2018, 11:37:16 PM5/23/18
to Allen Baum, chuanhua.chang, RISC-V ISA Dev
Allen Baum wrote:
> Tricky case.
> Nice.
> This is a fetch of the page table entries themselves, correct?
> Translation doesn’t require a TLB at all, or that any particular
> translation will ever be guaranteed inserted into the TLB or not of
> course, but if there is one, only the final translation could be put
> into the TLB.
>
> So are you asking that if there is a TLB miss, and the final
> translation is disallowed by the PMP, will that translation be
> inserted into TLB or not?
>
> That sounds implementation dependent.
>
> Note that this could occur before the final translation, in which case
> it couldn’t/shouldn’t be put into the TLB.
>
> It wouldn’t hurt to clarify.

I think that the question is what happens to TLB entries that were read
from regions that are now inaccessible. In other words, PMP restricts
the hardware page-table-walker. What happens to TLB entries that *were*
loaded, but from locations that are *now* inaccessible after a PMP change?


-- Jacob

chuanhua.chang

unread,
May 23, 2018, 11:38:00 PM5/23/18
to RISC-V ISA Dev, chuanhu...@gmail.com
> This is a fetch of the page table entries themselves, correct?
Yes.


> So are you asking that if there is a TLB miss, and the final translation is disallowed by the PMP, will that translation be inserted into TLB or not?
No. What you described is clearly stated in the spec that the final translation should not be inserted into TLB and the CPU should generate an access fault. What I am asking is as follows:

1. The PMP is updated such that the page table entries are not allowed to be read.
2. What should happen to the page table entries already cached in the CPU? Is this a SW responsibility to invalidate the cached entries or is this a HW responsibility to invalidate the cached entries?

It feels that this should be a SW responsibility since SW knows better if a PMP update will cause this situation or not.

Chuanhua

Michael Clark

unread,
May 24, 2018, 12:55:07 AM5/24/18
to chuanhua.chang, RISC-V ISA Dev


> On 24/05/2018, at 3:38 PM, chuanhua.chang <chuanhu...@gmail.com> wrote:
>
> > This is a fetch of the page table entries themselves, correct?
> Yes.
>
> > So are you asking that if there is a TLB miss, and the final translation is disallowed by the PMP, will that translation be inserted into TLB or not?
> No. What you described is clearly stated in the spec that the final translation should not be inserted into TLB and the CPU should generate an access fault. What I am asking is as follows:
>
> 1. The PMP is updated such that the page table entries are not allowed to be read.
> 2. What should happen to the page table entries already cached in the CPU? Is this a SW responsibility to invalidate the cached entries or is this a HW responsibility to invalidate the cached entries?
>
> It feels that this should be a SW responsibility since SW knows better if a PMP update will cause this situation or not.

It should be up to the software

(implementation issues aside i.e. implementation that accelerate PMP using the TLB, to avoid a cycle for PMP lookups post TLB lookup; that implementation approach must assure consistency so likely needs to shoot down TLB entires when PMP entries are updated; this is indeed a micro-architectural detail and not specific to the ISA).

From the ISA’s perspective the PMP entries should take effect when they are changed.

There are only hardware responsibilities if the TLB is used as an acceleration structure and that’s a micro-architectural optimisation.

Michael Clark

unread,
May 24, 2018, 1:00:02 AM5/24/18
to chuanhua.chang, RISC-V ISA Dev
So indeed there could be a hardware responsibility for TLB consistency given particular micro-architectural choices.

Obviously it’s going to be slower to do a PMP lookup after the TLB lookup so it makes sense to save the result in the TLB, and one must bear the consequences of the consistency constraints this implies. Changing a PMP while you have (stale) PMP acceleration state in the TLB. This scenario is definitely the hardwares responsibility. Unless I’m mistaken sfence.vma only applies to VM not PMP.
> --
> You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
> To post to this group, send email to isa...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/87A9A0CE-EB5A-438F-892A-5DB0EC85F970%40mac.com.

Andrew Waterman

unread,
May 24, 2018, 2:20:00 AM5/24/18
to Allen Baum, chuanhua.chang, RISC-V ISA Dev
We've discussed this case before but apparently forgot to write down
our conclusion, which is that a full SFENCE.VMA should be executed
after swapping the PMPs. In other words, translation caches are free
to cache the PMPs.

Some reasons include:

- If we defined it the other way, then in conventional
implementations, every pmpaddr write, and most pmpcfg writes, would
need to flush the TLBs. (I doubt anyone would implement the snoop
paths to avoid the flushes.)
- Executing an SFENCE.VMA in this case is consistent with expectations
for multi-level paging, where the hypervisor needs to execute an
SFENCE.VMA after changing the level-1 page table for the effects to
trickle down to level-2 paging.
- The obvious one: existing HW does not flush the TLBs in this case,
so requiring the SFENCE.VMA is compatible.
> https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/C68B8F3E-19A1-40A6-84D7-1772881FC26C%40esperantotech.com.

Luke Kenneth Casson Leighton

unread,
May 24, 2018, 6:24:49 AM5/24/18
to Andrew Waterman, Allen Baum, chuanhua.chang, RISC-V ISA Dev
On Thu, May 24, 2018 at 7:19 AM, Andrew Waterman
<wate...@eecs.berkeley.edu> wrote:

> We've discussed this case before but apparently forgot to write down
> our conclusion, which is that a full SFENCE.VMA should be executed
> after swapping the PMPs. In other words, translation caches are free
> to cache the PMPs.

hmmm... so there's implementation notes/advice being discussed which
is being lost/forgotten because it's on a mailing list / forum, which
from bitter experience we know is never a good place to store
"structured information". although i'm struggling to keep up after
actually cut/pasting various bits of conversations into wiki pages for
SV, i am doing it, because from 20 years experience of dealing with
software libre project "not-management" i *know* that if you don't do
that you lose track sometimes within minutes of seeing the
mailing-list message fly by.

i'm certainly not going to say to anyone on this list "you should
consider changing working practices" (i'm saying that just in case
anyone thinks that i am even doing anything like *implying* that
people on this list *should* change working practices: i am NOT saying
OR implying that. at all. so please, no replies "go away don't tell
us what to do" ok?). i am simply making people *aware* that there
exist working practices that help ensure that critical information is
not lost. honestly: they're a pain to adopt. they irritate me
immensely. but not as much as losing the damn information :)

l.

Michael Clark

unread,
May 25, 2018, 1:31:54 AM5/25/18
to Andrew Waterman, Allen Baum, RISC-V ISA Dev, chuanhua.chang
On Thu, 24 May 2018 at 6:19 PM, Andrew Waterman <wate...@eecs.berkeley.edu> wrote:
We've discussed this case before but apparently forgot to write down
our conclusion, which is that a full SFENCE.VMA should be executed
after swapping the PMPs.  In other words, translation caches are free
to cache the PMPs.

Some reasons include:

- If we defined it the other way, then in conventional
implementations, every pmpaddr write, and most pmpcfg writes, would
need to flush the TLBs.  (I doubt anyone would implement the snoop
paths to avoid the flushes.)
- Executing an SFENCE.VMA in this case is consistent with expectations
for multi-level paging, where the hypervisor needs to execute an
SFENCE.VMA after changing the level-1 page table for the effects to
trickle down to level-2 paging.
- The obvious one: existing HW does not flush the TLBs in this case,
so requiring the SFENCE.VMA is compatible.

Interesting. Apologies for jumping to a presumed conclusion. PMP and VM are logically distinct but it’s fair enough to document that SFENCE.VMA is necessary to ensure PMP consistency with VM if present. That allows the microarchitectual optimisation for the VM <-> PMP interaction, as stated, for microarchitectures that use the TLB as an acceleration structure for PMP.

Q. Is this still required if satp.vm = bare? for implementations where satp.vm can be set to non zero values? and for implementations without VM? or is the SFENCE.VMA just for VM <-> PMP consistency if a translation mode is in effect?

BTW - I think QEMU’s software implementation requires an SFENCE.VMA as well as the recent granularity updates, as PMP is enforced at page size granularity and the entries are caches in the TLB with page size protection granularity. It would be too slow to check PMP in the softmmu load/store fast path.

My first instinct was of course based on seeing the PMP as a base building block, and VM layered on top and that microarchitectural optimisations like this would be hidden (and then afterwards thinking about the implementation in QEMU).

An M-mode only implementation of course is a distinct case.

Andrew Waterman

unread,
May 25, 2018, 1:44:32 AM5/25/18
to Michael Clark, Allen Baum, RISC-V ISA Dev, chuanhua.chang
On Thu, May 24, 2018 at 10:31 PM Michael Clark <m...@sifive.com> wrote:
On Thu, 24 May 2018 at 6:19 PM, Andrew Waterman <wate...@eecs.berkeley.edu> wrote:
We've discussed this case before but apparently forgot to write down
our conclusion, which is that a full SFENCE.VMA should be executed
after swapping the PMPs.  In other words, translation caches are free
to cache the PMPs.

Some reasons include:

- If we defined it the other way, then in conventional
implementations, every pmpaddr write, and most pmpcfg writes, would
need to flush the TLBs.  (I doubt anyone would implement the snoop
paths to avoid the flushes.)
- Executing an SFENCE.VMA in this case is consistent with expectations
for multi-level paging, where the hypervisor needs to execute an
SFENCE.VMA after changing the level-1 page table for the effects to
trickle down to level-2 paging.
- The obvious one: existing HW does not flush the TLBs in this case,
so requiring the SFENCE.VMA is compatible.

Interesting. Apologies for jumping to a presumed conclusion. PMP and VM are logically distinct but it’s fair enough to document that SFENCE.VMA is necessary to ensure PMP consistency with VM if present. That allows the microarchitectual optimisation for the VM <-> PMP interaction, as stated, for microarchitectures that use the TLB as an acceleration structure for PMP.

Q. Is this still required if satp.vm = bare? for implementations where satp.vm can be set to non zero values? and for implementations without VM? or is the SFENCE.VMA just for VM <-> PMP consistency if a translation mode is in effect?

No need for SFENCE in bare case (or no-VM case): it’s just for address translation caching.

Michael Clark

unread,
May 25, 2018, 1:46:24 AM5/25/18
to Luke Kenneth Casson Leighton, Andrew Waterman, Allen Baum, chuanhua.chang, RISC-V ISA Dev

--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.

Paul Miranda

unread,
May 25, 2018, 8:25:09 AM5/25/18
to RISC-V ISA Dev, wate...@eecs.berkeley.edu, allen...@esperantotech.com, chuanhu...@gmail.com
I was just looking at PMP the other day thinking that top-of-range addressing mode is going to be quite slow to implement relative to power-of-two chunks. I wouldn't mind if TOR went away completely... is that a possibility?

I suppose an implementation can always choose to have TOR perform badly and then software will "learn" not to use it, and that is how obscure parts of the x86 ISA have effectively died off over the years, but it's sad to see a clean-sheet ISA already saddled with a burden like this. PMP in general seems like a bit of an afterthought that could have been more extensible so PMA didn't have to be a hand-wave of "implementation-specific". I assume most high performance RV implementations will cache PMP and PMA in the TLB, but all in different ways, which seems unfortunate.

Andrew Waterman

unread,
May 25, 2018, 2:16:58 PM5/25/18
to Paul Miranda, RISC-V ISA Dev, Allen Baum, chuanhua.chang
On Fri, May 25, 2018 at 5:25 AM, Paul Miranda <paulcm...@gmail.com> wrote:
> I was just looking at PMP the other day thinking that top-of-range
> addressing mode is going to be quite slow to implement relative to
> power-of-two chunks. I wouldn't mind if TOR went away completely... is that
> a possibility?

PMP configuration registers are WARL, so there's nothing wrong with
implementations that support only naturally aligned regions.

kr...@berkeley.edu

unread,
May 26, 2018, 3:11:03 AM5/26/18
to Paul Miranda, RISC-V ISA Dev, wate...@eecs.berkeley.edu, allen...@esperantotech.com, chuanhu...@gmail.com

>>>>> On Fri, 25 May 2018 05:25:09 -0700 (PDT), Paul Miranda <paulcm...@gmail.com> said:
| I was just looking at PMP the other day thinking that top-of-range addressing
| mode is going to be quite slow to implement relative to power-of-two chunks. I
| wouldn't mind if TOR went away completely... is that a possibility?

It'll be up to a platform to mandate the PMPs it requires.

| I suppose an implementation can always choose to have TOR perform badly and
| then software will "learn" not to use it, and that is how obscure parts of the
| x86 ISA have effectively died off over the years, but it's sad to see a
| clean-sheet ISA already saddled with a burden like this.

I will note that some systems that previously had only power-of-2
boundaries in their memory protection units have added more flexible
limits recently. I think the trend is in the opposite direction, with
better/more-complex protection becoming common.

| PMP in general seems
| like a bit of an afterthought that could have been more extensible so PMA
| didn't have to be a hand-wave of "implementation-specific".

PMP (dynamic software permissions) and PMA (static hardware abilities)
are very different things, and we purposefully separated them.

| I assume most high
| performance RV implementations will cache PMP and PMA in the TLB, but all in
| different ways, which seems unfortunate.

Why is it unfortunate to allow multiple caching strategies provided
they have the same behavior?

Krste

| On Friday, May 25, 2018 at 12:31:54 AM UTC-5, mjc wrote:

| BTW - I think QEMU’s software implementation requires an SFENCE.VMA as well
| as the recent granularity updates, as PMP is enforced at page size
| granularity and the entries are caches in the TLB with page size protection
| granularity. It would be too slow to check PMP in the softmmu load/store
| fast path.

| --
| You received this message because you are subscribed to the Google Groups
| "RISC-V ISA Dev" group.
| To unsubscribe from this group and stop receiving emails from it, send an email
| to isa-dev+u...@groups.riscv.org.
| To post to this group, send email to isa...@groups.riscv.org.
| Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/
| .
| To view this discussion on the web visit https://groups.google.com/a/
| groups.riscv.org/d/msgid/isa-dev/
| 7b322c96-f9ba-485a-b8de-0b92e6c41b12%40groups.riscv.org.
Reply all
Reply to author
Forward
0 new messages