Hi Greg,The sfence.vma was performed prior to the jarl that triggered the speculative prefetch and only on the target address of the jarl, there was no sfence.vma for the addresses in the following (unmapped) page.From “Svnapot” Standard Extension for NAPOT Translation Contiguity, Version 1.0":"It is the responsibility of the OS and/or hypervisor to configure the page tables in such
a way that there are no inconsistencies between NAPOT PTEs and other NAPOT or non-
NAPOT PTEs that overlap the same address range."In this case by pure bad ( good ? ) luck the PTE following the properly mapped PTE happened to decode with V=1, X=1 and N=1.There was definite inconsistency in the leaf page address between the two PTE's, thus violating the NATOP requirement.Does it follow that the RISCV ISA requires that software mark all unused PTE's in a page directory as NOT valid so this case does not trigger ?
Hi Greg,Your conclusion above is clear, and the impetus for starting this thread in the first place.The only way I see of avoiding an erroneous caching of an implicit read from a speculative prefetch is to have a RISCV ISA requirement that software MUST mark all unused PTEs in a page directory as NOT valid.Is this a RISCV ISA requirement ?
If not, how can the implementation avoid the bad instruction access fault for the original jarl ?
On Mon, Mar 3, 2025 at 9:48 AM Adnan Hamid <adnan....@gmail.com> wrote:
>
> Here is another tape-out gating customer question that is in need of a definitive ruling:
>
> May a compliant RISCV implementation assume that software MUST mark unused PTEs in a page directory not valid ?
>
> A situation arose where the Breker RISCV test generator created a scenario where it jumped (`jarl`) to a virtual address that was the second to last byte of a 4K page. The bytes at that address decoded to a return instruction that would allow software to continue normal execution.
is the return instruction 2B or 4B long?
if it is 4B long, then a second page translation is required.
is the problem arising because the second translation being launched
speculatively for a subsequent 4K page, but the instruction turns out
to be 2B so that speculative translation was never required?
(and that
speculative translation happens to find an invalid 64K PTE, which I
would argue is invalid and should not be consulted)
> an aggressive code prefetch engine speculatively read the following PTE with garbage bytes which happened to decode to a 64K page
Just wanted to make sure I understand:
-- the following PTE is for a subsequent page, and it happens to be
64K which also overlaps with the initial PTE?
-- the 64K PTE is not initialized by the OS properly, so it contains
garbage info? I believe the OS *must* ensure these are at least marked
invalid (V=0), and that the behaviour of an implementation would be
undefined if V=1 but the rest of the entry was "garbage".
Guy
On Mon, Mar 3, 2025 at 10:56 AM Adnan Hamid <adnan....@gmail.com> wrote:Hi Greg,Your conclusion above is clear, and the impetus for starting this thread in the first place.The only way I see of avoiding an erroneous caching of an implicit read from a speculative prefetch is to have a RISCV ISA requirement that software MUST mark all unused PTEs in a page directory as NOT valid.Is this a RISCV ISA requirement ?No.
Also note that even if there was a requirement, software can still not conform to that requirement and one is again left with the same possibilities.
In practice, if the spec required consistency, then it would probably also specify that the behavior is UNSPECIFIED if the requirement is not satisfied.If not, how can the implementation avoid the bad instruction access fault for the original jarl ?Don't have inconsistent PTEs within a NAPOT group - which is a software matter. Otherwise hardware would probably have to jump through a variety of hoops to ensure some specific chosen implementation behavior in the face of inconsistent PTEs.
Greg
In a conventional TLB design, it is possible for multiple entries to match a single
address if, for example, a page is upgraded to a superpage without first clearing the
original non-leaf PTE’s valid bit and executing an SFENCE.VMA with rs1=x0. In this
case, a similar remark applies: it is unpredictable whether the old non-leaf PTE or the
new leaf PTE is used, but the behavior is otherwise well defined.
Rather than think about speculation, let's work backwards from the other direction. It is legal to have a 0-entry TLB in RISC-V. This means that hardware would do a page-table walk for every memory access. As per the spec quote above, the page-table walk would be allowed to return either mapping arbitrarily. So this software problem exists without any form of speculation
Best,
-Dan
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To view this discussion visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/fbca732a-bfc0-4f28-a308-f5f0f44ca1dbn%40groups.riscv.org.
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To view this discussion visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/91600bf0-c259-4b6f-a6f9-2dc0bc0c8164%40gmail.com.
Adnan Hamid wrote:
> The inconsistent PTEs within a NAPOT group occurred because random bytes
> were left in the unused PTE.
Its not clear what is mean "is not mapped/used" - A PTE is either
valid or is invalid. There is no concept of "not in use".
On Mon, Mar 3, 2025 at 11:24 AM Adnan Hamid <adnan....@gmail.com> wrote:
>
>
>
> On Monday, March 3, 2025 at 11:03:01 AM UTC-8 Greg Favor wrote:
>
> On Mon, Mar 3, 2025 at 10:56 AM Adnan Hamid <adnan....@gmail.com> wrote:
>
> Hi Greg,
> Your conclusion above is clear, and the impetus for starting this thread in the first place.
>
> The only way I see of avoiding an erroneous caching of an implicit read from a speculative prefetch is to have a RISCV ISA requirement that software MUST mark all unused PTEs in a page directory as NOT valid.
>
> Is this a RISCV ISA requirement ?
>
> No.
>
> No ? I was all ready to put this thread to bed before you said No.
Greg is correct -- this is not an ISA requirement.
For example, the OS can mark unused PTEs as valid. In this case, the
PTE may be valid, but its properties may be set in a way that still
causes an exception upon access (eg, by setting RWX=000, or by setting
A=0 or D=0 and expecting that the hardware PTW cannot change those
bits so an exception is required).
The only way I see of avoiding an erroneous caching of an implicit read from a speculative prefetch is to have a RISCV ISA requirement that software MUST mark all unused PTEs in a page directory as NOT valid.Is this a RISCV ISA requirement ?No.No ? I was all ready to put this thread to bed before you said No.
Also note that even if there was a requirement, software can still not conform to that requirement and one is again left with the same possibilities.Hmm? Software did not foresee the speculative fetch to the next page which was NOT required because the ret instruction occupied only 2B at the end of the target page.I am arguing that software is responsible for setting PTE.V=0 if the PTE is not mapped, otherwise behavior is UNSPECIFIED.
The inconsistent PTEs within a NAPOT group occurred because random bytes were left in the unused PTE.Repeating my proposal, to avoid this software MUST guarantee PTE.V=0 if the PTE is not mapped/used, otherwise behavior is UNSPECIFIED.
Earl Killian pointed out (to me) that there are 32 x16 6 PTE entries pointed to by the next higher page table level,and you can't leave any of the other 496 entries uninitialized either.
Here is another tape-out gating customer question that is in need of a definitive ruling:May a compliant RISCV implementation assume that software MUST mark unused PTEs in a page directory not valid ?
A situation arose where the Breker RISCV test generator created a scenario where it jumped (`jarl`) to a virtual address that was the second to last byte of a 4K page. The bytes at that address decoded to a return instruction that would allow software to continue normal execution.
Meanwhile an aggressive code prefetch engine speculatively read the following PTE with garbage bytes which happened to decode to a 64K page that overlapped with the 4K page being accessed. The 64K page overwrote the TLB mapping causing the side effect of having the `jarl` take an instruction access fault because the original 4K mapping was overwritten.
-adnan
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To view this discussion visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/146dbc9a-8cb2-4f2f-9fd4-772ab91dc1a0n%40groups.riscv.org.