Connection between TLB and PTW

Aaron Dietz

unread,

Feb 5, 2019, 9:36:58 AM2/5/19

to RISC-V HW Dev

Hello to everybody,

currently I try to understand how the translation lookaside buffer and the page table walker are connected to each other in the rocket-chip project and how virtual addresses are resolved to physical addresses.

I already have some guesses and it would be great if someone could say one or two words about them :).

As it can be seen here https://www.lowrisc.org/docs/tagged-memory-v0.1/rocket-core/ the instruction cache contains one TLB block which can trigger a refill.

1. Is this refill operation performed by the PTW?

The file src/main/Scala/rocket/PTW.scala in the rocket-chip repo defines the following

val io = new Bundle {
  val requestor = Vec(n, new TLBPTWIO).flip
  val mem = new HellaCacheIO
  val dpath = new DatapathPTWIO
}

2. Does this mean the part are connected like this?

rocket core -> L1 instruction cache -> TLB(in every icache) -> PTW

3. Is there another TLB element in the L2 cache?

4. Does the L1 and/or L2 data cache have the same kind of TLB element? (Works the same?)

4. Does the PTW on its own have some kind of cache?

Greetings from Munich,

Aaron Dietz

Andrew Waterman

unread,

Feb 5, 2019, 11:13:40 AM2/5/19

to Aaron Dietz, RISC-V HW Dev

On Tue, Feb 5, 2019 at 6:37 AM Aaron Dietz <diet...@gmail.com> wrote:

Hello to everybody,

currently I try to understand how the translation lookaside buffer and the page table walker are connected to each other in the rocket-chip project and how virtual addresses are resolved to physical addresses.
I already have some guesses and it would be great if someone could say one or two words about them :).

As it can be seen here https://www.lowrisc.org/docs/tagged-memory-v0.1/rocket-core/ the instruction cache contains one TLB block which can trigger a refill.
1. Is this refill operation performed by the PTW?

Yes.

The file src/main/Scala/rocket/PTW.scala in the rocket-chip repo defines the following
val io = new Bundle { val requestor = Vec(n, new TLBPTWIO).flip val mem = new HellaCacheIO val dpath = new DatapathPTWIO }
2. Does this mean the part are connected like this?
rocket core -> L1 instruction cache -> TLB(in every icache) -> PTW

Yes, though the PTW is also connected directly to the core, to get the page-table base register.

3. Is there another TLB element in the L2 cache?

No, everything is physical beyond the L1s.

4. Does the L1 and/or L2 data cache have the same kind of TLB element? (Works the same?)

4. Does the PTW on its own have some kind of cache?

Yes, it has a small cache of the nonleaf nodes, and a larger L2 TLB.

Greetings from Munich,

Aaron Dietz

--
You received this message because you are subscribed to the Google Groups "RISC-V HW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+un...@groups.riscv.org.
To post to this group, send email to hw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/hw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/hw-dev/ff2cf85b-e253-4859-bd18-6fb95863d705%40groups.riscv.org.

Aaron Dietz

unread,

Feb 11, 2019, 9:33:20 AM2/11/19

to RISC-V HW Dev, diet...@gmail.com

Thanks Andrew. That helps me a lot :)

Jose Martins

unread,

Mar 21, 2020, 12:55:38 PM3/21/20

to RISC-V HW Dev, diet...@gmail.com

I'm also looking at how these two modules interact and have a couple of questions regarding superpage management.

I see the L1 TLB module used inside the caches have specific entries to hold superpages. When looking at the L2 TLB embedded in the PTW module, I can't find differentiation between superpage of leaf page entries. Am I right in concluding that all translations are cached in the L2 TLB as 4K pages?

Also, it seems the TLB module uses the level signal in the PTWResp field of the TLBPTWIO bundle to figure out if the translation resulted in a superpage or not. Is this correct? What I'm failing to grasp is, if my previous assumption regarding the granularity of the L2 TLB entries is correct, how is the superpage information correctly forward to the L1 TLB when there is an L2 hit.

Hope the questions are not too confusing and I apologize in advance if I'm saying something that does not make sense (I'm more of a software guy)

Jose

To unsubscribe from this group and stop receiving emails from it, send an email to hw-...@groups.riscv.org.

Andrew Waterman

unread,

Mar 21, 2020, 6:21:32 PM3/21/20

to Jose Martins, RISC-V HW Dev, Aaron Dietz

On Sat, Mar 21, 2020 at 9:55 AM Jose Martins <josema...@gmail.com> wrote:

I'm also looking at how these two modules interact and have a couple of questions regarding superpage management.

I see the L1 TLB module used inside the caches have specific entries to hold superpages. When looking at the L2 TLB embedded in the PTW module, I can't find differentiation between superpage of leaf page entries. Am I right in concluding that all translations are cached in the L2 TLB as 4K pages?

It's more rigid than that; only 4K-page entries are cached in the L2 TLBs.

Also, it seems the TLB module uses the level signal in the PTWResp field of the TLBPTWIO bundle to figure out if the translation resulted in a superpage or not. Is this correct? What I'm failing to grasp is, if my previous assumption regarding the granularity of the L2 TLB entries is correct, how is the superpage information correctly forward to the L1 TLB when there is an L2 hit.

I think my previous answer answers this one, too.

To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+un...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/hw-dev/ed633e8f-e9c1-4655-85f0-b4694e9ce7af%40groups.riscv.org.

Jose Martins

unread,

Mar 22, 2020, 7:12:39 AM3/22/20

to RISC-V HW Dev, josema...@gmail.com, diet...@gmail.com

thanks Andrew

Another related question, I can't find the logic for asid matching in either the L1 or L2 TLBs... Does the rocket chip support asids?

To unsubscribe from this group and stop receiving emails from it, send an email to hw-...@groups.riscv.org.

Andrew Waterman

unread,

Mar 22, 2020, 4:46:51 PM3/22/20

to Jose Martins, RISC-V HW Dev, diet...@gmail.com

Rocket currently doesn’t support ASIDs. (Or, more precisely, Rocket only supports one ASID, 0.)

Down the road, when we add hypervisor support, we’ll likely address this limitation, too.

To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+un...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/hw-dev/18eb29e9-8c82-4e32-a8e9-c8925db1d018%40groups.riscv.org.

Sean Halle

unread,

May 1, 2020, 10:34:00 AM5/1/20

to RISC-V HW Dev

Hello,

We're putting out a high performance, low power chip for Enterprise Software workloads. It's based on RocketChip, but with custom CPU cores and SiFive's L2/L3 cache. In this setting, the size of the physical address space is important. 512GB is just barely large enough.

In this context, this commit to PTW.scala from Oct 2017 (aswaterman) is relevant. It contains a require that the physical addr bits must be less than the virtual addr bits. However, it's not clear to me/us why v cannot equal paddrBits:

val v = pgIdxBits + pgLevels * pgLevelBits
require(v == xLen || xLen > v && v > paddrBits)

https://github.com/chipsalliance/rocket-chip/commit/986cbfb6b1266c7c293b1f9968096b65e09d5d0a

This is important because a 3 level table equals 39 bits of virtual address. 39 bits translates into 512GB, which is the minimum needed, but if the number of physical addr bits must be strictly less than the number of virtual addr bits then physical addr size is limited to 256GB DRAM maximum. The alternative is going to a 4 level page table, both of which would be unhappy :-)

Can anyone shed some light on this?

Thank you,

Sean

http://Intensivate.com

Andrew Waterman

unread,

May 1, 2020, 4:57:14 PM5/1/20

to Sean Halle, RISC-V HW Dev

Physical addresses are zero-extended and virtual addresses are sign-extended. The strict inequality makes this easier to manage - the physical address is first zero-extended to the virtual address size, then that whole thing is sign-extended.

It’s possible to rectify this in another way that doesn’t impose this constraint, but we didn’t think it was worth the effort—especially since Linux wants to map all of physical memory into its virtual address space anyway, so having excess physical address bits wouldn’t be useful in practice. In other words, if you want more DRAM, you might as well go for the deeper page table, anyway.

--
You received this message because you are subscribed to the Google Groups "RISC-V HW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+un...@groups.riscv.org.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/hw-dev/CAJ4GwD%2BPbX2J%3Dk8419pXr3815bhr5h4H8-ndFw66cY4uMkd6xA%40mail.gmail.com.

Reply all

Reply to author

Forward