MMU Page Size too Small?

482 views
Skip to first unread message

Michael Chapman

unread,
Jun 16, 2016, 1:46:35 PM6/16/16
to RISC-V ISA Dev
The current privileged specification defines the page size to be 4KBytes in all variants.
Intel used 4KByte page sizes in the 80386 processor introduced in 1986 (or there abouts).

The inconvenience of such a small page size is twofold:-

1) It limits the size of a cache set in a physically tagged virtually indexed cache as the cache index bits must be unchanged by the MMU lookup.

This can be circumvented by the OS ensuring that it does not map pages which change the next few bits of address. This means maintaining several free lists of pages and always selecting a page from the right free list when setting up a PTE. (Sometimes referred to as page coloring)

A 4KByte cache set size is pretty small by today's standards. A typical two set associative cache would be limited to 8KB capacity in a virtually indexed physically tagged cache. With current main stream non-exotic processing technology we can easily build bigger single cycle MMU/Caches than that which work at full processor ALU cycle speeds.

Increasing the number of ways of the cache is possible, but for a fast cache lookup, we want to access all the ways at the same time and then select the one with the matching tag. This makes for memories which are far from ideal in aspect ratio.

Why do we want a physically tagged but virtually indexed cache? In order to overlap the cache access with the MMU lookup. I.e. to avoid adding a cycle latency for the MMU operation.

2) It reduces the amount of memory which is covered by the MMU TLB. This is an efficiency question. As working set sizes have increased by a huge amount since the 4KB page size was introduced by Intel in the 80386 in 1986, but the typical number of TLBs entries has not gone up in the same proportion.

I would suggest to increase the page size to 64Kbytes and keep the current scheme for mega pages.

(I will just go and get my fire retardent flame suit now ;-)

Andrew Waterman

unread,
Jun 16, 2016, 1:57:34 PM6/16/16
to Michael Chapman, RISC-V ISA Dev
Four counterpoints:

- Larger pages incur more internal fragmentation, particularly in OS small-file caches
- They incur incompatibility with some (bad) low-level software that expects 4K pages, increasing porting effort
- Transparent superpage support in modern OSes actually works to some extent
- Microarchitectural techniques can cope with 4K pages to a great extent

In the end, there are many good arguments both for and against the humble 4K page.
--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To post to this group, send email to isa...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/isa-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/4cf80811-fd62-4d7a-9306-3ada7ed60a34%40groups.riscv.org.

Luis Cargnini

unread,
Jun 16, 2016, 2:00:17 PM6/16/16
to Andrew Waterman, Michael Chapman, RISC-V ISA Dev

Just my $0.02

Can the architecture support X possible page sizes configuration? 1K, 4K, 16K ?

Embedded systems have gravitated towards 1K, general purpose still live in 4K and high advanced systems for HP have been able to support >=16K.

 

Would that be possible?

                                                              

Best Regards,

 

Luis Vitorio Cargnini, Ph.D.

Research Technologist Engineer
Western Digital Corporation

e.  luis.c...@hgst.com

o.  +1-408-717-5513

 

3403 Yerba Buena Road, WDC Research Center

www.wdc.com www.hgst.com

Western Digital Corporation (and its subsidiaries) E-mail Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential or legally privileged information of WDC and/or its affiliates, and are intended solely for the use of the individual or entity to which they are addressed. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited. If you have received this e-mail in error, please notify the sender immediately and delete the e-mail in its entirety from your system.

Samuel Falvo II

unread,
Jun 16, 2016, 2:34:43 PM6/16/16
to Michael Chapman, RISC-V ISA Dev
On Thu, Jun 16, 2016 at 1:46 PM, Michael Chapman
<michael.c...@gmail.com> wrote:
> 2) It reduces the amount of memory which is covered by the MMU TLB. This is
> an efficiency question. As working set sizes have increased by a huge amount
> since the 4KB page size was introduced by Intel in the 80386 in 1986, but
> the typical number of TLBs entries has not gone up in the same proportion.

Couldn't you just use a page table entry of type >=2 higher up in the
hierarchy to get super-pages?

--
Samuel A. Falvo II

Michael Chapman

unread,
Jun 16, 2016, 2:47:51 PM6/16/16
to Samuel Falvo II, RISC-V ISA Dev
Yes of course.

But why implement the next level of hierarchy if we don't use them because they are too inefficient,
or too complex to support the page coloring algorithm in the OS.
Incidentally page coloring has various nefast effects as well.

Small pages, mean that we pay an extra cycle in an ideal fully cached (TLB hit, cache hit) access.
Also, we increase the TLB miss rate by a very large amount and this again lowers average IPC.

If we keep the specification as is, I recommend that the Linux implementers for systems with > 1Gbyte of memory to only use  2MByte super pages and not use the 4Kbyte pages at all.
Also from the HW perspective I recommend a virtually indexed cache to improve the memory access latency and just let the OS figure out how to make it work - may be with an option to reduce the effective cache size to only use the first 4Kbytes in each cache set to keep compatibility with OSes which insist on using the 4Kbyte pages and don't know how to color pages.
Reply all
Reply to author
Forward
0 new messages