Can 64 bit Linux run on RISC-V with 512M RAM, Sv39 page table size consider

326 views
Skip to first unread message

Daniel Lu

unread,
Mar 13, 2018, 11:08:32 PM3/13/18
to RISC-V SW Dev
Hi all

I want to run 64 bit Linux  on RISC-V with 512M RAM, a simple calculate show that Sv39 page table size is larger than 512M! 
rough page table computing: the third level total size = 2^9 x 2^9 x 2^9 x 8 = 1GB.
which mean it couldn't live in 512MB RAM?  am I right?

I notice there is a config item in latest kernel:  Maximum Physical Memory size  2G  or 128G, does it help?

Bruce Hoult

unread,
Mar 14, 2018, 12:42:10 AM3/14/18
to Daniel Lu, RISC-V SW Dev
Thats correct, but you'd be crazy to let a program use a fully populated 512 GB address space if you only have 512 MB of RAM. Even not counting the space needed for the page tables, you'd be swapping like crazy.

On such a system you might only want to give each process, say, a 32 bit address space. I believe you could populate only, in this case, four entries in the 4 KB Root Page Table. Maybe the first four. Maybe the first two and last two. Whatever. The other 508 entries can be filled with zeroes (or at least the LSB zero) to indicate they are invalid and cause a page fault exception if an address in that range is attempted to be accessed.

The four used page table entries (which map 1 GB each) could then point to four 4 KB 2nd Page Tables, which would in turn point to 2048 3rd level Page Tables.

So you'll have a total of 2053 Page Table blocks, taking a total of just over 8 MB of RAM. 

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/38a4725e-cff5-4598-8c2d-a4ae9e65ca60%40groups.riscv.org.

Palmer Dabbelt

unread,
Mar 14, 2018, 1:02:10 AM3/14/18
to br...@hoult.org, rop...@gmail.com, sw-...@groups.riscv.org
You can boot a RISC-V kernel down to 32MiB of RAM without any tricks specific
to low-memory systems:

$ git diff
diff --git a/Makefile b/Makefile
index 408a4247fe64..4913e768f410 100644
--- a/Makefile
+++ b/Makefile
@@ -213,4 +213,5 @@ sim: $(spike) $(bbl)
qemu: $(qemu) $(bbl) $(rootfs)
$(qemu) -nographic -machine virt -kernel $(bbl) \
-drive file=$(rootfs),format=raw,id=hd0 -device virtio-blk-device,drive=hd0 \
- -netdev user,id=net0 -device virtio-net-device,netdev=net0
+ -netdev user,id=net0 -device virtio-net-device,netdev=net0 \
+ -m 32
$ cat work/linux/.config | grep -i CONFIG_MAXPHYSMEM
# CONFIG_MAXPHYSMEM_2GB is not set
CONFIG_MAXPHYSMEM_128GB=y

$ make qemu
...
Starting logging: OK
Starting mdev...
sort: /sys/devices/platform/Fixed: No such file or directory
modprobe: can't change directory to '/lib/modules': No such file or directory
Initializing random number generator... done.
Starting network...
udhcpc (v1.24.2) started
Sending discover...
Sending select for 10.0.2.15...
Lease of 10.0.2.15 obtained, lease time 86400
deleting routers
adding dns 10.0.2.3
Starting dropbear sshd: OK

Welcome to Buildroot
buildroot login: root
Password:
# free -m
total used free shared buffers cached
Mem: 24 19 4 8 0 8
-/+ buffers/cache: 11 12
Swap: 0 0 0

You don't need to map every virtual address in order to have a valid userspace,
just the ones that are actually used. Thus there's no reason to bother
restricting the user's virtual address space to a particular bit width -- just
let the user allocate whatever it wants.

CONFIG_MAXPHYSMEM is a small-scale optimization: if the kernel knows it can
only access 2GiB of physical memory then it can be compiled to use LUI-based
addressing for all symbols, which allows the compiler to generate better code
in some cases. I'd be very surpried if you could find any workload where this
was a 1% performance benefit.
>> email to sw-dev+un...@groups.riscv.org.
>> To post to this group, send email to sw-...@groups.riscv.org.
>> Visit this group at https://groups.google.com/a/
>> groups.riscv.org/group/sw-dev/.
>> To view this discussion on the web visit https://groups.google.com/a/
>> groups.riscv.org/d/msgid/sw-dev/38a4725e-cff5-4598-8c2d-
>> a4ae9e65ca60%40groups.riscv.org
>> <https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/38a4725e-cff5-4598-8c2d-a4ae9e65ca60%40groups.riscv.org?utm_medium=email&utm_source=footer>
>> .
>>

Daniel Lu

unread,
Mar 14, 2018, 4:50:47 AM3/14/18
to RISC-V SW Dev, br...@hoult.org, rop...@gmail.com
Hi Bruce & Palmer,

I currently run a fpga based 64 bit rocketchip core, the board limit to 512M RAM. so qemu may not help here.

Regarding Bruce's tweak kernel vmm code suggestion, sounds like a challenge, but sounds like it is the only solution for my economic board?! I do have some interesting on this after carefully read the S spec. I notice the superpage schema of Sv39 after re-read spec, the 2MB megapages sounds like a good solution for low memory, imaging we could run full 64bit linux on riscv 64 with only hundreds MB of RAM, which is great feature. the 2MB schema use only 2 levels page table, the total ram gonna be taken by 2nd level page table is 2^9 x 2^9 x 8= 2MB. with each entry point to a 2MB page.  do you know current kernel support this schema or not?    regarding 1GB superpage schema, I think the page swap overhead is too big, isn't it, especially for low memory board!  I'm not interested in it. but again, all these spec supported schema is nice to have feature.

so if I understand correctly, CONFIG_MAXPHYSMEM_2G is imply -mcmodel=medany flag for compiling kernel I guess?


在 2018年3月14日星期三 UTC+8下午1:02:10,Palmer Dabbelt写道:

Daniel Lu

unread,
Mar 14, 2018, 5:16:23 AM3/14/18
to RISC-V SW Dev, br...@hoult.org, rop...@gmail.com
sorry, a correction for last sentence:  medlow instead of medany.


在 2018年3月14日星期三 UTC+8下午4:50:47,Daniel Lu写道:

Cesar Eduardo Barros

unread,
Mar 14, 2018, 8:10:25 AM3/14/18
to Daniel Lu, RISC-V SW Dev
Em 14-03-2018 00:08, Daniel Lu escreveu:
> Hi all
>
> I want to run 64 bit Linux  on RISC-V with 512M RAM, a simple calculate
> show that Sv39 page table size is larger than 512M!
> rough page table computing: the third level total size = 2^9 x 2^9 x 2^9
> x 8 = 1GB.
> which mean it couldn't live in 512MB RAM?  am I right?

The page table is a sparse data structure (this is not exclusive to
RISC-V, most ISAs with virtual memory do it the same way). It doesn't
have to be fully populated, and most of the time it won't.

This is why it's represented as a tree, instead of an array covering the
whole address space. Branches of the page table tree that are empty can
be pruned. As an extreme example, if your process only has memory on the
first megabyte of its address space, only the first entry of the
top-level page table will be filled, and only the first entry of the
next-level page table will be filled too. The other entries will not be
present.

Also, branches of the page table can be shared. For instance, it's very
common (at least before the recent Meltdown vulnerabilities) to map the
operating system kernel at the same place in every process; the parts of
the page table for these mappings are the same for every process, so the
top-level page table for every process can point to a single copy of the
page tables for these mappings.

--
Cesar Eduardo Barros
ces...@cesarb.eti.br

Michael Clark

unread,
Mar 14, 2018, 8:24:13 AM3/14/18
to Daniel Lu, RISC-V SW Dev, br...@hoult.org


On 14/03/2018, at 1:50 AM, Daniel Lu <rop...@gmail.com> wrote:

Hi Bruce & Palmer,

I currently run a fpga based 64 bit rocketchip core, the board limit to 512M RAM. so qemu may not help here.

Regarding Bruce's tweak kernel vmm code suggestion, sounds like a challenge, but sounds like it is the only solution for my economic board?! I do have some interesting on this after carefully read the S spec. I notice the superpage schema of Sv39 after re-read spec, the 2MB megapages sounds like a good solution for low memory, imaging we could run full 64bit linux on riscv 64 with only hundreds MB of RAM, which is great feature. the 2MB schema use only 2 levels page table, the total ram gonna be taken by 2nd level page table is 2^9 x 2^9 x 8= 2MB. with each entry point to a 2MB page.  do you know current kernel support this schema or not?    regarding 1GB superpage schema, I think the page swap overhead is too big, isn't it, especially for low memory board!  I'm not interested in it. but again, all these spec supported schema is nice to have feature.

I think your pagetable math is wrong. With 2MiB pages a single 2nd level 4KiB PTE page can map 1GiB. You only need one 4KiB root PTE and a half populated 4KiB leaf PTE to map 512MiB with 2MiB pages. 8KiB in total.

Each 4KiB PTE page has 512 entries, so you only need to use half of the 2nd level 4KiB PTE page to map 512MiB (256 * 2MiB).

The page tables are sparsely populated. I think that’s what you’re missing.

SV39 = 9 + 9 + 9 + 12 and each PTE page is 4KiB and holds 512 entries (2 ^ 9) and can map up to 512GiB. You can map 512GiB with one root page table (4KiB) containing 512 gigapage entries.



You are not going to have to worry about PTE space as they are a small fraction of memory, unless you are for example running 10,000 processes or something (each process has its own 4 KiB root PTE and likely a small number of second and third level pages table pages if the heap and text is sparse. Maybe 32KiB per process depending on how much address space they map.

Samuel Falvo II

unread,
Mar 14, 2018, 12:58:41 PM3/14/18
to Bruce Hoult, Daniel Lu, RISC-V SW Dev
On Tue, Mar 13, 2018 at 9:42 PM, Bruce Hoult <br...@hoult.org> wrote:
> Thats correct, but you'd be crazy to let a program use a fully populated 512
> GB address space if you only have 512 MB of RAM. Even not counting the space
> needed for the page tables, you'd be swapping like crazy.

The kernel won't even let you allocate that much memory on such a
small amount of RAM anyway; your process will be OOM-killed before it
gets that far.

--
Samuel A. Falvo II

Daniel Lu

unread,
Mar 14, 2018, 9:27:21 PM3/14/18
to RISC-V SW Dev, br...@hoult.org, rop...@gmail.com
Thanks everyone who response to my question, now I'm confident that I can run linux 64bit on my 512M ram without problem, will go ahead. more swapping would be expected, but I'm ok with it. as it is kind of SW+HW PoC. thanks every.
Reply all
Reply to author
Forward
0 new messages