On 2015-10-29 at 03:35 "'Davide Libenzi' via Akaros"
<
aka...@googlegroups.com> wrote:
> Given that I am awfully early 😑 , let me explain better.
> If I understood correctly, we need to have 8KB pages, and the host
> will use, say, that lower 4KB while the VMM will user the upper.
> Right?
> This because you want to have a fast mapping from host page to EPT
> page, right?
There's a little more to it than that.
The main thing is that there is an invariant that the address space
exposed to a process is the same whether it is in Ring 3 or in "Ring
V" (Ring 0 of VMX root, guest VM OS, etc).
In x86, this invariant translates to: the normal page table (called the
KPT, for kernel page table) and the EPT have identical mappings for
(almost) all of the users address space. The KPT and EPT are just
different windows into the same Process Virtual -> Host Physical
mapping. In the case of the EPT, "Process Virtual == Guest Physical".
That is the essence of how page tables and virtual machines work in
Akaros.
The x86-specific bit is that there is an EPT at all. Architectures
that are designed for virtualization should use the same format for the
KPT and the EPT, such that only the KPT is needed. x86 isn't like
that, but the arch-independent parts of Akaros won't cater to x86.
> My 50% was based on the observation that in order for a pair to be
> available, the host should never allocate the 4KB part which is
> reserved for the VMM.
In general, we just do an "order 1" allocation for a page table from
x86 (2 contig pages). That's a little more stress on the page
allocator. So yes, every page table page for every process in x86
costs 8KB instead of 4KB. We're okay with that.
> The 8KB forced page size is a bit weird, but I think, if I understood
> correctly what you are trying to achieve, we can get there w/out 8KB
> pages constraint.
> Akaros (like pretty much every VM implementation I am aware of), has
> something like:
>
> struct page {
> ...
> };
> struct page pages[MAX_PHYS_PAGES];
>
> Now we can have:
>
> struct page {
> ...
> #ifdef CONFIG_VMX
> uintptr_t paired_pfn; // Or: struct page *paired_page
> #endif
As a side note, that would need to be CONFIG_X86, since the need for an
EPT is an x86 thing. Additionally, we won't have CONFIG_VMX, since we
want our VMM support to be always on for Akaros, compared to the "bag
on the side" approach.
> };
>
> So say you walked you host page table and you found host PFN (Page
> Frame Number), to get EPT PFN you would simply:
>
> pages[PFN].paired_pfn
that would work, but it's a lot less convenient than simply adding a
constant to get from a KPTE to an EPTE
static inline epte_t *kpte_to_epte(kpte_t *kpte)
{
return (epte_t*)(((uintptr_t)kpte) + PGSIZE);
}
No dereferences or anything, and it's super simple.
Also, I'm reluctant to add things to the page struct. Adding 8 bytes
for the uintptr_t is a tax of 8 bytes / page. Adding an extra page to
a page table is 4KB per page table. The ratio in cost there is 512:1.
What's the ratio of page table pages to regular pages? A fully
populated PML1 has 512 entries, but there are the intermediate
PML4-2s. Then again, there are the jumbo PTEs for the KERNBASE
mapping. Anyway, the memory savings isn't clear.
> void alloc_page_pair(struct page **pages)
> {
> kpage_alloc(&pages[0]);
> kpage_alloc(&pages[1]);
Agreed that the nice thing about this is you don't need contiguous
pages.
Barret