Palacios on a microkernel

50 views
Skip to first unread message

Lukas Humbel

unread,
Dec 7, 2012, 4:41:24 AM12/7/12
to v3vee-de...@googlegroups.com
Hi all,

I'm a student at ETH Zurich, currently writing my master thesis. We have a simple VMM in Barrelfish (http://www.barrelfish.org/) which I should extend. I am now considering porting Palacios instead of extending our work because Palacios seems way more mature (and, has almost all features I am aiming for, especially guest SMP support).
I looked at the Technical Report on Palacios. What is currently holding me back is the OS interface we have to support. It looks more suited to a monolithic kernel. Barrelfish is a microkernel and I'd really like to only have as little code as possible in privileged mode.
I found some clues that Palacios has been run on several microkernel (Minix, L4Re), but couldn't find any code.

Do you know something about these ports and how they are implemented? Does only run a minimal part in the kernel?

How would you estimate the effort to modify the OS interface of Palacios to support such operation? Is it planned/desired to integrate such an interface into the Palacios mainline?

Cheers,
Lukas Humbel

Peter Dinda

unread,
Dec 7, 2012, 9:38:48 AM12/7/12
to v3vee-de...@googlegroups.com
Erik van der Kouwe ported Palacios to Minix 3 and contributed back a set of changes to Palacios to help support his port.  Palacios also runs on Sandia's Kitten LWK, but, while tiny, Sandia is monolithic.  Still, you can find the Kitten interfaces on Kitten's site (Kitten is GPLed).   I'm not aware of an L4 port.  

The structure of the basic OS interface (OS hooks) and the optional interfaces was designed to make it feasible to function in something small, like Kitten.  At this point, it's unlikely it will get much smaller since a lot of it is necessitated by the SVM / VT virtualization model itself.   Palacios itself needs fully privileged operation since it needs to maintain the shadow or nested page tables, and it needs to control interrupt dispatch, as well as initiate IPIs.     The optional interfaces, otoh, seem pretty amenable to low privilege implementations.

Hope that helps.   

Peter 

Lukas Humbel

unread,
Dec 14, 2012, 10:48:30 AM12/14/12
to v3vee-de...@googlegroups.com
Hi Peter

Thanks for your reply. I agree, the interface can't get much smaller. Just some brainstorming from my side: I wasn't thinking about a smaller interface but more an alternative, maybe providing an additional hook which calls one specific (small) function which does all the privileged operations. A user space implementation could set up a hook which performs a syscall and executes the function in kernel space, while in the monolithic case, the hook would just directly point to the function.

Apart from executing the VM directly I dont see what else is needed to be run in kernel space, page tables can be manipulated in user space as long as they are mapped somewhere, interrupts can be routed to userspace (of course, the OS has to support this, but there is no change needed to palacios).

Is the interrupt reception used for something else than receiving IPI (if we ignore device pass through for now)?  If so, then using a messaging primitive from the OS should be sufficient and no new kernel routine is needed.

I still couldn't find the minix implementation (well i found at least some switches in the palacios source, but I'm more interested in the specific implementation of the os_hooks structure), I'll send Erik van der Kouwe an email.

Cheers,
Lukas

Jack Lange

unread,
Dec 14, 2012, 12:22:27 PM12/14/12
to v3vee-de...@googlegroups.com
Hey Lukas,

There was an anonymous svn repository for a while, but it seems to
have disappeared.

It used to be here:
> svn checkout --username anonymous https://gforge.cs.vu.nl/svn/minix/branches/src.r6135.buildsystem.palacios

Taking the approach you outlined (privileged syscall) should work well
enough on AMD, because almost all of the privileged assembly
instructions are contained in the VM launch code. Though I don't
really remember where the user/kernel split was placed for MINIX, I am
pretty sure it was AMD exclusive. However, with Intel things will
probably be a bit more complicated as it requires privileged
instructions to be used more liberally.

Interrupt reception should be replaceable with an underlying channel
of some kind without significant changes.
Passthrough is a bit more complicated, as it requires coordination
with the host OS (especially if the OS is actively using PCI devices).

--Jack
> --
> You received this message because you are subscribed to the Google Groups
> "V3VEE Development" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/v3vee-development/-/FwxyfcLENlEJ.
>
> To post to this group, send email to v3vee-de...@googlegroups.com.
> To unsubscribe from this group, send email to
> v3vee-developm...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/v3vee-development?hl=en.

Peter Dinda

unread,
Dec 17, 2012, 2:09:50 PM12/17/12
to v3vee-de...@googlegroups.com
On Fri, Dec 14, 2012 at 9:48 AM, Lukas Humbel <lukas....@gmail.com> wrote:
> Hi Peter
>
> Thanks for your reply. I agree, the interface can't get much smaller. Just
> some brainstorming from my side: I wasn't thinking about a smaller interface
> but more an alternative, maybe providing an additional hook which calls one
> specific (small) function which does all the privileged operations. A user
> space implementation could set up a hook which performs a syscall and
> executes the function in kernel space, while in the monolithic case, the
> hook would just directly point to the function.
>

True.

> Apart from executing the VM directly I dont see what else is needed to be
> run in kernel space, page tables can be manipulated in user space as long as
> they are mapped somewhere, interrupts can be routed to userspace (of course,
> the OS has to support this, but there is no change needed to palacios).
>
> Is the interrupt reception used for something else than receiving IPI (if we
> ignore device pass through for now)? If so, then using a messaging
> primitive from the OS should be sufficient and no new kernel routine is
> needed.
>

Ignoring passthrough, the interrupt model has two things that I think
would require careful design and thought in a microkernel model.
First, the VM execution loop disables interrupts on the physical core
and sets interrupt exiting on the underlying machine. This means that
from just before an entry to sometime after exit, the core is running
with interrupts disabled. When an interrupt occurs, the hardware
exits and returns control to us. We then have the first chance to
respond to the interrupt. In the normal case, we just turn interrupts
back on and the interrupt is dispatched in the normal way for the host
OS. In essence, we can delay interrupts. The second thing to note
is that we use IPIs to make it possible to force an exit on a
particular core. More generally, the host must expose some mechanism
that allows us to force the hardware to do an exit on a remote core.
> --
> You received this message because you are subscribed to the Google Groups
> "V3VEE Development" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/v3vee-development/-/FwxyfcLENlEJ.
>
> To post to this group, send email to v3vee-de...@googlegroups.com.
> To unsubscribe from this group, send email to
> v3vee-developm...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/v3vee-development?hl=en.



--
Dr. Peter A. Dinda
Professor
Head, Computer Engineering and Systems Division
Department of Electrical Engineering and Computer Science
Northwestern University
2145 Sheridan Avenue
Evanston, IL 60208
847-467-7859 (voice)
http://www.eecs.northwestern.edu/~pdinda
pdi...@northwestern.edu

Scott Levy

unread,
Dec 18, 2012, 11:01:00 PM12/18/12
to v3vee-de...@googlegroups.com
I recently updated my work area (it's been sitting idle for several months) and now when I boot my
guest, the guest boots *very* slowly. When I look at the output of dmesg, I see:

[ 572.393129] palacios (pcore 0): palacios/src/devices/8259a.c(509): Interrupt pending after EOI

over and over again (approximately 20 per second). Can anyone explain what's gone wrong?

=s=
Reply all
Reply to author
Forward
0 new messages