Why Not seL4?

68 views
Skip to first unread message

William ML Leslie

unread,
Aug 26, 2025, 8:49:41 PM (14 days ago) Aug 26
to Design
MarkM asks, "to what degree could [the capability mafia] consolidate our interests in the KeyKOS-lineage OSes (KeyKOS, CapROS, EROS, Coyotos) into efforts to improve seL4 as needed so we're not losing any of the *significant* differential value of the KeyKOS line?".  It's a great question, and since I continue to work on a Coyotos derivative rather than seL4, I have lots of thoughts.  Let's explore them.

My summary description of seL4 is that Gerwin read the Coyotos Microkernel Spec and made two primary changes: removal of any persistence features, and not abstracting the concept of address space.  There are some differences to the system as a whole, such as the license (GPLv2-only instead of GPLv2+) and the quality of the tools.  I find (Coyotos) mkimage tool for building the initial configuration of the system infinitely more powerful and comprehensible than CAmkES, but nothing prevents us from replicating the mkimage program in seL4.  I also find CMake to be entirely magic beyond my comprehension, where Coyotos uses the much friendlier GNU make.

On removal of persistence features: I know that this matters to some people.  Coyotos is designed as a disk operating system, in the manner of OS/400 and IBMi, where RAM is just a cache for all of your objects.  seL4, on the other hand, lets you control exactly what is in RAM.  I don't have a preference for one style or another, and I think modern systems like Midori went with the "no paging" style to great effect.  I have heard through Andrew Warkentin that trying to do orthogonal persistence on top of seL4 is a non-starter, but I haven't explored it myself.

On not abstracting address spaces: Coyotos had both Guarded Page Tables (GPTs), which are manipulated by programs to build address spaces, and Mappings, which are platform-specific and silently built as-needed.  seL4 only has the platform-specific object and a platform-specific API to build them.  Again, this might not be a bad thing.  You still need to figure out some way to represent address spaces on-disk, but having to "own" the mappings means you have better control over the performance of running a program that has gone tepid.

In terms of memory management, I think of the Untyped mechanism more like the Coyotos Range cap than a SpaceBank.  Untyped allows you to turn a page of RAM into some kernel object: an Endpoint, a Thread Control Block (Coyotos: Process), a Page.  Would we have much trouble building a SpaceBank on top of this mechanism?

I'm not sure if you could build a Factory (Coyotos: Constructor) that can verify confinement on top of seL4.  Might be fun to try.

Given that none of these differences are that material, why am I not currently building on seL4?  Some of the reason comes down to familiarity, but I have a deeper problem with the KeyKOS family that I want to explore, and there'd be a lot of toes getting stepped on if I tried to do this in seL4.  It's really unpleasant to be told "No" when you're doing exploratory work, and seL4 also happens to be a moving target.

Anyway! I'm not the only person to recognise this problem, here's a talk given by Matthew Brecknell from Kry10 describing the seL4 objects that must be set up to build a "Data Diode" allowing data to travel in one direction.  The time code is 20 minutes, 16 seconds.


Under this family of systems, there's a level of trust you require in order to call any component.  You trust the component not to crash and discard its reply capability, or equivalently, to just never send the reply.  In order to receive a message, you need to be waiting on that endpoint when the response arrives; your process can't be doing anything else.  So in order to communicate with a component without becoming subject to it, you need a mechanism like the one described in the talk: you'll add your request to a buffer, and then trigger an asynchronous notification, letting the receiver know that you've got a request for it.  You can then carry on with other requests.

This set-up works well enough for systems like Kry10 that are globally configured: there are a known set of components that are going to be wired up.  But I am a soy-dev aspiring E programmer, and I want to create and discard objects at will!  In that case, this sort of construction doesn't really scale.  If you have N components, you could end up with N squared connections between them, each with a send buffer for each communication direction, maybe doubled again if you want to share capabilities (who doesn't?).

To put the problem in perspective, people used to regularly hit the open file limit on posix systems, and there are plenty of examples where people have more sockets open than would fit on a single ip address. If you had one object per port, with the minimum possible buffer sizes, for 65,536 open ports, then you require 65,536 * 4k * 4 = 1GiB of space just for IPC buffers!

No.  The primary responsibility of the kernel is to enable safe IPC between mutually-suspicious components.  So, I'm exploring mechanisms to do just that.  It has been taking its sweet time, I'm terribly sorry.  I've worked 50-60 hour weeks on cloud telephony systems for over 5 years and not had a lot of capacity, and I find C to be slow-going.  Nevertheless, this is why seL4 is not for me (yet).


--
William ML Leslie

Mark S. Miller

unread,
Aug 27, 2025, 2:16:08 AM (13 days ago) Aug 27
to fr...@googlegroups.com
William, this is all extremely valuable, thanks!!


--
You received this message because you are subscribed to the Google Groups "friam" group.
To unsubscribe from this group and stop receiving emails from it, send an email to friam+un...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/friam/CAHgd1hGDc1hQLGxrbzkseGtgnMW-r0H%2BXzd1w%3DxBYBSo_J8z8Q%40mail.gmail.com.


--
  Cheers,
  --MarkM

William ML Leslie

unread,
Aug 28, 2025, 12:58:20 AM (12 days ago) Aug 28
to Design
I had some feedback from the seL4 community I thought would be good to incorporate.  It doesn't greatly change the overall message, but just so that nobody gets the wrong idea.

If it's not clear, when I say that I think of seL4 as Gerwin copying the Coyotos Microkernel Spec and trimming some fat, I'm being entirely facetious.  The (Coyotos) spec itself mentions that much of the concepts that distinguish Coyotos from its predecessor came out of conversations at TUD.  Gerwin even attributes the design of seL4 primarily to Kevin Elphinstone and Philip Derrin before 2004.  But unless I've misunderstood the terms in Shap's main point of contention at TUD; seL4 appears to have gone with the Coyotos approach here:

There was some discussion of merging the two kernels, but no agreement could be reached on the future of L4's map and unmap operation. - Coyotos Microkernel Specification, 0.6+

The end result being that if a runtime concept exists in both the Coyotos Spec and seL4, it maps close to one to one.

I also mentioned that seL4 would not be the right vehicle for exploratory work and that seL4 was a moving target.  Please know that this didn't come from anyone in the seL4 community.  When I decided to avoid working directly in seL4, it was 2019, and I had just come off a lengthy term with a project that had gradually gone from a research endeavour to a serious product, and one core developer had started deliberately trying to derail my work.  At the same time, the level of formal verification work going on within seL4 was such that getting any semantic changes to the kernel upstream would have been, I sensed, beyond my limited capacity.  This was very much a "me" thing, not a "them" thing.

Though, to be fair to my assessment, I have seen people discouraged from contributing ports of seL4 to less popular CPU architectures by core seL4 developers.  I don't know how serious the effort to port was, in fact, the pushback was probably good advice.  But it did reinforce my decision not to bring any kernel change until I had a solid approach that I could comfortably defend.  Verification has been The Hard Part, of course, and any sort of change needs to be carefully considered with reverification in mind.

Gernot also pointed out that Microkit, designed after real-world CAmkES use, is not prescriptive of which build system you use, and is a lot friendlier.  It's still much closer to the CapROS imgmap format than the Coyotos mkimage format.  He also mentioned that he'd like to see someone designing a persistent system on seL4.

--
William ML Leslie

William ML Leslie

unread,
Aug 28, 2025, 1:06:30 AM (12 days ago) Aug 28
to Design
On Thu, 28 Aug 2025 at 14:58, William ML Leslie <william.l...@gmail.com> wrote:
The (Coyotos) spec itself mentions that much of the concepts that distinguish Coyotos from its predecessor came out of conversations at TUD.

Sorry, that's an overstatement.  It doesn't say this.  It does say that the endpoint concept predates the meeting, and that is maybe the key difference between the newer architectures and EROS.

--
William ML Leslie

Matt Rice

unread,
Aug 28, 2025, 2:34:17 AM (12 days ago) Aug 28
to fr...@googlegroups.com
On Thu, Aug 28, 2025 at 4:58 AM William ML Leslie
<william.l...@gmail.com> wrote:
>
> I also mentioned that seL4 would not be the right vehicle for exploratory work and that seL4 was a moving target.

I'd also mention one other thing which I feel is relevant to the
exploratory work aspect,
seL4 is also used in teaching, and as such they limit the bootstrap
environment to being buildable from
ubuntu long term storage, which for me has ended up stalling some
projects for a good 10 years.
because some features barely missed the prior ubuntu LTS release window.
It is entirely understandable, but also unfortunate...

William ML Leslie

unread,
Aug 28, 2025, 3:04:18 AM (12 days ago) Aug 28
to fr...@googlegroups.com
When I started the work of building and running Coyotos and CapROS, I did so from within GNU Guix.  I still think that was a really good decision.  It meant that I could gradually walk tools from autotools, gcc, and binutils slowly up in version, fixing build failures, until I had something that built cleanly.  I don't know that mandating Guix or Nix would be smart (and supporting nothing else), but it does mean that if it successfully built once, it will always build.

You know, I never did get the crosstools packaged for Ubuntu as it's hard to know exactly what to capture in the source build.

--
William ML Leslie

Mark S. Miller

unread,
Aug 28, 2025, 5:43:59 PM (12 days ago) Aug 28
to fr...@googlegroups.com
What's "TUD"?


--
You received this message because you are subscribed to the Google Groups "friam" group.
To unsubscribe from this group and stop receiving emails from it, send an email to friam+un...@googlegroups.com.

Matt Rice

unread,
Aug 28, 2025, 5:53:32 PM (12 days ago) Aug 28
to fr...@googlegroups.com

Matt Rice

unread,
Aug 28, 2025, 8:49:22 PM (12 days ago) Aug 28
to fr...@googlegroups.com
On Thu, Aug 28, 2025 at 7:04 AM William ML Leslie
FWIW, the bootstrap I was working on (or perhaps have been taking a
break from), tries to avoid all this posix native tooling as much as
possible.
Instead, of running cross tools running on and built for a posix
native machine architecture, my goal was to essentially build up the
toolchain on an emulated capability userspace, so instead of running
posix tools under emulation on a cap-os for "native tools", we would
run them under capability user-space emulation on posix systems.

The genode project IIRC did a similar thing, building their IPC system
on linux, and I was partially inspired by that but the feeling was it
was worthwhile to attempt to do the bootstrap utilizing web assembly
both because it's VM bytecode format allows for architecture
independence,
so in theory we could run the same bytecode under posix for initial
bring-up and under the bootstrapped system.

The hoot project from spritely is also a scheme compiler project
targeting wasm...
https://spritely.institute/hoot/

I find that all the posix toolchains are basically infested with
ambient authority and designed completely inside out from the way one
would
design them on a capability system (e.g. "a.out")...
In the gnu tools much of the design seems to be caused by open file
descriptor limits,
as such the bfd library opens/closes file descriptors at will, and so
there are names of files e.g. in a stack.

I had hoped to avoid making a system too specific to any one
capability system, so my system is actually limited to what MarkM has
called transformational programs (those that transform input to some
output). Not relying on sending messages to other processes,
There is kind of just a big bang followed by an autopsy for each
process. I don't have any intuition on which way (emulating a more
traditional capability userspace, or this batch-processing like one)
would be preferred in practice. That is what I was going for
bootstrap wise anyways.
Reply all
Reply to author
Forward
0 new messages