CapROS status

Charlie Landau

unread,

Nov 20, 2022, 4:18:59 PM11/20/22

to cap-talk

The following was written for Mark Miller, because of his deep knowledge of KeyKOS and its successors, and because he has encouraged me to continue work on CapROS. However I have not been successful in emailing him directly. (Perhaps my email server was marked as a spammer?) So I am posting it here for any feedback that the group and/or Mark may care to offer.

-------- Forwarded Message --------
Subject:
CapROS status
Date:
Sat, 22 Oct 2022 12:08:44 -0700
From:
Charlie Landau
To:
Mark S. Miller
Markm,

I am writing to you because you are very knowledgeable about work in the capability space, and I value your advice.

After reading about new work in systems like Fuchsia, KataOS, seL4, etc., I have mixed feelings. One the one hand I am happy that so much progress is being made in secure operating systems using capabilities. On the other hand I am discouraged because it still seems to me that these are partial solutions compared to the KeyKOS/EROS/CapROS model.

I might be suffering from a lack of information.

Do any of these systems use checkpoint/restart? If not, how do they solve the problem of preserving relationships that are represented in capabilities? Does this no longer matter in a world where every device has a battery?

Do these systems support tight control of resources such as storage space and CPU time, as KeyKOS did with space banks and meters? Or is it fair to consider resources unlimited nowadays?

Do these systems support validating the extent to which a service can spread your data, as the factory does? Or was this never really viable in practice?

Do these systems write their own drivers for all devices? CapROS could run Linux device drivers in a protected domain with little modification.

My work on CapROS stopped in 2013 when I had to take a full-time job. The last running CapROS system stopped in 2015 when lightning struck the hardware, and it could not be repaired or replaced. Since then CapROS has atrophied, and it no longer builds on currently-supported compilers. It no longer runs on any hardware you can buy. There is virtually no documentation, and the EROS documentation has been lost. It was never written in a reliable language like Rust, and has never been proven correct like seL4.

When I retired a year ago I hoped to correct some of those issues, but I want to enjoy retirement and not just have a full-time unpaid job. There is a lot to do. William ML Leslie has helped some, but he seems to have moved on to coyotos, and no one else is working on CapROS.

Meanwhile, Google and SEG are moving forward on their systems. I cannot compete with them. I briefly considered seeking a job working with those systems, but I don't really want to come out of retirement, even if anyone would hire me at age 72.

I am considering just abandoning CapROS. I believe there are some useful ideas in the system, but so far no one seems to have known or cared about them.

I would appreciate your thoughts on this.

--
Charlie Landau
he/him/his (Why Pronouns Matter)

William ML Leslie

unread,

Nov 21, 2022, 7:54:03 AM11/21/22

to cap-...@googlegroups.com

My summary, since I try to follow all of these very closely. I wrote a really long winded version of this, but I really don't like my writing.

On Mon, 21 Nov 2022 at 07:19, Charlie Landau <cha...@charlielandau.com> wrote:

Do any of these systems use checkpoint/restart? If not, how do they solve the problem of preserving relationships that are represented in capabilities? Does this no longer matter in a world where every device has a battery?

AFAICT they have all punted on persisting capabilities. The seL4-based systems are small static images. Fuchsia has capabilities in its app storage but not, as far as I can tell, in minfs, which is its dynamic filesystem.

Do these systems support tight control of resources such as storage space and CPU time, as KeyKOS did with space banks and meters? Or is it fair to consider resources unlimited nowadays?

The seL4 systems inherited these concepts from the KeyKOS family. Fuchsia does not concern itself with resource management, for the most part, it loves an overcommit. On the flip side, Fuchsia does have async IPC, so you can safely invoke a capability without worrying whether it would ever respond.

Do these systems support validating the extent to which a service can spread your data, as the factory does? Or was this never really viable in practice?

Not as far as I know. The seL4 image system (CAmkES) only supports very static systems, there is no explicit support for construction. I don't think they will move beyond that for some years.

Not sure about Fuchsia.

Do these systems write their own drivers for all devices? CapROS could run Linux device drivers in a protected domain with little modification.

Fuchsia has its own drivers for the most part, at least, as far as I can tell.

Many seL4 systems run a Linux VM and pass hardware straight through to it. There are some drivers specific to the platform, Lucy Parker has recently implemented a high-performance network driver and set the interface for seL4 driver structure.

--

William ML Leslie

Matt Rice

unread,

Nov 21, 2022, 8:22:11 AM11/21/22

to cap-...@googlegroups.com

On Mon, Nov 21, 2022 at 12:54 PM William ML Leslie
<william.l...@gmail.com> wrote:
>
> My summary, since I try to follow all of these very closely. I wrote a really long winded version of this, but I really don't like my writing.
>

Ditto, but I would also like to apologize for not helping out, free
time to work on it hasn't been copious,
and I will admit that some of the commits to c++ize things scared me
off somewhat, as it just isn't a language
which I want to spend much time in.

> On Mon, 21 Nov 2022 at 07:19, Charlie Landau <cha...@charlielandau.com> wrote:
>>
>> Do any of these systems use checkpoint/restart? If not, how do they solve the problem of preserving relationships that are represented in capabilities? Does this no longer matter in a world where every device has a battery?
>
>
> AFAICT they have all punted on persisting capabilities. The seL4-based systems are small static images. Fuchsia has capabilities in its app storage but not, as far as I can tell, in minfs, which is its dynamic filesystem.
>

The only thing I would add here is, robigalia which is based on seL4
tried to do persistence in userspace; since it is not a part of the
kernel.
But the seL4 kernel doesn't expose things about capabilities needed to
serialize them in any way which could be efficiently done,
It has been a bit of time since I looked at the details regarding
this, but some hints of the scale of problems in
https://robigalia.gitlab.io/book/rosme.html#_capability_equality

It was viewed I believe as a non-starter exposing the tag bits, etc
from the kernel since it leaks information about capabilities, nor was
adding persistence in the kernel proper.

that said I'm uncertain exactly how robigalia was trying it, e.g. full
system persistence via wrapping capability invocations/creation in
proxies a membrane sort of thing,
or otherwise if it was a kind of in-process library, which I'm not
sure would make sense but when posed with the alternative of proxying
might seem tempting?

Anyhow, when I looked at it the conclusion seemed to be that it
doesn't have it, and it seemed like a stretch to imagine it being
feasible without adding some kind of powerful
capability for the persistence layer to hold to allow some kind of
reflection on capabilities.

>>
>>
>> Do these systems support tight control of resources such as storage space and CPU time, as KeyKOS did with space banks and meters? Or is it fair to consider resources unlimited nowadays?
>
>
> The seL4 systems inherited these concepts from the KeyKOS family. Fuchsia does not concern itself with resource management, for the most part, it loves an overcommit. On the flip side, Fuchsia does have async IPC, so you can safely invoke a capability without worrying whether it would ever respond.
>
>>
>>
>> Do these systems support validating the extent to which a service can spread your data, as the factory does? Or was this never really viable in practice?
>
>
> Not as far as I know. The seL4 image system (CAmkES) only supports very static systems, there is no explicit support for construction. I don't think they will move beyond that for some years.
>
> Not sure about Fuchsia.
>
>>
>>
>> Do these systems write their own drivers for all devices? CapROS could run Linux device drivers in a protected domain with little modification.
>
>
> Fuchsia has its own drivers for the most part, at least, as far as I can tell.
>
> Many seL4 systems run a Linux VM and pass hardware straight through to it. There are some drivers specific to the platform, Lucy Parker has recently implemented a high-performance network driver and set the interface for seL4 driver structure.
>
>
> --
> William ML Leslie
>

> --
> You received this message because you are subscribed to the Google Groups "cap-talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cap-talk+u...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/cap-talk/CAHgd1hGX_OR0Cu8_nwnL7RxNMsTKfQewfU_zm-0KenyTEmw%2BaA%40mail.gmail.com.

William ML Leslie

unread,

Nov 21, 2022, 8:41:30 AM11/21/22

to cap-...@googlegroups.com

On Mon, 21 Nov 2022 at 23:22, Matt Rice <rat...@gmail.com> wrote:

On Mon, Nov 21, 2022 at 12:54 PM William ML Leslie
<william.l...@gmail.com> wrote:
>
> My summary, since I try to follow all of these very closely. I wrote a really long winded version of this, but I really don't like my writing.
>

Ditto, but I would also like to apologize for not helping out, free
time to work on it hasn't been copious,
and I will admit that some of the commits to c++ize things scared me
off somewhat, as it just isn't a language
which I want to spend much time in.

Nah, we always appreciate your input.

There are two C++ programs - CapIDL and MkImage. They are all small and simple programs for the most part. I fixed them in place because I was trying to keep a small footprint, but I'd like to replace at least MkImage with a MontE program at some point. The mki language isn't a capability language, and it really should be.

--

William ML Leslie

Charlie Landau

unread,

Nov 21, 2022, 5:10:58 PM11/21/22

to cap-...@googlegroups.com

On 11/21/22 5:21 AM, Matt Rice wrote:

I will admit that some of the commits to c++ize things scared me
off somewhat, as it just isn't a language
which I want to spend much time in.

CapIDL is a program of unknown authorship, that no one understands, that parses an undocumented language to produce C stubs to invoke capabilities. Some of the stubs no longer compiled with recent compilers, because the stubs have errors, as well as being suboptimal. I converted it to C++ in order to make some simplifications that made the program slightly more understandable. There are still errors, but I moved on to other work once the stubs compiled without error. You are right to be scared of CapIDL, but C++ is the least of its problems.

Matt Rice

unread,

Nov 21, 2022, 5:35:58 PM11/21/22

to cap-...@googlegroups.com

Thanks, I had wrongfully it seems assumed that it was more than *just* CapIDL.

> --
> Charlie Landau
> he/him/his (Why Pronouns Matter)
>

> --
> You received this message because you are subscribed to the Google Groups "cap-talk" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cap-talk+u...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/cap-talk/0b71e9fe-7d70-45ab-366e-2551a123e77b%40charlielandau.com.

William ML Leslie

unread,

Nov 21, 2022, 11:55:52 PM11/21/22

to cap-...@googlegroups.com

On Mon, 21 Nov 2022 at 22:53, William ML Leslie <william.l...@gmail.com> wrote:

My summary, since I try to follow all of these very closely. I wrote a really long winded version of this, but I really don't like my writing.

Though if I might indulge a little:

Calling KataOS an operating system at this point is a bit premature. Right now, it's the 5th iteration of the Rust SDK for seL4, making it easy to get started using seL4 for embedded IoT; it is highly commendable work. It's also completely open source, as is Fuchsia. The news cycle has blown up KataOS, but it appears to have completely missed the reality of what it is.

Compare how quiet the internet is on Kry10, a proprietary operating system running Erlang on seL4 with liquid smooth developer experience, built by some very knowledgeable systems people.

--

William ML Leslie

Jonathan S. Shapiro

unread,

Jan 14, 2023, 12:03:41 PM1/14/23

to cap-...@googlegroups.com

I'm not sure this email is useful beyond serving as proof of life, but I can at least clear one or two things up about CapIDL.

CapIDL is a program of known authorship: I wrote it. More than once. The language it parses is definitely not documented, and it should be. I'm not sure which of the many editions of CapIDL Charlie is discussing. but it's true that the early versions had a number of stub generation errors, and probably true that not all of these have ever been corrected, mainly because EROS ceased to be a target platform for my efforts, and I got pulled away from Coyotos before CapIDL became crucial. Additionally, the tool was not central to the work I was doing on EROS, and my time was prioritized toward other things. So yes, CapIDL is a bit of an orphaned child in the pantheon of the EROS/Coyotos code.

The statement that "no one understands" it isn't particularly constructive. Having written several marshaling tools, and read many more, I'd observe that marshaling is a structurally dirty business that leads to structurally dirty code. It is by nature ad hoc and inelegant, partly because it interacts with concrete aspects of the underlying architecture (usually more than one simultaneously) that we would all - correctly - prefer to forget. Including ad hoc decisions about registerization that are driven empirically rather than by any obvious principle. It's terrible code, and I'll make no serious attempt to defend it. Before you hate on it too hard, go have a look at almost any other IDL compiler, or at the X10 or X11 marshaling code. It'll put CapIDL right in perspective for you.:-)

The problem that CapIDL attempts to solve is three-fold:

Automating the generation of stubs. Normal mortals don't do register level programming these days (and didn't then), and of course that type of programming is totally non-portable.
Canonicalizing the [de]marshaling of arguments, which assists in protocol-level debugging and some forms of intermediation.
Integrating interface documentation into the same tool that implements the interface. The object reference manual is built from XML files that are generated by CapIDL.

In EROS, the main value was actually in the generated docs, because the protocols involved just weren't that complicated yet. In Coyotos, where the messaging layer is more complex, marshaling support was much more helpful.

Regarding C++, there are two decisions I made in standing up EROS that I came to see as really bad ideas. The worst idea, by far, was use of C++ (the other was XML). At the time, C++ was a familiar tool and a fast way to get moving, but it is a pretty bad language. The opacity and obscurity of code generation stinks, and the efficiency of generated code suffers accordingly. Also the cache density of the code. Removing C++ from the Coyotos tool chain entirely was a massive improvement, decreasing the size of the generated kernel code by nearly 30% simply because unused exception paths were eliminated. Note that none of that code was ever once touched in execution, because it was code that managed exceptions that would never be thrown. When I started, exceptions were not yet part of the language. When they arrived, the need to disable exceptions in kernel compilation meant that it isn't actually written in C++ at all - it just looks like C++.

C++ being the work of the devil, I do not favor conversion of anything to C++. Only somewhat facetiously, I'd say that C++ has the dubious distinction of being the only programming language that is not an improvement on Ada or COBOL. :-)

This project is no longer mine, and some of the priorities I was pursuing are probably not relevant right now. The one thing I'd note in regard to implementation languages is that the CapIDL behavior and generated code are among the things one would eventually like to formally verify. Moving to C++ means that the first step to success will be rewriting it.

For completeness: C is not much better. As an exercise in total futility, I invite anyone who thinks they deeply understand the language to give a correct specification of C sequence points...

Jonathan

Jonathan S. Shapiro

unread,

Jan 14, 2023, 12:18:53 PM1/14/23

to cap-...@googlegroups.com

On Mon, Nov 21, 2022 at 5:41 AM William ML Leslie <william.l...@gmail.com> wrote:

I'd like to replace at least MkImage with a MontE program at some point. The mki language isn't a capability language, and it really should be.

I'm not sure it's worth the bother.

mkimage is strictly a program for composing system images, and to export necessary constants that are consumed elsewhere in the build process. It is not something that's ever going to be extended. The earliest form was entirely declarative. Later versions became more general because of the desire to eliminate repeated initialization code.

The move to add functions was unfortunate, because it means that analyzing the content and structure of a generated image is exceptionally hard. From a verification and robustness perspective, it would be much better if mkimage could somehow be restored to its purely declarative form. When I introduced the functions, my thought was that mkimage might be revised to emit a log of capability store operations, which could then be interpreted in straight-line form to produce the system image, and could be analyzed more readily than programs written in mkimage itself.

mkimage and capidl were tools that I had in mind to publish a USENIX paper about. I'm not sure where things stand today, but at the time they just weren't interesting enough to warrant a publication in SOSP or OSDI. During the years when I was simulating an academic, focusing on the top tier venues as perceived by academia was a career requirement.

Jonathan

Reply all

Reply to author

Forward