Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Languages and new directions in operating systems

90 views
Skip to first unread message

Dan Cross

unread,
May 1, 2023, 9:30:56 PM5/1/23
to
For decades the operating system development landscape has been
dominated by C; specifically in the kernel space. In so many
ways, this makes sense, as C was created to build an operating
system, but it's also become an increasingly hostile language
for its original purpose (e.g., https://arxiv.org/abs/2201.07845
and https://queue.acm.org/detail.cfm?id=3212479; others).

This begs the question: what other languages are suitable for
building kernels? Rust seems like an obvious choice, and there
is prior art, such as https://www.redox-os.org/ and my own
https://github.com/dancrossnyc/rxv64 (this is one of several
kernels I've written in Rust).

It's not perfect; memory management at the level of page tables
etc is still a bit of a slog, and we really need a purely
fallible memory allocation story, but I feel comfortable at this
point saying that it is the best available language for the
domain.

Some folks have mentioned Zig as a possible alternative; time
will tell.

- Dan C.

Luke A. Guest

unread,
May 2, 2023, 9:12:14 AM5/2/23
to
On 02/05/2023 02:30, Dan Cross wrote:
> For decades the operating system development landscape has been
> dominated by C; specifically in the kernel space. In so many
> ways, this makes sense, as C was created to build an operating
> system, but it's also become an increasingly hostile language
> for its original purpose (e.g., https://arxiv.org/abs/2201.07845
> and https://queue.acm.org/detail.cfm?id=3212479; others).
>

It was always a hostile language.

> This begs the question: what other languages are suitable for
> building kernels? Rust seems like an obvious choice, and there

I wrote the bare bones kernel in Ada, osdev.org.

Dan Cross

unread,
May 2, 2023, 9:40:10 AM5/2/23
to
In article <u2r27b$p0bl$1...@dont-email.me>,
Luke A. Guest <lag...@archeia.com> wrote:
>On 02/05/2023 02:30, Dan Cross wrote:
>> For decades the operating system development landscape has been
>> dominated by C; specifically in the kernel space. In so many
>> ways, this makes sense, as C was created to build an operating
>> system, but it's also become an increasingly hostile language
>> for its original purpose (e.g., https://arxiv.org/abs/2201.07845
>> and https://queue.acm.org/detail.cfm?id=3212479; others).
>
>It was always a hostile language.

I can see why people say this, but what I mean is that compiler
writers have become somewhat hostile to OS developers by really
stretching what "Undefined Behavior" allows them to do. I get
that on some level, but on another, it means that one cannot
treat C as a portable macro assembler. Indeed, this has been
the case for decades.

>> This begs the question: what other languages are suitable for
>> building kernels? Rust seems like an obvious choice, and there
>
>I wrote the bare bones kernel in Ada, osdev.org.

Very nice.

- Dan C.

Scott Lurndal

unread,
May 2, 2023, 10:07:00 AM5/2/23
to
cr...@spitfire.i.gajendra.net (Dan Cross) writes:
>In article <u2r27b$p0bl$1...@dont-email.me>,
>Luke A. Guest <lag...@archeia.com> wrote:
>>On 02/05/2023 02:30, Dan Cross wrote:
>>> For decades the operating system development landscape has been
>>> dominated by C; specifically in the kernel space. In so many
>>> ways, this makes sense, as C was created to build an operating
>>> system, but it's also become an increasingly hostile language
>>> for its original purpose (e.g., https://arxiv.org/abs/2201.07845
>>> and https://queue.acm.org/detail.cfm?id=3212479; others).
>>
>>It was always a hostile language.
>
>I can see why people say this, but what I mean is that compiler
>writers have become somewhat hostile to OS developers by really
>stretching what "Undefined Behavior" allows them to do. I get
>that on some level, but on another, it means that one cannot
>treat C as a portable macro assembler. Indeed, this has been
>the case for decades.

The C compilers generally used for OS development have flags
to disable the aggressive optimizations. GCC is, after all,
still used to build linux and several other operating systems
and hypervisors.

Dan Cross

unread,
May 2, 2023, 12:51:38 PM5/2/23
to
In article <4O84M.509173$cKvc....@fx42.iad>,
This is true. However, once one goes that route, one finds that
one is no longer writing in C, but in a dialect of C specific to
some project. Granted, that dialect shares syntax and _most_ of
the semantics of C, but it's a dialect nonetheless.

Thus Linux, as one example, is not written in C so much as
"Linux C", which is the dialect created by the various flags
that they pass to GCC and Clang to make whatever undefined or
implementation-defined behaviors they rely on well-defined.
As Ted Ts'o has said, no one really loses sleep over this, and
that's fine: Linux is too important for it to every really
change.

But personally, I'd rather program in a language that requires
well-defined behavior by default, coupled with semantics that
make it aggressively optimizable. In Rust, for example, the
compiler simply prohibits UB outside of `unsafe` blocks: it is a
compile-time error. Moreover, `unsafe` blocks aren't permitted
to contain UB; rather, the compiler simply gives the programmer
a little more leeway to perform actions that it cannot tell are
safe, and shifts the onus onto the programmer to ensure that
the program is well-defined.

- Dan C.



- Dan C.

Scott Lurndal

unread,
May 2, 2023, 3:28:18 PM5/2/23
to
I would argue that there has never been a single dialect of
C - even today one cannot write a useful and well performing
application using just the facilities of standard C (toy
programs, educational exercises, yes, real programs, not so
much). So we have two primary dialects, POSIX and Windows.

When I started getting paid to write operating systems, they
were primarily assembler (e.g. VMS, with some in BLISS-32);
Burroughs was the exception by using a dialect of Algol
as a systems programming language. HP did something similar
on the HP-3000 with SPL. In both cases, the languages were
designed to be close to the hardware.

In the early 1980s Burroughs developed a language called
SPRITE for the BCD medium systems line - we rewrote the
1960's assembler MCP in the modula-like SPRITE language,
but even there facilities needed to be provided for low-level
machine access (privileged instructions, multithread
synchronization mechanisms, access to special hardware
registers (e.g. the base of the translation tables), etc).

Granted in in all the above cases, the compiler team
sat across the hall from the OS team and there were no
agressive optimizations of UB behavior as there was no
UB behavior.


>But personally, I'd rather program in a language that requires
>well-defined behavior by default, coupled with semantics that
>make it aggressively optimizable. In Rust, for example, the
>compiler simply prohibits UB outside of `unsafe` blocks: it is a
>compile-time error. Moreover, `unsafe` blocks aren't permitted
>to contain UB; rather, the compiler simply gives the programmer
>a little more leeway to perform actions that it cannot tell are
>safe, and shifts the onus onto the programmer to ensure that
>the program is well-defined.

We'll see if Rust is the answer long-run. As long as it
has the facilites noted above, it should suffice.

Luke A. Guest

unread,
May 2, 2023, 4:05:28 PM5/2/23
to
On 02/05/2023 20:28, Scott Lurndal wrote:

>> This is true. However, once one goes that route, one finds that
>> one is no longer writing in C, but in a dialect of C specific to
>> some project. Granted, that dialect shares syntax and _most_ of
>> the semantics of C, but it's a dialect nonetheless.
>
> I would argue that there has never been a single dialect of
> C - even today one cannot write a useful and well performing
> application using just the facilities of standard C (toy
> programs, educational exercises, yes, real programs, not so
> much). So we have two primary dialects, POSIX and Windows.

Wirth wrote Oberon in Oberon.

Dan Cross

unread,
May 3, 2023, 9:52:30 PM5/3/23
to
In article <kvd4M.567604$5S78....@fx48.iad>,
I'm not sure I agree with that first statement. One could write
`grep`, for instance, using pretty much only pure ISO C. But
yes, as soon as you move out of the domain of simple filters,
the landscape is far more complex.

I think this all reinforces my thesis, by the way: there are now
reasonable alternatives to C. Why then, would one prefer C to
another language? I assert that, for many application domains
where it has been the de facto default, it is no longer the
best choice.

>When I started getting paid to write operating systems, they
>were primarily assembler (e.g. VMS, with some in BLISS-32);
>Burroughs was the exception by using a dialect of Algol
>as a systems programming language. HP did something similar
>on the HP-3000 with SPL. In both cases, the languages were
>designed to be close to the hardware.
>
>In the early 1980s Burroughs developed a language called
>SPRITE for the BCD medium systems line - we rewrote the
>1960's assembler MCP in the modula-like SPRITE language,
>but even there facilities needed to be provided for low-level
>machine access (privileged instructions, multithread
>synchronization mechanisms, access to special hardware
>registers (e.g. the base of the translation tables), etc).
>
>Granted in in all the above cases, the compiler team
>sat across the hall from the OS team and there were no
>agressive optimizations of UB behavior as there was no
>UB behavior.

C certainly started in this vein: pre-typesetter C for writing
Unix on the PDP-11 was, in a lot of ways, very close to the
machine. Where it wasn't (direct access to machine state like
registers, or invoking privilieged instructions), it had easy
interoperability with assembler. This is still largely true.

>>But personally, I'd rather program in a language that requires
>>well-defined behavior by default, coupled with semantics that
>>make it aggressively optimizable. In Rust, for example, the
>>compiler simply prohibits UB outside of `unsafe` blocks: it is a
>>compile-time error. Moreover, `unsafe` blocks aren't permitted
>>to contain UB; rather, the compiler simply gives the programmer
>>a little more leeway to perform actions that it cannot tell are
>>safe, and shifts the onus onto the programmer to ensure that
>>the program is well-defined.
>
>We'll see if Rust is the answer long-run. As long as it
>has the facilites noted above, it should suffice.

It does. The rxv64 kernel and C library are all in Rust, with a
bit of assembler thrown in. The user programs are still in C,
though.

- Dan C.

Scott Lurndal

unread,
May 4, 2023, 12:08:17 PM5/4/23
to
I dunno. We switched to C++ for operating system
work in the late 80's, as did Chorus Systemes for
their distributed microkernel. The OS for Unisys' OPUS was
a mix of C++, C and assembler (first and second
stage boot loaders).


Also used C++ for a bare-metal
hypervisor in first decade of this millenium,
and currently use it for CPU simulation along
with SystemC. I suspect that had Rust been
available when these projects started, it may have been a viable
alternative, although being able to leverage
and include existing C or C++ code was a valuable
consideration in choice of implementation language
in all the above projects.

Alexei A. Frounze

unread,
May 6, 2023, 1:42:44 PM5/6/23
to
On Monday, May 1, 2023 at 6:30:56 PM UTC-7, Dan Cross wrote:
> For decades the operating system development landscape has been
> dominated by C; specifically in the kernel space. In so many
> ways, this makes sense, as C was created to build an operating
> system, but it's also become an increasingly hostile language
> for its original purpose (e.g., https://arxiv.org/abs/2201.07845
> and https://queue.acm.org/detail.cfm?id=3212479; others).

Yep, the UB creep is bad.

> This begs the question: what other languages are suitable for
> building kernels? Rust seems like an obvious choice, and there
> is prior art, such as https://www.redox-os.org/ and my own
> https://github.com/dancrossnyc/rxv64 (this is one of several
> kernels I've written in Rust).
>
> It's not perfect; memory management at the level of page tables
> etc is still a bit of a slog, and we really need a purely
> fallible memory allocation story, but I feel comfortable at this
> point saying that it is the best available language for the
> domain.

I'm seeing repeated stories on Linux and Windows supporting
rust or moving towards C/C++ to rust conversion. I'm interested
in rust too, but I got stuck somewhere in the borrower chapters
in the documentation. Do you think the official docs are the best
description of those lifetime annotations and such? Or is there
something better, with gentler introduction?

Alex

Dan Cross

unread,
May 6, 2023, 8:56:12 PM5/6/23
to
In article <NJQ4M.2535753$iS99.2...@fx16.iad>,
Oh, I should apologize: I'm being lazily imprecise in that
I mentally lump C++ in with C for this problem domain, but I
never clarified that. My bad! Certainly C++ has been used in
the same general area for a while. While in some respects an
improvement over C in terms of providing higher-level
abstractions, it is also a very idiosyncratic language and
notoriously complex.

>I suspect that had Rust been
>available when these projects started, it may have been a viable
>alternative, although being able to leverage
>and include existing C or C++ code was a valuable
>consideration in choice of implementation language
>in all the above projects.

Rust has good FFI interop with at least C (and, by extension,
assembler). It is difficult to have completely faithful C++
interop without supporting the full semantics of C++ (SFINAE
and all of that), though there's been some good work on
increasing the set of common functionality: https://cxx.rs/
and https://crates.io/crates/cpp, for example.

I think Zig has a very good story here, perhaps better than Rust
in the way they integrate with a C compiler in addition to
supporting the local ABI. E.g.,
https://ziglang.org/learn/overview/#zig-is-also-a-c-compiler

Back to Rust, though, I think there are still some gaps.
Support for writing C-style variadic functions is still not
stable yet: a program can consume a va_list, but not create
one. When I wrote `printf` for rxv64, I wrote the base stub in
assembler, and then immediately delegate to a Rust function:
https://github.com/dancrossnyc/rxv64/blob/main/ulib/src/sysx86_64.rs#L109
At least robust support for a specified ABI gives us enough to
roll this.

- Dan C.

Dan Cross

unread,
May 6, 2023, 9:01:23 PM5/6/23
to
In article <92ae547c-e14a-48e1...@googlegroups.com>,
I'm not sure about a gentler introduction. I first learned via
the O'Reilly book, personally, and I liked it very much.

It's a bit of a joke that all new Rust programmers go through
a phase of, "fighting the borrow-checker." (Followed shortly by
the "traits are AWESOME, let's use them EVERYWHERE!" phase).

Personally, I like just cracking it open and playing around with
a language to learn it, but that's just me.

- Dan C.

Alexei A. Frounze

unread,
May 7, 2023, 3:50:41 AM5/7/23
to
On Saturday, May 6, 2023 at 6:01:23 PM UTC-7, Dan Cross wrote:
> In article <92ae547c-e14a-48e1...@googlegroups.com>,
> Alexei A. Frounze <alexf...@gmail.com> wrote:
...
> >I'm seeing repeated stories on Linux and Windows supporting
> >rust or moving towards C/C++ to rust conversion. I'm interested
> >in rust too, but I got stuck somewhere in the borrower chapters
> >in the documentation. Do you think the official docs are the best
> >description of those lifetime annotations and such? Or is there
> >something better, with gentler introduction?
> I'm not sure about a gentler introduction. I first learned via
> the O'Reilly book, personally, and I liked it very much.
>
> It's a bit of a joke that all new Rust programmers go through
> a phase of, "fighting the borrow-checker." (Followed shortly by
> the "traits are AWESOME, let's use them EVERYWHERE!" phase).

:) I'm not fighting it (yet). I think I get the basic idea (or one of the
few) that the allocations/lifetimes are nested (entirely or largely,
unless we're using the unsafe escapes), but I got confused with
some details and put it aside. Should work out those details at
some point.

> Personally, I like just cracking it open and playing around with
> a language to learn it, but that's just me.

I was hoping, I'd just read the relevant chapters, but I might
end up doing just the same, writing code as aid. I expect my code
compiling and working being more of a proof of the soundness
of the code with rust than with C/C++. Though I think rust
still got some things wrong (e.g. different overflow handling in
debug vs release), but overall it seems a step in the right direction.
Perhaps, a few steps, not just one. :)

Alex

Dan Cross

unread,
May 7, 2023, 8:20:48 AM5/7/23
to
In article <4afb8338-5540-4549...@googlegroups.com>,
Alexei A. Frounze <alexf...@gmail.com> wrote:
>On Saturday, May 6, 2023 at 6:01:23 PM UTC-7, Dan Cross wrote:
>>[snip]
>> It's a bit of a joke that all new Rust programmers go through
>> a phase of, "fighting the borrow-checker." (Followed shortly by
>> the "traits are AWESOME, let's use them EVERYWHERE!" phase).
>
>:) I'm not fighting it (yet). I think I get the basic idea (or one of the
>few) that the allocations/lifetimes are nested (entirely or largely,
>unless we're using the unsafe escapes), but I got confused with
>some details and put it aside. Should work out those details at
>some point.

Conceptually, it's fairly simple:

1. Every non-trivial[*] object has a single owner
2. Assignment is logically a "move" operation, not copy, and
transfers ownership of the object
3. Objects are immutable by default.
3. You can either borrow a single mutable reference to a mutable
object, or one or more immutable references to any object,
but these are mutually exclusive:
a. if a mutable refence to an object exists, you cannot
borrow another reference of any kind to the same object;
b. if any number of refenences to an object exists you cannot
borrow a mutable refence to that object.
Note that (a) and (b) together imply that you can create as
many immutable references to an object as you like.
4. A reference may not outlive the object it is borrowed from,
and lifetimes are a first-class property in the language,
with explicit language support.

The interesting thing is that, in safe Rust code, all of this is
statically enforced by the compiler, and the combination has
some interesting properties. Safe Rust code is, by definition,
free of data-races; further, the existence of a mutable
reference is a compiler-enforced proof of the right to mutually
exclusive access to a valid object (references are defined to
be non-nullable). Undefined behavior in safe Rust is a
compile-time error.

Unsafe Rust lets you do a handful of other things (like
dereference raw pointers), but the programmer must still obey
all of the above rules. Effectively, `unsafe` says that the
compiler cannot check the rules statically, so the programmer is
asserting that they have all been followed using external
knowledge that the compiler is not privy to.

>> Personally, I like just cracking it open and playing around with
>> a language to learn it, but that's just me.
>
>I was hoping, I'd just read the relevant chapters, but I might
>end up doing just the same, writing code as aid. I expect my code
>compiling and working being more of a proof of the soundness
>of the code with rust than with C/C++. Though I think rust
>still got some things wrong (e.g. different overflow handling in
>debug vs release), but overall it seems a step in the right direction.
>Perhaps, a few steps, not just one. :)

Yes; safe Rust code that successfully compiles is essentially
statically free of entire classes of memory safety problems, and
more or less cannot segfault.

An observation when we started programming in it at work was
that it took us longer to get our programs to compile, but once
we did, they tended not to have memory-safety issues. That
doesn't mean that they didn't have bugs, but the defects were
more often logic issues rather than bad pointers and so on.
Moreover, where we used `unsafe` was where we really needed to
go look and see if we were messing something up in our reviews.

I tend to agree wth you about e.g. overflow handling, though we
also found that because we were thinking about it for the debug
case, we used the explicit versions if we really wanted wrapping
semantics and so on, and where we wanted the checks even in
release mode we could use `checked_add` and so on. Still, there
is an argument that it would have been better if checked were
the default, and the unchecked kind had to be explicitly
requested. One nice thing in this area is that generally, Rust
is explicit in places where other languages are implicit:
conversion between types, for example.

I think they got a few things wrong in the memory model,
frankly, and the way that allocators work can be challenging if
you neither need more want infallible semantics for allocation.
It's not prefect, but right now, it's pretty good; I've enjoyed
programming bare-metal at a higher level of abstraction than one
gets out of C, for example.

- Dan C.

0 new messages