Greetings

37 views
Skip to first unread message

Vincent Van Laak

unread,
Oct 12, 2025, 8:09:35 PMOct 12
to inferno-os

Greetings!

I am a weirdo who has, since 2005 and without knowing Inferno OS existed, been conducting what I will summarize as a thought experiment, which as far as I can tell (as I am just beginning to learn about Inferno) correlates with you at a good 70+% level.  I kept my thinking private for a long time, and am recently trying to come out of that shell.

I am reluctant to either ramble on about my ideas apropos of nothing, or blindly link others to the small blog where I have written about my project.  That blog, written in ignorance of all but my own thinking, may seem a bit… “Golly gee whiz” about some things.  Ideally, I would share places where I believe my thought experiment has provided insights that may be useful to you.  Perhaps I will after a bit more reading, if it is welcome.  I'd be happy to talk theory about a lot of different parts of the system.

That said… I mean, it's not as though Inferno is in active development, and I understand that.  Even if I offered suggestions… abstract ideas aren't helpful without the work to back it up.  And I… I don't think I'll be able to do that work myself, especially not alone.  I have a Bachelor's CS, but… my programming skills are not impressive, and I've never done kernel work.  And I may be losing my mind, a bit.

That having been said, I have otherwise been unable to find like-minded people to talk to, and I hope some people here will be willing to indulge me.  I say with some conviction that a distributed operating system like Inferno could become the future of computing, if done correctly.

Of course, I literally call my project MAD, so take that with an appropriately sized grain of salt.

  - Vincent

Vincent Van Laak

unread,
Oct 13, 2025, 2:02:21 PMOct 13
to inferno-os

So here's an example of what I'm talking about, of the kind of theoretical nonsense I spend entirely too much time thinking about:

Remote procedure calls.

Fundamentally in any distributed system, we will be doing things that fall into the category of remote procedure calls very frequently, so it's important that those mechanisms are well thought-through.  If people can't have confidence in your RPC system, your distributed system is on shaky grounds.  Without having dived into the details of it, Styx is not a confidence-inspiring mechanism.

Now, that's unfair, but only kind of.  Styx is either a peer of, or a step ahead of, the Internet Protocol as an RPC mechanism.  (See also this blog post deriding IP as an RPC mechanism)  The weak part of both, is that they are fundamentally an application entry point that accepts strings (or data streams) as arguments, and that's been how you call applications since applications were first developed.  We have grown used to the fact that that's how it works.  I completely understand using a mature metaphor to minimize new technology being introduced or developed, and applying it to files to create multiple entry points to control your remote application isn't a bad addition.  (In fact reusing the filesystem metaphor to map a distributed system, and adding more entry points to a program, are both points on which we agree.)  As a procedure call mechanism, though, it's… limited.

Now, my proposed solution is probably on the entire other side of the map, too complicated instead of too simple, requiring dev work I don't want to do… so instead of talking about that, let's just talk about the high-level theory.

My guiding philosophy is that an operating system should not force programmers to implement or reimplement core systems, even if it permits it; wheel-reinvention should be kept to a minimum, a hobby rather than a job.  In that context, typeless data streams are… basically the worst.  They oblige programmers to either invent a complicated mechanism to handle various valid and invalid data coming through the stream channel, or naively hope and pray that what comes in follows the plan.  What do you do when you start getting bad data in a stream?  Kill it and start over?  What if it's in the middle of a complex client-server interaction?  Networking stacks have gotten good at minimizing events where the network itself corrupts data channels, but there are other potential sources of errors, including active malice.  It is still required that programmers invent ways to detect and handle these errors, or pull in a third-party library that does the same.

Message-based data streams are a bit better, in that there are defined intervals in which the sender and receiver's expectations are reset to a normalized state.  Instead of potentially seeing an unending bitstream through a channel, due to a malfunctioning or malicious sender, you can only see a potentially unending stream of individual packets, and these packets could be mixed in with legitimate traffic that may still be handled correctly.  Common message streams, however, still contain typeless data or text, which brings us to our next problem, or rather, my next bit of guiding philosophy:

If a feature is common in programming languages, that feature should probably exist for RPCs, data channels, application entry points, and any middleware that ties them together (such as the OS shell).  Most directly, I'm looking at type checking (in particular advanced objects, but also the basics), though you could argue that preconditions and postconditions, or something similar, fit the same bill.  On the one hand, formalized type checking and/or custom condition checking at data ingress means the programmer doesn't need to reinvent that particular wheel.  This is helpful, among other reasons, because programmers have proven over and over that they will do a terrible job reinventing that wheel.  We have all half-assed an “Is this what I think it is” check or two or ten or many many more.  When we aren't half-assing those checks, we're overdoing them because we don't trust any part of the code we aren't looking at right now, so who knows, maybe something weird slipped through out of nowhere, in the absolute middle of a workflow that honestly should be bulletproof, so why do these errors keep coming up.

On the other hand, formalized typing for entry points and remote procedures is, if not self documenting, a world ahead of “shout strings down a pipe” syntax.  In that context, I dislike Inferno/Styx's bare Unix file access methodology.  The fact that control files are named is helpful, but that's essentially sanity checking only the first argument to a function, the one that selects which internal function the rest of your arguments go to.  Without access to the documentation, the files which act as data sources and control functions can seem just as arbitrary and opaque as any other undocumented mass of files.  It is still therefore incumbent upon the programmer to offer feedback (of the --help type) so that someone who is exploring the system or debugging it doesn't have to constantly refer back to a separate window to look up the interface details.  (This is still the case with formalized types, but the fact of type checking means that type information is stored in the executable, meaning some extra machine-generated documentation is plausible; likewise, if you had explicit pre/postcondition blocks, you might machine-generate some basic documentation from that, or just output the condition block's source for the user to peruse.)

All of that said, the idea of type checking in a distributed system creates loads of headaches immediately and persistently.  It requires the system and its core tools (shell, etc) be aware of types and typed objects.  You need something more like a python or javascript REPL loop for your shells, which is capable of implicitly and explicitly capturing objects and passing them as arguments to functions.  Any type that gets passed in or out of a public function need to be universal (eg JSON/pythonic data objects) and/or public.  Implementing all that would not be fast, easy, or cheap.  But also, programs needs to be working off of compatible implementations of standard data types, or else your public type infrastructure breaks down immediately.

In that context, my theoretical Project MAD has a global type directory, where you ask for an API and get an implementation object that can be dynamically linked into your executable.  But… that's a complex topic for another time, or you could read this.  Either way, this seems like a suitable stopping point for this topic.

In short: What you really want from a distributed system is for a programmer to treat a function on a remote machine as though it were a part of their own application.  That means just passing arguments back and forth, using a system that's just as powerful as a full programming language, complete with type checking and similar features.

My argument that Inferno/Styx doesn't go far enough, is not about Inferno at all.  IP, sockets, Unix IPC, and the like are all doing what I will characterize as "the wrong thing".  Granted, doing what I consider “the right thing” would introduce a generational change in how programs operate, and like… I'm just some guy with an idea.  I'm not actually so arrogant as to think I haven't missed anything.  At absolute best, it needs more thinking about, and it's very likely that I'm being too naive.

But that's why I enjoy thinking about the high-level theory.  I'm not setting myself or anyone else up for heartbreak by spending, or inducing the spending of millions of dollars and thousands of man-hours on an implementation that may end up being wrong.  If someone in this group, or someone discovering my blog, finds a logical hole and blows my theory out of the water, well, nothing is lost.  And if someone takes some of my ideas and makes good use of them in ways I don't expect and can't take credit for… well, I'll still count that as a win.  More broadly… it's just fun, holding an entire design in my head and turning it around like a puzzle to see how it works and what shapes I can bend it into.

I don't plan to keep rambling on here, absent an actual conversation.  Goodness knows I have my blog for that.  But I thought it'd be worth offering an example of what's in my head.

-V

da...@boddie.org.uk

unread,
Oct 13, 2025, 6:12:01 PMOct 13
to inferno-os
Perhaps look at the way that Limbo can send typed data over channels.
If 9P/Styx doesn't provide the structure you need, it's up to you to impose that. Think of it as a building block instead of something that should be ready to use out of the box.
After all, if it provided some kind of RPC mechanism, maybe it wouldn't be exactly what you wanted then you would have to reinvent it, anyway.

Vincent Van Laak

unread,
Oct 14, 2025, 1:24:51 PMOct 14
to inferno-os
So I'm trying to find a metaphor that works the way I want it, and the best I can come up with, is the difference between wiring standards and what an electrician can do with wire and sockets and breakers and such.  On the one hand, yes: there will always be a call for electricians to do what the client says, what the situation calls for, even when it's not to code.  But having an electrical wiring code is important.  A wiring code implicitly or explicitly says what's important: wire sizes, insulation, breakers, ground fault interrupters, conduits... I don't want to torture the metaphor.  Some electricians read the electrical code early in their career and get an understanding of why they should do things a certain way, while others use the wiring code to turn their brains off and just do what they're told.  Both are important.  Being given wires and tools, and instructions on how to use them, is not the same as understanding what the ideal state of house wiring should look like.

Stepping back from the metaphor, as far as I can tell from the limbo doc (which is on the site but not linked in the resources, I found the link off Wikipedia) and from checking around the git repo, the method for passing typed data through channels is not much more than syntactic sugar, making it simpler to pass data and catch programmer errors.  It may do compile time type checking, but it doesn't look like it does any runtime type checking.  This is important because a distributed system is almost certain to also be accessible over a network.  In any but the most tightly controlled circumstances, it's inevitable that malicious users could start poking at ports.  Even if that doesn't happen, if Inferno became popular, inevitably admins would start writing third-party scripts to interact with other applications.  In either circumstance, compile-time type checking is useless, because the hacker (good or bad) isn't going to use those tools.  Any app that you expect to touch a larger world, needs to handle weird and bad data.  The "wiring code" of a distributed application requires runtime checks on incoming data, no different from any internet-facing service written in any other language and programming style.

(Aside: Yes, I am being paranoid.  In fact under project MAD, the nominal computer uses a separate network device and protocol for OS traffic, distinct from external network traffic, and is fairly well controlled, but I still worry about what would happen if bad actors sneak into it.  I end up throwing my hands in the air and saying that some things come down to trust, but always try to verify.  Arguing about the same in Inferno is comparatively less paranoid.)

And yes, ultimately, if there is a mechanism that doesn't completely cover your use case, you build your own.  What is not covered in that, is that more and better tools means that happens less often.  Really good tools means it hardly has to happen at all, and/or the process of building your own becomes a lot simpler.

Reply all
Reply to author
Forward
0 new messages