* Thomas Bushnell, BSG | You really have no clue how GCC works if you think it's more trouble. | Really, GCC is totally equipped to do cross-compilation (as are all | the other parts of the toolchain).
I have helped port GCC in the past, like in 1988. I am quite sure the design has improved since then. You imply it has in the most unuseful way, so I guess you think it is quite useless to be specific and useful.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
* Thomas Bushnell, BSG | So it should be pointed out that one of the reasons I'm interested in | this question is that I'm interested in lisp systems running on bare | metal.
That would be an operating system for and in (Common) Lisp, would it not?
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
Erik Naggum <e...@naggum.net> writes: > * Thomas Bushnell, BSG > | You really have no clue how GCC works if you think it's more trouble. > | Really, GCC is totally equipped to do cross-compilation (as are all > | the other parts of the toolchain).
> I have helped port GCC in the past, like in 1988. I am quite sure the > design has improved since then. You imply it has in the most unuseful > way, so I guess you think it is quite useless to be specific and useful.
GCC has improved rather a lot in the intervening 14 years. It is now much easier to simply directly build the compiler for a new target (using cross compilation) than it is to port some other compiler first. It's so easy, that it's generally the preferable option.
Erik Naggum <e...@naggum.net> writes: > * Thomas Bushnell, BSG > | So it should be pointed out that one of the reasons I'm interested in > | this question is that I'm interested in lisp systems running on bare > | metal.
> That would be an operating system for and in (Common) Lisp, would it not?
Or Scheme, or other things. But the general problem is of more interest that merely a Lisp-OS strategy.
* Erik Naggum wrote: > This is probably doable, but in my experience with cross-compilation, you > do not just generate code, you effectively generate a module that works > with a much larger system. To make this _really_ work, you have to have > intimate knowledge of the target system. Since the compiler is often the > first thing you build on a new system in order to build the other tools > you want to use there, my thinking is that you save a lot of time using a > pre-existing compiler and like tool, particularly to ensure that you get > the linking information right for that particular environment, what with > all the shared library dependencies and whatnot.
I think this is obviously what you do when you have the choice - when I got gcc working on our Sun's I did it by compiling with Sun's cc and using a large number of other pre-provided tools (I think I still use the Sun linker), and so did everyone else in those days (nowadays you get a binary of gcc to bootstrap because Sun don't ship a c compiler for free any more...).
But in what I suspect (without real knowledge) is a large proportion of ports nowadays the target machine is something which does not yet have anything on it, because it's a little embedded processor for which you (the maker of the processor) are going to provide the gnu toolchain (this seems to be pretty much standard for embedded systems now). In these cases you probably have seriously minimal support on the target since you're aiming to get something like an OS up on it from the bare iron.
> Duane Rettig <du...@franz.com> writes: >> Similar truths apply to a garbage-collector. It might be >> perfectly acceptable for a gc function to call cons, but it >> had better be prepared to deal with the case where there is >> no more room for a new cons cell, which would thus cause a >> recursive call to the garbage-collector (presumably an infinite >> recursion, since the reason for the initial gc call might have >> been for lack of space). > So the *point* of my question is, in part, just this problem. Now, > if there *isn't* a solution, then you have to subset the language, > omit cons, and then code your GC. > But why do that if there is a convenient solution? > One strategy: suppose to GC an arena of N bytes takes N/10 bytes of > memory to hold dynamically allocated GC data structures. > One strategy is to just save that space always, so it's there. Or, > if one is using stop-and-copy, then it's even easier to find space. > If one is in a multi-threaded world, and each thread gets its own > allocation arena for normal allocation, and you are using a > stop-the-world approach to GC, then you can't reliably assume > (perhaps) that all the threads have left their arena in an ideal > state. That means that the GC will probably have to allocate out of > a totally separate arena from what other programs use. When it's > done, a quick GC pass (allocating from the main heap) can be run to > clean the special GC arena, and copy anything remaining there onto > the main heap.
I think you need to take a step back and try to see _precisely_ what it is you're trying to implement.
It seems (from what I see) that your goal is to build a "not far off bare iron" Lisp environment.
A hopefully not-too-wild guess would be that you might take one of the Unix-like kernels, and rather than running init on top of that, running Lisp instead. (I could be wrong, but that would surprise a bit...)
I think your goal is likely to be to: - Interface to the kernel's devices; - Interface to the kernel's memory manager; and then build a usable environment on top of that.
Let's suppose that this environment is to be an implementation of Common Lisp. (Changing to Scheme wouldn't change the fundamentals.)
You clearly need to have the following things:
a) A code generator that generates and stores object code that the kernel knows how to load and run.
b) A memory management system that requests memory from the kernel, when needed, which doles it out when the CL wants to CONS a bit, which can reclaim garbage, and perhaps even return it to the kernel.
Those are things which fall out of the scope of things that a CL implementation "trivially provides."
It would certainly be a reasonable idea to create, in portable Common Lisp (or, for that matter, Scheme):
a) Some functions that know how to take object code in some sort of buffer, and write it out to storage;
b) Some functions that implement the "code buffer" described in a), and which allow storing machine language instructions in that buffer by making use of some sort of "assembler";
c) Some set of Lisp functions that implement a memory manager, using the "assembler" described in b).
Given that, you can then proceed to implement:
d) A library of string and number functions using c) and b);
e) An interface, in assembly language, to the OS kernel, which uses d) and b) to provide an API that can access OS functionality.
At this point you have a memory manager, a program loader and "saver," an assembler, and access to OS services.
You can start assembling functions to implement basic pieces of a Lisp environment, and then assemble many of those together to implement further functions and macros.
There's going to be a bunch of code that is written in pretty much raw assembler, guaranteed. I'd think a)-e) are all examples of that.
The proper goal of the exercise will doubtless be to implement in the f)-and-following phases the pieces of the Lisp environment needed to host a)-e).
Thus, it would be perfectly acceptable to make extensive use of macros and CLOS throughout perhaps even MOP, with the underlying expectation that f)-and-following includes an implementation of macros, CLOS, and MOP. In effect, the project involves implementing an Extremely Sophisticated Assembler in Lisp, and that's quite a reasonable idea.
You could reasonably assume that since there exists the CLOS implementation "PCL," that you don't forcibly have to create CLOS from scratch, but could just implement enough of CL to allow loading in PCL.
What you _can't_ assume is that simply by the sheer existence of CLISP, CMU/CL, and ACL, you can avoid a)-e) or f)-and-following.
Your big tasks are going to be to design: a) How you'll save and load binary code; b) How you'll implement memory management; c) How you'll store Lisp code in memory; d) How you'll store Lisp data in memory.
That's certainly where the project starts. If the approaches for those four things are done well, the system may be good. If the approaches suck, well, great suckage lies there... -- (reverse (concatenate 'string "moc.adanac@" "enworbbc")) http://www.ntlug.org/~cbbrowne/oses.html "Windows: The ``Big O'' of operating systems."
Tim Bradshaw wrote: > I didn't quite mean it quite so literally. Imagine I get a blob of > code, how do I know that it doesn't fake things? The only way I can > see to do this is a completely trusted compiler, which can sign its > output, so you're still dynamically checking, you just do it once, > when the program starts (isn't this what MS push with ActiveX?). Or I > guess you can do some kind of proof on the program before running it > (Java?).
> Given the negligible cost of checks, I'd kind of rather the OS just > did them though.
Running untrusted code, i.e. an opaque binary handed to you by a potentially malicious stranger, is a problem that requires somewhat elaborate solutions. For example, you can run it in a sandbox that restricts its access to sensitive resources (Java VM), you can do the same thing but with hardware support to speed it up (Unix processes with CPU-supported memory protection), or you can require the person giving you the code to suppy a machine-checkable proof that it is harmless (proof-carrying code), or you can just give up and ask the user "do you trust this guy?" (digital signatures).
Note how even the low-tech altenative of sandboxing does not work too well; numerous flaws were pointed out in the Java security scheme, and even though e.g. FreeBSD provides a "jail" system call, you probably would only want to use it to provide an extra level of protection for deamons that already provide their own checks. Relying completly on the OS provisions for this would feel risky.
The primary reason that these runtime checks are universally included is that they solve a *different* problem: when I make a mistake and introduce a bug in my program, how can I prevent it from clobbering all information on my computer and setting me back a week's worth of work? Somewhere on Dennis Ritchie's homepage there is a description of what it was like working on their multi-user machine before the memory protection unit was delivered. Before you ran your newly compiled program you shouted out "a.out!" and waited until your coworkers had had time to save their files. Quite often the computer would stop echoing keystrokes after that.
But the crucial point is that this problem has another solution, by writing your program in a type-safe language like Lisp, ML, Java instead of in assembly or C. No Lisp or Java program will ever modify memory that wasn't properly allocated to it. And this is not because of run-time checking - there are some checks, like array bounds, but primarily this is because these languages have _no way to describe_ the act of peeking and poking at data that does not belong to you (kind of like Orwell's newspeak). So there is no need to monitor an ML program with runtime checks - the very fact that it was written in ML gives us all the confidence we need.
There is some loss of clarity because the same word "protection" is used to describle both the concepts above (and the also unrelated idea of enforcing good software engineering practice by "encapsulating" parts of programs in modules).
But if you do your program development in a type-safe language using a standard OS, then you are paying for a feature you don't need, namely the sandboxing of programs into processes that cannot hurt eachother. (Whereas if you program in some more assembly-style language, this is definently a feature you would want). And these checks do not have negligable cost. Every time you want your processes to communicate with each other, or with the world, or you just want to switch to another process in a multi-tasking system, you need to tell the memory-protection unit about it, set up a new stack and virtual address translation table, etc. These context-switches take time - OS preformance is assesed partly on the basis of how long time, buffering IO libraries are used to avoid them, etc.
It would be nice with a operating system based on a type-safe language instead of C. It could then dispense with the concept of processes all together, and there would be no distinction between user or system code. (the "kernel" would dissolve into a set of libraries, with a thread scheduler somewhere).
Such an OS could of course still sandbox or proof-check or verify the signatures of untrusted code, just as you run a Java VM in Windows/Unix.
-Vilhelm
(The ideas above come mainly from Vapour [http://vapour.sourceforge.net/], a pet project of an IRC aquintance. He seems to have given up developing it, though).
Vilhelm Sjoberg <va...@cam.ac.uk> writes: > It would be nice with a operating system based on a type-safe language > instead of C. It could then dispense with the concept of processes all > together, and there would be no distinction between user or system > code. (the "kernel" would dissolve into a set of libraries, with a > thread scheduler somewhere).
Which goal is indeed part of the background behind the question of mine that started this thread.
Vilhelm Sjoberg <va...@cam.ac.uk> writes: > But if you do your program development in a type-safe language using a > standard OS, then you are paying for a feature you don't need, namely > the sandboxing of programs into processes that cannot hurt eachother.
As long as your type-safe language has "escape hatches" for bypassing safety (e.g. unchecked conversion, for V'Address use ..., calls for foreign functions, etc.) then OS protection features are still necessary.
Even if your language has no escape hatches, you are still putting a lot of trust in the security and quality of your runtime environment (which ultimately is not implemented in the safe language).
It is far far better to have both safety features (language safety and OS-protections).
-- Cheers, The Rhythm is around me, The Rhythm has control. Ray Blaak The Rhythm is inside me, bl...@telus.net The Rhythm has my soul.
Ray Blaak <bl...@telus.net> writes: > As long as your type-safe language has "escape hatches" for > bypassing safety (e.g. unchecked conversion, for V'Address use ..., > calls for foreign functions, etc.) then OS protection features are > still necessary.
If the escape hatches are only available to privileged code, you still don't need OS protection features.
> Even if your language has no escape hatches, you are still putting a > lot of trust in the security and quality of your runtime environment > (which ultimately is not implemented in the safe language).
On Fri, 08 Mar 2002 00:30:27 -0500, Ray Blaak wrote: > Vilhelm Sjoberg <va...@cam.ac.uk> writes: >> But if you do your program development in a type-safe language using a >> standard OS, then you are paying for a feature you don't need, namely >> the sandboxing of programs into processes that cannot hurt eachother.
> As long as your type-safe language has "escape hatches" for bypassing > safety (e.g. unchecked conversion, for V'Address use ..., calls for > foreign functions, etc.) then OS protection features are still > necessary.
But in the kind of language we are talking about, it is statically known whether "escape hatches" have been use in a particular program. Only programs that do use unsafe features need OS protection. In practice, those should be the vast minority.
> Even if your language has no escape hatches, you are still putting a lot > of trust in the security and quality of your runtime environment (which > ultimately is not implemented in the safe language).
Right. Just as much trust as I put now into my runtime environment -- the millions of lines of kernel code, the code for all those setuid programs on my system, ...
I would have an easier time "trusting" my environment if I knew its safety relies on proofs rather than on mere hope that not one of a buch of a few thousand programmers who I personally don't know haven't screwed up somewhere.
> It is far far better to have both safety features (language safety and > OS-protections).
I see no reason to pay for OS protection if I provably won't need it.
mdanish> Context-switches are expensive, remember. An OS/compiler mdanish> that removed as many layers as possible between program and mdanish> underlying hardware would be much faster;
Isn't this what exokernels are all about? As I remember/understood it, the exokernel "movement" takes the microkernel idea to the logical extreme, expecting the kernel to do very little other than multiplexing access to hardware.
------------------------+-------------------------------------------------- --- Christian Lynbech | Ericsson Telebit, Skanderborgvej 232, DK-8260 Viby J Phone: +45 8938 5244 | email: christian.lynb...@ted.ericsson.se Fax: +45 8938 5101 | web: www.ericsson.com ------------------------+-------------------------------------------------- --- Hit the philistines three times over the head with the Elisp reference manual. - peto...@hal.com (Michael A. Petonic)
* Frode Vatvedt Fjeld <fro...@acm.org> | The old lisp machines lacked OS protection mechanisms like address | spaces, didn't they? Did they suffer substantially from this design?
They would today, since all kinds of creep can and do get access to computers. Besides, if anyone would implement a system that invites intrusion and viruses and malicious abuse and is completely helpless in the face of such abuse, they would be violating a number of Microsoft patents on how _not_ to design software.
/// -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief.
> I would have an easier time "trusting" my environment if I knew its > safety relies on proofs rather than on mere hope that not one of a buch of > a few thousand programmers who I personally don't know haven't screwed up > somewhere.
Well, that's good. Perhaps you should start writing a provably correct shared-memory multiprocessor OS that scales to, say, a 100 processor machine reasonably well and supports all the features of, say, Solaris on such HW, and performs as well.
* Frode Vatvedt Fjeld <fro...@acm.org> | The old lisp machines lacked OS protection mechanisms like address | spaces, didn't they? Did they suffer substantially from this design?
Erik Naggum <e...@naggum.net> writes: > They would today, since all kinds of creep can and do get access to > computers. Besides, if anyone would implement a system that invites > intrusion and viruses and malicious abuse and is completely helpless > in the face of such abuse, they would be violating a number of > Microsoft patents on how _not_ to design software.
But recent Microsoft OSes do implement address spaces, and enforce them as do unix systems. The fundamental problem of macro viruses is not solved by address spaces (even if they arguably can help somewhat once the damage is done), nor are buffer/stack overruns, or similar attacks. It is also generally understood that the unix (and, I suppose, windows) protection mechanisms do not seriously protect against anyone with physical access to the machine.
I'd say if you run malicious code, outside some specially rigged jail environment, you are in big trouble regardless. Java's virtual machine is one such rigged environment, but to my knowledge this approach also doesn't rely on strictly enforced address spaces.
Address spaces do allow multiple users (to a limited degree) not to interfere with one another, but I believe most PC-users today get to have one personal computer each for typical work, and demanding applications get at least one devoted machine.
Tim Bradshaw <t...@cley.com> writes: > Not then, but the thought of such a system with a MIME mailer which > could cause random Lisp code to be executed fills me with horror.
But wouldn't any mailer that causes any kind of random code to be executed equally fill you with horror?
> I'd say if you run malicious code, outside some specially rigged jail > environment, you are in big trouble regardless. Java's virtual machine > is one such rigged environment, but to my knowledge this approach also > doesn't rely on strictly enforced address spaces. > Address spaces do allow multiple users (to a limited degree) not to > interfere with one another, but I believe most PC-users today get to > have one personal computer each for typical work, and demanding > applications get at least one devoted machine.
I think you're missing the distinction between necessary and sufficient. No one is claiming that memory protection mechanisms are *sufficient* but for native code applications I think I would like to claim they are *necessary* to run untrusted native code.
> PreScheme gives you exactly that, which is what the GC of Scheme 48 is > written in. Not that it buys you much there except for performance, > as the (or this particular) GC primarily deals with words and bits.
N.B. PreScheme is really just ML with Scheme syntax.. :)
I combine existing type systems with several standard type-based compilation techniques to write strongly typed programs that include a function that acts as a tracing garbage collector for the program. Since the garbage collector is an explicit function, there is no need to provide a trusted garbage collector as a runtime service to manage memory. Since the language is strongly typed, the standard type soundness guarantee ``Well typed programs do not go wrong'' is extended to include the collector, making the garbage collector an untrusted piece of code. This is a desirable property for both Java and proof-carrying code systems.
...
It is extends the work I presented at POPL 2001. I won't make any claims of practicality until I bite the bullet and build a large scale system. However, the underlying design is sound. If you're not concerned about type saftey then the work still offers you a recipe of what minimal abstractions you need to do a GC in your favoriate untyped language.
On Fri, 08 Mar 2002 10:06:25 -0500, Tim Bradshaw wrote: >> I would have an easier time "trusting" my environment if I knew its >> safety relies on proofs rather than on mere hope that not one of a buch >> of a few thousand programmers who I personally don't know haven't >> screwed up somewhere.
> Well, that's good. Perhaps you should start writing a provably correct > shared-memory multiprocessor OS that scales to, say, a 100 processor > machine reasonably well and supports all the features of, say, Solaris > on such HW, and performs as well.
I might, but I need someone to pay me for doing so. Plus, there is no guarantee that the result would be widely empoyed. Look at what we are stuck with now (Windows, Windows, Windows, with perhaps an odd Linux here and there in the mix): Technical superiority is not at all a guarantee for success in this marketplace. (Neither Windows nor Linux would be so popular otherwise.)
By the way, what I am thinking of would never support "all the features of, say, Solaris", at least not to the point of low-level (API-) compatibility. This is because we are talking complete redesign of *everything*, and this must start with exactly those interfaces. Of course, the resulting incompatibility with existing software would make it even harder to be accepted. Basically, we are stuck with what we have now. I am quite confident that in, say, 100 years the world of OS design will look quite different, but it will take a major revolution or two to get there.
Tim Bradshaw <t...@cley.com> writes: > I think you're missing the distinction between necessary and > sufficient. No one is claiming that memory protection mechanisms > are *sufficient* but for native code applications I think I would > like to claim they are *necessary* to run untrusted native code.
But why would you want to run untrusted native code? I believe Microsoft is setting up an entire trust infrastructure for binaries, with cryptographic signing of applications and drivers and whatnot, and that's probably for a reason.
Are you willing to take a (potentially) big performance hit just to be able to support the jail system call? Maybe sometimes that's necessary, but not always.
Erik Naggum <e...@naggum.net> writes: > They would today, since all kinds of creep can and do get access to > computers. Besides, if anyone would implement a system that invites > intrusion and viruses and malicious abuse and is completely helpless in > the face of such abuse, they would be violating a number of Microsoft > patents on how _not_ to design software.
And since the popular operating systems *do* have those memory protection features, it's not possible for a creep to get in and destroy a system, right?