Multics Concepts For the Contemporary Computing World

John Ahlstrom

unread,

Jun 27, 2003, 3:07:47 PM6/27/03

to

In alt.os.multics
Tom Van Vleck wrote:
>
> I don't have time now to do this topic justice, but here are a
> few remarks, from the point of view of someone who left the
> OS design team in 1981.
>
--snip snip
>
> 7. Discussion on the "kernel" point has missed a key aspect.
> Microkernel systems such as Mach work by message passing. Multics
> had a notion of a "kernel" and there was a design project to
> separate ring 0 into kernel and non-kernel, and multiple projects
> to move stuff out of ring 0, mostly never shipped. But these two
> are not the same thing: there was never any proposal to introduce
> message-passing calls into the Multics architecture. So this is
> a big choice, to be made at the very beginning. Message passing
> architectures like Mach's are great for structure, but there's a
> heavy performance penalty you pay up front, in argument
> marshaling and so on. I worked on Tandem systems, and because
> they were fundamentally message passing, they were able to expand
> to multiprocessors and clusters with ease.
-snip snip

What about architectural support for message passing?
IIRC the GEC 4080 had such support.
From: http://www.cucumber.demon.co.uk/geccl/4000series/4080sales.html

> 4000 NUCLEUS TIMES
> (microseconds, typical)
[JKA: machine had 550nsec memory cycle]
> Semaphore operations
> no program change - 4.95
> program change - 35
> Segment load
> - 7.5
> Inter-chapter branch
> no segment change - 4.6
> segment change - 9.1
> Start input/output
> - 20.0
> Interrupt
> no program change- 8.7
> program change - 42
> Inter-process message
> no program change - 35
> program change - 55
>

Does the 4080 have any successors?
Any similar support in other architectures?
Can such support change the performance penalty enough to
make message passing cost-effective?

--
I don't think average programmers would get along very well
with languages that force them to think about their design
decisions before they plunge into coding.
Brian Inglis

Stephen Fuld

unread,

Jun 27, 2003, 3:40:50 PM6/27/03

to

"John Ahlstrom" <jahl...@cisco.com> wrote in message
news:3EFC9603...@cisco.com...

>
> In alt.os.multics
> Tom Van Vleck wrote:
> >
> > I don't have time now to do this topic justice, but here are a
> > few remarks, from the point of view of someone who left the
> > OS design team in 1981.
> >
> --snip snip
> >
> > 7. Discussion on the "kernel" point has missed a key aspect.
> > Microkernel systems such as Mach work by message passing. Multics
> > had a notion of a "kernel" and there was a design project to
> > separate ring 0 into kernel and non-kernel, and multiple projects
> > to move stuff out of ring 0, mostly never shipped. But these two
> > are not the same thing: there was never any proposal to introduce
> > message-passing calls into the Multics architecture. So this is
> > a big choice, to be made at the very beginning. Message passing
> > architectures like Mach's are great for structure, but there's a
> > heavy performance penalty you pay up front, in argument
> > marshaling and so on. I worked on Tandem systems, and because
> > they were fundamentally message passing, they were able to expand
> > to multiprocessors and clusters with ease.
> -snip snip
>
> What about architectural support for message passing?

Didn't the Elixi "mini-super" computer have such support?

--
- Stephen Fuld
e-mail address disguised to prevent spam

Russell Williams

unread,

Jun 30, 2003, 7:01:45 PM6/30/03

to

Stephen Fuld" <s.f...@PleaseRemove.att.net> wrote in message
news:651La.24517$3o3.1...@bgtnsc05-news.ops.worldnet.att.net...

Elxsi implemented message passing in hardware and microcode, and
used the control of link (port in Mach terms) ownership as the
fundamental system security mechanism, (along with the fact that
only the memory manager process had the hardware page tables
in its virtual address space). I/O completions showed up as messages
from controllers. It was basically a multi-server (Gnu Hurd-like)
system. Message passing was about at least an order of magnitude
slower than a function call (more if you sent much data by value).

1-2 orders of magnitude is well within the bounds where reasonable
partitioning of the OS would make the cost of message passing
insignificant. (On the other hand, a couple of bad partitioning
decisions were made that made those costs painful; refactoring had
to occur). The benefit was that we got excellent scaling from 1-12
processors, including the first (AFAIK) observations of
super-linear speedup (because adding processors added cache).

The machine was strange by today's standard in other ways: 64 bit
registers / integers, but only 32 bit virtual addresses. Cobol screamed
because you could do decimal in registers. The first
fast implementations of full IEEE SP/DP floating.

The hardware based messages and multi-server structure made for
some strange effects: on a machine with lots of RAM, you could be
using the source debugger on the memory manager while other users
continued their work without pause. We had a Unix server that
accepted "system call" messages from Posix processes (again, a good
partitioning got us lots of parallelism by farming out work to other
servers without too much time spent in message passing).

A technically interesting and successful design, but both its technical and
marketing niches were closed by the advance of the killer micros. Our
big competition was high-end VAXes, at a time when VAX software was
already entrenched, and the market for that class of hardware was being
supplanted by RISC workstations.

Russell Williams
not speaking for Adobe Systems

Stephen Fuld

unread,

Jun 30, 2003, 11:32:10 PM6/30/03

to

"Russell Williams" <williams...@adobe.com> wrote in message
news:tj3Ma.2789$Ry3.1...@monger.newsread.com...

> Stephen Fuld" <s.f...@PleaseRemove.att.net> wrote in message
> news:651La.24517$3o3.1...@bgtnsc05-news.ops.worldnet.att.net...
> >
> > "John Ahlstrom" <jahl...@cisco.com> wrote in message
> > news:3EFC9603...@cisco.com...
> > >
> > > In alt.os.multics
> > > Tom Van Vleck wrote:
> > > What about architectural support for message passing?
> >
> > Didn't the Elixi "mini-super" computer have such support?
>
> Elxsi implemented message passing in hardware and microcode,

Rest of very good explanation snipped

Thanks, I am glad I remembered correctly and your explanation of both the
technical and business issues was well done. I remember that it used huge
boards with ECL circuitry and big fans and thus was unsuitable for what we
were looking for at the time, but I remember being impressed with the
thought that went into its designs.

So, the obvious question is then, is there something that makes sense from
that idea to adapt into current microprocessor designs in order to give the
advantages of low cost message passing, and ease the development of more
modular software that would use it?

Cliff Sojourner

unread,

Jul 1, 2003, 12:21:24 AM7/1/03

to

> So, the obvious question is then, is there something that makes sense from
> that idea to adapt into current microprocessor designs in order to give
the
> advantages of low cost message passing, and ease the development of more
> modular software that would use it?

if it were easy to get the benefits of message passing OS then it would have
happened a long time ago.

programming a Tandem, for example, requires a very different mindset than
programming any *NIX system. by "programming" I mean "doing it properly".

also, as was pointed out earlier in this thread, not all applications can or
should pay the huge cost of message passing for the relatively minor gains
of scalability, atomicity, fault tolerance, manageability, reliability, etc.

but you're on the right track - how can we make message passing systems
attractive to "regular" applications?

tough question!

Rupert Pigott

unread,

Jul 1, 2003, 4:41:57 AM7/1/03

to

"Cliff Sojourner" <c...@employees.org> wrote in message
news:8%7Ma.4154$Xm3.1087@sccrnsc02...

> > So, the obvious question is then, is there something that makes sense
from
> > that idea to adapt into current microprocessor designs in order to give
> the
> > advantages of low cost message passing, and ease the development of more
> > modular software that would use it?
>
> if it were easy to get the benefits of message passing OS then it would
have
> happened a long time ago.
>
> programming a Tandem, for example, requires a very different mindset than
> programming any *NIX system. by "programming" I mean "doing it properly".
>
> also, as was pointed out earlier in this thread, not all applications can
or
> should pay the huge cost of message passing for the relatively minor gains
> of scalability, atomicity, fault tolerance, manageability, reliability,
etc.

By 'cost' do you mean that it takes longer to communicate
via message passing than shared memory ?

I don't see why this should be so. In a NUMA system or a
message passing system for a message to get from CPU A to
CPU B it will still have to travel along a very similar
signal path. So it can't be the plumbing that slows it down

If you are talking about a locally delivered message then
perhaps it could be slower, simply because you are eating
bandwidth to make a copy (and pranging the cache to boot).

The trick seems to be to make messages cheap in the
hardware, this has been done many many times. From my
point of view keeping it simple is the way to go here,
so you start by throwing your Ethernet gear into a skip. :)

I've seen a few people smugly compare TCP over Ethernet
with custom built shared-memory pipes down the years and
make the bogus leap of intuition that message passing
is slow... I suppose to counter that they would really
need to see how shared memory performs across TCP +
Ethernet. My guess would be : considerably more shite
than message passing.

Cheers,
Rupert

Sander Vesik

unread,

Jul 1, 2003, 8:37:06 AM7/1/03

to

Couledn't you add something like that onto a "conventional" processor?

--
Sander

+++ Out of cheese error +++

Anne & Lynn Wheeler

unread,

Jul 1, 2003, 10:47:10 AM7/1/03

to

"Rupert Pigott" <r...@dark-try-removing-this-boong.demon.co.uk> writes:
> By 'cost' do you mean that it takes longer to communicate
> via message passing than shared memory ?
>
> I don't see why this should be so. In a NUMA system or a
> message passing system for a message to get from CPU A to
> CPU B it will still have to travel along a very similar
> signal path. So it can't be the plumbing that slows it down
>
> If you are talking about a locally delivered message then
> perhaps it could be slower, simply because you are eating
> bandwidth to make a copy (and pranging the cache to boot).
>
> The trick seems to be to make messages cheap in the
> hardware, this has been done many many times. From my
> point of view keeping it simple is the way to go here,
> so you start by throwing your Ethernet gear into a skip. :)
>
> I've seen a few people smugly compare TCP over Ethernet
> with custom built shared-memory pipes down the years and
> make the bogus leap of intuition that message passing
> is slow... I suppose to counter that they would really
> need to see how shared memory performs across TCP +
> Ethernet. My guess would be : considerably more shite
> than message passing.

SCI was attempt to be all things to all people (mappings for
cache memory, disk protocol, etc):
http://www.scizzl.com/
http://hsi.web.cern.ch/HSI/sci/sci.html
http://www.computer.org/proceedings/lcn/1591/15910691abs.htm
http://lists.insecure.org/linux-kernel/2001/Jul/1421.html

i just ran across hardware announcement about targeting IPv6 on OC192
environments and doing 266million searches/second.
http://www.commsdesign.com/story/OEG20030630S0053

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
Internet trivia 20th anv http://www.garlic.com/~lynn/rfcietff.htm

Peter da Silva

unread,

Jul 1, 2003, 1:38:36 PM7/1/03

to

In article <8%7Ma.4154$Xm3.1087@sccrnsc02>,

Cliff Sojourner <c...@employees.org> wrote:
> if it were easy to get the benefits of message passing OS then it would have
> happened a long time ago.

If the message passing is cheap enough (significantly less than system
call overhead in a "traditional" OS) then the message-passing system
can be faster than the traditional one. The problem with message
passing systems isn't the message passing overhead, it's that you have
to do a lot of work trying to avoid any service becoming a bottleneck.
Even on the Amiga where it was four instructions to put a message on a
queue the bottlenecks in the file system became a problem.

In a monolithic UNIX kernel this kind of thing comes for free, since
each system call automagically gets its own process context to handle
the whole operation from start to finish you never ended up blocked on
a read because some server somewhere was blocked on someone else's
request.

But now that I've mentioned the Amiga, I have to say that it did happen
a long time ago. There's unfortunately non-technical reasons why one
system or another becomes dominant or fails (for example, getting used
as a pawn in a war between Jack Tramiel and his former employers didn't
do the Amiga any good).

> but you're on the right track - how can we make message passing systems
> attractive to "regular" applications?

Message passing systems are a natural for GUI applications, and may
turn out still to be what they need. God knows there needs to be SOME
kind of fundamental paradigm shift in that environment.

--
#!/usr/bin/perl
$/="%\n";chomp(@_=<>);print$_[rand$.]

Peter da Silva, just another Perl poseur.

Geoff Lane

unread,

Jul 1, 2003, 1:49:33 PM7/1/03

to

In alt.folklore.computers Peter da Silva <pe...@abbnm.com> wrote:
> If the message passing is cheap enough (significantly less than system
> call overhead in a "traditional" OS) then the message-passing system
> can be faster than the traditional one.

Message passing also has another advantage - it defines interfaces that
cannot be subverted. Monolithic kernels allow poor programmers to bypass
defined interfaces in the interests of "effiency"

--
Geoff Lane

Barry Margolin

unread,

Jul 1, 2003, 1:57:05 PM7/1/03

to

In article <3f01c9ad$0$56600$bed6...@pubnews.gradwell.net>,

On the other hand, it also traps you into using those interfaces. If you
don't get the design right, it can be difficult to work around it. Ideally
this shouldn't be a problem, but in a practical sense it often is.

--
Barry Margolin, barry.m...@level3.com
Level(3), Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

Tom Van Vleck

unread,

Jul 1, 2003, 2:54:04 PM7/1/03

to

"Rupert Pigott" wrote:
> I don't see why this should be so. In a NUMA system or a
> message passing system for a message to get from CPU A to
> CPU B it will still have to travel along a very similar
> signal path. So it can't be the plumbing that slows it down
>
> If you are talking about a locally delivered message then
> perhaps it could be slower, simply because you are eating
> bandwidth to make a copy (and pranging the cache to boot).
>
> The trick seems to be to make messages cheap in the
> hardware, this has been done many many times.

One cost of message based systems is making copies of
things. To make a message passing call, one has to at
minimum determine the size of the arguments, allocate a
message object, marshal the arguments into it, queue and
dequeue the message, and free the message object. If the
calling site and the called site do not share memory, than
additional copying and buffering is necessary. The storage
for the copies is either preallocated and mostly idle, or
is allocated and freed from a pool of storage, at the cost
of additional complexity; in either case it adds to memory
pressure.

Another cost is synchronization. Each allocation, freeing,
queueing, or dequeueing operation needs atomicity; whether
hidden in the hardware or done explicitly in software, this
synchronization requires some cost even if there is never a
conflict that causes one thread to delay.

My experience with message passing systems is that they
start out by penalized a factor of about two compared to
direct call systems, and that by employing many clever
strategies, can make up about half the deficit after years
of improvement. Sometimes the elegance, uniformity, and
protection provided by the message passing design is worth
it.

Stephen Fuld

unread,

Jul 1, 2003, 3:01:34 PM7/1/03

to

"Sander Vesik" <san...@haldjas.folklore.ee> wrote in message
news:10570630...@haldjas.folklore.ee...

> In comp.arch Stephen Fuld <s.f...@pleaseremove.att.net> wrote:

snip

> > So, the obvious question is then, is there something that makes sense
from
> > that idea to adapt into current microprocessor designs in order to give
the
> > advantages of low cost message passing, and ease the development of more
> > modular software that would use it?
> >
>
> Couledn't you add something like that onto a "conventional" processor?

I think you essentially restated part of my question. In the Elixi, Russel
pointed out that it needed both hardware and microcode. Now microcode is
passe on most current "conventional" processors, so you have to figure
something else out. In order to cross domains, you probably have to do some
fiddling with page tables or something. You want to avoid the overhead of a
full system call if possible. ISTM that there are some issues here to
resolve that may make it not worth while. Hence my question, and the second
part, which is, assuming that you had cheap message passing, what would it
take for much software to take advantage of it?

Pete Fenelon

unread,

Jul 1, 2003, 3:08:26 PM7/1/03

to

In alt.folklore.computers Rupert Pigott <r...@dark-try-removing-this-boong.demon.co.uk> wrote:
>
> The trick seems to be to make messages cheap in the
> hardware, this has been done many many times. From my
> point of view keeping it simple is the way to go here,
> so you start by throwing your Ethernet gear into a skip. :)
>

Back when I was a youngster slaving away for the military-industrial
complex I did a fair bit of work on Multibus 2 crates for sonar
systems that had quite an amusing little Message Passing Coprocessor
on them - (Intel 82389 rings a bell). Made inter-processor comms on
Multibus about as easy as DMA between a CPU and a peripheral. Nice, even
though Multibus 2 wasn't particularly elegant itself.

pete
--
pe...@fenelon.com "there's no room for enigmas in built-up areas" HMHB

Peter da Silva

unread,

Jul 1, 2003, 2:57:40 PM7/1/03

to

In article <3f01c9ad$0$56600$bed6...@pubnews.gradwell.net>,
Geoff Lane <zza...@buffy.sighup.org.uk> wrote:

Not that any OS has ever moved a component into the kernel to do the same
thing. :)

Peter da Silva

unread,

Jul 1, 2003, 2:59:44 PM7/1/03

to

In article <RXjMa.11$5P2...@paloalto-snr1.gtei.net>,

Barry Margolin <barry.m...@level3.com> wrote:
> In article <3f01c9ad$0$56600$bed6...@pubnews.gradwell.net>,
> Geoff Lane <zza...@buffy.sighup.org.uk> wrote:
> >In alt.folklore.computers Peter da Silva <pe...@abbnm.com> wrote:
> >> If the message passing is cheap enough (significantly less than system
> >> call overhead in a "traditional" OS) then the message-passing system
> >> can be faster than the traditional one.

> >Message passing also has another advantage - it defines interfaces that
> >cannot be subverted. Monolithic kernels allow poor programmers to bypass
> >defined interfaces in the interests of "effiency"

> On the other hand, it also traps you into using those interfaces. If you
> don't get the design right, it can be difficult to work around it. Ideally
> this shouldn't be a problem, but in a practical sense it often is.

No more than any other formalised interface does. If you need to redesign
to get rid of a poorly chosen interface, then it's probably best to be faced
with it up front than to have a new interface grow organically as components
start bypassing it.

Peter da Silva

unread,

Jul 1, 2003, 3:15:55 PM7/1/03

to

In article <thvv-7F26CE.1...@news.comcast.giganews.com>,

Tom Van Vleck <th...@multicians.org> wrote:
> One cost of message based systems is making copies of
> things.

You can use techniques similar to the ones used to cut down or even
eliminate copies in network stacks. All objects, all objects over a certain
size, or all objects designated as "fast copy" are mapped rather than
copied... and may even be allocated out of a shared memory area to cut
down on the amount of page table rearrangement needed. You just need
to agree that the sending component doesn't access the object after
it's sent.

> My experience with message passing systems is that they
> start out by penalized a factor of about two compared to
> direct call systems, and that by employing many clever
> strategies, can make up about half the deficit after years
> of improvement.

My experience is with one particular where message passing was only a few
times slower than a subroutine call. Also all messages were queued, so
rather than making a system call (which meant a context switch), and
then another, and another, a program sends multiple messages and only
then enters a wait and you hit the context switch.

This is similar to what X11 does in bundling multiple operations in one
message, but it applies to all the concurrent operations performed by
one component... so after initialization (which tends to be serialised)
it may be making more "system calls" but only a fraction of them actually
involve a context switch.

It ran into serialization problems, mostly due to components that didn't
keep multiple messages in flight but instead ran each to completion before
attending the next.

Rupert Pigott

unread,

Jul 1, 2003, 5:27:21 PM7/1/03

to

"Tom Van Vleck" <th...@multicians.org> wrote in message
news:thvv-7F26CE.1...@news.comcast.giganews.com...

> "Rupert Pigott" wrote:
> > I don't see why this should be so. In a NUMA system or a
> > message passing system for a message to get from CPU A to
> > CPU B it will still have to travel along a very similar
> > signal path. So it can't be the plumbing that slows it down
> >
> > If you are talking about a locally delivered message then
> > perhaps it could be slower, simply because you are eating
> > bandwidth to make a copy (and pranging the cache to boot).
> >
> > The trick seems to be to make messages cheap in the
> > hardware, this has been done many many times.
>
> One cost of message based systems is making copies of
> things. To make a message passing call, one has to at
> minimum determine the size of the arguments, allocate a
> message object, marshal the arguments into it, queue and
> dequeue the message, and free the message object. If the

That's not quite true. Allocate a receiving buffer (once)
for each "channel", not for each "message". In OCCAM this
was frequently done statically (no run time cost).

Also in a NUMA or SMP machine with cache-coherency the
copying and locking in effect perform the same kind of
interactions that message passing does. On a more
dogmatic day I'd assert that it is *precisely* the same
interaction but with different semantics presented to
the code.

> calling site and the called site do not share memory, than
> additional copying and buffering is necessary. The storage
> for the copies is either preallocated and mostly idle, or
> is allocated and freed from a pool of storage, at the cost
> of additional complexity; in either case it adds to memory
> pressure.

Not a huge concern in recent years judging by the
bloatage that applications have exhibited.

> Another cost is synchronization. Each allocation, freeing,
> queueing, or dequeueing operation needs atomicity; whether
> hidden in the hardware or done explicitly in software, this
> synchronization requires some cost even if there is never a
> conflict that causes one thread to delay.

This is required for Shared Memory too. The minimum
synchronisation and data transfer requirements will remain
the same at the application level in a modern SMP with
local cache for each CPU (unless the shared memory is
accessed directly word by word - slow).

I come from a CSP background which is more like ADA's
Rendezvous* mechanisms than this queued nonsense sitting
on top of heavyweight transports... If I wanted queueing
I implemented it in a tiny little process that sat
between the clients and the server.

> My experience with message passing systems is that they
> start out by penalized a factor of about two compared to
> direct call systems, and that by employing many clever
> strategies, can make up about half the deficit after years
> of improvement. Sometimes the elegance, uniformity, and
> protection provided by the message passing design is worth
> it.

In which case I'd guess that you were using relatively
heavyweight message passing compared to the form I am
most familiar with (and would advocate as the One True
Programming Paradigm...).

* = Please tell me ADA's Rendezvous doesn't do queueing...

Cheers,
Rupert

Rupert Pigott

unread,

Jul 1, 2003, 5:31:25 PM7/1/03

to

"Peter da Silva" <pe...@abbnm.com> wrote in message
news:bdsln0$292k$6...@jeeves.eng.abbnm.com...

Or you could do what many kludgers do : Add another interface and
botch the internals to fit.

Cheers,
Rupert

Rupert Pigott

unread,

Jul 1, 2003, 5:32:36 PM7/1/03

to

"Peter da Silva" <pe...@abbnm.com> wrote in message
news:bdslj4$292k$5...@jeeves.eng.abbnm.com...

> In article <3f01c9ad$0$56600$bed6...@pubnews.gradwell.net>,
> Geoff Lane <zza...@buffy.sighup.org.uk> wrote:
> > In alt.folklore.computers Peter da Silva <pe...@abbnm.com> wrote:
> > > If the message passing is cheap enough (significantly less than system
> > > call overhead in a "traditional" OS) then the message-passing system
> > > can be faster than the traditional one.
>
> > Message passing also has another advantage - it defines interfaces that
> > cannot be subverted. Monolithic kernels allow poor programmers to
bypass
> > defined interfaces in the interests of "effiency"
>
> Not that any OS has ever moved a component into the kernel to do the same
> thing. :)

God forbid that you put NFS servers, HTTP servers, and GUIs into
kernel ! That would be lunacy ! Who would do such a thing ? :)

Cheers,
Rupert

Chris Hedley

unread,

Jul 1, 2003, 5:42:40 PM7/1/03

to

According to Rupert Pigott <r...@dark-try-removing-this-boong.demon.co.uk>:

> > Not that any OS has ever moved a component into the kernel to do the same
> > thing. :)
>
> God forbid that you put NFS servers, HTTP servers, and GUIs into
> kernel ! That would be lunacy ! Who would do such a thing ? :)

Some people could jump to the conclusion that MVT's memory scheme
is still state of the art...

Chris.
--
"If the world was an orange it would be like much too small, y'know?" Neil, '84
Currently playing: random early '80s radio stuff
http://www.chrishedley.com - assorted stuff, inc my genealogy. Gan canny!

Nick Maclaren

unread,

Jul 1, 2003, 6:12:46 PM7/1/03

to

In article <10570948...@saucer.planet.gong>,

Rupert Pigott <r...@dark-try-removing-this-boong.demon.co.uk> wrote:
>
>"Tom Van Vleck" <th...@multicians.org> wrote in message
>news:thvv-7F26CE.1...@news.comcast.giganews.com...
>> "Rupert Pigott" wrote:
>> > I don't see why this should be so. In a NUMA system or a
>> > message passing system for a message to get from CPU A to
>> > CPU B it will still have to travel along a very similar
>> > signal path. So it can't be the plumbing that slows it down
>> >
>> > If you are talking about a locally delivered message then
>> > perhaps it could be slower, simply because you are eating
>> > bandwidth to make a copy (and pranging the cache to boot).
>> >
>> > The trick seems to be to make messages cheap in the
>> > hardware, this has been done many many times.
>>
>> One cost of message based systems is making copies of
>> things. To make a message passing call, one has to at
>> minimum determine the size of the arguments, allocate a
>> message object, marshal the arguments into it, queue and
>> dequeue the message, and free the message object. If the
>
>That's not quite true. Allocate a receiving buffer (once)
>for each "channel", not for each "message". In OCCAM this
>was frequently done statically (no run time cost).

It's not even remotely true. Both Rupert Pigott and you are talking
about techniques that were widely known a long time back, and which
work well.

>Also in a NUMA or SMP machine with cache-coherency the
>copying and locking in effect perform the same kind of
>interactions that message passing does. On a more
>dogmatic day I'd assert that it is *precisely* the same
>interaction but with different semantics presented to
>the code.

I think that I agree with you. The only case I know of where shared
memory 'scores' is when transferring data from the middle of (say)
one stack to the middle of another. Shared memory can do that with
one copy; message passing sometimes (but not always) needs two.

>> calling site and the called site do not share memory, than
>> additional copying and buffering is necessary. The storage
>> for the copies is either preallocated and mostly idle, or
>> is allocated and freed from a pool of storage, at the cost
>> of additional complexity; in either case it adds to memory
>> pressure.
>
>Not a huge concern in recent years judging by the
>bloatage that applications have exhibited.

If the buffer management is competent, it isn't a major problem
except for massive processor counts. And only SGI produces a shared
memory system with above c. 100 CPUs.

>> Another cost is synchronization. Each allocation, freeing,
>> queueing, or dequeueing operation needs atomicity; whether
>> hidden in the hardware or done explicitly in software, this
>> synchronization requires some cost even if there is never a
>> conflict that causes one thread to delay.
>
>This is required for Shared Memory too. The minimum
>synchronisation and data transfer requirements will remain
>the same at the application level in a modern SMP with
>local cache for each CPU (unless the shared memory is
>accessed directly word by word - slow).

What is often worse is that shared memory interfaces often don't
provide decent synchronised transfers, and so you have to use
inappropriate ones (e.g. barriers).

>I come from a CSP background which is more like ADA's
>Rendezvous* mechanisms than this queued nonsense sitting
>on top of heavyweight transports... If I wanted queueing
>I implemented it in a tiny little process that sat
>between the clients and the server.

Heavyweight AND MISDESIGNED transports. N copies, each stage
done synchronously and no decent diagnostics.

>> My experience with message passing systems is that they
>> start out by penalized a factor of about two compared to
>> direct call systems, and that by employing many clever
>> strategies, can make up about half the deficit after years
>> of improvement. Sometimes the elegance, uniformity, and
>> protection provided by the message passing design is worth
>> it.
>
>In which case I'd guess that you were using relatively
>heavyweight message passing compared to the form I am
>most familiar with (and would advocate as the One True
>Programming Paradigm...).

Now, there I differ .... It is One of The True Programming Paradigms,
but I am a heretic (from whatever viewpoint) :-)

Regards,
Nick Maclaren.

Sander Vesik

unread,

Jul 1, 2003, 6:10:37 PM7/1/03

to

In comp.arch Stephen Fuld <s.f...@pleaseremove.att.net> wrote:
>
> "Sander Vesik" <san...@haldjas.folklore.ee> wrote in message
> news:10570630...@haldjas.folklore.ee...
>> In comp.arch Stephen Fuld <s.f...@pleaseremove.att.net> wrote:
>
> snip
>
>> > So, the obvious question is then, is there something that makes sense
> from
>> > that idea to adapt into current microprocessor designs in order to give
> the
>> > advantages of low cost message passing, and ease the development of more
>> > modular software that would use it?
>> >
>>
>> Couledn't you add something like that onto a "conventional" processor?
>
> I think you essentially restated part of my question. In the Elixi, Russel
> pointed out that it needed both hardware and microcode. Now microcode is

yes - in a shorter (and i'm afraid, infinititely worse spelled) version. By the
time I reached the end I had forgotten all about text before the description.

> passe on most current "conventional" processors, so you have to figure
> something else out. In order to cross domains, you probably have to do some
> fiddling with page tables or something. You want to avoid the overhead of a

Instead of microcode, one might use a special operating mode / exception level
and support instructions. such a mode could use alternate regs, have access to
data using more than one asid and so on. with some input checking in hardware
it could be both fast and RISCy.

> full system call if possible. ISTM that there are some issues here to
> resolve that may make it not worth while. Hence my question, and the second
> part, which is, assuming that you had cheap message passing, what would it
> take for much software to take advantage of it?
>

Hmmm... dependning on how ingrained their present message passing interfaces
and implementations are, mach or some of the newer microkernels might be portable
to such? Couldn't you as a first step eliminate some of their present inefficency
and then extend to achieve more performance?

Stephen Fuld

unread,

Jul 2, 2003, 1:01:43 AM7/2/03

to

"Sander Vesik" <san...@haldjas.folklore.ee> wrote in message

news:10570974...@haldjas.folklore.ee...

> In comp.arch Stephen Fuld <s.f...@pleaseremove.att.net> wrote:

snip

> Instead of microcode, one might use a special operating mode / exception

level
> and support instructions. such a mode could use alternate regs, have
access to
> data using more than one asid and so on. with some input checking in
hardware
> it could be both fast and RISCy.

Yes, I think you could use something like that. I guess I was looking for a
variety of potential solutions with some analysis of what fits the best, is
most efficient, is easiers to use, etc. You have indeed provided the
outline for one such method, Would the lower numbered rings (but still >0)
be sufficient, of do we need another mode?

> > full system call if possible. ISTM that there are some issues here to
> > resolve that may make it not worth while. Hence my question, and the
second
> > part, which is, assuming that you had cheap message passing, what would
it
> > take for much software to take advantage of it?
> >
>
> Hmmm... dependning on how ingrained their present message passing
interfaces
> and implementations are, mach or some of the newer microkernels might be
portable
> to such? Couldn't you as a first step eliminate some of their present
inefficency
> and then extend to achieve more performance?

I think so. And you could have a compatibility "trap" routine that took
what are now kernel calls and turned them into the appropriate messages
passed. Eventually, code could migrate toward the native interfaces for
increased performance and perhaps functionality.

James Cownie

unread,

Jul 2, 2003, 4:08:49 AM7/2/03

to

Rupert Pigott wrote:

>
> That's not quite true. Allocate a receiving buffer (once)
> for each "channel", not for each "message". In OCCAM this
> was frequently done statically (no run time cost).
>

I don't think so. In Occam all communication is synchronised,
so there is no need for _any_ receive buffer. Data can _always_
be transferred directly into the user's target variable.

Similarly on send, data can always be transferred directly
from the user's variable. (Or constant or ?, of course).

--
-- Jim

James Cownie <jco...@etnus.com>
Etnus, LLC. +44 117 9071438
http://www.etnus.com

Pete Fenelon

unread,

Jul 2, 2003, 5:10:29 AM7/2/03

to

In alt.folklore.computers Rupert Pigott <r...@dark-try-removing-this-boong.demon.co.uk> wrote:

> God forbid that you put NFS servers, HTTP servers, and GUIs into
> kernel ! That would be lunacy ! Who would do such a thing ? :)
>

Thinking of no open-source OS in particular.... the script kiddies who
hack the Linux kernel have managed 2 out of 3 ;) Fortunately they're
optional ;)

I don't think I've seen an in-kernel GUI on any Unix system since
Whitechapel MG1s, but I'm sure someone could prove me wrong ;)

Morten Reistad

unread,

Jul 2, 2003, 6:41:24 AM7/2/03

to

In article <vg58c5l...@corp.supernews.com>,

Pete Fenelon <pe...@fenelon.com> wrote:
>In alt.folklore.computers Rupert Pigott <r...@dark-try-removing-this-boong.demon.co.uk> wrote:
>> God forbid that you put NFS servers, HTTP servers, and GUIs into
>> kernel ! That would be lunacy ! Who would do such a thing ? :)
>>
>
>Thinking of no open-source OS in particular.... the script kiddies who
>hack the Linux kernel have managed 2 out of 3 ;) Fortunately they're
>optional ;)

The Linux people have the nfs server still in user mode last I saw.
The BSD has had the nfs server tightly connected to the rest of the fs
code, and even if it is a separate process, it still is executing
kernel code in a high privilige level.

<rant>
Why do the file systems have to be so tightly integrated in the "ring0"
core? This is one subsystem that screams for standard callouts and
"ring1" level.
</rant off>

>I don't think I've seen an in-kernel GUI on any Unix system since
>Whitechapel MG1s, but I'm sure someone could prove me wrong ;)

GUI's, no; unless you count the fancy tty screen drivers.

Pete Fenelon

unread,

Jul 2, 2003, 8:07:08 AM7/2/03

to

In alt.folklore.computers Morten Reistad <m...@reistad.priv.no> wrote:
> In article <vg58c5l...@corp.supernews.com>,
> Pete Fenelon <pe...@fenelon.com> wrote:
>>In alt.folklore.computers Rupert Pigott <r...@dark-try-removing-this-boong.demon.co.uk> wrote:
>>> God forbid that you put NFS servers, HTTP servers, and GUIs into
>>> kernel ! That would be lunacy ! Who would do such a thing ? :)
>>>
>>
>>Thinking of no open-source OS in particular.... the script kiddies who
>>hack the Linux kernel have managed 2 out of 3 ;) Fortunately they're
>>optional ;)
>
> The Linux people have the nfs server still in user mode last I saw.
> The BSD has had the nfs server tightly connected to the rest of the fs
> code, and even if it is a separate process, it still is executing
> kernel code in a high privilige level.

AFAIR, Acting as an NFS server under Linux doesn't need kernel
support (but can use optional kernel-side support). Acting as an
NFS client requires kernel support (to wire NFS into the supported
set of filesystems.)

>
> <rant>
> Why do the file systems have to be so tightly integrated in the "ring0"
> core? This is one subsystem that screams for standard callouts and
> "ring1" level.
> </rant off>

Agreed.

>
>
>>I don't think I've seen an in-kernel GUI on any Unix system since
>>Whitechapel MG1s, but I'm sure someone could prove me wrong ;)
>
> GUI's, no; unless you count the fancy tty screen drivers.

--

Holger Veit

unread,

Jul 2, 2003, 8:28:28 AM7/2/03

to

Pete Fenelon <pe...@fenelon.com> wrote:
> In alt.folklore.computers Morten Reistad <m...@reistad.priv.no> wrote:
[...]

>
> AFAIR, Acting as an NFS server under Linux doesn't need kernel
> support (but can use optional kernel-side support). Acting as an
> NFS client requires kernel support (to wire NFS into the supported
> set of filesystems.)
>
>>
>> <rant>
>> Why do the file systems have to be so tightly integrated in the "ring0"
>> core? This is one subsystem that screams for standard callouts and
>> "ring1" level.
>> </rant off>
>
> Agreed.

Seconded. Problem is meanwhile that the old VAX days are gone; those
that introduced the several privilege rings which the 386 copied rather
precisely. With the exception of OS/2 and older WinNT, it seems no
modern OS has actually used the feature of multiple privilege levels
at all, with the common distinction of "supervisor" (or "kernel") and
"user" modes. Earlier (like the 68000) and later processors (PPC, MIPS, etc.)
only have those two levels at all. I.e. the knowledge of layered privileges
seems to be gone and lost - it is no just and "everything" or "nothing"
difference which makes such systems rather vulnerable. Ring 1 for file
systems or hicore parts of drivers is appropriate - but then, as M$
demonstrated by destroying the rather clean concept of WinNT, there are
performance issues due to lousy application code that "forces" the OS
writers to circumvent such clean ring callouts by throwing the whole
garbage into ring 0. Or as in Linux, there is no explicit driver API
at all (like NT's HAL or OS/2's DevHlp); every driver can mess up everything
else in kernel mode without being prevented. Ideas like microkernels have
been beaten to death by Mach crap that didn't build up the kernel from
ground up, but by the nonsense attempt to strip a monolithic kernel and
move parts into user mode without first defining where the ring border
is supposed to be. Needless to say, the result was catastrophic.

Holger

Barry Margolin

unread,

Jul 2, 2003, 11:02:13 AM7/2/03

to

In article <kscudb.krn1.ln@acer>, Morten Reistad <m...@reistad.priv.no> wrote:
>In article <vg58c5l...@corp.supernews.com>,
>Pete Fenelon <pe...@fenelon.com> wrote:
>>In alt.folklore.computers Rupert Pigott
><r...@dark-try-removing-this-boong.demon.co.uk> wrote:
>>> God forbid that you put NFS servers, HTTP servers, and GUIs into
>>> kernel ! That would be lunacy ! Who would do such a thing ? :)
>>>
>>
>>Thinking of no open-source OS in particular.... the script kiddies who
>>hack the Linux kernel have managed 2 out of 3 ;) Fortunately they're
>>optional ;)
>
>The Linux people have the nfs server still in user mode last I saw.

And I frequently hear complaints about how poor Linux's NFS support is.
Coincidence?

Unfortunately, the design of NFS practically screams for kernel
implementation. Most file system APIs implement an
open/do-lots-of-operations/close model of file access. NFS doesn't have
open or close operations, each request identifies the file using an opaque
"handle", and the file handle maps most naturally into Unix's inode model;
when implementing NFS servers on other operating systems, it's often
necessary to design kludges to support its file handles. Since the
standard API only deals with accessing files by name, not inode, it's
necessary to put the server in the kernel to get past the name requirement
(user-mode servers typically have to have the same kinds of kludges as
non-Unix implementations, or you need to add system calls that allow
by-inode access).

Shmuel (Seymour J.) Metz

unread,

Jul 2, 2003, 10:22:32 AM7/2/03

to

In <g8vsdb...@teabag.cbhnet>, on 07/01/2003

at 10:42 PM, c...@ieya.co.REMOVE_THIS.uk (Chris Hedley) said:

>Some people could jump to the conclusion that MVT's memory scheme is
>still state of the art...

Even MVT had storage protection; Supervisor didn't automatically give
you key 0. So moving graphics into the kernel on an IA-32 is even
worse than what MVT had. A company that would do a thing like that
would be capable of anything, even allowing users to include code in
e-mail that the company's software would automatically execute on
receipt of the e-mail. We all know that no one could be that stupid.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT

Any unsolicited bulk E-mail will be subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail.

Reply to domain Patriot dot net user shmuel+news to contact me. Do not reply
to spam...@library.lspace.org

Rupert Pigott

unread,

Jul 2, 2003, 11:46:13 AM7/2/03

to

"James Cownie" <jco...@etnus.com> wrote in message
news:lqwMa.866$Wp.61...@news-text.cableinet.net...

> Rupert Pigott wrote:
>
> >
> > That's not quite true. Allocate a receiving buffer (once)
> > for each "channel", not for each "message". In OCCAM this
> > was frequently done statically (no run time cost).
> >
>
> I don't think so. In Occam all communication is synchronised,
> so there is no need for _any_ receive buffer. Data can _always_
> be transferred directly into the user's target variable.
>
> Similarly on send, data can always be transferred directly
> from the user's variable. (Or constant or ?, of course).

Buffer == user variable.

The confusion is arising because I'm trying to be
language agnostic here, just looking at the basics
of sending bits across systems and synchronisation. :)

Cheers,
Rupert

Peter Ibbotson

unread,

Jul 2, 2003, 11:52:49 AM7/2/03

to

"Pete Fenelon" <pe...@fenelon.com> wrote in message
news:vg58c5l...@corp.supernews.com...

> In alt.folklore.computers Rupert Pigott
<r...@dark-try-removing-this-boong.demon.co.uk> wrote:
> > God forbid that you put NFS servers, HTTP servers, and GUIs into
> > kernel ! That would be lunacy ! Who would do such a thing ? :)
> >
>
> Thinking of no open-source OS in particular.... the script kiddies who
> hack the Linux kernel have managed 2 out of 3 ;) Fortunately they're
> optional ;)
>
> I don't think I've seen an in-kernel GUI on any Unix system since
> Whitechapel MG1s, but I'm sure someone could prove me wrong ;)

I learned C on the MG1, I don't remember the GUI as being in kernel, but
then at the time I'm not sure I'd have spotted the distinction. I always
liked the idea of a seperate mouse co-processor, and all windows having
their contents stored in a raster rather than having to repaint when they
overlapped. Are there technical documents on the web anywhere?

--
Work pet...@lakeview.co.uk.plugh.org | remove magic word .org to reply
Home pe...@ibbotson.co.uk.plugh.org | I own the domain but theres no MX

George Coulouris

unread,

Jul 2, 2003, 1:29:21 PM7/2/03

to

In article <10570948...@saucer.planet.gong>, Rupert Pigott wrote:
[snip]

> Also in a NUMA or SMP machine with cache-coherency the
> copying and locking in effect perform the same kind of
> interactions that message passing does. On a more
> dogmatic day I'd assert that it is *precisely* the same
> interaction but with different semantics presented to
> the code.

This reminds me of field theories vs. gluon exchange..

--
george coulouris
not speaking for ncbi
remove 's' from my email address to reply

Thomas

unread,

Jul 2, 2003, 3:45:30 PM7/2/03

to

Morten Reistad wrote:

> The Linux people have the nfs server still in user mode last I saw.

Debian offers a choice: user or kernel mode NFS server.

User mode has a real nfsd running handling requests.

Kernel model also starts an nfsd, that in turn starts kernel threads, and are
part of the kernel. Just like scheduler and interrupt handlers are part of the
kernel.

Thomas

Eric Lee Green

unread,

Jul 2, 2003, 4:40:50 PM7/2/03

to

Morten Reistad wrote:
> In article <vg58c5l...@corp.supernews.com>,
> Pete Fenelon <pe...@fenelon.com> wrote:
>>In alt.folklore.computers Rupert Pigott
>><r...@dark-try-removing-this-boong.demon.co.uk> wrote:
>>> God forbid that you put NFS servers, HTTP servers, and GUIs into
>>> kernel ! That would be lunacy ! Who would do such a thing ? :)
>>>
>>
>>Thinking of no open-source OS in particular.... the script kiddies who
>>hack the Linux kernel have managed 2 out of 3 ;) Fortunately they're
>>optional ;)
>
> The Linux people have the nfs server still in user mode last I saw.

Which apparently was over two years ago. The standard NFS server in Linux has
been the kernel one since the release of the Linux 2.4 operating system
kernel. The user mode NFS server is still available, but is unsupported and
only implements NFS V2, whereas the kernel mode NFS server implements NFS V3.
The NFS V4 reference implementation for Linux is also a kernel-mode server.

> <rant>
> Why do the file systems have to be so tightly integrated in the "ring0"
> core? This is one subsystem that screams for standard callouts and
> "ring1" level.
> </rant off>

Performance. Device drivers reading straight into filesystem buffers is
difficult to achieve in userland. You end up having to modify your VM to be
able to lock pages in memory, then have to go through the overhead of managing
said locking before you do I/O. Much easier to just have both operating in
kernal-land using the normal kernel memory page allocation mechanisms of the
OS in question.

That said, on my 2.4Ghz laptop, I could do reads through LUFS (Linux Userland
FileSystem) at 100MByte/sec, much faster than the hard drive can spin, so
performance is not as big an issue as it was "back in the day". I was using
over 20% of the CPU to do the reads, though, as vs. under 5% of the CPU for
the native kernel-mode filesystem. The main reason for the high CPU usage was
all the data copying between usermode and kernel-land needed. For example,
read() turns into: kernel call, VFS layer (may have cached data), pop back up
to userland, kernel call to device driver, copy result back up to userland,
copy result back down to kernel-land, pop back up to the user with a copy of
the data. Ouch. I can think of ways to speed this up, but none that will make
it as fast as the tightly-integrated kernel-land implementation.

> GUI's, no; unless you count the fancy tty screen drivers.

Hmm, the Linux 'fbcon' screen drivers ALMOST count as a native GUI, so I guess
the Linux geeks have achieved 2.5 out of 3 on the scale of atrocities :-).

--
Eric Lee Green mailto:er...@badtux.org
Unix/Linux/Storage Software Engineer needs job --
see http://badtux.org for resume

-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 80,000 Newsgroups - 16 Different Servers! =-----

Chris Hedley

unread,

Jul 2, 2003, 4:38:01 PM7/2/03

to

According to Shmuel (Seymour J.) Metz <spam...@library.lspace.org.invalid>:

> Even MVT had storage protection; Supervisor didn't automatically give
> you key 0.

I had a feeling that my comparison was probably unfair to MVT!

> So moving graphics into the kernel on an IA-32 is even
> worse than what MVT had. A company that would do a thing like that
> would be capable of anything, even allowing users to include code in
> e-mail that the company's software would automatically execute on
> receipt of the e-mail. We all know that no one could be that stupid.

No, it'll never happen. :/ (I could rant on and on, but I think
it's already been done by people who're better at it than me!)

Linus Torvalds

unread,

Jul 2, 2003, 4:46:45 PM7/2/03

to

In article <kscudb.krn1.ln@acer>, Morten Reistad <m...@reistad.priv.no> wrote:
>

>The Linux people have the nfs server still in user mode last I saw.

Nope. That was five years ago. Nobody uses the user-space server for
serious NFS serving any more, even though it _is_ useful for
experimenting with user-space filesystems (ie "ftp filesystem" or
"source control filesystem").

><rant>
>Why do the file systems have to be so tightly integrated in the "ring0"
>core? This is one subsystem that screams for standard callouts and
>"ring1" level.
></rant off>

Because only naive people think you can do it efficiently any other way.

Face it, microkernels and message passing on that level died a long time
ago, and that's a GOOD THING.

Most of the serious processing happens outside the filesystem (ie the
VFS layer keeps track of name caches, stat caches, content caches etc),
and all of those data structures are totally filesystem-independent (in
a well-designed system) and are used heavily by things like memory
management. Think mmap - the content caches are exposed to user space
etc. But that's not the only thing - the name cache is used extensively
to allow people to see where their data comes from (think "pwd", but on
steroids), and none of this is anything that the low-level filesystem
should ever care about.

At the same time, all those (ring0 - core) filesystem data structures
HAVE TO BE MADE AVAILABLE to the low-level filesystem for any kind of
efficient processing. If you think we're going to copy file contents
around, you're just crazy. In other words, the filesystem has to be
able to directly access the name cache, and the content caches. Which in
turn means that it has to be ring0 (core) too.

If you don't care about performance, you can add call-outs and copy-in
and copy-out etc crap. I'm telling you that you would be crazy to do it,
but judging from some of the people in academic OS research, you
wouldn't be alone in your own delusional world of crap.

Sorry to burst your bubble.

Linus

Tom Van Vleck

unread,

Jul 2, 2003, 5:38:30 PM7/2/03

to

Eric Lee Green wrote:

> Morten Reistad wrote:
> > <rant>
> > Why do the file systems have to be so tightly integrated in
> > the "ring0" core? This is one subsystem that screams for
> > standard callouts and"ring1" level.
> > </rant off>
>
> Performance. Device drivers reading straight into filesystem
> buffers is difficult to achieve in userland. You end up having
> to modify your VM to be able to lock pages in memory, then have
> to go through the overhead of managing said locking before you
> do I/O. Much easier to just have both operating in kernal-land
> using the normal kernel memory page allocation mechanisms of
> the OS in question.

Multics had a facility called the I/O interfacer, ioi_.
Its purpose was to allow the user to safely write a device
driver that ran in the user ring of a process, obtaining
page-locked I/O buffers and wiring and unwiring them efficiently.
The various tape DIMs and the printer DIM used ioi_.

One of the major efficiencies of this scheme was, again, that we
could avoid making multiple extra copies of the data. This saved
us complicated alloc, free, and snynchronization operations and
the related memory pressure on every record.

It worked great. As I remember, when the printer DIM was changed
to use ioi_, its load on the system decreased by more than half,
and we got to remove a bunch of device specific code from ring 0.

Charlie Gibbs

unread,

Jul 2, 2003, 5:19:31 PM7/2/03

to

In article <bduv4l$p08$1$8300...@news.demon.co.uk>
spa...@ibbotson.demon.co.uk (Peter Ibbotson) writes:

>I learned C on the MG1, I don't remember the GUI as being in kernel,
>but then at the time I'm not sure I'd have spotted the distinction.
>I always liked the idea of a seperate mouse co-processor, and all
>windows having their contents stored in a raster rather than having
>to repaint when they overlapped.

You mean like the Amiga's SMART_REFRESH windows? When I started
writing Windows programs, I was quite disgusted to discover that
the contents of my window would be erased if another window opened
over top of it, and that I was responsible for restoring it.
Oddly enough, my old Amiga 1000, with its paltry 68000, a couple
of support chips, and 512K of RAM, didn't seem to have much trouble
with the overhead that modern-day programmers with a Pentium 4 will
still claim is intolerably high...

--
/~\ cgi...@kltpzyxm.invalid (Charlie Gibbs)
\ / I'm really at ac.dekanfrus if you read it the right way.
X Top-posted messages will probably be ignored. See RFC1855.
/ \ HTML will DEFINITELY be ignored. Join the ASCII ribbon campaign!

Jeff Kenton

unread,

Jul 2, 2003, 7:57:21 PM7/2/03

to

Charlie Gibbs wrote:
...

>
> You mean like the Amiga's SMART_REFRESH windows? When I started
> writing Windows programs, I was quite disgusted to discover that
> the contents of my window would be erased if another window opened
> over top of it, and that I was responsible for restoring it.
> Oddly enough, my old Amiga 1000, with its paltry 68000, a couple
> of support chips, and 512K of RAM, didn't seem to have much trouble
> with the overhead that modern-day programmers with a Pentium 4 will
> still claim is intolerably high...

Agreed, but you're cheating a little here. The support chips included
state-of-the-art graphics processing without loading down the 68000.

jeff (one-time authorized Amiga reseller)

--

-------------------------------------------------------------------------
= Jeff Kenton Consulting and software development =
= http://home.comcast.net/~jeffrey.kenton =
-------------------------------------------------------------------------

Christopher Browne

unread,

Jul 2, 2003, 9:36:31 PM7/2/03

to

After takin a swig o' Arrakan spice grog, Jeff Kenton <Jeffrey...@comcast.net> belched out...:

> Charlie Gibbs wrote:
> ...
> >
>> You mean like the Amiga's SMART_REFRESH windows? When I started
>> writing Windows programs, I was quite disgusted to discover that
>> the contents of my window would be erased if another window opened
>> over top of it, and that I was responsible for restoring it.
>> Oddly enough, my old Amiga 1000, with its paltry 68000, a couple
>> of support chips, and 512K of RAM, didn't seem to have much trouble
>> with the overhead that modern-day programmers with a Pentium 4 will
>> still claim is intolerably high...
>
> Agreed, but you're cheating a little here. The support chips included
> state-of-the-art graphics processing without loading down the 68000.

.. And with some DSP chip that does a gazillion polygons per second
and 128MB of _graphics memory_, this is expected to be troublesome?
--
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','ntlug.org').
http://cbbrowne.com/info/spreadsheets.html
All bleeding stops...eventually.

Eric Lee Green

unread,

Jul 3, 2003, 12:10:55 AM7/3/03

to

Tom Van Vleck wrote:
> Eric Lee Green wrote:
>> Morten Reistad wrote:
>> > <rant>
>> > Why do the file systems have to be so tightly integrated in
>> > the "ring0" core? This is one subsystem that screams for
>> > standard callouts and"ring1" level.
>> > </rant off>
>>
>> Performance. Device drivers reading straight into filesystem
>> buffers is difficult to achieve in userland. You end up having
>> to modify your VM to be able to lock pages in memory, then have
>> to go through the overhead of managing said locking before you
>> do I/O. Much easier to just have both operating in kernal-land
>> using the normal kernel memory page allocation mechanisms of
>> the OS in question.
>
> Multics had a facility called the I/O interfacer, ioi_.
> Its purpose was to allow the user to safely write a device
> driver that ran in the user ring of a process, obtaining
> page-locked I/O buffers and wiring and unwiring them efficiently.
> The various tape DIMs and the printer DIM used ioi_.

This solves only part of the problem. With filesystems, you not only want to
move data between user and device, you also want to cache it, in a
preferential manner (e.g. directory nodes might get cached preferential to
data nodes, etc.). Multics was somewhat unique in its ability to memory-map
files between different processes (after all, they were just segments), which
used the page cache as a file cache, but memory mapping isn't particularly
efficient as a method for caching, especially with current processors. RMS is
having to re-write major parts of the GNU Hurd to move away from memory
mapping back to traditional I/O for numerous reasons (one reason being that it
imposed major restrictions on the size of devices and files with today's
32-bit processors).

> It worked great. As I remember, when the printer DIM was changed
> to use ioi_, its load on the system decreased by more than half,
> and we got to remove a bunch of device specific code from ring 0.

The deal with Multics was that it had wonderfully fast I/O, but a horrendously
slow context switch time and processors that were five years obsolete (and
slower than molasses) on the day they were first released. I'm still astounded
that a $6,000,000 Level/68 with three processors couldn't manage much over 100
users without staggering to a halt, in the same era where an equivalent IBM
370 installation could handle 500 users. (I was a user of USL Multics... I
swiftly learned to do my assignments at 3AM in the morning, when there was
only 40 or 50 people logged in... this probably explains my poor GPA during
the Multics era at USL too, since I was falling asleep in class :-). Anyhow,
an amount of copying that would have brought Multics to a crawl would not
cause a modern processor to even flinch. Copying across multiple contexts, not
copying in general, is the major issue, and with the filesystem in kernel-land
there's only one copying across a context (the copy from kernel-land to
user-land or vice-versa).

john jakson

unread,

Jul 3, 2003, 3:04:34 AM7/3/03

to

"Rupert Pigott" <r...@dark-try-removing-this-boong.demon.co.uk> wrote in message news:<10571607...@saucer.planet.gong>...

OT
For the Occam heads or ex Inmos folks!

In reviewing Occam more thoroughly recently it jumps out to me that
Occam is a wired kind of Hardware Description Language or HDL, the
kind of HDL that only a CSP mathematician or SW type would create with
some basic knowledge of HW but probably none on HW simulation event
wheels. I know Tony Hoare's 1st life was in HW, but the time gap to
CSP circa 68 was maybe too long.

Now HandelC binds Occam semantics with C syntax into a C based HDL
which allows the user to describe the algorithms parallel behaviour
directly rather than assuming some almighty parallelizing compiler try
to find it for them. HandelC can be used by C SW types to design soso
HW esp suitable for FPGAs. C based HDLs are not going to be popular
with HW folks for many reasons I leave alone here.

So HDLs use modules, always running processes, wires, hierarchy and
some C like syntax for functions, assignments etc.

Occam uses proedures, maybe always running processes, channels, !,?,
hierarchy and some non C like syntax to hang it all together.

Inside the HDL simulator, processes obey different rules, write before
read, not anonymous, static P list (HW doesn't just appear &
dissappear),

What would the world think if a small piece of the Verilog language
(or any good C like HDL with world class synthesis) was joined with C
& Occam semantics to allow a single language to support not just seq &
par programming but also support HW style of par coding. The run time
scheduler needed to support Occam channels & messages is not so very
different from the simulation event wheel of any good HDL engine.

The really big payoff is that either ordinary C seq code can be used &
mixed with Occam par code for SW types AND HW types can with extra
effort also code up in HW event driven logic. Given the kind of
compute intensive apps that would be written in Occam as distributed
processes possibly on more than 1 cpu, it is painfully obvious that
some of those problems can just as well be described with HW event
semantics. This opens the door to synthesizing such HW code portions
into accelerated HW engines or leaving it as end user HDL code. The
runtime for such a language could be either a static library, or
perhaps a new implementation of the Transputer with enhanced
scheduler. The HW applications running on this runtime could be truly
regarded as simulations of HW rather than the way we think of running
SW processes.

I could go on, but I have seen no other reasonable attempt to join HW
& SW languages into 1 language without the thing becoming an expensive
monster language available only to ASIC-FPGA guys (SuperLog,
SystemVerilog etc) some time in the distant future. ADA & VHDL might
have been a HW-SW candidate too, but gee they are both monster
languages & not exactly popular choices.

Besides compute intensive apps, I can also think of embedded systems
where HW & SW are usually designed by 2 distinct groups and then
cobbled together with some magic C glue & SDK. These systems have to
be rigidly defined accross a hard boundary of pyhsical and soft
entities long before either side can start. This would not have to be
the case with a mixed HW-SW language & runtime since some modules
might be real HW or simulated HW as performance allows and can be
changed by resynthesizing different parts as HW. Communication between
HW & SW uses shared link HW at the interface and channels on the SW
side.

I am working on such a mixed language & compiler, but the question is
how much of Verilog do I want to keep, most of it is already a mix of
HW and SW language features that already duplicates C (pretty badly
I'd say). I would like to only include the synthesizeable part and let
regular C be used to support it.

Any thoughts?

Regards
JJ

James Cownie

unread,

Jul 3, 2003, 4:36:25 AM7/3/03

to

But then your original statement is still wrong, since there
is no buffer allocated statically for each channel.

Consider
in ? a
in ? b
the variables a and b are allocated (on the stack), but there is
no store allocated as a buffer for the channel in.

Jan C. Vorbrüggen

unread,

Jul 3, 2003, 5:31:29 AM7/3/03

to

> This reminds me of field theories vs. gluon exchange..

...and both field theories and ccNUMA have "action at a distance",
while particle exchange and message passing make the delay more
explicit.

Jan

Ian G Batten

unread,

Jul 3, 2003, 6:06:19 AM7/3/03

to

In article <VtCMa.13$VI....@paloalto-snr1.gtei.net>,

Barry Margolin <barry.m...@level3.com> wrote:
> >The Linux people have the nfs server still in user mode last I saw.
>
> And I frequently hear complaints about how poor Linux's NFS support is.
> Coincidence?

It's in kernel space now, and has been for some time.

ian

Ian G Batten

unread,

Jul 3, 2003, 6:21:49 AM7/3/03

to

In article <vg58c5l...@corp.supernews.com>,
Pete Fenelon <pe...@fenelon.com> wrote:

> I don't think I've seen an in-kernel GUI on any Unix system since
> Whitechapel MG1s, but I'm sure someone could prove me wrong ;)

Suntools/SunWindows, present up until SunOS3.4?

ian

Pete Fenelon

unread,

Jul 3, 2003, 7:25:24 AM7/3/03

to

Predated the MG1 by a few years, I think, but definitely had a lot of
the pixrect stuff in-kernel.

Rupert Pigott

unread,

Jul 3, 2003, 10:52:17 AM7/3/03

to

"john jakson" <johnj...@yahoo.com> wrote in message
news:adb3971c.03070...@posting.google.com...

[SNIP]

> What would the world think if a small piece of the Verilog language
> (or any good C like HDL with world class synthesis) was joined with C
> & Occam semantics to allow a single language to support not just seq &
> par programming but also support HW style of par coding. The run time

This particular (and insignificant) part of the world would think that you
are
quite insane, but don't let that stop you ! Personally I really can't see
what
C would give you over and above OCCAM for that kind of gig. The two
key things that I missed in OCCAM were pointers and being able to
randomly change the type of stuff... The latter I grew to appreciate as I
matured. :)

As for Verilog I can't comment, never touched it, but I suspect that it
would
give something to OCCAM. A few of the folks I hung out with at INMOS
*did* use OCCAM for small simulations (can't remember how the big ones
were done), and a couple of the hardware bods semi-seriously suggested
doing a VHDL compiler for Transputers - from what they were saying it
seemed like a very snug fit (as you observed in your post).

Cheers,
Rupert

Rupert Pigott

unread,

Jul 3, 2003, 10:54:41 AM7/3/03

to

"James Cownie" <jco...@etnus.com> wrote in message

news:dWRMa.1853$gN2.14...@news-text.cableinet.net...

> Rupert Pigott wrote:
> > "James Cownie" <jco...@etnus.com> wrote in message
> > news:lqwMa.866$Wp.61...@news-text.cableinet.net...
> >
> >>Rupert Pigott wrote:
> >>
> >>
> >>>That's not quite true. Allocate a receiving buffer (once)
> >>>for each "channel", not for each "message". In OCCAM this
> >>>was frequently done statically (no run time cost).
> >>>
> >>
> >>I don't think so. In Occam all communication is synchronised,
> >>so there is no need for _any_ receive buffer. Data can _always_
> >>be transferred directly into the user's target variable.
> >>
> >>Similarly on send, data can always be transferred directly
> >>from the user's variable. (Or constant or ?, of course).
> >
> >
> > Buffer == user variable.
> >
> > The confusion is arising because I'm trying to be
> > language agnostic here, just looking at the basics
> > of sending bits across systems and synchronisation. :)
>
> But then your original statement is still wrong, since there
> is no buffer allocated statically for each channel.

Oops. Yes, of course. :)

> Consider
> in ? a
> in ? b
> the variables a and b are allocated (on the stack), but there is
> no store allocated as a buffer for the channel in.

The static thing is complete crap. Wrong braincell fired
I guess. But in essence the chunks of memory that the
channel I/O operates on is determined statically. It's
not dynamically allocated from a heap.

Cheers,
Rupert

Tim Shoppa

unread,

Jul 3, 2003, 1:18:51 PM7/3/03

to

Tom Van Vleck <th...@multicians.org> wrote in message news:<20030702173830...@multicians.org>...

> Eric Lee Green wrote:
>
> > Morten Reistad wrote:
> > > <rant>
> > > Why do the file systems have to be so tightly integrated in
> > > the "ring0" core? This is one subsystem that screams for
> > > standard callouts and"ring1" level.
> > > </rant off>
> >
> > Performance. Device drivers reading straight into filesystem
> > buffers is difficult to achieve in userland. You end up having
> > to modify your VM to be able to lock pages in memory, then have
> > to go through the overhead of managing said locking before you
> > do I/O. Much easier to just have both operating in kernal-land
> > using the normal kernel memory page allocation mechanisms of
> > the OS in question.
>
> Multics had a facility called the I/O interfacer, ioi_.
> Its purpose was to allow the user to safely write a device
> driver that ran in the user ring of a process, obtaining
> page-locked I/O buffers and wiring and unwiring them efficiently.
> The various tape DIMs and the printer DIM used ioi_.
>
> One of the major efficiencies of this scheme was, again, that we
> could avoid making multiple extra copies of the data. This saved
> us complicated alloc, free, and snynchronization operations and
> the related memory pressure on every record.

What details of the page-locked I/O buffers did ioi_ handle? In particular,
for DM devices it must have put the buffer address/length into the peripheral
controller for the user... how much consistency/inconsistency was there
between the various devices that ioi_ handled?

Right now I'm thinking of comparison with IBM mainframe-style channel
I/O (which also could avoid all those extra bounce buffers).

Tim.

Tom Van Vleck

unread,

Jul 3, 2003, 2:11:01 PM7/3/03

to

sho...@trailing-edge.com (Tim Shoppa) wrote:

> What details of the page-locked I/O buffers did ioi_ handle?
> In particular, for DM devices it must have put the buffer
> address/length into the peripheral controller for the user...
> how much consistency/inconsistency was there between the
> various devices that ioi_ handled?
>
> Right now I'm thinking of comparison with IBM mainframe-style
> channel I/O (which also could avoid all those extra bounce
> buffers).

It has been a long time.. as I remember, ioi_ knew how to give
you a wired, contiguous buffer in low enough memory that the I/O
controller could use it, and understood enough about the channel
protocol to keep it wired only when necessary. ioi_ also set up
the channel address base and bounds in the I/O controller
correctly.

As I remember, ioi_ knew very little about the devices or the
peripheral controllers; it just managed access to the buffers.

For more information, there is a project MAC TR:
Clark, D. D., An input-ouput architecture for virtual memory
computer systems, MAC-TR-117 (Ph.D. thesis), January 1974.

Abstract:
"In many large systems today, input/output is not performed
directly by the user, but is done interpretively by the system
for him, which causes additional overhead and also restricts the
user to whatever algorithms the system has implemented. Many
causes contribute to this involvement of the system in user
input/output, including the need to enforce protection
requirements, the inability to provide adequate response to
control signals from devices, and the difficulty of running
devices in a virtual environment, especially a virtual memory.
The goal of this thesis was the creation of an input/output
system which allows the user the freedom of direct access to the
device, and which allows the user to build input/output control
programs in a simple and understandable manner. This thesis
presents a design for an input/output subsystem architecture
which, in the context of a segmented, paged, time-shared computer
system, allows the user direct access to input/output devices.
This thesis proposes a particular architecture, to be used as an
example of a class of suitable designs, with the intention that
this example serve as a tool in understanding the large number
preferable form."

Benny Amorsen

unread,

Jul 3, 2003, 3:03:19 PM7/3/03

to

>>>>> "PI" == Peter Ibbotson <spa...@ibbotson.demon.co.uk> writes:

PI> I learned C on the MG1, I don't remember the GUI as being in
PI> kernel, but then at the time I'm not sure I'd have spotted the
PI> distinction. I always liked the idea of a seperate mouse
PI> co-processor, and all windows having their contents stored in a
PI> raster rather than having to repaint when they overlapped. Are
PI> there technical documents on the web anywhere?

With todays hardware, windows could simply be textures on polygons.
That way the GPU would handle all the repainting. A mouse coprocessor
is probably overkill, 50 interrupts a second is nothing for a modern
CPU and few people need more than 50 frames per second.

/Benny

Peter da Silva

unread,

Jul 3, 2003, 6:25:03 PM7/3/03

to

In article <3f02eaa8$7$fuzhry+tra$mr2...@news.patriot.net>,

Shmuel (Seymour J.) Metz <spam...@library.lspace.org.invalid> wrote:
> worse than what MVT had. A company that would do a thing like that
> would be capable of anything, even allowing users to include code in
> e-mail that the company's software would automatically execute on
> receipt of the e-mail. We all know that no one could be that stupid.

Bah, even the mythical "GOOD TIMES" virus required that you *open* it!

--
#!/usr/bin/perl
$/="%\n";chomp(@_=<>);print$_[rand$.]

Peter da Silva, just another Perl poseur.

Peter da Silva

unread,

Jul 3, 2003, 6:34:29 PM7/3/03

to

In article <BjKMa.89516$R73.9817@sccrnsc04>,
Jeff Kenton <Jeffrey...@comcast.net> wrote:

> Charlie Gibbs wrote:
> > You mean like the Amiga's SMART_REFRESH windows? When I started
> > writing Windows programs, I was quite disgusted to discover that
> > the contents of my window would be erased if another window opened
> > over top of it, and that I was responsible for restoring it.
> > Oddly enough, my old Amiga 1000, with its paltry 68000, a couple
> > of support chips, and 512K of RAM, didn't seem to have much trouble
> > with the overhead that modern-day programmers with a Pentium 4 will
> > still claim is intolerably high...

> Agreed, but you're cheating a little here. The support chips included
> state-of-the-art graphics processing without loading down the 68000.

Yep, though that was a fairly short term advantage, the 68020 often beat
the graphics chips and on the 68030 on my Amiga 3000 I got a big speedup
from a program that took the graphics chips out of the loop for text
rendering.

So... replace that with a 16 MHz 68030 and 2M of RAM, and you still have
to wonder why a Pentium 4 can't manage it.

In Mac OS X not only is that refresh handled by the OS, but it does it
for window scrolling and pre-renders hidden portions of subwindows in
the background. So the lesson wasn't completely lost.

john jakson

unread,

Jul 4, 2003, 12:24:49 AM7/4/03

to

"Rupert Pigott" <r...@dark-try-removing-this-boong.demon.co.uk> wrote in message news:<10572439...@saucer.planet.gong>...

You are probably right about the insanity. maybe I should move to Area
51 or just sell transparent alumin[i]um. We were all pretty crazy back
them, although I had no appreciation for what the architecture people
were really doing until a few years ago. I could see HW thinking all
through it if I squinted enough.

Well I'm half ways into the Verilog compiler anyway, so once the
runtime event wheel is up & running, adding != & ?= syncronized put &
get on chan vars is no big deal and they fit right into the fork &
join substitute for par blocks that Verilog already has. The C syntax
is important as Verilog is supposedly C like as well, only the
expressions are really C'ish.

Can't say I ever will meet any Occam users over here (MA), its a shame
more HW & SW people can't know that some of the others guys got
something similar they could understand.

Inmos mostly used their own HDL (far ahead of Verilog even now in some
ways) until IIRC they replaced it with Verilog & other commercial EDA
tools. I remember well Clive D's work on running Spice & maybe HDL
sims on a small 5 T network but the Transputer was not quite right for
it. Using Occam directly as a HDL would have been just as inefficient
as C on seq only cpus, but a new event time wheel scheduler could have
done the trick, but that was out of the question.

Regards
JJ

Scott Schwartz

unread,

Jul 4, 2003, 1:18:40 AM7/4/03

to

Just for the record, there are at least two modern systems, Plan 9 and
Inferno that use CSP for lots of interesting things. Modern languages
based on CSP include Limbo (from Bell Labs) and Erlang (from
Ericsson). All of these things have been used in commercial products,
not intended as academic research. They strike a practical balance.

Russ Cox wrote a nice essay on the topic:

"Resources about threaded programming in the Bell Labs CSP style"
http://plan9.bell-labs.com/who/rsc/thread/index.html

Nick Maclaren

unread,

Jul 4, 2003, 3:21:59 AM7/4/03

to

In article <be2afv$13so$5...@jeeves.eng.abbnm.com>, pe...@abbnm.com (Peter da Silva) writes:
|> In article <3f02eaa8$7$fuzhry+tra$mr2...@news.patriot.net>,
|> Shmuel (Seymour J.) Metz <spam...@library.lspace.org.invalid> wrote:
|> > worse than what MVT had. A company that would do a thing like that
|> > would be capable of anything, even allowing users to include code in
|> > e-mail that the company's software would automatically execute on
|> > receipt of the e-mail. We all know that no one could be that stupid.
|>
|> Bah, even the mythical "GOOD TIMES" virus required that you *open* it!

Hmm. I remember reading that Microsoft did just that. Of course,
they CALLED it a beta test, but charged almost normally for it.
And they did have the sense to disable that trap before release.

Regards,
Nick Maclaren.

Jan C. Vorbrüggen

unread,

Jul 4, 2003, 3:26:26 AM7/4/03

to

> > worse than what MVT had. A company that would do a thing like that
> > would be capable of anything, even allowing users to include code in
> > e-mail that the company's software would automatically execute on
> > receipt of the e-mail. We all know that no one could be that stupid.
> Bah, even the mythical "GOOD TIMES" virus required that you *open* it!

Yes, it amazes again and again what said company's programmers imagination
manages to put out into the wild.

Jan

Bengt Larsson

unread,

Jul 4, 2003, 8:29:52 AM7/4/03

to

In comp.arch, Scott Schwartz <"schwartz+@usenet "@bio.cse.psu.edu>
wrote:

And here is the rationale for basing (multi-)tasking in Ada on CSP:
http://archive.adaic.com/standards/83rat/html/ratl-13-03.html

Douglas H. Quebbeman

unread,

Jul 4, 2003, 8:58:01 AM7/4/03

to

Linus Torvalds <torv...@penguin.transmeta.com> wrote in article
<10571788...@palladium.transmeta.com>...

[..snip..]

> If you don't care about performance, you can add call-outs and copy-in
> and copy-out etc crap. I'm telling you that you would be crazy to do it,
> but judging from some of the people in academic OS research, you
> wouldn't be alone in your own delusional world of crap.
>
> Sorry to burst your bubble.

I think most of us care about performance.

But I always thought one of the main reasons for chasing the
ever-increasing hardware performance curve was to make it possible
to write code in a high-level manner, and have it run fast enough
to be useful. Getting away from the bit-twiddling and relying on
higher-level constructs makes it possible for us to capitalize
on our past efforts more efficiently. Each new layer brings new
metaphors that permit programmers to get their tasks done faster.

I always hoped these benefits would be available not only to
applications programmers, but to us systems programmers as well.

Regards,
-doug q

Edward Rice

unread,

Jul 4, 2003, 12:11:09 PM7/4/03

to

In article <01c3422b$dec204f0$b2ecffcc@Shadow>,

"Douglas H. Quebbeman" <Do...@IgLou.com> wrote:

> I think most of us care about performance.
>
> But I always thought one of the main reasons for chasing the
> ever-increasing hardware performance curve was to make it possible
> to write code in a high-level manner, and have it run fast enough
> to be useful. Getting away from the bit-twiddling and relying on
> higher-level constructs makes it possible for us to capitalize

> on our past efforts more efficiently...

To bring up the obvioous counter-example, Doug, the weather people and the
nuclear people and the geology people, the imaging people and demented
mathematicians like myself (and probably some other people, too) want new
machines to pour out raw speed because the basic codes they want to run are
pretty simple, and it's much easier to re-code for an ultra-fast
architecture running on ultra-fast hardware than it is somehow wring
another percent or two from code that's been optimized for five years.

However, if I were still in the business of /producing/ code, I'd just nod
my head and dismiss those idjits as outliers on the curve and be in almost
full agreement with you. I look at what "productivity" meant on Multics
boxes, the time it took to do a compile and the limitations we had on
produceable tools because some things just took too long to develop or were
likely to take too long to run... and then I think of what joy some of
those projects would be on boxes of today's speed. YOW! I can think of
one particular analysis task that would have turned into a nothing project
(rather than being abandoned as too costly) if we'd had today's horsepower
to throw against it.

ehr

Shmuel (Seymour J.) Metz

unread,

Jul 4, 2003, 10:40:56 AM7/4/03

to

In <bec993c8.03070...@posting.google.com>, on 07/03/2003

at 10:18 AM, sho...@trailing-edge.com (Tim Shoppa) said:

>Right now I'm thinking of comparison with IBM mainframe-style channel
>I/O (which also could avoid all those extra bounce buffers).

What kind? On what platform? OS/360 had no paging, so I assume that
you mean OS/VS. EXCP had a substantial overhead for CCW translation
and IDAL construction. EXCPVR was semi-privileged and STARTIO was
available only for Supervisor state. Similar issues existed for every
other operating system except DOS/VSE when running with ECPS:VSE
enabled.

--
Shmuel (Seymour J.) Metz, SysProg and JOAT

Any unsolicited bulk E-mail will be subject to legal action. I reserve the
right to publicly post or ridicule any abusive E-mail.

Reply to domain Patriot dot net user shmuel+news to contact me. Do not reply
to spam...@library.lspace.org

Chris Jones

unread,

Jul 4, 2003, 1:47:16 PM7/4/03

to

Tom Van Vleck <th...@multicians.org> writes:

> sho...@trailing-edge.com (Tim Shoppa) wrote:
>
> > What details of the page-locked I/O buffers did ioi_ handle?
> > In particular, for DM devices it must have put the buffer
> > address/length into the peripheral controller for the user...
> > how much consistency/inconsistency was there between the
> > various devices that ioi_ handled?
> >
> > Right now I'm thinking of comparison with IBM mainframe-style
> > channel I/O (which also could avoid all those extra bounce
> > buffers).
>
> It has been a long time.. as I remember, ioi_ knew how to give
> you a wired, contiguous buffer in low enough memory that the I/O
> controller could use it, and understood enough about the channel
> protocol to keep it wired only when necessary. ioi_ also set up
> the channel address base and bounds in the I/O controller
> correctly.
>
> As I remember, ioi_ knew very little about the devices or the
> peripheral controllers; it just managed access to the buffers.

This is basically correct. ioi_ was more of an API; it knew how to
manage memory and talk to the IOM (I/O Multiplexor, through which all
devices were connected). It eventually evolved so that the buffer,
while wired, did not have to be contiguous (using what was called
"paged-mode" of the IOM). There was a restriction that the pages had to
be in "low memory", but that meant they had to be connected to the
lowest numbered memory controller (a/k/a SCU 0), as SCU 0 was the only
one which could not be dynamically deconfigured.

Peter da Silva

unread,

Jul 4, 2003, 11:49:13 AM7/4/03

to

In article <be39un$p9s$1...@pegasus.csx.cam.ac.uk>,

Um, no, there have been multiple Outlook exploits that found ways to
execute code before the user actually opened the message, so they couldn't
select and delete it without triggering the virus.

I was making a joke, you see. Back in the '80s there was this joke going
around about an email virus that would activate if you just opened a message
without running any attachments. We used to joke about the "GOOD TIMES" virus
and come up with more and more incredible things it would do, safe in the
knowledge that nobody would ever be stupid enough to write a mail program
that would run untrusted code when you opened a message, especially after
the Christmas Tree and WANK attacks, and then the Internet Worm.

Not only has Microsoft done exactly that, they spent five years in a fight
with the justice department to keep the underlying design flaw that almost
all of these exploits depended on in Windows. Yes, there's been a few prosaic
buffer overflow attacks that many other programs are subject to, but things
like what they call "cross frame scripting" would not be an issue if they
hadn't integrated the core of Internet Explorer directly into the standard
APIs of the OS.

Linus Torvalds

unread,

Jul 4, 2003, 3:55:40 PM7/4/03

to

Douglas H. Quebbeman wrote:

> Linus Torvalds <torv...@penguin.transmeta.com> wrote:
>>
>> If you don't care about performance, you can add call-outs and copy-in
>> and copy-out etc crap. I'm telling you that you would be crazy to do it,
>> but judging from some of the people in academic OS research, you
>> wouldn't be alone in your own delusional world of crap.
>

> I think most of us care about performance.
>
> But I always thought one of the main reasons for chasing the
> ever-increasing hardware performance curve was to make it possible
> to write code in a high-level manner, and have it run fast enough
> to be useful.

Why would you ever do that for an operating system, though?

When you make your own application slower, that's _your_ problem, and
the rest of the world doesn't really mind. If they find your application
useful enough, they'll use it. And if it's too slow, they might not. Not
everybody can just buy faster hardware.

However, when you make your OS slow, you make _everything_ slow.

Also, what's the point of writing low-level code in a high-level manner?
High-level code hides the details and does a lot of things automatically
for the programmer, but in an OS you are constrained by hardware and
security issues, and a lot of the time you absolutely MUST NOT hide the
details.

> Getting away from the bit-twiddling and relying on
> higher-level constructs makes it possible for us to capitalize
> on our past efforts more efficiently. Each new layer brings new
> metaphors that permit programmers to get their tasks done faster.

But the OS _is_ one of those layers. The whole point of layering is
you use independent concepts on top of each other to make the higher
levels more pleasant to use.

The important part here is "independent". The concepts should be
clearly above or below each other, not smushed together into a
unholy mess of 'every single abstraction you can think of'.

Leave the OS be. Put your abstractions on top of the solid ground
of an OS that performs well.

> I always hoped these benefits would be available not only to
> applications programmers, but to us systems programmers as well.

If you want OS protection, you work outside the OS. It's that simple.

You can do a lot of "system programming" outside the OS. Look at X,
or look at any number of server applications (apache etc). But don't
make the mistake of thinking that because a lot of services _should_
be done outside the kernel, that means that you should do all of them.

Filesystems are just about _the_ most performance critical thing in
an operating system. They tie intimately with pretty much everything.

In particular, filesystems are a hell of a lot more important than
message passing - you're better off implementing message passing on
top of a filesystem than you are the other way around.

In short: when I get a faster machine, I'd rather use that extra speed
to make the machine more useful to me, than waste it on stuff where it
doesn't help and where it makes no sense.

Linus

Nick Maclaren

unread,

Jul 4, 2003, 4:11:52 PM7/4/03

to

In article <10573485...@palladium.transmeta.com>,
Linus Torvalds <torv...@osdl.org> wrote:

>Douglas H. Quebbeman wrote:
>>>
>>> If you don't care about performance, you can add call-outs and copy-in
>>> and copy-out etc crap. I'm telling you that you would be crazy to do it,
>>> but judging from some of the people in academic OS research, you
>>> wouldn't be alone in your own delusional world of crap.
>>
>> I think most of us care about performance.
>>
>> But I always thought one of the main reasons for chasing the
>> ever-increasing hardware performance curve was to make it possible
>> to write code in a high-level manner, and have it run fast enough
>> to be useful.
>
>Why would you ever do that for an operating system, though?

To increase its extensibility by a factor of 10, its debuggability
by a factor of 100, its reliability by a factor of 1,000 and its
security by a factor of 10,000. No, I am NOT joking - those numbers
are achievable.

Just for the record, Linux is at least as good as most commercial
systems in those respects, so I am talking generically.

Nor I am denying your points about performance. An operating system
that is perfect in all respects, but takes a month to deliver what
is needed in a week, isn't useful.

>In short: when I get a faster machine, I'd rather use that extra speed
>to make the machine more useful to me, than waste it on stuff where it
>doesn't help and where it makes no sense.

Quite. But there are more criteria than performance. Nowadays,
the critical needs in many uses of operating systems are not for
more performance but for the aspects I mentioned above.

Regards,
Nick Maclaren.

Walter Bushell

unread,

Jul 4, 2003, 5:14:12 PM7/4/03

to

Peter da Silva <pe...@abbnm.com> wrote:

> In article <BjKMa.89516$R73.9817@sccrnsc04>,
> Jeff Kenton <Jeffrey...@comcast.net> wrote:
> > Charlie Gibbs wrote:
> > > You mean like the Amiga's SMART_REFRESH windows? When I started
> > > writing Windows programs, I was quite disgusted to discover that
> > > the contents of my window would be erased if another window opened
> > > over top of it, and that I was responsible for restoring it.
> > > Oddly enough, my old Amiga 1000, with its paltry 68000, a couple
> > > of support chips, and 512K of RAM, didn't seem to have much trouble
> > > with the overhead that modern-day programmers with a Pentium 4 will
> > > still claim is intolerably high...
>
> > Agreed, but you're cheating a little here. The support chips included
> > state-of-the-art graphics processing without loading down the 68000.
>
> Yep, though that was a fairly short term advantage, the 68020 often beat
> the graphics chips and on the 68030 on my Amiga 3000 I got a big speedup
> from a program that took the graphics chips out of the loop for text
> rendering.

Yes, I remember that Apples reply to the superior graphics of the Amiga
was they use special chips and that limits your future, Apple did have
dsp sound chips but absorbed their function early in the Power PC era.
If one uses specialized chips one has to do it in a programmer
transparent way.

> So... replace that with a 16 MHz 68030 and 2M of RAM, and you still have
> to wonder why a Pentium 4 can't manage it.

Four hysterical raisins. Once you start going down that route, the old
programs do their own windowing, so the new ones have to be compatible,
and it gets into the culture.

>
> In Mac OS X not only is that refresh handled by the OS, but it does it
> for window scrolling and pre-renders hidden portions of subwindows in
> the background. So the lesson wasn't completely lost.

--
Walter It is difficult to get a man to understand something," wrote
Upton Sinclair, "when his salary depends upon his not understanding it."
Walter

Peter da Silva

unread,

Jul 4, 2003, 5:15:43 PM7/4/03

to

In article <01c3422b$dec204f0$b2ecffcc@Shadow>,
Douglas H. Quebbeman <Do...@IgLou.com> wrote:

> But I always thought one of the main reasons for chasing the
> ever-increasing hardware performance curve was to make it possible
> to write code in a high-level manner, and have it run fast enough
> to be useful. Getting away from the bit-twiddling and relying on
> higher-level constructs makes it possible for us to capitalize
> on our past efforts more efficiently. Each new layer brings new
> metaphors that permit programmers to get their tasks done faster.

I know where I want my hardware performance to go.

I want my user interface to have real-time ray-tracing with a
radiosity illumination model and realistic physics.

None of this wimpy OpenGL phong-shaded z-buffering approximation.

Quartz Extreme? Not near extreme enough for me, Apple, but thanks
for the translucent Terminal windows, it's a start.

Dave Leigh

unread,

Jul 4, 2003, 6:21:12 PM7/4/03

to

Peter da Silva wrote on Friday 04 July 2003 16:15 in message
<be4qpv$1cg$1...@jeeves.eng.abbnm.com>:

> In article <01c3422b$dec204f0$b2ecffcc@Shadow>,

...

> I know where I want my hardware performance to go.
>
> I want my user interface to have real-time ray-tracing with a
> radiosity illumination model and realistic physics.

To each his own, so don't take this as criticism, but.... why?

--
Dave Leigh, Consulting Systems Analyst
Cratchit.org

Peter da Silva

unread,

Jul 4, 2003, 5:32:30 PM7/4/03

to

Cool, it's been almost ten years since the last time we did this!

In article <10573485...@palladium.transmeta.com>,
Linus Torvalds <torv...@osdl.org> wrote:

> > But I always thought one of the main reasons for chasing the
> > ever-increasing hardware performance curve was to make it possible
> > to write code in a high-level manner, and have it run fast enough
> > to be useful.

> Why would you ever do that for an operating system, though?

Linus, you're already doing it for your operating system. How much do you
think you could speed up Linux by getting rid of memory protection and
multiuser protection? Your system call overhead could go down from
microseconds to nanoseconds, the Amiga "system call" was four instruction
long.

You put up with the performance hit because you get a corresponding
benefit. I wouldn't go back to the Amiga now, I like it when a wild pointer
only takes out one process instead of the whole system. But I don't
pretend that there isn't a cost to go along with this benefit.

The same thing can be said for all the different layering mechanisms
used inside the kernel. We had 10 users running on a single 12 MHz 286 with
1M of RAM under Xenix... but we only had one file system, one network stack,
and a hung tape drive locked up the whole system and you had to reboot. If
you wanted to repartition a drive, you had to rebuild the kernel because
all the partition tables were hardcoded.

So the question isn't "why would you do that for an operating system", the
answer to that is easy: "because it makes the OS more stable, and makes driver
development easier, and allows you to expand what a non-root user is allowed
to do". The question is, "how much overhead are we talking about" or "what
are the tradeoffs" or "where would you start?"...

> Also, what's the point of writing low-level code in a high-level manner?

What's the point of writing the OS in C instead of assembly?

> High-level code hides the details and does a lot of things automatically
> for the programmer, but in an OS you are constrained by hardware and
> security issues, and a lot of the time you absolutely MUST NOT hide the
> details.

Every internal API inside the kernel "hides the details". Do you want to have
to duplicate all the code that interacts with a file system anywhere in the
kernel because you want to know the details?

> Leave the OS be. Put your abstractions on top of the solid ground
> of an OS that performs well.

Loadable drivers.
File systems.
Pseudodevices.
Layered file systems.

On the Amiga, where "inside the kernel" you had high level abstractions for
all the interfaces (a file system, for example, was just a program that
registered itself as a file system and accepted file system messages), people
were producing some amazingly cool and *clean* system-level objects... and
they didn't even have OS source available!

> > I always hoped these benefits would be available not only to
> > applications programmers, but to us systems programmers as well.

> If you want OS protection, you work outside the OS. It's that simple.

It's not that simple. What's "the OS" anyway?

> In short: when I get a faster machine, I'd rather use that extra speed
> to make the machine more useful to me, than waste it on stuff where it
> doesn't help and where it makes no sense.

That's why I'd like to have better abstractions for doing OS-level stuff...
because it does make sense.

Peter da Silva

unread,

Jul 4, 2003, 5:41:59 PM7/4/03

to

In article <1fxkjma.ms73qb1r8c0nqN%pr...@panix.com>,

Walter Bushell <pr...@panix.com> wrote:
> > So... replace that with a 16 MHz 68030 and 2M of RAM, and you still have
> > to wonder why a Pentium 4 can't manage it.

> Four hysterical raisins. Once you start going down that route, the old
> programs do their own windowing, so the new ones have to be compatible,
> and it gets into the culture.

I'm not sure I understand what this refers to. The Amiga window system had
a much cleaner and clearer distinction between what the windowing system
was responsible for and what the application was responsible for than just
about any windowing system I've seen anywhere else, except maybe NeWS.

It didn't get its performance by making programs do their own windowing at
all. And the special chips weren't the limiting factor that Apple thought:
most applications didn't use them directly, and (as the text trick I mentioned
illustrates) didn't even know whether they were or not.

The real problem the Amiga had was that they didn't make the hooks they put
in to support virtual memory and protected memory clear, so most programs
neglected to (for example) allocate local data from PRIVATE memory and
shared data like message buffers from PUBLIC memory... they typically
used all one or all the other, so they were never able to implement private
protected (and swappable) memory.

Which would have forced them to go through an API change in the '90s even if
they hadn't been killed by lawyers.

jonah thomas

unread,

Jul 4, 2003, 5:57:13 PM7/4/03

to

Linus Torvalds wrote:
> Douglas H. Quebbeman wrote:

>>But I always thought one of the main reasons for chasing the
>>ever-increasing hardware performance curve was to make it possible
>>to write code in a high-level manner, and have it run fast enough
>>to be useful.

> Why would you ever do that for an operating system, though?

> When you make your own application slower, that's _your_ problem, and
> the rest of the world doesn't really mind. If they find your application
> useful enough, they'll use it. And if it's too slow, they might not. Not
> everybody can just buy faster hardware.

> However, when you make your OS slow, you make _everything_ slow.

> Also, what's the point of writing low-level code in a high-level manner?
> High-level code hides the details and does a lot of things automatically
> for the programmer, but in an OS you are constrained by hardware and
> security issues, and a lot of the time you absolutely MUST NOT hide the
> details.

There's an OS called MenuetOS that I think would be right up your alley.
Written entirely in assembly. Very fast, simple....

Oh wait. You're *Linus Torvalds*. You already have an OS. Never mind.

Peter da Silva

unread,

Jul 4, 2003, 6:03:58 PM7/4/03

to

In article <vgbsi0o...@corp.supernews.com>,

Dave Leigh <dave....@cratchit.org> wrote:
> Peter da Silva wrote on Friday 04 July 2003 16:15 in message
> <be4qpv$1cg$1...@jeeves.eng.abbnm.com>:

> > I know where I want my hardware performance to go.

> > I want my user interface to have real-time ray-tracing with a
> > radiosity illumination model and realistic physics.

> To each his own, so don't take this as criticism, but.... why?

Because I think in three dimensions, and I want to take advantage of three
dimensions to organise my computer environment, and I want the user
interface to manage all these relationships so the application doesn't
need to know more than, perhaps, how high a detail it need to be building
its input to the rendering engine at... so that you get a uniform user
interface as well as a three-dimensional one.

That's why I want my user interface to have real-time 3-d rendering.

On top of that: I am a mammal, a primate, a predator, and I have a complex
visual system that can extract information from subtle cues. That means I
want the user interface to be rendered as realistically as possible, so those
cues mean "there's something you want to look at here" and not "the raycasting
code doesn't quite handle this shodow properly, so there's this weird flicker
when that log window and that mail reader are too close to each other".

Sander Vesik

unread,

Jul 4, 2003, 6:56:52 PM7/4/03

to

In comp.arch Peter da Silva <pe...@abbnm.com> wrote:
> In article <01c3422b$dec204f0$b2ecffcc@Shadow>,
> Douglas H. Quebbeman <Do...@IgLou.com> wrote:
>> But I always thought one of the main reasons for chasing the
>> ever-increasing hardware performance curve was to make it possible
>> to write code in a high-level manner, and have it run fast enough
>> to be useful. Getting away from the bit-twiddling and relying on
>> higher-level constructs makes it possible for us to capitalize
>> on our past efforts more efficiently. Each new layer brings new
>> metaphors that permit programmers to get their tasks done faster.
>
> I know where I want my hardware performance to go.
>
> I want my user interface to have real-time ray-tracing with a
> radiosity illumination model and realistic physics.

Me too. or rather, I want customisable physics. If I want to float,
I should I want so.

>
> None of this wimpy OpenGL phong-shaded z-buffering approximation.

Well, actually, it has taken us horribly long to get to the point
where mainstream hardware does / can do phong shading...

>
> Quartz Extreme? Not near extreme enough for me, Apple, but thanks
> for the translucent Terminal windows, it's a start.
>

--
Sander

+++ Out of cheese error +++

Dave Leigh

unread,

Jul 4, 2003, 8:10:11 PM7/4/03

to

Peter da Silva wrote on Friday 04 July 2003 17:03 in message
<be4tke$2pv$6...@jeeves.eng.abbnm.com>:

OK, now I see where you're coming from. I'll note in passing that while I'm
also a mammal, a primate, and acutely visual, I'm an omnivore, so perhaps
some of our differences of opinion can be chalked up to my more varied
diet.

I once worked with a fellow who was working on an interface to a virtual
art gallery. It sounded cool in theory; you'd be able to walk through the
gallery and see the sculptures and paintings on the walls, just as you
would through a real gallery. What he forgot is that the value of an
interface is the degree to which it can _improve_ upon reality, not the
degree to which it can _mimic_ it. In actual use the idea sucked (even
ignoring such things as lack of raytracing and limited resolution), and
improvements like 'teleportation' were at best marginal.

A computer desktop is another example. If someone were to provide a virtual
interface that attempted to mimic my real desktop or work area I'd very
likely not use it no matter how pretty raytracing was. My real desk is
an unorganized mess. On the other hand, my computer desktop is always well
organized in large part due to its limitations. Personally, I think there
are times when even the limited 3D model of "windows on top of windows" is
overly complicated to use. So, I tend to use virtual desktops to spread
things out. (Apple recently addressed the same problem in the Panther
release of OS X with the Expose' feature.)

Again, I'm not criticizing the _idea_ of a 3D environment. The point of
this post is, while I see that there are situations where such an interface
would indeed be valuable, I'd ask you to remember as you conceptualize and
design that just because pigs exist in the real world there is no reason to
adopt one as your interface. Improve it.

Justin R. Bendich

unread,

Jul 4, 2003, 8:23:34 PM7/4/03

to

"Peter da Silva" <pe...@abbnm.com> wrote in message
news:be4qpv$1cg$1...@jeeves.eng.abbnm.com...

> --
> #!/usr/bin/perl
> $/="%\n";chomp(@_=<>);print$_[rand$.]
>
> Peter da Silva, just another Perl poseur.

Did you used to have an ASCII cat in your .sig?
I miss it, even if it wasn't yours...

Justin

Linus Torvalds

unread,

Jul 4, 2003, 10:30:16 PM7/4/03

to

Nick Maclaren wrote:
>>
>>Why would you ever do that for an operating system, though?
>
> To increase its extensibility by a factor of 10, its debuggability
> by a factor of 100, its reliability by a factor of 1,000 and its
> security by a factor of 10,000. No, I am NOT joking - those numbers
> are achievable.

If you're not joking, you have your work cut out for you. The proof is
in the pudding, and I hereby claim that you're seriously naīve if you
truly believe that.

But hey, prove me wrong. I'm pragmatic: I'd happily be proven wrong
by somebody actually showing something. It sounds like so much hot
air to me so far, though.

> Just for the record, Linux is at least as good as most commercial
> systems in those respects, so I am talking generically.

Completely ignoring all performance issues, I will tell you why you're
wrong on all counts: it's a hell of a lot more difficult to write, debug
and validate communications than it is to do the same for a monolithic
program.

In short, you're making the _classic_ microkernel mistake. You are
comparing apples to oranges. Your argument goes like this:

- by making the filesystem an independent entity, it becomes simpler
than the full OS would become, and as such it is more easily written
and debugged.

This is a stupid and completely illogical comparison. You compare one
subsystem to the whole, and you think the comparison is valid. Yes, the
filesystem itself might be easier to write THAN THE WHOLE OS, and it
might be easier to debug THAN THE WHOLE OS. But do you see the fallacy?

The system as a whole actually got _harder_ to debug, because when you
split the thing up, you added a layer of communication and you _removed_
the possibility to trivially debug the two parts together as one.

You hamstrung the parts by not allowing them to share data and thus you
introduced the problem of data coherency that didn't even exist in the
original design because the original design didn't need them. The original
design could share locks and data freely.

To be blunt: have you tried debugging deadlocks and raceconditions in
threaded applications that use multiple address spaces?

Put it another way: have you tried debugging asynchronous systems
with complex interactions? It's not as simple as debugging one part
on its own. A lot of the problems only show up as emergent behaviour.

For example, one of the most interesting parts of filesystem behaviour
is the handling of low-memory situations and the shrinking of the caches
involved. You can avoid it by never caching anything in the filesystem,
but that tends to be a bad idea.

My favourite analogy is one of the brain. The complexity of the brain
is _bigger_ than the sum of the complexity of each neuron. Because the
_real_ complexity is not in the neurons themselves (and no, they aren't
exactly trivial, but some people think they understand them reasonably
well), but in the patterns of interactions that happen: the feedback
cycles that keeps the activity in balance.

This is the kind of complexity you really really don't want to debug.
It's wonderful when it works, but part of why it's so wonderful is the
fact that it's almost impossible to understand _how_ it really works,
and when things go wrong you have a really hard time fixing them.

So when you say that message passing increases debuggability by two
orders of magnitude, I laugh derisively in your direction.

I claim that you want to start communicating between independent modules
no sooner than you absolutely HAVE to, and that you should avoid splitting
things up until you really need to, because that communication complexity
often swamps the complexity of the actual pieces involved in it.

Linus

Linus Torvalds

unread,

Jul 4, 2003, 11:19:49 PM7/4/03

to

Peter da Silva wrote:
>
> Linus, you're already doing it for your operating system. How much do you
> think you could speed up Linux by getting rid of memory protection and
> multiuser protection? Your system call overhead could go down from
> microseconds to nanoseconds, the Amiga "system call" was four instruction
> long.

No. That's a user-visible API thing, and as such unacceptable. If the
OS doesn't give protection to the user programs, it is not in my opinion
in the least interesting.

So what you are suggesting doesn't make sense. It's like getting the
wrong answer - it doesn't make for a faster system, simply because
system is now no longer doing the right thing.

However, _within_ the OS, protection doesn't buy you much. We have
some internal debugging facilities that slow things down enormously
if you enable them, but they are literally only meant for debugging,
and they play second fiddle to the design.

And NOT having protection within the kernel itself is actually a
huge win. And while performance is important, the big win is that
without protection you can simply solve a lot of problems in a much
more straightforward manner.

> You put up with the performance hit because you get a corresponding
> benefit.

Clearly any engineering problem always ends up being a cost/benefit
analysis. That's what makes it engineering, not science.

So in that sense what you're saying is a tautology.

At the same time, I claim that you're just wrong. Because this is one of
the areas where it is NOT a question of cost vs benefit, but one of the
few areas where it is a question of simple user requirements. Not having
protection between programs and users is simply not an option.

> The same thing can be said for all the different layering mechanisms
> used inside the kernel.

Actually, in almost all cases the benefit of layering in the kernel is
- better performance
- less duplication

Really. Layering done right makes it _easier_ to write good code, and
layering should never be a performance issue unless it is badly
designed.

STL is an example of good layering design (in C++): it's very much
designed so that the compiler at least in theory - and reasonably often
in practice too - can do the right thing with little or no performance
downside.

Similarly, the examples of layering that Linux uses extensively (starting
from the use of C over assembly, to having various ground rules on how to
write architecture-neutral code, to having things like a VFS layer that
handles most of the common code in filesystems) literally do improve
performance. If they didn't, they'd be badly designed.

For example, the VFS layer gives filesystems a generic notion of a page
cache for maintaining caches of the file (and directory) contents. Yes,
the low-level filesystems could do it themselves, but not only does this
layering avoid duplicated work, it actually improves performance, because
it means that we have a _global_ cache replacement policy that very much
outperforms something that could only work on one filesystem at a time.

Similarly, when I rewrite the basic VM subsystem for the first Linux port
to the alpha, and had to virtualize the page tables, I actually ended up
improving performance even on x86. Why? Because the layering itself didn't
add any overhead (trivial macros and inline functions used to hide the
differences), but by being done right, it made the code easier to follow,
and actually made some bad decisions in the original code clear.

The same goes for the choice of a C compiler over hand-written assembly.
Nobody sane these days claims that handwritten assembly will outperform
a good compiler, especially if the code is allowed to use inline asms
for stuff that the compiler can't handle well natively.

See? It's a total and incorrect MYTH that layering should be bad for
performance. Only _bad_ layering is bad for performance.

This is why I rage against silly people who try to impose bad layering.
It's easy to recognize bad layering: when it results in a clear performance
problem, it is immediately clear that the layering was misdesigned.

And in the end, that is my beef with microkernels. They clearly
_are_ badly designed, since the design ends up not only having
fundamental performance problems, but actually makes a lot of things
more complex as opposed to less. QED.

> So the question isn't "why would you do that for an operating system", the
> answer to that is easy: "because it makes the OS more stable, and makes
> driver development easier, and allows you to expand what a non-root user
> is allowed to do". The question is, "how much overhead are we talking
> about" or "what are the tradeoffs" or "where would you start?"...

No. I claim that trying to split off the filesystems out of the low-level
kernel is a layering violation, because it is clearly a case of bad
layering.

And performance is part of it, but so is (very much) the fact that it
also makes it much harder to maintain coherent data structures and graceful
caches.

>> Also, what's the point of writing low-level code in a high-level manner?
>
> What's the point of writing the OS in C instead of assembly?

Your argument is nonsense.

I very much point to the use of C instead of assembly as a performance
IMPROVING thing, and thus clearly good layering. I claim that a good
C compiler will outperform a human on any reasonably sized project,
and that the advantages are obvious both from a performance and a
maintenance standpoint.

>> Leave the OS be. Put your abstractions on top of the solid ground
>> of an OS that performs well.
>
> Loadable drivers.
> File systems.
> Pseudodevices.
> Layered file systems.

None of these are even visible in a performance analysis.

I don't much like loadable drivers myself, and I don't use them, but nobody
has actually ever shown the theoretical downside (because they are loaded
into the virtual kernel address space rather than in the 1:1 mapping that
the main kernel uses, they have somewhat worse ITLB behaviour in theory).

The others do not have any performance problems, and as mentioned things
like the FS virtualization actually has performance benefits.

> On the Amiga, where "inside the kernel" you had high level abstractions
> for all the interfaces (a file system, for example, was just a program
> that registered itself as a file system and accepted file system
> messages), people were producing some amazingly cool and *clean*
> system-level objects... and they didn't even have OS source available!

You'r enot arguing against my point at all.

Good abstraction has zero performance penalty, and can make the system
much more pleasant to work with. And I claim that this is the DEFINITION
of good abstraction.

Microkernels and message passing fail this definition very badly. (The
Amiga had a "sort of" message passing, but since in real life it was
nothign but a function call with a pointer, that was message passing on
the _abstract_ level, with no performance downside. That's fine as an
abstraction, but it obviously only works if there is no memory protection)

Linus

Peter da Silva

unread,

Jul 5, 2003, 1:34:17 AM7/5/03

to

In article <vgc2ue9...@corp.supernews.com>,

Dave Leigh <dave....@cratchit.org> wrote:
> OK, now I see where you're coming from. I'll note in passing that while I'm
> also a mammal, a primate, and acutely visual, I'm an omnivore, so perhaps
> some of our differences of opinion can be chalked up to my more varied
> diet.

An omnivore is still a predator.

> I once worked with a fellow who was working on an interface to a virtual
> art gallery. It sounded cool in theory; you'd be able to walk through the
> gallery and see the sculptures and paintings on the walls, just as you
> would through a real gallery.

There's two big problems with this:

1. Performance needs to improve by a few orders of magnitude
first.

2. Metaphors are intended to cast light on a problem, but the
metaphor should never substitute for the problem.

I see you're going to spend quite a lot of words pointing this latter
point out to me good and hard. Rather than address your message point
by point I'll just nod my head for a while, OK? I already know that
bit. I'm not talking about mimicking anything exactly in this 3d
environment. I don't have more than the vaguest idea what the best 3d
metaphors for events going on in the computer will be. I do, however,
think we need to find out.

My computer desk is only well organised because I can tell the computer
to take care of the small stuff. I can do all kinds of things on my
computer desktop I can't do on my real desktop, like telling the
computer to automatically stack and lay out windows for me. I can't
wait to see what it will do when I have it apply the same operations to
files and directories in 3d.

On Mac OS X, I can't do that, Apple's window manager sucks by
comparison with the one I use on FreeBSD (and which is, ironically,
based on the look and feel of an earlier version of what became Mac OS
X). And I don't care much for the Expose family of operations. Instead,
give me tools to group windows into projects. In 2d i use virtual
desktops. In 3d I will be able to watch the "virtual desktops" I'm not
using, because they'll be off in the distance. The "rotating cube"
effect they have in Panther? That will just be one of the things you
can do with your own sets of 3d windows, or components, or whatever.

So, no, I don't want a full bore literalist VR interface. I just think
the same kinds of tools you'd build to create that will be useful for
less literal general user interfaces... again, because we've evolved to
be able to use that kind of thing effectively.

It's like I'm talking "you know, it'd be really nice if we could build
something like a carriage that didn't need a horse" and you're telling
me that a horse is a bad design for a mechanical carriage-pulling
engine and that I probably shouldn't use reins for steering. Well, yeh,
that doesn't mean a "horseless carriage" that was a little less literal
a device wouldn't be useful.

Peter da Silva

unread,

Jul 5, 2003, 1:37:17 AM7/5/03

to

In article <aUoNa.45298$xg5....@twister.austin.rr.com>,

Justin R. Bendich <bend...@austin.rr.com> wrote:
> Did you used to have an ASCII cat in your .sig?

Not exactly.

--
I've seen things you people can't imagine. Chimneysweeps on fire over the roofs
of London. I've watched kite-strings glitter in the sun at Hyde Park Gate. All
these things will be lost in time, like chalk-paintings in the rain. `-_-'
Time for your nap. | Peter da Silva | Har du kramat din varg, idag? 'U`

Steve O'Hara-Smith

unread,

Jul 5, 2003, 3:07:04 AM7/5/03

to

On Fri, 4 Jul 2003 21:15:43 +0000 (UTC)
pe...@abbnm.com (Peter da Silva) wrote:

PDS> I want my user interface to have real-time ray-tracing with a
PDS> radiosity illumination model and realistic physics.

Wimp! Real time colour animated holography please.

--
C:>WIN | Directable Mirrors
The computer obeys and wins. |A Better Way To Focus The Sun
You lose and Bill collects. | licenses available - see:
| http://www.sohara.org/

jmfb...@aol.com

unread,

Jul 5, 2003, 5:51:11 AM7/5/03

to

In article <BB2B1F5D...@0.0.0.0>, ehr...@his.com (Edward Rice) wrote:
>In article <01c3422b$dec204f0$b2ecffcc@Shadow>,
>"Douglas H. Quebbeman" <Do...@IgLou.com> wrote:
>
> > I think most of us care about performance.
> >
> > But I always thought one of the main reasons for chasing the
> > ever-increasing hardware performance curve was to make it possible
> > to write code in a high-level manner, and have it run fast enough
> > to be useful. Getting away from the bit-twiddling and relying on
> > higher-level constructs makes it possible for us to capitalize
> > on our past efforts more efficiently...
>
>To bring up the obvioous counter-example, Doug, the weather people and the
>nuclear people and the geology people, the imaging people and demented
>mathematicians like myself (and probably some other people, too)

Chemists.

> .. want new

>machines to pour out raw speed because the basic codes they want to run
are
>pretty simple, and it's much easier to re-code for an ultra-fast
>architecture running on ultra-fast hardware than it is somehow wring
>another percent or two from code that's been optimized for five years.
>
>However, if I were still in the business of /producing/ code, I'd just nod
>my head and dismiss those idjits as outliers on the curve and be in almost
>full agreement with you. I look at what "productivity" meant on Multics
>boxes, the time it took to do a compile and the limitations we had on
>produceable tools because some things just took too long to develop or
were
>likely to take too long to run... and then I think of what joy some of
>those projects would be on boxes of today's speed. YOW! I can think of
>one particular analysis task that would have turned into a nothing project
>(rather than being abandoned as too costly) if we'd had today's horsepower
>to throw against it.

One of the reasons I decided to continue to work on TOPS-10 was
so more people had more CPU cycles available to them at all times.
I envisioned that a lot more things would be accomplished in the
sciences if calculations could be off-loaded onto a computer,
especially if the computing could be completed in CPU-seconds
rather than CPU-months...or years.

I heard about a program session that run by chemists, stand-alone,
and was in its 20th day of the 30-day run when a very bad, power-stopping
storm visited the area. The calculations done these days wouldn't
have been considered back in the 70s or 80s.

You needed Cray power and waiting line was long and fraught with
many strainers that a user had to overcome before getting the computes.

/BAH

Subtract a hundred and four for e-mail.

Nick Maclaren

unread,

Jul 5, 2003, 9:15:09 AM7/5/03

to

Linus Torvalds <torv...@osdl.org> wrote:
> Nick Maclaren wrote:
> >>
> >>Why would you ever do that for an operating system, though?
> >
> > To increase its extensibility by a factor of 10, its debuggability
> > by a factor of 100, its reliability by a factor of 1,000 and its
> > security by a factor of 10,000. No, I am NOT joking - those numbers
> > are achievable.
>
> If you're not joking, you have your work cut out for you. The proof is

> in the pudding, and I hereby claim that you're seriously naove if you

> truly believe that.
>
> But hey, prove me wrong. I'm pragmatic: I'd happily be proven wrong
> by somebody actually showing something. It sounds like so much hot
> air to me so far, though.

Well, from your well-known viewpoint and this posting, I may have a job
convincing you, but I stand by the above. Here are references to
examples:

Extensibility by a factor of 10 has often been done, especially in the
better research systems of the 1980s, which arguably achieved factors
of 100 or more.

Debuggability by a factor of 100 has rarely been delivered in operating
systems, but factors of 10 have without trying hard. And evidence from
complex applications of the 1960s and 1970s is that current systems
could easily be improved by a factor of more than 100 - I could explain
how, if you are interested.

Reliability by a factor of 1,000 has been done in the 'embedded' area,
sometimes for very large systems.

Security by a factor of 10,000 is harder to prove, but look at the
capability systems of the 1980s. My assertion (based on how often I
trip across security exposures by accident) is that, if current systems
were attacked by hackers of the calibre of the 1970s and 1980s, they
would last less than an hour. Well, my record is 15 minutes, but I
don't claim to have major skills in that area :-)

> > Just for the record, Linux is at least as good as most commercial
> > systems in those respects, so I am talking generically.
>
> Completely ignoring all performance issues, I will tell you why you're
> wrong on all counts: it's a hell of a lot more difficult to write, debug
> and validate communications than it is to do the same for a monolithic
> program.

That is true, but YOU are making the classic mistake of confusing the
difficulty of building a half-working system with that of an extremely
reliable one. Take a look back at the sections of my posting that you
snipped, and notice that my OBJECTIVES are different from yours. In
particular, I want systems with MASSIVE improvements in the areas I
mentioned - I get the impression you regard those as secondary to
performance.

Neither viewpoint is wrong, out of context. I assert that my viewpoint
is a little more realistic, but doubtless you differ :-)

> In short, you're making the _classic_ microkernel mistake. You are
> comparing apples to oranges. Your argument goes like this:
>
> - by making the filesystem an independent entity, it becomes simpler
> than the full OS would become, and as such it is more easily written
> and debugged.

No, I did not say that. Nor did I mean it. Merely dividing up code
does nothing useful.

> This is a stupid and completely illogical comparison. You compare one
> subsystem to the whole, and you think the comparison is valid. Yes, the
> filesystem itself might be easier to write THAN THE WHOLE OS, and it
> might be easier to debug THAN THE WHOLE OS. But do you see the fallacy?

Of course, and I did back around 1970 when I was flamed by the
'structured programming' people for saying the same thing as you are
saying today.

> The system as a whole actually got _harder_ to debug, because when you
> split the thing up, you added a layer of communication and you _removed_
> the possibility to trivially debug the two parts together as one.

That is true IF all that you do is to split them apart. I am talking
about starting by designing the interfaces, and treating the interfaces
as primary. You can THEN prove or debug all three parts in isolation
(yes, three, because the interfaces are one part).

> Put it another way: have you tried debugging asynchronous systems
> with complex interactions? It's not as simple as debugging one part
> on its own. A lot of the problems only show up as emergent behaviour.

Yes. I have spent many decades doing it, often without the benefit of
either source or diagnostic tools. My success rate is not high, but it
is often comparable to the people who do it with access to the source.

> So when you say that message passing increases debuggability by two
> orders of magnitude, I laugh derisively in your direction.

Since I never said that, and never meant it, I suggest that you are
tilting at windmills. There are NO advantages to message passing AS
SUCH, but there are some with what it gives you. But that is all
irrelevant, as I am not talking about message passing anyway!

What I am talking about is carefully designed, structured interfaces,
which may (or may not) involve the passing of messages. With any
large program, requiring the use of "front doors" for all communication
costs in performance but, IF DONE RIGHT, it can deliver all of the
advantages I mention above. If done wrong, of course, it merely costs
performance :-)

> I claim that you want to start communicating between independent modules
> no sooner than you absolutely HAVE to, and that you should avoid splitting
> things up until you really need to, because that communication complexity
> often swamps the complexity of the actual pieces involved in it.

OK. So why can't I get access to the kernel data structures from my
program? I really would be able to like to place my data structures
for best effect, control the TLBs and so on. I could get MUCH better
performance under many circumstances that way!

Regards,
Nick Maclaren.

Chris Hedley

unread,

Jul 5, 2003, 9:35:51 AM7/5/03

to

According to Peter da Silva <pe...@abbnm.com>:

> In article <aUoNa.45298$xg5....@twister.austin.rr.com>,
> Justin R. Bendich <bend...@austin.rr.com> wrote:
> > Did you used to have an ASCII cat in your .sig?
>
> Not exactly.

The Motörhead mascot, then?

Chris.
--
"If the world was an orange it would be like much too small, y'know?" Neil, '84
Currently playing: random early '80s radio stuff
http://www.chrishedley.com - assorted stuff, inc my genealogy. Gan canny!

Anne & Lynn Wheeler

unread,

Jul 5, 2003, 10:02:47 AM7/5/03

to

jmfb...@aol.com writes:
> One of the reasons I decided to continue to work on TOPS-10 was
> so more people had more CPU cycles available to them at all times.
> I envisioned that a lot more things would be accomplished in the
> sciences if calculations could be off-loaded onto a computer,
> especially if the computing could be completed in CPU-seconds
> rather than CPU-months...or years.
>
> I heard about a program session that run by chemists, stand-alone,
> and was in its 20th day of the 30-day run when a very bad, power-stopping
> storm visited the area. The calculations done these days wouldn't
> have been considered back in the 70s or 80s.
>
> You needed Cray power and waiting line was long and fraught with
> many strainers that a user had to overcome before getting the computes.

pasc had a program that would be in the queue for sjr's 195 for 3
months before being run. they found that they could schedule it in the
background on pasc's 370/145 ... where it wouldn't get a lot of 1st
shift cpu, but could catch up on 2nd, 3rd and 4th (weekends) ... and
complete in little less than 3 months. of course they did regular
checkpoints.

sjr's 195 also ran some amount of air bearing simulation for disk
floating head technology (again weeks long queue). when we bullet proofed
input/output supervisor for the disk engineering (bldg 14) and disk
product product test lab (bldg 15)
http://www.garlic.com/~lynn/subtopic.html#disk

a lot of the disk test cell testing got converted from stand alone
testing to running in operating system environment and so could do
half dozen or dozen test cell twork going on simultaneously (instead
of one at a time dedicated time). the other benefit was that even with
multiple test cell load ... the cpu load was rarely more than a couple
percent ... so in addition to improving test cell work by several
hundred percent ... it opened up significant cpu processing to disk
engineering.

The product test lab's 3033 had about the same cpu thruput as sjr's
195 for most workloads .... 195 really took off when you could get
instruction loops within 63 instructions. opening up several of the
processors required for dedicated disk I/O testing to selected CPU
intensive use, significantly aided things like the air bearing work for
floating disk heads.

--
Anne & Lynn Wheeler | ly...@garlic.com - http://www.garlic.com/~lynn/
Internet trivia, 20th anniv: http://www.garlic.com/~lynn/rfcietff.htm

Tom Van Vleck

unread,

Jul 5, 2003, 10:10:11 AM7/5/03

to

Linus Torvalds <torv...@osdl.org> wrote:

> In particular, filesystems are a hell of a lot more important
> than message passing - you're better off implementing message
> passing on top of a filesystem than you are the other way
> around.

That was the model I was used to after working on CTSS and
Multics. File system inside the supervisor, close to the
hardware.

Then in 1981 I joined Tandem Computers, which had a very
different architecture. The Tandem kernel was minimal, and
provided little more than processes and messages. Many
traditional operating system functions were in a per-CPU monitor
process. And the file system was in its own process, called the
"disk process." Performance for transaction processing was good
enough to win benchmarks, and extensibility and fault tolerance
were excellent.

Joel Bartlett's very fine paper describing the Tandem kernel is
online:
http://www.hpl.hp.com/techreports/tandem/TR-81.4.html

(Over time this elegant architecture got strained a bit. The disk
process contained not only raw disk management but also a tree
structured file system, access methods such as indexed sequential
and entry sequenced, file and record locking, transaction
protection including before/after image journaling, and a
complete SQL implementation; packing all that complexity into one
process made it hard to evolve and improve.)

What I concluded from experience with the Tandem systems is that
there is more than one good way to build operating systems,
depending on what you're trying to optimize.

Robert Myers

unread,

Jul 5, 2003, 10:11:21 AM7/5/03

to

On 5 Jul 2003 13:15:09 GMT, nm...@cus.cam.ac.uk (Nick Maclaren) wrote:
>
>
>OK. So why can't I get access to the kernel data structures from my
>program? I really would be able to like to place my data structures
>for best effect, control the TLBs and so on. I could get MUCH better
>performance under many circumstances that way!
>

This dialogue provoked the google search

linux "kernel hooks"

I think I'll go play.

RM

Peter da Silva

unread,

Jul 5, 2003, 11:19:38 AM7/5/03

to

In article <20030705101011...@multicians.org>,

Tom Van Vleck <th...@multicians.org> wrote:
> What I concluded from experience with the Tandem systems is that
> there is more than one good way to build operating systems,
> depending on what you're trying to optimize.

Definitely. I'm frustrated by the difficulty of finding easily available
and useful implementations of operating systems that aren't based one way
or the other on the same kernel described in the Lions book.

I had some hope that something would come out of the Amiga-QNX exchange,
but unfortunately the curse on the Amiga is still working.

Peter da Silva

unread,

Jul 5, 2003, 11:21:14 AM7/5/03

to

In article <20030705090704....@eircom.net>,

Steve O'Hara-Smith <ste...@eircom.net> wrote:
> On Fri, 4 Jul 2003 21:15:43 +0000 (UTC)
> pe...@abbnm.com (Peter da Silva) wrote:
> PDS> I want my user interface to have real-time ray-tracing with a
> PDS> radiosity illumination model and realistic physics.

> Wimp! Real time colour animated holography please.

I suspect that would require discovering new fundamental physical
principles.

Tom Van Vleck

unread,

Jul 5, 2003, 11:53:44 AM7/5/03

to

nm...@cus.cam.ac.uk (Nick Maclaren) wrote:

> OK. So why can't I get access to the kernel data structures
> from my program? I really would be able to like to place my
> data structures for best effect, control the TLBs and so on. I
> could get MUCH better performance under many circumstances that
> way!

By doing so, of course, you entangle the complexity of the Linux
kernel into the complexity of your program. If you view and
manipulate the kernel data structures directly, your application
becomes non-portable to any other platform. Your program becomes
an outboard part of the kernel, recompiled when it changes.. or
you introduce an interface on both sides, push more complexity
into the kernel, and still have the semantics of Linux implicit
in your use of the interface.

If we look at an application, there is some complexity that is
because of the problem we're trying to solve -- the quantum
calculation or whatever. Then the typical application has a
bunch of complexity devoted to concerns not delivered as results:
numerical stability, tolerating hardware faults, performance,
fitting into available hardware, power management, evolvability,
bug prevention, security. Ordering and trading off these
concerns can take more effort than solving the main problem, so
we often try to amortize this activity over multiple problems.

We tried not to let kernel data structures in Multics be visible
to user applications, so that they wouldn't grow dependencies on
them and prevent us from later redesign.

Nick Maclaren

unread,

Jul 5, 2003, 11:57:28 AM7/5/03

to

In article <20030705115344...@multicians.org>,

Tom Van Vleck <th...@multicians.org> wrote:

>nm...@cus.cam.ac.uk (Nick Maclaren) wrote:
>
>> OK. So why can't I get access to the kernel data structures
>> from my program? I really would be able to like to place my
>> data structures for best effect, control the TLBs and so on. I
>> could get MUCH better performance under many circumstances that
>> way!
>
>By doing so, of course, you entangle the complexity of the Linux
>kernel into the complexity of your program. If you view and
>manipulate the kernel data structures directly, your application
>becomes non-portable to any other platform. Your program becomes
>an outboard part of the kernel, recompiled when it changes.. or
>you introduce an interface on both sides, push more complexity
>into the kernel, and still have the semantics of Linux implicit
>in your use of the interface.

Absolutely. I do regard it as a good idea to provide information
calls, for advanced performance analysis, but that is all.

The paragraph above was a reductio ad absurdum on Linus's previous
ones, incidentally, and was and is not my viewpoint.

Regards,
Nick Maclaren.

Nick Maclaren

unread,

Jul 5, 2003, 12:33:31 PM7/5/03

to

In article <20030705101011...@multicians.org>,

Tom Van Vleck <th...@multicians.org> wrote:
>

>Then in 1981 I joined Tandem Computers, which had a very
>different architecture. The Tandem kernel was minimal, and
>provided little more than processes and messages. Many
>traditional operating system functions were in a per-CPU monitor
>process. And the file system was in its own process, called the
>"disk process." Performance for transaction processing was good
>enough to win benchmarks, and extensibility and fault tolerance
>were excellent.

That is very much the sort of model I was thinking of, and is close
to those used by the capability systems. There is absolutely no
reason for transfers (rather than connexions) not to go straight
from the application to the filing system process. The fact that
most current hardware and operating systems don't have appropriate
primitives is not fundamental, and they could easily be provided.
But just not starting from here ....

One thing that makes me despair of the industry is that almost all
modern systems have SERIOUSLY bad I/O performance. Unix was and is
optimised for very small, atomic transactions on a single, local
disk. Even the better ones make a complete dog's dinner of high
performance parallel I/O (especially remote I/O).

Yet I have been told by many, many vendors and many, many experts
that no fundamental change is possible because it would impact
performance! Bizarre.

Regards,
Nick Maclaren.

Linus Torvalds

unread,

Jul 5, 2003, 12:50:32 PM7/5/03

to

Nick Maclaren wrote:
>
> OK. So why can't I get access to the kernel data structures from my
> program? I really would be able to like to place my data structures
> for best effect, control the TLBs and so on. I could get MUCH better
> performance under many circumstances that way!

Bad example.

Because you _can_ get access to the TLB entries (well, the page tables,
or on processors with fixed entries like PPC, the BAT registers) from
your program. The interfaces are there exactly because some programs
literally do care that much, and care about things like specific physical
pages.

It's a pain to do, and very few programs care enough to use it. But
big databases want to control their TLB behaviour, so they have access
to what the kernel calls "hugetlb" - basically you can bypass a large
part of the VM by mapping a hugetlb area. It won't swap for you, and
it will have various architecture-defined limitations (ie on x86 the
area will have to be either 2M or 4M aligned/sized).

And other system programs need access to raw physical pages, for AGP
mapping and starting DMA from user space. The interfaces exist, it's
just that they tend to not be used very widely because they are clearly
not portable, and they are a pain to use and administer (all of these
have security issues).

The rule is: provide an abstraction, but if you really need to, be
willing to _break_ the abstraction.

Linus

Charles Richmond

unread,

Jul 5, 2003, 1:26:45 PM7/5/03

to

Edward Rice wrote:
>
> [snip...] [snip...] [snip...]

>
> However, if I were still in the business of /producing/ code, I'd just nod
> my head and dismiss those idjits as outliers on the curve and be in almost
> full agreement with you. I look at what "productivity" meant on Multics
> boxes, the time it took to do a compile and the limitations we had on
> produceable tools because some things just took too long to develop or were
> likely to take too long to run... and then I think of what joy some of
> those projects would be on boxes of today's speed. YOW! I can think of
> one particular analysis task that would have turned into a nothing project
> (rather than being abandoned as too costly) if we'd had today's horsepower
> to throw against it.
>

It is the business of today (and the future) to make the past
look ridiculous. But without all that hard work done in the
past, we would *not* be where we are today hardware-wise
and software-wise. Think of the past as technology's vestigal
tail...

--
+----------------------------------------------------------------+
| Charles and Francis Richmond richmond at plano dot net |
+----------------------------------------------------------------+

Linus Torvalds

unread,

Jul 5, 2003, 1:37:47 PM7/5/03

to

Linus Torvalds wrote:

> Bad example.
>
> Because you _can_ get access to the TLB entries (well, the page tables,
> or on processors with fixed entries like PPC, the BAT registers) from

> your program. [ ... ]

Actually, thinking about it, it's not a bad example.

Yes, you can control the TLB from user space to some degree, because we
can make the interfaces available for it. But no, you can't control the
placement of the rest of the kernel data structures to make sure that
you don't get any interaction between the kernel TLB accesses and your
own user TLB accesses.

That's a problem with separation in general: some things that might be
trivial if they weren't separate end up being very hard to do if you split
them up, because there are no sane interfaces for them. When you can't
just directly interface to the internal data structures, you sometimes
end up being screwed.

So the kernel ends up exporting _some_ functionality that is obvious
enough, but at some point you can't get any more because the interfaces
to export internals get too hairy and pointless.

Somebody (sorry, forget who) mentioned a Tandem "filesystem outside the
OS" example. But I bet that ended up putting the block IO layer inside the
filesystem, and didn't support very many alternate filesystems - once you
split it up that way, it ends up often being very hard to do things that
others find trivial, exactly because the split makes it hard to access state
on the "other side".

So without knowing the Tandem OS, I will just make a wild stab at guessing
what the split resulted in: the filesystem process maintained all caches
internally to itself, and the system probably had some interface to set
aside <n>% of memory for that filesystem process for caching. That's not
the only way to do things, but it's a fairly obvious approach, and the
alternatives would probably tend to get rather complicated.

This is why you don't want to split too early. At the same time you _do_
want to split at some point, and the point should be the one that has the
simplest requirements for the interfaces you provide.

I maintain that if you want to make something that looks like UNIX, the
split has to be at the system call interface level and nowhere else. Using
message passing at a lower level ends up being just plain stupid.

If you want to make a system that doesn't care about compatibility, you
may have more freedom. Of course, with that freedom comes the fact that
nobody actually wants to use your system, which gives you even _more_
freedom, since then you don't have to worry about those pesky users and
their real-life problems.

Linus

Peter Flass

unread,

Jul 5, 2003, 4:02:06 PM7/5/03

to

Linus Torvalds wrote:
>
> Douglas H. Quebbeman wrote:

> > But I always thought one of the main reasons for chasing the
> > ever-increasing hardware performance curve was to make it possible
> > to write code in a high-level manner, and have it run fast enough
> > to be useful.
>

> Why would you ever do that for an operating system, though?

[snip]

>
> However, when you make your OS slow, you make _everything_ slow.
>

> Also, what's the point of writing low-level code in a high-level manner?

Or in a high-level language, if you carry this argiument far enough?
> The important part here is "independent". The concepts should be
> clearly above or below each other, not smushed together into a
> unholy mess of 'every single abstraction you can think of'.

>
> Leave the OS be. Put your abstractions on top of the solid ground
> of an OS that performs well.
>

> > I always hoped these benefits would be available not only to
> > applications programmers, but to us systems programmers as well.
>
> If you want OS protection, you work outside the OS. It's that simple.
>

Yes, but there's OS code and then there's OS code. Are device drivers
part of the OS? There has certainly been debate about where they
belong. There are parts of the OS that are beaten to death - scheduler,
memory manager, etc. These need to be as efficient and close to the
hardware as possible. Then there are parts of the OS kernel that aren't
heavily used - in general why not just let them page out and don't
worry, within reason, about squeezing the last byte and cycle out of
them. It's the old 80-20 rule.

Peter da Silva

unread,

Jul 5, 2003, 4:12:54 PM7/5/03

to

In article <10574266...@palladium.transmeta.com>,

Linus Torvalds <torv...@osdl.org> wrote:
> I maintain that if you want to make something that looks like UNIX, the
> split has to be at the system call interface level and nowhere else. Using
> message passing at a lower level ends up being just plain stupid.

You've got a couple of big assumptions here that don't necessarily follow:

1. That everyone wants the OS to look like UNIX.

This isn't necessarily so. For compatibility, we want to be
able to run UNIX software without excessive overhead, but
that doesn't mean the native API is going to be "PRM section
2". Counterexamples: BeOS, Windows NT with Interix, QNX.

2. That even in UNIX the split is static over time. It's not,
and never has been. System calls migrate between section 2
and section 3 all the time, and there have been some impressive
results produced by doing things like implementing read()
and write() using the logical equivalent of mmap(). It's not
at all clear that all the interfaces in the standard UNIX
layering *are* the most efficient ones.

And this can all be done without compromising compatibility. One of the
reasons that the UNIX layering is a good one is that, apart from one
or two calls, it can be implemented efficiently on top of a wide variety
of operating systems... though not necessarily with a 1:1 system-call
mapping.

Christopher Browne

unread,

Jul 5, 2003, 6:29:41 PM7/5/03

to

A long time ago, in a galaxy far, far away, Tom Van Vleck <th...@multicians.org> wrote:
> nm...@cus.cam.ac.uk (Nick Maclaren) wrote:
>> OK. So why can't I get access to the kernel data structures from
>> my program? I really would be able to like to place my data
>> structures for best effect, control the TLBs and so on. I could
>> get MUCH better performance under many circumstances that way!

> By doing so, of course, you entangle the complexity of the Linux
> kernel into the complexity of your program. If you view and
> manipulate the kernel data structures directly, your application
> becomes non-portable to any other platform. Your program becomes an
> outboard part of the kernel, recompiled when it changes.. or you
> introduce an interface on both sides, push more complexity into the
> kernel, and still have the semantics of Linux implicit in your use
> of the interface.

Ah, but I don't think you evade that by having some "nicer interface."

The userspace NFS daemon for Linux was pretty Linux-specific; if
something were designed that knew how to use intelligent interfaces to
Linux VFS so as to allow it to be efficient despite being in user
space, it would _still_ have the semantics of Linux implicit in the
use of the interfaces.
--
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','cbbrowne.com').
http://www.ntlug.org/~cbbrowne/internet.html
"Support your local medical examiner - die strangely."
-- Blake Bowers