what's next for the linux kernel?

224 views
Skip to first unread message

Luke Kenneth Casson Leighton

unread,
Oct 2, 2005, 4:50:06 PM10/2/05
to
hi,

as i love to put my oar in where it's unlikely that people
will listen, and as i have little to gain or lose by doing
so, i figured you can decide for yourselves whether to be
selectively deaf or not:

here's my take on where i believe the linux kernel needs to go,
in order to keep up.

what prompted me to send this message now was a recent report
where linus' no1 patcher is believed to be close to overload,
and in that report, i think it was andrew morton, it was
said that he believed the linux kernel development rate to be
slowing down, because it is nearing completion.

i think it safe to say that a project only nears completion
when it fulfils its requirements and, given that i believe that
there is going to be a critical shift in the requirements, it
logically follows that the linux kernel should not be believed
to be nearing completion.

with me so far? :)

okay, so what's the bit that's missing that mr really irritating,
oh-so-right and oh-so-killfile-ignorable luke kenneth casson
kipper lozenge has spotted that nobody else has, and what's
the fuss about?

well... to answer that, i need to outline a bit about processor
manufacturing: if you are familiar with processor design please
forgive me, this is for the benefit of those who might not be.

the basic premise: 90 nanometres is basically... well...
price/performance-wise, it's hit a brick wall at about 2.5Ghz, and
both intel and amd know it: they just haven't told anyone.

anyone (big) else has a _really_ hard time getting above 2Ghz,
because the amount of pipelining required is just... insane
(see recent ibm powerpc5 see slashdot - what speed does it do?
surprise: 2.1Ghz when everyone was hoping it would be 2.4-2.5ghz).

a _small_ chip design company (not an IBM, intel or amd)
will be lucky to see this side of 1Ghz, at 90nm.

also, the cost of mask charges for 90nm is insane: somewhere
around $2million and that's never going to go away.

the costs for 65nm are going to be far far greater than that,
and 45nm i don't even want to imagine what they're going to be.

plus, there's a problem of quantum mechanics, heat dissipation
and current drain that makes, with current manufacturing
techniques, the production of 65nm and 45nm chips really
problematic.

with present manufacturing techniques, the current drain and heat
dissipation associated with 45nm means that you have to cut the number
of gates down to ONE MILLION, otherwise the chip destroys itself.

(brighter readers might now have an inkling of where i'm going
with this - bear with me :)

compare that one million gates with the present number of gates in an
AMD or x86 chip - some oh, what, 20 million?

now you get it?

for the present insane uniprocessor architectures at least
(and certainly for the x86 design), 90nm is _it_ - and yet,
people demand ever more faster and faster amounts of processing,
and no amount of trying on the part of engineers can get round
the laws of physics.

so, what's the solution?

well.... it's to back to parallel processing techniques, of course.

and, surprise surprise, what do we have intel pushing?

apart from, of course, the performance per watt metric (which,
if you read a few paragraphs back, you realise why they have to
trick both their customers and their engineers into believing
that performance/watt is suddenly important, it's because they
have to carve a path for a while getting the current usage down
in order for the 65nm chips to become palatable - assuming they
can be made at all in a realistic yield - read price bracket)

well - intel is pushing "hyperthreading".

and surprise, surprise, what is amd pushing? dual-core chips.

and what is in the X-Box 360? a PowerPC _triple_ core, _dual_
hyper-threaded processor!!

i believe that the X-Box 360 processor is the way things
are going to be moving - quad-core quad-threaded processors;
16 and 32 core ultra-RISC processors: medium to massive parallel
processors, but this time single-chip unlike the past decade(s) where
multi-core was hip and cool and... expensive.

i believe the future to contain stacks of single-chip multiprocessing
designs in several forms - including intel's fun-and-games VLIW stuff.

remember: intel recently bought the company that has spent
15 years working on that DEC/Alpha just-in-time x86-to-alpha
assembly converter product (remember DEC/Alphas running NT 3.51,
anyone, and still being able to run x86 programs?)

and, what is the linux kernel?

it's a daft, monolithic design that is suitable and faster on
single-processor systems, and that design is going to look _really_
outdated, really soon.

fortunately, there is a research project that has already
done a significant amount of work in breaking away from the
monolithic design: the l4linux project.

last time i checked, a few months ago, they were keeping thoroughly
up-to-date and had either 2.6.11 or 2.6.12 ported, can't recall which.

the l4linux project puts the linux kernel on top of L4-compliant
microkernels (yes, just like the gnu hurd :) and there are several such
L4-compliant microkernels - named after nuts. pistachio, etc.

one of those l4-compliant microkernels is a parallel processor
based one - it's SMP compliant, it even has support for virtual
machining, whoopee, ain't that fun.

i remember now. university of south australia, and university
of karlsruhe. i probably spelled that wrong.


in short, basically, if you follow and agree with the logic, the
linux kernel - as maintained by linus - is far from complete.

i therefore invite you to consider the following strategy:

1) that the linux kernel should merge with the oskit project or that the
linux kernel should split into two projects - a) 30-40k lines of code
comprising the code in kernel/* and headers and ports and headers
b) device drivers i.e duh the oskit project.

2) that the linux kernel should merge and maintain the efforts
of the l4linux project as mainlined not sidelined.

3) that serious efforts be diverted into the l4 microkernels to make it
portable, work on parallel processor systems, hyperthreaded, SMP and
other (such as ACPI which has had to be #ifdef'd out even in XEN).

4) other.

yes, i know this flies in the face of linus' distaste for
message-based kernels, and it's because message-passing slows
things down... but they slow things down _only_ on uniprocessor
kernel designs, and uniprocessors are going to be blowing
goats / bubbles / insert-as-appropriate in the not-too-distant
future. there have _already_ been high-profile parallel
processor designs announced, released, and put into service
(e.g. dual-core AMD64, triple-core dual-hyperthreaded PowerPC in
the X-Box 360).

yes, i may have got things wrong.

yes, it is up to _you_ to point them out.

yes, it is up to _you_ to decide what to do, not me.

good luck.

l.

p.s. XEN is even getting lovely encouraging noises from intel
to support hyperthreading, isn't that nice boys and girls?

--
--
<a href="http://lkcl.net">http://lkcl.net</a>
--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Rik van Riel

unread,
Oct 2, 2005, 5:10:08 PM10/2/05
to
On Sun, 2 Oct 2005, Luke Kenneth Casson Leighton wrote:

> and, what is the linux kernel?
>
> it's a daft, monolithic design that is suitable and faster on
> single-processor systems, and that design is going to look _really_
> outdated, really soon.

Linux already has a number of scalable SMP synchronisation
mechanisms. The main scalability effort nowadays is about
the avoidance of so-called "cache line bouncing".

http://wiki.kernelnewbies.org/wiki/SMPSynchronisation

--
All Rights Reversed

Robert Hancock

unread,
Oct 2, 2005, 6:50:08 PM10/2/05
to
Luke Kenneth Casson Leighton wrote:
> and, what is the linux kernel?
>
> it's a daft, monolithic design that is suitable and faster on
> single-processor systems, and that design is going to look _really_
> outdated, really soon.

Well, it sounds like it works pretty well on such things as 512 CPU
Altix systems, so it sounds like the suggestion that Linux is designed
solely for single-processor systems and isn't suitable for multicore,
hyperthreaded CPUs doesn't hold much water..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from hanc...@nospamshaw.ca
Home Page: http://www.roberthancock.com/

Christoph Hellwig

unread,
Oct 2, 2005, 7:00:10 PM10/2/05
to
Let's hope these posts will stop when the UK starts to allow serving
drinks after 23:00. Post from half-drunk people that need to get a life
don't really help a lot.

Luke Kenneth Casson Leighton

unread,
Oct 2, 2005, 7:10:03 PM10/2/05
to
On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:
> On Sun, 2 Oct 2005, Luke Kenneth Casson Leighton wrote:
>
> > and, what is the linux kernel?
> >
> > it's a daft, monolithic design that is suitable and faster on
> > single-processor systems, and that design is going to look _really_
> > outdated, really soon.
>
> Linux already has a number of scalable SMP synchronisation
> mechanisms.

... and you are tied in to the decisions made by the linux kernel
developers.

whereas, if you allow something like a message-passing design (such as
in the port of the linux kernel to l4), you have the option to try out
different underlying structures - _without_ having to totally redesign
the infrastructure.

several people involved with the l4linux project have already done
that: as i mentioned in the original post, there are about three or
four different and separate l4 microkernels available for download
(GPL) and one of them is ported to stacks of different architectures,
and one of them is SMP capable and even includes a virtual machine
environment.

and they're only approx 30-40,000 lines each, btw.


also, what about architectures that have features over-and-above SMP?

in the original design of SMP it was assumed that if you have
N processors that you have N-way access to memory.

what if, therefore, someone comes up with an architecture that is
better than or improves greatly upon SMP?

they will need to make _significant_ inroads into the linux kernel
code, whereas if, say, you oh i dunno provide hardware-accelerated
parallel support for a nanokernel (such as l4) which just _happens_
to be better than SMP then running anything which is l4 compliant gets
the benefit.


the reason i mention this is because arguments about saying "SMP is it,
SMP is great, SMP is everything, we're improving our SMP design" don't
entirely cut it, because SMP has limitations that don't scale properly
to say 64 or 128 processors: sooner or later someone's going to come up
with something better than SMP and all the efforts focussed on making
SMP better in the linux kernel are going to look lame.

l.

p.s. yes i do know of a company that has improved on SMP.

Rik van Riel

unread,
Oct 2, 2005, 7:30:12 PM10/2/05
to
On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
> On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:

> > Linux already has a number of scalable SMP synchronisation
> > mechanisms.
>
> ... and you are tied in to the decisions made by the linux kernel
> developers.
>
> whereas, if you allow something like a message-passing design (such as
> in the port of the linux kernel to l4), you have the option to try out
> different underlying structures - _without_ having to totally redesign
> the infrastructure.

Infrastructure is not what matters when it comes to SMP
scalability on modern systems, since lock contention is
not the primary SMP scalability problem.

Due to the large latency ratio between L1/L2 cache and
RAM, the biggest scalability problem is cache invalidation
and cache bounces.

Those are not solvable by using another underlying
infrastructure - they require a reorganization of the
datastructures on top, the data structures in Linux.

Note that message passing is by definition less efficient
than SMP synchronisation mechanisms that do not require
data to be exchanged between CPUs, eg. RCU or the use of
cpu-local data structures.

> p.s. yes i do know of a company that has improved on SMP.

SGI ? IBM ?

--
All Rights Reversed

Luke Kenneth Casson Leighton

unread,
Oct 2, 2005, 7:30:16 PM10/2/05
to
On Sun, Oct 02, 2005 at 11:49:57PM +0100, Christoph Hellwig wrote:
> Let's hope these posts will stop when the UK starts to allow serving
> drinks after 23:00. Post from half-drunk people that need to get a life
> don't really help a lot.

hi, christoph,

i assume that your global world-wide distribution of this message
was a mistake on your part. but, seeing as it _has_ gone out to
literally thousands of extremely busy people, i can only apologise
to them on your behalf for the mistake of wasting their valuable
time.

let's also hope that people who believe that comments such as the one
that you have made are useful and productive also think about the
consequences of doing so, bear in mind that internet archives are
forever, and also that they check whether the person that they are
criticising drinks at _all_.

personally, my average consumption of alcohol can be measured
as approx 1 bottle per decade. and i'm not talking meths.

if you don't like what i have to say, and don't want to listen,
even with a pinch of salt to me rambling, learn how to set
up a killfile, and use it. and think more before hitting the
reply-to-all button. key. whatever.

l.

--
--
<a href="http://lkcl.net">http://lkcl.net</a>
--

Vadim Lobanov

unread,
Oct 2, 2005, 7:40:03 PM10/2/05
to

Like NUMA?

> they will need to make _significant_ inroads into the linux kernel
> code, whereas if, say, you oh i dunno provide hardware-accelerated
> parallel support for a nanokernel (such as l4) which just _happens_
> to be better than SMP then running anything which is l4 compliant gets
> the benefit.
>
>
> the reason i mention this is because arguments about saying "SMP is it,
> SMP is great, SMP is everything, we're improving our SMP design" don't
> entirely cut it, because SMP has limitations that don't scale properly
> to say 64 or 128 processors: sooner or later someone's going to come up
> with something better than SMP and all the efforts focussed on making
> SMP better in the linux kernel are going to look lame.
>
> l.
>
> p.s. yes i do know of a company that has improved on SMP.
>
> -

-Vadim Lobanov

Gene Heskett

unread,
Oct 2, 2005, 7:40:06 PM10/2/05
to
On Sunday 02 October 2005 18:43, Robert Hancock wrote:
>Luke Kenneth Casson Leighton wrote:
>> and, what is the linux kernel?
>>
>> it's a daft, monolithic design that is suitable and faster on
>> single-processor systems, and that design is going to look _really_
>> outdated, really soon.
>
>Well, it sounds like it works pretty well on such things as 512 CPU
>Altix systems, so it sounds like the suggestion that Linux is designed
>solely for single-processor systems and isn't suitable for multicore,
>hyperthreaded CPUs doesn't hold much water..

Ahh, yes and no, Robert. The un-answered question, for that
512 processor Altix system, would be "but does it run things 512
times faster?" Methinks not, by a very wide margin. Yes, do a lot
of unrelated things fast maybe, but render a 30 megabyte page with
ghostscript in 10 milliseconds? Never happen IMO.

And Christoph in the next msg, calls him 1/2 drunk. He doesn't come
across to me as being more than 1 beer drunk. And he does make some
interesting points, so if they aren't valid, lets use proveable logic
to shoot them down, not name calling and pointless rhetoric.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.35% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2005 by Maurice Eugene Heskett, all rights reserved.

Vadim Lobanov

unread,
Oct 2, 2005, 7:50:06 PM10/2/05
to
On Sun, 2 Oct 2005, Gene Heskett wrote:

> On Sunday 02 October 2005 18:43, Robert Hancock wrote:
> >Luke Kenneth Casson Leighton wrote:
> >> and, what is the linux kernel?
> >>
> >> it's a daft, monolithic design that is suitable and faster on
> >> single-processor systems, and that design is going to look _really_
> >> outdated, really soon.
> >
> >Well, it sounds like it works pretty well on such things as 512 CPU
> >Altix systems, so it sounds like the suggestion that Linux is designed
> >solely for single-processor systems and isn't suitable for multicore,
> >hyperthreaded CPUs doesn't hold much water..
>
> Ahh, yes and no, Robert. The un-answered question, for that
> 512 processor Altix system, would be "but does it run things 512
> times faster?" Methinks not, by a very wide margin. Yes, do a lot
> of unrelated things fast maybe, but render a 30 megabyte page with
> ghostscript in 10 milliseconds? Never happen IMO.

This is only true for workloads that are parallelizable. I don't think
any kernel is quite good enough to divine what a single-threaded
userland application is doing and make its work parallel.

That is to say, if we are going to look at examples (and so we should),
then we need to pick an example that is actually expected to benefit
from many-processor machines.

> And Christoph in the next msg, calls him 1/2 drunk. He doesn't come
> across to me as being more than 1 beer drunk. And he does make some
> interesting points, so if they aren't valid, lets use proveable logic
> to shoot them down, not name calling and pointless rhetoric.
>
> --
> Cheers, Gene
> "There are four boxes to be used in defense of liberty:
> soap, ballot, jury, and ammo. Please use in that order."
> -Ed Howdershelt (Author)
> 99.35% setiathome rank, not too shabby for a WV hillbilly
> Yahoo.com and AOL/TW attorneys please note, additions to the above
> message by Gene Heskett are:
> Copyright 2005 by Maurice Eugene Heskett, all rights reserved.
>
>
> -

-Vadim Lobanov

Rik van Riel

unread,
Oct 2, 2005, 8:00:13 PM10/2/05
to
On Sun, 2 Oct 2005, Gene Heskett wrote:

> Ahh, yes and no, Robert. The un-answered question, for that
> 512 processor Altix system, would be "but does it run things 512
> times faster?" Methinks not, by a very wide margin. Yes, do a lot
> of unrelated things fast maybe, but render a 30 megabyte page with
> ghostscript in 10 milliseconds? Never happen IMO.

You haven't explained us why you think your proposal
would allow Linux to circumvent Amdahl's law...

--
All Rights Reversed

Martin J. Bligh

unread,
Oct 2, 2005, 8:10:08 PM10/2/05
to
--Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote (on Monday, October 03, 2005 00:05:45 +0100):

> On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:
>> On Sun, 2 Oct 2005, Luke Kenneth Casson Leighton wrote:
>>
>> > and, what is the linux kernel?
>> >
>> > it's a daft, monolithic design that is suitable and faster on
>> > single-processor systems, and that design is going to look _really_
>> > outdated, really soon.
>>
>> Linux already has a number of scalable SMP synchronisation
>> mechanisms.
>
> ... and you are tied in to the decisions made by the linux kernel
> developers.

Yes. As are the rest of us. So if you want to implement something
different, that's your perogative. So feel free to go do it
somewhere else, and quit whining on this list.

We are not your implementation bitches. If you think it's such a great
idea, do it yourself.

M.

Randy.Dunlap

unread,
Oct 2, 2005, 8:20:11 PM10/2/05
to
On Sun, 02 Oct 2005 17:04:51 -0700 Martin J. Bligh wrote:

> --Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote (on Monday, October 03, 2005 00:05:45 +0100):
>
> > On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:
> >> On Sun, 2 Oct 2005, Luke Kenneth Casson Leighton wrote:
> >>
> >> > and, what is the linux kernel?
> >> >
> >> > it's a daft, monolithic design that is suitable and faster on
> >> > single-processor systems, and that design is going to look _really_
> >> > outdated, really soon.
> >>
> >> Linux already has a number of scalable SMP synchronisation
> >> mechanisms.
> >
> > ... and you are tied in to the decisions made by the linux kernel
> > developers.
>
> Yes. As are the rest of us. So if you want to implement something
> different, that's your perogative. So feel free to go do it
> somewhere else, and quit whining on this list.
>
> We are not your implementation bitches. If you think it's such a great
> idea, do it yourself.

IOW, -ENOPATCH. where's your patch?

---
~Randy
You can't do anything without having to do something else first.
-- Belefant's Law

Kurt Wall

unread,
Oct 2, 2005, 8:40:09 PM10/2/05
to

[...]

> with me so far? :)

Yes, and getting more annoyed at your condescending tone with each
paragraph.



> and, what is the linux kernel?
>
> it's a daft, monolithic design that is suitable and faster on
> single-processor systems, and that design is going to look _really_
> outdated, really soon.

Andrew Tannenbaum said the same thing in the early 1990s. That we're
here still having this discussion >10 years later is telling. Dr.
Tannenbaum might have been acadmeically and theoretically correct,
but, with a nod to OS X, the Linux kernel has proven itself by
implementation and has proven to be remarkably adaptable.

Check back in ten years. Or just come back when you've completed
implementing all the beauteous features you're selling.

> in short, basically, if you follow and agree with the logic, the
> linux kernel - as maintained by linus - is far from complete.
>
> i therefore invite you to consider the following strategy:

And the e.e. cummings affectation is even *more* annoying than your
condescension.

Kurt
--
Blood flows down one leg and up the other.

Kurt Wall

unread,
Oct 2, 2005, 8:40:12 PM10/2/05
to
On Sun, Oct 02, 2005 at 11:49:57PM +0100, Christoph Hellwig took 8 lines to write:
> Let's hope these posts will stop when the UK starts to allow serving
> drinks after 23:00. Post from half-drunk people that need to get a life
> don't really help a lot.

As posts from fully drunk people will help, either, albeit they might be
more entertaining.

Kurt
--
Violence is the last refuge of the incompetent.
-- Salvor Hardin

Luke Kenneth Casson Leighton

unread,
Oct 2, 2005, 8:50:05 PM10/2/05
to
On Sun, Oct 02, 2005 at 05:14:57PM -0700, Randy.Dunlap wrote:

> IOW, -ENOPATCH. where's your patch?

most of the relevant work has already been done (and not by
me): i invite you to consider searching with google for l4ka,
l4linux and oskit, or simply going to the web site l4linux.org
and l4ka.org.

the code for oskit has been available for some years, now,
and is regularly maintained. the l4linux people have had to
make some significant modifications to it (oskit), and also
to grub, and libstdc++, and pretty much everything else under
the sun and, it's all there, for the approx 100mb downloading.

l.

David Leimbach

unread,
Oct 2, 2005, 8:50:08 PM10/2/05
to
> > it's a daft, monolithic design that is suitable and faster on
> > single-processor systems, and that design is going to look _really_
> > outdated, really soon.
>
> Andrew Tannenbaum said the same thing in the early 1990s. That we're
> here still having this discussion >10 years later is telling. Dr.
> Tannenbaum might have been acadmeically and theoretically correct,
> but, with a nod to OS X, the Linux kernel has proven itself by
> implementation and has proven to be remarkably adaptable.

Why are you nodding to OS X? It's not a real micokernel either. It
just happens to have all the foobage of a microkernel in a rather
monolithic design. The reason that the bsd personality is in the same
address space as the mach bits is because they didn't want to deal
with the overheads of the message passing from kernel to userspace.

The L4 people figured out how to get a lot of those inefficiencies to
disappear and L4Linux is quite "performant". In some cases, L4Linux
can be used to provide a device driver for other L4 threads that would
normally have to write their own [in user space and even with
respectable performance
http://www.ertos.nicta.com.au/Research/ULDD/Performance.pml]

That's an interesting re-use and combination of several philosophies
if you ask me.

There is a lot of "what's next for linux" going on behind the scenes
and the current path of linux is apparently good enough for
accomplishing it.

- Dave

Luke Kenneth Casson Leighton

unread,
Oct 2, 2005, 9:00:09 PM10/2/05
to
On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:

> > what if, therefore, someone comes up with an architecture that is
> > better than or improves greatly upon SMP?
>
> Like NUMA?

yes, like numa, and there is more.

i had the honour to work with someone who came up with a radical
enhancement even to _that_.

basically the company has implemented, in hardware (a
nanokernel), some operating system primitives, such as message
passing (based on a derivative by thompson of the "alice"
project from plessey, imperial and manchester university
in the mid-80s), hardware cache line lookups (which means
instead of linked list searching, the hardware does it for
you in a single cycle), stuff like that.

the message passing system is designed as a parallel message bus -
completely separate from the SMP and NUMA memory architecture, and as
such it is perfect for use in microkernel OSes.

(these sorts of things are unlikely to make it into the linux kernel, no
matter how much persuasion and how many patches they would write).

_however_, a much _better_ target would be to create an L4 microkernel
on top of their hardware kernel.

this company's hardware is kinda a bit difficult for most people to get
their heads round: it's basically parallelised hardware-acceleration for
operating systems, and very few people see the point in that.

however, as i pointed out, 90nm and approx-2Ghz is pretty much _it_,
and to get any faster you _have_ to go parallel.

and the drive for "faster", "better", "more sales" means more and more
parallelism.

it's _happening_ - and SMP ain't gonna cut it (which is why
these multi-core chips are coming out and why hyperthreading
is coming out).

so.

this is a heads-up.

what you choose to do with this analysis is up to you.

l.

Luke Kenneth Casson Leighton

unread,
Oct 2, 2005, 9:20:06 PM10/2/05
to
On Sun, Oct 02, 2005 at 05:04:51PM -0700, Martin J. Bligh wrote:
> --Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote (on Monday, October 03, 2005 00:05:45 +0100):
>
> > On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:
> >> On Sun, 2 Oct 2005, Luke Kenneth Casson Leighton wrote:
> >>
> >> > and, what is the linux kernel?
> >> >
> >> > it's a daft, monolithic design that is suitable and faster on
> >> > single-processor systems, and that design is going to look _really_
> >> > outdated, really soon.
> >>
> >> Linux already has a number of scalable SMP synchronisation
> >> mechanisms.
> >
> > ... and you are tied in to the decisions made by the linux kernel
> > developers.
>
> Yes. As are the rest of us. So if you want to implement something
> different, that's your perogative. So feel free to go do it
> somewhere else, and quit whining on this list.
>
> We are not your implementation bitches. If you think it's such a great
> idea, do it yourself.

martin, i'm going to take a leaf from the great rusty russell's book,
because i was very impressed with the professional way in which he
dealt with someone who posted such immature and out-of-line comments:
he rewrote them in a much more non-hostile manner and then replied to
that.

so, here goes: i'm copying the above few [relevant] paragraphs
below, then rewriting them, here:

> >
> > ... and you are tied in to the decisions made by the linux kernel
> > developers.
>

> Yes, this is very true: we are all somewhat at the mercy of their
> decisions. However, fortunately, they had the foresight to work
> with free software, so any of us can try something different, if
> we wish.
>
> i am slightly confused by your message, however: forgive me for
> asking this but you are not expecting us to implement such a radical
> redesign, are you?

martin, hi, thank you for responding.

well... actually, as it turns out, the l4linux and l4ka people have
already done most of the work!!

i believe you may have missed part of my message (it was a bit long, i
admit) and i thank you for the opportunity, that your message presents,
to reiterate this: l4linux _exists_ - last time i checked (some months
ago) it had a port of 2.6.11 to the L4 microkernel.

so, in more ways than one, no i am of course not expecting people to
just take orders from someone as mad as myself :)

i really should reiterate this: i _invite_ people to _consider_ the
direction that processor designs - not just any "off-the-wall"
processor designs but _mainstream_ x86-compatible processor designs -
are likely to take. and they are becoming more and more parallel.

the kinds of questions that the experienced linux kernel
maintainers and developers really need to ask is: can the
present linux kernel design _cope_ with such parallelism?

is there an easier way?

that's mainly why i wished you "good luck" :)

l.

p.s. martin. _don't_ do that again. i don't care who you are:
internet archives are forever and your rudeness will be noted
by google-users and other search-users - long after you are dead.

Rik van Riel

unread,
Oct 2, 2005, 9:20:08 PM10/2/05
to
On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:

> well... actually, as it turns out, the l4linux and l4ka people have
> already done most of the work!!

And I am sure they have reasons for not submitting their
changes to the linux-kernel mailing list. They probably
know something we (including you) don't know.

Switching out the low level infrastructure does NOT help
with scalability. The only way to make the kernel more
parallelizable is by changing the high level code, ie.
Linux itself.

Adding a microkernel under Linux is not going to help
with anything you mentioned so far.

--
All Rights Reversed

Chase Venters

unread,
Oct 2, 2005, 9:30:08 PM10/2/05
to
I'd venture to say that Linux scalability is fantastic. This also sounds like
a repeat of a debate that happened ten years ago.

I too was intrigued by Andrew's comment about 'finishing the kernel', though
I'm guessing (albeit without ever having spoken to Andrew personally) that it
was partially in jest. What it does suggest, though, is a point that KDE
desktop developer Aaron Seigo has made recently about the focus moving up the
stack.

If we are admirably tackling the problems of hardware compatibility,
stability, scalability and we've implemented most of the important features
that belong in the kernel, then a lot of the development fire for a so-called
complete Linux system is going to have to move up the stack - into the
userland.

Indeed, adding 100 cores to my Pentium 4 isn't going to do me a damned bit of
good when Akregator goes to query some 40 RSS feeds and Kontact blocks,
refusing to process GUI events. It's also not going to make compiling a
single .c file any faster.

I have no doubt that the bright minds here on LKML will continue to find
places to improve Linux's scalability, but that certainly doesn't require
rebuilding the kernel - we're already doing remarkably well in the
scalability department.

The bottom line is that the application developers need to start being clever
with threads. I think I remember some interesting rumors about Perl 6, for
example, including 'autothreading' support - the idea that your optimizer
could be smart enough to identify certain work that can go parallel.

As dual cores and HT become more popular, the onus is going to be on the
applications, not the OS, to speed up.

Regards,
Chase Venters

On Sunday 02 October 2005 08:10 pm, Luke Kenneth Casson Leighton wrote:
> ... words ...

Luke Kenneth Casson Leighton

unread,
Oct 2, 2005, 9:30:12 PM10/2/05
to
On Sun, Oct 02, 2005 at 07:26:21PM -0400, Rik van Riel wrote:
> On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
> > On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:
>
> > > Linux already has a number of scalable SMP synchronisation
> > > mechanisms.
> >
> > ... and you are tied in to the decisions made by the linux kernel
> > developers.
> >
> > whereas, if you allow something like a message-passing design (such as
> > in the port of the linux kernel to l4), you have the option to try out
> > different underlying structures - _without_ having to totally redesign
> > the infrastructure.
>
> Infrastructure is not what matters when it comes to SMP
> scalability on modern systems, since lock contention is
> not the primary SMP scalability problem.
>
> Due to the large latency ratio between L1/L2 cache and
> RAM, the biggest scalability problem is cache invalidation
> and cache bounces.
>
> Those are not solvable by using another underlying
> infrastructure - they require a reorganization of the
> datastructures on top, the data structures in Linux.

... ah, but what about in hardware? what if you had hardware support
for

_plus_ what if you had some other OS primitives implemented
in hardware, the use of which allowed you to avoid or minimise
cache invalidation problems?

not entirely, of course, but enough to make up for SMP's deficiencies.


> Note that message passing is by definition less efficient
> than SMP synchronisation mechanisms that do not require
> data to be exchanged between CPUs, eg. RCU or the use of
> cpu-local data structures.

how about message passing by reference - a la c++?

i.e. using an "out-of-band" parallel message bus, you pass
the address in a NUMA or SMP area of memory that is granted
to a specific processor, which says to another processor oh
something like "you now have access to this memory: by the time
you get this message i will have already cleared the cache so
you can get it immediately".

that sort of thing.

_and_ you use the parallel message bus to communicate memory
allocation, locking, etc.

_and_ you use the parallel message bus to implement semaphores and
mutexes.

_and_ if the message is small enough, you just pass the message across
without going via external memory.


... but i digress - but enough to demonstrate, i hope, that
this isn't some "pie-in-the-sky" thing, it's one hint at a
solution to the problem that a lot of hardware designers haven't
been able to solve, and up until now they haven't had to even
_consider_ it.

and they've avoided the problem by going "multi-core" and going
"hyperthreading".

but, at some point, hyperthreading isn't going to cut it, and at some
point multi-core isn't going to cut it.

and people are _still_ going to expect to see the monster
parallelism (32, 64, 128 parallel hardware threads) as
"one processor".

the question is - and i iterate it again: can the present
linux kernel design _cope_ with such monster parallelism?

answer, at present, as maintained as-it-is, not a chance.

question _that_ raises: do you _want_ to [make it cope with such
monster parallelism]?

and if the answer to that is "no, definitely not", then the
responsibility can be offloaded onto a microkernel, e.g. the L4
microkernel, and it _just_ so happens that the linux kernel has already
been ported to L4.

i raise this _one_ route - there are surely going to be others.

i invite you to consider discussing them.

LIKE FRIGGIN ADULTS, unlike the very spiteful comments i've
received indicate that some people would like to do (no i
don't count you in that number, rik, just in case you thought
i was because i'm replying direct to you!).


> > p.s. yes i do know of a company that has improved on SMP.
>
> SGI ? IBM ?

no, they're a startup.

--
--
<a href="http://lkcl.net">http://lkcl.net</a>
--

Vadim Lobanov

unread,
Oct 2, 2005, 9:30:12 PM10/2/05
to
On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:

> On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:
>
> > > what if, therefore, someone comes up with an architecture that is
> > > better than or improves greatly upon SMP?
> >
> > Like NUMA?
>
> yes, like numa, and there is more.

The beauty of capitalization is that it makes it easier for others to
read what you have to say.

> i had the honour to work with someone who came up with a radical
> enhancement even to _that_.
>
> basically the company has implemented, in hardware (a
> nanokernel), some operating system primitives, such as message
> passing (based on a derivative by thompson of the "alice"
> project from plessey, imperial and manchester university
> in the mid-80s), hardware cache line lookups (which means
> instead of linked list searching, the hardware does it for
> you in a single cycle), stuff like that.

That sounds awesome, but I have something better -- a quantum computer.
And it's about as parallel as you're going to get anytime in the
foreseeable future!

...

Moral of the story: There are thousands of hardware doodads all around.
People only start to become interested when they have actual "metal"
freely available on the market, that they can play with and code to.

> the message passing system is designed as a parallel message bus -
> completely separate from the SMP and NUMA memory architecture, and as
> such it is perfect for use in microkernel OSes.

You're making an implicit assumption here that it will benefit _only_
microkernel designs. That is not at all immediate or obvious to me (or,
I suspect, others also) -- where's the proof?

> (these sorts of things are unlikely to make it into the linux kernel, no
> matter how much persuasion and how many patches they would write).

No, the kernel hackers are actually very sensible people. When they push
back, there's usually a darn good reason for it. See above point
regarding availability of hardware.

> _however_, a much _better_ target would be to create an L4 microkernel
> on top of their hardware kernel.

Perfect. You can do that, and benefit from the oodles of fame that
follow. Others might be less-than-convinced.

> this company's hardware is kinda a bit difficult for most people to get
> their heads round: it's basically parallelised hardware-acceleration for
> operating systems, and very few people see the point in that.

That just sounds condescending.

> however, as i pointed out, 90nm and approx-2Ghz is pretty much _it_,
> and to get any faster you _have_ to go parallel.

Sure, it's going to stop somewhere, but you have to be a heck of a
visionary to predict that it will stop _there_. People have been
surprised before on such matters, so don't go around yelling about the
impending doom quite yet.

> and the drive for "faster", "better", "more sales" means more and more
> parallelism.
>
> it's _happening_ - and SMP ain't gonna cut it (which is why
> these multi-core chips are coming out and why hyperthreading
> is coming out).

"Rah, rah, parallelism is great!" -- That's a great slogan, except...

Users, who also happen to be the target of those sales, care about
_userland_ applications. And the bitter truth is that the _vast_
majority of userland apps are single-threaded. Why? Two reasons --
first, it's harder to write a multithreaded application, and second,
some workloads simply can't be expressed "in parallel". Your kernel
might (might, not will) run like a speed-demon, but the userland stuff
will still be lackluster in comparison.

And that's when your slogan hits a wall, and the marketing hype dies.
The reality is that parallelism is something to be desired, but is not
always achievable.

> so.
>
> this is a heads-up.
>
> what you choose to do with this analysis is up to you.

I choose to wait for actual, concrete details and proofs of your design,
instead of the ambiguous "visionary" hand-waving so far. As has already
been said, -ENOPATCH.

> l.
>

-Vadim Lobanov

Al Viro

unread,
Oct 2, 2005, 9:50:09 PM10/2/05
to
On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:
> I choose to wait for actual, concrete details and proofs of your design,
> instead of the ambiguous "visionary" hand-waving so far. As has already
> been said, -ENOPATCH.

Speaking of which, IIRC, somebody used to maintain a list of words, acronyms,
etc. useful to know if you want to read l-k. May I submit an addition to
that list?

visionary [n]: onanist with strong exhibitionist tendencies; from
"visions", the source of inspiration they refer to when it becomes
obvious that they have lost both sight and capacity for rational
thought.

Vadim Lobanov

unread,
Oct 2, 2005, 10:00:12 PM10/2/05
to
On Mon, 3 Oct 2005, Al Viro wrote:

> On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:
> > I choose to wait for actual, concrete details and proofs of your design,
> > instead of the ambiguous "visionary" hand-waving so far. As has already
> > been said, -ENOPATCH.
>
> Speaking of which, IIRC, somebody used to maintain a list of words, acronyms,
> etc. useful to know if you want to read l-k. May I submit an addition to
> that list?
>
> visionary [n]: onanist with strong exhibitionist tendencies; from
> "visions", the source of inspiration they refer to when it becomes
> obvious that they have lost both sight and capacity for rational
> thought.
>

Nice. :-)

Just from idle curiosity, you wouldn't know where that list currently
resides, would you?

-Vadim Lobanov

Luke Kenneth Casson Leighton

unread,
Oct 2, 2005, 10:00:13 PM10/2/05
to
On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:

> On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
>
> > On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:
> >
> > > > what if, therefore, someone comes up with an architecture that is
> > > > better than or improves greatly upon SMP?
> > >
> > > Like NUMA?
> >
> > yes, like numa, and there is more.
>
> The beauty of capitalization is that it makes it easier for others to
> read what you have to say.

sorry, vadim: haven't touched a shift key in over 20 years.

> > basically the company has implemented, in hardware (a
> > nanokernel), some operating system primitives, such as message
> > passing (based on a derivative by thompson of the "alice"
> > project from plessey, imperial and manchester university
> > in the mid-80s), hardware cache line lookups (which means
> > instead of linked list searching, the hardware does it for
> > you in a single cycle), stuff like that.
>
> That sounds awesome, but I have something better -- a quantum computer.
> And it's about as parallel as you're going to get anytime in the
> foreseeable future!

:)

*sigh* - i _so_ hope we don't need degrees in physics to program
them...

> > the message passing system is designed as a parallel message bus -
> > completely separate from the SMP and NUMA memory architecture, and as
> > such it is perfect for use in microkernel OSes.
>
> You're making an implicit assumption here that it will benefit _only_
> microkernel designs.

ah, i'm not: i just left out mentioning it :)

the message passing needs to be communicated down to manage
threads, and also to provide a means to manage semaphores and
mutexes: ultimately, support for such an architecture would
work its way down to libc.


and yes, if you _really_ didn't want a kernel in the way at all, you
could go embedded and just... do everything yourself.

or port reactos, the free software reimplementation of nt,
to it, or something :)

*shrug*.

> > this company's hardware is kinda a bit difficult for most people to get
> > their heads round: it's basically parallelised hardware-acceleration for
> > operating systems, and very few people see the point in that.
>
> That just sounds condescending.

i'm very sorry about that, it wasn't deliberate and ... re-reading
my comment, i should say that my comment isn't actually entirely true!

a correction/qualification: the people whom the startup company
contacted before they were put in touch with me had found that
everybody they had previously talked to just simply _did_ not
get it: this was presumably because of their choice of people
whom they were seeking funding from were not technically up
to the job of understanding the concept.

i didn't mean to imply that _everyone_ - or more specifically the
people reading this list - would not get it.

sorry.

> > however, as i pointed out, 90nm and approx-2Ghz is pretty much _it_,
> > and to get any faster you _have_ to go parallel.
>
> Sure, it's going to stop somewhere, but you have to be a heck of a
> visionary to predict that it will stop _there_.

okay, i admit it: you caught me out - i'm a mad visionary.

but seriously.

it won't stop - but the price of 90nm mask charges, at approx
$2m, is already far too high, and the number of large chips
being designed is plummetting like a stone as a result - from
like 15,000 per year a few years ago down to ... damn, can't remember -
less than a hundred (i think! don't quote me on that!)

when 90 nm was introduced, some mad fabs wanted to make 9
metre lenses, dude!!! until carl zeiss were called in and
managed to get it down to 3 metres.

and that lens is produced on a PER CHIP basis.

basically, it's about cost.

the costs of producing faster and faster uniprocessors is
getting out of control.

i'm not explaining things very well, but i'm trying. too many words,
not concise enough, too much to explain without people misunderstanding
or skipping things and getting the wrong end of the stick.

argh.


> > and the drive for "faster", "better", "more sales" means more and more
> > parallelism.
> >
> > it's _happening_ - and SMP ain't gonna cut it (which is why
> > these multi-core chips are coming out and why hyperthreading
> > is coming out).
>
> "Rah, rah, parallelism is great!" -- That's a great slogan, except...
>
> Users, who also happen to be the target of those sales, care about
> _userland_ applications. And the bitter truth is that the _vast_
> majority of userland apps are single-threaded. Why? Two reasons --
> first, it's harder to write a multithreaded application, and second,
> some workloads simply can't be expressed "in parallel". Your kernel
> might (might, not will) run like a speed-demon, but the userland stuff
> will still be lackluster in comparison.
>
> And that's when your slogan hits a wall, and the marketing hype dies.
> The reality is that parallelism is something to be desired, but is not
> always achievable.

okay: i will catch up on this bit, another time, because it is late
enough for me to be getting dizzy and appearing to be drunk.

this is one answer (and there are others i will write another time.
hint: automated code analysis tools, auto-parallelising tools, both
offline and realtime):

watch what intel and amd do: they will support _anything_ - clutch at
straws - to make parallelism palable, why? because in order to be
competitive - and realistically priced - they don't have any choice.

plus, i am expecting the chips to be thrown out there (like
the X-Box 360 which has SIX hardware threads remember) and
the software people to quite literally _have_ to deal with it.

i expect the hardware people to go: this is the limit, this is what we
can do, realistically price-performance-wise: lump it, deal with it.

when intel and amd start doing that, everyone _will_ lump it.
and deal with it.

... why do you think intel is hyping support for and backing
hyperthreads support in XEN/Linux so much?

l.

Rik van Riel

unread,
Oct 2, 2005, 10:00:16 PM10/2/05
to
On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:

> how about message passing by reference - a la c++?

> _and_ you use the parallel message bus to communicate memory
> allocation, locking, etc.

Then you lose. It's the act of passing itself that causes
scalability problems and a loss of performance.

The best way to get SMP scalability is to avoid message
passing alltogether, using things like per-cpu data
structures and RCU.

Not having to pass a message is faster than any message
passing mechanism.

> ... but i digress - but enough to demonstrate, i hope, that
> this isn't some "pie-in-the-sky" thing,

You've made a lot of wild claims so far, most of which I'm
not ready to belief without some proof to back them up.

> it's one hint at a solution to the problem that a lot of hardware
> designers haven't been able to solve, and up until now they haven't
> had to even _consider_ it.

The main problem is that communication with bits of silicon
four inches away is a lot slower, or takes much more power,
than communication with bits of silicon half a millimeter away.

This makes cross-core communication, and even cross-thread
communication in SMT/HT, slower than not having to have such
communication at all.

> the question is - and i iterate it again: can the present
> linux kernel design _cope_ with such monster parallelism?

The SGI and IBM people seem fairly happy with current 128 CPU
performance, and appear to be making serious progress towards
512 CPUs and more.

> question _that_ raises: do you _want_ to [make it cope with such
> monster parallelism]?
>
> and if the answer to that is "no, definitely not", then the
> responsibility can be offloaded onto a microkernel,

No, that cannot be done, for all the reasons I mentioned
earlier in the thread.

Think about something like the directory entry cache (dcache),
all the CPUs need to see that cache consistently, and you cannot
avoid locking overhead by having the locking done by a microkernel.

The only way to avoid locking overhead is by changing the data
structure to something that doesn't need locking.

No matter how low your locking overhead - once you have 1024
CPUs it's probably too high.

--
All Rights Reversed

Al Viro

unread,
Oct 2, 2005, 10:00:18 PM10/2/05
to
On Sun, Oct 02, 2005 at 06:50:46PM -0700, Vadim Lobanov wrote:
> > visionary [n]: onanist with strong exhibitionist tendencies; from
> > "visions", the source of inspiration they refer to when it becomes
> > obvious that they have lost both sight and capacity for rational
> > thought.
> >
>
> Nice. :-)
>
> Just from idle curiosity, you wouldn't know where that list currently
> resides, would you?

No idea... Probably somebody from kernelnewbies.org crowd would know
the current location...

Luke Kenneth Casson Leighton

unread,
Oct 2, 2005, 10:10:15 PM10/2/05
to
On Mon, Oct 03, 2005 at 02:53:00AM +0100, Al Viro wrote:
> On Sun, Oct 02, 2005 at 06:50:46PM -0700, Vadim Lobanov wrote:
> > > visionary [n]: onanist with strong exhibitionist tendencies; from
> > > "visions", the source of inspiration they refer to when it becomes
> > > obvious that they have lost both sight and capacity for rational
> > > thought.

oo, nice pretty flowers, wheeee :)

Vadim Lobanov

unread,
Oct 2, 2005, 10:40:06 PM10/2/05
to
On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:

> On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:
>
> > On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
> >
> > > On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:
> > >
> > > > > what if, therefore, someone comes up with an architecture that is
> > > > > better than or improves greatly upon SMP?
> > > >
> > > > Like NUMA?
> > >
> > > yes, like numa, and there is more.
> >
> > The beauty of capitalization is that it makes it easier for others to
> > read what you have to say.
>
> sorry, vadim: haven't touched a shift key in over 20 years.

It's not going to bite you. I promise.

No, for reliability and performance reasons, I very much want a kernel
in the way. After all, kernel code is orders of magnitude better tuned
than almost all userland code.

The point I was making here is that, from what I can see, the current
Linux architecture is quite alright in anticipation of the hardware that
you're describing. It _could_ be better tuned for such hardware, sure,
but so far there is no need for such work at this particular moment.

I can guarantee one thing here -- the cost, as is, is absolutely
bearable. These companies make more money doing this than they spend in
doing it, otherwise they wouldn't be in business. From an economics
perspective, this industry is very much alive and well, proven by the
fact that these companies haven't bailed out of it yet.

We don't need hints. We need actual performance statistics --
verifiable numbers that we can point to and say "Oh crap, we're losing."
or "Hah, we kick butt.", as the case may be.

> watch what intel and amd do: they will support _anything_ - clutch at
> straws - to make parallelism palable, why? because in order to be
> competitive - and realistically priced - they don't have any choice.

As stated earlier, I doubt they're in such dire straits as you predict.
Ultimately, the only reason why they need to advance their designs is to
be able to market it better. This means that truly innovative designs
may not be pursued because the up-front cost is too high.

There's a saying: "Let your competitor do your R&D for you."

> plus, i am expecting the chips to be thrown out there (like
> the X-Box 360 which has SIX hardware threads remember) and
> the software people to quite literally _have_ to deal with it.
>
> i expect the hardware people to go: this is the limit, this is what we
> can do, realistically price-performance-wise: lump it, deal with it.
>
> when intel and amd start doing that, everyone _will_ lump it.
> and deal with it.

Hardware without software is just as useless as software without
hardware. Any argument from any side that goes along the lines of "deal
with it" can be countered in kind.

What this boils down to is that hardware people try to make their
products appealing to program to, from _both_ a speed and a usability
perspective. That's how they get mindshare.

> ... why do you think intel is hyping support for and backing
> hyperthreads support in XEN/Linux so much?

At the risk of stepping on some toes, I believe that hyperthreading is
going out of style, in favor of multi-core processors.

> l.
>

In conclusion, you made claims that Linux is lagging behind. However,
such claims are rather useless without data and/or technical discussions
to back them up.

-Vadim Lobanov

Valdis.K...@vt.edu

unread,
Oct 2, 2005, 11:00:13 PM10/2/05
to
On Mon, 03 Oct 2005 01:54:00 BST, Luke Kenneth Casson Leighton said:

> in the mid-80s), hardware cache line lookups (which means
> instead of linked list searching, the hardware does it for
> you in a single cycle), stuff like that.

OK.. I'll bite. How do you find the 5th or 6th entry in the linked list,
when only the first entry is in cache, in a single cycle, when a cache line
miss is more than a single cycle penalty, and you have several "These are not
the droids you're looking for" checks and go on to the next entry - and do it
in one clock cycle?

Now, it's really easy to imagine an execution unit that will execute this
as a single opcode, and stall until complete. Of course, this only really helps
if you have multiple execution units - which is what hyperthreading and
multi-core and all that is about. And guess what - it's not news...

The HP2114 and DEC KL10/20 were able to dereference a chain of indirect bits
back in the 70's (complete with warnings that hardware wedges could occur if
an indirect reference formed a loop or pointed at itself). Whoops. :)

And all the way back in 1964, IBM disk controllers were able to do some rather
sophisticated offloading of "channel control words" (amazing what you could do
with 'Search ID Equal', 'Transfer In-Channel' (really a misnamed branch
instruction), and self-modifying CCWs). But even then, they understood that
it was only a win if you could go do other stuff when you waited....

D. Hazelton

unread,
Oct 2, 2005, 11:20:11 PM10/2/05
to
On Monday 03 October 2005 02:31, Vadim Lobanov wrote:
> On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
> > On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:
> > > On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
> > > > On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:
> > > > > > what if, therefore, someone comes up with an
> > > > > > architecture that is better than or improves greatly upon
> > > > > > SMP?
> > > > >
> > > > > Like NUMA?
> > > >
> > > > yes, like numa, and there is more.
> > >
> > > The beauty of capitalization is that it makes it easier for
> > > others to read what you have to say.
> >
> > sorry, vadim: haven't touched a shift key in over 20 years.
>
> It's not going to bite you. I promise.

You never know - someone might've rigged his keyboard to shock him
every time the shift key was pressed :)

<snip>


> > > > the message passing system is designed as a parallel message
> > > > bus - completely separate from the SMP and NUMA memory
> > > > architecture, and as such it is perfect for use in
> > > > microkernel OSes.
> > >
> > > You're making an implicit assumption here that it will benefit
> > > _only_ microkernel designs.
> >
> > ah, i'm not: i just left out mentioning it :)
> >
> > the message passing needs to be communicated down to manage
> > threads, and also to provide a means to manage semaphores and
> > mutexes: ultimately, support for such an architecture would
> > work its way down to libc.
> >
> >
> > and yes, if you _really_ didn't want a kernel in the way at all,
> > you could go embedded and just... do everything yourself.
> >
> > or port reactos, the free software reimplementation of nt,
> > to it, or something :)
> >
> > *shrug*.
>
> No, for reliability and performance reasons, I very much want a
> kernel in the way. After all, kernel code is orders of magnitude
> better tuned than almost all userland code.
>
> The point I was making here is that, from what I can see, the
> current Linux architecture is quite alright in anticipation of the
> hardware that you're describing. It _could_ be better tuned for
> such hardware, sure, but so far there is no need for such work at
> this particular moment.

Wholly agreed. The arguments over the benefits of running a
microkernel aren't ever really clear. Beyond that, I personally feel
that the whole micro vs. mono argument is a catfight between
academics. I'd rather have a system that works and is proven than a
system that is bleeding edge and never truly stable. To me this
means a monolithic kernel - microkernels are picky at best, and can
be highly insecure (and that means "unstable" in my book too).

<snip>


> > > > however, as i pointed out, 90nm and approx-2Ghz is pretty
> > > > much _it_, and to get any faster you _have_ to go parallel.
> > >
> > > Sure, it's going to stop somewhere, but you have to be a heck
> > > of a visionary to predict that it will stop _there_.
> >
> > okay, i admit it: you caught me out - i'm a mad visionary.
> >
> > but seriously.
> >
> > it won't stop - but the price of 90nm mask charges, at approx
> > $2m, is already far too high, and the number of large chips
> > being designed is plummetting like a stone as a result - from
> > like 15,000 per year a few years ago down to ... damn, can't
> > remember - less than a hundred (i think! don't quote me on
> > that!)
> >
> > when 90 nm was introduced, some mad fabs wanted to make 9
> > metre lenses, dude!!! until carl zeiss were called in and
> > managed to get it down to 3 metres.
> >
> > and that lens is produced on a PER CHIP basis.
> >
> > basically, it's about cost.
>
> I can guarantee one thing here -- the cost, as is, is absolutely
> bearable. These companies make more money doing this than they
> spend in doing it, otherwise they wouldn't be in business. From an
> economics perspective, this industry is very much alive and well,
> proven by the fact that these companies haven't bailed out of it
> yet.

I have to agree. And he is also completely ignoring the fact that both
Intel and AMD are either in the process of moving to (or have moved
to) a 65nm fab process - last news I saw about this said both
facilities were running into the multi-billion dollar cost range.
Companies worried about $2m for a mask charge wouldn't be investing
multiple billions of dollars in new plants and a new, smaller fab
process.

<snip>

Hear, hear! I'm still working my way through the source tree and
learning the general layout and functionality of the various bits,
but in just a pair of months of being on this list I can attest to
the fact that one thing all developers seem to ask for is statistics.

<snip>


> At the risk of stepping on some toes, I believe that hyperthreading
> is going out of style, in favor of multi-core processors.

Agreed. And multi-core processors aren't really new technology - there
have been multi-core designs out for a while, but those were usually
low production "research" chips.

DRH

0xA6992F96300F159086FF28208F8280BB8B00C32A.asc

Rik van Riel

unread,
Oct 2, 2005, 11:30:13 PM10/2/05
to
On Sun, 2 Oct 2005, Valdis.K...@vt.edu wrote:

> OK.. I'll bite. How do you find the 5th or 6th entry in the linked
> list, when only the first entry is in cache, in a single cycle, when a
> cache line miss is more than a single cycle penalty, and you have
> several "These are not the droids you're looking for" checks and go on
> to the next entry - and do it in one clock cycle?

A nice saying from the last decade comes to mind:

"If you can do all that in one cycle, your cycles are too long."

--
All Rights Reversed

Gene Heskett

unread,
Oct 3, 2005, 12:00:07 AM10/3/05
to
On Sunday 02 October 2005 19:48, Rik van Riel wrote:
>On Sun, 2 Oct 2005, Gene Heskett wrote:
>> Ahh, yes and no, Robert. The un-answered question, for that
>> 512 processor Altix system, would be "but does it run things 512
>> times faster?" Methinks not, by a very wide margin. Yes, do a lot
>> of unrelated things fast maybe, but render a 30 megabyte page with
>> ghostscript in 10 milliseconds? Never happen IMO.
>
>You haven't explained us why you think your proposal
>would allow Linux to circumvent Amdahl's law...

Amdahl's Law?

Thats a reference I don't believe I've been made aware of. Can you
elaborate?

Besides, it isn't my proposal, just a question in that I chose a
scenario (ghostscripts rendering of a page of text) that in fact only
runs maybe 10x faster on an XP-2800 Athlon with a gig of dram than it
did on my old 25 mhz 68040 equipt amiga with 64 megs of dram.
With 64 megs of dram, so it wasn't nearly as memory bound doing that
as most of the Amiga's were.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.35% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2005 by Maurice Eugene Heskett, all rights reserved.

-

Willy Tarreau

unread,
Oct 3, 2005, 12:20:09 AM10/3/05
to
On Mon, Oct 03, 2005 at 12:24:16AM +0100, Luke Kenneth Casson Leighton wrote:

> On Sun, Oct 02, 2005 at 11:49:57PM +0100, Christoph Hellwig wrote:
> > Let's hope these posts will stop when the UK starts to allow serving
> > drinks after 23:00. Post from half-drunk people that need to get a life
> > don't really help a lot.
>
> hi, christoph,
>
> i assume that your global world-wide distribution of this message
> was a mistake on your part. but, seeing as it _has_ gone out to
> literally thousands of extremely busy people, i can only apologise
> to them on your behalf for the mistake of wasting their valuable
> time.

If you think this, you don't know Christoph then. We all know him
for his "warm words" and his frankness. Sometimes he may be quite
a bit excessive, but here I tend to agree with him. He simply meant
that these long threads are often totally useless and consume a lot
of time to real developers. Generally speaking, calling developers
to tell them "hey, you work the wrong way" is not productive. If you
tell them that you can improve their work and show some code, it's
often more appreciated. Yes I know you provided some links to sites,
but without proposing one real guideline nor project. You'd better
have posted something like "announce: merging of l4linux into
mainline" and telling people that you will start a slow, non-intrusive
merging work of one of your favorite project. You'd have got lots
of opposition but this would have been more productive than just
speaking about philosophy.

> let's also hope that people who believe that comments such as the one
> that you have made are useful and productive also think about the
> consequences of doing so, bear in mind that internet archives are
> forever, and also that they check whether the person that they are
> criticising drinks at _all_.

[ cut all the uninteresting info about your drinking habits that you
sent to the whole world and which will be archived forever ]

Regards,
Willy

Sonny Rao

unread,
Oct 3, 2005, 1:10:15 AM10/3/05
to
On Mon, Oct 03, 2005 at 01:54:00AM +0100, Luke Kenneth Casson Leighton wrote:
<snip>

> this company's hardware is kinda a bit difficult for most people to get
> their heads round: it's basically parallelised hardware-acceleration for
> operating systems, and very few people see the point in that.

Obviously, we are all clueless morons.

<snip>

> so.
>
> this is a heads-up.
>
> what you choose to do with this analysis is up to you.
>
> l.

Roll around on the floor while violently laughing for a while?

Nick Piggin

unread,
Oct 3, 2005, 1:50:15 AM10/3/05
to
Allow me to apply Rusty's technique, if you will.

Luke Kenneth Casson Leighton wrote:

> Hi,
>
> Can all you great kernel hackers, who only know a little bit
> less than me and have only built a slightly less successful
> kernel than I have, stop what you are doing and do it my way
> instead?
>

Hi Luke,

Thanks for your concise and non-rambling letter that is actually
readable - a true rarity on lkml these days.

To answer your question: I think we would all be happy to examine
your ideas when you can provide some real numbers and comparisions
and actual technical arguments as to why they are better than the
current scheme we have in Linux.

Nick

PS. I am disappointed not to have seen any references to XML in
your proposal. May I suggest you adopt some kind of XML format
for your message protocol?

Send instant messages to your online friends http://au.messenger.yahoo.com

Meelis Roos

unread,
Oct 3, 2005, 4:00:16 AM10/3/05
to
LKCL> the code for oskit has been available for some years, now,
LKCL> and is regularly maintained. the l4linux people have had to

My experience with oskit (trying to let students use it for OS course
homework) is quite ... underwhelming. It works as long as you try to use
it exactly like the developers did and breaks on a slightest sidestep
from that road. And there's not much documentation so it's hard to learn
where that road might be.

Switched to Linux/BSD code hacking with students, the code that actually
works.

YMMV.

--
Meelis Roos

Erik Mouw

unread,
Oct 3, 2005, 5:40:20 AM10/3/05
to
On Mon, Oct 03, 2005 at 02:53:00AM +0100, Al Viro wrote:
> On Sun, Oct 02, 2005 at 06:50:46PM -0700, Vadim Lobanov wrote:
> > > visionary [n]: onanist with strong exhibitionist tendencies; from
> > > "visions", the source of inspiration they refer to when it becomes
> > > obvious that they have lost both sight and capacity for rational
> > > thought.
> > >
> >
> > Nice. :-)
> >
> > Just from idle curiosity, you wouldn't know where that list currently
> > resides, would you?
>
> No idea... Probably somebody from kernelnewbies.org crowd would know
> the current location...

There's not really a list of words on kernelnewbies.org, but a fortunes
file:

http://www.kernelnewbies.org/kernelnewbies-fortunes.tar.gz

(and I just updated it with your description of a visionary)


Erik

--
+-- Erik Mouw -- www.harddisk-recovery.nl -- 0800 220 20 20 --
| Eigen lab: Delftechpark 26, 2628 XH, Delft, Nederland
| Files foetsie, bestanden kwijt, alle data weg?!
| Blijf kalm en neem contact op met Harddisk-recovery.nl!

Jesper Juhl

unread,
Oct 3, 2005, 5:50:19 AM10/3/05
to
On 10/3/05, Gene Heskett <gene.h...@verizon.net> wrote:
> On Sunday 02 October 2005 19:48, Rik van Riel wrote:
> >On Sun, 2 Oct 2005, Gene Heskett wrote:
> >> Ahh, yes and no, Robert. The un-answered question, for that
> >> 512 processor Altix system, would be "but does it run things 512
> >> times faster?" Methinks not, by a very wide margin. Yes, do a lot
> >> of unrelated things fast maybe, but render a 30 megabyte page with
> >> ghostscript in 10 milliseconds? Never happen IMO.
> >
> >You haven't explained us why you think your proposal
> >would allow Linux to circumvent Amdahl's law...
>
> Amdahl's Law?
>
http://en.wikipedia.org/wiki/Amdahl's_law
http://home.wlu.edu/~whaleyt/classes/parallel/topics/amdahl.html

And google has even more. Wonderful thing those search engines...


--
Jesper Juhl <jespe...@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

Giuseppe Bilotta

unread,
Oct 3, 2005, 6:50:22 AM10/3/05
to
On Mon, 3 Oct 2005 02:53:02 +0100, Luke Kenneth Casson Leighton wrote:

> On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:
>> The beauty of capitalization is that it makes it easier for others to
>> read what you have to say.
>
> sorry, vadim: haven't touched a shift key in over 20 years.
>

[snip]

> *sigh* - i _so_ hope we don't need degrees in physics to program
> them...

[snip]

> ah, i'm not: i just left out mentioning it :)

I'd *love* a keyboard layout where * _ : ) are accesible without
shift! Can you send me yours?

--
Giuseppe "Oblomov" Bilotta

Hic manebimus optime

Horst von Brand

unread,
Oct 3, 2005, 9:40:22 AM10/3/05
to
Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:
> On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:
> > > what if, therefore, someone comes up with an architecture that is
> > > better than or improves greatly upon SMP?

> > Like NUMA?

> yes, like numa, and there is more.
>

> i had the honour to work with someone who came up with a radical
> enhancement even to _that_.

Any papers to look at?

> basically the company has implemented, in hardware (a nanokernel),

A nanokernel is a piece of software in my book?

> some
> operating system primitives, such as message passing (based on a
> derivative by thompson of the "alice" project from plessey, imperial and

> manchester university in the mid-80s), hardware cache line lookups


> (which means instead of linked list searching, the hardware does it for
> you in a single cycle), stuff like that.

Single CPU cycle for searching data in memory? Impossible.

> the message passing system is designed as a parallel message bus -
> completely separate from the SMP and NUMA memory architecture, and as
> such it is perfect for use in microkernel OSes.

Something must shuffle the data from "regular memory" into "message
memory", so I bet that soon becomes the bottleneck. And the duplicate data
paths add to the cost, money that could be spent on making memory access
faster, so...

> (these sorts of things are unlikely to make it into the linux kernel, no
> matter how much persuasion and how many patches they would write).

Your head would apin when looking at how fast this gets into Linux if there
were such machines around, and it is worth it.

> _however_, a much _better_ target would be to create an L4 microkernel
> on top of their hardware kernel.

Not yet another baroque CISC design, this time around with 1/3 of an OS in
it!

> this company's hardware is kinda a bit difficult for most people to get
> their heads round: it's basically parallelised hardware-acceleration for
> operating systems, and very few people see the point in that.

Perhaps most people that don't see the point do have a point?

> however, as i pointed out, 90nm and approx-2Ghz is pretty much _it_,
> and to get any faster you _have_ to go parallel.

Sorry, all this has been doomsayed (with different numbers) from 1965 or
so.

> and the drive for "faster", "better", "more sales" means more and more
> parallelism.

Right.

> it's _happening_ - and SMP ain't gonna cut it (which is why
> these multi-core chips are coming out and why hyperthreading
> is coming out).

Hyperthreading and multi-core /are/ SMP, just done a bit differently.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

Jon Masters

unread,
Oct 3, 2005, 10:30:37 AM10/3/05
to
On 10/2/05, Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:

> as i love to put my oar in where it's unlikely that people
> will listen, and as i have little to gain or lose by doing
> so, i figured you can decide for yourselves whether to be
> selectively deaf or not:

Hi Luke,

Haven't seen you since I believe you gave a somewhat interesting talk
on FUSE at an OxLUG a year or more back. I don't think anyone here is
selectively deaf, but some might just ignore you for such comments :-)

> what prompted me to send this message now was a recent report
> where linus' no1 patcher is believed to be close to overload,
> and in that report, i think it was andrew morton, it was
> said that he believed the linux kernel development rate to be
> slowing down, because it is nearing completion.

There was some general bollocks about Andrew being burned out, but
that wasn't what the point was as far as I saw it - more about how
things could be better streamlined than a sudden panic moment.

> i think it safe to say that a project only nears completion
> when it fulfils its requirements and, given that i believe that
> there is going to be a critical shift in the requirements, it
> logically follows that the linux kernel should not be believed
> to be nearing completion.

Whoever said it was?

> with me so far? :)

I don't think anyone with moderate grasp of the English language won't
have understood what you wrote above. They might not understand why
you said it, but that's another issue.

> the basic premise: 90 nanometres is basically... well...
> price/performance-wise, it's hit a brick wall at about 2.5Ghz, and
> both intel and amd know it: they just haven't told anyone.

But you /know/ this because you're a microprocessor designer as well
as a contributor to the FUSE project?

> anyone (big) else has a _really_ hard time getting above 2Ghz,
> because the amount of pipelining required is just... insane
> (see recent ibm powerpc5 see slashdot - what speed does it do?
> surprise: 2.1Ghz when everyone was hoping it would be 2.4-2.5ghz).

I think there are many possible reasons for that and I doubt slashdot
will reveal any of those reasons. The main issue (as I understand it)
is that SMT/SMP is taking off for many applications and manufacturers
want to cater for them while reducing heat output - so they care less
about MHz than about potential real world performance.

> so, what's the solution?

> well.... it's to back to parallel processing techniques, of course.

Yes. Wow! Of course! Whoda thunk it? I mean, parallel processing!
Let's get that right into the kern...oh wait, didn't Alan and a bunch
of others already do that years ago? Then again, we might have missed
all of the stuff which went into 2.2, 2.4 and then 2.6?

> well - intel is pushing "hyperthreading".

Wow! Really? I seem to have missed /all/ of those annoying ads. But
please tell me some more about it!

> and, what is the linux kernel?

> it's a daft, monolithic design that is suitable and faster on
> single-processor systems, and that design is going to look _really_
> outdated, really soon.

Why? I happen to think Microkernels are really sexy in a Computer
Science masturbatory kind of way, but Linux seems to do the job just
fine in real life. Do we need to have this whole
Microkernel/Monolithic conversation simply because you misunderstood
something about the kind of performance now possible in 2.6 kernels as
compared with adding a whole pointless level of message passing
underneath?

Jon.

Miklos Szeredi

unread,
Oct 3, 2005, 12:10:15 PM10/3/05
to
> But you /know/ this because you're a microprocessor designer as well
> as a contributor to the FUSE project?

AFAIK Luke never contributed to the FUSE project. Hopefully that
answers your question.

FUSE and microkernels are sometimes mentioned together, but I believe
there's a very important philosophical difference:

FUSE was created to ease the development and use of a very _special_
group of filesystems. It was never meant to replace (and never will)
the fantastically efficient and flexible internal filesystem
interfaces in Linux and other monolithic kernels.

On the other hand, the microkernel approach is to restrict _all_
filesystems to the more secure, but less efficient and less flexible
interface. Which is stupid IMO.

Miklos

Valdis.K...@vt.edu

unread,
Oct 3, 2005, 12:40:09 PM10/3/05
to
On Sun, 02 Oct 2005 22:12:38 EDT, Horst von Brand said:

> > some
> > operating system primitives, such as message passing (based on a
> > derivative by thompson of the "alice" project from plessey, imperial and
> > manchester university in the mid-80s), hardware cache line lookups
> > (which means instead of linked list searching, the hardware does it for
> > you in a single cycle), stuff like that.
>
> Single CPU cycle for searching data in memory? Impossible.

Well... if it was content-addressable RAM similar to what's already used for
the hardware TLB's and the like - just that it's one thing to make a 32 or 256
location content-addressable RAM, and totally another to have multiple megabytes
of the stuff. :)

Joe Bob Spamtest

unread,
Oct 3, 2005, 2:00:11 PM10/3/05
to
Luke Kenneth Casson Leighton wrote:
> p.s. martin. _don't_ do that again. i don't care who you are:
> internet archives are forever and your rudeness will be noted
> by google-users and other search-users - long after you are dead.

and who are you, the thought police? Get off your high horse.

I'm sure he's well aware of the consequences of posting to this list, as
I'm sure we all are. Hell, even *I* know all my mails to this list are
going to be archived for eternity.

Look, if you want to be a productive member of our community, stop
bitching about the way things *should* be, and submit some patches like
everyone else. Code talks. Bullshit ... well, it doesn't do much but sit
around stinking the place up.

Lennart Sorensen

unread,
Oct 3, 2005, 2:10:12 PM10/3/05
to
On Mon, Oct 03, 2005 at 10:50:00AM +0300, Meelis Roos wrote:
> LKCL> the code for oskit has been available for some years, now,
> LKCL> and is regularly maintained. the l4linux people have had to
>
> My experience with oskit (trying to let students use it for OS course
> homework) is quite ... underwhelming. It works as long as you try to use
> it exactly like the developers did and breaks on a slightest sidestep
> from that road. And there's not much documentation so it's hard to learn
> where that road might be.
>
> Switched to Linux/BSD code hacking with students, the code that actually
> works.

Can oskit be worse than nachos where the OS ran outside the memory space
and cpu with only applications being inside the emulated mips processor?
Made some things much too easy to do, and other things much to hard
(like converting an address from user space to kernel space an accessing
it, which should be easy, but was hard).

I suspect most 'simple' OS teaching tools are awful. Of course writing
a complete OS from scratch is a serious pain and makes debuging much
harder than if you can do your work on top of a working OS that can
print debug messages.

Len Sorensen

Lennart Sorensen

unread,
Oct 3, 2005, 2:30:22 PM10/3/05
to
On Mon, Oct 03, 2005 at 02:53:02AM +0100, Luke Kenneth Casson Leighton wrote:
> sorry, vadim: haven't touched a shift key in over 20 years.

Except to type that ':' I suspect.

How about learning to use that shift key so that everyone else that
reads your typing don't have to spend so much time working it out when
proper syntax would have made it simpler. It may save you 1% of your
time typing it, but you cost thousands of people much more as a result.
net loss for the world as a whole.

> ah, i'm not: i just left out mentioning it :)
>
> the message passing needs to be communicated down to manage
> threads, and also to provide a means to manage semaphores and
> mutexes: ultimately, support for such an architecture would
> work its way down to libc.
>
> and yes, if you _really_ didn't want a kernel in the way at all, you
> could go embedded and just... do everything yourself.
>
> or port reactos, the free software reimplementation of nt,
> to it, or something :)

Microkernel, message passing, blah blah blah. Does GNU Hurd actually
run fast yet? I think it exists finally and works (at least mostly) but
how does the performance compare?

Most of your arguments seem like a repeat of more acedemic theories most
of which have not been used in a real system where performance running
average software was important, at least I never heard of them. Not
that that necesarily means much, other than they can't have gained much
popularity.

> it won't stop - but the price of 90nm mask charges, at approx
> $2m, is already far too high, and the number of large chips
> being designed is plummetting like a stone as a result - from
> like 15,000 per year a few years ago down to ... damn, can't remember -
> less than a hundred (i think! don't quote me on that!)

Hmm, so if we guess it might take 10 masks per processor type over it's
life time as they change features and such, that's still less than 1% of
the cost of the FAB in the first place. I agree with the person that
said intel/AMD/company probably don't care much, as long as their
engineers make really darn sure that the mask is correct when they go to
make one.

> okay: i will catch up on this bit, another time, because it is late
> enough for me to be getting dizzy and appearing to be drunk.
>
> this is one answer (and there are others i will write another time.
> hint: automated code analysis tools, auto-parallelising tools, both
> offline and realtime):
>
> watch what intel and amd do: they will support _anything_ - clutch at
> straws - to make parallelism palable, why? because in order to be
> competitive - and realistically priced - they don't have any choice.
>
> plus, i am expecting the chips to be thrown out there (like
> the X-Box 360 which has SIX hardware threads remember) and
> the software people to quite literally _have_ to deal with it.

Hey I like parallel processing. i think it's neat, and I have often
made some of my own tools multithreaded just because I found it could be
done for the task, and I often find multithreaded simpler to code (I
seem to be a bit of a weirdo that way) for certain tasks.

> i expect the hardware people to go: this is the limit, this is what we
> can do, realistically price-performance-wise: lump it, deal with it.
>
> when intel and amd start doing that, everyone _will_ lump it.
> and deal with it.
>
> ... why do you think intel is hyping support for and backing
> hyperthreads support in XEN/Linux so much?

Ehm, because intel has it and their P4 desperately needs help to gain
any performance it can until they get the Pentium-M based desktop chips
finished with multiple cores, and of course because AMD doesn't have it.
Seem like good reasons for intel to try and push it.

Len Sorensen

linux-os (Dick Johnson)

unread,
Oct 3, 2005, 2:40:16 PM10/3/05
to

On Mon, 3 Oct 2005, Lennart Sorensen wrote:

> On Mon, Oct 03, 2005 at 10:50:00AM +0300, Meelis Roos wrote:
>> LKCL> the code for oskit has been available for some years, now,
>> LKCL> and is regularly maintained. the l4linux people have had to
>>
>> My experience with oskit (trying to let students use it for OS course
>> homework) is quite ... underwhelming. It works as long as you try to use
>> it exactly like the developers did and breaks on a slightest sidestep
>> from that road. And there's not much documentation so it's hard to learn
>> where that road might be.
>>
>> Switched to Linux/BSD code hacking with students, the code that actually
>> works.
>
> Can oskit be worse than nachos where the OS ran outside the memory space
> and cpu with only applications being inside the emulated mips processor?
> Made some things much too easy to do, and other things much to hard
> (like converting an address from user space to kernel space an accessing
> it, which should be easy, but was hard).
>
> I suspect most 'simple' OS teaching tools are awful. Of course writing
> a complete OS from scratch is a serious pain and makes debuging much
> harder than if you can do your work on top of a working OS that can
> print debug messages.
>
> Len Sorensen
> -

But the first thing you must do in a 'roll-your-own' OS is to make
provisions to write text to (sometimes a temporary) output device
and get some input from same. Writing such basic stuff is getting
harder because many embedded systems don't have UARTS, screen-cards,
keyboards, or any useful method of doing I/O. This is where an
existing OS (Like Linux) can help you get some I/O running, perhaps
through a USB bus. You debug and make it work as a Linux
Driver, then you link the working stuff into your headless CPU
board.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13 on an i686 machine (5589.55 BogoMips).
Warning : 98.36% of all statistics are fiction.

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to Deliver...@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

Alan Cox

unread,
Oct 3, 2005, 2:50:09 PM10/3/05
to
On Sul, 2005-10-02 at 22:55 -0400, Valdis.K...@vt.edu wrote:
> The HP2114 and DEC KL10/20 were able to dereference a chain of indirect bits
> back in the 70's (complete with warnings that hardware wedges could occur if
> an indirect reference formed a loop or pointed at itself).

The KL10 has an 8 way limit. The PDP-6 didn't but then it also lacked an
MMU.

Alan Cox

unread,
Oct 3, 2005, 3:00:07 PM10/3/05
to
On Llu, 2005-10-03 at 01:54 +0100, Luke Kenneth Casson Leighton wrote:
> the message passing system is designed as a parallel message bus -
> completely separate from the SMP and NUMA memory architecture, and as
> such it is perfect for use in microkernel OSes.

I've got one of those. It has the memory attached. Makes a fantastic
message bus and has a really long queue. Also features shortcuts for
messages travelling between processors in short order cache to cache.
Made by AMD and Intel.

> however, as i pointed out, 90nm and approx-2Ghz is pretty much _it_,
> and to get any faster you _have_ to go parallel.

We do 512 processors passably now. Thats a lot of cores and more than
the commodity computing people can wire to memory subsystems at a price
people will pay.

Besides which you need to take it up with the desktop people really. Its
their apps that use most of the processor power and will benefit most
from parallelising and efficiency work.

Alan

Luke Kenneth Casson Leighton

unread,
Oct 3, 2005, 3:00:15 PM10/3/05
to
On Mon, Oct 03, 2005 at 02:08:58PM -0400, Lennart Sorensen wrote:
> On Mon, Oct 03, 2005 at 10:50:00AM +0300, Meelis Roos wrote:
> > LKCL> the code for oskit has been available for some years, now,
> > LKCL> and is regularly maintained. the l4linux people have had to
> >
> > My experience with oskit (trying to let students use it for OS course
> > homework) is quite ... underwhelming. It works as long as you try to use
> > it exactly like the developers did and breaks on a slightest sidestep
> > from that road. And there's not much documentation so it's hard to learn
> > where that road might be.

analysis, verification, debugging and adoption of oskit by
the linux kernel maintainers would help enormously there,
i believe, which is why i invited the kernel maintainers to
give it some thought.

there are other reasons: not least is that oskit _is_ the
linux kernel source code - with the kernel/* bits removed and
the device drivers and support infrastructure remaining.


so the developers who split the linux source code out into oskit did
not, in your opinion and experience, meelis, do a very good job: so
educate them and tell them how to do it better.

l.

Luke Kenneth Casson Leighton

unread,
Oct 3, 2005, 3:10:13 PM10/3/05
to