what's next for the linux kernel?

Luke Kenneth Casson Leighton

unread,

Oct 2, 2005, 4:50:06 PM10/2/05

to

hi,

as i love to put my oar in where it's unlikely that people
will listen, and as i have little to gain or lose by doing
so, i figured you can decide for yourselves whether to be
selectively deaf or not:

here's my take on where i believe the linux kernel needs to go,
in order to keep up.

what prompted me to send this message now was a recent report
where linus' no1 patcher is believed to be close to overload,
and in that report, i think it was andrew morton, it was
said that he believed the linux kernel development rate to be
slowing down, because it is nearing completion.

i think it safe to say that a project only nears completion
when it fulfils its requirements and, given that i believe that
there is going to be a critical shift in the requirements, it
logically follows that the linux kernel should not be believed
to be nearing completion.

with me so far? :)

okay, so what's the bit that's missing that mr really irritating,
oh-so-right and oh-so-killfile-ignorable luke kenneth casson
kipper lozenge has spotted that nobody else has, and what's
the fuss about?

well... to answer that, i need to outline a bit about processor
manufacturing: if you are familiar with processor design please
forgive me, this is for the benefit of those who might not be.

the basic premise: 90 nanometres is basically... well...
price/performance-wise, it's hit a brick wall at about 2.5Ghz, and
both intel and amd know it: they just haven't told anyone.

anyone (big) else has a _really_ hard time getting above 2Ghz,
because the amount of pipelining required is just... insane
(see recent ibm powerpc5 see slashdot - what speed does it do?
surprise: 2.1Ghz when everyone was hoping it would be 2.4-2.5ghz).

a _small_ chip design company (not an IBM, intel or amd)
will be lucky to see this side of 1Ghz, at 90nm.

also, the cost of mask charges for 90nm is insane: somewhere
around $2million and that's never going to go away.

the costs for 65nm are going to be far far greater than that,
and 45nm i don't even want to imagine what they're going to be.

plus, there's a problem of quantum mechanics, heat dissipation
and current drain that makes, with current manufacturing
techniques, the production of 65nm and 45nm chips really
problematic.

with present manufacturing techniques, the current drain and heat
dissipation associated with 45nm means that you have to cut the number
of gates down to ONE MILLION, otherwise the chip destroys itself.

(brighter readers might now have an inkling of where i'm going
with this - bear with me :)

compare that one million gates with the present number of gates in an
AMD or x86 chip - some oh, what, 20 million?

now you get it?

for the present insane uniprocessor architectures at least
(and certainly for the x86 design), 90nm is _it_ - and yet,
people demand ever more faster and faster amounts of processing,
and no amount of trying on the part of engineers can get round
the laws of physics.

so, what's the solution?

well.... it's to back to parallel processing techniques, of course.

and, surprise surprise, what do we have intel pushing?

apart from, of course, the performance per watt metric (which,
if you read a few paragraphs back, you realise why they have to
trick both their customers and their engineers into believing
that performance/watt is suddenly important, it's because they
have to carve a path for a while getting the current usage down
in order for the 65nm chips to become palatable - assuming they
can be made at all in a realistic yield - read price bracket)

well - intel is pushing "hyperthreading".

and surprise, surprise, what is amd pushing? dual-core chips.

and what is in the X-Box 360? a PowerPC _triple_ core, _dual_
hyper-threaded processor!!

i believe that the X-Box 360 processor is the way things
are going to be moving - quad-core quad-threaded processors;
16 and 32 core ultra-RISC processors: medium to massive parallel
processors, but this time single-chip unlike the past decade(s) where
multi-core was hip and cool and... expensive.

i believe the future to contain stacks of single-chip multiprocessing
designs in several forms - including intel's fun-and-games VLIW stuff.

remember: intel recently bought the company that has spent
15 years working on that DEC/Alpha just-in-time x86-to-alpha
assembly converter product (remember DEC/Alphas running NT 3.51,
anyone, and still being able to run x86 programs?)

and, what is the linux kernel?

it's a daft, monolithic design that is suitable and faster on
single-processor systems, and that design is going to look _really_
outdated, really soon.

fortunately, there is a research project that has already
done a significant amount of work in breaking away from the
monolithic design: the l4linux project.

last time i checked, a few months ago, they were keeping thoroughly
up-to-date and had either 2.6.11 or 2.6.12 ported, can't recall which.

the l4linux project puts the linux kernel on top of L4-compliant
microkernels (yes, just like the gnu hurd :) and there are several such
L4-compliant microkernels - named after nuts. pistachio, etc.

one of those l4-compliant microkernels is a parallel processor
based one - it's SMP compliant, it even has support for virtual
machining, whoopee, ain't that fun.

i remember now. university of south australia, and university
of karlsruhe. i probably spelled that wrong.

in short, basically, if you follow and agree with the logic, the
linux kernel - as maintained by linus - is far from complete.

i therefore invite you to consider the following strategy:

1) that the linux kernel should merge with the oskit project or that the
linux kernel should split into two projects - a) 30-40k lines of code
comprising the code in kernel/* and headers and ports and headers
b) device drivers i.e duh the oskit project.

2) that the linux kernel should merge and maintain the efforts
of the l4linux project as mainlined not sidelined.

3) that serious efforts be diverted into the l4 microkernels to make it
portable, work on parallel processor systems, hyperthreaded, SMP and
other (such as ACPI which has had to be #ifdef'd out even in XEN).

4) other.

yes, i know this flies in the face of linus' distaste for
message-based kernels, and it's because message-passing slows
things down... but they slow things down _only_ on uniprocessor
kernel designs, and uniprocessors are going to be blowing
goats / bubbles / insert-as-appropriate in the not-too-distant
future. there have _already_ been high-profile parallel
processor designs announced, released, and put into service
(e.g. dual-core AMD64, triple-core dual-hyperthreaded PowerPC in
the X-Box 360).

yes, i may have got things wrong.

yes, it is up to _you_ to point them out.

yes, it is up to _you_ to decide what to do, not me.

good luck.

l.

p.s. XEN is even getting lovely encouraging noises from intel
to support hyperthreading, isn't that nice boys and girls?

--
--
<a href="http://lkcl.net">http://lkcl.net</a>
--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Rik van Riel

unread,

Oct 2, 2005, 5:10:08 PM10/2/05

to

On Sun, 2 Oct 2005, Luke Kenneth Casson Leighton wrote:

> and, what is the linux kernel?
>
> it's a daft, monolithic design that is suitable and faster on
> single-processor systems, and that design is going to look _really_
> outdated, really soon.

Linux already has a number of scalable SMP synchronisation
mechanisms. The main scalability effort nowadays is about
the avoidance of so-called "cache line bouncing".

http://wiki.kernelnewbies.org/wiki/SMPSynchronisation

--
All Rights Reversed

Robert Hancock

unread,

Oct 2, 2005, 6:50:08 PM10/2/05

to

Luke Kenneth Casson Leighton wrote:
> and, what is the linux kernel?
>
> it's a daft, monolithic design that is suitable and faster on
> single-processor systems, and that design is going to look _really_
> outdated, really soon.

Well, it sounds like it works pretty well on such things as 512 CPU
Altix systems, so it sounds like the suggestion that Linux is designed
solely for single-processor systems and isn't suitable for multicore,
hyperthreaded CPUs doesn't hold much water..

--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from hanc...@nospamshaw.ca
Home Page: http://www.roberthancock.com/

Christoph Hellwig

unread,

Oct 2, 2005, 7:00:10 PM10/2/05

to

Let's hope these posts will stop when the UK starts to allow serving
drinks after 23:00. Post from half-drunk people that need to get a life
don't really help a lot.

Luke Kenneth Casson Leighton

unread,

Oct 2, 2005, 7:10:03 PM10/2/05

to

On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:
> On Sun, 2 Oct 2005, Luke Kenneth Casson Leighton wrote:
>
> > and, what is the linux kernel?
> >
> > it's a daft, monolithic design that is suitable and faster on
> > single-processor systems, and that design is going to look _really_
> > outdated, really soon.
>
> Linux already has a number of scalable SMP synchronisation
> mechanisms.

... and you are tied in to the decisions made by the linux kernel
developers.

whereas, if you allow something like a message-passing design (such as
in the port of the linux kernel to l4), you have the option to try out
different underlying structures - _without_ having to totally redesign
the infrastructure.

several people involved with the l4linux project have already done
that: as i mentioned in the original post, there are about three or
four different and separate l4 microkernels available for download
(GPL) and one of them is ported to stacks of different architectures,
and one of them is SMP capable and even includes a virtual machine
environment.

and they're only approx 30-40,000 lines each, btw.

also, what about architectures that have features over-and-above SMP?

in the original design of SMP it was assumed that if you have
N processors that you have N-way access to memory.

what if, therefore, someone comes up with an architecture that is
better than or improves greatly upon SMP?

they will need to make _significant_ inroads into the linux kernel
code, whereas if, say, you oh i dunno provide hardware-accelerated
parallel support for a nanokernel (such as l4) which just _happens_
to be better than SMP then running anything which is l4 compliant gets
the benefit.

the reason i mention this is because arguments about saying "SMP is it,
SMP is great, SMP is everything, we're improving our SMP design" don't
entirely cut it, because SMP has limitations that don't scale properly
to say 64 or 128 processors: sooner or later someone's going to come up
with something better than SMP and all the efforts focussed on making
SMP better in the linux kernel are going to look lame.

l.

p.s. yes i do know of a company that has improved on SMP.

Rik van Riel

unread,

Oct 2, 2005, 7:30:12 PM10/2/05

to

On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
> On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:

> > Linux already has a number of scalable SMP synchronisation
> > mechanisms.
>
> ... and you are tied in to the decisions made by the linux kernel
> developers.
>
> whereas, if you allow something like a message-passing design (such as
> in the port of the linux kernel to l4), you have the option to try out
> different underlying structures - _without_ having to totally redesign
> the infrastructure.

Infrastructure is not what matters when it comes to SMP
scalability on modern systems, since lock contention is
not the primary SMP scalability problem.

Due to the large latency ratio between L1/L2 cache and
RAM, the biggest scalability problem is cache invalidation
and cache bounces.

Those are not solvable by using another underlying
infrastructure - they require a reorganization of the
datastructures on top, the data structures in Linux.

Note that message passing is by definition less efficient
than SMP synchronisation mechanisms that do not require
data to be exchanged between CPUs, eg. RCU or the use of
cpu-local data structures.

> p.s. yes i do know of a company that has improved on SMP.

SGI ? IBM ?

--
All Rights Reversed

Luke Kenneth Casson Leighton

unread,

Oct 2, 2005, 7:30:16 PM10/2/05

to

On Sun, Oct 02, 2005 at 11:49:57PM +0100, Christoph Hellwig wrote:
> Let's hope these posts will stop when the UK starts to allow serving
> drinks after 23:00. Post from half-drunk people that need to get a life
> don't really help a lot.

hi, christoph,

i assume that your global world-wide distribution of this message
was a mistake on your part. but, seeing as it _has_ gone out to
literally thousands of extremely busy people, i can only apologise
to them on your behalf for the mistake of wasting their valuable
time.

let's also hope that people who believe that comments such as the one
that you have made are useful and productive also think about the
consequences of doing so, bear in mind that internet archives are
forever, and also that they check whether the person that they are
criticising drinks at _all_.

personally, my average consumption of alcohol can be measured
as approx 1 bottle per decade. and i'm not talking meths.

if you don't like what i have to say, and don't want to listen,
even with a pinch of salt to me rambling, learn how to set
up a killfile, and use it. and think more before hitting the
reply-to-all button. key. whatever.

l.

--
--
<a href="http://lkcl.net">http://lkcl.net</a>
--

Vadim Lobanov

unread,

Oct 2, 2005, 7:40:03 PM10/2/05

to

Like NUMA?

> they will need to make _significant_ inroads into the linux kernel
> code, whereas if, say, you oh i dunno provide hardware-accelerated
> parallel support for a nanokernel (such as l4) which just _happens_
> to be better than SMP then running anything which is l4 compliant gets
> the benefit.
>
>
> the reason i mention this is because arguments about saying "SMP is it,
> SMP is great, SMP is everything, we're improving our SMP design" don't
> entirely cut it, because SMP has limitations that don't scale properly
> to say 64 or 128 processors: sooner or later someone's going to come up
> with something better than SMP and all the efforts focussed on making
> SMP better in the linux kernel are going to look lame.
>
> l.
>
> p.s. yes i do know of a company that has improved on SMP.
>
> -

-Vadim Lobanov

Gene Heskett

unread,

Oct 2, 2005, 7:40:06 PM10/2/05

to

On Sunday 02 October 2005 18:43, Robert Hancock wrote:
>Luke Kenneth Casson Leighton wrote:
>> and, what is the linux kernel?
>>
>> it's a daft, monolithic design that is suitable and faster on
>> single-processor systems, and that design is going to look _really_
>> outdated, really soon.
>
>Well, it sounds like it works pretty well on such things as 512 CPU
>Altix systems, so it sounds like the suggestion that Linux is designed
>solely for single-processor systems and isn't suitable for multicore,
>hyperthreaded CPUs doesn't hold much water..

Ahh, yes and no, Robert. The un-answered question, for that
512 processor Altix system, would be "but does it run things 512
times faster?" Methinks not, by a very wide margin. Yes, do a lot
of unrelated things fast maybe, but render a 30 megabyte page with
ghostscript in 10 milliseconds? Never happen IMO.

And Christoph in the next msg, calls him 1/2 drunk. He doesn't come
across to me as being more than 1 beer drunk. And he does make some
interesting points, so if they aren't valid, lets use proveable logic
to shoot them down, not name calling and pointless rhetoric.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.35% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2005 by Maurice Eugene Heskett, all rights reserved.

Vadim Lobanov

unread,

Oct 2, 2005, 7:50:06 PM10/2/05

to

On Sun, 2 Oct 2005, Gene Heskett wrote:

> On Sunday 02 October 2005 18:43, Robert Hancock wrote:
> >Luke Kenneth Casson Leighton wrote:
> >> and, what is the linux kernel?
> >>
> >> it's a daft, monolithic design that is suitable and faster on
> >> single-processor systems, and that design is going to look _really_
> >> outdated, really soon.
> >
> >Well, it sounds like it works pretty well on such things as 512 CPU
> >Altix systems, so it sounds like the suggestion that Linux is designed
> >solely for single-processor systems and isn't suitable for multicore,
> >hyperthreaded CPUs doesn't hold much water..
>
> Ahh, yes and no, Robert. The un-answered question, for that
> 512 processor Altix system, would be "but does it run things 512
> times faster?" Methinks not, by a very wide margin. Yes, do a lot
> of unrelated things fast maybe, but render a 30 megabyte page with
> ghostscript in 10 milliseconds? Never happen IMO.

This is only true for workloads that are parallelizable. I don't think
any kernel is quite good enough to divine what a single-threaded
userland application is doing and make its work parallel.

That is to say, if we are going to look at examples (and so we should),
then we need to pick an example that is actually expected to benefit
from many-processor machines.

> And Christoph in the next msg, calls him 1/2 drunk. He doesn't come
> across to me as being more than 1 beer drunk. And he does make some
> interesting points, so if they aren't valid, lets use proveable logic
> to shoot them down, not name calling and pointless rhetoric.
>
> --
> Cheers, Gene
> "There are four boxes to be used in defense of liberty:
> soap, ballot, jury, and ammo. Please use in that order."
> -Ed Howdershelt (Author)
> 99.35% setiathome rank, not too shabby for a WV hillbilly
> Yahoo.com and AOL/TW attorneys please note, additions to the above
> message by Gene Heskett are:
> Copyright 2005 by Maurice Eugene Heskett, all rights reserved.
>
>
> -

-Vadim Lobanov

Rik van Riel

unread,

Oct 2, 2005, 8:00:13 PM10/2/05

to

On Sun, 2 Oct 2005, Gene Heskett wrote:

> Ahh, yes and no, Robert. The un-answered question, for that
> 512 processor Altix system, would be "but does it run things 512
> times faster?" Methinks not, by a very wide margin. Yes, do a lot
> of unrelated things fast maybe, but render a 30 megabyte page with
> ghostscript in 10 milliseconds? Never happen IMO.

You haven't explained us why you think your proposal
would allow Linux to circumvent Amdahl's law...

--
All Rights Reversed

Martin J. Bligh

unread,

Oct 2, 2005, 8:10:08 PM10/2/05

to

--Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote (on Monday, October 03, 2005 00:05:45 +0100):

> On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:
>> On Sun, 2 Oct 2005, Luke Kenneth Casson Leighton wrote:
>>
>> > and, what is the linux kernel?
>> >
>> > it's a daft, monolithic design that is suitable and faster on
>> > single-processor systems, and that design is going to look _really_
>> > outdated, really soon.
>>
>> Linux already has a number of scalable SMP synchronisation
>> mechanisms.
>
> ... and you are tied in to the decisions made by the linux kernel
> developers.

Yes. As are the rest of us. So if you want to implement something
different, that's your perogative. So feel free to go do it
somewhere else, and quit whining on this list.

We are not your implementation bitches. If you think it's such a great
idea, do it yourself.

M.

Randy.Dunlap

unread,

Oct 2, 2005, 8:20:11 PM10/2/05

to

On Sun, 02 Oct 2005 17:04:51 -0700 Martin J. Bligh wrote:

> --Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote (on Monday, October 03, 2005 00:05:45 +0100):
>
> > On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:
> >> On Sun, 2 Oct 2005, Luke Kenneth Casson Leighton wrote:
> >>
> >> > and, what is the linux kernel?
> >> >
> >> > it's a daft, monolithic design that is suitable and faster on
> >> > single-processor systems, and that design is going to look _really_
> >> > outdated, really soon.
> >>
> >> Linux already has a number of scalable SMP synchronisation
> >> mechanisms.
> >
> > ... and you are tied in to the decisions made by the linux kernel
> > developers.
>
> Yes. As are the rest of us. So if you want to implement something
> different, that's your perogative. So feel free to go do it
> somewhere else, and quit whining on this list.
>
> We are not your implementation bitches. If you think it's such a great
> idea, do it yourself.

IOW, -ENOPATCH. where's your patch?

---
~Randy
You can't do anything without having to do something else first.
-- Belefant's Law

Kurt Wall

unread,

Oct 2, 2005, 8:40:09 PM10/2/05

to

[...]

> with me so far? :)

Yes, and getting more annoyed at your condescending tone with each
paragraph.

> and, what is the linux kernel?
>
> it's a daft, monolithic design that is suitable and faster on
> single-processor systems, and that design is going to look _really_
> outdated, really soon.

Andrew Tannenbaum said the same thing in the early 1990s. That we're
here still having this discussion >10 years later is telling. Dr.
Tannenbaum might have been acadmeically and theoretically correct,
but, with a nod to OS X, the Linux kernel has proven itself by
implementation and has proven to be remarkably adaptable.

Check back in ten years. Or just come back when you've completed
implementing all the beauteous features you're selling.

> in short, basically, if you follow and agree with the logic, the
> linux kernel - as maintained by linus - is far from complete.
>
> i therefore invite you to consider the following strategy:

And the e.e. cummings affectation is even *more* annoying than your
condescension.

Kurt
--
Blood flows down one leg and up the other.

Kurt Wall

unread,

Oct 2, 2005, 8:40:12 PM10/2/05

to

On Sun, Oct 02, 2005 at 11:49:57PM +0100, Christoph Hellwig took 8 lines to write:
> Let's hope these posts will stop when the UK starts to allow serving
> drinks after 23:00. Post from half-drunk people that need to get a life
> don't really help a lot.

As posts from fully drunk people will help, either, albeit they might be
more entertaining.

Kurt
--
Violence is the last refuge of the incompetent.
-- Salvor Hardin

Luke Kenneth Casson Leighton

unread,

Oct 2, 2005, 8:50:05 PM10/2/05

to

On Sun, Oct 02, 2005 at 05:14:57PM -0700, Randy.Dunlap wrote:

> IOW, -ENOPATCH. where's your patch?

most of the relevant work has already been done (and not by
me): i invite you to consider searching with google for l4ka,
l4linux and oskit, or simply going to the web site l4linux.org
and l4ka.org.

the code for oskit has been available for some years, now,
and is regularly maintained. the l4linux people have had to
make some significant modifications to it (oskit), and also
to grub, and libstdc++, and pretty much everything else under
the sun and, it's all there, for the approx 100mb downloading.

l.

David Leimbach

unread,

Oct 2, 2005, 8:50:08 PM10/2/05

to

> > it's a daft, monolithic design that is suitable and faster on
> > single-processor systems, and that design is going to look _really_
> > outdated, really soon.
>
> Andrew Tannenbaum said the same thing in the early 1990s. That we're
> here still having this discussion >10 years later is telling. Dr.
> Tannenbaum might have been acadmeically and theoretically correct,
> but, with a nod to OS X, the Linux kernel has proven itself by
> implementation and has proven to be remarkably adaptable.

Why are you nodding to OS X? It's not a real micokernel either. It
just happens to have all the foobage of a microkernel in a rather
monolithic design. The reason that the bsd personality is in the same
address space as the mach bits is because they didn't want to deal
with the overheads of the message passing from kernel to userspace.

The L4 people figured out how to get a lot of those inefficiencies to
disappear and L4Linux is quite "performant". In some cases, L4Linux
can be used to provide a device driver for other L4 threads that would
normally have to write their own [in user space and even with
respectable performance
http://www.ertos.nicta.com.au/Research/ULDD/Performance.pml]

That's an interesting re-use and combination of several philosophies
if you ask me.

There is a lot of "what's next for linux" going on behind the scenes
and the current path of linux is apparently good enough for
accomplishing it.

- Dave

Luke Kenneth Casson Leighton

unread,

Oct 2, 2005, 9:00:09 PM10/2/05

to

On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:

> > what if, therefore, someone comes up with an architecture that is
> > better than or improves greatly upon SMP?
>
> Like NUMA?

yes, like numa, and there is more.

i had the honour to work with someone who came up with a radical
enhancement even to _that_.

basically the company has implemented, in hardware (a
nanokernel), some operating system primitives, such as message
passing (based on a derivative by thompson of the "alice"
project from plessey, imperial and manchester university
in the mid-80s), hardware cache line lookups (which means
instead of linked list searching, the hardware does it for
you in a single cycle), stuff like that.

the message passing system is designed as a parallel message bus -
completely separate from the SMP and NUMA memory architecture, and as
such it is perfect for use in microkernel OSes.

(these sorts of things are unlikely to make it into the linux kernel, no
matter how much persuasion and how many patches they would write).

_however_, a much _better_ target would be to create an L4 microkernel
on top of their hardware kernel.

this company's hardware is kinda a bit difficult for most people to get
their heads round: it's basically parallelised hardware-acceleration for
operating systems, and very few people see the point in that.

however, as i pointed out, 90nm and approx-2Ghz is pretty much _it_,
and to get any faster you _have_ to go parallel.

and the drive for "faster", "better", "more sales" means more and more
parallelism.

it's _happening_ - and SMP ain't gonna cut it (which is why
these multi-core chips are coming out and why hyperthreading
is coming out).

so.

this is a heads-up.

what you choose to do with this analysis is up to you.

l.

Luke Kenneth Casson Leighton

unread,

Oct 2, 2005, 9:20:06 PM10/2/05

to

On Sun, Oct 02, 2005 at 05:04:51PM -0700, Martin J. Bligh wrote:
> --Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote (on Monday, October 03, 2005 00:05:45 +0100):
>
> > On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:
> >> On Sun, 2 Oct 2005, Luke Kenneth Casson Leighton wrote:
> >>
> >> > and, what is the linux kernel?
> >> >
> >> > it's a daft, monolithic design that is suitable and faster on
> >> > single-processor systems, and that design is going to look _really_
> >> > outdated, really soon.
> >>
> >> Linux already has a number of scalable SMP synchronisation
> >> mechanisms.
> >
> > ... and you are tied in to the decisions made by the linux kernel
> > developers.
>
> Yes. As are the rest of us. So if you want to implement something
> different, that's your perogative. So feel free to go do it
> somewhere else, and quit whining on this list.
>
> We are not your implementation bitches. If you think it's such a great
> idea, do it yourself.

martin, i'm going to take a leaf from the great rusty russell's book,
because i was very impressed with the professional way in which he
dealt with someone who posted such immature and out-of-line comments:
he rewrote them in a much more non-hostile manner and then replied to
that.

so, here goes: i'm copying the above few [relevant] paragraphs
below, then rewriting them, here:

> >
> > ... and you are tied in to the decisions made by the linux kernel
> > developers.
>

> Yes, this is very true: we are all somewhat at the mercy of their
> decisions. However, fortunately, they had the foresight to work
> with free software, so any of us can try something different, if
> we wish.
>
> i am slightly confused by your message, however: forgive me for
> asking this but you are not expecting us to implement such a radical
> redesign, are you?

martin, hi, thank you for responding.

well... actually, as it turns out, the l4linux and l4ka people have
already done most of the work!!

i believe you may have missed part of my message (it was a bit long, i
admit) and i thank you for the opportunity, that your message presents,
to reiterate this: l4linux _exists_ - last time i checked (some months
ago) it had a port of 2.6.11 to the L4 microkernel.

so, in more ways than one, no i am of course not expecting people to
just take orders from someone as mad as myself :)

i really should reiterate this: i _invite_ people to _consider_ the
direction that processor designs - not just any "off-the-wall"
processor designs but _mainstream_ x86-compatible processor designs -
are likely to take. and they are becoming more and more parallel.

the kinds of questions that the experienced linux kernel
maintainers and developers really need to ask is: can the
present linux kernel design _cope_ with such parallelism?

is there an easier way?

that's mainly why i wished you "good luck" :)

l.

p.s. martin. _don't_ do that again. i don't care who you are:
internet archives are forever and your rudeness will be noted
by google-users and other search-users - long after you are dead.

Rik van Riel

unread,

Oct 2, 2005, 9:20:08 PM10/2/05

to

On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:

> well... actually, as it turns out, the l4linux and l4ka people have
> already done most of the work!!

And I am sure they have reasons for not submitting their
changes to the linux-kernel mailing list. They probably
know something we (including you) don't know.

Switching out the low level infrastructure does NOT help
with scalability. The only way to make the kernel more
parallelizable is by changing the high level code, ie.
Linux itself.

Adding a microkernel under Linux is not going to help
with anything you mentioned so far.

--
All Rights Reversed

Chase Venters

unread,

Oct 2, 2005, 9:30:08 PM10/2/05

to

I'd venture to say that Linux scalability is fantastic. This also sounds like
a repeat of a debate that happened ten years ago.

I too was intrigued by Andrew's comment about 'finishing the kernel', though
I'm guessing (albeit without ever having spoken to Andrew personally) that it
was partially in jest. What it does suggest, though, is a point that KDE
desktop developer Aaron Seigo has made recently about the focus moving up the
stack.

If we are admirably tackling the problems of hardware compatibility,
stability, scalability and we've implemented most of the important features
that belong in the kernel, then a lot of the development fire for a so-called
complete Linux system is going to have to move up the stack - into the
userland.

Indeed, adding 100 cores to my Pentium 4 isn't going to do me a damned bit of
good when Akregator goes to query some 40 RSS feeds and Kontact blocks,
refusing to process GUI events. It's also not going to make compiling a
single .c file any faster.

I have no doubt that the bright minds here on LKML will continue to find
places to improve Linux's scalability, but that certainly doesn't require
rebuilding the kernel - we're already doing remarkably well in the
scalability department.

The bottom line is that the application developers need to start being clever
with threads. I think I remember some interesting rumors about Perl 6, for
example, including 'autothreading' support - the idea that your optimizer
could be smart enough to identify certain work that can go parallel.

As dual cores and HT become more popular, the onus is going to be on the
applications, not the OS, to speed up.

Regards,
Chase Venters

On Sunday 02 October 2005 08:10 pm, Luke Kenneth Casson Leighton wrote:
> ... words ...

Luke Kenneth Casson Leighton

unread,

Oct 2, 2005, 9:30:12 PM10/2/05

to

On Sun, Oct 02, 2005 at 07:26:21PM -0400, Rik van Riel wrote:
> On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
> > On Sun, Oct 02, 2005 at 05:05:42PM -0400, Rik van Riel wrote:
>
> > > Linux already has a number of scalable SMP synchronisation
> > > mechanisms.
> >
> > ... and you are tied in to the decisions made by the linux kernel
> > developers.
> >
> > whereas, if you allow something like a message-passing design (such as
> > in the port of the linux kernel to l4), you have the option to try out
> > different underlying structures - _without_ having to totally redesign
> > the infrastructure.
>
> Infrastructure is not what matters when it comes to SMP
> scalability on modern systems, since lock contention is
> not the primary SMP scalability problem.
>
> Due to the large latency ratio between L1/L2 cache and
> RAM, the biggest scalability problem is cache invalidation
> and cache bounces.
>
> Those are not solvable by using another underlying
> infrastructure - they require a reorganization of the
> datastructures on top, the data structures in Linux.

... ah, but what about in hardware? what if you had hardware support
for

_plus_ what if you had some other OS primitives implemented
in hardware, the use of which allowed you to avoid or minimise
cache invalidation problems?

not entirely, of course, but enough to make up for SMP's deficiencies.

> Note that message passing is by definition less efficient
> than SMP synchronisation mechanisms that do not require
> data to be exchanged between CPUs, eg. RCU or the use of
> cpu-local data structures.

how about message passing by reference - a la c++?

i.e. using an "out-of-band" parallel message bus, you pass
the address in a NUMA or SMP area of memory that is granted
to a specific processor, which says to another processor oh
something like "you now have access to this memory: by the time
you get this message i will have already cleared the cache so
you can get it immediately".

that sort of thing.

_and_ you use the parallel message bus to communicate memory
allocation, locking, etc.

_and_ you use the parallel message bus to implement semaphores and
mutexes.

_and_ if the message is small enough, you just pass the message across
without going via external memory.

... but i digress - but enough to demonstrate, i hope, that
this isn't some "pie-in-the-sky" thing, it's one hint at a
solution to the problem that a lot of hardware designers haven't
been able to solve, and up until now they haven't had to even
_consider_ it.

and they've avoided the problem by going "multi-core" and going
"hyperthreading".

but, at some point, hyperthreading isn't going to cut it, and at some
point multi-core isn't going to cut it.

and people are _still_ going to expect to see the monster
parallelism (32, 64, 128 parallel hardware threads) as
"one processor".

the question is - and i iterate it again: can the present
linux kernel design _cope_ with such monster parallelism?

answer, at present, as maintained as-it-is, not a chance.

question _that_ raises: do you _want_ to [make it cope with such
monster parallelism]?

and if the answer to that is "no, definitely not", then the
responsibility can be offloaded onto a microkernel, e.g. the L4
microkernel, and it _just_ so happens that the linux kernel has already
been ported to L4.

i raise this _one_ route - there are surely going to be others.

i invite you to consider discussing them.

LIKE FRIGGIN ADULTS, unlike the very spiteful comments i've
received indicate that some people would like to do (no i
don't count you in that number, rik, just in case you thought
i was because i'm replying direct to you!).

> > p.s. yes i do know of a company that has improved on SMP.
>
> SGI ? IBM ?

no, they're a startup.

--
--
<a href="http://lkcl.net">http://lkcl.net</a>
--

Vadim Lobanov

unread,

Oct 2, 2005, 9:30:12 PM10/2/05

to

On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:

> On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:
>
> > > what if, therefore, someone comes up with an architecture that is
> > > better than or improves greatly upon SMP?
> >
> > Like NUMA?
>
> yes, like numa, and there is more.

The beauty of capitalization is that it makes it easier for others to
read what you have to say.

> i had the honour to work with someone who came up with a radical
> enhancement even to _that_.
>
> basically the company has implemented, in hardware (a
> nanokernel), some operating system primitives, such as message
> passing (based on a derivative by thompson of the "alice"
> project from plessey, imperial and manchester university
> in the mid-80s), hardware cache line lookups (which means
> instead of linked list searching, the hardware does it for
> you in a single cycle), stuff like that.

That sounds awesome, but I have something better -- a quantum computer.
And it's about as parallel as you're going to get anytime in the
foreseeable future!

...

Moral of the story: There are thousands of hardware doodads all around.
People only start to become interested when they have actual "metal"
freely available on the market, that they can play with and code to.

> the message passing system is designed as a parallel message bus -
> completely separate from the SMP and NUMA memory architecture, and as
> such it is perfect for use in microkernel OSes.

You're making an implicit assumption here that it will benefit _only_
microkernel designs. That is not at all immediate or obvious to me (or,
I suspect, others also) -- where's the proof?

> (these sorts of things are unlikely to make it into the linux kernel, no
> matter how much persuasion and how many patches they would write).

No, the kernel hackers are actually very sensible people. When they push
back, there's usually a darn good reason for it. See above point
regarding availability of hardware.

> _however_, a much _better_ target would be to create an L4 microkernel
> on top of their hardware kernel.

Perfect. You can do that, and benefit from the oodles of fame that
follow. Others might be less-than-convinced.

> this company's hardware is kinda a bit difficult for most people to get
> their heads round: it's basically parallelised hardware-acceleration for
> operating systems, and very few people see the point in that.

That just sounds condescending.

> however, as i pointed out, 90nm and approx-2Ghz is pretty much _it_,
> and to get any faster you _have_ to go parallel.

Sure, it's going to stop somewhere, but you have to be a heck of a
visionary to predict that it will stop _there_. People have been
surprised before on such matters, so don't go around yelling about the
impending doom quite yet.

> and the drive for "faster", "better", "more sales" means more and more
> parallelism.
>
> it's _happening_ - and SMP ain't gonna cut it (which is why
> these multi-core chips are coming out and why hyperthreading
> is coming out).

"Rah, rah, parallelism is great!" -- That's a great slogan, except...

Users, who also happen to be the target of those sales, care about
_userland_ applications. And the bitter truth is that the _vast_
majority of userland apps are single-threaded. Why? Two reasons --
first, it's harder to write a multithreaded application, and second,
some workloads simply can't be expressed "in parallel". Your kernel
might (might, not will) run like a speed-demon, but the userland stuff
will still be lackluster in comparison.

And that's when your slogan hits a wall, and the marketing hype dies.
The reality is that parallelism is something to be desired, but is not
always achievable.

> so.
>
> this is a heads-up.
>
> what you choose to do with this analysis is up to you.

I choose to wait for actual, concrete details and proofs of your design,
instead of the ambiguous "visionary" hand-waving so far. As has already
been said, -ENOPATCH.

> l.
>

-Vadim Lobanov

Al Viro

unread,

Oct 2, 2005, 9:50:09 PM10/2/05

to

On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:
> I choose to wait for actual, concrete details and proofs of your design,
> instead of the ambiguous "visionary" hand-waving so far. As has already
> been said, -ENOPATCH.

Speaking of which, IIRC, somebody used to maintain a list of words, acronyms,
etc. useful to know if you want to read l-k. May I submit an addition to
that list?

visionary [n]: onanist with strong exhibitionist tendencies; from
"visions", the source of inspiration they refer to when it becomes
obvious that they have lost both sight and capacity for rational
thought.

Vadim Lobanov

unread,

Oct 2, 2005, 10:00:12 PM10/2/05

to

On Mon, 3 Oct 2005, Al Viro wrote:

> On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:
> > I choose to wait for actual, concrete details and proofs of your design,
> > instead of the ambiguous "visionary" hand-waving so far. As has already
> > been said, -ENOPATCH.
>
> Speaking of which, IIRC, somebody used to maintain a list of words, acronyms,
> etc. useful to know if you want to read l-k. May I submit an addition to
> that list?
>
> visionary [n]: onanist with strong exhibitionist tendencies; from
> "visions", the source of inspiration they refer to when it becomes
> obvious that they have lost both sight and capacity for rational
> thought.
>

Nice. :-)

Just from idle curiosity, you wouldn't know where that list currently
resides, would you?

-Vadim Lobanov

Luke Kenneth Casson Leighton

unread,

Oct 2, 2005, 10:00:13 PM10/2/05

to

On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:

> On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
>
> > On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:
> >
> > > > what if, therefore, someone comes up with an architecture that is
> > > > better than or improves greatly upon SMP?
> > >
> > > Like NUMA?
> >
> > yes, like numa, and there is more.
>
> The beauty of capitalization is that it makes it easier for others to
> read what you have to say.

sorry, vadim: haven't touched a shift key in over 20 years.

> > basically the company has implemented, in hardware (a
> > nanokernel), some operating system primitives, such as message
> > passing (based on a derivative by thompson of the "alice"
> > project from plessey, imperial and manchester university
> > in the mid-80s), hardware cache line lookups (which means
> > instead of linked list searching, the hardware does it for
> > you in a single cycle), stuff like that.
>
> That sounds awesome, but I have something better -- a quantum computer.
> And it's about as parallel as you're going to get anytime in the
> foreseeable future!

:)

*sigh* - i _so_ hope we don't need degrees in physics to program
them...

> > the message passing system is designed as a parallel message bus -
> > completely separate from the SMP and NUMA memory architecture, and as
> > such it is perfect for use in microkernel OSes.
>
> You're making an implicit assumption here that it will benefit _only_
> microkernel designs.

ah, i'm not: i just left out mentioning it :)

the message passing needs to be communicated down to manage
threads, and also to provide a means to manage semaphores and
mutexes: ultimately, support for such an architecture would
work its way down to libc.

and yes, if you _really_ didn't want a kernel in the way at all, you
could go embedded and just... do everything yourself.

or port reactos, the free software reimplementation of nt,
to it, or something :)

*shrug*.

> > this company's hardware is kinda a bit difficult for most people to get
> > their heads round: it's basically parallelised hardware-acceleration for
> > operating systems, and very few people see the point in that.
>
> That just sounds condescending.

i'm very sorry about that, it wasn't deliberate and ... re-reading
my comment, i should say that my comment isn't actually entirely true!

a correction/qualification: the people whom the startup company
contacted before they were put in touch with me had found that
everybody they had previously talked to just simply _did_ not
get it: this was presumably because of their choice of people
whom they were seeking funding from were not technically up
to the job of understanding the concept.

i didn't mean to imply that _everyone_ - or more specifically the
people reading this list - would not get it.

sorry.

> > however, as i pointed out, 90nm and approx-2Ghz is pretty much _it_,
> > and to get any faster you _have_ to go parallel.
>
> Sure, it's going to stop somewhere, but you have to be a heck of a
> visionary to predict that it will stop _there_.

okay, i admit it: you caught me out - i'm a mad visionary.

but seriously.

it won't stop - but the price of 90nm mask charges, at approx
$2m, is already far too high, and the number of large chips
being designed is plummetting like a stone as a result - from
like 15,000 per year a few years ago down to ... damn, can't remember -
less than a hundred (i think! don't quote me on that!)

when 90 nm was introduced, some mad fabs wanted to make 9
metre lenses, dude!!! until carl zeiss were called in and
managed to get it down to 3 metres.

and that lens is produced on a PER CHIP basis.

basically, it's about cost.

the costs of producing faster and faster uniprocessors is
getting out of control.

i'm not explaining things very well, but i'm trying. too many words,
not concise enough, too much to explain without people misunderstanding
or skipping things and getting the wrong end of the stick.

argh.

> > and the drive for "faster", "better", "more sales" means more and more
> > parallelism.
> >
> > it's _happening_ - and SMP ain't gonna cut it (which is why
> > these multi-core chips are coming out and why hyperthreading
> > is coming out).
>
> "Rah, rah, parallelism is great!" -- That's a great slogan, except...
>
> Users, who also happen to be the target of those sales, care about
> _userland_ applications. And the bitter truth is that the _vast_
> majority of userland apps are single-threaded. Why? Two reasons --
> first, it's harder to write a multithreaded application, and second,
> some workloads simply can't be expressed "in parallel". Your kernel
> might (might, not will) run like a speed-demon, but the userland stuff
> will still be lackluster in comparison.
>
> And that's when your slogan hits a wall, and the marketing hype dies.
> The reality is that parallelism is something to be desired, but is not
> always achievable.

okay: i will catch up on this bit, another time, because it is late
enough for me to be getting dizzy and appearing to be drunk.

this is one answer (and there are others i will write another time.
hint: automated code analysis tools, auto-parallelising tools, both
offline and realtime):

watch what intel and amd do: they will support _anything_ - clutch at
straws - to make parallelism palable, why? because in order to be
competitive - and realistically priced - they don't have any choice.

plus, i am expecting the chips to be thrown out there (like
the X-Box 360 which has SIX hardware threads remember) and
the software people to quite literally _have_ to deal with it.

i expect the hardware people to go: this is the limit, this is what we
can do, realistically price-performance-wise: lump it, deal with it.

when intel and amd start doing that, everyone _will_ lump it.
and deal with it.

... why do you think intel is hyping support for and backing
hyperthreads support in XEN/Linux so much?

l.

Rik van Riel

unread,

Oct 2, 2005, 10:00:16 PM10/2/05

to

On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:

> how about message passing by reference - a la c++?

> _and_ you use the parallel message bus to communicate memory
> allocation, locking, etc.

Then you lose. It's the act of passing itself that causes
scalability problems and a loss of performance.

The best way to get SMP scalability is to avoid message
passing alltogether, using things like per-cpu data
structures and RCU.

Not having to pass a message is faster than any message
passing mechanism.

> ... but i digress - but enough to demonstrate, i hope, that
> this isn't some "pie-in-the-sky" thing,

You've made a lot of wild claims so far, most of which I'm
not ready to belief without some proof to back them up.

> it's one hint at a solution to the problem that a lot of hardware
> designers haven't been able to solve, and up until now they haven't
> had to even _consider_ it.

The main problem is that communication with bits of silicon
four inches away is a lot slower, or takes much more power,
than communication with bits of silicon half a millimeter away.

This makes cross-core communication, and even cross-thread
communication in SMT/HT, slower than not having to have such
communication at all.

> the question is - and i iterate it again: can the present
> linux kernel design _cope_ with such monster parallelism?

The SGI and IBM people seem fairly happy with current 128 CPU
performance, and appear to be making serious progress towards
512 CPUs and more.

> question _that_ raises: do you _want_ to [make it cope with such
> monster parallelism]?
>
> and if the answer to that is "no, definitely not", then the
> responsibility can be offloaded onto a microkernel,

No, that cannot be done, for all the reasons I mentioned
earlier in the thread.

Think about something like the directory entry cache (dcache),
all the CPUs need to see that cache consistently, and you cannot
avoid locking overhead by having the locking done by a microkernel.

The only way to avoid locking overhead is by changing the data
structure to something that doesn't need locking.

No matter how low your locking overhead - once you have 1024
CPUs it's probably too high.

--
All Rights Reversed

Al Viro

unread,

Oct 2, 2005, 10:00:18 PM10/2/05

to

On Sun, Oct 02, 2005 at 06:50:46PM -0700, Vadim Lobanov wrote:
> > visionary [n]: onanist with strong exhibitionist tendencies; from
> > "visions", the source of inspiration they refer to when it becomes
> > obvious that they have lost both sight and capacity for rational
> > thought.
> >
>
> Nice. :-)
>
> Just from idle curiosity, you wouldn't know where that list currently
> resides, would you?

No idea... Probably somebody from kernelnewbies.org crowd would know
the current location...

Luke Kenneth Casson Leighton

unread,

Oct 2, 2005, 10:10:15 PM10/2/05

to

On Mon, Oct 03, 2005 at 02:53:00AM +0100, Al Viro wrote:
> On Sun, Oct 02, 2005 at 06:50:46PM -0700, Vadim Lobanov wrote:
> > > visionary [n]: onanist with strong exhibitionist tendencies; from
> > > "visions", the source of inspiration they refer to when it becomes
> > > obvious that they have lost both sight and capacity for rational
> > > thought.

oo, nice pretty flowers, wheeee :)

Vadim Lobanov

unread,

Oct 2, 2005, 10:40:06 PM10/2/05

to

On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:

> On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:
>
> > On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
> >
> > > On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:
> > >
> > > > > what if, therefore, someone comes up with an architecture that is
> > > > > better than or improves greatly upon SMP?
> > > >
> > > > Like NUMA?
> > >
> > > yes, like numa, and there is more.
> >
> > The beauty of capitalization is that it makes it easier for others to
> > read what you have to say.
>
> sorry, vadim: haven't touched a shift key in over 20 years.

It's not going to bite you. I promise.

No, for reliability and performance reasons, I very much want a kernel
in the way. After all, kernel code is orders of magnitude better tuned
than almost all userland code.

The point I was making here is that, from what I can see, the current
Linux architecture is quite alright in anticipation of the hardware that
you're describing. It _could_ be better tuned for such hardware, sure,
but so far there is no need for such work at this particular moment.

I can guarantee one thing here -- the cost, as is, is absolutely
bearable. These companies make more money doing this than they spend in
doing it, otherwise they wouldn't be in business. From an economics
perspective, this industry is very much alive and well, proven by the
fact that these companies haven't bailed out of it yet.

We don't need hints. We need actual performance statistics --
verifiable numbers that we can point to and say "Oh crap, we're losing."
or "Hah, we kick butt.", as the case may be.

> watch what intel and amd do: they will support _anything_ - clutch at
> straws - to make parallelism palable, why? because in order to be
> competitive - and realistically priced - they don't have any choice.

As stated earlier, I doubt they're in such dire straits as you predict.
Ultimately, the only reason why they need to advance their designs is to
be able to market it better. This means that truly innovative designs
may not be pursued because the up-front cost is too high.

There's a saying: "Let your competitor do your R&D for you."

> plus, i am expecting the chips to be thrown out there (like
> the X-Box 360 which has SIX hardware threads remember) and
> the software people to quite literally _have_ to deal with it.
>
> i expect the hardware people to go: this is the limit, this is what we
> can do, realistically price-performance-wise: lump it, deal with it.
>
> when intel and amd start doing that, everyone _will_ lump it.
> and deal with it.

Hardware without software is just as useless as software without
hardware. Any argument from any side that goes along the lines of "deal
with it" can be countered in kind.

What this boils down to is that hardware people try to make their
products appealing to program to, from _both_ a speed and a usability
perspective. That's how they get mindshare.

> ... why do you think intel is hyping support for and backing
> hyperthreads support in XEN/Linux so much?

At the risk of stepping on some toes, I believe that hyperthreading is
going out of style, in favor of multi-core processors.

> l.
>

In conclusion, you made claims that Linux is lagging behind. However,
such claims are rather useless without data and/or technical discussions
to back them up.

-Vadim Lobanov

Valdis.K...@vt.edu

unread,

Oct 2, 2005, 11:00:13 PM10/2/05

to

On Mon, 03 Oct 2005 01:54:00 BST, Luke Kenneth Casson Leighton said:

> in the mid-80s), hardware cache line lookups (which means
> instead of linked list searching, the hardware does it for
> you in a single cycle), stuff like that.

OK.. I'll bite. How do you find the 5th or 6th entry in the linked list,
when only the first entry is in cache, in a single cycle, when a cache line
miss is more than a single cycle penalty, and you have several "These are not
the droids you're looking for" checks and go on to the next entry - and do it
in one clock cycle?

Now, it's really easy to imagine an execution unit that will execute this
as a single opcode, and stall until complete. Of course, this only really helps
if you have multiple execution units - which is what hyperthreading and
multi-core and all that is about. And guess what - it's not news...

The HP2114 and DEC KL10/20 were able to dereference a chain of indirect bits
back in the 70's (complete with warnings that hardware wedges could occur if
an indirect reference formed a loop or pointed at itself). Whoops. :)

And all the way back in 1964, IBM disk controllers were able to do some rather
sophisticated offloading of "channel control words" (amazing what you could do
with 'Search ID Equal', 'Transfer In-Channel' (really a misnamed branch
instruction), and self-modifying CCWs). But even then, they understood that
it was only a win if you could go do other stuff when you waited....

D. Hazelton

unread,

Oct 2, 2005, 11:20:11 PM10/2/05

to

On Monday 03 October 2005 02:31, Vadim Lobanov wrote:
> On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
> > On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:
> > > On Mon, 3 Oct 2005, Luke Kenneth Casson Leighton wrote:
> > > > On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:
> > > > > > what if, therefore, someone comes up with an
> > > > > > architecture that is better than or improves greatly upon
> > > > > > SMP?
> > > > >
> > > > > Like NUMA?
> > > >
> > > > yes, like numa, and there is more.
> > >
> > > The beauty of capitalization is that it makes it easier for
> > > others to read what you have to say.
> >
> > sorry, vadim: haven't touched a shift key in over 20 years.
>
> It's not going to bite you. I promise.

You never know - someone might've rigged his keyboard to shock him
every time the shift key was pressed :)

<snip>

> > > > the message passing system is designed as a parallel message
> > > > bus - completely separate from the SMP and NUMA memory
> > > > architecture, and as such it is perfect for use in
> > > > microkernel OSes.
> > >
> > > You're making an implicit assumption here that it will benefit
> > > _only_ microkernel designs.
> >
> > ah, i'm not: i just left out mentioning it :)
> >
> > the message passing needs to be communicated down to manage
> > threads, and also to provide a means to manage semaphores and
> > mutexes: ultimately, support for such an architecture would
> > work its way down to libc.
> >
> >
> > and yes, if you _really_ didn't want a kernel in the way at all,
> > you could go embedded and just... do everything yourself.
> >
> > or port reactos, the free software reimplementation of nt,
> > to it, or something :)
> >
> > *shrug*.
>
> No, for reliability and performance reasons, I very much want a
> kernel in the way. After all, kernel code is orders of magnitude
> better tuned than almost all userland code.
>
> The point I was making here is that, from what I can see, the
> current Linux architecture is quite alright in anticipation of the
> hardware that you're describing. It _could_ be better tuned for
> such hardware, sure, but so far there is no need for such work at
> this particular moment.

Wholly agreed. The arguments over the benefits of running a
microkernel aren't ever really clear. Beyond that, I personally feel
that the whole micro vs. mono argument is a catfight between
academics. I'd rather have a system that works and is proven than a
system that is bleeding edge and never truly stable. To me this
means a monolithic kernel - microkernels are picky at best, and can
be highly insecure (and that means "unstable" in my book too).

<snip>

> > > > however, as i pointed out, 90nm and approx-2Ghz is pretty
> > > > much _it_, and to get any faster you _have_ to go parallel.
> > >
> > > Sure, it's going to stop somewhere, but you have to be a heck
> > > of a visionary to predict that it will stop _there_.
> >
> > okay, i admit it: you caught me out - i'm a mad visionary.
> >
> > but seriously.
> >
> > it won't stop - but the price of 90nm mask charges, at approx
> > $2m, is already far too high, and the number of large chips
> > being designed is plummetting like a stone as a result - from
> > like 15,000 per year a few years ago down to ... damn, can't
> > remember - less than a hundred (i think! don't quote me on
> > that!)
> >
> > when 90 nm was introduced, some mad fabs wanted to make 9
> > metre lenses, dude!!! until carl zeiss were called in and
> > managed to get it down to 3 metres.
> >
> > and that lens is produced on a PER CHIP basis.
> >
> > basically, it's about cost.
>
> I can guarantee one thing here -- the cost, as is, is absolutely
> bearable. These companies make more money doing this than they
> spend in doing it, otherwise they wouldn't be in business. From an
> economics perspective, this industry is very much alive and well,
> proven by the fact that these companies haven't bailed out of it
> yet.

I have to agree. And he is also completely ignoring the fact that both
Intel and AMD are either in the process of moving to (or have moved
to) a 65nm fab process - last news I saw about this said both
facilities were running into the multi-billion dollar cost range.
Companies worried about $2m for a mask charge wouldn't be investing
multiple billions of dollars in new plants and a new, smaller fab
process.

<snip>

Hear, hear! I'm still working my way through the source tree and
learning the general layout and functionality of the various bits,
but in just a pair of months of being on this list I can attest to
the fact that one thing all developers seem to ask for is statistics.

<snip>

> At the risk of stepping on some toes, I believe that hyperthreading
> is going out of style, in favor of multi-core processors.

Agreed. And multi-core processors aren't really new technology - there
have been multi-core designs out for a while, but those were usually
low production "research" chips.

DRH

0xA6992F96300F159086FF28208F8280BB8B00C32A.asc

Rik van Riel

unread,

Oct 2, 2005, 11:30:13 PM10/2/05

to

On Sun, 2 Oct 2005, Valdis.K...@vt.edu wrote:

> OK.. I'll bite. How do you find the 5th or 6th entry in the linked
> list, when only the first entry is in cache, in a single cycle, when a
> cache line miss is more than a single cycle penalty, and you have
> several "These are not the droids you're looking for" checks and go on
> to the next entry - and do it in one clock cycle?

A nice saying from the last decade comes to mind:

"If you can do all that in one cycle, your cycles are too long."

--
All Rights Reversed

Gene Heskett

unread,

Oct 3, 2005, 12:00:07 AM10/3/05

to

On Sunday 02 October 2005 19:48, Rik van Riel wrote:
>On Sun, 2 Oct 2005, Gene Heskett wrote:
>> Ahh, yes and no, Robert. The un-answered question, for that
>> 512 processor Altix system, would be "but does it run things 512
>> times faster?" Methinks not, by a very wide margin. Yes, do a lot
>> of unrelated things fast maybe, but render a 30 megabyte page with
>> ghostscript in 10 milliseconds? Never happen IMO.
>
>You haven't explained us why you think your proposal
>would allow Linux to circumvent Amdahl's law...

Amdahl's Law?

Thats a reference I don't believe I've been made aware of. Can you
elaborate?

Besides, it isn't my proposal, just a question in that I chose a
scenario (ghostscripts rendering of a page of text) that in fact only
runs maybe 10x faster on an XP-2800 Athlon with a gig of dram than it
did on my old 25 mhz 68040 equipt amiga with 64 megs of dram.
With 64 megs of dram, so it wasn't nearly as memory bound doing that
as most of the Amiga's were.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.35% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2005 by Maurice Eugene Heskett, all rights reserved.

-

Willy Tarreau

unread,

Oct 3, 2005, 12:20:09 AM10/3/05

to

On Mon, Oct 03, 2005 at 12:24:16AM +0100, Luke Kenneth Casson Leighton wrote:

> On Sun, Oct 02, 2005 at 11:49:57PM +0100, Christoph Hellwig wrote:
> > Let's hope these posts will stop when the UK starts to allow serving
> > drinks after 23:00. Post from half-drunk people that need to get a life
> > don't really help a lot.
>

> hi, christoph,
>
> i assume that your global world-wide distribution of this message
> was a mistake on your part. but, seeing as it _has_ gone out to
> literally thousands of extremely busy people, i can only apologise
> to them on your behalf for the mistake of wasting their valuable
> time.

If you think this, you don't know Christoph then. We all know him
for his "warm words" and his frankness. Sometimes he may be quite
a bit excessive, but here I tend to agree with him. He simply meant
that these long threads are often totally useless and consume a lot
of time to real developers. Generally speaking, calling developers
to tell them "hey, you work the wrong way" is not productive. If you
tell them that you can improve their work and show some code, it's
often more appreciated. Yes I know you provided some links to sites,
but without proposing one real guideline nor project. You'd better
have posted something like "announce: merging of l4linux into
mainline" and telling people that you will start a slow, non-intrusive
merging work of one of your favorite project. You'd have got lots
of opposition but this would have been more productive than just
speaking about philosophy.

> let's also hope that people who believe that comments such as the one
> that you have made are useful and productive also think about the
> consequences of doing so, bear in mind that internet archives are
> forever, and also that they check whether the person that they are
> criticising drinks at _all_.

[ cut all the uninteresting info about your drinking habits that you
sent to the whole world and which will be archived forever ]

Regards,
Willy

Sonny Rao

unread,

Oct 3, 2005, 1:10:15 AM10/3/05

to

On Mon, Oct 03, 2005 at 01:54:00AM +0100, Luke Kenneth Casson Leighton wrote:
<snip>

> this company's hardware is kinda a bit difficult for most people to get
> their heads round: it's basically parallelised hardware-acceleration for
> operating systems, and very few people see the point in that.

Obviously, we are all clueless morons.

<snip>

> so.
>
> this is a heads-up.
>
> what you choose to do with this analysis is up to you.
>
> l.

Roll around on the floor while violently laughing for a while?

Nick Piggin

unread,

Oct 3, 2005, 1:50:15 AM10/3/05

to

Allow me to apply Rusty's technique, if you will.

Luke Kenneth Casson Leighton wrote:

> Hi,
>
> Can all you great kernel hackers, who only know a little bit
> less than me and have only built a slightly less successful
> kernel than I have, stop what you are doing and do it my way
> instead?
>

Hi Luke,

Thanks for your concise and non-rambling letter that is actually
readable - a true rarity on lkml these days.

To answer your question: I think we would all be happy to examine
your ideas when you can provide some real numbers and comparisions
and actual technical arguments as to why they are better than the
current scheme we have in Linux.

Nick

PS. I am disappointed not to have seen any references to XML in
your proposal. May I suggest you adopt some kind of XML format
for your message protocol?

Send instant messages to your online friends http://au.messenger.yahoo.com

Meelis Roos

unread,

Oct 3, 2005, 4:00:16 AM10/3/05

to

LKCL> the code for oskit has been available for some years, now,
LKCL> and is regularly maintained. the l4linux people have had to

My experience with oskit (trying to let students use it for OS course
homework) is quite ... underwhelming. It works as long as you try to use
it exactly like the developers did and breaks on a slightest sidestep
from that road. And there's not much documentation so it's hard to learn
where that road might be.

Switched to Linux/BSD code hacking with students, the code that actually
works.

YMMV.

--
Meelis Roos

Erik Mouw

unread,

Oct 3, 2005, 5:40:20 AM10/3/05

to

On Mon, Oct 03, 2005 at 02:53:00AM +0100, Al Viro wrote:

> On Sun, Oct 02, 2005 at 06:50:46PM -0700, Vadim Lobanov wrote:
> > > visionary [n]: onanist with strong exhibitionist tendencies; from
> > > "visions", the source of inspiration they refer to when it becomes
> > > obvious that they have lost both sight and capacity for rational
> > > thought.
> > >
> >
> > Nice. :-)
> >
> > Just from idle curiosity, you wouldn't know where that list currently
> > resides, would you?
>
> No idea... Probably somebody from kernelnewbies.org crowd would know
> the current location...

There's not really a list of words on kernelnewbies.org, but a fortunes
file:

http://www.kernelnewbies.org/kernelnewbies-fortunes.tar.gz

(and I just updated it with your description of a visionary)

Erik

--
+-- Erik Mouw -- www.harddisk-recovery.nl -- 0800 220 20 20 --
| Eigen lab: Delftechpark 26, 2628 XH, Delft, Nederland
| Files foetsie, bestanden kwijt, alle data weg?!
| Blijf kalm en neem contact op met Harddisk-recovery.nl!

Jesper Juhl

unread,

Oct 3, 2005, 5:50:19 AM10/3/05

to

On 10/3/05, Gene Heskett <gene.h...@verizon.net> wrote:
> On Sunday 02 October 2005 19:48, Rik van Riel wrote:
> >On Sun, 2 Oct 2005, Gene Heskett wrote:
> >> Ahh, yes and no, Robert. The un-answered question, for that
> >> 512 processor Altix system, would be "but does it run things 512
> >> times faster?" Methinks not, by a very wide margin. Yes, do a lot
> >> of unrelated things fast maybe, but render a 30 megabyte page with
> >> ghostscript in 10 milliseconds? Never happen IMO.
> >
> >You haven't explained us why you think your proposal
> >would allow Linux to circumvent Amdahl's law...
>
> Amdahl's Law?
>

http://en.wikipedia.org/wiki/Amdahl's_law
http://home.wlu.edu/~whaleyt/classes/parallel/topics/amdahl.html

And google has even more. Wonderful thing those search engines...

--
Jesper Juhl <jespe...@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

Giuseppe Bilotta

unread,

Oct 3, 2005, 6:50:22 AM10/3/05

to

On Mon, 3 Oct 2005 02:53:02 +0100, Luke Kenneth Casson Leighton wrote:

> On Sun, Oct 02, 2005 at 06:20:38PM -0700, Vadim Lobanov wrote:
>> The beauty of capitalization is that it makes it easier for others to
>> read what you have to say.
>
> sorry, vadim: haven't touched a shift key in over 20 years.
>

[snip]

> *sigh* - i _so_ hope we don't need degrees in physics to program
> them...

[snip]

> ah, i'm not: i just left out mentioning it :)

I'd *love* a keyboard layout where * _ : ) are accesible without
shift! Can you send me yours?

--
Giuseppe "Oblomov" Bilotta

Hic manebimus optime

Horst von Brand

unread,

Oct 3, 2005, 9:40:22 AM10/3/05

to

Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:
> On Sun, Oct 02, 2005 at 04:37:52PM -0700, Vadim Lobanov wrote:
> > > what if, therefore, someone comes up with an architecture that is
> > > better than or improves greatly upon SMP?

> > Like NUMA?

> yes, like numa, and there is more.
>

> i had the honour to work with someone who came up with a radical
> enhancement even to _that_.

Any papers to look at?

> basically the company has implemented, in hardware (a nanokernel),

A nanokernel is a piece of software in my book?

> some
> operating system primitives, such as message passing (based on a
> derivative by thompson of the "alice" project from plessey, imperial and

> manchester university in the mid-80s), hardware cache line lookups

> (which means instead of linked list searching, the hardware does it for
> you in a single cycle), stuff like that.

Single CPU cycle for searching data in memory? Impossible.

> the message passing system is designed as a parallel message bus -
> completely separate from the SMP and NUMA memory architecture, and as
> such it is perfect for use in microkernel OSes.

Something must shuffle the data from "regular memory" into "message
memory", so I bet that soon becomes the bottleneck. And the duplicate data
paths add to the cost, money that could be spent on making memory access
faster, so...

> (these sorts of things are unlikely to make it into the linux kernel, no
> matter how much persuasion and how many patches they would write).

Your head would apin when looking at how fast this gets into Linux if there
were such machines around, and it is worth it.

> _however_, a much _better_ target would be to create an L4 microkernel
> on top of their hardware kernel.

Not yet another baroque CISC design, this time around with 1/3 of an OS in
it!

> this company's hardware is kinda a bit difficult for most people to get
> their heads round: it's basically parallelised hardware-acceleration for
> operating systems, and very few people see the point in that.

Perhaps most people that don't see the point do have a point?

> however, as i pointed out, 90nm and approx-2Ghz is pretty much _it_,
> and to get any faster you _have_ to go parallel.

Sorry, all this has been doomsayed (with different numbers) from 1965 or
so.

> and the drive for "faster", "better", "more sales" means more and more
> parallelism.

Right.

> it's _happening_ - and SMP ain't gonna cut it (which is why
> these multi-core chips are coming out and why hyperthreading
> is coming out).

Hyperthreading and multi-core /are/ SMP, just done a bit differently.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

Jon Masters

unread,

Oct 3, 2005, 10:30:37 AM10/3/05

to

On 10/2/05, Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:

> as i love to put my oar in where it's unlikely that people
> will listen, and as i have little to gain or lose by doing
> so, i figured you can decide for yourselves whether to be
> selectively deaf or not:

Hi Luke,

Haven't seen you since I believe you gave a somewhat interesting talk
on FUSE at an OxLUG a year or more back. I don't think anyone here is
selectively deaf, but some might just ignore you for such comments :-)

> what prompted me to send this message now was a recent report
> where linus' no1 patcher is believed to be close to overload,
> and in that report, i think it was andrew morton, it was
> said that he believed the linux kernel development rate to be
> slowing down, because it is nearing completion.

There was some general bollocks about Andrew being burned out, but
that wasn't what the point was as far as I saw it - more about how
things could be better streamlined than a sudden panic moment.

> i think it safe to say that a project only nears completion
> when it fulfils its requirements and, given that i believe that
> there is going to be a critical shift in the requirements, it
> logically follows that the linux kernel should not be believed
> to be nearing completion.

Whoever said it was?

> with me so far? :)

I don't think anyone with moderate grasp of the English language won't
have understood what you wrote above. They might not understand why
you said it, but that's another issue.

> the basic premise: 90 nanometres is basically... well...
> price/performance-wise, it's hit a brick wall at about 2.5Ghz, and
> both intel and amd know it: they just haven't told anyone.

But you /know/ this because you're a microprocessor designer as well
as a contributor to the FUSE project?

> anyone (big) else has a _really_ hard time getting above 2Ghz,
> because the amount of pipelining required is just... insane
> (see recent ibm powerpc5 see slashdot - what speed does it do?
> surprise: 2.1Ghz when everyone was hoping it would be 2.4-2.5ghz).

I think there are many possible reasons for that and I doubt slashdot
will reveal any of those reasons. The main issue (as I understand it)
is that SMT/SMP is taking off for many applications and manufacturers
want to cater for them while reducing heat output - so they care less
about MHz than about potential real world performance.

> so, what's the solution?

> well.... it's to back to parallel processing techniques, of course.

Yes. Wow! Of course! Whoda thunk it? I mean, parallel processing!
Let's get that right into the kern...oh wait, didn't Alan and a bunch
of others already do that years ago? Then again, we might have missed
all of the stuff which went into 2.2, 2.4 and then 2.6?

> well - intel is pushing "hyperthreading".

Wow! Really? I seem to have missed /all/ of those annoying ads. But
please tell me some more about it!

> and, what is the linux kernel?

> it's a daft, monolithic design that is suitable and faster on
> single-processor systems, and that design is going to look _really_
> outdated, really soon.

Why? I happen to think Microkernels are really sexy in a Computer
Science masturbatory kind of way, but Linux seems to do the job just
fine in real life. Do we need to have this whole
Microkernel/Monolithic conversation simply because you misunderstood
something about the kind of performance now possible in 2.6 kernels as
compared with adding a whole pointless level of message passing
underneath?

Jon.

Miklos Szeredi

unread,

Oct 3, 2005, 12:10:15 PM10/3/05

to

> But you /know/ this because you're a microprocessor designer as well
> as a contributor to the FUSE project?

AFAIK Luke never contributed to the FUSE project. Hopefully that
answers your question.

FUSE and microkernels are sometimes mentioned together, but I believe
there's a very important philosophical difference:

FUSE was created to ease the development and use of a very _special_
group of filesystems. It was never meant to replace (and never will)
the fantastically efficient and flexible internal filesystem
interfaces in Linux and other monolithic kernels.

On the other hand, the microkernel approach is to restrict _all_
filesystems to the more secure, but less efficient and less flexible
interface. Which is stupid IMO.

Miklos

Valdis.K...@vt.edu

unread,

Oct 3, 2005, 12:40:09 PM10/3/05

to

On Sun, 02 Oct 2005 22:12:38 EDT, Horst von Brand said:

> > some
> > operating system primitives, such as message passing (based on a
> > derivative by thompson of the "alice" project from plessey, imperial and
> > manchester university in the mid-80s), hardware cache line lookups
> > (which means instead of linked list searching, the hardware does it for
> > you in a single cycle), stuff like that.
>
> Single CPU cycle for searching data in memory? Impossible.

Well... if it was content-addressable RAM similar to what's already used for
the hardware TLB's and the like - just that it's one thing to make a 32 or 256
location content-addressable RAM, and totally another to have multiple megabytes
of the stuff. :)

Joe Bob Spamtest

unread,

Oct 3, 2005, 2:00:11 PM10/3/05

to

Luke Kenneth Casson Leighton wrote:

> p.s. martin. _don't_ do that again. i don't care who you are:
> internet archives are forever and your rudeness will be noted
> by google-users and other search-users - long after you are dead.

and who are you, the thought police? Get off your high horse.

I'm sure he's well aware of the consequences of posting to this list, as
I'm sure we all are. Hell, even *I* know all my mails to this list are
going to be archived for eternity.

Look, if you want to be a productive member of our community, stop
bitching about the way things *should* be, and submit some patches like
everyone else. Code talks. Bullshit ... well, it doesn't do much but sit
around stinking the place up.

Lennart Sorensen

unread,

Oct 3, 2005, 2:10:12 PM10/3/05

to

On Mon, Oct 03, 2005 at 10:50:00AM +0300, Meelis Roos wrote:
> LKCL> the code for oskit has been available for some years, now,
> LKCL> and is regularly maintained. the l4linux people have had to
>
> My experience with oskit (trying to let students use it for OS course
> homework) is quite ... underwhelming. It works as long as you try to use
> it exactly like the developers did and breaks on a slightest sidestep
> from that road. And there's not much documentation so it's hard to learn
> where that road might be.
>
> Switched to Linux/BSD code hacking with students, the code that actually
> works.

Can oskit be worse than nachos where the OS ran outside the memory space
and cpu with only applications being inside the emulated mips processor?
Made some things much too easy to do, and other things much to hard
(like converting an address from user space to kernel space an accessing
it, which should be easy, but was hard).

I suspect most 'simple' OS teaching tools are awful. Of course writing
a complete OS from scratch is a serious pain and makes debuging much
harder than if you can do your work on top of a working OS that can
print debug messages.

Len Sorensen

Lennart Sorensen

unread,

Oct 3, 2005, 2:30:22 PM10/3/05

to

On Mon, Oct 03, 2005 at 02:53:02AM +0100, Luke Kenneth Casson Leighton wrote:
> sorry, vadim: haven't touched a shift key in over 20 years.

Except to type that ':' I suspect.

How about learning to use that shift key so that everyone else that
reads your typing don't have to spend so much time working it out when
proper syntax would have made it simpler. It may save you 1% of your
time typing it, but you cost thousands of people much more as a result.
net loss for the world as a whole.

> ah, i'm not: i just left out mentioning it :)
>
> the message passing needs to be communicated down to manage
> threads, and also to provide a means to manage semaphores and
> mutexes: ultimately, support for such an architecture would
> work its way down to libc.
>
> and yes, if you _really_ didn't want a kernel in the way at all, you
> could go embedded and just... do everything yourself.
>
> or port reactos, the free software reimplementation of nt,
> to it, or something :)

Microkernel, message passing, blah blah blah. Does GNU Hurd actually
run fast yet? I think it exists finally and works (at least mostly) but
how does the performance compare?

Most of your arguments seem like a repeat of more acedemic theories most
of which have not been used in a real system where performance running
average software was important, at least I never heard of them. Not
that that necesarily means much, other than they can't have gained much
popularity.

> it won't stop - but the price of 90nm mask charges, at approx
> $2m, is already far too high, and the number of large chips
> being designed is plummetting like a stone as a result - from
> like 15,000 per year a few years ago down to ... damn, can't remember -
> less than a hundred (i think! don't quote me on that!)

Hmm, so if we guess it might take 10 masks per processor type over it's
life time as they change features and such, that's still less than 1% of
the cost of the FAB in the first place. I agree with the person that
said intel/AMD/company probably don't care much, as long as their
engineers make really darn sure that the mask is correct when they go to
make one.

> okay: i will catch up on this bit, another time, because it is late
> enough for me to be getting dizzy and appearing to be drunk.
>
> this is one answer (and there are others i will write another time.
> hint: automated code analysis tools, auto-parallelising tools, both
> offline and realtime):
>
> watch what intel and amd do: they will support _anything_ - clutch at
> straws - to make parallelism palable, why? because in order to be
> competitive - and realistically priced - they don't have any choice.
>
> plus, i am expecting the chips to be thrown out there (like
> the X-Box 360 which has SIX hardware threads remember) and
> the software people to quite literally _have_ to deal with it.

Hey I like parallel processing. i think it's neat, and I have often
made some of my own tools multithreaded just because I found it could be
done for the task, and I often find multithreaded simpler to code (I
seem to be a bit of a weirdo that way) for certain tasks.

> i expect the hardware people to go: this is the limit, this is what we
> can do, realistically price-performance-wise: lump it, deal with it.
>
> when intel and amd start doing that, everyone _will_ lump it.
> and deal with it.
>
> ... why do you think intel is hyping support for and backing
> hyperthreads support in XEN/Linux so much?

Ehm, because intel has it and their P4 desperately needs help to gain
any performance it can until they get the Pentium-M based desktop chips
finished with multiple cores, and of course because AMD doesn't have it.
Seem like good reasons for intel to try and push it.

Len Sorensen

linux-os (Dick Johnson)

unread,

Oct 3, 2005, 2:40:16 PM10/3/05

to

On Mon, 3 Oct 2005, Lennart Sorensen wrote:

> On Mon, Oct 03, 2005 at 10:50:00AM +0300, Meelis Roos wrote:
>> LKCL> the code for oskit has been available for some years, now,
>> LKCL> and is regularly maintained. the l4linux people have had to
>>
>> My experience with oskit (trying to let students use it for OS course
>> homework) is quite ... underwhelming. It works as long as you try to use
>> it exactly like the developers did and breaks on a slightest sidestep
>> from that road. And there's not much documentation so it's hard to learn
>> where that road might be.
>>
>> Switched to Linux/BSD code hacking with students, the code that actually
>> works.
>
> Can oskit be worse than nachos where the OS ran outside the memory space
> and cpu with only applications being inside the emulated mips processor?
> Made some things much too easy to do, and other things much to hard
> (like converting an address from user space to kernel space an accessing
> it, which should be easy, but was hard).
>
> I suspect most 'simple' OS teaching tools are awful. Of course writing
> a complete OS from scratch is a serious pain and makes debuging much
> harder than if you can do your work on top of a working OS that can
> print debug messages.
>
> Len Sorensen
> -

But the first thing you must do in a 'roll-your-own' OS is to make
provisions to write text to (sometimes a temporary) output device
and get some input from same. Writing such basic stuff is getting
harder because many embedded systems don't have UARTS, screen-cards,
keyboards, or any useful method of doing I/O. This is where an
existing OS (Like Linux) can help you get some I/O running, perhaps
through a USB bus. You debug and make it work as a Linux
Driver, then you link the working stuff into your headless CPU
board.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13 on an i686 machine (5589.55 BogoMips).
Warning : 98.36% of all statistics are fiction.

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to Deliver...@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

Alan Cox

unread,

Oct 3, 2005, 2:50:09 PM10/3/05

to

On Sul, 2005-10-02 at 22:55 -0400, Valdis.K...@vt.edu wrote:
> The HP2114 and DEC KL10/20 were able to dereference a chain of indirect bits
> back in the 70's (complete with warnings that hardware wedges could occur if
> an indirect reference formed a loop or pointed at itself).

The KL10 has an 8 way limit. The PDP-6 didn't but then it also lacked an
MMU.

Alan Cox

unread,

Oct 3, 2005, 3:00:07 PM10/3/05

to

On Llu, 2005-10-03 at 01:54 +0100, Luke Kenneth Casson Leighton wrote:
> the message passing system is designed as a parallel message bus -
> completely separate from the SMP and NUMA memory architecture, and as
> such it is perfect for use in microkernel OSes.

I've got one of those. It has the memory attached. Makes a fantastic
message bus and has a really long queue. Also features shortcuts for
messages travelling between processors in short order cache to cache.
Made by AMD and Intel.

> however, as i pointed out, 90nm and approx-2Ghz is pretty much _it_,
> and to get any faster you _have_ to go parallel.

We do 512 processors passably now. Thats a lot of cores and more than
the commodity computing people can wire to memory subsystems at a price
people will pay.

Besides which you need to take it up with the desktop people really. Its
their apps that use most of the processor power and will benefit most
from parallelising and efficiency work.

Alan

Luke Kenneth Casson Leighton

unread,

Oct 3, 2005, 3:00:15 PM10/3/05

to

On Mon, Oct 03, 2005 at 02:08:58PM -0400, Lennart Sorensen wrote:
> On Mon, Oct 03, 2005 at 10:50:00AM +0300, Meelis Roos wrote:
> > LKCL> the code for oskit has been available for some years, now,
> > LKCL> and is regularly maintained. the l4linux people have had to
> >
> > My experience with oskit (trying to let students use it for OS course
> > homework) is quite ... underwhelming. It works as long as you try to use
> > it exactly like the developers did and breaks on a slightest sidestep
> > from that road. And there's not much documentation so it's hard to learn
> > where that road might be.

analysis, verification, debugging and adoption of oskit by
the linux kernel maintainers would help enormously there,
i believe, which is why i invited the kernel maintainers to
give it some thought.

there are other reasons: not least is that oskit _is_ the
linux kernel source code - with the kernel/* bits removed and
the device drivers and support infrastructure remaining.

so the developers who split the linux source code out into oskit did
not, in your opinion and experience, meelis, do a very good job: so
educate them and tell them how to do it better.

l.

Luke Kenneth Casson Leighton

unread,

Oct 3, 2005, 3:10:13 PM10/3/05

to

aspex microelectronics 4096 2-bit massively parallel SIMD
processor (does 1 terabit-ops / sec @ 250mhz which sounds a
lot until you try to do FPU emulation on it).

each 2-bit processor has 256 bits of content-addressable memory,
which can be 8-bit, 16-bit or 32-bit addressed (to make 4096 parallel
memory searches - in a single cycle).

absolutely friggin blindingly fast for certain issues (video
processing, certain kinds of audio processing - e.g. FFTs,
XML and HTTP parsing), and pissed all over for others such
as doing floating point arithmetic.

but anyway: that's a side issue. thanks for reminding me about CAM,
valdis.

l.

--
--
<a href="http://lkcl.net">http://lkcl.net</a>
--

Luke Kenneth Casson Leighton

unread,

Oct 3, 2005, 3:20:13 PM10/3/05

to

On Mon, Oct 03, 2005 at 06:00:56PM +0200, Miklos Szeredi wrote:
> > But you /know/ this because you're a microprocessor designer as well
> > as a contributor to the FUSE project?
>
> AFAIK Luke never contributed to the FUSE project. Hopefully that
> answers your question.

wrong.

i added xattr support to fuse, for use in selinux. it's a long story.

http://www.ussg.iu.edu/hypermail/linux/kernel/0409.2/1441.html

and yes, for the record, i am just as comfortable with hardware
designs as with software, having designed a massively parallel
encryption algorithm capable of doing up to about 16384-bit
block sizes with key sizes of up to around 8192-bits, (which
unfortunately wasn't very fast in software - you can't have
everything), came up with some significant improvements to
the plessey/imperial-uni/man-uni ALICE parallel transputer
network as a third year project, and also provided aspex,
the massively-parallel SIMD processor company, with enough
new material and ideas in four months for them to have to
register six new patents.

... why are you people bothering to attempt to go "oh, this
guy must not know anything therefore we'll waste the list's
time with our opinions on whether he cannot do anything",
such that i have to refute you, and look like a complete
egg-head jumped-up i'm-better-than-you horn-blowing tosser?

stop it!

everyone has their level and areas of expertise: instead of
turning this into a pissing contest, be glad and humbled for
an opportunity to learn from each other.

l.

Miklos Szeredi

unread,

Oct 3, 2005, 3:40:08 PM10/3/05

to

> > > But you /know/ this because you're a microprocessor designer as well
> > > as a contributor to the FUSE project?
> >
> > AFAIK Luke never contributed to the FUSE project. Hopefully that
> > answers your question.
>
> wrong.
>
> i added xattr support to fuse, for use in selinux. it's a long story.
>
> http://www.ussg.iu.edu/hypermail/linux/kernel/0409.2/1441.html
>

Well, adding is a different thing from contribution.

Contribution is when you add something and also share it with the
maintainer of said software.

I can't remember you doing that, so that's why I said what I said.

Miklos

Jon Masters

unread,

Oct 3, 2005, 4:10:14 PM10/3/05

to

On 10/3/05, linux-os (Dick Johnson) <linu...@analogic.com> wrote:
>
> On Mon, 3 Oct 2005, Lennart Sorensen wrote:

> > I suspect most 'simple' OS teaching tools are awful. Of course writing
> > a complete OS from scratch is a serious pain and makes debuging much
> > harder than if you can do your work on top of a working OS that can
> > print debug messages.

> But the first thing you must do in a 'roll-your-own' OS is to make

> provisions to write text to (sometimes a temporary) output device
> and get some input from same.

Indeed. I started work on a microkernel for a final year University
project. Didn't get very far beyond minimal memory management and a
vague handy-wavy concept of a process in the end as it's easy to get
unstuck figuring out random blackbox hardware. Makes you respect some
of these people who really figured it out for real and got it working.

> Writing such basic stuff is getting harder because many embedded
> systems don't have UARTS, screen-cards, keyboards, or any useful
> method of doing I/O.

It's easier now that we have a growing number of cheaper ARM/PPC
boards on the market. But in order to do much of this, you really need
a hardware debugger. In my case, I tried to do this on an Apple
powerbook but once you've broken the BAT/page mapping for your
framebuffer you're rapidly running out of ways of debugging, e.g. a
VM. It's difficult enough even with a UART, or an LED, or whatever.

> This is where an existing OS (Like Linux) can help you get some I/O
> running, perhaps through a USB bus. You debug and make it work
> as a Linux Driver, then you link the working stuff into your headless
> CPU board.

A lot of people end up doing that - I've heard of some interesting
stories which I'm sure aren't widespread. One case, the guy had
basically bolted a small realtime module on to Linux (not really quite
like RTLinux) but had been able to do a lot of testing through
existing APIs. Another trick is to write as much as you can to sit
right atop the existing firmware - OpenFirmware, U-Boot, whatever and
perhaps even forgo trying to handle exceptions/VM for yourself in the
beginning.

Jon.

Luke Kenneth Casson Leighton

unread,

Oct 3, 2005, 4:30:15 PM10/3/05

to

On Mon, Oct 03, 2005 at 03:20:46PM +0100, Jon Masters wrote:
> On 10/2/05, Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:
>
> > as i love to put my oar in where it's unlikely that people
> > will listen, and as i have little to gain or lose by doing
> > so, i figured you can decide for yourselves whether to be
> > selectively deaf or not:
>
> Hi Luke,

hellooo jon.

> Haven't seen you since I believe you gave a somewhat interesting talk
> on FUSE at an OxLUG a year or more back.

good grief, that long ago?

> I don't think anyone here is
> selectively deaf, but some might just ignore you for such comments :-)

pardon? oh - yes, i'm counting on it, for a good signal/noise
ratio. sad to recount, the strategy ain't workin too well,
oh well. i've received about 10 hate mails so far, i _must_
be doing _something_ right.

> > i think it safe to say that a project only nears completion
> > when it fulfils its requirements and, given that i believe that
> > there is going to be a critical shift in the requirements, it
> > logically follows that the linux kernel should not be believed
> > to be nearing completion.
>
> Whoever said it was?

istrc it was in andrew morton's interview / comments :)

> > the basic premise: 90 nanometres is basically... well...
> > price/performance-wise, it's hit a brick wall at about 2.5Ghz, and
> > both intel and amd know it: they just haven't told anyone.
>
> But you /know/ this because you're a microprocessor designer as well
> as a contributor to the FUSE project?

i have been speaking on a regular basis with someone who
has been dealing for nearly twenty years now with processor
designs (from a business perspective, for assessing high-tech
companies for investment and recruitment purposes). i have
been fortunate enough to have the benefit of their experience
in assessing the viability of chip designs.

i haven't created silicon, but i've studied processor designs until
they were coming out of my ears.

... you are mistaken on one point, though: my work on fuse proved
unsuccessful because i was a) running out of time b) running out of
reasons to continue (the deal fell through) c) i ran into an error
message in selinux: the "please try later" one, which flummoxed me but
i now believe to be due to a crash in the userspace stuff. maybe.
urk. it's been a year.

anyway, we digress.

> > anyone (big) else has a _really_ hard time getting above 2Ghz,
> > because the amount of pipelining required is just... insane
> > (see recent ibm powerpc5 see slashdot - what speed does it do?
> > surprise: 2.1Ghz when everyone was hoping it would be 2.4-2.5ghz).
>
> I think there are many possible reasons for that and I doubt slashdot
> will reveal any of those reasons.

probably :)

> The main issue (as I understand it)
> is that SMT/SMP is taking off for many applications and manufacturers
> want to cater for them while reducing heat output - so they care less
> about MHz than about potential real world performance.

pipelining. pipelining. latency between blocks.

halving the microns should quadruple the speed: the distance is halved
so light has half the distance to travel and ... darn, can't remember
the other reason for the other factor-of-two.

if the latency between sub-blocks is large (or becomes
relevant), then it doesn't matter _what_ microns you attempt to
run in, your design will asymtote towards an upper speed limit.

if you're having to pipeline down to the level of 2-bit adders with 16
stages in order to do 32-bit adds at oh say 4ghz, you _know_ you're in
trouble.

> > so, what's the solution?
>
> > well.... it's to back to parallel processing techniques, of course.
>
> Yes. Wow! Of course! Whoda thunk it? I mean, parallel processing!
> Let's get that right into the kern...oh wait, didn't Alan and a bunch
> of others already do that years ago? Then again, we might have missed
> all of the stuff which went into 2.2, 2.4 and then 2.6?

jon, jon *sigh* :) i meant _hardware_ parallel processing - i wasn't
referring to anything led or initiated by the linux kernel, but instead
to the simple conclusion that if hardware is running out of steam in
uniprocessor (monster-pipelined; awful-prediction;
let's-put-five-separately-designed-algorithms-for-divide-into-the-chip-and-take-the-answer-of-the-first-unit-that-replies sort of design)
then chip designers are forced to parallelise.

> > well - intel is pushing "hyperthreading".
>
> Wow! Really? I seem to have missed /all/ of those annoying ads. But
> please tell me some more about it!

make me. nyer :)

> > and, what is the linux kernel?
>
> > it's a daft, monolithic design that is suitable and faster on
> > single-processor systems, and that design is going to look _really_
> > outdated, really soon.
>
> Why? I happen to think Microkernels are really sexy in a Computer
> Science masturbatory kind of way, but Linux seems to do the job just
> fine in real life. Do we need to have this whole
> Microkernel/Monolithic conversation simply because you misunderstood
> something about the kind of performance now possible in 2.6 kernels as
> compared with adding a whole pointless level of message passing
> underneath?

i'll answer rik's point later when i've thought about it some
more: in short, rik has concluded that because i advocated
message passing that somehow his SMP improvement work
(isolating data structures) is irrelevant - far from it: such
improvements would prove, i believe, to be a significant additional
augmentation, unburdening both a monolithic _and_ a microkernel'd linux
kernel from some of the cru-joze nastiness of SMP.

if there are better ways, _great_.

love to hear them.

l.

Luke Kenneth Casson Leighton

unread,

Oct 3, 2005, 4:40:10 PM10/3/05

to

On Mon, Oct 03, 2005 at 12:36:20PM -0700, Joe Bob Spamtest wrote:
> The point being: If and when the industry switches its focus to highly
> parallel systems, Linux will shortly follow.

joe: hi, thanks for responding. i believe this to be a very
sound strategy, and given the technical expertise of the kernel
developers i have confidence in their abilities to pull that off.

personally i find that i like a bit of a run-up and/or advance notice
of major paradigm shifts. on the basis that other people might also
want to know, i initiated this discussion yesterday and it seems like
forever already! :)

l.

oh, and joe? my wife is the one with the high horse, not me.
she qualified for the national british dressage championships which
was last month, and came 17th in the country, at elementary
level, on her beautiful pony, blue. i am very proud of her.

http://www.bdchampionships.co.uk

Luke Kenneth Casson Leighton

unread,

Oct 3, 2005, 5:10:16 PM10/3/05

to

On Mon, Oct 03, 2005 at 08:18:40PM +0100, Alan Cox wrote:
> On Llu, 2005-10-03 at 01:54 +0100, Luke Kenneth Casson Leighton wrote:
> > the message passing system is designed as a parallel message bus -
> > completely separate from the SMP and NUMA memory architecture, and as
> > such it is perfect for use in microkernel OSes.
>
> I've got one of those. It has the memory attached. Makes a fantastic
> message bus and has a really long queue. Also features shortcuts for
> messages travelling between processors in short order cache to cache.
> Made by AMD and Intel.

made? _cool_. actual hardware. new knowledge for me. do you know
of any online references, papers or stuff? [btw just to clarify:
you're saying you have a NUMA bus or you're saying you have an
augmented SMP+NUMA+separate-parallel-message-passing-bus er .. thing]

> > however, as i pointed out, 90nm and approx-2Ghz is pretty much _it_,
> > and to get any faster you _have_ to go parallel.
>
> We do 512 processors passably now.

wild.

> Thats a lot of cores and more than
> the commodity computing people can wire to memory subsystems at a price
> people will pay.

oops.

whereas, would you see it more reasonable for a commodity-level
chip to be something like 32- or even 64- ultra-RISC cores of
between 5000 and 10,000 gates each, resulting in a processor
of about 50% cache memory and 50% processing plus associated
parallel bus architecture at around 1 million gates?

running at oh say 1ghz or with careful design effort focussed on the
RISC cores maybe even 2ghz, resulting in 128 total GigaOps if you
go for 64 cpus @ 2ghz. that's a friggin lot of processing power
for a 1m gates processor!!

(hey, see, i can learn to use the shift key to highlight the keyword)

such a chip, in 90nm, would be approx $USD 20 in mass production.

small, good heat distribution, probably too many pins, probably
need some DRAM memory stamped upside down on top of the die,
instead of off-chip [putting DRAM and transistors on the same die
is a frequent and costly mistake: the yields are terrible].

putting DRAM upside down on top of a die and then hitting it
with a hammer (literally) is a frequently used technique to
avoid the problem.

> Besides which you need to take it up with the desktop people really.

i'm sort-of drafting a reply to rik's point in my head (no it's not
reiterations of things already said) and yes it involves legacy apps,
badly written apps, desktop focus etc.

server-side, yep, fine, 32-way, use it all, got it, heck,
even my apache server running on a P200 w/64mb RAM runs more
processes than that.

l.

Luke Kenneth Casson Leighton

unread,

Oct 3, 2005, 5:20:14 PM10/3/05

to

On Mon, Oct 03, 2005 at 01:03:48AM -0400, Sonny Rao wrote:
> Roll around on the floor while violently laughing for a while?

_excellent_! can we watch? where's the mp4 :)

Luke Kenneth Casson Leighton

unread,

Oct 3, 2005, 5:30:10 PM10/3/05

to

On Sun, Oct 02, 2005 at 10:55:49PM -0400, Valdis.K...@vt.edu wrote:
> On Mon, 03 Oct 2005 01:54:00 BST, Luke Kenneth Casson Leighton said:
>
> > in the mid-80s), hardware cache line lookups (which means
> > instead of linked list searching, the hardware does it for
> > you in a single cycle), stuff like that.
>

> OK.. I'll bite. How do you find the 5th or 6th entry in the linked list,
> when only the first entry is in cache, in a single cycle, when a cache line
> miss is more than a single cycle penalty, and you have several "These are not
> the droids you're looking for" checks and go on to the next entry - and do it
> in one clock cycle?

i was not privy to the design discussions: unfortunately i was only
given brief conclusions and hints by the designer.

my guess is that yes, as the later messages in this thread
hint at, CAM is probably the key: 256 blocks of 32-bit CAM,
something like that.

CAM is known to help dramatically decrease execution time
by orders of magnitude in linked list algorithms such as
searching and sorting, esp. where each CAM cell has built-in
processing, like the aspex.net massively-deep SIMD architecture has.

l.

Nix

unread,

Oct 3, 2005, 5:40:07 PM10/3/05

to

On 3 Oct 2005, Giuseppe Bilotta prattled cheerily:

> I'd *love* a keyboard layout where * _ : ) are accesible without
> shift! Can you send me yours?

<http://www.maltron.co.uk/images/press/maltron-ergonomic-english-trackball-tq-hr1.jpg>

(downside: cost. Upside: feels lovely.)

--
`Next: FEMA neglects to take into account the possibility of
fire in Old Balsawood Town (currently in its fifth year of drought
and home of the General Grant Home for Compulsive Arsonists).'
--- James Nicoll

Alan Cox

unread,

Oct 3, 2005, 5:40:11 PM10/3/05

to

On Llu, 2005-10-03 at 22:07 +0100, Luke Kenneth Casson Leighton wrote:
> made? _cool_. actual hardware. new knowledge for me. do you know
> of any online references, papers or stuff? [btw just to clarify:
> you're saying you have a NUMA bus or you're saying you have an
> augmented SMP+NUMA+separate-parallel-message-passing-bus er .. thing]

Its a standard current Intel feature. See "mwait" in the processor
manual. The CPUs are also smart enough to do cache to cache transfers.
No special hardware no magic.

And unless I want my messages to cause interrupts and wake events (in
which case the APIC does it nicely) then any locked operation on memory
will do the job just fine. I don't need funky hardware on a system. The
first point I need funky hardware is between boards and that isn't
consumer any more.

Alan

Jon Masters

unread,

Oct 3, 2005, 6:00:19 PM10/3/05

to

On Mon, Oct 03, 2005 at 09:22:39PM +0100, Luke Kenneth Casson Leighton wrote:

> On Mon, Oct 03, 2005 at 03:20:46PM +0100, Jon Masters wrote:

> > On 10/2/05, Luke Kenneth Casson Leighton <lk...@lkcl.net> wrote:

> hellooo jon.
>
> > Haven't seen you since I believe you gave a somewhat interesting talk
> > on FUSE at an OxLUG a year or more back.
>
> good grief, that long ago?

Indeed. Time flies.

> > I don't think anyone here is
> > selectively deaf, but some might just ignore you for such comments :-)
>
> pardon? oh - yes, i'm counting on it, for a good signal/noise
> ratio. sad to recount, the strategy ain't workin too well,
> oh well. i've received about 10 hate mails so far, i _must_
> be doing _something_ right.

I hate to sound like "one of them" but nothing you've said is revolutionary -
really it's not - but you're saying it as if you just dreamed it up today and
/that's/ what got people annoyed. One of those pseudo-intellectual Starbucks
moments, if you will.

> > > the basic premise: 90 nanometres is basically... well...
> > > price/performance-wise, it's hit a brick wall at about 2.5Ghz, and
> > > both intel and amd know it: they just haven't told anyone.
> >
> > But you /know/ this because you're a microprocessor designer as well
> > as a contributor to the FUSE project?
>
> i have been speaking on a regular basis with someone who
> has been dealing for nearly twenty years now with processor
> designs (from a business perspective, for assessing high-tech
> companies for investment and recruitment purposes). i have
> been fortunate enough to have the benefit of their experience
> in assessing the viability of chip designs.

In other words, no. I'm not a processor designer either (very few people
are) but I do have a lot of experience with Xilinx FPGAs and I'll add
something of relevence to the end of this message. The point really is
that some people here know a great deal more than you do about this (and
I'm not saying I'm one of them), they're going to rightfully feel a
little annoyed if you start preeching to the choir.

> > The main issue (as I understand it)
> > is that SMT/SMP is taking off for many applications and manufacturers
> > want to cater for them while reducing heat output - so they care less
> > about MHz than about potential real world performance.
>
> pipelining. pipelining. latency between blocks.

Would you mind learning to use capitali[sz]ation in your mail? It's
really not very easy to read what you write. Was the above intended to
be an actual sentence (in which case I can't see any clauses in the
above) or just random words which - if said together - will summon some
mystical power upon us?

> > > so, what's the solution?
> >
> > > well.... it's to back to parallel processing techniques, of course.
> >
> > Yes. Wow! Of course! Whoda thunk it? I mean, parallel processing!
> > Let's get that right into the kern...oh wait, didn't Alan and a bunch
> > of others already do that years ago? Then again, we might have missed
> > all of the stuff which went into 2.2, 2.4 and then 2.6?
>
> jon, jon *sigh* :)

My point was that some very much more cleaver people have worked on this
for a very long time already. Alan did a lot of cool stuff in the
beginning, what Ingo does now just scares me, etc.

> i meant _hardware_ parallel processing

Old hat. It's been worked on for over a decade and some of it is working
out now in the form of concept processors that mix FPGAs with CPUs.

> - i wasn't referring to anything led or initiated by the linux kernel,
> but instead to the simple conclusion that if hardware is running out of
> steam in uniprocessor (monster-pipelined; awful-prediction;
> let's-put-five-separately-designed-algorithms-for-divide-into-the-chip-and-take-the-answer-of-the-first-unit-that-replies sort of design)
> then chip designers are forced to parallelise.

So simple in fact that it's already done in most common hardware. Why
else would we have any offload chips at all?

Please help me to understand the value of your original message. I've
read it a few times, but it continues to allude me.

Jon.

Sonny Rao

unread,

Oct 3, 2005, 8:00:14 PM10/3/05

to

On Mon, Oct 03, 2005 at 10:12:26PM +0100, Luke Kenneth Casson Leighton wrote:
> On Mon, Oct 03, 2005 at 01:03:48AM -0400, Sonny Rao wrote:
> > Roll around on the floor while violently laughing for a while?
>
> _excellent_! can we watch? where's the mp4 :)

It'll be out there shortly... ;-P

But realistically, how can I take you seriously?

You come onto the Linux kernel mailing list, talk about hardware
implementations that are not generally known or available while
insulting the populace by claiming they would not understand
anyway, then insult the kernel developers by proclaiming their hard work
is "daft" and that the design is going to be "_really_ outdated,
really soon" without much explanation, go on to make wildly
generalized comments about processor frequency scaling with little
real insight or explanation of the issues, and apparently like making
long meandering posts even more difficult to read by not using capital
letters.

I also notice you suddenly decided to act morally superior to Martin
by taking "a leaf from the great rusty russell's book " and dealing
with "immature and out-of-line comments" in a "professional way."
Way to go...

Anyway, others have posted far more mature responses than my obvious
flamebait... apologies for the extra noise.

Sonny

Jason Stubbs

unread,

Oct 3, 2005, 9:40:10 PM10/3/05

to

Luke Kenneth Casson Leighton wrote:

> halving the microns should quadruple the speed: the distance is halved
> so light has half the distance to travel and ... darn, can't remember
> the other reason for the other factor-of-two.

2 dimensions?

--
Jason Stubbs

Valdis.K...@vt.edu

unread,

Oct 4, 2005, 12:00:14 AM10/4/05

to

On Mon, 03 Oct 2005 22:07:22 BST, Luke Kenneth Casson Leighton said:

> whereas, would you see it more reasonable for a commodity-level
> chip to be something like 32- or even 64- ultra-RISC cores of
> between 5000 and 10,000 gates each, resulting in a processor
> of about 50% cache memory and 50% processing plus associated
> parallel bus architecture at around 1 million gates?

Read your history - especially IBM's 801 chipset, which became the RT,
and why they then replaced that with the Power architecture...

> running at oh say 1ghz or with careful design effort focussed on the
> RISC cores maybe even 2ghz, resulting in 128 total GigaOps if you
> go for 64 cpus @ 2ghz. that's a friggin lot of processing power
> for a 1m gates processor!!

Good. Were you planning to run the ucLinux branch on this, or include all
the pieces needed to support virtual memory? And do it inside that 10K gate
budget, too (hint - how many gates will you burn just doing a TLB big enough
to get the performance of mapping a virtual->real address to be good enough?)

You might want to read up on all the fun that IBM went through in designing
memory subsystems that can keep even the Power4 and Power5 chipsets fed too,
or the interesting stuff that SGI has to do to allow 64/128/512 processors
to beat up on a memory system - I'm sure there's some pretty high gate count
involved there..

If you're doing 64 10K-gate cores, you've blown 64% of your 1M gate budget.
You've got only 320K gates left to build cache *and* virtual memory support to
make that 1M gate budget. And yes, you need a cache, as IBM found out on
their RT processor.....

Martin Fouts

unread,

Oct 4, 2005, 12:20:13 AM10/4/05

to

> -----Original Message-----
> From: linux-ker...@vger.kernel.org
> [mailto:linux-ker...@vger.kernel.org] On Behalf Of Luke
> Kenneth Casson Leighton
> Sent: Monday, October 03, 2005 1:31 PM
> To: Joe Bob Spamtest
> Cc: linux-kernel
> Subject: Re: what's next for the linux kernel?
>
> On Mon, Oct 03, 2005 at 12:36:20PM -0700, Joe Bob Spamtest wrote:
> > The point being: If and when the industry switches its focus to
> > highly parallel systems, Linux will shortly follow.
>
>

> personally i find that i like a bit of a run-up and/or
> advance notice of major paradigm shifts. on the basis that
> other people might also want to know, i initiated this
> discussion yesterday and it seems like forever already! :)

There's no real need to hurry. The industry isn't going to shift its
focus to highly parallel systems in the near future. Highly parallel
systems have been around since the 60s, (when 'highly parallel' meant a
lot less than it does today, but was still parallel compared to typical
von neumann machines of the period.)

It's been fifteen years since I last played with them, and there's not
much chance that they'll get interesting in the near future.

They'll never go completely away, because there'll always be niches
where they make sense, but they'll never break out into the mainstream,
because those niches tend to get smaller, not larger, over time.

Luke Kenneth Casson Leighton

unread,

Oct 4, 2005, 8:31:18 AM10/4/05

to

On Tue, Oct 04, 2005 at 10:33:16AM +0900, Jason Stubbs wrote:
> Luke Kenneth Casson Leighton wrote:
> > halving the microns should quadruple the speed: the distance is halved
> > so light has half the distance to travel and ... darn, can't remember
> > the other reason for the other factor-of-two.
>
> 2 dimensions?

Voltage-squared. capacitance. when you go down the microns, your
capacitance drops and the voltage squared goes down, too.

0.65nm is 1.2v

0.45 is aiming for 0.9 volts.

silicon germanium is going to hit a limit real soon.
you can't go below 0.8 volts, that's the gate "off" threshold.

l.

--
--
<a href="http://lkcl.net">http://lkcl.net</a>
--

Luke Kenneth Casson Leighton

unread,

Oct 4, 2005, 9:00:29 AM10/4/05

to

> Hmm, so if we guess it might take 10 masks per processor type over it's
> life time as they change features and such, that's still less than 1% of
> the cost of the FAB in the first place. I agree with the person that
> said intel/AMD/company probably don't care much, as long as their
> engineers make really darn sure that the mask is correct when they go to
> make one.

you elaborate, therefore, on my point.

anyone else, therefore, cannot hope to compete or even enter the
market, at 90nm.

which is why the first VIA eden processors maxed out at 800mhz (i'm
guessing they were a 0.13micron and therefore 2.5 volts)

> > ... why do you think intel is hyping support for and backing
> > hyperthreads support in XEN/Linux so much?
>
> Ehm, because intel has it and their P4 desperately needs help to gain
> any performance it can until they get the Pentium-M based desktop chips
> finished with multiple cores, and of course because AMD doesn't have it.
> Seem like good reasons for intel to try and push it.

you lend weight to my earlier points: the push is to
drive the engineers towards less gates on the excuse of
cart-before-horsing the market with their "performance / watt"
metrics, such that if 0.65nm comes off it's less painful
and not too much of a jump, and they aim for more parallel
processing (multiple cores).

current : 200 million gates with 90nm at 1.65 volt
estimated: 40 million gates with 65nm at 1.1 volt
estimated: 1 million gates with 45nm at 0.9 volt.

the "off" voltage of a silicon germanium transistor is 0.8 volts.

at 45nm the current leakage is so insane that the heat
dissipation, through the oxide layer which covers the chip,
ends up blowing the chip up.

trouble.

l.

Luke Kenneth Casson Leighton

unread,

Oct 4, 2005, 9:10:14 AM10/4/05

to

On Sun, Oct 02, 2005 at 08:27:45PM -0500, Chase Venters wrote:

> The bottom line is that the application developers need to start being clever
> with threads.

yep! ah. but. see this:

http://lists.samba.org/archive/samba-technical/2004-December/038300.html

and think what would happen if glibc had hardware-support for
semaphores and mutexes.

> I think I remember some interesting rumors about Perl 6, for
> example, including 'autothreading' support - the idea that your optimizer
> could be smart enough to identify certain work that can go parallel.

http://www.ics.ele.tue.nl/~sander/publications.php
http://portal.acm.org/citation.cfm?id=582068
http://csdl.computer.org/comp/proceedings/acsd/2003/1887/00/18870237.pdf

to get the above references, put in "holland parallel code
analysis tools" into google.com.

put in "parallel code analysis tools" into google.com for a different
set.

linux-os (Dick Johnson)

unread,

Oct 4, 2005, 9:20:20 AM10/4/05

to

Since the voltage goes down as the speed increases, an
industry joke has been that at 0 volts you can get the
speed that you want. Unfortunately, the required infinite
current is a bit hard to manage. This brings up the
configuration change that has been in the works for
some time, current-mode logic. Just like TTL gave way
to ECL to obtain two orders of magnitude increase in
random-logic speed, there will be a similar increase once
current rather than voltage is used for logic levels.
And, it's not absolute current, either but delta-current
which will define a logic state.

The problems with reducing capacity end up being exacerbated
when logic states are stored in the very capacity that is
being reduced. Eventually there is little difference between
"logic" and "noise". This brings up the new science of
"statistical logic" which has yet to make its way into
microprocessors, but soon will once the quantization noise
becomes a factor.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13 on an i686 machine (5589.55 BogoMips).
Warning : 98.36% of all statistics are fiction.

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to Deliver...@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

Lennart Sorensen

unread,

Oct 4, 2005, 9:51:05 AM10/4/05

to

On Tue, Oct 04, 2005 at 01:53:54PM +0100, Luke Kenneth Casson Leighton wrote:
> you elaborate, therefore, on my point.
>
> anyone else, therefore, cannot hope to compete or even enter the
> market, at 90nm.
>
> which is why the first VIA eden processors maxed out at 800mhz (i'm
> guessing they were a 0.13micron and therefore 2.5 volts)

Sticking with 0.13 also seems to make for a better embedded processor
since it seems when you go to 0.09 or smaller it becomes nearly
impossible to run at high and low temperatures which is what some
embedded applications want (like -40 to +85C). We had to change compact
flash suppliers when our supplier went to 90nm since they said their new
process couldn't work reliably at industrial temperature anymore. If
VIA wants the eden to run at a wide temperature range, it appears they
are better off sticking to a larger process and keeping the cpu speed
down to something reasonable. I imagine at some point someone will
manage to make a 90nm chip that does handle bigger temperature ranges
but I haven't seen one yet myself.

> you lend weight to my earlier points: the push is to
> drive the engineers towards less gates on the excuse of
> cart-before-horsing the market with their "performance / watt"
> metrics, such that if 0.65nm comes off it's less painful
> and not too much of a jump, and they aim for more parallel
> processing (multiple cores).

If that was the goal the x86 architecture should be dumped. It spends
too many gates converting x86 junk into something reasonable to execute.
People don't appear very eager to dump the x86 unfortunately. Something
to do with backwards compatibility and such.

> current : 200 million gates with 90nm at 1.65 volt
> estimated: 40 million gates with 65nm at 1.1 volt
> estimated: 1 million gates with 45nm at 0.9 volt.

A 1 million gate chip at 45nm would be rather tiny. The yield would
probably be amazingly good. Of course if it does give off a lot of heat
you still have the problem of how to get rid of the heat given it is
focused in a very very small space. Of course just reducing the size of
the cache on intel's chips to something sane would reduce the gate count
enourmously. But that won't happen until they make a more efficient
chip.

> the "off" voltage of a silicon germanium transistor is 0.8 volts.
>
> at 45nm the current leakage is so insane that the heat
> dissipation, through the oxide layer which covers the chip,
> ends up blowing the chip up.

Unless someone finds a way to reduce the leakage. It is worth a lot of
money to some companies to solve that problem after all.

Len Sorensen

Andi Kleen

unread,

Oct 4, 2005, 10:10:14 AM10/4/05

to

Alan Cox <al...@lxorguk.ukuu.org.uk> writes:

> On Llu, 2005-10-03 at 22:07 +0100, Luke Kenneth Casson Leighton wrote:
> > made? _cool_. actual hardware. new knowledge for me. do you know
> > of any online references, papers or stuff? [btw just to clarify:
> > you're saying you have a NUMA bus or you're saying you have an
> > augmented SMP+NUMA+separate-parallel-message-passing-bus er .. thing]
>
> Its a standard current Intel feature. See "mwait" in the processor
> manual. The CPUs are also smart enough to do cache to cache transfers.
> No special hardware no magic.

It's unfortunately useless for anything but kernels right now because
Intel has disabled it in ring 3 (and AMD doesn't support it yet)
And the only good use the kernel found for it so far is fast wakeup
from the idle loop.

> And unless I want my messages to cause interrupts and wake events (in
> which case the APIC does it nicely) then any locked operation on memory
> will do the job just fine. I don't need funky hardware on a system. The
> first point I need funky hardware is between boards and that isn't
> consumer any more.

Firewire + CLFLUSH should do the job.

-Andi

Tushar Adeshara

unread,

Oct 4, 2005, 11:10:07 AM10/4/05

to

Hi all,
I am equally interested to see how all this will affect embedded world.
While processors will have to go parallel, I agree, there will be more
then one processor (may be uniprocessor) embedded in devices like
cellphone, MP3 player etc. against one on our desk and in server
rooms. Linux has to work on this devices too.

--
Regards,
Tushar
--------------------
It's not a problem, it's an opportunity for improvement. Lets improve.

Nikita Danilov

unread,

Oct 4, 2005, 11:10:19 AM10/4/05

to

Luke Kenneth Casson Leighton writes:
> On Sun, Oct 02, 2005 at 08:27:45PM -0500, Chase Venters wrote:
>
> > The bottom line is that the application developers need to start being clever
> > with threads.
>
> yep! ah. but. see this:
>
> http://lists.samba.org/archive/samba-technical/2004-December/038300.html
>
> and think what would happen if glibc had hardware-support for
> semaphores and mutexes.

Let me guess... nothing? Overhead of locking depends on data-structures
used by application/library and their access patterns: one thread has to
wait for another to finish with the shared resource. Implementing
locking in hardware is going to change nothing here (barring really
stupid implementations of locking primitives). Especially as we are
talking about blocking primitives, like pthread semaphore or mutex: an
entry into the scheduler will by far outweigh any advantages of
raw-metal synchronization.

>
> > I think I remember some interesting rumors about Perl 6, for
> > example, including 'autothreading' support - the idea that your optimizer
> > could be smart enough to identify certain work that can go parallel.

Fortran people automatically parallelize loops for a _long_ time.

>
> http://www.ics.ele.tue.nl/~sander/publications.php
> http://portal.acm.org/citation.cfm?id=582068
> http://csdl.computer.org/comp/proceedings/acsd/2003/1887/00/18870237.pdf
>
> to get the above references, put in "holland parallel code
> analysis tools" into google.com.

PS: I wonder why Luke Kenneth Casson Leighton, Esq., while failing to
spell the Grandeur of his Appellative with the full Capitalization in
not a single From header humble readers of this Thread have a rare Honor
to witness, insists on referring to his interlocutors in minuscule only?

Does this correlate with an abnormally frequent usage of word
"condescending" in this discussion?

Nikita.

Luke Kenneth Casson Leighton

unread,

Oct 4, 2005, 12:00:53 PM10/4/05

to

On Tue, Oct 04, 2005 at 07:04:35PM +0400, Nikita Danilov wrote:

dude.

chill.

out.

--
--
<a href="http://lkcl.net">http://lkcl.net</a>
--

Luke Kenneth Casson Leighton

unread,

Oct 4, 2005, 12:20:20 PM10/4/05

to

On Tue, Oct 04, 2005 at 07:04:35PM +0400, Nikita Danilov wrote:

> Luke Kenneth Casson Leighton writes:
> > On Sun, Oct 02, 2005 at 08:27:45PM -0500, Chase Venters wrote:
> >
> > > The bottom line is that the application developers need to start being clever
> > > with threads.
> >
> > yep! ah. but. see this:
> >
> > http://lists.samba.org/archive/samba-technical/2004-December/038300.html
> >
> > and think what would happen if glibc had hardware-support for
> > semaphores and mutexes.
>
> Let me guess... nothing?

interesting.

> Overhead of locking depends on data-structures
> used by application/library and their access patterns: one thread has to
> wait for another to finish with the shared resource.

yes.

> locking in hardware is going to change nothing here (barring really
> stupid implementations of locking primitives). Especially as we are
> talking about blocking primitives, like pthread semaphore or mutex: an
> entry into the scheduler will by far outweigh any advantages of
> raw-metal synchronization.

so what would, in your opinion, be a good optimisation?

the references i found (just below) are to tool chains or research
projects for code or linker-level analysis and parallelisation tools.

what would, in your opinion, be a good way for hardware to assist
thread optimisation, at this level (glibc)?

assuming that you have an intelligent programmer (or some really good
and working parallelisation tools) who really knows his threads?

> > http://www.ics.ele.tue.nl/~sander/publications.php
> > http://portal.acm.org/citation.cfm?id=582068
> > http://csdl.computer.org/comp/proceedings/acsd/2003/1887/00/18870237.pdf
> >
> > to get the above references, put in "holland parallel code
> > analysis tools" into google.com.
>
> PS: I wonder why Luke Kenneth Casson Leighton, Esq., while failing to

can i invite you to consider, when replying to these lists, to consider
instead of treating it as a location where you can piss over anyone
that you do not believe to be in any way your equal or in fact the equal
of anyone, to instead consider the following template for your replies:

okay, right.
* i do/don't get what this guy is saying.
* i do/don't have an alternative idea (here it is / sorry)
* here's what's wrong / right with what he's saying.
* here's where it can/can't be done better.

the bits that are missing from your reply are:

* "you do/don't get where i'm going with this"
* you haven't specified an alternative idea
* you've outlined what's wrong but not what's right
* you haven't specified how it can be done better.

i therefore conclude that you are bully. a snob.

i _really_ detest bullying - and that's what you are doing.

intellectual bullying.

stop it.

so i sent some messages saying "i think the kernel developers could be
wrong in their design strategy" so FRIGGIN what?

prove me right or prove me wrong.

or shut up, or add my email address to your killfile.

_don't_ be an intellectual snob.

l.

Gene Heskett

unread,

Oct 4, 2005, 12:30:22 PM10/4/05

to

On Tuesday 04 October 2005 08:53, Luke Kenneth Casson Leighton wrote:
[...]

> the "off" voltage of a silicon germanium transistor is 0.8
volts.

Isn't that a bit contradictory, Luke?. Or is there a new process that
mixes the two basic materials for making semiconductors from?

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.35% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2005 by Maurice Eugene Heskett, all rights reserved.

Nikita Danilov

unread,

Oct 4, 2005, 1:20:11 PM10/4/05

to

Luke Kenneth Casson Leighton writes:

[...]

>
> assuming that you have an intelligent programmer (or some really good
> and working parallelisation tools) who really knows his threads?

Well, I'd like to have a hardware with CAS-n operation for one
thing. But what would this buy us? Having different kernel algorithms
for x86 and mythical cas-n-able hardware is not viable.

[...]

>
> can i invite you to consider, when replying to these lists, to consider
> instead of treating it as a location where you can piss over anyone
> that you do not believe to be in any way your equal or in fact the equal
> of anyone,

As this self-contradictory (unless equivalence is not reflective in your
world) description is not the way I treat "these lists", I'll --instead
of pointing out to errors in your proposal-- follow (with improvements)
your advice, and re-interpret it the way I want:

> * i do/don't get what this guy is saying.

Indeed. But maybe this is at least partially because "this guy" fails to
abide to the elementary rules of grammar?

One who really wants to communicate with other people, follows common
conventions that exist specifically on purpose of making communication
possible. If you want to engage into a discussion with other people, you
should put aside considerations of your own convenience or habit, and
stick to the common protocol. It's that simple.

[...]

>
> so i sent some messages saying "i think the kernel developers could be
> wrong in their design strategy" so FRIGGIN what?

Pray, do tell me you typed this without a shift key. I'd very much do
like to not destroy the very essence of 20 years of accumulated
experience.

>
> or shut up, or add my email address to your killfile.

Please do this with my address in exchange: suddenly, I am
overwhelmed with a glimmering presentiment of sharing this location with
a lot of worthy people.

Nikita.

Bill Davidsen

unread,

Oct 4, 2005, 1:20:18 PM10/4/05

to

That's one way, the other is to find a way to cool such a chip. I see
references to diamond substrate from time to time, good thermal
conductor. So are other carbon forms, Fullerines, etc.

Clearly reducing leakage is the optimal solution, "deal with the heat"
is the other.

Luke Kenneth Casson Leighton

unread,

Oct 4, 2005, 1:30:31 PM10/4/05

to

On Tue, Oct 04, 2005 at 09:15:57PM +0400, Nikita Danilov wrote:
> Luke Kenneth Casson Leighton writes:
>
> [...]
>
> >
> > assuming that you have an intelligent programmer (or some really good
> > and working parallelisation tools) who really knows his threads?
>
> Well, I'd like to have a hardware with CAS-n operation for one
> thing.

CAS - compare and swap - by CAS-n i presume that you mean effectively a
SIMD CAS instruction?

> But what would this buy us?

you do not say :) i am genuinely interested to hear what it would buy.

> Having different kernel algorithms
> for x86 and mythical cas-n-able hardware is not viable.

if i can get an NPTL .deb package for glibc for x86 only it would tend
to imply that that isn't a valid conclusion: am i missing something?

cheers,

l.

Rik van Riel

unread,

Oct 4, 2005, 1:40:07 PM10/4/05

to

On Tue, 4 Oct 2005, Luke Kenneth Casson Leighton wrote:

> okay, right.
> * i do/don't get what this guy is saying.
> * i do/don't have an alternative idea (here it is / sorry)
> * here's what's wrong / right with what he's saying.
> * here's where it can/can't be done better.

It would help if you added something from the third and
fourth bullet points to your posts, instead of sticking
with just the first two.

--
All Rights Reversed

Nikita Danilov

unread,

Oct 4, 2005, 1:50:15 PM10/4/05

to

Luke Kenneth Casson Leighton writes:
> On Tue, Oct 04, 2005 at 09:15:57PM +0400, Nikita Danilov wrote:
> > Luke Kenneth Casson Leighton writes:
> >
> > [...]
> >
> > >
> > > assuming that you have an intelligent programmer (or some really good
> > > and working parallelisation tools) who really knows his threads?
> >
> > Well, I'd like to have a hardware with CAS-n operation for one
> > thing.
>
> CAS - compare and swap - by CAS-n i presume that you mean effectively a
> SIMD CAS instruction?

An instruction that atomically compares and swaps n independent memory
locations with n given values. cas-1 (traditional compare-and-swap) is
enough to implement lock-less queue, cas-2 is enough to implement
double-linked lists, and was used by Synthesis lock-free kernel
(http://citeseer.ist.psu.edu/massalin91lockfree.html).

To be precise, cas-1 is theoretically enough to implement double-linked
lists too, but resulting algorithms are not pretty at all.

>
> > But what would this buy us?
>
> you do not say :) i am genuinely interested to hear what it would buy.

Nothing. That was an instance of "rhetorical question", sorry that I
made not this clear enough.

>
> > Having different kernel algorithms
> > for x86 and mythical cas-n-able hardware is not viable.
>
> if i can get an NPTL .deb package for glibc for x86 only it would tend
> to imply that that isn't a valid conclusion: am i missing something?

Yes: this is Linux _Kernel_ mailing list, and I was talking about kernel
code and kernel algorithms.

>
> cheers,
>
> l.
>

Nikita.

Marc Perkel

unread,

Oct 4, 2005, 3:50:13 PM10/4/05

to

I think it's time for some innovative thinking and for people to step
outside the Linux box and look around at other operating systems for
some good ideas. I'll run through a few ideas here.

Reiser 4 - The idea of building a file system on top of a database is
the right way to go. Reiser is onto something here and this is a
technology that needs to be built upon. It's current condition is a
little on the week side - no ACLs for example - but the underlying
concept is ound.

Novell Netware type permissions. ACLs are a step in the right direction
but Linux isn't any where near where Novell was back in 1990. Linux lets
you - for example - to delete files that you have no read or write
access rights to. Netware on the other hand prevents you from deleting
files that you can't write to and if you have no right it is as if the
file isn't there. You can't even see it in the directory. Netware also
has inherited permissions like Windows and Samba has and this is doing
it right. File systems and individual directories should be able to be
flagged as casesensitive/insensitive. Permissions need to be fine
grained and easy to use. Netware is a good example to emulate.

The bootup sequence of Linux is pathetic. What an ungodly mess. The
FSTAB file needs to go and a smarter system needs to be developed. I
know this isn't entirely a kernel issue but it is somewhat related.

I think development needs to be done to make the kernel cleaner and
smarter rather than just bigger and faster. It's time to look at what
users need and try to make Linux somewhat more windows like in being
able to smartly recover from problems. Perhaps better error messages
that your traditional kernel panic or hex dump screen of death.

The big challenge for Linux is to be able to put it in the hands of
people who don't want to dedicate their entire life to understanding all
the little quirks that we have become used to. The slogan should be
"this just works" and is intuitive.

Anyhow - before I piss off too many people who are religiously attached
to Linux worshiping - I'll quit not. ;)

Marc Perkel
Linux Visionary

Luke Kenneth Casson Leighton

unread,

Oct 4, 2005, 5:20:11 PM10/4/05

to

On Tue, Oct 04, 2005 at 12:47:25PM -0700, Marc Perkel wrote:

> The bootup sequence of Linux is pathetic. What an ungodly mess. The
> FSTAB file needs to go and a smarter system needs to be developed. I
> know this isn't entirely a kernel issue but it is somewhat related.

depinit. written by richard lightman. easily located with google.

on relatively inexpensive amd 2100 hardware, depinit results
in a startup time to console login of 5 seconds, and x-windows
in a further 3.

this is probably as good a time as any to mention this:

depinit on a 2.6 kernel has had to have a small script added which does
a sleep 3; kill -HUP <itself> - i.e. "kill -HUP 1".

if this is not done, then any child program that sends a signal to
process 1 is NOT SEEN.

richard believes the problem to be actually in the 2.6 kernel.

whilst /sbin/init only catches one signal, depinit catches quite
literally _all_ of them.

i'm relaying this from memory, so some of the above may be inaccurate.

> I think development needs to be done to make the kernel cleaner and
> smarter rather than just bigger and faster.

actually, on embedded systems the linux 2.6 kernel is bigger
and slower, which has prompted a large number of embedded systems
designers to stick with the [by now abandoned] 2.4 series.

> Marc Perkel
> Linux Visionary
^^^^^ ^^^^^^^^^
wha-heeeeey !

my main concern, btw, is that by the time linux kernel developers
"receive hardware to play with", it's already too late.

the hardware decisions have already been made.

you - worthy as you are and the work you are doing is -
are treated as second class citizens by the companies
manufacturing hardware.

time to put the horse before the cart.

l.

--
--
<a href="http://lkcl.net">http://lkcl.net</a>
--

Bodo Eggert

unread,

Oct 4, 2005, 6:10:08 PM10/4/05

to

Marc Perkel <ma...@perkel.com> wrote:

[...]

> I'll run through a few ideas here.

> Novell Netware type permissions. ACLs are a step in the right direction

> but Linux isn't any where near where Novell was back in 1990. Linux lets
> you - for example - to delete files that you have no read or write
> access rights to.

It lets you unlink them. That's different from deleting, since the owner
may have his/her private link to that file.

Unlinking is changing the contents of a directory, and it's controlled by
the write permission of the containing directory.

> Netware on the other hand prevents you from deleting
> files that you can't write to and if you have no right it is as if the
> file isn't there.

Imagine a /tmp directory (writable by world) with user "a" creating a file
"foo", umask=077 and user "b" trying to do the same. User "b" will get
'file exists' if he tries to create it, and 'file does not exist' if he
tries to list it. He will go nuts.

BTW: YANI: That about a tmpfs where all-numerical entries can only be
created by the corresponding UID? This would provide a secure, private
tmp directory to each user without the possibility of races and denial-of-
service attacks. Maybe it should be controlled by a mount flag.

> You can't even see it in the directory. Netware also
> has inherited permissions like Windows and Samba has and this is doing
> it right.

You can't do that if you have hardlinks. However, I missed the feature of
overruling file permissions in some special directories, e.g. anything
put under /pub should ignore umask and be a+rX.

> File systems and individual directories should be able to be
> flagged as casesensitive/insensitive.

IMHO not needed.

> Permissions need to be fine
> grained and easy to use. Netware is a good example to emulate.

ACK.

> The bootup sequence of Linux is pathetic. What an ungodly mess.

Which one? The bsd-style, sysv-suse-style, the sysv-debian-style,
the sysv-gentoo-style, the supervise-style, ...?

> The
> FSTAB file needs to go and a smarter system needs to be developed.

Smarter than recognizing the partitions by GUID?

> I
> know this isn't entirely a kernel issue but it is somewhat related.
>

> I think development needs to be done to make the kernel cleaner and

> smarter rather than just bigger and faster. It's time to look at what
> users need and try to make Linux somewhat more windows like in being
> able to smartly recover from problems.

Using "windows" and "smartly recover from problems" is strange.

> Perhaps better error messages

And it becomes even more strange. Decent error messsages from windows are
as common as snowballs in hell.

> that your traditional kernel panic or hex dump screen of death.

«Some error occured. Press "OK".»

And if there is a help button, it won't.

> The big challenge for Linux is to be able to put it in the hands of
> people who don't want to dedicate their entire life to understanding all
> the little quirks that we have become used to. The slogan should be
> "this just works" and is intuitive.

"Just working" isn't easy if you have zillions of dependencies and even
more possibilities to choose from. You can e.g. make linux use a raid0
of a network block device and a loop-mounted file accessed over a ssh
session as it's root device (just in case if you you got bored of drilling
holes into your knees, pooring milk into them and raising fish in them.)

--
Ich danke GMX dafür, die Verwendung meiner Adressen mittels per SPF
verbreiteten Lügen zu sabotieren.

Chase Venters

unread,

Oct 4, 2005, 7:50:10 PM10/4/05

to

On Tuesday 04 October 2005 02:47 pm, Marc Perkel wrote:
> The bootup sequence of Linux is pathetic. What an ungodly mess. The
> FSTAB file needs to go and a smarter system needs to be developed. I
> know this isn't entirely a kernel issue but it is somewhat related.
>
> I think development needs to be done to make the kernel cleaner and
> smarter rather than just bigger and faster. It's time to look at what
> users need and try to make Linux somewhat more windows like in being
> able to smartly recover from problems. Perhaps better error messages
> that your traditional kernel panic or hex dump screen of death.
>
> The big challenge for Linux is to be able to put it in the hands of
> people who don't want to dedicate their entire life to understanding all
> the little quirks that we have become used to. The slogan should be
> "this just works" and is intuitive.

I agree with the basic sentiment here.

fstab is pretty traditional - and you're right, it really isn't a kernel thing
per se. Let's not forget the value of simplicity though. fstab works and does
its job well. I do think it would be interesting for a distribution to
experiment with different approaches, though -- something with the emerging
hardware abstraction layer perhaps.

As for error messages... the equivalent of the Linux kernel panic is basically
the Windows BSOD. Neither one of them should appear in the day to day use of
the system as they indicate bugs. Linux is actually the clear winner here, I
think, because a Windows BSOD gives you a single hex code and no indication
of what happened, except for very vague codes like
"PAGE_FAULT_IN_NON_PAGED_AREA". I'd much rather have a backtrace :) In any
case, I'm watching the work on kdump with a keen interest.

Really, I think the whole issue of usability isn't tied directly to the
kernel. The kernel has been making leaps and bounds in making this easy for
userspace to deal with (where the approaches to solve the problem belong).
Sysfs is an obvious great example.

Work on dbus and HAL should give us good improvements in these areas. One
remaining challenge I see is system configuration - each daemon tends to
adopt its own syntax for configuration, which means that providing a GUI for
novice users to manage these systems means attacking each problem separately
and in full. Now I certainly wouldn't advocate a Windows-style registry,
because I think it's full of obvious problems. Nevertheless, it would be nice
to have some kind of configuration editor abstraction library that had some
sort of syntax definition database to allow for some interesting work on
GUIs.

In any case, I think pretty much all of this work lives outside the kernel.
There is one side note I'd make about booting - my own boot process has to
wait forever for my Adaptec SCSI controller to wake up. It would be
interesting if bootup initialization tasks could be organized into dependency
levels and run in parallel, though as I'm a beginner to the workings of the
kernel I'm not entirely sure how possible this would be.

Cheers,
Chase Venters

D. Hazelton

unread,

Oct 5, 2005, 1:20:07 AM10/5/05

to

On Tuesday 04 October 2005 19:47, Marc Perkel wrote:
> I think it's time for some innovative thinking and for people to
> step outside the Linux box and look around at other operating
> systems for some good ideas. I'll run through a few ideas here.
>
> Reiser 4 - The idea of building a file system on top of a database
> is the right way to go. Reiser is onto something here and this is a
> technology that needs to be built upon. It's current condition is a
> little on the week side - no ACLs for example - but the underlying
> concept is ound.

A filesystem built on top of a database? Isn't that introducing
complexity into something that should be as simple as possible so
that the number of possible errors is reduced?

But then, I do not have experience with filesystem design and
implementation, so I cannot make a suggestion on this front.

> Novell Netware type permissions. ACLs are a step in the right
> direction but Linux isn't any where near where Novell was back in
> 1990. Linux lets you - for example - to delete files that you have
> no read or write access rights to.

As someone else pointed out, this is because unlinking is related to
your access permissions on the parent directory and not the file.

> Netware on the other hand
> prevents you from deleting files that you can't write to and if you
> have no right it is as if the file isn't there. You can't even see
> it in the directory.

This is just adding a layer of security through hiding data. I
personally don't like this idea for a number of reasons, not the
least of which is that it is the access permissions of the directory
that control whether or not you can see a file. This and the previous
comment about unlinking of files is, IIRC, actually part of the POSIX
standard.

> Netware also has inherited permissions like
> Windows and Samba has and this is doing it right.

This would only be a bonus with ACL's. It makes no real sense to
propogate a directories permissions down to the files in a directory
since an directory you can list the contents of has at least 1
execute bit set.

> File systems and
> individual directories should be able to be flagged as
> casesensitive/insensitive.

This would be a rarely used feature and would break many tools. Having
an extra bit would also require modifying the kernel and I doubt
anyone wants to tackle such a job, as it would also break all extant
filesystems.

> Permissions need to be fine grained and
> easy to use. Netware is a good example to emulate.

I do agree with this, but have to point out that this is already
allowed for under POSIX and a number of filesystems support this.
It's called "POSIX ACL's" in the kernel configuration system. Since I
don't use them on my home system (I see no need since it's just me
and whatever hacker has managed to penetrate the system (to date: 0))
I do use ACL's (POSIX and otherwise) on all the systems I maintain
for my various clients (providing the OS supports them)

> The bootup sequence of Linux is pathetic. What an ungodly mess. The
> FSTAB file needs to go and a smarter system needs to be developed.
> I know this isn't entirely a kernel issue but it is somewhat
> related.

I'll have to disagree about FSTAB - this is something that is at the
peak of it's usefulness and changing or removing it would require the
people that maintain the core utilities to rewrite mount(8) almost
entirely.

However, when it comes to the boot sequence as controlled by init(8) I
have to agree. I'm personally working on an entirely new set of
init-scripts for my system and have thought about seeing if anyone
has ever released an init(8) that is more functional than the basic
GNU/FSF version. If there was an init(8) that could run the scripts
in parallel I'd be using it as soon as I had a set of scripts lined
up that were designed to be run in parallel.

> I think development needs to be done to make the kernel cleaner and
> smarter rather than just bigger and faster. It's time to look at
> what users need and try to make Linux somewhat more windows like in
> being able to smartly recover from problems. Perhaps better error
> messages that your traditional kernel panic or hex dump screen of
> death.

lol. Nope. The kernel panic could be refined to contain even more
information and be even more user-friendly but it is definately
light-years ahead of a Windows BSOD. Now if you're talking about the
errors as seen by users of applications that's not a kernel issue and
is the purview of the application developers.

> The big challenge for Linux is to be able to put it in the hands of
> people who don't want to dedicate their entire life to
> understanding all the little quirks that we have become used to.
> The slogan should be "this just works" and is intuitive.

Yep. I do agree with that. However, until the rest of the big
companies catch up to the ones that already support Linux this will
never happen. Several of my non-business maintenance clients have
inquired abotu Linux and I've had to tell them to just stick with
Windows because they rely on a rather large number of Windows only
programs (that do _not_ run under Wine) and/or are not technically
enough inclined to be able to handle the learning curve involved in
moving to a non-MS operating system.

The fact that I have gotten those inquiries means that the news about
how stable Linux is is getting to the "mainstream" population. Only
problem is that MS and the home-PC boom has landed the PC and the
internet in the hands of too many people who are barely computer
literate enough to use a mouse. (I'm speaking from experience. A
large number of my private clients fit this description to a T.
Although they are good as a continuing source of income :) And with
Windows being as "User Friendly" as it is, and with at least 75% of
the major software firms still not supporting Linux, there is no way
Linux can make any real inroads into the desktop market.

OTOH several of my clients have inquired about Linux because of it's
security - it doesn't take a genius to see that the same reason
Firefox made such a big dent in MS's hold on the browser market could
work for Linux as well. And I have had more than one client ask if
there was an alternative to Windows than ran well - I've always
answered the same way: "Yes, but you would need to learn an all-new
way of doing things." Every last one of them dropped it on hearing
those words.

> Anyhow - before I piss off too many people who are religiously
> attached to Linux worshiping - I'll quit not. ;)

heh. keep it up. you've managed to turn a pointless argument of micro
versus mono into something productive. (even if you didn't mean to)

DRH

0xA6992F96300F159086FF28208F8280BB8B00C32A.asc

Valdis.K...@vt.edu

unread,

Oct 5, 2005, 1:40:07 AM10/5/05

to

On Tue, 04 Oct 2005 18:40:33 CDT, Chase Venters said:
> Work on dbus and HAL should give us good improvements in these areas. One
> remaining challenge I see is system configuration - each daemon tends to
> adopt its own syntax for configuration, which means that providing a GUI for
> novice users to manage these systems means attacking each problem separately
> and in full. Now I certainly wouldn't advocate a Windows-style registry,
> because I think it's full of obvious problems. Nevertheless, it would be nice
> to have some kind of configuration editor abstraction library that had some
> sort of syntax definition database to allow for some interesting work on
> GUIs.

Anybody who tries to do this without at least understanding the design choices
made by AIX's SMIT tool deserves to re-invent it, poorly.

> In any case, I think pretty much all of this work lives outside the kernel.

Amen to that - although the whole hotplug/udev/sysfs aggregation has at least
made a semi-sane way to find out from userspace what the kernel thinks is going on...

Are there any drivers out there that don't play nice with sysfs? If so, should
a mention of them be added to http://kerneljanitors.org/TODO ?

Marc Perkel

unread,

Oct 5, 2005, 1:50:07 AM10/5/05

to

D. Hazelton wrote:

>
>
>>Novell Netware type permissions. ACLs are a step in the right
>>direction but Linux isn't any where near where Novell was back in
>>1990. Linux lets you - for example - to delete files that you have
>>no read or write access rights to.
>>
>>
>
>As someone else pointed out, this is because unlinking is related to
>your access permissions on the parent directory and not the file.
>
>
>

Right - that's Unix "inside the box" thinking. The idea is to make the
operating system smarter so that the user doesn't have to deal with
what's computer friendly - but reather what makes sense to the user.
From a user's perspective if you have not rights to access a file then
why should you be allowed to delete it?

Now - the idea is to create choice. If you need to emulate Unix nehavior
for compatibility that's fine. But I would migrate away from that into a
permissions paradygme that worked like Netware.

I started with Netware and I'm spoiled. They had it right 15 years ago
and Linux isn't any where near what I was with Netware and DOS in 1990.
Once you've had this kind of permission power Linux is a real big step down.

So - the thread is about the future so I say - time to fix Unix.

Valdis.K...@vt.edu

unread,

Oct 5, 2005, 2:10:05 AM10/5/05

to

On Tue, 04 Oct 2005 22:49:03 PDT, Marc Perkel said:
> From a user's perspective if you have not rights to access a file then
> why should you be allowed to delete it?

Because it's your directory, dammit, and nobody else should be allowed to
clutter it with files you can't even read. :)

What's so hard to understand about that viewpoint? Want to try to explain
the converse to a Windows/Netware user? "But it's *MY* folder, why can't I
get rid of this file I can't read and have no use for?"

Trying to make it "make sense to the user" without expecting the user to learn
at least a bit about what's going on is futile, as Alan Perlis understood:

When someone says "I want a programming language in which I need only
say what I wish done," give him a lollipop.

Steven Rostedt

unread,

Oct 5, 2005, 3:00:16 AM10/5/05

to

On Tue, 4 Oct 2005, Chase Venters wrote:

> As for error messages... the equivalent of the Linux kernel panic is basically
> the Windows BSOD. Neither one of them should appear in the day to day use of
> the system as they indicate bugs. Linux is actually the clear winner here, I
> think, because a Windows BSOD gives you a single hex code and no indication
> of what happened, except for very vague codes like
> "PAGE_FAULT_IN_NON_PAGED_AREA". I'd much rather have a backtrace :) In any
> case, I'm watching the work on kdump with a keen interest.
>

And what about kexec? To be able to boot into another kernel on a kernel
bug and still have access to all the memory and the system state of the
bug. That's pretty cool. It would be like Windows going straight to
Safe-Mode on a BSOFD without a reboot.

> In any case, I think pretty much all of this work lives outside the kernel.
> There is one side note I'd make about booting - my own boot process has to
> wait forever for my Adaptec SCSI controller to wake up. It would be
> interesting if bootup initialization tasks could be organized into dependency
> levels and run in parallel, though as I'm a beginner to the workings of the
> kernel I'm not entirely sure how possible this would be.
>

I've been thinking of at least trying to see what would happen if I
threaded the do_initcalls in main.c but lately I haven't had the time.

-- Steve

Nikita Danilov

unread,

Oct 5, 2005, 5:31:09 AM10/5/05

to

Marc Perkel writes:

[...]

> Right - that's Unix "inside the box" thinking. The idea is to make the
> operating system smarter so that the user doesn't have to deal with
> what's computer friendly - but reather what makes sense to the user.
> From a user's perspective if you have not rights to access a file then
> why should you be allowed to delete it?

Because in Unix a name is not an attribute of a file.

Files are objects that you read, write and truncate. They are
represented by inodes.

Separately from that, there is an indexing structure: directory
tree. Directories map symbolical names to inodes. Obviously, adding a
reference to an index, or removing it from one requires access
permission to the _index_ rather then to the object being referenced.

That two-level model of files and indexing on top of them is essential
to Unix due to the flexibility and conceptual economy it provides.

>
> Now - the idea is to create choice. If you need to emulate Unix nehavior
> for compatibility that's fine. But I would migrate away from that into a
> permissions paradygme that worked like Netware.

And there are people believing that ITS (or VMS, or <insert your first
passion here>...) set the standard to follow. :-)

[...]

>
> So - the thread is about the future so I say - time to fix Unix.

One thing is clear: it's too late to fix Netware. Why should Unix
emulate its lethal defects?

Nikita.

Luke Kenneth Casson Leighton

unread,

Oct 5, 2005, 6:00:33 AM10/5/05

to

On Wed, Oct 05, 2005 at 01:24:12PM +0400, Nikita Danilov wrote:

> Marc Perkel writes:
>
> [...]
>
> > Right - that's Unix "inside the box" thinking. The idea is to make the
> > operating system smarter so that the user doesn't have to deal with
> > what's computer friendly - but reather what makes sense to the user.
> > From a user's perspective if you have not rights to access a file then
> > why should you be allowed to delete it?
>
> Because in Unix a name is not an attribute of a file.

there is no excuse.

selinux has already provided an alternative that is similar to NW
file permissions.

so there is no excuse.

think about this.

in what way is it possible for linux to fully support the NTFS
filesystem?

bearing in mind that you will need to communicate a little bit more
than, but the direct equivalent of unix uid gid and secondary groups,
from userspace into kernelspace - just like reading /etc/passwd but
instead reading from a userspace daemon.

l.

Luke Kenneth Casson Leighton

unread,

Oct 5, 2005, 6:10:16 AM10/5/05

to

> > In any case, I think pretty much all of this work lives outside the kernel.
> > There is one side note I'd make about booting - my own boot process has to
> > wait forever for my Adaptec SCSI controller to wake up. It would be
> > interesting if bootup initialization tasks could be organized into dependency
> > levels and run in parallel, though as I'm a beginner to the workings of the
> > kernel I'm not entirely sure how possible this would be.

again - depinit.

the only thing that richard lightman didn't spend time on was the
splitting out of hotplug / hotplug scripts such that depinit could be
told "okay the ethernet's up now, all eth0 dependencies can now proceed"
or "okay the scsi controller is up now, all fstab entries depending
on this controller can now proceed".

richard's example scripts split out "var", "usr", "home", "boot" as
separate dependencies (actually they're symlinks to the "mount"
pseudo-dependency) on which things like pretty much all services
that need to do syslogging depend on the "var" dependency you get the
idea.

so as long as it's hotplug that gets the notifications, great.

if it's _not_ hotplug that's receiving the notifications, such that
device driver initialisation cannot be delayed because you're looking at
calling some boot-rom initialisation stuff, then, sorry, nope - you're
gonna have to just wait for that SCSI controller to get its act
together :)

but again, this is off-topic for the original discussion and is again
userspace not kernel oh well, in for a penny in for a pound.

l.

Luke Kenneth Casson Leighton

unread,

Oct 5, 2005, 6:10:20 AM10/5/05

to

On Wed, Oct 05, 2005 at 01:35:57AM -0400, Valdis.K...@vt.edu wrote:
> On Tue, 04 Oct 2005 18:40:33 CDT, Chase Venters said:
> > Work on dbus and HAL should give us good improvements in these areas. One

HAL is great. dbus should have been shot at birth and the people who
initiated it without looking at freedce should have been fired.

l.

Luke Kenneth Casson Leighton

unread,

Oct 5, 2005, 6:20:11 AM10/5/05

to

On Wed, Oct 05, 2005 at 01:22:33AM +0000, D. Hazelton wrote:
> On Tuesday 04 October 2005 19:47, Marc Perkel wrote:

> As someone else pointed out, this is because unlinking is related to
> your access permissions on the parent directory and not the file.

that's POSIX.

i trust that POSIX has not been hard-coded into the entire design of
the linux kernel filesystem architecture _just_ because it's ... POSIX.

l.

Luke Kenneth Casson Leighton

unread,

Oct 5, 2005, 6:30:44 AM10/5/05

to

On Tue, Oct 04, 2005 at 06:40:33PM -0500, Chase Venters wrote:

> Work on dbus and HAL should give us good improvements in these areas. One

dbus. total waste of several man-years of effort that could have been
better spent in e.g. removing the dependency of posix draft 4 threads
from freedce (which i finally did last month) and e.g. adding an ultra-fast
shared-memory transport plugin.

hal. _excellent_. look forward to it being ported to win32 so that
it's useful for the kde-win32 port etc. etc.

> remaining challenge I see is system configuration - each daemon tends to
> adopt its own syntax for configuration, which means that providing a GUI for
> novice users to manage these systems means attacking each problem separately
> and in full.

that's quite straightforward to deal with but it _does_ mean using a
unified approach to writing APIs.

NT solved this problem by writing graphical tools in c and then
adopting dce/rpc as the means to administer the services both locally
_and_ remotely.

wholesale. utterly. everything. right from the simplest things like
rebooting the machine, through checking the MAC addresses of cards
(NetTransportEnum) all the way up to DNS administration - yes,
dnsadmin.exe will write out DNS zone files in proper bind format.

it's quite a brave choice to take.

> Now I certainly wouldn't advocate a Windows-style registry,
> because I think it's full of obvious problems.

such as? :)

they're not obvious to me. at the risk of in-for-penny, in-for-pound
_radically_ off-topic discussion encouragement here, and also for
completeness should someone come back to the archives in some years or
months and go "what obvious problems", could you kindly elaborate?

one i can think of is "eek, my system's broken, eek, i can't even use
vi to edit the configs".

and having described the problem, then.. .. well... actually...
it's simply dealt with:

http://www.bindview.com/Services/RAZOR/Utilities/Unix_Linux/ntreg_readme.cfm

todd sabin wrote a linux filesystem driver which is read-only, so at
least half the work's done.

(and the reactos people have written a complete implementation
of a registry, btw).

> Nevertheless, it would be nice
> to have some kind of configuration editor abstraction library that had some
> sort of syntax definition database to allow for some interesting work on
> GUIs.

i have to say this: it's almost too radical, dude :)

he he.

> In any case, I think pretty much all of this work lives outside the kernel.

ACK!

... well... not entirely.

a "registry" - god help us - would need to be stored on a filesystem.
and then, ideally, made accessible a la todd sabin's ntreg driver - via
a POSIX interface (because the linux kernel doesn't _do_ anything other
than POSIX filesystems *sigh*). and that makes it also convenient to
access from kernelspace, too.

hey, you know what? if linux got a registry, it would be possible for
the kernel to access - and store, and communicate - persistent
information.

conveniently.

hurrah.

Valdis.K...@vt.edu

unread,

Oct 5, 2005, 6:31:13 AM10/5/05

to

On Wed, 05 Oct 2005 11:09:42 BST, Luke Kenneth Casson Leighton said:
> On Wed, Oct 05, 2005 at 01:22:33AM +0000, D. Hazelton wrote:
> > On Tuesday 04 October 2005 19:47, Marc Perkel wrote:
>
> > As someone else pointed out, this is because unlinking is related to
> > your access permissions on the parent directory and not the file.
>
> that's POSIX.
>
> i trust that POSIX has not been hard-coded into the entire design of
> the linux kernel filesystem architecture _just_ because it's ... POSIX.

No, what got hard-coded were the concepts of inodes as the actual description
of filesystem objects, directories as lists of name-inode pairs, and the whole
user/group/other permission thing. "unlink depends on the directory
permissions not the object unlinked" has been the semantic that people depended
on ever since some code at Bell Labs started supporting tree-structured
directories and multiple hardlinks. POSIX merely codified existing behavior in
this case.