What's left over.

Matt D. Robinson

unread,

Oct 31, 2002, 7:20:14 AM10/31/02

to Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

Linus Torvalds wrote:
> > Crash Dumping (LKCD)
>
> This is definitely a vendor-driven thing. I don't believe it has any
> relevance unless vendors actively support it.

There are people within IBM in Germany, India and England, as well as
a number of companies (Intel, NEC, Hitachi, Fujitsu), as well as SGI
that are PAID to support this. In addition, Global Services at IBM
uses this as a front-line method for resolving customer problems.
If you're looking for names of people to sign up to support it
(both vendors and non-vendors), I can make that list up for you.

There are a number of us (developers, support staff, and other
interested parties) who bend over backwards, day in and day out
to make sure this stuff works and helps people, even if it isn't
kernel developers (directly -- indirectly, you get bug reports that
are sane and useful).

It's not sexy kernel stuff, but it is very important, and if you'd
like, I can have representatives from at least 10 major corporations
(Fortune 500 companies) contact you to request that this go in.

We're generating 2.5.45 patches now, and we ask that you include
the patches when they are posted.

I don't know what else to say except that people really want this
stuff and all of us in the LKCD community work really hard together
to make this project useful for everyone.

Please include this in your next snapshot.

--Matt

P.S. Copying some of the users and developers.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Dax Kelson

unread,

Oct 31, 2002, 7:22:10 AM10/31/02

to Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On Wed, 2002-10-30 at 19:31, Linus Torvalds wrote:
>
> > ext2/ext3 ACLs and Extended Attributes
>
> I don't know why people still want ACL's. There were noises about them for
> samba, but I'v enot heard anything since. Are vendors using this?
>

I teach Linux classes to corporate IT guys (~300 or so this year) and
many of them are migrating from Solaris or deploying Linux along side
Solaris.

Solaris has had ACLs since 2.5.1 (1996), and EAs since 2.9 (May 2002).

Having ACL in Linux is a VERY COMMON REQUEST that I hear from the
students.

FWIW.

Dax Kelson
Guru Labs

Dax Kelson

unread,

Oct 31, 2002, 7:22:44 AM10/31/02

to Alexander Viro, Chris Wedgwood, Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On Thu, 2002-10-31 at 00:10, Alexander Viro wrote:
>
>
> On 30 Oct 2002, Dax Kelson wrote:
>
> > Without ACLs, if Sally, Joe and Bill need rw access to a file/dir, just
> > create another group with just those three people in. Over time, of
>
> If Sally, Joe and Bill need rw access to a directory, and Joe and Bill
> are using existing userland (any OS I'd seen), then Sally can easily
> fuck them into the next month and not in a good way.

I think the normal intent is to let Sally, Joe, and Bill have their own
private directory protected from THE REST OF THE USERS.

If a member of your trusted circle goes rogue, then, yup you are screwed
for the moment. It shouldn't last a whole month though.

That is what backups, and employment termination is for.

Dax

Patrick Finnegan

unread,

Oct 31, 2002, 7:23:37 AM10/31/02

to linux-...@vger.kernel.org

On Thu, 31 Oct 2002, Christoph Hellwig wrote:

> On Wed, Oct 30, 2002 at 11:20:42PM -0500, Patrick Finnegan wrote:
> > Specifically, the interoperation with IBM's JFS LVM and MS's LVM will be
>
> JFS has no lvm, it just sits on any blockdevice. The support for Windows
> dynamic disks actually layers ontop of the MD driver..

To be more specific, I'm talking about AIX's JFS, not linux's JFS...

--
Purdue Universtiy ITAP/RCS
Information Technology at Purdue
Research Computing and Storage
http://www-rcd.cc.purdue.edu

http://dilbert.com/comics/dilbert/archive/images/dilbert2040637020924.gif

Christoph Hellwig

unread,

Oct 31, 2002, 7:23:39 AM10/31/02

to Patrick Finnegan, linux-...@vger.kernel.org, Rusty Russell

On Wed, Oct 30, 2002 at 11:20:42PM -0500, Patrick Finnegan wrote:
> Specifically, the interoperation with IBM's JFS LVM and MS's LVM will be

JFS has no lvm, it just sits on any blockdevice. The support for Windows
dynamic disks actually layers ontop of the MD driver..

-

Patrick Finnegan

unread,

Oct 31, 2002, 7:23:48 AM10/31/02

to linux-...@vger.kernel.org, Rusty Russell

I'm kind of new here, but I'll present my case in hope that someone
listens to me.

On Wed, 30 Oct 2002, Linus Torvalds wrote:

> On Thu, 31 Oct 2002, Rusty Russell wrote:
>
> > Crash Dumping (LKCD)
>
> This is definitely a vendor-driven thing. I don't believe it has any
> relevance unless vendors actively support it.

This is something that we're just starting to use in my department in
Purdue - we work with clustering, and LKCD will let us determine why our
nodes decide to kernel panic since it's generally not worthwhile to
connect a head to each machine.

I see LKCD as having a big impact by allowing kernels to be debugged after
they have panic'd (and thus don't send out a message to syslog). It can
especially be usful in compute farms, or other scenerios where it's
difficut or cost prohibitive to connect a console (or console server) to
each individual machine.

> > EVMS
>
> Not for the feature freeze, there are some noises that imply that SuSE may
> push it in their kernels.

I think that the integration between RAID and LVM is a good thing, and
EVMS's 'plug-in module' architecture will help tremendously to bring
interoperation with other systems' volume management subsystems.

Specifically, the interoperation with IBM's JFS LVM and MS's LVM will be

helpful for people trying to migrate their servers over from those OS's to
GNU/Linux.

-- Pat

Purdue University ITAP/RCS

Information Technology at Purdue
Research Computing and Storage
http://www-rcd.cc.purdue.edu

Andreas Dilger

unread,

Oct 31, 2002, 7:24:13 AM10/31/02

to Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On Oct 30, 2002 18:31 -0800, Linus Torvalds wrote:
> On Thu, 31 Oct 2002, Rusty Russell wrote:

> > ext2/ext3 ACLs and Extended Attributes
>
> I don't know why people still want ACL's. There were noises about them for

> samba, but I've not heard anything since. Are vendors using this?

I don't really care about ACLs so much one way or the other, but we
DEFINITELY use EAs with Lustre, so at the minimum if we could have
that part of the changes I'd be happy.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

tri...@samba.org

unread,

Oct 31, 2002, 7:24:45 AM10/31/02

to h...@infradead.org, ru...@rustcorp.com.au, linux-...@vger.kernel.org, ge...@linux-m68k.org, r...@arm.linux.org.uk, pe...@chubb.wattle.id.au, ty...@mit.edu

> XFS doesn't have ACLs either in plain 2.5.

The existing NAS boxes that use Linux and XFS tend to base their
kernels on the 2.4-xfs tree from cvs on sgi.com. It works well and the
SGI guys have been very good about fixing problems when they crop up.

I think that the biggest beneficiary of adding extended attributes and
ACLs into ext3 for 2.6 would be more casual users (home, small office
etc) as they will then be able to use ACLs in Samba without the pain
of switching to a different kernel.

Cheers, Tridge

--
http://samba.org/~tridge/

Stephen Lord

unread,

Oct 31, 2002, 7:24:52 AM10/31/02

to Linus Torvalds, Rusty Russell, Linux Kernel Mailing List

On Wed, 2002-10-30 at 20:31, Linus Torvalds wrote:
>
> On Thu, 31 Oct 2002, Rusty Russell wrote:
> >

> > Here is the list of features which have are being actively
> > pushed, not NAK'ed, and are not in 2.5.45. There are 13 of them, as
> > appropriate for Halloween.
>
> I'm unlikely to be able to merge everything by tomorrow, so I will
> consider tomorrow a submission deadline to me, rather than a merge
> deadline. That said, I merged everything I'm sure I want to merge today,
> and the rest I simply haven't had time to look at very much.
>

>
> > ext2/ext3 ACLs and Extended Attributes
>
> I don't know why people still want ACL's. There were noises about them for
> samba, but I'v enot heard anything since. Are vendors using this?
>

There are a fair number of NAS vendors who do linux boxes with Samba
and XFS because of the ACL support, Quantum being the one Tridge now
works for by the way. The reason they want it is so they can support
the features NT folks are used to having in their file servers.
Now, we could just let the NT folks use NT servers instead....

Even getting XFS ACLs running in 2.5 requires part of this patch set.

Steve

Christoph Hellwig

unread,

Oct 31, 2002, 7:26:10 AM10/31/02

to Rusty Russell, linux-...@vger.kernel.org, Geert Uytterhoeven, Russell King, Peter Chubb, tri...@samba.org, ty...@mit.edu

On Thu, Oct 31, 2002 at 02:00:31PM +1100, Rusty Russell wrote:
> > I don't know why people still want ACL's. There were noises about them for
> > samba, but I'v enot heard anything since. Are vendors using this?
>

> SAMBA needs them, which is why serious Samba boxes use XFS. Tridge,
> Ted?

XFS doesn't have ACLs either in plain 2.5.

> > Not for the feature freeze, there are some noises that imply that SuSE may

> > push it in their kernels.
>

> They have, IIRC. Interestingly, it was less invasive (existing source
> touched) than the LVM2/DM patch you merged.

But that only because dm added stuff to the generic code where we
told it. It's a lot more code than dm and it adds new discovery
code at the same time we start moving stuff _out_ of the kernel
to initramfs.

If you can SuSE has merged it any IBM patch posted here should get
in, coming from big blue seems to be a basic merge criteria in
Nuernberg :)

tri...@samba.org

unread,

Oct 31, 2002, 7:26:13 AM10/31/02

to torv...@transmeta.com, ru...@rustcorp.com.au, linux-...@vger.kernel.org, ge...@linux-m68k.org, r...@arm.linux.org.uk, pe...@chubb.wattle.id.au, ty...@mit.edu

> > > ext2/ext3 ACLs and Extended Attributes
> >

> > I don't know why people still want ACL's. There were noises about them for
> > samba, but I'v enot heard anything since. Are vendors using this?
>
> SAMBA needs them, which is why serious Samba boxes use XFS. Tridge,
> Ted?

oh yes, all the Linux based storage appliances use ACLs. Posix ACLs
aren't ideal for Samba, but they are *much* better than having no ACLs
at all. The Posix ACL code has been in Samba for a long time (getting
close to 3 years now?).

Eventually I'd like to see a combination of LSM with a new ACL system
give the ability to support full NT ACLs on Linux (which is also
needed for full nfsv4 support), but that is way too much to do for
the 2.6 kernel.

For the majority of windows users the mapping Samba does internally
between Posix ACLs and NT ACLs is sufficient for now.

I think that it would be a very good thing for Posix ACLs to be
included in the 2.6 kernel, especially in ext3.

Extended attributes are also important as they give a place to store
all the extra DOS info that has no other logical place in a posix
filesystem. For example, we can put the 'read only', 'archive', 'hidden'
and 'system' attributes there. If we don't have extended attributes
then we need to use a nasty kludge where these map to various unix
permission bits, but the mapping is terrible and doesn't give the
correct semantics (especially for things like read only on
directories).

My main concern with using extended attributes in this way is
performance. My experience with XFS is that as soon as you start
adding extended attributes then the performance drops a lot, but I
haven't tested performance with the ext3 extended attributes so maybe
they don't have the same problem.

Cheers, Tridge

--
http://samba.org/~tridge/

Stephen Frost

unread,

Oct 31, 2002, 7:26:24 AM10/31/02

to Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

* Rik van Riel (ri...@conectiva.com.br) wrote:

> On Wed, 30 Oct 2002, Linus Torvalds wrote:
> > On Thu, 31 Oct 2002, Rusty Russell wrote:
>

> > > ext2/ext3 ACLs and Extended Attributes
> >
> > I don't know why people still want ACL's. There were noises about them for
> > samba, but I'v enot heard anything since. Are vendors using this?
>

> Yes, people use it. Not quite sure why though, I guess ACLs
> buy some flexibility over the user/group/other model but if
> the "unlimited groups" patch goes in (is in?) I'm happy ;)
>
> Personally I do think either the unlimited groups patch or
> ACLs are needed in order to sanely run a large anoncvs setup.

The feeling I got on this was the ability to let users define their own
groups. Perhaps I'm not following it closely enough but that was the
impression I got in terms of "what this does for us"; I'm probably
missing other things. Just that ability would be nice in my view
though. Isn't it something that's been in AFS for a long time too?
I've got a few friends who've played with AFS before (at CMU and the
like) and really enjoyed the ACLs there.

Just my thoughts,

Stephen

Karim Yaghmour

unread,

Oct 31, 2002, 7:27:24 AM10/31/02

to Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org, LTT-Dev

Linus Torvalds wrote:
> > Linux Trace Toolkit (LTT)
>
> I don't know what this buys us.

How about being able to:
- Debug synchronization problems among processes (there is no other
tool to do this, not gdb, not strace, not printf, ...)
- Measure exact time spent wainting for kernel and which other
processes a process had to wait for.
- Measure exact time it takes for an interrupt's effects to propagate
throughout the entire system.
- Understand the exact behavior the system has to input. (what is
the exact sequence of processes that run when I press a key).
- Identify sporadic problems in very saturated systems. (thousands
of servers and one of them is doing weird stuff).
- etc.

Providing system tracing is a necessity for any sort of complex
application development and system monitoring. Some people simply
can't use Linux without this sort of tool and I am at pains to
explain to them why they actually have to patch their kernel to
be able to debug their inter-process synchronization problems.

Users don't have to patch their kernel to use gdb and I don't
see why they should need to patch their kernel to understand how
their various processes interact with the kernel and vice-versa.

Karim

===================================================
Karim Yaghmour
ka...@opersys.com
Embedded and Real-Time Linux Expert
===================================================

Rik van Riel

unread,

Oct 31, 2002, 7:27:39 AM10/31/02

to Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On Wed, 30 Oct 2002, Linus Torvalds wrote:
> On Thu, 31 Oct 2002, Rusty Russell wrote:

> > ext2/ext3 ACLs and Extended Attributes
>
> I don't know why people still want ACL's. There were noises about them for
> samba, but I'v enot heard anything since. Are vendors using this?

Yes, people use it. Not quite sure why though, I guess ACLs
buy some flexibility over the user/group/other model but if
the "unlimited groups" patch goes in (is in?) I'm happy ;)

Personally I do think either the unlimited groups patch or
ACLs are needed in order to sanely run a large anoncvs setup.

regards,

Rik
--
Bravely reimplemented by the knights who say "NIH".
http://www.surriel.com/ http://distro.conectiva.com/
Current spamtrap: <a href=mailto:"oct...@surriel.com">oct...@surriel.com</a>

Rusty Russell

unread,

Oct 31, 2002, 7:27:49 AM10/31/02

to Linus Torvalds, linux-...@vger.kernel.org, Geert Uytterhoeven, Russell King, Peter Chubb, tri...@samba.org, ty...@mit.edu

In message <Pine.LNX.4.44.02103...@home.transmeta.com> you wri

te:
>
> On Thu, 31 Oct 2002, Rusty Russell wrote:
> >

> > Here is the list of features which have are being actively
> > pushed, not NAK'ed, and are not in 2.5.45. There are 13 of them, as
> > appropriate for Halloween.
>
> I'm unlikely to be able to merge everything by tomorrow, so I will
> consider tomorrow a submission deadline to me, rather than a merge
> deadline. That said, I merged everything I'm sure I want to merge today,
> and the rest I simply haven't had time to look at very much.
>

> > In-kernel Module Loader and Unified parameter support
>
> This apparently breaks things like DRI, which I'm fairly unhappy about,
> since I think 3D is important.

Yes, the patch stubs out inter_module_*, in favor of get_symbol() &
put_symbol().

This breaks the three users: one in drivers/mtd/ and two in
drivers/char/drm/. I have a patch which fixes them (untested), or I
can simply put the inter_module_* code back in.

> > Fbdev Rewrite
>
> This one is just huge, and I have little personal judgement on it.

It's been around for a while. Geert, Russell?

> > Linux Trace Toolkit (LTT)
>
> I don't know what this buys us.

Haven't looked at it.

> > statfs64
>
> I haven't even seen it.

It's fairly old, but Peter Chubb said there was some vendor interest
for v. large devices. Peter?

> > ext2/ext3 ACLs and Extended Attributes
>
> I don't know why people still want ACL's. There were noises about them for
> samba, but I'v enot heard anything since. Are vendors using this?

SAMBA needs them, which is why serious Samba boxes use XFS. Tridge,
Ted?

> > Hotplug CPU Removal Support
>
> No objections, but very little visibility into it either.

The controls are in driverfs etc, and that's always been in flux. 8(

The rest is v. small, basically extending ksoftirqd, workqueues and
migration threads to disable them. Then it's all arch-specific.

> > Hires Timers
>
> This one is likely another "vendor push" thing.
>
> > EVMS

>
> Not for the feature freeze, there are some noises that imply that SuSE may
> push it in their kernels.

They have, IIRC. Interestingly, it was less invasive (existing source
touched) than the LVM2/DM patch you merged.

> > initramfs
>
> I want this.

Good. The big payoff is moving stuff out of the kernel, which can't
really be done in a stable series.

> > Kernel Probes
>
> Probably.

Sent.

Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

Alexander Viro

unread,

Oct 31, 2002, 7:28:19 AM10/31/02

to Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On Wed, 30 Oct 2002, Linus Torvalds wrote:

> > ext2/ext3 ACLs and Extended Attributes
>
> I don't know why people still want ACL's. There were noises about them for
> samba, but I'v enot heard anything since. Are vendors using this?

Because People Are Stupid(tm). Because it's cheaper to put "ACL support: yes"
in the feature list under "Security" than to make sure than userland can cope
with anything more complex than "Me Og. Og see directory. Directory Og's.
Nobody change it". C.f. snake oil, P.T.Barnum and esp. LSM users

Linus Torvalds

unread,

Oct 31, 2002, 7:28:30 AM10/31/02

to Rusty Russell, linux-...@vger.kernel.org

On Thu, 31 Oct 2002, Rusty Russell wrote:
>
> Here is the list of features which have are being actively
> pushed, not NAK'ed, and are not in 2.5.45. There are 13 of them, as
> appropriate for Halloween.

I'm unlikely to be able to merge everything by tomorrow, so I will
consider tomorrow a submission deadline to me, rather than a merge
deadline. That said, I merged everything I'm sure I want to merge today,
and the rest I simply haven't had time to look at very much.

> In-kernel Module Loader and Unified parameter support

This apparently breaks things like DRI, which I'm fairly unhappy about,
since I think 3D is important.

> Fbdev Rewrite

This one is just huge, and I have little personal judgement on it.

> Linux Trace Toolkit (LTT)

I don't know what this buys us.

> statfs64

I haven't even seen it.

> ext2/ext3 ACLs and Extended Attributes

I don't know why people still want ACL's. There were noises about them for
samba, but I'v enot heard anything since. Are vendors using this?

> ucLinux Patch (MMU-less support)

I've seen this, it looks pretty ok.

> Crash Dumping (LKCD)

This is definitely a vendor-driven thing. I don't believe it has any
relevance unless vendors actively support it.

> POSIX Timer API

I think I'll do at least the API, but there were some questions about the
config options here, I think.

> Hotplug CPU Removal Support

No objections, but very little visibility into it either.

> Hires Timers

This one is likely another "vendor push" thing.

> EVMS

Not for the feature freeze, there are some noises that imply that SuSE may
push it in their kernels.

> initramfs

I want this.

> Kernel Probes

Probably.

Linus

Rusty Russell

unread,

Oct 31, 2002, 7:29:13 AM10/31/02

to torv...@transmeta.com, linux-...@vger.kernel.org

Hi Linus,

Here is the list of features which have are being actively
pushed, not NAK'ed, and are not in 2.5.45. There are 13 of them, as
appropriate for Halloween.

Most were submitted repeatedly *well* before the freeze. It'd
be nice for you to give feedback, and decide which ones (if any) are
still up for review.

Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

From: http://www.kernel.org/pub/linux/kernel/people/rusty/2.6-not-in-yet/

Rusty's Remarkably Unreliable List of Pending 2.6 Features
[aka. Rusty's Snowball List]

A: Author
M: lkml posting describing patch
D: Download URL
S: Size of patch, number of files altered (source/config), number of new files.
X: Impact summary (only parts of patch which alter existing source files, not config/make files)
T: Diffstat of whole patch
N: Random notes

In rough order of invasiveness (number of altered source files):

In-kernel Module Loader and Unified parameter support

A: Rusty Russell
D: http://www.kernel.org/pub/linux/kernel/people/rusty/patches/Module/
S: 841 kbytes, 302/36 files altered, 22 new
T: Diffstat
X: Summary patch (598k)
N: Requires new modutils

Fbdev Rewrite
A: James Simmons
M: http://www.uwsg.iu.edu/hypermail/linux/kernel/0111.3/1267.html
D: http://phoenix.infradead.org/~jsimmons/fbdev.diff.gz
S: 4852 kbytes, 168/29 files altered, 124 new
T: Diffstat
X: Summary patch (182k)

Linux Trace Toolkit (LTT)
A: Karim Yaghmour
M: http://www.uwsg.iu.edu/hypermail/linux/kernel/0204.1/0832.html
M: http://marc.theaimsgroup.com/?l=linux-kernel&m=103491640202541&w=2
M: http://marc.theaimsgroup.com/?l=linux-kernel&m=103423004321305&w=2
M: http://marc.theaimsgroup.com/?l=linux-kernel&m=103247532007850&w=2
D: http://opersys.com/ftp/pub/LTT/ExtraPatches/patch-ltt-linux-2.5.44-vanilla-021026-2.2.bz2
S: 257 kbytes, 67/4 files altered, 9 new
T: Diffstat
X: Summary patch (90k)

statfs64
A: Peter Chubb
M: http://marc.theaimsgroup.com/?l=linux-kernel&m=103490436228016&w=2
D: http://marc.theaimsgroup.com/?l=linux-kernel&m=103490436228016&w=2
S: 42 kbytes, 53/0 files altered, 1 new
T: Diffstat
X: Summary patch (32k)

ext2/ext3 ACLs and Extended Attributes

A: Ted Ts'o
M: http://lists.insecure.org/lists/linux-kernel/2002/Oct/6787.html
B: bk://extfs.bkbits.net/extfs-2.5-update
D: http://thunk.org/tytso/linux/extfs-2.5/
S: 497 kbytes, 96/34 files altered, 34 new
T: Diffstat
X: Summary patch (167k)

ucLinux Patch (MMU-less support)
A: Greg Ungerer
M: http://lwn.net/Articles/11016/
D: http://www.uclinux.org/pub/uClinux/uClinux-2.5.x/linux-2.5.44uc3.patch.gz
S: 2218 kbytes, 25/34 files altered, 429 new
T: Diffstat
X: Summary patch (40k)

Crash Dumping (LKCD)
A: Matt Robinson, LKCD team
M: http://lists.insecure.org/lists/linux-kernel/2002/Oct/8552.html
D: http://lkcd.sourceforge.net/download/latest/
S: 18479 kbytes, 18/10 files altered, 10 new
T: Diffstat
X: Summary patch (18k)

POSIX Timer API
A: George Anzinger
M: http://marc.theaimsgroup.com/?l=linux-kernel&m=103553654329827&w=2
D: http://unc.dl.sourceforge.net/sourceforge/high-res-timers/hrtimers-posix-2.5.44-1.0.patch
S: 66 kbytes, 18/2 files altered, 4 new
T: Diffstat
X: Summary patch (21k)

Hotplug CPU Removal Support
A: Rusty Russell
D: http://www.kernel.org/pub/linux/kernel/people/rusty/patches/Hotcpu/hotcpu-cpudown.patch.gz
S: 32 kbytes, 16/0 files altered, 0 new
T: Diffstat
X: Summary patch (29k)

Hires Timers
A: George Anzinger
M: http://marc.theaimsgroup.com/?l=linux-kernel&m=103557676007653&w=2
M: http://marc.theaimsgroup.com/?l=linux-kernel&m=103557677207693&w=2
M: http://marc.theaimsgroup.com/?l=linux-kernel&m=103558349714128&w=2
D: http://unc.dl.sourceforge.net/sourceforge/high-res-timers/hrtimers-core-2.5.44-1.0.patch http://unc.dl.sourceforge.net/sourceforge/high-res-timers/hrtimers-i386-2.5.44-1.0.patch http://unc.dl.sourceforge.net/sourceforge/high-res-timers/hrtimers-hrposix-2.5.44-1.1.patch
S: 132 kbytes, 15/4 files altered, 10 new
T: Diffstat
X: Summary patch (44k)
N: Requires POSIX Timer API patch

EVMS
A: EVMS Team
M: http://www.uwsg.iu.edu/hypermail/linux/kernel/0208.0/0109.html
D: http://evms.sourceforge.net/patches/2.5.44/
S: 1101 kbytes, 7/10 files altered, 44 new
T: Diffstat
X: Summary patch (4k)

initramfs
A: Al Viro
M: http://www.cs.helsinki.fi/linux/linux-kernel/2001-30/0110.html
D: ftp://ftp.math.psu.edu/pub/viro/N0-initramfs-C21
S: 16 kbytes, 5/1 files altered, 2 new
T: Diffstat
X: Summary patch (5k)

Kernel Probes
A: Vamsi Krishna S
M: lists.insecure.org/linux-kernel/2002/Aug/1299.html
D: http://www.kernel.org/pub/linux/kernel/people/rusty/patches/Misc/kprobes.patch.gz
S: 18 kbytes, 4/2 files altered, 4 new
T: Diffstat
X: Summary patch (5k)

Alexander Viro

unread,

Oct 31, 2002, 7:30:56 AM10/31/02

to Dax Kelson, Chris Wedgwood, Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On 30 Oct 2002, Dax Kelson wrote:

> Without ACLs, if Sally, Joe and Bill need rw access to a file/dir, just
> create another group with just those three people in. Over time, of

If Sally, Joe and Bill need rw access to a directory, and Joe and Bill
are using existing userland (any OS I'd seen), then Sally can easily
fuck them into the next month and not in a good way.

_That_ is the real problem. Until that is solved (i.e. until all
userland is written up to the standards allegedly followed in writing
suid-root programs wrt hostile filesystem modifications) NO mechanism
will help you. ACLs, huge groups, whatever - setups with that sort
of access allowed are NOT SUSTAINABLE with the current userland(s).

Dax Kelson

unread,

Oct 31, 2002, 7:31:29 AM10/31/02

to Chris Wedgwood, Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On Wed, 2002-10-30 at 23:22, Chris Wedgwood wrote:

> On Thu, Oct 31, 2002 at 01:06:54AM -0200, Rik van Riel wrote:
>
> > Personally I do think either the unlimited groups patch or ACLs are
> > needed in order to sanely run a large anoncvs setup.
>

> Processes need to be a member of 20+ groups to make anoncvs work?
> Sounds like anoncvs is broken then.

Technically speaking you can achieve ACL like permissions/behavior using
the historical UNIX security model by creating a group EACH time you run
into a unique case permission scenario.

Without ACLs, if Sally, Joe and Bill need rw access to a file/dir, just
create another group with just those three people in. Over time, of

course, this leads to massive group proliferation. Without Tim Hockin's
patch, 32 groups is maximum number of groups a user can be a member of.

Dax

Alexander Viro

unread,

Oct 31, 2002, 7:43:28 AM10/31/02

to Dax Kelson, Chris Wedgwood, Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On 31 Oct 2002, Dax Kelson wrote:

> I think the normal intent is to let Sally, Joe, and Bill have their own
> private directory protected from THE REST OF THE USERS.
>
> If a member of your trusted circle goes rogue, then, yup you are screwed
> for the moment. It shouldn't last a whole month though.
>
> That is what backups, and employment termination is for.

Then give them all the same account and be done with that. Effect will
be the same.

Ville Herva

unread,

Oct 31, 2002, 7:47:18 AM10/31/02

to Linus Torvalds, linux-...@vger.kernel.org

On Wed, Oct 30, 2002 at 06:31:36PM -0800, you [Linus Torvalds] wrote:
>
> > Crash Dumping (LKCD)
>
> This is definitely a vendor-driven thing. I don't believe it has any
> relevance unless vendors actively support it.

I don't think this is just a vendor thing. Currently, linux doesn't have any
way of saving the crash dump when the box crashes. So if it crashes, the
user needs to write the oops down by hand (error prone, the interesting part
has often scrolled off screen), or attach a serial console (then he needs to
reproduce it - not always possible, and actually majority of people (home
users) don't have second box and the cable. Nor the motivation.)

So, imho some kind of way of semi-automatically save the dumps is needed. If
vendors even support it - great - but it has value to mainline kernel as
well, as people can submit more accurate error reports. Besides, if it goes
in mainline, I believe vendors are likely to support it. (Why wouldn't they?
Currently there just isn't a standard way of doing this.)

There are a bunch of patches for this sort of thing (Willy Tarreau's
kmsgdump for dumping to floppy, Ingo's netconsole, Rusty's oopser for
dumping to ide device...), but lkcd is a more general framework, and can
support different ways of dumping.

I know you are not keen on kernel debuggers, but I can't see what's
fundamentally wrong with being able to save the crucial info when a crash
happens...

-- v --

v...@iki.fi

Geert Uytterhoeven

unread,

Oct 31, 2002, 9:25:59 AM10/31/02

to Ville Herva, Linus Torvalds, Linux Kernel Development

On Thu, 31 Oct 2002, Ville Herva wrote:
> On Wed, Oct 30, 2002 at 06:31:36PM -0800, you [Linus Torvalds] wrote:
> > > Crash Dumping (LKCD)
> >
> > This is definitely a vendor-driven thing. I don't believe it has any
> > relevance unless vendors actively support it.
>
> I don't think this is just a vendor thing. Currently, linux doesn't have any
> way of saving the crash dump when the box crashes. So if it crashes, the
> user needs to write the oops down by hand (error prone, the interesting part
> has often scrolled off screen), or attach a serial console (then he needs to
> reproduce it - not always possible, and actually majority of people (home
> users) don't have second box and the cable. Nor the motivation.)

Except on m68k, where we've had a feature to store all kernel messages in an
unused portion of memory (e.g. some Chip RAM on Amiga) and recover them after
reboot since ages.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

Ville Herva

unread,

Oct 31, 2002, 9:40:40 AM10/31/02

to Geert Uytterhoeven, Linux Kernel Development

On Thu, Oct 31, 2002 at 10:23:32AM +0100, you [Geert Uytterhoeven] wrote:
>
> Except on m68k, where we've had a feature to store all kernel messages in an
> unused portion of memory (e.g. some Chip RAM on Amiga) and recover them after
> reboot since ages.

There was similar thing for x86 as well:

http://www.tux.org/hypermail/linux-kernel/1999week27/0782.html

Of course it never went to mainline (and I don't know how well it worked.)
From what I understand, lkcd can support such method easily.

-- v --

v...@iki.fi

Lech Szychowski

unread,

Oct 31, 2002, 9:45:52 AM10/31/02

to Rik van Riel, linux-...@vger.kernel.org

> Yes, people use it. Not quite sure why though, I guess ACLs
> buy some flexibility over the user/group/other model but if
> the "unlimited groups" patch goes in (is in?) I'm happy ;)

Correct me if I'm wrong but I believe a process has to be
restarted to have its group membership list changed?

That's a huge difference from ACL behavior which allow for changes to
file access rights without the need to restart the accessing process.

--
Leszek.

-- le...@pse.pl 2:480/33.7 -- REAL programmers use INTEGERS --
-- speaking just for myself...

Trever L. Adams

unread,

Oct 31, 2002, 10:17:04 AM10/31/02

to Linus Torvalds, Rusty Russell, Linux Kernel Mailing List

On Wed, 2002-10-30 at 21:31, Linus Torvalds wrote:

> > ext2/ext3 ACLs and Extended Attributes
>
> I don't know why people still want ACL's. There were noises about them for
> samba, but I'v enot heard anything since. Are vendors using this?
>

I am sure I don't count (not being a vendor), but Intermezzo offers
support for this (they are waiting on feature freeze to redo it to 2.5
according to an email I have). I want this stuff. Yes, u+g+w is nice,
but good ACLs are even better. Please, if this is technically correct
in implementation, do put it in.

Thank you,
Trever

Joe Thornber

unread,

Oct 31, 2002, 10:17:34 AM10/31/02

to Rusty Russell, Linus Torvalds, linux-...@vger.kernel.org, Geert Uytterhoeven, Russell King, Peter Chubb, tri...@samba.org, ty...@mit.edu

On Thu, Oct 31, 2002 at 02:00:31PM +1100, Rusty Russell wrote:

> > > EVMS
> >
> > Not for the feature freeze, there are some noises that imply that SuSE may
> > push it in their kernels.
>
> They have, IIRC. Interestingly, it was less invasive (existing source
> touched) than the LVM2/DM patch you merged.

FUD. I added to three areas of existing code:

i) Every man and his dog uses mempools in conjuction with slabs, so
rather than having everyone redefining their own alloc/free
functions I added the following huge functions to mempool.c. In no
way were they mandatory.

/*
* A commonly used alloc and free fn.
*/
void *mempool_alloc_slab(int gfp_mask, void *pool_data)
{
kmem_cache_t *mem = (kmem_cache_t *) pool_data;
return kmem_cache_alloc(mem, gfp_mask);
}

void mempool_free_slab(void *element, void *pool_data)
{
kmem_cache_t *mem = (kmem_cache_t *) pool_data;
kmem_cache_free(mem, element);
}

ii) vcalloc, this *didn't* get merged, and will probably end up getting
moved into dm.h.

iii) ioctl32 support: people have argued against an ioctl interface,
and I'm inclined to agree with them, which is why I'm going to
publish an fs interface shortly. However, given that we are
currently using an ioctl interface how do we avoid adding support for
32bit userland/64 kernel space ? If EVMS isn't touching these
files does that mean they're not supporting these architectures ?

arch/mips64/kernel/ioctl32.c
arch/ppc64/kernel/ioctl32.c
arch/s390x/kernel/ioctl32.c
arch/sparc64/kernel/ioctl32.c

So given that (ii) didn't get merged, which of (i) and (iii) were you
objecting to ?

- Joe

Geert Uytterhoeven

unread,

Oct 31, 2002, 11:04:59 AM10/31/02

to Rusty Russell, James Simmons, Linus Torvalds, Linux Kernel Development, Russell King, Peter Chubb, tri...@samba.org, Theodore Ts'o

On Thu, 31 Oct 2002, Rusty Russell wrote:
> In message <Pine.LNX.4.44.02103...@home.transmeta.com> you wri
> te:
> > On Thu, 31 Oct 2002, Rusty Russell wrote:
> > > Fbdev Rewrite
> >
> > This one is just huge, and I have little personal judgement on it.
>
> It's been around for a while. Geert, Russell?

It's huge because it moves a lot of files around:
1. drivers/char/agp/ -> drivers/video/agp/
2. drivers/char/drm/ -> drivers/video/drm/
3. console related files in drivers/video/ -> drivers/video/console/

(1) and (2) should be reverted, but apparently they aren't reverted in the
patch at http://phoenix.infradead.org/~jsimmons/fbdev.diff.gz yet. The patch
also seems to remove some drivers. Haven't checked the bk repo yet.

James, can you please fix that (and the .Config files)?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

-

Chris Friesen

unread,

Oct 31, 2002, 2:23:04 PM10/31/02

to Linus Torvalds, linux-...@vger.kernel.org

Linus Torvalds wrote:
>>Linux Trace Toolkit (LTT

> I don't know what this buys us.

I'd like to add a request for this to be in mainstream. The benefits
have already been stated in this thread, and it has been used here to
good effect.

>>Crash Dumping (LKCD

> This is definitely a vendor-driven thing. I don't believe it has any
> relevance unless vendors actively support it.

I'd like to see this too. The more debug information the better as far
as I'm concerned.

>>Hires Timer

> This one is likely another "vendor push" thing.

It doesn't hurt performance when turned off, and allows for
finer-grained timing when turned on. What's not to like? I can't
comment on the actual code, but I really like the idea.

Chris

--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: cfri...@nortelnetworks.com

Jeff Garzik

unread,

Oct 31, 2002, 2:28:10 PM10/31/02

to Joe Thornber, Rusty Russell, Linus Torvalds, linux-...@vger.kernel.org, Geert Uytterhoeven, Russell King, Peter Chubb, tri...@samba.org, ty...@mit.edu

Joe Thornber wrote:

>ii) vcalloc, this *didn't* get merged, and will probably end up getting
> moved into dm.h.
>

Yeah, historically we have avoided things like this.

kcalloc gets proposed every year or so too.

>iii) ioctl32 support: people have argued against an ioctl interface,
> and I'm inclined to agree with them, which is why I'm going to
> publish an fs interface shortly. However, given that we are
> currently using an ioctl interface how do we avoid adding support for
> 32bit userland/64 kernel space ? If EVMS isn't touching these
> files does that mean they're not supporting these architectures ?
>
> arch/mips64/kernel/ioctl32.c
> arch/ppc64/kernel/ioctl32.c
> arch/s390x/kernel/ioctl32.c
> arch/sparc64/kernel/ioctl32.c
>
>

Well, I'll note that ALSA compartmentalizes their ioctl32 handling
within their own subsystem, which seems like a decent solution.

That said, [maybe I'm biased <g>], using an fs interface allows one to
completely eliminate an ioctl32 interface. That would be the direction
I would greatly prefer by the time 2.5.x hits the code freeze.

Best regards, and congrats for getting it merged,

Jeff

Jeff Garzik

unread,

Oct 31, 2002, 2:33:14 PM10/31/02

to Chris Wedgwood, Dax Kelson, Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

Chris Wedgwood wrote:

>problems most people don't have? What next, some kind of misdesigned
>in-kernel CryptoAPI?
>
>

Ok, I'll allow myself to be trolled.

What's wrong with our current 2.5.45 crypto api?

Alan Cox

unread,

Oct 31, 2002, 2:37:39 PM10/31/02

to Jeff Garzik, Joe Thornber, Rusty Russell, Linus Torvalds, Linux Kernel Mailing List, Geert Uytterhoeven, Russell King, Peter Chubb, tri...@samba.org, ty...@mit.edu

On Thu, 2002-10-31 at 14:26, Jeff Garzik wrote:
> Yeah, historically we have avoided things like this.
> kcalloc gets proposed every year or so too.

I would like to see both of these in because tons of kernel fixing that
has been done through audits has been about

get_user(a, ...)
kmalloc(a * sizeof(b), ..)

We end up with loads of ugly > MAXINT/sizeof(foo) if checks in the code
that ought to be in one place

Suparna Bhattacharya

unread,

Oct 31, 2002, 2:51:24 PM10/31/02

to Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org, lkcd-...@lists.sourceforge.net, lkcd-g...@lists.sourceforge.net

On Thu, Oct 31, 2002 at 02:39:23AM +0000, Linus Torvalds wrote:
>
> On Thu, 31 Oct 2002, Rusty Russell wrote:
> >
> > Here is the list of features which have are being actively
> > pushed, not NAK'ed, and are not in 2.5.45. There are 13 of them, as
> > appropriate for Halloween.
>
> I'm unlikely to be able to merge everything by tomorrow, so I will
> consider tomorrow a submission deadline to me, rather than a merge
> deadline. That said, I merged everything I'm sure I want to merge today,
> and the rest I simply haven't had time to look at very much.
>
>

> > Crash Dumping (LKCD)
>
> This is definitely a vendor-driven thing. I don't believe it has any
> relevance unless vendors actively support it.
>

Linus,

I wish you could have made it to the OLS RAS BOF and seen this for
yourself - the vendor support, the need and the drive towards a
unified and flexible dumping framework.

The problem with dump has not been lack of vendor interest. There
wouldn't have been multiple dump type implementations floating around
if there wasn't a need -- LKCD, Mission Critical dump, Ingo's
network dump, kmsgdump, Rusty's oops dumper to cite some. The difficulty
has been technical and hence the diversity of approaches that different
projects came up with to tackle the problem (arising from slightly
different priorities and environments in each case). The second has
been related to preferences in the kind of user level analysis tools.

And the LKCD project has been evolving to address these very
problems to bring the best of these worlds together and also allow
flexibility on the choice of analysis tools !

Mission critical Linux project code base for example is now being
maintained as part of the LKCD project. Either lcrash or mission
critical linux crash can be used for analysing LKCD dumps.

And on the kernel side of things:

(a) The dump driver interface in LKCD has been specifically
designed to enable different kinds of dumping mechanisms and
targets to be supported -- generic block, network dump ,
polled-IDE (Rusty style) etc, even alternate dump targets failover
and multiple dump devices in the future if required. We are also
experimenting with a memory dump driver to save dump to memory
and dump after a memory preserving soft-boot, reusing the mission
critical mcore technique.
(b) Selective dumping, for different levels of dump data - one
option that was added recently would dump all kernel pages
and is likely to be commonly used (gzip compressed dump). Its
pretty easy to extend to more selectivity or different levels
and the dump also occurs in passes from more critical data to
less critical.
(The page in use flag was added to help with this)
(c) The core pieces which touch the kernel as such just add basic
infrastructure that is needed in the kernel for any dumping
facility. Includes:
- Enabling IPI to collect CPU state on all processors in the
system right when dump is triggered (may not be a normal
situation, so NMIs where supported are the best option)
- Ability to quiesce (silence) the system before dumping
(and if in non-disruptive mode, then restore it back)
- Calls into dump from kernel paths (panic, oops, sysrq
etc).
- Exports of symbols to help with physical memory
traversal and verification

As Matt has said there is an active development community behind
LKCD and lot of the drive for that has come from companies who use it
and are really hoping hard that it becomes part of the mainline.

BTW, the code has also been scrutinised and reviewed over
lkml as well and undergone iterations of releases following
that. Anything else there that you think needs to be fixed please
do let us know.

Regards
Suparna

Richard J Moore

unread,

Oct 31, 2002, 2:52:57 PM10/31/02

to Linus Torvalds, linux-...@vger.kernel.org, n2...@ltc-eth1000.torolab.ibm.com, Rusty Russell

>> Linux Trace Toolkit (LTT)

>
>I don't know what this buys us.

If you consider developer productivity useful then LTT has definite
benefits especially when combined with kprobes. With the two it is possible
to implant tracepoints without having to code up specific printks: kprobes
can be used to implant a probe, the probe handler can call LTT to record
the event.

Why call LTT instead of having a printk in the probe handler? - for
performance reasons, for latency reasons, because kprobes can implant
probes absolutely anywhere in the system, for analysis reasons - LTT trace
data can be post processed and massaged in a number of ways using the
visualizer tools. Yes you can do some of this using printk directly, but
you can be into a whole heap more work and it will certainly take longer to
implant a temporary tracepoint, recompile, run, remove, recompile the using
the dynamic trace technique of LTT+kprobes.

Richard

Richard J Moore

unread,

Oct 31, 2002, 3:02:47 PM10/31/02

to Linus Torvalds, linux-...@vger.kernel.org, n2...@ltc-eth1000.torolab.ibm.com, Rusty Russell

>> Crash Dumping (LKCD)
>
>This is definitely a vendor-driven thing. I don't believe it has any
>relevance unless vendors actively support it.

I can't argue with the fact you want to view lkcd this way. However as a
developer I have found a crash dump facility indispensable for certain
problems, particularly those that involve multiple processors where to use
more invasive techniques such as an interactive debugger can make the
problem unreproducible. It's also worth pointing out that each of the
serviceability tools (dump, trace, probes) complements each other. They are
every so much more powerful when used as a set: lkcd can capture a trace
buffer, whose contents would otherwise be lost; kprobes enables LTT to
implant tracepoints dynamically; krpobes + lkcd allows a crash dump to be
triggered for complex and specific conditions that are difficult to
reproduce. Without such tools, data gathering for complex problems becomes
a problem in itself. A problem doesn't necessarily have to be reproducible
to make it necessary to solve.

Linus Torvalds

unread,

Oct 31, 2002, 3:47:14 PM10/31/02

to Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Wed, 30 Oct 2002, Matt D. Robinson wrote:

> Linus Torvalds wrote:
> > > Crash Dumping (LKCD)
> >
> > This is definitely a vendor-driven thing. I don't believe it has any
> > relevance unless vendors actively support it.
>

> There are people within IBM in Germany, India and England, as well as
> a number of companies (Intel, NEC, Hitachi, Fujitsu), as well as SGI
> that are PAID to support this.

That's fine. And since they are paid to support it, they can apply the
patches.

What I'm saying by "vendor driven" is that it has no relevance for the
standard kernel, and since it has no relevance to that, then I have no
incentives to merge it. The crash dump is only useful with people who
actively look at the dumps, and I don't know _anybody_ outside of the
specialized vendors you mention who actually do that.

I will merge it when there are real users who want it - usually as a
result of having gotten used to it through a vendor who supports it. (And
by "support" I do not mean "maintain the patches", but "actively uses it"
to work out the users problems or whatever).

Horse before the cart and all that thing.

People have to realize that my kernel is not for random new features. The
stuff I consider important are things that people use on their own, or
stuff that is the base for other work. Quite often I want vendors to merge
patches _they_ care about long long before I will merge them (examples of
this are quite common, things like reiserfs and ext3 etc).

THAT is what I mean by vendor-driven. If vendors decide they really want
the patches, and I actually start seeing noises on linux-kernel or getting
requests for it being merged from _users_ rather than developers, then
that means that the vendor is on to something.

Linus

Jamie Lokier

unread,

Oct 31, 2002, 3:51:16 PM10/31/02

to Richard J Moore, Linus Torvalds, linux-...@vger.kernel.org, n2...@ltc-eth1000.torolab.ibm.com, Rusty Russell

Richard J Moore wrote:
> With the two it is possible to implant tracepoints without having to
> code up specific printks: kprobes can be used to implant a probe,
> the probe handler can call LTT to record the event.

Hey, that _is_ useful. Me like. Me spent many times wondering what
gets called when, and hunting heisenbugs masked by printk slowness.

-- Jamie

bob

unread,

Oct 31, 2002, 4:01:30 PM10/31/02

to Linus Torvalds, ka...@opersys.com, Rusty Russell, linux-...@vger.kernel.org, okr...@watson.ibm.com, okr...@us.ibm.com, fra...@us.ibm.com, LTT-Dev

Linus,

LTT is one step in allowing Linux to continue to move towards being a
viable alternative for more than just hackers. It is part of a larger
effort to provide reliability and serviceability. Concretely it allows
application/subsystem programmers to understand the performance of their
applications and the system. I should note, it also allows people to
improve kernel behavior as well. As we have communicated in the past, the
ability to gather and analyze this data is vital. From my correspondences
with Ingo

"If you care about performance you will want to trace. On two previous
kernels I have worked on I've heard this comment ["we don't need tracing"].
Once the infrastructure was in it was used and appreciated." There were
world-class programmers involved in these projects that did not see the
value of such infrastructure until they were able to use it.

I think Karim provided a list of possible uses, there are countless
applications of this - I'll list some more:
seeing where unexplained idle tie is occurring
understanding where interrupt processing time is going
understanding interactions between applications - which is running when
etc etc etc

If you look around the kernel, subsystems, and applications, you will find
growing numbers of one-off-ways of gathering this information. Providing a
unified way for different developers to communicate about performance will
significantly improve the ability to performance debug different
applications, drivers, system/application interaction, etc.

LTT has existed for a long time now and recent additions have been well
motivated: For a while now I have been working with the RAS team at IBM and
with Karim Yaghmour to streamline LTT and make it perform well on MPs. We
have addressed all the concerns raised by yourself, Ingo, and others from
previous postings. If there remains concern, it is also possible for one
to disable tracing. Some of the features we put into LTT came from ideas
we prototyped in K42 (www.research.ibm.com/K42) which in turn was developed
based on my experience writing a tracing infrastructure for IRIX while
working for SGI, and other's experiences with AIX's tracing facilities.

LTT is a valuable aspect in allowing developers using Linux to understand
their application's and the system's behavior. It serves to strengthen
Linux's RAS capabilities and would be great to get included into 2.5.
Thanks.

Thank you.

Robert Wisniewski
The K42 MP OS Project
Advanced Operating Systems
Scalable Parallel Systems
IBM T.J. Watson Research Center
914-945-3181
http://www.research.ibm.com/K42/
b...@watson.ibm.com

Larry McVoy

unread,

Oct 31, 2002, 4:20:55 PM10/31/02

to bob, Linus Torvalds, ka...@opersys.com, Rusty Russell, linux-...@vger.kernel.org, okr...@watson.ibm.com, okr...@us.ibm.com, fra...@us.ibm.com, LTT-Dev

I don't mean to pick on LTT, I haven't used it, it may be the best thing
since sliced bread.

I can tell you how to present this and any other feature similar to this
in a way which would make me a lot more willing to accept it, which
presupposes I'm doing Linus' job which of course I am not. However,
it's likely that Linus has similar views but he gets to chime in and
speak for himself.

All of these tools/features/whatever add some cost. The cost can be
measured in lots of different ways:

- lines of code
- lines of code which can't be configed out
- call depth increases
- stack size increases
- cache foot print increases
- parallelism (think preempt)
- interface changes

I suspect there are other metrics and it would be very cool if others would
chime in with their pet peeves.

What would be cool is if there was some way to quantify as much as possible
of the accepted set of costs so that that could be balanced against the
value of the change, right?

The one that always gets me is

"I've added feature XYZ, I benchmarked it with <whatever, usually
LMbench> and it didn't make a difference"

That is almost certainly misleading. The real thing you want to do
is quantify the actual costs because there can be non-zero costs that
do not show up in benchmarks. For example, suppose that the benchmark
neatly fits in the onchip caches and it only uses 1/2 of those caches.
Your change could increase the cache foot print to just fill the caches,
the benchmark says no difference, you declare success and move on.
The problem is that almost all changes are good enough that they match
this description. Measuring them in isolation doesn't tell us enough.
If I combine two changes, both of which use up 1/2 the cache, there is
no longer any room for anything else in the cache.

I'd love to see a trend where patch requests for any non-trivial patch
included before/after data for the above metrics (and any others that
people see as useful). I'd love to see some people taking just one of
the above and making a tool which measures that metric. Then we combine
the tools into a "patch measurement suite" and start prefixing patches
with

Code changes:
+1234 -5678 = -4444 (all code)
+123 -567 = -444 (all code subject to CONFIG_XYZ)

Call depth:
+2 for read()
+2 for write()
no change for all other system calls

Stack size:
+2099 bytes for read()/write() path

Cache misses:
No change for benchmark1, 2, 3
12,000 data read misses for lat_ctx ....

Etc.

What does the list think of this?
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm

Stephen Wille Padnos

unread,

Oct 31, 2002, 4:26:37 PM10/31/02

to Alexander Viro, Dax Kelson, Chris Wedgwood, Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

Alexander Viro wrote:

>On 31 Oct 2002, Dax Kelson wrote:
>
>>I think the normal intent is to let Sally, Joe, and Bill have their own
>>private directory protected from THE REST OF THE USERS.
>>
>>If a member of your trusted circle goes rogue, then, yup you are screwed
>>for the moment. It shouldn't last a whole month though.
>>
>>That is what backups, and employment termination is for.
>>
>>
>
>Then give them all the same account and be done with that. Effect will
>be the same.
>
>

Unless I'm missing something, that only works if all the users need
*exactly* the same permissions to all files, which isn't a good assumption.

Example: Sally is an accountant, Joe and Bill are engineers.

Bill and Joe are working on a project, and Sally is cost control for
that project - they all need access to the project files. Bill and Joe
do not need access to officer salary data, but Sally does. Bill and Joe
need access to other projects (not necessarily the same ones), but Sally
doesn't. Oops.

- Steve

Oliver Xymoron

unread,

Oct 31, 2002, 4:40:33 PM10/31/02

to Alexander Viro, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On Wed, Oct 30, 2002 at 09:43:29PM -0500, Alexander Viro wrote:
>
>
> On Wed, 30 Oct 2002, Linus Torvalds wrote:
>
> > > ext2/ext3 ACLs and Extended Attributes
> >
> > I don't know why people still want ACL's. There were noises about them for
> > samba, but I'v enot heard anything since. Are vendors using this?
>
> Because People Are Stupid(tm). Because it's cheaper to put "ACL support: yes"
> in the feature list under "Security" than to make sure than userland can cope
> with anything more complex than "Me Og. Og see directory. Directory Og's.
> Nobody change it". C.f. snake oil, P.T.Barnum and esp. LSM users

It's nearly useless in a Unix-only context, true, however there's a rather
serious impedance mismatch for serving files to Windows that this
addresses. Emulating ACLs on the fly with groups to fit into the
Windows model is mostly doable but ain't pretty.

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

Cort Dougan

unread,

Oct 31, 2002, 4:44:59 PM10/31/02

to Larry McVoy, bob, Linus Torvalds, ka...@opersys.com, Rusty Russell, linux-...@vger.kernel.org, okr...@watson.ibm.com, okr...@us.ibm.com, fra...@us.ibm.com, LTT-Dev

An excellent engineering practice but extremely difficult to do. This is
the holy-grail of software design and I don't think it would work for an
extremely loosely connected set of developers.

There is no central control of the system (or chain of accountability) and
that knocks down the practicality of this plan. It would work extremely
well in another project, though.

} What does the list think of this?

Dr. Greg Wettstein

unread,

Oct 31, 2002, 4:43:43 PM10/31/02

to Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On Oct 30, 6:31pm, Linus Torvalds wrote:
} Subject: Re: What's left over.

> > ext2/ext3 ACLs and Extended Attributes
>
> I don't know why people still want ACL's. There were noises about
> them for samba, but I'v enot heard anything since. Are vendors using
> this?

I can offer a perspective from someone who has been struggling to get
Linux competitive in real-life enterprise situations.

ACL's are an issue for Linux (and Samba) in order for the combination
to sustain competitiveness against Novell and NT in the desktop
fileservices domain. The harsh reality of life is that file and
document sharing is a way of life in the environments where Novell
dominates. The appearance of ACL's and desktop support for their
management in NT would tend to confirm this.

Without the granularity of ACL's it becomes too difficult to establish
the types of permission environments needed to support what most
administrative and department support personnel (ie, secretaries) seem
to desire.

The patches also begin implementing a common API framework which
multiple filesystems seem to be able to leverage. At least the rumor
appears to be that the instrastructure allows common toolsets to be
used for both ext2/3, XFS and perhaps other filesystems which want to
implement ACL's.

Its a compilation option and if set to default minimizes the impact on
people who don't need or want the infrastructure. Ted also has his
fingers in the project which probably means that it isn't going to get
neglected.

Just my 2 cents.

Best wishes for a productive weekend to everyone.

Greg

}-- End of excerpt from Linus Torvalds

As always,
Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC.
4206 N. 19th Ave. Specializing in information infra-structure
Fargo, ND 58102 development.
PH: 701-281-4950 WWW: http://www.enjellic.com
FAX: 701-281-3949 EMAIL: gr...@enjellic.com
------------------------------------------------------------------------------
"Open source code is not guaranteed nor does it come with a warranty."
-- the Alexis de Tocqueville Institute

"I guess that's in contrast to proprietary software, which comes with
a money-back guarantee, and free on-site repairs if any bugs are found."
-- Rary

Alexander Viro

unread,

Oct 31, 2002, 4:47:08 PM10/31/02

to Stephen Wille Padnos, Dax Kelson, Chris Wedgwood, Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On Thu, 31 Oct 2002, Stephen Wille Padnos wrote:

> >Then give them all the same account and be done with that. Effect will
> >be the same.
> >
> >
>
> Unless I'm missing something, that only works if all the users need
> *exactly* the same permissions to all files, which isn't a good assumption.

That's the point. In practice shared writable access to a directory can be
easily elevated to full control of each others' accounts, since most of
userland code is written in implicit assumption that nothing bad happens with
directory structure under it. And there is nothing kernel can do about that -
attacker does action you had explicitly allowed and your program goes bonkers
since it can't cope with that. Mechanism used to allow that action doesn't
enter the picture - be it ACLs, groups or something else.

bob

unread,

Oct 31, 2002, 4:50:37 PM10/31/02

to Larry McVoy, bob, Linus Torvalds, ka...@opersys.com, Rusty Russell, linux-...@vger.kernel.org, okr...@watson.ibm.com, okr...@us.ibm.com, fra...@us.ibm.com, LTT-Dev

Larry McVoy writes:
> I don't mean to pick on LTT, I haven't used it, it may be the best thing
> since sliced bread.

...

> > The one that always gets me is
>
> "I've added feature XYZ, I benchmarked it with <whatever, usually
> LMbench> and it didn't make a difference"

Larry,
You're right - whoever wrote that useless LMbench anyway :-)

I agree it would be great to have have a tool that allows us to gather
information on some of what you suggest below - but it's hard - people in
software engineering have been working on such things for a long time.
Further, what you mention below does not make sense in isolation. For
example a package could add 1000 lines of code and have almost no impact,
while another 10 lines of code could make a huge difference. So while the
below metrics are fine, without arguing about the expected impact they're
not necessarily helpful.

That's why benchmarks are still helpful as they are indicative of what
expected performance might be. If you're trying to get at maintainability
then I might (being a K42 convert) argue for a different strategy
altogether.

So what about LTT then. Well sure enough we did run LMbench as some other
tests. We ran a kernel compile, a tar, and LMbench - and posted results to
lkml. While this hardly represents all possibilities, showing little
performance impact on these is a positive statement about impact on other
applications.

To address some of the list below:
lines of code: a lot - almost all can be configed out,
call depth increase: we can analyze - complicated since while it is a
couple levels - other calls in the code may be to
cache footprint: how? - simulate? this is tough - qualitatively I think for
ltt is small because the same code is used across all trace
events. And less frequent trace events won't interfere
parallelism: not quite sure what you mean here - we not have a non-blocking
lockless scheme to address what I think the concern is here
interface changes: I argue very very positive - as in my letter to Linus
getting various developers to talk about performance
with a common mechanism would be a big win

I'm sure this doesn't fully address your concerns - but if others feel some
of the below numbers are really important we can certainly go about getting
more accurate results then my above off-the-cuff info.

Robert Wisniewski
The K42 MP OS Project
Advanced Operating Systems
Scalable Parallel Systems
IBM T.J. Watson Research Center
914-945-3181
http://www.research.ibm.com/K42/
b...@watson.ibm.com

----

Stephen Frost

unread,

Oct 31, 2002, 5:06:41 PM10/31/02

to Oliver Xymoron, Alexander Viro, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

* Oliver Xymoron (oxym...@waste.org) wrote:
> On Wed, Oct 30, 2002 at 09:43:29PM -0500, Alexander Viro wrote:
> > Because People Are Stupid(tm). Because it's cheaper to put "ACL support: yes"
> > in the feature list under "Security" than to make sure than userland can cope
> > with anything more complex than "Me Og. Og see directory. Directory Og's.
> > Nobody change it". C.f. snake oil, P.T.Barnum and esp. LSM users
>
> It's nearly useless in a Unix-only context, true, however there's a rather
> serious impedance mismatch for serving files to Windows that this
> addresses. Emulating ACLs on the fly with groups to fit into the
> Windows model is mostly doable but ain't pretty.

It's only nearly useless if you have some desire as an admin to
constantly be creating groups and changing group lists for users. This
is not a feature which is useful only when serving files to Windows
machines, not even nearly. AFS, Solaris, Irix etc have support for ACLs
and have a great deal of people who use them. The simple yet common
situation of one user who wants to give even just read access to
another specific user for a given file is a pain in the ass to deal with
given the current structure.

Stephen

Patrick Finnegan

unread,

Oct 31, 2002, 5:09:55 PM10/31/02

to linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002, Linus Torvalds wrote:

>
> On Wed, 30 Oct 2002, Matt D. Robinson wrote:
>
> > Linus Torvalds wrote:
> > > > Crash Dumping (LKCD)
> > >
> > > This is definitely a vendor-driven thing. I don't believe it has any
> > > relevance unless vendors actively support it.
> >
> > There are people within IBM in Germany, India and England, as well as
> > a number of companies (Intel, NEC, Hitachi, Fujitsu), as well as SGI
> > that are PAID to support this.

To add to that list, here at Purdue University, we actively look at crash
dumps on other architectures, such as IBM AIX, and are starting to do the
same on Linux machines, after discovery of LKCD.

> What I'm saying by "vendor driven" is that it has no relevance for the
> standard kernel, and since it has no relevance to that, then I have no
> incentives to merge it. The crash dump is only useful with people who
> actively look at the dumps, and I don't know _anybody_ outside of the
> specialized vendors you mention who actually do that.

This has much relevance for the standard kernel, as much relevance as gdb
has for people using applications. While a majority of non-techno-geek
end-users probably don't care about the patch, I'm certain that there are
plenty of organizations out there like Purdue that WANT lkcd to become a
standard part of the Linux kernel. Until then, we're forced to do our
own kernel patching every time we push out a new kernel.

> I will merge it when there are real users who want it - usually as a
> result of having gotten used to it through a vendor who supports it. (And
> by "support" I do not mean "maintain the patches", but "actively uses it"
> to work out the users problems or whatever).

We actively use it.

> People have to realize that my kernel is not for random new features. The
> stuff I consider important are things that people use on their own, or
> stuff that is the base for other work. Quite often I want vendors to merge
> patches _they_ care about long long before I will merge them (examples of
> this are quite common, things like reiserfs and ext3 etc).

LKCD isn't a 'random new feature'. It's something that is present in
nearly ever other "Unix" on the market. (Yes I know Unix != Linux). It's
a feature that should have been integrated by now IMHO.

> THAT is what I mean by vendor-driven. If vendors decide they really want
> the patches, and I actually start seeing noises on linux-kernel or getting
> requests for it being merged from _users_ rather than developers, then
> that means that the vendor is on to something.

Again, we're the end-user, not the vendor, and we're trying to drive to
have it included. I've talked with outher sys admins in my department
here at Purdue, and have gotten a unanimous response that "It would be a
good and useful feature to have."

Pat
--
Purdue Universtiy ITAP/RCS
Information Technology at Purdue
Research Computing and Storage
http://www-rcd.cc.purdue.edu

http://dilbert.com/comics/dilbert/archive/images/dilbert2040637020924.gif

Matt D. Robinson

unread,

Oct 31, 2002, 5:11:32 PM10/31/02

to Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002, Linus Torvalds wrote:

|>On Wed, 30 Oct 2002, Matt D. Robinson wrote:
|>That's fine. And since they are paid to support it, they can apply the
|>patches.

Sure, but why should they have to? What technical reason is there
for not including it, Linus?

I completely don't understand your reasoning here. I use it for my
home, not for work, and that's important for me. And not everyone
can spend their evenings rolling up the next set of patches for
a distribution. Yes, vendors want it, they need it, but there are
plenty of people like me that want this in too!

We want to see this in the kernel, frankly, because it's a pain
in the butt keeping up with your kernel revisions and everything
else that goes in that changes. And I'm sure SuSE, UnitedLinux and
(hopefully) Red Hat don't want to spend their time having to roll
this stuff in each and every time you roll a new kernel.

I mean, PLEASE, Linus, what do we have to do? There are so many
interests in this stuff, and I really, truly don't get what's wrong
with putting this in the kernel?

Have you looked at it? Have you looked at how it is now structure
to be non-invasive? How it will allow other kernel developers to
generate their own dumping methods? I mean, we sent you E-mails
weeks ago, and you didn't respond to any of them with even a word
of acknowledgement of receipt.

|>What I'm saying by "vendor driven" is that it has no relevance for the
|>standard kernel, and since it has no relevance to that, then I have no
|>incentives to merge it. The crash dump is only useful with people who
|>actively look at the dumps, and I don't know _anybody_ outside of the
|>specialized vendors you mention who actually do that.

I do. Others like myself do. And not just for development
purposes. I don't like to see my system crash after installing one
of your new kernels and not be able to figure out what's wrong.
The nice thing is that LKCD there, it works, and I can just look
at the crash report instead of wishing that my console buffer
didn't just scroll off. Oh, I know, I'll just wait for it to
happen again ... yeah, like that's real intelligent.

|>I will merge it when there are real users who want it - usually as a
|>result of having gotten used to it through a vendor who supports it. (And
|>by "support" I do not mean "maintain the patches", but "actively uses it"
|>to work out the users problems or whatever).
|>
|>Horse before the cart and all that thing.
|>
|>People have to realize that my kernel is not for random new features. The
|>stuff I consider important are things that people use on their own, or
|>stuff that is the base for other work. Quite often I want vendors to merge
|>patches _they_ care about long long before I will merge them (examples of
|>this are quite common, things like reiserfs and ext3 etc).

Other vendors have merged LKCD a long time ago and use it, and
expect it to be there. And users like myself find it valuable on
their desktops, their servers, etc. I mean, there's someone using
this at Purdue that's responded to you, just another kernel user
that likes to have this stuff there automatically.

|>THAT is what I mean by vendor-driven. If vendors decide they really want
|>the patches, and I actually start seeing noises on linux-kernel or getting
|>requests for it being merged from _users_ rather than developers, then
|>that means that the vendor is on to something.

TurboLinux, MonteVista, Veritas, SuSE, and UnitedLinux have LKCD.
With the most recent changes, I think Red Hat can put LKCD in now
such that it isn't invasive to their distribution.

I think SuSE has already expressed a desire to have this in. If
you want to hear from others, I'll asked them to respond to you.

|> Linus

--Matt

Stephen Frost

unread,

Oct 31, 2002, 5:13:52 PM10/31/02

to Alexander Viro, Stephen Wille Padnos, Dax Kelson, Chris Wedgwood, Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

* Alexander Viro (vi...@math.psu.edu) wrote:
> On Thu, 31 Oct 2002, Stephen Wille Padnos wrote:
> > Unless I'm missing something, that only works if all the users need
> > *exactly* the same permissions to all files, which isn't a good assumption.
>
> That's the point. In practice shared writable access to a directory can be
> easily elevated to full control of each others' accounts, since most of
> userland code is written in implicit assumption that nothing bad happens with
> directory structure under it. And there is nothing kernel can do about that -
> attacker does action you had explicitly allowed and your program goes bonkers
> since it can't cope with that. Mechanism used to allow that action doesn't
> enter the picture - be it ACLs, groups or something else.

So you're not really arguing against ACLs, you're complaining that
userspace is broken when there's shared write access. That's fine,
userspace should be fixed, inclusion of ACLs into the kernel shouldn't
be denied because of this. ACLs should be optional, of course, and if
you want them some really noisy warnings about the problems of shared
writeable area with current userspace tools. Of course, that same
warning should probably be included in 'groupadd'.

Stephen

Michael Shuey

unread,

Oct 31, 2002, 5:16:37 PM10/31/02

to Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

I'm a user, and I request that LKCD get merged into the kernel. :-)

On Thu, Oct 31, 2002 at 07:46:08AM -0800, Linus Torvalds wrote:
> What I'm saying by "vendor driven" is that it has no relevance for the
> standard kernel, and since it has no relevance to that, then I have no
> incentives to merge it. The crash dump is only useful with people who
> actively look at the dumps, and I don't know _anybody_ outside of the
> specialized vendors you mention who actually do that.

I actively look at LKCD dumps. I have no affiliation with SGI, IBM, or any
of the previously mentioned companies. I'm not aware of any vendors providing
pre-patched kernels with LKCD; right now my only option for reasonable crash
data is to patch and build my own kernel.

> I will merge it when there are real users who want it - usually as a
> result of having gotten used to it through a vendor who supports it. (And
> by "support" I do not mean "maintain the patches", but "actively uses it"
> to work out the users problems or whatever).

Here at Purdue University we're building several Linux clusters. LKCD is
most useful to help find in-kernel problems. Most of the time our crashes
are due to a flakey stick of RAM or a dying disk (or controller), but LKCD
dumps are still useful. With a crash dump I can analyze the cause of the
crash after the fact, but without a dump my only option to get _any_ crash
data is to leave a console plugged into each node of my clusters.

Do you feel like donating a 700-port console server? Right, so it's LKCD
for me then.

> People have to realize that my kernel is not for random new features. The
> stuff I consider important are things that people use on their own, or
> stuff that is the base for other work. Quite often I want vendors to merge
> patches _they_ care about long long before I will merge them (examples of
> this are quite common, things like reiserfs and ext3 etc).
>
> THAT is what I mean by vendor-driven. If vendors decide they really want
> the patches, and I actually start seeing noises on linux-kernel or getting
> requests for it being merged from _users_ rather than developers, then
> that means that the vendor is on to something.

I understand that Linux can't have random new features (especially going into
a feature-freeze). However, any additions that provide better debugging info
are (in my opinion, at any rate) worth it. Every other UNIX I've used (with
the possible exception of an early Ultrix) has some facility to inspect the
kernel - all have _at_least_ dumps that get written to a swap disk on a crash
and many have an in-core debugger. Running gdb on a live kernel from a
remote machine isn't unheard of, at least with other OSes. Unfortunately,
only aid you'll get in debugging a Linux kernel is the source code. Sure,
you can add a mess of printk's all over suspect code, and yes, the console
gets a register dump on a panic, but that really isn't enough. Some times
it's nice to be able to walk through the kernel's data structures and figure
out just what was going on when things died. I get this with LKCD.

To that end, it'd be nice if the trace toolkit and SGI's kernel debugger were
added. No, I haven't used them, but then I don't do much kernel development
either. I'd bet that LTT and the kernel debugger would be very useful to
those who do, though.

--
Mike Shuey

Andi Kleen

unread,

Oct 31, 2002, 5:26:48 PM10/31/02

to Rusty Russell, linux-...@vger.kernel.org, torv...@transmeta.com

Rusty Russell <ru...@rustcorp.com.au> writes:

> > > statfs64
> >
> > I haven't even seen it.
>
> It's fairly old, but Peter Chubb said there was some vendor interest
> for v. large devices. Peter?

statfs64 is needed when you want to access large NFS servers (>2TB is
becomming quite common for NAS) and want to have working "df" for them.

Currently it is scaled by wsize==blocksize, so it only breaks when
fileserversize/wsize > 2^31. For 1KB wsize it breaks with 2TB, with
4KB with 8TB etc. While 1KB wsize is arguably stupid (but happens sometimes
in practice). 8TB is not an unrealistic size for an NFS server these
days.

I did an hack to scale the NFS block size in stat to make sure it fits
into 31bit, but statfs64 would be the correct solution for it really.

Also I would like to propose the nanosecond stat patches. It doesn't add
new system calls, but just uses spare fields in the existing stat64
structure and closes a hole in make.

-Andi

Linus Torvalds

unread,

Oct 31, 2002, 5:27:53 PM10/31/02

to Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

[ Ok, this is a really serious email. If you don't get it, don't bother
emailing me. Instead, think about it for an hour, and if you still don't
get it, ask somebody you know to explain it to you. ]

On Thu, 31 Oct 2002, Matt D. Robinson wrote:
>
> Sure, but why should they have to? What technical reason is there
> for not including it, Linus?

There are many:

- bloat kills:

My job is saying "NO!"

In other words: the question is never EVER "Why shouldn't it be
accepted?", but it is always "Why do we really not want to live
without this?"

- included features kill off (potentially better) projects.

There's a big "inertia" to features. It's often better to keep
features _off_ the standard kernel if they may end up being
further developed in totally new directions.

In particular when it comes to this project, I'm told about
"netdump", which doesn't try to dump to a disk, but over the net.
And quite frankly, my immediate reaction is to say "Hell, I
_never_ want the dump touching my disk, but over the network
sounds like a great idea".

To me this says "LKCD is stupid". Which means that I'm not going to apply
it, and I'm going to need some real reason to do so - ie being proven
wrong in the field.

(And don't get me wrong - I don't mind getting proven wrong. I change my
opinions the way some people change underwear. And I think that's ok).

> I completely don't understand your reasoning here.

Tough. That's YOUR problem.

Linus

Alexander Viro

unread,

Oct 31, 2002, 5:32:38 PM10/31/02

to Stephen Frost, Stephen Wille Padnos, Dax Kelson, Chris Wedgwood, Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

On Thu, 31 Oct 2002, Stephen Frost wrote:

> So you're not really arguing against ACLs, you're complaining that
> userspace is broken when there's shared write access. That's fine,
> userspace should be fixed, inclusion of ACLs into the kernel shouldn't
> be denied because of this. ACLs should be optional, of course, and if
> you want them some really noisy warnings about the problems of shared
> writeable area with current userspace tools. Of course, that same
> warning should probably be included in 'groupadd'.

No. I'm saying that ACLs do not have a point until at least basic
userland gets ready for setups people want ACLs for. Adding features that
can't be used until $BIG_WORK is done is idiocy in the best case and
danger in the worst. Especially since $BIG_WORK does not depend on these
features.

Karim Yaghmour

unread,

Oct 31, 2002, 5:33:58 PM10/31/02

to Larry McVoy, bob, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org, okr...@watson.ibm.com, okr...@us.ibm.com, fra...@us.ibm.com, LTT-Dev

Hello Larry,

First, thanks for your feedback.

I understand and share you concern about the use of micro-benchmarks
to qualify/quantify the impact additional code on the kernel. This is
precisely the reason why I chose not to use micro-benchmarks in the
Usenix article I presented about LTT at the 2000 annual technical
conference. I was suprised to see some of the selection commitee
members actually come up to me and say: "I'm so glad to see a paper
that doesn't use micro-benchmarks."

That's why we elected to create 2 separate sets of benchmarks, one
using real-life applications (kernel build, bzip2, etc.) and one
using LMbench. Personnally, I would have been satisfied with just the
real-life applications, but I know that many folks on the LKML want
to see LMbench numbers, so we included those too. That said, I find
it very positive that you keep a healthy dose of self-criticism towards
your own tool, this is exactly the kind of stuff that makes LMbench so
good. So too is it with LTT. I've always been on the lookout for
reducing costs here and there while acheiving maximal functionality.

Fortunately, repeated testing and analysis on LTT by many parties
using many tools have confirmed that the current LTT has very low
impact on many fronts, including static code modifications.

So, for example, we had one example run of LMbench where we ran kernel
compiles in the background (i.e. a script restarted the kernel
compile every time it ended). To make it as simple as possible, here's
the elapsed time taken to run LMbench on 4x SMP system in the various
configurations:
---------------------------------------------------------------------
vanilla 14:27
vanilla+ltt+ltt off 14:26
vanilla+ltt+ltt on 14:31
vanilla+ltt+ltt on+daemon on 14:32

vanilla+ltt+ltt on+kernel compile 15:03
vanilla+ltt+ltt on+kernel compiles+daemon on 15:13
---------------------------------------------------------------------

As you can see, the differences in percentages are all within the 2%
range we mentioned earlier.

To address the specific metrics you mentioned:

> Code changes:
We've posted diffstats with every patch we published on the LKML.

> Call depth:
We're talking 3 for syscalls and 2 for all other events in order to
reach the core tracing function proper (this could easily be reduced
by 1 if it's really a problem). Add 1 for locking scheme and 3 for
the non-locking scheme. I'm not counting the calls we make to kernel
services, which somewhat goes to show that this is a flawed measure
because I've never seen any thorough analysis of call depths for
kernel services. Can't say that it wouldn't be an interesting
research project to see someone do that for the entire kernel, we
may find some interesting results.

> Stack size:
This really depends on the quantity of data being passed to the tracer,
which varies greatly from one event to the other. I can say this, however:
in all the testing I've seen done on LTT in the past, there has never
been a stack problem. This isn't an invitation for being reckless. I am
aware of stack issues and have been on the lookout for the any related
problem.

> Cache misses:
Bob has said it best. I think the best that we can do about this is
to follow the known-to-be-good guidelines about cache interference.
The discussion Ingo and Bob had on this issue in relation to LTT,
for example, shows that we've thought this through.

Beyond everything I've said above, I'd invite you to download LTT and
try it out. I'm sure you'll see why this is important for Linux users.

BTW, while I'm on the subject of LMbench, I've been trying to find a
way to run it on an embedded system. The problem is that this thing
needs a compiler and that would mean having to cross-compile gcc itself
and so on, which creates storage problems etc. Are there any plans to
make a mini-LMbench?

Thanks again,

Karim

===================================================
Karim Yaghmour
ka...@opersys.com
Embedded and Real-Time Linux Expert
===================================================

Richard Gooch

unread,

Oct 31, 2002, 5:38:07 PM10/31/02

to Alexander Viro, Stephen Wille Padnos, Dax Kelson, Chris Wedgwood, Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

Alexander Viro writes:
> On Thu, 31 Oct 2002, Stephen Wille Padnos wrote:
>
> > >Then give them all the same account and be done with that. Effect will
> > >be the same.
> >
> > Unless I'm missing something, that only works if all the users need
> > *exactly* the same permissions to all files, which isn't a good assumption.
>
> That's the point. In practice shared writable access to a directory
> can be easily elevated to full control of each others' accounts,

^^^^^^
While that may be true in theory, in practice it's not necessarily the
case. Many people don't have the expertise to make use of such
exploits. And before you say that they can download a pre-cooked
exploit kit, let me tell you that there are plenty of people who don't
have the time or inclination to do that.

I've seen you talk about these kinds of things before, and you always
seem to be talking about the typical nightmarish undergrad CS lab
where the kids spend all their time trying to crack each other and the
system. And I'm not saying that these don't exist: I've seen it.

But there are other environments (say a research lab with grad
students, post-docs and faculty) where the inhabitants either don't
have the skills or don't have the interest in cracking accounts.
Everyone is too busy doing their own research. Cracking the mysteries
of the universe seems to be more interesting.

So group write access and ACL's *can* lead to wanton cracking, but for
many environments it's not an issue. For many, the dangers lie outside
the firewall, not inside.

Note that I'm not specifically advocating ACL's, I'm just letting you
know that the problem you're concerned about is, for good reason, not
a problem for everyone.

I will note that one appealing aspect of ACL's is that they do not
require administrator intervention. That's good for a user who just
wants to set something up without having to wait for the sysadmin.
It's also good for the sysadmin (excepting control freaks) who doesn't
want to do things that the users can (or should) actually be doing by
themselves.

Regards,

Richard....
Permanent: rgo...@atnf.csiro.au
Current: rgo...@ras.ucalgary.ca

Linus Torvalds

unread,

Oct 31, 2002, 5:40:08 PM10/31/02

to Oliver Xymoron, Alexander Viro, Rusty Russell, linux-...@vger.kernel.org

Note that as far as ACL's go, enough people have convinced me that we want
them, with clear real-life issues. So don't worry about them, I'll merge
it.

Linus

Linus Torvalds

unread,

Oct 31, 2002, 5:41:52 PM10/31/02

to Alexander Viro, Stephen Frost, Stephen Wille Padnos, Dax Kelson, Chris Wedgwood, Rik van Riel, Rusty Russell, linux-...@vger.kernel.org

On Thu, 31 Oct 2002, Alexander Viro wrote:
>
> No. I'm saying that ACLs do not have a point until at least basic
> userland gets ready for setups people want ACLs for. Adding features that
> can't be used until $BIG_WORK is done is idiocy in the best case and
> danger in the worst. Especially since $BIG_WORK does not depend on these
> features.

I think samba alone counts as enough user-land usage.

And if it turns out nobody else ever wants to use them, that's fine too.

Linus

Matt D. Robinson

unread,

Oct 31, 2002, 5:48:46 PM10/31/02

to Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002, Linus Torvalds wrote:
|>[ Ok, this is a really serious email. If you don't get it, don't bother
|> emailing me. Instead, think about it for an hour, and if you still don't
|> get it, ask somebody you know to explain it to you. ]

Thanks for the response. I don't think I need an hour. This is
pretty simple.

This isn't bloat. If you want, it can be built as a module, and
not as part of your kernel. How can that be bloat? People who
build kernels can optionally build it in, but we're not asking
that it be turned on by default, rather, built as a module so
people can load it if they want to. We made it into a module
because 18 months ago you complained about it being bloat. We
addressed your concerns.

Some people, particularly large SSI configurations, can't live
without this. You shouldn't crash once. Crashing twice, or
more often, is inexcusable.

|> - included features kill off (potentially better) projects.
|>
|> There's a big "inertia" to features. It's often better to keep
|> features _off_ the standard kernel if they may end up being
|> further developed in totally new directions.

I can't argue against this ... to do so would mean that you don't
accept any new features for 2.5, and there are a lot of projects
like mine that need to go in, although I do understand your concerns.

|> In particular when it comes to this project, I'm told about
|> "netdump", which doesn't try to dump to a disk, but over the net.
|> And quite frankly, my immediate reaction is to say "Hell, I
|> _never_ want the dump touching my disk, but over the network
|> sounds like a great idea".

We've integrated the "netdump" capabilities as a dump method
for LKCD. It's an option for dumping, just like all the other
dump methods available to people? Want to dump to disk? Use
LKCD. Want to dump on the network? USE LKCD. What's wrong
with that?

We've created a net dump method that allows you to dump across the
network from Mohammed Abbas (modified from Ingo's netconsole dump).
It integrates into LKCD beautifully. If you want that patch with
the rest of our LKCD patches, we can include it, no problem.

|>To me this says "LKCD is stupid". Which means that I'm not going to apply
|>it, and I'm going to need some real reason to do so - ie being proven
|>wrong in the field.

Hopefully some of this changes your mind.

It is. I lose sleep because this is my problem. I lose time on
the weekends because this is my problem.

If you've _reviewed_ the LKCD patches and still have the opinions
you've mentioned above, then I'll consider this your position and
be done with it. Otherwise, please accept the code.

We'll keep doing our best to keep up with your kernels in the
meantime.

|> Linus

--Matt

Linus Torvalds

unread,

Oct 31, 2002, 5:57:27 PM10/31/02

to Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002, Matt D. Robinson wrote:
>
> This isn't bloat. If you want, it can be built as a module, and
> not as part of your kernel. How can that be bloat?

I don't care one _whit_ about the size of the binary. I don't maintain
binaries, adn the binary can be gigabytes for all I care.

The only thing I care about is source code. So the "build it as a module
and it is not bloat" argument is a total nonsense thing as far as I'm
concerned.

Anyway, new code is always bloat to me, unless I see people using them.

Guys, why do you even bother trying to convince me? If you are right, you
will be able to convince other people, and that's the whole point of open
source.

Being "vendor-driven" is _not_ a bad thing. It only means that _I_ am not
personally convinced. I'm only one person.

Linus

Dave Craft

unread,

Oct 31, 2002, 5:58:54 PM10/31/02

to Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002, Linus Torvalds wrote:

> What I'm saying by "vendor driven" is that it has no relevance for the
> standard kernel, and since it has no relevance to that, then I have no
> incentives to merge it. The crash dump is only useful with people who
> actively look at the dumps, and I don't know _anybody_ outside of the
> specialized vendors you mention who actually do that.

Unfortunately the vast majority of the customers I deal with
buy a distribution and then put a kernel from kernel.org
on. I believe this comes about because of either needing fixes
or function that appear in later kernels that have not made
it to the distributions kernels yet.

Even if the distribution included LKCD in their kernel,
I lose lots of debug ability once customers switch over to
kernel.org and no longer have the LKCD patch.

Thus we are currently left with having to maintain LKCD patches for
many arbitrary kernel.org kernels and convince customers to apply
it BEFORE they start encountering problems that we'll have to look at.
Application of patches that aren't automatically included in kernel.org
rarely happens with our customer set (before problems occur),
no matter how much we flag the issue to them up front.

I realize that while my current capacity makes me fall into
the 'vendor' support you speak of, I believe I am actually
advocating its inclusion on behalf of real live customers.

Vendors can and do actually help linux development, by screening,
researching fixes, and or directly fixing lots of customer
problems that you never have to deal with. To do that, LKCD
is the debug weapon of choice.

I request you reconsider the inclusion of LKCD.

Regards, Dave

Mail : da...@austin.ibm.com Phone : 512-838-8248

Oliver Xymoron

unread,

Oct 31, 2002, 6:06:33 PM10/31/02

to Linus Torvalds, Alexander Viro, Rusty Russell, linux-...@vger.kernel.org

On Thu, Oct 31, 2002 at 09:38:41AM -0800, Linus Torvalds wrote:
>
> Note that as far as ACL's go, enough people have convinced me that we want
> them, with clear real-life issues. So don't worry about them, I'll merge
> it.

Ok, so now lets work on a Documentation/filesystems patch pointing
out a few of the common pitfalls, as I definitely agree they invite
some grave mistakes and are best avoided in most scenarios.

- /tmp-style symlink issues on shared directories
- vast majority of software (including security tools) ACL-unaware
- much harder to check for correctness

Al, I'm sure you have more..

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

Nicholas Wourms

unread,

Oct 31, 2002, 6:08:34 PM10/31/02

to linux-...@vger.kernel.org

Trever L. Adams wrote:

> On Wed, 2002-10-30 at 21:31, Linus Torvalds wrote:
>
>> > ext2/ext3 ACLs and Extended Attributes
>>
>> I don't know why people still want ACL's. There were noises about them
>> for samba, but I'v enot heard anything since. Are vendors using this?
>>
>

> I am sure I don't count (not being a vendor), but Intermezzo offers
> support for this (they are waiting on feature freeze to redo it to 2.5
> according to an email I have). I want this stuff. Yes, u+g+w is nice,
> but good ACLs are even better. Please, if this is technically correct
> in implementation, do put it in.
>

I agree, having them is far better then the standard u+g+w that's been
around for ages. I think it gives the "finer" grain of control over your
system that a lot of users may desire. Not to mention the fact that ACL's
are well supported by the recently merged XFS. If I'm not mistaken, AFS
uses them as well. I *really* don't see the overhead cost here in terms of
compiled kernel size when they are turned off. As for the size of the
source tarball, who cares? People should quit whining about the size of
the sources and get over it! Storage is cheap and broadband is in
widespread use.

Cheers,
Nicholas

Chris Friesen

unread,

Oct 31, 2002, 6:15:00 PM10/31/02

to Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

Linus Torvalds wrote:

> In particular when it comes to this project, I'm told about
> "netdump", which doesn't try to dump to a disk, but over the net.
> And quite frankly, my immediate reaction is to say "Hell, I
> _never_ want the dump touching my disk, but over the network
> sounds like a great idea".
>
> To me this says "LKCD is stupid". Which means that I'm not going to apply
> it, and I'm going to need some real reason to do so - ie being proven
> wrong in the field.

How do you deal with netdump when your network driver is what caused the
crash?

Ideally I would like to see a dump framework that can have a number of
possible dump targets. We should be able to dump to any combination of
network, serial, disk, flash, unused ram that isn't wiped over restarts,
etc...

Chris

--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: cfri...@nortelnetworks.com

Andrew Morton

unread,

Oct 31, 2002, 6:19:52 PM10/31/02

to Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

Linus Torvalds wrote:
>
> [ lkcd ]
>

We'll be spending the next six months stabilising and hardening
the used-to-be-2.5 kernel. If grunts like me can get hold a
copy of the other person's kernel image from time-of-crash, that
has a ton of value.

(Disclaimer: I've never used lkcd. I'm assuming that it's
possible to gdb around in a dump)

> In particular when it comes to this project, I'm told about
> "netdump", which doesn't try to dump to a disk, but over the net.

It could help. But like serial console, the random person whose
kernel just died often can't be bothered setting it up, or simply
doesn't have the gear, or the crash is not repeatable.

So. _If_ lkcd gives me gdb-able images from time-of-crash, I'd
like it please. And I'm the grunt who spent nearly two years
doing not much else apart from working 2.3/2.4 oops reports.

Oh, and as Rusty has pointed out, we lose a *lot* of oops reports
because users are in X and the backtrace doesn't make it to the
logs. Rusty has a little app which dumps just the oops report to
disk somewhere. Want that too.

Oliver Xymoron

unread,

Oct 31, 2002, 6:22:09 PM10/31/02

to Linus Torvalds, linux-kernel

On Thu, Oct 31, 2002 at 09:25:21AM -0800, Linus Torvalds wrote:
> (And don't get me wrong - I don't mind getting proven wrong. I change my
> opinions the way some people change underwear. And I think that's ok).

As in 'sometimes not even when hundreds of people start haranguing me
about it in public forums'?

Perhaps not the best analogy.

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

Patrick Finnegan

unread,

Oct 31, 2002, 6:24:16 PM10/31/02

to Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002, Linus Torvalds wrote:

>
> On Thu, 31 Oct 2002, Matt D. Robinson wrote:
> >
> > This isn't bloat. If you want, it can be built as a module, and
> > not as part of your kernel. How can that be bloat?
>
> I don't care one _whit_ about the size of the binary. I don't maintain
> binaries, adn the binary can be gigabytes for all I care.
>
> The only thing I care about is source code. So the "build it as a module
> and it is not bloat" argument is a total nonsense thing as far as I'm
> concerned.

So, you don't like bloat, such as having 22 different file systems (only
including the ones that can be placed on disk, not things like devfs or
smbfs...). That's more filesystems than I have dollars in my wallet at
the moment. For the amount of utility that this code provides, it's
definately not 'bloat'.

> Anyway, new code is always bloat to me, unless I see people using them.

HEY!!! WE'RE USING IT!!!

> Guys, why do you even bother trying to convince me? If you are right, you
> will be able to convince other people, and that's the whole point of open
> source.

Now this sounds more like something I'd hear from Sun trying to get a fix
for a version of Solaris without having to buy a new one. I thought the
whole point of Free Software was sharing with the community, and doing
what's best for the community.

> Being "vendor-driven" is _not_ a bad thing. It only means that _I_ am not
> personally convinced. I'm only one person.

That's the same as claiming that George W. Bush is just one person....

So I'll plea yet again, please add LKCD!

Pat
--
Purdue Universtiy ITAP/RCS
Information Technology at Purdue
Research Computing and Storage
http://www-rcd.cc.purdue.edu

http://dilbert.com/comics/dilbert/archive/images/dilbert2040637020924.gif

-

Linus Torvalds

unread,

Oct 31, 2002, 6:26:31 PM10/31/02

to Chris Friesen, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002, Chris Friesen wrote:
>
> How do you deal with netdump when your network driver is what caused the
> crash?

Actually, from a driver perspective, _the_ most likely driver to crash is
the disk driver.

That's from years of experience. The network drivers are a lot simpler,
the hardware is simpler and more standardized, and doesn't do as many
things. It's just plain _easier_ to write a network driver than a disk
driver.

Ask anybody who has done both.

But that's not the real issue. The real issue is that I have no personal
incentives to try to merge the thing, and as a result I think I'm the
wrong person to do so. I've told people over and over again that I think
this is a "vendor merge", and I'm fed up with people not _getting_ it.

Don't bother to ask me to merge the thing, that only makes me get even
more fed up with the whole discussion. This is open source, guys. Anybody
can merge it. Because I don't particularly believe in it doesn't mean that
it cannot be used. It only means that I want to see users flock to it and
show my beliefs wrong.

Linus

Linus Torvalds

unread,

Oct 31, 2002, 6:30:03 PM10/31/02

to Oliver Xymoron, linux-kernel

On Thu, 31 Oct 2002, Oliver Xymoron wrote:
>
> Perhaps not the best analogy.

Heh. I like my analogies bad. The best analogies should make you go
"huh!" - kind of like a pink poodle in a tutu.

Linus

Nicholas Wourms

unread,

Oct 31, 2002, 6:31:17 PM10/31/02

to linux-...@vger.kernel.org

Chris Wedgwood wrote:

> On Wed, Oct 30, 2002 at 11:48:23PM -0700, Dax Kelson wrote:
>
>> Technically speaking you can achieve ACL like permissions/behavior
>> using the historical UNIX security model by creating a group EACH
>> time you run into a unique case permission scenario.
>
> I'm not arguing against this... I'm claiming POSIX ACLs are mostly
> brain-dead and almost worthless (broken by committee pressure and too
> many people making stupid concessions).
>
> If we must have ACLs, why not do it right?
>
>> Without ACLs, if Sally, Joe and Bill need rw access to a file/dir,
>> just create another group with just those three people in. Over
>> time, of course, this leads to massive group proliferation. Without
>> Tim Hockin's patch, 32 groups is maximum number of groups a user can
>> be a member of.
>
> How many people actually need this level of complexity?
>
> Why are we adding all this shit and bloat because of perceived
> problems most people don't have? What next, some kind of misdesigned
> in-kernel CryptoAPI?

Get over it! If you haven't noticed, CryptoAPI is merged already. The only
bloat ACLs cause is the size of the source tarball. If your connection is
slow or you are out of diskspace, too bad! I'm sure I'm not the only one
who is tired of hearing people whine about "bloat" wrt the sources and
demanding that features they don't use be ignored. No one (non-core)
feature will be useful to everyone, that is a given fact. The point is
that while you see no use for it, there are many others out there who do.
ACLs are something which have existed in the Solaris/BSD world for a long
time now, and people who have admin these boxen find ACLs to be quite
useful.

Cheers,
Nicholas

Alan Cox

unread,

Oct 31, 2002, 6:34:35 PM10/31/02

to Chris Friesen, Linus Torvalds, Matt D. Robinson, Rusty Russell, Linux Kernel Mailing List, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 2002-10-31 at 18:10, Chris Friesen wrote:
> > To me this says "LKCD is stupid". Which means that I'm not going to apply
> > it, and I'm going to need some real reason to do so - ie being proven
> > wrong in the field.
>
> How do you deal with netdump when your network driver is what caused the
> crash?

Netdump drives the system itself. Any dump driver has to as it cant
assume the system is in a remotely sane state

John Alvord

unread,

Oct 31, 2002, 6:35:38 PM10/31/02

to Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002 09:54:54 -0800 (PST), Linus Torvalds
<torv...@transmeta.com> wrote:

>Guys, why do you even bother trying to convince me? If you are right, you
>will be able to convince other people, and that's the whole point of open
>source.
>
>Being "vendor-driven" is _not_ a bad thing. It only means that _I_ am not
>personally convinced. I'm only one person.

It sounds to me like there needs to be L-K traffic when problems are
solved using LKCD.

Personally I love crash dumps... in 33 years of computing I have spent
a total of 1-2 years doing nothing but enhancing and developing
post-processing facilities. The true benefit is not just the "crashed
here, add a null check nonsense". It is the ability to examine the
whole system state. With an inboard trace table, you can even go back
in time. You can look at call stacks, locks held, state of allocated
memory, etc etc. If you save callstacks and time with allocated
memory, you can track down storage growth problems. I have spent weeks
winkling problems out of crash dumps, solving problems the developers
didn't even know existed.

With the right facility you can take crash dump snapshots and keep on
running. It is a great tool for understanding a system.

But until there is a flow of results - good quality fixes - resulting
from such analysis, I can see exactly why LT is doubtful.

john alvord

Patrick Mochel

unread,

Oct 31, 2002, 6:43:01 PM10/31/02

to Dave Craft, Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002, Dave Craft wrote:

> On Thu, 31 Oct 2002, Linus Torvalds wrote:
>
> > What I'm saying by "vendor driven" is that it has no relevance for the
> > standard kernel, and since it has no relevance to that, then I have no
> > incentives to merge it. The crash dump is only useful with people who
> > actively look at the dumps, and I don't know _anybody_ outside of the
> > specialized vendors you mention who actually do that.
>
> Unfortunately the vast majority of the customers I deal with
> buy a distribution and then put a kernel from kernel.org
> on. I believe this comes about because of either needing fixes
> or function that appear in later kernels that have not made
> it to the distributions kernels yet.
>
> Even if the distribution included LKCD in their kernel,
> I lose lots of debug ability once customers switch over to
> kernel.org and no longer have the LKCD patch.
>
> Thus we are currently left with having to maintain LKCD patches for
> many arbitrary kernel.org kernels and convince customers to apply
> it BEFORE they start encountering problems that we'll have to look at.
> Application of patches that aren't automatically included in kernel.org
> rarely happens with our customer set (before problems occur),
> no matter how much we flag the issue to them up front.

So, this is precisely where something like OSDL's Carrier Grade and Data
Center working groups can come into play, amazingly enough.

By now, nearly everyone has heard about the working groups and nearly
every developer that has, despises them. Even I resist association with
them. But, they can have some real value to the vendors and the OEMs in
exactly the way you describe.

Take for example DCL. It's a kernel tree with several base patches
intended to make Linux better in the data center. The base is not fancy,
and includes things like LKCD and kdb (I think). It's actively maintained
and updated more often than Linus makes a release (by virtue of
bitkeeper).

The intent is to later have multiple child trees that implement features
for a specific application space (e.g. databases), while maintainig the
same base set of features. People wishing to use the most recent kernel
with those features can use the DCL tree directly. Or an OEM FAE can use
the tree to build something for the vendor, or add extra features.

Note that it's not a distribution. We don't even make real releases, since
we don't create tarballs or patches (it's only in BK, which actually kinda
sucks). It's merely a means to have these features actively maintained and
kept in synch.

And really, that's what everyone wants. Linus doesn't want the features,
as don't other developers, regardless of the Buzzword or Coolness factors.
Some vendors and users do want them. The developers of the features and
distributors of features don't want to deal with the tedium and pain of
updating patches each and every release.

In the end, it comes down to the fact that Linus's tree is Linus's tree.
Other people can have their trees. I'm not going to tell you go off and
make your own if you want those features so bad, because I know what a
pain in the ass it is, and I know having someone else do it is a lot
easier.

DCL and CGL have their trees, for purposes probably very very similar to
what your customers need. I encourage you to check them out and work with
them (or talk to people in your company that are). Try and make it work,
and everyone can be happy (relativey). And, if DCL and CGL aren't
satisfying the space that you need, please speak up to OSDL and the
working groups. People are listening, and willing to take your suggestions
into consideration.

Relevant URLs:

http://osdl.org/projects/cgl/
http://osdl.org/projects/dcl/

-pat "kissing serious butt" mochel

Alan Cox

unread,

Oct 31, 2002, 6:46:59 PM10/31/02

to sh...@purdue.edu, Linus Torvalds, Matt D. Robinson, Rusty Russell, Linux Kernel Mailing List, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 2002-10-31 at 17:13, Michael Shuey wrote:
> I'm a user, and I request that LKCD get merged into the kernel. :-)

> Do you feel like donating a 700-port console server? Right, so it's LKCD
> for me then.

Wouldn't you rather they neatly tftp'd dumps to a nominated central
server which noticed the arrival, did the initial processing with a perl
script and mailed you a summary ?

Linus Torvalds

unread,

Oct 31, 2002, 6:51:41 PM10/31/02

to Chris Wedgwood, Jeff Garzik, Dax Kelson, Rik van Riel, Rusty Russell, linux-...@vger.kernel.org

On Thu, 31 Oct 2002, Chris Wedgwood wrote:
>
> It's synchronous and assume everything is synchronous. Lots of
> hardware (most) doesn't work that way.

Think of it another way: many users will likely _require_ atomic
encryption / decryption (done in softirq contexts etc), and thus a
synchronous interface. Also, it simplifies the code and makes it more
efficient.

Any hardware that needs to go off and think about how to encrypt something
sounds like it's so slow as to be unusable. I suspect that anything that
is over the PCI bus is already so slow (even if it adds no extra cycles of
its own) that you're better off using the CPU for the encryption rather
than some external hardware.

In short, from what I can tell, there is no huge actual reason to ever
allow a asynchronous interface. Such interfaces are likely fine for things
like network cards that can do encryption on their own on outgoing or
incoming packets, but that is not a general-purpose encryption engine, and
would not merit being part of an encryption library anyway.

[ Such a card is just a way to _avoid_ using the encryption library - the
same way we can avoid using the checksumming stuff for network cards
that can do their own checksums ]

We'll see. I'd rather have a simpler interface that works for all relevant
cases today, and then if external crypto chips end up being common and
sufficiently efficient, we can always re-consider. Are the DMA-over-PCI
roundtrip (and resulting cache invalidations) overheads really worth the
extra hardware?

Linus

Rik van Riel

unread,

Oct 31, 2002, 6:52:17 PM10/31/02

to Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002, Linus Torvalds wrote:

> In particular when it comes to this project, I'm told about
> "netdump", which doesn't try to dump to a disk, but over the net.

And guess what ? Netdump is one of various LKCD dump methods ...

regards,

Rik
--
A: No.
Q: Should I include quotations after my reply?

http://www.surriel.com/ http://distro.conectiva.com/

Alexander Viro

unread,

Oct 31, 2002, 7:00:12 PM10/31/02

to Nicholas Wourms, linux-...@vger.kernel.org

On Thu, 31 Oct 2002, Nicholas Wourms wrote:

> slow or you are out of diskspace, too bad! I'm sure I'm not the only one
> who is tired of hearing people whine about "bloat" wrt the sources and
> demanding that features they don't use be ignored. No one (non-core)

One look at the From:
understanding has blossomed
.procmailrc grows

Alan Cox

unread,

Oct 31, 2002, 7:03:03 PM10/31/02

to nwo...@netscape.net, Linux Kernel Mailing List

On Thu, 2002-10-31 at 18:28, Nicholas Wourms wrote:
> > problems most people don't have? What next, some kind of misdesigned
> > in-kernel CryptoAPI?
>
> Get over it! If you haven't noticed, CryptoAPI is merged already. The only

Chris is write that crypto api is misdesigned if we want to use hardware
cryptocards

Nicholas Wourms

unread,

Oct 31, 2002, 7:16:49 PM10/31/02

to Alexander Viro, linux-...@vger.kernel.org

Alexander Viro wrote:
>
> On Thu, 31 Oct 2002, Nicholas Wourms wrote:
>
>
>>slow or you are out of diskspace, too bad! I'm sure I'm not the only one
>>who is tired of hearing people whine about "bloat" wrt the sources and
>>demanding that features they don't use be ignored. No one (non-core)
>
>
> One look at the From:
> understanding has blossomed
> .procmailrc grows
>

Your point is?

Stephen Hemminger

unread,

Oct 31, 2002, 7:18:54 PM10/31/02

to Patrick Mochel, Dave Craft, Linus Torvalds, Matt D. Robinson, Rusty Russell, Kernel List, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 2002-10-31 at 10:45, Patrick Mochel wrote:
>
> So, this is precisely where something like OSDL's Carrier Grade and Data
> Center working groups can come into play, amazingly enough.
>
> By now, nearly everyone has heard about the working groups and nearly
> every developer that has, despises them. Even I resist association with
> them. But, they can have some real value to the vendors and the OEMs in
> exactly the way you describe.
>
> Take for example DCL. It's a kernel tree with several base patches
> intended to make Linux better in the data center. The base is not fancy,
> and includes things like LKCD and kdb (I think). It's actively maintained
> and updated more often than Linus makes a release (by virtue of
> bitkeeper).

LKCD is in and I try to keep it up to date with the patch stream.
KDB is not in yet, because the current posted patches are not up to date
to apply cleanly against 2.5.44 or 2.5.45.

> The intent is to later have multiple child trees that implement features
> for a specific application space (e.g. databases), while maintainig the
> same base set of features. People wishing to use the most recent kernel
> with those features can use the DCL tree directly. Or an OEM FAE can use
> the tree to build something for the vendor, or add extra features.

CGL hasn't decided what they want to change to.
DCL is going to have one tree focused on databases.

> Note that it's not a distribution. We don't even make real releases, since
> we don't create tarballs or patches (it's only in BK, which actually kinda
> sucks). It's merely a means to have these features actively maintained and
> kept in synch.

For DCL there is both a bitkeeper tree bk://bk.osdl.org/dcl-2.5 and
regular snapshots available on sourceforge
http://osdldcl.sourceforge.net

> And really, that's what everyone wants. Linus doesn't want the features,
> as don't other developers, regardless of the Buzzword or Coolness factors.
> Some vendors and users do want them. The developers of the features and
> distributors of features don't want to deal with the tedium and pain of
> updating patches each and every release.
>
> In the end, it comes down to the fact that Linus's tree is Linus's tree.
> Other people can have their trees. I'm not going to tell you go off and
> make your own if you want those features so bad, because I know what a
> pain in the ass it is, and I know having someone else do it is a lot
> easier.
>

FYI the criteria I apply for what goes into DCL is:
* Applys to large systems and databases
* Vendor support
* Conforms to Linux standard style
* Active project and maintainer that accepts feedback
* Community rejection has been mostly positive.

> DCL and CGL have their trees, for purposes probably very very similar to
> what your customers need. I encourage you to check them out and work with
> them (or talk to people in your company that are). Try and make it work,
> and everyone can be happy (relativey). And, if DCL and CGL aren't
> satisfying the space that you need, please speak up to OSDL and the
> working groups. People are listening, and willing to take your suggestions
> into consideration.
>
> Relevant URLs:
>
> http://osdl.org/projects/cgl/
> http://osdl.org/projects/dcl/

Stephen Hemminger
Data Center Linux (DCL) Maintainer/Coordinater

Nicholas Wourms

unread,

Oct 31, 2002, 7:19:54 PM10/31/02

to Linux Kernel Mailing List

Alan Cox wrote:
> On Thu, 2002-10-31 at 18:28, Nicholas Wourms wrote:
>
>>>problems most people don't have? What next, some kind of misdesigned
>>>in-kernel CryptoAPI?
>>
>>Get over it! If you haven't noticed, CryptoAPI is merged already. The only
>
>
> Chris is write that crypto api is misdesigned if we want to use hardware
> cryptocards
>

Alan,

Thanks for setting me straight, your assertion is correct,
of course. I was under the impression that the CryptoAPI
code was merged initially for IPSEC support and would be
revamped and expanded at a later date to support a wide
variety of interfaces?

Cheers,
Nicholas

Castor Fu

unread,

Oct 31, 2002, 7:35:03 PM10/31/02

to Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002, Linus Torvalds wrote:

>
> On Wed, 30 Oct 2002, Matt D. Robinson wrote:
>
> > Linus Torvalds wrote:
> > > > Crash Dumping (LKCD)
> > >
> > > This is definitely a vendor-driven thing. I don't believe it has any
> > > relevance unless vendors actively support it.
> >
> > There are people within IBM in Germany, India and England, as well as
> > a number of companies (Intel, NEC, Hitachi, Fujitsu), as well as SGI
> > that are PAID to support this.

Add 3PAR and probably a number of other small companies given the traffic
on the lists. Anyone building a new product on Linux and mucking
around inside the kernel, and having more than a handful of developers
is going to want LKCD, or Mission Critical's mcore, or netdump, or
something like it.

It's a shame that right out of the gate they'll have to spend time
figuring out which of these solutions work for them. I spent at least
a month of my life just looking at what's out there, and trying to make
each of them work with our product. It'd be nice if that time were
spent on making new "cool stuff".

Since then, we've put significant amounts of work into making LKCD
reliable on our system, and it's been incredibly useful in our
development. It's going to prove invaluable supporting our stuff in
the field.

> What I'm saying by "vendor driven" is that it has no relevance for the
> standard kernel, and since it has no relevance to that, then I have no
> incentives to merge it. The crash dump is only useful with people who
> actively look at the dumps, and I don't know _anybody_ outside of the
> specialized vendors you mention who actually do that.
>

> I will merge it when there are real users who want it - usually as a
> result of having gotten used to it through a vendor who supports it. (And
> by "support" I do not mean "maintain the patches", but "actively uses it"
> to work out the users problems or whatever).

If you asked me if 3PAR is a "vendor" or a "user" I'd have to say "yes".
As a vendor we sell our system to customers. They could not care less
that LKCD is in the linux kernel distribution. All they care about is
that we fix their problems as fast as possible. They probably have
no idea that this is the underlying technology, so you will never
hear from them about us.

However, we also use linux for desktops, build servers, database servers, etc.
When we have problems with these systems, we'd LOVE to be able to use the
same expertise and technology which we've developed for our system, but
all too often we find that someone just grabbed a Redhat 7.x disk or
standard debian distro to build the system.

So as a "user" I'm asking the distribution vendors, please make it easy
for me to use the same damn tools everywhere by providing some sort
of common crash dump mechanism. It'll make it easier for me to consider new
hardware, new software, etc. One thing that's awesome is Dave Anderson's
"crash" tool. It works with LKCD dumps, netdump dumps, etc. It's an example
of a tool which has leveraged all the different dump communities.

As a "vendor" please put LKCD or something like it into the main line
kernel. LKCD works. It has an active developer community which has
been extending it to work over networks, onto disks, developing new
analysis tools, etc. If we can settle on one such tool, we'll get
more cool stuff like lock analyzers, etc. Until then, we WILL keep
re-inventing the wheel because this is one of the first steps to
collect significant amounts of real data.

-castor

Michael Shuey

unread,

Oct 31, 2002, 7:44:13 PM10/31/02

to Alan Cox, Linus Torvalds, Matt D. Robinson, Rusty Russell, Linux Kernel Mailing List, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, Oct 31, 2002 at 07:04:31PM +0000, Alan Cox wrote:
> On Thu, 2002-10-31 at 17:13, Michael Shuey wrote:
> > I'm a user, and I request that LKCD get merged into the kernel. :-)
> > Do you feel like donating a 700-port console server? Right, so it's LKCD
> > for me then.
>
> Wouldn't you rather they neatly tftp'd dumps to a nominated central
> server which noticed the arrival, did the initial processing with a perl
> script and mailed you a summary ?

Generally speaking, no.

A tftp server doesn't provide enough security (specifically authentication).
It would need to be accessible from clusters in multiple buildings and on
multiple networks (some of which must be public).

I've seen more network adapter issues than drive controller issues. In
particular, some vendors (Compaq, listen up) can't implement an eepro100 to
save their asses, especially on older hardware.

From time to time bandwidth issues and/or network splits can prevent dumps
from being reliably delivered.

Right now we use the presence of a local dump to indicate that a machine
should not join the PBS pool (and begin to run more jobs) on a reboot. I'd
rather not have the nodes check a central server to see if it's okay to run
jobs. And no, I don't want machines to stay down after a crash - many nodes
are in distant corners of campus and it's cold outside. :-) If I can fix the
problem through software I'd prefer that the problematic host be up, rather
than having to walk over to it just to hit reset and load a new kernel.

That said, it would be really nice if LKCD would log dumps to both the swap
device and to a remote server. That way if the machine crashed because of
disk failure I'd still have an uncorrupted dump image (and could then notice
all the little errors coming back out of the swap device). A tool to
automatically analyze a dump and email back summaries would be much more
useful, though. If someone were to write such a widget, that'd be swell. :-)

Right now I'm less concerned with getting dumps to exactly the right place
and a bit more concerned with getting dumps in the main kernel at all.

--
Mike Shuey

Bernhard Kaindl

unread,

Oct 31, 2002, 7:59:11 PM10/31/02

to Andrew Morton, linux-...@vger.kernel.org

On Thu, 31 Oct 2002, Andrew Morton wrote:
>
> We'll be spending the next six months stabilising and hardening
> the used-to-be-2.5 kernel. If grunts like me can get hold a
> copy of the other person's kernel image from time-of-crash, that
> has a ton of value.

Exactly, sometimes you don't even need the dump itself, The user
who has the dump just types lcrash and report -w file.txt and
lcrash writes a consolidated report with the most interesting
information from the dump to the file.txt and he can sent it
to you and you've much information you often miss in problem
reports.

The report consists of: time when the dump was created, time
when the report was created, the architecture, the hostname,
kernel version and compile time, the kernel (dmesg) buffer
with all the oopses logged into it, a short task list with
process adress, id's, state, flags, cpu and process name,
and finally a full CPU dump of every CPU with all registers,
current process and function and a symbolic stack backtrace
of the CPU.

Sometimes this is all you need to know and if you need to
know e.g. the stack backtrace of a not running process at
the time of the dump, you can ask the user to simply run
trace <process address> and lcrash gives you the backtrace
of this process:

lcrash> t[race] 0x1408000
================================================================
STACK TRACE FOR TASK: 0x1408000 (kjournald)

STACK:
0 schedule+894 [0x3164e]
1 interruptible_sleep_on+174 [0x31eae]
2 journal_revoke+<ERROR> [0x10889c0c]
3 kernel_thread+70 [0x18c1e]

showing the full task scruct, a sub-struct or a field is also simple:

p[rint] ((struct task_struct *)0x1408000)->pending
struct sigpending {
head = (nil)
tail = 0x1598700
signal = sigset_t {
sig = {
[0] 0
}
}
}

"feels" a bit like gdb

> (Disclaimer: I've never used lkcd. I'm assuming that it's
> possible to gdb around in a dump)

I don't know if there is an lkcd->ELF core converter yet, but
it should be doable. However, lcrash is quite powerful, it comes
with sial, an integrated C interpreter that permits easy access to the
symbol and type information, obviosly, it allows to write code like this:

void
showprocs()
{
struct proc* p;
for(p=*(struct proc**)procs; p; p=p->p_next)
do something...
}
}

Looks nice... :-)

I also don't know if (k)gdb knows about tasks, lcrash at least
knows about them and this may when you look into a specific
task(I'm not an expert)

Of cource lcrash can do dissembing, find symbols,

> So. _If_ lkcd gives me gdb-able images from time-of-crash, I'd
> like it please. And I'm the grunt who spent nearly two years
> doing not much else apart from working 2.3/2.4 oops reports.

Maybe the lkcd people can do so, but I think they can also give
a hands-on workshop to lcrash.

You can use lcrash also on running system to browse around,
learn and save dumps from it without interrupting it, you
just need lcrash, the System.map and the Kerntypes file from
kernel for using it.

> Oh, and as Rusty has pointed out, we lose a *lot* of oops reports
> because users are in X and the backtrace doesn't make it to the
> logs.

Yep, I think it would be good even if Linus just accepts the
infrastructure patch of lkcd which needs to be in the kernel,
the vafourite dump method module can then be downloaded, compiled
installed and configured without much pain, I think that people
can start using it in a broader range without the hassle of
needing to patching and booting a special kernel.

Bernd

PS: lcrash is only one of the many frontends, as I've read in
this thread, there is also Dave Anderson's "crash" tool which
works with LKCD dumps, netdump dumps, etc. There is also qlcrash,
an qt frontend for lcrash for people who like to click!

george anzinger

unread,

Oct 31, 2002, 7:59:55 PM10/31/02

to Stephen Hemminger, Patrick Mochel, Dave Craft, Linus Torvalds, Matt D. Robinson, Rusty Russell, Kernel List, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

Stephen Hemminger wrote:
> FYI the criteria I apply for what goes into DCL is:
> * Applys to large systems and databases
> * Vendor support
> * Conforms to Linux standard style
> * Active project and maintainer that accepts feedback
> * Community rejection has been mostly positive.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Could you decode this :)

--
George Anzinger geo...@mvista.com
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml

Andreas Herrmann

unread,

Oct 31, 2002, 8:24:06 PM10/31/02

to Linus Torvalds, linux-...@vger.kernel.org, lkcd-...@lists.sourceforge.net, lkcd-dev...@lists.sourceforge.net, lkcd-g...@lists.sourceforge.net, Rusty Russell, Matt D. Robinson

Linus Torvalds <torv...@transmeta.com>
Sent by: lkcd-dev...@lists.sourceforge.net
10/31/02 04:46 PM

On Wed, 30 Oct 2002, Matt D. Robinson wrote:

> People have to realize that my kernel is not for random new
> features. The stuff I consider important are things that people
> use on their own, or stuff that is the base for other work.

A dump mechanism within the kernel is a base for much easier
kernel debugging.
IMHO, analyzing a dump is much more effective than guessing
a kernel bug solely with help of an oops message.
Using lkcd/lcrash, I've debugged enough problems in
kernel modules that were otherwise quite hard to determine.
It is hard to understand why developers do not want the
aid of dump/dump-analysis for kernel development.

Regards,

Andreas

James Simmons

unread,

Oct 31, 2002, 8:26:00 PM10/31/02

to Geert Uytterhoeven, Rusty Russell, Linus Torvalds, Linux Kernel Development, Russell King, Peter Chubb, tri...@samba.org, Theodore Ts'o

> > > On Thu, 31 Oct 2002, Rusty Russell wrote:
> > > > Fbdev Rewrite
> > >
> > > This one is just huge, and I have little personal judgement on it.
> >
> > It's been around for a while. Geert, Russell?
>
> It's huge because it moves a lot of files around:
> 1. drivers/char/agp/ -> drivers/video/agp/
> 2. drivers/char/drm/ -> drivers/video/drm/
> 3. console related files in drivers/video/ -> drivers/video/console/
>
> (1) and (2) should be reverted, but apparently they aren't reverted in the
> patch at http://phoenix.infradead.org/~jsimmons/fbdev.diff.gz yet. The patch
> also seems to remove some drivers. Haven't checked the bk repo yet.
>
> James, can you please fix that (and the .Config files)?

Done. I have a new version of that patch at the same place. It is against
2.5.45.

http://phoenix.infradead.org/~jsimmons/fbdev.diff.gz

Its still pretty big. We can save the moving of the agp code for post
halloween.

Linus Torvalds

unread,

Oct 31, 2002, 8:41:42 PM10/31/02

to Andreas Herrmann, linux-...@vger.kernel.org, lkcd-...@lists.sourceforge.net, lkcd-dev...@lists.sourceforge.net, lkcd-g...@lists.sourceforge.net, Rusty Russell, Matt D. Robinson

On Thu, 31 Oct 2002, Andreas Herrmann wrote:
>
> A dump mechanism within the kernel is a base for much easier
> kernel debugging.
> IMHO, analyzing a dump is much more effective than guessing
> a kernel bug solely with help of an oops message.

And imnsho, debugging the kernel on a source level is the way to do it.

Which is why it's not going to be me who merges it.

Read my emails.

Linus

Jeff Garzik

unread,

Oct 31, 2002, 8:46:54 PM10/31/02

to Alan Cox, nwo...@netscape.net, Linux Kernel Mailing List

Alan Cox wrote:

>On Thu, 2002-10-31 at 18:28, Nicholas Wourms wrote:
>
>
>>>problems most people don't have? What next, some kind of misdesigned
>>>in-kernel CryptoAPI?
>>>
>>>
>>Get over it! If you haven't noticed, CryptoAPI is merged already. The only
>>
>>
>
>Chris is write that crypto api is misdesigned if we want to use hardware
>cryptocards
>
>

I'll reserve judgement until we actually get access to some decent [made
in the past few years] hardware crypto cards, and take a hard look at
their PCI bus utilization... until then it is mostly vague handwaving...

[vendors - any takers?]

Stephen Hemminger

unread,

Oct 31, 2002, 8:49:38 PM10/31/02

to george anzinger, Patrick Mochel, Dave Craft, Linus Torvalds, Matt D. Robinson, Rusty Russell, Kernel List, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 2002-10-31 at 11:57, george anzinger wrote:
> Stephen Hemminger wrote:
> > FYI the criteria I apply for what goes into DCL is:
> > * Applys to large systems and databases
> > * Vendor support
> > * Conforms to Linux standard style
> > * Active project and maintainer that accepts feedback
> > * Community rejection has been mostly positive.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> Could you decode this :)

s/rejection/reaction/

Patrick Finnegan

unread,

Oct 31, 2002, 8:53:19 PM10/31/02

to Linus Torvalds, Andreas Herrmann, linux-...@vger.kernel.org, lkcd-...@lists.sourceforge.net, lkcd-dev...@lists.sourceforge.net, lkcd-g...@lists.sourceforge.net, Rusty Russell, Matt D. Robinson

On Thu, 31 Oct 2002, Linus Torvalds wrote:

> On Thu, 31 Oct 2002, Andreas Herrmann wrote:
> >
> > A dump mechanism within the kernel is a base for much easier
> > kernel debugging.
> > IMHO, analyzing a dump is much more effective than guessing
> > a kernel bug solely with help of an oops message.
>
> And imnsho, debugging the kernel on a source level is the way to do it.
>
> Which is why it's not going to be me who merges it.

But, LKCD is useful also for tracing crashes back to hardware that causes
it. It's really hard to find problems in hardware using source code,
since the source code DOENS'T have anything to do with the problems.

Pat
--
Purdue Universtiy ITAP/RCS
Information Technology at Purdue
Research Computing and Storage
http://www-rcd.cc.purdue.edu

http://dilbert.com/comics/dilbert/archive/images/dilbert2040637020924.gif

-

Dave Anderson

unread,

Oct 31, 2002, 9:01:42 PM10/31/02

to Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, 31 Oct 2002, Linus Torvalds wrote:

> - included features kill off (potentially better) projects.
>
> There's a big "inertia" to features. It's often better to keep
> features _off_ the standard kernel if they may end up being
> further developed in totally new directions.

>
> In particular when it comes to this project, I'm told about
> "netdump", which doesn't try to dump to a disk, but over the net.

> And quite frankly, my immediate reaction is to say "Hell, I
> _never_ want the dump touching my disk, but over the network
> sounds like a great idea".

>
> To me this says "LKCD is stupid". Which means that I'm not going to apply
> it, and I'm going to need some real reason to do so - ie being proven
> wrong in the field.
>

> (And don't get me wrong - I don't mind getting proven wrong. I change my
> opinions the way some people change underwear. And I think that's ok).

It would be most unfortunate if the existance of netdump is used as a
reason to deny LKCD's inclusion, or to simply dismiss LKCD as stupid.

On Thu, 31 Oct 2002, Matt D. Robinson wrote:

> We want to see this in the kernel, frankly, because it's a pain
> in the butt keeping up with your kernel revisions and everything
> else that goes in that changes. And I'm sure SuSE, UnitedLinux and
> (hopefully) Red Hat don't want to spend their time having to roll
> this stuff in each and every time you roll a new kernel.

While Red Hat advocates Ingo's netdump option, we have customer
requests that are requiring us to look at LKCD disk-based dumps as an
alternative, co-existing dump mechanism. Since the two methods are not mutually
exclusive, LKCD will never kill off netdump -- nor certainly vice-versa. We're
all just looking for a better means to be able to
provide support to our customers, not to mention its value as a
development aid.

Dave Anderson
Red Hat, Inc.

Jeff Garzik

unread,

Oct 31, 2002, 9:03:58 PM10/31/02

to Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

Linus Torvalds wrote:

> In particular when it comes to this project, I'm told about
> "netdump", which doesn't try to dump to a disk, but over the net.
> And quite frankly, my immediate reaction is to say "Hell, I
> _never_ want the dump touching my disk, but over the network
> sounds like a great idea".
>
>

[yes, I realize the LKCD merge debate is over, bear with me :)]

I'm sort of in the middle on this issue: The existence of netdump does
not imply that disk dumps are a bad thing.

netdumps require a net dump server, and it is simply not realistic at
all to assume that users seeing crashes will always have a netdump
server set up in advance, or even have multiple machines to make that
possible. Disk dumps are valuable because their requirements are very
low, and because of all the user-support reasons that Andrew Morton
mentioned in this thread.

That said, I used to be an LKCD cheerleader until a couple people made
some good points to me: it is not nearly low-level enough to truly be
of use in crash situations. netdump can work if your interrupts are
hosed/screaming, and various mid-layers are dying. For LKCD to be of
any use, it needs to _skip_ the block layer and talk directly to
low-level drivers.

So, I think the stock kernel does need some form of disk dumping,
regardless of any presence/absence of netdump. But LKCD isn't there yet...

Jeff

Benjamin LaHaise

unread,

Oct 31, 2002, 9:10:38 PM10/31/02

to Linus Torvalds, Andreas Herrmann, linux-...@vger.kernel.org, lkcd-...@lists.sourceforge.net, lkcd-dev...@lists.sourceforge.net, lkcd-g...@lists.sourceforge.net, Rusty Russell, Matt D. Robinson

On Thu, Oct 31, 2002 at 12:40:28PM -0800, Linus Torvalds wrote:
> And imnsho, debugging the kernel on a source level is the way to do it.
>
> Which is why it's not going to be me who merges it.
>
> Read my emails.

That is one of the reasons that crash dumps are useful. Quite a few
problems that customers hit are not easy to reproduce, but when they
provide a dump file that can be loaded into gdb with the original
kernel debugging info and the backtrace command issued and various
bits of internal structures examined, usually a good hypothesis can
be made for the cause. Feed that back into a code audit and you end
up fixing problems that are decidedly challenging.

-ben
--
"Do you seek knowledge in time travel?"

john stultz

unread,

Oct 31, 2002, 9:16:38 PM10/31/02

to Stephen Frost, Rik van Riel, Linus Torvalds, Rusty Russell, lkml

On Wed, 2002-10-30 at 19:28, Stephen Frost wrote:
> The feeling I got on this was the ability to let users define their own
> groups. Perhaps I'm not following it closely enough but that was the
> impression I got in terms of "what this does for us"; I'm probably
> missing other things. Just that ability would be nice in my view
> though. Isn't it something that's been in AFS for a long time too?
> I've got a few friends who've played with AFS before (at CMU and the
> like) and really enjoyed the ACLs there.

Yea, I haven't looked at the submitted implementation, but at CMU ACLs
were critical to be able to selectively share data between a very large
set of users w/o bugging an administrator. Given multiple classes per
semester which had multiple group projects, where you may have different
groups for each project, I have no clue how anyone would be able to
handle the (unix)group management required. ACLs let the users
themselves manage what people got what access to their data.

How else can I fix my partner's bugs (or vice-versa), give the clumsy TA
read only access, and let the cheat across the hall figure it out for
himself? (There may very well be a good solution to this w/o ACLs but
I've not seen it in use.)

So yea, I'd love to see a common ACLs API.
-john

Werner Almesberger

unread,

Oct 31, 2002, 9:50:36 PM10/31/02

to john stultz, lkml

[ Cc: trimmed ]

john stultz wrote:
> groups for each project, I have no clue how anyone would be able to
> handle the (unix)group management required. ACLs let the users
> themselves manage what people got what access to their data.

Note that POSIX ACLs don't seem to solve this either: they only
let you control access in terms of existing users or groups.

IMHO, this is one of the standard pitfalls of ACLs: if they don't
let you aggregate information, you quickly end up with huge ACLs
hanging off every file, and each of those ACLs wants to be
carefully maintained. I've seen a lot of this in my VMS days.
(Unix is a bit better, because you can control access at a
directory level, while VMS needs the ACL on each file, because
you can open files directly by VMS' equivalent to an inode
number, without traversing the directory hierarchy. Of course,
many users didn't know that :-)

To make ACLs truly scalable, it would be nice to be able to
express permissions in terms of access to other filesystem
objects. E.g. "everybody who can read file ~me/acls/my_friends
can write the directory on which this ACE hangs". This should
work like a symlink, i.e. if I add new friends to my_friends, I
don't have to update all my ACLs.

- Werner

--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina w...@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

Oliver Xymoron

unread,

Oct 31, 2002, 9:51:44 PM10/31/02

to Dave Anderson, Linus Torvalds, Matt D. Robinson, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On Thu, Oct 31, 2002 at 03:59:34PM -0500, Dave Anderson wrote:
>
> > To me this says "LKCD is stupid". Which means that I'm not going to apply
> > it, and I'm going to need some real reason to do so - ie being proven
> > wrong in the field.
> >
> > (And don't get me wrong - I don't mind getting proven wrong. I change my
> > opinions the way some people change underwear. And I think that's ok).
>
> It would be most unfortunate if the existance of netdump is used as a
> reason to deny LKCD's inclusion, or to simply dismiss LKCD as stupid.

What he really wants is for Andrew or Alan or someone else he trusts
to merge it, get actual field results, and declare it useful. If
people start visibly passing around crash dump results on l-k and
solving problems with them, that'll help too. Until then all he has is
his gut feel to go on.

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

Bernhard Kaindl

unread,

Oct 31, 2002, 10:06:00 PM10/31/02

to linux-...@vger.kernel.org, Linus Torvalds, lkcd-g...@lists.sourceforge.net

On Thu, 31 Oct 2002, Benjamin LaHaise wrote:
> On Thu, Oct 31, 2002 at 12:40:28PM -0800, Linus Torvalds wrote:
> > And imnsho, debugging the kernel on a source level is the way to do it.
> >
> > Which is why it's not going to be me who merges it.
> >
> > Read my emails.
>
> That is one of the reasons that crash dumps are useful. Quite a few
> problems that customers hit are not easy to reproduce, but when they
> provide a dump file that can be loaded into gdb with the original
> kernel debugging info and the backtrace command issued and various
> bits of internal structures examined, usually a good hypothesis can
> be made for the cause. Feed that back into a code audit and you end
> up fixing problems that are decidedly challenging.
>
> -ben

I could not have said it better. I've a good real-life example for it,
one which really happened and one just as example to give an image.

[ I'm not an expert, I'm just writing about my experiance ]
[ in order to try to make linux even better than it is ]

About debugging at source level:

Dump analysis does not say that you are not debugging on a source level,
with a vmlinux compiled with -g, (which could be stripped before making
the image) crash analysis tools could operate at source level(depending
on the compiler's reorderings of course, the assumtion that -O2 maps
source:binary 1:1 is of course not from this world)

An analogy to doctors, hospitals and patients:

dump analysis says you don't need to have a living patient
in order to cure a disease. It says you may have slept on the
other side of the world while the disease murdered your fellow
at home. But as you don't like that it happens again to another
fellow, you want to have a remote lab which gives you every info
you need to have in order to know what might have murdered him.

The dump tools are this remote lab. If you don't have it, you
may need to fly over to the site where the disease is, monitor
the patient and try to find out what's happening and you can't
find out what's up without at least one another dead patient at
the end.

But the hospital may not like to even have one single dead
patient more than neccesary(best 0) and would choose a doctor
who has the remote lab where he can quickly check what's up
and find a cure *before* the next patient gets ill.

Back to the computer world, this would mean that an OS having
the remote lab(dump tools) would be favoured over on OS that
don't has. The same goes for LTT and Dynamic Probes.

Back to crash dump: In some environments like laboratory or blood
bank information systems you need to use computers in order to
efficiently process, store and distribute data, and organize
the handling of blood. In such environments, the life of people
can change on a fast, efficiently and stably working organsation.

Of course you need to be able to recover and continue such
organisation even with the laboratory information system being
down for a reboot or maintenance.

But you simply cannot go there, halt all the distributed information
retrieval and automated job control with the laboratory apparatuses,
block all the users(maybe thousands) for debugging the kernel and
check what is going on while the whole hospital is waiting for you.

Of course you can do this, but only once or only in at a time
where every use of the system can be organized to bypass it und
use paper, in-house mail and phone to do the things the system
is normally doing. A hospital with thousands of patients cannot
wait while debugging.

> Which is why it's not going to be me who merges it.

Sure, but it would help Linux World Domination if the base
kernel would support it also.

Bernd

PS: Sorry for the extreme example but this is an example
I know from my previous work and I've just tried to describe
it as real as possible.

Pavel Machek

unread,

Oct 31, 2002, 10:10:48 PM10/31/02

to Alexander Viro, Dax Kelson, Chris Wedgwood, Rik van Riel, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

Hi!

> > Without ACLs, if Sally, Joe and Bill need rw access to a file/dir, just
> > create another group with just those three people in. Over time, of
>
> If Sally, Joe and Bill need rw access to a directory, and Joe and Bill
> are using existing userland (any OS I'd seen), then Sally can easily
> fuck them into the next month and not in a good way.

Do you mean symlink attack?

> _That_ is the real problem. Until that is solved (i.e. until all
> userland is written up to the standards allegedly followed in writing
> suid-root programs wrt hostile filesystem modifications) NO mechanism
> will help you. ACLs, huge groups, whatever - setups with that sort
> of access allowed are NOT SUSTAINABLE with the current userland(s).

So userland needs to be improved. It already needs that modifications
because of /tmp. Is there any new issue there?
Pavel
--
When do you have heart between your knees?

Pavel Machek

unread,

Oct 31, 2002, 10:11:26 PM10/31/02

to Alexander Viro, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org

Hi!

> > > ext2/ext3 ACLs and Extended Attributes
> >
> > I don't know why people still want ACL's. There were noises about them for
> > samba, but I'v enot heard anything since. Are vendors using this?
>
> Because People Are Stupid(tm). Because it's cheaper to put "ACL support: yes"
> in the feature list under "Security" than to make sure than userland can cope
> with anything more complex than "Me Og. Og see directory. Directory Og's.
> Nobody change it". C.f. snake oil, P.T.Barnum and esp. LSM users

Okay... Have ~/bin/phonebook and I'd like it to be rw- to me, r-- to
jarka and mj, and --- to everyone else. How do I do that without ACLs?
Adding a group is root-only operation.

This seems like a pretty common situation to me, and current solutions
are not nice. [I guess ~/bin/ with --x and
~/bin/my-secret-password-only-jarka-and-mj-knows/phonebook would solve
the problem, but...!]

Shawn

unread,

Oct 31, 2002, 10:23:00 PM10/31/02

to Matt D. Robinson, Linus Torvalds, Rusty Russell, linux-...@vger.kernel.org, lkcd-g...@lists.sourceforge.net, lkcd-...@lists.sourceforge.net

On 10/31, Matt D. Robinson said something like:

> On Thu, 31 Oct 2002, Linus Torvalds wrote:

> |>On Wed, 30 Oct 2002, Matt D. Robinson wrote:
> |>That's fine. And since they are paid to support it, they can apply the
> |>patches.

>
> We want to see this in the kernel, frankly, because it's a pain
> in the butt keeping up with your kernel revisions and everything
> else that goes in that changes. And I'm sure SuSE, UnitedLinux and
> (hopefully) Red Hat don't want to spend their time having to roll
> this stuff in each and every time you roll a new kernel.

I share some of your sentiment, but honestly, think about it.

Linus has to "keep up" with all the changees coming into his inbox as
well, and the more features, the more breakage that can happen when
Linus accepts a patch.

Really, Linus wants to push some of his maintanance overhead to distros,
who get paid to do it, but also to provide sexy bullet point items for
users, so they buy "Linux" stuff.

You try to find a better balance.

--
Shawn Leas
co...@enodev.com

I installed a skylight in my apartment...
The people who live above me are furious!
-- Stephen Wright