Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[patch 00/2] improve .text size on gcc 4.0 and newer compilers

190 views
Skip to first unread message

Ingo Molnar

unread,
Dec 28, 2005, 6:50:19 AM12/28/05
to
this patchset (for the 2.6.16 tree) consists of two patches:

gcc-no-forced-inlining.patch
gcc-unit-at-a-time.patch

the purpose of these patches is to reduce the kernel's .text size, in
particular if CONFIG_CC_OPTIMIZE_FOR_SIZE is specified. The effect of
the patches on x86 is:

text data bss dec hex filename
3286166 869852 387260 4543278 45532e vmlinux-orig
3194123 955168 387260 4536551 4538e7 vmlinux-inline
3119495 884960 387748 4392203 43050b vmlinux-inline+units
437271 77646 32192 547109 85925 vmlinux-tiny-orig
452694 77646 32192 562532 89564 vmlinux-tiny-inline
431891 77422 32128 541441 84301 vmlinux-tiny-inline+units

i.e. a 5.3% .text reduction (!) with a larger .config, and a 1.2% .text
reduction with a smaller .config.

i've also done test-builds with CC_OPTIMIZE_FOR_SIZE disabled:

text data bss dec hex filename
4080998 870384 387260 5338642 517612 vmlinux-speed-orig
4084421 872024 387260 5343705 5189d9 vmlinux-speed-inline
4010957 834048 387748 5232753 4fd871 vmlinux-speed-inline+units

so the more flexible inlining did not result in many changes [which is
good, we want gcc to inline those in the optimized-for-speed case], but
unit-at-a-time optimization resulted in smaller code - very likely
meaning speed advantages as well.

unit-at-a-time still increases the kernel stack footprint somewhat (by
about 5% in the CC_OPTIMIZE_FOR_SIZE case), but not by the insane degree
gcc3 used to, which prompted the original -fno-unit-at-a-time addition.

so i think the combination of the two patches is a win both for small
and for large systems. In fact the 5.3% .text reduction for embedded
kernels is very significant.

the patches are against -git, and were test-built and test-booted on
x86, using gcc 4.0.2.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Linus Torvalds

unread,
Dec 28, 2005, 2:30:12 PM12/28/05
to

On Wed, 28 Dec 2005, Ingo Molnar wrote:
>
> this patchset (for the 2.6.16 tree) consists of two patches:
>
> gcc-no-forced-inlining.patch
> gcc-unit-at-a-time.patch

Why do you mix the two up? I'd assume they are independent, and if they
aren't, please explain why?

The forced inlining is not just a good idea. Several versions of gcc would
NOT COMPILE the kernel without it. The fact that it works with your
configurations and your particular compiler version has absolutely ZERO
relevance.

Gcc has had horrible mistakes in inlining functions. Inlining too much,
and quite often, not inlining things that absolutely _have_ to be inlined.
Trivial things that inline to an instruction or two, but that look
complicated because they have a big switch-statement that just happens to
be known at compile-time.

And not inlining them not only results in horribly bad code (dynamic tests
for something that should be static), but also results in link errors when
cases that should be statically unreachable suddenly become reachable
after all.

So the fact that your gcc-4.x version happens to get things right for your
case in no way means that you can do this in general.

Also, the inlining patch apparently makes code larger in some cases, so
it's not even a unconditional win.

What's the effect of _just_ the "unit-at-a-time" thing which we can (and
you did) much more easily make gcc-version-dependent?

Linus

Arjan van de Ven

unread,
Dec 28, 2005, 2:40:16 PM12/28/05
to

>
> The forced inlining is not just a good idea. Several versions of gcc would
> NOT COMPILE the kernel without it.

yup that's why the patch only does it for gcc4, in which the inlining
heuristics finally got rewritten to something that seems to resemble
sanity...

> Also, the inlining patch apparently makes code larger in some cases, so
> it's not even a unconditional win.

.... as long as you give the inlining algorithm enough information.
-fno-unit-at-a-time prevents gcc from having the information, and the
decisions it makes are then less optimal...

(unit-at-a-time allows gcc to look at the entire .c file, eg things like
number of callers etc etc, disabling that tells gcc to do the .c file as
single pass top-to-bottom only)

Linus Torvalds

unread,
Dec 28, 2005, 4:10:06 PM12/28/05
to

On Wed, 28 Dec 2005, Arjan van de Ven wrote:
>
> yup that's why the patch only does it for gcc4, in which the inlining
> heuristics finally got rewritten to something that seems to resemble
> sanity...

Is that actually true of all gcc4 versions? I seem to remember gcc-4.0
being a real stinker.

> > Also, the inlining patch apparently makes code larger in some cases,
> > so it's not even a unconditional win.
>
> .... as long as you give the inlining algorithm enough information.
> -fno-unit-at-a-time prevents gcc from having the information, and the
> decisions it makes are then less optimal...
>
> (unit-at-a-time allows gcc to look at the entire .c file, eg things like
> number of callers etc etc, disabling that tells gcc to do the .c file as
> single pass top-to-bottom only)

I'd still prefer to see numbers with -funit-at-a-time only. I think it's
an independent knob, and I'd be much less worried about that, because we
do know that unit-at-a-time has been enabled on x86-64 for a long time
("forever"). So that's less of a change, I feel.

Linus

Arjan van de Ven

unread,
Dec 28, 2005, 4:20:12 PM12/28/05
to
On Wed, 2005-12-28 at 13:02 -0800, Linus Torvalds wrote:
>
> On Wed, 28 Dec 2005, Arjan van de Ven wrote:
> >
> > yup that's why the patch only does it for gcc4, in which the inlining
> > heuristics finally got rewritten to something that seems to resemble
> > sanity...
>
> Is that actually true of all gcc4 versions? I seem to remember gcc-4.0
> being a real stinker.

it is... if you disable unit-at-a-time for sure.
But I'm not entirely sure when this got in, if it was 4.0 or 4.1

> > (unit-at-a-time allows gcc to look at the entire .c file, eg things like
> > number of callers etc etc, disabling that tells gcc to do the .c file as
> > single pass top-to-bottom only)
>
> I'd still prefer to see numbers with -funit-at-a-time only. I think it's
> an independent knob, and I'd be much less worried about that, because we
> do know that unit-at-a-time has been enabled on x86-64 for a long time
> ("forever"). So that's less of a change, I feel.

the only effect I expect is more inlining actually, since we on the one
hand tie gcc's hands via the forced inline, and one the other hand now
give it more room to inline more. But yeah it's worth to look at for
sure, even if it is to see it's getting bigger ;)

Ingo Molnar

unread,
Dec 28, 2005, 4:30:16 PM12/28/05
to

* Linus Torvalds <torv...@osdl.org> wrote:

> On Wed, 28 Dec 2005, Arjan van de Ven wrote:
> >
> > yup that's why the patch only does it for gcc4, in which the inlining
> > heuristics finally got rewritten to something that seems to resemble
> > sanity...
>
> Is that actually true of all gcc4 versions? I seem to remember gcc-4.0
> being a real stinker.

all my tests were with gcc 4.0.2.

> > > Also, the inlining patch apparently makes code larger in some cases,
> > > so it's not even a unconditional win.
> >
> > .... as long as you give the inlining algorithm enough information.
> > -fno-unit-at-a-time prevents gcc from having the information, and the
> > decisions it makes are then less optimal...
> >
> > (unit-at-a-time allows gcc to look at the entire .c file, eg things like
> > number of callers etc etc, disabling that tells gcc to do the .c file as
> > single pass top-to-bottom only)
>
> I'd still prefer to see numbers with -funit-at-a-time only. I think
> it's an independent knob, and I'd be much less worried about that,
> because we do know that unit-at-a-time has been enabled on x86-64 for
> a long time ("forever"). So that's less of a change, I feel.

the two patches are completely independent, and the only reason i did
them together was because i was looking at .text size in general and
these were the two things that made a difference. Also, the inlining was
a loss in one of the .config's, unless combined with the wider-scope
unit-at-a-time optimization.

(there's a third thing that i was also playing with, -ffunction-sections
and -fdata-sections, but those dont seem to be reliable on the binutils
side yet.)

here are the isolated unit-at-a-time numbers as well:

text data bss dec hex filename
3286166 869852 387260 4543278 45532e vmlinux-orig

3259928 833176 387748 4480852 445f54 vmlinux-units -0.8%
3194123 955168 387260 4536551 4538e7 vmlinux-inline -2.9%
3119495 884960 387748 4392203 43050b vmlinux-inline+units -5.3%

so both inlining and unit-at-a-time is a win independently [although
inlining alone does bloat .data], but applied together they bring an
additional 1.6% of .text savings. All builds done with:

gcc version 4.0.2 20051109 (Red Hat 4.0.2-6)

how about giving the inlining stuff some more exposure in -mm (if it's
fine with Andrew), to check for any regressions? I'd suggest the same
for the unit-at-a-time thing too, in any case.

Ingo

Ingo Molnar

unread,
Dec 28, 2005, 5:00:15 PM12/28/05
to

* Ingo Molnar <mi...@elte.hu> wrote:

> (there's a third thing that i was also playing with, -ffunction-sections
> and -fdata-sections, but those dont seem to be reliable on the binutils
> side yet.)
>
> here are the isolated unit-at-a-time numbers as well:
>
> text data bss dec hex filename
> 3286166 869852 387260 4543278 45532e vmlinux-orig
> 3259928 833176 387748 4480852 445f54 vmlinux-units -0.8%
> 3194123 955168 387260 4536551 4538e7 vmlinux-inline -2.9%
> 3119495 884960 387748 4392203 43050b vmlinux-inline+units -5.3%
>
> so both inlining and unit-at-a-time is a win independently [although
> inlining alone does bloat .data], but applied together they bring an
> additional 1.6% of .text savings. All builds done with:
>
> gcc version 4.0.2 20051109 (Red Hat 4.0.2-6)
>
> how about giving the inlining stuff some more exposure in -mm (if it's
> fine with Andrew), to check for any regressions? I'd suggest the same
> for the unit-at-a-time thing too, in any case.

another thing: i wanted to decrease the size of -Os
(CONFIG_CC_OPTIMIZE_FOR_SIZE) kernels, which e.g. Fedora uses too (to
keep the icache footprint down).

I think gcc should arguably not be forced to inline things when doing
-Os, and it's also expected to mess up much less than when optimizing
for speed. So maybe forced inlining should be dependent on
!CONFIG_CC_OPTIMIZE_FOR_SIZE?

I.e. like the patch below?

Ingo

----------------->
Subject: allow gcc4 to control inlining

allow gcc4 compilers to decide what to inline and what not - instead
of the kernel forcing gcc to inline all the time.

Signed-off-by: Ingo Molnar <mi...@elte.hu>
Signed-off-by: Arjan van de Ven <ar...@infradead.org>
----

include/linux/compiler-gcc4.h | 13 +++++++++----
1 files changed, 9 insertions(+), 4 deletions(-)

Index: linux-gcc.q/include/linux/compiler-gcc4.h
===================================================================
--- linux-gcc.q.orig/include/linux/compiler-gcc4.h
+++ linux-gcc.q/include/linux/compiler-gcc4.h
@@ -3,14 +3,19 @@
/* These definitions are for GCC v4.x. */
#include <linux/compiler-gcc.h>

-#define inline inline __attribute__((always_inline))
-#define __inline__ __inline__ __attribute__((always_inline))
-#define __inline __inline __attribute__((always_inline))
+
+#ifndef CONFIG_CC_OPTIMIZE_FOR_SIZE
+# define inline inline __attribute__((always_inline))
+# define __inline__ __inline__ __attribute__((always_inline))
+# define __inline __inline __attribute__((always_inline))
+#endif
+
#define __deprecated __attribute__((deprecated))
#define __attribute_used__ __attribute__((__used__))
#define __attribute_pure__ __attribute__((pure))
#define __attribute_const__ __attribute__((__const__))
-#define noinline __attribute__((noinline))
+#define noinline __attribute__((noinline))
+#define __always_inline inline __attribute__((always_inline))
#define __must_check __attribute__((warn_unused_result))
#define __compiler_offsetof(a,b) __builtin_offsetof(a,b)

Krzysztof Halasa

unread,
Dec 28, 2005, 7:00:16 PM12/28/05
to
Ingo Molnar <mi...@elte.hu> writes:

>> gcc version 4.0.2 20051109 (Red Hat 4.0.2-6)

> another thing: i wanted to decrease the size of -Os

> (CONFIG_CC_OPTIMIZE_FOR_SIZE) kernels, which e.g. Fedora uses too (to
> keep the icache footprint down).

Remember the above gcc miscompiles the x86-32 kernel with -Os:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=173764
--
Krzysztof Halasa

Rogério Brito

unread,
Dec 28, 2005, 7:40:06 PM12/28/05
to
On Dec 28 2005, Ingo Molnar wrote:
> how about giving the inlining stuff some more exposure in -mm (if it's
> fine with Andrew), to check for any regressions? I'd suggest the same
> for the unit-at-a-time thing too, in any case.

I am willing to give a try to the patches on both ia32 and ppc (which is
what I have at hand). I'm using Debian testing, but I can, perhaps, give
GCC 4.1 a shot (if I happen to grab my hands on such patched tree soon
enough).

I am interested in anything that could bring me memory reduction.
Actually, I am even considering using the -tiny patches here on my
father's computer---an old Pentium MMX 200MHz with 64MB of RAM.

Also, the PowerMac 9500 that I have here was inherited from my uncle and
it has a slow SCSI disk (only 2MB/s of transfer rates) and 192MB of RAM.
Anything that makes it avoid hitting swap is a plus, as you can imagine.


Thanks, Rogério.

--
Rogério Brito : rbr...@ime.usp.br : http://www.ime.usp.br/~rbrito
Homepage of the algorithms package : http://algorithms.berlios.de
Homepage on freshmeat: http://freshmeat.net/projects/algorithms/

Andrew Morton

unread,
Dec 28, 2005, 11:20:09 PM12/28/05
to
Ingo Molnar <mi...@elte.hu> wrote:
>
> I think gcc should arguably not be forced to inline things when doing
> -Os, and it's also expected to mess up much less than when optimizing
> for speed. So maybe forced inlining should be dependent on
> !CONFIG_CC_OPTIMIZE_FOR_SIZE?

When it comes to inlining I just don't trust gcc as far as I can spit it.
We're putting the kernel at the mercy of future random brainfarts and bugs
from the gcc guys. It would be better and safer IMO to continue to force
`inline' to have strict and sane semamtics, and to simply be vigilant about
our use of it.

IOW: I'd prefer that we be the ones who specify which functions are going
to be inlined and which ones are not.


If no-forced-inlining makes the kernel smaller then we probably have (yet
more) incorrect inlining. We should hunt those down and fix them. We did
quite a lot of this in 2.5.x/2.6.early. Didn't someone have a script which
would identify which functions are a candidate for uninlining?

Adrian Bunk

unread,
Dec 28, 2005, 11:50:13 PM12/28/05
to
On Wed, Dec 28, 2005 at 12:46:37PM +0100, Ingo Molnar wrote:
> this patchset (for the 2.6.16 tree) consists of two patches:
>
> gcc-no-forced-inlining.patch
> gcc-unit-at-a-time.patch
>
> the purpose of these patches is to reduce the kernel's .text size, in
> particular if CONFIG_CC_OPTIMIZE_FOR_SIZE is specified. The effect of
> the patches on x86 is:
>
> text data bss dec hex filename
> 3286166 869852 387260 4543278 45532e vmlinux-orig
> 3194123 955168 387260 4536551 4538e7 vmlinux-inline
>...

The most interesting question is:
Which object files do these savings come from

We have two cases in the kernel:
- header files where forced inlining is required
- C files where forced inlining is nearly always wrong

The classical example are functions some marked as "inline" when they
where tiny and had one caller, but now are huge and have many callers.

An interesting number would be the space saving after doing some kind of
s/inline//g in all .c files.

> unit-at-a-time still increases the kernel stack footprint somewhat (by
> about 5% in the CC_OPTIMIZE_FOR_SIZE case), but not by the insane degree
> gcc3 used to, which prompted the original -fno-unit-at-a-time addition.

>...

Please hold off this patch.

I do already plan to look at this after the smoke has cleared after the
4k stacks issue. I want to avoid two different knobs both with negative
effects on stack usage (currently CONFIG_4KSTACKS=y, and after your
patch gcc >= 4.0) giving a low testing coverage of the worst cases.

> Ingo

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

Ingo Molnar

unread,
Dec 29, 2005, 2:40:08 AM12/29/05
to

* Andrew Morton <ak...@osdl.org> wrote:

> Ingo Molnar <mi...@elte.hu> wrote:
> >
> > I think gcc should arguably not be forced to inline things when doing
> > -Os, and it's also expected to mess up much less than when optimizing
> > for speed. So maybe forced inlining should be dependent on
> > !CONFIG_CC_OPTIMIZE_FOR_SIZE?
>
> When it comes to inlining I just don't trust gcc as far as I can spit
> it. We're putting the kernel at the mercy of future random brainfarts
> and bugs from the gcc guys. It would be better and safer IMO to
> continue to force `inline' to have strict and sane semamtics, and to
> simply be vigilant about our use of it.

i think there's quite an attitude here - we are at the mercy of "gcc
brainfarts" anyway, and users are at the mercy of "kernel brainfarts"
just as much. Should users disable swapping and trash-talk it just
because the Linux kernel used to have a poor VM? (And the gcc folks are
certainly listening - it's not like they were unwilling to fix stuff,
they simply had their own decade-old technological legacies that made
certain seemingly random problems much harder to attack. E.g. -Os has
recently been improved quite significantly in to-be-gcc-4.2.)

at least let us allow gcc do it in the CONFIG_CC_OPTIMIZE_FOR_SIZE case,
-Os means "optimize for space" - no ifs and when, it's a _very_ clear
and definite goal. I dont think there's much space for gcc to mess up
there, it's a mostly binary decision: either the inlining of a
particular function saves space, or not.

in the other case, when optimizing for speed, the decisions are alot
less clear, and gcc has arguably alot more leeway to mess up.

also, there's a fundamental conflict of 'speed vs. performance' here,
for a certain boundary region. For the extremes, very small and very
large functions, the decision is clear, but if e.g. a CPU has tons of
cache, it might prefer more agressive inlining than if it doesnt. So
it's not like we can do it in a fully static manner.

> If no-forced-inlining makes the kernel smaller then we probably have
> (yet more) incorrect inlining. We should hunt those down and fix them.
> We did quite a lot of this in 2.5.x/2.6.early. Didn't someone have a
> script which would identify which functions are a candidate for
> uninlining?

this is going to be a never ending battle, and it's not about peanuts
either: we are talking about 5% of .text space here, on a .config that
carries most of the important subsystems and drivers. Do we really want
to take on this battle and fight it for 30,000+ kernel functions - when
gcc today can arguably do a _better_ job than what we attempted to do
manually for years? We went to great trouble going to BK just to make
development easier - shouldnt we let a fully open-source tool like gcc
make our lives easier and not worry about details like that? Whether to
inline or not _is_ a mostly thoughtless work with almost zero intellect
in it. I'd rather trust gcc do it than some script doing the same much
worse.

Ingo

Ingo Molnar

unread,
Dec 29, 2005, 2:50:09 AM12/29/05
to

* Krzysztof Halasa <k...@pm.waw.pl> wrote:

> Ingo Molnar <mi...@elte.hu> writes:
>
> >> gcc version 4.0.2 20051109 (Red Hat 4.0.2-6)
>
> > another thing: i wanted to decrease the size of -Os
> > (CONFIG_CC_OPTIMIZE_FOR_SIZE) kernels, which e.g. Fedora uses too (to
> > keep the icache footprint down).
>
> Remember the above gcc miscompiles the x86-32 kernel with -Os:
>
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=173764

i'm not sure what the point is. There was no sudden rush of -Os related
bugs when Fedora switched to it for the kernel, and the 35% code-size
savings were certainly worth it in terms of icache footprint. Yes, -Os
is a major change for how the compiler works, and the kernel is a major
piece of software.

Ingo

Arjan van de Ven

unread,
Dec 29, 2005, 3:00:21 AM12/29/05
to

> IOW: I'd prefer that we be the ones who specify which functions are going
> to be inlined and which ones are not.

a bold statement... especially since the "and which ones are not" isn't
currently there, we still leave gcc a lot of freedom there ... but only
in one direction.

Ingo Molnar

unread,
Dec 29, 2005, 3:10:05 AM12/29/05
to

* Adrian Bunk <bu...@stusta.de> wrote:

> > unit-at-a-time still increases the kernel stack footprint somewhat (by
> > about 5% in the CC_OPTIMIZE_FOR_SIZE case), but not by the insane degree
> > gcc3 used to, which prompted the original -fno-unit-at-a-time addition.
> >...
>
> Please hold off this patch.
>
> I do already plan to look at this after the smoke has cleared after
> the 4k stacks issue. I want to avoid two different knobs both with
> negative effects on stack usage (currently CONFIG_4KSTACKS=y, and
> after your patch gcc >= 4.0) giving a low testing coverage of the
> worst cases.

this is obviously not 2.6.15 stuff, so we've got enough time to see the
effects. [ And what does "I do plan to look at this" mean? When
precisely, and can i thus go to other topics without the issue being
dropped on the floor indefinitely? ]

also note that the inlining patch actually _reduces_ average stack
footprint by ~3-4%:
orig +inlining
# of functions above 256 bytes: 683 660
total stackspace, bytes: 148492 142884

it is the unit-at-a-time patch that increases stack footprint (by about
7-8%, which together with the inlining patch gives a net ~5%).

Ingo

Dave Jones

unread,
Dec 29, 2005, 3:10:09 AM12/29/05
to
On Thu, Dec 29, 2005 at 08:41:07AM +0100, Ingo Molnar wrote:
>
> * Krzysztof Halasa <k...@pm.waw.pl> wrote:
>
> > Ingo Molnar <mi...@elte.hu> writes:
> >
> > >> gcc version 4.0.2 20051109 (Red Hat 4.0.2-6)
> >
> > > another thing: i wanted to decrease the size of -Os
> > > (CONFIG_CC_OPTIMIZE_FOR_SIZE) kernels, which e.g. Fedora uses too (to
> > > keep the icache footprint down).
> >
> > Remember the above gcc miscompiles the x86-32 kernel with -Os:
> >
> > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=173764
>
> i'm not sure what the point is. There was no sudden rush of -Os related
> bugs when Fedora switched to it for the kernel, and the 35% code-size
> savings were certainly worth it in terms of icache footprint. Yes, -Os
> is a major change for how the compiler works, and the kernel is a major
> piece of software.

The bug referenced is also fixed in gcc 4.1

Dave

Adrian Bunk

unread,
Dec 29, 2005, 9:00:37 AM12/29/05
to
On Thu, Dec 29, 2005 at 08:59:36AM +0100, Ingo Molnar wrote:
>
> * Adrian Bunk <bu...@stusta.de> wrote:
>
> > > unit-at-a-time still increases the kernel stack footprint somewhat (by
> > > about 5% in the CC_OPTIMIZE_FOR_SIZE case), but not by the insane degree
> > > gcc3 used to, which prompted the original -fno-unit-at-a-time addition.
> > >...
> >
> > Please hold off this patch.
> >
> > I do already plan to look at this after the smoke has cleared after
> > the 4k stacks issue. I want to avoid two different knobs both with
> > negative effects on stack usage (currently CONFIG_4KSTACKS=y, and
> > after your patch gcc >= 4.0) giving a low testing coverage of the
> > worst cases.
>
> this is obviously not 2.6.15 stuff, so we've got enough time to see the
> effects. [ And what does "I do plan to look at this" mean? When
> precisely, and can i thus go to other topics without the issue being
> dropped on the floor indefinitely? ]

It won't be dropped on the floor indefinitely.

"I do plan to look at this" means that I'd currently estimate this being
2.6.19 stuff.

Yes that's one year from now, but we need it properly analyzed and
tested before getting it into Linus' tree, and I do really want it
untangled from and therefore after 4k stacks.

> also note that the inlining patch actually _reduces_ average stack
> footprint by ~3-4%:
> orig +inlining
> # of functions above 256 bytes: 683 660
> total stackspace, bytes: 148492 142884
>
> it is the unit-at-a-time patch that increases stack footprint (by about
> 7-8%, which together with the inlining patch gives a net ~5%).

The problem with the stack is that average stack usage is relatively
uninteresting - what matters is the worst case stack usage. And I'd
expect the stack footprint improvements you see with less inlining in
different places than the deteriorations with unit-at-a-time.

> Ingo

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

-

Christoph Hellwig

unread,
Dec 29, 2005, 9:40:26 AM12/29/05
to
> another thing: i wanted to decrease the size of -Os
> (CONFIG_CC_OPTIMIZE_FOR_SIZE) kernels, which e.g. Fedora uses too (to
> keep the icache footprint down).
>
> I think gcc should arguably not be forced to inline things when doing
> -Os, and it's also expected to mess up much less than when optimizing
> for speed. So maybe forced inlining should be dependent on
> !CONFIG_CC_OPTIMIZE_FOR_SIZE?

I don't care too much whether we put always_inline or inline at the function
we _really_ want to inline. But all others shouldn't have any inline marker.
So instead of changing the pretty useful redefinitions we have to keep the
code a little more readable what about getting rid of all the stupid inlines
we have over the place? I think many things we have static inline in headers
now should move to proper out of line functions. This is more work, but also
more useful than just flipping a bit.

Arjan van de Ven

unread,
Dec 29, 2005, 10:00:37 AM12/29/05
to

> I don't care too much whether we put always_inline or inline at the function
> we _really_ want to inline. But all others shouldn't have any inline marker.
> So instead of changing the pretty useful redefinitions we have to keep the
> code a little more readable what about getting rid of all the stupid inlines
> we have over the place?

just in drivers/ there are well over 6400 of those. Changing most of
those is going to be a huge effort. The reality is, most driver writers
(in fact kernel code writers) tend to overestimate the gain of inline in
THEIR code, and to underestimate the cumulative cost of it. Despite what
akpm says, I think gcc can make a better judgement than most of these
authors (probably including me :). We can remove 6400 now, but a year
from now, another 1000 have been added back again I bet.

You describe a nice utopia where only the most essential functions are
inlined.. but so far that hasn't worked out all that well ;) Turning
"inline" back into the hint to the compiler that the C language makes it
is maybe a cop-out, but it's a sustainable approach at least.

> I think many things we have static inline in headers
> now should move to proper out of line functions.

I suspect the biggest gains aren't the ones in the headers; those tend
to be quite small and often mostly optimize away due to constant
arguments (there may be a few exceptions of course), and also have been
attacked by various people in the 2.5/2.6 series before. It's the local
functions that got too many "inline" hints.

Horst von Brand

unread,
Dec 29, 2005, 10:10:11 AM12/29/05
to
Ingo Molnar <mi...@elte.hu> wrote:
> * Andrew Morton <ak...@osdl.org> wrote:
> > Ingo Molnar <mi...@elte.hu> wrote:
> > > I think gcc should arguably not be forced to inline things when doing
> > > -Os, and it's also expected to mess up much less than when optimizing
> > > for speed. So maybe forced inlining should be dependent on
> > > !CONFIG_CC_OPTIMIZE_FOR_SIZE?

> > When it comes to inlining I just don't trust gcc as far as I can spit

> > it. We're putting the kernel at the mercy of future random brainfarts
> > and bugs from the gcc guys. It would be better and safer IMO to
> > continue to force `inline' to have strict and sane semamtics, and to
> > simply be vigilant about our use of it.

> i think there's quite an attitude here - we are at the mercy of "gcc
> brainfarts" anyway, and users are at the mercy of "kernel brainfarts"
> just as much. Should users disable swapping and trash-talk it just
> because the Linux kernel used to have a poor VM? (And the gcc folks are
> certainly listening - it's not like they were unwilling to fix stuff,
> they simply had their own decade-old technological legacies that made
> certain seemingly random problems much harder to attack. E.g. -Os has
> recently been improved quite significantly in to-be-gcc-4.2.)

Also, we do trust gcc not to screw up on lots of other stuff. I.e., we
trust it to use registers wisely (register anyone?), to set up sane
counting loops and related array handling (noone is using pointers to
traverse arrays "for speed" anymore), and to select the best code sequence
for the machine at hand in lots of cases, ... And not only for the kernel,
for the whole userspace too!

Sure, this is a large change, and it might be warranted to place it under
CONFIG_NEW_COMPILER_OPTIONS (Marked experimental, high explosive, etc if it
makes you too uneasy).
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

Horst von Brand

unread,
Dec 29, 2005, 10:10:26 AM12/29/05
to
Arjan van de Ven <ar...@infradead.org> wrote:
> > IOW: I'd prefer that we be the ones who specify which functions are going
> > to be inlined and which ones are not.

> a bold statement... especially since the "and which ones are not" isn't
> currently there, we still leave gcc a lot of freedom there ... but only
> in one direction.

Besides, this is currently an everywhere or nowhere switch. gcc (in
principle at least) could decide which calls to inline and for which ones
it isn't worth it. Just like the (also long to die) "register" keyword.

Adrian Bunk

unread,
Dec 29, 2005, 10:50:17 AM12/29/05
to
On Thu, Dec 29, 2005 at 03:54:09PM +0100, Arjan van de Ven wrote:
>
> > I don't care too much whether we put always_inline or inline at the function
> > we _really_ want to inline. But all others shouldn't have any inline marker.
> > So instead of changing the pretty useful redefinitions we have to keep the
> > code a little more readable what about getting rid of all the stupid inlines
> > we have over the place?
>
> just in drivers/ there are well over 6400 of those. Changing most of
> those is going to be a huge effort. The reality is, most driver writers
> (in fact kernel code writers) tend to overestimate the gain of inline in
> THEIR code, and to underestimate the cumulative cost of it. Despite what
> akpm says, I think gcc can make a better judgement than most of these
> authors (probably including me :). We can remove 6400 now, but a year
> from now, another 1000 have been added back again I bet.

Are we that bad reviewing code?

An "inline" in a .c file is simply nearly always wrong in the kernel,
and unless the author has a good justification for it it should be
removed.

> You describe a nice utopia where only the most essential functions are
> inlined.. but so far that hasn't worked out all that well ;) Turning
> "inline" back into the hint to the compiler that the C language makes it
> is maybe a cop-out, but it's a sustainable approach at least.

>...

But shouldn't nowadays gcc be able to know best even without an "inline"
hint?

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

-

Arjan van de Ven

unread,
Dec 29, 2005, 10:50:19 AM12/29/05
to

> > You describe a nice utopia where only the most essential functions are
> > inlined.. but so far that hasn't worked out all that well ;) Turning
> > "inline" back into the hint to the compiler that the C language makes it
> > is maybe a cop-out, but it's a sustainable approach at least.
> >...
>
> But shouldn't nowadays gcc be able to know best even without an "inline"
> hint?

it will, the inline hint only affects the thresholds so it's not
entirely without effects, but I can imagine that there are cases that
truely are performance critical and can be optimized out and where you
don't want to help gcc a bit (say a one line wrapper around readl or
writel). Otoh I suspect that modern gcc will be more than smart enough
and inline one liners anyway (if they're static of course).

Adrian Bunk

unread,
Dec 29, 2005, 10:50:20 AM12/29/05
to
On Thu, Dec 29, 2005 at 08:32:59AM +0100, Ingo Molnar wrote:
>
> * Andrew Morton <ak...@osdl.org> wrote:
>
> > Ingo Molnar <mi...@elte.hu> wrote:
> > >
> > > I think gcc should arguably not be forced to inline things when doing
> > > -Os, and it's also expected to mess up much less than when optimizing
> > > for speed. So maybe forced inlining should be dependent on
> > > !CONFIG_CC_OPTIMIZE_FOR_SIZE?
> >
> > When it comes to inlining I just don't trust gcc as far as I can spit
> > it. We're putting the kernel at the mercy of future random brainfarts
> > and bugs from the gcc guys. It would be better and safer IMO to
> > continue to force `inline' to have strict and sane semamtics, and to
> > simply be vigilant about our use of it.
>
> i think there's quite an attitude here - we are at the mercy of "gcc
> brainfarts" anyway, and users are at the mercy of "kernel brainfarts"
> just as much. Should users disable swapping and trash-talk it just
> because the Linux kernel used to have a poor VM? (And the gcc folks are
> certainly listening - it's not like they were unwilling to fix stuff,
> they simply had their own decade-old technological legacies that made
> certain seemingly random problems much harder to attack. E.g. -Os has
> recently been improved quite significantly in to-be-gcc-4.2.)
>...

> also, there's a fundamental conflict of 'speed vs. performance' here,
> for a certain boundary region. For the extremes, very small and very
> large functions, the decision is clear, but if e.g. a CPU has tons of
> cache, it might prefer more agressive inlining than if it doesnt. So
> it's not like we can do it in a fully static manner.
>...

I'd formulate it the other way round as Andrew:

We should force gcc to inline code where we do know best
("static inline"s in header files) and leave the decision
to gcc in the cases where gcc should know best controlled
by some high-level knobs like -Os/-O2.

gcc simply needs to be forced to inline in some cases in which we really
need inlining, but in all other cases gcc knows best and we can trust
gcc to make the right decision.

> Ingo

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

-

Jakub Jelinek

unread,
Dec 29, 2005, 10:50:20 AM12/29/05
to
On Thu, Dec 29, 2005 at 04:35:29PM +0100, Adrian Bunk wrote:
> > You describe a nice utopia where only the most essential functions are
> > inlined.. but so far that hasn't worked out all that well ;) Turning
> > "inline" back into the hint to the compiler that the C language makes it
> > is maybe a cop-out, but it's a sustainable approach at least.
> >...
>
> But shouldn't nowadays gcc be able to know best even without an "inline"
> hint?

Only for static functions (and in -funit-at-a-time mode).
Anything else would require full IMA over the whole kernel and we aren't
there yet. So inline hints are useful. But most of the inline keywords
in the kernel really should be that, hints, because e.g. while it can be
beneficial to inline something on one arch, it may be not beneficial on
another arch, depending on cache sizes, number of general registers
available to the compiler, register preassure, speed of the call/ret
pair, calling convention and many other factors.

Jakub

Linus Torvalds

unread,
Dec 29, 2005, 12:50:07 PM12/29/05
to

On Thu, 29 Dec 2005, Ingo Molnar wrote:
>
> * Andrew Morton <ak...@osdl.org> wrote:
> >
> > When it comes to inlining I just don't trust gcc as far as I can spit
> > it. We're putting the kernel at the mercy of future random brainfarts
> > and bugs from the gcc guys. It would be better and safer IMO to
> > continue to force `inline' to have strict and sane semamtics, and to
> > simply be vigilant about our use of it.
>
> i think there's quite an attitude here - we are at the mercy of "gcc
> brainfarts" anyway, and users are at the mercy of "kernel brainfarts"
> just as much.

There's a huge difference here. The gcc people very much have a "Oh, we
changed old documented behaviour - live with it" attitude, together with
"That was a gcc extension, not part of the C language, so when we change
how gcc behaves, it's _your_ problem" approach.

At least they used to.

So yes, there's a huge attitude difference. The gcc people have a BAD
attitude. When the meaning of "inline" changed (from a "inline this" to
"hey, it's a hint"), the gcc people never EVER said "sorry". They
effectively said "screw you".

I know this is why I don't trust gcc wrt inlining. It's not so much about
any technical issues, as about the fact that the kernel tends to be a lot
heavier user of gcc features than most programs, and has correctness
issues with them, AND THE GCC PEOPLE SIMPLY DON'T CARE.

Comparing it to the kernel is ludicrous. We care about user-space
interfaces to an insane degree. We go to extreme lengths to maintain even
badly designed or unintentional interfaces. Breaking user programs simply
isn't acceptable. We're _not_ like the gcc developers. We know that
people use old binaries for years and years, and that making a new
release doesn't mean that you can just throw that out. You can trust us.

Maybe gcc development has changed. Maybe it hasn't.

THAT is what makes me worry. I don't know if this is why Andrew doesn't
trust inlining, but I suspect it has similar roots. Not trusting it
because we haven't been able to trust the people behind it. No heads-up,
no warnings, no discussions. Just a "screw you, things changed, your
usage doesn't matter, and we're not even interested in listening to you
or telling you why things changed".

There have been situations where documented gcc semantics changed, and
instead of saying "sorry", the gcc people changed the documentation. What
the hell is the point of documented semantics if you can't depend on them
anyway?

One thing we could do: I think modern gcc's at least have an option to
warn when they don't inline something. It might make sense to just enable
that warning, and see _which_ functions -Os and -funit-at-a-time say are
too large to be inlined.

Maybe the right thing to do is to just heed that warning, and remove such
functions from header files and make them no-inline? That way we get the
size fixes _regardless_ of any compiler options.

Linus

Arjan van de Ven

unread,
Dec 29, 2005, 2:00:25 PM12/29/05
to

>
> One thing we could do: I think modern gcc's at least have an option to
> warn when they don't inline something. It might make sense to just enable
> that warning, and see _which_ functions -Os and -funit-at-a-time say are
> too large to be inlined.


with -Os gcc gets a bit picky and warns a LOT; with -O2... you get the
following fixes (all huge functions)


diff -purN linux-org/drivers/acpi/ec.c linux-2.6.15-rc6/drivers/acpi/ec.c
--- linux-org/drivers/acpi/ec.c 2005-10-28 02:02:08.000000000 +0200
+++ linux-2.6.15-rc6/drivers/acpi/ec.c 2005-12-29 19:21:37.000000000 +0100
@@ -153,7 +153,7 @@ static int acpi_ec_polling_mode = EC_POL
Transaction Management
-------------------------------------------------------------------------- */

-static inline u32 acpi_ec_read_status(union acpi_ec *ec)
+static u32 acpi_ec_read_status(union acpi_ec *ec)
{
u32 status = 0;

diff -purN linux-org/drivers/bluetooth/hci_bcsp.c linux-2.6.15-rc6/drivers/bluetooth/hci_bcsp.c
--- linux-org/drivers/bluetooth/hci_bcsp.c 2005-12-22 19:54:33.000000000 +0100
+++ linux-2.6.15-rc6/drivers/bluetooth/hci_bcsp.c 2005-12-29 19:23:21.000000000 +0100
@@ -494,7 +494,7 @@ static inline void bcsp_unslip_one_byte(
}
}

-static inline void bcsp_complete_rx_pkt(struct hci_uart *hu)
+static void bcsp_complete_rx_pkt(struct hci_uart *hu)
{
struct bcsp_struct *bcsp = hu->priv;
int pass_up;
diff -purN linux-org/drivers/char/drm/r128_state.c linux-2.6.15-rc6/drivers/char/drm/r128_state.c
--- linux-org/drivers/char/drm/r128_state.c 2005-12-22 19:54:33.000000000 +0100
+++ linux-2.6.15-rc6/drivers/char/drm/r128_state.c 2005-12-29 19:24:59.000000000 +0100
@@ -220,7 +220,7 @@ static __inline__ void r128_emit_tex1(dr
ADVANCE_RING();
}

-static __inline__ void r128_emit_state(drm_r128_private_t * dev_priv)
+static void r128_emit_state(drm_r128_private_t * dev_priv)
{
drm_r128_sarea_t *sarea_priv = dev_priv->sarea_priv;
unsigned int dirty = sarea_priv->dirty;
diff -purN linux-org/drivers/isdn/hisax/avm_pci.c linux-2.6.15-rc6/drivers/isdn/hisax/avm_pci.c
--- linux-org/drivers/isdn/hisax/avm_pci.c 2005-12-22 19:54:33.000000000 +0100
+++ linux-2.6.15-rc6/drivers/isdn/hisax/avm_pci.c 2005-12-29 19:29:31.000000000 +0100
@@ -358,7 +358,7 @@ hdlc_fill_fifo(struct BCState *bcs)
}
}

-static inline void
+static void
HDLC_irq(struct BCState *bcs, u_int stat) {
int len;
struct sk_buff *skb;
diff -purN linux-org/drivers/isdn/hisax/diva.c linux-2.6.15-rc6/drivers/isdn/hisax/diva.c
--- linux-org/drivers/isdn/hisax/diva.c 2005-10-28 02:02:08.000000000 +0200
+++ linux-2.6.15-rc6/drivers/isdn/hisax/diva.c 2005-12-29 19:29:42.000000000 +0100
@@ -476,7 +476,7 @@ Memhscx_fill_fifo(struct BCState *bcs)
}
}

-static inline void
+static void
Memhscx_interrupt(struct IsdnCardState *cs, u_char val, u_char hscx)
{
u_char r;
diff -purN linux-org/drivers/isdn/hisax/hscx_irq.c linux-2.6.15-rc6/drivers/isdn/hisax/hscx_irq.c
--- linux-org/drivers/isdn/hisax/hscx_irq.c 2005-10-28 02:02:08.000000000 +0200
+++ linux-2.6.15-rc6/drivers/isdn/hisax/hscx_irq.c 2005-12-29 19:30:21.000000000 +0100
@@ -119,7 +119,7 @@ hscx_fill_fifo(struct BCState *bcs)
}
}

-static inline void
+static void
hscx_interrupt(struct IsdnCardState *cs, u_char val, u_char hscx)
{
u_char r;
@@ -221,7 +221,7 @@ hscx_interrupt(struct IsdnCardState *cs,
}
}

-static inline void
+static void
hscx_int_main(struct IsdnCardState *cs, u_char val)
{

diff -purN linux-org/drivers/isdn/hisax/jade_irq.c linux-2.6.15-rc6/drivers/isdn/hisax/jade_irq.c
--- linux-org/drivers/isdn/hisax/jade_irq.c 2005-10-28 02:02:08.000000000 +0200
+++ linux-2.6.15-rc6/drivers/isdn/hisax/jade_irq.c 2005-12-29 19:30:07.000000000 +0100
@@ -110,7 +110,7 @@ jade_fill_fifo(struct BCState *bcs)
}


-static inline void
+static void
jade_interrupt(struct IsdnCardState *cs, u_char val, u_char jade)
{
u_char r;
diff -purN linux-org/drivers/md/dm-crypt.c linux-2.6.15-rc6/drivers/md/dm-crypt.c
--- linux-org/drivers/md/dm-crypt.c 2005-12-22 19:54:33.000000000 +0100
+++ linux-2.6.15-rc6/drivers/md/dm-crypt.c 2005-12-29 19:28:58.000000000 +0100
@@ -228,7 +228,7 @@ static struct crypt_iv_operations crypt_
};


-static inline int
+static int
crypt_convert_scatterlist(struct crypt_config *cc, struct scatterlist *out,
struct scatterlist *in, unsigned int length,
int write, sector_t sector)
diff -purN linux-org/drivers/media/video/cx25840/cx25840-audio.c linux-2.6.15-rc6/drivers/media/video/cx25840/cx25840-audio.c
--- linux-org/drivers/media/video/cx25840/cx25840-audio.c 2005-12-22 19:54:33.000000000 +0100
+++ linux-2.6.15-rc6/drivers/media/video/cx25840/cx25840-audio.c 2005-12-29 19:31:11.000000000 +0100
@@ -23,7 +23,7 @@

#include "cx25840.h"

-inline static int set_audclk_freq(struct i2c_client *client,
+static int set_audclk_freq(struct i2c_client *client,
enum v4l2_audio_clock_freq freq)
{
struct cx25840_state *state = i2c_get_clientdata(client);
diff -purN linux-org/drivers/media/video/tvp5150.c linux-2.6.15-rc6/drivers/media/video/tvp5150.c
--- linux-org/drivers/media/video/tvp5150.c 2005-12-22 19:54:33.000000000 +0100
+++ linux-2.6.15-rc6/drivers/media/video/tvp5150.c 2005-12-29 19:31:41.000000000 +0100
@@ -87,7 +87,7 @@ struct tvp5150 {
int sat;
};

-static inline int tvp5150_read(struct i2c_client *c, unsigned char addr)
+static int tvp5150_read(struct i2c_client *c, unsigned char addr)
{
unsigned char buffer[1];
int rc;
diff -purN linux-org/drivers/mtd/nand/diskonchip.c linux-2.6.15-rc6/drivers/mtd/nand/diskonchip.c
--- linux-org/drivers/mtd/nand/diskonchip.c 2005-12-22 19:54:34.000000000 +0100
+++ linux-2.6.15-rc6/drivers/mtd/nand/diskonchip.c 2005-12-29 19:31:26.000000000 +0100
@@ -1506,7 +1506,7 @@ static inline int __init doc2001plus_ini
return 1;
}

-static inline int __init doc_probe(unsigned long physadr)
+static int __init doc_probe(unsigned long physadr)
{
unsigned char ChipID;
struct mtd_info *mtd;
diff -purN linux-org/drivers/net/wireless/ipw2100.c linux-2.6.15-rc6/drivers/net/wireless/ipw2100.c
--- linux-org/drivers/net/wireless/ipw2100.c 2005-12-22 19:54:34.000000000 +0100
+++ linux-2.6.15-rc6/drivers/net/wireless/ipw2100.c 2005-12-29 19:33:50.000000000 +0100
@@ -2346,7 +2346,7 @@ static inline void ipw2100_corruption_de
schedule_reset(priv);
}

-static inline void isr_rx(struct ipw2100_priv *priv, int i,
+static void isr_rx(struct ipw2100_priv *priv, int i,
struct ieee80211_rx_stats *stats)
{
struct ipw2100_status *status = &priv->status_queue.drv[i];
@@ -2481,7 +2481,7 @@ static inline int ipw2100_corruption_che
* The WRITE index is cached in the variable 'priv->rx_queue.next'.
*
*/
-static inline void __ipw2100_rx_process(struct ipw2100_priv *priv)
+static void __ipw2100_rx_process(struct ipw2100_priv *priv)
{
struct ipw2100_bd_queue *rxq = &priv->rx_queue;
struct ipw2100_status_queue *sq = &priv->status_queue;
@@ -2634,7 +2634,7 @@ static inline void __ipw2100_rx_process(
* for use by future command and data packets.
*
*/
-static inline int __ipw2100_tx_process(struct ipw2100_priv *priv)
+static int __ipw2100_tx_process(struct ipw2100_priv *priv)
{
struct ipw2100_bd_queue *txq = &priv->tx_queue;
struct ipw2100_bd *tbd;
diff -purN linux-org/drivers/scsi/iscsi_tcp.c linux-2.6.15-rc6/drivers/scsi/iscsi_tcp.c
--- linux-org/drivers/scsi/iscsi_tcp.c 2005-12-22 19:54:34.000000000 +0100
+++ linux-2.6.15-rc6/drivers/scsi/iscsi_tcp.c 2005-12-29 19:32:02.000000000 +0100
@@ -1437,7 +1437,7 @@ iscsi_buf_data_digest_update(struct iscs
}
}

-static inline int
+static int
iscsi_digest_final_send(struct iscsi_conn *conn, struct iscsi_cmd_task *ctask,
struct iscsi_buf *buf, uint32_t *digest, int final)
{
diff -purN linux-org/drivers/video/matrox/matroxfb_maven.c linux-2.6.15-rc6/drivers/video/matrox/matroxfb_maven.c
--- linux-org/drivers/video/matrox/matroxfb_maven.c 2005-10-28 02:02:08.000000000 +0200
+++ linux-2.6.15-rc6/drivers/video/matrox/matroxfb_maven.c 2005-12-29 19:34:05.000000000 +0100
@@ -968,7 +968,7 @@ static inline int maven_compute_timming(
return 0;
}

-static inline int maven_program_timming(struct maven_data* md,
+static int maven_program_timming(struct maven_data* md,
const struct mavenregs* m) {
struct i2c_client* c = md->client;

diff -purN linux-org/fs/9p/conv.c linux-2.6.15-rc6/fs/9p/conv.c
--- linux-org/fs/9p/conv.c 2005-10-28 02:02:08.000000000 +0200
+++ linux-2.6.15-rc6/fs/9p/conv.c 2005-12-29 19:20:19.000000000 +0100
@@ -350,7 +350,7 @@ serialize_stat(struct v9fs_session_info
*
*/

-static inline int
+static int
deserialize_stat(struct v9fs_session_info *v9ses, struct cbuf *bufp,
struct v9fs_stat *stat, struct cbuf *dbufp)
{
diff -purN linux-org/fs/nfsd/nfsxdr.c linux-2.6.15-rc6/fs/nfsd/nfsxdr.c
--- linux-org/fs/nfsd/nfsxdr.c 2005-10-28 02:02:08.000000000 +0200
+++ linux-2.6.15-rc6/fs/nfsd/nfsxdr.c 2005-12-29 19:24:28.000000000 +0100
@@ -151,7 +151,7 @@ decode_sattr(u32 *p, struct iattr *iap)
return p;
}

-static inline u32 *
+static u32 *
encode_fattr(struct svc_rqst *rqstp, u32 *p, struct svc_fh *fhp)
{
struct vfsmount *mnt = fhp->fh_export->ex_mnt;
diff -purN linux-org/net/ieee80211/ieee80211_rx.c linux-2.6.15-rc6/net/ieee80211/ieee80211_rx.c
--- linux-org/net/ieee80211/ieee80211_rx.c 2005-12-22 19:54:36.000000000 +0100
+++ linux-2.6.15-rc6/net/ieee80211/ieee80211_rx.c 2005-12-29 19:24:05.000000000 +0100
@@ -1295,7 +1295,7 @@ static inline int is_beacon(int fc)
return (WLAN_FC_GET_STYPE(le16_to_cpu(fc)) == IEEE80211_STYPE_BEACON);
}

-static inline void ieee80211_process_probe_response(struct ieee80211_device
+static void ieee80211_process_probe_response(struct ieee80211_device
*ieee, struct
ieee80211_probe_response
*beacon, struct ieee80211_rx_stats
diff -purN linux-org/net/netfilter/nfnetlink.c linux-2.6.15-rc6/net/netfilter/nfnetlink.c
--- linux-org/net/netfilter/nfnetlink.c 2005-12-22 19:54:36.000000000 +0100
+++ linux-2.6.15-rc6/net/netfilter/nfnetlink.c 2005-12-29 19:28:08.000000000 +0100
@@ -212,7 +212,7 @@ int nfnetlink_unicast(struct sk_buff *sk
}

/* Process one complete nfnetlink message. */
-static inline int nfnetlink_rcv_msg(struct sk_buff *skb,
+static int nfnetlink_rcv_msg(struct sk_buff *skb,
struct nlmsghdr *nlh, int *errp)
{
struct nfnl_callback *nc;
diff -purN linux-org/sound/oss/esssolo1.c linux-2.6.15-rc6/sound/oss/esssolo1.c
--- linux-org/sound/oss/esssolo1.c 2005-10-28 02:02:08.000000000 +0200
+++ linux-2.6.15-rc6/sound/oss/esssolo1.c 2005-12-29 19:23:05.000000000 +0100
@@ -515,7 +515,7 @@ static inline int prog_dmabuf_adc(struct
return 0;
}

-static inline int prog_dmabuf_dac(struct solo1_state *s)
+static int prog_dmabuf_dac(struct solo1_state *s)
{
unsigned long va;
int c;

Arjan van de Ven

unread,
Dec 29, 2005, 2:00:27 PM12/29/05
to
On Thu, 2005-12-29 at 19:42 +0100, Arjan van de Ven wrote:
> >
> > One thing we could do: I think modern gcc's at least have an option to
> > warn when they don't inline something. It might make sense to just enable
> > that warning, and see _which_ functions -Os and -funit-at-a-time say are
> > too large to be inlined.
>
>
> with -Os gcc gets a bit picky and warns a LOT; with -O2... you get the
> following fixes (all huge functions)
>


btw this caught one bug that the forced attribute was hiding: there was
a function which was "inline" and which uses a variable sized array.
normally gcc refuses to inline that (rightfully; esp relative addressing
gets rather really complex in that scenario), but the force attribute
causes it to be inlined anyway. No idea if the result is sane in that
case...

Adrian Bunk

unread,
Dec 29, 2005, 2:20:13 PM12/29/05
to
On Thu, Dec 29, 2005 at 10:42:41AM -0500, Jakub Jelinek wrote:
> On Thu, Dec 29, 2005 at 04:35:29PM +0100, Adrian Bunk wrote:
> > > You describe a nice utopia where only the most essential functions are
> > > inlined.. but so far that hasn't worked out all that well ;) Turning
> > > "inline" back into the hint to the compiler that the C language makes it
> > > is maybe a cop-out, but it's a sustainable approach at least.
> > >...
> >
> > But shouldn't nowadays gcc be able to know best even without an "inline"
> > hint?
>
> Only for static functions (and in -funit-at-a-time mode).

I'm assuming -funit-at-a-time mode. Currently it's disabled on i386, but
this will change in the medium-term future.

> Anything else would require full IMA over the whole kernel and we aren't
> there yet. So inline hints are useful. But most of the inline keywords
> in the kernel really should be that, hints, because e.g. while it can be

Are there (on !alpha) any places in the kernel where a function is
inline but not static, and this is wanted?

> beneficial to inline something on one arch, it may be not beneficial on
> another arch, depending on cache sizes, number of general registers
> available to the compiler, register preassure, speed of the call/ret
> pair, calling convention and many other factors.

Does gcc really need hints when the functions are static?

> Jakub

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

-

Krzysztof Halasa

unread,
Dec 29, 2005, 2:50:11 PM12/29/05
to
Ingo Molnar <mi...@elte.hu> writes:

>> Remember the above gcc miscompiles the x86-32 kernel with -Os:
>>
>> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=173764
>
> i'm not sure what the point is.

Nothing special, just a side note.

> There was no sudden rush of -Os related
> bugs when Fedora switched to it for the kernel,

I found 'ip route add' was broken with -Os. I use FC4s but the kernel
is usually a mutated version of the Linus' tree so I can't check it.

> and the 35% code-size
> savings were certainly worth it in terms of icache footprint.

Sure.

Good to hear gcc 4.1 is fixed.
--
Krzysztof Halasa

Horst von Brand

unread,
Dec 29, 2005, 3:10:13 PM12/29/05
to
Adrian Bunk <bu...@stusta.de> wrote:
> On Thu, Dec 29, 2005 at 08:59:36AM +0100, Ingo Molnar wrote:
> > * Adrian Bunk <bu...@stusta.de> wrote:

> > > > unit-at-a-time still increases the kernel stack footprint somewhat
> > > > (by about 5% in the CC_OPTIMIZE_FOR_SIZE case), but not by the
> > > > insane degree gcc3 used to, which prompted the original
> > > > -fno-unit-at-a-time addition.
> > > >...

> > > Please hold off this patch.
> > >
> > > I do already plan to look at this after the smoke has cleared after
> > > the 4k stacks issue. I want to avoid two different knobs both with
> > > negative effects on stack usage (currently CONFIG_4KSTACKS=y, and
> > > after your patch gcc >= 4.0) giving a low testing coverage of the
> > > worst cases.

This is /one/ knob with effect on stack usage...

> > this is obviously not 2.6.15 stuff, so we've got enough time to see the
> > effects. [ And what does "I do plan to look at this" mean? When
> > precisely, and can i thus go to other topics without the issue being
> > dropped on the floor indefinitely? ]

> It won't be dropped on the floor indefinitely.
>
> "I do plan to look at this" means that I'd currently estimate this being
> 2.6.19 stuff.
>
> Yes that's one year from now, but we need it properly analyzed and
> tested before getting it into Linus' tree, and I do really want it
> untangled from and therefore after 4k stacks.

That is "indefinitely" in my book. Or nearly so. And in the meantime will
get many hackers to patch it in by hand and forget to tell...

> > also note that the inlining patch actually _reduces_ average stack
> > footprint by ~3-4%:
> > orig +inlining
> > # of functions above 256 bytes: 683 660
> > total stackspace, bytes: 148492 142884
> >
> > it is the unit-at-a-time patch that increases stack footprint (by about
> > 7-8%, which together with the inlining patch gives a net ~5%).
>
> The problem with the stack is that average stack usage is relatively
> uninteresting - what matters is the worst case stack usage. And I'd
> expect the stack footprint improvements you see with less inlining in
> different places than the deteriorations with unit-at-a-time.

That is a red herring. The numbers are for number of large stack users
(goes down) and cummulative stack usage (goes down too). Sure, if the
number of > 256 bytes stack users goes down while the largest stack uses go
up we are in trouble. And if the grand total goes down but stack usage by
some critical users go up we might be screwed. That could be answered by
looking at the details behind the above numbers. But there is only one way
to find out if it causes problems (and fix them)...

I'd (tend to) buy an argument about possible instabilities in that gcc
code, but then again, it has to be tested sometime...

Make it a configuration option, under EXPERIMENTAL, VERY DANGEROUS, HIGH
EXPLOSIVE if you must.


--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

-

Ingo Molnar

unread,
Dec 29, 2005, 3:30:08 PM12/29/05
to

* Linus Torvalds <torv...@osdl.org> wrote:

> > i think there's quite an attitude here - we are at the mercy of "gcc
> > brainfarts" anyway, and users are at the mercy of "kernel brainfarts"
> > just as much.
>
> There's a huge difference here. The gcc people very much have a "Oh,
> we changed old documented behaviour - live with it" attitude, together
> with "That was a gcc extension, not part of the C language, so when we
> change how gcc behaves, it's _your_ problem" approach.
>
> At least they used to.

yeah, i think that was definitely the case historically.

> Maybe the right thing to do is to just heed that warning, and remove
> such functions from header files and make them no-inline? That way we
> get the size fixes _regardless_ of any compiler options.

i think the eye-opener (for me at least) was that there's really a
massive 5%+ size difference here, from 2 simple patches. And meanwhile
Matt is doing truly hard size-reduction work and is mailing patches to
lkml that remove 200-300 bytes of .text, which is 0.01% of code, apiece.

Debloating is like scalability, a piece-by-piece process where we'll
only see the full effects after doing 100 independent steps, but still
we must not ignore the big effects either, nor must we get ourselves
into losing maintainance battles.

The current inline model seems to be a lost battle, the 'size noise'
caused by spurious inlines (which count in the thousands) is _far_
outpowering most of the size reduction efforts. And i think it can be
argued that at least in the -Os case gcc has a very clear directive wrt.
what to do - and much less room to mess up. Independently of how much we
trust it.

Ingo

Ingo Molnar

unread,
Dec 29, 2005, 3:30:11 PM12/29/05
to

* Adrian Bunk <bu...@stusta.de> wrote:

> It won't be dropped on the floor indefinitely.
>
> "I do plan to look at this" means that I'd currently estimate this
> being 2.6.19 stuff.

you must be kidding ...

> Yes that's one year from now, but we need it properly analyzed and
> tested before getting it into Linus' tree, and I do really want it
> untangled from and therefore after 4k stacks.

you are really using the wrong technology for this.

look at the latency tracing patch i posted today: it includes a feature
that prints the worst-case stack footprint _as it happens_, and thus
allows the mapping of such effects in a very efficient and very
practical way. As it works on a live system, and profiles live function
traces, it goes through function pointers and irq entry nesting effects
too. We could perhaps put that into Fedora for a while and get the
worst-case footprints mapped.

in fact i've been running this feature in the -rt kernel for quite some
time, and it enabled the fixing of a couple of bad stack abusers, and it
also told us what our current worst-case stack footprint is [when 4K
stacks are enabled]: it's execve of an ELF binary.

Ingo

Dave Jones

unread,
Dec 29, 2005, 3:40:13 PM12/29/05
to
On Thu, Dec 29, 2005 at 09:41:12AM -0800, Linus Torvalds wrote:

> Comparing it to the kernel is ludicrous. We care about user-space
> interfaces to an insane degree. We go to extreme lengths to maintain even
> badly designed or unintentional interfaces. Breaking user programs simply
> isn't acceptable. We're _not_ like the gcc developers. We know that
> people use old binaries for years and years, and that making a new
> release doesn't mean that you can just throw that out. You can trust us.

Does this mean you're holding back the 2.6.15 release until we don't
need to update udev to stop X from breaking ?
</tongue-in-cheek>

Seriously, we break things _every_ release. Sometimes in tiny
'doesn't really matter' ways, sometimes in "fuck, my system no
longer works" ways, but the days where we I didn't have to tell
our userspace packagers to rev a half dozen or so packages up to the
latest upstream revisions when I've pushed a rebased kernel are
a distant memory.

Dave

Linus Torvalds

unread,
Dec 29, 2005, 4:00:16 PM12/29/05
to

On Thu, 29 Dec 2005, Dave Jones wrote:
>
> Seriously, we break things _every_ release. Sometimes in tiny
> 'doesn't really matter' ways, sometimes in "fuck, my system no
> longer works" ways, but the days where we I didn't have to tell
> our userspace packagers to rev a half dozen or so packages up to the
> latest upstream revisions when I've pushed a rebased kernel are
> a distant memory.

Umm.. Complain more. I upgrade kernels a lot more often than I upgrade
distros, and things don't break. They're not allowed to break, because I
refuse to upgrade my user programs just because I do kernel development.
But I'd only notice a small part of user space, so if people don't
complain, they break not because we don't care, but because we didn't even
know.

So if you have a user program that breaks, _complain_. It's really not
supposed to happen outside of perhaps kernel module loaders etc things
that get really really chummy with kernel internals (and even that was
fixed: the modern way of loading modules isn't that chummy any more, so
hopefully we'll not need to break even module loaders again).

If we change some /proc file thing, breakage is often totally
unintentional, and complaining is the right thing - people might not even
have realized it broke.

At least _I_ take breakage reports seriously. If there are maintainers
that don't, complain to them. I'll back you up. Breaking user space simply
isn't acceptable without years of preparation and warning.

Linus

Linus Torvalds

unread,
Dec 29, 2005, 4:30:08 PM12/29/05
to

On Thu, 29 Dec 2005, Linus Torvalds wrote:
>
> At least _I_ take breakage reports seriously. If there are maintainers
> that don't, complain to them. I'll back you up. Breaking user space simply
> isn't acceptable without years of preparation and warning.

Btw, sometimes we knowingly change semantics that we believe that nobody
would ever be able to care about. Then we literally _depend_ on people
complaining about breakage in case we were wrong, and if you guys don't,
and just curse, and upgrade programs, we actually miss out on real
information.

And yes, occasionally we don't have much choice, and things break. It
should be extremely rare, though. Much more commonly it would be a bug or
an unintentional change that somebody didn't even realized changed
semantics subtly.

Matt Mackall

unread,
Dec 29, 2005, 5:30:08 PM12/29/05
to
On Thu, Dec 29, 2005 at 09:19:53PM +0100, Ingo Molnar wrote:
>
> * Linus Torvalds <torv...@osdl.org> wrote:
>
> > > i think there's quite an attitude here - we are at the mercy of "gcc
> > > brainfarts" anyway, and users are at the mercy of "kernel brainfarts"
> > > just as much.
> >
> > There's a huge difference here. The gcc people very much have a "Oh,
> > we changed old documented behaviour - live with it" attitude, together
> > with "That was a gcc extension, not part of the C language, so when we
> > change how gcc behaves, it's _your_ problem" approach.
> >
> > At least they used to.
>
> yeah, i think that was definitely the case historically.
>
> > Maybe the right thing to do is to just heed that warning, and remove
> > such functions from header files and make them no-inline? That way we
> > get the size fixes _regardless_ of any compiler options.
>
> i think the eye-opener (for me at least) was that there's really a
> massive 5%+ size difference here, from 2 simple patches. And meanwhile
> Matt is doing truly hard size-reduction work and is mailing patches to
> lkml that remove 200-300 bytes of .text, which is 0.01% of code, apiece.

For the record, my cut-off for non-trivial stuff is currently about
1K. Which is more like 0.1% for minimal kernels. Unfortunately, the
impact of these patches on a stripped-down kernel is less substantial
than on a featureful one, so

> Debloating is like scalability, a piece-by-piece process where we'll
> only see the full effects after doing 100 independent steps, but still
> we must not ignore the big effects either, nor must we get ourselves
> into losing maintainance battles.
>
> The current inline model seems to be a lost battle, the 'size noise'
> caused by spurious inlines (which count in the thousands) is _far_
> outpowering most of the size reduction efforts. And i think it can be
> argued that at least in the -Os case gcc has a very clear directive wrt.
> what to do - and much less room to mess up. Independently of how much we
> trust it.

I think both these patches deserve a spin in -mm. But I can see
arguments for staging it. Hopefully we can get Andrew to take the
unit-at-a-time piece for post-2.6.15 and try out the inlining after
we've got some confidence with the first.

--
Mathematics is the supreme nostalgia of our time.

Dave Jones

unread,
Dec 29, 2005, 5:50:07 PM12/29/05
to
On Thu, Dec 29, 2005 at 12:49:16PM -0800, Linus Torvalds wrote:

> Umm.. Complain more. I upgrade kernels a lot more often than I upgrade
> distros, and things don't break. They're not allowed to break, because I
> refuse to upgrade my user programs just because I do kernel development.
> But I'd only notice a small part of user space, so if people don't
> complain, they break not because we don't care, but because we didn't even
> know.
>
> So if you have a user program that breaks, _complain_. It's really not
> supposed to happen outside of perhaps kernel module loaders etc things
> that get really really chummy with kernel internals (and even that was
> fixed: the modern way of loading modules isn't that chummy any more, so
> hopefully we'll not need to break even module loaders again).
>
> If we change some /proc file thing, breakage is often totally
> unintentional, and complaining is the right thing - people might not even
> have realized it broke.
>
> At least _I_ take breakage reports seriously. If there are maintainers
> that don't, complain to them. I'll back you up. Breaking user space simply
> isn't acceptable without years of preparation and warning.

The udev situation I mentioned has been known about for at least a month,
probably longer. With old udev, we don't get /dev/input/event* created
with 2.6.15rc.

At some point in time it became defacto that certain things like udev, hotplug,
alsa-lib, wireless-tools and a bunch of others have to have kept in lockstep
with the kernel, and if it breaks, it's your fault for not upgrading
your userspace.

Seriously, I (and many others) have been complaining about this
for months. (Pretty much every time the "Please can we have a 2.7"
thread comes up). [note, that I actually prefer the 'new' approach
to development in 2.6, what I object to is that at the same time we
threw out the 'lets be careful about not breaking userspace' mantra.]

Just a few years ago, if someone suggested breaking a userspace
app in a kernel upgrade, they'd be crucified on linux-kernel, now
it's 'the norm').

Dave

Ismail Donmez

unread,
Dec 29, 2005, 6:00:16 PM12/29/05
to
Cuma 30 Aralık 2005 00:41 tarihinde, Dave Jones şunları yazmıştı:
[...]

> The udev situation I mentioned has been known about for at least a month,
> probably longer. With old udev, we don't get /dev/input/event* created
> with 2.6.15rc.
>
> At some point in time it became defacto that certain things like udev,
> hotplug, alsa-lib, wireless-tools and a bunch of others have to have kept
> in lockstep with the kernel, and if it breaks, it's your fault for not
> upgrading your userspace.
>
> Seriously, I (and many others) have been complaining about this
> for months. (Pretty much every time the "Please can we have a 2.7"
> thread comes up). [note, that I actually prefer the 'new' approach
> to development in 2.6, what I object to is that at the same time we
> threw out the 'lets be careful about not breaking userspace' mantra.]
>
> Just a few years ago, if someone suggested breaking a userspace
> app in a kernel upgrade, they'd be crucified on linux-kernel, now
> it's 'the norm').

We had two userspace wireless monitoring program depending
on /sys/class/net/<device name>/wireless directory to be present and now its
gone in 2.6.15 and I can't find one line of changelog where its gone or
why. /sys seems to be the mostly abused part of kernel-userspace relationship
with changing paths,names and now disappearing directories....

Regards,
ismail

Arjan van de Ven

unread,
Dec 29, 2005, 6:10:08 PM12/29/05
to
Some data from an x86-64 allyesconfig build.

Below is a *rough* estimate of savings that could be achieved by
uninlining specific functions. The estimate is rough in the sense that it assumes
that no "trick" allows the uninlined version to be significantly smaller
than the inlined version, which for certain functions is not a valid
assumption (kmalloc comes to mind as an obvious one).

The saving is estimated at (count-1) * (size-6), eg the estimate for a
function call is 6 bytes as well and the estimate for the size something
takes as inlined is the same as the uninline size.


These are the top items only; a more complete list can be gotten
from http://www.fenrus.org/savings

Est saving function name count uninline size
----------------------------------------------------------------------
95940 down [2461] <45>
84392 skb_put [1097] <83>
50932 kfree_skb [1499] <40>
44880 init_waitqueue_head [881] <57>
34840 lowmem_page_address [537] <71>
25573 cfi_build_cmd [108] <245>
19825 skb_push [326] <67>
17992 aic_outb [347] <58>
17434 module_put [380] <52>
16318 ahd_outb [399] <47>
16035 kmalloc [3208] <11>
14040 netif_wake_queue [361] <45>
13266 dev_kfree_skb_irq [202] <72>
12078 signal_pending [672] <24>
11979 ahc_outb [364] <39>
11603 down_interruptible [284] <47>
11552 ahd_inb [305] <44>
11310 dst_release [175] <71>
11275 netif_stop_queue [452] <31>
11165 down_write [320] <41>
11107 ahc_inb [384] <35>
10807 usb_fill_bulk_urb [102] <113>
10508 ahd_set_modes [72] <154>
10266 skb_queue_head_init [178] <64>

Linus Torvalds

unread,
Dec 29, 2005, 6:10:08 PM12/29/05
to

On Thu, 29 Dec 2005, Dave Jones wrote:
>

> At some point in time it became defacto that certain things like udev, hotplug,
> alsa-lib, wireless-tools and a bunch of others have to have kept in lockstep
> with the kernel, and if it breaks, it's your fault for not upgrading
> your userspace.

Hmm.. Time for some re-indoctrination?

We really shouldn't allow that. I know who to blame for udev, who else
should I complain to?

> Just a few years ago, if someone suggested breaking a userspace
> app in a kernel upgrade, they'd be crucified on linux-kernel, now
> it's 'the norm').

That really isn't acceptable. Breaking user space - even things that are
"close" to the kernel like udev scripts and alsa-lib, really is NOT a good
idea.

We're much better off wasting a bit of time on backwards compatibility,
than wasting a lot of user time and irritation (and indirectly, developer
time) on linkages to packages outside the kernel.

If you cannot upgrade a kernel without ugrading some user package, that
should be considered a real bug and a regression.

There are real technical reasons for not allowing those kinds of version
linkages: it makes it MUCH harder to blame the right thing when things go
wrong.

Now, I'm not saying that we can always support everything that goes on in
user space forever, but dammit, we can try damn hard.

(Somehow I'm not surprised about alsa. I think the whole alsa release
process has always sucked. Dang).

Linus

Jeff V. Merkey

unread,
Dec 29, 2005, 6:10:12 PM12/29/05
to


The breakage issue is ridiculous, assinine, and unnecessary. I have been
porting dsfs to the various releases over the past month,
and the breakage of user space, usb, nfs, memory management, is beyond
absurd. Instead of constantly breaking the
interfaces in Linux, why not tell people **NO** every time they try to
rewrite the memory management, etc.

Instead of creating new capabilities and features, everyone seems hell
bent on rewriting the same stale, mouldy, boring sections
of the OS over and over again. So how many times does the memory manage
and slab allocator need to get rewritten. Or how many
times does the vfs need to get broken over and over again.

This bullshit is killing Linux, it's too much work to keep up with
breakage, most of which is INTENTIONAL and NEEDLESS.

STOP STOP STOP!!!

I can't even apply a kdb patch between . releases of the kernel without
seeing major breakage between Kprobes and some other
BS someone decides to change for no other reason than THEY CAN DO IT. I
have seen memory corruption in the rrecent kernels.
When a component of the OS is finished, then leave it the hell alone.

The only NEW features in Linux are more drivers and more processors.
Most of what's been going on in the development over the
past few months has been unnecessary breakage. If you are trying to get
people to stay off Linux as a stable platform, you are all
succeeding.

Jeff

Dave Jones

unread,
Dec 29, 2005, 6:10:15 PM12/29/05
to
On Thu, Dec 29, 2005 at 02:56:16PM -0800, Linus Torvalds wrote:

> That really isn't acceptable. Breaking user space - even things that are
> "close" to the kernel like udev scripts and alsa-lib, really is NOT a good
> idea.
>

> If you cannot upgrade a kernel without ugrading some user package, that
> should be considered a real bug and a regression.

I'm glad you agree. I've decided to try something different once 2.6.16
is out. Every day, I'm going to push the -git snapshot of the day into
a testing branch for Fedora users. (Normally, only rawhide[1] users
get to test kernel-de-jour, and this always has the latest userspace, so
we don't notice problems until a kernel point release and the stable
distro gets an update).

It'll come with disclaimers up the whazoo about it possibly crashing,
eating your cat etc, but I bet some loonies will be mad enough to
try it, and report when it crashes and burns. This should at least
get us knowing about *when* we break things sooner.

During 2.6.16rc, expect more screaming.

Dave

[1] For non-Red Hat savvy, rawhide=='fedora development branch'

Willy Tarreau

unread,
Dec 29, 2005, 6:20:10 PM12/29/05
to
On Thu, Dec 29, 2005 at 09:41:12AM -0800, Linus Torvalds wrote:

> There have been situations where documented gcc semantics changed, and
> instead of saying "sorry", the gcc people changed the documentation. What
> the hell is the point of documented semantics if you can't depend on them
> anyway?

Remember the #arg and ##arg mess in macros between gcc2 and gcc3 ?

I fell like I start to understand where your hate for specifications
comes from. As much as I like to stick to specs, which is generally
OK for hardware and network protocols, I can say that with GCC, there
is clearly no rule telling you whether your program will still compile
with version N+1 or not.

Can't we elect a recommended gcc version that distro makers could
ship under the name kgcc as it has been the case for some time,
and try to stick to that version for as long as possible ? The only
real reason to upgrade it would be to support newer archs, while at
the moment, we try to support compilers which are shipped as default
*user-space* compilers.

Willy

Dmitry Torokhov

unread,
Dec 29, 2005, 6:20:13 PM12/29/05
to
On 12/29/05, Dave Jones <da...@redhat.com> wrote:
> On Thu, Dec 29, 2005 at 12:49:16PM -0800, Linus Torvalds wrote:
>
> > Umm.. Complain more. I upgrade kernels a lot more often than I upgrade
> > distros, and things don't break. They're not allowed to break, because I
> > refuse to upgrade my user programs just because I do kernel development.
> > But I'd only notice a small part of user space, so if people don't
> > complain, they break not because we don't care, but because we didn't even
> > know.
> >
> > So if you have a user program that breaks, _complain_. It's really not
> > supposed to happen outside of perhaps kernel module loaders etc things
> > that get really really chummy with kernel internals (and even that was
> > fixed: the modern way of loading modules isn't that chummy any more, so
> > hopefully we'll not need to break even module loaders again).
> >
> > If we change some /proc file thing, breakage is often totally
> > unintentional, and complaining is the right thing - people might not even
> > have realized it broke.
> >
> > At least _I_ take breakage reports seriously. If there are maintainers
> > that don't, complain to them. I'll back you up. Breaking user space simply
> > isn't acceptable without years of preparation and warning.
>
> The udev situation I mentioned has been known about for at least a month,
> probably longer. With old udev, we don't get /dev/input/event* created
> with 2.6.15rc.
>

Once input core was converted to sysfs the bereakage was unavoidable.
Because of historical oversight input_dev and input interfaces, such
as mouseX were generating the same "input" events with different
arguments. The option was either to go with separate classes
(breakage) or making hierarchy within one class (breakage again). And
sysfs conversion was needed to do hotplug over netlink...

> At some point in time it became defacto that certain things like udev, hotplug,
> alsa-lib, wireless-tools and a bunch of others have to have kept in lockstep
> with the kernel, and if it breaks, it's your fault for not upgrading
> your userspace.
>

I would say that udev and hotplug is special kind of userspace as it
really extension of the kernel. It would probably be best if udev was
packaged together with the kernel.

--
Dmitry

Adrian Bunk

unread,
Dec 29, 2005, 6:40:12 PM12/29/05
to
On Thu, Dec 29, 2005 at 02:56:16PM -0800, Linus Torvalds wrote:
>
> On Thu, 29 Dec 2005, Dave Jones wrote:
>...

> > Just a few years ago, if someone suggested breaking a userspace
> > app in a kernel upgrade, they'd be crucified on linux-kernel, now
> > it's 'the norm').
>
> That really isn't acceptable. Breaking user space - even things that are
> "close" to the kernel like udev scripts and alsa-lib, really is NOT a good
> idea.
>
> We're much better off wasting a bit of time on backwards compatibility,
> than wasting a lot of user time and irritation (and indirectly, developer
> time) on linkages to packages outside the kernel.
>
> If you cannot upgrade a kernel without ugrading some user package, that
> should be considered a real bug and a regression.
>
> There are real technical reasons for not allowing those kinds of version
> linkages: it makes it MUCH harder to blame the right thing when things go
> wrong.
>
> Now, I'm not saying that we can always support everything that goes on in
> user space forever, but dammit, we can try damn hard.
>...

Was it a mistake to drop support for ipfwadm and ipchains?
Was it a mistake to drop support for devsfs?
Will it be a mistake to drop support for gcc < 3.2?
Will it be a mistake to remove the obsolete raw driver?
Will it be a mistake to drop the Video4Linux API 1 ioctls?
Will it be a mistake to drop support for pcmcia-cs?
...

And if any of these was or will not be a mistake, when is the right time
for the userspace breakage?

I did agree with what you express before support for ipchains was
removed and support for devfs was removed, and many more I do not
remember currently, but I've now simply accepted that regarding kernel
development, 6 is an odd number.

The fundamental problem is that the current development model
contains no well-defined points where breakages of the kernel-related
userspace were allowed and expected by users.

> Linus

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

-

Linus Torvalds

unread,
Dec 29, 2005, 6:50:15 PM12/29/05
to

On Thu, 29 Dec 2005, Jeff V. Merkey wrote:
>
> The breakage issue is ridiculous, assinine, and unnecessary. I have been
> porting dsfs to the various releases over the past month, and the
> breakage of user space, usb, nfs, memory management, is beyond absurd.

We're not talking about internal kernel stuff. Internal kernel stuff
_does_ get changed, and we dont' care about breakage of out-of-kernel
stuff. That's fundamental.

Linus

Jeff V. Merkey

unread,
Dec 29, 2005, 7:00:22 PM12/29/05
to
Linus Torvalds wrote:

>On Thu, 29 Dec 2005, Jeff V. Merkey wrote:
>
>
>>The breakage issue is ridiculous, assinine, and unnecessary. I have been
>>porting dsfs to the various releases over the past month, and the
>>breakage of user space, usb, nfs, memory management, is beyond absurd.
>>
>>
>
>We're not talking about internal kernel stuff. Internal kernel stuff
>_does_ get changed, and we dont' care about breakage of out-of-kernel
>stuff. That's fundamental.
>
>

Start caring. People spend lots of money supporting you, and what you
are doing. How about taking some
responsibility for that so they don't change their minds and move back
to windows or pull their support because it's too
costly or too much of a hassle to produce something stable from these
releases. If you export functions from the kernel,
don't break them. Don't let these numbnuts keep breaking things that
shouldn't be broken, i.e. memory manager (now that's a
big one). If you replace a subsystem with a newer one, keep a mapping
layer through at least the next .<even> release
i.e. 2.4 -> 2.6 (this is reasonable and expected -- you can drop things
but you should only do it on well understood
boundries). Don't let these people break everything every other
incremental release.

I have a family too Linus and I like to spend my evenings with them
rather then unwinding Olaf's bugs in NFS (the most
recent one). Think about "free and easy" and about making peoples lives
a little easier to support code on your platform
rather then expecting the rest of the planet to clean up everyone's
messes and sloppiness.

Jeff

Linus Torvalds

unread,
Dec 29, 2005, 7:10:17 PM12/29/05
to

On Fri, 30 Dec 2005, Adrian Bunk wrote:
> >
> > Now, I'm not saying that we can always support everything that goes on in
> > user space forever, but dammit, we can try damn hard.
>

> Was it a mistake to drop support for ipfwadm and ipchains?
> Was it a mistake to drop support for devsfs?
> Will it be a mistake to drop support for gcc < 3.2?

Those things at least were brewing for _years_. People had lots of
heads-up warning.

> Will it be a mistake to remove the obsolete raw driver?
> Will it be a mistake to drop the Video4Linux API 1 ioctls?
> Will it be a mistake to drop support for pcmcia-cs?

And again, this is something that we've been warnign about. We have.

I'm not talking about never obsoleting bad interfaces at all. I'm talking
about the unnecessary breakage that comes from changes that simply aren't
needed, and that isn't given proper heads-up for.

We used to have a fairly clear point where we could break things, when we
had major kernel releases (ie 2.4 -> 2.6 broke the module loader. It was
documented, and it was unavoidable).

> The fundamental problem is that the current development model
> contains no well-defined points where breakages of the kernel-related
> userspace were allowed and expected by users.

The basic rule should be "never". For example, we now have three different
generations of the "stat()" system call, and yes, we wrote the code to
maintain all three interfaces. Breaking an old one for a new better one
simply wasn't an option.

Now, the more specialized the usage is, the less strict the "never"
becomes. But if something becomes a pain for distribution managers (and
from Dave, it sounds like we've hit that way too often), that definitely
means that we've broken too many things.

In short: I don't think anybody can complain about devfs-like things.
We've kept it up for a _long_ time, and there was tons of help for the
migration. But clearly DaveJ is unhappy, and that implies that we're not
doing as well as we should.

(Which is not to say that we should necessarily bend over backwards to
make sure that DaveJ is _never_ unhappy. We should just try harder).

Linus

Linus Torvalds

unread,
Dec 29, 2005, 7:20:06 PM12/29/05
to

On Thu, 29 Dec 2005, Jeff V. Merkey wrote:
>

> Linus Torvalds wrote:
> >
> > We're not talking about internal kernel stuff. Internal kernel stuff _does_
> > get changed, and we dont' care about breakage of out-of-kernel stuff. That's
> > fundamental.
>
> Start caring. People spend lots of money supporting you, and what you are

> doing. How about taking some responsibility for that [...]

Cry me a river, Jeff.

The kernel is GPL'd. That's my responsibility. Source code. Stuff that
comes to me as patches. That's my job, and that's what I get paid for. In
fact, my contract says that I _cannot_ work on anything that isn't open
source.

Stuff outside the kernel is almost always either (a) experimental stuff
that just isn't ready to be merged or (b) tries to avoid the GPL.

Neither is worth a _second_ of anybodys time trying to support, and when
you say "people spend lots of money supporting you", you're lying through
your teeth. The GPL-avoiding kind of people don't spend a dime supporting
me, they spend their money actively trying to debase and destroy what I
and thousands of others have been working our butts off for.

So don't try to make it sound like something it isn't. We support outside
projects a hell of a lot better than we'd need to, and I can tell you that
it's mostly _me_ who does that. Most of the core kernel developers argue
that I should support less of it - and yes, they are backed up by lawyers
at their (sometimes quite big) companies.

So be honest now. Are those projects you care about going to be GPL'd and
actively pushed back into the standard kernel?

And if they aren't, SHUT THE HELL UP, because they are total freeloaders,
and claimign that they "support" me is total crap.

Jeff V. Merkey

unread,
Dec 29, 2005, 7:30:16 PM12/29/05
to
Linus Torvalds wrote:

>On Thu, 29 Dec 2005, Jeff V. Merkey wrote:
>
>
>>Linus Torvalds wrote:
>>
>>
>>>We're not talking about internal kernel stuff. Internal kernel stuff _does_
>>>get changed, and we dont' care about breakage of out-of-kernel stuff. That's
>>>fundamental.
>>>
>>>
>>Start caring. People spend lots of money supporting you, and what you are
>>doing. How about taking some responsibility for that [...]
>>
>>
>
>Cry me a river, Jeff.
>
>The kernel is GPL'd. That's my responsibility. Source code. Stuff that
>comes to me as patches. That's my job, and that's what I get paid for. In
>fact, my contract says that I _cannot_ work on anything that isn't open
>source.
>
>Stuff outside the kernel is almost always either (a) experimental stuff
>that just isn't ready to be merged or (b) tries to avoid the GPL.
>
>Neither is worth a _second_ of anybodys time trying to support, and when
>you say "people spend lots of money supporting you", you're lying through
>your teeth. The GPL-avoiding kind of people don't spend a dime supporting
>me, they spend their money actively trying to debase and destroy what I
>and thousands of others have been working our butts off for.
>
>

The fact that Oracle and IBM support apps on Linux are Freeloading? Baloney!
Linux benefits by having the choice of al these applications.

(P.S. I have heard through the grapevine IBM is putting emphasis on AIX
as their platform and are actively telling this to large customers --
can you verify this
and are you aware of it)

>So don't try to make it sound like something it isn't. We support outside
>projects a hell of a lot better than we'd need to, and I can tell you that
>it's mostly _me_ who does that. Most of the core kernel developers argue
>that I should support less of it - and yes, they are backed up by lawyers
>at their (sometimes quite big) companies.
>
>So be honest now. Are those projects you care about going to be GPL'd and
>actively pushed back into the standard kernel?
>
>And if they aren't, SHUT THE HELL UP, because they are total freeloaders,
>and claimign that they "support" me is total crap.
>
>

Commercial applications support gives Linux "network effect" (economic term)
and thus clout and credibility. Protect this -- its in **OUR** interests.

Jeff

Linus Torvalds

unread,
Dec 29, 2005, 7:50:08 PM12/29/05
to

On Thu, 29 Dec 2005, Jeff V. Merkey wrote:
>
> The fact that Oracle and IBM support apps on Linux are Freeloading? Baloney!

Jeff, give it a rest.

Oracle and IBM haven't been complaining, have they? Oracle mostly does
user-space stuff (that doesn't change), and has been pretty good about
their Oracle-fs too - they're even actively discussing "git" issues on the
git mailing list, and asking for help, and just generally being good
members of the community.

And IBM engineers are part of the people who change internal kernel
interfaces in order to make it work better for them.

Pretty much the ONLY people who ever complain about those internal kernel
interfaces changing are the free-loaders. It's hard for them, because they
don't want to play according to the rules. Tough. Watch me not care:

[ Linus sits in his chair, patently not caring ]

See?

Dave Jones

unread,
Dec 29, 2005, 7:50:12 PM12/29/05
to
On Thu, Dec 29, 2005 at 07:38:14PM -0500, Ryan Anderson wrote:

> The biggest complaint I've seen about udev isn't the fact that you
> sometimes need to upgrade to use a new kernel, that's something that
> people like Dave can handle via package dependencies.

Sure, we *can* do that, though the problem is it introduces latency between
the time I build a kernel, and time users can test it, as they have
to sit and wait for the userspace packages to also arrive.

There's a 2.6.15rc7 kernel that some Fedora Core 4 users could download
and play with right now. I thought it'd be great to get some extra testing
over the xmas holidays. Unfortunatly, due to the necessary udev upgrade,
many users are turned off from testing by the inability to run X after
installing it. It'll probably be some time in the new year when folks
like our udev packager get back from vacation before we get a test package for FC4.

The more people I'm reliant upon for having bits in place, the longer
users have to wait to be able to test, and the longer we all wait for
feedback.

> The part of the udev situation that I've heard as a complaint (though I
> haven't experienced myself) is that the new udev wasn't backwards
> compatible with certain older kernels. I think the description I've
> heard is that you need one udev for < 2.6.12, one for 2.6.12 - 2.6.14,
> and now a new one for 2.6.15, and at least, jumping from < 2.6.12 to
> 2.6.15 is pretty much guaranteed to be difficult to get right. (If the
> kernel fails, you have a udev installed that won't work on the older
> kernel correctly, apparently.)
>
> This, for what it's worth, is the same breakage that Dave seemed to be
> most frustrated with during his OLS keynote, regarding ALSA versions,
> and a few other things that caused breakage and the user space failed to
> work correctly when the kernel was reverted. (I hope I'm not putting
> words in your mouth, Dave).

Yep, That is another problem. It's not uncommon for someone to upgrade
to a new rebased kernel and its assorted userspace bits, then find out
their wireless card broke, so they go back to the old working kernel,
only to find the newer sound libraries misbehave on older kernels.
(ALSA isn't the only problem area here, but it's an easy target).

Ryan Anderson

unread,
Dec 29, 2005, 7:50:13 PM12/29/05
to

The biggest complaint I've seen about udev isn't the fact that you


sometimes need to upgrade to use a new kernel, that's something that
people like Dave can handle via package dependencies.

The part of the udev situation that I've heard as a complaint (though I


haven't experienced myself) is that the new udev wasn't backwards
compatible with certain older kernels. I think the description I've
heard is that you need one udev for < 2.6.12, one for 2.6.12 - 2.6.14,
and now a new one for 2.6.15, and at least, jumping from < 2.6.12 to
2.6.15 is pretty much guaranteed to be difficult to get right. (If the
kernel fails, you have a udev installed that won't work on the older
kernel correctly, apparently.)

This, for what it's worth, is the same breakage that Dave seemed to be
most frustrated with during his OLS keynote, regarding ALSA versions,
and a few other things that caused breakage and the user space failed to
work correctly when the kernel was reverted. (I hope I'm not putting
words in your mouth, Dave).

That's my perspective from someone who has only dealt with the issue
from the point of view of a user, and only in the case of the Debian
udev packages flat out refusing to install while running a "too old"
kernel, which was rather, umm, annoying.

--

Ryan Anderson
sometimes Pug Majere

signature.asc

Linus Torvalds

unread,
Dec 29, 2005, 8:00:20 PM12/29/05
to

On Thu, 29 Dec 2005, Ryan Anderson wrote:
>
> This, for what it's worth, is the same breakage that Dave seemed to be
> most frustrated with during his OLS keynote, regarding ALSA versions,
> and a few other things that caused breakage and the user space failed to
> work correctly when the kernel was reverted. (I hope I'm not putting
> words in your mouth, Dave).

I agree: the worst part of version dependency is that it's really hard in
general to just move one of the components backwards or forwards.
Something you want to do when breakage occurs, or just because you need to
figure out some _other_ problem (like doing a kernel bug bisection).

Which is why pretty much _every_ component needs to be backwards
compatible at least to some degree. Otherwise they'd need to be bundled
and developed together as one thing.

IOW, the same way it's wrong for the kernel to need new binaries, it's
wrong for binaries to need a new kernel. It's one reason why we seldom add
new system calls: they aren't all that useful in any kind of short
timeframe, because even programs that would _like_ to use them usually
can't do so for a long time (until they don't have to worry about people
running old kernels any more).

(Some system calls are easier to add than others - if you can easily
emulate the new system call semantics with just a slight performance hit,
you can just have a simple wrapper with a fallback. That doesn't always
work well - some things are just very hard to emulate efficiently).

Linus

Linus Torvalds

unread,
Dec 29, 2005, 8:10:04 PM12/29/05
to

On Thu, 29 Dec 2005, Dave Jones wrote:
>

> There's a 2.6.15rc7 kernel that some Fedora Core 4 users could download
> and play with right now. I thought it'd be great to get some extra testing
> over the xmas holidays. Unfortunatly, due to the necessary udev upgrade,
> many users are turned off from testing by the inability to run X after
> installing it.

Can you actually detail this thing a bit more? I'm a FC4 user myself, and
I'm sure as hell running X too. And that's not even a special X install
like I used to have, it's bog-standard FC4 afaik.

And I'm definitely running -rc7 (well, not exactly, it's my current git
tree, so it's -rc7+patches).

So whatever breakage is there, I'd love to know more. It's not entirely
obvious.

Linus

Dave Airlie

unread,
Dec 29, 2005, 8:30:11 PM12/29/05
to
>
> Can you actually detail this thing a bit more? I'm a FC4 user myself, and
> I'm sure as hell running X too. And that's not even a special X install
> like I used to have, it's bog-standard FC4 afaik.
>
> And I'm definitely running -rc7 (well, not exactly, it's my current git
> tree, so it's -rc7+patches).

/dev/input/event* disappear, I've just noticed this myself yesterday
working on Xegl, I thought I'd done something wrong, then I realised
udev/kernel issues and I just created them by hand..

Your X might not be using evdev....

Dave.

Dave Jones

unread,
Dec 29, 2005, 8:30:15 PM12/29/05
to
On Thu, Dec 29, 2005 at 05:05:08PM -0800, Linus Torvalds wrote:
>
>
> On Thu, 29 Dec 2005, Dave Jones wrote:
> >
> > There's a 2.6.15rc7 kernel that some Fedora Core 4 users could download
> > and play with right now. I thought it'd be great to get some extra testing
> > over the xmas holidays. Unfortunatly, due to the necessary udev upgrade,
> > many users are turned off from testing by the inability to run X after
> > installing it.
>
> Can you actually detail this thing a bit more? I'm a FC4 user myself, and
> I'm sure as hell running X too. And that's not even a special X install
> like I used to have, it's bog-standard FC4 afaik.
>
> And I'm definitely running -rc7 (well, not exactly, it's my current git
> tree, so it's -rc7+patches).
>
> So whatever breakage is there, I'd love to know more. It's not entirely
> obvious.

It's X config dependant. If you have

Option "Device" "/dev/input/mice"

in your inputdevice section, all should work (I think[*]).
However, some folks seem to have somehow ended up with references
to either 'mouse0' or 'event*' in there.

With 2.6.14 on my testbox, I get this..

$ ls /dev/input/
event0 event1 mice mouse0

With 2.6.15rc

$ ls /dev/input/
mice

If I can dig out the bugzilla that reported this, I'll followup.
Something in my head is telling me it had something to do with
laptop touchpads, but that could be the post-xmas `nog talking.

Dave

[*] How much I look forward to a world where X has no config file
and just figures all this out itself.

Tim Schmielau

unread,
Dec 29, 2005, 9:10:05 PM12/29/05
to
On Thu, 29 Dec 2005, Arjan van de Ven wrote:

> Some data from an x86-64 allyesconfig build.

Thanks for the table. This certainly is a good starting point to find
valid candidates for uninlining.

> Below is a *rough* estimate of savings that could be achieved by
> uninlining specific functions. The estimate is rough in the sense that
> it assumes
> that no "trick" allows the uninlined version to be significantly smaller
> than the inlined version, which for certain functions is not a valid
> assumption (kmalloc comes to mind as an obvious one).

What about the (probably more common) case that the inlined version is
smaller because of optimizations that are not possible in the general
case?

> The saving is estimated at (count-1) * (size-6), eg the estimate for a
> function call is 6 bytes as well and the estimate for the size something
> takes as inlined is the same as the uninline size.

Maybe the estimate is a little bit too rough. All savings add up to
1780743 bytes, which seems a bit too large to me (can't compare to the
total size of an allyesconfig kernel since that gives me a 'File size
limit exceeded' when linking).


What about the previous suggestion to remove inline from *all* static
inline functions in .c files?
I just tried that for the fun of it. It got rid of 8806 'inline'
annotations and produced the ~2 MB (uncompressed) patch at
http://www.physik3.uni-rostock.de/tim/kernel/2.6/deinline.patch.gz
The resulting kernel actually booted (am running it right now). However,
catching just these low-hanging fruits doesn't get me anywhere near
Arjan's numbers. For my non-representative personal config I get (on
i386 with -unit-at-a-time):

> size vmlinux*
text data bss dec hex filename
2197105 386568 316840 2900513 2c4221 vmlinux
2144453 392100 316840 2853393 2b8a11 vmlinux.deinline

I just started an allyesconfig build to get some real numbers.

Tim

Dave Jones

unread,
Dec 29, 2005, 9:20:09 PM12/29/05
to
On Fri, Dec 30, 2005 at 03:10:29AM +0100, Jiri Slaby wrote:

> http://download.fedora.redhat.com/pub/fedora/linux/core/development/SRPMS/
> [maybe this is the difference?

Of course. development branch always has latest userspace.
The point here is that the old userspace breaks.

Dave

Jiri Slaby

unread,
Dec 29, 2005, 9:30:09 PM12/29/05
to
Dave Jones napsal(a):

>With 2.6.14 on my testbox, I get this..
>
>$ ls /dev/input/
>event0 event1 mice mouse0
>
>With 2.6.15rc
>
>$ ls /dev/input/
>mice
>
>

I don't know what's wrong (or different), but
$ uname -a
Linux bellona 2.6.15-rc7 #1 SMP PREEMPT Fri Dec 30 02:56:57 CET 2005
i686 i686 i386 GNU/Linux
$ cat /etc/fedora-release
Fedora Core release 4 (Stentz)
$ rpm -q udev hal
udev-077-1
hal-0.5.5.1-1
from SRPMS from
http://download.fedora.redhat.com/pub/fedora/linux/core/development/SRPMS/
[maybe this is the difference? btw. despite, rc5-mm3 sound is defunct --
sound class is under device's class, but it's mm tree]
and at last the point of this e-mail:

$ ls /dev/input/
event0 event1 mice mouse0 wacom

(udev created them, I'm sure)

all the best,

--
Jiri Slaby www.fi.muni.cz/~xslaby
\_.-^-._ jiri...@gmail.com _.-^-._/
B67499670407CE62ACC8 22A032CC55C339D47A7E

Tim Schmielau

unread,
Dec 29, 2005, 9:30:10 PM12/29/05
to
On Fri, 30 Dec 2005, Tim Schmielau wrote:

> > size vmlinux*
> text data bss dec hex filename
> 2197105 386568 316840 2900513 2c4221 vmlinux
> 2144453 392100 316840 2853393 2b8a11 vmlinux.deinline

Doh! I forgot to set -Os.
Will better go to bed now and redo the numbers tomorrow.

Jiri Slaby

unread,
Dec 29, 2005, 9:40:08 PM12/29/05
to
Dave Jones wrote:
> On Fri, Dec 30, 2005 at 03:10:29AM +0100, Jiri Slaby wrote:
>
> > http://download.fedora.redhat.com/pub/fedora/linux/core/development/SRPMS/
> > [maybe this is the difference?
>
> Of course. development branch always has latest userspace.
> The point here is that the old userspace breaks.
OK, I am with you, but when I am trying (devel) -rc releases, I need devel pkgs
(my opinion). And there is some time then to move these devel pkgs to release
database until next not-rc become real. Indeed, changes like this in some -rc
just before stable release or BIG changes in -rc at all are crazy, but if
somebody do that, we all have the chance say "STOP, we don't want it" (that's
the one from points to get lkml) not only THE MAN before git-commit.

regards,


--
Jiri Slaby www.fi.muni.cz/~xslaby
\_.-^-._ jiri...@gmail.com _.-^-._/
B67499670407CE62ACC8 22A032CC55C339D47A7E

Jiri Slaby

unread,
Dec 29, 2005, 9:40:05 PM12/29/05
to
[sorry for suplicity (if any)]

Dave Jones napsal(a):
>With 2.6.14 on my testbox, I get this..
>
>$ ls /dev/input/
>event0 event1 mice mouse0
>
>With 2.6.15rc
>
>$ ls /dev/input/
>mice
>

I don't know what's wrong, but


$ uname -a
Linux bellona 2.6.15-rc7 #1 SMP PREEMPT Fri Dec 30 02:56:57 CET 2005 i686 i686 i386 GNU/Linux
$ cat /etc/fedora-release
Fedora Core release 4 (Stentz)
$ rpm -q udev hal
udev-077-1
hal-0.5.5.1-1

from SRPMS from http://download.fedora.redhat.com/pub/fedora/linux/core/development/SRPMS/
[maybe this is the difference? btw. despite, rc5-mm3 sound is defunct -- sound class is under device's class]


and at last the point of this e-mail:

$ ls /dev/input/
event0 event1 mice mouse0 wacom

(udev created them, I'm sure)

all the best,

Mark Lord

unread,
Dec 29, 2005, 11:00:16 PM12/29/05
to
>If we change some /proc file thing, breakage is often totally
>unintentional, and complaining is the right thing - people might not even
>have realized it broke.

Okay, I'm complaining: /proc/cpuinfo is no longer correct
for my Pentium-M notebook, as ov 2.6.15-rc7. Now it reports
a cpu speed of approx 800Mhz for a 2.0Mhz Pentium-M.

Cheers!

Nicolas Pitre

unread,
Dec 29, 2005, 11:00:19 PM12/29/05
to
On Thu, 29 Dec 2005, Arjan van de Ven wrote:

> Some data from an x86-64 allyesconfig build.
>

> 25573 cfi_build_cmd [108] <245>

Beware this one. The CFI code is not realistically ever used with
everything set to y in real life scenarios. In fact, when only the
needed buswidth and interleave option are selected then this particular
inlined function gets reduced to a simple constant, such as 0x00700070
for example.

However if gcc wasn't forced to always inline, then in the allyesconfig
this function would benefit from being uninlined automatically.


Nicolas

Dave Jones

unread,
Dec 29, 2005, 11:10:09 PM12/29/05
to
On Thu, Dec 29, 2005 at 10:57:40PM -0500, Mark Lord wrote:
> Mark Lord wrote:
> >
> >Okay, I'm complaining: /proc/cpuinfo is no longer correct
> >for my Pentium-M notebook, as ov 2.6.15-rc7. Now it reports
> >a cpu speed of approx 800Mhz for a 2.0Mhz Pentium-M.
>
> 2.0GHz, not Mhz! (blush)
>
> Prior to -rc7, /proc/cpuinfo would scale according to the
> current speedstep of the CPU. Now it seems stuck at the
> lowest setting for some reason.

Ok, if the scaling doesn't work any more, that's a bug rather
than an intentional breakage. More details please? dmesg ?
/sys/devices/system/cpu/cpufreq contents? What were you using
to do the scaling previously? (An app, or ondemand)

Dave

Mark Lord

unread,
Dec 29, 2005, 11:20:17 PM12/29/05
to
Mark Lord wrote:
>
> The actual speedstep component ("ondemand" cpufreq) is working just
> fine, according to /sys/devices/system/cpu/cpufreq. But /proc/cpuinfo
> is no longer reflecting the current values -- stuck at 800Mhz
> regardless of /sys/devices/system/cpu/cpufreq showing other values.

Actually, the path is /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq.

And tonight it appears to be working again (/proc/cpuinfo showing
correct values, something it was not doing when I first checked it
after upgrading to -rc7.. something buggy there??).

Cheers

Mark Lord

unread,
Dec 29, 2005, 11:20:18 PM12/29/05
to
Dave Jones wrote:
> On Thu, Dec 29, 2005 at 10:57:40PM -0500, Mark Lord wrote:
> > Mark Lord wrote:
> > >
> > >Okay, I'm complaining: /proc/cpuinfo is no longer correct
> > >for my Pentium-M notebook, as ov 2.6.15-rc7. Now it reports
> > >a cpu speed of approx 800Mhz for a 2.0Mhz Pentium-M.
> >
> > 2.0GHz, not Mhz! (blush)
> >
> > Prior to -rc7, /proc/cpuinfo would scale according to the
> > current speedstep of the CPU. Now it seems stuck at the
> > lowest setting for some reason.
>
> Ok, if the scaling doesn't work any more, that's a bug rather
> than an intentional breakage. More details please? dmesg ?
> /sys/devices/system/cpu/cpufreq contents? What were you using
> to do the scaling previously? (An app, or ondemand)

The actual speedstep component ("ondemand" cpufreq) is working just


fine, according to /sys/devices/system/cpu/cpufreq. But /proc/cpuinfo
is no longer reflecting the current values -- stuck at 800Mhz
regardless of /sys/devices/system/cpu/cpufreq showing other values.

Cheers

Mark Lord

unread,
Dec 29, 2005, 11:30:12 PM12/29/05
to
Mark Lord wrote:
>
> Okay, I'm complaining: /proc/cpuinfo is no longer correct
> for my Pentium-M notebook, as ov 2.6.15-rc7. Now it reports
> a cpu speed of approx 800Mhz for a 2.0Mhz Pentium-M.

2.0GHz, not Mhz! (blush)

Prior to -rc7, /proc/cpuinfo would scale according to the
current speedstep of the CPU. Now it seems stuck at the
lowest setting for some reason.

Cheers

Dave Jones

unread,
Dec 29, 2005, 11:30:13 PM12/29/05
to
On Thu, Dec 29, 2005 at 10:47:28PM -0500, Mark Lord wrote:
> >If we change some /proc file thing, breakage is often totally
> >unintentional, and complaining is the right thing - people might not even
> >have realized it broke.
>
> Okay, I'm complaining: /proc/cpuinfo is no longer correct
> for my Pentium-M notebook, as ov 2.6.15-rc7. Now it reports
> a cpu speed of approx 800Mhz for a 2.0Mhz Pentium-M.

It's reporting the 'current running speed'. You have speedstep-centrino
loaded, (and probably a governor changing the speed down when idle).

I don't see how this can be construed as breakage btw.
Which application breaks due to this changing ?

Dave

Mark Lord

unread,
Dec 29, 2005, 11:40:17 PM12/29/05
to
Mark Lord wrote:
..

> And tonight it appears to be working again (/proc/cpuinfo showing
> correct values, something it was not doing when I first checked it
> after upgrading to -rc7.. something buggy there??).

Okay, I've tried a couple of reboots, and it's working fine tonight.
Maybe it only fails when doing a public demo for Windows people?
(as when it first failed).

Leave it. If I can catch it again, I'll scream again then.

Dave Jones

unread,
Dec 30, 2005, 12:10:04 AM12/30/05
to
On Thu, Dec 29, 2005 at 11:20:12PM -0500, Mark Lord wrote:
> Mark Lord wrote:
> ..
> >And tonight it appears to be working again (/proc/cpuinfo showing
> >correct values, something it was not doing when I first checked it
> >after upgrading to -rc7.. something buggy there??).
>
> Okay, I've tried a couple of reboots, and it's working fine tonight.
> Maybe it only fails when doing a public demo for Windows people?
> (as when it first failed).
>
> Leave it. If I can catch it again, I'll scream again then.

One thing that could explain it.. SMP kernels currently don't
report scaling correctly. It'll always show the boot frequency.
There's a fix for this in the cpufreq.git repo (and -mm)
that's going to Linus once 2.6.15 is out.

Dave

Theodore Ts'o

unread,
Dec 30, 2005, 1:20:05 AM12/30/05
to
On Thu, Dec 29, 2005 at 08:21:45PM -0500, Dave Jones wrote:
> With 2.6.14 on my testbox, I get this..
>
> $ ls /dev/input/
> event0 event1 mice mouse0
>
> With 2.6.15rc
>
> $ ls /dev/input/
> mice
>
> If I can dig out the bugzilla that reported this, I'll followup.
> Something in my head is telling me it had something to do with
> laptop touchpads, but that could be the post-xmas `nog talking.

When I got bitten with this udev breakage a while ago, it wasn't a
matter of /dev/input/event? in the X configuration file, but that the
Synaptics driver directly searches for /dev/input/event? files to
talk directly to the raw touchpad device driver directly, and fails to
initialize, thus causing the X server initialization to fail. There
may be other failure scenarios, but this was the one I ran into with
my laptop.

- Ted

Ingo Molnar

unread,
Dec 30, 2005, 3:00:11 AM12/30/05
to

* Tim Schmielau <t...@physik3.uni-rostock.de> wrote:

> What about the previous suggestion to remove inline from *all* static
> inline functions in .c files?

i think this is a way too static approach. Why go from one extreme to
the other, when my 3 simple patches (which arguably create a more
flexible scenario) gives us savings of 7.7%?

Ingo

Arjan van de Ven

unread,
Dec 30, 2005, 3:10:09 AM12/30/05
to

> Can't we elect a recommended gcc version that distro makers could
> ship under the name kgcc as it has been the case for some time,

speaking as someone who used to work for a distro: this sucks for
distros. Shipping 2 compilers is NOT fun. Not fun at all! It's double
the maintenance, actually more since 1 of the 2 is only used in 1
package, so it gets a lot less testing.

Willy Tarreau

unread,
Dec 30, 2005, 3:20:09 AM12/30/05
to
On Fri, Dec 30, 2005 at 09:05:17AM +0100, Arjan van de Ven wrote:
>
> > Can't we elect a recommended gcc version that distro makers could
> > ship under the name kgcc as it has been the case for some time,
>
> speaking as someone who used to work for a distro: this sucks for
> distros. Shipping 2 compilers is NOT fun. Not fun at all! It's double
> the maintenance, actually more since 1 of the 2 is only used in 1
> package, so it gets a lot less testing.

I trust your experience on this, but wasn't the lack of testing
primarily due to the use of a "special" version of the compiler ?
For instance, if we put a short howto in Documentation/ explaining
how to build a kgcc toolchain describing what versions to use, there
are chances that most LKML users will use the exact same version.
Distro maintainers may want to follow the same version too. Also,
the fact that the kernel would be designed to work with *that*
compiler will limit the maintenance trouble you certainly have
encountered trying to keep the compiler up-to-date with more recent
kernel patches and updates.

Of course I may be wrong, but I think that kernel developpers spend
a huge time adapting the kernel to newer versions gcc (and fixing
bugs caused by new versions too), and this time would better be spent
developping new features and fixing bugs (and of course sometimes
maintaining the kgcc toolchain when needed).

Willy

Arjan van de Ven

unread,
Dec 30, 2005, 3:20:09 AM12/30/05
to

>
> We used to have a fairly clear point where we could break things, when we
> had major kernel releases (ie 2.4 -> 2.6 broke the module loader. It was
> documented, and it was unavoidable).

maybe such a point should be added back, in the sense that the
"announced" things get batched up to, say, every 3rd kernel release.
Doing this in batches is less painful than doing it every release.

Arjan van de Ven

unread,
Dec 30, 2005, 3:30:10 AM12/30/05
to
On Fri, 2005-12-30 at 09:15 +0100, Willy Tarreau wrote:
>
>
> I trust your experience on this, but wasn't the lack of testing
> primarily due to the use of a "special" version of the compiler ?
> For instance, if we put a short howto in Documentation/ explaining
> how to build a kgcc toolchain describing what versions to use, there
> are chances that most LKML users will use the exact same version.
> Distro maintainers may want to follow the same version too. Also,
> the fact that the kernel would be designed to work with *that*
> compiler will limit the maintenance trouble you certainly have
> encountered trying to keep the compiler up-to-date with more recent
> kernel patches and updates.

it's not that easy. Simply put: the gcc people release an update every 6
months; distros "jump ahead" the bugfixes on that usually. (think of it
like -stable, where distros would ship patches accepted for -stable but
before -stable got released). Taking an older compiler from gcc.gnu.org
doesn't mean it's bug free. It just means you're not getting bugfixes.

Jesper Juhl

unread,
Dec 30, 2005, 3:40:08 AM12/30/05
to
On 12/30/05, Willy Tarreau <wi...@w.ods.org> wrote:
<!-- snip -->

>
> Can't we elect a recommended gcc version that distro makers could
> ship under the name kgcc as it has been the case for some time,
> and try to stick to that version for as long as possible ? The only
> real reason to upgrade it would be to support newer archs, while at
> the moment, we try to support compilers which are shipped as default
> *user-space* compilers.
>
As I see it, doing that would
- put extra work on distributors.
- bloat users systems with the need to have two gcc versions installed.
- decrease testing with different gcc versions, which sometimes uncover bugs.


--
Jesper Juhl <jespe...@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

Willy Tarreau

unread,
Dec 30, 2005, 4:30:18 AM12/30/05
to
On Fri, Dec 30, 2005 at 09:24:32AM +0100, Arjan van de Ven wrote:
> On Fri, 2005-12-30 at 09:15 +0100, Willy Tarreau wrote:
> >
> >
> > I trust your experience on this, but wasn't the lack of testing
> > primarily due to the use of a "special" version of the compiler ?
> > For instance, if we put a short howto in Documentation/ explaining
> > how to build a kgcc toolchain describing what versions to use, there
> > are chances that most LKML users will use the exact same version.
> > Distro maintainers may want to follow the same version too. Also,
> > the fact that the kernel would be designed to work with *that*
> > compiler will limit the maintenance trouble you certainly have
> > encountered trying to keep the compiler up-to-date with more recent
> > kernel patches and updates.
>
> it's not that easy. Simply put: the gcc people release an update every 6
> months; distros "jump ahead" the bugfixes on that usually. (think of it
> like -stable, where distros would ship patches accepted for -stable but
> before -stable got released). Taking an older compiler from gcc.gnu.org
> doesn't mean it's bug free. It just means you're not getting bugfixes.

OK, but precisely, we don't have any bug free version of gcc anyway. The
kernel has a long history of workaround for gcc bugs. So probably there
will be less work with a -possibly buggy- old gcc version than with a
constantly changing one. For instance, if we stick to 3.4 for 2 years,
we will of course encounter a lot of bugs. But they will be worked around
just like gcc-2.95 bugs have been, and we will be able to keep the same
compiler very long at virtually zero maintenance work.

A few years ago, I had to work on a mainframe system with gcc 1.37.
Yes, 1.37 !!! It was very limited, but I could adapt my code to it
without thinking about what would happen when they update it precisely
because it was not meant to evolve at all. It had been shipped like
this with the OS for 5 years and that was OK. With stable tools like
this, any bug becomes a feature because you don't risk someone fixing
it and breaking your workaround.

While it would be a real problem for user-space tools, I think it
is compatible with kernel needs. The kernel already has strict
requirements to be built and does not need the same level of
portability as pdksh or openssh for instance.

Willy

Andi Kleen

unread,
Dec 30, 2005, 4:30:18 AM12/30/05
to
Jakub Jelinek <ja...@redhat.com> writes:
>
> Only for static functions (and in -funit-at-a-time mode).
> Anything else would require full IMA over the whole kernel and we aren't
> there yet. So inline hints are useful. But most of the inline keywords
> in the kernel really should be that, hints, because e.g. while it can be
> beneficial to inline something on one arch, it may be not beneficial on
> another arch, depending on cache sizes, number of general registers
> available to the compiler, register preassure, speed of the call/ret
> pair, calling convention and many other factors.

There are important exceptions like:

- Code that really wants to do compile time constant resolution
(like the x86 copy_*_user) and even throws linker errors when wrong.
- Anything in a include file (otherwise it gets duplicated for
every #include which can actually increase text size a lot)
- There is some code which absolutely needs inline in the x86-64
vsyscall code.

But arguably they should be force_inline.

I'm not quite sure I buy Ingo's original argument also. If he's only
looking at text size then with the above fixed then he ideally
would like to not inline anything (because except these
exceptions above .text usually near always shrinks when
not inlining). But that's not necessarily best for performance.

-Andi

Willy Tarreau

unread,
Dec 30, 2005, 4:40:10 AM12/30/05
to
On Fri, Dec 30, 2005 at 09:33:14AM +0100, Jesper Juhl wrote:
> On 12/30/05, Willy Tarreau <wi...@w.ods.org> wrote:
> <!-- snip -->
> >
> > Can't we elect a recommended gcc version that distro makers could
> > ship under the name kgcc as it has been the case for some time,
> > and try to stick to that version for as long as possible ? The only
> > real reason to upgrade it would be to support newer archs, while at
> > the moment, we try to support compilers which are shipped as default
> > *user-space* compilers.
> >
> As I see it, doing that would
> - put extra work on distributors.

In the short term, yes. In the mid-term, I don't think so. Having one package
which does not need to change and another one which evolves regardless of
kernel needs is less work than ensuring that a single package is still
compatible with everyone's needs. Think about support too : "what gcc version
did you use ?" would simply become "did you build with kgcc ?"

> - bloat users systems with the need to have two gcc versions installed.

$ size /usr/lib/gcc-lib/i586-pc-linux-gnu/3.3.6/cc1


text data bss dec hex filename

3430228 2680 746688 4179596 3fc68c /usr/lib/gcc-lib/i586-pc-linux-gnu/3.3.6/cc1

You don't even need libgcc nor c++ to build the kernel. Anyway, it should
not be an absolute requirement, but the *recommended* and *supported* version.

> - decrease testing with different gcc versions, which sometimes uncover bugs.

gcc testing should not consume kernel developpers' time, but gcc's users.
How many kernel bugs have finally been attributed to a recent change in gcc ?
A lot I think. Uncovering bugs in gcc is useful but not the primary goal of
kernel developpers.

> Jesper Juhl <jespe...@gmail.com>

Willy

Jesper Juhl

unread,
Dec 30, 2005, 4:40:13 AM12/30/05
to
On 12/30/05, Willy Tarreau <wi...@w.ods.org> wrote:
> On Fri, Dec 30, 2005 at 09:33:14AM +0100, Jesper Juhl wrote:
> > On 12/30/05, Willy Tarreau <wi...@w.ods.org> wrote:
> > <!-- snip -->
> > >
> > > Can't we elect a recommended gcc version that distro makers could
> > > ship under the name kgcc as it has been the case for some time,
> > > and try to stick to that version for as long as possible ? The only
> > > real reason to upgrade it would be to support newer archs, while at
> > > the moment, we try to support compilers which are shipped as default
> > > *user-space* compilers.
> > >
> > As I see it, doing that would
> > - put extra work on distributors.
>
> In the short term, yes. In the mid-term, I don't think so. Having one package
> which does not need to change and another one which evolves regardless of
> kernel needs is less work than ensuring that a single package is still
> compatible with everyone's needs. Think about support too : "what gcc version
> did you use ?" would simply become "did you build with kgcc ?"
>
> > - bloat users systems with the need to have two gcc versions installed.
>
> $ size /usr/lib/gcc-lib/i586-pc-linux-gnu/3.3.6/cc1
> text data bss dec hex filename
> 3430228 2680 746688 4179596 3fc68c /usr/lib/gcc-lib/i586-pc-linux-gnu/3.3.6/cc1
>
It's not much, agreed, but if the users regular gcc can build the
kernel it's still unnessesary extra bloat to have two gcc's.
But you are right, the bloat issue is just a minor thing.


> You don't even need libgcc nor c++ to build the kernel. Anyway, it should
> not be an absolute requirement, but the *recommended* and *supported* version.
>
> > - decrease testing with different gcc versions, which sometimes uncover bugs.
>
> gcc testing should not consume kernel developpers' time, but gcc's users.
> How many kernel bugs have finally been attributed to a recent change in gcc ?
> A lot I think. Uncovering bugs in gcc is useful but not the primary goal of
> kernel developpers.
>

That's not what I meant. I meant that building the kernel with
different gcc versions sometimes uncover bugs in the *kernel*. I was
not talking about finding bugs in gcc.

Willy Tarreau

unread,
Dec 30, 2005, 4:50:08 AM12/30/05
to

OK. But there will always be people trying to build kernels with any gcc so
I don't think we would lose this bug report channel anyway.

> Jesper Juhl <jespe...@gmail.com>

Willy

Ingo Molnar

unread,
Dec 30, 2005, 4:50:11 AM12/30/05
to

* Andi Kleen <a...@suse.de> wrote:

> There are important exceptions like:
>
> - Code that really wants to do compile time constant resolution
> (like the x86 copy_*_user) and even throws linker errors when wrong.
> - Anything in a include file (otherwise it gets duplicated for
> every #include which can actually increase text size a lot)
> - There is some code which absolutely needs inline in the x86-64
> vsyscall code.
>
> But arguably they should be force_inline.

FYI, i picked up a couple of those in the 3rd patch that i sent
yesterday (see below too). That patch marks a handful of functions
__always_inline. This improved size by another 2-3%. Not bad from a
small patch:

asm-i386/apic.h | 6 +++---
asm-i386/bitops.h | 2 +-
asm-i386/current.h | 2 +-
asm-i386/string.h | 8 ++++----
linux/buffer_head.h | 10 +++++-----
linux/byteorder/swab.h | 18 +++++++++---------
linux/mm.h | 2 +-
linux/slab.h | 2 +-
8 files changed, 25 insertions(+), 25 deletions(-)

> I'm not quite sure I buy Ingo's original argument also. If he's only
> looking at text size then with the above fixed then he ideally would
> like to not inline anything (because except these exceptions above
> .text usually near always shrinks when not inlining). But that's not
> necessarily best for performance.

well, i think the numbers talk for themselves. Here are my latest
results:

----
The effect of the patches on x86, using a generic .config is:

text data bss dec hex filename

3286166 869852 387260 4543278 45532e vmlinux-orig
3194123 955168 387260 4536551 4538e7 vmlinux-inline
3119495 884960 387748 4392203 43050b vmlinux-inline+units
3051709 869380 387748 4308837 41bf65 vmlinux-inline+units+fixes
3049357 868928 387748 4306033 41b471 vmlinux-inline+units+fixes+capable

i.e. a 7.8% code-size reduction. Using a tiny .config gives:

text data bss dec hex filename

437271 77646 32192 547109 85925 vmlinux-orig
452694 77646 32192 562532 89564 vmlinux-inline
431891 77422 32128 541441 84301 vmlinux-inline+units
414803 77422 32128 524353 80041 vmlinux-inline+units+fixes
414020 77422 32128 523570 7fd32 vmlinux-inline+units+fixes+capable

or an 5.6% reduction.

i've also done test-builds with CC_OPTIMIZE_FOR_SIZE disabled:

text data bss dec hex filename

4080998 870384 387260 5338642 517612 vmlinux-orig
4084421 872024 387260 5343705 5189d9 vmlinux-inline
4010957 834048 387748 5232753 4fd871 vmlinux-inline+units
4010039 833112 387748 5230899 4fd133 vmlinux-inline+units+fixes
4007617 833120 387748 5228485 4fc7c5 vmlinux-inline+units+fixes+capable

or a 1.8% code size reduction.

Ingo

--------
Subject: mark a handful of inline functions as 'must inline'

this patch marks a number of functions as 'must inline' - so that they
get inlined even if optimizing for size. This patch gives another 2-3%
of size saved, when CONFIG_CC_OPTIMIZE_FOR_SIZE is enabled.

Signed-off-by: Ingo Molnar <mi...@elte.hu>

----

include/asm-i386/apic.h | 6 +++---
include/asm-i386/bitops.h | 2 +-
include/asm-i386/current.h | 2 +-
include/asm-i386/string.h | 8 ++++----
include/linux/buffer_head.h | 10 +++++-----
include/linux/byteorder/swab.h | 18 +++++++++---------
include/linux/mm.h | 2 +-
include/linux/slab.h | 2 +-
8 files changed, 25 insertions(+), 25 deletions(-)

Index: linux-gcc.q/include/asm-i386/apic.h
===================================================================
--- linux-gcc.q.orig/include/asm-i386/apic.h
+++ linux-gcc.q/include/asm-i386/apic.h
@@ -49,17 +49,17 @@ static inline void lapic_enable(void)
* Basic functions accessing APICs.
*/

-static __inline void apic_write(unsigned long reg, unsigned long v)
+static __always_inline void apic_write(unsigned long reg, unsigned long v)
{
*((volatile unsigned long *)(APIC_BASE+reg)) = v;
}

-static __inline void apic_write_atomic(unsigned long reg, unsigned long v)
+static __always_inline void apic_write_atomic(unsigned long reg, unsigned long v)
{
xchg((volatile unsigned long *)(APIC_BASE+reg), v);
}

-static __inline unsigned long apic_read(unsigned long reg)
+static __always_inline unsigned long apic_read(unsigned long reg)
{
return *((volatile unsigned long *)(APIC_BASE+reg));
}
Index: linux-gcc.q/include/asm-i386/bitops.h
===================================================================
--- linux-gcc.q.orig/include/asm-i386/bitops.h
+++ linux-gcc.q/include/asm-i386/bitops.h
@@ -247,7 +247,7 @@ static inline int test_and_change_bit(in
static int test_bit(int nr, const volatile void * addr);
#endif

-static inline int constant_test_bit(int nr, const volatile unsigned long *addr)
+static __always_inline int constant_test_bit(int nr, const volatile unsigned long *addr)
{
return ((1UL << (nr & 31)) & (addr[nr >> 5])) != 0;
}
Index: linux-gcc.q/include/asm-i386/current.h
===================================================================
--- linux-gcc.q.orig/include/asm-i386/current.h
+++ linux-gcc.q/include/asm-i386/current.h
@@ -5,7 +5,7 @@

struct task_struct;

-static inline struct task_struct * get_current(void)
+static __always_inline struct task_struct * get_current(void)
{
return current_thread_info()->task;
}
Index: linux-gcc.q/include/asm-i386/string.h
===================================================================
--- linux-gcc.q.orig/include/asm-i386/string.h
+++ linux-gcc.q/include/asm-i386/string.h
@@ -201,7 +201,7 @@ __asm__ __volatile__(
return __res;
}

-static inline void * __memcpy(void * to, const void * from, size_t n)
+static __always_inline void * __memcpy(void * to, const void * from, size_t n)
{
int d0, d1, d2;
__asm__ __volatile__(
@@ -223,7 +223,7 @@ return (to);
* This looks ugly, but the compiler can optimize it totally,
* as the count is constant.
*/
-static inline void * __constant_memcpy(void * to, const void * from, size_t n)
+static __always_inline void * __constant_memcpy(void * to, const void * from, size_t n)
{
long esi, edi;
if (!n) return to;
@@ -367,7 +367,7 @@ return s;
* things 32 bits at a time even when we don't know the size of the
* area at compile-time..
*/
-static inline void * __constant_c_memset(void * s, unsigned long c, size_t count)
+static __always_inline void * __constant_c_memset(void * s, unsigned long c, size_t count)
{
int d0, d1;
__asm__ __volatile__(
@@ -416,7 +416,7 @@ extern char *strstr(const char *cs, cons
* This looks horribly ugly, but the compiler can optimize it totally,
* as we by now know that both pattern and count is constant..
*/
-static inline void * __constant_c_and_count_memset(void * s, unsigned long pattern, size_t count)
+static __always_inline void * __constant_c_and_count_memset(void * s, unsigned long pattern, size_t count)
{
switch (count) {
case 0:
Index: linux-gcc.q/include/linux/buffer_head.h
===================================================================
--- linux-gcc.q.orig/include/linux/buffer_head.h
+++ linux-gcc.q/include/linux/buffer_head.h
@@ -72,15 +72,15 @@ struct buffer_head {
* and buffer_foo() functions.
*/
#define BUFFER_FNS(bit, name) \
-static inline void set_buffer_##name(struct buffer_head *bh) \
+static __always_inline void set_buffer_##name(struct buffer_head *bh) \
{ \
set_bit(BH_##bit, &(bh)->b_state); \
} \
-static inline void clear_buffer_##name(struct buffer_head *bh) \
+static __always_inline void clear_buffer_##name(struct buffer_head *bh) \
{ \
clear_bit(BH_##bit, &(bh)->b_state); \
} \
-static inline int buffer_##name(const struct buffer_head *bh) \
+static __always_inline int buffer_##name(const struct buffer_head *bh) \
{ \
return test_bit(BH_##bit, &(bh)->b_state); \
}
@@ -89,11 +89,11 @@ static inline int buffer_##name(const st
* test_set_buffer_foo() and test_clear_buffer_foo()
*/
#define TAS_BUFFER_FNS(bit, name) \
-static inline int test_set_buffer_##name(struct buffer_head *bh) \
+static __always_inline int test_set_buffer_##name(struct buffer_head *bh)\
{ \
return test_and_set_bit(BH_##bit, &(bh)->b_state); \
} \
-static inline int test_clear_buffer_##name(struct buffer_head *bh) \
+static __always_inline int test_clear_buffer_##name(struct buffer_head *bh)\
{ \
return test_and_clear_bit(BH_##bit, &(bh)->b_state); \
} \
Index: linux-gcc.q/include/linux/byteorder/swab.h
===================================================================
--- linux-gcc.q.orig/include/linux/byteorder/swab.h
+++ linux-gcc.q/include/linux/byteorder/swab.h
@@ -130,34 +130,34 @@
#endif /* OPTIMIZE */


-static __inline__ __attribute_const__ __u16 __fswab16(__u16 x)
+static __always_inline __attribute_const__ __u16 __fswab16(__u16 x)
{
return __arch__swab16(x);
}
-static __inline__ __u16 __swab16p(const __u16 *x)
+static __always_inline __u16 __swab16p(const __u16 *x)
{
return __arch__swab16p(x);
}
-static __inline__ void __swab16s(__u16 *addr)
+static __always_inline void __swab16s(__u16 *addr)
{
__arch__swab16s(addr);
}

-static __inline__ __attribute_const__ __u32 __fswab32(__u32 x)
+static __always_inline __attribute_const__ __u32 __fswab32(__u32 x)
{
return __arch__swab32(x);
}
-static __inline__ __u32 __swab32p(const __u32 *x)
+static __always_inline __u32 __swab32p(const __u32 *x)
{
return __arch__swab32p(x);
}
-static __inline__ void __swab32s(__u32 *addr)
+static __always_inline void __swab32s(__u32 *addr)
{
__arch__swab32s(addr);
}

#ifdef __BYTEORDER_HAS_U64__
-static __inline__ __attribute_const__ __u64 __fswab64(__u64 x)
+static __always_inline __attribute_const__ __u64 __fswab64(__u64 x)
{
# ifdef __SWAB_64_THRU_32__
__u32 h = x >> 32;
@@ -167,11 +167,11 @@ static __inline__ __attribute_const__ __
return __arch__swab64(x);
# endif
}
-static __inline__ __u64 __swab64p(const __u64 *x)
+static __always_inline __u64 __swab64p(const __u64 *x)
{
return __arch__swab64p(x);
}
-static __inline__ void __swab64s(__u64 *addr)
+static __always_inline void __swab64s(__u64 *addr)
{
__arch__swab64s(addr);
}
Index: linux-gcc.q/include/linux/mm.h
===================================================================
--- linux-gcc.q.orig/include/linux/mm.h
+++ linux-gcc.q/include/linux/mm.h
@@ -507,7 +507,7 @@ static inline void set_page_links(struct
extern struct page *mem_map;
#endif

-static inline void *lowmem_page_address(struct page *page)
+static __always_inline void *lowmem_page_address(struct page *page)
{
return __va(page_to_pfn(page) << PAGE_SHIFT);
}
Index: linux-gcc.q/include/linux/slab.h
===================================================================
--- linux-gcc.q.orig/include/linux/slab.h
+++ linux-gcc.q/include/linux/slab.h
@@ -76,7 +76,7 @@ struct cache_sizes {
extern struct cache_sizes malloc_sizes[];
extern void *__kmalloc(size_t, gfp_t);

-static inline void *kmalloc(size_t size, gfp_t flags)
+static __always_inline void *kmalloc(size_t size, gfp_t flags)
{
if (__builtin_constant_p(size)) {
int i = 0;

Ingo Molnar

unread,
Dec 30, 2005, 5:20:14 AM12/30/05
to

* Ingo Molnar <mi...@elte.hu> wrote:

> > I'm not quite sure I buy Ingo's original argument also. If he's only
> > looking at text size then with the above fixed then he ideally would
> > like to not inline anything (because except these exceptions above
> > .text usually near always shrinks when not inlining). But that's not
> > necessarily best for performance.
>
> well, i think the numbers talk for themselves. Here are my latest
> results:

i now have x86 allyesconfig numbers too:

text data bss dec filename
24190215 6737902 1775592 32703709 vmlinux-allyes-speed-orig
20096423 6758758 1775592 28630773 vmlinux-allyes-orig
19223511 6844002 1775656 27843169 vmlinux-allyes-inline+units+fixes+capable

i.e. enabling CONFIG_CC_OPTIMIZE_FOR_SIZE gives a 20.4% size reduction,
and adding my latest debloating-queue ontop of gives an additional 4.5%
of reduction. The queue is at:

http://redhat.com/~mingo/debloating-patches/

note: my focus is still mostly on CC_OPTIMIZE_FOR_SIZE (which is only
offered if CONFIG_EMBEDDED is enabled) - if you want a larger kernel
optimized for speed, do not enable it.

Ingo

Andi Kleen

unread,
Dec 30, 2005, 5:30:14 AM12/30/05
to
On Fri, Dec 30, 2005 at 10:40:45AM +0100, Ingo Molnar wrote:
> text data bss dec hex filename
> 4080998 870384 387260 5338642 517612 vmlinux-orig
> 4084421 872024 387260 5343705 5189d9 vmlinux-inline
> 4010957 834048 387748 5232753 4fd871 vmlinux-inline+units
> 4010039 833112 387748 5230899 4fd133 vmlinux-inline+units+fixes
> 4007617 833120 387748 5228485 4fc7c5 vmlinux-inline+units+fixes+capable
>
> or a 1.8% code size reduction.

But again if you only look at text size you ideally would want
to never inline anything, except the cases above and only called
once functions. So just turn it off except when forced? That would
be the logical conclusion from your strategy. I'm not sure it's a good
one.

-Andi

Bernd Petrovitsch

unread,
Dec 30, 2005, 6:30:11 AM12/30/05
to
On Thu, 2005-12-29 at 15:17 -0700, Jeff V. Merkey wrote:
[...]
> Start caring. People spend lots of money supporting you, and what you
> are doing. How about taking some
> responsibility for that so they don't change their minds and move back
> to windows or pull their support because it's too
> costly or too much of a hassle to produce something stable from these
> releases. If you export functions from the kernel,

The "program a driver once, runs on every windows in the future" is
actually a myth. Talk to developers with windows drivers ....
It is just that the companies absolutely don't have a choice if MSFT
changes something ....

Bernd
--
Firmix Software GmbH http://www.firmix.at/
mobil: +43 664 4416156 fax: +43 1 7890849-55
Embedded Linux Development and Services

Bernd Petrovitsch

unread,
Dec 30, 2005, 6:40:05 AM12/30/05
to
On Thu, 2005-12-29 at 16:54 -0700, Jeff V. Merkey wrote:
[....]
> The fact that Oracle and IBM support apps on Linux are Freeloading? Baloney!
> Linux benefits by having the choice of al these applications.

Do they have binary-only kernel modules or user-space apps?

> (P.S. I have heard through the grapevine IBM is putting emphasis on AIX
> as their platform and are actively telling this to large customers --
> can you verify this and are you aware of it)

Not knwoing any inner IBM things, the simple commercial explanation is:
If a customer buys AIX, he is forced to buy the hardware at IBM. And IBM
is a hardware selling (and consulting) company anyways, it never was a
"software company".

Adrian Bunk

unread,
Dec 30, 2005, 8:40:25 AM12/30/05
to
On Fri, Dec 30, 2005 at 11:14:43AM +0100, Ingo Molnar wrote:
>...

> note: my focus is still mostly on CC_OPTIMIZE_FOR_SIZE (which is only
> offered if CONFIG_EMBEDDED is enabled) - if you want a larger kernel
> optimized for speed, do not enable it.

Since 2.6.15-rc6, CC_OPTIMIZE_FOR_SIZE only depends on EXPERIMENTAL.

> Ingo

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

Adrian Bunk

unread,
Dec 30, 2005, 8:50:09 AM12/30/05
to
>...

The changes in gcc aren't _that_ big.

As an example, I tried compiling recent 2.6 kernels with gcc CVS HEAD
shortly before the 4.1 branch was created, and except for two or three
internal compiler errors (that are OK considering that I used a random
CVS snapshot) the kernel compiled fine.

Every gcc release might have it's own issues, but compared to e.g. the
pains your proposal would impose on new ports, they aren't that big.
And you shouldn't forget that it's even non-trivial to find one gcc
release that works fine compiling kernels on all architectures. As an
example, gcc 3.2 is a known bad compiler on arm.

> Willy

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

-

Christian Trefzer

unread,
Dec 30, 2005, 9:20:06 AM12/30/05
to
Hi Ingo,

On Fri, Dec 30, 2005 at 11:14:43AM +0100, Ingo Molnar wrote:
>

> * Ingo Molnar <mi...@elte.hu> wrote:
>

> [...] The queue is at:
>
> http://redhat.com/~mingo/debloating-patches/
>

I was curious and applied among others the uninline-capable patch, with
the result that modules complain about an unknown symbol "capable". The
attached patch is a manually adapted version of yours, extended by the
EXPORT_SYMBOL_GPL required for modules to load again.

The code with the EXPORT_SYMBOL_GPL line works (I am currently running
that kernel) and the patch should apply cleanly.

Regards,
Chris

uninline-and-export-capable.patch

Jeff V. Merkey

unread,
Dec 30, 2005, 10:20:32 AM12/30/05
to
Linus Torvalds wrote:

>On Thu, 29 Dec 2005, Jeff V. Merkey wrote:
>
>
>>The fact that Oracle and IBM support apps on Linux are Freeloading? Baloney!
>>
>>
>

>Jeff, give it a rest.
>
>Oracle and IBM haven't been complaining, have they? Oracle mostly does
>user-space stuff (that doesn't change), and has been pretty good about
>their Oracle-fs too - they're even actively discussing "git" issues on the
>git mailing list, and asking for help, and just generally being good
>members of the community.
>
>And IBM engineers are part of the people who change internal kernel
>interfaces in order to make it work better for them.
>
>Pretty much the ONLY people who ever complain about those internal kernel
>interfaces changing are the free-loaders. It's hard for them, because they
>don't want to play according to the rules. Tough. Watch me not care:
>
> [ Linus sits in his chair, patently not caring ]
>
>See?
>
>

I went to New Mexico for Christmas to see my 100 year old Cherokee
Grandmother.
After reading all the junk on the internet about me people have written
about me and this
whole mess with Linux, she offered two very wise comments:

1. "When people say or write things about you, its a reflection on them,
and not you."

2. (looking at me intensely) "The cure for stupidity is silence."

So I have to say, if you feel those who write applications for Linux are
freeloaders
(I take this to mean a freeloader is someone who doesn't give you their
IP but
uses the Linux platform to sell vertical apps), then you are saying you
don't
care about supporting vendors who commercialize Linux (expect those who
give you back IP or money).

I think I got this right. Did I miss anything?

You don't have to respond. Remember the cure for stupidity is silence.

[Jeff sitting in char also not caring but trying to cure his stupidity
by being silent]

Jeff :-)

> Linus

Jeff V. Merkey

unread,
Dec 30, 2005, 10:30:26 AM12/30/05
to
Bernd Petrovitsch wrote:

>On Thu, 2005-12-29 at 15:17 -0700, Jeff V. Merkey wrote:
>[...]
>
>
>>Start caring. People spend lots of money supporting you, and what you
>>are doing. How about taking some
>>responsibility for that so they don't change their minds and move back
>>to windows or pull their support because it's too
>>costly or too much of a hassle to produce something stable from these
>>releases. If you export functions from the kernel,
>>
>>
>
>The "program a driver once, runs on every windows in the future" is
>actually a myth. Talk to developers with windows drivers ....
>It is just that the companies absolutely don't have a choice if MSFT
>changes something ....
>
> Bernd
>
>

I support and write FS drivers for windows. The same driver works on
2002, 2002, 2003, and XP. Longhorn have changed two IFS functions
and that's it, and still loads the older fs drivers through a compat
interface.

Jeff

Jeff V. Merkey

unread,
Dec 30, 2005, 10:30:29 AM12/30/05
to
Bernd Petrovitsch wrote:

>On Thu, 2005-12-29 at 16:54 -0700, Jeff V. Merkey wrote:
>[....]
>
>
>>The fact that Oracle and IBM support apps on Linux are Freeloading? Baloney!
>>Linux benefits by having the choice of al these applications.
>>
>>
>
>Do they have binary-only kernel modules or user-space apps?
>
>
>
>>(P.S. I have heard through the grapevine IBM is putting emphasis on AIX
>>as their platform and are actively telling this to large customers --
>>can you verify this and are you aware of it)
>>
>>
>
>Not knwoing any inner IBM things, the simple commercial explanation is:
>If a customer buys AIX, he is forced to buy the hardware at IBM. And IBM
>is a hardware selling (and consulting) company anyways, it never was a
>"software company".
>
>

The lawsuit is impacting their sales finally. They are telling this to
folks they are moving off Linux long term. I don't know
if it's a smoke screen due to the lawsuit or an actual technical
business decision. I would guess they are making a back door
to pull out if the lawsuit goes south. Looks like it might be based on
filings on the 12/22 but I don't know for certain.

Jeff

> Bernd

Rik van Riel

unread,
Dec 30, 2005, 11:20:28 AM12/30/05
to
On Fri, 30 Dec 2005, Jeff V. Merkey wrote:

> 1. "When people say or write things about you, its a reflection on them,
> and not you."
>
> 2. (looking at me intensely) "The cure for stupidity is silence."

Your grandmother gave you very wise advise indeed.

You might want to consider how her comments apply to
your own site, http://www.merkeylaw.com/ ...

--
All Rights Reversed

Alan Cox

unread,
Dec 30, 2005, 11:20:37 AM12/30/05
to
On Mer, 2005-12-28 at 20:11 -0800, Andrew Morton wrote:
> If no-forced-inlining makes the kernel smaller then we probably have (yet
> more) incorrect inlining. We should hunt those down and fix them. We did
> quite a lot of this in 2.5.x/2.6.early. Didn't someone have a script which
> would identify which functions are a candidate for uninlining?

There is a tool that does this quite well. Its called "gcc" ;)

More seriously we need to seperate "things Andrew thinks are good inline
candidates" and "things that *must* be inlined". That allows 'build for
size' to do the equivalent of "-Dplease_inline" and the other build to
do "-Dplease_inline=inline". Gcc's inliner isn't aware of things cross
module so isn't going to make all the decisions right, but will make the
tedious local decisions.

As far as bugs go - gcc -Os has also fixed bugs in the past. It doesn't
introduce bugs so much as change them. Fedora means we have good long
term data on -Os with modern gcc (not with old gcc but we just dumped <
3.2 anyway).

Nowdays the -Os code paths are also getting real hammering because many
people build desktops, even OpenOffice with -Os and see overall
performance gains for the system.

Alan

Jeff V. Merkey

unread,
Dec 30, 2005, 11:50:29 AM12/30/05
to
Rik van Riel wrote:

>On Fri, 30 Dec 2005, Jeff V. Merkey wrote:
>
>
>
>>1. "When people say or write things about you, its a reflection on them,
>>and not you."
>>
>>2. (looking at me intensely) "The cure for stupidity is silence."
>>
>>
>
>Your grandmother gave you very wise advise indeed.
>
>You might want to consider how her comments apply to
>your own site, http://www.merkeylaw.com/ ...
>
>
>

Point well made. Dan Lyons at Forbes suggested when you are stalked, the
best course of
action is to expose them. I have noticed a complete cessation of hate
mail since it went up.
Instead of getting dozens of hate mails, now I get none. I'll take it
down when it stops
altogether. I have no idea have these people got the impression I am
anti-Linux. I am as much
a member of the community as anyone else, and I plan to keep working on
Linux (and other
projects) for a long time to come.

Jeff

Bernd Petrovitsch

unread,
Dec 30, 2005, 12:20:32 PM12/30/05
to
On Fri, 2005-12-30 at 07:53 -0700, Jeff V. Merkey wrote:
> Bernd Petrovitsch wrote:
> >On Thu, 2005-12-29 at 15:17 -0700, Jeff V. Merkey wrote:
> >[...]
> >>Start caring. People spend lots of money supporting you, and what you
> >>are doing. How about taking some
> >>responsibility for that so they don't change their minds and move back
> >>to windows or pull their support because it's too
> >>costly or too much of a hassle to produce something stable from these
> >>releases. If you export functions from the kernel,
> >
> >The "program a driver once, runs on every windows in the future" is
> >actually a myth. Talk to developers with windows drivers ....
> >It is just that the companies absolutely don't have a choice if MSFT
> >changes something ....
> >
> I support and write FS drivers for windows. The same driver works on
> 2002, 2002, 2003, and XP. Longhorn have changed two IFS functions

Which are basically 2 stable versions (2000 & XP).
No 3.1, 95 and 98 support before?
Apart from that there are other drivers as FS drivers too.

> and that's it, and still loads the older fs drivers through a compat
> interface.

Bernd
--
Firmix Software GmbH http://www.firmix.at/
mobil: +43 664 4416156 fax: +43 1 7890849-55
Embedded Linux Development and Services

-

It is loading more messages.
0 new messages