CPU specific/optimized Debian builds ?

Renaud Guerin

unread,

May 23, 2002, 6:40:09 AM5/23/02

to

Hi,

I was having a look at Gentoo Linux the other night, and their principle of
rebuilding your own packages with your chosen gcc -O & -march flags.

I got wondering, how one should go about rebuilding from source a whole
debian installation in a similar way ?
Anybody did this before, or are there any significant hurdles that prevent
automating this process ?

With the forthcoming significant updates to big parts of the system (XF4.2,
KDE 3), and the simultaneous release of gcc 3.1, maybe now would also be a
good time to consider having an arch-"x86 on steroids" along with i386, with
an automated build process.

I've read the performance improvements in the case of Gentoo *really* make a
difference.
I believe Debian has an advanced enough packages & build system to make such
build processes relatively painless, so why not take advantage of it and
offer more performance ?

I could understand the overhead and practical unfeasability of adding an
additional arch (even though it's not really a *different* arch), but it'd be
nice if at least there were a straightforward tool/procedure to generate a
custom-built CD Image with optimized packages.

I suspect there might be something like this already, maybe someone is even
already hosting optimized ISO's somewhere.
If so, please make yourself heard ! :)

ps: please cc me answers.

--
To UNSUBSCRIBE, email to debian-dev...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org

Peter Makholm

unread,

May 23, 2002, 7:10:05 AM5/23/02

to

Renaud Guerin <rgu...@free.fr> writes:

> I was having a look at Gentoo Linux the other night, and their principle of
> rebuilding your own packages with your chosen gcc -O & -march flags.

There has just bee a story about it on DebianPlanet. Try lok at the
comments at the story:
<http://debianplanet.org/article.php?sid=675&mode=nested&order=0&thold=0>

--
Peter Makholm | There are 10 kinds of people. Those who count in
pe...@makholm.net | binary and those who don't
http://hacking.dk |

George Danchev

unread,

May 23, 2002, 7:20:07 AM5/23/02

to

On Thursday 23 May 2002 13:37, Renaud Guerin wrote:
> Hi,
>
> I was having a look at Gentoo Linux the other night, and their principle of
> rebuilding your own packages with your chosen gcc -O & -march flags.
>
> I got wondering, how one should go about rebuilding from source a whole
> debian installation in a similar way ?
> Anybody did this before, or are there any significant hurdles that prevent
> automating this process ?

There are easily could be some, like the issue that woody's build-depends are
not satisfied within woody, there might be broken (or none) build-depend
field in the control file and perhaps some more. Basically you need to do
something like the following:
(do NOT run it is not tested and it's lame, just read)

#!/bin/sh
############# variables #############
# directory to store the sources
SRCDIR=SOURCES
# temporary file containing selected packages
SELECT=selections.tmp
#####################################

if [ -d $SRCDIR ]; then
cd $SRCDIR
# clean it ?
rm -rf *
else
mkdir $SRCDIR
cd $SRCDIR
fi

# let's see what's new in the archive,
# make sure you have deb-src lines in your sources.list
# apt-get update

# get local selections

dpkg --get-selections | grep -w 'install' | \
awk '{print $1}' > ../$SELECT

# fetch them 1 by 1 from the archive and build

for I in `cat ../$SELECT`; do

# we need to prepare Build-Depends (from control file)
# to satisfy the compile conditions

apt-get build-dep $I;

# now it's safe to build the package in question

apt-get -b source $I;

# remove each source after being built ???

rm -rf *
done

You may also consider using auto-apt (see man howto to keep updated its db)
during the compile and linkage. Really nice tool, which will ask you to
install the package(s) that provide missing build deps.

Yet another issue is how you can pass build options (let's say the disirable
CFLAGS) to apt-get -b or auto-apt commands. I know you can pass some
variables to dpkg, but have never experimented that with apt, for example:

COLUMNS=140 dpkg -l | grep package

so, the question is can we do something like:

CFLAGS="bla bla bla" apt-get -b package

or there are some features I'm missing here ?

--
Greets,
fr33zb1

Michael Stone

unread,

May 23, 2002, 7:40:05 AM5/23/02

to

On Thu, May 23, 2002 at 12:37:40PM +0200, Renaud Guerin wrote:
> I've read the performance improvements in the case of Gentoo *really* make a
> difference.

Do you have actual benchmarks rather than hearsay from some leet /.
kiddie? No one has ever come to debian with a convincing set of numbers
that indicate that optimizing intel builds buys any significant
advantage.

--
Mike Stone

Junichi Uekawa

unread,

May 23, 2002, 7:40:11 AM5/23/02

to

On Thu, 23 May 2002 12:37:40 +0200
Renaud Guerin <rgu...@free.fr> wrote:

> I got wondering, how one should go about rebuilding from source a whole
> debian installation in a similar way ?
> Anybody did this before, or are there any significant hurdles that prevent
> automating this process ?

pbuilder is designed for this purpose.

In summary: Debian does not build from source in woody, quite yet.
But we are getting mostly there.
I welcome people trying to rebuild packages from source.

You should be able to use pentium-builder package and
pbuilder together to build athlon-optimized Debian,
for example. But I have not gotten that far yet.
I'm making a very slow progress...

http://www.netfort.gr.jp/~dancer/software/athlon-debian.html.en

regards,
junichi

--
dan...@debian.org http://www.netfort.gr.jp/~dancer

Roger Leigh

unread,

May 23, 2002, 8:30:20 AM5/23/02

to

On Thu, May 23, 2002 at 08:36:02PM +0900, Junichi Uekawa wrote:
> On Thu, 23 May 2002 12:37:40 +0200
> Renaud Guerin <rgu...@free.fr> wrote:
>
> > I got wondering, how one should go about rebuilding from source a whole
> > debian installation in a similar way ?
> > Anybody did this before, or are there any significant hurdles that prevent
> > automating this process ?
>
> pbuilder is designed for this purpose.
>
> In summary: Debian does not build from source in woody, quite yet.
> But we are getting mostly there.
> I welcome people trying to rebuild packages from source.

It would be nice one day if you /could/ rebuild the entirety of Debian. I
think one way of getting this to work is rebuilding old packages during
autobuilder idle time. If I understand correctly, for each uploaded
package, it will only be built once for each arch. If the packages are
slowly rebuilt automatically, it will catch non-buildable source and
policy violations if it's old and not updated for years.

> You should be able to use pentium-builder package and
> pbuilder together to build athlon-optimized Debian,
> for example. But I have not gotten that far yet.
> I'm making a very slow progress...

Last November, I tried an upgrade from potato to sid entirely from source.
After two weekends spent continuously compiling and installing packages, I
got completely stuck (~250 packages in) due to:
1. unsatisfiable circular build-depends
2. non-buildable packages
3. unsatisfiable build-depends as a consequence of 1 and 2.

Unless you are autobuilding sid, continuously, then you will find that it
is very difficult to build the entire archive, IME.

When I get 160 GiB of new HDD space (within a few days), I'm going to try
bootstrapping an i686 arch and set up an autobuilder to build an i686
woody once woody is released. This may prove beyond my capabilities, but
should be interesting. Possibly the autobuilder will be able to build
what I could not if it can calculate all the Build-Deps.

--
Roger Leigh
** Registration Number: 151826, http://counter.li.org **
Need Epson Stylus Utilities? http://gimp-print.sourceforge.net/
GPG Public Key: 0x25BFB848 available on public keyservers

Fabio Massimo Di Nitto

unread,

May 23, 2002, 8:30:20 AM5/23/02

to

Renaud Guerin wrote:

Hi,
We (me and a friend of mine) have actually started a small
process of building debian for i686 and it isn't that complicate.
Actually with "only" 4 minor patches we can easily build all the
base system but we are looking deeper in some automatization
processes. Until now we built all by hand to be sure that everything
is working. We expect to come out with some documentation and possibly
a mirror within the end of june (I have summer holidays right now...
so no pc with me :-) ).
If you are really interested in such idea fell free to contact me.
The main target of our project is more to gain experience in handling
debian at a very low level than getting a 0.2% faster system.

Fabio

Junichi Uekawa

unread,

May 23, 2002, 8:40:18 AM5/23/02

to

On Thu, 23 May 2002 13:21:34 +0100
Roger Leigh <rl...@york.ac.uk> wrote:

> It would be nice one day if you /could/ rebuild the entirety of Debian.

I am rebuilding it.

> I
> think one way of getting this to work is rebuilding old packages during
> autobuilder idle time. If I understand correctly, for each uploaded
> package, it will only be built once for each arch. If the packages are
> slowly rebuilt automatically, it will catch non-buildable source and
> policy violations if it's old and not updated for years.

I am already doing it.

> Unless you are autobuilding sid, continuously, then you will find that it
> is very difficult to build the entire archive, IME.

I am continuously autobuilding sid.

Frank Copeland

unread,

May 23, 2002, 8:50:05 AM5/23/02

to

On Thu, 23 May 2002 11:38:06 +0000 (UTC), Junichi Uekawa <dan...@netfort.gr.jp> wrote:

> You should be able to use pentium-builder package and
> pbuilder together to build athlon-optimized Debian,
> for example. But I have not gotten that far yet.
> I'm making a very slow progress...
>
> http://www.netfort.gr.jp/~dancer/software/athlon-debian.html.en

My K7/1700+ (1.433 real GHz) spends 99% of its time idle.

I already use a K7-optimised kernel (building it is the only time I
notice significant activity in the CPU monitor). I can see some logic
in building a K7-optimised libc. But what do I gain from building the
rest of my system with K7 optimisations? Do I really care if ls runs 5
or 10% faster (in CPU terms) when it is essentially disk-bound?

If I should discover a particular package that (heaven forfend) makes
this beast appear sluggish, why isn't it enough that I can apt-get
source it, adjust the build settings and dpkg-buildpackage a
K7-optimised version? I already do that for the kernel.

Frank
--
Home Page: <URL:http://thingy.apana.org.au/~fjc/>
Not the Scientology Home Page: <URL:http://xenu.apana.org.au/ntshp/>

Glenn McGrath

unread,

May 23, 2002, 8:50:09 AM5/23/02

to

It would be very stupid to suggest it makes it slower.

We should consider it our obligation to allow our users to take full
advantage of their hardware, that doesnt mean we have to provide
pre-compiled binaries for every sub-arch, but we should allow users to
build their own optimised packages.

IMHO our packaging system is severly lacking in this regard, i think
dpkg-architecutre needs to evolve, its a pretty hairy problem though.

Glenn

Fabio Massimo Di Nitto

unread,

May 23, 2002, 9:00:15 AM5/23/02

to

Glenn McGrath wrote:

>On Thu, 23 May 2002 07:31:05 -0400
>"Michael Stone" <mst...@debian.org> wrote:
>
>
>
>>On Thu, May 23, 2002 at 12:37:40PM +0200, Renaud Guerin wrote:
>>
>>
>>>I've read the performance improvements in the case of Gentoo *really*
>>>make a difference.
>>>
>>>
>>Do you have actual benchmarks rather than hearsay from some leet /.
>>kiddie? No one has ever come to debian with a convincing set of numbers
>>that indicate that optimizing intel builds buys any significant
>>advantage.
>>
>>
>>
>
>It would be very stupid to suggest it makes it slower.
>
>We should consider it our obligation to allow our users to take full
>advantage of their hardware, that doesnt mean we have to provide
>pre-compiled binaries for every sub-arch, but we should allow users to
>build their own optimised packages.
>
>IMHO our packaging system is severly lacking in this regard, i think
>dpkg-architecutre needs to evolve, its a pretty hairy problem though.
>
>
>Glenn
>
>
>
>

At the actual state to build i686 I had to made small changes
to dpkg, glibc and gcc. apt can be rebuilt cleanly but you can't
use it efficently until there will be a spec-arch official mirror.

I will make my patches available later this evening eu time.
I don't ensure that they break something else since Im still
toying in a chroot environment but atleast is a point from where
people can start.

Fabio

PS --force-architecture it will be your best friend :-)

Michael Stone

unread,

May 23, 2002, 9:10:09 AM5/23/02

to

On Thu, May 23, 2002 at 10:42:02PM +1000, Glenn McGrath wrote:
> On Thu, 23 May 2002 07:31:05 -0400 "Michael Stone" <mst...@debian.org> wrote:
> > Do you have actual benchmarks rather than hearsay from some leet /.
> > kiddie? No one has ever come to debian with a convincing set of numbers
> > that indicate that optimizing intel builds buys any significant
> > advantage.
>
> It would be very stupid to suggest it makes it slower.

Would it? If you don't have numbers you have *nothing* to back that
assertion. We see hand waving advocacy every couple of months about this
subject, but in the past several *years* no one has cared enough to do
some real research and demonstrate a benefit. "It Stands To Reason" is
frankly a ridiculous justification for change.

FYI, gcc has a long history of mediocre optimizations, and it
is well known that optimizations that make some processors faster can
slow down other (even newer) processors (that is, general optimizations,
not gcc-specific).

--
Mike Stone

Glenn McGrath

unread,

May 23, 2002, 9:40:08 AM5/23/02

to

On Thu, 23 May 2002 08:56:57 -0400
"Michael Stone" <mst...@debian.org> wrote:

> On Thu, May 23, 2002 at 10:42:02PM +1000, Glenn McGrath wrote:
> > On Thu, 23 May 2002 07:31:05 -0400 "Michael Stone" <mst...@debian.org>
> > wrote:> Do you have actual benchmarks rather than hearsay from some
> > leet /.> kiddie? No one has ever come to debian with a convincing set
> > of numbers> that indicate that optimizing intel builds buys any
> > significant> advantage.
> >
> > It would be very stupid to suggest it makes it slower.
>
> Would it? If you don't have numbers you have *nothing* to back that
> assertion. We see hand waving advocacy every couple of months about this
> subject, but in the past several *years* no one has cared enough to do
> some real research and demonstrate a benefit. "It Stands To Reason" is
> frankly a ridiculous justification for change.
>
> FYI, gcc has a long history of mediocre optimizations, and it
> is well known that optimizations that make some processors faster can
> slow down other (even newer) processors (that is, general optimizations,
> not gcc-specific).
>

Sounds like your trying to be a troll.

I wonder why CPU manufacturers added extensions to the i386 instruction
set ?

Glenn

Michael Stone

unread,

May 23, 2002, 9:50:11 AM5/23/02

to

On Thu, May 23, 2002 at 11:29:17PM +1000, Glenn McGrath wrote:
> Sounds like your trying to be a troll.

No. I'm asking that people have hard numbers to support their wild
claims before dragging this tired old subject out for another airing.
Is that really so much to ask? *If* if you have promising data, *then*
people will be more receptive to this *old* thread. Until then you're
wasting time with groundless assertions.

> I wonder why CPU manufacturers added extensions to the i386 instruction
> set ?

The fact that instructions are added does not automatically mean that
compilers and applications are able to make good use of them. If it's as
easy as you think, run some verifiable tests, publish the results, and
come back with some real figures to discuss--I'd very much welcome such
data, if someone cared enough about this issue to do some work rather
than arguing about it.

--
Mike Stone

Will Newton

unread,

May 23, 2002, 10:00:13 AM5/23/02

to

On Thursday 23 May 2002 2:29 pm, Glenn McGrath wrote:

> Sounds like your trying to be a troll.

Not necessarily. Most computers spend the majority of their time idle. In
this case is it really worth making it spend a fraction more time idle?

Performance hotspots - the kernel, Mesa, mplayer etc. - can be optimized by
hand and CPU detection is really not too hard to do, and this is the Right
Way(TM). Different arches for different Intel CPUs is insane, it would mean a
vast increase in packages. A simple mechanism to allow specific packages to
be recompiled on a users machine is a useful idea, but I suspect it would
only result in people realising that an SSE2/3DNow version of vim is really
not that spectacular.

Emmanuel le Chevoir

unread,

May 23, 2002, 10:10:08 AM5/23/02

to

On Thu, 23 May 2002, Frank Copeland wrote:
> If I should discover a particular package that (heaven forfend) makes
> this beast appear sluggish, why isn't it enough that I can apt-get
> source it, adjust the build settings and dpkg-buildpackage a
> K7-optimised version? I already do that for the kernel.

Well, spend a day or two rebuilding the whole KDE stuff, you should then
understand why everybody isn't comfortable with this idea.

Emmanuel le Chevoir

George Danchev

unread,

May 23, 2002, 10:20:09 AM5/23/02

to

On Thursday 23 May 2002 16:42, Michael Stone wrote:
> On Thu, May 23, 2002 at 11:29:17PM +1000, Glenn McGrath wrote:

Well, I think that the emphasis here is that Debian *must* be successfully
rebuildable from source packages (even tunnig -O level and -march) within a
distribution, so to be ready "when the compilers and applications are able to
make good use of added instructions". The whole story is about those debian
features, not about if they make any sense, because they certainly do,
acording to debian users.

--
Greets,
fr33zb1

Roger Leigh

unread,

May 23, 2002, 10:40:09 AM5/23/02

to

On Thu, May 23, 2002 at 09:34:08PM +0900, Junichi Uekawa wrote:
> On Thu, 23 May 2002 13:21:34 +0100
> Roger Leigh <rl...@york.ac.uk> wrote:
>
> > It would be nice one day if you /could/ rebuild the entirety of Debian.
>
> I am rebuilding it.
>
> > I
> > think one way of getting this to work is rebuilding old packages during
> > autobuilder idle time. If I understand correctly, for each uploaded
> > package, it will only be built once for each arch. If the packages are
> > slowly rebuilt automatically, it will catch non-buildable source and
> > policy violations if it's old and not updated for years.
>
> I am already doing it.

Great! During my time trying, I certainly found and reported quite a lot
of bugs just by hand-building the packages. This even included fairly
common shared libraries determining their shlibs files from the
_installed_ version of the library, so you had to build, install and
rebuild to get a valid package! I'm sure that your efforts doing this
will improve Debian greatly.

Glenn McGrath

unread,

May 23, 2002, 10:50:10 AM5/23/02

to

On Thu, 23 May 2002 14:48:28 +0100
"Will Newton" <wi...@misconception.org.uk> wrote:

> On Thursday 23 May 2002 2:29 pm, Glenn McGrath wrote:
>
> > Sounds like your trying to be a troll.
>
> Not necessarily. Most computers spend the majority of their time idle.
> In this case is it really worth making it spend a fraction more time
> idle?
>
> Performance hotspots - the kernel, Mesa, mplayer etc. - can be optimized
> by hand and CPU detection is really not too hard to do, and this is the
> Right Way(TM). Different arches for different Intel CPUs is insane, it
> would mean a vast increase in packages. A simple mechanism to allow
> specific packages to be recompiled on a users machine is a useful idea,
> but I suspect it would only result in people realising that an
> SSE2/3DNow version of vim is really not that spectacular.
>

PC's are designed so that the CPU is idle most of the time, irrespective
of the power of the CPU, that has nothing to do with wether or not
binaries should be as efficient as possible.

If we do provide the ability for users to compile their own CPU optimised
binaries, then how can it be a bad thing ?

Would it be bad if their binaries are only marginally more efficient ?

The only "cost" involved is that it would require us to be more organised.

Glenn

Jeff Licquia

unread,

May 23, 2002, 11:50:05 AM5/23/02

to

On Thu, 2002-05-23 at 09:38, Glenn McGrath wrote:
> If we do provide the ability for users to compile their own CPU optimised
> binaries, then how can it be a bad thing ?

Providing our users with such an ability is, I think, useful. So, I
wholeheartedly support Junichi's project (even if it's only moral
support :-).

It would seem that we're confusing granting an ability to our users for
the (oft-repeated) call for optimizing the official Debian packages.
The latter proposal is completely inappropriate outside of the narrow
range of options we currently provide (multiple kernel packages,
optional optimized libc, some packages with built-in CPU detection).

Andrew Suffield

unread,

May 23, 2002, 11:50:11 AM5/23/02

to

On Thu, May 23, 2002 at 11:29:17PM +1000, Glenn McGrath wrote:

> > > It would be very stupid to suggest it makes it slower.
> >
> > Would it? If you don't have numbers you have *nothing* to back that
> > assertion. We see hand waving advocacy every couple of months about this
> > subject, but in the past several *years* no one has cared enough to do
> > some real research and demonstrate a benefit. "It Stands To Reason" is
> > frankly a ridiculous justification for change.
> >
> > FYI, gcc has a long history of mediocre optimizations, and it
> > is well known that optimizations that make some processors faster can
> > slow down other (even newer) processors (that is, general optimizations,
> > not gcc-specific).
> >
>
> Sounds like your trying to be a troll.
>
> I wonder why CPU manufacturers added extensions to the i386 instruction
> set ?

I present for your general contempt the i486(?) ENTER and LEAVE
instructions, which, at the time of their first appearence, were
slower to execute than the two or three instructions which they were
documented as being exactly equivalent to (and it's far from the only
case of such idiocy from Intel).

Not to mention the nefarious pipeline stall and scheduling issues that
crippled the i586.

--
.''`. ** Debian GNU/Linux ** | Andrew Suffield
: :' : http://www.debian.org/ | Dept. of Computing,
`. `' | Imperial College,
`- -><- | London, UK

Steve Langasek

unread,

May 23, 2002, 12:30:15 PM5/23/02

to

On Thu, May 23, 2002 at 01:21:34PM +0100, Roger Leigh wrote:
> When I get 160 GiB of new HDD space (within a few days), I'm going to try
> bootstrapping an i686 arch and set up an autobuilder to build an i686
> woody once woody is released. This may prove beyond my capabilities, but
> should be interesting. Possibly the autobuilder will be able to build
> what I could not if it can calculate all the Build-Deps.

<lart>
That's spelled "GB".
</lart>

Steve Langasek
psotmodern programmer

Joey Hess

unread,

May 23, 2002, 1:00:17 PM5/23/02

to

Junichi Uekawa wrote:
> You should be able to use pentium-builder package and
> pbuilder together to build athlon-optimized Debian,
> for example. But I have not gotten that far yet.
> I'm making a very slow progress...

Does anyone know if gcc 3.0 has some sort of config file in /etc that
can be used to force compile options like processor and optimizations?
IIRC, some other distribution had such a thing, and it's really better
than the hackish pentium-builder to do it that way.

--
see shy jo

Daniel Burrows

unread,

May 23, 2002, 1:10:07 PM5/23/02

to

On Thu, May 23, 2002 at 10:12:15AM -0500, Jeff Licquia <lic...@debian.org> was heard to say:

> It would seem that we're confusing granting an ability to our users for
> the (oft-repeated) call for optimizing the official Debian packages.
> The latter proposal is completely inappropriate outside of the narrow
> range of options we currently provide (multiple kernel packages,
> optional optimized libc, some packages with built-in CPU detection).

Just to nitpick, we don't provide an optimized libc:

glibc (2.2.2-3) unstable; urgency=low

* Disable building of optimized libs for now. I did not forsee the
problems involved with symbol skew between ld-linux.so.2 and the
optmized libc.so.6. As of now, I can see no way around this.
* Make libc6 conflict with the optimized libs for now, so we can get rid
of them, closes: #90753, #90758, #90763, #90770, #90778, #90779

Daniel

--
/-------------------- Daniel Burrows <dbur...@debian.org> -------------------\
| "We've got nothing to fear but the stuff that we're |
| afraid of!" -- Fluble |
\---- Be like the kid in the movie! Play chess! -- http://www.uschess.org ---/

Junichi Uekawa

unread,

May 23, 2002, 1:10:14 PM5/23/02

to

Roger Leigh <rl...@york.ac.uk> immo vero scripsit:

> > I am already doing it.
>
> Great! During my time trying, I certainly found and reported quite a lot
> of bugs just by hand-building the packages. This even included fairly
> common shared libraries determining their shlibs files from the
> _installed_ version of the library, so you had to build, install and
> rebuild to get a valid package! I'm sure that your efforts doing this
> will improve Debian greatly.

That is pretty interesting to know.
I believe that's caused by calling dpkg-shlibdeps without
setting the appropriate LD_LIBRARY_PATH?

Anyway, you can (sort of) see my progress at
bugs.debian.org/from:dan...@netfort.gr.jp

All my bugs with build problems start with "FTBFS:"
to make them stand out.
There is quite a lot of them, about 200 open right now,
and quite many of them have patches.
But I haven't gotten around to sending patches to all yet.

There are many unresponsive maintainers, which some of
them I took the liberty of doing an NMU. But I can't just
fix everything by doing an NMU on every package...

regards,
junichi

--
dan...@debian.org : Junichi Uekawa http://www.netfort.gr.jp/~dancer
GPG Fingerprint : 17D6 120E 4455 1832 9423 7447 3059 BF92 CD37 56F4
Libpkg-guide: http://www.netfort.gr.jp/~dancer/column/libpkg-guide/

Thomas Bushnell, BSG

unread,

May 23, 2002, 1:30:12 PM5/23/02

to

Michael Stone <mst...@debian.org> writes:

> FYI, gcc has a long history of mediocre optimizations, and it
> is well known that optimizations that make some processors faster can
> slow down other (even newer) processors (that is, general optimizations,
> not gcc-specific).

Do you have specific bug reports to file? GCC makes it very easy to
turn on or off optimizations in the default set for various
processors. So if some new 386-inspired optimization slows down the
magic-chip port, it's easy for the magic-chip port to just not use
that optimization.

So I wonder if you have submitted the relevant bug reports, or you can
give anything more specific than the vague statement above.

Thomas Bushnell, BSG

unread,

May 23, 2002, 1:30:14 PM5/23/02

to

Michael Stone <mst...@debian.org> writes:

> No. I'm asking that people have hard numbers to support their wild
> claims before dragging this tired old subject out for another airing.
> Is that really so much to ask? *If* if you have promising data, *then*
> people will be more receptive to this *old* thread. Until then you're
> wasting time with groundless assertions.

Quite right. So can you give us some hard numbers about specific
optimizations which were added, and slowed down "new fancy chips"?
And also, while you're at it, explain why you didn't file the relevant
bug reports too.

Thomas

Thomas Bushnell, BSG

unread,

May 23, 2002, 1:30:15 PM5/23/02

to

Will Newton <wi...@misconception.org.uk> writes:

> Not necessarily. Most computers spend the majority of their time idle. In
> this case is it really worth making it spend a fraction more time idle?

Yes. Have you ever used a laptop?

John H. Robinson, IV

unread,

May 23, 2002, 1:30:22 PM5/23/02

to

On Thu, May 23, 2002 at 10:47:55AM -0500, Steve Langasek wrote:
> On Thu, May 23, 2002 at 01:21:34PM +0100, Roger Leigh wrote:
> > When I get 160 GiB of new HDD space (within a few days),
>

> <lart>
> That's spelled "GB".
> </lart>

maybe.

172 GB = 160 GiB

http://physics.nist.gov/cuu/Units/binary.html

-john

Simon Law

unread,

May 23, 2002, 1:50:09 PM5/23/02

to

\sart{I believe that Roger is referring to a gibibyte.}

Simon

amand Tihon

unread,

May 23, 2002, 2:00:06 PM5/23/02

to

On Thu, 23 May 2002 10:47:55 -0500
Steve Langasek <vor...@netexpress.net> wrote:

> On Thu, May 23, 2002 at 01:21:34PM +0100, Roger Leigh wrote:
> > When I get 160 GiB of new HDD space (within a few days), I'm going

> <lart>

> That's spelled "GB".
> </lart>

I recognise it's not well known, but he was right. 1 GiB is 1 gibibyte,
ie 2^30 bytes.
See http://physics.nist.gov/cuu/Units/binary.html for more "binary
multiples".

--
Amand Tihon

Michael Stone

unread,

May 23, 2002, 2:00:13 PM5/23/02

to

On Thu, May 23, 2002 at 09:33:50AM -0700, Thomas Bushnell, BSG wrote:
> Michael Stone <mst...@debian.org> writes:
> > FYI, gcc has a long history of mediocre optimizations, and it

[snip]

>
> Do you have specific bug reports to file? GCC makes it very easy to
> turn on or off optimizations in the default set for various
> processors. So if some new 386-inspired optimization slows down the
> magic-chip port, it's easy for the magic-chip port to just not use
> that optimization.
>
> So I wonder if you have submitted the relevant bug reports, or you can
> give anything more specific than the vague statement above.

What bug reports? If you compare gcs to other compilers on the same
platform, gcc typically performs worse. gcs also has a goal of
portability that other compilers do not, so the difference in
optimization isn't a bug so much as a difference in focus. This has
nothing to do with tuning available optimizations to perform better on a
specific chip. This does, however, mean that the fact that a particular
processor introduced a particular improvement should not automatically
lead to the conclusion that gcs can take full advantage of that
improvement. The only way to know would be to benchmark the program with
different compiler options--which is exactly what I asked for.

Off the top of my head, one reference is
http://www.nersc.gov/research/FTG/pcp/performance.html
Take a look at relative performance between e.g., the compaq compiler on
alpha and the gnu compiler on alpha, or the portland compiler vs. the
gnu compiler on intel. gcs has historically been a solid, portable
compiler, but not a speed king. AFAIK, the gnu compiler people are aware
of benchmark comparisons between gnu and other compilers, so this isn't
a revelation. I'm sure you can find papers with some clever use of
google, or by checking some journals. Note that gcc 3 has performance
increases over previous versions, and the picture may change. I have not
seen comparative assesments between gcs 3 and other compilers, and would
be interested in the new numbers. Note that I am not *against* changing
debian's compilation standards, but I do think that it's premature to
suggest that before someone quantifies the benefits.

On Thu, May 23, 2002 at 09:34:44AM -0700, Thomas Bushnell, BSG wrote:
> Quite right. So can you give us some hard numbers about specific
> optimizations which were added, and slowed down "new fancy chips"?
> And also, while you're at it, explain why you didn't file the relevant
> bug reports too.

You're all about bug reports. These aren't bugs. A bug would be "2+2=8
on gnu compiler". "gnu compiler slower than compaq compiler on compaq
system" isn't a bug. But at any rate I think you're wilfully ignoring
what I said and the context in which I said it. I was not making a claim
about whether distributing a debian release with a specific set of
optimizations would affect performance. Specifically, I said that the
topic had come up before and that no one demonstrated significant gains.
I then said that no one had presented any new numbers since that time.
There have been people who *claimed* that their recompiling made a
difference, but there have been no formal, reproducable benchmarks
wherein people compared a basic debian distribution against one with
some or all parts compiled with a set of optimizations. I don't think
it's unreasonable to request such data when someone suggests that debian
should change its distribution policies. I also don't see why that
suggestion would so infuriate you.

--
Mike Stone

Will Newton

unread,

May 23, 2002, 2:10:14 PM5/23/02

to

On Thursday 23 May 2002 5:35 pm, Thomas Bushnell, BSG wrote:

> > Not necessarily. Most computers spend the majority of their time idle. In
> > this case is it really worth making it spend a fraction more time idle?
>
> Yes. Have you ever used a laptop?

So you are suggesting that compiler optimizations have a significant effect
on battery life of laptops?

Have you ever used a laptop?

(HInt: The CPU is the least of your worries battery consumption wise)

Steve Langasek

unread,

May 23, 2002, 2:20:10 PM5/23/02

to

On Thu, May 23, 2002 at 07:27:20PM +0200, amand Tihon wrote:
> On Thu, 23 May 2002 10:47:55 -0500
> Steve Langasek <vor...@netexpress.net> wrote:

> > On Thu, May 23, 2002 at 01:21:34PM +0100, Roger Leigh wrote:
> > > When I get 160 GiB of new HDD space (within a few days), I'm going

> > <lart>
> > That's spelled "GB".
> > </lart>

> I recognise it's not well known, but he was right. 1 GiB is 1 gibibyte,
> ie 2^30 bytes.
> See http://physics.nist.gov/cuu/Units/binary.html for more "binary
> multiples".

That the people who put labels on hard drives are incapable of binary
math, and that the IEC feels they have any business meddling with a
system that worked fine for years before they decided to get involved,
does not make 'GiB' correct, or even useful. That anyone believes this
new set of prefixes will /reduce/ confusion when RAM, file sizes,
transfer speeds, and bandwidth rates (all of which have a greater direct
impact on the average computer user than the total number of bytes
available for use on a 160GB hard drive) is positively laughable.

Steve Langasek
postmodern programmer

Roger Leigh

unread,

May 23, 2002, 2:20:15 PM5/23/02

to

"John H. Robinson, IV" <jh...@ucsd.edu> writes:

> On Thu, May 23, 2002 at 10:47:55AM -0500, Steve Langasek wrote:
> > On Thu, May 23, 2002 at 01:21:34PM +0100, Roger Leigh wrote:
> > > When I get 160 GiB of new HDD space (within a few days),
> >
> > <lart>
> > That's spelled "GB".
> > </lart>
>
> maybe.
>
> 172 GB = 160 GiB

It depends on what the HDD manufacturer means by 160. It may turn out
to only be ~149 GiB if it is really 160 GB.

--
Roger Leigh
** Registration Number: 151826, http://counter.li.org **
Need Epson Stylus Utilities? http://gimp-print.sourceforge.net/
GPG Public Key: 0x25BFB848 available on public keyservers

Michael Poole

unread,

May 23, 2002, 2:30:11 PM5/23/02

to

amand Tihon <am...@alrj.org> writes:

> On Thu, 23 May 2002 10:47:55 -0500
> Steve Langasek <vor...@netexpress.net> wrote:
>
> > On Thu, May 23, 2002 at 01:21:34PM +0100, Roger Leigh wrote:
> > > When I get 160 GiB of new HDD space (within a few days), I'm going
>
> > <lart>
> > That's spelled "GB".
> > </lart>
>
> I recognise it's not well known, but he was right. 1 GiB is 1 gibibyte,
> ie 2^30 bytes.
> See http://physics.nist.gov/cuu/Units/binary.html for more "binary
> multiples".

Roger may be using a valid unit, but he's not using the RIGHT unit,
and Steve is. Hard drive vendors have long used strange units
(compared to the rest of the computing industry) to pad their numbers:
sometimes "1 MB" meant "1000 KB of 1024 bytes each" (1000 KiB), and "1
GB" meant 1000 of those MB. But many now settle on using base-10
powers to describe their drives, since that gives the biggest
significand (the "160" part of "160 GB").

The only hard drive vendor I know of selling "160 GB" drives is
Maxtor; their product manuals say: "Maxtor defines 1 Gigabyte (GB) as
10^9 or 1,000,000,000 bytes of data." It goes on to say that there
are a maximum 320,173,056 of addressible sectors (each 512 bytes) in
LBA mode, for a total of 163.9 GB of space. This makes 152.6 GiB.

Can we be done with this rather silly thread now?

-- Michael Poole

Roger Leigh

unread,

May 23, 2002, 2:30:10 PM5/23/02

to

Junichi Uekawa <dan...@netfort.gr.jp> writes:

> Roger Leigh <rl...@york.ac.uk> immo vero scripsit:
>
> > > I am already doing it.
> >
> > Great! During my time trying, I certainly found and reported quite a lot
> > of bugs just by hand-building the packages. This even included fairly
> > common shared libraries determining their shlibs files from the
> > _installed_ version of the library, so you had to build, install and
> > rebuild to get a valid package! I'm sure that your efforts doing this
> > will improve Debian greatly.
>
> That is pretty interesting to know.
> I believe that's caused by calling dpkg-shlibdeps without
> setting the appropriate LD_LIBRARY_PATH?

Yes, or dh_shlibdeps without `-l' (though LD_LIBRARY_PATH also works).

> Anyway, you can (sort of) see my progress at
> bugs.debian.org/from:dan...@netfort.gr.jp
>
> All my bugs with build problems start with "FTBFS:"
> to make them stand out.
> There is quite a lot of them, about 200 open right now,
> and quite many of them have patches.
> But I haven't gotten around to sending patches to all yet.

I'll make a note of this, then I won't start filing duplicates.

> There are many unresponsive maintainers, which some of
> them I took the liberty of doing an NMU. But I can't just
> fix everything by doing an NMU on every package...

No, that would be several full-time jobs!

--
Roger Leigh
** Registration Number: 151826, http://counter.li.org **
Need Epson Stylus Utilities? http://gimp-print.sourceforge.net/
GPG Public Key: 0x25BFB848 available on public keyservers

Glenn Maynard

unread,

May 23, 2002, 2:30:12 PM5/23/02

to

On Thu, May 23, 2002 at 01:04:02PM -0500, Steve Langasek wrote:
> That the people who put labels on hard drives are incapable of binary
> math, and that the IEC feels they have any business meddling with a
> system that worked fine for years before they decided to get involved,
> does not make 'GiB' correct, or even useful. That anyone believes this
> new set of prefixes will /reduce/ confusion when RAM, file sizes,
> transfer speeds, and bandwidth rates (all of which have a greater direct
> impact on the average computer user than the total number of bytes
> available for use on a 160GB hard drive) is positively laughable.

... and I spat in disgust to see my copy of ifconfig start spewing "GiB"
at me. Ugh.

I'm half waiting to hear someone say "100 gibs" aloud in a store, and
see them pointed to the gaming section.

--
Glenn Maynard

John H. Robinson, IV

unread,

May 23, 2002, 2:30:15 PM5/23/02

to

here we go, wandering off topic again . . .

On Thu, May 23, 2002 at 01:04:02PM -0500, Steve Langasek wrote:
>

> That the people who put labels on hard drives are incapable of binary
> math,

i believe this is false (i have no proof though, and i suspect neither
do you). however, 172GB sounds bigger than 160GiB. this is why such a
drive would cost, say, 399$ instead of 400$.

160GiB vs 172GB is a marketting thing, not a capacity thing.

> That anyone believes this new set of prefixes will /reduce/ confusion
> when RAM, file sizes, transfer speeds, and bandwidth rates (all of
> which have a greater direct impact on the average computer user than
> the total number of bytes available for use on a 160GB hard drive) is
> positively laughable.

of course, RAM, file sizes, transfer speeds, and bandwidth rates all
have exactly _nothing_ to do with proper unit prefixes.

-john

John H. Robinson, IV

unread,

May 23, 2002, 2:40:10 PM5/23/02

to

On Thu, May 23, 2002 at 07:14:24PM +0100, Roger Leigh wrote:
> "John H. Robinson, IV" <jh...@ucsd.edu> writes:
>
> > 172 GB = 160 GiB
>
> It depends on what the HDD manufacturer means by 160. It may turn out
> to only be ~149 GiB if it is really 160 GB.

this is why i said ``maybe.'' someone that uses a Gi prefix i have to
assume knows what it means, and is able to convert marketting droid
172GB to a more useful 160GiB.

okay, enough of this ;)

-john

Thimo Neubauer

unread,

May 23, 2002, 3:00:17 PM5/23/02

to

On Thu, May 23, 2002 at 05:05:28PM +0300, George Danchev wrote:
> On Thursday 23 May 2002 16:42, Michael Stone wrote:
> > On Thu, May 23, 2002 at 11:29:17PM +1000, Glenn McGrath wrote:
>
> Well, I think that the emphasis here is that Debian *must* be successfully
> rebuildable from source packages (even tunnig -O level and -march) within a
> distribution, so to be ready "when the compilers and applications are able to
> make good use of added instructions". The whole story is about those debian
> features, not about if they make any sense, because they certainly do,
> acording to debian users.

First of all: why must we? But if you want Debian to reach this goal,
grab a couple of packages, build them with you favourite
optimizations, find out if the failing build is the packages or the
compilers fault, file bugs. Repeat as necessary.

Ah, and run your optimized packages and compare the runtime with the
non-optimized version. Post the results. Thanks.

CU
Thimo

--
Thimo Neubauer <th...@debian.org>
Debian GNU/Linux 3.0 frozen! See http://www.debian.org/ for details

Thimo Neubauer

unread,

May 23, 2002, 3:20:09 PM5/23/02

to

On Fri, May 24, 2002 at 12:38:27AM +1000, Glenn McGrath wrote:
> If we do provide the ability for users to compile their own CPU optimised
> binaries, then how can it be a bad thing ?

Well, every user may apt-get source his favourite package and modify
debian/rules. This isn't optimal but global CFLAGS for _all_ packages
are impossible even on one architecture like i386. Some programs
simply won't work when optimized due to compiler errors.

> Would it be bad if their binaries are only marginally more efficient ?

You can waste as much CPU time as you want on you machine, but just
don't use a .d.o machine to build optimized versions. And please
imagine the waste of diskspace if every package was uploaded with ---
oops, lost the track of newest instuctions sets -- at least 4
different versions as the kernel images already are (which makes sense
for this special package)?

> The only "cost" involved is that it would require us to be more organised.

Please stop saying "we have to do something", do something
yourself. Please think up a plan on how to accomplish an "Optimized
Debian" distribution. Talk to the ftp-masters and admins about your
plan. Discuss it with us. Then we'll maybefind your idea
attractive. Otherwise, please stop the thread.

George Danchev

unread,

May 23, 2002, 3:50:10 PM5/23/02

to

On Thursday 23 May 2002 19:10, Joey Hess wrote:
> Junichi Uekawa wrote:
> > You should be able to use pentium-builder package and
> > pbuilder together to build athlon-optimized Debian,
> > for example. But I have not gotten that far yet.
> > I'm making a very slow progress...
>
> Does anyone know if gcc 3.0 has some sort of config file in /etc that
> can be used to force compile options like processor and optimizations?
> IIRC, some other distribution had such a thing, and it's really better
> than the hackish pentium-builder to do it that way.

what if we have $CFLAGS within the "build" target in debian/rules file. Then
export CFLAGS="-O3 -mcpu=i686" , and start some procedure for rebuilding all
installed binary packages from source packages ?

or to change some defines in /usr/lib/gcc-lib/i386-linux/<gcc-version>/specs ?
--
Greets,
fr33zb1

Steve Langasek

unread,

May 23, 2002, 4:00:13 PM5/23/02

to

On Thu, May 23, 2002 at 11:26:56AM -0700, John H. Robinson, IV wrote:
> here we go, wandering off topic again . . .

It's at least as on-topic as any of the flamewars going on, since these
idiotic prefixes are now having a discernable impact on Debian which
affects my perception of the overall quality of the distribution (c.f.
comments regarding ifconfig output).

> On Thu, May 23, 2002 at 01:04:02PM -0500, Steve Langasek wrote:

> > That the people who put labels on hard drives are incapable of binary
> > math,

> i believe this is false (i have no proof though, and i suspect neither
> do you). however, 172GB sounds bigger than 160GiB. this is why such a
> drive would cost, say, 399$ instead of 400$.

> 160GiB vs 172GB is a marketting thing, not a capacity thing.

Well, I believe it's true that the people putting the labels on the hard
drives are incapable of doing binary math, by virtue of the fact that
the labels are the domain of marketeers rather than engineers, and I
won't hold my breath looking for a marketroid that can do binary math
without the benefit of numerous bifurcated appendages. But perhaps more
to the point, I would say that the people who put labels on hard drives
are incapable of doing the right thing wrt advertising their products.

> > That anyone believes this new set of prefixes will /reduce/ confusion
> > when RAM, file sizes, transfer speeds, and bandwidth rates (all of
> > which have a greater direct impact on the average computer user than
> > the total number of bytes available for use on a 160GB hard drive) is
> > positively laughable.

> of course, RAM, file sizes, transfer speeds, and bandwidth rates all
> have exactly _nothing_ to do with proper unit prefixes.

s/\) is/\) continue to use the traditional definitions is/

If that makes it any clearer. It supports the principle that within the
domain of computing, 2^10 *is* the proper unit prefix.

Steve Langasek
postmodern programmer

Glenn McGrath

unread,

May 23, 2002, 7:40:06 PM5/23/02

to

On Thu, 23 May 2002 21:12:34 +0200
"Thimo Neubauer" <th...@debian.org> wrote:

> On Fri, May 24, 2002 at 12:38:27AM +1000, Glenn McGrath wrote:
> > If we do provide the ability for users to compile their own CPU
> > optimised binaries, then how can it be a bad thing ?
>
> Well, every user may apt-get source his favourite package and modify
> debian/rules. This isn't optimal but global CFLAGS for _all_ packages
> are impossible even on one architecture like i386. Some programs
> simply won't work when optimized due to compiler errors.
>

I dont think -march=<cpu> should ever produce a bad binary, but if you
want to talk about -O<n> optimizations thats a different story, to handle
that nicely we would really need something in the build system to give a
range of values that should work for a particular package.

> > Would it be bad if their binaries are only marginally more efficient ?
>
> You can waste as much CPU time as you want on you machine, but just
> don't use a .d.o machine to build optimized versions. And please
> imagine the waste of diskspace if every package was uploaded with ---
> oops, lost the track of newest instuctions sets -- at least 4
> different versions as the kernel images already are (which makes sense
> for this special package)?
>

Did i even suggest this, i specifically said in other posts that we should
do it to allow our users to compile their own optimized packages.

> > The only "cost" involved is that it would require us to be more
> > organised.
>
> Please stop saying "we have to do something", do something
> yourself. Please think up a plan on how to accomplish an "Optimized
> Debian" distribution. Talk to the ftp-masters and admins about your
> plan. Discuss it with us. Then we'll maybefind your idea
> attractive. Otherwise, please stop the thread.
>

"Discuss it with us", and "stop saying we have to do something" in the
same context, good one.

To all the people who jump out and defend the status quo without reason,
please get out of the way.

If you didnt understand what i said in my previous posts, i would be glad
to explain using smaller words in private.

Glenn

Glenn McGrath

unread,

May 23, 2002, 7:50:07 PM5/23/02

to

On Fri, 24 May 2002 00:05:10 +0900
"Junichi Uekawa" <dan...@netfort.gr.jp> wrote:

> All my bugs with build problems start with "FTBFS:"
> to make them stand out.
> There is quite a lot of them, about 200 open right now,
> and quite many of them have patches.
> But I haven't gotten around to sending patches to all yet.
>

It picked up a FTBFS in my package libtar, tuns out that the autoconf
wrapper is broken, a typo, it accepts the argument --inclue but not
--include.

This bug has the potential to cause lots of other packages to FTBFS, maybe
some of the 200 ?

A bug was going to be filed after discussion on irc, but i cant see it in
the bts, ill file one later if i dont find it.

Glenn

Colin Watson

unread,

May 23, 2002, 9:20:05 PM5/23/02

to

On Fri, May 24, 2002 at 09:14:57AM +1000, Glenn McGrath wrote:
> It picked up a FTBFS in my package libtar, tuns out that the autoconf
> wrapper is broken, a typo, it accepts the argument --inclue but not
> --include.
>
> This bug has the potential to cause lots of other packages to FTBFS, maybe
> some of the 200 ?
>
> A bug was going to be filed after discussion on irc, but i cant see it in
> the bts, ill file one later if i dont find it.

That's #147786, fixed in autoconf2.13 2.13-44.

--
Colin Watson [cjwa...@flatline.org.uk]

Thomas Bushnell, BSG

unread,

May 23, 2002, 9:30:08 PM5/23/02

to

Michael Stone <mst...@debian.org> writes:

> What bug reports? If you compare gcs to other compilers on the same
> platform, gcc typically performs worse.

I'm a little unsure what you mean by "gcs". Perhaps if you'd explain
that acronym.

GCC is not the best compiler in the world, but that's not what you
claimed in the original message. You said that GCC adds optimizations
which speed up code on some chips, at the expense of making it slower
on other chips.

Since GCC optimizations are (by default) carefully tuned, such that
only those which help a given chip are enabled for that chip, I'm
wondering if you can be more precise.

Thomas Bushnell, BSG

unread,

May 23, 2002, 9:30:08 PM5/23/02

to

Will Newton <wi...@misconception.org.uk> writes:

> (HInt: The CPU is the least of your worries battery consumption wise)

CPU idling is actually an important thing with laptops, and the CPU is
a significant power consumer these days.

Regardless, many tasks on my systems are CPU-bound.

Junichi Uekawa

unread,

May 23, 2002, 11:00:07 PM5/23/02

to

Joey Hess <jo...@debian.org> immo vero scripsit:

> Does anyone know if gcc 3.0 has some sort of config file in /etc that
> can be used to force compile options like processor and optimizations?
> IIRC, some other distribution had such a thing, and it's really better
> than the hackish pentium-builder to do it that way.

pentium-builder is really hackish, and breaks build of
gcc, but it kinda works, if gcc is avoided.
And it is quite easy to set up, after all.

regards,
junichi
--
dan...@debian.org : Junichi Uekawa http://www.netfort.gr.jp/~dancer
GPG Fingerprint : 17D6 120E 4455 1832 9423 7447 3059 BF92 CD37 56F4
Libpkg-guide: http://www.netfort.gr.jp/~dancer/column/libpkg-guide/

Joe Wreschnig

unread,

May 23, 2002, 11:10:04 PM5/23/02

to

On Thu, 2002-05-23 at 19:10, Thomas Bushnell, BSG wrote:
> GCC is not the best compiler in the world, but that's not what you
> claimed in the original message. You said that GCC adds optimizations
> which speed up code on some chips, at the expense of making it slower
> on other chips.
>
> Since GCC optimizations are (by default) carefully tuned, such that
> only those which help a given chip are enabled for that chip, I'm
> wondering if you can be more precise.

I've heard this too (but I've seen about as much evidence of this as I
have of the 30% claimed speedups from optimizations, so I don't claim
it's right). Apparently it has something to do with e.g. i586
optimizations being slower than i386 (no optimization except
platform-independant ones) on i686 chips, or vice versa, or Intel
optimizations slowing down AMD and vice versa.

I've also heard that apparently turning on i486 optimizations can cause
586/686 processors no end of invalid instruction problems.

But like I said, I treat pretty much every claim in this thread as
fairly dubious.
--
- Joe Wreschnig <pi...@sacredchao.net> - http://www.sacredchao.net
"What I did was justified because I had a policy of my own... It's
okay to be different, to not conform to society."
-- Chen Kenichi, Iron Chef Chinese

signature.asc

Michael Stone

unread,

May 23, 2002, 11:20:06 PM5/23/02

to

On Thu, May 23, 2002 at 05:10:31PM -0700, Thomas Bushnell, BSG wrote:
> Michael Stone <mst...@debian.org> writes:
> > What bug reports? If you compare gcs to other compilers on the same
> > platform, gcc typically performs worse.
>
> I'm a little unsure what you mean by "gcs". Perhaps if you'd explain
> that acronym.

gnu compiler system. including, e.g., the fortran compiler and not just
the c compiler.

> GCC is not the best compiler in the world, but that's not what you
> claimed in the original message. You said that GCC adds optimizations
> which speed up code on some chips, at the expense of making it slower
> on other chips.
>
> Since GCC optimizations are (by default) carefully tuned, such that
> only those which help a given chip are enabled for that chip, I'm
> wondering if you can be more precise.

I made two claims, that gcc isn't a particularly strong optimizing
compiler, and that there are certain optimizations (instructions or
orderings) that can improve results on one processor and hurt on
another. The latter point was not limited to the gnu compiler, and is a
general problem in optimizing. (Which is of concern if debian is to add
additional optimizations to its default compiles, or to add additional
architecture-specific packages.) The general point is, again, that any
suggestion to add architecture-specific optimizations will need to be
backed up by a broad range of benchmarks.

--
Mike Stone

Chris Cheney

unread,

May 24, 2002, 12:30:06 AM5/24/02

to

On Thu, May 23, 2002 at 09:34:44AM -0700, Thomas Bushnell, BSG wrote:
> Quite right. So can you give us some hard numbers about specific
> optimizations which were added, and slowed down "new fancy chips"?
> And also, while you're at it, explain why you didn't file the relevant
> bug reports too.

It is pretty well known optimizing for the original pentium causes slow
downs on newer chips, like p2/p4/athlon, however the bug is in the
design of the original pentium itself not anything in the software
optimization. Also using O3 over O2 will usually causes slower running
programs due to code bloat wrt cpu cache.

Chris

Junichi Uekawa

unread,

May 24, 2002, 12:50:04 AM5/24/02

to

On Fri, 24 May 2002 09:14:57 +1000
Glenn McGrath <bu...@optushome.com.au> wrote:

> > All my bugs with build problems start with "FTBFS:"
> > to make them stand out.
> > There is quite a lot of them, about 200 open right now,
> > and quite many of them have patches.
> > But I haven't gotten around to sending patches to all yet.
> >
>
> It picked up a FTBFS in my package libtar, tuns out that the autoconf
> wrapper is broken, a typo, it accepts the argument --inclue but not
> --include.

Such bugs are embarassing, should be fixed ASAP :)

> This bug has the potential to cause lots of other packages to FTBFS, maybe
> some of the 200 ?

Hm... not many of the packages build-depend on automake/autoconf
since the last breakages automake/autoconf caused.

People use "touch" magic, or the other techniques to avoid
regenerating maintainer scripts.

> A bug was going to be filed after discussion on irc, but i cant see it in
> the bts, ill file one later if i dont find it.

okay.

--
dan...@debian.org http://www.netfort.gr.jp/~dancer

Thomas Bushnell, BSG

unread,

May 24, 2002, 1:40:04 AM5/24/02

to

Joe Wreschnig <pi...@sacredchao.net> writes:

> > Since GCC optimizations are (by default) carefully tuned, such that
> > only those which help a given chip are enabled for that chip, I'm
> > wondering if you can be more precise.

> I've heard this too (but I've seen about as much evidence of this as I
> have of the 30% claimed speedups from optimizations, so I don't claim
> it's right). Apparently it has something to do with e.g. i586
> optimizations being slower than i386 (no optimization except
> platform-independant ones) on i686 chips, or vice versa, or Intel
> optimizations slowing down AMD and vice versa.

Right. So you use the i586 optimizations on an i586, and the i386
optimizations on an i386. Why is this so confusing?

Thomas Bushnell, BSG

unread,

May 24, 2002, 1:40:05 AM5/24/02

to

Michael Stone <mst...@debian.org> writes:

> I made two claims, that gcc isn't a particularly strong optimizing
> compiler, and that there are certain optimizations (instructions or
> orderings) that can improve results on one processor and hurt on
> another.

Sure, but GCC knows which ones are best for each processor, and by
default turns on only those, depending on which target you are
compiling for.

So what's the problem?

Matt Zimmerman

unread,

May 24, 2002, 2:00:06 AM5/24/02

to

On Thu, May 23, 2002 at 09:54:28PM -0400, Michael Stone wrote:

> On Thu, May 23, 2002 at 05:10:31PM -0700, Thomas Bushnell, BSG wrote:
> > Michael Stone <mst...@debian.org> writes:
> > > What bug reports? If you compare gcs to other compilers on the same
> > > platform, gcc typically performs worse.
> >
> > I'm a little unsure what you mean by "gcs". Perhaps if you'd explain
> > that acronym.
>
> gnu compiler system. including, e.g., the fortran compiler and not just
> the c compiler.

'GCC' already refers to all of those.

(gcc.gnu.org)

In April 1999, the egcs steering committee was appointed by the FSF as the
official GNU maintainer for GCC. At that time GCC was renamed from the "GNU
C Compiler" to the "GNU Compiler Collection" and received a new mission
statement.

--
- mdz

Thomas Bushnell, BSG

unread,

May 24, 2002, 2:40:06 AM5/24/02

to

Chris Cheney <cch...@cheney.cx> writes:

> It is pretty well known optimizing for the original pentium causes slow
> downs on newer chips, like p2/p4/athlon, however the bug is in the
> design of the original pentium itself not anything in the software
> optimization. Also using O3 over O2 will usually causes slower running
> programs due to code bloat wrt cpu cache.

Sure, and in any case like this, a given option will be better for
some chips and worse for others. If you turn on the option, life is
better for chip A, and worse for chip B. If you don't turn on the
option, life is worse for chip A, and better for chip B.

None of that is some kind of bug or defect in GCC, but a fact that the
chips have different performance characteristics.

Joe Wreschnig

unread,

May 24, 2002, 3:20:04 AM5/24/02

to

On Thu, 2002-05-23 at 22:26, Thomas Bushnell, BSG wrote:
> Joe Wreschnig <pi...@sacredchao.net> writes:
>
> > > Since GCC optimizations are (by default) carefully tuned, such that
> > > only those which help a given chip are enabled for that chip, I'm
> > > wondering if you can be more precise.
>
> > I've heard this too (but I've seen about as much evidence of this as I
> > have of the 30% claimed speedups from optimizations, so I don't claim
> > it's right). Apparently it has something to do with e.g. i586
> > optimizations being slower than i386 (no optimization except
> > platform-independant ones) on i686 chips, or vice versa, or Intel
> > optimizations slowing down AMD and vice versa.
>
> Right. So you use the i586 optimizations on an i586, and the i386
> optimizations on an i386. Why is this so confusing?

*shrug* Beats me. It's a problem with distributing binaries definitely,
but since I gather the original poster just wanted to automate pbuilder
and so on, my guess is a lot of people didn't bother to actually read
what he wanted to do and just jumped on him assuming he wanted Debian to
provide binaries.

Alternately, I have a theory that the number of flames on d-d is
constant over any given 8 hour interval, and so with a huge thread
finally dying down, this one (and the WineX one) had to pick up. :P

signature.asc

Peter Makholm

unread,

May 24, 2002, 4:30:12 AM5/24/02

to

t...@becket.net (Thomas Bushnell, BSG) writes:

> Right. So you use the i586 optimizations on an i586, and the i386
> optimizations on an i386. Why is this so confusing?

I neither have a i386 nor a i586 which package should I use?

I got a AMD-something and what Joe says is that it might be better for
me to use the i386 version than the i586 version of a package but the
ordinary user (which I consider myself one of when it come to
spu-specific optimisations) I would intuitively think than I should
choose the newest subarc that not newer than my computer.

--
Peter Makholm | I have something to say: It's better to burn in
pe...@makholm.net | hell, than to fade away!
http://hacking.dk | -- Kurgan

Steven Fuerst

unread,

May 24, 2002, 6:20:14 AM5/24/02

to

> Hi,
>
> I was having a look at Gentoo Linux the other night, and their principle of
> rebuilding your own packages with your chosen gcc -O & -march flags.
>
> I got wondering, how one should go about rebuilding from source a whole
> debian installation in a similar way ?
> Anybody did this before, or are there any significant hurdles that prevent
> automating this process ?
>

<snip>

I'm currently making/using a program called 'apt-src' which downloads
and compiles source packages instead of using binary packages like
apt-get. With ccache and pentium-builder installed it can be changed to
optimise for a given architecture. I've been using it for a few weeks
now - with no strange side-effects.

Here is a description of some of what it does:

'apt-src install packagename' will install the package from source,
recursively compiling and installing the build-depends as needed.

'apt-src upgrade' will try to upgrade all out of date packages from
source, taking into account build-dependancies. It does not add or
remove any packages. (Including build-depends - so some packages may be
unbuildable.)

'apt-src dist-upgrade' will add and remove packages as well as upgrading
everything. (This is somewhat buggy - I've just found it doesn't handle
build-conflicts very well.)

It currently is a huge hack - but it works most of the time now. (I'm
limited by the rate that new pacakges are added into the pool in
debugging it.)

It would be relatively trivial to add a 'apt-src build-world' target
which would rebuild everything... but I'm still trying to debug the
simpler stuff, so that will have to wait.

Steven

Thomas Bushnell, BSG

unread,

May 24, 2002, 1:30:10 PM5/24/02

to

Peter Makholm <pe...@makholm.net> writes:

> I got a AMD-something and what Joe says is that it might be better for
> me to use the i386 version than the i586 version of a package but the
> ordinary user (which I consider myself one of when it come to
> spu-specific optimisations) I would intuitively think than I should
> choose the newest subarc that not newer than my computer.

Those intuitions might well be in error, of course... so it seems that
we should be as clear as possible about which recommended packages to
use for each Arch.

Thomas Bushnell, BSG

unread,

May 24, 2002, 1:30:12 PM5/24/02

to

Steven Fuerst <s...@mssl.ucl.ac.uk> writes:

> I'm currently making/using a program called 'apt-src' which downloads
> and compiles source packages instead of using binary packages like
> apt-get. With ccache and pentium-builder installed it can be changed to
> optimise for a given architecture. I've been using it for a few weeks
> now - with no strange side-effects.

You rock!

Stephen Zander

unread,

May 24, 2002, 2:20:07 PM5/24/02

to

>>>>> "Glenn" == Glenn McGrath <bu...@optushome.com.au> writes:
Glenn> I dont think -march=<cpu> should ever produce a bad binary,

On SPARC, anything build with -march=sparcv9 or -march=ultrasparc
(they are equivalent) will fail on non-UltraSPARC boxen, so it is
possible to produce bad binaries with -march. I expect the same is
true of any 32bit/64bit architecture. All the world is not x86.

--
Stephen

"A duck!"

Nick Phillips

unread,

May 24, 2002, 7:50:07 PM5/24/02

to

On Fri, May 24, 2002 at 11:00:28AM -0700, Stephen Zander wrote:
> >>>>> "Glenn" == Glenn McGrath <bu...@optushome.com.au> writes:
> Glenn> I dont think -march=<cpu> should ever produce a bad binary,
>
> On SPARC, anything build with -march=sparcv9 or -march=ultrasparc
> (they are equivalent) will fail on non-UltraSPARC boxen, so it is
> possible to produce bad binaries with -march. I expect the same is
> true of any 32bit/64bit architecture. All the world is not x86.

That doesn't make it a bad binary.

I don't know what planet you're on, but I wouldn't expect something I
explicitly built for a K7 to run on a PIII. There's nothing special
about x86 architecture that magically enables older processors to
understand newer instruction sets...

[puzzled]

--
Nick Phillips -- n...@lemon-computing.com
Today is the first day of the rest of the mess.

Wilmer van der Gaast

unread,

May 25, 2002, 10:40:07 AM5/25/02

to

Frank Cope...@lists.debian-devel@Thu, 23 May 2002 12:39:29 +0000 (UTC):
> My K7/1700+ (1.433 real GHz) spends 99% of its time idle.
>
Wrong...

model name : AMD Athlon(tm) XP 1700+
cpu MHz : 1466.460

Nice machine though..

> Do I really care if ls runs 5 or 10% faster (in CPU terms) when it is
> essentially disk-bound?
>
Not about ls.. But a faster Mozilla/KDE sounds interesting, doesn't it?

Junichi Uekawa

unread,

May 25, 2002, 12:40:10 PM5/25/02

to

Frank Copeland <f...@thingy.apana.org.au> immo vero scripsit:

> My K7/1700+ (1.433 real GHz) spends 99% of its time idle.
>

> I already use a K7-optimised kernel (building it is the only time I
> notice significant activity in the CPU monitor). I can see some logic
> in building a K7-optimised libc. But what do I gain from building the
> rest of my system with K7 optimisations? Do I really care if ls runs 5

> or 10% faster (in CPU terms) when it is essentially disk-bound?

Go and rebuild the Debian archive. It will keep your
CPU busy for at least a week.

Then, fix bugs you find in the build process.

It will keep you busy for at least a year :P

regards,
junichi

--
dan...@debian.org : Junichi Uekawa http://www.netfort.gr.jp/~dancer
GPG Fingerprint : 17D6 120E 4455 1832 9423 7447 3059 BF92 CD37 56F4
Libpkg-guide: http://www.netfort.gr.jp/~dancer/column/libpkg-guide/

Stephen Zander

unread,

May 25, 2002, 2:30:09 PM5/25/02

to

>>>>> "Nick" == Nick Phillips <n...@nz.lemon-computing.com> writes:
Nick> That doesn't make it a bad binary.

So a binary that throw illegal instruction traps is still a good binary?

Nick> I don't know what planet you're on, but I wouldn't expect
Nick> something I explicitly built for a K7 to run on a
Nick> PIII. There's nothing special about x86 architecture that
Nick> magically enables older processors to understand newer
Nick> instruction sets...

Th difference is that using -march in the x86 instruction superset
results more in instruction sequence reordering than it does in
unsupported opcodes, mainly because the more obscure opcodes don't add
much from a peromance perspective. Integer multiplication (one of the
most obvious differences between SPARC V7 & SPARC V8) is a little more
common.

All that said, your right, no architecture is forward compatible.

--
Stephen

Nick Phillips

unread,

May 26, 2002, 2:50:06 AM5/26/02

to

On Sat, May 25, 2002 at 11:26:29AM -0700, Stephen Zander wrote:
> >>>>> "Nick" == Nick Phillips <n...@nz.lemon-computing.com> writes:
> Nick> That doesn't make it a bad binary.
>
> So a binary that throw illegal instruction traps is still a good binary?

Depends on whether you're running it on the arch it was intended for.
If not, then quite possibly yes. It's just not the *right* good binary...

*shrug*

--
Nick Phillips -- n...@lemon-computing.com

You may get an opportunity for advancement today. Watch it!

Frank Copeland

unread,

May 26, 2002, 2:50:06 AM5/26/02

to

On Sat, 25 May 2002 14:31:41 +0000 (UTC), Wilmer van der Gaast <lin...@bigfoot.com> wrote:
> Frank Cope...@lists.debian-devel@Thu, 23 May 2002 12:39:29 +0000 (UTC):

>> Do I really care if ls runs 5 or 10% faster (in CPU terms) when it is

>> essentially disk-bound?
>>
> Not about ls.. But a faster Mozilla/KDE sounds interesting, doesn't it?

Not especially. Galeon/Gnome runs perfectly fine on this box, and on
the much slower boxes that preceded it. They are bad examples in any
case. They are not CPU-bound applications.

I've seen suggestions that *building* KDE or a CPU-optimized debian
itself are applications where a CPU-optimised debian would be an
advantage. Of course a CPU-optimised toolchain will build a
CPU-optimised debian faster, but if I don't want to build a
CPU-optimised debian then I don't need it. It's a circular argument. At
best it's an argument for building a CPU-optimised toolchain, not the
whole distribution.

Which brings me back to the point I made in my original response:

> If I should discover a particular package that (heaven forfend) makes
> this beast appear sluggish, why isn't it enough that I can apt-get
> source it, adjust the build settings and dpkg-buildpackage a
> K7-optimised version? I already do that for the kernel.

I can do it for gcc as well.

Frank
--
Home Page: <URL:http://thingy.apana.org.au/~fjc/>
Not the Scientology Home Page: <URL:http://xenu.apana.org.au/ntshp/>

Matt Zimmerman

unread,

May 26, 2002, 12:10:07 PM5/26/02

to

On Sun, May 26, 2002 at 06:49:13AM +0000, Frank Copeland wrote:

> Not especially. Galeon/Gnome runs perfectly fine on this box, and on the
> much slower boxes that preceded it. They are bad examples in any case.
> They are not CPU-bound applications.

You should not make such general statements based only on your own
experience. Galeon and Mozilla are network-bound on this Athlon, but on my
iMac at work, they are very much CPU-bound.

Even so, I am not in favour of spending resources on CPU-optimized binary
packages, though it would be nice if there were an elegant and reliable way
to supply extra optimization flags to the package build process without
human intervention.

--
- mdz

Ulrich Eckhardt

unread,

May 28, 2002, 7:10:19 PM5/28/02

to

On Thursday 23 May 2002 14:56, Michael Stone wrote:
> On Thu, May 23, 2002 at 10:42:02PM +1000, Glenn McGrath wrote:
> > On Thu, 23 May 2002 07:31:05 -0400 "Michael Stone"
<mst...@debian.org>wrote:
> > > Do you have actual benchmarks rather than hearsay from some leet /.
> > > kiddie? No one has ever come to debian with a convincing set of numbers
> > > that indicate that optimizing intel builds buys any significant
> > > advantage.
> >
> > It would be very stupid to suggest it makes it slower.
>
> Would it? If you don't have numbers you have *nothing* to back that
> assertion.

using as a benchmark compressing my cvsroot with
time tar cjf cvsroot.tar.bz2 cvsroot
on an Athlon 1400 (or 1600, dunno ...). Size of the dir is 5.3M.

Debian's debs:
around 3.6 seconds

compiled bzip2 and tar with DEBIAN_BUILDARCH=k6:
around 3.5 seconds

compiled with DEBIAN_BUILDARCH=k6 and GCCVER=3.0
around 3.3 seconds

compiled with DEBIAN_BUILDARCH=athlon and GCCVER=3.0
around 3.3 seconds

(note: arch=athlon doesn't work for gcc2.95)

Adam Heath

unread,

May 28, 2002, 7:30:11 PM5/28/02

to

On Wed, 29 May 2002, Ulrich Eckhardt wrote:

> using as a benchmark compressing my cvsroot with
> time tar cjf cvsroot.tar.bz2 cvsroot
> on an Athlon 1400 (or 1600, dunno ...). Size of the dir is 5.3M.
>
>
> Debian's debs:
> around 3.6 seconds
>
> compiled bzip2 and tar with DEBIAN_BUILDARCH=k6:
> around 3.5 seconds

Noise.

>
> compiled with DEBIAN_BUILDARCH=k6 and GCCVER=3.0
> around 3.3 seconds
>
> compiled with DEBIAN_BUILDARCH=athlon and GCCVER=3.0
> around 3.3 seconds

No change for k6/athlon. However, it appears that GCC=3.0 gives very good
numbers.

What happens if you compile with just 3.0, but no BUILDARCH?

Also, use a biggest test case. This one is rather small.

Ulrich Eckhardt

unread,

May 29, 2002, 3:50:10 AM5/29/02

to

On Wednesday 29 May 2002 01:30, Adam Heath wrote:
> On Wed, 29 May 2002, Ulrich Eckhardt wrote:
> >
> > Debian's debs:
> > around 3.6 seconds
> >
> > compiled bzip2 and tar with DEBIAN_BUILDARCH=k6:
> > around 3.5 seconds
>
> Noise.

No. Marginal but visible.

>
> > compiled with DEBIAN_BUILDARCH=k6 and GCCVER=3.0
> > around 3.3 seconds
> >
> > compiled with DEBIAN_BUILDARCH=athlon and GCCVER=3.0
> > around 3.3 seconds
>
> No change for k6/athlon. However, it appears that GCC=3.0 gives very good
> numbers.
>
> What happens if you compile with just 3.0, but no BUILDARCH?
>
> Also, use a biggest test case. This one is rather small.

Hmmm, everyone can do so on their own machines easily .... how are the
results on yours ? :)

I'll give it a more systematic try with a linux-kernel sourcetree instead of
my cvsroot, but that's later today.

uli

Anthony DeRobertis

unread,

May 29, 2002, 5:50:10 AM5/29/02

to

On Thu, 2002-05-23 at 14:21, Glenn Maynard wrote:

> ... and I spat in disgust to see my copy of ifconfig start spewing "GiB"
> at me. Ugh.

Well, let it wrap and then it will start spewing Men in Black at you. So
not all is lost.

signature.asc

Anthony DeRobertis

unread,

May 29, 2002, 6:10:07 AM5/29/02

to

On Thu, 2002-05-23 at 19:06, Glenn McGrath wrote:

> I dont think -march=<cpu> should ever produce a bad binary,

-march=athlon and -march=pentiumpro have made a binary of mine behave
differently.

It was my fault; I was (accidentally) depending on undefined behavior.
Interestingly, it worked on FreeBSD, PowerPC, Alpha, and i386 --- as
long as I didn't use the wrong -march... Didn't work on Darwin, though.

Everything's fine now that I fixed my undefined behavior, but it _could_
break things.

Probably isn't to often an occurance, and would be picked up more by
people using different compiler versions. Or the upteen different
architectures.

signature.asc

Andrew Suffield

unread,

May 29, 2002, 6:20:10 AM5/29/02

to

On Wed, May 29, 2002 at 09:48:33AM +0200, Ulrich Eckhardt wrote:
> On Wednesday 29 May 2002 01:30, Adam Heath wrote:
> > On Wed, 29 May 2002, Ulrich Eckhardt wrote:
> > >
> > > Debian's debs:
> > > around 3.6 seconds
> > >
> > > compiled bzip2 and tar with DEBIAN_BUILDARCH=k6:
> > > around 3.5 seconds
> >
> > Noise.
> No. Marginal but visible.

No, it really is noise. At the accuracy you have given (one decimal
place), a change of 0.1 is statistically meaningless, especially since
you said "about". It can easily be attributed to aliasing effects (a
0.0001 shift about the 3.55 mark will result in a shift from 3.5 to
3.6).

The figures you have given here are not meaningful.

> > > compiled with DEBIAN_BUILDARCH=k6 and GCCVER=3.0
> > > around 3.3 seconds
> > >
> > > compiled with DEBIAN_BUILDARCH=athlon and GCCVER=3.0
> > > around 3.3 seconds
> >
> > No change for k6/athlon. However, it appears that GCC=3.0 gives very good
> > numbers.
> >
> > What happens if you compile with just 3.0, but no BUILDARCH?
> >
> > Also, use a biggest test case. This one is rather small.
>
> Hmmm, everyone can do so on their own machines easily .... how are the
> results on yours ? :)
>
> I'll give it a more systematic try with a linux-kernel sourcetree instead of
> my cvsroot, but that's later today.

That is not a deterministic test. It doesn't provide a particularly
useful benchmark; it involves a semi-random pattern of disk access
which is hopelessly skewed by environmental factors.

If you are trying to do a benchmark on processor-related performance
changes (which it seems like you are), you will have to construct a
benchmark which actually measures that (no, I don't have any
suggestions offhand).

--
.''`. ** Debian GNU/Linux ** | Andrew Suffield
: :' : http://www.debian.org/ | Dept. of Computing,
`. `' | Imperial College,
`- -><- | London, UK

James Kahn

unread,

May 29, 2002, 7:00:14 AM5/29/02

to

On Wed, 2002-05-29 at 22:16, Andrew Suffield wrote:
> On Wed, May 29, 2002 at 09:48:33AM +0200, Ulrich Eckhardt wrote:
> > Hmmm, everyone can do so on their own machines easily .... how are the
> > results on yours ? :)
> >
> > I'll give it a more systematic try with a linux-kernel sourcetree instead of
> > my cvsroot, but that's later today.
>
> That is not a deterministic test. It doesn't provide a particularly
> useful benchmark; it involves a semi-random pattern of disk access
> which is hopelessly skewed by environmental factors.

Far from being a "deterministic" test, it's not a test of CPU at all.
The CPU will be mostly idle waiting for data from the hard disk and bus.

> If you are trying to do a benchmark on processor-related performance
> changes (which it seems like you are), you will have to construct a
> benchmark which actually measures that (no, I don't have any
> suggestions offhand).

Ray tracing tends to be pretty CPU intensive, how about using povray?
Of course, to get "real" results, you'll need it to trace a rather large
scene a few times with each buildarch, throw away any outliers and draw
the mean of the rest.

James.

Oliver Kurth

unread,

May 29, 2002, 7:10:07 AM5/29/02

to

On Wed, May 29, 2002 at 10:58:41PM +1200, James Kahn wrote:
> On Wed, 2002-05-29 at 22:16, Andrew Suffield wrote:
> > On Wed, May 29, 2002 at 09:48:33AM +0200, Ulrich Eckhardt wrote:
> > > Hmmm, everyone can do so on their own machines easily .... how are the
> > > results on yours ? :)
> > >
> > > I'll give it a more systematic try with a linux-kernel sourcetree instead of
> > > my cvsroot, but that's later today.
> >
> > That is not a deterministic test. It doesn't provide a particularly
> > useful benchmark; it involves a semi-random pattern of disk access
> > which is hopelessly skewed by environmental factors.
>
> Far from being a "deterministic" test, it's not a test of CPU at all.
> The CPU will be mostly idle waiting for data from the hard disk and bus.
>
> > If you are trying to do a benchmark on processor-related performance
> > changes (which it seems like you are), you will have to construct a
> > benchmark which actually measures that (no, I don't have any
> > suggestions offhand).
>
> Ray tracing tends to be pretty CPU intensive, how about using povray?
> Of course, to get "real" results, you'll need it to trace a rather large
> scene a few times with each buildarch, throw away any outliers and draw
> the mean of the rest.

If bzip2 should be used as a benchmark, write a small program that generates
random data and pipe this to bzip2:

genran | bzip2 -c > /dev/null

This should not generate any disk IO. Of course, the data generating program
should not be CPU intensive itself.

Greetings,
Oliver
>

Oohara Yuuma

unread,

May 29, 2002, 10:40:10 AM5/29/02

to

On 29 May 2002 22:58:41 +1200,

James Kahn <jk...@paradise.net.nz> wrote:
> On Wed, 2002-05-29 at 22:16, Andrew Suffield wrote:
> > If you are trying to do a benchmark on processor-related performance
> > changes (which it seems like you are), you will have to construct a
> > benchmark which actually measures that (no, I don't have any
> > suggestions offhand).
> Ray tracing tends to be pretty CPU intensive, how about using povray?
> Of course, to get "real" results, you'll need it to trace a rather large
> scene a few times with each buildarch, throw away any outliers and draw
> the mean of the rest.

How about john? Here is my result:

$ echo 'exec -a john debian/tmp/usr/sbin/john-any -test' | sh

* Debian default (-O2 -fomit-frame-pointer)

Benchmarking: Standard DES [24/32 4K]... DONE
Many salts: 164531 c/s real, 166193 c/s virtual
Only one salt: 152396 c/s real, 152396 c/s virtual

Benchmarking: BSDI DES (x725) [24/32 4K]... DONE
Many salts: 5868 c/s real, 5868 c/s virtual
Only one salt: 5779 c/s real, 5779 c/s virtual

Benchmarking: FreeBSD MD5 [32/32]... DONE
Raw: 2603 c/s real, 2603 c/s virtual

Benchmarking: OpenBSD Blowfish (x32) [32/32]... DONE
Raw: 208 c/s real, 208 c/s virtual

Benchmarking: Kerberos AFS DES [24/32 4K]... DONE
Short: 148224 c/s real, 148224 c/s virtual
Long: 258252 c/s real, 258252 c/s virtual

Benchmarking: NT LM DES [24/32 4K]... DONE
Raw: 815219 c/s real, 815219 c/s virtual

* no optimization

Benchmarking: Standard DES [24/32 4K]... DONE
Many salts: 162892 c/s real, 163219 c/s virtual
Only one salt: 140697 c/s real, 140697 c/s virtual

Benchmarking: BSDI DES (x725) [24/32 4K]... DONE
Many salts: 5872 c/s real, 5872 c/s virtual
Only one salt: 5609 c/s real, 5609 c/s virtual

Benchmarking: FreeBSD MD5 [32/32]... DONE
Raw: 2459 c/s real, 2588 c/s virtual

Benchmarking: OpenBSD Blowfish (x32) [32/32]... DONE
Raw: 191 c/s real, 191 c/s virtual

Benchmarking: Kerberos AFS DES [24/32 4K]... DONE
Short: 131123 c/s real, 131123 c/s virtual
Long: 127948 c/s real, 127948 c/s virtual

Benchmarking: NT LM DES [24/32 4K]... DONE
Raw: 685875 c/s real, 685875 c/s virtual

* overkill optimization (gcc-3.1, -O3 -fomit-frame-pointer -mcpu=pentium4)

Benchmarking: Standard DES [24/32 4K]... DONE
Many salts: 166272 c/s real, 166272 c/s virtual
Only one salt: 152038 c/s real, 152648 c/s virtual

Benchmarking: BSDI DES (x725) [24/32 4K]... DONE
Many salts: 5777 c/s real, 5835 c/s virtual
Only one salt: 5777 c/s real, 5777 c/s virtual

Benchmarking: FreeBSD MD5 [32/32]... DONE
Raw: 2616 c/s real, 2616 c/s virtual

Benchmarking: OpenBSD Blowfish (x32) [32/32]... DONE
Raw: 207 c/s real, 207 c/s virtual

Benchmarking: Kerberos AFS DES [24/32 4K]... DONE
Short: 144640 c/s real, 144640 c/s virtual
Long: 256256 c/s real, 256256 c/s virtual

Benchmarking: NT LM DES [24/32 4K]... DONE
Raw: 847820 c/s real, 847820 c/s virtual

I don't think CPU-specific optimization is much better
than non-CPU-specific optimization.

--
Oohara Yuuma <ooh...@libra.interq.or.jp>
Debian developer
PGP key (key ID F464A695) http://www.interq.or.jp/libra/oohara/pub-key.txt
Key fingerprint = 6142 8D07 9C5B 159B C170 1F4A 40D6 F42E F464 A695

I always put away what I take.
--- Ryuji Akai, "Star away"

Matthew M

unread,

May 29, 2002, 3:50:06 PM5/29/02

to

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wednesday 29 May 2002 12:08 pm, Oliver Kurth wrote:
> If bzip2 should be used as a benchmark, write a small program that
> generates random data and pipe this to bzip2:
>
> genran | bzip2 -c > /dev/null
>
> This should not generate any disk IO. Of course, the data generating
> program should not be CPU intensive itself.

Tests use a file generated by dd if=/dev/urandom of=./test.file bs=1M count=10
Compressed with bzip2 -kc ./test.file > /dev/null
File is in cache, so no disk io.

gcc-2.95.4 (no optimisation)
19.1 seconds

gcc-2.95.4 (-O2 -fomit-frame-pointer)
15.8 seconds

gcc-2.95.4 (-O2 -fomit-frame-pointer -march=i686)
16.1 seconds

gcc-3.0 (no optimisation)
24.75 seconds

gcc-3.0 (-O2 -fomit-frame-pointer)
16.0 seconds

gcc-3.0 (-O2 -fomit-frame-pointer -march=i686)
15.9 seconds

gcc-3.1 (no optimisation)
22.1 seconds

gcc-3.1 (-O2 -fomit-frame-pointer)
15.6 seconds

gcc-3.1 (-O2 -fomit-frame-pointer -march=pentium4)
15.8 seconds

Summary:

gcc-3.1 is the fastest compiler (even faster than icc). Turning on
cpu-specific optimisation decreses performance in this case; I'm not sure if
that is noise (performance goes up for gcc-3.0), but the results were pretty
consistent.

So there it is... in this case, at least, cpu-specific optimisations don't
appear to be worth much. Of course, that might not be true for different
workloads.

*matt*
- --

Now I know someone out there is going to claim, "Well then, UNIX is intuitive,
because you only need to learn 5000 commands, and then everything else follows
from that! Har har har!"
-- Andy Bates on "intuitive interfaces", slightly defending Macs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE89SoQoVQMDIAmueURAgTpAJ0bP7imcJuy/xFlSoFpv52WRQUHawCfZvSM
fFvm0OA2HVfa0mA2z802gGw=
=EiGE
-----END PGP SIGNATURE-----

Ulrich Eckhardt

unread,

May 29, 2002, 5:40:07 PM5/29/02

to

On Wednesday 29 May 2002 12:16, Andrew Suffield wrote:
> On Wed, May 29, 2002 at 09:48:33AM +0200, Ulrich Eckhardt wrote:
> > On Wednesday 29 May 2002 01:30, Adam Heath wrote:
> > > On Wed, 29 May 2002, Ulrich Eckhardt wrote:
> > > > Debian's debs:
> > > > around 3.6 seconds
> > > >
> > > > compiled bzip2 and tar with DEBIAN_BUILDARCH=k6:
> > > > around 3.5 seconds
> > >
> > > Noise.
> >
> > No. Marginal but visible.

You don't show me yours so I wont show you mine !
:)

>
> No, it really is noise. At the accuracy you have given (one decimal
> place), a change of 0.1 is statistically meaningless, especially since
> you said "about". It can easily be attributed to aliasing effects (a
> 0.0001 shift about the 3.55 mark will result in a shift from 3.5 to
> 3.6).
>

You have no idea why I claimed that it is not noise, because you have neither
seen what I have really done nor have I told you. My short 'no' is also a
response to your short 'noise', for which you don't tell me your reasons.

However, it really is not noise. That is not a scientific opinion, but one I
observed, try it yourself and you'll see it. I don't claim that the
difference is really 0.1 s btw. It may be even smaller, but it was visible
when switching those two versions.

> The figures you have given here are not meaningful.
>
> > > > compiled with DEBIAN_BUILDARCH=k6 and GCCVER=3.0
> > > > around 3.3 seconds
> > > >
> > > > compiled with DEBIAN_BUILDARCH=athlon and GCCVER=3.0
> > > > around 3.3 seconds
> > >
> > > No change for k6/athlon. However, it appears that GCC=3.0 gives very
> > > good numbers.
> > >
> > > What happens if you compile with just 3.0, but no BUILDARCH?
> > >
> > > Also, use a biggest test case. This one is rather small.
> >
> > Hmmm, everyone can do so on their own machines easily .... how are the
> > results on yours ? :)
> >
> > I'll give it a more systematic try with a linux-kernel sourcetree instead
> > of my cvsroot, but that's later today.
>
> That is not a deterministic test. It doesn't provide a particularly
> useful benchmark; it involves a semi-random pattern of disk access
> which is hopelessly skewed by environmental factors.
>

I picked those two because I use them now and then. I hope that the kernel
catches all diskaccess so that shouldn't be an issue after the first run.

Random data has a drawback if I don't use the same set of data for the
various program to benchmark, therefore some constant but (semi-)random data
is preferable imho.

Also the size shouldn't exceed the memory because then it really gets
IO-bound so small sizes are preferable. A Linux sourcetree is too large, I
have to admit.

To add some more to this thing, here are some more results. This time I also
only document the 'user' field of what time reports. System is
Athlon XP 1700 @ 1470 Hz (256KiB cache, 2929.45 bogomips)
Epox 8KHA+ with 256MiB RAM
Note: these times are only the lowest 'user' result of running
tar cjf cvsroot-2002-05-29.tar.bz2 cvsroot
(and cvsroot being 5.3MiB) a few times. Some may claim that I should have
taken the average, but A) I am too lazy. B) ideally it should always take the
same time and that time cannot go beneath a certain limit which is the time
that process really takes, not additional time for context-switches etc ...

All results varied in a range of ~0.1 s, with the occasional escapee.

BUILDARCH/BUILDGCCVER

i386/2.95
2.72 s

i386/3.0
2.53 s

i486/2.95
2.60 s

i486/3.0
2.50 s

k6/2.95
2.57 s

k6/3.0
2.42 s

athlon/3.0
2.50 s

Two conclusions:
- I'll now recompile some stuff with k6/3.0 :-)
- adding archs to Debian is not really beneficial for my system, the pending
switch to gcc 3.0 will already do a big step.
- this needs to be reproduced on other systems too. If I want to, I will do
so on a pentium MMX@300 and a K6-2@500.

uli

sta...@okstate.edu

unread,

May 29, 2002, 5:50:09 PM5/29/02

to

>Far from being a "deterministic" test, it's not a test of CPU at all.
>The CPU will be mostly idle waiting for data from the hard disk and bus.

Even run bzip? It's a CPU bound task; if it weren't then it would run at
about the same speed as gzip and be usable many more places.

James Kahn

unread,

May 30, 2002, 5:30:11 AM5/30/02

to

On Thu, 2002-05-30 at 09:44, sta...@okstate.edu wrote:
> >Far from being a "deterministic" test, it's not a test of CPU at all.
> >The CPU will be mostly idle waiting for data from the hard disk and bus.
>
> Even run bzip? It's a CPU bound task; if it weren't then it would run at
> about the same speed as gzip and be usable many more places.

Yes, I have run bzip2, as you'd expect most people here would have.
What you are forgetting is that CPUs are incredibly fast - and cache,
the bus, ram and harddisks are many orders of magnitude slower (each
somewhat moreso than the last). The problem you run into here is that
to test the CPU (rather than the rest of the system) you have to have
such a small amount of data to compress that your sample is very
vulnerable to noise. On the other hand, if you try to take a larger
sample, other environmental factors come into effect - that of each
component the data must travel through before reaching the CPU.

It is CPU bound, but when it does rest and wait for data from the hard
disk, it is not the CPU that is being stressed. If it does this at all
it is not a good test of CPU optimisation.

James.

Michael Stone

unread,

May 30, 2002, 6:20:07 AM5/30/02

to

On Thu, May 30, 2002 at 09:24:29PM +1200, James Kahn wrote:
> It is CPU bound, but when it does rest and wait for data from the hard
> disk, it is not the CPU that is being stressed. If it does this at all
> it is not a good test of CPU optimisation.

But we're not really interested in testing cpu optimization. For our
purposes, what's important is whether optimizations make a difference
for our applications, not whether the optimizations improve some
abstract test cases. Yes, it makes testing harder, but that's life.

--
Mike Stone

James Kahn

unread,

May 30, 2002, 7:30:17 AM5/30/02

to

On Thu, 2002-05-30 at 22:15, Michael Stone wrote:
> On Thu, May 30, 2002 at 09:24:29PM +1200, James Kahn wrote:
> > It is CPU bound, but when it does rest and wait for data from the hard
> > disk, it is not the CPU that is being stressed. If it does this at all
> > it is not a good test of CPU optimisation.
>
> But we're not really interested in testing cpu optimization. For our
> purposes, what's important is whether optimizations make a difference
> for our applications, not whether the optimizations improve some
> abstract test cases. Yes, it makes testing harder, but that's life.

Harder, if not impossible to get real results. Tests would either be
subjective - "It just *feels* faster" - or have a high signal to noise
ratio.

Michael Stone

unread,

May 30, 2002, 11:00:07 AM5/30/02

to

On Thu, May 30, 2002 at 11:27:50PM +1200, James Kahn wrote:
> Harder, if not impossible to get real results. Tests would either be
> subjective - "It just *feels* faster" - or have a high signal to noise
> ratio.

Not really. You create a benchmark of actual applications and compare
the results to the non-optimized applications. You run the tests a
number of times and see what happens. This isn't unheard of--a lot of
windows benchmarks, for example, actually run MS office apps, or video
games, or whatever--which is much more useful for most people than
reporting how fast a given machine can do fourier transforms. If the
results are only noise, well, I guess that answers the question. :)

--
Mike Stone

Junichi Uekawa

unread,

Jun 1, 2002, 1:00:12 PM6/1/02

to

Matthew M <matthew...@btinternet.com> immo vero scripsit:

> gcc-3.1 is the fastest compiler (even faster than icc). Turning on
> cpu-specific optimisation decreses performance in this case; I'm not sure if
> that is noise (performance goes up for gcc-3.0), but the results were pretty
> consistent.
>
> So there it is... in this case, at least, cpu-specific optimisations don't
> appear to be worth much. Of course, that might not be true for different
> workloads.

I have different values for athlon, although I used
bochs as my testcase.

I benchmarked gcc-3.0 and gcc-2.95 at that time, and that
motivated me to do my rebuilds.
I have my graph somewhere ...

regards,
junichi

--
dan...@debian.org : Junichi Uekawa http://www.netfort.gr.jp/~dancer
GPG Fingerprint : 17D6 120E 4455 1832 9423 7447 3059 BF92 CD37 56F4
Libpkg-guide: http://www.netfort.gr.jp/~dancer/column/libpkg-guide/

Thomas Zimmerman

unread,

Jun 4, 2002, 11:50:06 PM6/4/02

to

On 29-May 10:58, James Kahn wrote:
> On Wed, 2002-05-29 at 22:16, Andrew Suffield wrote:
> > On Wed, May 29, 2002 at 09:48:33AM +0200, Ulrich Eckhardt wrote:
> > > Hmmm, everyone can do so on their own machines easily .... how are the
> > > results on yours ? :)
> > >
> > > I'll give it a more systematic try with a linux-kernel sourcetree instead of
> > > my cvsroot, but that's later today.
> >
> > That is not a deterministic test. It doesn't provide a particularly
> > useful benchmark; it involves a semi-random pattern of disk access
> > which is hopelessly skewed by environmental factors.
>
> Far from being a "deterministic" test, it's not a test of CPU at all.
> The CPU will be mostly idle waiting for data from the hard disk and bus.

=) I want your box. Even on my Athlon box with lots of ram, a kernel compile
is definately cpu bound. I'll give you that disk access adds too much
"noise" for benchmarking. A "find . -exec cat {} >/dev/null \;" in the build
tree speeds things greatly, but then I have 512M ram. Building a test
framework around a kernel untar then build isn't that hard. (Even I can do
it...that's saying something.)

That said, this tread seems to be heading now where fast. Wouldn't it just
be nice of woody+1 could be rebuilt using tools? Software shouldn't need
hand holding.

Thomas
[snip]

James Kahn

unread,

Jun 5, 2002, 6:50:10 AM6/5/02

to

On Thu, 2002-05-30 at 07:15, Thomas Zimmerman wrote:
> On 29-May 10:58, James Kahn wrote:
> > On Wed, 2002-05-29 at 22:16, Andrew Suffield wrote:
> > > On Wed, May 29, 2002 at 09:48:33AM +0200, Ulrich Eckhardt wrote:
> > > > Hmmm, everyone can do so on their own machines easily .... how are the
> > > > results on yours ? :)
> > > >
> > > > I'll give it a more systematic try with a linux-kernel sourcetree instead of
> > > > my cvsroot, but that's later today.
> > >
> > > That is not a deterministic test. It doesn't provide a particularly
> > > useful benchmark; it involves a semi-random pattern of disk access
> > > which is hopelessly skewed by environmental factors.
> >
> > Far from being a "deterministic" test, it's not a test of CPU at all.
> > The CPU will be mostly idle waiting for data from the hard disk and bus.
>
> =) I want your box. Even on my Athlon box with lots of ram, a kernel compile
> is definately cpu bound.

Oh definitely, a kernel compile seems to be the de facto standard for
benchmarking in GNU/Linux. I thought Ulrich was talking about tar +
bzip2ing a kernel source tree? Sounded like it to me.

James

Ulrich Eckhardt

unread,

Jun 7, 2002, 6:10:08 AM6/7/02

to

On Wednesday 05 June 2002 12:03, James Kahn wrote:
>
> Oh definitely, a kernel compile seems to be the de facto standard for
> benchmarking in GNU/Linux. I thought Ulrich was talking about tar +
> bzip2ing a kernel source tree? Sounded like it to me.

Yes I was, but that will really be IO-bound as somebodypointed out as I don't
have enough RAM. However, I redid the test with a ~5MB tree and some more
different types of compilation but the results where already posted here.

Charles C. Fu

unread,

Jun 19, 2002, 5:30:11 AM6/19/02

to

On Wed, 29 May 2002, Ulrich Eckhardt wrote:

> using as a benchmark compressing my cvsroot

Hmmph, everyone seems to be posting integer benchmarks when I would
actually expect the major benefit to be from improved scheduling
and use of MMX-type instructions in apps bound by floating-point
computations.

To illustrate, here are results of running at least three runs each of

(unset DISPLAY;echo 'set samples 1000000;set size ratio -1;set xrange [-100:100];plot x*x/(1+x+x*x)'|time ./gnuplot)

with the following configurations on one of my systems (dual 650MHz
Pentium III):

stock Debian gnuplot 3.7.2-4 (-O2):
5.66user 1.94system 0:07.56elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
5.77user 1.77system 0:07.53elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
5.69user 1.88system 0:07.56elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k

compiled with gcc-3.1 and -O2:
5.68user 1.88system 0:07.55elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
5.85user 1.71system 0:07.56elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
5.82user 1.75system 0:07.56elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k

compiled with gcc-3.1 and -O2 -march=pentium3:
4.62user 1.85system 0:06.46elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
4.59user 1.89system 0:06.47elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
4.64user 1.81system 0:06.44elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k

compiled with gcc-3.1 and -O2 -march=pentium3 -fomit-frame-pointer:
4.41user 1.84system 0:06.24elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
4.39user 1.86system 0:06.25elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
4.52user 1.73system 0:06.23elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k

compiled with gcc-3.1 and all of the above + -mfpmath=sse:
4.23user 1.86system 0:06.08elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
4.23user 1.87system 0:06.08elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
4.22user 1.88system 0:06.08elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
4.23user 1.87system 0:06.10elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k
4.42user 1.67system 0:06.09elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
4.13user 1.94system 0:06.07elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k

As can be seen, the cpu-specific -march=pentium3 flag helps noticeably
in this simple benchmark, and the other optimizations each help a
little. (Since gnuplot mostly uses double precision internally, it is
possible -mfpmath=sse would have greater impact when compiled for and
run on a SSE2 system. I have certainly seen it have a _tremendous_
impact on one of my own programs on that PIII box when executing a
tight, single-precision loop, which, of course, are uncommon in most
apps.)

Of course, I have also seen cases where the extra flags do not help or
even hurt. So, my own preference would be for the following:

- CPU detection handled upstream within the sources (ideal),
- else Debian cpu-specific packages _only_ where someone has verified
it really seems to make a significant positive difference and
doesn't cause additional bugs,

and

- ability for users to easily recompile packages from Debian source
with custom compilers and compiler flags.

-ccwf

Ulrich Eckhardt

unread,

Oct 13, 2002, 3:10:06 AM10/13/02

to

On Wednesday 05 June 2002 12:03, James Kahn wrote:
>

> Oh definitely, a kernel compile seems to be the de facto standard for
> benchmarking in GNU/Linux. I thought Ulrich was talking about tar +
> bzip2ing a kernel source tree? Sounded like it to me.

Yes I was, but that will really be IO-bound as somebodypointed out as I don't

have enough RAM. However, I redid the test with a ~5MB tree and some more
different types of compilation but the results where already posted here.