[Haskell-cafe] ANNOUNCE: nobench: Haskell implementaion benchmarks. GHC v Hugs v Yhc v NHC v ...

Donald Bruce Stewart

unread,

Feb 19, 2007, 5:34:44 AM2/19/07

to haskel...@haskell.org

Following recent discussion about a cross-implementation performance
benchmark suite, based on nofib, I've gone and combined nofib with the
great language shootout programs, and rewritten the build system to
support cross implementation measurements.

The result is:

nobench
http://www.cse.unsw.edu.au/~dons/nobench.html

The benchmark suite runs regularly, and currently reports the
speed of each program in the suite, running under each system. The
results are quite interesting. The most recent run is available:

http://www.cse.unsw.edu.au/~dons/nobench/bench.results
http://www.cse.unsw.edu.au/~dons/nobench/bench.log

The programs are a mixture of traditional nofib style Haskell, with more
performance-tuned code from the shootout. More tweaking is required to
help better support nhc and yhc (and jhc, and ...).

The entire benchmark set and framework is available via darcs:

darcs get http://www.cse.unsw.edu.au/~dons/code/nobench

Currently todo are porting the rest of nofib, pretty graphs of the
results (and html), and memory use measurements.

Patches welcome!

Cheers,
Don
_______________________________________________
Haskell-Cafe mailing list
Haskel...@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Neil Mitchell

unread,

Feb 19, 2007, 5:55:26 AM2/19/07

to Donald Bruce Stewart

Hi Dons,

> nobench
> http://www.cse.unsw.edu.au/~dons/nobench.html

Yhc is consistently half the speed of nhc, whereas in our tests, its
typically 20% faster. Can you make sure you've built Yhi with -O
(scons type=release should do it). I opened a bug just a few days ago,
because I realised all benchmark's would get run at no optimisation
otherwise :)

If anyone wants a project finding out what flags to build Yhi with to
get the best performance here would be nice to see :)

Why does the integrate benchmark import both System and
System.Environment? Yhc currently doesn't export getArgs from System,
only System.Environment. (And yes, we really should fix that!)

Thanks

Neil

Ketil Malde

unread,

Feb 19, 2007, 6:39:29 AM2/19/07

to haskel...@haskell.org

Donald Bruce Stewart wrote:
> Following recent discussion about a cross-implementation performance
> benchmark suite, based on nofib, I've gone and combined nofib with the
> great language shootout programs, and rewritten the build system to
> support cross implementation measurements.
>

Great work!

.but I wonder if the shootout is really the kind of code that is ideal
for compiler benchmark. Typically (at least based on what I've seen
of the submissions) they tend to be fairly heavily tuned, using
optimizations
that are a) obfuscating the code and b) tuned specifically for GHC.

(Another potential issue that follows from this is how to resolve a
modification
to a benchmark that makes one compiler faster at the expense of another.)

Wouldn't it be better to benchmark a more idiomatically correct codebase?

-k

Dougal Stanton

unread,

Feb 19, 2007, 6:54:30 AM2/19/07

to haskel...@haskell.org

Quoth Ketil Malde, nevermore,

>
> Wouldn't it be better to benchmark a more idiomatically correct codebase?

I suppose the ideal way to do it would be benchmarks for the (1) idiomatic
and (2) the highly tuned implementations. Then the compiler writers can
push 1 towards 2, while the pesky shootout implementers can move the
goalposts of 2. ;-)

In reality this may just foster a small set of horribly specialised
optimisers in the compilers, with little benefit for real-world usage. :-(

Cheers,

D.

--
Dougal Stanton

Felipe Almeida Lessa

unread,

Feb 19, 2007, 6:57:59 AM2/19/07

to Donald Bruce Stewart

On 2/19/07, Donald Bruce Stewart <do...@cse.unsw.edu.au> wrote:
> results are quite interesting. The most recent run is available:
>
> http://www.cse.unsw.edu.au/~dons/nobench/bench.results
> http://www.cse.unsw.edu.au/~dons/nobench/bench.log

Maybe I'm missing something, but how can ghci beat ghc (on pidigits)?

BTW, nice compilation of tests =).

--
Felipe.

Donald Bruce Stewart

unread,

Feb 19, 2007, 7:56:52 AM2/19/07

to Felipe Almeida Lessa

felipe.lessa:

> On 2/19/07, Donald Bruce Stewart <do...@cse.unsw.edu.au> wrote:
> >results are quite interesting. The most recent run is available:
> >
> > http://www.cse.unsw.edu.au/~dons/nobench/bench.results
> > http://www.cse.unsw.edu.au/~dons/nobench/bench.log
>
> Maybe I'm missing something, but how can ghci beat ghc (on pidigits)?
>
> BTW, nice compilation of tests =).

As far as I can see, this benchmark relies soley on how fast gmp is.
There's very little overhead other than that. More investigation
required though.

-- Don

Donald Bruce Stewart

unread,

Feb 19, 2007, 7:57:58 AM2/19/07

to haskel...@haskell.org

ithika:

> Quoth Ketil Malde, nevermore,
> >
> > Wouldn't it be better to benchmark a more idiomatically correct codebase?
>
>
> I suppose the ideal way to do it would be benchmarks for the (1) idiomatic
> and (2) the highly tuned implementations. Then the compiler writers can
> push 1 towards 2, while the pesky shootout implementers can move the
> goalposts of 2. ;-)
>
> In reality this may just foster a small set of horribly specialised
> optimisers in the compilers, with little benefit for real-world usage. :-(
>

I think more likely, and hopefully, we'll use this to check that things
aren't getting worse from release to release.

-- Don

Donald Bruce Stewart

unread,

Feb 19, 2007, 8:04:12 AM2/19/07

to Ketil Malde

Ketil.Malde:

> Donald Bruce Stewart wrote:
> >Following recent discussion about a cross-implementation performance
> >benchmark suite, based on nofib, I've gone and combined nofib with the
> >great language shootout programs, and rewritten the build system to
> >support cross implementation measurements.
> >
> Great work!
>

> ..but I wonder if the shootout is really the kind of code that is ideal

> for compiler benchmark. Typically (at least based on what I've seen
> of the submissions) they tend to be fairly heavily tuned, using
> optimizations
> that are a) obfuscating the code and b) tuned specifically for GHC.

They exercise the pointy end of things. Specifically, mutable arrays,
double precision math and bytestrings. Stuff we don't have tests for in
nofib, that has performed poorly in the past (till we noticed it on the
shootout..). This kind of code does get written in practice (and when it
is written, it is usually because it needs to be fast).

So I think the few that were added are useful.

More category 'real' programs could be contributed, though.

-- Don

Matthew Naylor

unread,

Feb 19, 2007, 3:20:25 PM2/19/07

to haskel...@haskell.org

Hi all,

> GHC v Hugs v Yhc v NHC v ...

... Hacle & Clean!

I shoved 5 of the benchmarks that Donald used through Hacle, and
compiled the outputs using version 2.1 of the Clean compiler. Results
are below.

As for the other examples, Hacle doesn't like non-Haskell98 and
translates arbitrary-precision integers to fixed-precision ones (!)

I'm not sure how well Hacle would work with nobench because input
files must be unambiguously-typed assuming a "default ()" at the top.
So some programs may require a little tweaking to go through. Mind,
this was only a problem on 1 of the 5 programs I just tried...

Matt.

(Note: ignore the "65536" at the end of each Clean result -- my fault
for not compiling with the right options)

===================================================================
binarytrees (GHC)
===================================================================
stretch tree of depth 17 check: -1
131072 trees of depth 4 check: -131072
32768 trees of depth 6 check: -32768
8192 trees of depth 8 check: -8192
2048 trees of depth 10 check: -2048
512 trees of depth 12 check: -512
128 trees of depth 14 check: -128
32 trees of depth 16 check: -32
long lived tree of depth 16 check: -1

real 0m3.301s
user 0m3.280s
sys 0m0.016s
===================================================================
binarytrees (Clean)
===================================================================
Execution: 2.34 Garbage collection: 0.25 Total: 2.59
stretch tree of depth 17 check: -1
131072 trees of depth 4 check: -131072
32768 trees of depth 6 check: -32768
8192 trees of depth 8 check: -8192
2048 trees of depth 10 check: -2048
512 trees of depth 12 check: -512
128 trees of depth 14 check: -128
32 trees of depth 16 check: -32
long lived tree of depth 16 check: -1
65536

real 0m2.691s
user 0m2.592s
sys 0m0.100s
===================================================================
partial sums (GHC)
===================================================================
2.9999999999999987 (2/3)^k
3160.817621887086 k^-0.5
0.9999996000002026 1/k(k+1)
30.31454150956248 Flint Hills
42.99523399808393 Cookson Hills
15.30901715473893 Harmonic
1.644933666848388 Riemann Zeta
0.6931469805600944 Alternating Harmonic
0.7853980633974358 Gregory

real 0m4.887s
user 0m4.888s
sys 0m0.000s
===================================================================
partial sums (Clean)
===================================================================
Execution: 4.41 Garbage collection: 0.05 Total: 4.46
3 (2/3)^k
3160.81762188709 k^-0.5
0.999999600000203 1/k(k+1)
30.3145415095625 Flint Hills
42.9952339980839 Cookson Hills
15.3090171547389 Harmonic
1.64493366684839 Riemann Zeta
0.693146980560094 Alternating Harmonic
0.785398063397435 Gregory
65536

real 0m4.545s
user 0m4.468s
sys 0m0.076s
===================================================================
queens (GHC)
===================================================================
14200

real 0m1.990s
user 0m1.980s
sys 0m0.012s
===================================================================
queens (Clean)
===================================================================
Execution: 6.58 Garbage collection: 1.07 Total: 7.65
14200
65536

real 0m7.921s
user 0m7.656s
sys 0m0.264s
===================================================================
recursive (GHC)
===================================================================
Ack(3,9): 4093
Fib(36.0): 2.4157817e7
Tak(24,16,8): 9
Fib(3): 3
Tak(3.0,2.0,1.0): 2.0

real 0m5.232s
user 0m5.224s
sys 0m0.008s
===================================================================
recursive (Clean)
===================================================================
Execution: 2.40 Garbage collection: 0.00 Total: 2.40
Ack(3,9): 4093
Fib(36): 24157817
Tak(24,16,8): 9
Fib(3): 3
Tak(3,2,1): 2
65536

real 0m2.403s
user 0m2.400s
sys 0m0.000s
===================================================================
loop (GHC)
===================================================================
3.3333333333333335

real 0m1.039s
user 0m1.036s
sys 0m0.004s
===================================================================
loop (Clean)
===================================================================
Execution: 1.26 Garbage collection: 0.00 Total: 1.26
3.33333333333333
65536

real 0m1.325s
user 0m1.260s
sys 0m0.068s

Stefan O'Rear

unread,

Feb 19, 2007, 3:34:39 PM2/19/07

to Matthew Naylor

On Mon, Feb 19, 2007 at 08:12:14PM +0000, Matthew Naylor wrote:
> Hi all,
>
> > GHC v Hugs v Yhc v NHC v ...
>
> ... Hacle & Clean!
>
> I shoved 5 of the benchmarks that Donald used through Hacle, and
> compiled the outputs using version 2.1 of the Clean compiler. Results
> are below.

Submit a patch, it's easy! Took me <10 minutes to add YHC support
and send it in. (the reason my name isn't in darcs changes is because
dons' X crashed, killing darcs, irreperably corrupting _darcs, so he
had to rm -r _darcs ; darcs init)

Just edit header.mk and footer.mk in the obvious way.

> As for the other examples, Hacle doesn't like non-Haskell98 and
> translates arbitrary-precision integers to fixed-precision ones (!)

Don't worry, nobench is based on a testsuite and as such is prepared to diff
output. (if that doesn't happen, I'd consider it a bug)

> I'm not sure how well Hacle would work with nobench because input
> files must be unambiguously-typed assuming a "default ()" at the top.
> So some programs may require a little tweaking to go through. Mind,
> this was only a problem on 1 of the 5 programs I just tried...

Well, he was willing to make concessions for Yhc brokenness (wrt importing
System.Environment - yhc's System doesn't export getArgs like the Report
says it should (first tangible result of nofib: the Yhc team has fixed it))

And don't worry about adding dependencies - you can remove compilers you don't
have by editing the COMPILERS = line in header.mk.

Stefan

Neil Mitchell

unread,

Feb 19, 2007, 5:12:53 PM2/19/07

to Stefan O'Rear

Hi

> Well, he was willing to make concessions for Yhc brokenness (wrt importing
> System.Environment - yhc's System doesn't export getArgs like the Report
> says it should (first tangible result of nofib: the Yhc team has fixed it))

The second tangible result should be that Yhc runs faster than nhc.
Our internal testing originally showed a 20% speedup over nhc -
something seems to have gone wrong to slow down Yhc, so we are working
to fix this. Hopefully in a few days Yhc will beat nhc - just in case
anyone is drawing performance ideas from the current benchmark.

Thanks

Neil

David House

unread,

Feb 19, 2007, 5:32:28 PM2/19/07

to Neil Mitchell

On 19/02/07, Neil Mitchell <ndmit...@gmail.com> wrote:
> The second tangible result should be that Yhc runs faster than nhc.
> Our internal testing originally showed a 20% speedup over nhc -
> something seems to have gone wrong to slow down Yhc, so we are working
> to fix this. Hopefully in a few days Yhc will beat nhc - just in case
> anyone is drawing performance ideas from the current benchmark.

Great! Nothing like a bit of competition to spur coding into action! :)

Nice work, dons.

--
-David House, dmh...@gmail.com

Bulat Ziganshin

unread,

Feb 20, 2007, 5:17:46 AM2/20/07

to Dougal Stanton

Hello Dougal,

Monday, February 19, 2007, 3:02:30 PM, you wrote:
> I suppose the ideal way to do it would be benchmarks for the (1) idiomatic
> and (2) the highly tuned implementations. Then the compiler writers can
> push 1 towards 2, while the pesky shootout implementers can move the
> goalposts of 2. ;-)

> In reality this may just foster a small set of horribly specialised
> optimisers in the compilers, with little benefit for real-world usage. :-(

i disagree. when i write some general-purpose library, i use these
optimization tricks to make library as fast as possible. and wide
audience of library users will benefit from such low-level
optimizations

great example of such low-level optimized library is ByteString which
provides C-close speed with very high-level interface

--
Best regards,
Bulat mailto:Bulat.Z...@gmail.com