Hello all,
It was taking a while to run "benchmark.hs" as a validation step before checking in.
If you compile that script and call it with something like "./benchmark.run --par +RTS -N32" you will see it compile in parallel (and run too, if it's in testing rather than benchmarking mode, specified by SHORTRUN=1).
On 32 threads I see it taking only ~33 seconds to compile and run both serial and threaded versions of all the benchmarks under all the configurations (86 configurations).
Cheers,
-Ryan
P.S. On a related note there is also a "clusterbench.hs" script for doing a parameter exploration (e.g. -A256K ... -A2M) over a cluster of homogenous multicore machines. This really speeds things up. Now we just need to build the scripts for data-mining the results! And then we'll have a decent answer as to the role of GHC compile and runtime parameters in performance.