Upcoming Shootout in Feb 2010

2 views
Skip to first unread message

Shri

unread,
Feb 9, 2010, 3:44:55 PM2/9/10
to Ruby Benchmark Suite
Hi Antonio,

You mentioned that you were going to do a shootout. I wanted to let
you know the IronRuby release timeline. RC1 is currently out; it was
released last November. RC2 should be coming out sometime this week.
Our dev Tomas has been recently working on improving performance as we
get closer to RTM, but his changes will not be in RC2. The changes
will be in the next release which would be in early March. Your
options are to build from the github source (we can also get you a
build), wait for the next release (preferable from our point of view
as it will be more representative of what V1 RTM will look like), or
just go with RC1 or RC2 (which is the simplest for you, but it will be
out of date for IronRuby in a month or so). You can decide what works
for you.

Also, in the thread at http://groups.google.com/group/ruby-benchmark-suite/t/423505af9b64447,
I had asked you to use the -X:NoAdaptiveCompilation command line
option. You should not need to use it if you pull in the patch at
http://github.com/shri/ruby-benchmark-suite/commit/8b29171952dc6e4c1ff3973a581490d6cf939f60
which adds a warmup phase.

Also, in http://antoniocangiano.com/2009/08/03/performance-of-ironruby-ruby-on-windows/,
you mention you are running on a virtual machine. Do you know if the
virtual machine is using two processor cores? You can check if the
Performance pane in Task Manager shows two graphs. IronRuby is
optimized for multiple processors and tries to do compilation on a
background thread. If the virtual machine is not using two cores, it
could degrade performance a bit.

Regards
Shri

Antonio Cangiano

unread,
Feb 9, 2010, 3:57:17 PM2/9/10
to ruby-bench...@googlegroups.com
On Tue, Feb 9, 2010 at 3:44 PM, Shri <Shri....@microsoft.com> wrote:
Your options are to build from the github source (we can also get you a
build), wait for the next release (preferable from our point of view
as it will be more representative of what V1 RTM will look like), or
just go with RC1 or RC2 (which is the simplest for you, but it will be
out of date for IronRuby in a month or so). You can decide what works
for you.

I can wait until March.

Also, in the thread at http://groups.google.com/group/ruby-benchmark-suite/t/423505af9b64447,
I had asked you to use the -X:NoAdaptiveCompilation command line
option. You should not need to use it if you pull in the patch at
http://github.com/shri/ruby-benchmark-suite/commit/8b29171952dc6e4c1ff3973a581490d6cf939f60
which adds a warmup phase.

Noted.
 
Also, in http://antoniocangiano.com/2009/08/03/performance-of-ironruby-ruby-on-windows/,
you mention you are running on a virtual machine. Do you know if the
virtual machine is using two processor cores?

Yes, both cores were used. However, this time around I may simply use a quadcore, 8 GB RAM machine (i.e, no vms).
--
http://ThinkCode.TV - Screencast e videocorsi di programmazione
http://antoniocangiano.com - Zen and the Art of Programming
http://math-blog.com - Mathematics is wonderful!
Follow me on Twitter: http://twitter.com/acangiano
Author of "Ruby on Rails for Microsoft Developers" (Wrox, 2009)

Shri Borde

unread,
Feb 9, 2010, 4:07:49 PM2/9/10
to ruby-bench...@googlegroups.com
> I can wait until March.
Great!
 

Monty Williams

unread,
Feb 9, 2010, 5:19:01 PM2/9/10
to ruby-bench...@googlegroups.com
Hi Antonio,

We're switching from using the pure Ruby YAML to an FFI based C implementation soon. That should get us past the YAML emitter problems we had in the RDOC benchmarks.

-- Monty
--
The GitHub project is located at http://github.com/acangiano/ruby-benchmark-suite
 
You received this message because you are subscribed to the Google
Groups "Ruby Benchmark Suite" group.
To post to this group, send email to
ruby-bench...@googlegroups.com
To unsubscribe from this group, send email to
ruby-benchmark-s...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/ruby-benchmark-suite?hl=en

Isaac Gouy

unread,
Feb 27, 2010, 2:23:03 PM2/27/10
to Ruby Benchmark Suite

On Feb 9, 12:57 pm, Antonio Cangiano <acangi...@gmail.com> wrote:

> I can wait until March.

Meanwhile, I had one or two gripes from the previous shootout which
maybe could be addressed in the upcomming shootout? :-)


1) How to summarize

"a bar chart of the total time requested for the common subset of
successfully executed benchmarks"

This is the tail wagging the dog - the slower implementations time out
on the programs that do more, so then the measurements for programs
that do more are disregarded!?

The programs that do more, tell us more - so strive to keep those
measurements in the summary.


2) How to summarize

With all due credit to M. Edward (Ed) Borasky - he's right, boxplots
are the sensible way to present the measurements from this kind of
comparison.

Over the last 18 months, boxplots have been the summary presentation
shown on the benchmarks game - pretty much without confusion or
complaint.


3) How to summarize

Geometric mean - okay this is just #1 and #2 all over again :-)

You know the geometric mean can be calculated without reducing the
data to "the common subset of successfully executed benchmarks".

Any concern that outliers might unduly effect the geometric mean can
be addressed by using more robust descriptive statistics - median,
quartiles, boxplots.

best wishes, Isaac

Antonio Cangiano

unread,
Feb 27, 2010, 3:20:42 PM2/27/10
to ruby-bench...@googlegroups.com
On Sat, Feb 27, 2010 at 2:23 PM, Isaac Gouy <igo...@yahoo.com> wrote:
This is the tail wagging the dog - the slower implementations time out
on the programs that do more, so then the measurements for programs
that do more are disregarded!?

The programs that do more, tell us more - so strive to keep those
measurements in the summary

Timeouts are already included in the total, only tests where one of the implementations gave an error, are not included. But I fully agree. The total times are misleading and virtually meaningless and won't be included in future shootouts.

With all due credit to M. Edward (Ed) Borasky - he's right, boxplots
are the sensible way to present the measurements from this kind of
comparison.

Over the last 18 months, boxplots have been the summary presentation
shown on the benchmarks game - pretty much without confusion or
complaint.

Boxplots are definitely the way to go, statistically speaking. We need to script this or even better, switch the RBS to the code behind the languages game (assuming it automates running the tests and the report creation).

This allows me to introduce the much bigger elephant in the room: the current suite has a major fault that has to do with the use of Ruby. The way it currently works is by running a large number of Ruby tests in sequence. These tests are run through Rake. In the past I noticed two issues with this:
  • While the tests are running, the memory taken up by the Rake process becomes very large.
  • The tests are run so fast that most Ruby implementations don't even have time to deallocate objects before the next test is started. Because of this, you could have test B show a very poor performance, because test A was memory consuming, and the implementation at hand's GC is rather slow.
Here is what I propose we do:
  1. Switch the shootout over to the languages game system.
  2. Limit the amount of benchmarks only to the major ones.
  3. Ask the community to help us out with more real world examples.
  4. The repository will still have the micro tests that are used by implementers, in a legacy folder, but for the shootout they won't be used.
Thoughts?

Isaac Gouy

unread,
Feb 27, 2010, 4:57:57 PM2/27/10
to Ruby Benchmark Suite

On Feb 27, 12:20 pm, Antonio Cangiano <acangi...@gmail.com> wrote:

> Here is what I propose we do:
>

>    1. Switch the shootout over to the languages game system.

That ought to reveal some bugs in the benchmarks game scripts.

From an outsider perspective I would guess there might be some push-
back at the idea of using Python scripts for Ruby benchmarking :-)


1) The "bencher" Python measurement scripts used for the benchmarks
game are packaged and available from the help page -

http://shootout.alioth.debian.org/help.php#languagex


2) The analysis and presentation for the benchmarks game website is
done on-the-fly with PHP.

I would guess that your analysis and presentation needs are somewhat
different - and the measurements could easily be munged into a form
that would allow analysis and presentation with some stats tool.

For example, multiple timings of the same program are accumulated in
compressed files, the benchmarks game grabs the fastest measurement -
but you might choose the median.


Isaac Gouy

unread,
Feb 27, 2010, 6:00:04 PM2/27/10
to Ruby Benchmark Suite

On Feb 27, 1:57 pm, Isaac Gouy <igo...@yahoo.com> wrote:

> 2) The analysis and presentation for the benchmarks game website is
> done on-the-fly with PHP.
>
> I would guess that your analysis and presentation needs are somewhat
> different - and the measurements could easily be munged into a form
> that would allow analysis and presentation with some stats tool.
>
> For example, multiple timings of the same program are accumulated in
> compressed files, the benchmarks game grabs the fastest measurement -
> but you might choose the median.


Now I think about it, a really lazy approach would be to run the PHP
scripts on a local webserver and grab medians quartiles outliers
straight out of the table -

http://shootout.alioth.debian.org/u32/which-programming-languages-are-fastest.php

(Those PHP analysis and presentation scripts aren't packaged, they are
available from CVS https://alioth.debian.org/scm/viewvc.php/shootout/website/?root=shootout
)

I'm sure you can make some pretty charts once you have the numbers ;-)

Isaac Gouy

unread,
Feb 27, 2010, 8:40:52 PM2/27/10
to Ruby Benchmark Suite

On Feb 27, 12:20 pm, Antonio Cangiano <acangi...@gmail.com> wrote:

>    1. Switch the shootout over to the languages game system.

iirc the previous objection to doing this was that the benchmarks game
measurement scripts are symlink happy linux scripts - not for windows.

Antonio Cangiano

unread,
Mar 1, 2010, 2:04:35 AM3/1/10
to ruby-bench...@googlegroups.com
On Sat, Feb 27, 2010 at 8:40 PM, Isaac Gouy <igo...@yahoo.com> wrote:
iirc the previous objection to doing this was that the benchmarks game
measurement scripts are symlink happy linux scripts - not for windows.

This is, indeed, an obstacle. Windows and Mac OS X are part of the game and here to stay.

Monty Williams

unread,
Mar 1, 2010, 10:36:36 AM3/1/10
to ruby-bench...@googlegroups.com
With the ubiquity of Windows it seems like someone somewhere might be working on getting these running on Windows. 

However, mentions of running on Windows in their forums are pretty ancient (2005). 

It's been years since I used Windows, but is Cygwin or mingw any help?

-- Monty



----- Original Message -----
From: "Antonio Cangiano" <acan...@gmail.com>
To: ruby-bench...@googlegroups.com

M. Edward (Ed) Borasky

unread,
Mar 1, 2010, 10:49:37 AM3/1/10
to ruby-bench...@googlegroups.com, Monty Williams
On 03/01/2010 07:36 AM, Monty Williams wrote:
> With the ubiquity of Windows it seems like someone somewhere might be working on getting these running on Windows.
>
>
> However, mentions of running on Windows in their forums are pretty ancient (2005).
>
>
> It's been years since I used Windows, but is Cygwin or mingw any help?
>
>
> -- Monty

I don't even know what the status of *Ruby* is on Windows, let alone
Cygwin, Mingw, Python, etc. I came to the conclusion that if I wanted
Windows/Mac/Linux Ruby compatibility and an IDE, my options were
NetBeans / jRuby. That's one of the reasons I spend so much time these
days in ActiveState Perl and Komodo. ;-)

--
M. Edward (Ed) Borasky
http://borasky-research.net/smart-at-znmeb

"A mathematician is a device for turning coffee into theorems." ~ Paul Erdős

Isaac Gouy

unread,
Mar 1, 2010, 12:19:22 PM3/1/10
to Ruby Benchmark Suite

On Mar 1, 7:36 am, Monty Williams <monty.willi...@gemstone.com> wrote:
> With the ubiquity of Windows it seems like someone somewhere might be working on getting these running on Windows.
>
> However, mentions of running on Windows in their forums are pretty ancient (2005).


Back then the measurement scripts were a re-write of Doug Bagley's
original Perl scripts ;-)

When I wrote the Python scripts from scratch, I had no intention of
making them cross-platform - so many of the language tools on MS
Windows were either commercial or had license conditions that made it
unclear if publishing benchmarks was acceptable.


Isaac Gouy

unread,
Mar 1, 2010, 12:32:36 PM3/1/10
to Ruby Benchmark Suite

On Feb 28, 11:04 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> On Sat, Feb 27, 2010 at 8:40 PM, Isaac Gouy <igo...@yahoo.com> wrote:
> > iirc the previous objection to doing this was that the benchmarks game
> > measurement scripts are symlink happy linux scripts - not for windows.
>
> This is, indeed, an obstacle. Windows and Mac OS X are part of the game and
> here to stay.

Well, Mac OS X is kind of 'nix - so it's not obvious to me that the
scripts wouldn't work.

Given your "bigger elephant in the room" comment, I'd suggest there's
actually value in using the Python script to cross-check the
measurements you make on Linux.

Evan Phoenix

unread,
Mar 1, 2010, 12:43:31 PM3/1/10
to ruby-bench...@googlegroups.com
Hi everyone,

Wanted to quickly chime in. I'm happy to help build up the Rubinius tier'd benchmarks and we can use them if we'd like.

My other comment is that I've recently been writing benchmarks using a iterations/sec technique. This is great because it eliminates the need to tune an iteration count in each benchmark and it allows all impls to run the benchmark in the same amount of time. That last point means it solves the timeout problem, slow impls just show up as slow.

The only downside of this technique is that it requires Thread. Thankfully though, the thread simply needs to run. There is exact timing needed by the Thread, because the time difference is still calculated and used. The code for this is here: http://github.com/evanphx/rubinius/blob/master/benchmark/core/cps.rb and you can see an example of it being used in http://github.com/evanphx/rubinius/blob/master/benchmark/core/methods/string/aref_op.rb

As you can see, I've been using this technique to tune String and it's proved to be much nicer to use than pure iteration bound benchmarks we've all been writing up to now.

Anyway, my point is that it seems like using the iterations/sec technique would provide better data and that RBS should consider using it.

Thoughts?

- Evan

Message has been deleted

Isaac Gouy

unread,
Mar 1, 2010, 1:43:25 PM3/1/10
to Ruby Benchmark Suite
On Mar 1, 9:43 am, Evan Phoenix <e...@fallingsnow.net> wrote:
-snip-

> My other comment is that I've recently been writing benchmarks using a iterations/sec technique.

aka throughput benchmarks

> This is great because it eliminates the need to tune an iteration count in each benchmark and it allows all impls to run the benchmark in the same amount of time. That last point means it solves the timeout problem, slow impls just show up as slow.

Ummm ... assuming that every benchmark completes within that "the same
amount of time", otherwise there would be 0 iterations/sec for some
impls ?

Evan Phoenix

unread,
Mar 1, 2010, 1:48:12 PM3/1/10
to ruby-bench...@googlegroups.com

Presumingly the benchmark shows up as 0 iterations/sec, yes. The code could easily been tweaked to allow for benchmarks that have generally greater than 1 second per iteration by increasing the time and changing the units

set_units :minutes
set_time 5

def Bench.run
i = 0
while @should_run
something_that_takes_a_few_seconds
i += 1
end

@iterations = i
end

Fast impls again just have a much larger i/s number.

Isaac Gouy

unread,
Mar 1, 2010, 3:06:25 PM3/1/10
to Ruby Benchmark Suite

On Mar 1, 10:48 am, Evan Phoenix <e...@fallingsnow.net> wrote:
-snip-

> Presumingly the benchmark shows up as 0 iterations/sec, yes. The code could easily been tweaked to allow for benchmarks that have generally greater than 1 second per iteration by increasing the time and changing the units

Rather than "eliminates the need to tune an iteration count in each
benchmark" it morphs into the need to tune the time and time units in
each benchmark.

I think for some tasks throughput benchmarks seem like a natural
representation while for others they are a weird mismatch.

We know that processing 10,000 small files probably isn't going to be
like processing a 10,000 times larger file once - but with throughput
benchmarks we suggest it is.

Isaac Gouy

unread,
Mar 6, 2010, 11:32:06 PM3/6/10
to Ruby Benchmark Suite

On Mar 1, 7:36 am, Monty Williams <monty.willi...@gemstone.com> wrote:

> With the ubiquity of Windows it seems like someone somewhere might be working on getting these running on Windows.
>
> However, mentions of running on Windows in their forums are pretty ancient (2005).
>
> It's been years since I used Windows, but is Cygwin or mingw any help?

"Python Win32 Extensions" helped.
"GNU Make for Windows" helped.
"GNU DiffUtils for Windows" helped.

The re-written script does now seem to give CPU and Elapsed times, and
Peak Working Set on XP and Vista.

> ----- Original Message -----
> From: "Antonio Cangiano" <acangi...@gmail.com>
> To: ruby-bench...@googlegroups.com
> Sent: Sunday, February 28, 2010 11:04:35 PM GMT -08:00 US/Canada Pacific
> Subject: Re: [RBS] Re: Upcoming Shootout in Feb 2010
>
> On Sat, Feb 27, 2010 at 8:40 PM, Isaac Gouy < igo...@yahoo.com > wrote:
>
> iirc the previous objection to doing this was that the benchmarks game
> measurement scripts are symlink happy linux scripts - not for windows.
> This is, indeed, an obstacle. Windows and Mac OS X are part of the game and here to stay.

> --http://ThinkCode.TV- Screencast e videocorsi di programmazionehttp://antoniocangiano.com- Zen and the Art of Programminghttp://math-blog.com- Mathematics is wonderful!


> Follow me on Twitter:http://twitter.com/acangiano
> Author of "Ruby on Rails for Microsoft Developers" (Wrox, 2009)
>
> --

> The GitHub project is located athttp://github.com/acangiano/ruby-benchmark-suite

Antonio Cangiano

unread,
Mar 7, 2010, 12:17:29 AM3/7/10
to ruby-bench...@googlegroups.com
On Sat, Mar 6, 2010 at 11:32 PM, Isaac Gouy <igo...@yahoo.com> wrote:
"Python Win32 Extensions" helped.
"GNU Make for Windows" helped.
"GNU DiffUtils for Windows" helped.

The re-written script does now seem to give CPU and Elapsed times, and
Peak Working Set on XP and Vista.

Great stuff, Isaac.
--
http://ThinkCode.TV - High-quality programming screencasts
http://antoniocangiano.com - Zen and the Art of Programming
http://math-blog.com - Mathematics is wonderful!

Isaac Gouy

unread,
Mar 8, 2010, 9:26:50 PM3/8/10
to Ruby Benchmark Suite

On Mar 6, 9:17 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> On Sat, Mar 6, 2010 at 11:32 PM, Isaac Gouy <igo...@yahoo.com> wrote:
> > "Python Win32 Extensions" helped.
> > "GNU Make for Windows" helped.
> > "GNU DiffUtils for Windows" helped.
>
> > The re-written script does now seem to give CPU and Elapsed times, and
> > Peak Working Set on XP and Vista.
>
> Great stuff, Isaac.


Well, not as good as on Linux - but maybe good enough.


So far on win32, no CPU load measurement, and an obvious problem
measuring programs that spawn multiple processes - the CPU and memory
use for the child processes is not measured - the elapsed time is
measured.


Notice the strange CPU secs measurements for the "binary-trees #6"
and "spectral-norm #5" CPython programs -

http://shootout.alioth.debian.org/demo/measurements.php?lang=python

Notice those programs use the multiprocessing module to spawn
processes -

http://shootout.alioth.debian.org/demo/program.php?test=binarytrees&lang=python&id=6

http://shootout.alioth.debian.org/demo/program.php?test=spectralnorm&lang=python&id=5


Not a problem with Java -

http://shootout.alioth.debian.org/demo/benchmark.php?test=all&lang=java&lang2=javaxint

Isaac Gouy

unread,
Mar 11, 2010, 1:31:09 AM3/11/10
to Ruby Benchmark Suite

On Mar 6, 9:17 pm, Antonio Cangiano <acangi...@gmail.com> wrote:

> On Sat, Mar 6, 2010 at 11:32 PM, Isaac Gouy <igo...@yahoo.com> wrote:
> > "Python Win32 Extensions" helped.
> > "GNU Make for Windows" helped.
> > "GNU DiffUtils for Windows" helped.
>
> > The re-written script does now seem to give CPU and Elapsed times, and
> > Peak Working Set on XP and Vista.
>
> Great stuff, Isaac.


Well, improved enough over the last few days.

On win32 there is CPU load measurement, and for programs that spawn
multiple processes, CPU time and memory use measurement includes
parent and child processes.

http://shootout.alioth.debian.org/demo/program.php?test=spectralnorm&lang=python&id=5

Roger Pack

unread,
Mar 12, 2010, 12:13:38 PM3/12/10
to ruby-bench...@googlegroups.com
> While the tests are running, the memory taken up by the Rake process becomes
> very large.
> The tests are run so fast that most Ruby implementations don't even have
> time to deallocate objects before the next test is started. Because of this,
> you could have test B show a very poor performance, because test A was
> memory consuming, and the implementation at hand's GC is rather slow.

Have you gotten this with recent tests? I'm interested as to why rake
would grow in size--since all it goes is do a system("monitor xxx"),
so I thought we had that problem fixed. Does it go up then down for
the next test? Or are you referring to "after the first iteration,
the process size is already large which hinders the later iterations"
per test?
Thanks!
-r

Roger Pack

unread,
Mar 12, 2010, 12:16:09 PM3/12/10
to ruby-bench...@googlegroups.com
>> Your options are to build from the github source (we can also get you a
>> build), wait for the next release (preferable from our point of view
>> as it will be more representative of what V1 RTM will look like), or
>> just go with RC1 or RC2 (which is the simplest for you, but it will be
>> out of date for IronRuby in a month or so). You can decide what works
>> for you.
>
> I can wait until March.

You may want to test jruby trunk (if 1.5.0 isn't out yet). Also of
interest may be using jruby with different parameters, like --fast,
--server, etc.

Of course, while you're at it benchmarking REE with different "garbage
settings" [1] might also be interesting to some.

GL!
-r
[1] http://www.rubyenterpriseedition.com/documentation.html#_garbage_collector_performance_tuning

Antonio Cangiano

unread,
Mar 12, 2010, 12:19:46 PM3/12/10
to ruby-bench...@googlegroups.com
On Fri, Mar 12, 2010 at 12:16 PM, Roger Pack <roger...@gmail.com> wrote:
You may want to test jruby trunk (if 1.5.0 isn't out yet).  Also of
interest may be using jruby with different parameters, like --fast,
--server, etc.

This is always the case (i.e., the most performing parameters).

--
http://ThinkCode.TV - High-quality programming screencasts

Isaac Gouy

unread,
Mar 24, 2010, 12:48:55 PM3/24/10
to Ruby Benchmark Suite

On Feb 27, 1:20 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
-snip-


> Here is what I propose we do:
>

>    1. Switch the shootout over to the languages game system.
>    2. Limit the amount of benchmarks only to the major ones.
>    3. Ask the community to help us out with more real world examples.
>    4. The repository will still have the micro tests that are used by


>    implementers, in a legacy folder, but for the shootout they won't be used.


How are you going to select which Ruby versions to measure?

http://www.ruby-lang.org/en/downloads/ no longer seems to encourage
building 1.8.7 from source code - although I suppose you can dig
around on the ftp server to find a 1.8.7 tarball.

Assume a distro package is the appropriate thing to measure and
that'll probably be something compiled for i486.

Assume a custom build from source code is the appropriate thing to
measure and that'll probably compile for i686 - and that's before you
hack the make file to use -O3 :-)

Reply all
Reply to author
Forward
0 new messages