You mentioned that you were going to do a shootout. I wanted to let
you know the IronRuby release timeline. RC1 is currently out; it was
released last November. RC2 should be coming out sometime this week.
Our dev Tomas has been recently working on improving performance as we
get closer to RTM, but his changes will not be in RC2. The changes
will be in the next release which would be in early March. Your
options are to build from the github source (we can also get you a
build), wait for the next release (preferable from our point of view
as it will be more representative of what V1 RTM will look like), or
just go with RC1 or RC2 (which is the simplest for you, but it will be
out of date for IronRuby in a month or so). You can decide what works
for you.
Also, in the thread at http://groups.google.com/group/ruby-benchmark-suite/t/423505af9b64447,
I had asked you to use the -X:NoAdaptiveCompilation command line
option. You should not need to use it if you pull in the patch at
http://github.com/shri/ruby-benchmark-suite/commit/8b29171952dc6e4c1ff3973a581490d6cf939f60
which adds a warmup phase.
Also, in http://antoniocangiano.com/2009/08/03/performance-of-ironruby-ruby-on-windows/,
you mention you are running on a virtual machine. Do you know if the
virtual machine is using two processor cores? You can check if the
Performance pane in Task Manager shows two graphs. IronRuby is
optimized for multiple processors and tries to do compilation on a
background thread. If the virtual machine is not using two cores, it
could degrade performance a bit.
Regards
Shri
Your options are to build from the github source (we can also get you a
build), wait for the next release (preferable from our point of view
as it will be more representative of what V1 RTM will look like), or
just go with RC1 or RC2 (which is the simplest for you, but it will be
out of date for IronRuby in a month or so). You can decide what works
for you.
Also, in the thread at http://groups.google.com/group/ruby-benchmark-suite/t/423505af9b64447,
I had asked you to use the -X:NoAdaptiveCompilation command line
option. You should not need to use it if you pull in the patch at
http://github.com/shri/ruby-benchmark-suite/commit/8b29171952dc6e4c1ff3973a581490d6cf939f60
which adds a warmup phase.
Also, in http://antoniocangiano.com/2009/08/03/performance-of-ironruby-ruby-on-windows/,
you mention you are running on a virtual machine. Do you know if the
virtual machine is using two processor cores?
On Feb 9, 12:57 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> I can wait until March.
Meanwhile, I had one or two gripes from the previous shootout which
maybe could be addressed in the upcomming shootout? :-)
1) How to summarize
"a bar chart of the total time requested for the common subset of
successfully executed benchmarks"
This is the tail wagging the dog - the slower implementations time out
on the programs that do more, so then the measurements for programs
that do more are disregarded!?
The programs that do more, tell us more - so strive to keep those
measurements in the summary.
2) How to summarize
With all due credit to M. Edward (Ed) Borasky - he's right, boxplots
are the sensible way to present the measurements from this kind of
comparison.
Over the last 18 months, boxplots have been the summary presentation
shown on the benchmarks game - pretty much without confusion or
complaint.
3) How to summarize
Geometric mean - okay this is just #1 and #2 all over again :-)
You know the geometric mean can be calculated without reducing the
data to "the common subset of successfully executed benchmarks".
Any concern that outliers might unduly effect the geometric mean can
be addressed by using more robust descriptive statistics - median,
quartiles, boxplots.
best wishes, Isaac
This is the tail wagging the dog - the slower implementations time out
on the programs that do more, so then the measurements for programs
that do more are disregarded!?
The programs that do more, tell us more - so strive to keep those
measurements in the summary
With all due credit to M. Edward (Ed) Borasky - he's right, boxplots
are the sensible way to present the measurements from this kind of
comparison.
Over the last 18 months, boxplots have been the summary presentation
shown on the benchmarks game - pretty much without confusion or
complaint.
On Feb 27, 12:20 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> Here is what I propose we do:
>
> 1. Switch the shootout over to the languages game system.
That ought to reveal some bugs in the benchmarks game scripts.
From an outsider perspective I would guess there might be some push-
back at the idea of using Python scripts for Ruby benchmarking :-)
1) The "bencher" Python measurement scripts used for the benchmarks
game are packaged and available from the help page -
http://shootout.alioth.debian.org/help.php#languagex
2) The analysis and presentation for the benchmarks game website is
done on-the-fly with PHP.
I would guess that your analysis and presentation needs are somewhat
different - and the measurements could easily be munged into a form
that would allow analysis and presentation with some stats tool.
For example, multiple timings of the same program are accumulated in
compressed files, the benchmarks game grabs the fastest measurement -
but you might choose the median.
On Feb 27, 1:57 pm, Isaac Gouy <igo...@yahoo.com> wrote:
> 2) The analysis and presentation for the benchmarks game website is
> done on-the-fly with PHP.
>
> I would guess that your analysis and presentation needs are somewhat
> different - and the measurements could easily be munged into a form
> that would allow analysis and presentation with some stats tool.
>
> For example, multiple timings of the same program are accumulated in
> compressed files, the benchmarks game grabs the fastest measurement -
> but you might choose the median.
Now I think about it, a really lazy approach would be to run the PHP
scripts on a local webserver and grab medians quartiles outliers
straight out of the table -
http://shootout.alioth.debian.org/u32/which-programming-languages-are-fastest.php
(Those PHP analysis and presentation scripts aren't packaged, they are
available from CVS https://alioth.debian.org/scm/viewvc.php/shootout/website/?root=shootout
)
I'm sure you can make some pretty charts once you have the numbers ;-)
On Feb 27, 12:20 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> 1. Switch the shootout over to the languages game system.
iirc the previous objection to doing this was that the benchmarks game
measurement scripts are symlink happy linux scripts - not for windows.
iirc the previous objection to doing this was that the benchmarks game
measurement scripts are symlink happy linux scripts - not for windows.
I don't even know what the status of *Ruby* is on Windows, let alone
Cygwin, Mingw, Python, etc. I came to the conclusion that if I wanted
Windows/Mac/Linux Ruby compatibility and an IDE, my options were
NetBeans / jRuby. That's one of the reasons I spend so much time these
days in ActiveState Perl and Komodo. ;-)
--
M. Edward (Ed) Borasky
http://borasky-research.net/smart-at-znmeb
"A mathematician is a device for turning coffee into theorems." ~ Paul Erdős
On Mar 1, 7:36 am, Monty Williams <monty.willi...@gemstone.com> wrote:
> With the ubiquity of Windows it seems like someone somewhere might be working on getting these running on Windows.
>
> However, mentions of running on Windows in their forums are pretty ancient (2005).
Back then the measurement scripts were a re-write of Doug Bagley's
original Perl scripts ;-)
When I wrote the Python scripts from scratch, I had no intention of
making them cross-platform - so many of the language tools on MS
Windows were either commercial or had license conditions that made it
unclear if publishing benchmarks was acceptable.
On Feb 28, 11:04 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> On Sat, Feb 27, 2010 at 8:40 PM, Isaac Gouy <igo...@yahoo.com> wrote:
> > iirc the previous objection to doing this was that the benchmarks game
> > measurement scripts are symlink happy linux scripts - not for windows.
>
> This is, indeed, an obstacle. Windows and Mac OS X are part of the game and
> here to stay.
Well, Mac OS X is kind of 'nix - so it's not obvious to me that the
scripts wouldn't work.
Given your "bigger elephant in the room" comment, I'd suggest there's
actually value in using the Python script to cross-check the
measurements you make on Linux.
Wanted to quickly chime in. I'm happy to help build up the Rubinius tier'd benchmarks and we can use them if we'd like.
My other comment is that I've recently been writing benchmarks using a iterations/sec technique. This is great because it eliminates the need to tune an iteration count in each benchmark and it allows all impls to run the benchmark in the same amount of time. That last point means it solves the timeout problem, slow impls just show up as slow.
The only downside of this technique is that it requires Thread. Thankfully though, the thread simply needs to run. There is exact timing needed by the Thread, because the time difference is still calculated and used. The code for this is here: http://github.com/evanphx/rubinius/blob/master/benchmark/core/cps.rb and you can see an example of it being used in http://github.com/evanphx/rubinius/blob/master/benchmark/core/methods/string/aref_op.rb
As you can see, I've been using this technique to tune String and it's proved to be much nicer to use than pure iteration bound benchmarks we've all been writing up to now.
Anyway, my point is that it seems like using the iterations/sec technique would provide better data and that RBS should consider using it.
Thoughts?
- Evan
> My other comment is that I've recently been writing benchmarks using a iterations/sec technique.
aka throughput benchmarks
> This is great because it eliminates the need to tune an iteration count in each benchmark and it allows all impls to run the benchmark in the same amount of time. That last point means it solves the timeout problem, slow impls just show up as slow.
Ummm ... assuming that every benchmark completes within that "the same
amount of time", otherwise there would be 0 iterations/sec for some
impls ?
Presumingly the benchmark shows up as 0 iterations/sec, yes. The code could easily been tweaked to allow for benchmarks that have generally greater than 1 second per iteration by increasing the time and changing the units
set_units :minutes
set_time 5
def Bench.run
i = 0
while @should_run
something_that_takes_a_few_seconds
i += 1
end
@iterations = i
end
Fast impls again just have a much larger i/s number.
On Mar 1, 10:48 am, Evan Phoenix <e...@fallingsnow.net> wrote:
-snip-
> Presumingly the benchmark shows up as 0 iterations/sec, yes. The code could easily been tweaked to allow for benchmarks that have generally greater than 1 second per iteration by increasing the time and changing the units
Rather than "eliminates the need to tune an iteration count in each
benchmark" it morphs into the need to tune the time and time units in
each benchmark.
I think for some tasks throughput benchmarks seem like a natural
representation while for others they are a weird mismatch.
We know that processing 10,000 small files probably isn't going to be
like processing a 10,000 times larger file once - but with throughput
benchmarks we suggest it is.
On Mar 1, 7:36 am, Monty Williams <monty.willi...@gemstone.com> wrote:
> With the ubiquity of Windows it seems like someone somewhere might be working on getting these running on Windows.
>
> However, mentions of running on Windows in their forums are pretty ancient (2005).
>
> It's been years since I used Windows, but is Cygwin or mingw any help?
"Python Win32 Extensions" helped.
"GNU Make for Windows" helped.
"GNU DiffUtils for Windows" helped.
The re-written script does now seem to give CPU and Elapsed times, and
Peak Working Set on XP and Vista.
> ----- Original Message -----
> From: "Antonio Cangiano" <acangi...@gmail.com>
> To: ruby-bench...@googlegroups.com
> Sent: Sunday, February 28, 2010 11:04:35 PM GMT -08:00 US/Canada Pacific
> Subject: Re: [RBS] Re: Upcoming Shootout in Feb 2010
>
> On Sat, Feb 27, 2010 at 8:40 PM, Isaac Gouy < igo...@yahoo.com > wrote:
>
> iirc the previous objection to doing this was that the benchmarks game
> measurement scripts are symlink happy linux scripts - not for windows.
> This is, indeed, an obstacle. Windows and Mac OS X are part of the game and here to stay.
> --http://ThinkCode.TV- Screencast e videocorsi di programmazionehttp://antoniocangiano.com- Zen and the Art of Programminghttp://math-blog.com- Mathematics is wonderful!
> Follow me on Twitter:http://twitter.com/acangiano
> Author of "Ruby on Rails for Microsoft Developers" (Wrox, 2009)
>
> --
> The GitHub project is located athttp://github.com/acangiano/ruby-benchmark-suite
"Python Win32 Extensions" helped.
"GNU Make for Windows" helped.
"GNU DiffUtils for Windows" helped.
The re-written script does now seem to give CPU and Elapsed times, and
Peak Working Set on XP and Vista.
On Mar 6, 9:17 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> On Sat, Mar 6, 2010 at 11:32 PM, Isaac Gouy <igo...@yahoo.com> wrote:
> > "Python Win32 Extensions" helped.
> > "GNU Make for Windows" helped.
> > "GNU DiffUtils for Windows" helped.
>
> > The re-written script does now seem to give CPU and Elapsed times, and
> > Peak Working Set on XP and Vista.
>
> Great stuff, Isaac.
Well, not as good as on Linux - but maybe good enough.
So far on win32, no CPU load measurement, and an obvious problem
measuring programs that spawn multiple processes - the CPU and memory
use for the child processes is not measured - the elapsed time is
measured.
Notice the strange CPU secs measurements for the "binary-trees #6"
and "spectral-norm #5" CPython programs -
http://shootout.alioth.debian.org/demo/measurements.php?lang=python
Notice those programs use the multiprocessing module to spawn
processes -
http://shootout.alioth.debian.org/demo/program.php?test=binarytrees&lang=python&id=6
http://shootout.alioth.debian.org/demo/program.php?test=spectralnorm&lang=python&id=5
Not a problem with Java -
http://shootout.alioth.debian.org/demo/benchmark.php?test=all&lang=java&lang2=javaxint
On Mar 6, 9:17 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> On Sat, Mar 6, 2010 at 11:32 PM, Isaac Gouy <igo...@yahoo.com> wrote:
> > "Python Win32 Extensions" helped.
> > "GNU Make for Windows" helped.
> > "GNU DiffUtils for Windows" helped.
>
> > The re-written script does now seem to give CPU and Elapsed times, and
> > Peak Working Set on XP and Vista.
>
> Great stuff, Isaac.
Well, improved enough over the last few days.
On win32 there is CPU load measurement, and for programs that spawn
multiple processes, CPU time and memory use measurement includes
parent and child processes.
http://shootout.alioth.debian.org/demo/program.php?test=spectralnorm&lang=python&id=5
Have you gotten this with recent tests? I'm interested as to why rake
would grow in size--since all it goes is do a system("monitor xxx"),
so I thought we had that problem fixed. Does it go up then down for
the next test? Or are you referring to "after the first iteration,
the process size is already large which hinders the later iterations"
per test?
Thanks!
-r
You may want to test jruby trunk (if 1.5.0 isn't out yet). Also of
interest may be using jruby with different parameters, like --fast,
--server, etc.
Of course, while you're at it benchmarking REE with different "garbage
settings" [1] might also be interesting to some.
GL!
-r
[1] http://www.rubyenterpriseedition.com/documentation.html#_garbage_collector_performance_tuning
You may want to test jruby trunk (if 1.5.0 isn't out yet). Also of
interest may be using jruby with different parameters, like --fast,
--server, etc.
On Feb 27, 1:20 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
-snip-
> Here is what I propose we do:
>
> 1. Switch the shootout over to the languages game system.
> 2. Limit the amount of benchmarks only to the major ones.
> 3. Ask the community to help us out with more real world examples.
> 4. The repository will still have the micro tests that are used by
> implementers, in a legacy folder, but for the shootout they won't be used.
How are you going to select which Ruby versions to measure?
http://www.ruby-lang.org/en/downloads/ no longer seems to encourage
building 1.8.7 from source code - although I suppose you can dig
around on the ftp server to find a 1.8.7 tarball.
Assume a distro package is the appropriate thing to measure and
that'll probably be something compiled for i486.
Assume a custom build from source code is the appropriate thing to
measure and that'll probably compile for i686 - and that's before you
hack the make file to use -O3 :-)