You mentioned that you were going to do a shootout. I wanted to let you know the IronRuby release timeline. RC1 is currently out; it was released last November. RC2 should be coming out sometime this week. Our dev Tomas has been recently working on improving performance as we get closer to RTM, but his changes will not be in RC2. The changes will be in the next release which would be in early March. Your options are to build from the github source (we can also get you a build), wait for the next release (preferable from our point of view as it will be more representative of what V1 RTM will look like), or just go with RC1 or RC2 (which is the simplest for you, but it will be out of date for IronRuby in a month or so). You can decide what works for you.
Also, in http://antoniocangiano.com/2009/08/03/performance-of-ironruby-ruby-on..., you mention you are running on a virtual machine. Do you know if the virtual machine is using two processor cores? You can check if the Performance pane in Task Manager shows two graphs. IronRuby is optimized for multiple processors and tries to do compilation on a background thread. If the virtual machine is not using two cores, it could degrade performance a bit.
On Tue, Feb 9, 2010 at 3:44 PM, Shri <Shri.Bo...@microsoft.com> wrote: > Your options are to build from the github source (we can also get you a > build), wait for the next release (preferable from our point of view > as it will be more representative of what V1 RTM will look like), or > just go with RC1 or RC2 (which is the simplest for you, but it will be > out of date for IronRuby in a month or so). You can decide what works > for you.
Yes, both cores were used. However, this time around I may simply use a quadcore, 8 GB RAM machine (i.e, no vms). -- http://ThinkCode.TV - Screencast e videocorsi di programmazione http://antoniocangiano.com - Zen and the Art of Programming http://math-blog.com - Mathematics is wonderful! Follow me on Twitter: http://twitter.com/acangiano Author of "Ruby on Rails for Microsoft Developers" (Wrox, 2009)
We're switching from using the pure Ruby YAML to an FFI based C implementation soon. That should get us past the YAML emitter problems we had in the RDOC benchmarks.
----- Original Message ----- From: "Antonio Cangiano" <acangi...@gmail.com> To: ruby-benchmark-suite@googlegroups.com Sent: Tuesday, February 9, 2010 12:57:17 PM GMT -08:00 US/Canada Pacific Subject: Re: [RBS] Upcoming Shootout in Feb 2010
On Tue, Feb 9, 2010 at 3:44 PM, Shri < Shri.Bo...@microsoft.com > wrote:
Your options are to build from the github source (we can also get you a build), wait for the next release (preferable from our point of view as it will be more representative of what V1 RTM will look like), or just go with RC1 or RC2 (which is the simplest for you, but it will be out of date for IronRuby in a month or so). You can decide what works for you.
Yes, both cores were used. However, this time around I may simply use a quadcore, 8 GB RAM machine (i.e, no vms). -- http://ThinkCode.TV - Screencast e videocorsi di programmazione http://antoniocangiano.com - Zen and the Art of Programming http://math-blog.com - Mathematics is wonderful! Follow me on Twitter: http://twitter.com/acangiano Author of "Ruby on Rails for Microsoft Developers" (Wrox, 2009)
You received this message because you are subscribed to the Google Groups "Ruby Benchmark Suite" group. To post to this group, send email to ruby-benchmark-suite@googlegroups.com To unsubscribe from this group, send email to ruby-benchmark-suite+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/ruby-benchmark-suite?hl=en
On Feb 9, 12:57 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> I can wait until March.
Meanwhile, I had one or two gripes from the previous shootout which maybe could be addressed in the upcomming shootout? :-)
1) How to summarize
"a bar chart of the total time requested for the common subset of successfully executed benchmarks"
This is the tail wagging the dog - the slower implementations time out on the programs that do more, so then the measurements for programs that do more are disregarded!?
The programs that do more, tell us more - so strive to keep those measurements in the summary.
2) How to summarize
With all due credit to M. Edward (Ed) Borasky - he's right, boxplots are the sensible way to present the measurements from this kind of comparison.
Over the last 18 months, boxplots have been the summary presentation shown on the benchmarks game - pretty much without confusion or complaint.
3) How to summarize
Geometric mean - okay this is just #1 and #2 all over again :-)
You know the geometric mean can be calculated without reducing the data to "the common subset of successfully executed benchmarks".
Any concern that outliers might unduly effect the geometric mean can be addressed by using more robust descriptive statistics - median, quartiles, boxplots.
On Sat, Feb 27, 2010 at 2:23 PM, Isaac Gouy <igo...@yahoo.com> wrote: > This is the tail wagging the dog - the slower implementations time out > on the programs that do more, so then the measurements for programs > that do more are disregarded!?
> The programs that do more, tell us more - so strive to keep those > measurements in the summary
Timeouts are already included in the total, only tests where one of the implementations gave an error, are not included. But I fully agree. The total times are misleading and virtually meaningless and won't be included in future shootouts.
With all due credit to M. Edward (Ed) Borasky - he's right, boxplots
> are the sensible way to present the measurements from this kind of > comparison.
> Over the last 18 months, boxplots have been the summary presentation > shown on the benchmarks game - pretty much without confusion or > complaint.
Boxplots are definitely the way to go, statistically speaking. We need to script this or even better, switch the RBS to the code behind the languages game (assuming it automates running the tests and the report creation).
This allows me to introduce the much bigger elephant in the room: the current suite has a major fault that has to do with the use of Ruby. The way it currently works is by running a large number of Ruby tests in sequence. These tests are run through Rake. In the past I noticed two issues with this:
- While the tests are running, the memory taken up by the Rake process becomes very large. - The tests are run so fast that most Ruby implementations don't even have time to deallocate objects before the next test is started. Because of this, you could have test B show a very poor performance, because test A was memory consuming, and the implementation at hand's GC is rather slow.
Here is what I propose we do:
1. Switch the shootout over to the languages game system. 2. Limit the amount of benchmarks only to the major ones. 3. Ask the community to help us out with more real world examples. 4. The repository will still have the micro tests that are used by implementers, in a legacy folder, but for the shootout they won't be used.
2) The analysis and presentation for the benchmarks game website is done on-the-fly with PHP.
I would guess that your analysis and presentation needs are somewhat different - and the measurements could easily be munged into a form that would allow analysis and presentation with some stats tool.
For example, multiple timings of the same program are accumulated in compressed files, the benchmarks game grabs the fastest measurement - but you might choose the median.
On Feb 27, 1:57 pm, Isaac Gouy <igo...@yahoo.com> wrote:
> 2) The analysis and presentation for the benchmarks game website is > done on-the-fly with PHP.
> I would guess that your analysis and presentation needs are somewhat > different - and the measurements could easily be munged into a form > that would allow analysis and presentation with some stats tool.
> For example, multiple timings of the same program are accumulated in > compressed files, the benchmarks game grabs the fastest measurement - > but you might choose the median.
Now I think about it, a really lazy approach would be to run the PHP scripts on a local webserver and grab medians quartiles outliers straight out of the table -
On Sat, Feb 27, 2010 at 8:40 PM, Isaac Gouy <igo...@yahoo.com> wrote: > iirc the previous objection to doing this was that the benchmarks game > measurement scripts are symlink happy linux scripts - not for windows.
This is, indeed, an obstacle. Windows and Mac OS X are part of the game and here to stay. -- http://ThinkCode.TV - Screencast e videocorsi di programmazione http://antoniocangiano.com - Zen and the Art of Programming http://math-blog.com - Mathematics is wonderful! Follow me on Twitter: http://twitter.com/acangiano Author of "Ruby on Rails for Microsoft Developers" (Wrox, 2009)
----- Original Message ----- From: "Antonio Cangiano" <acangi...@gmail.com> To: ruby-benchmark-suite@googlegroups.com Sent: Sunday, February 28, 2010 11:04:35 PM GMT -08:00 US/Canada Pacific Subject: Re: [RBS] Re: Upcoming Shootout in Feb 2010
On Sat, Feb 27, 2010 at 8:40 PM, Isaac Gouy < igo...@yahoo.com > wrote:
iirc the previous objection to doing this was that the benchmarks game measurement scripts are symlink happy linux scripts - not for windows. This is, indeed, an obstacle. Windows and Mac OS X are part of the game and here to stay. -- http://ThinkCode.TV - Screencast e videocorsi di programmazione http://antoniocangiano.com - Zen and the Art of Programming http://math-blog.com - Mathematics is wonderful! Follow me on Twitter: http://twitter.com/acangiano Author of "Ruby on Rails for Microsoft Developers" (Wrox, 2009)
You received this message because you are subscribed to the Google Groups "Ruby Benchmark Suite" group. To post to this group, send email to ruby-benchmark-suite@googlegroups.com To unsubscribe from this group, send email to ruby-benchmark-suite+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/ruby-benchmark-suite?hl=en
> With the ubiquity of Windows it seems like someone somewhere might be working on getting these running on Windows.
> However, mentions of running on Windows in their forums are pretty ancient (2005).
> It's been years since I used Windows, but is Cygwin or mingw any help?
> -- Monty
I don't even know what the status of *Ruby* is on Windows, let alone Cygwin, Mingw, Python, etc. I came to the conclusion that if I wanted Windows/Mac/Linux Ruby compatibility and an IDE, my options were NetBeans / jRuby. That's one of the reasons I spend so much time these days in ActiveState Perl and Komodo. ;-)
On Mar 1, 7:36 am, Monty Williams <monty.willi...@gemstone.com> wrote:
> With the ubiquity of Windows it seems like someone somewhere might be working on getting these running on Windows.
> However, mentions of running on Windows in their forums are pretty ancient (2005).
Back then the measurement scripts were a re-write of Doug Bagley's original Perl scripts ;-)
When I wrote the Python scripts from scratch, I had no intention of making them cross-platform - so many of the language tools on MS Windows were either commercial or had license conditions that made it unclear if publishing benchmarks was acceptable.
On Feb 28, 11:04 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> On Sat, Feb 27, 2010 at 8:40 PM, Isaac Gouy <igo...@yahoo.com> wrote: > > iirc the previous objection to doing this was that the benchmarks game > > measurement scripts are symlink happy linux scripts - not for windows.
> This is, indeed, an obstacle. Windows and Mac OS X are part of the game and > here to stay.
Well, Mac OS X is kind of 'nix - so it's not obvious to me that the scripts wouldn't work.
Given your "bigger elephant in the room" comment, I'd suggest there's actually value in using the Python script to cross-check the measurements you make on Linux.
Wanted to quickly chime in. I'm happy to help build up the Rubinius tier'd benchmarks and we can use them if we'd like.
My other comment is that I've recently been writing benchmarks using a iterations/sec technique. This is great because it eliminates the need to tune an iteration count in each benchmark and it allows all impls to run the benchmark in the same amount of time. That last point means it solves the timeout problem, slow impls just show up as slow.
As you can see, I've been using this technique to tune String and it's proved to be much nicer to use than pure iteration bound benchmarks we've all been writing up to now.
Anyway, my point is that it seems like using the iterations/sec technique would provide better data and that RBS should consider using it.
Thoughts?
- Evan
On Feb 27, 2010, at 12:20 PM, Antonio Cangiano wrote:
> On Sat, Feb 27, 2010 at 2:23 PM, Isaac Gouy <igo...@yahoo.com> wrote: > This is the tail wagging the dog - the slower implementations time out > on the programs that do more, so then the measurements for programs > that do more are disregarded!?
> The programs that do more, tell us more - so strive to keep those > measurements in the summary
> Timeouts are already included in the total, only tests where one of the implementations gave an error, are not included. But I fully agree. The total times are misleading and virtually meaningless and won't be included in future shootouts.
> With all due credit to M. Edward (Ed) Borasky - he's right, boxplots > are the sensible way to present the measurements from this kind of > comparison.
> Over the last 18 months, boxplots have been the summary presentation > shown on the benchmarks game - pretty much without confusion or > complaint.
> Boxplots are definitely the way to go, statistically speaking. We need to script this or even better, switch the RBS to the code behind the languages game (assuming it automates running the tests and the report creation).
> This allows me to introduce the much bigger elephant in the room: the current suite has a major fault that has to do with the use of Ruby. The way it currently works is by running a large number of Ruby tests in sequence. These tests are run through Rake. In the past I noticed two issues with this: > While the tests are running, the memory taken up by the Rake process becomes very large. > The tests are run so fast that most Ruby implementations don't even have time to deallocate objects before the next test is started. Because of this, you could have test B show a very poor performance, because test A was memory consuming, and the implementation at hand's GC is rather slow. > Here is what I propose we do: > Switch the shootout over to the languages game system. > Limit the amount of benchmarks only to the major ones. > Ask the community to help us out with more real world examples. > The repository will still have the micro tests that are used by implementers, in a legacy folder, but for the shootout they won't be used. > Thoughts? > -- > http://ThinkCode.TV - Screencast e videocorsi di programmazione > http://antoniocangiano.com - Zen and the Art of Programming > http://math-blog.com - Mathematics is wonderful! > Follow me on Twitter: http://twitter.com/acangiano > Author of "Ruby on Rails for Microsoft Developers" (Wrox, 2009)
> You received this message because you are subscribed to the Google > Groups "Ruby Benchmark Suite" group. > To post to this group, send email to > ruby-benchmark-suite@googlegroups.com > To unsubscribe from this group, send email to > ruby-benchmark-suite+unsubscribe@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/ruby-benchmark-suite?hl=en
On Mar 1, 9:43 am, Evan Phoenix <e...@fallingsnow.net> wrote: -snip-
> My other comment is that I've recently been writing benchmarks using a iterations/sec technique.
aka throughput benchmarks
> This is great because it eliminates the need to tune an iteration count in each benchmark and it allows all impls to run the benchmark in the same amount of time. That last point means it solves the timeout problem, slow impls just show up as slow.
Ummm ... assuming that every benchmark completes within that "the same amount of time", otherwise there would be 0 iterations/sec for some impls ?
> On Mar 1, 9:43 am, Evan Phoenix <e...@fallingsnow.net> wrote: > -snip-
>> My other comment is that I've recently been writing benchmarks using a iterations/sec technique.
> aka throughput benchmarks
>> This is great because it eliminates the need to tune an iteration count in each benchmark and it allows all impls to run the benchmark in the same amount of time. That last point means it solves the timeout problem, slow impls just show up as slow.
> Ummm ... assuming that every benchmark completes within that "the same > amount of time", otherwise there would be 0 iterations/sec for some > impls ?
Presumingly the benchmark shows up as 0 iterations/sec, yes. The code could easily been tweaked to allow for benchmarks that have generally greater than 1 second per iteration by increasing the time and changing the units
set_units :minutes set_time 5
def Bench.run i = 0 while @should_run something_that_takes_a_few_seconds i += 1 end
@iterations = i end
Fast impls again just have a much larger i/s number.
> You received this message because you are subscribed to the Google > Groups "Ruby Benchmark Suite" group. > To post to this group, send email to > ruby-benchmark-suite@googlegroups.com > To unsubscribe from this group, send email to > ruby-benchmark-suite+unsubscribe@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/ruby-benchmark-suite?hl=en
On Mar 1, 10:48 am, Evan Phoenix <e...@fallingsnow.net> wrote: -snip-
> Presumingly the benchmark shows up as 0 iterations/sec, yes. The code could easily been tweaked to allow for benchmarks that have generally greater than 1 second per iteration by increasing the time and changing the units
Rather than "eliminates the need to tune an iteration count in each benchmark" it morphs into the need to tune the time and time units in each benchmark.
I think for some tasks throughput benchmarks seem like a natural representation while for others they are a weird mismatch.
We know that processing 10,000 small files probably isn't going to be like processing a 10,000 times larger file once - but with throughput benchmarks we suggest it is.
> ----- Original Message ----- > From: "Antonio Cangiano" <acangi...@gmail.com> > To: ruby-benchmark-suite@googlegroups.com > Sent: Sunday, February 28, 2010 11:04:35 PM GMT -08:00 US/Canada Pacific > Subject: Re: [RBS] Re: Upcoming Shootout in Feb 2010
> On Sat, Feb 27, 2010 at 8:40 PM, Isaac Gouy < igo...@yahoo.com > wrote:
> iirc the previous objection to doing this was that the benchmarks game > measurement scripts are symlink happy linux scripts - not for windows. > This is, indeed, an obstacle. Windows and Mac OS X are part of the game and here to stay. > --http://ThinkCode.TV- Screencast e videocorsi di programmazionehttp://antoniocangiano.com- Zen and the Art of Programminghttp://math-blog.com- Mathematics is wonderful! > Follow me on Twitter:http://twitter.com/acangiano > Author of "Ruby on Rails for Microsoft Developers" (Wrox, 2009)
> You received this message because you are subscribed to the Google > Groups "Ruby Benchmark Suite" group. > To post to this group, send email to > ruby-benchmark-suite@googlegroups.com > To unsubscribe from this group, send email to > ruby-benchmark-suite+unsubscribe@googlegroups.com > For more options, visit this group athttp://groups.google.com/group/ruby-benchmark-suite?hl=en
On Sat, Mar 6, 2010 at 11:32 PM, Isaac Gouy <igo...@yahoo.com> wrote: > "Python Win32 Extensions" helped. > "GNU Make for Windows" helped. > "GNU DiffUtils for Windows" helped.
> The re-written script does now seem to give CPU and Elapsed times, and > Peak Working Set on XP and Vista.
On Mar 6, 9:17 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> On Sat, Mar 6, 2010 at 11:32 PM, Isaac Gouy <igo...@yahoo.com> wrote: > > "Python Win32 Extensions" helped. > > "GNU Make for Windows" helped. > > "GNU DiffUtils for Windows" helped.
> > The re-written script does now seem to give CPU and Elapsed times, and > > Peak Working Set on XP and Vista.
> Great stuff, Isaac.
Well, not as good as on Linux - but maybe good enough.
So far on win32, no CPU load measurement, and an obvious problem measuring programs that spawn multiple processes - the CPU and memory use for the child processes is not measured - the elapsed time is measured.
Notice the strange CPU secs measurements for the "binary-trees #6" and "spectral-norm #5" CPython programs -
On Mar 6, 9:17 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> On Sat, Mar 6, 2010 at 11:32 PM, Isaac Gouy <igo...@yahoo.com> wrote: > > "Python Win32 Extensions" helped. > > "GNU Make for Windows" helped. > > "GNU DiffUtils for Windows" helped.
> > The re-written script does now seem to give CPU and Elapsed times, and > > Peak Working Set on XP and Vista.
> Great stuff, Isaac.
Well, improved enough over the last few days.
On win32 there is CPU load measurement, and for programs that spawn multiple processes, CPU time and memory use measurement includes parent and child processes.
> While the tests are running, the memory taken up by the Rake process becomes > very large. > The tests are run so fast that most Ruby implementations don't even have > time to deallocate objects before the next test is started. Because of this, > you could have test B show a very poor performance, because test A was > memory consuming, and the implementation at hand's GC is rather slow.
Have you gotten this with recent tests? I'm interested as to why rake would grow in size--since all it goes is do a system("monitor xxx"), so I thought we had that problem fixed. Does it go up then down for the next test? Or are you referring to "after the first iteration, the process size is already large which hinders the later iterations" per test? Thanks! -r
>> Your options are to build from the github source (we can also get you a >> build), wait for the next release (preferable from our point of view >> as it will be more representative of what V1 RTM will look like), or >> just go with RC1 or RC2 (which is the simplest for you, but it will be >> out of date for IronRuby in a month or so). You can decide what works >> for you.
> I can wait until March.
You may want to test jruby trunk (if 1.5.0 isn't out yet). Also of interest may be using jruby with different parameters, like --fast, --server, etc.
Of course, while you're at it benchmarking REE with different "garbage settings" [1] might also be interesting to some.
On Fri, Mar 12, 2010 at 12:16 PM, Roger Pack <rogerdp...@gmail.com> wrote: > You may want to test jruby trunk (if 1.5.0 isn't out yet). Also of > interest may be using jruby with different parameters, like --fast, > --server, etc.