[RBS] About that shootout...

26 views
Skip to first unread message

Antonio Cangiano

unread,
May 3, 2010, 4:53:41 PM5/3/10
to ruby-benchmark-suite
Yes, things got really busy and I never got around to run the promised shootout. Instead I decided to run three separate shootouts, the results of which I will publish this month, possibly one per week. They are going to be as follows:

       1.        The Great Ruby Shootout (Mac Edition)
       2.        The Great Ruby Shootout (Windows Edition)
       3.        The Great Ruby Shootout (Linux Edition)

The Mac edition will include MacRuby 0.6, Ruby 1.8 and Ruby 1.9.  The Windows edition will include IronRuby 1.0, Ruby 1.8, and Ruby 1.9. And lastly, the Linux edition will include Ruby 1.8, Ruby 1.9, JRuby, Rubinius, and MagLev.

I will run a subset of tests only, favoring programs to empty loops. Unless I encounter any major issues, I will represent the data with boxplots.

Thank you for your patience,
Antonio

PS: The good news is that I didn't notice any memory leak issues (aka, the elephant in the room). They must have been fixed (by Roger?) a while ago.
--
http://thinkcode.tv - High-Quality Programming Screencasts
http://antoniocangiano.com - Zen and the Art of Programming
http://math-blog.com - Mathematics is wonderful!
Follow me on Twitter: http://twitter.com/acangiano

--
The GitHub project is located at http://github.com/acangiano/ruby-benchmark-suite
 
You received this message because you are subscribed to the Google
Groups "Ruby Benchmark Suite" group.
To post to this group, send email to
ruby-bench...@googlegroups.com
To unsubscribe from this group, send email to
ruby-benchmark-s...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/ruby-benchmark-suite?hl=en

Roger Pack

unread,
May 3, 2010, 6:29:03 PM5/3/10
to ruby-bench...@googlegroups.com
Jruby for doze would be nice (MRI 1.9.2 would also be nice [trunk
snapshot], which has slightly better windows support than 1.9.1 for
certain operations like file reads) ...
-rp

Antonio Cangiano

unread,
May 3, 2010, 6:34:07 PM5/3/10
to ruby-bench...@googlegroups.com
On Mon, May 3, 2010 at 6:29 PM, Roger Pack <roger...@gmail.com> wrote:
Jruby for doze would be nice

Good point.

--
http://thinkcode.tv - High-Quality Programming Screencasts
http://antoniocangiano.com - Zen and the Art of Programming
http://math-blog.com - Mathematics is wonderful!
Follow me on Twitter: http://twitter.com/acangiano

Monty Williams

unread,
May 4, 2010, 1:11:23 AM5/4/10
to ruby-bench...@googlegroups.com
Hi Antonio,

Just curious, why not run JRuby, Rubinius, and MagLev on Mac as well as Linux?

-- Monty

Antonio Cangiano

unread,
May 4, 2010, 8:14:12 AM5/4/10
to ruby-bench...@googlegroups.com
On Tue, May 4, 2010 at 1:11 AM, Monty Williams <monty.w...@gemstone.com> wrote:
Just curious, why not run JRuby, Rubinius, and MagLev on Mac as well as Linux?

Hi Monty,

setting up. running, and reporting on the shootout is very time consuming, so I try to limit it to what people are really interested in. Namely, how fast is MacRuby on Mac? How fast is IronRuby on Windows? How fast are all of them on Linux (where most people deploy their apps)?

I forgot to mention that REE will be included as well (on Linux).

Monty Williams

unread,
May 4, 2010, 9:58:52 AM5/4/10
to ruby-bench...@googlegroups.com
Hi Antonio,

I figured it might be the amount of effort involved. I don't think performance differs that much between OS's on equivalent HW.

If I can be of any help, let me know.

-- Monty

----- Original Message -----
From: "Antonio Cangiano" <acan...@gmail.com>

Isaac Gouy

unread,
May 5, 2010, 10:48:51 AM5/5/10
to Ruby Benchmark Suite


On May 4, 5:14 am, Antonio Cangiano <acangi...@gmail.com> wrote:
-snip-
> setting up. running, and reporting on the shootout is very time consuming,
> so I try to limit it to what people are really interested in. Namely, how
> fast is MacRuby on Mac? How fast is IronRuby on Windows? How fast are all of
> them on Linux (where most people deploy their apps)?
>
> I forgot to mention that REE will be included as well (on Linux).


Seeing program performance measurements for "the same" language
implementations on different OS/hardware combinations can be a nice
reminder that these kinds of program measurement are not definitive :-)

Isaac Gouy

unread,
May 14, 2010, 1:25:40 PM5/14/10
to Ruby Benchmark Suite


On May 3, 1:53 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> Yes, things got really busy and I never got around to run the promised
> shootout. Instead I decided to run three separate shootouts, the results of
> which I will publish this month, possibly one per week. They are going to be
> as follows:
>
>        1.        The Great Ruby Shootout (Mac Edition)
>        2.        The Great Ruby Shootout (Windows Edition)
>        3.        The Great Ruby Shootout (Linux Edition)
>
> The Mac edition will include MacRuby 0.6, Ruby 1.8 and Ruby 1.9.  The
> Windows edition will include IronRuby 1.0, Ruby 1.8, and Ruby 1.9. And
> lastly, the Linux edition will include Ruby 1.8, Ruby 1.9, JRuby, Rubinius,
> and MagLev.


Hopefully you hadn't started measuring JRuby before JRuby 1.5 was
released?

http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=jruby&lang2=ruby

Antonio Cangiano

unread,
May 14, 2010, 1:31:59 PM5/14/10
to ruby-bench...@googlegroups.com
On Fri, May 14, 2010 at 1:25 PM, Isaac Gouy <igo...@yahoo.com> wrote:
Hopefully you hadn't started measuring JRuby before JRuby 1.5 was
released?

http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=jruby&lang2=ruby

JRuby 1.5 will be used. :)

--
http://thinkcode.tv - High-Quality Programming Screencasts
http://antoniocangiano.com - Zen and the Art of Programming
http://math-blog.com - Mathematics is wonderful!
Follow me on Twitter: http://twitter.com/acangiano

Charles Oliver Nutter

unread,
May 25, 2010, 2:08:39 AM5/25/10
to ruby-bench...@googlegroups.com
It's come to my attention that RBS starts a new process for each
benchmark. Since many benchmarks are short and only run once, this
does not provide valid benchmark results for most of the optimizing
VMs.

Perhaps you should hold off on publishing a shootout until benchmarks
aren't spinning up a new JVM (in the case of JRuby) for every script?

Charles Oliver Nutter

unread,
May 25, 2010, 2:26:30 AM5/25/10
to ruby-bench...@googlegroups.com
Another discovery...

Since many of the benchmarks include the benchmarked code directly in
a block, those benchmarks will never even JIT in JRuby. Yes, it's a
lacking in JRuby that blocks do not currently JIT independent of a
method that contains them, but it seems an unfortunate penalty since
the majority of user code will eventually have the method containing a
block JIT, bringing the block along with it.

So not only are many of the benchmarks running only once inside a new
JVM...many are not even compiling. I'm not sure that the suite in its
current form provides much useful information.

Charles Oliver Nutter

unread,
May 25, 2010, 3:01:52 AM5/25/10
to ruby-bench...@googlegroups.com
Sigh...yeah, there's a real problem here. This Mitko Kostov fellow has
published some numbers running RBS in its current form. Because of the
current issues, some of his numbers are as 10x or worse than JRuby
without those handicaps.

http://mitkokostov.info/2010/05/22/ruby-vm-shootout.html

For example...he shows the bm_list 10k result at over 14s for JRuby.
On my system, tweaking RBS to not launch a new VM, it runs in around
2s.

bm_mandelbrot...56s to 15s.
bm_lucas_lehmer for 9941...23s to 5.6s (and the others that time out
all complete just find for me on JRuby)

You can see these numbers are incredibly crippled...even accounting
for speed differences between his machine and my machine, they're way
off the mark.

We can talk about how to repair the suite, if you like...

On Tue, May 25, 2010 at 1:26 AM, Charles Oliver Nutter

Jim Deville

unread,
May 25, 2010, 11:36:08 AM5/25/10
to ruby-bench...@googlegroups.com
For the fresh process part, I thought Shri had commited our changes that ran an untimed bm before starting the timer. This allows for the warmup on IronRuby (pre-compilation and all).

For the rest, I think that the benchmarks are still valid. If a implementation doesn't optimize for this case, then that should be part of the results. That said, I do feel that there should be some additional benchmarks that address this case to show both sides of the picture.

JD

Charles Oliver Nutter

unread,
May 25, 2010, 1:25:33 PM5/25/10
to ruby-bench...@googlegroups.com
It looks like the compilation is the biggest problem here for JRuby.

Running in the same JVM keeps some of the initial results fast, but
the 5 iterations does a pretty good job of that too. But with a large
number of these benchmarks including code only in block bodies,
they're simply not getting compiled in JRuby at all. It's motivation
for me to fix that, but it's unfortunate anyone using the suite won't
see typical performance unless they tell JRuby to force the code to
compile on load, which does appear to help quite a bit.

I'm not sure I see the first result being thrown out either; it's
still there, usually still very slow, and included in at least the
mean and median calculations. I haven't dug enough to see whether it's
affecting the final result.

Isaac Gouy

unread,
May 25, 2010, 1:41:53 PM5/25/10
to Ruby Benchmark Suite


On May 24, 11:08 pm, Charles Oliver Nutter <head...@headius.com>
wrote:
> It's come to my attention that RBS starts a new process for each
> benchmark. Since many benchmarks are short and only run once, this
> does not provide valid benchmark results for most of the optimizing
> VMs.


Wouldn't those benchmark results be "valid" for short programs that
were only run once on a cold VM?

Does the Ruby Benchmark Suite spell-out any expectations about how
Ruby programs are "usually" run?

Isaac Gouy

unread,
May 25, 2010, 2:34:43 PM5/25/10
to Ruby Benchmark Suite


On May 25, 10:41 am, Isaac Gouy <igo...@yahoo.com> wrote:
> On May 24, 11:08 pm, Charles Oliver Nutter <head...@headius.com>
> wrote:
>
> > It's come to my attention that RBS starts a new process for each
> > benchmark. Since many benchmarks are short and only run once, this
> > does not provide valid benchmark results for most of the optimizing
> > VMs.
>
> Wouldn't those benchmark results be "valid" for short programs that
> were only run once on a cold VM?
>
> Does the Ruby Benchmark Suite spell-out any expectations about how
> Ruby programs are "usually" run?


iow Perhaps the Ruby language implementations now make such
significantly different assumptions, about the usual runtime context,
that a single approach to benchmarking isn't useful for all of them.

Roger Pack

unread,
May 25, 2010, 4:57:51 PM5/25/10
to ruby-bench...@googlegroups.com
> For the fresh process part, I thought Shri had commited our changes that ran an untimed bm before starting the timer. This allows for the warmup on IronRuby (pre-compilation and all).

I'd argue that untimed bm is not good because a few rails benchmarks
can actually only run once (they cannot be run-run easily, like timing
"the time it takes to start up a rails app"--hard to repeat that in
the same run--rake test:all is also hard to re-run). A windows
limitation more than anything, but it's nice to keep it cross
platform.

Re: being slow on jruby

I had thought that the tests had each been refactored to 'each take
about 4s' though I'm sure some haven't. Plus 5 iterations..
If that's not the case then I'd say we need to help that test.

Also I wouldn't be averse to refactoring the existing tests to be
"method based" (since apparently Jruby and rubinius (and IronRuby?)
can "only JIT code within a method").
-rp

Evan Phoenix

unread,
May 25, 2010, 5:00:11 PM5/25/10
to ruby-bench...@googlegroups.com

Rubinius can JIT blocks as well. It's capable of JIT'ing any code context (method, block, eval, class body, script, etc).

- Evan

> -rp

Roger Pack

unread,
May 25, 2010, 5:27:12 PM5/25/10
to ruby-bench...@googlegroups.com
> Rubinius can JIT blocks as well. It's capable of JIT'ing any code context (method, block, eval, class body, script, etc).
>
>  - Evan

Ahh. I was confusing lack of "on stack" replacement with not being
able to JIT a block. It should work ok then, since we run multiple
iterations.
-rp

Isaac Gouy

unread,
May 26, 2010, 11:25:39 AM5/26/10
to Ruby Benchmark Suite
When you do publish your new measurements, would it be appropriate to
state very clearly that the implicit assumption is server-side Ruby?

Just saying - "The best times out of five iterations were reported,
and these do not include startup times or the time required to parse
and compile classes and method for the first time." - is perhaps bit
too subtle ;-)

Antonio Cangiano

unread,
May 31, 2010, 7:21:13 PM5/31/10
to ruby-bench...@googlegroups.com
> Sigh...yeah, there's a real problem here. This Mitko Kostov fellow has
> published some numbers running RBS in its current form.

BTW, his numbers are really off. I wonder if he messed up the settings
for JRuby. I always make sure that each VM runs at its best. I'm not sure
that he used the --server and other optimization flags.

> We can talk about how to repair the suite, if you like...

That's the whole purpose of the suite being open source. I can grant
access to the repository to whoever wants to contribute (but doesn't
have access now).

> I'm not sure I see the first result being thrown out either; it's
> still there, usually still very slow, and included in at least the
> mean and median calculations. I haven't dug enough to see whether it's
> affecting the final result.

Only the best time is reported, so a slow first run doesn't affect the
end results.

Also I wouldn't be averse to refactoring the existing tests to be

method based. If it makes the tests more realistic, I'm all for it.

> When you do publish your new measurements, would it be appropriate to
> state very clearly that the implicit assumption is server-side Ruby?

Yes, I can do this.

Cheers,
Antonio

Isaac Gouy

unread,
Jun 1, 2010, 11:33:22 AM6/1/10
to Ruby Benchmark Suite


On May 31, 4:21 pm, Antonio Cangiano <acangi...@gmail.com> wrote:
> > Sigh...yeah, there's a real problem here. This Mitko Kostov fellow has
> > published some numbers running RBS in its current form.
>
> BTW, his numbers are really off. I wonder if he messed up the settings
> for JRuby. I always make sure that each VM runs at its best. I'm not sure
> that he used the --server and other optimization flags.

-snip-

1) Are the parameters you use in configuration files, that he could
access when he grabbed the ruby-benchmark-suite? (I didn't look
properly.)

2) The visibility provided by showing the compile time and run time
command lines alongside the program source code and program output, on
the benchmarks game website, has allowed people to tell me that I
should be using some other parameter.

That approach seems less workable when publishing on a blog - and
perhaps makes it important to have them somewhere obvious on github.

Antonio Cangiano

unread,
Jun 1, 2010, 11:43:43 AM6/1/10
to ruby-bench...@googlegroups.com
On Tue, Jun 1, 2010 at 11:33 AM, Isaac Gouy <igo...@yahoo.com> wrote:
1) Are the parameters you use in configuration files, that he could
access when he grabbed the ruby-benchmark-suite? (I didn't look
properly.)

No.
 
2) The visibility provided by showing the compile time and run time
command lines alongside the program source code and program output, on
the benchmarks game website, has allowed people to tell me that I
should be using some other parameter. 

That approach seems less workable when publishing on a blog - and
perhaps makes it important to have them somewhere obvious on github.

Something worth considering.

roger...@gmail.com

unread,
Jun 6, 2010, 1:05:45 AM6/6/10
to Ruby Benchmark Suite
> The Mac edition will include MacRuby 0.6, Ruby 1.8 and Ruby 1.9.  The
> Windows edition will include IronRuby 1.0, Ruby 1.8, and Ruby 1.9. And
> lastly, the Linux edition will include Ruby 1.8, Ruby 1.9, JRuby, Rubinius,
> and MagLev.

Might be good to include REE for one of those, as well.
Thoughts?
-rp

Antonio Cangiano

unread,
Jun 6, 2010, 6:59:01 AM6/6/10
to ruby-bench...@googlegroups.com
On Sun, Jun 6, 2010 at 1:05 AM, roger...@gmail.com <roger...@gmail.com> wrote:
Might be good to
include
include REE for one of those, as well.

Yeah, earlier I wrote:

I forgot to mention that REE will 
be included as well (on Linux).
--

Isaac Gouy

unread,
Jun 16, 2010, 2:35:53 PM6/16/10
to Ruby Benchmark Suite


On May 31, 4:21 pm, Antonio Cangiano <acangi...@gmail.com> wrote:

-snip-

> > When you do publish your new measurements, would it be appropriate to
> > state very clearly that the implicit assumption is server-side Ruby?
>
> Yes, I can do this.


Now I understand that the measurements were always made by starting a
Ruby implementation once and then doing repeated measurements, I don't
understand why "The best times out of five iterations" are reported.

Reporting "the best times" is dubious even in the benchmarks game,
were the repeated startup and measurement is just intended to take
account of variations from one program run to the next.

In The Great Ruby Shootout, if I understand correctly, the repeated
measurements allow some Ruby implementations to do progressively
better optimization of the code.

I don't think there's anything at all wrong with trying to show what
could happen on a long running server, as long as the costs are
accounted for as well as the benefits.

I've noticed that for the benchmarks game Java nbody program some of
the repeated measurements are ~25% slower and I don't see why that
kind of cost should be ignored or hidden - "the best times" hides that
cost.

When a Ruby implementation is started once, and then repeated
measurements are made, and the program is progressively optimized from
one run to the next - shouldn't all the measurements should
contribute, shouldn't an average should be reported?


Isaac Gouy

unread,
Jun 16, 2010, 7:24:35 PM6/16/10
to Ruby Benchmark Suite
And furthermore... If the context really is server-side performance
then wouldn't the obvious (simplistic?) approach be to:

- run a fixed warm-up sequence of benchmark tasks

- randomize a long sequence of benchmark tasks and have a Ruby
implementation chew through the entire sequence for hours, appending
task timing measurements to file, and collate and average the task
timings after the entire sequence has been completed.

Brian Ford

unread,
Jun 16, 2010, 7:31:03 PM6/16/10
to ruby-bench...@googlegroups.com

Sure, until one of the benchmarks hangs or segv's the executable.

Isaac Gouy

unread,
Jun 17, 2010, 1:36:14 PM6/17/10
to Ruby Benchmark Suite
Seems like we should always be able to catch broken programs before
starting a long sequence of benchmark task, by running the programs. I
agree that leaves an open question - what to do about Ruby language
implementations that break on "good" Ruby programs.

If the benchmarks hang because that Ruby language implementation
couldn't cope with a long sequence of benchmark tasks, then the value
of timing sub-second task on that Ruby language implementation as a
suggeston of server-side performance seems vanishingly slight.

Note - I'm not commenting about the Ruby Benchmark Suite as-used-by-
Ruby-implementors but about Antonio's Great Ruby Shootout.

Antonio Cangiano

unread,
Jun 24, 2010, 2:45:45 PM6/24/10
to ruby-bench...@googlegroups.com
On Thu, Jun 17, 2010 at 1:36 PM, Isaac Gouy <igo...@yahoo.com> wrote:
Note - I'm not
commenting
commenting about the Ruby Benchmark Suite as-used-by-
Ruby-implementors but about Antonio's Great Ruby Shootout.

In fact, we can keep the RBS for the implementors, but from now on, the Great Ruby Shootout will only focus on selected large benchmarks from the RBS. The Windows one should be published very soon (I've had major distractions lately, including wife at the hospital and death in the family).
--
Antonio Cangiano
http://antoniocangiano.com
Reply all
Reply to author
Forward
0 new messages