Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

For performance, write it in C

170 views
Skip to first unread message

Peter Hickman

unread,
Jul 26, 2006, 4:47:13 AM7/26/06
to
Whenever the question of performance comes up with scripting languages
such as Ruby, Perl or Python there will be people whose response can be
summarised as "Write it in C". I am one such person. Some people take
offence at this and label us trolls or heretics of the true programming
language (take your pick).

I am assuming here that when people talk about performance they really
mean speed. Some will disagree but this is what I am talking about.

In this post I want to clear some things up and provide benchmarks as to
why you should take "Write it in C" seriously. Personally I like to
write my projects in Ruby or Perl or Python and then convert them to C
to get a performance boost. How much of a boost? Well here I will give
you some hard data and source code so that you can see just how much of
a boost C can give you.

The mini project in question is to generate all the valid Latin squares.
A Latin square is a grid of numbers (lets say a 9 x 9 grid) that has the
numbers 1 to 9 appear only once in each row and column. Sudoku grids are
a subset of Latin squares.

The approach taken is to create a list of all the permutations and then
build up a grid row by row checking that the newly added row does not
conflict with any of the previous rows. If the final row can be added
without problem the solution is printed and the search for the next one
starts. It is in essence depth first search. The first version of the
program that I wrote in Perl took 473 minutes to generate all the valid
5 x 5 Latin squares, version four of the program took 12 minutes and 51
seconds. The C version of the program took 5.5 seconds to produce
identical results. All run on the same hardware.

[Latin]$ time ./Latin1.pl 5 > x5

real 473m45.370s
user 248m59.752s
sys 2m54.598s

[Latin]$ time ./Latin4.pl 5 > x5

real 12m51.178s
user 12m14.066s
sys 0m7.101s

[Latin]$ time ./c_version.sh 5

real 0m5.519s
user 0m4.585s
sys 0m0.691s

This is what I mean when I say that coding in C will improve the
performance of your program. The improvement goes beyond percentages, it
is in orders of magnitude. I think that the effort is worth it. If a 5 x
5 grid with 120 permutations took 12 minutes in Perl, how long would a 6
* 6 grid with 720 permutations take? What unit of measure would you be
using for a 9 x 9 grid?

Size Permutations
==== ============
1 1
2 2
3 6
4 24
5 120
6 720
7 5040
8 40320
9 362880

Now lets look at first version of the code:

1 #!/usr/bin/perl -w

2 use strict;
3 use warnings;

4 use Algorithm::Permute;

5 my $width_of_board = shift;

6 my @permutations;

7 my $p = new Algorithm::Permute( [ 1 .. $width_of_board ] );

8 while ( my @res = $p->next ) {
9 push @permutations, [@res];
10 }
11 my $number_of_permutations = scalar(@permutations);

12 for ( my $x = 0 ; $x < $number_of_permutations ; $x++ ) {
13 add_a_line($x);
14 }

Lines 1 to 11 just build up a list of all the permutations using the
handy Algorithm::Permute module from CPAN. Lines 12 to 14 starts on the
first row of the solution by trying out all possible permutations for
the first row.

15 sub add_a_line {
16 my @lines = @_;

17 my $size = scalar(@lines);

18 my $ok = 1;
19 for ( my $x = 0 ; $x < $size ; $x++ ) {
20 for ( my $y = 0 ; $y < $size ; $y++ ) {
21 if ( $x != $y ) {
22 $ok = 0 unless compare( $lines[$x], $lines[$y] );
23 }
24 }
25 }

26 if ($ok) {
27 if ( $size == $width_of_board ) {
28 print join(':', map { p($_) } @lines) . "\n";
29 }
30 else {
31 for ( my $x = 0 ; $x < $number_of_permutations ;
$x++ ) {
32 add_a_line( @lines, $x );
33 }
34 }
35 }
36 }

The add_a_line() function first checks that none of the lines so far
conflict (have the same digit in the same column), if it passes and the
number of lines equals the size of the board then the result is printed
and another solution is looked for. Failing that another line is added
and add_a_line() is called.

Here is the function that tells if two lines conflict.

37 sub compare {
38 my ( $a, $b ) = @_;

39 my $ok = 1;

40 my @aa = @{ $permutations[$a] };
41 my @bb = @{ $permutations[$b] };

42 for ( my $x = 0 ; $x < $width_of_board ; $x++ ) {
43 $ok = 0 if $aa[$x] == $bb[$x];
44 }

45 return $ok == 1;
46 }

The p() function is a little utility to convert a list into a string for
display.

47 sub p {
48 my ($x) = @_;

49 my @a = @{ $permutations[$x] };
50 my $y = join( '', @a );

51 return $y;
52 }

Well I have just exposed some pretty crap code to eternal ridicule on
the internet, but there you have it. The code is crap, even non Perl
programmers will be able to point out the deficenties with this code. It
works, even though a 5 x 5 grid took 473 minutes to run. Lets try and
salvage some pride and show version four and see how we managed to speed
things up.

1 #!/usr/bin/perl -w

2 use strict;
3 use warnings;

4 use Algorithm::Permute;

5 my $width_of_board = shift;

6 my @permutations;
7 my @output;
8 my %compared;

9 my $p = new Algorithm::Permute( [ 1 .. $width_of_board ] );

10 while ( my @res = $p->next ) {
11 push @permutations, [@res];
12 push @output, join( '', @res );
13 }
14 my $number_of_permutations = scalar(@permutations);

Lines 1 to 14 are doing pretty much what version one was doing except
that a new list, @output, is being built up to precalculate the output
strings and remove the need for the p() function. A minor speed up but
useful.

15 for ( my $x = 0 ; $x < $number_of_permutations ; $x++ ) {
16 for ( my $y = 0 ; $y < $number_of_permutations ; $y++ ) {
17 my $ok = 1;

18 my @aa = @{ $permutations[$x] };
19 my @bb = @{ $permutations[$y] };

20 for ( my $z = 0 ; $z < $width_of_board ; $z++ ) {
21 if ( $aa[$z] == $bb[$z] ) {
22 $ok = 0;
23 last;
24 }
25 }

26 if ( $ok == 1 ) {
27 $compared{"$x:$y"} = 1;
28 }
29 }
30 }

Lines 15 to 30 introduces new code to precalculate the comparisons and
feed the results into a hash. Lines 31 to 33 start the work in the same
way as version one.

31 for ( my $x = 0 ; $x < $number_of_permutations ; $x++ ) {
32 add_a_line($x);
33 }

And now to the improved add_a_line() function. The code has been
improved to only check that the newly added line does not conflict with
any of the existsing lines rather than repeatedly comparing the existing
(valid) lines.

34 sub add_a_line {
35 my @lines = @_;

36 my $size = scalar(@lines);

37 my $ok = 1;

38 if ( $size > 1 ) {
39 for ( my $x = 0 ; $x < ( $size - 1 ) ; $x++ ) {
40 unless ( defined $compared{ $lines[$x] .':'.
$lines[-1] } ) {
41 $ok = 0;
42 last;
43 }
44 }
45 }

46 if ($ok) {
47 if ( $size == $width_of_board ) {
48 print join( ':', map { $output[$_] } @lines ) . "\n";
49 }
50 else {
51 for ( my $x = 0 ; $x < $number_of_permutations ;
$x++ ) {
52 add_a_line( @lines, $x );
53 }
54 }
55 }
56 }

These changes took us down from 473 minutes to just 12. The elimination
of unnessessary comparisons in add_a_line() helped as did the
precalculation of those comparisons. There are lessons to be learnt
here, write decent code and cache repetetive comparisons. There are no
great tricks, just that bad code can cost you dearly and simple things
can bring big improvements. So with such a massive improvement how could
we make our code any faster?

Write it in C.

Having learnt the lessons developing the code in Perl I am not going to
start the whole thing over in C. Using version four I used the
precalculation phase of the Perl scripts to write out a C header file
with data structures that would be useful for the C program.

1 #define WIDTH_OF_BOARD 5
2 #define NUMBER_OF_PERMUTATIONS 120
3 char *output_strings[] = {
4 "54321",

123 "12345",
124 };
125 bool compared[NUMBER_OF_PERMUTATIONS][NUMBER_OF_PERMUTATIONS] = {
126 {false, false, ...

245 {false, false, ...
246 };
247 int work[WIDTH_OF_BOARD];

This then leaves the C code itself. Lines 1 to 7 includes a load of
useful stuff, infact it is also probably including some quite
unneccessary stuff too, I just cut and paste it from another project.

1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <stdbool.h>
4 #include <err.h>
5 #include <string.h>
6 #include <unistd.h>
7 #include <sys/types.h>

Line 8 is the header file that Perl precalculated.

8 #include "Latin.h"

Now the meat. The code is pretty much the same as version four just
adapted to C. No special C tricks, no weird pointer stuff, an almost
line for line translation of the Perl code.

9 void
10 add_a_row(int row)
11 {
12 bool is_ok;
13 int x,y;

14 if (row == WIDTH_OF_BOARD) {
15 for (x = 0; x < WIDTH_OF_BOARD; x++) {
16 if (x == 0) {
17 printf("%s", output_strings[work[x]]);
18 } else {
19 printf(":%s", output_strings[work[x]]);
20 }
21 }
22 puts("");
23 } else {
24 for (x = 0; x < NUMBER_OF_PERMUTATIONS; x++) {
25 work[row] = x;

26 is_ok = true;
27 if (row != 0) {
28 for( y = 0; y < row; y++ ) {
29 if(compared[work[row]][work[y]] == false) {
30 is_ok = false;
31 break;
32 }
33 }
34 }
35 if (is_ok == true) {
36 add_a_row(row + 1);
37 }
38 }
39 }
40 }

41 int
42 main(int argc, char *argv[])
43 {
44 add_a_row(0);
45 }

And the C version ran in 5.5 seconds. In fact the 5.5 seconds includes
the Perl program that does all the precalculation to write the Latin.h
header file, the compiling of the C source and finally the running of
the program itself. So we have not cheated by doing some work outside
the timings.

Just think of it, 12 minutes down to 5.5 seconds without having to write
any incomprehensible C code. Because we all know that C code is
completely incomprehensible with it doing all that weird pointer stuff
all the time.

Now the Perl code could be improved, there are tricks that could be
pulled out of the bag to trim something off the 12 minutes. Perhaps
another language would be faster? But how far could you close the gap
from 12 minutes to 5.5 seconds?

Just to up the ante I added -fast -mcpu=7450 to the compiler (gcc
optimized for speed on an G4 Macintosh) and ran it again.

[Latin]$ time ./c_version.sh 5 > x5

real 0m3.986s
user 0m2.810s
sys 0m0.540s

Another 30% performance improvement without changing any code.

Lets review the languages we have used and their advantages. C is very
fast without any stupid tricks. C will give you better control over the
amount of memory you use (the Perl code eats up massive amounts of
memory in comparison, the 9 x 9 grid threw an out of memory error on my
1Gb machine).

It is much easier to develop in Perl. Any error message you get is
likely to at least give you a clue as to what the problem might be. C
programmers have to put up with the likes of 'Bus error' or
'Segmentation fault', which is why C programmers are grouches. Perl also
allows you to significantly improve your code without major rewrites.
There is a module called Memoize that can wrap a function and cache the
calls all by adding two extra lines to your code, the same is true for
most scripting languages.

So what am I recommending here, write all your programs in C? No. Write
all your programs in Perl? No. Write them in your favourite scripting
language to refine the code and then translate it into C if the
performance falls short of your requirements. Even if you intend to
write it in C all along hacking the code in Perl first allows you to
play with the algorithm without having to worry about memory allocation
and other such C style house keeping. Good code is good code in any
language.

If you really really want that performance boost then take the following
advice very seriously - "Write it in C".


Dr Nic

unread,
Jul 26, 2006, 5:03:10 AM7/26/06
to
Do you have any preferred tutorials on wrapping Ruby around C libraries?

--
Posted via http://www.ruby-forum.com/.

Pit Capitain

unread,
Jul 26, 2006, 5:23:08 AM7/26/06
to
Peter Hickman schrieb:
> (Example of Perl and C Code)

Peter, is there any chance you could test your program with Ruby Inline?

http://rubyforge.org/projects/rubyinline

I'm on Windows, so I can't use Ruby Inline (+1 for MinGW btw :-)

Regards,
Pit

ben...@fysh.org

unread,
Jul 26, 2006, 5:40:24 AM7/26/06
to
Peter Hickman gave a very good article about prototyping in a scripting
language, and then re-coding in c:

*snip*

> If you really really want that performance boost then take the following
> advice very seriously - "Write it in C".

I totally agree that with the current state of the art, this is the
right approach.

Maybe it doesn't need saying, but I'm going to... in the vasy majority
of applications, almost all of their run time is a tiny wee small
fraction of the code. This is the part that I would write in c (or c++).
The vast majority of the code (and it's not there just for fun, it's
still completely important to the application) will use a vanishingly
small fraction of processor time. This is the bit that I would probably
leave in Ruby.

People talk about the 80:20 principle, but in my experience it's much
more like 99:1 for applications. 99% of the code uses 1% of the run
time. 1% of the code consumes 99% of the run time. That could be the
signal processing and graphics heavy applications that I have
experienced though.

Thanks for the comparison, it was great. And thanks for the very nice
pre-generation of look up tables in perl idea. Nice.

Cheers,
Benjohn

peteth...@googlemail.com

unread,
Jul 26, 2006, 5:44:19 AM7/26/06
to
It might be interesting to see how Java fares too - another route
again.

Pete

Tomasz Wegrzanowski

unread,
Jul 26, 2006, 5:54:02 AM7/26/06
to
On 7/26/06, ben...@fysh.org <ben...@fysh.org> wrote:
> Peter Hickman gave a very good article about prototyping in a scripting
> language, and then re-coding in c:
>
> *snip*
>
> > If you really really want that performance boost then take the following
> > advice very seriously - "Write it in C".
>
> I totally agree that with the current state of the art, this is the
> right approach.

Sorry, I just couldn't resist - but maybe you should code Java instead -
http://kano.net/javabench/ ;-)

Jay Levitt

unread,
Jul 26, 2006, 7:26:28 AM7/26/06
to
On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:

> In this post I want to clear some things up and provide benchmarks as to
> why you should take "Write it in C" seriously.

This is a great post, and should at least be posted to a blog somewhere so
the masses who don't know about USENET can still find it on Google!

Jay

Peter Hickman

unread,
Jul 26, 2006, 7:47:01 AM7/26/06
to
Robert Dober wrote:
> "Hmmm, maybe you should know that this kind of performance is not
> possible in Ruby or even slightly faster interpreted languages, and
> that you
> should consider writing part of it in C, it is not so difficult, have
> a look
> here or there"
>
> as opposed to those who write
>
> "Write it in C if you want speed"
>

Tact has never been one of my strong points. Your phrasing was much
nicer I will agree. The things that has been bugging me with this whole
performance thing is that I am sure that many people do not realise just
how fast a program written in C can run. The people who seem to take
offence at being told to write their code in C seem to have no
significant experience in using C. What also seems to happen is that
after saying that performance is their number 1 absolute top priority
they start to back peddle saying how hard C code is to develop. Yes is
it harder to work with than a scripting language, it is all the rope you
need to hang yourself with but you can write some blindingly fast code
if you are prepared to put in the effort. Wasn't that what they said was
their number 1 absolute top priority?


Peter Hickman

unread,
Jul 26, 2006, 7:48:14 AM7/26/06
to
I may well put it in my web site, along with all the source code. Google
and Yahoo hit it enough.


Leslie Viljoen

unread,
Jul 26, 2006, 8:26:05 AM7/26/06
to
On 7/26/06, Peter Hickman <pe...@semantico.com> wrote:
> Jay Levitt wrote:
> > On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:
> >
> >
> >> In this post I want to clear some things up and provide benchmarks as to
> >> why you should take "Write it in C" seriously.

Something else to consider is the ease with which Ruby extensions can
be written in C. The first time I tried I has something running in 20
minutes.

Though if I was going to choose a (single) language for raw
performance I'd try to go with Pascal or Ada.


Les

James Edward Gray II

unread,
Jul 26, 2006, 10:02:48 AM7/26/06
to
On Jul 26, 2006, at 8:57 AM, Charles O Nutter wrote:

> - Write it in C is as valid as write it in Java (as someone else
> mentioned).
> Java is at least as fast as C for most algorithms.

I'm Java Certified and I've been hearing people say this for years,
but I just haven't experienced this myself. You guys must develop on
much faster boxes than my MacBook Pro. ;)

James Edward Gray II


Peter Hickman

unread,
Jul 26, 2006, 10:13:05 AM7/26/06
to
Charles O Nutter wrote:
> I'll lob a couple of grenades and then duck for cover.

>
> - Write it in C is as valid as write it in Java (as someone else
> mentioned).
> Java is at least as fast as C for most algorithms.

As someone who is paid to program in Java I very seriously doubt this.
However I will write a Java version of the code and time it. It should
be interesting to say the least.

> All this said, there's truth to the idea that we shouldn't *have* to
> write
> platform-level code to get reasonable performance, and every effort
> should
> be made to improve the speed of Ruby code as near as possible to that
> of the
> underlying platform code.
We are talking about two different things here. I was talking about
performance as being the number 1 absolute top priority, you are talking
about 'reasonable performance'. As far as I am concerned for those
scripts that I don't convert to C Perl, Ruby and Python are fast enough.
Other people think that they are not, they seem to expect the sort of
performance C gives when they write things in a scripting language. I
think that they are barking.


Kroeger, Simon (ext)

unread,
Jul 26, 2006, 10:18:47 AM7/26/06
to
Hi Peter!

> Whenever the question of performance comes up with scripting
> languages
> such as Ruby, Perl or Python there will be people whose
> response can be
> summarised as "Write it in C". I am one such person. Some people take
> offence at this and label us trolls or heretics of the true
> programming
> language (take your pick).

The last (and only) time I called someone a troll for saying
'Write it C' it was in response to a rails related question.
Further the OP asked for configuration items and such, but maybe
that's a whole other storry. (and of course you can write
C Extensions for rails... yeah, yadda, yadda :) )

..snip 52 lines Perl, some hundred lines C ...

> [Latin]$ time ./Latin1.pl 5 > x5
>
> real 473m45.370s
> user 248m59.752s
> sys 2m54.598s
>
> [Latin]$ time ./Latin4.pl 5 > x5
>
> real 12m51.178s
> user 12m14.066s
> sys 0m7.101s
>
> [Latin]$ time ./c_version.sh 5
>
> real 0m5.519s
> user 0m4.585s
> sys 0m0.691s

Just to show the beauty of ruby:
-----------------------------------------------------------
require 'rubygems'
require 'permutation'
require 'set'

$size = (ARGV.shift || 5).to_i

$perms = Permutation.new($size).map{|p| p.value}
$out = $perms.map{|p| p.map{|v| v+1}.join}
$filter = $perms.map do |p|
s = SortedSet.new
$perms.each_with_index do |o, i|
o.each_with_index {|v, j| s.add(i) if p[j] == v}
end && s.to_a
end

$latins = []
def search lines, possibs
return $latins << lines if lines.size == $size
possibs.each do |p|
search lines + [p], (possibs -
$filter[p]).subtract(lines.last.to_i..p)
end
end

search [], SortedSet[*(0...$perms.size)]

$latins.each do |latin|
$perms.each do |perm|
perm.each{|p| puts $out[latin[p]]}
puts
end
end
-----------------------------------------------------------
(does someone has a nicer/even faster version?)

would you please run that on your machine?
perhaps you have to do a "gem install permutation"
(no I don't think it's faster than your C code, but
it should beat the perl version)

> If you really really want that performance boost then take
> the following
> advice very seriously - "Write it in C".

Agreed, 100%, for those who want speed, speed and nothing
else there is hardly a better way.

thanks

Simon


Francis Cianfrocca

unread,
Jul 26, 2006, 10:24:07 AM7/26/06
to
Peter Hickman wrote:
> We are talking about two different things here. I was talking about
> performance as being the number 1 absolute top priority, you are talking
> about 'reasonable performance'. As far as I am concerned for those
> scripts that I don't convert to C Perl, Ruby and Python are fast enough.
> Other people think that they are not, they seem to expect the sort of
> performance C gives when they write things in a scripting language. I
> think that they are barking.

I quite agree with this. To Nutter's point, one can make one's own
choice between C and Java once the decision has been made to write
platform-level code. But most code, perhaps nearly all code, should stay
in dyntyped script, in order to optimize the development/runtime
cost-balance. I think you can get tremendous benefits from factoring
code cleverly enough to keep the native-code components as small as
possible. And this is often a nontrivial exercise because it depends on
a good understanding of where performance costs come from in any
particular program.

To the point about Java: as I mentioned upthread, working-set size is
often the key limiting factor in Ruby performance. On a large and busy
server (which is my target environment most of the time), Java can be a
very inappropriate choice for the same reason!

Ryan McGovern

unread,
Jul 26, 2006, 10:29:06 AM7/26/06
to
Charles O Nutter wrote:
>
> Well la-dee-dah! Seriously though, there's a volume of tests and
> benchmarks
> (for what they're worth) that show this to be true, and for
> memory-intensive
> situations Java almost always wins because lazy memory management allows
> work to get done first, faster. Granted, Java's abstractions make it
> easier
> to write bad code...but that's true of every language, including C.
>
> - Charles Oliver Nutter, CERTIFIED Java Developer
>
I dont doubt for simple applications and algorithms java is nearly as
fast as C if not equivalent. Though for larger java projects such as
Eclipse, i've had a horrible time of it being slow and cumbersome on the
system, and Visual Studio will run fine and be far more responsive.
I dont really know why that is it could be as simple as some bad code in
the java gui layer that Eclipse is using.

pat eyler

unread,
Jul 26, 2006, 10:42:40 AM7/26/06
to

well, I didn't post the original post (though I did link to it). I
did post my take
on it. At it's core: write it in Ruby and if it's too slow, profile
it and rewrite the
slow parts (in C if need be). Rewriting the whole app in C when Ruby makes
cohabitating with C (or C++ or Objective C) so easy just seems pointless.

My post is at http://on-ruby.blogspot.com/2006/07/rubyinline-making-making-things-faster.html


>
> Jay
>
>


--
thanks,
-pate
-------------------------
http://on-ruby.blogspot.com

Sean O'Halpin

unread,
Jul 26, 2006, 11:01:11 AM7/26/06
to
On 7/26/06, Charles O Nutter <hea...@headius.com> wrote:

> there's nothing about a
> language's design that should necessitate it being slower than any other
> language.

While I accept that you shouldn't confuse a language with its implementation,
I find that a mildly surprising statement, especially since you said
in an earlier post:

> for memory-intensive situations Java almost always wins because lazy memory
> management allows work to get done first, faster

Garbage collection seems to me to be an integral part of Java's design.

Off the top of my head, I can think of some other design aspects that
have an effect on performance: method lookup in OO languages, scoping,
continuations, closures, static vs dynamic typing, type inference,
double dispatch, consing + car + cdr in Lisp, direct vs indirect
threading in Forth, etc. These are not just matters of implementation.
Each is a language design decision with a semantic effect which incurs
or avoids a computational cost, regardless of how it's actually
implemented. For example, Ruby has real closures, Python doesn't. I
don't see how you could ever reduce the cost of Ruby having closures
to zero - the memory alone is an implied overhead. Sure you can
optimize till the cows come home but different functionalities have
different costs and you can't ever avoid that.

Regards,
Sean

Dean Wampler

unread,
Jul 26, 2006, 11:03:50 AM7/26/06
to
On 7/26/06, ben...@fysh.org <ben...@fysh.org> wrote:
> ...

>
> People talk about the 80:20 principle, but in my experience it's much
> more like 99:1 for applications. 99% of the code uses 1% of the run
> time. 1% of the code consumes 99% of the run time. That could be the
> signal processing and graphics heavy applications that I have
> experienced though.
> ...

This is the "value proposition" of the "Hot Spot" technology in the
Java Virtual Machine. On the fly, it looks for byte code sections that
get executed repeatedly and it then compiles them to object code,
thereby doing runtime optimization. This allows many Java server
processes to run with near-native speeds. When Ruby runs on a virtual
machine, planned for version 2, then Ruby can do that too. The JRuby
project will effectively accomplish the same goal.

--
Dean Wampler
http://www.objectmentor.com
http://www.aspectprogramming.com
http://www.contract4j.org

Pedro Côrte-Real

unread,
Jul 26, 2006, 11:07:08 AM7/26/06
to
On 7/26/06, Sean O'Halpin <sean.o...@gmail.com> wrote:
> Off the top of my head, I can think of some other design aspects that
> have an effect on performance: method lookup in OO languages, scoping,
> continuations, closures, static vs dynamic typing, type inference,
> double dispatch, consing + car + cdr in Lisp, direct vs indirect
> threading in Forth, etc. These are not just matters of implementation.
> Each is a language design decision with a semantic effect which incurs
> or avoids a computational cost, regardless of how it's actually
> implemented. For example, Ruby has real closures, Python doesn't. I
> don't see how you could ever reduce the cost of Ruby having closures
> to zero - the memory alone is an implied overhead. Sure you can
> optimize till the cows come home but different functionalities have
> different costs and you can't ever avoid that.

In theory if two programs in two different languages produce the same
exact results the perfect compilers for each of the languages would
end up producing the same code. In theory practice is the same as
theory but in practice it isn't.

Cheers,

Pedro.

Kristof Bastiaensen

unread,
Jul 26, 2006, 11:32:15 AM7/26/06
to
On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:

> Whenever the question of performance comes up with scripting languages
> such as Ruby, Perl or Python there will be people whose response can be
> summarised as "Write it in C". I am one such person. Some people take
> offence at this and label us trolls or heretics of the true programming
> language (take your pick).
>

> <snip>

Hi,

When reading your C code, I saw that there is a lot of code that is
generated. I'd be interested to see how well the C program does if
it can work for any size of the squares. In this case I think the problem
is well suited for logic languages. I wrote a version in the functional
logic language Curry, which does reasonably well. It will probably not be
faster than the C version, but a lot faster than a program written in
Ruby/Perl/Python.

>If you really really want that performance boost then take the following
> advice very seriously - "Write it in C".

It can be a good idea to rewrite parts in C, but I would first check if
the algorithms are good, so that it may not even be needed to write any
C code. And perhaps there are libraries or tools that do the trick
efficiently. I would keep writing C code as the last option.

Regards,
Kristof

-------------------- start of latin.curry ----------------------------
-- upto is a nondeterministic function that evaluates to
-- a number from 1 upto n
upto 1 = 1
upto n | n > 1 = n ? upto (n-1)

-- check if the lists r s have no element with the same value at the
-- same position
elems_diff r s = and $ zipWith (/=) r s

-- extend takes a list of columns, and extends each column with a
-- number for the next row. It checks the number agains the column and
-- against the previous numbers in the row.

extend :: [[Int]] -> Int -> [[Int]]
extend cols n = addnum cols [] where
addnum [] _ = []
addnum (col:cs) prev
| x =:= upto n &
(x `elem` prev) =:= False &
(x `elem` col) =:= False = (x:col) : addnum cs (x:prev)
where x free

latin_square n = latin_square_ n
where latin_square_ 0 = replicate n [] -- initalize columns to nil
latin_square_ m | m > 0 = extend (latin_square_ (m-1)) n

square2str s = unlines $ map format_col s
where format_col col = unwords $ map show col

main = mapIO_ (putStrLn . square2str) (findall (\s -> s =:= latin_square 5))
------------------------- end latin.curry -----------------------------

vasudevram

unread,
Jul 26, 2006, 11:19:40 AM7/26/06
to
> > On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:
> > > In this post I want to clear some things up and provide benchmarks as to
> > > why you should take "Write it in C" seriously.

Interesting series of messages! Got to save and read them through at
leisure ...

Just adding my 2c:

[I worked on a fairly complex performance tuning job once, involving
HP-UNIX boxes (multiple),
Informix ESQL/C batch programs, IBM MQ series (now called Websphere
MQ), UNIX IPC, and got a chance to do tuning at several different
levels - SQL queries, MQ logs/partitions, the C code, algorithms in it,
etc. Was very educational ... just sharing some insights gained from
that, from reading on the subject, and from smaller hobby projects
tuning my own code ...]

[Not implying that previous posters on this thread haven't done any of
the below].

Performance tuning in general is *very* complicated. Guesses or
assumptions like "this code tweak should make it run faster" often do
not work. The only way is a (semi)scientific approach to
measure/profile, study profile results, make hypotheses, change code
accordingly, then re-measure to see if the change made a difference.

Tuning can be done at any of several different levels, ranging from:

- hardware (even here, not just throwing more boxes or faster boxes at
the problem, but things like hardware architecture - obviously only if
the skills are available and the problem is worth the effort)

- software architecture

- algorithms and data structures optimization

- plain code tuning (things like common subexpression elimination, e.g.
using C syntax, changing:

for (i = 0; i < getLength(my_collection); i++) { /* do something with
my_collection[i] */ }

to

collLength = getLength(my_collection);
for (i = 0; i < collLength; i++) { /* do something with
my_collection[i] */ }

/* which removes the repeated/redundant call to the function
getLength() */

Jon Bentley's book "Writing Efficient Programs" is a very good book
which discusses rules, examples and war stories of tuning at almost all
of these levels, including really excellent advice on code-level
tuning, which may sometimes be the easiest one to implement on existing
code.
Though the examples are in a pseudo-Pascal dialect (easily
understandable for those knowing C), and though it may be out of print
now, for those who have a real need for tuning advice, its worth trying
to get a used copy on eBay, from a friend, whatever.

Its chock-full of code examples with the tuning results (verifiied by
measurement, as stated above), when (and when not) to apply them, and
the war stories are really interesting too ...

Googling for "performance tuning" and variants thereof will help ...

There's another book (for Java, mostly server-side programming) by a
guy called Dov (something - forget his last name and the book title, if
I remember it, will post here) that's *really* excellent too - he shows
(again, with actual measurements) how some of the "expected" results
were actually wrong/counter-intuitive. He worked with IBM on the web
software for one of the recent Olympics.

HTH
Vasudev
----------------------------------------------------------------------------------
Vasudev Ram
Custom utility development in UNIX/C/sh/Java/Python/Ruby
Software consulting and training
http://www.dancingbison.com
----------------------------------------------------------------------------------

Peter Hickman

unread,
Jul 26, 2006, 11:21:01 AM7/26/06
to
I will run your Ruby version and the Java version that I write and post
the results here. Give us a week or so as I have other things to be doing.


Doug H

unread,
Jul 26, 2006, 11:23:08 AM7/26/06
to
Peter Hickman wrote:
> If you really really want that performance boost then take the following
> advice very seriously - "Write it in C".

Assuming you have no choice but C/C++. That's why I like using the
jvm or clr with languages like jruby, groovy, or boo. You don't have
to use C, you can use java or C# or boo itself (since it is statically
typed with type inference): http://boo.codehaus.org/
or C/C++ as well, although it is 100x easier to interface with a C lib
from the clr than it is from jvm with jni.

vasudevram

unread,
Jul 26, 2006, 11:33:52 AM7/26/06
to

Kristof Bastiaensen wrote:
> On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:
>
> > Whenever the question of performance comes up with scripting languages
> > such as Ruby, Perl or Python there will be people whose response can be
> > summarised as "Write it in C". I am one such person. Some people take
> > offence at this and label us trolls or heretics of the true programming
> > language (take your pick).
> >
> > <snip>
>
> Hi,
>
> When reading your C code, I saw that there is a lot of code that is
> generated. I'd be interested to see how well the C program does if
> it can work for any size of the squares. In this case I think the problem
> is well suited for logic languages. I wrote a version in the functional
> logic language Curry, which does reasonably well. It will probably not be

Interesting ... I read somewhere that the OCaml language, while
higher-level than C (and a functional one too), runs some programs at
least, as fast or faster than C ...
Not sure how true that is ...

Vasudev
http://www.dancingbison.com

Sean O'Halpin

unread,
Jul 26, 2006, 11:40:03 AM7/26/06
to
On 7/26/06, Pedro Côrte-Real <pe...@pedrocr.net> wrote:
>
> In theory if two programs in two different languages produce the same
> exact results the perfect compilers for each of the languages would
> end up producing the same code. In theory practice is the same as
> theory but in practice it isn't.
>
> Cheers,
>
> Pedro.
>
>
In theory, an infinite number of computer scientists hacking for an
infinite amount of time on a keyboard will eventually almost surely
produce a perfect compiler.

In practice, I can't wait that long ;)

Cheers,
Sean

David Pollak

unread,
Jul 26, 2006, 11:42:46 AM7/26/06
to
Writing code that runs as fast in Java as it does in C is real work,
but it's possible.

Integer (http://athena.com) is a pure Java spreadsheet. I optimized
the numeric functions and array functions (e.g., SUM(A1:G99)) such
that Integer runs as fast or faster than Excel and OpenOffice Calc on
identical hardware. However, it required a mind shift from "standard"
Java programming.

In addition, because Java has nice semantics for multithreading, I was
able to implement some very cleaver algorithms such that Integer's
recalculation speed scales nearly linearly with additional CPUs up to
a certain point (the linearity goes away at around 16 processors on a
big Sun box.) But I digress.

First, I pre-allocated a lot of workspace so that there's little
memory allocation going on during recalculation.

Second, I avoid Monitors (synchronized) as much as possible.

Third, I write "C" style Java (lots of switch statements, putting
parameters and results in buffers rather than passing them on the
stack, etc.)

Memory usage in Java is higher than in C. If Java has Structs a la
C#/Mono, it'd be possible to squeeze that much more performance from
Java.

There are some applications that will never perform as in Java (e.g.,
stuff that's heavily oriented to bit manipulation.) But for many
classes of applications (e.g., spreadsheets) Java can perform as well
as C.

When I care about computational performance, I go with Java or in a
rare case, C++ (C style C++, no STL or virtual methods). If I care
about developer performance, I've been going with Ruby more and more.

My 2 cents.


--
--------
David Pollak's Ruby Playground
http://dppruby.com

har...@schizopolis.net

unread,
Jul 26, 2006, 11:55:05 AM7/26/06
to

You read that correctly. The problem is that nearly every benchmark I've
seen for comparing the performance of various languages has been a
repeated mathematical operation like computing a Mandelbrot Set or running
Fibonacci Sequences that all but guarantees the edge will belong to
functional languages like Haskell and OCAML or stripped-down assembly-like
languages like C (http://shootout.alioth.debian.org/debian/ for samples),
because they are best suited for straight-up number crunching. Are there
good benchmarks for OO languages? Or dynamic languages? Are there good
benchmarks that could actually measure the types of uses I need, where I'm
building a web front end to a DB store? I don't know about you, but my job
has never involved fractals.

I used to put faith into benchmarks like this, but now I think about
developer time and maintenance time as well. That seems to be a more
intelligent approach.

Jake


Gregory Brown

unread,
Jul 26, 2006, 12:03:13 PM7/26/06
to
On 7/26/06, David Pollak <pol...@gmail.com> wrote:

> In addition, because Java has nice semantics for multithreading, I was
> able to implement some very cleaver algorithms such that Integer's
> recalculation speed scales nearly linearly with additional CPUs up to
> a certain point (the linearity goes away at around 16 processors on a
> big Sun box.) But I digress.

This must be evidence of true cutting edge development ;)

Isaac Gouy

unread,
Jul 26, 2006, 12:13:27 PM7/26/06
to

"The results I got were that Java is significantly faster than
optimized C++ in many cases... I've been accused of biasing the results
by using the -O2 option for GCC..."

"...so I took the benchmark code for C++ and Java from the now outdated
Great Computer Language Shootout and ran the tests myself"

Not so outdated
http://shootout.alioth.debian.org/gp4sandbox/benchmark.php?test=all&lang=java&lang2=gpp

Kristof Bastiaensen

unread,
Jul 26, 2006, 12:33:22 PM7/26/06
to
On Thu, 27 Jul 2006 00:55:05 +0900, harrisj wrote:

>>
>> Kristof Bastiaensen wrote:
>>>
>>> <snipped a lot>


>> Interesting ... I read somewhere that the OCaml language, while
>> higher-level than C (and a functional one too), runs some programs at
>> least, as fast or faster than C ...
>> Not sure how true that is ...
>>
>> Vasudev
>> http://www.dancingbison.com
>
> You read that correctly. The problem is that nearly every benchmark I've
> seen for comparing the performance of various languages has been a
> repeated mathematical operation like computing a Mandelbrot Set or running
> Fibonacci Sequences that all but guarantees the edge will belong to
> functional languages like Haskell and OCAML or stripped-down assembly-like
> languages like C (http://shootout.alioth.debian.org/debian/ for samples),
> because they are best suited for straight-up number crunching.

In some cases the functional version is faster because the problem can be
more easily described in a functional way. But in general code produced
by ocaml is about twice as slow as C, because the compiler doesn't do the
same extensive optimizations as for example gcc does. But that's still
pretty good.

> Are there
> good benchmarks for OO languages? Or dynamic languages? Are there good
> benchmarks that could actually measure the types of uses I need, where I'm
> building a web front end to a DB store? I don't know about you, but my job
> has never involved fractals.
>

> <snip>

True, benchmarks only measure execution speed, but they don't show if a
given programmer will be productive in them. I think that's also
largely a personal choice. Some people may be more productive in a
functional language, some people more in Ruby. And others even in perl... :)

Kristof

Tim Hoolihan

unread,
Jul 26, 2006, 12:30:42 PM7/26/06
to
Languages compiled to machine code can only make compile time
optimizations, while vm and interpreted languages have run time
optimizations available. There is an article that discusses this better
than I can:

Available here:
http://unmoldable.com/story.php?a=235
or here:
http://www.informit.com/articles/article.asp?p=486104&rl=1

I wouldn't argue that there is much faster than C or assembly currently,
but I think this article lays out a good roadmap for how HLLs can catch up.

Also, see pages 14-16 of "CLR via C#" by Jeffrey Richter provides a good
summary of the advantages of Run Time optimization.

-Tim

David Pollak

unread,
Jul 26, 2006, 12:28:18 PM7/26/06
to
Greg,

In spreadsheets, it is cutting edge. Name one other commercial
spreadsheet that can use more than 1 CPU?

David

Martin Ellis

unread,
Jul 26, 2006, 12:28:15 PM7/26/06
to
Sean O'Halpin wrote:
> In theory, an infinite number of computer scientists hacking for an
> infinite amount of time on a keyboard will eventually almost surely
> produce a perfect compiler.
>
> In practice, I can't wait that long ;)

You'd be waiting a long time indeed :o).

I believe our good friend Mr. Turing proved [1] that such
a compiler could never exist, some seven decades ago.

Oh well.

Martin


[1] OK. That wasn't exactly what he proved.
But only this particular corollary is relevant.

James Edward Gray II

unread,
Jul 26, 2006, 12:35:10 PM7/26/06
to
On Jul 26, 2006, at 11:28 AM, David Pollak wrote:

> Greg,
>
> In spreadsheets, it is cutting edge. Name one other commercial
> spreadsheet that can use more than 1 CPU?

I'm pretty sure Greg was funning around with the comical typo in your
post. Take a look back at how you spelled "clever." ;)

James Edward Gray II


ara.t....@noaa.gov

unread,
Jul 26, 2006, 12:39:29 PM7/26/06
to
On Wed, 26 Jul 2006, Peter Hickman wrote:

> Whenever the question of performance comes up with scripting languages such
> as Ruby, Perl or Python there will be people whose response can be
> summarised as "Write it in C". I am one such person. Some people take
> offence at this and label us trolls or heretics of the true programming
> language (take your pick).
>

> I am assuming here that when people talk about performance they really mean
> speed. Some will disagree but this is what I am talking about.


>
> In this post I want to clear some things up and provide benchmarks as to why

> you should take "Write it in C" seriously. Personally I like to write my
> projects in Ruby or Perl or Python and then convert them to C to get a
> performance boost. How much of a boost? Well here I will give you some hard
> data and source code so that you can see just how much of a boost C can give
> you.
>
> The mini project in question is to generate all the valid Latin squares. A
> Latin square is a grid of numbers (lets say a 9 x 9 grid) that has the
> numbers 1 to 9 appear only once in each row and column. Sudoku grids are a
> subset of Latin squares.
>
> The approach taken is to create a list of all the permutations and then
> build up a grid row by row checking that the newly added row does not
> conflict with any of the previous rows. If the final row can be added
> without problem the solution is printed and the search for the next one
> starts. It is in essence depth first search. The first version of the
> program that I wrote in Perl took 473 minutes to generate all the valid 5 x
> 5 Latin squares, version four of the program took 12 minutes and 51 seconds.
> The C version of the program took 5.5 seconds to produce identical results.
> All run on the same hardware.

just for fun, here's a ruby version (note that the array would actually need
to be reformed into rows, but someone else can play with that)

harp:~ > cat a.rb
require 'gsl'

n = Integer(ARGV.shift || 2)

width, height = n, n

perm = GSL::Permutation.alloc width * height

p perm.to_a until perm.next == GSL::FAILURE


it's not terribly fast to run - but it was to write!

-a
--
suffering increases your inner strength. also, the wishing for suffering
makes the suffering disappear.
- h.h. the 14th dali lama

Isaac Gouy

unread,
Jul 26, 2006, 12:43:53 PM7/26/06
to

Once you ignore the stuff that beats up on hash-tables, character
manipulation, regex, concurrency, memory allocation, ... you will be
left with array-access and number crunching.

otoh regex-dna makes Tcl look good :-)
http://shootout.alioth.debian.org/gp4/benchmark.php?test=regexdna&lang=tcl&id=2


> Are there good benchmarks for OO languages? Or dynamic languages? Are there good
> benchmarks that could actually measure the types of uses I need, where I'm
> building a web front end to a DB store? I don't know about you, but my job
> has never involved fractals.

Never involved double math in tight loops?


> I used to put faith into benchmarks like this, but now I think about
> developer time and maintenance time as well. That seems to be a more
> intelligent approach.
>
> Jake

"your application is the ultimate benchmark"
http://shootout.alioth.debian.org/gp4/miscfile.php?file=benchmarking&title=Flawed%20Benchmarks#app

Gregory Brown

unread,
Jul 26, 2006, 12:57:14 PM7/26/06
to
On 7/26/06, James Edward Gray II <ja...@grayproductions.net> wrote:

James gets right to the point. I was just taking a slice at your
typo, not Integer. :)

David Pollak

unread,
Jul 26, 2006, 1:49:54 PM7/26/06
to
Guess I should unit test my posts... :-)

On 7/26/06, Gregory Brown <gregory...@gmail.com> wrote:

Chad Perrin

unread,
Jul 26, 2006, 2:18:52 PM7/26/06
to
On Wed, Jul 26, 2006 at 11:29:06PM +0900, Ryan McGovern wrote:
> >
> I dont doubt for simple applications and algorithms java is nearly as
> fast as C if not equivalent. Though for larger java projects such as
> Eclipse, i've had a horrible time of it being slow and cumbersome on the
> system, and Visual Studio will run fine and be far more responsive.
> I dont really know why that is it could be as simple as some bad code in
> the java gui layer that Eclipse is using.

Doubtful. Java does generally produce notably faster applications than
Ruby, and there are benchmarks that show that in specific instances it
can hustle almost as well as C -- even scalably so. A more
comprehensive survey of benchmarks, on the other hand, starts to take
its toll on Java's reputation for speed. The first problem is that C
isn't object oriented and, while OOP can be great for programmer
productivity under many circumstances (particularly involving larger
projects), it introduces a bunch of interface activity between parts of
the program which begins to slow things down. Furthermore, Java's
bytecode-compilation and VM interpretation can increase execution speed
at runtime by cutting out part of the process of getting from source to
binary, but it still requires interpretation and relies on the
performance of the VM itself (which is, sad to say, not as light on its
feet as many would like).

In fact, there are cases where the Perl runtime compiler's quickness
makes Java's VM look dog-slow. If your purpose for using a language
other than whatever high-level language you prefer is maximum
performance (presumably without giving up readable source code), Java
isn't an ideal choice. If your high-level language of choice is Perl,
there's actually very little reason for Java at all, and the same is
true of some Lisp interpreters/compilers.

For those keen on functional programming syntax, Haskell is a better
choice than Java for performance: in fact, the only thing keeping
Haskell from performing as well as C, from what I understand, is the
current state of processor design. Similarly, O'Caml is one of the
fastest non-C languages available: it consistently, in a wide range of
benchmark tests and real-world anecdotal comparisons, executes "at least
half as quickly" as C, which is faster than it sounds.

The OP is right, though: if execution speed is your top priority, use C.
Java is an also-ran -- what people generally mean when they say that
Java is almost as fast as C is that a given application written in both
C and Java "also runs in under a second" in Java, or something to that
effect. While that may be true, there's a significant difference
between 0.023 seconds and 0.8 seconds (for hypothetical example).

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]
"The ability to quote is a serviceable
substitute for wit." - W. Somerset Maugham

Chad Perrin

unread,
Jul 26, 2006, 2:27:05 PM7/26/06
to
On Thu, Jul 27, 2006 at 12:42:46AM +0900, David Pollak wrote:
> Writing code that runs as fast in Java as it does in C is real work,
> but it's possible.

. . the problem being that putting the same effort into optimizing a C
program will net greater performance rewards as well. The only language
I've ever seen stand up to C in head-to-head optimization comparisons
with any consistency, and even outperform it, was Delphi-style Objective
Pascal, and that's only anecdotal comparisons involving my father (who
knows Delphi's Objective Pascal better than most lifelong C programmers
know C), so the comparisons might be somewhat skewed. My suspicion is
that the compared C code can be tweaked to outperform the Object Pascal
beyond Object Pascal's ability to be tweaked for performance -- the
problem being that eventually you have to stop tweaking your code, so
sometimes the Object Pascal might be better anyway.

Java doesn't even come close to that level of performance optimization,
alas. At least, not from what I've seen.


>
> There are some applications that will never perform as in Java (e.g.,
> stuff that's heavily oriented to bit manipulation.) But for many
> classes of applications (e.g., spreadsheets) Java can perform as well
> as C.

Is that heavily optimized Java vs. "normal" (untweaked) C?

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

Brian K. Reid: "In computer science, we stand on each other's feet."

Chad Perrin

unread,
Jul 26, 2006, 2:43:41 PM7/26/06
to
On Wed, Jul 26, 2006 at 11:26:45PM +0900, Charles O Nutter wrote:
> On 7/26/06, Peter Hickman <pe...@semantico.com> wrote:
> >
> >Charles O Nutter wrote:
> >> I'll lob a couple of grenades and then duck for cover.

> >>
> >> - Write it in C is as valid as write it in Java (as someone else
> >> mentioned).
> >> Java is at least as fast as C for most algorithms.
> >
> >As someone who is paid to program in Java I very seriously doubt this.
> >However I will write a Java version of the code and time it. It should
> >be interesting to say the least.
>
> Doubt all you like.

Full disclaimer included:
As someone who is NOT paid to program in Java, and in fact finds Java
rather odious, and would rather write code in almost anything else that
isn't the annoyance factor equivalent of VB, I too doubt it. Aside from
not being paid to program in Java, though, I have played with Java code,
I have researched Java performance characteristics extensively in the
performance of job tasks, I've looked at hundreds of benchmarks over the
years, and I know a fair bit about programming language interpreters
and parsers in the abstract. The end result is that characterizing Java
as "at least as fast as C" in most cases and faster in many other cases
sounds like a load of (perhaps well-meaning) hooey to me.


>
> >We are talking about two different things here. I was talking about
> >performance as being the number 1 absolute top priority, you are talking
> >about 'reasonable performance'. As far as I am concerned for those
> >scripts that I don't convert to C Perl, Ruby and Python are fast enough.
> >Other people think that they are not, they seem to expect the sort of
> >performance C gives when they write things in a scripting language. I
> >think that they are barking.
>
> They may be hoping for the impossible, but that doesn't mean they shouldn't
> hope and that doesn't mean they shouldn't be frustrated when the stock
> answer is "write it in C". The fact that Ruby and other dynamic languages
> are not as fast as compiled C is not the language's fault or the user's
> fault...it's an implementation flaw. It certainly may be a flaw that can't
> be fixed, a problem impossible to solve, but there's nothing about a


> language's design that should necessitate it being slower than any other

> language. Perhaps we haven't found the right way to implement these
> languages, or perhaps some of us have and others just aren't there yet.
> Either way, it's not the language that's eventually the problem...it's
> simply the distance from what the underlying platform wants to run that's an
> issue. C is, as they put it, closer to "bare metal", but only because C is
> little more than a set of macros on top of assembly code. If the underlying
> processor ran YARV bytecodes, I doubt Ruby performance would be a concern.

I'd say "yes and no" to that. There are things about Ruby -- things
that I absolutely would not change -- that necessitate slower runtime
execution. For instance, for Ruby to work as specced, it needs dynamic
typing, which is simply slower in execution, because typing becomes a
part of execution. Static typing doesn't require types to be set at
execution: they can be set at compile-time, because they don't have to
change depending on runtime conditions. Thus, you add an extra step to
runtime execution a variable (pardon the pun) number of times. It's an
unavoidable execution-speed loss based on how the Ruby language is
supposed to work, and it's a characteristic of Ruby that I absolutely
would not throw away for better runtime performance. Because of this,
of course, it is effectively impossible to use persistent compiled
executables for Ruby to solve the runtime execution performance gap that
is introduced by dynamic typing as well. C'est la vie.

Other, similar difficulties arise as well. Ultimately, it's not the
fact that it's an interpreted language that is the problem. That can be
solved via a number of tricks (just-in-time compilation similar to Perl,
bytecode compilation, or even simply writing a compiler for it, for
example), if that's the only problem. The main problem is that, like
Perl, Python, and various Lisps, it's a dynamic language: it can be used
to write programs that change while they're running. To squeeze the
same performance out of Ruby that you get out of C, you'd have to remove
its dynamic characteristics, and once you do that you don't have Ruby
any longer.

Simon Kröger

unread,
Jul 26, 2006, 2:47:48 PM7/26/06
to
Peter Hickman wrote:
> I will run your Ruby version and the Java version that I write and post
> the results here. Give us a week or so as I have other things to be doing.

Hmm, in a week this discussion will be over (ok, it will reappear some time
soon, but nevertheless) and everybody has swallowed your points.

$ ruby -v
ruby 1.8.4 (2005-12-24) [i386-mingw32]

$ time ruby latin.rb 5 > latin.txt

real 0m4.703s
user 0m0.015s
sys 0m0.000s

(this is a 2.13GHz PentiumM, 1GB RAM, forget the user and sys timings, but
'real' is for real, this is WinXP)

My point is: If you choose the right algorithm, your program will get faster by
orders of magnitudes - spending time optimizing algorithms is better than
spending the same time recoding everything in C. In a not so distance future
(when the interpreter is better optimized or perhaps YARV sees the light of day
my version will be even faster than yours. It will be easier to maintain and
more platform independent.

Of course you can port this enhanced version to C and it will be even faster,
but if you have a limited amount of time/money to spend on optimization i would
say: go for the algorithm.

To stop at least some of the flames: I like Extensions, i like them most if
they are generally useful (and fast) like gsl, NArray, whatever. The
combination of such Extensions and optimized algorithms (written in ruby) would
be my preferred solution if i had a performance critical problem that I'm
allowed to tackle in ruby.

cheers

Simon

p.s.: and if my solution is only that fast because of a bug (i really hope
not), i think my conclusions still hold true.

Martin Ellis

unread,
Jul 26, 2006, 2:52:28 PM7/26/06
to
Chad Perrin wrote:
> Haskell is a better choice than Java for performance:

I suspect it depends what you're doing...

> in fact, the only thing keeping
> Haskell from performing as well as C, from what I understand, is the
> current state of processor design.

I'm interested to know more about that.
Could you elaborate? A reference would do.

Cheers
Martin

Chad Perrin

unread,
Jul 26, 2006, 3:19:38 PM7/26/06
to
On Thu, Jul 27, 2006 at 03:55:10AM +0900, Martin Ellis wrote:
> Chad Perrin wrote:
> > Haskell is a better choice than Java for performance:
>
> I suspect it depends what you're doing...

To clarify: I meant "on average" or "in general". Obviously, there will
be instances where Java will outperform Haskell or, for that matter,
even C -- just as there are times Perl can outperform C, for an
equivalent amount of invested programmer time, et cetera. I suspect the
same is true even of Ruby, despite its comparatively crappy execution
speed. That doesn't change the fact that in the majority of cases,
Haskell will outperform most other languages. It is, after all, the C
of functional programming.

>
> > in fact, the only thing keeping
> > Haskell from performing as well as C, from what I understand, is the
> > current state of processor design.
>
> I'm interested to know more about that.
> Could you elaborate? A reference would do.

I'm having difficulty finding citations for this that actually explain
anything, but the short and sloppy version is as follows:

Because imperative style programming had "won" the programming paradigm
battle back in the antediluvian days of programming, processors have
over time been oriented more and more toward efficient execution of code
written in that style. When a new processor design and a new
instruction set for a processor is shown to be more efficient in code
execution, it is more efficient because it has been better architected
for the software that will run on it, to better handle the instructions
that will be given to it with alacrity. Since almost all programs
written today are written in imperative, rather than functional, style,
this means that processors are optimized for execution of imperative
code (or, more specifically, execution of binaries that are compiled
from imperative code).

As a result, functional programming languages operate at a slight
compilation efficiency disadvantage -- a disadvantage that has been
growing for decades. There are off-hand remarks all over the web about
how functional programming languages supposedly do not compile as
efficiently as imperative programming languages, but these statements
only tell part of the story: the full tale is that functional
programming languages do not compile as efficiently on processors
optimized for imperative-style programming.

We are likely heading into an era where that will be less strictly the
case, however, and functional languages will be able to start catching
up, performance-wise. Newer programming languages are beginning to get
further from their imperative roots, incorporating more characteristics
of functional-style languages (think of Ruby's convergence on Lisp, for
instance). For now, however, O'Caml and, even moreso, Haskell suffer at
a disadvantage because their most efficient execution environment isn't
available on our computers.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"Real ugliness is not harsh-looking syntax, but having to
build programs out of the wrong concepts." - Paul Graham

Chad Perrin

unread,
Jul 26, 2006, 3:24:11 PM7/26/06
to
On Wed, Jul 26, 2006 at 09:26:05PM +0900, Leslie Viljoen wrote:
>
> Something else to consider is the ease with which Ruby extensions can
> be written in C. The first time I tried I has something running in 20
> minutes.
>
> Though if I was going to choose a (single) language for raw
> performance I'd try to go with Pascal or Ada.

Pascal's sort of an iffy proposition for me, in comparison with C. I'm
simply not sure that it can be optimized as thoroughly as C, in any
current implementations. According to its spec, it can probably
outperform C if implemented well, and Borland Delphi does a reasonably
good job of that, but it has received considerably less attention from
compiler programmers over time and as such is probably lagging in
implementation performance. It's kind of a mixed bag, and I'd like to
get more data on comparative performance characteristics than I
currently have.

Ada, on the other hand -- for circumstances in which it is most commonly
employed (embedded systems, et cetera), it does indeed tend to kick C's
behind a bit. That may have more to do with compiler optimization than
language spec, though.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"A script is what you give the actors. A program
is what you give the audience." - Larry Wall

Chad Perrin

unread,
Jul 26, 2006, 3:27:24 PM7/26/06
to
On Wed, Jul 26, 2006 at 08:30:11PM +0900, Jay Levitt wrote:
> On Wed, 26 Jul 2006 17:47:13 +0900, Peter Hickman wrote:
>
> > In this post I want to clear some things up and provide benchmarks as to
> > why you should take "Write it in C" seriously.
>
> This is a great post, and should at least be posted to a blog somewhere so
> the masses who don't know about USENET can still find it on Google!

This list is not only on USENET, for what it's worth.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

Ashley Moran

unread,
Jul 26, 2006, 3:27:57 PM7/26/06
to

On Jul 26, 2006, at 8:03 pm, Chad Perrin wrote:

> This recent mania for VMs is irksome to me. The same benefits can be
> had from a JIT compiler, without the attendant downsides of a VM (such
> as greater persistent memory usage, et cetera).

Chad,

I'm late to this conversation but I've been interested in Ruby
performance lately. I just had to write a script to process about
1-1.5GB of CSV data (No major calculations, but it involves about 20
million rows, or something in that region). The Ruby implementation
I wrote takes about 2.5 hours to run - I think memory management is
the main issue as the manual garbage collection run I added after
each file goes into several minutes for the larger sets of data. As
you can imagine, I am more than eager for YARV/Rite.

Anyway, my question really is that I thought a VM was a prerequisite
or JIT? Is that not the case? And if the YARV VM is not the way to
go, what is?

Ashley

Chad Perrin

unread,
Jul 26, 2006, 2:59:49 PM7/26/06
to
On Thu, Jul 27, 2006 at 12:23:23AM +0900, Charles O Nutter wrote:
>
> You're mixing language semantics and implementation details here. The
> mechanics of method lookup is not a language feature; it's an implementation
> detail. On the other hand, the logic of which method gets invoked in a
> hierarchy of types is a language detail. Scoping is a language feature, but
> the means by which scope is maintained is an implementation detail.

In some ways, you're right: implementation details are being mixed up
with language definition in the preceding list of features. In the case
of scoping, however, you're not entirely right with regard to "the means
by which scope is maintained". Dynamic scoping, by definition, requires
runtime scoping. Static scoping, by definition, does not. This means
that (to use Perl as an example, since I know it better than Ruby) my(),
which is used to declare variables in lexical scope, can be managed at
compile time, while local(), which is used for dynamic scope, can only
be managed at runtime -- else it will not work as advertised. That's
more than simply implementation details: implementation is, in this
case, dictated by language features.


>
> As for the closure comment...sure, there's overhead in creating closures,
> but it's *explicit* overhead. This is no different from creating the
> equivalent of closures in languages that don't support them. The concept of
> a closure has a clear specification and certainly increases the complexity
> of a language and an underlying implementation. But that complexity may not
> in itself always represent a decrease in performance, since other means of
> accomplishing the same task may be even less performant. That's how it is
> for any language feature...you take the good with the bad, and if you don't
> use all of the good you may be less-than-optimal. If using a feature has ten
> good aspects and five bad, and you only make use of five good aspects, then
> your code is sub-optimal. If you use less than five, you're even worse off
> and perhaps should consider doing things differently. Nothing about the
> feature itself explicitly implies that performance should degrade by using
> it...it's a matter of using those features wisely and making optimal use of
> their good aspects, balanced with their bad aspects.

I think closures are kind of a bad example for this, actually. There's
nothing about closures that necessarily harms performance of the
language in implementation. In fact, closures are in some respects
merely a happy accident that arises as the result of other, positive
characteristics of a language that all can tend to contribute to better
performance of the implementation of a language (such as lexical scope,
which leads to better performance than dynamic scope). In fact, one of
the reasons Python doesn't have proper closures (lack of strict lexical
scoping) is also likely one of the reasons Python still tends to lag
behind Perl for performance purposes.

The only real overhead involved in closures, as far as I can see, is the
allocation of memory to a closure that doesn't go away until the program
exits or, in some implementations, until the program reaches a point
where it will absolutely, positively never need that closure again
(which is roughly the same thing for most uses of closures). A little
extra memory usage does not translate directly to performance loss. In
fact, in any halfway-decent system implementation, it really shouldn't
result in reduced performance unless you start having to swap because
you've overrun "physical RAM", I think.

The day may come when RAM is better managed so that performance gains
can be had for less memory usage, though, so I doubt this will always be
true.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

Chad Perrin

unread,
Jul 26, 2006, 3:03:23 PM7/26/06
to
On Thu, Jul 27, 2006 at 12:03:50AM +0900, Dean Wampler wrote:
> On 7/26/06, ben...@fysh.org <ben...@fysh.org> wrote:
> >...
> >
> >People talk about the 80:20 principle, but in my experience it's much
> >more like 99:1 for applications. 99% of the code uses 1% of the run
> >time. 1% of the code consumes 99% of the run time. That could be the
> >signal processing and graphics heavy applications that I have
> >experienced though.
> >...
>
> This is the "value proposition" of the "Hot Spot" technology in the
> Java Virtual Machine. On the fly, it looks for byte code sections that
> get executed repeatedly and it then compiles them to object code,
> thereby doing runtime optimization. This allows many Java server
> processes to run with near-native speeds. When Ruby runs on a virtual
> machine, planned for version 2, then Ruby can do that too. The JRuby
> project will effectively accomplish the same goal.

This recent mania for VMs is irksome to me. The same benefits can be
had from a JIT compiler, without the attendant downsides of a VM (such
as greater persistent memory usage, et cetera).

--

CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"The measure on a man's real character is what he would do
if he knew he would never be found out." - Thomas McCauley

Chad Perrin

unread,
Jul 26, 2006, 3:31:43 PM7/26/06
to
On Thu, Jul 27, 2006 at 03:47:48AM +0900, Simon Kröger wrote:
> Peter Hickman wrote:
> > I will run your Ruby version and the Java version that I write and post
> > the results here. Give us a week or so as I have other things to be doing.
>
> Hmm, in a week this discussion will be over (ok, it will reappear some time
> soon, but nevertheless) and everybody has swallowed your points.
>
> $ ruby -v
> ruby 1.8.4 (2005-12-24) [i386-mingw32]
>
> $ time ruby latin.rb 5 > latin.txt
>
> real 0m4.703s
> user 0m0.015s
> sys 0m0.000s
>
> (this is a 2.13GHz PentiumM, 1GB RAM, forget the user and sys timings, but
> 'real' is for real, this is WinXP)

Holy crap, that's fast.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

Chad Perrin

unread,
Jul 26, 2006, 3:41:52 PM7/26/06
to
On Thu, Jul 27, 2006 at 01:20:05AM +0900, Kristof Bastiaensen wrote:
>
> True, benchmarks only measure execution speed, but they don't show if a
> given programmer will be productive in them. I think that's also
> largely a personal choice. Some people may be more productive in a
> functional language, some people more in Ruby. And others even in perl... :)

Actually, a well-defined functional syntax is a beautiful thing to
behold. UCBLogo, of all things, taught me that -- and taught me to
dearly love arithmetic prefix notation. Ruby's .method syntax is also
pretty darned nice, but I'd like to see it slightly more consistently
applied (only slightly). There's more to syntactic and semantic style
than mere personal preference: there are concrete benefits to a
functional syntax in terms of writing consistent code, for instance.

. . and Perl is great. Let's not knock it just because Ruby is great
too. Well, maybe in jest, as you've done, but it gets a far worse
reputation than it deserves. As Paul Graham put it, ugly code is a
result of being forced to use the wrong concepts to achieve something
specific, and not of a harsh-looking syntax. Any syntax can be
harsh-looking to someone unaccustomed to it (even Ruby's).

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

Ron M

unread,
Jul 26, 2006, 3:49:02 PM7/26/06
to
Charles O Nutter wrote:
> I'll lob a couple of grenades and then duck for cover.
>
> - Write it in C is as valid as write it in Java (as someone else
> mentioned).

Not really. In C you can quite easily use inline assembly
to do use your chips MMX/SSE/VIS/AltiVec extensions and if
you need more, interface to your GPU if you want to use it
as a coprocessor.

I don't know of any good way of doing those in Java except
by writing native extensions in C or directly with an assembler.

Last I played with Java it didn't have a working cross-platform
mmap, and if that's still true, the awesome NArray+mmap Ruby
floating around is a good real-world example of this flexibility.

ara.t....@noaa.gov

unread,
Jul 26, 2006, 3:54:34 PM7/26/06
to
On Wed, 26 Jul 2006, Kroeger, Simon (ext) wrote:

> Just to show the beauty of ruby:
> -----------------------------------------------------------
> require 'rubygems'
> require 'permutation'
> require 'set'
>
> $size = (ARGV.shift || 5).to_i
>
> $perms = Permutation.new($size).map{|p| p.value}
> $out = $perms.map{|p| p.map{|v| v+1}.join}
> $filter = $perms.map do |p|
> s = SortedSet.new
> $perms.each_with_index do |o, i|
> o.each_with_index {|v, j| s.add(i) if p[j] == v}
> end && s.to_a
> end
>
> $latins = []
> def search lines, possibs
> return $latins << lines if lines.size == $size
> possibs.each do |p|
> search lines + [p], (possibs -
> $filter[p]).subtract(lines.last.to_i..p)
> end
> end
>
> search [], SortedSet[*(0...$perms.size)]
>
> $latins.each do |latin|
> $perms.each do |perm|
> perm.each{|p| puts $out[latin[p]]}
> puts
> end
> end
> -----------------------------------------------------------
> (does someone has a nicer/even faster version?)
>
> would you please run that on your machine?
> perhaps you have to do a "gem install permutation"
> (no I don't think it's faster than your C code, but
> it should beat the perl version)


>
>> If you really really want that performance boost then take
>> the following

>> advice very seriously - "Write it in C".
>
> Agreed, 100%, for those who want speed, speed and nothing
> else there is hardly a better way.
>
> thanks
>
> Simon

harp:~ > time ruby latin.rb 5 > 5.out
real 0m11.170s
user 0m10.840s
sys 0m0.040s

harp:~ > uname -srm
Linux 2.4.21-40.EL i686

harp:~ > cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Pentium(R) 4 CPU 2.40GHz
stepping : 7
cpu MHz : 2386.575
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips : 4757.91

harp:~ > ruby -v
ruby 1.8.4 (2005-12-01) [i686-linux]


not too shabby. definite not worth the hassle for 5 seconds of c.

Francis Cianfrocca

unread,
Jul 26, 2006, 3:59:13 PM7/26/06
to
Ashley Moran wrote:
> performance lately. I just had to write a script to process about
> 1-1.5GB of CSV data (No major calculations, but it involves about 20
> million rows, or something in that region).

I've had tremendous results optimizing Ruby programs that process huge
piles of text. There is a range of "tricks" you can use to keep Ruby
from wasting memory, which is its real downfall. If it's possible, given
your application, to process your CSV text in such a way that you don't
store any transformations of the whole set in memory at once, you'll go
orders of magnitude faster. You can even try to break your computation
up into multiple stages, and stream the intermediate results out to
temporary files. As ugly as that sounds, it will be far faster.

In regard to the whole conversation on this thread: at the end of the
day, absolute performance only matters if you can put a dollar amount on
it. That makes the uncontexted language comparisons essentially
meaningless.

In regard to YARV: I get a creepy feeling about anything that is
considered by most of the world to be the prospective answer to all
their problems. And as a former language designer, I have some reasons
to believe that a VM will not be Ruby's performance panacea.

--
Posted via http://www.ruby-forum.com/.

Chad Perrin

unread,
Jul 26, 2006, 4:11:58 PM7/26/06
to
On Thu, Jul 27, 2006 at 04:27:57AM +0900, Ashley Moran wrote:
>
> I'm late to this conversation but I've been interested in Ruby
> performance lately. I just had to write a script to process about
> 1-1.5GB of CSV data (No major calculations, but it involves about 20
> million rows, or something in that region). The Ruby implementation
> I wrote takes about 2.5 hours to run - I think memory management is
> the main issue as the manual garbage collection run I added after
> each file goes into several minutes for the larger sets of data. As
> you can imagine, I am more than eager for YARV/Rite.
>
> Anyway, my question really is that I thought a VM was a prerequisite
> or JIT? Is that not the case? And if the YARV VM is not the way to
> go, what is?

The canonical example for comparison, I suppose, is the Java VM vs. the
Perl JIT compiler. In Java, the source is compiled to bytecode and
stored. In Perl, the source remains in source form, and is stored as
ASCII (or whatever). When execution happens with Java, the VM actually
interprets the bytecode. Java bytecode is compiled for a virtual
computer system (the "virtual machine"), which then runs the code as
though it were native binary compiled for this virtual machine. That
virtual machine is, from the perspective of the OS, an interpreter,
however. Thus, Java is generally half-compiled and half-interpreted,
which speeds up the interpretation process.

When execution happens in Perl 5.x, on the other hand, a compiler runs
at execution time, compiling executable binary code from the source. It
does so in stages, however, to allow for the dynamic runtime effects of
Perl to take place -- which is one reason the JIT compiler is generally
preferable to a compiler of persistent binary executables in the style
of C. Perl is, thus, technically a compiled language, and not an
interpreted language like Ruby.

Something akin to bytecode compilation could be used to improve upon the
execution speed of Perl programs without diverging from the
JIT-compilation execution it currently uses and also without giving up
any of the dynamic runtime capabilities of Perl. This would involve
running the first (couple of) pass(es) of the compiler to produce a
persistent binary compiled file with the dynamic elements still left in
an uncompiled form, to be JIT-compiled at execution time. That would
probably grant the best performance available for a dynamic language,
and would avoid the overhead of a VM implementation. It would, however,
require some pretty clever programmers to implement in a sane fashion.

I'm not entirely certain that would be appropriate for Ruby, considering
how much of the language ends up being dynamic in implementation, but it
bothers me that it doesn't even seem to be up for discussion. In fact,
Perl is heading in the direction of a VM implementation with Perl 6,
despite the performance successes of the Perl 5.x compiler. Rather than
improve upon an implementation that is working brilliantly, they seem
intent upon tossing it out and creating a different implementation
altogether that, as far as I can see, doesn't hold out much hope for
improvement. I could, of course, be wrong about that, but that's how it
looks from where I'm standing.

It just looks to me like everyone's chasing VMs. While the nontrivial
problems with Java's VM are in many cases specific to the Java VM (the
Smalltalk VMs have tended to be rather better designed, for instance),
there are still issues inherent in the VM approach as currently
envisioned, and as such it leaves sort of a bad taste in my mouth.

I think I've rambled. I'll stop now.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"There comes a time in the history of any project when it becomes necessary
to shoot the engineers and begin production." - MacUser, November 1990

ara.t....@noaa.gov

unread,
Jul 26, 2006, 4:12:29 PM7/26/06
to
On Thu, 27 Jul 2006, Francis Cianfrocca wrote:

> In regard to YARV: I get a creepy feeling about anything that is
> considered by most of the world to be the prospective answer to all
> their problems. And as a former language designer, I have some reasons
> to believe that a VM will not be Ruby's performance panacea.

one of the reasons i've been pushing so hard for an msys based ruby is that
having a 'compilable' ruby on all platforms might open up developement on jit
type things like ruby inline - which is pretty dang neat.

2 cts.

Chad Perrin

unread,
Jul 26, 2006, 4:14:36 PM7/26/06
to
On Thu, Jul 27, 2006 at 04:59:13AM +0900, Francis Cianfrocca wrote:
> Ashley Moran wrote:
> > performance lately. I just had to write a script to process about
> > 1-1.5GB of CSV data (No major calculations, but it involves about 20
> > million rows, or something in that region).
>
> I've had tremendous results optimizing Ruby programs that process huge
> piles of text. There is a range of "tricks" you can use to keep Ruby
> from wasting memory, which is its real downfall. If it's possible, given
> your application, to process your CSV text in such a way that you don't
> store any transformations of the whole set in memory at once, you'll go
> orders of magnitude faster. You can even try to break your computation
> up into multiple stages, and stream the intermediate results out to
> temporary files. As ugly as that sounds, it will be far faster.

One of these days, I'll actually know enough Ruby to be sure of what
language constructs work for what purposes in terms of performance. I
rather suspect there are prettier AND better-performing options than
using temporary files to store data during computation, however.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

Ben Franklin: "As we enjoy great Advantages from the Inventions of
others we should be glad of an Opportunity to serve others by any
Invention of ours, and this we should do freely and generously."

Francis Cianfrocca

unread,
Jul 26, 2006, 4:15:56 PM7/26/06
to
Ron M wrote:
> Not really. In C you can quite easily use inline assembly
> to do use your chips MMX/SSE/VIS/AltiVec extensions and if
> you need more, interface to your GPU if you want to use it
> as a coprocessor.
>
> I don't know of any good way of doing those in Java except
> by writing native extensions in C or directly with an assembler.
>
> Last I played with Java it didn't have a working cross-platform
> mmap, and if that's still true, the awesome NArray+mmap Ruby
> floating around is a good real-world example of this flexibility.


Your point about Java here is very well-taken. I'd add that you don't
even really need to drop into asm to get most of the benefits you're
talking about. C compilers are really very good at optimizing, and I
think you'll get nearly all of the available performance benefits from
well-written C alone. (I've written at least a million lines of
production asm code in my life, as well as a pile of commercial
compilers for various languages.) It goes back to economics again. A
very few applications will gain so much incremental value from the extra
5-10% performance boost that you get from hand-tuned asm, that it's
worth the vastly higher cost (development, maintenance, and loss of
portability) of doing the asm. A tiny number of pretty unusual apps
(graphics processing, perhaps) will get a lot more than 10% from asm.

The performance increment in going from Ruby to C is in *many* cases a
lot more than 10%, in fact it can easily be 10,000%.

kha...@enigo.com

unread,
Jul 26, 2006, 4:22:46 PM7/26/06
to
On Thu, 27 Jul 2006, Ashley Moran wrote:

> I'm late to this conversation but I've been interested in Ruby performance
> lately. I just had to write a script to process about 1-1.5GB of CSV data

Just as a sidenote to this conversation, if you are not using FasterCSV,
take a look at it. http://rubyforge.org/projects/fastercsv

Using it may dramatically speed your script.


Kirk Haines

Francis Cianfrocca

unread,
Jul 26, 2006, 4:31:38 PM7/26/06
to
Chad Perrin wrote:
> On Thu, Jul 27, 2006 at 04:59:13AM +0900, Francis Cianfrocca wrote:
>> orders of magnitude faster. You can even try to break your computation
>> up into multiple stages, and stream the intermediate results out to
>> temporary files. As ugly as that sounds, it will be far faster.
>
> One of these days, I'll actually know enough Ruby to be sure of what
> language constructs work for what purposes in terms of performance. I
> rather suspect there are prettier AND better-performing options than
> using temporary files to store data during computation, however.

Ashley was talking about 1GB+ datasets, iirc. I'd love to see an
in-memory data structure (Ruby or otherwise) that can slug a few of
those around without breathing hard. And on most machines, you're going
through the disk anyway with a dataset that large, as it thrashes your
virtual-memory. So why not take advantage of the tunings that are built
into the I/O channel?

If I'm using C, I always handle datasets that big with the kernel vm
functions- generally faster than the I/O functions. I don't know how to
do that portably in Ruby (yet).

Isaac Gouy

unread,
Jul 26, 2006, 4:46:19 PM7/26/06
to

Chad Perrin wrote:
> On Wed, Jul 26, 2006 at 11:29:06PM +0900, Ryan McGovern wrote:
-snip-

> For those keen on functional programming syntax, Haskell is a better
> choice than Java for performance: in fact, the only thing keeping
> Haskell from performing as well as C, from what I understand, is the
> current state of processor design. Similarly, O'Caml is one of the
> fastest non-C languages available: it consistently, in a wide range of
> benchmark tests and real-world anecdotal comparisons, executes "at least
> half as quickly" as C, which is faster than it sounds.

For those keen on functional programming, Clean produces small fast
executables.

> The OP is right, though: if execution speed is your top priority, use C.
> Java is an also-ran -- what people generally mean when they say that
> Java is almost as fast as C is that a given application written in both
> C and Java "also runs in under a second" in Java, or something to that
> effect. While that may be true, there's a significant difference
> between 0.023 seconds and 0.8 seconds (for hypothetical example).

That sounds wrong to me - I hear positive comments about Java
performance for long-running programs, not for programs that run in
under a second.

Hal Fulton

unread,
Jul 26, 2006, 4:54:49 PM7/26/06
to
Isaac Gouy wrote:
>
> That sounds wrong to me - I hear positive comments about Java
> performance for long-running programs, not for programs that run in
> under a second.
>

JIT is the key to a lot of that. Performance depends greatly on
the compiler, the JVM, the algorithm, etc.

I won a bet once from a friend. We wrote comparable programs in
Java and C++ (some arbitrary math in a loop running a bazillion
times).

With defaults on both compiles, the Java was actually *faster*
than the C++. Even I didn't expect that. But as I said, this
sort of thing is highly dependent on many different factors.


Hal


Ashley Moran

unread,
Jul 26, 2006, 5:16:25 PM7/26/06
to

On Jul 26, 2006, at 9:31 pm, Francis Cianfrocca wrote:

> Ashley was talking about 1GB+ datasets, iirc. I'd love to see an
> in-memory data structure (Ruby or otherwise) that can slug a few of
> those around without breathing hard. And on most machines, you're
> going
> through the disk anyway with a dataset that large, as it thrashes your
> virtual-memory. So why not take advantage of the tunings that are
> built
> into the I/O channel?
>
> If I'm using C, I always handle datasets that big with the kernel vm
> functions- generally faster than the I/O functions. I don't know
> how to
> do that portably in Ruby (yet).


I think the total data size is about 1.5GB, but the individual files
are smaller, the largest being a few hundred GB. The most rows in a
file is ~15,000,000 I think. The server I run it on has 2GB RAM (an
Athlon 3500+ running FreeBSD/amd64, so the hardware is not really an
issue)... it can get all the way through without swapping (just!)

The processing is pretty trivial, and mainly involves incrementing
some ID columns so we can merge datasets together, adding a text
column to the start of every row, and eliminating a few duplicates.
The output file is gzipped (sending the output of CSV::Writer through
GzipWriter). I could probably rewrite it so that most files are
output a line at a time, and call out to the command line gzip. Only
the small files *need* to be stored in RAM for duplicate removal,
others are guaranteed unique. At the time I didn't think using RAM
would give such a huge performance hit (lesson learnt).

I might also look into Kirk's suggestion of FasterCSV. If all this
doesn't improve things, there's always the option of going dual-core
and forking to do independent files.

However... the script can be run at night so even in its current
state it's acceptable. It will only need serious work if we start
adding many more datasets into the routine (we're using two out of a
conceivable 4 or 5, I think). In that case we could justify buying a
faster CPU if it got out of hand, rather than rewrite it in C. But
that's more a reflection of hardware prices than my wages :)

I have yet to write anything in Ruby was less than twice as fast to
code as it would have been in bourne-sh/Java/whatever, never mind
twice as fun or maintainable. I recently rewrote an 830 line Java/
Hibernate web service client as 67 lines of Ruby, in about an hour.
With that kind of productivity, performance can go to hell!

Ashley

Ashley Moran

unread,
Jul 26, 2006, 5:33:08 PM7/26/06
to

On Jul 26, 2006, at 9:11 pm, Chad Perrin wrote:

> It just looks to me like everyone's chasing VMs. While the nontrivial
> problems with Java's VM are in many cases specific to the Java VM (the
> Smalltalk VMs have tended to be rather better designed, for instance),
> there are still issues inherent in the VM approach as currently
> envisioned, and as such it leaves sort of a bad taste in my mouth.

Chad...

Just out of curiosity (since I don't know much about this subject),
what do yo think of the approach Microsoft took with the CLR? From
what I read it's very similar to the JVM except it compiles directly
to native code, and makes linking to native libraries easier. I
assume this is closer to JVM behaviour than Perl 5 behaviour. Is
there anything to be learnt from it for Ruby?

Ashley

ara.t....@noaa.gov

unread,
Jul 26, 2006, 5:24:07 PM7/26/06
to
On Thu, 27 Jul 2006, Ashley Moran wrote:

> maintainable. I recently rewrote an 830 line Java/Hibernate web service

> client as 67 lines of Ruby, in about an hour. With that kind of
> productivity, performance can go to hell!

i process tons of big csv files and use this approach:

- parse the first line, remember cell count

- foreach line
- attempt parsing using simple split, iff that fails fall back to csv.rb
methods


something like

n_fields = nil

f.each do |line|
fields = lines.split %r/,/
n_fields ||= fields.size

if fields.size != n_fields
fields = parse_with_csv_lib line
end

...
end

this obviously won't work with csv files that have cells spanning lines, but
for simply stuff it can speed up parsing in a huge way.

Francis Cianfrocca

unread,
Jul 26, 2006, 5:24:26 PM7/26/06
to
Charles O Nutter wrote:
> I would challenge the Ruby
> community at large to expect more from Ruby proper before giving up the
> dream of highly-performant Ruby code and plunging into the C.

Much depends on what is wanted from the language. My friends know me for
a person who will gladly walk a very long way to get an incremental
performance improvement in any program. But I don't dream of
highly-performant Ruby code. I dream of highly-scalable applications
that can work with many different kinds of data seamlessly and link
business people and their customers together in newer, faster, more
secure ways than have ever been imagined before. I want to be able to
turn almost any kind of data, wherever it is, into actionable
information and combine it flexibly with any other data. I want to be
able to simply drop any piece of new code into a network and
automatically have it start working with other components in the
(global) network. I want a language system that can gracefully and
powerfully model all of these new kinds of interactions without
requiring top-down analysis of impossibly large problem domains and
rigid program-by-contract regimes. Ruby has unique characteristics,
among all other languages that I know, that qualify it for a first
approach to my prticular dream. Among these are the excellent
metaprogramming support, the open classes, the adaptability to tooling,
and (yes) the generally-acceptable performance.

If one's goal is to get a program that will take the least amount of
time to plow through some vector mathematics problem, then by all means
let's have the language-performance discussion. But to me, most of these
compute-intensive tasks are problems that have been being addressed by
smart people ever since Fortran came along. We don't necessarily need
Ruby to solve them.

We do need Ruby to solve a very different set of next-generation
problems, for which C and Java (and even Perl and Python) are very
poorly suited.

Chad Perrin

unread,
Jul 26, 2006, 6:13:56 PM7/26/06
to
On Thu, Jul 27, 2006 at 06:16:25AM +0900, Ashley Moran wrote:
>
> I have yet to write anything in Ruby was less than twice as fast to
> code as it would have been in bourne-sh/Java/whatever, never mind
> twice as fun or maintainable. I recently rewrote an 830 line Java/
> Hibernate web service client as 67 lines of Ruby, in about an hour.
> With that kind of productivity, performance can go to hell!

With a 92% cut in code weight, I can certainly sympathize with that
sentiment. Wow.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

Chad Perrin

unread,
Jul 26, 2006, 6:31:23 PM7/26/06
to
On Thu, Jul 27, 2006 at 06:24:49AM +0900, Charles O Nutter wrote:
> On 7/26/06, Chad Perrin <per...@apotheon.com> wrote:
> >
> >The canonical example for comparison, I suppose, is the Java VM vs. the
> >Perl JIT compiler. In Java, the source is compiled to bytecode and
> >stored. In Perl, the source remains in source form, and is stored as
> >ASCII (or whatever). When execution happens with Java, the VM actually
> >interprets the bytecode. Java bytecode is compiled for a virtual
> >computer system (the "virtual machine"), which then runs the code as
> >though it were native binary compiled for this virtual machine. That
> >virtual machine is, from the perspective of the OS, an interpreter,
> >however. Thus, Java is generally half-compiled and half-interpreted,
> >which speeds up the interpretation process.
>
>
> Half true. The Java VM could be called "half-compiled and half-interpreted"
> at runtime for only a short time, and only if you do not consider VM
> bytecodes to be a valid "compiled" state. However most bytecode is very
> quickly compiled into processor-native code, making those bits fully
> compiled. After a long enough runtime (not very long in actuality), all Java
> code is running as native code for the target processor (with various
> degrees of optimization and overhead).

True . . . but this results in fairly abysmal performance, all things
considered, for short runs. Also, see below regarding dynamic
programming.


>
> The difference between AOT compilation with GCC or .NET is that Java's
> compiler can make determinations based on runtime profiling about *how* to
> compile that "last mile" in the most optimal way possible. The bytecode
> compilation does, as you say, primarily speed up the interpretation process.
> However it's far from the whole story, and the runtime JITing of bytecode
> into native code is where the magic lives. To miss that is to miss the
> greatest single feature of the JVM.

This also is true, but that benefit is entirely unusable for highly
dynamic code, unfortunately -- and, in fact, even bytecode compilation
might be a bit too much to ask for too-dynamic code. I suppose it's
something for pointier heads than mine, since I'm not actually a
compiler-writer or language-designer (yet). It's also worth noting that
this isn't accomplishing anything that isn't also accomplished by the
Perl JIT compiler.


>
> When execution happens in Perl 5.x, on the other hand, a compiler runs
> >at execution time, compiling executable binary code from the source. It
> >does so in stages, however, to allow for the dynamic runtime effects of
> >Perl to take place -- which is one reason the JIT compiler is generally
> >preferable to a compiler of persistent binary executables in the style
> >of C. Perl is, thus, technically a compiled language, and not an
> >interpreted language like Ruby.
>

> I am not familiar with Perl's compiler. Does it compile to processor-native
> code or to an intermediate bytecode of some kind?

There is no intermediate bytecode step for Perl, as far as I'm aware.
It's not a question I've directly asked one of the Perl internals
maintainers, but everything I know about the Perl compiler confirms my
belief that it simply does compilation to machine code.


>
> We're also juggling terms pretty loosely here. A compiler converts
> human-readable code into machine-readable code. If the "machine" is a VM,
> then you're fully compiling. If the VM code later gets compiled into "real
> machine" code, that's another compile cycle. Compilation isn't as cut and
> dried as you make it out to be, and claiming that, for example, Java is
> "half compiled" is just plain wrong.

Let's call it "virtually compiled", then, since it's being compiled to
code that is readable by a "virtual machine" -- or, better yet, we can
call it bytecode and say that it's not fully compiled to physical
machine-readable code, which is what I was trying to explain in the
first place.


>
> Something akin to bytecode compilation could be used to improve upon the
> >execution speed of Perl programs without diverging from the
> >JIT-compilation execution it currently uses and also without giving up
> >any of the dynamic runtime capabilities of Perl. This would involve
> >running the first (couple of) pass(es) of the compiler to produce a
> >persistent binary compiled file with the dynamic elements still left in
> >an uncompiled form, to be JIT-compiled at execution time. That would
> >probably grant the best performance available for a dynamic language,
> >and would avoid the overhead of a VM implementation. It would, however,
> >require some pretty clever programmers to implement in a sane fashion.
>

> There are a lot of clever programmers out there.

True, of course. The problem is getting them to work on a given
problem.


>
> Having worked heavily on a Ruby implementation, I can say for certain that
> 99% of Ruby code is static. There are some dynamic bits, especially within
> Rails where methods are juggled about like flaming swords, but even these
> dynamic bits eventually settle into mostly-static sections of code.

I love that imagery, with the flaming sword juggling. Thanks.


> Compilation of Ruby code into either bytecode for a fast interpreter engine
> like YARV or into bytecode for a VM like Java is therefore perfectly valid
> and very effective. Preliminary compiler results for JRuby show a boost of
> 50% performance over previous versions, and that's without optimizing many
> of the more expensive Ruby operations (call logic, block management).
> Whether a VM is present (as in JRuby) or not (as may be the case with YARV),
> eliminating the overhead of per-node interpretation is a big positive. JRuby
> will also feature a JIT compiler to allow running arbitrary .rb files
> directly, optimizing them as necessary and as seems valid based on runtime
> characteristics. I don't know if YARV will do the same, but it's a good
> idea.

I'm sure a VM or similar approach (and, frankly, I do prefer the
fast-interpreter approach over the VM approach) would provide ample
opportunity to improve upon Ruby's current performance, but that doesn't
necessarily mean it's better than other approaches to improving
performance. That's where I was aiming.


>
> The whole VM thing is such a small issue. Ruby itself is really just a VM,
> where its instructions are the elements in its AST. The definition of a VM
> is sufficiently vague enough to include most other interpreters in the same
> family. Perhaps you are specifically referring to VMs that provide a set of
> "processor-like" fine-grained operations, attempting to simulate some sort
> of magical imaginary hardware? That would describe the Java VM pretty well,
> though in actuality there are real processes that run Java bytecodes
> natively as well. Whether or not a language runs on top of a VM is
> irrelevant, especially considering JRuby is a mostly-compatible version of
> Ruby running on top of a VM. It matters much more that translation to
> whatever underlying machine....virtual or otherwise...is as optimal and
> clean as possible.

A dividing line between "interpreter" and "VM" has always seemed rather
more clear to me than you make it sound. Yes, I do refer to a
simulation of an "imaginary" (or, more to the point, "virtual") machine,
as opposed to a process that interprets code. Oh, wait, there's that
really, really obvious dividing line I keep seeing.

The use (or lack) of a VM does indeed matter: it's an implementation
detail, and implementation details make a rather significant difference
in performance. The ability of the parser to quickly execute what's fed
to it is important, as you indicate, but so too is the ability of the
parser to run quickly itself -- unless your program is actually compiled
to machine-native code for the hardware, in which case the lack of need
for the parser to execute at all at runtime is significant.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

Brian K. Reid: "In computer science, we stand on each other's feet."

Chad Perrin

unread,
Jul 26, 2006, 6:41:10 PM7/26/06
to
On Thu, Jul 27, 2006 at 06:33:08AM +0900, Ashley Moran wrote:
>
> Just out of curiosity (since I don't know much about this subject),
> what do yo think of the approach Microsoft took with the CLR? From
> what I read it's very similar to the JVM except it compiles directly
> to native code, and makes linking to native libraries easier. I
> assume this is closer to JVM behaviour than Perl 5 behaviour. Is
> there anything to be learnt from it for Ruby?

I'm not as familiar with what's going on under the hood of the CLR as
the JVM, but from what I do know it exhibits both advantages and
disadvantages in comparison with the Java VM. Thus far, the evidence
seems to be leaning in the direction of the CLR's advantages over the
JVM coming into play more often than the disadvantages, however, which
seems to indicate that the compromises that were made may have been the
"right" compromises, as far as this comparison goes.

In fact, the CLR seems in some ways to be a compromise between
Perl-style JIT compilation and Java-style bytecode compilation with
runtime VM-interpretation (there really needs to be a term for what a VM
does separate from either compilation or interpretation, since what it
does generally isn't strictly either of them). There may well be
something to learn from that for future Ruby implementations, though I'd
warn away from trying to take the "all languages compile to the same
intermediate bytecode" approach that the CLR takes -- it tries to be too
many things at once, basically, and ends up introducing some
inefficiencies in that sense. If you want to do everything CLR does,
with Ruby, then port Ruby to the CLR, but if you want to simply gain
performance benefits from studying up on the CLR, make sure you
cherry-pick the bits that are relevant to the task at hand.

I think Ruby would probably best benefit from something somewhere
between the Perl compiler's behavior and the CLR compiler.
Specifically, compile all the static algorithm behavior in your code to
something persistent, link in all the rest as uncompiled (though perhaps
parse-tree compiled, which is almost but not quite the same as bytecode
compiled) code, and let that be machine-code compiled at runtime. This
might even conceivably be broken into two separate compilers to minimize
the last-stage compiler size needed on client systems and to optimize
each part to operate as quickly as possible.

Run all this stuff past a true expert before charging off to implement
it, of course. I'm an armchair theorist.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"It's just incredible that a trillion-synapse computer could actually
spend Saturday afternoon watching a football game." - Marvin Minsky

Csaba Henk

unread,
Jul 26, 2006, 7:19:39 PM7/26/06
to
On 2006-07-26, Sean O'Halpin <sean.o...@gmail.com> wrote:
> implemented. For example, Ruby has real closures, Python doesn't. I

Even if OT, just for the sake of correctness: let me remark that Python
does have closures. Local functions (ones defined within another
function's body) are scoped lexically.

It's just sort of an anti-POLA (and inconvenient, as-is) piece of
semantics that variables get reinitalized upon assignment.

Hence:

def foo():
x = 5
def bar():
x = 6
return x
bar()
return x, bar

x, bar = foo()
print x, bar() ==> 5 6

def foo():
_x = [5]
def bar():
_x[0] = 6
return _x[0]
bar()
return _x[0], bar

x, bar = foo()
print x, bar() ==> 6 6


Regards,
Csaba

Chad Perrin

unread,
Jul 26, 2006, 7:33:31 PM7/26/06
to

Is it just me, or are there no proper closures in that example code?

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

print substr("Just another Perl hacker", 0, -2);

Sean O'Halpin

unread,
Jul 26, 2006, 7:35:47 PM7/26/06
to
On 7/26/06, Chad Perrin <per...@apotheon.com> wrote:
> On Thu, Jul 27, 2006 at 12:23:23AM +0900, Charles O Nutter wrote:
> >
> > You're mixing language semantics and implementation details here.
[snip]

> In some ways, you're right: implementation details are being mixed up
> with language definition in the preceding list of features.
[snip]

Nope - there's no mix up. My point is that any feature of a language
that requires extra work
whether it be at compile time or run time incurs a cost. Those
features are generally there to make life easier for us programmers,
not the machine. The only way to make sure you're not paying that
price is to hand-code optimised machine code for a specific processor
and hardware context. No language translator can guarantee that it
will produce better code (regardless of the nonsense of the 'perfect
compiler').

Let me take two examples from my list. First, method lookup in OO
languages. There is no
way you can optimise this across the board to static jumps in a
language like Ruby or Smalltalk. There will always be the requirement
(imposed by the ~semantics~ of the language) to be able to find the
right method at runtime. This is part of the language design which
imposes constraints on the implementation that assembly languages (for
example) do not have to pay. There is a cost for abstraction (a cost
which I am willing to pay by the way). Of course, you can implement
virtual methods in assembly, but you don't ~have~ to. In Ruby there is
no choice. Everything is an object. (You can optimise most of it away,
but not all).

Second, closures:

Chad Perrin said:
> A little extra memory usage does not translate directly to performance loss.

Coming from a background where I had to make everything happen in 48K,
I have to disagree. And it's not always just 'a little extra memory
usage'. Careless use of closures can cripple an application. See the
problems the Rails team encountered.

Charles - you say that closures are explicit - I beg to differ. By
definition, they are surely implicit. Doesn't your argument that they
can be simulated by other means contradict your statement?

As for the notion that a hardware YARV processor would make a
difference - how would that ameliorate the issues Ruby has with memory
usage? Performance isn't just about time - space also matters.

I am surprised that you think I am confusing language features with
implementation details. From my point of view, it is you who are
ignoring the fact that abstractions incur a cost.

Best regards (I'm enjoying this discussion BTW :)
Sean

Sean O'Halpin

unread,
Jul 26, 2006, 7:42:11 PM7/26/06
to

I've crossed my eyes twice and still can't see it ;)

Chad Perrin

unread,
Jul 26, 2006, 7:50:04 PM7/26/06
to
On Thu, Jul 27, 2006 at 08:35:47AM +0900, Sean O'Halpin wrote:
>
> Chad Perrin said:
> >A little extra memory usage does not translate directly to performance
> >loss.
>
> Coming from a background where I had to make everything happen in 48K,
> I have to disagree. And it's not always just 'a little extra memory
> usage'. Careless use of closures can cripple an application. See the
> problems the Rails team encountered.

Careless use of ANYTHING can cripple an application. Using an extra
kilobyte of memory on a 1GB system for a closure instance or two is not
indicative of an inescapable performance cost for the mere fact of the
existence of closures. While your point about 48K of RAM is well-taken,
it's pretty much inapplicable here: I wouldn't be writing Ruby programs
to run in 48K. Hell, my operating system won't run in 48K, nor even a
hundred times that (I'm using Debian GNU/Linux Etch/Testing, if that
matters). I'm sure as heck not going to expect all my applications to
run in that environment.

Careless use of pointers can cripple not only the application, but the
whole system. Careless use of loops can crash it. Careless use of the
power cord can destroy the hardware. In the grand scheme of things,
closures are not a good example of a language feature that hinders
performance when we're talking about high-level languages such as Ruby.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

Chad Perrin

unread,
Jul 26, 2006, 7:52:01 PM7/26/06
to
On Thu, Jul 27, 2006 at 08:42:11AM +0900, Sean O'Halpin wrote:
> >
> >Is it just me, or are there no proper closures in that example code?
>
> I've crossed my eyes twice and still can't see it ;)

Remember what your mother said: if you keep doing that, your eyes might
stick that way. Be careful.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

Keith Gaughan

unread,
Jul 26, 2006, 7:53:18 PM7/26/06
to

No, Chad. There's closures in there. What you're not seeing is
anonymous functions, but closures are not the same as anonymous
functions.

--
Keith Gaughan - kmga...@eircom.net - http://talideon.com/

Chad Perrin

unread,
Jul 26, 2006, 8:03:30 PM7/26/06
to

Maybe I'm missing something critical about Python, but I don't see the
persistent code construct being passed from the function when its
lexical scope (assuming it's truly lexical, which it might not be in
this case) closes. It's only a closure if there's something persistent
that was "closed" by the scope closing.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"The first rule of magic is simple. Don't waste your time waving your
hands and hopping when a rock or a club will do." - McCloctnick the Lucid

Keith Gaughan

unread,
Jul 26, 2006, 8:09:43 PM7/26/06
to

Clarification: like Java, Python won't let you assign to variables in
the outer scope, so you have to use an array or some other object to
hack around that if you need that functionality. I know, it sucks, but
the fact it doesn't allow you to assign to an outer scope doesn't stop
Python from having closures, just that it doesn't trust the developer
not to screw things up.

Here's a better example:

def foo():
x = [0]
def bar():
x[0] += 1
print x[0]
return bar

baz = foo()
baz() -> 1
baz() -> 2
baz() -> 3

Of course, this is better implemented as a generator.

Life is like a tin of sardines.
We're, all of us, looking for the key.
-- Beyond the Fringe

Thomas E Enebo

unread,
Jul 26, 2006, 8:14:49 PM7/26/06
to
On Thu, 27 Jul 2006, Chad Perrin defenestrated me:

> On Thu, Jul 27, 2006 at 06:24:49AM +0900, Charles O Nutter wrote:
> > On 7/26/06, Chad Perrin <per...@apotheon.com> wrote:
> >
> > When execution happens in Perl 5.x, on the other hand, a compiler runs
> > >at execution time, compiling executable binary code from the source. It
> > >does so in stages, however, to allow for the dynamic runtime effects of
> > >Perl to take place -- which is one reason the JIT compiler is generally
> > >preferable to a compiler of persistent binary executables in the style
> > >of C. Perl is, thus, technically a compiled language, and not an
> > >interpreted language like Ruby.
> >
> > I am not familiar with Perl's compiler. Does it compile to processor-native
> > code or to an intermediate bytecode of some kind?
>
> There is no intermediate bytecode step for Perl, as far as I'm aware.
> It's not a question I've directly asked one of the Perl internals
> maintainers, but everything I know about the Perl compiler confirms my
> belief that it simply does compilation to machine code.

I have not been on the perl train for years, but I believe for Perl5
at least this is not true. I remember Malcolm Beatties B module which
basically exposed the intermediate bytecodes that perl normally interprets.
That was some time ago and things may have changed?

Here is some documentation on this (it could be old but it seems to
match my memory):

http://www.faqs.org/docs/perl5int/c163.html

So it looks like Perl is somewhat similiar to Java (perhaps the other
way around since Perl's interpreter pre-dates Java). An analogy of the
difference would be that Perl is CISC and Java is RISC since Perl bytecode
is higher level. Maybe they JIT pieces?

-Tom

--
+ http://www.tc.umn.edu/~enebo +---- mailto:en...@acm.org ----+
| Thomas E Enebo, Protagonist | "Luck favors the prepared |
| | mind." -Louis Pasteur |

David Pollak

unread,
Jul 26, 2006, 8:26:45 PM7/26/06
to
On 7/26/06, Chad Perrin <per...@apotheon.com> wrote:
> On Thu, Jul 27, 2006 at 12:42:46AM +0900, David Pollak wrote:
>
>
> >
> > There are some applications that will never perform as in Java (e.g.,
> > stuff that's heavily oriented to bit manipulation.) But for many
> > classes of applications (e.g., spreadsheets) Java can perform as well
> > as C.
>
> Is that heavily optimized Java vs. "normal" (untweaked) C?

No. That's heavily optimized Java vs. heavily optimized C. I spent a
fair amount of time chatting with the Excel team a while back. They
cared as much about performance as I did. They spent a lot more time
and money optimizing Excel than I did with Integer. They had far more
in terms of tools and access to compiler tools than I did (although
Sun was very helpful to me.)

What was at stake was not someone's desktop spreadsheet, but was the
financial trader's desk. Financial traders move millions (and
sometimes billions) of Dollars, Euros, etc. through their spreadsheets
every day. A 5 or 10 second advantage in calculating a spreadsheet
could mean a significant profit for a trading firm.

So, I am comparing apples to apples. A Java program can be optimized
to perform as well as a C program for *certain* tasks.

>
> --
> CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

> Brian K. Reid: "In computer science, we stand on each other's feet."
>
>


--
--------
David Pollak's Ruby Playground
http://dppruby.com

Chad Perrin

unread,
Jul 26, 2006, 9:07:00 PM7/26/06
to
On Thu, Jul 27, 2006 at 09:26:45AM +0900, David Pollak wrote:
> On 7/26/06, Chad Perrin <per...@apotheon.com> wrote:
> >On Thu, Jul 27, 2006 at 12:42:46AM +0900, David Pollak wrote:
> >
> >
> >>
> >> There are some applications that will never perform as in Java (e.g.,
> >> stuff that's heavily oriented to bit manipulation.) But for many
> >> classes of applications (e.g., spreadsheets) Java can perform as well
> >> as C.
> >
> >Is that heavily optimized Java vs. "normal" (untweaked) C?
>
> No. That's heavily optimized Java vs. heavily optimized C. I spent a
> fair amount of time chatting with the Excel team a while back. They
> cared as much about performance as I did. They spent a lot more time
> and money optimizing Excel than I did with Integer. They had far more
> in terms of tools and access to compiler tools than I did (although
> Sun was very helpful to me.)

Excel isn't a very good point of comparison for C. For one thing, it's
not C -- it's C++. For another, it has a pretty bad case of featuritis.

Chad Perrin

unread,
Jul 26, 2006, 9:22:39 PM7/26/06
to
On Thu, Jul 27, 2006 at 09:14:49AM +0900, Thomas E Enebo wrote:
> On Thu, 27 Jul 2006, Chad Perrin defenestrated me:
> >
> > There is no intermediate bytecode step for Perl, as far as I'm aware.
> > It's not a question I've directly asked one of the Perl internals
> > maintainers, but everything I know about the Perl compiler confirms my
> > belief that it simply does compilation to machine code.
>
> I have not been on the perl train for years, but I believe for Perl5
> at least this is not true. I remember Malcolm Beatties B module which
> basically exposed the intermediate bytecodes that perl normally interprets.
> That was some time ago and things may have changed?
>
> Here is some documentation on this (it could be old but it seems to
> match my memory):
>
> http://www.faqs.org/docs/perl5int/c163.html
>
> So it looks like Perl is somewhat similiar to Java (perhaps the other
> way around since Perl's interpreter pre-dates Java). An analogy of the
> difference would be that Perl is CISC and Java is RISC since Perl bytecode
> is higher level. Maybe they JIT pieces?

I believe you are correct, with regard to an intermediate code step,
after all. I've done some research on it to refresh my memory. Whether
it continues to compile to a machine-readable executable or interprets
the intermediate code form is something I haven't been able to nail down
yet. I'll keep looking. Apparently, I was wrong somewhere along the
way.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"The measure on a man's real character is what he would do
if he knew he would never be found out." - Thomas McCauley

Keith Gaughan

unread,
Jul 26, 2006, 9:35:52 PM7/26/06
to
On Thu, Jul 27, 2006 at 10:07:00AM +0900, Chad Perrin wrote:

> On Thu, Jul 27, 2006 at 09:26:45AM +0900, David Pollak wrote:
>
> > On 7/26/06, Chad Perrin <per...@apotheon.com> wrote:
> >
> > >On Thu, Jul 27, 2006 at 12:42:46AM +0900, David Pollak wrote:
> > >
> > >> There are some applications that will never perform as in Java (e.g.,
> > >> stuff that's heavily oriented to bit manipulation.) But for many
> > >> classes of applications (e.g., spreadsheets) Java can perform as well
> > >> as C.
> > >
> > >Is that heavily optimized Java vs. "normal" (untweaked) C?
> >
> > No. That's heavily optimized Java vs. heavily optimized C. I spent a
> > fair amount of time chatting with the Excel team a while back. They
> > cared as much about performance as I did. They spent a lot more time
> > and money optimizing Excel than I did with Integer. They had far more
> > in terms of tools and access to compiler tools than I did (although
> > Sun was very helpful to me.)
>
> Excel isn't a very good point of comparison for C. For one thing, it's
> not C -- it's C++. For another, it has a pretty bad case of featuritis.

Actually, the part that counts, calculation engine, comes in two
varieties: a slow but provably correct version, and a fast, highly
optimised version, a significant portion of which is written _assembly
language_. MS use a battery of regression tests to ensure that the fast
one always gives the same results as the slow one.

Just because the bits that aren't performance sensitive are written in
C++ doesn't mean that the rest of it is slow and bloated.

K.

Life's too short to dance with ugly women.

Chad Perrin

unread,
Jul 26, 2006, 9:50:32 PM7/26/06
to
On Thu, Jul 27, 2006 at 10:35:52AM +0900, Keith Gaughan wrote:
>
> Actually, the part that counts, calculation engine, comes in two
> varieties: a slow but provably correct version, and a fast, highly
> optimised version, a significant portion of which is written _assembly
> language_. MS use a battery of regression tests to ensure that the fast
> one always gives the same results as the slow one.

That might be the "part that counts" (nice pun) for calculation, but
it's not the only part that counts. Interface rendering, interactive
operations, and so on are also fairly important performance-wise -- at
least to the user. In fact, calculation waits can be easier to overlook
as a user than waits for the application to catch up when you click on a
button.

On the other hand, if we were specifically referring to things like
column calculation speed (of which I wasn't strictly aware), then your
point is made.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"Real ugliness is not harsh-looking syntax, but having to
build programs out of the wrong concepts." - Paul Graham

Keith Gaughan

unread,
Jul 26, 2006, 10:17:44 PM7/26/06
to
On Thu, Jul 27, 2006 at 10:50:32AM +0900, Chad Perrin wrote:

> On Thu, Jul 27, 2006 at 10:35:52AM +0900, Keith Gaughan wrote:
>
> > Actually, the part that counts, calculation engine, comes in two
> > varieties: a slow but provably correct version, and a fast, highly
> > optimised version, a significant portion of which is written _assembly
> > language_. MS use a battery of regression tests to ensure that the fast
> > one always gives the same results as the slow one.

> That might be the "part that counts" (nice pun) for calculation, but
> it's not the only part that counts.

As far as Excel goes, it is. It's the single biggest time sink in the
application.

> Interface rendering, interactive
> operations, and so on are also fairly important performance-wise

..most of which is down to Windows itself, not Excel. Excel's
contribution to that lag isn't, I believe, all that great. So in this
regard, your complaint is more to do with GDI and so on than with Excel
itself.

> least to the user. In fact, calculation waits can be easier to overlook
> as a user than waits for the application to catch up when you click on a
> button.

Two point:

1. As far as I know, Excel runs its interface on one thread and the
calculation engine on another. This helps give the apprearance of
Excel being snappier than it actually is: you're able to work on the
spreadsheet while it's recalculating cells.

2. On simple spreadsheets, the lag isn't noticible. But Excel is
designed to be able to handle big spreadsheets well. That's why so
much work is put into the calculation engine rather than having an
interface which is completely fat free: in time critical applications,
it's the calculation engine that really matters.

I use Excel a lot, and have for a few years now. Grudgingly, mind you,
because I dislike having to deal with spreadsheets. But as far as MS
applications go, I think your accusations of slowness and bloat are a
little off the mark and better targeted towards its fellow MS Office
software.

Where Excel *does* fall down in turns of speed is disc I/O. There it can
be atrociously slow.

> On the other hand, if we were specifically referring to things like
> column calculation speed (of which I wasn't strictly aware), then your
> point is made.

Recalculating a spreadsheet is something more that just calculating
columns. Excel itself is a Turing-complete dataflow machine. Getting
something like that which is both correct *and* fast is hard.

Chad Perrin

unread,
Jul 26, 2006, 10:28:20 PM7/26/06
to
On Thu, Jul 27, 2006 at 09:09:43AM +0900, Keith Gaughan wrote:
> On Thu, Jul 27, 2006 at 08:53:18AM +0900, Keith Gaughan wrote:
>
> > On Thu, Jul 27, 2006 at 08:33:31AM +0900, Chad Perrin wrote:
> >
> > > Is it just me, or are there no proper closures in that example code?
> >
> > No, Chad. There's closures in there. What you're not seeing is
> > anonymous functions, but closures are not the same as anonymous
> > functions.
>
> Clarification: like Java, Python won't let you assign to variables in
> the outer scope, so you have to use an array or some other object to
> hack around that if you need that functionality. I know, it sucks, but
> the fact it doesn't allow you to assign to an outer scope doesn't stop
> Python from having closures, just that it doesn't trust the developer
> not to screw things up.
>
> Here's a better example:
>
> def foo():
> x = [0]
> def bar():
> x[0] += 1
> print x[0]
> return bar
>
> baz = foo()
> baz() -> 1
> baz() -> 2
> baz() -> 3

That's a bit clearer -- and it does look like a proper closure. It also
looks unnecessarily complex to implement. Thanks for the example.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

"The ability to quote is a serviceable
substitute for wit." - W. Somerset Maugham

Chad Perrin

unread,
Jul 26, 2006, 10:38:34 PM7/26/06
to
On Thu, Jul 27, 2006 at 11:17:44AM +0900, Keith Gaughan wrote:
> On Thu, Jul 27, 2006 at 10:50:32AM +0900, Chad Perrin wrote:
>
> > That might be the "part that counts" (nice pun) for calculation, but
> > it's not the only part that counts.
>
> As far as Excel goes, it is. It's the single biggest time sink in the
> application.

. .
I'll put it this way: it's not the only part that counts for me, and for
other spreadsheet users with whom I've discussed Excel in the past.


>
> > Interface rendering, interactive
> > operations, and so on are also fairly important performance-wise
>

> ...most of which is down to Windows itself, not Excel. Excel's


> contribution to that lag isn't, I believe, all that great. So in this
> regard, your complaint is more to do with GDI and so on than with Excel
> itself.

Excel doesn't run so well on Linux, so I'll just leave that lying where
it is.


>
> > least to the user. In fact, calculation waits can be easier to overlook
> > as a user than waits for the application to catch up when you click on a
> > button.
>
> Two point:
>
> 1. As far as I know, Excel runs its interface on one thread and the
> calculation engine on another. This helps give the apprearance of
> Excel being snappier than it actually is: you're able to work on the
> spreadsheet while it's recalculating cells.
>
> 2. On simple spreadsheets, the lag isn't noticible. But Excel is
> designed to be able to handle big spreadsheets well. That's why so
> much work is put into the calculation engine rather than having an
> interface which is completely fat free: in time critical applications,
> it's the calculation engine that really matters.

. . and yet, the interface being fat-free would be awfully nice.
Instead, it gets fatter with every new version.


>
> I use Excel a lot, and have for a few years now. Grudgingly, mind you,
> because I dislike having to deal with spreadsheets. But as far as MS
> applications go, I think your accusations of slowness and bloat are a
> little off the mark and better targeted towards its fellow MS Office
> software.

It's true that other MS Office applications are worse. That doesn't
make Excel perfect.


>
> Where Excel *does* fall down in turns of speed is disc I/O. There it can
> be atrociously slow.

I certainly won't disagree with that. Disk access seems to be another
problem -- though it's easier to overlook than some other issues, once a
spreadsheet is loaded and before it needs to be saved again.


>
> > On the other hand, if we were specifically referring to things like
> > column calculation speed (of which I wasn't strictly aware), then your
> > point is made.
>
> Recalculating a spreadsheet is something more that just calculating
> columns. Excel itself is a Turing-complete dataflow machine. Getting
> something like that which is both correct *and* fast is hard.

I don't particularly see how that contradicts what I said. I may have
been more flippant in my reference to calculations than you'd like, but
I didn't say anything inaccurate.

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

Keith Gaughan

unread,
Jul 26, 2006, 11:19:24 PM7/26/06
to
On Thu, Jul 27, 2006 at 11:38:34AM +0900, Chad Perrin wrote:

> > ...most of which is down to Windows itself, not Excel. Excel's
> > contribution to that lag isn't, I believe, all that great. So in this
> > regard, your complaint is more to do with GDI and so on than with Excel
> > itself.
>
> Excel doesn't run so well on Linux, so I'll just leave that lying where
> it is.

In fairness, if you're judging it's performance based on running it in
Wine, that's not really a fair comparison. :-)

K.

People need good lies. There are too many bad ones.
-- Bokonon, "Cat's Cradle" by Kurt Vonnegut, Jr.

Keith Gaughan

unread,
Jul 26, 2006, 11:25:49 PM7/26/06
to
On Thu, Jul 27, 2006 at 11:28:20AM +0900, Chad Perrin wrote:

> > Here's a better example:
> >
> > def foo():
> > x = [0]
> > def bar():
> > x[0] += 1
> > print x[0]
> > return bar
> >
> > baz = foo()
> > baz() -> 1
> > baz() -> 2
> > baz() -> 3
>
> That's a bit clearer -- and it does look like a proper closure. It also
> looks unnecessarily complex to implement. Thanks for the example.

In practice, it's not really all that bad. Most of the places where I'd
end up using closures in, say Ruby and JavaScript, I'd end up using
generators, list comprehensions, &c. in Python. Having to name the inner
function's a bit of a pain, but generally I don't end up assigning to
the variables in the outer scope anyway, so that's not such a big deal
either.

Different stroke, and all that.

K.

I hate babies. They're so human.
-- H. H. Munro

Francis Cianfrocca

unread,
Jul 27, 2006, 12:23:47 AM7/27/06
to
Ashley Moran wrote:
> I think the total data size is about 1.5GB, but the individual files
> are smaller, the largest being a few hundred GB. The most rows in a
> file is ~15,000,000 I think. The server I run it on has 2GB RAM (an
> Athlon 3500+ running FreeBSD/amd64, so the hardware is not really an
> issue)... it can get all the way through without swapping (just!)

The problem isn't the raw size of the dataset. It's the number of
objects you create and the amount of garbage that has to be cleaned up.
If you're clever about how you write, you can help Ruby by not creating
so much garbage. That can give a huge benefit.


>
> The processing is pretty trivial, and mainly involves incrementing
> some ID columns so we can merge datasets together, adding a text
> column to the start of every row, and eliminating a few duplicates.

Eliminating the dupes is the only scary thing I've seen here. What's the
absolute smallest piece of data that you need to look at in order to
distinguish a dupe? (If it's the whole line, then the answer is 16
bytes- the length of the MD5 hash ;-)) That's the critical working set.
If you can't get the Ruby version fast enough, it's cheap and easy to
sort through 15,000,000 of them in C. Then one pass through the sorted
set finds your dupes. I've never found a consistently-fastest performer
among Ruby's several different ways of storing sorted sets.

Make sure that your inner loop doesn't allocate any new variables,
especially arrays- declare them outside your inner loop and re-use them
with Array#clear.

> doesn't improve things, there's always the option of going dual-core
> and forking to do independent files.

Obviously I haven't seen your code or your data, but if the Ruby app is
memory-bus-bound, then this approach may make your problem worse, not
better.

Good luck. I recently got a Ruby program that aggregates several LDAP
directory-pulls with about a million entries down from a few hours to a
few seconds, without having to drop into C. It can be done, and it's
kindof fun too.

Chad Perrin

unread,
Jul 27, 2006, 12:59:37 AM7/27/06
to
On Thu, Jul 27, 2006 at 12:19:24PM +0900, Keith Gaughan wrote:
> On Thu, Jul 27, 2006 at 11:38:34AM +0900, Chad Perrin wrote:
>
> > > ...most of which is down to Windows itself, not Excel. Excel's
> > > contribution to that lag isn't, I believe, all that great. So in this
> > > regard, your complaint is more to do with GDI and so on than with Excel
> > > itself.
> >
> > Excel doesn't run so well on Linux, so I'll just leave that lying where
> > it is.
>
> In fairness, if you're judging it's performance based on running it in
> Wine, that's not really a fair comparison. :-)

I'm judging it based on running it on Windows. My point is that
divorcing it from the only environment in which it runs (natively) is
less than strictly sporting of you, when trying to discuss its
performance characteristics (or lack thereof).

--
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]

William James

unread,
Jul 27, 2006, 1:18:25 AM7/27/06
to
Kroeger, Simon (ext) wrote:
> Hi Peter!
>
> > Whenever the question of performance comes up with scripting
> > languages
> > such as Ruby, Perl or Python there will be people whose
> > response can be
> > summarised as "Write it in C". I am one such person. Some people take
> > offence at this and label us trolls or heretics of the true
> > programming
> > language (take your pick).
>
> The last (and only) time I called someone a troll for saying
> 'Write it C' it was in response to a rails related question.
> Further the OP asked for configuration items and such, but maybe
> that's a whole other storry. (and of course you can write
> C Extensions for rails... yeah, yadda, yadda :) )
>
> ..snip 52 lines Perl, some hundred lines C ...
>
> > [Latin]$ time ./Latin1.pl 5 > x5
> >
> > real 473m45.370s
> > user 248m59.752s
> > sys 2m54.598s
> >
> > [Latin]$ time ./Latin4.pl 5 > x5
> >
> > real 12m51.178s
> > user 12m14.066s
> > sys 0m7.101s
> >
> > [Latin]$ time ./c_version.sh 5
> >
> > real 0m5.519s
> > user 0m4.585s
> > sys 0m0.691s
>
> Just to show the beauty of ruby:
> -----------------------------------------------------------
> require 'rubygems'
> require 'permutation'
> require 'set'
>
> $size = (ARGV.shift || 5).to_i
>
> $perms = Permutation.new($size).map{|p| p.value}
> $out = $perms.map{|p| p.map{|v| v+1}.join}
> $filter = $perms.map do |p|
> s = SortedSet.new
> $perms.each_with_index do |o, i|
> o.each_with_index {|v, j| s.add(i) if p[j] == v}
> end && s.to_a
> end
>
> $latins = []
> def search lines, possibs
> return $latins << lines if lines.size == $size
> possibs.each do |p|
> search lines + [p], (possibs -
> $filter[p]).subtract(lines.last.to_i..p)
> end
> end
>
> search [], SortedSet[*(0...$perms.size)]
>
> $latins.each do |latin|
> $perms.each do |perm|
> perm.each{|p| puts $out[latin[p]]}
> puts
> end
> end
> -----------------------------------------------------------
> (does someone has a nicer/even faster version?)

Here's a much slower version that has no 'require'.

Wd = ARGV.pop.to_i
$board = []

# Generate all possible valid rows.
Rows = (0...Wd**Wd).map{|n| n.to_s(Wd).rjust(Wd,'0')}.
reject{|s| s=~/(.).*\1/}.map{|s| s.split(//).map{|n| n.to_i + 1} }

def check( ary, n )
ary[0,n+1].transpose.all?{|x| x.size == x.uniq.size }
end

def add_a_row( row_num )
if Wd == row_num
puts $board.map{|row| row.join}.join(':')
else
Rows.size.times { |i|
$board[row_num] = Rows[i]
if check( $board, row_num )
add_a_row( row_num + 1 )
end
}
end
end

add_a_row 0

Keith Gaughan

unread,
Jul 27, 2006, 1:26:37 AM7/27/06
to
On Thu, Jul 27, 2006 at 01:59:37PM +0900, Chad Perrin wrote:
> On Thu, Jul 27, 2006 at 12:19:24PM +0900, Keith Gaughan wrote:
> > On Thu, Jul 27, 2006 at 11:38:34AM +0900, Chad Perrin wrote:
> >
> > > > ...most of which is down to Windows itself, not Excel. Excel's
> > > > contribution to that lag isn't, I believe, all that great. So in this
> > > > regard, your complaint is more to do with GDI and so on than with Excel
> > > > itself.
> > >
> > > Excel doesn't run so well on Linux, so I'll just leave that lying where
> > > it is.
> >
> > In fairness, if you're judging it's performance based on running it in
> > Wine, that's not really a fair comparison. :-)
>
> I'm judging it based on running it on Windows. My point is that
> divorcing it from the only environment in which it runs (natively) is
> less than strictly sporting of you, when trying to discuss its
> performance characteristics (or lack thereof).

Wait... I did no such thing. All I said was that what interface
sluggishness you get from Excel can't be blamed on Excel. They're
performance characteristics that *can* be divorced from Excel (because
they're Window's own performance characteristic, not Excel's). Argue
those points, and you're arguing about the wrong software.

But Wine is an emulator, and while it does a good job approaching the
speed of Windows, it doesn't hit it, nor can it hit it. You're not
comparing like with like. Now that's far from sporting.

You're argument it disingenuous. Consider Cygwin running on Windows
compared to FreeBSD running on the same machine. I can make this
comparison because the machine I'm currently using dual-boots such a
setup. I run many of the same applications under Cygwin as I do under
FreeBSD on the same box. Those same applications running under Cygwin
are noticably slower than the native equivalents under FreeBSD. Do I
blame the software I'm running under Cygwin for being slow? No, because
I'm well aware that it zips along in its native environment. Do a blame
Cygwin? No, because it does an awful lot of work to trick the software
running under it that it's running on a *nix. Do I blame Windows? No,
because I use some of that software--gcc being an example--natively
under Windows and it performs just as well as when it's ran natively
under FreeBSD. Bringing Wine in is a red herring. Software cannot be
blamed for the environment it's executed in.

Fear is the greatest salesman.
-- Robert Klein

Chad Perrin

unread,
Jul 27, 2006, 1:42:30 AM7/27/06
to
On Thu, Jul 27, 2006 at 02:26:37PM +0900, Keith Gaughan wrote:
> On Thu, Jul 27, 2006 at 01:59:37PM +0900, Chad Perrin wrote:
> >
> > I'm judging it based on running it on Windows. My point is that
> > divorcing it from the only environment in which it runs (natively) is
> > less than strictly sporting of you, when trying to discuss its
> > performance characteristics (or lack thereof).
>
> Wait... I did no such thing. All I said was that what interface
> sluggishness you get from Excel can't be blamed on Excel. They're
> performance characteristics that *can* be divorced from Excel (because
> they're Window's own performance characteristic, not Excel's). Argue
> those points, and you're arguing about the wrong software.

Design decisions that involve interfacing with interface software that
sucks is related to the software under discussion -- and not all of the
interface is entirely delegated to Windows, either. No software can be
evaluated for its performance characteristics separate from its
environment except insofar as it runs without that environment.


>
> But Wine is an emulator, and while it does a good job approaching the
> speed of Windows, it doesn't hit it, nor can it hit it. You're not
> comparing like with like. Now that's far from sporting.

Actually, no, it's not an emulator. It's a set of libraries (or a
single library -- I'm a little sketchy on the details) that provides the
same API as Windows software finds in a Windows environment. An
emulator actually creates a faux/copy version of the environment it's
emulating. It is to Linux compared with Unix as an actual emulator is
to Cygwin compared with Unix: one is a differing implementation and the
other is an emulator.

. . and, in fact, there are things that run faster via Wine on Linux
than natively on Windows.


[ snip ]


> under FreeBSD. Bringing Wine in is a red herring. Software cannot be
> blamed for the environment it's executed in.

I didn't bring it up. You did. I made a comment about Excel not
working in Linux as a bit of a joke, attempting to make the point that
saying Excel performance can be evaluated separately from its dependence
on Windows doesn't strike me as useful.

Peter Hickman

unread,
Jul 27, 2006, 3:55:57 AM7/27/06
to
On my machine it took around 33 seconds but I think that I can improve
it a little, besides I have to test the results first.


Ashley Moran

unread,
Jul 27, 2006, 5:13:47 AM7/27/06
to
On Thursday 27 July 2006 05:23, Francis Cianfrocca wrote:
> Eliminating the dupes is the only scary thing I've seen here. What's the
> absolute smallest piece of data that you need to look at in order to
> distinguish a dupe? (If it's the whole line, then the answer is 16
> bytes- the length of the MD5 hash ;-)) That's the critical working set.
> If you can't get the Ruby version fast enough, it's cheap and easy to
> sort through 15,000,000 of them in C. Then one pass through the sorted
> set finds your dupes. I've never found a consistently-fastest performer
> among Ruby's several different ways of storing sorted sets.
>
> Make sure that your inner loop doesn't allocate any new variables,
> especially arrays- declare them outside your inner loop and re-use them
> with Array#clear.

Nice MD5 trick! I'll remember that. Fortunately the files that need
duplicate elimination are really small, so I won't need to resort to that.
But I'll remember it for future reference.


>
> Obviously I haven't seen your code or your data, but if the Ruby app is
> memory-bus-bound, then this approach may make your problem worse, not
> better.

Hadn't thought of that, good point...


> Good luck. I recently got a Ruby program that aggregates several LDAP
> directory-pulls with about a million entries down from a few hours to a
> few seconds, without having to drop into C. It can be done, and it's
> kindof fun too.

Next time I get a morning free I might apply some of the tweaks that have been
suggested. Might be interested to see how much I can improve the
performance.

Cheers
Ashley

--
"If you do it the stupid way, you will have to do it again"
- Gregory Chudnovsky

Ashley Moran

unread,
Jul 27, 2006, 5:18:14 AM7/27/06
to
On Wednesday 26 July 2006 23:13, Chad Perrin wrote:
> > I recently rewrote an 830 line Java/
> > Hibernate web service client as 67 lines of Ruby, in about an hour.  
> > With that kind of productivity, performance can go to hell!
>
> With a 92% cut in code weight, I can certainly sympathize with that
> sentiment.  Wow.

Even the last remaining member of the Anti-Ruby Defence League in my office
admitted (reluctantly) that it was impressive!

Daniel Martin

unread,
Jul 27, 2006, 11:06:24 AM7/27/06
to
har...@schizopolis.net writes:

> Are there
> good benchmarks for OO languages? Or dynamic languages? Are there good
> benchmarks that could actually measure the types of uses I need, where I'm
> building a web front end to a DB store? I don't know about you, but my job
> has never involved fractals.

There was posted here a few weeks ago, in a thread on rails
performance, a benchmark that measured how fast a "hello world" web
app could respond under heavy load. This doesn't measure the DB back
end piece, of course, but it's a little closer to useful for you than
Mandelbrot calculations.

In fact, digging through Google desktop searches of my recently
visited web pages finds it here:

http://www.usenetbinaries.com/doc/Web_Platform_Benchmarks.html

Rails loses this contest big-time. Perl CGI scripts even beat a Rails
FastCGI setup. Rails FastCGI is about 15 times slower than plain ruby
FastCGI.

Also, it seems clear that at least for very simple web apps, PHP 4 to
PHP 5 is a distinct performance regression.

Isaac Gouy

unread,
Jul 27, 2006, 4:40:05 PM7/27/06
to

Hal Fulton wrote:
> Isaac Gouy wrote:
> >
> > That sounds wrong to me - I hear positive comments about Java
> > performance for long-running programs, not for programs that run in
> > under a second.
> >
>
> JIT is the key to a lot of that. Performance depends greatly on
> the compiler, the JVM, the algorithm, etc.
>
> I won a bet once from a friend. We wrote comparable programs in
> Java and C++ (some arbitrary math in a loop running a bazillion
> times).
>
> With defaults on both compiles, the Java was actually *faster*
> than the C++. Even I didn't expect that. But as I said, this
> sort of thing is highly dependent on many different factors.
>
>
> Hal

Sometimes when we look at different workloads we can see the
performance crossover when relatively slow startup is overcome, code
JIT'd and adaptive optimisation kicks in

http://shootout.alioth.debian.org/gp4/fulldata.php?test=nbody&p1=java-0&p2=gcc-0&p3=clean-0&p4=java-0

Tim Bray

unread,
Jul 27, 2006, 7:11:17 PM7/27/06
to
Sorry for coming late to the party.

On Jul 26, 2006, at 1:47 AM, Peter Hickman wrote:

> Whenever the question of performance comes up with scripting
> languages such as Ruby, Perl or Python there will be people whose
> response can be summarised as "Write it in C".

The conclusion is wrong in the general case. Suppose that, instead
of computing permutations, your task had been to read ten million
lines of textual log files and track statistics about certain kinds
of events coded in there. I bet a version coded in perl, making
straightforward uses of regexes and hashes, would have performance
that would be very hard to match in C or any other language. Ruby
would be a little slower I bet just because Perl's regex engine is so
highly-tuned, although it's been claimed Oniguruma is faster.

So, first gripe: C is faster than Ruby *in certain problem domains*.
In others, it's not.

Second gripe. The notion of doing a wholesale rewrite in C is almost
certainly wrong. In fact, the notion of doing any kind of serious
hacking, without doing some measuring first, is almost always wrong.
The *right* way to build software that performs well is to write a
natural, idiomatic implementation, trying to avoid stupid design
errors but not worrying too much about performance. If it's fast
enough, you're done. If it's not fast enough, don't write another
line of code till you've used used a profiler and understand what the
problem is. If in fact this is the kind of a problem where C is
going to do better, chances are you only have to replace 10% of your
code to get 90% of the available speedup.

And don't remember to budget downstream maintenance time for the
memory-allocation errors and libc dependencies and so on that cause C
programs to be subject to periodic downstream crashes.

-Tim

It is loading more messages.
0 new messages