Java C++ shootout

655 views
Skip to first unread message

Glen Low

unread,
Jun 16, 2004, 6:15:21 AM6/16/04
to
Another Java vs C++ performance shootout:

http://kano.net/javabench/

Anyone care to comment about the appropriateness of his C++ code and
compiler settings?

Cheers,
Glen Low, Pixelglow Software
www.pixelglow.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Hendrik Belitz

unread,
Jun 16, 2004, 11:15:37 AM6/16/04
to
Glen Low wrote:

> Another Java vs C++ performance shootout:

Hmm, most of the C++ code looks more like C. I don't know if this also has
an impact on performance, but taking a Java vs C++ shootout should respect
language-specific features which wasn't made here, alt least for C++.

--
To get my real email adress, remove the two onkas
--
Hendrik Belitz
- Abort, Retry, Fthagn? -

scottNOgr...@yahoo.com

unread,
Jun 16, 2004, 11:22:09 AM6/16/04
to
In article <9215d7ac.04061...@posting.google.com>, Glen Low says...

>
>Another Java vs C++ performance shootout:
>
>http://kano.net/javabench/
>
>Anyone care to comment about the appropriateness of his C++ code and
>compiler settings?

This supposed benchmark has already been well picked apart on the Slashdot
discussion site. The C++ is poor and the compiler used (gcc) known for
portability, not fast executables.

As the author of the site states, he was tired of Java losing all the
performance benchmarks, so he decided to fix it!

What would have been more interesting would have been something else more like a
contest. Give a set of problems to people who are familiar with one or the
other language. Have teams code solutions, and then test the solutions. This
would test, in part, the ease of writing fast code in C++ or Java.

Also interesting, but not done would be to attempt to see the impact, if any, of
a re-optimizing compiler. While a good C++ compiler should be able to optimize
for a normal case, if a section of code appears at profile time to take one
particular branch, then the branch prediction could be modified in real time.
While this is presumably possible with C++ with some sort of compiler feedback
loop from instrumented code, it is not part of my normal experience.

sdg

Thorsten Ottosen

unread,
Jun 16, 2004, 11:24:16 AM6/16/04
to
| Another Java vs C++ performance shootout:
|
| http://kano.net/javabench/
|
| Anyone care to comment about the appropriateness of his C++ code

Sure.

1. His code is very inappropriate: a lot won't compile and a lot is not
standard C++, nor "nice" C++ .

2. He's comparing two particular implementations, not two languages.

3. He's comparing compile-time optimizations with run-time optimizations.

4. His test are *very* simple, not really useful for what he's trying to do.

That said, it would still be interesting to see a benchmark on real
applications if somebody has made it.

br

Thorsten

Hyman Rosen

unread,
Jun 16, 2004, 4:30:44 PM6/16/04
to
Thorsten Ottosen wrote:
> 2. He's comparing two particular implementations, not two languages.

Well, that's a particularly meaningless remark.
Obviously, languages do not have a speed, only
implemented programs do!

> 3. He's comparing compile-time optimizations with run-time optimizations.

He's comparing how fast this one goes with how fast that
one goes. I don't understand your distinction, nor why it
would be meaningful to someone who wants an implementation
which runs quickly.

> 4. His test are *very* simple, not really useful for what he's trying to do.

He's trying to demonstrate that equivalent Java is faster than C++,
and since that's what his tests showed, they're very useful :-)

Tom Houlder

unread,
Jun 17, 2004, 4:41:34 AM6/17/04
to
"Hyman Rosen" <hyr...@mail.com> wrote in message
news:10874032...@master.nyc.kbcfp.com...

> Thorsten Ottosen wrote:
> > 2. He's comparing two particular implementations, not two
languages.
>
> Well, that's a particularly meaningless remark.
> Obviously, languages do not have a speed, only
> implemented programs do!

Of course languages have speed. Java came in like a whirlwind and
left even faster, but C++ seems to stay here forever. So Java is a
faster language than C++.


Tom

--
Tom Houlder
thou...@houlderREMOVE.net

Chic - C++ without hassles
http://www.houlder.net

ka...@gabi-soft.fr

unread,
Jun 17, 2004, 4:44:34 AM6/17/04
to
gle...@pixelglow.com (Glen Low) wrote in message
news:<9215d7ac.04061...@posting.google.com>...

> Another Java vs C++ performance shootout:

> http://kano.net/javabench/

> Anyone care to comment about the appropriateness of his C++ code and
> compiler settings?

When I'm benchmarking for speed, I use -O3 with g++, not -O2. That's
definitly an error on his part.

I glanced at a few of the tests. For the most part, they are pretty low
level; it's hard to say what significance they have with regards to real
code.

The hash code tests actually test how good the hashing function is. The
default algorithm used by g++ is definitly bad for large numbers of
short strings (it will cluster near the front of the table), but I've
not done enough analysis to be able to say exactly what is too large or
too short.

It would be interesting to see what happens with more complex data
structures, like std::vector (and push_back) vs. java.util.Vector. Java
has a decided advantage in array handling, because of the absense of
aliasing. As soon as you use any data structure other than an Array of a
built-in type, however, you need to dynamically allocate each element in
the structure. Relocating garbage collection generally results in much
faster allocation than you can get with operator new in C++, but even
the fastest allocation can't compete with no allocation.

Note too that while garbage collection allocation is typically much
faster than manual allocation (perhaps an order of magnitude or so),
this is partitially offset by the time you spend collecting. His
programs allocate little enough that he probably never needs to collect.

The small size of the programs are a definite advantage for Java's
runtime optimizer. A run-time optimizer can't spend forever optimizing,
but it has access to actual run-time data, AND can specialize its
optimization for the specific processor. (G++, and most other C++
compilers, have flags to allow this as well.) Given the small size, a
complete analysis of the program is possible in the limited time
available, and given the other additional data, I would find it rather
worrisome if Java didn't do better. I'm more sceptical with regards to
real programs, however. If there really is one small critical loop,
Java should win; that's not the case for most of my programs, however.

And of course, g++ is far (very far) from state of the art optimizing,
especially (so I've been told) for Intel architectures.

All that said, however, for many applications, there will be little or
no difference in speed between Java and C++. Java has a number of
advantages which make life easier for the optimizer, and for most
applications, garbage collection will be faster than manual memory
management, but the fact that everything has to be dynamically allocated
can cost a lot.

In the end, of course, performance isn't necessarily the key argument.
For most of the work I do, safety is. And that is a criteria where C++
is a clear winner; it is very, very difficult to write correct programs
of any significant size in Java. (It's not really that easy in C++, but
at least the language never actively prevents you from doing the right
thing.)

--
James Kanze GABI Software
Conseils en informatique orient&#65533;e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S&#65533;mard, 78210 St.-Cyr-l'&#65533;cole, France, +33 (0)1 30 23 00 34

Hendrik Belitz

unread,
Jun 17, 2004, 4:48:15 AM6/17/04
to
Hyman Rosen wrote:

> Thorsten Ottosen wrote:
>> 2. He's comparing two particular implementations, not two languages.
>
> Well, that's a particularly meaningless remark.
> Obviously, languages do not have a speed, only
> implemented programs do!
>
>> 3. He's comparing compile-time optimizations with run-time optimizations.
>
> He's comparing how fast this one goes with how fast that
> one goes. I don't understand your distinction, nor why it
> would be meaningful to someone who wants an implementation
> which runs quickly.
>
>> 4. His test are *very* simple, not really useful for what he's trying to
>> do.
>
> He's trying to demonstrate that equivalent Java is faster than C++,
> and since that's what his tests showed, they're very useful :-)

Well, but without programming in proper C++ (which he doesn't), this tests
are meaningless. Maybe he should ask a C++ developer to rewrite the C++
tests, so he can get more meaningful results. I'm sure, if I would write
both cases (java and C++), the C++ programs would run much faster than the
Java programs. Not because my C++'s so good, but my Java's so bad :-)

--
To get my real email adress, remove the two onkas
--
Hendrik Belitz
- Abort, Retry, Fthagn? -

Thorsten Ottosen

unread,
Jun 17, 2004, 4:52:06 AM6/17/04
to
"Hyman Rosen" <hyr...@mail.com> wrote in message news:10874032...@master.nyc.kbcfp.com...
| Thorsten Ottosen wrote:
| > 2. He's comparing two particular implementations, not two languages.
|
| Well, that's a particularly meaningless remark.
| Obviously, languages do not have a speed, only
| implemented programs do!

but if you want a fair picture you need to consider many implementations of both languages/libraries.
Picking just two is hardly enough.

| > 3. He's comparing compile-time optimizations with run-time optimizations.
|
| He's comparing how fast this one goes with how fast that
| one goes. I don't understand your distinction, nor why it
| would be meaningful to someone who wants an implementation
| which runs quickly.

that's true.

| > 4. His test are *very* simple, not really useful for what he's trying to do.
|
| He's trying to demonstrate that equivalent Java is faster than C++,

I guess we have different opinions about the meaning of "equivalent".

| and since that's what his tests showed, they're very useful :-)

But the test doesn't show that because they are too simple. The tests shows,
runtime-optimizations perform very well on overly simplified code.

As an example, virtual machines can pseudo-inline virtual function calls
by detecting really fast that we're dealing with the same object. That pays of
when you're calling the same function on the same object 10^6 times.

However, it's been a long time since even parts of my programs would fit that discription.

The interestig question is how good the runtime-optimizer works with normal
code. And that is not answered by his tests.

br

Thorsten

ka...@gabi-soft.fr

unread,
Jun 17, 2004, 10:46:35 AM6/17/04
to
Hyman Rosen <hyr...@mail.com> wrote in message
news:<10874032...@master.nyc.kbcfp.com>...

>> 4. His test are *very* simple, not really useful for what he's
>> trying to do.

> He's trying to demonstrate that equivalent Java is faster than C++,
> and since that's what his tests showed, they're very useful :-)

But it's an obvious truism, well known for quite a long time, that for
some specific operations, Java is faster than C++, and that for others,
C++ is faster than Java. Even with much older, less optimized Java
implementations. Speed, per se, has always been a weak point of C/C++,
due to the way they don't) handle arrays. Speed, per se, is also a weak
point of Java, due to the way it handles objects of user defined types.
I expect that a good Ada compiler (if such exist) would leave both
languages far behind.

In the meantime, it's well known: if you want Java to win your
benchmarks, use lots of arrays of basic types. If you favor C++, use
lots of simple, value oriented user defined types; I'll bet that in any
benchmark making extensive use of complex, for example, C++ will win
hands down.

Note that for the programs that use neither, like sieve or word count,
the times are remarkably similar; probably due in fact to the
differences in the quality of the optimizer, rather than any real
differences in the language.

Another interesting test is object creation. Java looks to be about
three or four times faster than C++. However, it is important to note
that 1) the memory use pattern is optimal for a garbage collector, and
2) more importantly, Java must optimize this aspect to a maximum, in
order to be usable, since all user defined objects must be allocated
dynamically. Typically, in C++, I'd guess that at most something like
25% of all objects are dynamically allocated. (But this depends
enormously on the application domain. For numeric calculations
involving complex, for example, there may be no dynamically allocated
objects in C++. See above -- what you test makes a difference.)

In the end, of course, speed is usually less important that
correctness. And while Java is fine for smaller things (where I
definitely prefer it to Perl), it doesn't scale; beyond a certain size,
writing a correct program in Java becomes almost impossible (where as in
C++, it's just very difficult).

--
James Kanze GABI Software
Conseils en informatique orient&#65533;e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S&#65533;mard, 78210 St.-Cyr-l'&#65533;cole, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

James Rogers

unread,
Jun 18, 2004, 5:47:30 AM6/18/04
to
ka...@gabi-soft.fr wrote in
news:d6652001.0406...@posting.google.com:

> But it's an obvious truism, well known for quite a long time, that for
> some specific operations, Java is faster than C++, and that for
> others, C++ is faster than Java. Even with much older, less optimized
> Java implementations. Speed, per se, has always been a weak point of
> C/C++, due to the way they don't) handle arrays. Speed, per se, is
> also a weak point of Java, due to the way it handles objects of user
> defined types. I expect that a good Ada compiler (if such exist) would
> leave both languages far behind.

I do not have the same system the owner of the web page has. Mine is
probably much slower. That said, I ran his Java code for the Ackerman
test with a parameter of 8 (giving a result of 2045). It took 2 seconds
to run, including the time needed to start up the JVM. An Ada example
compiled with the GNAT compiler, which uses the GNU backend, took
0.17 seconds. This was run on my 3 year old laptop using Win98.

I do not have a C++ compiler to compare timings. I would be surprised
if any C++ compiler required anything near 2 seconds to run this test
for properly constructed C++.

Jim Rogers

SrConchiwa

unread,
Jun 18, 2004, 5:49:34 AM6/18/04
to
> Speed, per se, has always been a weak point of C/C++,
> due to the way they don't) handle arrays.

I was trying to find an article online to help me understand why this
statement is true, but there are way to many "Java vs. C++" flame
sites that are clogging my results. Would you mind posting a link
explaining this?

> ... it doesn't scale; beyond a certain size,


> writing a correct program in Java becomes almost impossible (where as in
> C++, it's just very difficult).

Having little experience with mammoth projects, I also was curious
about this. I'm sure this is a very in-depth issue, but what language
features allow/impede large-scale development?

Thank you,
Luke

Stefan de Bruijn

unread,
Jun 18, 2004, 6:41:21 AM6/18/04
to
Glen Low wrote:
> Another Java vs C++ performance shootout:
>
> http://kano.net/javabench/
>
> Anyone care to comment about the appropriateness of his C++ code and
> compiler settings?

I've taken the liberty to adjust the C++ code the way I think it
"should" be (that is: using STL properly). Of course there could still
be remarks about the choices I've made (like using std::iostream rather
than FILE* or file descriptors for the sake of performance), but I think
the results give a better idea of the differences in performance. The
test was conducted on Intel and GCC; I would like to test Comeau too,
but unfortunately I don't have that.

The test can be obtained from http://cpp.student.utwente.nl/benchmark

Just my 2 cents: I find it strange that loops don't appear to be
optimized properly by both tested compilers.

Sincerely,

Stefan de Bruijn.

Catalin Pitis

unread,
Jun 19, 2004, 6:34:47 AM6/19/04
to
I see a *GREAT* difference at method call.

First of all, how were these measurements made? The profiling information
should be extracted somehow... I know that instrumenting the code is one
approach. In this case, how can measurement of method call can be done
without altering the binaries?

Can someone give me a hint?

Catalin


"Glen Low" <gle...@pixelglow.com> wrote in message
news:9215d7ac.04061...@posting.google.com...


> Another Java vs C++ performance shootout:
>
> http://kano.net/javabench/
>
> Anyone care to comment about the appropriateness of his C++ code and
> compiler settings?

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Michael Schlenger

unread,
Jun 19, 2004, 6:36:54 AM6/19/04
to
On 17 Jun 2004 10:46:35 -0400, ka...@gabi-soft.fr wrote:

>But it's an obvious truism, well known for quite a long time, that for
>some specific operations, Java is faster than C++, and that for others,
>C++ is faster than Java. Even with much older, less optimized Java
>implementations. Speed, per se, has always been a weak point of C/C++,
>due to the way they don't) handle arrays.

This is not so obvious to me. I always thought that array access is
fast in C++ since it only needs a simple offset calculation or pointer
arithmetic to access an array element. No range checks, nothing.


[snip]

>Another interesting test is object creation. Java looks to be about
>three or four times faster than C++.

Maybe with the general purpose allocator that is normally used
(malloc/free). But one often can implemennt special allocators for
special situations which then are much faster. E.G: well known is the
scenario when we have a container of objects all the same size.


------------------------------------------------
Michael Schlenger
------------------------------------------------

Greg Copeland

unread,
Jun 19, 2004, 6:47:49 AM6/19/04
to
On Wed, 16 Jun 2004 06:15:21 -0400, Glen Low wrote:

> Another Java vs C++ performance shootout:
>
> http://kano.net/javabench/
>
> Anyone care to comment about the appropriateness of his C++ code and
> compiler settings?


Just FYI, I've already taken four of the bechmarks there and optimized
them. I'll probably look at some more as time allows. All that I've
looked at are now running faster than the Java implementations. I've
submitted the code back to the author. One such implementation was 2.2 -
2.7 times faster than the java implementation while his original
implementation was actually slower than java's. That one was the hash
benchmark. I also noticed that one of the c++ implementations were not
even proper (as in, didn't do what it was supposed to be doing) and was
slower than the java implementation to boot. I corrected both of those.

Basically, all of the benchmarks appear to be absolute best corner case
situations where hot spot can optimize and all appear to avoid ever paying
any type of heap costs. The objinst benchmark, I simply used a commonly
available collector and it quickly zoom ahead of the java implementation.
Next, I plan on implementing an optimized heap implementation specifically
tailored for the object life cycle used in the benchmark. Which is,
highly transient, short lived objects whereby the destuctor is never used
nor required.

Still several others, in an attempt to eliminate noise, I wanted to run a
longer benchmark. Sadly, the java implementations were not able to run
because I didn't have enough memory. The C++ implementation would take
about 58M or something like that and completed in about 2s. The java
implementation ran out of memory at 380M and failed to complete after
around 6+ seconds. I was unable to add more memory to the VM with causing
my system to start paging.

Oddly, the guy did mention that he considered proper C++ implementations
to be "hand optimized" rather than recognizing that most of the benchmark
code was simply horrible and improper implementations. I take it tha
the believes his bad benchmark actually carries some level of validity.
At any rate, he said he'd offer an update later this summer. So, check
back to see what happens. ;)


Cheers,

Greg Copeland

Greg Copeland

unread,
Jun 19, 2004, 6:54:32 AM6/19/04
to
On Thu, 17 Jun 2004 04:44:34 -0400, kanze wrote:

> gle...@pixelglow.com (Glen Low) wrote in message
> news:<9215d7ac.04061...@posting.google.com>...
> > Another Java vs C++ performance shootout:
>
> > http://kano.net/javabench/
>
> > Anyone care to comment about the appropriateness of his C++ code and
> > compiler settings?
>
> When I'm benchmarking for speed, I use -O3 with g++, not -O2. That's
> definitly an error on his part.

Actually, I tried several -O options here and for the few benchmarks that
I tried different options, -O2 faired rather well. Mostly, -O3 made no
difference. I may go back and make another pass with -O3 with the new
code to see how things fair, just to make sure.

>
> I glanced at a few of the tests. For the most part, they are pretty low
> level; it's hard to say what significance they have with regards to real
> code.

Which is exactly why almost all of these represent ideal corner cases for
runtime optimization but have little reflection of real world application.
Worse, most of the C++ code is just horrible.

>
> The hash code tests actually test how good the hashing function is. The
> default algorithm used by g++ is definitly bad for large numbers of
> short strings (it will cluster near the front of the table), but I've
> not done enough analysis to be able to say exactly what is too large or
> too short.

Agreed. While using an equivilent hash implementation as what was being
used in the java implementation, the c++ implementation is now 2.2 - 2.7
times faster than the java implementation. The cost was submitted back to
the author.

>
> It would be interesting to see what happens with more complex data
> structures, like std::vector (and push_back) vs. java.util.Vector. Java
> has a decided advantage in array handling, because of the absense of
> aliasing. As soon as you use any data structure other than an Array of a
> built-in type, however, you need to dynamically allocate each element in
> the structure. Relocating garbage collection generally results in much
> faster allocation than you can get with operator new in C++, but even
> the fastest allocation can't compete with no allocation.
>
> Note too that while garbage collection allocation is typically much
> faster than manual allocation (perhaps an order of magnitude or so),
> this is partitially offset by the time you spend collecting. His
> programs allocate little enough that he probably never needs to collect.

Which is exactly true. Thusly, helping to invalidate benchmarks, such as
objinst. When I grabbed a GC for C++ and compiled against, the c++
implementation became faster by about 30%. I plan on quickly writing my
own heap implementation for that benchmark, which I suspect will be much,
much faster yet. The GC C++ implementation was submitted back to the
auther and I plan on submitting the other implementation as well.


Cheers,

Greg Copeland

dayton

unread,
Jun 19, 2004, 6:55:24 AM6/19/04
to
Glen Low wrote:

> Another Java vs C++ performance shootout:
>
> http://kano.net/javabench/
>
> Anyone care to comment about the appropriateness of his C++ code and
> compiler settings?

It appears he was testing the I/O system, and not actual number crunching.

For small loops on Pentium machines, it is possible for Java to be
faster than native code. This is because some CPUs have deep caches.
Some highly optimizing compilers in fact exploit this by generating
P-code for small loops and using an interpreter that runs entirely
within the cache.

On CPUs designed for multi-processor applications with shallower caches
(such as SPARCs), Java sucks. I find it ironic that Sun produced a
language that runs ideally on Intel.

I wonder why he didn't use the new gcj GNU compiler to compile the Java
so both the C++ and Java would run natively?

dayton

David Eng

unread,
Jun 19, 2004, 9:55:08 AM6/19/04
to
ka...@gabi-soft.fr wrote in message news:<d6652001.04061...@posting.google.com>...

> gle...@pixelglow.com (Glen Low) wrote in message
> news:<9215d7ac.04061...@posting.google.com>...

> In the end, of course, performance isn't necessarily the key argument.


> For most of the work I do, safety is. And that is a criteria where C++
> is a clear winner; it is very, very difficult to write correct programs
> of any significant size in Java. (It's not really that easy in C++, but
> at least the language never actively prevents you from doing the right
> thing.)

You mentioned in several posts that Java does not scale well but you
did not give an example. In which case, does Java not scale well?
What prevent Java to write a large program? I don't think you have to
write ONE large program to accomplish a task. Instead, you can write
SMALL programs distributed together to accomplish the same task. In
this case, does Java still have a scalable problem?

Thanks for your insight.

David Eng

unread,
Jun 19, 2004, 9:56:54 AM6/19/04
to
gle...@pixelglow.com (Glen Low) wrote in message news:<9215d7ac.04061...@posting.google.com>...
> Another Java vs C++ performance shootout:
>
> http://kano.net/javabench/
>
> Anyone care to comment about the appropriateness of his C++ code and
> compiler settings?
>

I use both languages daily and my experience is:

1. When implement complex algorithms, C++ is far more elegant and
easier than Java

2. When you need access to OS, forget about Java

3. Java provides a large runtime environment which C++ is lack of.
So, it is easy to write some code in Java than C++

4. In general, C++ runs faster than Java. Forget about this
benchmark test. C++ and Java like plane and car. No doubt, it is
much quicker to go to a Wal-Mart from home by a car than by a plane.

For small and lightweight processes, Java is preferred. For mission
critical and heavy-duty processes, C++ is the best choice.

Robert Jan Schaper

unread,
Jun 19, 2004, 10:12:39 AM6/19/04
to
ka...@gabi-soft.fr wrote:

> But it's an obvious truism, well known for quite a long time, that for
> some specific operations, Java is faster than C++, and that for others,
> C++ is faster than Java. Even with much older, less optimized Java
> implementations.

I've looked at two of his examples: ackermann and hash, and these illustrate
your point nicely. The first example, the ackermann seems indeed much
faster in Java. A little googling told me that this is because C and C++
compilers cannot implement tail-call elimination as well as Java. With n =
12 the Java implementation is more than twice as fast on my Athlon 2.5 GHz/
256MB / Linux 2.4 / Java HotSpot 1.4.2 / GCC 3.3.1


The hash example however is less impressive:

- I cannot reproduce the results the author claims. If I run his binaries
Java is already losing with a factor 5 if n = 100k in both client and
server mode.

- With a better hash function and itoa() instead of sprintf you can further
speedup the cpp version (see the example code at the end). With n = 1M the
speedup is almost 50%. A further improvement would be the removal of the
costly second strdup() call which seems unnecessary. But for some
mysterious reason this gives faulty results at some values of n. Perhaps
someone can explain what is happening here.

- The Java version can handle about n = 600k but after that it throws an out
of memory exception. C++ can handle n = 4M without swapping. Rather
embarrassing.

#include <iostream>
#include <hash_map.h>

using namespace std;

struct eqstr {
bool operator()(const char* s1, const char* s2) const {
return strcmp(s1, s2) == 0;
}
};

char* itoa(int val, int base) {
static char buf[32] = {0};
int i = 30;
for (; val && i; --i, val /= base)
buf[i] = "0123456789abcdef"[val % base];
return &buf[i+1];
}

// suggested on slashdot
struct hashme {
size_t operator()(const char* s) const {
size_t i;
for (i = 0; *s; s++)
i = 31 * i + *s;
return i;
}
};

int
main(int argc, char *argv[]) {
int n = ((argc == 2) ? atoi(argv[1]) : 1);
int c = 0;
hash_map<const char*, int, hashme, eqstr> X;

for (int i=1; i<=n; i++)
X[strdup(itoa(i,16))] = i;

// should work without strdup but doesn't for all values of n
for (int i=n; i>0; i--)
if (X[strdup(itoa(i,10))]) c++;

cout << c << endl;
}


--
Rob

ka...@gabi-soft.fr

unread,
Jun 19, 2004, 10:26:23 AM6/19/04
to
"Thorsten Ottosen" <nes...@cs.auc.dk> wrote in message
news:<40d110f1$0$5583$afc3...@news.optusnet.com.au>...

> "Hyman Rosen" <hyr...@mail.com> wrote in message
> news:10874032...@master.nyc.kbcfp.com...
> | Thorsten Ottosen wrote:

> | > 2. He's comparing two particular implementations, not two
> | > languages.

> | Well, that's a particularly meaningless remark.
> | Obviously, languages do not have a speed, only
> | implemented programs do!

> but if you want a fair picture you need to consider many
> implementations of both languages/libraries. Picking just two is
> hardly enough.

Yes and no. If I have to evaluate which language to use for a project,
and speed is that important, then I'll restrain my evaluation to the
implementations which are available, or which I can get threw
purchasing. Here, that means Sun's JDK and VC++ for Windows, Sun CC or
g++ for Solaris. The fact that better implementations exist (for both
languages) is irrelevant to me if I can't use them.

In theory, all Turing complete languages are equal. A perfect compiler
will determine the actual semantics, and generate the best code for
those semantics. In practice, of course, nobody's ever written a
perfect compiler, and nobody is going to do so anytime soon. Different
languages make different things easier or harder for the optimizer.
C/C++'s built in array semantics, with arrays decaying to pointers, is a
classical example of a language making this excessively difficult;
Java's lack of aliasing helps optimization considerably. On the other
hand, C/C++ allow the user to specify some optimizations (like placing
objects on the stack, rather than allocating them dynamically) that are
incredibly difficult for a compiler to find (in the general case).

Obviously, if you want to show Java faster, you make extensive use of
arrays, and avoid user defined types. If you want to show C++ faster,
just make extensive use of user defined types, either in short lived
arrays (so creation and deletion are more important than access) or as
local variables.

> | > 3. He's comparing compile-time optimizations with run-time
> | > optimizations.

> | He's comparing how fast this one goes with how fast that one goes. I
> | don't understand your distinction, nor why it would be meaningful to
> | someone who wants an implementation which runs quickly.

> that's true.

> | > 4. His test are *very* simple, not really useful for what he's
> | > trying to do.

> | He's trying to demonstrate that equivalent Java is faster than C++,

> I guess we have different opinions about the meaning of "equivalent".

The few bits of code I looked at were more or less equivalent. On the
other hand, of course, what he has shown is that for some programs,
equivalent Java is faster than C++. But that's something that any
reasonable person would have expected anyway, once the technology became
mature. C/C++'s aliasing and lack of garbage collection result in known
performance problems.

I don't know whether his choices were intentional, or simple an
unintentional result of trying to make things simple. It's simpler, in
very small programs, to just use built-in types, for example. And Java
is particularly efficient with built-in types.

> | and since that's what his tests showed, they're very useful :-)

> But the test doesn't show that because they are too simple. The tests
> shows, runtime-optimizations perform very well on overly simplified
> code.

It depends. From his tests, I think it safe to say that if you are
developing on an Intel based Linux box, and your typical applications
are between ten and twenty lines, Java will probably outperform C++ most
of the time. There are certainly other cases where Java will outperform
C++, but his tests don't indicate what they are.

> As an example, virtual machines can pseudo-inline virtual function
> calls by detecting really fast that we're dealing with the same
> object. That pays of when you're calling the same function on the
> same object 10^6 times.

Static analysis can also do this in certain cases. But it's harder, and
I'm pretty sure that g++ doesn't do it.

> However, it's been a long time since even parts of my programs would
> fit that discription.

The same is true for me, but there are applications where this is the
case.

> The interestig question is how good the runtime-optimizer works with
> normal code. And that is not answered by his tests.

The interesting question is what do you consider normal code:-). I
suspect that no two people will have perfectly identical answers.

--
James Kanze GABI Software
Conseils en informatique orient&#65533;e objet/
Beratung in objektorientierter Datenverarbeitung
9 place S&#65533;mard, 78210 St.-Cyr-l'&#65533;cole, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Gabriel Dos Reis

unread,
Jun 19, 2004, 6:06:34 PM6/19/04
to
ka...@gabi-soft.fr writes:

| gle...@pixelglow.com (Glen Low) wrote in message
| news:<9215d7ac.04061...@posting.google.com>...
| > Another Java vs C++ performance shootout:
|
| > http://kano.net/javabench/
|
| > Anyone care to comment about the appropriateness of his C++ code and
| > compiler settings?
|
| When I'm benchmarking for speed, I use -O3 with g++, not -O2. That's
| definitly an error on his part.

But -O3 may be counter-productive in some cases, it tries to inline as
much as it can -- even those you did not mark inline.

--
Gabriel Dos Reis
g...@cs.tamu.edu
Texas A&M University -- Computer Science Department
301, Bright Building -- College Station, TX 77843-3112

CFG

unread,
Jun 20, 2004, 6:18:06 AM6/20/04
to
If one wants to get real number crunching speed one has to use SIMD
instructions. In my experience many functions can be SIMD optimized to run
2-5X faster than "regular" code. This can be quite easily done from within
C++ because generally it allows access to all kinds of hardware capabilities
(e.g. one can use SIMD C intrinsics for Intel CPUs). I don't believe SIMD
will be supported by Java any time soon at the level beyond toy examples.

Generally, C/C++ code quite straightforwardly translates to machine
instructions, which is not the case with Java if we use interpreter. Java
can achieve comparable speed only if it uses Just-in-time (JIT) compiler,
which basically translates Java byte code to machine code . And,
theoretically, JIT compiler can do some dynamic optimizations - by doing so
it can outperfom statically optimized code (in principle, that is). But C++
code can be compiled to Java byte code as well - and therefore it can also
take advantage of dynamic optimizations. Thus, comparison of C++ progamming
language and a JIT compiler is not apple to apple comparison.

In addition, one of the inherent performance problems of the Java
programming language is that it heavily uses heap memory allocations and
RTTI, both of which are expensive. Even translation to machine code will not
remove the penalties.

Severin Ecker

unread,
Jun 20, 2004, 2:59:18 PM6/20/04
to
hi!

> Just FYI, I've already taken four of the bechmarks there and optimized
> them.

just curious,.. would it be possible to have a look at your optimized code?

thx, regards,
sev

Greg Copeland

unread,
Jun 21, 2004, 3:06:52 PM6/21/04
to
On Sun, 20 Jun 2004 14:59:18 -0400, Severin Ecker wrote:

> hi!
>
>> Just FYI, I've already taken four of the bechmarks there and optimized
>> them.
>
> just curious,.. would it be possible to have a look at your optimized code?
>
> thx, regards,
> sev
>

Feel free to improve further. Admittedly, I was unsure how much of the
code should be changed (as in complete rewrite). So, where I could, I
attempted to keep the code as close to the original as possible. Of
course, some of this was simple laziness. ;) Just the same, without much
effort, beating java's best corner case optimizations have not turned out
to be very hard.

Many of the benchmarks seem to stress the heap. It's well known that heap
management under g++ is not very good, especially when many, many heap
allocations are required. Another strategy would be to try other heap
management libraries and strategies. I suspect there is plenty of head
room here.

hash.cpp
More or less taken from slashdot. It looks like a good approach and
bluntly, nothing else occured to me for a better way to improve it. Well,
if you want to come up with a better strdup, feel free to do so.
-----
#include <stdio.h>
#include <iostream>
#include <ext/hash_map>

using namespace std;
using namespace __gnu_cxx;

#if 0


struct eqstr {
bool operator()(const char* s1, const char* s2) const {
return strcmp(s1, s2) == 0;
}
};

#else


struct eqstr {
bool operator()(const char* s1, const char* s2) const {

if (s1 == s2) return true;
if (!s1 || !s2) return false;

while (*s1 != '\0' && *s1 == *s2)
s1++, s2++;

return *s1 == *s2;
}
};
#endif



struct hashme
{
size_t operator()(const char* s) const {
size_t i;
for (i = 0; *s; s++)
i = 31 * i + *s;
return i;
}
};

int
main(int argc, char *argv[]) {

int n = ((argc == 2) ? atoi(argv[1]) : 1); char buf[16]; typedef
hash_map<const char*, int, hashme, eqstr> HM; HM hash1, hash2;

for (int i=0; i<10000; i++) {
sprintf(buf, "foo_%d", i);
hash1[strdup(buf)] = i;
}
for (int i=0; i<n; i++) {
for (HM::iterator k = hash1.begin(); k != hash1.end(); ++k) {
hash2[(*k).first] += k->second;
}
}
cout << hash1["foo_1"] << " " << hash1["foo_9999"] << " "
<< hash2["foo_1"] << " " << hash2["foo_9999"] << endl;
}
----

wc.cpp
It appears the original code didn't even do everything it was supposed to
be doing. This code is more accurate and faster than the original. It's
worth noting that the input file may also effect performance, which is
something the original benchmark seems to ignore. If you create an
input file where only words, spaces and new lines are used, I suspect this
benchmark will do slightly better than my results.
----
// -*- mode: c++ -*-
// $Id: wc.g++,v 1.6 2001/09/18 16:49:46 doug Exp $ //
http://www.bagley.org/~doug/shootout/ // with help from Tom Widmer

#include <ctype.h>
#include <iostream>


using namespace std;

enum {
OUT, /* outside a word */ IN
/* inside a word */


};

int
main(int argc, char *argv[]) {

char c;
int nl, nw, nc, inword;
char buff[8192];
cin.rdbuf()->pubsetbuf(buff, 8192); // enable buffering

inword = OUT;
nl = nw = nc = 0;
int intc;
streambuf* sbuf = cin.rdbuf();
while ((intc = sbuf->sbumpc()) != EOF) {
++nc;
c = (char)intc;
switch( c ) {
case 0x0a:
++nl ;
// break purposely left out here

case 0x20:
case 0x0b:
case 0x0c:
case 0x0d:
case 0x1c:
case 0x1d:
case 0x1e:
case 0x1f:
inword = OUT ;
break ;

default:
if( inword == OUT ) {
++nw ;
inword = IN ;
}
break ;
}
}
cout << nl << " " << nw << " " << nc << endl;
}

----

objinst.cpp
This is an interesting one. You'll need to obtain the GC library from
http://www.hpl.hp.com/personal/Hans_Boehm/gc/gc_source/gc6.2.tar.gz This
at least makes an effort to place the two benchmarks (java vs cpp) in the
same book (they are still not on the same page) for comparison. As I
originally stated, I still plan to write my own heap implementation to
further boost performance here. Simply using the GC makes it cpp version
about 20% faster than the java run (at least for me). I did not play any
with the GC options. Nonetheless, I believe this supports that the
overhead is clearly in heap cleanup and general heap management, which
java is able to reduce to hardly any overhead. That, and the fact that
java has no destructors. Notice that the code is also slightly different,
which did help shave off just tidbits of time here and there.

I believe there is room for significant improvement here. I believe I can
reduce the 200,000,000 allocations, deletions and destructions to
something in the neighborhood of 200,000,000 allocations, and two or three
deletions.

----
// -*- mode: c++ -*-
// $Id: objinst.g++,v 1.10 2001/10/09 23:52:37 doug Exp $ //
http://www.bagley.org/~doug/shootout/

#include <stdlib.h>
#include <iostream>

#include "gc_allocator.h"

// Redefine global operator new and delete to get garbage collected //
memory. This isn't necessary to use gc_alloc with STL containers. // But
a program that does all of its allocation through STL containers //
currently doesn't benefit much from garbage collection anyway. #include
"gc.h"
inline void * operator new(size_t n) { return GC_malloc(n); } inline void
operator delete(void *) {} inline void * operator new[](size_t n) { return
GC_malloc(n); } inline void operator delete[](void *) {}

#define __GC


using namespace std;

class Toggle {
public:
Toggle(bool start_state) : state(start_state) { }

bool value() const {
return(state);
}
}
Toggle& activate() {
if( state == 1 )
state = 0 ;
else state = 1 ;
return(*this);
}
}
protected:
bool state;
};

class NthToggle : public Toggle {
public:
NthToggle(bool start_state, int max_counter) :
Toggle(start_state), count_max(max_counter), counter(0) {
}
Toggle& activate() {
if (++this->counter >= this->count_max) {
if( state == 1 )
state = 0 ;
else state = 1 ;
counter = 0;
}
return(*this);
}
private:
int count_max;
int counter;


};

int
main(int argc, char *argv[]) {
int n = ((argc == 2) ? atoi(argv[1]) : 1);

Toggle toggle1(true) ;
for (int i=0; i<5; i++) {
cout << ((toggle1.activate().value()) ? "true" : "false") << endl;
}
}
for (int i=0; i<n; i++) {
Toggle *toggle = new Toggle(true);
}
}
cout << endl;

NthToggle ntoggle1(true, 3);
for (int i=0; i<8; i++) {
cout << ((ntoggle1.activate().value()) ? "true" : "false") << endl;
}
}
for (int i=0; i<n; i++) {
NthToggle *ntoggle = new NthToggle(true, 3);
}
return 0;
}
---

strcat.cpp
This code was especially naughty. It purposely tries to waste as much
time as it can in reallocating string space. Since the objective is to
measure the duration required to concat a string over and over, it was
obvious what the intent was. Sure enough, the very simple changes
resulted in code which ran 29% - 200% faster, depending on the run length.
Worth noting, it's fairly easy to make the java benchmark run out of
memory, even after adding more to the VM. In the 200% case, the cpp
version completed in 2.99s while the java version ran out of memory after
6+seconds, having been given 380M, IIRC.
---
// -*- mode: c++ -*-
// $Id: strcat.g++,v 1.4 2001/07/11 02:06:39 doug Exp $ //
http://www.bagley.org/~doug/shootout/ // with help from PeterB

#include <iostream>
#include <string>
using namespace std;



int main(int argc, char *argv[])
{

int i, n = ((argc == 2) ? atoi(argv[1]) : 1); string str;
#if 1
size_t capacity = 60*n;
#else
size_t capacity = 31;
#endif
str.reserve(capacity); // as per C-string size_t newLength = 6; for (i
= 0; i < n; i++)
{
if(newLength > capacity)
{
capacity *= 2;
str.reserve(capacity);
}
str += "hello\n";
newLength += 6;
}
cout << str.length() << endl;
return 0;
}
---

Again, feel free to further crank these up in performance. I would be
interested in any improvements you make. Best of all, feel free to submit
your improvements back to the shootout author.

Cheers!

ka...@gabi-soft.fr

unread,
Jun 21, 2004, 3:09:27 PM6/21/04
to
James Rogers <jimmaure...@att.net> wrote in message
news:<Xns950BE24739BA7...@204.127.36.1>...

> > But it's an obvious truism, well known for quite a long time, that
> > for some specific operations, Java is faster than C++, and that for
> > others, C++ is faster than Java. Even with much older, less
> > optimized Java implementations. Speed, per se, has always been a
> > weak point of C/C++, due to the way they don't) handle arrays.
> > Speed, per se, is also a weak point of Java, due to the way it
> > handles objects of user defined types. I expect that a good Ada
> > compiler (if such exist) would leave both languages far behind.

> I do not have the same system the owner of the web page has. Mine is
> probably much slower. That said, I ran his Java code for the Ackerman
> test with a parameter of 8 (giving a result of 2045). It took 2
> seconds to run, including the time needed to start up the JVM. An Ada
> example compiled with the GNAT compiler, which uses the GNU backend,
> took 0.17 seconds. This was run on my 3 year old laptop using Win98.

Well, I said that Ada would probably beat either of them:-).

Seriously, Ackermann is a very special case -- it's simple recursive
function calls that are very hard to inline. On the other hand, it ends
up calling the itself with the smaller parameters very, very often; I
wouldn't be surprised if a runtime optimizer didn't end up recognizing
this, and replace some of the function calls with something like:

switch ( x ) {
case 0 :
return ...
case 1 :
return ...
case 2 :
return ...
default :
return ack( ... ) ;
}

IMHO, this is one case where the runtime optimizer has a definite
advantage.

It is also a test in which none of the issues I mentionned play a role.
There are no arrays, with their aliasing problems, for C/C++, and there
are no user defined types, which cannot be allocated statically, for
Java.

With regards to your tests, I've almost never seen Java get down below 2
seconds start up time; I'd certainly try the tests with higher values
(so that, say, the Java code took about five minutes to run).

> I do not have a C++ compiler to compare timings. I would be surprised
> if any C++ compiler required anything near 2 seconds to run this test
> for properly constructed C++.

I suspect that for this test, the times for C++ should be about the same
as for Ada. As I said, no arrays decaying to pointers, no possible
aliasing...

--
James Kanze GABI Software

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,
Jun 21, 2004, 3:13:27 PM6/21/04
to
Michi-S...@web.de (Michael Schlenger) wrote in message
news:<40d30eeb...@news.individual.de>...

> On 17 Jun 2004 10:46:35 -0400, ka...@gabi-soft.fr wrote:

> >But it's an obvious truism, well known for quite a long time, that
> >for some specific operations, Java is faster than C++, and that for
> >others, C++ is faster than Java. Even with much older, less
> >optimized Java implementations. Speed, per se, has always been a
> >weak point of C/C++, due to the way they don't) handle arrays.

> This is not so obvious to me. I always thought that array access is
> fast in C++ since it only needs a simple offset calculation or pointer
> arithmetic to access an array element. No range checks, nothing.

If the compiler does no optimization, then C++ array accesses are about
the same as those in other languages, faster even than those in
languages requiring bounds checking. Once you start optimizing,
however, aliasing comes into play. The compiler cannot keep a value in
a register unless it can prove that there are no other paths by which it
can be accessed. And every pointer is a potential path. Thus, for
example, if I write something like:

void
smooth( double[] dest, int count, double[] source )
{
int destSize = count - 2 ;
for ( int i = 0 ; i < destSize ; ++ i ) {
dest[ i ] = (source[ i ] + source[ i + 1 ] + source[ i + 2 ]) / 3.0 ;
}
}

A good Java or Ada compiler will save the values read in source[ i + 1 ]
and source[ i + 2 ] each time through the loop, for use in the place of
source[ i ] and source[ i + 1 ] the next time ; a C/C++ compiler cannot
do this, since the write to dest[ i ] might modify source[ i + 1 ] or
source[ i + 2 ].

> [snip]

> >Another interesting test is object creation. Java looks to be about
> >three or four times faster than C++.

> Maybe with the general purpose allocator that is normally used
> (malloc/free). But one often can implemennt special allocators for
> special situations which then are much faster. E.G: well known is the
> scenario when we have a container of objects all the same size.

I agree that once the profiler starts showing bottlenecks, there are
solutions to avoid them. In specific cases. It's just more work.

--
James Kanze GABI Software

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,
Jun 21, 2004, 3:13:49 PM6/21/04
to
Robert Jan Schaper <sh...@bacon.demon.nl> wrote in message
news:<10d5cdk...@corp.supernews.com>...
> ka...@gabi-soft.fr wrote:

> > But it's an obvious truism, well known for quite a long time, that
> > for some specific operations, Java is faster than C++, and that for
> > others, C++ is faster than Java. Even with much older, less
> > optimized Java implementations.

> I've looked at two of his examples: ackermann and hash, and these
> illustrate your point nicely. The first example, the ackermann seems
> indeed much faster in Java. A little googling told me that this is
> because C and C++ compilers cannot implement tail-call elimination as
> well as Java. With n = 12 the Java implementation is more than twice
> as fast on my Athlon 2.5 GHz/ 256MB / Linux 2.4 / Java HotSpot 1.4.2 /
> GCC 3.3.1

More generally, a runtime optimizer can generally recognize that there
are a lot of calls with very small values and special case them. This
can save a lot in Ackermann -- and in some real code, but not most of
it.

> The hash example however is less impressive:

His hash tests tested the quality of the hash function used more than
anything else. As I mentionned in another posting, the default hash
function in the g++ 2.95.2 library (I've not looked since) will show
distinctive clustering near the head of the table for certain data
sets. If the data set in his tests happened to coincide with this, then
Java will beat C++ here.

The algorithm used in Java can also show this clustering, but it
requires a lot more strings, which are a lot smaller. Theoretically, to
avoid the clustering, your multiplier would have to be greater than
UCHAR_MAX; practically, the chance of encountering with 127 seems
extremely small, with 31 (what Java uses) it should still only occur in
extreme cases, whereas with values like 5 (what g++ uses), the risk is
very real, ESPECIALLY in small test cases.

As a extreme example, imagine a table with all two character strings,
with the character ranging from '\0'to '\377'. That's a total of 65536
different strings. The algorithm used by g++ will only generate hash
codes in the range of 0-1530 for such a string, however, and that used
by Java in the range of 0-8160. If we take a more realistic example,
there are 95 printable ASCII characters, for a total of 9025 different
two character strings. Here g++ generates codes in the range 0-756, and
Java 4032. For 3 character ASCII strings, the range is 857375 possible
strings, 3906 different hash values for g++, and 121918 for Java. If
only 1% of the possible strings are actually present, then g++ will
still show significant clustering in the first half of the table, but
Java will come out OK.

How to hash strings efficiently is now fairly well known: FNV is in
widespread use, and I've mentionned my alternative based on Mersenne
primes more than once here. Given what is known, I don't quite
understand why they continue to use less than optimal algorithms.

Anyway, a quick sort written in Java will also beat a bubble sort
written in C++, if the data sets are large enough. Or vice versa.
Comparing the execution times of two different algorithms tells you
absolutely nothing about the languages.

> - I cannot reproduce the results the author claims. If I run his
> binaries Java is already losing with a factor 5 if n = 100k in both
> client and server mode.


> - With a better hash function and itoa() instead of sprintf you can
> further speedup the cpp version (see the example code at the
> end). With n = 1M the speedup is almost 50%. A further improvement
> would be the removal of the costly second strdup() call which seems
> unnecessary. But for some mysterious reason this gives faulty results
> at some values of n. Perhaps someone can explain what is happening
> here.

> - The Java version can handle about n = 600k but after that it throws
> an out of memory exception. C++ can handle n = 4M without
> swapping. Rather embarrassing.

I don't know if it is still the case, but back when I was using it,
Sun's JDK implementation artificially limited memory use. There was an
option to tune it. A quick check with java -help on my machine shows
no such option, but I don't know whether this is because it isn't used
under Solaris (I was working under Windows), because it has been
superceded in 1.4 (I used 1.2) or because they simply don't bother to
output all possible options.

As for your code, you're still comparing apples to oranges. I'd use
std::string. But at the very least, you shouldn't leak memory. And you
should check the results of strdup -- that could explain your core
dumps. (Of course, if you're under Linux, even a successful malloc can
cause a core dump when used.)

Other than that, the hash algorithm you use is the same as that of
Java. Modulo the fact that in C++, we can implement it correctly, using
unsigned modulo arithmetic, whereas in Java, they have to fiddle with
signed integers.

--
James Kanze GABI Software

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

ka...@gabi-soft.fr

unread,
Jun 21, 2004, 3:14:32 PM6/21/04
to
davide...@yahoo.com (David Eng) wrote in message
news:<6b74193f.04061...@posting.google.com>...

> ka...@gabi-soft.fr wrote in message
> news:<d6652001.04061...@posting.google.com>...
> > gle...@pixelglow.com (Glen Low) wrote in message
> > news:<9215d7ac.04061...@posting.google.com>...

> > In the end, of course, performance isn't necessarily the key
> > argument. For most of the work I do, safety is. And that is a
> > criteria where C++ is a clear winner; it is very, very difficult to
> > write correct programs of any significant size in Java. (It's not
> > really that easy in C++, but at least the language never actively
> > prevents you from doing the right thing.)

> You mentioned in several posts that Java does not scale well but you
> did not give an example. In which case, does Java not scale well?

The biggest problem is that you can't normally keep the interface
specification (the class definition, in C++ terms) separate from the
implementation (the function definitions, in C++ terms). One of C++'s
known weak points with regards to large applications is the fact that
you can't separate the private members of the class from the rest of the
definition; change a private data member, and all users have to
recompile. Worse, in order to change a private data member, you need
write access to the .h file -- in most large projects, a typical
developer can't just go in and edit a .h file as he sees fit. So we
develop things like the compilation firewall idiom to work around this.

Java, of course, requires not only private data members to be present in
the class definition, but that every single implementation be present
AND compilable. It makes managing (human) dependencies almost
impossible.

The other thing I've found very useful in large scale developments is
programming by contract. Again, there's no direct support either in
Java or in C++. But in C++, we've more or less developed the idiom that
all virtual functions in most application level classes are private,
called by non virtual functions in the public section, which enforce
and/or verify the contract, and instrument the code in other ways. In
Java, you can't declare a private function virtual, and if you want the
slightest bit of code in the public functions (which can be declared
final), then you can't use an interface, only an abstract class. Which
in turn means no multiple inheritance, and all sorts of work-arounds to
simulate it.

When I was actively using Java, the lack of static type checking in
containers was also a problem. I've heard that they've fixed this since
then.

> What prevent Java to write a large program? I don't think you have to
> write ONE large program to accomplish a task. Instead, you can write
> SMALL programs distributed together to accomplish the same task. In
> this case, does Java still have a scalable problem?

You still need an implementation somewhere, and the human dependency
problem raises its head. And in distributed systems, the programming by
contract idiom becomes even more important.

--
James Kanze GABI Software

Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Allan W

unread,
Jun 22, 2004, 6:11:18 AM6/22/04
to
James Rogers <jimmaure...@att.net> wrote

> I do not have the same system the owner of the web page has. Mine is
> probably much slower. That said, I ran his Java code for the Ackerman
> test with a parameter of 8 (giving a result of 2045). It took 2 seconds
> to run, including the time needed to start up the JVM. An Ada example
> compiled with the GNAT compiler, which uses the GNU backend, took
> 0.17 seconds. This was run on my 3 year old laptop using Win98.
>
> I do not have a C++ compiler to compare timings. I would be surprised
> if any C++ compiler required anything near 2 seconds to run this test
> for properly constructed C++.

VC++ 6.0 Enterprise Edition, no service patches (we'll be on SP6 next week)

It took 0.034 seconds (that is, it took 34 seconds to run it 1000 times).
The code was slightly modified to gather statistics: the maximum recursion
depth was 2047, the total number of calls was 2,785,999.

Jacking up the stack space to 4gb, I was able to run it with a parameter
of 12 (giving a result of 32765). Despite thrashing (I have only 1.5gb of
physical memory), it ran in 8.3 seconds (10 runs took 83 seconds).
The maximum recursion depth was 32767, total calls 715,664,091.

I doubt the processor had much to do with it: HP workstation xw6000,
Windows XP Pro 2002 SP1, dual Xeon 2.8ghz, 1.5Gb RAM.

I'm certain I could do better if I removed recursion...

CFG

unread,
Jun 22, 2004, 6:13:52 AM6/22/04
to

> void
> smooth( double[] dest, int count, double[] source )
> {
> int destSize = count - 2 ;
> for ( int i = 0 ; i < destSize ; ++ i ) {
> dest[ i ] = (source[ i ] + source[ i + 1 ] + source[ i + 2 ])
/ 3.0 ;
> }
> }
>
> A good Java or Ada compiler will save the values read in source[ i + 1 ]
> and source[ i + 2 ] each time through the loop, for use in the place of
> source[ i ] and source[ i + 1 ] the next time ; a C/C++ compiler cannot
> do this, since the write to dest[ i ] might modify source[ i + 1 ] or
> source[ i + 2 ].

C++ compiler can perform runtime check here:
if (dest + count <= source || source + count <= dest)
{
.... // do it fast way. no aliasing
}
else
{
// do it slow way - possible aliasing detected
}

Some compilers such as Intel C++ support special keywords and/or pragmas to
circumvent this problem. For example,
void smooth( double *restrict dest, int count, double *restrict source)

Motti Lanzkron

unread,
Jun 22, 2004, 1:47:38 PM6/22/04
to
Allan W wrote:
>James Rogers <jimmaure...@att.net> wrote
> > I do not have the same system the owner of the web page has. Mine is
> > probably much slower. That said, I ran his Java code for the Ackerman
> > test with a parameter of 8 (giving a result of 2045).
...

>It took 0.034 seconds (that is, it took 34 seconds to run it 1000 times).
>The code was slightly modified to gather statistics: the maximum recursion
>depth was 2047, the total number of calls was 2,785,999.
...

>I'm certain I could do better if I removed recursion...

Good luck with that.

I seem to recall learning that Ackerman can't be computed without
recursion.

Sergey P. Derevyago

unread,
Jun 22, 2004, 2:05:16 PM6/22/04
to
ka...@gabi-soft.fr wrote:
> > You mentioned in several posts that Java does not scale well but you
> > did not give an example. In which case, does Java not scale well?
>
> The biggest problem is that you can't normally keep the interface
> specification (the class definition, in C++ terms) separate from the
> implementation (the function definitions, in C++ terms).
>
IMHO it's not a problem: Java does have interfaces.
And the compilation firewall idiom is a bad C++ solution for the
interface/implementation problem: abstract base classes together with smart
pointers do better:

// interface SomeInterface
class SomeInterface {
public:
// exposed methods
virtual void f()=0;
virtual int g()=0;

// required virtual destructor
virtual ~SomeInterface() {}

private:
// forbidden assignment
SomeInterface& operator=(const SomeInterface&);
};

// object factory namespace
namespace Factory {

// creates new object via operator new
// and returns it using (shared) smart pointer
sh_ptr<SomeInterface> newSomeInterface(/* parameters */);

}

void h()
{
// create an object that implements SomeInterface
sh_ptr<SomeInterface> a=Factory::newSomeInterface(/* your arguments */);

// use the object
a->f();
int i=a->g();

// and destructor of sh_ptr<> will delete the object on the block exit
}

> One of C++'s
> known weak points with regards to large applications is the fact that
> you can't separate the private members of the class from the rest of the
> definition; change a private data member, and all users have to
> recompile.

IMHO implementation details must be hidden behind interfaces. The private
data members that are a subject to change must not be exposed to end user.
--
With all respect, Sergey. http://ders.angen.net/
mailto : ders at skeptik.net

Thorsten Ottosen

unread,
Jun 23, 2004, 5:46:28 AM6/23/04
to
"Motti Lanzkron" <mlan...@yahoo.co.uk> wrote in message news:5p3gd0hct790rmb1e...@4ax.com...

| Allan W wrote:
| >James Rogers <jimmaure...@att.net> wrote

| >I'm certain I could do better if I removed recursion...


|
| Good luck with that.
|
| I seem to recall learning that Ackerman can't be computed without
| recursion.

you only need either recursion or iteration (+ selection and sequence ) to get a Turing-complete language.

br

Thorsten

David B. Held

unread,
Jun 23, 2004, 5:55:09 AM6/23/04
to
"Motti Lanzkron" <mlan...@yahoo.co.uk> wrote in message
news:5p3gd0hct790rmb1e...@4ax.com...
> [...]

> I seem to recall learning that Ackerman can't be computed
> without recursion.

I tried to verify or refute that, and found that there are several
variations on Ackermann's function. Nonetheless, the function
has several interesting properties. Allegedly, it is the fastest
growing recursive function of two arguments that is limited to
recursive calls and the successor function. It is also one of
the simplest total recursive functions that is not primitive
recursive. Oddly enough, nobody seems to say anything about
iterative implementations of the function. However, the Wikipedia
does make this interesting statement:

"It may be noted that A(4, 2), which appears as a decimal
expansion in several web pages, cannot possibly be
computed by recursive application of the Ackermann
function."

I can only assume this means "without optimization", since it also
provides this link:

http://www.kosara.net/thoughts/ackermann42.html

which was purportedly "calculated with a simple LISP programme
on CLisp." I assume the CLisp interpreter/compiler did not
entirely convert the algorithm to an iterative form, so it seems
that it either used optimization tricks to solve it recursively, or
the author wrote the implementation iteratively. At any rate, I have
only this argument to make for an iterative definition:

Since the Ackermann function is recursive, it is computable.
Since it is computable, there exists a Turing machine, T
which computes it.
Since the operation of a Turing machine can be defined
iteratively, it follows that there is an iterative description
of T, and thus of the Ackermann function.

Note that my proof implies that any computable function can be
implemented iteratively. The interesting question is what such
an implementation would look like, and I'm afraid that I can't
provide one. There is some tail recursion in the algorithm
which could be eliminated by hand without too much difficulty.
However, it is the double-recursive nature of the function that
makes it particularly difficult to translate to an iterative form.

It seems to me that the Ackermann function is not a particularly
appropriate test for imperative languages like C++, since most
programmers will tend to eliminate the tail recursion by hand
when performance matters, and there are probably very few
real-world applications that involve the type of double-recursion
found in Ackermann.

Perhaps the more interesting question from the shootout is
how difficult it would be to implement a JIT compiler for C++,
and whether vendors should put effort into such a thing.

Dave

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.701 / Virus Database: 458 - Release Date: 6/7/2004

ka...@gabi-soft.fr

unread,
Jun 23, 2004, 6:37:09 PM6/23/04
to
"CFG" <c...@fg.com> wrote in message
news:<10878506...@news.pubnix.net>...

> > void
> > smooth( double[] dest, int count, double[] source )
> > {
> > int destSize = count - 2 ;
> > for ( int i = 0 ; i < destSize ; ++ i ) {
> > dest[ i ] = (source[ i ] + source[ i + 1 ] + source[ i + 2 ])
> / 3.0 ;
> > }
> > }

> > A good Java or Ada compiler will save the values read in source[ i
> > + 1 ] and source[ i + 2 ] each time through the loop, for use in
> > the place of source[ i ] and source[ i + 1 ] the next time ; a
> > C/C++ compiler cannot do this, since the write to dest[ i ] might
> > modify source[ i + 1 ] or source[ i + 2 ].

> C++ compiler can perform runtime check here:
> if (dest + count <= source || source + count <= dest)
> {
> .... // do it fast way. no aliasing
> }
> else
> {
> // do it slow way - possible aliasing detected
> }

A C++ could track pointers, and determine that there was no aliasing,
too. It's a lot more work, however, and most current compilers aren't
anywhere near that level.

The present situation is that such aliasing is a problem, in practice,
and that Java doesn't have it.

> Some compilers such as Intel C++ support special keywords and/or
> pragmas to circumvent this problem. For example,
> void smooth( double *restrict dest, int count, double *restrict source)

That's C, and it introduces some other problems. With the above
declaration, for example, calling smooth with the same array for both
dest and source is undefined behavior; it's legal in Java and in C++ or
C without the restrict. And of course, using restrict is basically a
case of the author of the function making a promiss to the compiler
about what the client code will do. It's yet another source of
undefined behavior.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

CFG

unread,
Jun 23, 2004, 6:47:11 PM6/23/04
to
>Perhaps the more interesting question from the shootout is
>how difficult it would be to implement a JIT compiler for C++,
>and whether vendors should put effort into such a thing.

Microsoft is up to it.

..NET platform supports what they call "managed C++". At least a subset of
C++ can already be tested with Microsoft's .NET JIT compiler.

Gerhard Menzl

unread,
Jun 24, 2004, 7:32:09 AM6/24/04
to
CFG wrote:

>>Perhaps the more interesting question from the shootout is
>>how difficult it would be to implement a JIT compiler for C++,
>>and whether vendors should put effort into such a thing.
>
>
> Microsoft is up to it.
>
> ..NET platform supports what they call "managed C++". At least a subset of
> C++ can already be tested with Microsoft's .NET JIT compiler.

I would not call Managed C++ a subset of C++. It is the result of
forcing a single, rather restrictive programming paradigm (read: Java)
on to a language with a conflicting paradigm, and it is non-conforming
to the C++ Standard to the extent that it cannot be classified as a mere
extension anymore.

It will be interesting to see whether Herb Sutter and Stan Lippman will
be able to square the circle in the much awaited Whidbey release.

--
Gerhard Menzl

Humans may reply by replacing the obviously faked part of my e-mail
address with "kapsch".

ka...@gabi-soft.fr

unread,
Jun 24, 2004, 7:34:29 PM6/24/04
to
"David B. Held" <dh...@codelogicconsulting.com> wrote in message
news:<cbahi4$1ml$1...@news.astound.net>...

> "Motti Lanzkron" <mlan...@yahoo.co.uk> wrote in message
> news:5p3gd0hct790rmb1e...@4ax.com...
> > [...]
> > I seem to recall learning that Ackerman can't be computed
> > without recursion.

[...]


> At any rate, I have
> only this argument to make for an iterative definition:

> Since the Ackermann function is recursive, it is computable.
> Since it is computable, there exists a Turing machine, T
> which computes it.
> Since the operation of a Turing machine can be defined
> iteratively, it follows that there is an iterative description
> of T, and thus of the Ackermann function.

> Note that my proof implies that any computable function can be
> implemented iteratively.

Which is trivially true. All a recursive function call does is save the
current state, and start with a reinitialized "current state"; the
original state is restored on returning from the function. Create a
data structure with the state, put it in a stack, and manage the push's
and the pop's manually, and you implement it without recursion.

> The interesting question is what such an implementation would look
> like,

It would have an explicitely managed stack.

> and I'm afraid that I can't provide one. There is some tail
> recursion in the algorithm which could be eliminated by hand without
> too much difficulty. However, it is the double-recursive nature of
> the function that makes it particularly difficult to translate to an
> iterative form.

> It seems to me that the Ackermann function is not a particularly
> appropriate test for imperative languages like C++, since most
> programmers will tend to eliminate the tail recursion by hand when
> performance matters, and there are probably very few real-world
> applications that involve the type of double-recursion found in
> Ackermann.

The reason you often see Ackermann in benchmarks is that it is
considered a good way of measuring the cost of a function call. Because
of the double recursion, it is very hard, if not impossible, for a
compiler to do any tail recurssion elimination, and there isn't much
overhead in it other than the function calls.

If you're any good at assembler, write it once passing the parameters in
registers, and a second time passing them on the stack, and compare the
results:-).

Given that when performance matters and function call overhead is an
issue, most C++ programmers will inline the functions, I'm not sure that
it is a relevant benchmark for C++.

> Perhaps the more interesting question from the shootout is how
> difficult it would be to implement a JIT compiler for C++, and whether
> vendors should put effort into such a thing.

I don't know whether JIT is the only answer, but improved optimization
is always welcome. In a correct, conforming compiler, of course --
implementing the standard (including export) comes first.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

CFG

unread,
Jun 24, 2004, 7:39:44 PM6/24/04
to

> [...] It is the result of

> forcing a single, rather restrictive programming paradigm (read: Java)
> on to a language with a conflicting paradigm, and it is non-conforming
> to the C++ Standard to the extent that it cannot be classified as a mere
> extension anymore.

This might be true. But still I guess C-style functions this test case like
hashing function or Ackermann's functions can be compiled by ManagedC++
(MC++) without any significant changes. It remains to be seen what should be
used as a basis for MC++ code: C++ version or Java version of the original
test case.

CFG

unread,
Jun 24, 2004, 8:07:09 PM6/24/04
to
> [...] It is the result of

> forcing a single, rather restrictive programming paradigm (read: Java)
> on to a language with a conflicting paradigm, and it is non-conforming
> to the C++ Standard to the extent that it cannot be classified as a mere
> extension anymore.

This might be true. But still I guess that the C-style functions in this
test case
like hash function or Ackermann's functions can be compiled by ManagedC++


(MC++) without any significant changes.

The tricky part with MC++ is a question what should be used as a basis: C++


version or Java version of the original test case.

Glen Low

unread,
Jun 25, 2004, 8:55:35 AM6/25/04
to
> > Some compilers such as Intel C++ support special keywords and/or
> > pragmas to circumvent this problem. For example,
> > void smooth( double *restrict dest, int count, double *restrict source)
>
> That's C, and it introduces some other problems. With the above
> declaration, for example, calling smooth with the same array for both
> dest and source is undefined behavior; it's legal in Java and in C++ or
> C without the restrict. And of course, using restrict is basically a
> case of the author of the function making a promiss to the compiler
> about what the client code will do. It's yet another source of
> undefined behavior.

Ay that's the rub. There are typically 3 cases for arrays A1 and A2:

1. A1 is the same as A2.
2. A1 is different from A2, but they overlap.
3. A1 does not overlap A2 at all.

In Java, 1 and 3 apply.

In C and C++, 1, 2 and 3 apply.

If either A1 or A2 are restricted then only 3 applies.

I would suppose a smart JVM could do a quick test for 1 or 3 and apply
the appropriate alias-friendly or alias-defensive code. Does the
standard server JVM in fact do this? If not, they'd be faced with the
same problem that dest and source are the same.

Cheers,
Glen Low, Pixelglow Software
www.pixelglow.com

CFG

unread,
Jun 25, 2004, 9:02:53 AM6/25/04
to
If it matters one can always eliminate undefined behaviour as follows:

void smoothRestricted_Fast( double *restrict dest, int count, double
*restrict source);
void smoothGeneric_Slow( double * dest, int count, double * source);
void smooth( double * dest, int count, double * source)
{
if (isRestrictedApplicable(dest, count, source))
smoothRestricted_Fast(dest, count, source);
else
smoothGeneric_Slow(dest, count, source);
}

Similar runtime checks are unavoidable anyway. Think about using SIMD
instructions. SIMD might impose more strict alignment requirement than
the one inherent for a given type. For example, using SSE requires 16 byte
aligment vs. regular 4 byte aligment for "float".

In addition, don't forget that if you forced to use Array objects you have
another problem: how to work with sub-regions inside this array (a.k.a.
regions of interest - ROI for short).

If you don't allow aliasing then you have to copy ROIs (since ROIs can be
overlapping) which can be expensive. Just trying to say that there
is no free cure for aliasing problem.

Ben Hutchings

unread,
Jun 25, 2004, 9:18:17 PM6/25/04
to
Gerhard Menzl wrote:
> CFG wrote:
>
>>>Perhaps the more interesting question from the shootout is
>>>how difficult it would be to implement a JIT compiler for C++,
>>>and whether vendors should put effort into such a thing.
>>
>>
>> Microsoft is up to it.
>>
>> ..NET platform supports what they call "managed C++". At least a subset of
>> C++ can already be tested with Microsoft's .NET JIT compiler.
>
> I would not call Managed C++ a subset of C++. It is the result of
> forcing a single, rather restrictive programming paradigm (read: Java)
> on to a language with a conflicting paradigm, and it is non-conforming
> to the C++ Standard to the extent that it cannot be classified as a mere
> extension anymore.

As I understand it, VC++ is no less conformant to the standard when
compiling to managed code (IL) than when compiling to regular
unmanaged code. The restrictions only come into play if you want to
interoperate with other .NET components, when you need to use "managed
data" i.e. objects in a precisely garbage-collected heap.

> It will be interesting to see whether Herb Sutter and Stan Lippman will
> be able to square the circle in the much awaited Whidbey release.

AIUI the successor to Managed C++, C++/CLI, improves the syntax and
semantics of managed data declaration and use but it does not fix
standard conformance issues because they do not exist. (That's not
to say that VC++ is perfectly compliant, but that its deviations from
the standard apply equally to managed and unmanaged code.)

Gerhard Menzl

unread,
Jun 28, 2004, 8:23:05 AM6/28/04
to
Ben Hutchings wrote:

> As I understand it, VC++ is no less conformant to the standard when
> compiling to managed code (IL) than when compiling to regular
> unmanaged code. The restrictions only come into play if you want to
> interoperate with other .NET components, when you need to use "managed
> data" i.e. objects in a precisely garbage-collected heap.

I was referring to Managed C++, not the Visual C++ compiler. Of course
you don't get any of the conformance problems that come with the managed
extensions if you don't use them. But the language you end up with if
you do (and you have to if you target the .NET platform, for example)
resembles C++ remotely at best.

--
Gerhard Menzl

Humans may reply by replacing the obviously faked part of my e-mail
address with "kapsch".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Allan W

unread,
Jun 30, 2004, 8:29:45 AM6/30/04
to
> Ben Hutchings wrote:
> > As I understand it, VC++ is no less conformant to the standard when
> > compiling to managed code (IL) than when compiling to regular
> > unmanaged code. The restrictions only come into play if you want to
> > interoperate with other .NET components, when you need to use "managed
> > data" i.e. objects in a precisely garbage-collected heap.

Gerhard Menzl <gerhar...@spambucket.net> wrote


> I was referring to Managed C++, not the Visual C++ compiler. Of course
> you don't get any of the conformance problems that come with the managed
> extensions if you don't use them. But the language you end up with if
> you do (and you have to if you target the .NET platform, for example)
> resembles C++ remotely at best.

Why is this a problem?

Target the native Intel platform instead of .NET.
Problem solved?

Gerhard Menzl

unread,
Jul 1, 2004, 8:35:56 AM7/1/04
to
Allan W wrote:

> > I was referring to Managed C++, not the Visual C++ compiler. Of course
> > you don't get any of the conformance problems that come with the managed
> > extensions if you don't use them. But the language you end up with if
> > you do (and you have to if you target the .NET platform, for example)
> > resembles C++ remotely at best.
>
> Why is this a problem?
>
> Target the native Intel platform instead of .NET.
> Problem solved?

No.

It is a personal problem because I am not in a position to choose my
target platform at whim. What is more important, I consider it as a
general problem because the potential market for .NET applications is
huge, and I fear that the Java-like programming paradigm that comes with
it will become dominant and undermine the progress brought about by C++.

--
Gerhard Menzl

Humans may reply by replacing the obviously faked part of my e-mail
address with "kapsch".

[ See http://www.gotw.ca/resources/clcm.htm for info about ]

Reply all
Reply to author
Forward
0 new messages