[groovy-user] Havlak Benchmark applied to Groovy 2.1: Nice results but one tear-drop

11 views
Skip to first unread message

OlliP

unread,
Mar 10, 2013, 11:40:45 AM3/10/13
to us...@groovy.codehaus.org
Hello,

I ported the Java code from Robert Hundt's implementation of the Havlak
algorithm to Groovy 2.1: https://github.com/oplohmann/havlak-jvm-languages

Then I ran the code on my Core2 Duo CPU E8400 3.00 GHz machine to see how
Groovy 2.1.1 is doing compared to Java using JDK1.7_0.6 with @CompileStatic
and -indy and without:

Java: 52070 ms

Groovy with @CompileStatic without indy: 62309 ms
Groovy with @CompileStatic with indy: 59814 ms

dynamic Groovy without indy: 84566 ms

Now comes the strange thing:

dynamic Groovy >with< indy: >169906 ms<

Does someone have a clue what is happening here? To make sure you understand
I enabled indy corrrectly, here is what I did:

* remove the groovy*.jars from groovy-2.1.1/lib
* copy the groovy*indy.jars from groovy-2.1.1/indy to groovy-2.1.1/lib
* in groovy-2.1.1/lib rename groovy*indy.jars to groovy*.jars
* run groovy with the -indy switch

Any hints appreciated.
Regards, Oliver



--
View this message in context: http://groovy.329449.n5.nabble.com/Havlak-Benchmark-applied-to-Groovy-2-1-Nice-results-but-one-tear-drop-tp5713749.html
Sent from the groovy - user mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

http://xircles.codehaus.org/manage_email


Ondřej Čada

unread,
Mar 10, 2013, 6:29:05 PM3/10/13
to us...@groovy.codehaus.org
Oliver,

On Mar 10, 2013, at 4:40 PM, OlliP wrote:

> Java: 52070 ms
> Groovy with @CompileStatic without indy: 62309 ms
> Groovy with @CompileStatic with indy: 59814 ms
> dynamic Groovy without indy: 84566 ms

That's *considerably* better than I would presume (even though I'm using fully-dynamic Groovy for my server applications). Truly big kudos to Jochen and all others who work on the runtime!

> Now comes the strange thing:
> dynamic Groovy >with< indy: >169906 ms<
>
> Does someone have a clue what is happening here?

I can't be sure, but quite probably the culprit would be that indy does not make the primitive optimalizations -- see Jochen's messages on this topic, e.g.,

http://groovy.markmail.org/message/q45mzzw7xftfgruk

All the best,
---
Ondra Čada
OCSoftware: o...@ocs.cz http://www.ocs.cz
private on...@ocs.cz http://www.ocs.cz/oc

Jochen Theodorou

unread,
Mar 11, 2013, 5:09:46 AM3/11/13
to us...@groovy.codehaus.org
Am 10.03.2013 16:40, schrieb OlliP:
> Hello,
>
> I ported the Java code from Robert Hundt's implementation of the Havlak
> algorithm to Groovy 2.1: https://github.com/oplohmann/havlak-jvm-languages
>
> Then I ran the code on my Core2 Duo CPU E8400 3.00 GHz machine to see how
> Groovy 2.1.1 is doing compared to Java using JDK1.7_0.6 with @CompileStatic
> and -indy and without:
>
> Java: 52070 ms
>
> Groovy with @CompileStatic without indy: 62309 ms
> Groovy with @CompileStatic with indy: 59814 ms

looks like there are still some relevant dynamic parts in the static
compiled code, or the static compiled code uses some inefficient entry
points. Anyway, I think there is some potential for the static compiler
here.

> dynamic Groovy without indy: 84566 ms
>
> Now comes the strange thing:
>
> dynamic Groovy >with< indy: >169906 ms<

hmm... the indy version is for me within the expected margin. The
question is more why the version without indy is so much faster. Let's
see what direct method calls I can find... CFG#getNumNodes,
BasicBlock#getName, BasicBlock#inEdges, BasicBlock#outEdges,
SimpleLoop#setIsRoot

I think some of them are called really often.

> Does someone have a clue what is happening here? To make sure you understand
> I enabled indy corrrectly, here is what I did:
>
> * remove the groovy*.jars from groovy-2.1.1/lib
> * copy the groovy*indy.jars from groovy-2.1.1/indy to groovy-2.1.1/lib
> * in groovy-2.1.1/lib rename groovy*indy.jars to groovy*.jars
> * run groovy with the -indy switch

I could not start it, since there is a StackOverflow in
HavlakLoopFinder#doDFS, after "Performing Loop Recognition, 1 Iteration"

Otherwise I could have had a look at the code and search for
optimization potential.

If you want to check the non primitive optimizations version, then use
"groovy --disableopt=all" It would allow us to see if it is really the
primopts that are doing that here.

bye blackdrag
--
Jochen "blackdrag" Theodorou - Groovy Project Tech Lead
blog: http://blackdragsview.blogspot.com/
german groovy discussion newsgroup: de.comp.lang.misc
For Groovy programming sources visit http://groovy-lang.org

OlliP

unread,
Mar 11, 2013, 5:21:02 AM3/11/13
to us...@groovy.codehaus.org

> hmm... the indy version is for me within the expected margin. The
> question is more why the version without indy is so much faster. Let's
> see what direct method calls I can find... CFG#getNumNodes,
> BasicBlock#getName, BasicBlock#inEdges, BasicBlock#outEdges,
> SimpleLoop#setIsRoot

SimpleLoop#setIsRoot is only used to print out the statistics at the end of
the run. Just for information.


> I could not start it, since there is a StackOverflow in
> HavlakLoopFinder#doDFS, after "Performing Loop Recognition, 1 Iteration"

Oops. Did you run LoopTesterApp.groovy with JAVA_OPTS=-server -Xss15500k
-Xmx1G? That should do it.


> If you want to check the non primitive optimizations version, then use
> "groovy --disableopt=all" It would allow us to see if it is really the
> primopts that are doing that here.

I will try that and report back. Thanks for the reply ;-).

-- Oliver



--
View this message in context: http://groovy.329449.n5.nabble.com/Havlak-Benchmark-applied-to-Groovy-2-1-Nice-results-but-one-tear-drop-tp5713749p5713755.html
Sent from the groovy - user mailing list archive at Nabble.com.

Jochen Theodorou

unread,
Mar 11, 2013, 7:10:18 AM3/11/13
to us...@groovy.codehaus.org
Am 11.03.2013 10:21, schrieb OlliP:
>
>> hmm... the indy version is for me within the expected margin. The
>> question is more why the version without indy is so much faster. Let's
>> see what direct method calls I can find... CFG#getNumNodes,
>> BasicBlock#getName, BasicBlock#inEdges, BasicBlock#outEdges,
>> SimpleLoop#setIsRoot
>
> SimpleLoop#setIsRoot is only used to print out the statistics at the end of
> the run. Just for information.

as long as at least one of the other methods is called in each iteration
it can give a hint

>> I could not start it, since there is a StackOverflow in
>> HavlakLoopFinder#doDFS, after "Performing Loop Recognition, 1 Iteration"
>
> Oops. Did you run LoopTesterApp.groovy with JAVA_OPTS=-server -Xss15500k
> -Xmx1G? That should do it.

-server is not doing anything different for my since I am on 64bit, but
the others are required it seems.. that makes it run (java build
1.8.0-ea-b73):

primopts version (-Xss15500k -Xmx1G):

> Time: 115668 ms
>
> Total Memory: 943040 KB
> # of loops: 121002 (including 1 artificial root node)
> # of BBs : 252013
> # max time: 3407
> # min time: 0

indy version (-XX:-TieredCompilation -Xss15500k -Xmx1G):

> Time: 130511 ms
>
> Total Memory: 942080 KB
> # of loops: 121002 (including 1 artificial root node)
> # of BBs : 252013
> # max time: 4575
> # min time: 0

using jdk1.7.11 (and same settings as above):

primopts:
> Time: 111940 ms
>
> Total Memory: 936128 KB
> # of loops: 121002 (including 1 artificial root node)
> # of BBs : 252013
> # max time: 3082
> # min time: 0

indy:
> Time: 196210 ms
>
> Total Memory: 937152 KB
> # of loops: 121002 (including 1 artificial root node)
> # of BBs : 252013
> # max time: 5043
> # min time: 0

that's only to show the significant influence of the JDK.

Furthermore, I see that the time is the total time. The question is what
do you want to test with this using the total time? indy arguably takes
longer to optimize, so even if the final result in indy is faster than
primopts (can happen!) it may still not show in that time.

If I use gbench on the 50 iterations loop and let it do the iterations,
I get with jdk 1.8:
indy: 1.38s per iteration cpu time
primopts: 1.92s per iteration cpu time

according to that indy is faster than the primopts version. That means
if you did let the test above run not fo 50 iterations, but let's say
for 500 (house number), indy should be faster.

bye blackdrag

--
Jochen "blackdrag" Theodorou - Groovy Project Tech Lead
blog: http://blackdragsview.blogspot.com/
german groovy discussion newsgroup: de.comp.lang.misc
For Groovy programming sources visit http://groovy-lang.org


OlliP

unread,
Mar 11, 2013, 8:39:21 AM3/11/13
to us...@groovy.codehaus.org
Thanks for reporting back. The difference in JDK is considerable. I will play
with some warm up phase and post the results. I'm not sure I understand what
you mean with primopts..?! Google did not reveal much about what primopts
is...

-- Oliver



--
View this message in context: http://groovy.329449.n5.nabble.com/Havlak-Benchmark-applied-to-Groovy-2-1-Nice-results-but-one-tear-drop-tp5713749p5713772.html
Sent from the groovy - user mailing list archive at Nabble.com.

Jochen Theodorou

unread,
Mar 11, 2013, 8:54:50 AM3/11/13
to us...@groovy.codehaus.org
Am 11.03.2013 13:39, schrieb OlliP:
> Thanks for reporting back. The difference in JDK is considerable. I will play
> with some warm up phase and post the results. I'm not sure I understand what
> you mean with primopts..?! Google did not reveal much about what primopts
> is...

primopts:
short for primitive optimizations. A optimization technique introduced
in Groovy 1.8 by generating partially static compiled code and letting
the runtime decide if the static part can be used or not. The static
compilation is limited to doing direct method calls in certain cases and
operations on primitive types.

bye blackdrag


--
Jochen "blackdrag" Theodorou - Groovy Project Tech Lead
blog: http://blackdragsview.blogspot.com/
german groovy discussion newsgroup: de.comp.lang.misc
For Groovy programming sources visit http://groovy-lang.org


OlliP

unread,
Mar 26, 2013, 8:32:29 AM3/26/13
to us...@groovy.codehaus.org
I meanwhile had a look at the Havlak Scala code, which can be found here:
http://code.google.com/p/multi-language-bench/source/browse/trunk/src/

The Scala code takes 27024 ms.

Really good result, I thought. Then I said to myself that to be fair the
Scala functional lists used in the Havlak implementation should be replaced
with ArrayList to use the same collections as Java and Groovy. Now, the run
took 47276 ms. Thereafter, I also replaced the immutable Scala Sets with
Java's HashSet. Now execution time rose to 76820 ms.

What looked really excellent at the beginning is less than mediocre in the
end. Seems like the Havlak benchmark is mostly good for measuring collection
performance. That's a bit a pitty. There was so much work in that stuff.
But, I believe the Havlak measurements are at least partially usefeull.
Comparing Java and Groovy worked very well, because the code is very
similar.

-- Oliver



--
View this message in context: http://groovy.329449.n5.nabble.com/Havlak-Benchmark-applied-to-Groovy-2-1-Nice-results-but-one-tear-drop-tp5713749p5714638.html
Sent from the groovy - user mailing list archive at Nabble.com.

Jim White

unread,
Mar 26, 2013, 1:29:17 PM3/26/13
to us...@groovy.codehaus.org
Hundt's implementation of Havlak is bad Java.  His paper even talks about the unexpected result that Scala is faster than Java, sees that it is a lot of GC due to excessive generation of garbage then assumes that is OK.

I looked at the code for less than 2 minutes and see obvious discrepancies where Scala gets an unfair advantage by getting initial size values for collections and the Java code is left to use the default.  Compare for example lines 200~201 of the Java to 375~376 of the Scala).

These (any many other) problems were quickly found after the paper was published (I guess peer-review really is worthwhile after all) but no proper correction has been published AFAIK.  There is this blog post which talks about these issues.  One of the biggest Java vs Scala differences is that the Scala code is using primitives in collections where Java is using object boxing.


I see that this Groovy version is just the Java code and not at all idiomatic Groovy.  Not that groovizing will make the code faster, it would make it shorter (potentially lot shorter).

Jim
Reply all
Reply to author
Forward
0 new messages