(clojure 1.5.1) Weird performance results when using let versus def for variable

191 views
Skip to first unread message

Colin Yates

unread,
Jun 21, 2013, 8:36:47 AM6/21/13
to clo...@googlegroups.com
Hi all,

I am doing some (naive and trivial) performance tests before deciding whether and how to use Clojure for some performance critical number cruching and I wanted help understanding the behaviour.

I am defining an array inside a function, setting the contents to be 1 and then summing them up (by areducing) them (I chose 1 instead of a random number for consistency, obviously the contents will be different otherwise it would all reduce to (n) :)).  If I 'let' the array then it is factors of 10 faster than if I def the array.


[code]
(ns inc
  (:gen-class))

(defn- inc-atom [n]
  (def x (atom 0))
  (dotimes [n n] (swap! x inc))
  @x)

(defn- array-let [n]
  (let [a (int-array n)]
    (dotimes [n n] (aset-int a n 1))
    (areduce a i ret 0
             (+ ret (aget a i)))))

(defn- array-def [n]
  (def a (int-array n))
  (dotimes [n n] (aset-int a n 1))
  (areduce a i ret 0
           (+ ret (aget a i))))

(defn- run-test [subject n]
  (time (do (def x (subject n)) (println x))))

(defn -main [& args]
  (let [n 1000000]
    (println "inc atom")
    (run-test inc-atom n)
    (println "array with let")
    (run-test array-let n)
    (println "array with def")
    (run-test array-def n))
)
[/code]

Interestingly, if I refactored an 'execute-on-array' def which array-let and array-def delegated to then they had the same performance which seems to imply it is about scoping, but the array in both array-let and array-def have exactly the same scope...  Setting the autoboxing warning to true didn't point out anything either.

The output (from my VM, so a bit slow):
[code]
inc atom
1000000
"Elapsed time: 213.214118 msecs"
array with let
1000000
"Elapsed time: 75.302602 msecs"
array with def
1000000
"Elapsed time: 12868.970203 msecs"
[/code]

For comparison, the following java code:

[code]
package perf;

public class Inc {
    public static void main(String[] args) {
        int n = 1000000;
        int counter = 0;
        long start = System.currentTimeMillis();
        for (int i=0; i<n; i++) counter++;
        long end  = System.currentTimeMillis();         
        System.out.println ("Naive " + (end - start) + " ms, counter is " + counter);

        counter = 0;
        int[] arr = new int[n];
        start = System.currentTimeMillis();
        for (int i=0; i<arr.length; i++) arr[i]=1;
        for (int i=0; i<arr.length; i++) counter = counter + arr[i];
        end  = System.currentTimeMillis();         
        System.out.println ("Array " + (end - start) + " ms, counter is " + counter);     
                                           }
}    
[/code]

produces the (as expected, much faster) results :

[code]
Naive 3 ms, counter is 1000000
Array 6 ms, counter is 1000000
[/code]

I am not surprised that the atom/inc takes much longer than 3 ms, but I don't understand why the array solution is so much more expensive in Clojure?

On a related point - can anyone provide a faster implementation of summing up the contents of an array?

A lein project can be found https://github.com/yatesco/clojure-perf, 'lein uberjar; java -jar target/*.jar should demonstrate the output.

Jim - FooBar();

unread,
Jun 21, 2013, 8:51:38 AM6/21/13
to clo...@googlegroups.com
a start would be to set *warn-on-reflection* & *unchecked-math* to
true...I think you're not properly type-hinting your 'aget' calls.
areduce is the fastest way to sum up an array of primitives given that
there are no reflective calls. This takes just over 19 ms on my humble
machine and don't forget that we 're counting the time it takes to
populate the array as well...

(defn- array-sum-ints [n]
(let [^ints a (int-array n)]
(dotimes [n n] (aset a n 1))
(areduce a i ret 0
(+ ret (aget a i)))))

Jim
> --
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient
> with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to clojure+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

David Nolen

unread,
Jun 21, 2013, 9:29:58 AM6/21/13
to clojure
Using `def` like that is simply incorrect. `def` should always be at the top level unlike say Scheme.

I would first remove all internal defs and then rerun your benchmarks.


Colin Yates

unread,
Jun 21, 2013, 9:34:25 AM6/21/13
to clo...@googlegroups.com
Thanks Jim and David.

David, can you expand on why it is incorrect?  That is such a strong word.  Is it correct but simply non-idiomatic?

Also note that if I move the body out of the 'let' version of the array into another function passing in the array then the performance is the same as the 'def' version, so even if def is a problem it isn't the only cause.



You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/cvWf502OVPo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.

Jim - FooBar();

unread,
Jun 21, 2013, 9:49:52 AM6/21/13
to clo...@googlegroups.com
On 21/06/13 14:34, Colin Yates wrote:
 Is it correct but simply non-idiomatic?

no no it's actually very *dangerous*...by doing this you're essentially introducing mutable global state in your program and Clojure is a language that strives hard to minimise mutable and especially global state! I wouldn't say 'wrong' because the compiler lets you do it but it is certainly nasty code!


Also note that if I move the body out of the 'let' version of the array into another function passing in the array then the performance is the same as the 'def' version, so even if def is a problem it isn't the only cause.

using 'let' or passing the array as parameter is the nice and safe approach. The general performance of clojure when it comes to primitive arrays was discussed very recently in this thread [1] and was concluded that Clojure does indeed match java's performance. The specific use-case actually was summing up primitive arrays. I encourage you read it...In a nutshell, If you're using leiningen, add this entry to your project.clj and rerun your benchmarks.

:jvm-opts ^replace []

Jim

[1] https://groups.google.com/forum/#!topic/clojure/LTtxhPxH_ws

Michael Klishin

unread,
Jun 21, 2013, 9:51:15 AM6/21/13
to clo...@googlegroups.com

2013/6/21 Colin Yates <colin...@gmail.com>

Is it correct but simply non-idiomatic?

It's not how defs are supposed to be used. It's like using fields for everything in Java
even though you could use local variables for a lot of things, just because you can.

def produces a shared (well, namespace-local) var. You probably
want a function-local one.

On top of that, since the thread is about performance, vars have features that locals don't
and no feature comes without performance overhead.
--
MK

http://github.com/michaelklishin
http://twitter.com/michaelklishin

Michael Klishin

unread,
Jun 21, 2013, 9:54:08 AM6/21/13
to clo...@googlegroups.com
2013/6/21 Jim - FooBar(); <jimpi...@gmail.com>

If you're using leiningen, add this entry to your project.clj and rerun your benchmarks.

:jvm-opts ^replace []

Original post suggests the code is executed by building an uberjar running java -jar target/…
so Leiningen default JVM options are not relevant.

Jim - FooBar();

unread,
Jun 21, 2013, 9:59:27 AM6/21/13
to clo...@googlegroups.com
Did you read the entire thread?
both Jason and Leon (who originally posted) admit that this was the problem...Stuart even opened this issue:
https://github.com/technomancy/leiningen/pull/1230

the very last post reads:

I should follow up on this and clarify that core.matrix's esum is in fact as fast as Java -- I apologize for the false statement (I was unaware that new versions of leiningen disable advanced JIT optimizations by default, which lead to the numbers I reported).

Jim




On 21/06/13 14:54, Michael Klishin wrote:
2013/6/21 Jim - FooBar(); <jimpi...@gmail.com>
If you're using leiningen, add this entry to your project.clj and rerun your benchmarks.

:jvm-opts ^replace []

Original post suggests the code is executed by building an uberjar running java -jar target/�

so Leiningen default JVM options are not relevant.
--
MK

http://github.com/michaelklishin
http://twitter.com/michaelklishin
--
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
�
�

Andy Fingerhut

unread,
Jun 21, 2013, 10:06:07 AM6/21/13
to clo...@googlegroups.com
:jvm-opts and that ticket for Leiningen only affect the options passed to the JVM if you let Leiningen invoke the JVM for you, e.g. via "lein run ..."

Colin showed pretty clearly in his email that he was using "lein uberjar" followed by running the JVM explicitly with his own command line, so Leiningen has no way to affect the JVM command line options in that case.

Andy


On Fri, Jun 21, 2013 at 6:59 AM, Jim - FooBar(); <jimpi...@gmail.com> wrote:
Did you read the entire thread?
both Jason and Leon (who originally posted) admit that this was the problem...Stuart even opened this issue:
https://github.com/technomancy/leiningen/pull/1230

the very last post reads:

I should follow up on this and clarify that core.matrix's esum is in fact as fast as Java -- I apologize for the false statement (I was unaware that new versions of leiningen disable advanced JIT optimizations by default, which lead to the numbers I reported).

Jim




On 21/06/13 14:54, Michael Klishin wrote:
2013/6/21 Jim - FooBar(); <jimpi...@gmail.com>
If you're using leiningen, add this entry to your project.clj and rerun your benchmarks.

:jvm-opts ^replace []

Original post suggests the code is executed by building an uberjar running java -jar target/…

Michael Klishin

unread,
Jun 21, 2013, 10:08:37 AM6/21/13
to clo...@googlegroups.com
2013/6/21 Jim - FooBar(); <jimpi...@gmail.com>
Did you read the entire thread?
both Jason and Leon (who originally posted) admit that this was the problem...Stuart even opened this issue:
https://github.com/technomancy/leiningen/pull/1230

Leiningen's default only apply if you, hm, run Leiningen.

If you run java -jar …, you don't run Leiningen and the JVM flags used
are exactly what you provide.

Jim - FooBar();

unread,
Jun 21, 2013, 10:12:05 AM6/21/13
to clo...@googlegroups.com
On 21/06/13 15:06, Andy Fingerhut wrote:
> Colin showed pretty clearly in his email that he was using "lein
> uberjar" followed by running the JVM explicitly with his own command
> line, so Leiningen has no way to affect the JVM command line options
> in that case.

oops! I thought Michael meant the guys from Prismatic not the OP on this
thread. Yes, this doesn't apply to Colin...
my bad...I'm really sorry...

Jim

Colin Yates

unread,
Jun 21, 2013, 12:39:30 PM6/21/13
to clo...@googlegroups.com
Ah OK, I didn't realise.  I thought the vars would be locally scoped, i.e. semantically equivalent to 'let'ed symbols.

Thanks everyone for contributing.
Reply all
Reply to author
Forward
0 new messages