fastest aget and aset

169 views
Skip to first unread message

Timothy Pratley

unread,
Sep 27, 2009, 9:17:27 AM9/27/09
to Clojure
As far as I can tell there is currently no way to hint 2d array access
fast
(def a (make-array Double/TYPE 100 100))
(time (doseq [i (range 100), j (range 100)] (aget a (int i) (int j))))
"Elapsed time: 836.800335 msecs"
I can't find any combination of type hinting to speed it up.

However with a trivial change to core.clj [just added (int idx) in the
multi-arg case]:
> (apply aget (aget array (int idx)) idxs)))
(time (doseq [i (range 100), j (range 100)] (aget a i j)))
"Elapsed time: 40.162875 msecs"
It is fast! Yay!

That's great, almost as fast as a 1d array:
(def b (make-array Double/TYPE 10000))
(time (doseq [i (range 100), j (range 100)] (aget b (int (+ i (* j
100))))))
"Elapsed time: 27.701047 msecs"
But still it seems for maximum speed you would go with a 1d array.

Now for 1d arrays, the index also cannot be anything other than an
integer, so we could do the same thing [just added (int ~i)]:
{:inline (fn [a i] `(. clojure.lang.RT (aget ~a (int ~i))))

(time (doseq [i (range 100), j (range 100)] (aget b (+ i (* j 100)))))
"Elapsed time: 27.701047 msecs"
Great, it is just as fast as when we had to coerce to int.

But the coercion comes at a small cost when we don't need it:
(time (amap b idx ret (aset ret idx (inc (aget b idx)))))
original: "Elapsed time: 23.06673 msecs"
modified: "Elapsed time: 27.457263 msecs" <-- slower :( :( :(

So for the 1d case for maximum speed it appears better to leave it to
the user to coerce when necessary. The index must be an integer - is
there any way to have this fast in both cases? If not another
alternative would be to have an aget* and aget be a macro that
coerces, to allow both forms.

For multi-dimensional it appears the coercion must (and should) happen
in core?


Then I came across this post by cgrand:
http://clj-me.cgrand.net/2009/08/06/what-warn-on-reflection-doesnt-tell-you-about-arrays/
Indicates that type-hinting the array itself can provide a significant
speed-up, but I don't understand why, is this just more reflection or
is something else going on here?

In theory it should be possible to create a macro that creates
automatically type-hinted arrays (seeing the type is passed in) the
trick being to convert the type to a hint string. How would I do that?


And finally, it is actually faster to have an array of Objects and
store doubles in them than to use an array of doubles (due to boxing I
think). In my experimentation about 10% faster. Presumably an array of
doubles would be far less memory however.
(time (let [a #^"[Ljava.lang.Object;" (make-array Object 100)]
(dotimes [i 10000000] (aget a (int (rem i 100))))))
"Elapsed time: 3825.546843 msecs"
(time (let [a #^doubles (make-array Double/TYPE 100)] (dotimes [i
10000000] (aget a (int (rem i 100))))))
"Elapsed time: 4246.330734 msecs"


I know micro-benchmarks can be misleading, thanks for reading - please
point out any errors I've made.


Regards,
Tim.

Jonathan Smith

unread,
Sep 28, 2009, 2:45:39 AM9/28/09
to Clojure
Hello friend,

The time difference you are seeing between the 2-d version and the 1-d
version is actually due to in-lining...
If you try, for example:

(defn aget
"Returns the value at the index/indices. Works on Java arrays
of all
types."
{:inline (fn ([a i] `(. clojure.lang.RT (aget ~a ~i)))
([a i & idxs]
(reduce (fn [inner idx] `(aget ~inner (int ~idx)))
`(aget ~a (int ~i)) idxs)))
:inline-arities #{2 3 4 5 6 7} ;; actually i used a special 'all-
arities' set here
([array idx]
(clojure.lang.Reflector/prepRet (. Array (get array (int idx)))))
([array idx & idxs]
(apply aget (aget array (int idx)) idxs)))
and rebuild core...

I got results like so:

(time (doseq [i (range 100), j (range 100)] (aget a i j)))
"Elapsed time: 27.410739 msecs"

(time (doseq [i (range 100), j (range 100)] (aget b (int (+ i (* j
100))))))
"Elapsed time: 28.009345 msecs"

I don't really have any idea about the other stuff.

Jonathan Smith

unread,
Sep 28, 2009, 2:54:42 AM9/28/09
to Clojure
Thinking about it, this actually happens because of boxing and the
relation between macros and functions.

If you look at the 1-d case in the original code, it is inlined.
In the 2-d case in the original code, it is not inlined.

This means that when you coerce the indexs to int in the 2-d code, it
goes through a function call, meaning it gets boxed and would have to
be re-coerced to an int.

In the 1-d code, it is a macroexpansion, meaning that the coerced int
gets passed directly to the clojure.lang.RT aget primitive.

In the version I gave you, I added the coercion into the macro
version, because it seemed like a fairly reasonable thing to do,
however, I am no longer sure it is (it is really only required for the
function invocation, as coercing to int when the code has been inlined
actually works, and you can therefore avoid coercing twice).

I'm not sure, it is a very tricky subject, and probably not worth
worrying about for the minimal speed difference that it makes.

Interesting though,

-Jon.

Christophe Grand

unread,
Sep 28, 2009, 7:08:21 AM9/28/09
to clo...@googlegroups.com
Hi Timothy,

On Sun, Sep 27, 2009 at 3:17 PM, Timothy Pratley <timothy...@gmail.com> wrote:
Now for 1d arrays, the index also cannot be anything other than an
integer, so we could do the same thing [just added (int ~i)]:
 {:inline (fn [a i] `(. clojure.lang.RT (aget ~a (int ~i))))

(time (doseq [i (range 100), j (range 100)] (aget b (+ i (* j 100)))))
"Elapsed time: 27.701047 msecs"
Great, it is just as fast as when we had to coerce to int.

As I replied to your comment http://clj-me.cgrand.net/2009/08/06/what-warn-on-reflection-doesnt-tell-you-about-arrays/#comment-44
here, your timing is dominated by boxed arithmetic. You must also hint i and j.
 
Regarding multi-dim arrays, the best way to achieve high performance is to break access:

(def dd (make-array Double/TYPE 1000 1000))
user=> (time (dotimes [i 1000] (dotimes [j 1000] (aget dd i j))))
"Elapsed time: 455.514388 msecs"
user=> (time (dotimes [i 1000] (dotimes [j 1000] (-> dd (aget i) (aget j)))))
"Elapsed time: 257.625829 msecs"
user=> (time (dotimes [i 1000] (dotimes [j 1000] (-> #^objects dd (#^doubles aget i) (aget j)))))
"Elapsed time: 157.095175 msecs"

But since aget is inlined, the #^doubles hint is ignored so let's introduce a local:

user=> (time (dotimes [i 1000] (dotimes [j 1000] (let [a #^doubles (aget #^objects dd i)] (aget a j)))))
"Elapsed time: 158.134344 msecs"
No gain because the inlining "ate" the hint. We have to put the hint on something more stable:
user=> (time (dotimes [i 1000] (dotimes [j 1000] (let [#^doubles a (aget #^objects dd i)] (aget a j)))))
"Elapsed time: 17.412548 msecs"

It doesn't apply for random access but here we can even put the let out of the inner loop:
user=> (time (dotimes [i 1000] (let [#^doubles a (aget #^objects dd i)] (dotimes [j 1000] (aget a j)))))
"Elapsed time: 8.393386 msecs"

The following macro can help:
(defmacro deep-aget
  ([hint array idx]
    `(aget ~(vary-meta array assoc :tag hint) ~idx))
  ([hint array idx & idxs]
    `(let [a# (aget ~(vary-meta array assoc :tag 'objects) ~idx)]
       (deep-aget ~hint a# ~@idxs)))))

user=> (time (dotimes [i 1000] (dotimes [j 1000] (deep-aget doubles dd i j))))
"Elapsed time: 16.937279 msecs"

hth,

Christophe



--
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.blogspot.com/ (en)

Timothy Pratley

unread,
Sep 29, 2009, 12:15:03 AM9/29/09
to Clojure
Thanks Jon and Christophe for your insights! Very interesting :)
Reply all
Reply to author
Forward
0 new messages