Question regarding java array

230 views
Skip to first unread message

Ritchie Cai

unread,
Jun 10, 2015, 4:07:09 PM6/10/15
to clo...@googlegroups.com
I'm working on a java array of double with 1280000 elements. I need the max and min values of the array. So I initially tried areduce and loop, both gives runs around 20 seconds. But when try (apply max (vec array)) I get result under 90 ms.
Can anyone explain why there is such a big difference?
Also if want to iterate large java array like this to do some other operations, e.g. convolution, what's the best way to go? Is there another fast way to iterate through array or do I need to convert array into vector?

Thanks
Ritchie

Andy Fingerhut

unread,
Jun 10, 2015, 4:08:44 PM6/10/15
to clo...@googlegroups.com
Add this line at the top of your file where you try areduce and loop, and look for any reflection warnings that occur when loading the file:

(set! *warn-on-reflection* true)

If there are none, probably best to post a link to your code, or paste it in a message here if it is short enough, so others can give more precise suggestions.

Andy

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Colin Yates

unread,
Jun 10, 2015, 4:09:41 PM6/10/15
to clo...@googlegroups.com
Will guess in the dark but would boxing come into play here?

Mikera

unread,
Jun 10, 2015, 6:01:25 PM6/10/15
to clo...@googlegroups.com
Consider using core.matrix with vectorz-clj for operations on large numerical arrays / vectors of doubles. It is a *lot* faster than using Clojure vectors for this kind of scenario, plus it has a lot of helpful array operations already defined.

(use 'clojure.core.matrix)
(def v (array :vectorz (range 1280000)))

(time (emax v))
=> 1279999.0
"Elapsed time: 1.179533 msecs"

Steven Yi

unread,
Jun 10, 2015, 9:04:00 PM6/10/15
to clo...@googlegroups.com
As mentioned by Colin and Andy, I would guess it would be some form of boxing and reflection going on.  I tried the following:

(defn array-max [^doubles arr]

  (let [len (alength arr)]

    (loop [m Double/NEGATIVE_INFINITY indx 0]

      (if (< indx len)

        (recur (max m (aget arr indx)) (unchecked-inc indx))

        m))))


user=> (let [vs (amap (double-array 1280000) idx ret (Math/random))] 

               (time (array-max vs))) 

"Elapsed time: 3.719835 msecs"


To note, if you check out the source of areduce:

user=> (source areduce)

(defmacro areduce

  "Reduces an expression across an array a, using an index named idx,

  and return value named ret, initialized to init, setting ret to the 

  evaluation of expr at each step, returning ret."

  {:added "1.0"}

  [a idx ret init expr]

  `(let [a# ~a]

     (loop  [~idx 0 ~ret ~init]

       (if (< ~idx  (alength a#))

         (recur (unchecked-inc ~idx) ~expr)

         ~ret))))


It's just a macro, and so typehinting is going to play a factor.  For example, with areduce and a type hint on the array:


(defn array-max2 [^doubles arr]

  (areduce arr idx ret Double/NEGATIVE_INFINITY (max ret (aget arr idx))))

user=> (let [vs (amap (double-array 1280000) idx ret (Math/random))] (time (array-max vs))) 

"Elapsed time: 3.314599 msecs"


But with no type hint on arr:


(defn array-max2 [arr]

  (areduce arr idx ret Double/NEGATIVE_INFINITY (max ret (aget arr idx))))


user=> (let [vs (amap (double-array 1280000) idx ret (Math/random))] (time (array-max2 vs))) 

"Elapsed time: 35612.919192 msecs"


Without a typehint on the arr argument, I also do get boxed math and reflection warnings:


Reflection warning, /private/var/folders/0k/xj_drd990xxf4q99n2bdknrc0000gn/T/form-init1595291808747030463.clj:2:3 - call to static method alength on clojure.lang.RT can't be resolved (argument types: unknown).

Boxed math warning, /private/var/folders/0k/xj_drd990xxf4q99n2bdknrc0000gn/T/form-init1595291808747030463.clj:2:3 - call: public static boolean clojure.lang.Numbers.lt(long,java.lang.Object).

Reflection warning, /private/var/folders/0k/xj_drd990xxf4q99n2bdknrc0000gn/T/form-init1595291808747030463.clj:2:58 - call to static method aget on clojure.lang.RT can't be resolved (argument types: unknown, int).

Boxed math warning, /private/var/folders/0k/xj_drd990xxf4q99n2bdknrc0000gn/T/form-init1595291808747030463.clj:2:49 - call: public static java.lang.Object clojure.lang.Numbers.max(double,java.lang.Object).

form-init1595291808747030463.clj:2 recur arg for primitive local: ret is not matching primitive, had: Object, needed: double

Auto-boxing loop arg: ret

Reflection warning, /private/var/folders/0k/xj_drd990xxf4q99n2bdknrc0000gn/T/form-init1595291808747030463.clj:2:3 - call to static method alength on clojure.lang.RT can't be resolved (argument types: unknown).

Boxed math warning, /private/var/folders/0k/xj_drd990xxf4q99n2bdknrc0000gn/T/form-init1595291808747030463.clj:2:3 - call: public static boolean clojure.lang.Numbers.lt(long,java.lang.Object).

Reflection warning, /private/var/folders/0k/xj_drd990xxf4q99n2bdknrc0000gn/T/form-init1595291808747030463.clj:2:58 - call to static method aget on clojure.lang.RT can't be resolved (argument types: unknown, int).

Boxed math warning, /private/var/folders/0k/xj_drd990xxf4q99n2bdknrc0000gn/T/form-init1595291808747030463.clj:2:49 - call: public static java.lang.Object clojure.lang.Numbers.max(java.lang.Object,java.lang.Object).


Ritchie Cai

unread,
Jun 11, 2015, 5:26:04 AM6/11/15
to clo...@googlegroups.com
Yup. As it turns out, the slow down is pretty much due to reflection. After added type hint, it's much better now.
The code can be viewed here: 

They are three different implementations, here is the performance for each one:
imaging.dicom> (def data1 (timer "total: " (load-txt-image "resources/PCT/CTP404_merged/x_11.txt")))
  ==> timing:      loading txt data 513.267672 ms
  ==> timing:                   max 61.782047 ms
  ==> timing:                   min 62.4488 ms
  ==> timing:   update pixel values 281.222481 ms
  ==> timing:               total:  952.981063 ms
#'imaging.dicom/data1
imaging.dicom> (def data3 (timer "total: " (load-txt-image_matrix "resources/PCT/CTP404_merged/x_11.txt")))
  ==> timing:      loading txt data 728.267621 ms
  ==> timing:                   max 25.00652 ms
  ==> timing:                   min 25.575979 ms
  ==> timing:   update pixel values 111.647122 ms
  ==> timing:               total:  926.00495 ms
#'imaging.dicom/data3
imaging.dicom> (def data2 (timer "total: " (load-txt-image_array "resources/PCT/CTP404_merged/x_11.txt")))
  ==> timing:      loading txt data 664.818514 ms
  ==> timing:            min max :  429.855274 ms
  ==> timing:   update pixel values 361.323422 ms
  ==> timing:               total:  1491.792197 ms
#'imaging.dicom/data2
imaging.dicom> 

The core.matrix one is the fastest, persistent vector is the second, but very close. Double array is actually the slowest one.
Lesson learned here for me is that only use java array when absolutely necessary. I always thought since it's primitive array, it should be the fastest. Apparently not!

Thanks for the hint.

Ritchie Cai

unread,
Jun 11, 2015, 5:32:02 AM6/11/15
to clo...@googlegroups.com
I tried with vectorz here: https://github.com/malloc82/imaging/blob/45475b99f564b1ac77e668e04b91cb9c01a096d7/src/imaging/dicom.clj#L130-L161 and I'm really impressed with it's performance.
The performance is shown here:

imaging.dicom> (def data3 (timer "total: " (load-txt-image_matrix "resources/PCT/CTP404_merged/x_11.txt")))
  ==> timing:      loading txt data 728.267621 ms
  ==> timing:                   max 25.00652 ms
  ==> timing:                   min 25.575979 ms
  ==> timing:   update pixel values 111.647122 ms
  ==> timing:               total:  926.00495 ms

Just wondering though, is there a faster way to load an array than this way? https://github.com/malloc82/imaging/blob/45475b99f564b1ac77e668e04b91cb9c01a096d7/src/imaging/dicom.clj#L138
the data file I'm trying to read from contains text based pixel values.

Thanks.

Ritchie Cai

unread,
Jun 11, 2015, 5:43:54 AM6/11/15
to clo...@googlegroups.com
Yup. Reflection is issue, I needed type hint. 
However, on another note, I notice that in your first test case, your evaluation takes about 3 ms, but on my machine it takes 76 ms. I'm running a Xeon CPU at 3.5 GHZ, clojure-1.7-RC1. What could cause such a huge different timing?

Thanks.

Colin Yates

unread,
Jun 11, 2015, 5:45:59 AM6/11/15
to clo...@googlegroups.com
Lesson learned here for me is that only use java array when absolutely necessary. I always thought since it's primitive array, it should be the fastest. Apparently not!
This bears repeating. I often find it hit-and-miss to know when idiomatic Clojure will be faster than turning to Java. Are there any resources (blogs, books etc) to help increase intuition as to where to go? Other than time, the REPL and criterium of course :)

Andy-

unread,
Jun 11, 2015, 8:30:07 AM6/11/15
to clo...@googlegroups.com
On Thursday, June 11, 2015 at 5:32:02 AM UTC-4, Ritchie Cai wrote:
Just wondering though, is there a faster way to load an array than this way? https://github.com/malloc82/imaging/blob/45475b99f564b1ac77e668e04b91cb9c01a096d7/src/imaging/dicom.clj#L138
the data file I'm trying to read from contains text based pixel values.

 I'm not sure about your use case but you may want to look into HDF5 data format. That's what it's made for: Super fast loading of numerical data from disk. It can even load in parallel (many hundreds MB/s). As soon as my text file takes too long to load I usually just convert it first to HDF5 and then go from there. Linux/Mac has HDF tools which can dump and import (h5import) your data from the command line.
It's also a very common data format.

HTH

Ritchie Cai

unread,
Jun 11, 2015, 10:18:59 AM6/11/15
to clo...@googlegroups.com
Unfortunately, I don't get to decide the data format. It's a dump from previous stage. Also, it's supposed to be super easy for Physics people to look at. If you every work with them, you'll know what I mean XD.

But thanks for the suggestion.

Steven Yi

unread,
Jun 11, 2015, 12:20:42 PM6/11/15
to clo...@googlegroups.com
I'm not sure why you'd see much slower results there. For reference,
I'm on a Core i7-2720-M (MacbookPro 8,1 13" early 2011), and was using
clojure-1.7-beta3.

Also, I looked at the code you posted and I'm not so sure about your
assumption that Java arrays are slower:

* in load-txt-image_array, you could probably type hint the data var
in the first let binding as ^doubles. With that, you should be able
to get rid of the type hinting throughout the rest of the function.

* In your areduce code you're using a vector to carry the result,
which requires packing and unpacking, which ends up being somewhat
like auto-boxing. Using a loop-recur would allow you to carry over the
min and max separately between steps, something like:

(let [len (alength data)]
(loop [i 0 my-min 0.0 my-max 0.0]
(if (< i len)
(let [v (aget data i)]
(recur (unchecked-inc i) (Math/min my-min v) (Math/max my-max v)))
[my-min my-max])))

(could also use min and max instead of Math/min and Math/max)

* In the "update pixel values" part of the function, you're using a
doseq with a range. That'd cause a sequence of boxed numbers of be
generated. Even though you have a ^double as a type hint, which will
get you out of the boxed math warning, there's still boxing going on
and you'll still first getting a boxed number and then have a cast to
primitive double. For example, if you use this function:

user=> (defn a [] (doseq [i (range 50)] (println (+ ^double i 1.0))))

and use no.disassemble, you'll find byte code like this:

278 checkcast java.lang.Number [131]
281 invokestatic
clojure.lang.RT.uncheckedDoubleCast(java.lang.Object) : double [135]
284 dconst_1
285 invokestatic clojure.lang.Numbers.unchecked_add(double,
double) : double [141]

I'd try using a loop-recur here as well instead of the doseq.

As a sidenote, if haven't looked, you might give Prismatic's hiphip[1]
library a try.

[1] - https://github.com/prismatic/hiphip
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your
> first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "Clojure" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/clojure/Uh64-DaPYfc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to

Ritchie Cai

unread,
Jun 15, 2015, 5:22:08 PM6/15/15
to clo...@googlegroups.com
Ha, you are right. That really make things a lot faster now. All three different implementations now pretty much runs about the same speed, no one is significantly faster or slower. Really appreciate your help.

However, what really puzzles me at this point is that  array-max call speed. On all the systems I have tried, both Linux and Mac, both clojure-1.7-beta3 and clojure-RC1, all using java 1.8. I get pretty much the same results, all around 80ms no where near 4ms. Also I'm using lein repl with cider 0.8.2 , the array-max is evaluated using cider-eval-defun-at-point (C-c C-c) function. 

Do you mind to give some more info on how you evaluated that function? I might be making some stupid mistake or something I'm not already know.

Steven Yi

unread,
Jun 15, 2015, 5:59:57 PM6/15/15
to clo...@googlegroups.com
I typed the array-max code and test in a REPL launched with "lein
repl" in a terminal. I did do that in the root of one of my projects
that had settings on to use 1.7.0 and to warn on reflection and
unchecked math. When I launched just now I have these versions
reported to the terminal:

REPL-y 0.3.5, nREPL 0.2.8
Clojure 1.7.0-beta3
Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14

Ritchie Cai

unread,
Jun 15, 2015, 6:27:59 PM6/15/15
to clo...@googlegroups.com
My java was 1.8.0_05-b13. Upgraded it. Now it's around 9ms, close enough.

Thanks a lot. 

Jason Wolfe

unread,
Jun 15, 2015, 8:34:36 PM6/15/15
to clo...@googlegroups.com

Ritchie Cai

unread,
Jun 15, 2015, 8:38:54 PM6/15/15
to clo...@googlegroups.com
Aww ... with that it's around 3ms now. Thanks :)
Reply all
Reply to author
Forward
0 new messages