XOR two arrays into a third on Clojure

461 views
Skip to first unread message

Ignacio Corderi

unread,
Mar 13, 2014, 1:26:33 AM3/13/14
to clo...@googlegroups.com
Hey guys, here is a huge performance problem I'm trying to figure out:

;; Say you have 2 data arrays sitting out there of 1 MB

(def one-mb (byte-array (* 1024 1024))) 
(def another-mb (byte-array (* 1024 1024))) 

;; and another one that should have the byte-by-byte XOR of the previous two 

(def out-mb (byte-array (* 1024 1024)))

;; question is... how do you code this guy, so that it doesn't take forever

(defn inplace-xor [a b out]
  (def ln (count a))
  (loop [x 0]
    (if (< x ln)
      (do 
        (aset-byte out x (bit-xor (nth a x) (nth b x)))
        (recur (+ x 1))
        ))))

;; checking the time 

(time (inplace-xor one-mb another-mb out-mb))

;; takes about ~400ms which is.... well... A LOT

;; I'm happy to receive a solution that involves calling some java library...

Thanks in advance!
-Ignacio

 

Walter van der Laan

unread,
Mar 13, 2014, 4:14:11 AM3/13/14
to clo...@googlegroups.com
Try this:
(defn inplace-xor [^bytes a ^bytes b ^bytes out]
  (let [ln (count a)]
    (loop [x 0]
      (if (< x ln)
        (do 
          (aset-byte out x (bit-xor (aget a x) (aget b x)))
          (recur (inc x)))))))

Michael Gardner

unread,
Mar 13, 2014, 4:36:14 AM3/13/14
to clo...@googlegroups.com
Might be slow because of the polymorphic nature of nth. If you replace nth with aget (and turn on *warn-on-reflection*, which is a good idea when performance-tuning), you'll get reflection warnings because Clojure doesn't know what Java method to use since it doesn't know what type of objects a and b are. Once you silence those with type hints like in Walter's example, you get roughly a 2x speedup (in my tests).

BTW, I'd write it like this:

(defn inplace-xor [^bytes a ^bytes b ^bytes out]
(dotimes [i (alength a)]
(aset-byte out i
(bit-xor (aget a i) (aget b i)))))
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clojure+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Leif

unread,
Mar 13, 2014, 7:42:35 AM3/13/14
to clo...@googlegroups.com
Hi, Ignacio.

Performance tuning in clojure being somewhat complicated, I would look for prior art here.  For instance, the suggestions above give me a 6x speedup, but it's still way slower than the equivalent java code.  So I used Prismatic's hiphip (array)! library on your problem (but with longs) and got close to java for-loop speed:

https://github.com/Prismatic/hiphip

They are focused on numerics, so they don't have support for bytes/shorts, but it looks like you could add that fairly easily.  Or maybe someone has already written a equivalent library for bit-twiddling.  I would look/ask around.

If you're interested, the main slowdown in the above suggestions is probably the extra work from safe typecasts clojure does.  The hiphip library uses the unsafe equivalents from clojure.lang.RT.

Cheers,
Leif

Alex Miller

unread,
Mar 13, 2014, 8:34:29 AM3/13/14
to clo...@googlegroups.com
Agreed with all the comments on this so far. I would also say that dotimes is slower than loop for stuff like this so I would also make that change.

(defn inplace-xor [^bytes a ^bytes b ^bytes out] 
  (let [len (alength a)]
    (loop [i 0]
      (when (< i len)
        (aset-byte out i (bit-xor (aget a i) (aget b i)))
        (recur (inc i))))))

In this scenario loop/recur will use a primitive long for i.  And then I would look at the bytecode to verify no boxing is occurring.

Michael Gardner

unread,
Mar 13, 2014, 8:39:01 AM3/13/14
to clo...@googlegroups.com
On Mar 13, 2014, at 07:34 , Alex Miller <al...@puredanger.com> wrote:

> Agreed with all the comments on this so far. I would also say that dotimes is slower than loop for stuff like this so I would also make that change.

The dotimes version is slightly faster on my hardware. Why would it be slower?

Jozef Wagner

unread,
Mar 13, 2014, 9:21:43 AM3/13/14
to clo...@googlegroups.com
Note that looping with primitive int is faster than with long, and native array functions accepts/returns int instead of long for their index and length.

It is very hard to eliminate boxing without dropping to java. In you example, calling bit-xor does 2 autoboxing (and 1 long to int cast as aset-byte takes int index), as there is no byte variant of clojure.lang.Numbers/xor. 

Also you cannot detect it without advanced profiler or bytecode analysis.

Best,
Jozef

Leif

unread,
Mar 13, 2014, 10:33:16 AM3/13/14
to clo...@googlegroups.com
Based on what Jozef said, I could write

(defn inplace-xor-hh [^bytes a ^bytes b ^bytes out]
  (hiphip.array/afill! Byte/TYPE [_ out x a y b] (bit-xor (long x) (long y))))

It took 2 ms on my machine, vs. 80 ms for the 'dotimes' solution.'  I think that matches java's speed, but if not, I'm guessing java is your only option.

Again, you saw how fiddly this all is: even with a library taking care of most of it, there was still tweaking to do (for the simplest possible operation).  That's why I think you should consider building on what Prismatic (or someone else with a byte-munging specific library, if that exists) has done.

--Leif

Alex Miller

unread,
Mar 13, 2014, 4:07:43 PM3/13/14
to clo...@googlegroups.com
My best guess would be that I've used the loop version in places where I had (set! *unchecked-math* true) - I see that dotimes uses unchecked-inc so that might explain it. See what happens with this version.

(defn inplace-xor [^bytes a ^bytes b ^bytes out] 
  (let [len (alength a)]
    (loop [i 0]
      (when (< i len)
        (aset-byte out i (bit-xor (aget a i) (aget b i)))
        (recur (unchecked-inc i))))))

As someone mentions later in this thread, it really pays to read the compiled bytecode for this stuff and check for boxing.

Ignacio Corderi

unread,
Mar 13, 2014, 5:53:09 PM3/13/14
to clo...@googlegroups.com
This is a lot messier than I thought it would be. 

So far the fastest code is from @Michael_Gardner with the dotimes (~100ms)
Once I add :jvm-opts ^:replace [] on my profile and (set! *unchecked-math* true) several examples drop to ~80ms 
but @Leif example using hiphip drops to ~30ms.

@Leif I can't get the ~2ms you have, I'm probably missing some extra flag to achieve the extra order of magnitude drop. 
~2ms would be acceptable and comparable to my python implementation, I'm sure java would be somewhere in the vicinity of that too. 


Thanks to everyone for the replies!
-Ignacio


Ignacio Corderi

unread,
Mar 13, 2014, 6:19:13 PM3/13/14
to clo...@googlegroups.com
Ok so, 
This is what i got of running @Lein code example using hiphip 4 times in a row,
performance is now acceptable add I'm happy about it

"Elapsed time: 9.096 msecs"
"Elapsed time: 1.707 msecs"
"Elapsed time: 1.493 msecs"
"Elapsed time: 0.839 msecs"

Turning :aot on didn't fix the first outlier, so I'm guessing something inside hiphip is not warmed up, but I can live with that.

Thanks again!
-Ignacio


Jules

unread,
Mar 15, 2014, 5:43:38 PM3/15/14
to clo...@googlegroups.com
I've only skimmed this thread quickly, so if  I am repeating the obvious just ignore me :-)

I haven't noticed anyone suggesting parallelising your problem.

Take the best sequential solution that you have found, divide the size of the array by the number of cores that you have available, then hand off a start and end point to each core and apply your solution between them. 

If the cost of set up and tear down does not outweigh the amount of work that you have to do you should see a fair performance boost.


Jules

pete windle

unread,
Mar 16, 2014, 5:25:33 AM3/16/14
to clo...@googlegroups.com
Consider using criterium or similar for benchmarking. (time) is ok for rough and ready, but by the time you've navigated the tiered JIT of the JVM it just isn't good enough to be able to make useful inferences from the results.

Pete

James Elliott

unread,
Mar 12, 2016, 3:14:34 PM3/12/16
to Clojure
Interesting! This is the first time I have had to drop out of Clojure for performance reasons. It is not too surprising, given that I am doing low-level byte bashing in a tight loop to send pixels to a display over USB at sixty frames per second. But it is surprising that nothing like this has happened before in building my Clojure environment for running light shows.

I have created a separate Java library which I am calling Wayang, to make it easy for any Java project to drive this display. Once that’s done, I may add a small Clojure convenience wrapper, or I may just have Afterglow use Wayang directly. https://github.com/brunchboy/wayang

But first I have a bunch of Java2D integration to implement!

Mikera

unread,
Mar 13, 2016, 3:04:27 AM3/13/16
to Clojure
I have a useful library for image manipulation in Clojure, you may find it useful:


New ideas / PRs gratefully received!

James Elliott

unread,
Mar 14, 2016, 10:41:53 AM3/14/16
to clo...@googlegroups.com
Thanks, that does look nice. If I end up wanting to do any image manipulation I will definitely check it out. For now, all I need to do is be able to create a graphics context into which I can draw the user interface, then convert it into bits in the proper format, and blast them over USB to the Ableton Push 2. And that is working, to much rejoicing!

Cheers,

-James

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to a topic in the Google Groups "Clojure" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/clojure/kcx5eKdMxcs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to clojure+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages