Python is way faster than Clojure on this task

Pepijn de Vos

unread,

Nov 4, 2010, 5:28:12 PM11/4/10

to clo...@googlegroups.com

Hi all,

I have written a Python script to analyze Minecraft levels and render a graph. Then I did the same with Clojure. It takes Python 10 seconds to analyze a map, while it takes Clojure over a minute.

After having tried different options without any significant improvement, I am lost as to why there is such a huge difference. I wouldn't mind an extra pair of eyes/brains to look at this.

I blogged about it in more detail here: http://pepijndevos.nl/clojure-versus-python
Clojure version: https://github.com/pepijndevos/Clomian/
Python version: https://github.com/l0b0/mian

Clojure spends most of its time in the freqs function, here are a couple of variations: https://gist.github.com/663096

If you want to run the code yourself, you'll need a Minecraft level and JNBT, which is not on Maven.
JNBT: http://jnbt.sourceforge.net/
The level used in the blogpost: http://dl.dropbox.com/u/10094764/World2.zip

Groeten,
Pepijn de Vos
--
Sent from my iPod Shuffle
http://pepijndevos.nl

Mike Meyer

unread,

Nov 4, 2010, 5:43:02 PM11/4/10

to clo...@googlegroups.com

Can you check GC activity in the clojure version?

I once ran into an issue where Python was running rings around an
Eiffel version (compiled down to native code - no VM need apply). This
looks similar to what you have, in that I built a large data
structure, and then started groveling over it. Turned out that Eiffel
was doing a mark-and-sweep GC, which was spending all of it's time
marking and sweeping the large static data structure, whereas python
doing a reference count GC didn't. Given that I know nothing about
Java GCs, this is just a WAG.

Come to think of it, how about trying to run the program Jython? That
should have the same GC issues. If it's some similar environmental
problem, that would show up there as well.

<mike
--
Mike Meyer <m...@mired.org> http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Andrew Gwozdziewycz

unread,

Nov 4, 2010, 5:52:32 PM11/4/10

to clo...@googlegroups.com

There are many different collectors for the JVMs, too numerous to list
here, all tunable.

http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html

--
http://www.apgwoz.com

pepijn (aka fliebel)

unread,

Nov 5, 2010, 10:41:56 AM11/5/10

to Clojure

I don't know how to check the GC activity on my project, but I did run
Mian on Jython. It performs much like my initial Clojure version. It
consumes absurd amounts of memory and never finishes.

So I think we can safely say that Java's GC or the way it stores data
is less efficient on this type of problem than Python.

On Nov 4, 10:43 pm, Mike Meyer <mwm-keyword-googlegroups.

pepijn (aka fliebel)

unread,

Nov 5, 2010, 10:43:27 AM11/5/10

to Clojure

Can you recommend any? I tied a few of the GC options, but that didn't
help much.

> http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-1401...
>
> --http://www.apgwoz.com

David Nolen

unread,

Nov 5, 2010, 11:24:57 AM11/5/10

to clo...@googlegroups.com

On Fri, Nov 5, 2010 at 10:41 AM, pepijn (aka fliebel) <pepij...@gmail.com> wrote:

I don't know how to check the GC activity on my project, but I did run
Mian on Jython. It performs much like my initial Clojure version. It
consumes absurd amounts of memory and never finishes.

So I think we can safely say that Java's GC or the way it stores data
is less efficient on this type of problem than Python.

It's common that iteration heavy, mutation heavy code which is idiomatic in Python poses some challenges when translating to Clojure. Making this run faster than Python should be possible, and I would be surprised if it wasn't quite a bit faster. You should search the Google Group for the various threads on optimizing slow Clojure code.

I note that the repo does not contain the data file which your code runs against?

David

pepijn (aka fliebel)

unread,

Nov 5, 2010, 12:38:33 PM11/5/10

to Clojure

I will have a look around.

I listed the map I used in my first email, It's on my Dropbox:
http://dl.dropbox.com/u/10094764/World2.zip

Meanwhile I wrote a function that is already twice as fast as I had,
no memory problems, no threads. One tinny problem: it doesn't produce
the same result.

It's the one at the top: https://gist.github.com/663096

The other day I found out that this kind of logic will actually refer
to the same same transient. This eliminates the remainder and associng
in the areduce fn second in the list, but I'm not sure this is
reliable, and it might be the reason why some results get lost.

On Nov 5, 4:24 pm, David Nolen <dnolen.li...@gmail.com> wrote:
> On Fri, Nov 5, 2010 at 10:41 AM, pepijn (aka fliebel) <pepijnde...@gmail.com

B Smith-Mannschott

unread,

Nov 5, 2010, 2:30:24 PM11/5/10

to clo...@googlegroups.com

On Fri, Nov 5, 2010 at 17:38, pepijn (aka fliebel)
<pepij...@gmail.com> wrote:
> I will have a look around.
>
> I listed the map I used in my first email, It's on my Dropbox:
> http://dl.dropbox.com/u/10094764/World2.zip
>
> Meanwhile I wrote a function that is already twice as fast as I had,
> no memory problems, no threads. One tinny problem: it doesn't produce
> the same result.
>
> It's the one at the top: https://gist.github.com/663096
>
> The other day I found out that this kind of logic will actually refer
> to the same same transient. This eliminates the remainder and associng
> in the areduce fn second in the list, but I'm not sure this is
> reliable, and it might be the reason why some results get lost.

(defn freqs [^bytes blocks]
(loop [idx 0
ret (cycle (repeatedly 128 #(transient {})))]
(if (< idx (alength blocks))
(do
(update! (first ret) (aget blocks idx) (fnil inc 0))
(recur (inc idx) (next ret)))
(map persistent! (take 128 ret)))))

I'm not familiar with incanter, which defines update!, but the update!
call makes me suspicious. Transients are not designed to be banged on
in place. That would explain your losing results.

// ben

Greg

unread,

Nov 5, 2010, 2:31:21 PM11/5/10

to clo...@googlegroups.com

I'm very curios about this situation, please let us know if you manage to write a version that's faster than the python one (as David claims is possible). I would attempt it myself but I've only just recently had the time to dive back into Clojure. :-\

- Greg

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

David Nolen

unread,

Nov 5, 2010, 2:37:03 PM11/5/10

to clo...@googlegroups.com

On Fri, Nov 5, 2010 at 2:31 PM, Greg <gr...@kinostudios.com> wrote:

I'm very curios about this situation, please let us know if you manage to write a version that's faster than the python one (as David claims is possible). I would attempt it myself but I've only just recently had the time to dive back into Clojure. :-\

- Greg

I'm almost 100% certain it's possible. But having already optimized other people's code several times on this list as a learning experiment, I've become bored with optimizing other people's code ;)

David

pepijn (aka fliebel)

unread,

Nov 5, 2010, 2:48:50 PM11/5/10

to Clojure

update! is of my own making, based on assoc! and update-in

pepijn (aka fliebel)

unread,

Nov 5, 2010, 3:58:14 PM11/5/10

to Clojure

Could you refer me to some of those relevant to my problem? I tried
searching for them, and most stuff I found is about killing
reflection, using buffered IO and other basics I've already covered.

On Nov 5, 7:37 pm, David Nolen <dnolen.li...@gmail.com> wrote:

mch

unread,

Nov 5, 2010, 2:58:15 PM11/5/10

to Clojure

You can use Visual VM (https://visualvm.dev.java.net/) to see how the
VM is using memory. I don't think it specifically show a log of GC
activity, but it is pretty clear from the graphs.

mch

On Nov 5, 8:41 am, "pepijn (aka fliebel)" <pepijnde...@gmail.com>
wrote:

Alan

unread,

Nov 5, 2010, 5:59:16 PM11/5/10

to Clojure

I think you missed his point. (assoc! m k v) is *allowed* to modify m,
not *guaranteed*. It returns a pointer to a transient map, which may
be m, or may be a totally distinct map, or may be a new map that
shares some pointers with m. So your (do (update! blah foo
bar) ...more stuff) is potentially (and unpredictably) throwing away
the results of the update. You need to save the return value.

On Nov 5, 11:48 am, "pepijn (aka fliebel)" <pepijnde...@gmail.com>

Benny Tsai

unread,

Nov 5, 2010, 6:57:00 PM11/5/10

to Clojure

Here's what I have so far. The code splits blocks into 128 smaller
sub-arrays, each representing a level, then calls a modified version
of frequencies (using areduce instead of reduce) on each level. On my
machine, with server mode on, it takes about 20 seconds to compute the
frequencies for an array of 99844096 blocks. I haven't tested it on
your level, because I'm too lazy to get JNBT set up, but I'm curious
to see if it returns the correct result for you.

(def num-levels 128)

(defn get-level [level-num ^bytes blocks]
(let [size (/ (count blocks) num-levels)
output (byte-array size)]
(doseq [output-idx (range size)]
(let [block-idx (+ (* output-idx num-levels) level-num)]
(aset output output-idx (aget blocks block-idx))))
output))

(defn afrequencies
[^bytes a]
(persistent!
(areduce a
idx
counts
(transient {})
(let [x (aget a idx)]
(assoc! counts x (inc (get counts x 0)))))))

(defn freqs [^bytes blocks]
(let [levels (map #(get-level % blocks) (range num-levels))]
(map afrequencies levels)))

user=> (def blocks (byte-array 99844096))
#'user/blocks
user=> (time (count (freqs blocks)))
"Elapsed time: 20160.780769 msecs"
128

Benny Tsai

unread,

Nov 5, 2010, 8:27:34 PM11/5/10

to Clojure

Oops, sorry, got my terminology wrong. The sub-arrays represent
*layers*, not levels. So the code should actually read as follows:

(def num-layers 128)

(defn get-layer [layer-num ^bytes blocks]
(let [size (/ (count blocks) num-layers)

output (byte-array size)]
(doseq [output-idx (range size)]

(let [block-idx (+ (* output-idx num-layers) layer-num)]

(aset output output-idx (aget blocks block-idx))))
output))

(defn afrequencies
[^bytes a]
(persistent!
(areduce a
idx
counts
(transient {})
(let [x (aget a idx)]
(assoc! counts x (inc (get counts x 0)))))))

(defn freqs [^bytes blocks]

(let [layers (map #(get-layer % blocks) (range num-layers))]
(map afrequencies layers)))

pepijn (aka fliebel)

unread,

Nov 6, 2010, 6:23:13 AM11/6/10

to Clojure

Awesome. You managed to reproduce my initial solution, but working
with arrays all the way. It is already quite a bit faster than what I
have, but still nowhere near Python.

I'll put it on the list for things to look at.

Peter Schuller

unread,

Nov 6, 2010, 7:32:02 AM11/6/10

to clo...@googlegroups.com

> You can use Visual VM (https://visualvm.dev.java.net/) to see how the
> VM is using memory. I don't think it specifically show a log of GC
> activity, but it is pretty clear from the graphs.

Or just use -XX:+PrintGC and maybe -XX:+PrintGCDetails and
-XX:+PrintGCTimeStamps.

I haven't checked what the code is doing, but if you suspect extremely
poor performance due to GC it may be because your application happens
to require some amount of memory that is below but fairly close to the
default maximum heap size. That may easily cause very frequent GC:s
and show up as poor performance. If this is the case, doubling the
heap size should fix it (-Xmx...).

(The JVM does throw OutOfMemoryExceptions when it decides there is
cause too, but it is a difficult heuristic to decide when that is
actually the right thing to do. So it's very possible to be in
situations that are not quite so bad in terms of time spent doing GC
that the JVM throws an exception, yet bad enough to cause very
frequent full GC:s at considerable cost in CPU time.)

--
/ Peter Schuller

pepijn (aka fliebel)

unread,

Nov 6, 2010, 9:08:14 AM11/6/10

to Clojure

[GC 524288K->103751K(2009792K), 0.1105872 secs]
[GC 628039K->105751K(2009792K), 0.0925628 secs]
[GC 630039K->109023K(2009792K), 0.0702017 secs]
[GC 633311K->115263K(2009792K), 0.0766341 secs]
[GC 639551K->117723K(2009792K), 0.0731049 secs]
[GC 642011K->120195K(1980096K), 0.0755788 secs]
[GC 614787K->122572K(1908352K), 0.1118307 secs]
[GC 617164K->118300K(1957184K), 0.0198061 secs]
[GC 559388K->124316K(1849472K), 0.0162145 secs]
[GC 565404K->130372K(1958400K), 0.0239592 secs]
[GC 556356K->136556K(1830208K), 0.0294408 secs]
[GC 562540K->142740K(1958976K), 0.0202257 secs]
[GC 565396K->148924K(1958976K), 0.0194222 secs]
[GC 571580K->155092K(1962624K), 0.0195487 secs]
[GC 582676K->170156K(1960256K), 0.0393426 secs]
[GC 597740K->215084K(1974144K), 0.1229404 secs]
[GC 661292K->221300K(1967360K), 0.1056735 secs]
[GC 667508K->227388K(1979968K), 0.0218560 secs]
[GC 688828K->233660K(1976768K), 0.0225914 secs]
[GC 695100K->240100K(1987904K), 0.0225009 secs]
[GC 716452K->246716K(1983744K), 0.0227506 secs]
[GC 723068K->253428K(1996928K), 0.0226864 secs]
[GC 747444K->305836K(1992384K), 0.1136048 secs]
[GC 799852K->312556K(1998144K), 0.1169098 secs]
[GC 809452K->318924K(1994048K), 0.0234625 secs]
[GC 815820K->325316K(2006784K), 0.0230104 secs]
[GC 839236K->331916K(2002432K), 0.1496760 secs]
[GC 845836K->338572K(2015488K), 0.0233363 secs]
[GC 869900K->345436K(2011136K), 0.0268697 secs]

For 30 second worth of calculations, this doesn't look to bad to me.

I increased the heap space a lot, but I'm just bordering on the edge
of my real memory, so it's not helping much. Enabling the -server
thing seemed to help a tinny bit though.

On Nov 6, 12:32 pm, Peter Schuller <peter.schul...@infidyne.com>
wrote:

Bob Hutchison

unread,

Nov 6, 2010, 9:39:12 AM11/6/10

to clo...@googlegroups.com

On 2010-11-06, at 9:08 AM, pepijn (aka fliebel) wrote:

> I increased the heap space a lot, but I'm just bordering on the edge
> of my real memory, so it's not helping much.

Did you try pushing the minimum heap space up. I'm usually lazy and set them to the same. I've had serious trouble caused by the way the JVM increases the heap space. Setting min to max (and max big) pretty much took care of that issue.

Cheers,
Bob

----
Bob Hutchison
Recursive Design Inc.
http://www.recursive.ca/
weblog: http://xampl.com/so

Peter Schuller

unread,

Nov 6, 2010, 10:16:24 AM11/6/10

to clo...@googlegroups.com

> For 30 second worth of calculations, this doesn't look to bad to me.

If that was for all of the 30 seconds then yeah, GC is not the issue.

--
/ Peter Schuller

Peter Schuller

unread,

Nov 6, 2010, 10:22:01 AM11/6/10

to clo...@googlegroups.com

>> I increased the heap space a lot, but I'm just bordering on the edge
>> of my real memory, so it's not helping much.
>
> Did you try pushing the minimum heap space up. I'm usually lazy and set them to the same. I've had serious trouble caused by the way the JVM increases the heap space. Setting min to max (and max big) pretty much took care of that issue.

People keep making claims like this in various situations but I don't
tend to hear details. Exactly what problems are you having that would
plausibly apply in this situation?

Not that there is no reason to set ms=mx (there are reasons), but the
need to do so tends to be over-stated in my opinion. But if I'm
missing something I'd like to know about it :)

--
/ Peter Schuller

Bob Hutchison

unread,

Nov 6, 2010, 11:45:49 AM11/6/10

to clo...@googlegroups.com

I understand your scepticism but, even applaud it, but, in my case, it comes from actually trying it and measuring the difference (again in my case you didn't need anything fancy it was huge and highly visible). It happened often enough on different projects that I just do it routinely now. Anyway, if I *ever* see something that might be a GC-like problem I first eliminate heap growth from the picture (and, this is all I was suggesting here). Perhaps a hold over from earlier versions of the JVM but I don't personally care that much with my servers -- I have a machine dedicated to the application, it's got a lot of memory, use it. Might have a different attitude for a desktop app :-)

Cheers,
Bob

>
> --
> / Peter Schuller

>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

----

Benny Tsai

unread,

Nov 6, 2010, 2:00:52 PM11/6/10

to Clojure

While grocery shopping this morning, it occurred to me that it would
be even faster to do a single pass over the blocks array, and update
the count in one of 128 maps depending on the current index. Of
course, when I got home and took another look at your gist page, it
turns out that's you've already done in the second most recent
iteration :) So the following is almost the same as that iteration,
except:

1. I call assoc! directly on the transient maps instead of writing my
own update-in!.
2. I use unchecked-remainder instead of rem to calculate the index of
the map to update; this shaved off 3-4 seconds.
3. The transient maps are stored in a plain old vector. Using a
transient vector didn't make things much faster for me, so I dropped
it to keep the code a bit simpler.

(def num-layers 128)

(defn freqs
[^bytes blocks]
(map persistent!
(areduce blocks
idx
all-freqs
(vec (repeatedly num-layers #(transient {})))
(let [layer (unchecked-remainder (int idx) (int num-layers))
layer-freqs (nth all-freqs layer)
block (aget blocks idx)
old-count (get layer-freqs block 0)]
(assoc! layer-freqs block (inc old-count))
all-freqs))))

On my home machine, this computes the frequencies over 99844096 blocks
in about 11 seconds. My home machine is slower than the machine I
used earlier, so this version should be about twice as fast as my
earlier code.

On Nov 6, 4:23 am, "pepijn (aka fliebel)" <pepijnde...@gmail.com>
wrote:

Justin Kramer

unread,

Nov 7, 2010, 12:30:26 PM11/7/10

to Clojure

Implementing this in straight Java might help pinpoint whether this is
a JVM issue or a Clojure issue.

Also, FYI, there is clj-glob (https://github.com/jkk/clj-glob) for
finding files based on patterns like */*/*.dat

Justin

Gijs S.

unread,

Nov 7, 2010, 11:32:24 AM11/7/10

to Clojure

The problem is not the freqs function. The implementation in Python
and in Clojure are different algorithms.

Both produce a graph with the number of blocks per layer per type, for
6 particular types. These six types are the Clay, Coal ore, Diamond
ore, Gold ore, Iron ore, Obsidian, and Redstone ore that are plotted
in the graphs in Pepijn's blogpost.

The Python approach only calculates the frequencies for these 6 types.
The Clojure approach calculates the frequency for ALL the types and
then only displays the plot of the 6 types. This is why the freqs
function take such a long time as it does too much.

The implementation in Clojure below only counts the frequencies for
the 6 types. It also only traverses the blocks array only once,
similarly to the suggestion from Benny Tsai.

Counting the frequencies takes 3 seconds on my machine, using Pepijn's
example level as input. This shows that there is no need to apologize
for using Clojure :)

Counting the frequencies of all 92 types takes 40 seconds on my
machine on the example level. I bet this is faster than the Python
algorithm for all types, because the Clojure approach traverses the
blocks array only once. The Python algorithm will traverse each of the
128 layers 92 times.

;; blocks is one big byte array of all the blocks, created by
concatenating all the files
(defn freqs [^bytes blocks]

(let [freq-layer (vec (repeatedly 128 #(transient {})))

types #{(byte 56) (byte 49) (byte 16) (byte 15) (byte 14)
(byte 73)}

size (alength blocks)]

(loop [idx (int 0)]

(when (< idx size)

(let [block (byte (aget blocks idx))]

(when (types block)

;;the work below is only done when block is one of the six
types
(let [layer (unchecked-remainder idx 128)

fl (nth freq-layer layer)]

(assoc! fl block (inc (get fl block (int 0))))))

(recur (unchecked-inc idx)))))

(doall (map persistent! freq-layer))))

The whole file: https://gist.github.com/666228

-Gijs

Benny Tsai

unread,

Nov 7, 2010, 4:58:35 PM11/7/10

to Clojure

Great catch, Gijs!

Albert Cardona

unread,

Nov 8, 2010, 3:36:35 AM11/8/10

to clo...@googlegroups.com

Gijs,

I would put the return value of the assoc! call as one of the values
of the loop.
Otherwise, as far as I understand, you may lose values associated with
the fl transient map.
(Transient maps should not be used any differently than regular maps.
It's just their internals that are different.)

Albert
--
http://albert.rierol.net

Gijs S.

unread,

Nov 8, 2010, 8:56:59 AM11/8/10

to Clojure

Indeed I was not using the transients correctly. Transients are not
designed to be bashed in-place. (http://clojure.org/transients)

The gist is updated with a version that first worked without the
transients and then had the transient operations added. (https://
gist.github.com/666228)

The times and analysis still hold.

-Gijs

Benny Tsai

unread,

Nov 8, 2010, 3:58:17 PM11/8/10

to Clojure

I played with this some more, and it seems like for accessing an
element in a vector, (v idx) is a bit faster than (nth v idx). If you
replace this line (line 52 in the gist)...

fl (nth freq-layer layer)]

with:

fl (freq-layer layer)]

...the total time needed to compute frequencies for all 92 types goes
down from 29 seconds to 21 seconds on my machine.

Gijs S.

unread,

Nov 8, 2010, 6:31:04 PM11/8/10

to Clojure

I also get consistent better times with (freq-layer layer) vs. (nth
freq-layer layer).

Gist updated: https://gist.github.com/666228

-Gijs

Meikel Brandmeyer

unread,

Nov 9, 2010, 1:46:08 AM11/9/10

to Clojure

Hi,

On 9 Nov., 00:31, "Gijs S." <gijsstuur...@gmail.com> wrote:

> I also get consistent better times with (freq-layer layer) vs. (nth
> freq-layer layer).

This is to be expected, because (freq-layer layer) acts directly on
the datastructure, while (nth freq-layer layer) goes through another
function.

Sincerely
Meikel

pepijn (aka fliebel)

unread,

Nov 10, 2010, 8:06:10 AM11/10/10

to Clojure

I almost forgot about this. I talked to mfex on IRC, and he came up
with the winning solution.
http://clojure-log.n01se.net/date/2010-11-07.html#11:32

https://gist.github.com/666228

Basically it comes down to *not* doing work, rather than doing it
fast.

What the Python version does - and this version as well - is check if
the block type is actually displayed before counting it.

Thanks for all the input!

pepijn (aka fliebel)

unread,

Nov 10, 2010, 11:37:58 AM11/10/10

to Clojure

ouch didn't noticed the second page.

On Nov 10, 2:06 pm, "pepijn (aka fliebel)" <pepijnde...@gmail.com>
wrote:

> I almost forgot about this. I talked to mfex on IRC, and he came up

> with the winning solution.http://clojure-log.n01se.net/date/2010-11-07.html#11:32

Leif Walsh

unread,

Nov 10, 2010, 11:59:58 AM11/10/10

to clo...@googlegroups.com

I am reminded of an arcane implementation...
http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

--
Cheers,
Leif

pepijn (aka fliebel)

unread,

Nov 12, 2010, 4:41:45 AM11/12/10

to Clojure

Me to: http://clojure-log.n01se.net/date/2010-11-07.html#11:57

On Nov 10, 5:59 pm, Leif Walsh <leif.wa...@gmail.com> wrote:
> I am reminded of an arcane implementation...http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310...

Reply all

Reply to author

Forward