Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Python is way faster than Clojure on this task
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 36 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Pepijn de Vos  
View profile  
 More options Nov 4 2010, 5:28 pm
From: Pepijn de Vos <pepijnde...@gmail.com>
Date: Thu, 4 Nov 2010 22:28:12 +0100
Local: Thurs, Nov 4 2010 5:28 pm
Subject: Python is way faster than Clojure on this task
Hi all,

I have written a Python script to analyze Minecraft levels and render a graph. Then I did the same with Clojure. It takes Python 10 seconds to analyze a map, while it takes Clojure over a minute.

After having tried different options without any significant improvement, I am lost as to why there is such a huge difference. I wouldn't mind an extra pair of eyes/brains to look at this.

I blogged about it in more detail here: http://pepijndevos.nl/clojure-versus-python
Clojure version: https://github.com/pepijndevos/Clomian/
Python version: https://github.com/l0b0/mian

Clojure spends most of its time in the freqs function, here are a couple of variations: https://gist.github.com/663096

If you want to run the code yourself, you'll need a Minecraft level and JNBT, which is not on Maven.
JNBT: http://jnbt.sourceforge.net/
The level used in the blogpost: http://dl.dropbox.com/u/10094764/World2.zip

Groeten,
Pepijn de Vos
--
Sent from my iPod Shuffle
http://pepijndevos.nl


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mike Meyer  
View profile  
 More options Nov 4 2010, 5:43 pm
From: Mike Meyer <mwm-keyword-googlegroups.620...@mired.org>
Date: Thu, 4 Nov 2010 17:43:02 -0400
Local: Thurs, Nov 4 2010 5:43 pm
Subject: Re: Python is way faster than Clojure on this task
On Thu, 4 Nov 2010 22:28:12 +0100
Pepijn de Vos <pepijnde...@gmail.com> wrote:

> Hi all,

> I have written a Python script to analyze Minecraft levels and render a graph. Then I did the same with Clojure. It takes Python 10 seconds to analyze a map, while it takes Clojure over a minute.

> After having tried different options without any significant improvement, I am lost as to why there is such a huge difference. I wouldn't mind an extra pair of eyes/brains to look at this.

> I blogged about it in more detail here: http://pepijndevos.nl/clojure-versus-python
> Clojure version: https://github.com/pepijndevos/Clomian/
> Python version: https://github.com/l0b0/mian

> Clojure spends most of its time in the freqs function, here are a couple of variations: https://gist.github.com/663096

> If you want to run the code yourself, you'll need a Minecraft level and JNBT, which is not on Maven.
> JNBT: http://jnbt.sourceforge.net/
> The level used in the blogpost: http://dl.dropbox.com/u/10094764/World2.zip

Can you check GC activity in the clojure version?

I once ran into an issue where Python was running rings around an
Eiffel version (compiled down to native code - no VM need apply). This
looks similar to what you have, in that I built a large data
structure, and then started groveling over it. Turned out that Eiffel
was doing a mark-and-sweep GC, which was spending all of it's time
marking and sweeping the large static data structure, whereas python
doing a reference count GC didn't. Given that I know nothing about
Java GCs, this is just a WAG.

Come to think of it, how about trying to run the program Jython? That
should have the same GC issues. If it's some similar environmental
problem, that would show up there as well.

     <mike
--
Mike Meyer <m...@mired.org>          http://www.mired.org/consulting.html
Independent Network/Unix/Perforce consultant, email for more information.

O< ascii ribbon campaign - stop html mail - www.asciiribbon.org


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andrew Gwozdziewycz  
View profile  
 More options Nov 4 2010, 5:52 pm
From: Andrew Gwozdziewycz <apg...@gmail.com>
Date: Thu, 4 Nov 2010 17:52:32 -0400
Local: Thurs, Nov 4 2010 5:52 pm
Subject: Re: Python is way faster than Clojure on this task
On Thu, Nov 4, 2010 at 5:43 PM, Mike Meyer

There are many different collectors for the JVMs, too numerous to list
here, all tunable.

http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-1401...

--
http://www.apgwoz.com


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
pepijn (aka fliebel)  
View profile  
 More options Nov 5 2010, 10:41 am
From: "pepijn (aka fliebel)" <pepijnde...@gmail.com>
Date: Fri, 5 Nov 2010 07:41:56 -0700 (PDT)
Local: Fri, Nov 5 2010 10:41 am
Subject: Re: Python is way faster than Clojure on this task
I don't know how to check the GC activity on my project, but I did run
Mian on Jython. It performs much like my initial Clojure version. It
consumes absurd amounts of memory and never finishes.

So I think we can safely say that Java's GC or the way it stores data
is less efficient on this type of problem than Python.

On Nov 4, 10:43 pm, Mike Meyer <mwm-keyword-googlegroups.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
pepijn (aka fliebel)  
View profile  
 More options Nov 5 2010, 10:43 am
From: "pepijn (aka fliebel)" <pepijnde...@gmail.com>
Date: Fri, 5 Nov 2010 07:43:27 -0700 (PDT)
Local: Fri, Nov 5 2010 10:43 am
Subject: Re: Python is way faster than Clojure on this task
Can you recommend any? I tied a few of the GC options, but that didn't
help much.

On Nov 4, 10:52 pm, Andrew Gwozdziewycz <apg...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David Nolen  
View profile  
 More options Nov 5 2010, 11:24 am
From: David Nolen <dnolen.li...@gmail.com>
Date: Fri, 5 Nov 2010 11:24:57 -0400
Local: Fri, Nov 5 2010 11:24 am
Subject: Re: Python is way faster than Clojure on this task

On Fri, Nov 5, 2010 at 10:41 AM, pepijn (aka fliebel) <pepijnde...@gmail.com

> wrote:
> I don't know how to check the GC activity on my project, but I did run
> Mian on Jython. It performs much like my initial Clojure version. It
> consumes absurd amounts of memory and never finishes.

> So I think we can safely say that Java's GC or the way it stores data
> is less efficient on this type of problem than Python.

It's common that iteration heavy, mutation heavy code which is idiomatic in
Python poses some challenges when translating to Clojure. Making this run
faster than Python should be possible, and I would be surprised if it wasn't
quite a bit faster. You should search the Google Group for the various
threads on optimizing slow Clojure code.

I note that the repo does not contain the data file which your code runs
against?

David


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
pepijn (aka fliebel)  
View profile  
 More options Nov 5 2010, 12:38 pm
From: "pepijn (aka fliebel)" <pepijnde...@gmail.com>
Date: Fri, 5 Nov 2010 09:38:33 -0700 (PDT)
Local: Fri, Nov 5 2010 12:38 pm
Subject: Re: Python is way faster than Clojure on this task
I will have a look around.

I listed the map I used in my first email, It's on my Dropbox:
http://dl.dropbox.com/u/10094764/World2.zip

Meanwhile I wrote a function that is already twice as fast as I had,
no memory problems, no threads. One tinny problem: it doesn't produce
the same result.

It's the one at the top: https://gist.github.com/663096

The other day I found out that this kind of logic will actually refer
to the same same transient. This eliminates the remainder and associng
in the areduce fn second in the list, but I'm not sure this is
reliable, and it might be the reason why some results get lost.

On Nov 5, 4:24 pm, David Nolen <dnolen.li...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
B Smith-Mannschott  
View profile  
 More options Nov 5 2010, 2:30 pm
From: B Smith-Mannschott <bsmith.o...@gmail.com>
Date: Fri, 5 Nov 2010 19:30:24 +0100
Local: Fri, Nov 5 2010 2:30 pm
Subject: Re: Python is way faster than Clojure on this task
On Fri, Nov 5, 2010 at 17:38, pepijn (aka fliebel)

(defn freqs [^bytes blocks]
  (loop [idx 0
         ret (cycle (repeatedly 128 #(transient {})))]
    (if (< idx (alength blocks))
      (do
        (update! (first ret) (aget blocks idx) (fnil inc 0))
        (recur (inc idx) (next ret)))
      (map persistent! (take 128 ret)))))

I'm not familiar with incanter, which defines update!, but the update!
call makes me suspicious. Transients are not designed to be banged on
in place. That would explain your losing results.

// ben


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Greg  
View profile  
 More options Nov 5 2010, 2:31 pm
From: Greg <g...@kinostudios.com>
Date: Fri, 5 Nov 2010 11:31:21 -0700
Local: Fri, Nov 5 2010 2:31 pm
Subject: Re: Python is way faster than Clojure on this task
I'm very curios about this situation, please let us know if you manage to write a version that's faster than the python one (as David claims is possible). I would attempt it myself but I've only just recently had the time to dive back into Clojure. :-\

- Greg

On Nov 5, 2010, at 9:38 AM, pepijn (aka fliebel) wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David Nolen  
View profile  
 More options Nov 5 2010, 2:37 pm
From: David Nolen <dnolen.li...@gmail.com>
Date: Fri, 5 Nov 2010 14:37:03 -0400
Local: Fri, Nov 5 2010 2:37 pm
Subject: Re: Python is way faster than Clojure on this task

On Fri, Nov 5, 2010 at 2:31 PM, Greg <g...@kinostudios.com> wrote:
> I'm very curios about this situation, please let us know if you manage to
> write a version that's faster than the python one (as David claims is
> possible). I would attempt it myself but I've only just recently had the
> time to dive back into Clojure. :-\

> - Greg

I'm almost 100% certain it's possible. But having already optimized other
people's code several times on this list as a learning experiment, I've
become bored with optimizing other people's code ;)

David


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
pepijn (aka fliebel)  
View profile  
 More options Nov 5 2010, 2:48 pm
From: "pepijn (aka fliebel)" <pepijnde...@gmail.com>
Date: Fri, 5 Nov 2010 11:48:50 -0700 (PDT)
Local: Fri, Nov 5 2010 2:48 pm
Subject: Re: Python is way faster than Clojure on this task
update! is of my own making, based on assoc! and update-in

On Nov 5, 7:30 pm, B Smith-Mannschott <bsmith.o...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
pepijn (aka fliebel)  
View profile  
 More options Nov 5 2010, 3:58 pm
From: "pepijn (aka fliebel)" <pepijnde...@gmail.com>
Date: Fri, 5 Nov 2010 12:58:14 -0700 (PDT)
Local: Fri, Nov 5 2010 3:58 pm
Subject: Re: Python is way faster than Clojure on this task
Could you refer me to some of those relevant to my problem? I tried
searching for them, and most stuff I found is about killing
reflection, using buffered IO and other basics I've already covered.

On Nov 5, 7:37 pm, David Nolen <dnolen.li...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
mch  
View profile  
 More options Nov 5 2010, 2:58 pm
From: mch <matt.c.hug...@gmail.com>
Date: Fri, 5 Nov 2010 11:58:15 -0700 (PDT)
Local: Fri, Nov 5 2010 2:58 pm
Subject: Re: Python is way faster than Clojure on this task
You can use Visual VM (https://visualvm.dev.java.net/) to see how the
VM is using memory.  I don't think it specifically show a log of GC
activity, but it is pretty clear from the graphs.

mch

On Nov 5, 8:41 am, "pepijn (aka fliebel)" <pepijnde...@gmail.com>
wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alan  
View profile  
 More options Nov 5 2010, 5:59 pm
From: Alan <a...@malloys.org>
Date: Fri, 5 Nov 2010 14:59:16 -0700 (PDT)
Local: Fri, Nov 5 2010 5:59 pm
Subject: Re: Python is way faster than Clojure on this task
I think you missed his point. (assoc! m k v) is *allowed* to modify m,
not *guaranteed*. It returns a pointer to a transient map, which may
be m, or may be a totally distinct map, or may be a new map that
shares some pointers with m. So your (do (update! blah foo
bar) ...more stuff) is potentially (and unpredictably) throwing away
the results of the update. You need to save the return value.

On Nov 5, 11:48 am, "pepijn (aka fliebel)" <pepijnde...@gmail.com>
wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Benny Tsai  
View profile  
 More options Nov 5 2010, 6:57 pm
From: Benny Tsai <benny.t...@gmail.com>
Date: Fri, 5 Nov 2010 15:57:00 -0700 (PDT)
Subject: Re: Python is way faster than Clojure on this task
Here's what I have so far.  The code splits blocks into 128 smaller
sub-arrays, each representing a level, then calls a modified version
of frequencies (using areduce instead of reduce) on each level.  On my
machine, with server mode on, it takes about 20 seconds to compute the
frequencies for an array of 99844096 blocks.  I haven't tested it on
your level, because I'm too lazy to get JNBT set up, but I'm curious
to see if it returns the correct result for you.

(def num-levels 128)

(defn get-level [level-num ^bytes blocks]
  (let [size (/ (count blocks) num-levels)
        output (byte-array size)]
    (doseq [output-idx (range size)]
      (let [block-idx (+ (* output-idx num-levels) level-num)]
        (aset output output-idx (aget blocks block-idx))))
    output))

(defn afrequencies
  [^bytes a]
  (persistent!
   (areduce a
            idx
            counts
            (transient {})
            (let [x (aget a idx)]
              (assoc! counts x (inc (get counts x 0)))))))

(defn freqs [^bytes blocks]
  (let [levels (map #(get-level % blocks) (range num-levels))]
    (map afrequencies levels)))

user=> (def blocks (byte-array 99844096))
#'user/blocks
user=> (time (count (freqs blocks)))
"Elapsed time: 20160.780769 msecs"
128

On Nov 4, 3:28 pm, Pepijn de Vos <pepijnde...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Benny Tsai  
View profile  
 More options Nov 5 2010, 8:27 pm
From: Benny Tsai <benny.t...@gmail.com>
Date: Fri, 5 Nov 2010 17:27:34 -0700 (PDT)
Local: Fri, Nov 5 2010 8:27 pm
Subject: Re: Python is way faster than Clojure on this task
Oops, sorry, got my terminology wrong.  The sub-arrays represent
*layers*, not levels.  So the code should actually read as follows:

(def num-layers 128)

(defn get-layer [layer-num ^bytes blocks]
  (let [size (/ (count blocks) num-layers)
        output (byte-array size)]
    (doseq [output-idx (range size)]
      (let [block-idx (+ (* output-idx num-layers) layer-num)]
        (aset output output-idx (aget blocks block-idx))))
    output))

(defn afrequencies
  [^bytes a]
  (persistent!
   (areduce a
            idx
            counts
            (transient {})
            (let [x (aget a idx)]
              (assoc! counts x (inc (get counts x 0)))))))

(defn freqs [^bytes blocks]
  (let [layers (map #(get-layer % blocks) (range num-layers))]
    (map afrequencies layers)))

On Nov 5, 4:57 pm, Benny Tsai <benny.t...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
pepijn (aka fliebel)  
View profile  
 More options Nov 6 2010, 6:23 am
From: "pepijn (aka fliebel)" <pepijnde...@gmail.com>
Date: Sat, 6 Nov 2010 03:23:13 -0700 (PDT)
Local: Sat, Nov 6 2010 6:23 am
Subject: Re: Python is way faster than Clojure on this task
Awesome. You managed to reproduce my initial solution, but working
with arrays all the way. It is already quite a bit faster than what I
have, but still nowhere near Python.

I'll put it on the list for things to look at.

On Nov 6, 1:27 am, Benny Tsai <benny.t...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter Schuller  
View profile  
 More options Nov 6 2010, 7:32 am
From: Peter Schuller <peter.schul...@infidyne.com>
Date: Sat, 6 Nov 2010 12:32:02 +0100
Local: Sat, Nov 6 2010 7:32 am
Subject: Re: Python is way faster than Clojure on this task

> You can use Visual VM (https://visualvm.dev.java.net/) to see how the
> VM is using memory.  I don't think it specifically show a log of GC
> activity, but it is pretty clear from the graphs.

Or just use -XX:+PrintGC and maybe -XX:+PrintGCDetails and
-XX:+PrintGCTimeStamps.

I haven't checked what the code is doing, but if you suspect extremely
poor performance due to GC it may be because your application happens
to require some amount of memory that is below but fairly close to the
default maximum heap size. That may easily cause very frequent GC:s
and show up as poor performance. If this is the case, doubling the
heap size should fix it (-Xmx...).

(The JVM does throw OutOfMemoryExceptions when it decides there is
cause too, but it is a difficult heuristic to decide when that is
actually the right thing to do. So it's very possible to be in
situations that are not quite so bad in terms of time spent doing GC
that the JVM throws an exception, yet bad enough to cause very
frequent full GC:s at considerable cost in CPU time.)

--
/ Peter Schuller


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
pepijn (aka fliebel)  
View profile  
 More options Nov 6 2010, 9:08 am
From: "pepijn (aka fliebel)" <pepijnde...@gmail.com>
Date: Sat, 6 Nov 2010 06:08:14 -0700 (PDT)
Local: Sat, Nov 6 2010 9:08 am
Subject: Re: Python is way faster than Clojure on this task
[GC 524288K->103751K(2009792K), 0.1105872 secs]
[GC 628039K->105751K(2009792K), 0.0925628 secs]
[GC 630039K->109023K(2009792K), 0.0702017 secs]
[GC 633311K->115263K(2009792K), 0.0766341 secs]
[GC 639551K->117723K(2009792K), 0.0731049 secs]
[GC 642011K->120195K(1980096K), 0.0755788 secs]
[GC 614787K->122572K(1908352K), 0.1118307 secs]
[GC 617164K->118300K(1957184K), 0.0198061 secs]
[GC 559388K->124316K(1849472K), 0.0162145 secs]
[GC 565404K->130372K(1958400K), 0.0239592 secs]
[GC 556356K->136556K(1830208K), 0.0294408 secs]
[GC 562540K->142740K(1958976K), 0.0202257 secs]
[GC 565396K->148924K(1958976K), 0.0194222 secs]
[GC 571580K->155092K(1962624K), 0.0195487 secs]
[GC 582676K->170156K(1960256K), 0.0393426 secs]
[GC 597740K->215084K(1974144K), 0.1229404 secs]
[GC 661292K->221300K(1967360K), 0.1056735 secs]
[GC 667508K->227388K(1979968K), 0.0218560 secs]
[GC 688828K->233660K(1976768K), 0.0225914 secs]
[GC 695100K->240100K(1987904K), 0.0225009 secs]
[GC 716452K->246716K(1983744K), 0.0227506 secs]
[GC 723068K->253428K(1996928K), 0.0226864 secs]
[GC 747444K->305836K(1992384K), 0.1136048 secs]
[GC 799852K->312556K(1998144K), 0.1169098 secs]
[GC 809452K->318924K(1994048K), 0.0234625 secs]
[GC 815820K->325316K(2006784K), 0.0230104 secs]
[GC 839236K->331916K(2002432K), 0.1496760 secs]
[GC 845836K->338572K(2015488K), 0.0233363 secs]
[GC 869900K->345436K(2011136K), 0.0268697 secs]

For 30 second worth of calculations, this doesn't look to bad to me.

I increased the heap space a lot, but I'm just bordering on the edge
of my real memory, so it's not helping much. Enabling the -server
thing seemed to help a tinny bit though.

On Nov 6, 12:32 pm, Peter Schuller <peter.schul...@infidyne.com>
wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bob Hutchison  
View profile  
 More options Nov 6 2010, 9:39 am
From: Bob Hutchison <hutch-li...@recursive.ca>
Date: Sat, 6 Nov 2010 09:39:12 -0400
Local: Sat, Nov 6 2010 9:39 am
Subject: Re: Python is way faster than Clojure on this task

On 2010-11-06, at 9:08 AM, pepijn (aka fliebel) wrote:

> I increased the heap space a lot, but I'm just bordering on the edge
> of my real memory, so it's not helping much.

Did you try pushing the minimum heap space up. I'm usually lazy and set them to the same. I've had serious trouble caused by the way the JVM increases the heap space. Setting min to max (and max big) pretty much took care of that issue.

Cheers,
Bob

----
Bob Hutchison
Recursive Design Inc.
http://www.recursive.ca/
weblog: http://xampl.com/so


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter Schuller  
View profile  
 More options Nov 6 2010, 10:16 am
From: Peter Schuller <peter.schul...@infidyne.com>
Date: Sat, 6 Nov 2010 15:16:24 +0100
Local: Sat, Nov 6 2010 10:16 am
Subject: Re: Python is way faster than Clojure on this task

> For 30 second worth of calculations, this doesn't look to bad to me.

If that was for all of the 30 seconds then yeah, GC is not the issue.

--
/ Peter Schuller


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter Schuller  
View profile  
 More options Nov 6 2010, 10:22 am
From: Peter Schuller <peter.schul...@infidyne.com>
Date: Sat, 6 Nov 2010 15:22:01 +0100
Local: Sat, Nov 6 2010 10:22 am
Subject: Re: Python is way faster than Clojure on this task

>> I increased the heap space a lot, but I'm just bordering on the edge
>> of my real memory, so it's not helping much.

> Did you try pushing the minimum heap space up. I'm usually lazy and set them to the same. I've had serious trouble caused by the way the JVM increases the heap space. Setting min to max (and max big) pretty much took care of that issue.

People keep making claims like this in various situations but I don't
tend to hear details. Exactly what problems are you having that would
plausibly apply in this situation?

Not that there is no reason to set ms=mx (there are reasons), but the
need to do so tends to be over-stated in my opinion. But if I'm
missing something I'd like to know about it :)

--
/ Peter Schuller


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bob Hutchison  
View profile  
 More options Nov 6 2010, 11:45 am
From: Bob Hutchison <hutch-li...@recursive.ca>
Date: Sat, 6 Nov 2010 11:45:49 -0400
Local: Sat, Nov 6 2010 11:45 am
Subject: Re: Python is way faster than Clojure on this task

On 2010-11-06, at 10:22 AM, Peter Schuller wrote:

>>> I increased the heap space a lot, but I'm just bordering on the edge
>>> of my real memory, so it's not helping much.

>> Did you try pushing the minimum heap space up. I'm usually lazy and set them to the same. I've had serious trouble caused by the way the JVM increases the heap space. Setting min to max (and max big) pretty much took care of that issue.

> People keep making claims like this in various situations but I don't
> tend to hear details. Exactly what problems are you having that would
> plausibly apply in this situation?

> Not that there is no reason to set ms=mx (there are reasons), but the
> need to do so tends to be over-stated in my opinion. But if I'm
> missing something I'd like to know about it :)

I understand your scepticism but, even applaud it, but, in my case, it comes from actually trying it and measuring the difference (again in my case you didn't need anything fancy it was huge and highly visible). It happened often enough on different projects that I just do it routinely now. Anyway, if I *ever* see something that might be a GC-like problem I first eliminate heap growth from the picture (and, this is all I was suggesting here). Perhaps a hold over from earlier versions of the JVM but I don't personally care that much with my servers -- I have a machine dedicated to the application, it's got a lot of memory, use it. Might have a different attitude for a desktop app :-)

Cheers,
Bob

> --
> / Peter Schuller

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+unsubscribe@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

----
Bob Hutchison
Recursive Design Inc.
http://www.recursive.ca/
weblog: http://xampl.com/so

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Benny Tsai  
View profile  
 More options Nov 6 2010, 2:00 pm
From: Benny Tsai <benny.t...@gmail.com>
Date: Sat, 6 Nov 2010 11:00:52 -0700 (PDT)
Local: Sat, Nov 6 2010 2:00 pm
Subject: Re: Python is way faster than Clojure on this task
While grocery shopping this morning, it occurred to me that it would
be even faster to do a single pass over the blocks array, and update
the count in one of 128 maps depending on the current index.  Of
course, when I got home and took another look at your gist page, it
turns out that's you've already done in the second most recent
iteration :)  So the following is almost the same as that iteration,
except:

1. I call assoc! directly on the transient maps instead of writing my
own update-in!.
2. I use unchecked-remainder instead of rem to calculate the index of
the map to update; this shaved off 3-4 seconds.
3. The transient maps are stored in a plain old vector.  Using a
transient vector didn't make things much faster for me, so I dropped
it to keep the code a bit simpler.

(def num-layers 128)

(defn freqs
  [^bytes blocks]
  (map persistent!
       (areduce blocks
                idx
                all-freqs
                (vec (repeatedly num-layers #(transient {})))
                (let [layer (unchecked-remainder (int idx) (int num-layers))
                      layer-freqs (nth all-freqs layer)
                      block (aget blocks idx)
                      old-count (get layer-freqs block 0)]
                  (assoc! layer-freqs block (inc old-count))
                  all-freqs))))

On my home machine, this computes the frequencies over 99844096 blocks
in about 11 seconds.  My home machine is slower than the machine I
used earlier, so this version should be about twice as fast as my
earlier code.

On Nov 6, 4:23 am, "pepijn (aka fliebel)" <pepijnde...@gmail.com>
wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Justin Kramer  
View profile  
 More options Nov 7 2010, 12:30 pm
From: Justin Kramer <jkkra...@gmail.com>
Date: Sun, 7 Nov 2010 09:30:26 -0800 (PST)
Local: Sun, Nov 7 2010 12:30 pm
Subject: Re: Python is way faster than Clojure on this task
Implementing this in straight Java might help pinpoint whether this is
a JVM issue or a Clojure issue.

Also, FYI, there is clj-glob (https://github.com/jkk/clj-glob) for
finding files based on patterns like */*/*.dat

Justin

On Nov 4, 4:28 pm, Pepijn de Vos <pepijnde...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 36   Newer >
« Back to Discussions « Newer topic     Older topic »