http://shootout.alioth.debian.org
The Clojure program for the "fasta" problem, with source code, AOT
compilation command, and execution command given on this web page:
http://shootout.alioth.debian.org/u32/program.php?test=fasta&lang=clojure&id=3
still takes about 6x to 8x more time than the best Java 6 -server
program here, depending upon which of the four machines it is run on:
http://shootout.alioth.debian.org/u32/program.php?test=fasta&lang=java&id=3
I'm sure the Clojure program can be made faster, e.g. by doing fewer
calls to write to the output file, with more bytes per call. Most of
the time seems to be file writing and generating random numbers in gen-
random!, at least on my systems where I've done testing and
profiling. I'm also seeing a fair amount of time spent in calls to
java.lang.Double.valueOf, according to the built-in profiler that
comes with the Hotspot JVM.
Note: The web site is Clojure 1.2 only right now, so don't expect a
tweaked-out program using things that only work in Clojure 1.3 to work
there yet. Other tips:
+ Test your program by comparing its output to the Java program with
the same command line args. The Java program isn't guaranteed to be
bug free, but it has been tested a fair amount.
+ If you decide to submit, please take some time to select a -Xmx max
heap size argument that gives a good tradeoff between low memory use
and not too much extra time spent doing GC because of it. I've got
some scripts to help automate finding such a value if you are
interested, and I'd even volunteer to run them for you if you send me
your program. Still, it is straightforward to do by hand, too -- no
rocket science involved.
You are of course free to work on any of the problems you wish, but I
just wanted to point out this one as being what I consider a bit
stubborn to squeeze down the run time.
Andy
I'll contact you off list about how this gets submitted to the web site.
Thanks!
Andy
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient
> with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
I spent some time doing some small tweaks.
On one machine I have, your version runs in as little as 4.43 sec of
user + system CPU time (minimum time over about half a dozen separate
runs).
My tweaked version below runs in as little as 2.55 sec.
An AOT compiled "Hello, world!" program in Clojure on this same
machine takes 1.37 sec. (Clojure 1.2.0 for all of these results).
There might be something still left to shave off, but not a lot.
Thanks,
Andy
(ns fasta
(:gen-class))
(set! *warn-on-reflection* true)
(def *width* 60)
(def *lookup-size* 222000)
(def *alu* (str "GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGG"
"GAGGCCGAGGCGGGCGGATCACCTGAGGTCAGGAGTTCGAGA"
"CCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAAT"
"ACAAAAATTAGCCGGGCGTGGTGGCGCGCGCCTGTAATCCCA"
"GCTACTCGGGAGGCTGAGGCAGGAGAATCGCTTGAACCCGGG"
"AGGCGGAGGTTGCAGTGAGCCGAGATCGCGCCACTGCACTCC"
"AGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAA"))
(def *codes* "acgtBDHKMNRSVWY")
(def *iub* [0.27 0.12 0.12 0.27 0.02 0.02 0.02 0.02
0.02 0.02 0.02 0.02 0.02 0.02 0.02])
(def *homosapiens* [0.3029549426680 0.1979883004921
0.1975473066391 0.3015094502008])
(defn find-index [f coll]
(loop [i (int 0)
s (seq coll)]
(if (f (first s))
i
(recur (unchecked-inc i) (rest s)))))
(def random-seed (int-array [42]))
(let [IM (int 139968)
IA (int 3877)
IC (int 29573)
scale (double (/ *lookup-size* IM))]
(defn gen-random-fast []
(let [new-seed (unchecked-remainder
(unchecked-add (unchecked-multiply
(aget (ints random-seed) 0) IA)
IC) IM)]
(aset (ints random-seed) 0 new-seed)
(int (* new-seed scale)))))
;; Takes a vector of probabilities.
(defn make-cumulative [v]
(vec (map #(reduce + (subvec v 0 %)) (range 1 (inc (count v))))))
;; Takes a vector of cumulative probabilities.
(defn make-lookup-table [v]
(let [sz (int *lookup-size*)
lookup-scale (- sz 0.0001)
^ints a (int-array sz)]
(dotimes [i sz]
(aset a i (int (find-index #(<= (/ i lookup-scale) %) v))))
a))
(defn cycle-bytes [source source-size n
^java.io.BufferedOutputStream ostream]
(let [source-size (int source-size)
width (int *width*)
width+1 (int (inc width))
buffer-size (int (* width+1 4096))
buffer (byte-array buffer-size (byte 10))]
(loop [i (int 0)
j (int 0)
n (int n)]
(System/arraycopy source i buffer j width)
(if (> n width)
(recur (int (unchecked-remainder
(unchecked-add i width) source-size))
(int (let [j (unchecked-add j width+1)]
(if (== j buffer-size)
(do (.write ostream buffer) (int 0))
j)))
(unchecked-subtract n width))
(do
(aset buffer (+ j n) (byte 10))
(.write ostream buffer (int 0) (+ j n 1)))))))
(defn fasta-repeat [n ^java.io.BufferedOutputStream ostream]
(let [source (.getBytes (str *alu* *alu*))]
(cycle-bytes source (count *alu*) n ostream)))
(defn fasta-random [probs n ^java.io.BufferedOutputStream ostream]
(let [codes (.getBytes (str *codes*))
lookup-table (ints (make-lookup-table
(make-cumulative probs)))
width (int *width*)
buffer (byte-array 222000)
seeds (int-array 222000)
first-seed (aget (ints random-seed) 0)]
(loop [i (int 0)]
(aset seeds i (aget (ints random-seed) 0))
(aset buffer i
(aget codes
(aget lookup-table
(gen-random-fast))))
(if (== (aget (ints random-seed) 0) first-seed)
(do
(System/arraycopy buffer 0 buffer (inc i) *width*)
(cycle-bytes buffer (inc i) n ostream)
(aset (ints random-seed) 0 (aget seeds (mod n (inc i)))))
(recur (unchecked-inc i))))))
(defn write-line [s ^java.io.BufferedOutputStream stream]
(.write stream (.getBytes (str s "\n"))))
(defn -main [& args]
(let [n (Integer/parseInt (nth args 0))
ostream (java.io.BufferedOutputStream. System/out)
start-time (System/currentTimeMillis)]
(write-line ">ONE Homo sapiens alu" ostream)
(fasta-repeat (* n 2) ostream)
(write-line ">TWO IUB ambiguity codes" ostream)
(fasta-random *iub* (* n 3) ostream)
(write-line ">THREE Homo sapiens frequency" ostream)
(fasta-random *homosapiens* (* n 5) ostream)
(.flush ostream)))
Thanks,
Andy
If that was the reason it was deprecated, then it makes sense. It
would be nice if the rules for that problem made it more explicit that
this was not allowed.
I made a modified version of the program that calls the random number
generator for every separate character output (for the two sequences
that should be randomly generated), and unfortunately the run time is
back up to about the same as the current fastest program on the web
site, which unfortunately is 6x to 8x slower than the best Java program.
I've published that program at the link below. It contains comments
marking two lines in the function gen-random-fast where the Hotspot
JVM profiler tells me that the CPU is spending lots of time in
java.lang.Integer.valueOf (Clojure 1.2.0). I can't seem to get rid of
these calls to Integer.valueOf even after trying about half a dozen
different variations of type-hinting. Can anyone else see a way to
change the program to avoid those calls?
http://github.com/jafingerhut/clojure-benchmarks/blob/master/fasta/fasta.clj-10.clj
Thanks,
Andy
I don't get any reflection warnings with (gen-random-fast) on 1.2. Do you?
Nope. No reflection warnings. But the profiler does show significant
amounts of CPU time spent in java.lang.Integer.valueOf. Maybe it is
mistaken or misleading output from the profiler, but I'm inclined to
believe it so far.
Andy
Then you've got boxed primitives, but the Clojure compiler knows what
types they all are...
Here is something a little odd. I've looked at the disassembly of the
Java bytecode produced for this Clojure function:
(def random-seed (int-array [42]))
(let [scale (double (/ *lookup-size* 139968))]
(defn gen-random-fast []
(let [^ints random-seed random-seed
IM (int 139968)
IA (int 3877)
IC (int 29573)
zero (int 0)
new-seed (int (unchecked-remainder
(unchecked-add
(unchecked-multiply
(aget random-seed zero) IA) IC) IM))]
(aset random-seed zero new-seed) ;; <--- this line produces
odd Java bytecode
(int (* new-seed scale)))))
The marked line produces bytecodes for pushing the args on the stack,
then calling aset, and then it calls Integer.valueOf to take the int
returned by aset (which is just the value of new-seed assigned to the
array element) and convert it to an Integer, then it pops it off the
stack, discarding that Integer value. Strange. Apparently java -
server was not able to optimize away that wasted work, but I don't yet
know why the Clojure compiler generated the call in the first place.
The second call to Integer.valueOf() from the last line of the
function makes sense to me now: the int return value must be boxed
into an Integer before being returned as the value of function gen-
random-fast, since all Clojure 1.2 function arguments and returns
values must be Objects.
Given the wasted bytecode instructions mentioned above, on a hunch I
decided to assign the return value of aset to a symbol via let -- one
that is never used again. This actually produces faster code:
(def random-seed (int-array [42]))
(let [scale (double (/ *lookup-size* 139968))]
(defn gen-random-fast []
(let [^ints random-seed random-seed
IM (int 139968)
IA (int 3877)
IC (int 29573)
zero (int 0)
new-seed (int (unchecked-remainder
(unchecked-add
(unchecked-multiply
(aget random-seed zero) IA) IC) IM))
throwaway-val (int (aset random-seed zero new-seed))]
<-- this is the only changed line
(int (* new-seed scale)))))
Hmmmm. I hope such wasted instructions are rare.
Andy
Andy
> Yeah but it's not too hard to see why the Lisp programmer Juho
> Snellman opined on HN "the [sic program] implementations seem to have
> totally dived off the deep end of complexity".
That's why this kind of competition is not interesting to me. As it only compares the fastest programs, there's every incentive to submit horrifically complex, optimized-to-the-hilt solutions that would almost never get used in the real world.
Rather than ask "what's the fastest this can be done in language X?", we should ask "how fast are the idiomatic ways of doing this in language X" and possibly "how hard is it to do it faster, when those are not good enough?".
Possibly just including code size/complexity with the performance metric would do the trick, though measuring that cross-language is a challenge all its own.
Those are all good, but how many people actually look at anything beyond the fastest implementation for a given language? To some extent this is their problem, sure, but it's not helped by the prominence the shootout site gives to the fastest implementations in various places.
In the first place, I'm not even sure what the purpose of all that data is. What do people do with it? Is it purely for entertainment, or do people actually use it to decide what language to use for a project? If it's just for entertainment (as the word "game" seems to imply), won't people just skim the numbers for the fastest programs and walk away with wrong impressions?
In my opinion, it's simply not meaningful nor interesting to compare the fastest implementations across different languages. But by presenting those comparisons prominently (even when it also provides other more meaningful comparisons elsewhere), the shootout site does disservice to its users and the programming community by providing a convenient yet misleading focus point.
Perhaps each language could have one "best" implementation selected by the community or by a panel of judges; then the default or most prominent comparisons could use these "best" implementations rather than the fastest ones. Either way, it would be good to remove those fastest-vs-fastest comparisons altogether, or at least reduce their prominence significantly.