I can't say I've needed such a thing very often. Your own
reduce-times macro might clarify your code a bit, though.
> (defn test-shuffle-uniformity
> "Shuffles a 3-item collection n times, mapping each result to the number of
> times it's hit. If the distribution isn't uniform we've let bias into the
> random numbers or the algorithm. Typical results on my machine:
> (2 1 3) : 16611
> (3 2 1) : 16771
> (1 3 2) : 16707
> (1 2 3) : 16766
> (3 1 2) : 16555
> (2 3 1) : 16590"
> ([n] (test-shuffle-uniformity n shuffle))
> ([n shuffle]
> (let [coll [1 2 3]
> results (reduce (fn [results _]
> (let [result (shuffle coll)]
> (assoc results result
> (inc (or (get results result)
> 0)))))
> {} (range n))]
> (doseq r results
> (println (key r) ":" (val r))))))
You could simplify your reduce fn a bit by using merge-with:
(defn test-shuffle-uniformity
"Shuffles a 3-item collection n times, mapping each result to the number of
times it's hit. If the distribution isn't uniform we've let bias into the
random numbers or the algorithm. Typical results on my machine:
(2 1 3) : 16611
(3 2 1) : 16771
(1 3 2) : 16707
(1 2 3) : 16766
(3 1 2) : 16555
(2 3 1) : 16590"
([n] (test-shuffle-uniformity n shuffle))
([n shuffle]
(let [coll [1 2 3]
results (reduce (fn [results _]
(merge-with + results {(shuffle coll) 1}))
{} (range n))]
(doseq r results
(println (key r) ":" (val r))))))
--Chouser
(defn test-shuffle-uniformity
"Shuffles a 3-item collection n times, mapping each result to the number of
times it's hit. If the distribution isn't uniform we've let bias into the
random numbers or the algorithm. Typical results on my machine:
(2 1 3) : 16611
(3 2 1) : 16771
(1 3 2) : 16707
(1 2 3) : 16766
(3 1 2) : 16555
(2 3 1) : 16590"
([n] (test-shuffle-uniformity n shuffle))
([n shuffle]
(let [coll [1 2 3]
results (apply merge-with + {}
(take n (repeatedly (fn [] {(shuffle coll) 1}))))]
(doseq [k v] results
(println k ":" v)))))
It'd be nice to use the #(...) syntax instead of (fn [] ...), but that
won't work here since #({key val}) would try to call the map with no
arguments. You'd have to say #(hash-map key val) and (fn [] {key
val}) is arguably clearer.
Perhaps "repeatedly" could be made to take an optional argument n, but
again I don't know how often that would be useful.
Also note the destructuring in doseq to save a few keystrokes.
I was able to get almost a 20% speed boost by using a Java array for
each item instead of a vector:
(defn sort-by-mem-keys
"Like clojure/sort-by but only calls keyfn once on each item in coll."
([keyfn coll]
(sort-by-mem-keys keyfn compare coll))
([keyfn #^java.util.Comparator comp coll]
(let [key-vals (map #(to-array [(keyfn %) %]) coll)
sorted-pairs (sort (fn [x y] (.compare comp (aget x 0) (aget y 0)))
key-vals)]
(map #(aget % 1) sorted-pairs))))
Or you can rely on the fact that vectors are comparable themselves.
The first member is the most significant, so I'd think this would
produce acceptable results. It's more succinct and appears to double
the speed:
(defn sort-by-mem-keys
"Like clojure/sort-by but only calls keyfn once on each item in coll."
([keyfn coll]
(let [key-vals (map (fn [x] [(keyfn x) x]) coll)]
(map second (sort key-vals)))))
It's still no match for shuffle-java, though.
--Chouser
Or even:
> You could simplify your reduce fn a bit by using merge-with:
results (apply merge-with + {}
(defn test-shuffle-uniformity
"Shuffles a 3-item collection n times, mapping each result to the number of
times it's hit. If the distribution isn't uniform we've let bias into the
random numbers or the algorithm. Typical results on my machine:
(2 1 3) : 16611
(3 2 1) : 16771
(1 3 2) : 16707
(1 2 3) : 16766
(3 1 2) : 16555
(2 3 1) : 16590"
([n] (test-shuffle-uniformity n shuffle))
([n shuffle]
(let [coll [1 2 3]
(take n (repeatedly (fn [] {(shuffle coll) 1}))))]
(doseq [k v] results
(println k ":" v)))))
It'd be nice to use the #(...) syntax instead of (fn [] ...), but that
won't work here since #({key val}) would try to call the map with no
arguments. You'd have to say #(hash-map key val) and (fn [] {key
val}) is arguably clearer.
Also note the destructuring in doseq to save a few keystrokes.
Or you can rely on the fact that vectors are comparable themselves.
The first member is the most significant, so I'd think this would
produce acceptable results. It's more succinct and appears to double
the speed:
(let [key-vals (map (fn [x] [(keyfn x) x]) coll)]
(defn sort-by-mem-keys
"Like clojure/sort-by but only calls keyfn once on each item in coll."
([keyfn coll]
(map second (sort key-vals)))))
While I appreciate the intellectual exercise (and would point you to
http://okmij.org/ftp/Haskell/perfect-shuffle.txt if you haven't seen
it), one of the nice things about Clojure is that you aren't trapped
without mutability when you need it, and you need it here since there
is a known O(n) in-place implementation and you can't do better than
O(n log n) otherwise. Your implementation leveraging Java's
Collections/shuffle _is_ functional, it neither alters its arg nor has
any other side effects, and is the right way to implement this, since
Collections/shuffle has seen a lot of use/testing. See Clojure's sort,
for instance. It's ok to leverage Java's speed inside a functional
interface. For example, Clojure's persistent maps and vectors couldn't
be as nearly fast as they are if they were implemented purely
functionally.
Rich