I started to look closer at Clojure after Rich Hickey's presentation
at QCon London in March, this is my first post. I spend my days
programming Java, but my journey as developer did actually start with
an ML programming course many years ago. It has been great fun to
re-discover the functional style of programming again! This list is a
great source of knowledge.
A colleague of mine wrote an article about how to convert a piece of
Java code to Groovy a few months ago. As an exercise I thought I
should do the same, but to Clojure of course. I'd really appreciate
some feedback on the solution. I hope you aren't too sick of these
kind of posts just yet :-)
The code orders a list of characters, first by frequency and then
alphabetically if two characters happen to have the same frequency.
A Java unit test:
private void doTestOrderByFreq(Orderer orderer) {
List<String> unordered = Arrays.asList("b", "d", "b", "b",
"a", "c", "a");
List<String> ordered = orderer.orderByFreq(unordered);
assertEquals(Arrays.asList("b", "a", "c", "d"), ordered);
}
A Java implementation:
public class JavaOrderer implements Orderer {
public List<String> orderByFreq(List<String> in) {
final Bag bag = new HashBag(in);
List<String> out = new ArrayList<String>(bag.uniqueSet());
Collections.sort(out, new Comparator<String>() {
public int compare(String a, String b) {
int freq = Integer.valueOf(
bag.getCount(b)).compareTo(bag.getCount(a)
);
return freq != 0 ? freq : a.compareTo(b);
}
});
return out;
}
}
My shot at a Clojure implementation, with inspiration from a previous
post in this group on a similar problem:
(ns step3.pnehm.clojure-orderer
(:gen-class
:name step3.pnehm.ClojureOrderer
:implements [pnehm.Orderer]
))
(defn count-words [coll]
(reduce #(merge-with + %1 {%2 1}) {} coll))
(defn cmpr [[val1 freq1] [val2 freq2]]
(let [freq (compare freq2 freq1)]
(if (not= freq 0) freq (compare val1 val2))))
(defn -orderByFreq [_ coll]
(if (empty? coll) () (keys (sort cmpr (count-words coll)))))
Orderer is a simple Java interface making it easy to call different
implementations from the JUnit test:
List<String> orderByFreq(List<String> in);
All kinds of feedback are welcome, both on style and the chosen solution.
Many thanks!
/Patrik
Nice to see you're not intimidated by gen-class. :-)
> (defn cmpr [[val1 freq1] [val2 freq2]]
> (let [freq (compare freq2 freq1)]
> (if (not= freq 0) freq (compare val1 val2))))
When testing for equality with zero, it's more idiomatic and
sometimes faster to use 'zero?', so perhaps:
(if-not (zero? freq) freq (compare val1 val2))
> (defn -orderByFreq [_ coll]
> (if (empty? coll) () (keys (sort cmpr (count-words coll)))))
Do you need to return an empty seq? If not, how about:
(when (seq coll)
(keys (sort cmpr (count-words coll))))
--Chouser
Chouser wrote:
>
>> (defn cmpr [[val1 freq1] [val2 freq2]]
>> (let [freq (compare freq2 freq1)]
>> (if (not= freq 0) freq (compare val1 val2))))
>
> When testing for equality with zero, it's more idiomatic and
> sometimes faster to use 'zero?', so perhaps:
>
> (if-not (zero? freq) freq (compare val1 val2))
>
Yes, that's better.
>> (defn -orderByFreq [_ coll]
>> (if (empty? coll) () (keys (sort cmpr (count-words coll)))))
>
> Do you need to return an empty seq? If not, how about:
>
> (when (seq coll)
> (keys (sort cmpr (count-words coll))))
I believe this is a place where Java and Clojure differs a bit. In
Java I'd expect an empty list back, but in Clojure I guess nil makes
more sense? Compare how Map.keySet() in Java returns an emply set for
an empty map, but (keys map) in Clojure returns nill for an empty map.
So, for interoperability with Java, I belive an empty seq is the way
to go, but in a Clojure-only implementation nil would be a better fit.
But in that case I guess I could just do:
(keys (sort cmpr (count-words coll)))
Benjy wrote:
>
> I've been playing along at home for awhile now, but this example hit
> close to home -- I've written variations on this code countless times
> in perl, and figured I'd take a stab at clojuring it while also taking
> advantage of clojure.contrib.seq-utils's very useful to the matter at
> hand group-by. I'd understand if the OP doesn't want to use this code
> due to dependency on non-core or if the variations on making it more
> general obscure the original intent...
It's great with an alternative solution, and the dependency to
clojure.contrib is not a problem.
[...]
> PS. To the original poster, the use of destructuring binds for the
> arguments to cmpr was rad -- it was something I overlooked when I was
> doing this in my head over dinner and I was imagining a bunch of
> inanity involving hash lookups and ifs for falling back to comparing
> the keys as you'd have to do in other languages.
Yes, it's nice. My first few iterations used (val e) and (key e) but
destruction works nice here I think.
>
>
> (use '[clojure.contrib.seq-utils :only (group-by)])
>
> ;; is there a better option for this?
> (defn false-if-zero [t] (if (= t 0) nil t))
>
> ;; the body of this fn should probably be a macro that takes
> ;; any number of comparisons and or-chain them correctly such that
> ;; ties cascade to the next comparison and obviates the need for
> ;; explicit calls to false-if-zero. Does it already exist?
> (defn cmpr [[key-a val-a] [key-b val-b]]
> (or (false-if-zero (compare val-b val-a))
> (compare key-a key-b)))
>
> (defn count-hash-vals [coll]
> ;; apply hash-map mapcat was the first thing
> ;; I found that worked here... better options ++welcome.
> (apply hash-map
> (mapcat #(let [[k v] %]
> (list k (count v)))
> coll)))
>
> (defn orderByFreq [coll]
> (or
> (keys (sort cmpr
> (count-hash-vals (group-by identity coll)) )))
> '())
> ;; I don't think this OR short circuiting is idiomatic in
> clojure, but it sure is handy for "empty list if nil" scenarios.
> ;; Of course, (empty-list-or-nil ...) reads nice enough...
Yes, as you write, it may not be super-pretty, but it is handy if you
need to guard against nil, as in this case.
Thank you all for your feedback!
/Patrik
[...]
> (defn count-hash-vals [coll]
> ;; apply hash-map mapcat was the first thing
> ;; I found that worked here... better options ++welcome.
> (apply hash-map
> (mapcat #(let [[k v] %]
> (list k (count v)))
> coll)))
>
> (defn orderByFreq [coll]
> (or
> (keys (sort cmpr
> (count-hash-vals (group-by identity coll)) ))) ;; as an
> exercise for the reader, use this call to implement (majority? ...) or
> (plurality? ...) etc.
> '()) ;; I don't think this OR short circuiting is idiomatic in
> clojure, but it sure is handy for "empty list if nil" scenarios.
> ;; Of course, (empty-list-or-nil ...) reads nice enough...
Here is a version without the apply/hash-map/mapcat that uses a plain
map instead. The function no longer returns a map, so the (keys map)
call has to be replaced as well:
(defn count-hash-vals [coll]
(map (fn [[k v]] [k (count v)]) coll))
(defn -orderByFreq [_ coll]
(or
(map first (sort cmpr
(count-hash-vals (group-by identity coll))))
'()))
/Patrik