From Java to Clojure

31 views
Skip to first unread message

Patrik Fredriksson

unread,
Jul 9, 2009, 7:31:13 AM7/9/09
to clo...@googlegroups.com
Hi!

I started to look closer at Clojure after Rich Hickey's presentation
at QCon London in March, this is my first post. I spend my days
programming Java, but my journey as developer did actually start with
an ML programming course many years ago. It has been great fun to
re-discover the functional style of programming again! This list is a
great source of knowledge.

A colleague of mine wrote an article about how to convert a piece of
Java code to Groovy a few months ago. As an exercise I thought I
should do the same, but to Clojure of course. I'd really appreciate
some feedback on the solution. I hope you aren't too sick of these
kind of posts just yet :-)

The code orders a list of characters, first by frequency and then
alphabetically if two characters happen to have the same frequency.

A Java unit test:

private void doTestOrderByFreq(Orderer orderer) {
List<String> unordered = Arrays.asList("b", "d", "b", "b",
"a", "c", "a");
List<String> ordered = orderer.orderByFreq(unordered);
assertEquals(Arrays.asList("b", "a", "c", "d"), ordered);
}

A Java implementation:

public class JavaOrderer implements Orderer {

public List<String> orderByFreq(List<String> in) {
final Bag bag = new HashBag(in);
List<String> out = new ArrayList<String>(bag.uniqueSet());

Collections.sort(out, new Comparator<String>() {
public int compare(String a, String b) {
int freq = Integer.valueOf(
bag.getCount(b)).compareTo(bag.getCount(a)
);
return freq != 0 ? freq : a.compareTo(b);
}
});

return out;
}
}

My shot at a Clojure implementation, with inspiration from a previous
post in this group on a similar problem:

(ns step3.pnehm.clojure-orderer
(:gen-class
:name step3.pnehm.ClojureOrderer
:implements [pnehm.Orderer]
))

(defn count-words [coll]
(reduce #(merge-with + %1 {%2 1}) {} coll))

(defn cmpr [[val1 freq1] [val2 freq2]]
(let [freq (compare freq2 freq1)]
(if (not= freq 0) freq (compare val1 val2))))

(defn -orderByFreq [_ coll]
(if (empty? coll) () (keys (sort cmpr (count-words coll)))))

Orderer is a simple Java interface making it easy to call different
implementations from the JUnit test:

List<String> orderByFreq(List<String> in);

All kinds of feedback are welcome, both on style and the chosen solution.

Many thanks!

/Patrik

Sean Devlin

unread,
Jul 9, 2009, 9:23:58 AM7/9/09
to Clojure
One question on design intent before feedback. Is your intent to have
this Clojure code called by Java code later?

Patrik Fredriksson

unread,
Jul 9, 2009, 10:16:17 AM7/9/09
to clo...@googlegroups.com
The idea is to gradually, over a few steps, move from a Java
implementation to a pure Clojure implementation. A this step the
Clojure implementation is called by the Java JUnit-test (see complete
test below). In the last step, the Java-references will be removed
from the Clojure implementation and the JUnit test replaced by Clojure
tests.

The complete code, including Java, Groovy and current Clojure
implementation and a micro-benchmark, can be found here:
http://code.google.com/p/pnehm-java-to-cool-language-x/source/browse

The complete JUnit test:
public class OrdererTestUsingJava extends TestCase {

private void doTestOrderByFreq(Orderer orderer) {
List<String> unordered = Arrays.asList("b", "d", "b", "b",
"a", "c", "a");
List<String> ordered = orderer.orderByFreq(unordered);
assertEquals(Arrays.asList("b", "a", "c", "d"), ordered);

unordered = Arrays.asList("b", "d", "d", "e", "b", "a", "c", "a");
ordered = orderer.orderByFreq(unordered);
assertEquals(Arrays.asList("a", "b", "d", "c", "e"), ordered);

unordered = new ArrayList<String>();
ordered = orderer.orderByFreq(unordered);
assertTrue(unordered.isEmpty());
assertTrue(ordered.isEmpty());
}

public void testJavaImpl() {
doTestOrderByFreq(new JavaOrderer());
}

public void testClojureImpl() {
doTestOrderByFreq(new step3.pnehm.ClojureOrderer());
}

public void testGroovyImpl() {
doTestOrderByFreq(new GroovyOrderer());
}

}

/Patrik

Sean Devlin

unread,
Jul 9, 2009, 11:11:03 AM7/9/09
to Clojure
Okay, since you DO call the Clojure code from Java, I like what you
did very much.

If you're running "edge" Clojure, and not 1.0, I'd recommend writing
the tests in Clojure next, using the clojure.test namespace.

Sean

Chouser

unread,
Jul 9, 2009, 1:13:43 PM7/9/09
to clo...@googlegroups.com
On Thu, Jul 9, 2009 at 7:31 AM, Patrik Fredriksson<patr...@gmail.com> wrote:
>
> My shot at a Clojure implementation, with inspiration from a previous
> post in this group on a similar problem:
>
> (ns step3.pnehm.clojure-orderer
>  (:gen-class
>    :name step3.pnehm.ClojureOrderer
>    :implements [pnehm.Orderer]
>    ))

Nice to see you're not intimidated by gen-class. :-)

> (defn cmpr [[val1 freq1] [val2 freq2]]
>  (let [freq (compare freq2 freq1)]
>    (if (not= freq 0) freq (compare val1 val2))))

When testing for equality with zero, it's more idiomatic and
sometimes faster to use 'zero?', so perhaps:

(if-not (zero? freq) freq (compare val1 val2))

> (defn -orderByFreq [_ coll]
>  (if (empty? coll) () (keys (sort cmpr (count-words coll)))))

Do you need to return an empty seq? If not, how about:

(when (seq coll)
(keys (sort cmpr (count-words coll))))

--Chouser

Benjamin Stewart

unread,
Jul 10, 2009, 9:04:13 AM7/10/09
to clo...@googlegroups.com
I've been playing along at home for awhile now, but this example hit
close to home -- I've written variations on this code countless times
in perl, and figured I'd take a stab at clojuring it while also taking
advantage of clojure.contrib.seq-utils's very useful to the matter at
hand group-by. I'd understand if the OP doesn't want to use this code
due to dependency on non-core or if the variations on making it more
general obscure the original intent...

Anyway, my style could use work and the code is littered with comments
getting in the way that ask all kinds of questions (I admit, I haven't
done my due diligence on finding out if any of these functions exist
beyond group-by), but hopefully this variation is helpful to someone
or there are answers to my various questions. I'd *love* feedback.

Thanks in advance,
--Benjy

PS. To the original poster, the use of destructuring binds for the
arguments to cmpr was rad -- it was something I overlooked when I was
doing this in my head over dinner and I was imagining a bunch of
inanity involving hash lookups and ifs for falling back to comparing
the keys as you'd have to do in other languages.


(use '[clojure.contrib.seq-utils :only (group-by)])

;; is there a better option for this?
(defn false-if-zero [t] (if (= t 0) nil t))

;; the body of this fn should probably be a macro that takes
;; any number of comparisons and or-chain them correctly such that
;; ties cascade to the next comparison and obviates the need for
;; explicit calls to false-if-zero. Does it already exist?
(defn cmpr [[key-a val-a] [key-b val-b]]
(or (false-if-zero (compare val-b val-a))
(compare key-a key-b)))

(defn count-hash-vals [coll]
;; apply hash-map mapcat was the first thing
;; I found that worked here... better options ++welcome.
(apply hash-map
(mapcat #(let [[k v] %]
(list k (count v)))
coll)))

(defn orderByFreq [coll]
(or
(keys (sort cmpr
(count-hash-vals (group-by identity coll)) ))) ;; as an
exercise for the reader, use this call to implement (majority? ...) or
(plurality? ...) etc.
'()) ;; I don't think this OR short circuiting is idiomatic in
clojure, but it sure is handy for "empty list if nil" scenarios.
;; Of course, (empty-list-or-nil ...) reads nice enough...

;; ;; If sort or compare don't do an implicit orcish maneuver
;; ;; or otherwise memoize pieces of this call, performance could
;; ;; suffer; I have not researched this yet.
;; ;; Also, not general: assumes val is countable; val may
;; ;; be comparable but not countable. The above isn't
;; ;; fully general either, of course, but it's better.

;; (defn cmpr [[key-a val-a] [key-b val-b]]
;; (or (false-if-zero (compare (count val-b) (count val-a)))
;; (compare key-a key-b)))

;; (defn orderByFreq [coll]
;; (keys (sort cmpr (group-by identity coll))))

Sean Devlin

unread,
Jul 10, 2009, 11:31:04 AM7/10/09
to Clojure
To quote Benjamin Stewart:

;; the body of this fn should probably be a macro that takes
;; any number of comparisons and or-chain them correctly such that
;; ties cascade to the next comparison and obviates the need for
;; explicit calls to false-if-zero. Does it already exist?

This could be done with a macro, but I think that it could be done
simply with closure. Here's my attempt. I'm going to define two
helper functions first

(defn <=>
"This creates a comparator that wraps the mapping fn f."
[f]
(fn [a b]
(compare (f a) (f b))))

(defn inv-<=>
"This creates an inverse comparator that wraps the mapping fn f."
[f]
(fn [a b]
(compare (f b) (f a))))

These functions take in a mapping function and return a comparator/
inverse comparator, respectively. Now, I'll define a chaining
function.

(defn chain-comp
"This takes a list of comparator functions, and
chains them together. It returns another comparator.
It behaves similar to comp, in that fns are applied
right to left."
[& comps]
(fn [a b]
(loop [remaining-comps (reverse comps)]
(let [iter-comp (first remaining-comps)
iter-result (iter-comp a b)]
(if (and (zero? iter-result) (next remaining-comps))
(recur (rest remaining-comps))
iter-result)))))

Now, to define some simple test data

(def test-tuples
[{:a 0 :b 1}
{:a 0 :b 3}
{:a 0 :b 2}
{:a 1 :b 2}])

Since the chain comp returns a comparator, we'll use sort, and not
sort-by.

user=> (sort (chain-comp (<=> :b) (<=> :a)) test-tuples)
({:a 0, :b 1} {:a 0, :b 2} {:a 0, :b 3} {:a 1, :b 2})

user=> (sort (chain-comp (inv-<=> :b) (<=> :a)) test-tuples)
({:a 0, :b 3} {:a 0, :b 2} {:a 0, :b 1} {:a 1, :b 2})

I don't quite like the use of <=> and inv-<=>, but it the best
solution I could come up with that:

* Works with a closure (frequent readers should notice a theme in my
posts...)
* Allows inverting a comparison
* Uses the minimum amount of comparisons

Thoughts & feedback welcome

Sean

Gist available here:
http://gist.github.com/144590

Patrik Fredriksson

unread,
Jul 10, 2009, 7:07:09 PM7/10/09
to clo...@googlegroups.com
Sean Devlin wrote:
> If you're running "edge" Clojure, and not 1.0, I'd recommend writing
> the tests in Clojure next, using the clojure.test namespace.
I'd like to keep it 1.0 for now, so the tests are in clojure.contrib.test-is.
But I'll make note, or add an alternative version for clojure.test.
It's nice to see the test support being added to core.

Chouser wrote:
>
>> (defn cmpr [[val1 freq1] [val2 freq2]]
>> (let [freq (compare freq2 freq1)]
>> (if (not= freq 0) freq (compare val1 val2))))
>
> When testing for equality with zero, it's more idiomatic and
> sometimes faster to use 'zero?', so perhaps:
>
> (if-not (zero? freq) freq (compare val1 val2))
>

Yes, that's better.

>> (defn -orderByFreq [_ coll]
>> (if (empty? coll) () (keys (sort cmpr (count-words coll)))))
>
> Do you need to return an empty seq? If not, how about:
>
> (when (seq coll)
> (keys (sort cmpr (count-words coll))))

I believe this is a place where Java and Clojure differs a bit. In
Java I'd expect an empty list back, but in Clojure I guess nil makes
more sense? Compare how Map.keySet() in Java returns an emply set for
an empty map, but (keys map) in Clojure returns nill for an empty map.

So, for interoperability with Java, I belive an empty seq is the way
to go, but in a Clojure-only implementation nil would be a better fit.
But in that case I guess I could just do:
(keys (sort cmpr (count-words coll)))

Benjy wrote:
>
> I've been playing along at home for awhile now, but this example hit
> close to home -- I've written variations on this code countless times
> in perl, and figured I'd take a stab at clojuring it while also taking
> advantage of clojure.contrib.seq-utils's very useful to the matter at
> hand group-by.  I'd understand if the OP doesn't want to use this code
> due to dependency on non-core or if the variations on making it more
> general obscure the original intent...

It's great with an alternative solution, and the dependency to
clojure.contrib is not a problem.

[...]

> PS.  To the original poster, the use of destructuring binds for the
> arguments to cmpr was rad -- it was something I overlooked when I was
> doing this in my head over dinner and I was imagining a bunch of
> inanity involving hash lookups and ifs for falling back to comparing
> the keys as you'd have to do in other languages.

Yes, it's nice. My first few iterations used (val e) and (key e) but
destruction works nice here I think.

>
>
> (use '[clojure.contrib.seq-utils :only (group-by)])
>
> ;; is there a better option for this?
> (defn false-if-zero [t]   (if (= t 0)  nil t))
>
> ;; the body of this fn should probably be a macro that takes
> ;; any number of comparisons and or-chain them correctly such that
> ;; ties cascade to the next comparison and obviates the need for
> ;; explicit calls to false-if-zero.  Does it already exist?
> (defn cmpr [[key-a val-a] [key-b val-b]]
>  (or (false-if-zero (compare val-b val-a))
>      (compare key-a key-b)))
>
> (defn count-hash-vals [coll]
> ;; apply hash-map mapcat was the first thing
> ;; I found that worked here... better options ++welcome.
>               (apply hash-map
>                      (mapcat #(let [[k v] %]
>                                 (list k (count v)))
>                              coll)))
>
> (defn orderByFreq [coll]
>  (or
>       (keys (sort cmpr
>              (count-hash-vals (group-by identity coll)) )))

>       '())
> ;; I don't think this OR short circuiting is idiomatic in
> clojure, but it sure is handy for "empty list if nil" scenarios.
>  ;; Of course, (empty-list-or-nil ...) reads nice enough...

Yes, as you write, it may not be super-pretty, but it is handy if you
need to guard against nil, as in this case.

Thank you all for your feedback!

/Patrik

Patrik Fredriksson

unread,
Jul 10, 2009, 7:37:46 PM7/10/09
to clo...@googlegroups.com
On Fri, Jul 10, 2009 at 3:04 PM, Benjamin Stewart<bste...@gmail.com> wrote

[...]

> (defn count-hash-vals [coll]
> ;; apply hash-map mapcat was the first thing
> ;; I found that worked here... better options ++welcome.
>               (apply hash-map
>                      (mapcat #(let [[k v] %]
>                                 (list k (count v)))
>                              coll)))
>
> (defn orderByFreq [coll]
>  (or
>       (keys (sort cmpr
>              (count-hash-vals (group-by identity coll)) ))) ;; as an
> exercise for the reader, use this call to implement (majority? ...) or
> (plurality? ...) etc.
>       '())  ;; I don't think this OR short circuiting is idiomatic in
> clojure, but it sure is handy for "empty list if nil" scenarios.
>             ;; Of course, (empty-list-or-nil ...) reads nice enough...

Here is a version without the apply/hash-map/mapcat that uses a plain
map instead. The function no longer returns a map, so the (keys map)
call has to be replaced as well:

(defn count-hash-vals [coll]
(map (fn [[k v]] [k (count v)]) coll))

(defn -orderByFreq [_ coll]
(or
(map first (sort cmpr
(count-hash-vals (group-by identity coll))))
'()))

/Patrik

Reply all
Reply to author
Forward
0 new messages