When I created
http://dev.clojure.org/jira/browse/CLJ-1049, I thought
there was a simple fix, adding a few trivial lines that had been
seemingly accidentally omitted. When I tried that, I realized it was
not so simple, as described above. I now realize it's even trickier
than I thought, and that CLJ-1049, if ever resolved, probably didn't
belong in 1.5.0 anyway. I'll create a wiki page sometime with my
notes, and plan to release a proof-of-concept library.
Above I mentioned my principle that a reduce f should always be
binary, and a reduce-kv f always ternary. I now realize a similar
principle should be enforced for reducer operators. The types of
arguments to reducer operators should not depend on whether reduce or
reduce-kv is used. For example, (r/map f coll) should always call f
with one argument, whether reduce'd or reduce-kv'd, and that argument
should have a consistent type (an example below will hopefully make
this more clear).
If this principle is not enforced, composability suffers. If a
function returns (r/map f coll), it must either 1) know whether the
result will be reduce'd or reduce-kv'd, or 2) construct an f which
handles both cases. When composing many operators, the choice of
reduce/reduce-kv can infect the entire chain. This can be cumbersome
even when the choice is made locally, since it is impossible to switch
between 'reduce modes' within a single chain.
What should (into [] (r/map identity {1 2 3 4})) be? From
clojure.core/map, one would expect [[1 2] [3 4]]. This is also the
1.5.0 behavior of r/map. The r/map f is called with map entries.
Now, what should (into-kv {} (r/map identity {1 2 3 4})) be?
Naturally, one would expect {1 2 3 4}. But what is the r/map f called
with? If it's called with a map entry and returns a kv pair, this
seems to defeat the point of IKVReduce — avoiding overhead like this.
On the other hand, if it's called with the values only, then we have
an inconsistency — if reduce'd, it will be called with pairs, but if
reduce-kv'd, it will be called with just values.
One solution would be to have (into [] (r/map identity {1 2 3 4})) be
[2 4]. This is the solution I currently prefer. If you wanted to
reduce over a map's entries, you would need to seq the map and reduce
the seq. I _think_ this is what currently happens when you coll-reduce
a map — the default seq-reduce impl is used. Making this change would
mean providing a fast coll-reduce for maps, one which reduces over
only the values (by calling kvreduce and dropping the keys). This
avoids the special case where maps r/reduce with reduce-kv by default,
and makes the reduce semantics for maps match those of vectors. Of
course, this also makes r/map incompatible with clojure.core/map. But
if operators for reduce-kv are introduced, direct translation to seq
operators is lost anyway.
Another solution might be to just leave r/map as it is (returning
something that is only IReduce). Another operator, say r/map-v, could
be added, which acts as my proposed r/map. This would preserve
backwards compatibility.
However r/map etc work, new operators could be introduced for
IKVReduce. E.g. (r/map-kv (fn [k v] v') kv-coll), which returns
something that is both IKVReduce and IReduce (the latter by reducing
only over the v'). Similarly for (r/map-k (fn [k] k')), r/filter-kv,
etc. These provide composable transformations for IKVReduce without
introducing inconsistency.
A few simple examples (using my proposed r/map):
https://www.refheap.com/paste/3dda2c4d59abbf57f31c53836