reduce-kv, reducer transforms, and reducef arity

tomoj

unread,

Sep 30, 2012, 6:30:30 PM9/30/12

to cloju...@googlegroups.com

Currently, the reducer transforms (r/map etc) do not support reduce-kv. Although rfn makes an effort to support both [ret v] and [ret k v] arities, it is wasted effort since the return values of reducer/folder do not implement IKVReduce.

I have a patch at http://dev.clojure.org/jira/browse/CLJ-1049 that fixes this by implementing IKVReduce for reducer/folder.

However, just implementing IKVReduce results in some strange behavior with r/map and r/mapcat. Because r/map calls `(f1 ret (f k v))`, f1 must be 2-ary even if reduce-kv is used. In my patch, I chose to change this to `(f1 ret k (f k v))`, since it is the only sensible way I've thought of to preserve the correct arity.

With mapcat, because the return values of the map function are reduced with r/reduce, if they are maps, then f1 must be 3-ary even if reduce (not reduce-kv) is used. In my patch, I directly use coll-reduce or kv-reduce as appropriate to avoid this problem.

When creating my patch, I used the principle that an r/reduce reducef should always be 2-ary, while a reduce-kv reducef should always be 3-ary. Auto-kv-reduce on maps is an exception, and results in the following seemingly strange behavior (with my patch):

user> (r/reduce assoc {} {:foo 1 :bar 2})
{:bar 2, :foo 1}
user> (r/reduce assoc {} (r/map str {:foo 1 :bar 2}))
;; Error! assoc called with 2 args
user> (reduce-kv assoc {} (r/map str {:foo 1 :bar 2}))
{:bar ":bar2", :foo ":foo1"}

It would seem more natural for me for r/reduce and reduce-kv to always require 2-ary and 3-ary reducefs, respectively. Why the special case for maps?

tomoj

unread,

Sep 30, 2012, 6:36:00 PM9/30/12

to cloju...@googlegroups.com

Note that the mapcat problem seems to be a bug in the current implementation (this is without my patch):

user> (clojure.core.protocols/coll-reduce (r/mapcat identity [{:foo 1 :bar 2} {:bar 3}]) assoc {})
{:bar 3, :foo 1}

An special case that r/reduce on maps means kv-reduce instead of coll-reduce seems strange to me, but not incorrect. Here, though, a 3-ary reducef is required even though coll-reduce is called directly.

tomoj

unread,

Oct 17, 2012, 2:01:55 AM10/17/12

to cloju...@googlegroups.com

Any thoughts on this? Setting aside the issue of why maps reduce with reduce-kv, are there any problems with the changes in http://dev.clojure.org/jira/browse/CLJ-1049 ? Feeling a bit nervous 1.5 will hit and reduce-kv will be hobbled (and r/mapcat broken?).

Tom Jack

unread,

Dec 10, 2012, 9:38:33 PM12/10/12

to cloju...@googlegroups.com

When I created http://dev.clojure.org/jira/browse/CLJ-1049, I thought
there was a simple fix, adding a few trivial lines that had been
seemingly accidentally omitted. When I tried that, I realized it was
not so simple, as described above. I now realize it's even trickier
than I thought, and that CLJ-1049, if ever resolved, probably didn't
belong in 1.5.0 anyway. I'll create a wiki page sometime with my
notes, and plan to release a proof-of-concept library.

Above I mentioned my principle that a reduce f should always be
binary, and a reduce-kv f always ternary. I now realize a similar
principle should be enforced for reducer operators. The types of
arguments to reducer operators should not depend on whether reduce or
reduce-kv is used. For example, (r/map f coll) should always call f
with one argument, whether reduce'd or reduce-kv'd, and that argument
should have a consistent type (an example below will hopefully make
this more clear).

If this principle is not enforced, composability suffers. If a
function returns (r/map f coll), it must either 1) know whether the
result will be reduce'd or reduce-kv'd, or 2) construct an f which
handles both cases. When composing many operators, the choice of
reduce/reduce-kv can infect the entire chain. This can be cumbersome
even when the choice is made locally, since it is impossible to switch
between 'reduce modes' within a single chain.

What should (into [] (r/map identity {1 2 3 4})) be? From
clojure.core/map, one would expect [[1 2] [3 4]]. This is also the
1.5.0 behavior of r/map. The r/map f is called with map entries.

Now, what should (into-kv {} (r/map identity {1 2 3 4})) be?
Naturally, one would expect {1 2 3 4}. But what is the r/map f called
with? If it's called with a map entry and returns a kv pair, this
seems to defeat the point of IKVReduce — avoiding overhead like this.
On the other hand, if it's called with the values only, then we have
an inconsistency — if reduce'd, it will be called with pairs, but if
reduce-kv'd, it will be called with just values.

One solution would be to have (into [] (r/map identity {1 2 3 4})) be
[2 4]. This is the solution I currently prefer. If you wanted to
reduce over a map's entries, you would need to seq the map and reduce
the seq. I _think_ this is what currently happens when you coll-reduce
a map — the default seq-reduce impl is used. Making this change would
mean providing a fast coll-reduce for maps, one which reduces over
only the values (by calling kvreduce and dropping the keys). This
avoids the special case where maps r/reduce with reduce-kv by default,
and makes the reduce semantics for maps match those of vectors. Of
course, this also makes r/map incompatible with clojure.core/map. But
if operators for reduce-kv are introduced, direct translation to seq
operators is lost anyway.

Another solution might be to just leave r/map as it is (returning
something that is only IReduce). Another operator, say r/map-v, could
be added, which acts as my proposed r/map. This would preserve
backwards compatibility.

However r/map etc work, new operators could be introduced for
IKVReduce. E.g. (r/map-kv (fn [k v] v') kv-coll), which returns
something that is both IKVReduce and IReduce (the latter by reducing
only over the v'). Similarly for (r/map-k (fn [k] k')), r/filter-kv,
etc. These provide composable transformations for IKVReduce without
introducing inconsistency.

A few simple examples (using my proposed r/map):
https://www.refheap.com/paste/3dda2c4d59abbf57f31c53836

Reply all

Reply to author

Forward