Eager (cat) in transducers

279 views

Skip to first unread message

Glen Mailer

unread,

Sep 21, 2014, 12:16:47 PM9/21/14

to clo...@googlegroups.com

While watching Rich's strangeloop talk, I noticed a slight oddity in the definition of mapcat.

I brought this up briefly in the IRC channel yesterday and the general consensus seemed to be that this is awkward, but not easily solvable:

The original lazy definition of (mapcat) uses (concat), and concat explicitly uses a lazy sequence:
https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L679

The reducers defintion of (mapcat), like all reducers, explicitly uses (reduce)
https://github.com/clojure/clojure/blob/f3259f4f34a68eae7db7efc0be9d19fa5dafbd3c/src/clj/clojure/core/reducers.clj#L171

As I understand it, part of the goals of transducers is to unify the similar patterns seen in reducers, lazy-sequences and elsewhere.

However, transducers' (mapcat) is currently implemented via a new core function called (cat)
https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L2648
https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L7221

(cat) unfortunately contains an explicit call to (reduce), which makes the (map) part eager, here's an example of something that works with the previous mapcat, but not with the tranducers flavour:

(take 3 (mapcat repeat [1]))
; => (1 1 1)
(take 3 (sequence (mapcat repeat) [1]))
; => #<OutOfMemoryError java.lang.OutOfMemoryError: Java heap space>

Despite asking for a lazy transducer implementation, we get an eager step.

Is there some clever way to avoid this? I believe this applies generally to any transducer which calls the step function multiple times. Conceptually I think that transducer processor would have to wrap the step function in some way to make it act like a continuation to enforce laziness with the executing transducer?

Hopefully that all makes sense!

Alex Miller

unread,

Sep 22, 2014, 2:03:15 PM9/22/14

to clo...@googlegroups.com

This is an excellent question and the answer is: no, there is not a clever way (or a desire) to avoid this.

The internals of transducer functions are never lazy. Each "step" will be eagerly completed for each output element. This is a fundamental difference between the lazy sequence and transducer approaches.

Reply all

Reply to author

Forward

0 new messages