reduced-induced pain

已查看 377 次
跳至第一个未读帖子

Christophe Grand

未读,
2014年10月9日 06:33:312014/10/9
收件人 clojure-dev、Walter van der Laan
Hi,

Since it's introduction #'reduced has caused subtle bugs (I remember at least r/mapcat in clojure itself and others in custom reducers at work) because one has to think of always checking whether the return value of a reducing fn is a Reduced or not.

If you fail to do so, you may not spot the bug because it generally only appears in some compound calls because regular values need not to be unwrapped (if the return value was systematically wrapped in an Either – I'm not proposing that![1] – then forgetting about the dual nature of the return value would cause simple test cases to fail).

I haven't scrutinized all transducers, just looked at take and partition-by because Walter van der Laan put me on that scent and I found a bug related to reduced in each of them: https://gist.github.com/cgrand/5510fb70b91e69459446.

I find the reduced mechanism error-prone, and having it so central in transducers may yield to more bugs.

As a side note, in http://clj-me.cgrand.net/2014/10/08/these-arent-the-reducing-functions-you-are-looking-for/ where I implement transducers as process transformers instead of reducing fns transformers, take and partition-by doesn't suffer from those bugs despite me having derived them from their buggy counterparts.

So should I fill a ticket to fix take and partition-by (and evaluate others) or is it possible to reconsider or discuss the API of the abstraction that transducers are transforming? (eg reducing fn vs process (as I propose) vs...)

Thanks,

Christophe

--
On Clojure http://clj-me.cgrand.net/
Clojure Programming http://clojurebook.com
Training, Consulting & Contracting http://lambdanext.eu/

Alex Miller

未读,
2014年10月9日 09:34:242014/10/9
收件人 cloju...@googlegroups.com、Walter van der Laan
Thanks Christophe, this is a very good find. I logged here: http://dev.clojure.org/jira/browse/CLJ-1557

The alternative in your blog is a non-starter for the goals of transducers.

--
You received this message because you are subscribed to the Google Groups "Clojure Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure-dev...@googlegroups.com.
To post to this group, send email to cloju...@googlegroups.com.
Visit this group at http://groups.google.com/group/clojure-dev.
For more options, visit https://groups.google.com/d/optout.

Laurent PETIT

未读,
2014年10月9日 10:15:392014/10/9
收件人 cloju...@googlegroups.com、Walter van der Laan
Hello, 

If you have time, I'd be interested in knowing which goal(s) of transducers isn't reach by Christophe's proposal ?
--
Laurent Petit

Christophe Grand

未读,
2014年10月9日 10:16:552014/10/9
收件人 clojure-dev、Walter van der Laan
Thanks Alex for creating the ticket, I tweaked the decription.

However what are the goals of transducers (honest question)? I thought they were:
* context independence (using the same xform with reducibles, sequences and channels; and thus avoiding code duplication)
* efficiency (no intermediate allocation).

Reducing function as an API to stateful processes is appealing because familiar and deceptive... because familiar.
To me once you introduce state adn struct-evaluation the universality of fold[lr] seems weakened.

Christophe

On Thu, Oct 9, 2014 at 3:34 PM, Alex Miller <al...@puredanger.com> wrote:
non-starter for the goals of transducers




Alex Miller

未读,
2014年10月9日 11:46:022014/10/9
收件人 cloju...@googlegroups.com、Walter van der Laan
Efficiency by removing intermediate allocation.

Laurent PETIT

未读,
2014年10月9日 12:01:482014/10/9
收件人 cloju...@googlegroups.com、Walter van der Laan
2014-10-09 17:45 GMT+02:00 Alex Miller <al...@puredanger.com>:
Efficiency by removing intermediate allocation.


Sure, and this goal is met by Christophe's alternative API, where's there's no more intermediate allocation than with current implementation:

(defn transduce [xform f init coll]
  (let [vacc (volatile! init)
        p (fn 
            ([])
            ([x] (reduced? (vswap! vacc f x))))
        p (xform p)]
    (feed! p coll) ; transducers are now process -> process
    (p)
    (f (let [acc @vacc] (if (reduced? acc) @acc acc)))))

Where feed! is the iteration primitive – for exposition it can be implemented using reduce but it’s backwards: reduce should be implemented on top of feed!.

(defn feed!
 "Returns true if the process can't process more input."
 [p coll]
  (reduce #(and (p %2) (reduced true)) false coll))




--
Laurent Petit

Timothy Baldridge

未读,
2014年10月9日 12:08:432014/10/9
收件人 cloju...@googlegroups.com
The problem I had with Christophe's solution (although I hadn't gotten the time to reply to his blog post) is this: transducers are really only stateful some of the time. Partition and take are stateful, but filter, map, and something like "take-until" are not. So instead of being "mostly functional", Christophe's process solution ends up being "never functional". And actually it looks quite a bit like Rx. 

IMO, the concerns about reduced have merit, but I see it as a side-effect of trying to keep everything as functional as possible. On some platforms (like Python) it would be possible to signal end-of-reduction via throwing an exception since exceptions are free, but I don't think I'd want to try that on the JVM. 

Anyways, just my $0.02

Timothy
“One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.”
(Robert Firth)

Laurent PETIT

未读,
2014年10月9日 12:41:342014/10/9
收件人 cloju...@googlegroups.com
Christophe's map transducer implementation is not stateful. The only difference is it is not required to pass the accumulator around.
It would thus be safer, somehow, since it would then not be possible to alter the accumulator both processing-fns and transducers:

(defn map [f]
  (fn [p1]
    (fn
      ([] (p1))
      ([x] (p1 (f x))))))

Timothy Baldridge

未读,
2014年10月9日 16:34:162014/10/9
收件人 cloju...@googlegroups.com
Yes, I mis-spoke. They are not stateful, but the implementation now implies that every transducer will be side-effecting. Any function that says "takes a value and returns true if no more processing is to be done" implies that something side-effecting is done with that value. Contrast that with:

(transduce (map inc) conj (range 10)) 

Currently in Clojure 1.7, this code has no side-effects, and no stateful transducers. It's pure functional programming. True this isn't the case for take and partition,  but like I said it seems a bit odd to say that since a few things need state, everything has to be side-effecting.

Timothy

Laurent PETIT

未读,
2014年10月9日 17:01:332014/10/9
收件人 cloju...@googlegroups.com
Again, the side effect isn't in the transducer or the processing fn, but in transduce. And this could also probably be fought if we dig deeper  

There will be no additional side effect when applied with channels, for instance. 

Isn't a process stateful by nature? Why pretend to be functional there, wrap values for signaling conditions, pass accumulators down the call stack without touching them, and also do the same with seed values up the call stack ?

Maybe there's a concept of Processible waiting around the corner, like a Reducible, but without the seed and the accumulator. 
Those belong to transduce. 

Sharing my current understanding of long discussions with cgrand here, still digesting the information, sorry if I'm blatantly wrong or if this doesn't make sense. 

Interesting stuff :-)


--
Laurent Petit

Christophe Grand

未读,
2014年10月10日 03:59:102014/10/10
收件人 clojure-dev
On Thu, Oct 9, 2014 at 10:34 PM, Timothy Baldridge <tbald...@gmail.com> wrote:
(transduce (map inc) conj (range 10)) 

Currently in Clojure 1.7, this code has no side-effects, and no stateful transducers. It's pure functional programming. True this isn't the case for take and partition,  but like I said it seems a bit odd to say that since a few things need state, everything has to be side-effecting.

When you reason about something which may be stateful (namely a reducing fn) then you have to consider the widest case (stateful) not the narrowest case (pure) otherwise your code may fail to cover edge cases and will still pass your tests.

(The situation is similar with Reduced: when you reason about something which may be of two types (namely the return value of 2-arity of a reducing fn) then you have to consider the widest case (check whether you got a Reduced) not the narrowest case (plain unwrapped value) otherwise your code may fail to cover edge cases and will still pass your tests.)

Reducing fn as an api to stateful processes may be less error-prone with a linear type system.

(Transients which require linear updates too can't be used in lieu of persistents, so your code fail early. Imagine if transients used conj, assoc etc. instead of conj!, assoc! etc.)

Another thing about Reduced: you can't support primitive return values with this model.

About side effects in my proposal: they are encapsulated in the function applying xform (ie transduce, sequence, chan...) they don't leak to user code. Only transducers writers have to be aware of them, like they need to be aware (in 1.7) that the downstream reducing fn may be stateful and that its 2-arity may yield a Reduced any time.
回复全部
回复作者
转发
0 个新帖子