How is this code to create a stateful lazy sequence?

226 views
Skip to first unread message

Rob Nikander

unread,
Jul 22, 2017, 3:43:23 PM7/22/17
to Clojure
Hi,

Here's a function and a simple use of it. It works, but the code may not be good Clojure. I'm wondering how it might be better.

(defn distinct-by
  "Returns a sequence with no repeats of (f item). 
  A set is built up internally as the sequence is consumed, so don't use it on
  an infinite sequence or you will run out of memory."
  ([f coll]
    (letfn [(get-more [seen input-seq]
               (if (not (empty? input-seq))
                 (let [x (first input-seq)
                       fx (f x)]
                   (if (contains? seen fx)
                     (lazy-seq (get-more seen (rest input-seq)))
                     (lazy-seq (cons x (get-more (conj seen fx) (rest input-seq))))))))]
       (get-more #{} (seq coll)))))

(def xs (list {:n 1, :a 'a} {:n 2, :a 'b} {:n 3, :a 'a} {:n 4, :a 'c} {:n 5, :a 'b}))

(distinct-by :a xs)
=> ({:n 1, :a a} {:n 2, :a b} {:n 4, :a c})

Rob

Timothy Baldridge

unread,
Jul 22, 2017, 4:50:23 PM7/22/17
to clo...@googlegroups.com
If we think about what we're doing here is a stateful filter, then maybe we could leverage a few more core Clojure functions:


(defn distinct-by [f coll]
  (let [seen (atom #{})]
    (filter (fn [itm]
               (let [m (f itm)]
                  (when-not (@seen m)
                    (swap! seen conj m)))) 
             coll)))

Now it's true, we're using a atom and a bit of mutability where none existed before, but personally I think the intent is a bit clearer in this code. What we're saying is "filter out all results we've seen before". If this was production code I would probably replace the atom with a `volatile!` but that's just for (small) performance reasons.
                  

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+unsubscribe@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
“One of the main causes of the fall of the Roman Empire was that–lacking zero–they had no way to indicate successful termination of their C programs.”
(Robert Firth)

David Bürgin

unread,
Jul 22, 2017, 4:58:01 PM7/22/17
to clo...@googlegroups.com
Perhaps you want to study the implementation in Medley, those are always
very good quality:

https://github.com/weavejester/medley/blob/254989ed3de83c30ce0101d66c7ce1b6ee257b4d/src/medley/core.cljc#L173

David

Rob Nikander

unread,
Jul 23, 2017, 8:25:13 PM7/23/17
to Clojure
Yes, I find that much clearer too. Thanks!

Alex Baranosky

unread,
Aug 5, 2017, 6:58:08 AM8/5/17
to clo...@googlegroups.com
My version of distinct-by just took clojure.core/distinct and added a keyfn (and optional max-n). It's hard to claim this code is readable. I took this approach mostly because I knew it would be unlikely to be buggy, since it was a simple change or two from clojure.core/distinct

(defn distinct-by
  "Like distinct, but calls keyfn to determine distinctness, much like sort-by.
   Takes an optional max-n, indicating how many duplicates are acceptible."
  ([keyfn coll]
     (distinct-by keyfn 1 coll))
  ([keyfn max-n coll]
     (let [step (fn step [xs seen]
                  (lazy-seq
                   ((fn [[f :as xs] seen]
                      (when-let [s (seq xs)]
                        (if (>= (get seen (keyfn f) 0) max-n)
                          (recur (rest s) seen)
                          (cons f (step (rest s) (update-in seen [(keyfn f)] (fnil inc 0)))))))
                    xs seen)))]
       (step coll {}))))

Reply all
Reply to author
Forward
0 new messages