partition-starting-every : yet another partition function

74 views
Skip to first unread message

Matt Smith

unread,
Sep 10, 2010, 4:26:44 PM9/10/10
to Clojure
problem: convert a collection [1 2 0 1 2 3 0 1 2 3 0 0 1 2] into
partitions like:
((1 2) (0 1 2 3) (0 1 2 3) (0) (0 1 2))
In this case, start each partition on a 0.


I looked at the various partition functions but none of them would do
the trick without adding unnecessary complexity. Instead I wrote a
new function based on partition-by:

Solution:
(defn partition-starting-every
"Partition the sequence starting each partition when the f is true."
[f coll]
(if-let [coll (seq coll)]
(let [p (cons (first coll) (take-while (complement f) (rest
coll)))]
(lazy-seq (cons p (partition-starting-every f (drop (count p)
coll)))))))

user=>(partition-starting-every zero? [1 2 0 1 2 3 0 1 2 3 0 0 1 2])
((1 2) (0 1 2 3) (0 1 2 3) (0) (0 1 2))

Questions:
1 - Is there a simpler way to do this using existing partition
functions?
2 - If not, is this something people are interested in having
contributed?

In looking at the partition functions they are very similar. Maybe
there is a why to combine them in to a single function.

Gijs S.

unread,
Sep 15, 2010, 2:41:03 PM9/15/10
to Clojure
This answers neither question 1 nor 2, but there is a function from
the map, filter and reduce family of higher order functions that is
applicable to this problem. The function is called unfold and is the
opposite of reduce (see http://en.wikipedia.org/wiki/Anamorphism).

"An unfold builds a (potentially infinite) list from a seed value.
Typically, the unfold takes a seed value x, a one-place operation
unspool that yields a pairs of such items, and a predicate finished
which determines when to finish the list (if ever)."

I could not find unfold or an equivalent function in clojure or
clojure.contrib, so here it is:

(defn unfold [unspool finished x]
(lazy-seq
(when-not (finished x)
(let [[a y] (unspool x)]
(cons a (unfold unspool finished y))))))

;; the example from the wikipedia page:
(defn iterate-unfold [f x]
(unfold (fn [a] [a (f a)]) (fn [_] false) x))

=> (take 10 (iterate-unfold inc 0))
(0 1 2 3 4 5 6 7 8 9)

The idea is to pass an unspool function that splits a collection at
every occurrence of an item for which (pred item) is true. The unspool
function in partition-starting-every produces a vector of the
partition "cut off" of the collection and the rest of the collection
to be processed.

(defn partition-starting-every [f coll]
(unfold (fn [[head & tail]]
(let [[run more] (split-with (complement f) tail)]
[(cons head run) more])) empty? coll))

=> (partition-starting-every zero? [1 2 0 1 2 3 0 1 2 3 0 0 1 2])
((1 2) (0 1 2 3) (0 1 2 3) (0) (0 1 2))

The reason that I suggest the unfold function is because it can also
be used for other partitioning functions:

(defn partition-all-unfold [n step coll]
(unfold (fn [c]
[(take n c) (drop step c)]) empty? coll))

=> (partition-all 3 2 (range 0 10))
((0 1 2) (2 3 4) (4 5 6) (6 7 8) (8 9))
=> (partition-all-unfold 3 2 (range 0 10))
((0 1 2) (2 3 4) (4 5 6) (6 7 8) (8 9))

Hope this helps,
-Gijs

Nicolas Oury

unread,
Sep 16, 2010, 10:17:58 AM9/16/10
to clo...@googlegroups.com
I think that unfold (or co-reduce, or generate) should find its way in contrib.

I am not sure we need finished arg though. The traditional finish in
the seq family is nil.

My own version of unfold:

(defn unfold
"(unfold seed grow) use the seed and a function grow that returns an
element and another seed
to creates a lazy seq.
The seq is stopped the grow function returns nil."
[seed grow]
(if-let [[elt next-seed] (grow seed)]
(cons elt
(lazy-seq (unfold next-seed grow)))
()))

Whether the cons is in or outside the lazy-seq is debatable. (I like
to think of seq as the fixpoint of the functor (X -> Cons Object (lazy
X)))

Best,

Nicolas.

Chouser

unread,
Sep 16, 2010, 4:45:42 PM9/16/10
to clo...@googlegroups.com
On Fri, Sep 10, 2010 at 4:26 PM, Matt Smith <m0s...@gmail.com> wrote:
> problem: convert a collection [1 2 0 1 2 3 0 1 2 3 0 0 1 2] into
> partitions like:
> ((1 2) (0 1 2 3) (0 1 2 3) (0) (0 1 2))
> In this case, start each partition on a 0.
>
>
> I looked at the various partition functions but none of them would do
> the trick without adding unnecessary complexity.  Instead  I wrote a
> new function based on partition-by:
>
> Solution:
> (defn partition-starting-every
>  "Partition the sequence starting each partition when the f is true."
>  [f coll]
>  (if-let [coll (seq coll)]
>    (let [p (cons (first coll) (take-while (complement f) (rest
> coll)))]
>      (lazy-seq (cons p (partition-starting-every f (drop (count p)
> coll)))))))
>
> user=>(partition-starting-every zero? [1 2 0 1 2 3 0 1 2 3 0 0 1 2])
> ((1 2) (0 1 2 3) (0 1 2 3) (0) (0 1 2))

Iteration at a rate other than once per input seq item suggests iterate:

(defn partition-starting-every [f coll]
(->> [nil coll]
(iterate (fn [[p [x :as s1]]]
(let [[n s2] (split-with (complement f) (next s1))]
(when (seq s1)
[(cons x n) s2]))))
(map first)
next
(take-while identity)))

Ugh. Well, maybe partition-by can be used after all:

(defn partition-starting-every [f coll]
(let [pb (partition-by #(and (f %) (Object.)) coll)]
(->> (map (fn [[a :as as] [b :as bs]]
(cond
(nil? as) (when-not (f b) bs)
(and (f a) (f b)) as
(f a) (concat as bs)))
(cons nil pb) pb)
(remove nil?))))

Bleh. Looks lazy-seq is the way to go. :-)

BTW, it's generally best to have the lazy-seq outside the empty
test to make your fn as lazy as possible. And I'd use split-with:

(defn partition-starting-every
"Partition the sequence starting each partition when the f is true."
[f coll]

(lazy-seq
(when-let [[x & xs] (seq coll)]
(let [[a b] (split-with (complement f) xs)]
(cons (cons x a) (partition-starting-every f b))))))

--Chouser
http://joyofclojure.com/

Christophe Grand

unread,
Sep 17, 2010, 1:24:25 AM9/17/10
to clojure
On Thu, Sep 16, 2010 at 10:45 PM, Chouser <cho...@gmail.com> wrote:
On Fri, Sep 10, 2010 at 4:26 PM, Matt  Smith <m0s...@gmail.com> wrote:
> problem: convert a collection [1 2 0 1 2 3 0 1 2 3 0 0 1 2] into
> partitions like:
> ((1 2) (0 1 2 3) (0 1 2 3) (0) (0 1 2))
> In this case, start each partition on a 0.

Iteration at a rate other than once per input seq item suggests iterate:

(defn partition-starting-every [f coll]
 (->> [nil coll]
      (iterate (fn [[p [x :as s1]]]
                 (let [[n s2] (split-with (complement f) (next s1))]
                   (when (seq s1)
                     [(cons x n) s2]))))
      (map first)
      next
      (take-while identity)))

Ugh.  Well, maybe partition-by can be used after all:

(defn partition-starting-every [f coll]
 (let [pb (partition-by #(and (f %) (Object.)) coll)]
   (->> (map (fn [[a :as as] [b :as bs]]
               (cond
                 (nil? as) (when-not (f b) bs)
                 (and (f a) (f b)) as
                 (f a) (concat as bs)))
             (cons nil pb) pb)
        (remove nil?))))


Nice trick that your #(and (f %) (Object.)) however #(when (f %) (Object.)) would be better because it would produce only one false value (nil) instead of two (nil and false).

For comprehensiveness, let's not forget reductions :-)

(defn partition-starting-every
 "Partition the sequence starting each partition when the f is true."
 [f coll]
 (map #(map first %)
   (partition-by second
     (map vector coll
       (rest (reductions #(if (f %2) (inc %1) %1) 0 coll))))))

 Christophe


--
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.cgrand.net/ (en)

Thomas

unread,
Sep 17, 2010, 2:34:45 AM9/17/10
to Clojure
I agree with number 2: it would be very nice to have this in contrib.
I needed it last month and rolled my own (less clean) version.

Thomas

Gijs S.

unread,
Sep 16, 2010, 3:05:52 PM9/16/10
to Clojure
Finished is a predicate which designates when the seed is exhausted.
Because seed is not necessary a sequence, finished is not always
empty?.



For instance:



=> (unfold (fn [x] [(* x x) (inc x)]) #(> % 10) 0)

(0 1 4 9 16 25 36 49 64 81 100)



Or the zipmap (zip2) example from the wikipedia page.



Although the first example wanders back into territory where the
existing sequence functions such as iterate, take-while and for would
suffice.

-Gijs

Nicolas Oury

unread,
Sep 17, 2010, 1:49:01 PM9/17/10
to clo...@googlegroups.com
I was just saying that not returning something that is a pair, for
example nil, is good enough.

(unfold (fn [x] (when (<= x 10) [(* x x) (inc x)])) would work.

Both function can be written with each other anyway.

And they don't have the same number of args so they are compatible
with each other.

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

--
Sent from an IBM Model M, 15 August 1989.

Nicolas Oury

unread,
Sep 17, 2010, 1:58:18 PM9/17/10
to clo...@googlegroups.com
(defn unfold
([grow seed]
(lazy-seq

(if-let [[elt next-seed] (grow seed)]
(cons elt (unfold grow next-seed)))))
([grow finished? seed]
(unfold #(when (not (finished? %)) (grow %)) seed)))

(unfold (fn [x] [(* x x) (inc x)]) #(> % 10) 0)
(0 1 4 9 16 25 36 49 64 81 100)

(unfold (fn [x] (when (<= x 10) [(* x x) (inc x)])) 0)


(0 1 4 9 16 25 36 49 64 81 100)

I think it can be proved that any sequence can be build with a call to unfold.
Which makes it useful and solves the
"why reduce is not lazy?" question.

gary ng

unread,
Sep 17, 2010, 9:12:00 PM9/17/10
to clo...@googlegroups.com
On Fri, Sep 17, 2010 at 10:49 AM, Nicolas Oury <nicola...@gmail.com> wrote:
> I was just saying that not returning something that is a pair, for
> example nil, is good enough.

The implement is equivalent, most languages I know that has unfold use
your approach(i.e. return Maybe<e,s> or None).

This link use the unfold p f g form

http://webcache.googleusercontent.com/search?q=cache:ksRX1JVmxFgJ:www.comlab.ox.ac.uk/jeremy.gibbons/publications/unfold.ps.gz+unfold+p+f+g&cd=5&hl=en&ct=clnk&gl=ca

Reply all
Reply to author
Forward
0 new messages