breaking early from a "tight loop"

544 views
Skip to first unread message

prhlava

unread,
Jun 12, 2009, 10:54:56 AM6/12/09
to Clojure

Hello,

I am writing a (crude) image classification which basically counts
pixels within a specific RGB interval and if the count is over certain
number, the image is considered
"interesting". (the app runs on a grid BTW)

The point is that once the count is reached, there is no point
processing the rest of the pixels (which saves time).

So the question is, how do I "break early" from sequence (or loop)
processing?

Only option I found so far, would be using the "while" macro - any
other (clojure) options?

Kind regards,

Vlad

CuppoJava

unread,
Jun 12, 2009, 11:56:07 AM6/12/09
to Clojure
Hi Vlad,
I would approach it like this, and make full use of Clojure's lazy
sequences:

(count (take-while belowCount (filter identity (map isInteresting
pixels))))

-Patrick

joshua-choi

unread,
Jun 12, 2009, 12:35:09 PM6/12/09
to Clojure
Is the function of the "filter identity" call to make (map
isInteresting pixels) a lazy sequence? I thought that the sequences
map returned were already lazy, but I could be mistaken.

J. McConnell

unread,
Jun 12, 2009, 12:56:21 PM6/12/09
to clo...@googlegroups.com
On Fri, Jun 12, 2009 at 12:35 PM, joshua-choi <rbys...@gmail.com> wrote:

Is the function of the "filter identity" call to make (map
isInteresting pixels) a lazy sequence? I thought that the sequences
map returned were already lazy, but I could be mistaken.

I believe the idea is that mapping isInteresting over pixels will produce a lazy sequence of true/false values, eg. (true true false ...) or, if it returns a measure of the interestingness of a pixel, (0.75 0.32 nil ...). Then, the filter on identity will strip out any* false/nil values, leaving only the interesting pixels to count.

HTH,

- J.

joshua-choi

unread,
Jun 12, 2009, 1:51:33 PM6/12/09
to Clojure
Oh! I see. Thanks for the explanation.

On Jun 12, 9:56 am, "J. McConnell" <jdo...@gmail.com> wrote:

prhlava

unread,
Jun 12, 2009, 3:20:42 PM6/12/09
to Clojure


Hello Patrick,
Interesting approach - I have not thought of this. Also, it looks that
I can get an array of pixels (as ints) from BufferedImage so it should
be fast.

Thank you for the pointer.

Kind regards,

Vlad

PS: Other option occured to me later - using loop/recur, but I like
the above...

CuppoJava

unread,
Jun 12, 2009, 6:50:27 PM6/12/09
to Clojure
I'm glad I was helpful.
You can try this approach first, and see if it meets your
requirements. If it's still too slow you can drop down to loop/recur
to eliminate the boxing overhead at the expense of some extra
verbosity.
-Patrick

James Reeves

unread,
Jun 12, 2009, 9:44:24 PM6/12/09
to Clojure
This particular piece of code doesn't look like it would work, unless
I've misunderstood what Vlad is asking. I think you'd want something
more like:

(defn interesting? [pixels c]
(>= c (count (take c (filter in-interval? pixels)))))

Where c is the number past which an image is considered "interesting".
However, if c is very large, then (take c ...) is going to result in a
large sequence. We'd probably want to define a function:

(defn count-more-than? [n xs]
(or (zero? n)
(if (seq xs)
(recur (dec n) (rest xs)))))

(defn interesting? [pixels c]
(count-more-than? c (filter in-interval? pixels)))

- James

prhlava

unread,
Jun 13, 2009, 8:22:37 AM6/13/09
to Clojure

Hello James,

Thank you for more examples.

> > (count (take-while belowCount (filter identity (map isInteresting
> > pixels))))
> This particular piece of code doesn't look like it would work, unless
> I've misunderstood what Vlad is asking. I think you'd want something
> more like:

If I understood the Patrick's code right, it works as:

1. The (map isInteresting ...) returns lazy sequence of true/false
values for each pixel (true for interesting pixel)
2. The (filter identity ...) returns a lazy sequence of only true
values from the above
3. The (take-while belowCount ...) will return items of lazy sequence
while the predicate "belowCount"
holds. This implies that the predicate would have to be stateful
(i.e. increment a counter
on each invocation and return false when this counter == count-I-
am-interested-in . So the
take-while "stops" once the count is reached
4. the result of (count ...) would be compared with count-I-am-
interested-in and if ==, the image is
"interesting"

> (defn interesting? [pixels c]
> (>= c (count (take c (filter in-interval? pixels)))))
>
> Where c is the number past which an image is considered "interesting".
> However, if c is very large, then (take c ...) is going to result in a
> large sequence.

The large sequence should not be a problem for this particular
application.

> We'd probably want to define a function:
>
> (defn count-more-than? [n xs]
> (or (zero? n)
> (if (seq xs)
> (recur (dec n) (rest xs)))))
>
> (defn interesting? [pixels c]
> (count-more-than? c (filter in-interval? pixels)))

Cheers, I will try to code different variants, see how fast they are,
+ report back at some point with the findings.

Kind regards,

Vlad

CuppoJava

unread,
Jun 13, 2009, 11:03:52 AM6/13/09
to Clojure
Hi Vlad,
I just realized that you don't need the take-while.

(take num coll) will directly retrieve up-to num elements out of coll.

Cheers
-Patrick

Wrexsoul

unread,
Jun 13, 2009, 2:02:35 PM6/13/09
to Clojure
On Jun 12, 9:44 pm, James Reeves <weavejes...@googlemail.com> wrote:
>   (defn count-more-than? [n xs]
>     (or (zero? n)
>         (if (seq xs)
>           (recur (dec n) (rest xs)))))
>
>   (defn interesting? [pixels c]
>     (count-more-than? c (filter in-interval? pixels)))

Nice, but

user=> (count-more-than? 0 ())
true

(defn count-more-than? [n xs]
(if (seq xs)
(or (zero? n)
(recur (dec n) (rest xs)))))

seems to work. :)

Emeka

unread,
Jun 13, 2009, 3:03:59 PM6/13/09
to clo...@googlegroups.com
kedu Wrexsoul


user=> (count-more-than? 0 ())
true

(defn count-more-than? [n xs]
 (if (not (seq xs))

   (or (zero? n)
     (recur (dec n) (rest xs)))))

I'm afraid your code didn't return true for me. Just look at your code, (seq '()) will always give nil, so your code will return nil and not true. To get true from your code I guess the (seq xs) should be changed to (not (seq xs)).

Or am I missing something?

Emeka

Wrexsoul

unread,
Jun 13, 2009, 3:10:25 PM6/13/09
to Clojure
On Jun 13, 3:03 pm, Emeka <emekami...@gmail.com> wrote:
> kedu Wrexsoul
>
> user=> (count-more-than? 0 ())
> true
>
> (defn count-more-than? [n xs]
>   (if (not (seq xs))
>    (or (zero? n)
>      (recur (dec n) (rest xs)))))
>
> I'm afraid your code didn't return true for me.

That's the idea. This version doesn't have the corner-case bug with
empty colls that the previous guy's did. :)

Laurent PETIT

unread,
Jun 13, 2009, 4:17:08 PM6/13/09
to clo...@googlegroups.com
Hi,

This thread is interesting because we have a solution that is
suggested : using map then filter then take then count, which will
result in creating, for at least num pixels, at worst the full number
of pixels, 2 conses, and 3 times walking the array.
What's more interesting to me is that the ultimate result is not to
get a transformation of the input preserving the structure (a
"collection" of pixels), but a reduced value.

So it really seems to me that the missing abstraction, here, is being
able to do a reduce over the list of pixels, and being able, from the
reduction function, to quit the reduction early.

If we had this function, then there would be no new cons, and just one
walk of the list, while still being very functional.

Here is how the enhanced version of reduce could work :

(defn interesting? [pixels c]
(reduce (fn [counter pixel] (if (in-interval? pixel) (inc counter) counter))
0
pixels
(partial < c)))

The new optional argument to reduce would be a function applied to the
accumulated value before each invocation of the reducing function, and
if it returns true, then reduce would quit and return the currently
accumulated value.
Another way to implement the new version of reduce would be by having
the 4th parameter be an explicit "early-return value", and let the "do
we return early ?" test be part of the reduction function ?

Do you see a value in implementing such a new function (or extension
to existing reduce) ?

Of course, if one does not want to implement the general problem
solving solution, a good old walk via 'loop seems appealing to this
exact problem (I presume walking over pixels would have to be as fast
as possible) :

(defn interesting? [pixels interest-threshold]
(loop [count 0 pixels (seq pixels)]
(cond
(>= count interest-threshold) true
pixels (recur (if (in-interval? (first pixels)) (inc count)
count) (next pixels))
:else false)))

OK, ok, this one is rather verbose :-)

Regards,

--
Laurent

2009/6/13 CuppoJava <patrick...@hotmail.com>:

CuppoJava

unread,
Jun 13, 2009, 4:34:57 PM6/13/09
to Clojure
Hi Laurent,
I like how you distilled the problem down to a reduce with an early-
exit.

I'm interested in what you said about my suggestion:

"using map then filter then take then count, which will
result in creating, for at least num pixels, at worst the full number
of pixels, 2 conses, and 3 times walking the array. "

map filter and take are all lazy. so doesn't the array only get
iterated through once? by count?

James Reeves

unread,
Jun 13, 2009, 4:39:26 PM6/13/09
to Clojure
On Jun 13, 9:17 pm, Laurent PETIT <laurent.pe...@gmail.com> wrote:
> This thread is interesting because we have a solution that is
> suggested : using map then filter then take then count, which will
> result in creating, for at least num pixels, at worst the full number
> of pixels, 2 conses, and 3 times walking the array.

Well, the map is redundant - it's sufficient to just use filter.
Second, because these are lazy sequences, we're only walking the array
once, and not storing any more data than we need to.

- James

Laurent PETIT

unread,
Jun 13, 2009, 4:39:51 PM6/13/09
to clo...@googlegroups.com
2009/6/13 CuppoJava <patrick...@hotmail.com>:

Hi,

Well, the array is iterated once by map, the new seq created by map is
iterated once by filter, and the new seq created by filter is iterated
once by count, so right, I should have written : 3 walks of seqs of
the size of the array.
While reduce would create no new seq, and one walk ?

James Reeves

unread,
Jun 13, 2009, 4:40:21 PM6/13/09
to Clojure
On Jun 13, 7:02 pm, Wrexsoul <d2387...@bsnow.net> wrote:
> Nice, but
>
> user=> (count-more-than? 0 ())
> true
>
> (defn count-more-than? [n xs]
>   (if (seq xs)
>     (or (zero? n)
>       (recur (dec n) (rest xs)))))

True enough. My code didn't take into account that edge case.

- James

James Reeves

unread,
Jun 13, 2009, 4:42:38 PM6/13/09
to Clojure
On Jun 13, 9:39 pm, Laurent PETIT <laurent.pe...@gmail.com> wrote:
> Well, the array is iterated once by map, the new seq created by map is
> iterated once by filter, and the new seq created by filter is iterated
> once by count, so right, I should have written : 3 walks of seqs of
> the size of the array.

The filter and map functions produce lazy seqs, so the sequence is
only walked once.

- James

Laurent PETIT

unread,
Jun 13, 2009, 4:57:48 PM6/13/09
to clo...@googlegroups.com
2009/6/13 James Reeves <weave...@googlemail.com>:
Well, isn't walking 3 different sequences 1 time almost equivalent (in
terms of computer work) to walking 3 times one sequence ?

And nevertheless, if I insist, it's not really because there *could*
be a performance problem (though I guess there will, but only
benchmark would tell), but rather because the essence of the problem
seems to be needing a simple reduction with early break capability,
and no more :-)


>
> - James
> >
>



--
Cordialement,

Laurent PETIT

James Reeves

unread,
Jun 13, 2009, 9:14:27 PM6/13/09
to Clojure
On Jun 13, 9:57 pm, Laurent PETIT <laurent.pe...@gmail.com> wrote:
> > The filter and map functions produce lazy seqs, so the sequence is
> > only walked once.
>
> Well, isn't walking 3 different sequences 1 time almost equivalent (in
> terms of computer work) to walking 3 times one sequence ?

Well, I guess there's the overhead of wrapping it in multiple seqs,
but the logic calculations will be the same regardless of which method
is used.

> And nevertheless, if I insist, it's not really because there *could*
> be a performance problem (though I guess there will, but only
> benchmark would tell), but rather because the essence of the problem
> seems to be needing a simple reduction with early break capability,
> and no more :-)

Well, I kinda disagree here. It seems to me that this:

(defn interesting? [pixels c]
(> c (count (take c (filter in-interval? pixels))))

Is more straightforward than:

(defn interesting? [pixels c]
(reduce (fn [counter pixel] (if (in-interval? pixel) (inc counter)
counter))
0
pixels
(partial < c)))

- James

CuppoJava

unread,
Jun 14, 2009, 1:45:28 AM6/14/09
to Clojure
I actually really do like the reduce with early exit abstraction.
Because ultimately, that's what the question is. It's a reduce with
optimization.

However, I feel that Laurence's reduce is a little too specific. The
early exit condition is very likely to *not* depend on the reduce
accumulator, but rather on some intermediate calculation in the reduce
function.

It is possible to write something like this. But some people might be
concerned with how idiomatic this is.

(my_reduce (fn [a b c] (if bla-bla (break))) initial_value collection)

where (break) exits the immediate surrounding reduce.

Laurent PETIT

unread,
Jun 14, 2009, 3:03:38 AM6/14/09
to clo...@googlegroups.com
Hi,

2009/6/14 CuppoJava <patrick...@hotmail.com>:
Yes, I suggested a similar alternative in my email also:
"Another way to implement the new version of reduce would be by having
the 4th parameter be an explicit "early-return value", and let the "do
we return early ?" test be part of the reduction function ?"

which is similar to what you say, while maybe a little bit more
general (in the sense it does not introduce a symbol), and also maybe
a little bit easier for implementing my_reduce :

(my_reduce (fn [a b c] (if bla-bla :early-break)) initial_value
collection :early-break)

Laurent PETIT

unread,
Jun 14, 2009, 3:08:34 AM6/14/09
to clo...@googlegroups.com
2009/6/14 James Reeves <weave...@googlemail.com>:
>
> On Jun 13, 9:57 pm, Laurent PETIT <laurent.pe...@gmail.com> wrote:
>> > The filter and map functions produce lazy seqs, so the sequence is
>> > only walked once.
>>
>> Well, isn't walking 3 different sequences 1 time almost equivalent (in
>> terms of computer work) to walking 3 times one sequence ?
>
> Well, I guess there's the overhead of wrapping it in multiple seqs,
> but the logic calculations will be the same regardless of which method
> is used.
>
>> And nevertheless, if I insist, it's not really because there *could*
>> be a performance problem (though I guess there will, but only
>> benchmark would tell), but rather because the essence of the problem
>> seems to be needing a simple reduction with early break capability,
>> and no more :-)
>
> Well, I kinda disagree here. It seems to me that this:
>
>  (defn interesting? [pixels c]
>    (> c (count (take c (filter in-interval? pixels))))
>

I can't but agree that it is less verbose.
But I maintain (and then will let you 'cause I'm on holidays from
right after this last email :-) that the initial problem calls for a
modified reduce :-)

Sudish Joseph

unread,
Jun 14, 2009, 4:00:33 AM6/14/09
to Clojure
On Jun 13, 4:17 pm, Laurent PETIT <laurent.pe...@gmail.com> wrote:
> So it really seems to me that the missing abstraction, here, is being
> able to do a reduce over the list of pixels, and being able, from the
> reduction function, to quit the reduction early.

A lazy right fold[1] allows short-circuiting, so here's one attempt:

(defn foldr
"Implements a lazy right fold of the function f on coll. f should be
a function of two arguments: the next element to be consumed, and a
function which returns the result of folding the rest of the list.
The reduction starts at the end of the list by calling f with the
last element and val."
[f val coll]
(if-let [s (seq coll)]
(f (first s) (fn [] (foldr f val (rest s))))
val))

(defn interesting? [pixels c]
(let [f (fn [pixel counter] (if (< pixel c) (inc (counter)) 0))]
(foldr f 0 pixels)))

(interesting? (iterate inc 0) 10)
=> 10

The need to exlictly call the lazy thunk passed as the second argument
feels a little weird, but it does give us a lazy "right reduce" that
can short-circuit the reduction when it's done.

Here's a lazy filter function implemented using foldr:

(defn foldr-filter [pred coll]
(let [f (fn [x accum]
(if (pred x)
(cons x (accum))
nil))]
(foldr f nil coll)))

(take 20 (foldr-filter (fn [x] (< x 10)) (iterate inc 0)))
=> (0 1 2 3 4 5 6 7 8 9)

Cheers,
-Sudish

[1] http://en.wikipedia.org/wiki/Fold_%28higher-order_function%29#Evaluation_order_considerations

Sudish Joseph

unread,
Jun 14, 2009, 4:32:58 AM6/14/09
to Clojure
On Jun 13, 4:39 pm, Laurent PETIT <laurent.pe...@gmail.com> wrote:
> Hi,
>
> Well, the array is iterated once by map, the new seq created by map is
> iterated once by filter, and the new seq created by filter is iterated
> once by count, so right, I should have written : 3 walks of seqs of
> the size of the array.

Hi Laurent,

I'm probably misunderstanding what you mean by size of the array here,
but since all of this is lazy map will only walk as far as take-while
lets it, right? I.e., just far enough to make the take-while happy
and not the whole array.

(defn f [x]
(print (str "x: " x "\n"))
(inc x))

user> (count (take-while #(< % 10) (map f (iterate inc 0))))
x: 0
x: 1
x: 2
x: 3
x: 4
x: 5
x: 6
x: 7
x: 8
x: 9
9

> While reduce would create no new seq, and one walk ?

I guess it's a matter of perspective as to whether there's one walk or
multiple as the combined lazy-seq gets "pulled" by count -- I tend to
think of it as an assembly line with multiple stations on it (one,
lengthy "walk" :-).

Yes, like you said, there's double the consing (or triple with a
filter as in the original example). However, isn't the JVM supposed to
excel at allocating/collecting just this kind of very short-lived data/
garbage?

The (count (take-while (filter (map)))) just feels very natural in
clojure to me.

-Sudish

Max Suica

unread,
Jun 14, 2009, 5:04:26 AM6/14/09
to Clojure
Laurent, I think this is a close variant of the early exiting reduce
function you proposed:

(defn reduce-bail [f pred? val col]
(let [s (seq col)]
(if s
(if (pred? val)
(recur f pred? (f val (first s)) (next s))
val)
val)))

;Then we can write:

(defn interesting? [pixels in-range? count]
(let [p-count (reduce-bail (fn [c p] (if (in-range? p) (inc c) c))
(partial > count) 0 pixels)]
[( = count yummy-pix-count) p-count]))

; This returns a vec showing how close to the interesting count the
pixels came in addition to showing whether the sequence was
interesting, so that we see that it works.

; Now, in completely terrible form precipitated by the fact that it's
4:20 in the morning we try:

(interesting? (let [r (take 1000 (repeatedly #(rand-int 10)))]
(println (count (filter #(= % 5) r ))) r) #(= % 5) 93)

; This bails according to the number of 5s counted, but also side
effects the printing of a complete count to show that it works.

(interesting? (repeatedly rand) #(> % 0.99) 1000)

; Here, we see that our cautious reduce does not choke on infinite
sequences if it can escape. It will not halt for uninteresting
sequences however, so this isn't useful here.

Way too tired right now to think about whether this is useful in
general (Maybe in cases where you're looking to bail at some value
more complicated than a count), but the code runs pretty neat, and I
agree with Laurent that we should explore this abstraction. Dunno if
there are hidden bugs in my code tho. Anyways.

Sleepytime
- max

Max Suica

unread,
Jun 14, 2009, 5:15:13 AM6/14/09
to Clojure
On Jun 14, 5:04 am, Max Suica <Max.Su...@gmail.com> wrote:
> (defn interesting? [pixels in-range? count]
>   (let [p-count (reduce-bail (fn [c p] (if (in-range? p) (inc c) c))
> (partial > count) 0 pixels)]
>   [( = count yummy-pix-count) p-count]))

Shoot: s/[( = count yummy-pix-count) p-count]))/[( = count p-count) p-
count]))

And wow, writing literate programs is not facilitated by this forum!
Here's the whole listing in copy/paste friendly form:

(defn reduce-bail [f pred? val col]
(let [s (seq col)]
(if s
(if (pred? val)
(recur f pred? (f val (first s)) (next s))
val)
val)))

(defn interesting? [pixels in-range? count]
(let [p-count (reduce-bail (fn [c p] (if (in-range? p) (inc c) c))
(partial > count) 0 pixels)]
[( = count p-count) p-count]))

(interesting? (let [r (take 1000 (repeatedly #(rand-int 10)))]
(println (count (filter #(= % 5) r ))) r) #(= % 5) 93)

Max Suica

unread,
Jun 14, 2009, 5:31:42 AM6/14/09
to Clojure

> A lazy right fold[1] allows short-circuiting, so here's one attempt:

Wow, that made my head explode.

Some points:

1) That's not exatly foldr, as (foldr + 0 [range 100]) ought to work
2) Foldr is not tail recursive nor can you really call an anamorphism
lazy
3) Never the less you did it, through some mind twisting continuation
passing style
4) That's fn awesome :D where did you learn it?

-max

Max Suica

unread,
Jun 14, 2009, 5:41:42 AM6/14/09
to Clojure
I'm sorry, folds are catamorphisms, while stuff like (repeatedly f) or
(repeat n x) are anamorphisms, and can certainly be lazy.

Sudish Joseph

unread,
Jun 14, 2009, 6:05:04 AM6/14/09
to Clojure
On Jun 14, 5:31 am, Max Suica <Max.Su...@gmail.com> wrote:
> > A lazy right fold[1] allows short-circuiting, so here's one attempt:
>
> Wow, that made my head explode.
>
> Some points:
>
> 1) That's not exatly foldr, as  (foldr + 0 [range 100]) ought to work

Agreed, having to write that as

(foldr (fn [x f-rest] (+ x (f-rest))) 0 (range 100))

shows that deficiency rather quickly.

> 2) Foldr is not tail recursive nor can you really call an anamorphism
> lazy

The function I posted wasn't tail recursive either. It called the
passed in f on the (delayed) result of the recursion before
returning. It's certainly lazy on the spine of the list, in that
evaluation is deferred until the caller needs a value, just like foldr
should be.

It's the inability to delay evaluation of arbitrary args (the
recursion, here) that keeps us from being able to write, say, (foldr +
0 [1..10]) and have it work.

> 3) Never the less you did it, through some mind twisting continuation
> passing style
> 4) That's fn awesome :D where did you learn it?

There're three ways to defer evaluation that I know of in Clojure:
delayed macro evaluation, force/delay, and what you see. The macro
approach doesn't really work, the force/delay looks even worse, so ...

-Sudish

Chouser

unread,
Jun 15, 2009, 9:55:59 AM6/15/09
to clo...@googlegroups.com
On Sun, Jun 14, 2009 at 4:00 AM, Sudish Joseph<sud...@gmail.com> wrote:
>
> On Jun 13, 4:17 pm, Laurent PETIT <laurent.pe...@gmail.com> wrote:
>> So it really seems to me that the missing abstraction, here, is being
>> able to do a reduce over the list of pixels, and being able, from the
>> reduction function, to quit the reduction early.
>
> A lazy right fold[1] allows short-circuiting, so here's one attempt:

Here's another way to do a sort of lazy reduce:

(use '[clojure.contrib.seq-utils :only [reductions]])

(some #(> % 10)
(reductions (fn [counter pixel] (if (< pixel 10) (inc counter) counter))
0 (iterate dec 15)))

--Chouser

Reply all
Reply to author
Forward
0 new messages