Lazy recursive walk.

270 views
Skip to first unread message

Nicolas Buduroi

unread,
Jan 15, 2010, 3:21:21 PM1/15/10
to Clojure
Hi, I'm still not familiar with laziness and I'm trying to make a
function recursively walk arbitrary data structures to perform some
action on all strings. The non-lazy version is quite easy to do:

(use
'clojure.walk
'clojure.contrib.str-utils)

(defn recursive-string-walk [f form]
(walk #(if (string? %) (f %) (recursive-string-walk f %))
identity form))

But it blow up the stack quite rapidly, I've tried to inject some
laziness inside, but only came up with a slight improvement. It can
take forms that are a little more than two time as deep.

(defn recursive-string-walk [f form]
(walk #(cond
(string? %) (f %)
(seq? %) (lazy-seq (recursive-string-walk f %))
:default (recursive-string-walk f %))
identity form))

Is there a way too make a fully lazy version of this function?

Thanks

- budu

Sean Devlin

unread,
Jan 15, 2010, 3:25:24 PM1/15/10
to Clojure
Did you try wrapping everything w/ a call to lazy-seq?

Nicolas Buduroi

unread,
Jan 15, 2010, 3:44:47 PM1/15/10
to Clojure

On Jan 15, 3:25 pm, Sean Devlin <francoisdev...@gmail.com> wrote:
> Did you try wrapping everything w/ a call to lazy-seq?

Yes, it doesn't seem change anything.

Laurent PETIT

unread,
Jan 16, 2010, 7:33:08 PM1/16/10
to clo...@googlegroups.com
For the non lazy version , maybe using clojure.zip would help not blow
up the stack ?

(using clojure.zip/zip + a loop with recur on clojure.zip/next) ?

2010/1/15 Nicolas Buduroi <nbud...@gmail.com>:

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
>

Tom Hicks

unread,
Jan 16, 2010, 9:31:10 PM1/16/10
to Clojure

I suspect that just wrapping everything in a call to lazy-seq cannot
work
in this case. In the implementation of walk, the branch for handling
seqs
contains a 'doall', which I think will realize the entire sub-sequence
at that point
(while holding the head of the sub-sequence in memory too),
defeating your attempts to make it lazy by "wrapping" the recursion.

Just my suspicion...I'm still trying to get a grasp on lazy sequences
myself.
cheers,
-tom

Tom Hicks

unread,
Jan 16, 2010, 9:45:45 PM1/16/10
to Clojure
Sorry, I forgot to ask: how rapid is "rapidly"?
Can you provide a simple example that rapidly blows the stack
so we can experiment with lazy solutions?
-tom

On Jan 15, 1:21 pm, Nicolas Buduroi <nbudu...@gmail.com> wrote:
> ....
> But it blow up the stack quite rapidly, ...
> ...
> - budu

Tom Hicks

unread,
Jan 16, 2010, 10:06:53 PM1/16/10
to Clojure
On Jan 15, 1:21 pm, Nicolas Buduroi <nbudu...@gmail.com> wrote:
> Hi, I'm still not familiar with laziness and I'm trying to make a
> function recursively walk arbitrary data structures to perform some
> action on all strings. The non-lazy version is quite easy to do:
>
> (use
>   'clojure.walk
>   'clojure.contrib.str-utils)
>
> (defn recursive-string-walk [f form]
>   (walk #(if (string? %) (f %) (recursive-string-walk f %))
>     identity form))
>
> - budu

One more thought (sorry, I'm having too much fun with this
problem)....
I notice that since walk does not traverse anything except collective
data types,
your recursive-string-walk will not call the function f on just a
string argument:

user=> (recursive-string-walk reverse "hello")
"hello"

This, of course, may be exactly the behavior that you desire, but a
version
which handles strings might be useful for a "string walking" function:

(defn rsw [f form]
(if (string? form)
(f form)
(walk (partial rsw f) identity form)
)
)

Unfortunately, this is also non-lazy and will probably still blow your
stack. :)
-tom

Tom Hicks

unread,
Jan 16, 2010, 11:21:00 PM1/16/10
to Clojure
On Jan 15, 1:21 pm, Nicolas Buduroi <nbudu...@gmail.com> wrote:
> Hi, I'm still not familiar with laziness and I'm trying to make a
> function recursively walk arbitrary data structures to perform some
> action on all strings.
> ...

> Is there a way too make a fully lazy version of this function?
>
> - budu

I'm trying to create a lazy version of walk. Here's my first attempt:

(defn lazy-walk
"Traverses form, an arbitrary data structure. inner and outer are
functions. Applies inner to each element of form, building up a
data structure of the same type, then applies outer to the result.
Recognizes all Clojure data structures except sorted-map-by."
[inner outer form]
(lazy-seq
(cond
(list? form) (outer (apply list (map inner form)))
(seq? form) (outer (map inner form))
(vector? form) (outer (vec (map inner form)))
(map? form) (outer (into (if (sorted? form) (sorted-map) {})
(map inner form)))
(set? form) (outer (into (if (sorted? form) (sorted-set) #{})
(map inner form)))
:else (outer form) ) )
)

user=> (def lw (lazy-walk #(if (string? %) (.toUpperCase %) %)
identity (interleave (repeat
10000 88) (repeat 10000 "abc"))))
user=> (type lw)
clojure.lang.LazySeq
user=> (take 20 lw)
(88 "ABC" 88 "ABC" 88 "ABC" 88 "ABC" 88 "ABC" 88 "ABC" 88 "ABC" 88
"ABC" 88 "ABC" 88 "ABC")

I'm still not quite sure how to test lazy functions for correct lazy
behavior, though.
regards,
-tom

Nicolas Buduroi

unread,
Jan 17, 2010, 6:35:25 PM1/17/10
to Clojure
> Sorry, I forgot to ask: how rapid is "rapidly"?

Oh, I'd say I misused that word, at least it's way more than I need
for what I use this for. I created this post only to see if someone
would have an idea for a fully lazy version out of curiosity. From my
experiments, the non-recursive version blow the stack around 486
recursions on my computer.

> Can you provide a simple example that rapidly blows the stack
> so we can experiment  with lazy solutions?

Here's what I used to test the depth of recursion:

(def pit (iterate list "bottom!"))

(defn test-walk [walker shallowest deepest]
(doseq [depth (range shallowest deepest)]
(pprint (walker #(str depth " reached " %)
(last (take depth pit))))))

Here's the recursive walks functions and a new one using your first
attempt at lazy-walk:

(defn recursive-string-walk [f form]
(walk #(if (string? %) (f %) (recursive-string-walk f %))
identity form))

(defn lazy-recursive-string-walk [f form]


(walk #(cond
(string? %) (f %)

(seq? %) (lazy-seq (lazy-recursive-string-walk f %))
:default (lazy-recursive-string-walk f %))
identity form))

(defn really-lazy-recursive-string-walk [f form]
(lazy-walk #(if (string? %) (f %) (really-lazy-recursive-string-walk
f %))
identity form))

And here's the results:

* (test-walk recursive-string-walk 450 500) -> ...(...("487 reached
bottom!")...); Evaluation aborted.
* (test-walk lazy-recursive-string-walk 800 900) -> ...(...("828
reached bottom!")...); Evaluation aborted.
* (test-walk really-lazy-recursive-string-walk 800 900) -> ...(...
("829 reached bottom!")...); Evaluation aborted.

So just one more, there's still lots of room for improvement ;-). For
whatever reason the lazy version isn't able to reach 1000 while
yesterday it was able to go slightly above that and the non-lazy
version didn't changed. The more I look at the code the more I don't
get why it's not fully lazy. I'll try with zippers as Laurent
suggested.

Thanks a lot!

- budu

Nicolas Buduroi

unread,
Jan 18, 2010, 7:06:51 PM1/18/10
to Clojure
On Jan 16, 7:33 pm, Laurent PETIT <laurent.pe...@gmail.com> wrote:
> For the non lazy version , maybe using clojure.zip would help not blow
> up the stack ?
>
> (using clojure.zip/zip + a loop with recur on clojure.zip/next) ?

I've just tried it and it appears to be equivalent to the lazy walk
version, which I think is fully lazy after all (not tested, see
below).

(defn lazy-recursive-string-walk-with-zipper [f form]
(loop [loc (z/seq-zip form)]
(if (z/end? loc)
(z/root loc)
(recur (z/next
(if (string? (z/node loc))
(zip/replace loc (f (z/node loc)))
loc))))))

The thing is, I should learn to read stack traces more carefully! The
StackOverflowError was coming from my test function.

(def pit (iterate list "bottom!"))

(defn test-walk [walker shallowest deepest]
(doseq [depth (range shallowest deepest)]
(pprint (walker #(str depth " reached " %)
(last (take depth pit))))))

It's the iterate function that was throwing the error, while used
directly it can generate a 1000 lists deep nested list, but when used
in the test-walk function it only reach 828. The stack overflow make
more sense now, yet the stack trace is not easy to decipher.

No message.
[Thrown class java.lang.StackOverflowError]

Restarts:
0: [ABORT] Return to SLIME's top level.

Backtrace:
0: clojure.lang.PersistentHashMap$BitmapIndexedNode.index
(PersistentHashMap.java:467)
1: clojure.lang.PersistentHashMap$BitmapIndexedNode.assoc
(PersistentHashMap.java:616)
2: clojure.lang.PersistentHashMap$TransientHashMap.doAssoc
(PersistentHashMap.java:222)
3: clojure.lang.ATransientMap.assoc(ATransientMap.java:64)
4: clojure.lang.PersistentHashMap.create(PersistentHashMap.java:79)
5: clojure.core$hash_map__4297.doInvoke(core.clj:279)
6: clojure.lang.RestFn.invoke(RestFn.java:426)
7: clojure.core$print_sequential__6823.invoke(core_print.clj:37)
8: clojure.core$fn__6908.invoke(core_print.clj:136)
9: clojure.lang.MultiFn.invoke(MultiFn.java:161)
10: clojure.core$pr_on__5416.invoke(core.clj:2336)
11: clojure.core$print_sequential__6823.invoke(core_print.clj:54)
12: clojure.core$fn__6908.invoke(core_print.clj:136)
13: clojure.lang.MultiFn.invoke(MultiFn.java:161)
14: clojure.core$pr_on__5416.invoke(core.clj:2336)

I'll look a little deeper at this error and keep this post updated. In
the meantime, has anyone got an idea to replace iterate for creating
mock nested lists to test recursive-string-walk?

P.S.: Finally learned to use zippers and they can be very useful, like
that concept a lot. Found it intimidating at first but I've just read
an excellent write-up about them today:

http://scienceblogs.com/goodmath/2010/01/zippers_making_functional_upda.php?

Thanks

- budu

Nicolas Buduroi

unread,
Jan 18, 2010, 9:14:03 PM1/18/10
to Clojure
I've found the real problem. The overflow was coming from recursion
between pr-on and print-method. It must be that way because it's not
really useful to print such deeply nested data structure anyway.

> http://scienceblogs.com/goodmath/2010/01/zippers_making_functional_up...
>
> Thanks
>
> - budu

Nicolas Buduroi

unread,
Jan 18, 2010, 9:31:35 PM1/18/10
to Clojure
I've now changed my test method to this.

(def pit (iterate list "bottom!"))

(defn get-bottom [p]
(loop [p (first p)]
(if (seq? p) (recur (first p)) p)))

(defn test-walk [walker shallowest deepest]
(doseq [depth (range shallowest deepest)]

(println (get-bottom


(walker #(str depth " reached " %)

(lazy-seq (last (take depth pit))))))))

And now everything make lot more sense. A classic case of good code
with bad test which make you search for some imaginary bug. So always
carefully read your stack traces! Here's some benchmarking I've done
with the above versions.

user> (time (test-walk lazy-recursive-string-walk-with-zipper 100000
100001))
100000 reached bottom!
"Elapsed time: 1368.337794 msecs"

user> (time (test-walk lazy-recursive-string-walk 100000 100001))
100000 reached bottom!
"Elapsed time: 233.80888 msecs"

user> (time (test-walk really-lazy-recursive-string-walk 100000
100001))
100000 reached bottom!
"Elapsed time: 223.631872 msecs"

The one with a zipper is the slowest, zippers being much more capable
than just walking data structure, so it's normal. The first lazy
version is more than five faster and the one using Tom Hicks lazy-walk
function is slightly faster.

- budu

Tom Hicks

unread,
Jan 18, 2010, 10:55:28 PM1/18/10
to Clojure
Hey, congratulations on finding and fixing the problem!

And thanks for the great link to the zipper explanation....zippers
were
at the top of my list to learn (seriously). Did you catch this other
useful
link (in the comments of the article you provided)?

http://okmij.org/ftp/Computation/Continuations.html#zipper

This guy has a wealth of great information and references
to related topics.
regards,
-tom

Nicolas Buduroi

unread,
Jan 19, 2010, 1:32:51 AM1/19/10
to Clojure
On Jan 18, 10:55 pm, Tom Hicks <hickstoh...@gmail.com> wrote:
> Hey, congratulations on finding and fixing the problem!
>
> And thanks for the great link to the zipper explanation....zippers
> were
> at the top of my list to learn (seriously). Did you catch this other
> useful
> link (in the comments of the article you provided)?
>
> http://okmij.org/ftp/Computation/Continuations.html#zipper
>
> This guy has a wealth of great information and references
> to related topics.

Yes, I already knew about Oleg's FTP site, a timeless resource about
lots of (strange) programming concept. Didn't visited for a long time.

- budu

Reply all
Reply to author
Forward
0 new messages