Iterators with recursive algorithms

730 views
Skip to first unread message

Greg Ward

unread,
Dec 6, 2012, 12:36:08 PM12/6/12
to golan...@googlegroups.com
Hi all --

a couple of Gophers have independently hit upon an elegant pattern for
iterators in Go. Say you define a FooList type, which might be

type FooList []Foo

or

type FooList struct {
list []Foo
}

...or whatever. Your users can't use range, but you want for loops to
be easy for them. The solution (credit for independent discovery to
Adam Crossland and Steve Blenkinsop) looks like

var list FooList = ...
iter := list.Iterator()
for foo, ok := iter(); ok; foo, ok := iter() {
// do stuff with foo
}

where iter is a "func() (Foo, bool)" -- it's a closure that remembers
where it left off in its traversal and returns ok=false when you're
done. The implementation of FooList.Iterator() is straightforward for
simple data structures with simple traversal algorithms: e.g. see

http://play.golang.org/p/I6GHIOmurv

However, I'm trying to apply this iterator pattern to a depth-first
search of a directed graph, and I'm stumped. The problem is that DFS
is best implemented recursively, which makes it awfully hard for that
iterator closure to just return a current value and then resume where
it left off. The iteration state is really the recursive call stack,
not a simple array index or two.

Suggestions? After spending several hours and many pieces of scrap
paper, I think I have figured out how to do a non-recursive DFS. Or at
least, I figured out enough to convince myself that recursive DFS is a
massive win. So I'd really rather not have to go through the painful
exercise of implementing non-recursive DFS.

Thanks --

Greg

Kyle Lemons

unread,
Dec 6, 2012, 12:49:08 PM12/6/12
to Greg Ward, golang-nuts
If you know the number of items in the list, you can create a buffered channel large enough to hold (pointers to) each item and then run the recursive algorithm, pushing them all into the channel.  Then "next" is a simple matter of popping the next item off the list.  It's basically a slightly easier-to-read version of passing a queue down through the recursion and having everything push onto it.  Since the caller isn't required to exhaust them, I would make sure the channel is sufficiently buffered so that you don't leak goroutines.

The other option, of course, is to use a stack and save the stack in the iterator closure, but as you say, it's a bit involved (mostly because you'd have to push more than just the node onto the stack, you'd also have to include some state with it).



--



Glenn Brown

unread,
Dec 6, 2012, 1:01:58 PM12/6/12
to Greg Ward, Glenn Brown, golan...@googlegroups.com
Iterator() can create an unbuffered channel of Foo, launch a goroutine to fill the channel, and return a closure that reads Foo's from the channel.

Also, Iterative DFS is not hard: you simply need a stack (LIFO) to record discovered edges at each node before exploring them, and that's trivially implemented using slices. If you found this hard, you should to do it as an exercise, IMHO. Also, it will be much faster than the channel-based approach.

--Glenn

roger peppe

unread,
Dec 6, 2012, 1:16:52 PM12/6/12
to Greg Ward, golang-nuts
> However, I'm trying to apply this iterator pattern to a depth-first
> search of a directed graph, and I'm stumped.

i'd invert the control flow and have the iterator call an
iterator function that you call it (as filepath.Walk does, for example).

it probably has about the same performance characteristics,
but will be much easier to write.

if the caller wants to traverse two such iterators concurrently,
they could use channels (admittedly with significant performance cost)
> --
>
>

Jan Mercl

unread,
Dec 6, 2012, 2:26:44 PM12/6/12
to Greg Ward, golang-nuts
On Thu, Dec 6, 2012 at 6:36 PM, Greg Ward <gr...@gerg.ca> wrote:
> a couple of Gophers have independently hit upon an elegant pattern for
> iterators in Go.

I have to say the obvious: `elegant` is _totally_ subjective.

> Say you define a FooList type, which might be
>
> type FooList []Foo
>
> or
>
> type FooList struct {
> list []Foo
> }
>
> ...or whatever. Your users can't use range, but you want for loops to
> be easy for them.

In the first case users _can_ use range. In the second they've been
intentionally disallowed to do that (assuming "user" lives in another
package). Isn't it just simpler to let the user do what he needs to
do?

> http://play.golang.org/p/I6GHIOmurv

What language is that written in?

-j

Francesc Campoy Flores

unread,
Dec 6, 2012, 3:23:02 PM12/6/12
to Greg Ward, golan...@googlegroups.com
I'd rather use a channel instead, the code looks simpler to me.

Modifying your code just a little bit:


Modifying it a little bit more


Cheers,



--





--
--
Francesc

Kyle Lemons

unread,
Dec 6, 2012, 3:57:30 PM12/6/12
to Francesc Campoy Flores, Greg Ward, golan...@googlegroups.com
The problem with this approach (as has been pointed out numerous times when iterators come up on the mailing list) is that you cannot break out of that loop without leaving a goroutine hanging around forever.

On Thu, Dec 6, 2012 at 3:23 PM, Francesc Campoy Flores <cam...@golang.org> wrote:
http://play.golang.org/p/K7K22Jwy5A

si guy

unread,
Dec 6, 2012, 4:18:40 PM12/6/12
to golan...@googlegroups.com
Sure you can, or are quit chans illegal now?
http://play.golang.org/p/X4M-S-APgk

si guy

unread,
Dec 6, 2012, 4:23:49 PM12/6/12
to golan...@googlegroups.com
My bad Kyle, I just realized I mis-interpreted your post, I somehow failed to realize you were talking about magical iterators. So used to doing things the easy way at this point.

-Simon Watt

David DENG

unread,
Dec 6, 2012, 7:01:52 PM12/6/12
to golan...@googlegroups.com, gr...@gerg.ca
Two alternative ways
1) Use a slice as a stack to do DFS
2) Recursively call iterator functions

two implementations have be shown here: http://play.golang.org/p/Vn1Eak-cDt

David

Francesc Campoy Flores

unread,
Dec 6, 2012, 11:38:10 PM12/6/12
to Kyle Lemons, Greg Ward, golan...@googlegroups.com
You can, but you have to do it explicitly

--
--
Francesc

Ryan Tarpine

unread,
Dec 7, 2012, 9:29:23 AM12/7/12
to golan...@googlegroups.com
Wow that is surprising and good to know. Maybe something should be added to the FAQ or Effective Go about it. Here's the bugtracker issue, for anyone interested: http://code.google.com/p/go/issues/detail?id=296 (it's labeled WontFix).

-Ryan

Greg Ward

unread,
Dec 7, 2012, 11:51:14 AM12/7/12
to Jan Mercl, golang-nuts
On 06 December 2012, Jan Mercl said:
> On Thu, Dec 6, 2012 at 6:36 PM, Greg Ward <gr...@gerg.ca> wrote:
> > a couple of Gophers have independently hit upon an elegant pattern for
> > iterators in Go.
>
> I have to say the obvious: `elegant` is _totally_ subjective.

Fair enough.

> > Say you define a FooList type, which might be
> >
> > type FooList []Foo
> >
> > or
> >
> > type FooList struct {
> > list []Foo
> > }
> >
> > ...or whatever. Your users can't use range, but you want for loops to
> > be easy for them.
>
> In the first case users _can_ use range. In the second they've been
> intentionally disallowed to do that (assuming "user" lives in another
> package). Isn't it just simpler to let the user do what he needs to
> do?

Not all data structures are as simple as a wrapper around a slice. I
would never in a zillion years expect the "range" operator to
recognize my particular implementation of a directed graph, nor for it
to iterate over that graph in topological order. Of *course* I have to
write my own DFS that yields a topo sort. I'm just trying to expose
that functionality to the rest of my app in a nice, clean way.

Currently I'm using a callback:

// Callback function to visit nodes from DFS(). Return a non-nil error
// to abort the traversal and make DFS() return that error. DFS()
// aborted this way does not report dependency cycles.
type DFSVisitor func(id int) error

// Perform a partial depth-first search of the graph, exploring
// ancestors of all nodes in 'start'. For each node visited, call
// visit() just as we're leaving that node -- i.e. calls to visit()
// are in topological order. visit() can abort the search; see
// DFSVisitor for details.
func (self *DAG) DFS(start NodeSet, visit DFSVisitor) error {
[...]
}

That works fine, but it leads to slightly inside-out code, e.g.

visit := func(id int) error {
[...do stuff with the node specified by id...]
}
err := self.dag.DFS(targets, visit)

The closure-as-iterator pattern appeals because it lets you write
something like

iter := dag.Iterator()
for id, ok := iter(); ok; id, ok := iter() {
[...do stuff with the node specified by id...]
}

which I find slightly more readable. I thought it would be worth
playing around to see if I could figure out a way to make this work
despite having a recursive iteration algorithm.

> > http://play.golang.org/p/I6GHIOmurv
>
> What language is that written in?

Very droll. For the record, I did not write that code, but I like the
way it lets the data structure hide its inner workings from callers,
but still lets callers use a normal "for" loop. Perhaps a callback
would be more idiomatic Go, but this strikes me as a nice use of the
"comma ok" idiom.

Greg

N. Riesco - GMail

unread,
Dec 7, 2012, 12:39:24 PM12/7/12
to golang-nuts
in terms of minimising boilerplate, one could say it reads slightly
better like this:

err := dag.DFS(targets, DFSVisitor {
[...do stuff with the node specified by id...]
return nil
})

It'd be even nicer if the Go compiler could infer the type of the
closure and one could write:

err := dag.DFS(targets, {
[...do stuff with the node specified by id...]
return nil
})


Nico

Steven Blenkinsop

unread,
Dec 7, 2012, 2:21:06 PM12/7/12
to N. Riesco - GMail, golang-nuts
One approach I stumbled upon a while ago was to sort of trampoline* the iterators, like this: http://play.golang.org/p/HG3R8Tx9mW

Doing this with other traversal orders is left as an exercise to the reader ;). The advantage this has over the approach that David DENG posted is that you don't have to retraverse the recursive path each time you call the iterator.

* I didn't know there was a name for it until after I "invented" it. Bummer.




--



Greg Ward

unread,
Dec 7, 2012, 3:24:10 PM12/7/12
to David DENG, golan...@googlegroups.com
On 06 December 2012, David DENG said:
> Two alternative ways
> 1) Use a slice as a stack to do DFS
> 2) Recursively call iterator functions
>
> two implementations have be shown here: http://play.golang.org/p/Vn1Eak-cDt

Nice. However, I'm working with a directed graph, and I need a
topological sort, ie. the graph equivalent of a postorder traversal.
(I also need to detect and report cycles -- I need a DAG, but cannot
assume it.) Your code does a preorder traversal of a binary tree,
which is much simpler.

Turns out that a non-recursive postorder traversal of a binary tree is
a bit tricky. I managed to work out a very old-fashioned looking
algorithm (gotos! labels! no loops!) that works. Then I remembered
about Google and the WWW and discovered this nice exposition:

http://www.leetcode.com/2010/10/binary-tree-post-order-traversal.html

Anyways, I think I've blown enough time on this very minor refactoring
for now. Thanks for your suggestions!

Greg

Greg Ward

unread,
Dec 7, 2012, 3:56:24 PM12/7/12
to Steven Blenkinsop, golang-nuts
On 07 December 2012, Steven Blenkinsop said:
> One approach I stumbled upon a while ago was to sort of trampoline* the
> iterators, like this: http://play.golang.org/p/HG3R8Tx9mW

Ooh, that is clever. At least I think it is... I won't understand it
until I work through it on paper. *sigh*

> Doing this with other traversal orders is left as an exercise to the reader
> ;). The advantage this has over the approach that David DENG posted is that
> you don't have to retraverse the recursive path each time you call the
> iterator.

Yeah, I noticed that too. Downsides to the trampoline approach:

* two function calls for each iteration, compared to one for my
current callback
* lots of short-lived closures: I wonder how that affects GC?
* makes my brain explode

So far I'm inclined to stick with the straightforward, obvious
implementation that I already have. "Dammit, Go, why won't you let me
be subtle, sneaky, tricky, and clever?" ;-)

Greg

Steven Blenkinsop

unread,
Dec 7, 2012, 5:00:28 PM12/7/12
to Greg Ward, golang-nuts
On Fri, Dec 7, 2012 at 3:56 PM, Greg Ward <gr...@gerg.ca> wrote:

Yeah, I noticed that too. Downsides to the trampoline approach:

  * two function calls for each iteration, compared to one for my
    current callback

You can, of course, skip the Iterator and just use the Stepper directly.
 
  * lots of short-lived closures: I wonder how that affects GC?

It's basically the same as recursively making a linked list stack, except that it can be a bit lazy, and you don't have to write a separate Pop function. You could cache the first step if you wanted to, but it would be invalidated as soon as you altered the tree. Here's a version that uses an slice in place of closures, and uses it directly rather than through an iterator:

Post-order traversal:
 
  * makes my brain explode
 
Maybe my stack version makes more sense? It's essentially logically equivalent. The function literal there is just the closure equivalent of append.

Steven Blenkinsop

unread,
Dec 7, 2012, 9:41:31 PM12/7/12
to Greg Ward, golang-nuts
Okay, now I'm just playing with it. This is more of a toy than anything, but I was trying to see if I could replicate the behaviour of the closure based version, but store values in a slice rather than closing over them. I took a more object oriented approach, since Popping can now be destructive. Because the function literals don't close over anything, they shouldn't allocate any memory. I designed the stack so it only gets as big as the height of the tree. I can also now push multiple trees onto one stack and, I think interestingly, I can make them take up only one element until you start popping nodes from them, like I did with inOrder in this example.

Caution: If the previous examples I posted made your head hurt, this'll most likely be worse, though it operates on the same principle:

Kyle Lemons

unread,
Dec 7, 2012, 10:31:43 PM12/7/12
to Steven Blenkinsop, N. Riesco - GMail, golang-nuts
On Fri, Dec 7, 2012 at 2:21 PM, Steven Blenkinsop <stev...@gmail.com> wrote:
One approach I stumbled upon a while ago was to sort of trampoline* the iterators, like this: http://play.golang.org/p/HG3R8Tx9mW

That is brilliant!

For those playing along at home, the DFS (aka "pre-order") version is:
 
Doing this with other traversal orders is left as an exercise to the reader ;). The advantage this has over the approach that David DENG posted is that you don't have to retraverse the recursive path each time you call the iterator.

* I didn't know there was a name for it until after I "invented" it. Bummer.


On Fri, Dec 7, 2012 at 12:39 PM, N. Riesco - GMail <nicolas...@gmail.com> wrote:
in terms of minimising boilerplate, one could say it reads slightly better like this:

err := dag.DFS(targets, DFSVisitor {

        [...do stuff with the node specified by id...]
        return nil
})

It'd be even nicer if the Go compiler could infer the type of the closure and one could write:

err := dag.DFS(targets, {

        [...do stuff with the node specified by id...]
        return nil
})


Nico

--



--
 
 

Reply all
Reply to author
Forward
0 new messages