Iterate over map without range?

4,931 views
Skip to first unread message

Péter Szilágyi

unread,
Mar 30, 2013, 7:43:56 PM3/30/13
to golang-nuts
Hi,

  I was wondering whether there's any mechanism to iterate over a map that is capable of suspending the iteration and resuming it later.

  E.g. something along the lines:

for k, v in range myMap {
  // Break for some reason
}

// Do other stuff

for k, x in <somehow continue> {
  // Etc.
}

  The above code is just a wild example. A more appropriate and real world use case would be if a map is used as an internal data structure for an object/collection which would need iteration support (e.g. bag).

Thanks,
  Peter

Dave Cheney

unread,
Mar 30, 2013, 8:22:29 PM3/30/13
to Péter Szilágyi, golang-nuts
Here is an idea. http://play.golang.org/p/CkJoCv7HV7. I'm almost 100%
certain it's a bad idea.
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Kevin Gillette

unread,
Mar 30, 2013, 8:24:37 PM3/30/13
to golan...@googlegroups.com
range loops are the only way to iterate over a map (or otherwise "find out" what keys a map has). There are a number of ways to suspend within that loop, however -- typically these will be channels or locks. Just note that while keys are always output at most once, if a map is modified during iteration, there's no guarantee that all keys will be reached in iteration.

Péter Szilágyi

unread,
Mar 30, 2013, 8:32:47 PM3/30/13
to Kevin Gillette, golang-nuts
Dave: Yes, certainly a bad idea :D. It also crossed my mind, but channel operations are on the order of microseconds, whilst what I'm looking for is tens of nanoseconds.
Kevin: I don't really want to modify the map during iteration, just provide a wrapper to hide the internals of a data structure. E.g. a bag can be implemented very easily using map[interface{}]int, yet the iteration will become quite ugly:
for k, v in range bag {
  for i := 0; i < v; i++ {
    // I've got a value 'k'
  }
}
The above code might just as well be ok deep inside a library, but there are certain points when that API will surface... and it's not pretty.


--

Maxim Khitrov

unread,
Mar 30, 2013, 9:18:09 PM3/30/13
to Péter Szilágyi, Kevin Gillette, golang-nuts
If your data structure can tolerate splitting the items between two
maps, then something like this is an option:

http://play.golang.org/p/tpALaQ2E6M

It's just using a temporary map to store all visited elements. I
haven't done any benchmarking, but I would guess that this is still
faster than using channels.

- Max

steve wang

unread,
Mar 30, 2013, 9:57:55 PM3/30/13
to golan...@googlegroups.com
A mix of map and list can help.

steve wang

unread,
Mar 30, 2013, 10:47:32 PM3/30/13
to golan...@googlegroups.com
import (
    "container/list"
)

type Iterator struct {
    e *list.Element
}

func (p *Iterator) Valid() bool {
    return p.e != nil
}

func (p *Iterator) Value() (int, int) {
    pe := p.e.Value.(*Element)
    return pe.k, pe.v
}

func (p *Iterator) Next() {
    p.e = p.e.Next()
}

type Element struct {
    k, v int
}

type ListMap struct {
    m map[int]*list.Element
    l *list.List
}

func NewListMap() *ListMap {
    return &ListMap{
        m: make(map[int]*list.Element),
        l: list.New(),
    }
}

func (p *ListMap) Set(k, v int) {
    e, ok := p.m[k]
    if ok {
        e.Value.(*Element).v = v
    } else {
        p.m[k] = p.l.PushBack(&Element{k, v})
    }
}

func (p *ListMap) Get(k int) (int, bool) {
    e, ok := p.m[k]
    if !ok {
        return 0, false
    }
    return e.Value.(*Element).v, true
}

func (p *ListMap) Delete(k int) {
    e, ok := p.m[k]
    if ok {
        delete(p.m, k)
        p.l.Remove(e)
    }
}

func (p *ListMap) Iterate() Iterator {
    return Iterator{p.l.Front()}
}

Anoop K

unread,
Apr 1, 2013, 2:05:16 AM4/1/13
to golan...@googlegroups.com
Does GO have future plans to allow user defined objects also to implement 'range' by supporting 'yield' like functionality in python OR iterator interfaces  (eg Java Iterator interface next(), hasNext()) ?

Anoop

Rémy Oudompheng

unread,
Apr 1, 2013, 3:12:45 AM4/1/13
to Péter Szilágyi, golang-nuts


On 31 Mar 2013 00:44, "Péter Szilágyi" <pet...@gmail.com> wrote:
>
> Hi,
>
>   I was wondering whether there's any mechanism to iterate over a map that is capable of suspending the iteration and resuming it later.
>
>   E.g. something along the lines:
>
> for k, v in range myMap {
>   // Break for some reason
> }
>
> // Do other stuff
>
> for k, x in <somehow continue> {
>   // Etc.
> }
>

Replace the "break" word with the contents of "do other stuff".

>   The above code is just a wild example. A more appropriate and real world use case would be if a map is used as an internal data structure for an object/collection which would need iteration support (e.g. bag).

If my answer does not solve the problem, it means that you have given the wrong pseudo-code. Please explain what you are trying to do with more details.

Also I don't see any reason why an algorithm would make a difference between two arbitrary halves of an unordered collection, can you give an example?

Rémy.

Ian Lance Taylor

unread,
Apr 1, 2013, 1:05:03 PM4/1/13
to Anoop K, golang-nuts
On Sun, Mar 31, 2013 at 11:05 PM, Anoop K <ano...@gmail.com> wrote:
> Does GO have future plans to allow user defined objects also to implement
> 'range' by supporting 'yield' like functionality in python OR iterator
> interfaces (eg Java Iterator interface next(), hasNext()) ?

There are no such plans at present. Currently the way to have a user
defined type support range is to add a method that returns a channel.

Ian

John Nagle

unread,
Apr 1, 2013, 1:18:07 PM4/1/13
to golan...@googlegroups.com
On 3/31/2013 11:05 PM, Anoop K wrote:
> Does GO have future plans to allow user defined objects also to implement '*
> range*' by supporting '*yield*' like functionality in python OR iterator
> interfaces (eg Java Iterator interface next(), hasNext()) ?
>
> Anoop

You can do generator-like things by using a goroutine. Have
a goroutine iterate over the range and write the results to
a channel. Read from the channel when you want the next item.

But do NOT modify the map while doing this. Maps are not
safe against race conditions and Go may crash.

John Nagle

Péter Szilágyi

unread,
Apr 1, 2013, 5:43:41 PM4/1/13
to John Nagle, golang-nuts
Hello,

  My use case (I made a hint at it earlier) is to use map as an internal for a new data structure. Specifically, take a bag for example (set with the possibility to add the same item multiple times). With the current map range-based iterator there is no simple way to provide an iterator for my new data struct.

  I'm afraid that using channels will incur a large performance penalty (scheduler + locks).

  Ian: Any particular reason for not allowing access to the base iterators?

Have a nice evening,
  Peter


Andrew Gerrand

unread,
Apr 1, 2013, 5:52:20 PM4/1/13
to Péter Szilágyi, golang-nuts, John Nagle

There is no language level support for custom iterators because it would add a lot of complexity for a little syntactic sugar. Take a look at the new bufio.Scanner for an example of an idiomatic iterator that does not use range.

http://tip.golang.org/pkg/bufio/#Scanner

Andrew

Péter Szilágyi

unread,
Apr 1, 2013, 6:10:11 PM4/1/13
to Andrew Gerrand, golang-nuts, John Nagle
Actually, I don't agree that it's just syntactic sugar. If you wrap a map inside a data structure, there's no simple way of providing an iterator if it's not the basic one:

E.g. use a map[interface{}]int as the internal of a Bag. The key is the element, the value is the count.
For a bag containing {"abc": 2, "xyz": 3} inside the map, I'd like an iterator to expand to "abc", "abc", "xyz", "xyz", "xyz" (in whatever order).

Yes, there are workarounds, but why should performance be sacrificed?


Rémy Oudompheng

unread,
Apr 1, 2013, 6:16:44 PM4/1/13
to Péter Szilágyi, Andrew Gerrand, golang-nuts, John Nagle
On 2013/4/2 Péter Szilágyi <pet...@gmail.com> wrote:
> Actually, I don't agree that it's just syntactic sugar. If you wrap a map
> inside a data structure, there's no simple way of providing an iterator if
> it's not the basic one:
>
> E.g. use a map[interface{}]int as the internal of a Bag. The key is the
> element, the value is the count.
> For a bag containing {"abc": 2, "xyz": 3} inside the map, I'd like an
> iterator to expand to "abc", "abc", "xyz", "xyz", "xyz" (in whatever order).
>
> Yes, there are workarounds, but why should performance be sacrificed?

func Iterate(bag map[interface{}]int, do func (v interface{}) (stop bool)) {
for v, n := range bag {
for i := 0; i < n; i++ {
stop := do(v)
if stop {
return
}
}
}
}

Rémy.

Péter Szilágyi

unread,
Apr 1, 2013, 6:22:59 PM4/1/13
to Rémy Oudompheng, Andrew Gerrand, golang-nuts, John Nagle
1. Your exposing the bag internals to the outside API.
2. You're passing in outside logic into the collection internals (do func).

type Bag struct {
  data map[interface{}]int
}

Provide me a means to iterate the above.


John Nagle

unread,
Apr 1, 2013, 6:32:46 PM4/1/13
to Péter Szilágyi, golang-nuts@googlegroups.com >> golang-nuts
On 4/1/2013 2:43 PM, P�ter Szil�gyi wrote:
> I'm afraid that using channels will incur a large performance penalty
> (scheduler + locks).

Optimization opportunity: when the compiler sees a channel/goroutine
relationship that can be implemented as a coroutine, do so.

If all channel communication to a goroutine is via channels of
length zero, and all channels communicate with the same other goroutine,
(always the case if there's only one channel involved), the goroutine
can be implemented as a coroutine.

That provides generator capability with no new syntax.

John Nagle

Steve McCoy

unread,
Apr 1, 2013, 6:38:02 PM4/1/13
to golan...@googlegroups.com
1. Make Iterate a Bag method and drop the map parameter.
2. There's nothing inherently wrong with that. Everybody's favorite example, sort, does the same thing.

Rémy Oudompheng

unread,
Apr 1, 2013, 6:38:02 PM4/1/13
to Péter Szilágyi, Andrew Gerrand, golang-nuts, John Nagle
On 2013/4/2 Péter Szilágyi <pet...@gmail.com> wrote:
> 1. Your exposing the bag internals to the outside API.

I don't understand.

> 2. You're passing in outside logic into the collection internals (do func).

I don't understand either what you mean. Combining outside logic with
collection internals is the definition of what iteration is.
My proposal does not allow the caller to mess with the collection.

> type Bag struct {
> data map[interface{}]int
> }
>
> Provide me a means to iterate the above.

func (bag *Bag) ForEach(do func(v interface{}) bool) {
for v, n := range bag.data {
for i := 0; i<n; i++ {
if do(v) { return }
}
}
}

Rémy.

Péter Szilágyi

unread,
Apr 2, 2013, 2:21:38 AM4/2/13
to Steve McCoy, golang-nuts
Hi,

  Sorting and iterating imho are worlds apart: sort is an operation on a collection, iteration is more like a data source.

  ForEach can be useful, but it is more limiting on the things you can do, and increases complexity. With the design mentioned previously you have to declare a separate 'do' function for the logic (which will break your program flow. Take a trivial example: caluclate the sum of the elements inside the collection:

With every built-in, you can do:
sum := 0
for v := range map/slice/array {
  sum += v
}

With the library list you can do:
sum := 0
for e := l.Front(); e != nil; e = e.Next() {
  sum += e.Value
}

The library ring is a bit specialized, but iteration can still be achieved in 2-3 lines. I'm just lazy to code it right now.

Your proposed solution on the other hand:
sum := 0
adder := func(val int) {
  sum += val
  return false
}
bag.ForEach(adder)

  The above is an extreme over complication of a simple addition. Now try and implement a min-flow with it, and I'm sure you'll get the picture where I'm going. The code will be completely unreadable.

  The possibility to stop an iteration, do something else and then continue can also be invaluable in memoization style algorithms where you don't calculate for the whole collection something, just maybe for a small part of it. And later if it turns out to be needed, you can just continue where you left off. This is a bit more subtle, but one crude example might be a simple graph search algorithm where you're trying to find a path from a source to dest1. Afterwards it turns out that you'd also like the path to dest2. Now you either rerun a whole lot of searching, or you can just save the state of the previous search where you broke out. That's where you need iterators that can continue.

  Btw, the internal map in Go also uses plain iterators, no other syntactic stuff like range or foreach is defined for it. It just has been decided to hide them and provide a higher level and thus more limiting API to it. If range and for-each is so nice, why does the go runtime use classic iterators exclusively?

  Note, I'm not arguing that a problem cannot be solved without the classical iterators. They obviously can be, but with unnecessarily increased code and time complexity. Steve Wang proposed a nice workaround, alas, it will still incur a huge performance hit from the gc due to constant memory allocations and deallocations.

I hope it's a bit clearer now why I find iterators that are decoupled from a loop essential.

Cheers,
  Peter

PS: Just a random counterexample to your design: iterate over two, same-length bags simultaneously.


Andrew Gerrand

unread,
Apr 2, 2013, 7:31:18 AM4/2/13
to Péter Szilágyi, Steve McCoy, golang-nuts
A more even comparison:

sum := 0
for v := range map/slice/array {
  sum += v
}

sum := 0
for e := l.Front(); e != nil; e = e.Next() {
  sum += e.Value
}

sum := 0
bag.ForEach(func(val int) bool {
  sum += val
  return false
})

I don't see the "extreme over complication" that you observe.

Andrew

Steve McCoy

unread,
Apr 2, 2013, 7:35:14 AM4/2/13
to golan...@googlegroups.com
Sorry, I wrongly thought that you were totally against that style of iterator. I agree that it's usually a better option than the loop function.

alco

unread,
Apr 2, 2013, 1:36:44 PM4/2/13
to golan...@googlegroups.com
  The possibility to stop an iteration, do something else and then continue can also be invaluable in memoization style algorithms where you don't calculate for the whole collection something, just maybe for a small part of it. And later if it turns out to be needed, you can just continue where you left off. This is a bit more subtle, but one crude example might be a simple graph search algorithm where you're trying to find a path from a source to dest1. Afterwards it turns out that you'd also like the path to dest2. Now you either rerun a whole lot of searching, or you can just save the state of the previous search where you broke out. That's where you need iterators that can continue.

PS: Just a random counterexample to your design: iterate over two, same-length bags simultaneously.

I'm a bit off topic from the builtin iteration discussion, but it looks like you want some sort of language support for lazy sequences/streams that are compiled to for-loops. Just wanted to note, that a classical lazy stream can be built using closures alone. Coroutines would also do the job of suspending iteration and returning to it later.

However, in Go, it will be more idiomatic to use goroutines. Rob did an excellent talk on lexical scanning in Go[1] where he solves the problem of suspending an iteration and returning to the suspended state later using goroutines.

The problem of iterating over two collections simultaneously also has a simple solution in Go[2], again using goroutines.

  [1]: http://www.youtube.com/watch?v=HxaD_trXwRE
  [2]: http://tour.golang.org/#68

Gustavo Niemeyer

unread,
Apr 2, 2013, 2:17:28 PM4/2/13
to Péter Szilágyi, Steve McCoy, golang-nuts
Depending on your typical size and usage pattern, I'd not be surprised
if something like this actually works well in practice:

http://play.golang.org/p/h8CyyqKSYW
--

gustavo @ http://niemeyer.net

Péter Szilágyi

unread,
Apr 3, 2013, 5:27:42 AM4/3/13
to Gustavo Niemeyer, Steve McCoy, golang-nuts
Hi all,

  Andrew: yes, your version is indeed a bit denser, I forgot about that style. Still, I'm wondering about how it would look in a more complex loop. But that's something I'll have to try out so I won't discard it just yet. But still, it's slower and it is bound to a single loop (i.e. cannot break/continue, or do simultaneous iteration with multiple collections)

  Alco: Channels/go routines are nice and idiomatic and everything, but the scheduling will kill the performance. It's too high level for implementing collections.

  Gustavo: Depending on the usage, it might work well in certain places, but as a general solution it wastes a lot of memory.

  I've wrote up a small benchmark suite with the proposed solutions, to have a bit more concrete results: http://play.golang.org/p/VWJ5O9MsCB. Running the benchmarks for 250ms on a Core2 T9400 with go tip results in:

BenchmarkBasic       10000000       41.4 ns/op
BenchmarkForeach     10000000       46.0 ns/op
BenchmarkChannel      1000000      298.0 ns/op
BenchmarkUnwound     10000000       55.1 ns/op

The above benchmarks were in order:
  • Iterating over the internal map of a bag structure
  • The same thing as above, just executing a function for each value instead of direct access.
  • Using channels simulate an iterator
  • flattening the whole map and using that as an iterator
The conclusions I would like to draw from the results:
  • Using channels for iteration is out of the question.
  • Unwinding/flattening is both slower than the rest and would also incur significant memory requirements.
  • The foreach comes closest to the map, but the extra function call costs 10% performance.
One solution is missing from the above: the list in map proposed by Steve Wong. Iterator-wise that is the fastest, but it's because of the internal list which "is" already an iterator. The insertion and deletion will have a much larger cost there and I didn't want to benchmark those too for this demo. 

I would also like to emphasize that I'm striving for a general bag collection here, not one used in a specific scenario. That is why I'm trying to squeeze every last drop of performance out of the system.

Cheers,
  Peter

Andrew Gerrand

unread,
Apr 3, 2013, 5:35:36 AM4/3/13
to Péter Szilágyi, Gustavo Niemeyer, Steve McCoy, golang-nuts
I don't mean to be blunt, but this is a case of extreme premature optimization.

Write the code that works and is clear and easy to use. If your bag is too slow, you'll see it when profiling real programs. Only then should you get bogged down by optimization.

(Note, there's no way the channel-based option will be the clearest implementation.)

Andrew

Péter Szilágyi

unread,
Apr 3, 2013, 5:45:57 AM4/3/13
to Andrew Gerrand, Gustavo Niemeyer, Steve McCoy, golang-nuts
Hi,

  Of course, you're completely right with the premature optimization thing and I fully agree :). I'm just trying to write some generic collections an squeeze everything out of the system. It's more as a learning effort than production code, which if happens to run well enough I'd be happy to release. I'm just a bit annoyed that I know a faster and clearer solution, but I cannot implement it because two methods are hidden in the runtime :P

Have a nice day,
  Peter
Reply all
Reply to author
Forward
0 new messages