Russ,
Is there really parallelism possible in the goroutine version? It seems to me that the producer and consumer goroutines are related such that one will always be blocking on the other. If the channel has size 0, no amount of scheduling can make this parallel. Even if you use a channel with capacity 100, for example, it is likely that either the producer or consumer will end up blocking or finishing first, and even if the scheduler load-balanced perfectly, the iterator could never use more than 2 processors.
Maybe the iterators could use a capacity related to the size of the container, e.g. if the channel had capacity == container size then at least the producer never needs to block. This ameliorates the problem but does not make it very parallel nor fast, since each iteration still needs a copy into the channel and a copy out which a sane iterator would not need.
So even if you eliminate the blocking problem, you still end up with a slow iterator. It never seems logical to do with two processors what you can do faster with one. And since this can't scale to more than two processors at a time, it is never a logical implementation.
I wish this weren't true, bc I think the channel iterator concept is rather elegant. Please enlighten me if I'm missing some voodoo.
Ryanne
- from my phone -
On Dec 1, 2009 8:24 PM, "Russ Cox" <r...@golang.org> wrote:> So with the current state of the scheduler, is it really wise to be > creating standard library co...
True, but if there is significant work on the producer side then you don't really have an iterator...
Certainly we shouldn't design all iterators to be slow in general but possibly faster in the case where a different pattern would make more sense.
For example, if there was some container for which iterating was a big computation, then the programmer can use a generator and read from it in a loop. That is fine. But it doesn't follow that iterating over a list or vector should follow the same pattern just for the sake of consistency.
In most cases (I'd say all good cases) the iterator is doing something trivial, and that is why it is hidden behind the for...range syntactic sugar.
I guess I am arguing for a way to implement coroutines without being penalized for channel communication. Id like to see the for..range syntax work for coroutines and most iterators replaced with call-based coroutines rather than goroutines.
Perhaps the compiler could be smart and replace goroutines with coroutines magically, but I don't see the point.
- from my phone -
On Dec 3, 2009 1:07 PM, "Russ Cox" <r...@golang.org> wrote:> Is there really parallelism possible in the goroutine version? It seems to > me that the producer...
It might be important to understand though, that with GOMAXPROCS=1
when you're using a buffered channel, and your iterator goroutine is
pure computation (i.e. no syscalls or communications) then it won't
stop until buffer is filled completely. Effectively it means that if
you use capacity 100 you first put 100 elements, then iterator finally
sleeps and consumer wakes up, consumes all 100 elements (if consumer
is pure computation as well, which in benchmarks usually is), then the
process repeats. Because there are less switches it results in less
overhead, but you lose granularity of iteration.
It might be important to understand though, that with GOMAXPROCS=1
when you're using a buffered channel, and your iterator goroutine is
pure computation (i.e. no syscalls or communications) then it won't
stop until buffer is filled completely.
I might be wrong, but I looked at the source code, and in
runtime/chan.c functions chansend and chanrecv (the heart of channel
machinery) show that gosched is only called when operation cannot be
completed immediately. If GOMAXPROCS=1 then only one goroutine can run
at a time, and until any runtime operation calls gosched or
entersyscall other goroutines will not run. That's why as long as
goroutine can send values to a channel it will. And as long as
goroutine can receive from the channel it will as well. Only when the
channel is full (for send) or empty (for recv) will goroutine put
itself to sleep.
It might be important to understand though, that with GOMAXPROCS=1
when you're using a buffered channel, and your iterator goroutine is
pure computation (i.e. no syscalls or communications) then it won't
stop until buffer is filled completely.Are you sure about this? I don't know why this would be the case.