[Editorial note: I've spent quite a while on this email, and Ian's post
just came through. So, with apologies, I'm going to be lazy and post as-is.
I'll digest Ian's comments and perhaps respond in due course.]
It will certainly be easier to talk about concrete examples, but please
keep in mind the caveat Øyvind Teig points out — idiomatic usage for
specific cases wasn't the key motivation behind my post. I more interesting
in exploring a possible improvement to Go that (I hope) would entail
negligible change to the language or runtime.
Example 1: Five I/O-heavy producers (five sockets) write into a channel
that four CPU-heavy consumers (four cores) read. I wish to interpose a
special logging "channel" between the producers and consumers without
changing the implementation of either the producers or the consumers.
Example 2: Ten goroutines read URLs from ten files and write them into a
single channel. Three consumers read the channel and fetch the documents
corresponding to those URLs (three consumers ensures that you don't
completely choke your DSL modem), building a word histogram of each
document fetched then sending each word/count pair to one of eight channels
(one per core), based on hash(word) % 8. Each of the eight channels has a
consumer that accumulates incoming word/count pairs into a single
histogram. All second stage consumers occasionally report their 10 top hits
to a third channel that a single consumer analyses. This final "judging"
consumer tracks the most popular terms across all words and stops when it
decides that the list has stabilised enough to declare the top 10 words.
When the input files run dry and the last document is fetched, I want the
judge to immediately declare the winners. Likewise, I want the file readers
and document fetchers to stop work as soon as the judge declares the
winners early due to stabilisation. Moreover, the network interface might
go down, making it impossible to fetch any more documents; at this point I
want everyone to stop work and the judge to immediately declare the winners.
In the abstract: I have a multi-stage pipeline with P producers feeding C1
consumers feeding C2 second-stage consumers, etc. The cardinality of each
stage is determined by extraneous factors (the number of external input
sources, the need for limited parallelism, etc.). Most importantly, any
stage of the pipeline might decide to stop work, at which point I want the
entire pipeline to collapse and all participating goroutines and channels
to go away. I don't want to have to bake in any rules about who is allowed
to call a halt to proceedings. I also don't want to impose fixed counts
anywhere. For instance, I might decide to increase or decrease the number
of document fetchers in response to QoS data.
Note that a pipeline isn't the only topology I have in mind. Cyclic graphs
can be constructed to implement feedback loops. The node that combines the
input and feedback channels could return either when the input channel runs
dry, or when both the input and feedback channels both run dry (depending
on the problem), at which point you want the entire feedback loop to tear
down. Ditto when all consumers of the loop's output stage go away.
I should reiterate, I'm not that interested in tailored solutions for the
particular problems described above. They are merely illustrative. I am
sure that it is possible to implement anything I describe using Go as it is
today. What I am more concerned about is that all the suggestions I've read
seem to amount to either reversing the flow of control (but remaining
unidirectional) or implementing by hand the techniques I baked directly
into my microthreading library, which could probably be baked into Go with
little effort. For instance, the suggestion to use sync.WaitGroup seems
like it could solve the problem in a fairly general way, but it means that
I have to maintain a separate data structure alongside every channel to
track how many producers and consumers are attached to the channel, and I
have to remember to call Add and Done in all the right places. And even
then, there's no obvious idiomatic way to signal producers when all the
consumers are gone (it seems that causing a panic is frowned upon). So,
sure, I could probably solve all my problems that way, but it's difficult,
noisy and fragile.
I still don't know if I've communicated my intent clearly enough, so here's
a self-contained idea that I hope captures the essence of what I'm looking
for: I want to be able to treat an arbitrarily complex subsystem with an
input and output channel as a drop-in replacement for a regular channel,
including having the entire subsystem garbage-collected if the exposed
end-points become unreachable from outside the subsystem.
Wouldn't it be nice if Go, which already (almost?) has the necessary
information, would do all this bookkeeping for us?
On Monday, July 30, 2012 5:46:40 PM UTC+10, Uriel DeLarge wrote:
> On Mon, Jul 30, 2012 at 5:51 AM, Marcelo Cantos wrote:
> > Sorry, but I don't quite understand what "endrange()" means or how
> thinking
> > this way solves the problems that are bothering me.
> close() pretty much exists so the 'for range' construct can work on a
> channel as an iterator over a limited set of items.
> It can be useful in other situations, but often you will do fine
> assuming close() doesn't exist, notice that many previous CSP
> languages didn't have close() at all.
> In other words: CSP programming style doesn't require close().
> Uriel
> > Say I have five
> > producers that don't know about each other, all writing into a single
> > channel read by a filter, which passes transformed/filtered messages
> through
> > to a downstream consumer. How do the five producers let the filter know
> when
> > it (and, transitively, the downstream consumer) is not needed any more?
> What
> > is the idiomatic solution for this very common scenario?
> I don't think in practice it is a common scenario at all, gorotines
> are cheap, specially while blocked on a channel, just let them be.
> I think part of the problem is you are thinking in 'abstract' terms,
> rather than in terms of a real problem. Tell us what real problem you
> are trying to solve, and then we can find an idiomatic answer.
> Uriel