Moving this conversation from GitHub to the mailing list...
-----
Go should have a first-class cancellation pattern (in much the same way that it has a first class error pattern). The cancellation pattern should meet the following requirements:
observe
cancellation and the ability to cause
cancellation MUST be independent.observe
the same cancellation token without impacting each other.observe
cancellation in parallel without additional synchronization.poll
whether cancellation has occurred. This allows long running computations to periodically check whether they should stop running even if they never block on IO.select
statement. e.g. block on a data channel value or cancellation, which ever happens first.and
or or
manner such that you get a new cancellation token that becomes cancelled when its inputs become cancelled.diagnostic
message that can be used in logging or other debugging to indicate the reason a computation was asked to cancel.Cancellation allows a tree or pipeline of computation whose output is no longer needed to be terminated. Go makes describing pipelines of computation easy. Goroutines representing each stage of such a pipeline link together via channels to form a computational pipeline whose final channel delivers the final result of the computation. Once a pipeline has produced sufficient output for the final consumer the pipeline needs to be torn down. Sometimes this corresponds to natural boundaries of data being produced but sometimes it doesn't (e.g. when errors are encountered or the consumer disconnects abruptly).
I propose that cancellation be implemented like errors through an interface of the following form:
type Cancel interface { // Returns a channel that is closed when signalled by the cancellation function. // Guaranteed to return the same channel on each call // (i.e. can be safely cached by the caller.) Done() <-chan struct{} // Err returns the error provided to the cancellation function. // If not yet signalled always returns nil. // (i.e. can be called at any time to poll the state of cancellation.) Err() error } // CancelFunc is called to signal that cancellation should begin. // Signalling the cancellation token is synchronous and atomic. // CancelFunc is idempotent but only the first reason is preserved. // If reason is nil, CancelFunc panics. type CancelFunc func(reason error) // NewCancel creates a new cancellation signal and a cancellation // function to cancel it. func NewCancel() (Cancel, CancelFunc) { ... } // WithParent creates a new cancellation signal and a cancellation // function to cancel it. The resulting token is cancelled with either // the cancellation function is called or the parent becomes cancelled. func WithParent(parent Cancel) (Cancel, CancelFunc) { ... } // And creates a new cancellation signal that becomes cancelled when // both of its inputs become cancelled. func And(left Cancel, right Cancel) Cancel { ... } // Or creates a new cancellation signal that becomes cancelled when // either of its inputs become cancelled. func Or(left Cancel, right Cancel) Cancel { ... }
This interface allows for implementations that meet all of the above specifications. Because the Done() channel's write endpoint is closely held in this design, it is possible for a language runtime implementation to use a cheaper channel implementation that allows for Boolean and Hierarchical composition without actually allocating a goroutine to wait on the inputs.
For most of the same reasons that error is part of the language. First-class treatment of cancellation will guarantee the proper level of composability that an external implementation cannot achieve.
First-class treatment allows the core libraries to also level the pattern.
As mentioned above, a language runtime level implementation can be significantly less expensive for composition that would be possible by a pure library implementation. A low-cost enables its ubiquity as it allows it to be used in domains that simply would not permit a more expensive implementation.
- Jason.
At my last job, I implemented a system that included local file
caching. I used "chan struct{}" as my cancellation type directly, only
reading (incl. selecting) and close()ing it.
The resource library I wrote (to abstract over various transfer
protocols) implemented an interface similar to:
func Download(fromURL string, toFileName string, cancel <-chan
struct{}) error
func Upload(fromURL string, toFileName string, cancel <-chan struct{}) error
That library included retries, which used cancellation like:
for i := 0; i < retries; i++ {
if i > 0 {
select {
case <-time.After(retryInterval):
case <-cancel:
return ErrCanceled
}
}
// attempt the transfer, passing along the cancel channel
}
But more interesting was the local file cache. It had a fairly simple interface:
func (*Cache) Get(url string, cancel <-chan struct{}) (*CachedFile, error)
// and a helper to avoid having to split and merge goroutines when
you want many files in parallel.
func (*Cache) GetMany(urls []string, cancel <-chan struct{})
([]*CachedFile, error)
But the Cache had many jobs:
- Cache files in a directory to avoid redownloading
- Watch the cache size and remove the old, currently unused files
- Merge concurrent Get requests on the same URL to avoid bandwidth waste
- Handle cancellation properly (cancel the Download iff all concurrent
Gets on that URL were canceled)
The implementation of Get creates its own cancel channel for the
Download call, which would be shared between all concurrent Gets of
the same URL.
The implementation of GetMany is similarly complex, where it creates a
cancel channel *just* for the GetMany call, passes that on to parallel
calls to Get, and if an error occurs on one of those Gets, it will
cancel all the other Gets it made (without cancelling the downstream
user's cancel channel.) Simultaneously, if the caller cancels, we
cancel the Get calls we made as well. (The implementation is about 80
lines long due to the complex error and cancellation handling, so I'm
not including source code for it.)
Each job running on the system would make its own cancel channel for
the job overall, and pass that into each of its calls into the *Cache,
so that jobs can exit cleanly and quickly.
The implementation of the *Cache was one of the most difficult
concurrency issues I've had to work with, but using channels for all
the communication tended to make it relatively easy to do. I needed
non-blocking reads on the cancel channel (to see if I should start the
next operation), blocking selects with other channels (e.g. retries),
and sometimes needed to split off another goroutine to cancel
operations in libraries that don't use a channel directly (like
net/http, *os.File, and various object storage libraries.)
For all the main worker code, dealing with a cancel channel directly
was very, very easy; any time you're waiting on something, select with
the cancel channel and abort if it becomes readable.
Overall, I found the usage of chan struct{} directly to be pretty
clean. In all the readers of the cancel channels (which is all but the
place where it's closed), they'd declare their types to be <-chan
struct{} (and thus cannot close the channel (which is a coding
error)). The syntax is a bit unintuitive for the first few minutes,
but since it's the same as all other channel operations, it was easy
to get used to. It's also easy to think about the semantics of, which
is very important when implementing complex things like the local file
cache.
I tried to use the x/net/context library as well as labix's
gopkg.in/tomb.v1 for this, but found both of them to be trying to
solve more problems than cancellation and being more difficult to
think about than using channels directly.
I'm skeptical if any object-like interface will ever be able to match
the clarity of using a close-only channel of struct{} for me, aside
from one literally only exposing the read side of the channel and a
close method (and at that point, why wrap it in an object?) I don't
think standardizing the interface you mentioned would be worth the
trouble, but I *DO* think this is an important part of programming
really good Go servers that is often ignored in libraries I've seen
due to a lack of any standard at all.
At my last job, I implemented a system that included local file
caching. I used "chan struct{}" as my cancellation type directly, only
reading (incl. selecting) and close()ing it.
The resource library I wrote (to abstract over various transfer
protocols) implemented an interface similar to:
func Download(fromURL string, toFileName string, cancel <-chan
struct{}) error
func Upload(fromURL string, toFileName string, cancel <-chan struct{}) error
That library included retries, which used cancellation like:
for i := 0; i < retries; i++ {
if i > 0 {
select {
case <-time.After(retryInterval):
case <-cancel:
return ErrCanceled
}
}
// attempt the transfer, passing along the cancel channel
}
But more interesting was the local file cache. It had a fairly simple interface:
func (*Cache) Get(url string, cancel <-chan struct{}) (*CachedFile, error)
// and a helper to avoid having to split and merge goroutines when
you want many files in parallel.
func (*Cache) GetMany(urls []string, cancel <-chan struct{})
([]*CachedFile, error)
The go-statement is extended (in a backward compatible way) to optionally return a cancel instance if used in an assignment:
cancel := go func() {...}()
With the idea that the cancel token gets closed when the goroutine has completed and been torn down by the runtime.
Further supposed that if the function being called in the go-statement returns 1 or more values and the last return value is of type error then the cancel token will return that error value from its Err() interface after it becomes resolved. If the func doesn't return an error value then Err() returns nil.
cancel := go func() error {... return err}()
This has wonderful composability! Consider this code that someone on that other thread had asked about:
--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/TQ5TdJEBamY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
That example makes me think of this problem differently and start to
see more value in using more than just a bare channel; your examples
have nothing to do with cancellation, but are instead focused on
completion. This object we're discussing really isn't about
cancellation or completion, it's a "set once" variable intended for
many readers, one writer, possibly with a rich reader API (including
combinators like your cancel.Add, compatibility with select, and a
blocking "await" (<-c.Done()).)
I think my mind was led astray by the name "cancellation". It really
is this generic useful pattern about boolean state that changes
precisely once; cancellation can be a common use case and a good
example, but in terms of what it does, "cancel" implies far too narrow
of a scope, and the other uses seem like they're stretching the
intended use (which shouldn't be the case.) I'm having trouble
thinking of an appropriate word for it that doesn't have narrow
implications...
There's a Close() method on the *CachedFile which must be called if
you get any *CachedFile from Get or GetMany (and you're guaranteed to
only get a non-nil *CachedFile/[]*CachedFile or a non-nil error from
these.) *CachedFile forces the file to never be garbage collected
while it is still open, so there's no safety race. Since that detail
was completely irrelevant to the discussion of cancellation, I skipped
it.
Also, this is used for handling *large* files (tens of gigabytes
each), and a large number of them (about a terabyte per node.)
Groupcache is not meant to handle this kind of load. It's not a
problem solvable by a memcached-like system.
All that said, I'd like to keep this thread focused on cancellation,
not caching.
func jujuVersion() {a := []int{1, 2, 3}r := juju.NewRun(len(a))for i := range a {j := ir.Do(func() error {
// Do the work - what ever that is... 1/2 will fail here.
if j > 0 {
return errors.New("some error")}return nil})}
err := r.Wait()if err != nil {switch errs := err.(type) {case juju.Errors:for _, e := range errs {fmt.Printf("%v\n", e)}default:
fmt.Printf("%v\n", err)}}}
func tombVersion() {var t tomb.Tomb
for i := range []int{1, 2, 3} {
j := it.Go(func() error {
// Do the work - what ever that is... 1/2 will fail here.
if j > 0 {
return errors.New("some error")}return nil})}
err := t.Wait()if err != nil {fmt.Printf("%v\n", err)}}func original() {
var wg sync.WaitGrouperrs := make(chan error, 3)
a := []int{1, 2, 3}for i := range a {
wg.Add(1)go func(i int) {defer wg.Done()
// Do the work - what ever that is... 1/2 will fail here.var err errorif i > 0 {
err = errors.New("some error")}// Write the outcome, nil for success, non-nil for errorerrs <- err}(i)}wg.Wait()for range a {
if err := <-errs; err != nil {fmt.Printf("%v\n", err)}}}
I had another thought based on a conversation I was having with Bryan Mills on another thread. If the cancel interface were part of the language then consider the following additional language extension:The go-statement is extended (in a backward compatible way) to optionally return a cancel instance if used in an assignment:cancel := go func() {...}()With the idea that the cancel token gets closed when the goroutine has completed and been torn down by the runtime.
c <- go func(){..}()