Re: [go-nuts] Why doesn't make(chan...) return two half-duplex channels?

615 views
Skip to first unread message

Jim Whitehead II

unread,
Jul 4, 2012, 9:09:18 PM7/4/12
to jto...@gmail.com, golan...@googlegroups.com
On Wed, Jul 4, 2012 at 5:36 PM, <jto...@gmail.com> wrote:
> Language design question: Go seems pretty thought out, but every time I
> think about this one problem, I can't figure out why having a full-duplex
> channel is ever a win.
>
> Right now, when you make a channel, you get a full-duplex channel, and then
> you can typecast it to a half-duplex channel if you want. But imagine if
> making a channel by default gave you two half-duplex channel objects. When
> one side of the channel gets garbage collected, the other channel is
> guaranteed to no longer have any readers (or writers).

What does this gain you? I don't want dangling channel ends to be
garbage collected because they indicate an error in the style of
programming that Go is based on. A process should not need to know
anything about its environment or the details of the garbage
collector. It would seem odd to me if a process would just die when
its no longer communicating with anyone. That's implicit behaviour,
which goes against the explicit nature of Go.

> This seems like a huge win to me. You no longer have to pass control
> channels around to deal with long-lived goroutine cleanup. Having
> full-duplex channels makes me feel like I'm almost benefiting from garbage
> collection, but I have to manually go around and clean up any goroutines
> that are blocking on any channels. It's a shame those don't just end
> automatically when I'm done sending stuff to them (or reading from them).
> Right?

I disagree. I think the system as it stands is super easy to understand.

> So, golang-nuts, why is full-duplex channeling better? Are there any plans
> on using typecast half-duplex channels to enable this sort of garbage
> collection behavior, in that you can safely know that one side of the
> channel is over? I want GC to work with my channel-bound goroutines too. :(
>
> I'd fall in love with Go if not for this.

It's just a different style of thinking. I prefer the explicit nature
of the Go model. If I am reading on a channel that no one else has
access to (whether they are full or half duplex) then this is a
programming error in the concurrency model. I should write code that
handles this case explicitly. Anything else is tantamount to implicit
exceptions which opens a (bad) can of worms, in my opinion.

Just my thoughts,

- Jim

Matt Joiner

unread,
Jul 4, 2012, 9:38:50 PM7/4/12
to Jim Whitehead II, jto...@gmail.com, golan...@googlegroups.com
It seems like a very minor enhancement. The implicit close if all the
writers are garbage collected seems out of character in Go.

Paul Borman

unread,
Jul 4, 2012, 11:55:22 PM7/4/12
to Jim Whitehead II, jto...@gmail.com, golan...@googlegroups.com
On Wed, Jul 4, 2012 at 8:09 PM, Jim Whitehead II <jnwh...@gmail.com> wrote:
A process should not need to know
anything about its environment or the details of the garbage
collector.

No, that is not true.  You do need to know what gets garbage collected and what doesn't (and why).  Without doing this you will end up with memory leaks.  For example, you don not have to close a channel if all references to it are garbage collected, the channel then also be garbage collected.  However, if all references to an open file are garbage collected that file is not closed.  Garbage collection is a fundamental part of Go and understanding that and what it does is important to writing Go programs.

As for the environment, I can't possibly image what you mean by that.  A program that has no connection to its environment has no I/O, by definition, and hence cannot report at all what it has done.  Understanding the environment it runs in is often very critical to its function.

The OP's suggestion is interesting and I can see some merit in it, however, only for garbage collecting when all writers have disappeared as collecting is akin to closing and there have been many discussions about the dark horrors of readers closing channel.

What would this code do?

    <- make(chan struct{})

I used to use that before I learned to use select{}.

    -Paul

Rémy Oudompheng

unread,
Jul 5, 2012, 2:18:27 AM7/5/12
to jto...@gmail.com, golan...@googlegroups.com
On 2012/7/4 <jto...@gmail.com> wrote:
> This seems like a huge win to me. You no longer have to pass control
> channels around to deal with long-lived goroutine cleanup. Having
> full-duplex channels makes me feel like I'm almost benefiting from garbage
> collection, but I have to manually go around and clean up any goroutines
> that are blocking on any channels. It's a shame those don't just end
> automatically when I'm done sending stuff to them (or reading from them).
> Right?

Go already has a close() builtin to signal end of transmission through
a channel. A usual pattern is to use a range loop over channel in
consumer goroutines, which automatically ends when the channel is
closed and empty.

Rémy.

Kyle Lemons

unread,
Jul 5, 2012, 2:59:30 AM7/5/12
to jto...@gmail.com, golan...@googlegroups.com
On Wed, Jul 4, 2012 at 9:36 AM, <jto...@gmail.com> wrote:
Language design question: Go seems pretty thought out, but every time I think about this one problem, I can't figure out why having a full-duplex channel is ever a win.

I wouldn't say it's full-duplex anything.  It's simply a handle that refers to a (possibly a)synchronous queue.
 
Right now, when you make a channel, you get a full-duplex channel, and then you can typecast it to a half-duplex channel if you want. But imagine if making a channel by default gave you two half-duplex channel objects. When one side of the channel gets garbage collected, the other channel is guaranteed to no longer have any readers (or writers).

Converting (Go doesn't have casts) a channel to a send/recieve-only channel is exceedingly rare.  I think having make spit them out (and essentially removing unrestricted channels from the language) would complicate things unnecessarily, if for no reason than the readability/clarity loss now that you have two variables in the namespace instead of one (I happen to be a fan of * and # in vim and use them heavily).
 
This seems like a huge win to me. You no longer have to pass control channels around to deal with long-lived goroutine cleanup. Having full-duplex channels makes me feel like I'm almost benefiting from garbage collection, but I have to manually go around and clean up any goroutines that are blocking on any channels. It's a shame those don't just end automatically when I'm done sending stuff to them (or reading from them). Right?

Even if you have restricted channels, the "channel itself" (the queue) is not garbage collected until both sides are gone.  Being any smarter about it would require far more intelligence in the garbage collector than exists today.  By the time the garbage collector has that depth of knowledge, the compiler and run-time will also probably be smart enough to know which goroutines are sending and which are receiving from channels anyway (that leads to some fun optimization possibilities), and thus shouldn't need to be told.
 
So, golang-nuts, why is full-duplex channeling better? Are there any plans on using typecast half-duplex channels to enable this sort of garbage collection behavior, in that you can safely know that one side of the channel is over? I want GC to work with my channel-bound goroutines too. :(

Channels are very simple and easy-to-understand.  They are much easier to reason about than (especially multiple levels of) locking.  I wonder if you are thinking of channels too much like network sockets when you may be better off thinking of them as multiple-producer multiple-producer queues that are optionally asynchronous and buffered.
 
I'd fall in love with Go if not for this.

I suggest writing a few thousand lines of Go before you decide whether this is the feature that bugs you the most, and whether whatever does bother you is a show-stopper :).

unread,
Jul 5, 2012, 3:51:35 AM7/5/12
to golan...@googlegroups.com, Jim Whitehead II, jto...@gmail.com
On Thursday, July 5, 2012 5:55:22 AM UTC+2, Paul Borman wrote:
... if all references to an open file are garbage collected that file is not closed.

Search for SetFinalizer in http://golang.org/src/pkg/os/file_unix.go

unread,
Jul 5, 2012, 6:22:17 AM7/5/12
to golan...@googlegroups.com
On Wednesday, July 4, 2012 6:36:57 PM UTC+2, JT Olds wrote:
Language design question: Go seems pretty thought out, but every time I think about this one problem, I can't figure out why having a full-duplex channel is ever a win.

Right now, when you make a channel, you get a full-duplex channel, and then you can typecast it to a half-duplex channel if you want. But imagine if making a channel by default gave you two half-duplex channel objects. When one side of the channel gets garbage collected, the other channel is guaranteed to no longer have any readers (or writers).

Go's type system does not allow conversion from (<-chan T) or (chan<- T) to chan T. This means that if you have two half-duplex channel variables pointing to the same channel C, and either one of the variables if garbage collected, then Go's run-time could (in theory) garbage collect C.

One part of the functionality you are requesting in your post (garbage collection of half-duplex channels) is possible in a Go implementation without any modifications to the language spec.

The other part of your post (closing the channel when all (<-chan T) are garbage collected, or when all (chan<- T) are garbage collected) is missing from the language spec.. There is no rule in the spec which would be stating that "When an object X is garbage collected, do an action Y". As far as the spec is concerned, the rule is that Go's garbage collection never triggers any action. Note that runtime.SetFinalizer function is part of the run-time, it isn't in the spec.

This seems like a huge win to me. You no longer have to pass control channels around to deal with long-lived goroutine cleanup. Having full-duplex channels makes me feel like I'm almost benefiting from garbage collection, but I have to manually go around and clean up any goroutines that are blocking on any channels. It's a shame those don't just end automatically when I'm done sending stuff to them (or reading from them). Right?

In my opinion, it is a good decision not to have such automation in the language. The reason why it is a good decision is that basing program behavior on the abilities and disabilities of a garbage collector implementation is a bad idea. The spec would not only need to mention that there is some garbage collection in Go programs, but it would need to specify what kind of garbage collection it is and exactly specify when an object becomes garbage.   This mandates that the garbage collector would need to be a precise garbage collector.

All Go language implementations would need to adhere to the fact that garbage collection needs to be precise. Consequently, all of the existing implementations (8g/6g/5g, gccgo, whatever) would violate the language spec, because their garbage collector is conservative.

So, golang-nuts, why is full-duplex channeling better? Are there any plans on using typecast half-duplex channels to enable this sort of garbage collection behavior, in that you can safely know that one side of the channel is over? I want GC to work with my channel-bound goroutines too. :(

I'd fall in love with Go if not for this.

You can put the channel in a struct F, allocate F on heap, and use runtime.SetFinalizer to attach a finalizer to it. All accesses of the channel in your program need to go through F. When F gets garbage collected, the finalizer will close the channel. Go has no support for generics, so the code may be a bit verbose if you are using channels of distinct types.

That said, I believe that in order for a programming language to implement finalizers efficiently there needs to be an explicit finalizer type (finalizer T, just like chan T or []T) built into the language. Allowing finalizers on any object allocated from heap is somewhat inefficient.

Jim Whitehead II

unread,
Jul 5, 2012, 8:09:49 AM7/5/12
to Paul Borman, jto...@gmail.com, golan...@googlegroups.com
On Thu, Jul 5, 2012 at 4:55 AM, Paul Borman <bor...@google.com> wrote:
>
>
> On Wed, Jul 4, 2012 at 8:09 PM, Jim Whitehead II <jnwh...@gmail.com> wrote:
>>
>> A process should not need to know
>> anything about its environment or the details of the garbage
>> collector.
>
>
> No, that is not true. You do need to know what gets garbage collected and
> what doesn't (and why). Without doing this you will end up with memory
> leaks. For example, you don not have to close a channel if all references
> to it are garbage collected, the channel then also be garbage collected.
> However, if all references to an open file are garbage collected that file
> is not closed. Garbage collection is a fundamental part of Go and
> understanding that and what it does is important to writing Go programs.

I disagree with you, but we're probably just saying one thing in two
different ways. As a goroutine running in the world, I have links to
the outside world through channels. These channels have different
conventions and meanings, such as the 'quit' channels. I need to know
what my view of that channel is and whether or not I should plan to
wait for a signal on that channel, or broadcast to it when complete. I
don't need to know anything about the other processes that are running
in the system, I just need to know about the connections that I have
to them and the protocols for using those channels.

This can be seen trivially: I need to know whether to read or write on
a channel or I may introduce deadlock. This has to be baked into any
program that is using channels and doesn't require details of the rest
of the system (environment) in order for me to write my individual
goroutine procedure. Perhaps this is an oversimplification, but it's
how I view and write my programs.

> As for the environment, I can't possibly image what you mean by that. A
> program that has no connection to its environment has no I/O, by definition,
> and hence cannot report at all what it has done. Understanding the
> environment it runs in is often very critical to its function.

Ah, I see you haven't read the 1978 CSP paper, where Hoare introduces
the send and receive operations purposely to indicate I/O, or
connections with an environment. I am viewing the world through that
lens and that is how I write my programs.

When you create a function, you understand what the parameters and
return values are supposed to be. You might even indicate this to a
consumer of your program using documentation, comments, or other
conventions. I see concurrency as simply being an extension of that. I
give my channels good names, and talk about how they are being used.

"Read messages from the 'in' channel, producing a response on the
'out' channel for each before proceeding to the next one. If a message
is received on the 'quit' channel, then finish processing any
outstanding requests, but stop listening for messages on 'in'."

When I am designing this goroutine procedure, I can limit myself to
operating within the realms of this definition the same as if I was
writing a conventional function. I don't have to be concerned with the
fact that I'm not listening to ONE producer on a single channel, but
that I'm listening to TWO producers that share my input channel. I
just know my job and I do it.

Of course the person doing the programmer needs to understand the
connections between the processes-- they're the one who puts them
together. That's a consequence of the Go model that differs from the
CSP model, since we communicate over explicit first-class channels
rather than by addressing our messages to specific process names. I
can write my programs as simple sequential programs that communicate
using the resources they are given.

> The OP's suggestion is interesting and I can see some merit in it, however,
> only for garbage collecting when all writers have disappeared as collecting
> is akin to closing and there have been many discussions about the dark
> horrors of readers closing channel.

This is where I think it falls apart. It ruins the explicit nature of
Go by introducing behaviour that winds up being dangerously close to
"exceptional circumstances". But perhaps I just need to better
understand the use-case here. If we're talking about the channel end
being close()-ed when the other end is collected, then I don't know
that I would like those semantics.

> What would this code do?
>
> <- make(chan struct{})

That code to be indicates a deadlocked portion of code. It's a bug.

> I used to use that before I learned to use select{}.
>
> -Paul

I just happen to believe the explicit nature of the current system
fits well with the language and gives us both power and clarity. I
understand the OPs permise, but I don't believe the cost incurred is
worth losing that clarity.

- Jim

JT Olds

unread,
Jul 5, 2012, 12:28:52 PM7/5/12
to Jim Whitehead II, Paul Borman, golan...@googlegroups.com
Great discussion everyone, so, a few clarifications and a motivating example:

I am not interested in actually garbage collecting goroutines. I'm
interested in reducing the bookkeeping one has to do during clean up.
Having written a few thousand lines of Go, I often find that I get
into the pattern of creating a goroutine that is selecting on a
processing channel and a control channel. It will read from the
processing channel until it gets a message from the control channel,
at which point the select loop will end, and the goroutine will finish
up. However, that means that all of the other goroutines who are
sending jobs to this specific goroutine need to keep track of when the
close control message has happened, so they know that the channels are
closed down.

For a concrete example, let's say I have a goroutine that does some
computation, like factorial of a number x. I have a request structure
that includes the number x, and a response channel for the goroutine
to send the response on. A bunch of other goroutines have access to
this request channel and send requests and wait for responses on their
specific response channels. At some point, those other goroutines
finish up and exit. How does the factorial computing goroutine end?

If I were to use additional control channels, this is certainly
doable, but it seems like unnecessary bookkeeping. Imagine if the
request channel knew that no more responses would come on it? Then the
factorial goroutine select call on the request channel could end
(channel closed!) instead of blocking forever. It would know it would
receive no more responses and clean itself up.

So, when the sending side of the request channel is garbage collected,
a close event could in theory be sent to the receiving side, allowing
the receiving side to shut down and exit, without additional
bookkeeping for instructing the factorial goroutine to exit.

Perhaps there's some new Go pattern for dealing with this case
cleanly? But this example is perhaps a good place to center the
discussion.

-JT

Kyle Lemons

unread,
Jul 5, 2012, 10:20:01 PM7/5/12
to jto...@xnet5.com, Jim Whitehead II, Paul Borman, golan...@googlegroups.com
The usual way to handle this is by arranging for the request channel to be closed when all requesters are done.  I'm not sure what else you're doing with the control channel, but from your description it's not required, so I left it out.  Consider:


type Request struct {
X int
Result chan int
}

// Computer's zero-value is properly initialized.
type Computer struct {
lock sync.Mutex
requests chan Request
cleanup sync.WaitGroup
}

func (c *Computer) run() {
for req := range c.requests {
req.Result <- req.X * req.X
}
}

// Start won't work if the computer already stopped
// How to fix that is left as an exercise to the reader :)
func (c *Computer) Start() chan Request {
// Keep track of clients
c.cleanup.Add(1)

// Protect from duplicate creation
c.lock.Lock()
defer c.lock.Unlock()

// Start running if necessary
if c.requests == nil {
fmt.Println("Starting Computer")
c.requests = make(chan Request, 32)
go c.run()

// Arrange for the requests channel to close
// when there are no more clients
go func() {
c.cleanup.Wait()
fmt.Println("Cleaning up")
close(c.requests)
}()
}
return c.requests
}

func (c *Computer) Done() {
c.cleanup.Done()

JT Olds

unread,
Jul 5, 2012, 11:35:21 PM7/5/12
to Kyle Lemons, Jim Whitehead II, Paul Borman, golan...@googlegroups.com
Okay, so there isn't any new pattern for dealing with this, just all
the usual locks and bookkeeping.

It's a shame, because the computation-as-a-service model (a channel
that you send requests to) seems so incredibly useful. It's too bad
you essentially have to add reference counting. My argument was
half-duplex channels eliminate the need for manual reference counting
of goroutines.

si guy

unread,
Jul 6, 2012, 12:30:49 AM7/6/12
to golan...@googlegroups.com
You can also hold a reference to a heap allocared struct on the controller side that contains the input chans and use setfinalizer to close the workers input chans. Then have the workers return on closed input chans.
Or is this a bad idea? I've never used it.

si guy

unread,
Jul 6, 2012, 12:32:34 AM7/6/12
to golan...@googlegroups.com
Whoops, this was already mentioned, ignore my last.

Kyle Lemons

unread,
Jul 6, 2012, 4:11:30 PM7/6/12
to JT Olds, Jim Whitehead II, Paul Borman, golan...@googlegroups.com
On Thu, Jul 5, 2012 at 8:35 PM, JT Olds <jto...@xnet5.com> wrote:
Okay, so there isn't any new pattern for dealing with this, just all
the usual locks and bookkeeping.

It's a shame, because the computation-as-a-service model (a channel
that you send requests to) seems so incredibly useful. It's too bad
you essentially have to add reference counting. My argument was
half-duplex channels eliminate the need for manual reference counting
of goroutines.

Actually, it rarely comes up.  Usually if you have that sort of goroutine, it lives as long as the application is running, so it doesn't ever need to know to clean up (until a global quit signal is received, in which case you can either exit or go into lame-duck mode, depending on need).
Reply all
Reply to author
Forward
0 new messages