work group synchronisation with buffered chans (again)

347 views
Skip to first unread message

Dan Kortschak

unread,
Aug 18, 2013, 11:33:44 PM8/18/13
to golang-nuts
This issue has been discussed before, both on golang-nuts and
golang-dev, but I bring it up because in a recent codereview[1] I see
two conflicting views on the interpretation of the case and I don't know
what to believe.

In the golang-nuts thread[2] there is a discussion about the use of
buffered chans to limit workgroups, this links to the golang-dev
thread[3] where there is an argument made that it is formally true that
the prevailing idiom at the time of using the buffered chan's cap as the
limit was resolvable to a no-op[4]. The idiom used was:

var limit = make(chan struct{}, max)

func concFunc1() {
limit <- struct{}{} // A
defer func() {
<-limit // B
}()
// Work...
}


On the basis of this it was suggested that people should use a
pre-filled chan as a limiter:

var limit = make(chan struct{}, max)

func init() {
for i := 0; i < max; i++ {
limit <- struct{}{} // A
}
}

func concFunc2() {
<-limit // B
defer func() {
limit <- struct{}{} // A
}()
// Work...
}


As far as I can see the issue comes down to whether the memory model is
relevant to the determination of execution order in this case.

The potential guarantee from the memory model guarantees is that an A
happens before the corresponding B (because limit is buffered), so if
the memory model determines execution order here we can't formally say
that limit does stop concFunc1 from executing concurrently more than
cap(limit) times.

Is the memory model important here? If it's not, what is?

Dan

[1]https://codereview.appspot.com/13038043/
[2]https://groups.google.com/d/topic/golang-nuts/Ug1DhZGGqTk
[3]https://groups.google.com/d/topic/golang-dev/ShqsqvCzkWg
[4]https://groups.google.com/d/msg/golang-dev/ShqsqvCzkWg/9WGj2yPK9xYJ

Dmitry Vyukov

unread,
Aug 19, 2013, 4:51:19 AM8/19/13
to Dan Kortschak, golang-nuts
I've already described my thoughts on the codereview thread.

In short, if the memory model would NOT be relevant here, then the following program is allowed to print "playground\nHello, ":

func main() {
sync := make(chan bool)
go func() {
print("Hello, ")
sync <- true
}()
go func() {
<-sync
print("playground\n")
}()
select {}
}

The memory model say what operations happen before other operations. If you can prove that in all executions printing of "Hello" happens before printing of "playground", that's is, you proved that the program is correct. Otherwise they happen concurrently.

Dan Kortschak

unread,
Aug 19, 2013, 6:32:55 AM8/19/13
to Dmitry Vyukov, golang-nuts
Yes, and I think your position is correct. I'm hoping that I can see a formal answer. The codereview thread seemed to just drop the issue without any real resolution.

Dan

Nico

unread,
Aug 19, 2013, 6:35:55 AM8/19/13
to golan...@googlegroups.com
I'm not sure what the question is. Is it really whether the memory
applies in these two cases or how to apply the memory model to them?


On 19/08/13 04:33, Dan Kortschak wrote:
> var limit = make(chan struct{}, max)
>
> func concFunc1() {
> limit <- struct{}{} // A
> defer func() {
> <-limit // B
> }()
> // Work...
> }

If I remember correctly, here the memory model makes no guarantees about
what happens after A or before B, and thus the compiler is allowed to
optimise A and B away.


> On the basis of this it was suggested that people should use a
> pre-filled chan as a limiter:
>
> var limit = make(chan struct{}, max)
>
> func init() {
> for i := 0; i < max; i++ {
> limit <- struct{}{} // A
> }
> }
>
> func concFunc2() {
> <-limit // B
> defer func() {
> limit <- struct{}{} // A
> }()
> // Work...
> }

Here, on the other hand, everything that happens after B is guaranteed
by the memory model to happen after the A from a different goroutine has
been completed.


Nico

PS: Shameless plug. See

https://docs.google.com/document/d/1h5YSWCELBqH3ILW3HmkVIpCt2Gpn02FBSF50kPz921s/edit?usp=sharing

Dmitry Vyukov

unread,
Aug 19, 2013, 6:40:38 AM8/19/13
to Dan Kortschak, golang-nuts
I need to admit that this is mostly theoretical.
One practical implication is complains from race detector, however, it won't complain on most cases.
I can imagine that gccgo/llvmgo can actually break such code in distant future provided that it does whole program analysis and agressive optimizations. Because they will move in that direction for C/C++.

Dan Kortschak

unread,
Aug 19, 2013, 7:09:39 AM8/19/13
to Dmitry Vyukov, golang-nuts
That's my concern. I'm sure it's going to be safe for gc, but I anticipate that people will want to use the other compilers form performance critical applications when they are more developed. Knowing now means code written on that basis will remain safe.

Kyle Lemons

unread,
Aug 19, 2013, 4:00:12 PM8/19/13
to Dan Kortschak, golang-nuts
Here's how I see it:
If all you care about is execution order, you will probably live forever without seeing the first mode fail, because that is specified in the spec.  However, the second way is explicit about the memory synchronization, and so you know that you can rely on your goroutine having access to the right data.  Read and write barriers that are added based on channel send/receives seem like an optimization that is likely to show up some day, so the first seems reasonably likely to fail if used improperly.
 
The potential guarantee from the memory model guarantees is that an A
happens before the corresponding B (because limit is buffered), so if
the memory model determines execution order here we can't formally say
that limit does stop concFunc1 from executing concurrently more than
cap(limit) times.

Is the memory model important here? If it's not, what is?

Dan

[1]https://codereview.appspot.com/13038043/
[2]https://groups.google.com/d/topic/golang-nuts/Ug1DhZGGqTk
[3]https://groups.google.com/d/topic/golang-dev/ShqsqvCzkWg
[4]https://groups.google.com/d/msg/golang-dev/ShqsqvCzkWg/9WGj2yPK9xYJ

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Dan Kortschak

unread,
Aug 19, 2013, 8:03:47 PM8/19/13
to Kyle Lemons, golang-nuts
That seems entirely reasonable, but it would be nice to be able to point
to a document that explicitly states this.

Ian Lance Taylor

unread,
Aug 19, 2013, 8:15:42 PM8/19/13
to Dmitry Vyukov, Dan Kortschak, golang-nuts
On Mon, Aug 19, 2013 at 3:40 AM, Dmitry Vyukov <dvy...@google.com> wrote:
> I need to admit that this is mostly theoretical.
> One practical implication is complains from race detector, however, it won't
> complain on most cases.
> I can imagine that gccgo/llvmgo can actually break such code in distant
> future provided that it does whole program analysis and agressive
> optimizations. Because they will move in that direction for C/C++.

We know that gccgo/llvmgo are not going to rearrange program execution
around channel operations, because that would be baffling. The
question is more one of how to adjust the memory model, or possibly
the language spec, to make clear that any such rearrangement is
invalid.

Ian

Dmitry Vyukov

unread,
Aug 19, 2013, 8:26:08 PM8/19/13
to Ian Lance Taylor, Dan Kortschak, golang-nuts
On Tue, Aug 20, 2013 at 4:15 AM, Ian Lance Taylor <ia...@golang.org> wrote:
On Mon, Aug 19, 2013 at 3:40 AM, Dmitry Vyukov <dvy...@google.com> wrote:
> I need to admit that this is mostly theoretical.
> One practical implication is complains from race detector, however, it won't
> complain on most cases.
> I can imagine that gccgo/llvmgo can actually break such code in distant
> future provided that it does whole program analysis and agressive
> optimizations. Because they will move in that direction for C/C++.

We know that gccgo/llvmgo are not going to rearrange program execution
around channel operations, because that would be baffling.


Do you mean that it will be baffling, because we have lots of invalid programs out there :)


 
 The
question is more one of how to adjust the memory model, or possibly
the language spec, to make clear that any such rearrangement is
invalid.


N-th receive happens-before N+cap(c) send.

This is actually a generalization of the special rule for sync channels. If cap==0, then N-th recv happens-before N-th receive. So that special rule can be removed.



Dan Kortschak

unread,
Aug 19, 2013, 8:33:01 PM8/19/13
to Ian Lance Taylor, Dmitry Vyukov, golang-nuts
That would be the ideal situation.

Thanks.

Kyle Lemons

unread,
Aug 19, 2013, 8:50:34 PM8/19/13
to Dmitry Vyukov, Ian Lance Taylor, Dan Kortschak, golang-nuts
Will this make the "incorrect" semaphore suddenly correct? 

Dmitry Vyukov

unread,
Aug 20, 2013, 6:19:29 AM8/20/13
to Kyle Lemons, Ian Lance Taylor, Dan Kortschak, golang-nuts
On Tue, Aug 20, 2013 at 4:50 AM, Kyle Lemons <kev...@google.com> wrote:

On Mon, Aug 19, 2013 at 5:26 PM, Dmitry Vyukov <dvy...@google.com> wrote:
On Tue, Aug 20, 2013 at 4:15 AM, Ian Lance Taylor <ia...@golang.org> wrote:
On Mon, Aug 19, 2013 at 3:40 AM, Dmitry Vyukov <dvy...@google.com> wrote:
> I need to admit that this is mostly theoretical.
> One practical implication is complains from race detector, however, it won't
> complain on most cases.
> I can imagine that gccgo/llvmgo can actually break such code in distant
> future provided that it does whole program analysis and agressive
> optimizations. Because they will move in that direction for C/C++.

We know that gccgo/llvmgo are not going to rearrange program execution
around channel operations, because that would be baffling.


Do you mean that it will be baffling, because we have lots of invalid programs out there :)


 
 The
question is more one of how to adjust the memory model, or possibly
the language spec, to make clear that any such rearrangement is
invalid.


N-th receive happens-before N+cap(c) send.

This is actually a generalization of the special rule for sync channels. If cap==0, then N-th recv happens-before N-th receive. So that special rule can be removed.

Will this make the "incorrect" semaphore suddenly correct?


Yes, it is.

To reason about this, you need to draw a graph of happens-before (HB) relations, and then count maximum subset of action that we try limit that is not ordered by HB.
Here is what we have currently (S - send, A - action, R - receive):

S1->A1->R1
S2->A2->R2
S3->A3->R3
S4->A4->R4

Each goroutine does not necessary receives its own message, so there can be additional edges. But this is the worst case.

It's easy to see that ALL actions are not ordered in any way.
If we also add arrows between R1 and S3, and R2 and S4 (assuming cap(c) == 2), then the maximum subset of A not ordered by HB is 2 (==cap(c)).

As I said, this rules covers sync channels as well -- S1 happens before R1 (but we must be careful to define it so that there is no cycles in HB graph).

Another consequence -- "inverted" chan with cap==1 will be able to work as mutex.

Nico

unread,
Aug 20, 2013, 6:57:04 AM8/20/13
to golan...@googlegroups.com
On 20/08/13 01:26, Dmitry Vyukov wrote:
> N-th receive happens-before N+cap(c) send.

In the past there were objections to adding this rule to the memory
model. If I remember correctly:

- one objection was that it would prevent the compiler from applying
some optimisations (are there any of such optimisations currently in use?)

- it would synchronise a send and a receive that are, in principle,
unrelated (one could use the implementation of a work group using
buffered channels as a counter-example)

Are there any other reason not to extend the memory model in this way?

Dmitry Vyukov

unread,
Aug 20, 2013, 7:15:45 AM8/20/13
to Nico, golang-nuts
On Tue, Aug 20, 2013 at 2:57 PM, Nico <nicolas...@gmail.com> wrote:
On 20/08/13 01:26, Dmitry Vyukov wrote:
N-th receive happens-before N+cap(c) send.

In the past there were objections to adding this rule to the memory model. If I remember correctly:

- one objection was that it would prevent the compiler from applying some optimisations (are there any of such optimisations currently in use?)

I am pretty sure there are no such optimizations currently. They would have negligible effect on performance, so we can prohibit them.

 
- it would synchronise a send and a receive that are, in principle, unrelated (one could use the implementation of a work group using buffered channels as a counter-example)

Why is it bad?
What exactly do you mean by work group counter-example?

Dmitry Vyukov

unread,
Aug 20, 2013, 7:36:26 AM8/20/13
to Nico, golang-nuts
+golang-nuts


On Tue, Aug 20, 2013 at 3:26 PM, Nico <nicolas...@gmail.com> wrote:
On 20/08/13 12:15, Dmitry Vyukov wrote:
Why is it bad?

I don't think it's bad.


Then why is it an objection?
 

What exactly do you mean by work group counter-example?


    Are there any other reason not to extend the memory model in this way?

What I mean is that the objection the Nth receive and the (N+cap)th send are unrelated events is not valid when one tries to implement a work group using Dan's concFunc1()[1]. 
[1] https://groups.google.com/d/msg/golang-nuts/wSRqJLvVTzo/uDl0dmgu208J


Nico

unread,
Aug 20, 2013, 7:49:05 AM8/20/13
to golang-nuts
I've just found the original thread:

https://groups.google.com/d/msg/golang-nuts/MDvnk1Ax7UQ/eQGkJJmOxc4J

One of the objections raised against rule 4 (that I think it's
equivalent to "the Nth receive and the (N+cap)th send" rule) is given by
Gustavo:

https://groups.google.com/d/msg/golang-nuts/MDvnk1Ax7UQ/kXtjcFpSd0UJ

Personally, I have no objections. I want to understand the pros and cons.

From your previous response, I understand that adding this rule has
negligible effect on performance.

With my email, I was trying to prompt other people to raise any other
concerns. If there aren't any other concerns, I think Go would benefit
from adding such a rule.

Dmitry Vyukov

unread,
Aug 20, 2013, 8:21:48 AM8/20/13
to Nico, golang-nuts
On Tue, Aug 20, 2013 at 3:49 PM, Nico <nicolas...@gmail.com> wrote:
I've just found the original thread:

https://groups.google.com/d/msg/golang-nuts/MDvnk1Ax7UQ/eQGkJJmOxc4J

One of the objections raised against rule 4 (that I think it's equivalent to "the Nth receive and the (N+cap)th send" rule) is given by Gustavo:

https://groups.google.com/d/msg/golang-nuts/MDvnk1Ax7UQ/kXtjcFpSd0UJ


This is a weak argument. We must put semantics in front of the memory barrier. Otherwise we must not have select.

 
Personally, I have no objections. I want to understand the pros and cons.

From your previous response, I understand that adding this rule has negligible effect on performance.

With my email, I was trying to prompt other people to raise any other concerns. If there aren't any other concerns, I think Go would benefit from adding such a rule.


There are lots of potential synchronization one can infer, if he thinks in terms of internal chan buffer and sequential consistency. E.g.

c := make(chan bool, 1)

// goroutine 1
X = 42
c <- true

// goroutine 2
select {
case c <- true:
default:
  if X != 42 { panic("WTF") }
}

Can this panic fire?
If one uses the same naive reasoning as with inverted-chan-semaphore, then he can conclude that it must not fire. And this one can actually practically hurt performance. There are lots more weird patterns that one can use and argue that they are correct.

We need to draw the line somewhere.
One possible line is POSIX-like everything-is-sequentially-consistent, then all such patterns are legal.
Another reasonable line is -- send HB the corresponding recv. This supports only pure producer-consumer patterns.
Go diverges from this by adding that weird rule that recv HB send for sync channels. But does not add the similar rule for async channels.

It boils down to what code patterns do we want to support. And how much do we want to confuse Go users, apparently now in the absence of full sequential consistency lots of people are confused. The example I provided above is confusing as well -- how can the panic possibly fire?

Dmitry Vyukov

unread,
Aug 20, 2013, 8:30:08 AM8/20/13
to Nico, golang-nuts
There is a very interesting precedent in C/C++ std lib.
Generally C/C++ provides sequential consistency for race-free programs.
But for std::mutex there are also ways people are intended to use them, and ways people are not intended to use them. And you actually do care about any additional memory barrier in C mutex impl. The mutex with try_lock operation can be abused as follows:

std::mutex m;
int X = 0;

std::thread t([&]() {
    if(!m.try_lock()) {
        if(X != 42)
            throw "WTF";
    }
})
t.start();

X = 42;
m.lock();

So you use *failing* try_lock as a notification that another thread has already done something and locked the mutex.

What they done is quite interesting -- they said that try_lock can fail spuriously sometimes. Now you can think of the implications.

Nico

unread,
Aug 20, 2013, 8:46:30 AM8/20/13
to golang-nuts
On 20/08/13 13:21, Dmitry Vyukov wrote:
> c := make(chan bool, 1)
>
> // goroutine 1
> X = 42
> c <- true
>
> // goroutine 2
> select {
> case c <- true:
> default:
> if X != 42 { panic("WTF") }
> }

Oops!

I think this in an argument against adding the "Nth receive and the
(N+cap)th send" rule to the memory model. There are no receives and
hence it's not covered by this rule.

What's the point of fixing the work-group pattern, but not fixing this one?

Nico

unread,
Aug 20, 2013, 8:54:29 AM8/20/13
to golang-nuts
Maybe the rule should be re-worded in terms of the send that makes a
buffer reach its capacity.

For example: "a send that makes a channel full happens before a send or
a receive on that full channel" (I'm sure someone else can improve my
clumsy English)

Dmitry Vyukov

unread,
Aug 20, 2013, 8:55:59 AM8/20/13
to Nico, golang-nuts
Now what if you replace the select with len(c)==1?

Kyle Lemons

unread,
Aug 20, 2013, 12:09:51 PM8/20/13
to Dmitry Vyukov, Ian Lance Taylor, Dan Kortschak, golang-nuts
On Tue, Aug 20, 2013 at 3:19 AM, Dmitry Vyukov <dvy...@google.com> wrote:
On Tue, Aug 20, 2013 at 4:50 AM, Kyle Lemons <kev...@google.com> wrote:

On Mon, Aug 19, 2013 at 5:26 PM, Dmitry Vyukov <dvy...@google.com> wrote:
On Tue, Aug 20, 2013 at 4:15 AM, Ian Lance Taylor <ia...@golang.org> wrote:
On Mon, Aug 19, 2013 at 3:40 AM, Dmitry Vyukov <dvy...@google.com> wrote:
> I need to admit that this is mostly theoretical.
> One practical implication is complains from race detector, however, it won't
> complain on most cases.
> I can imagine that gccgo/llvmgo can actually break such code in distant
> future provided that it does whole program analysis and agressive
> optimizations. Because they will move in that direction for C/C++.

We know that gccgo/llvmgo are not going to rearrange program execution
around channel operations, because that would be baffling.


Do you mean that it will be baffling, because we have lots of invalid programs out there :)


 
 The
question is more one of how to adjust the memory model, or possibly
the language spec, to make clear that any such rearrangement is
invalid.


N-th receive happens-before N+cap(c) send.

This is actually a generalization of the special rule for sync channels. If cap==0, then N-th recv happens-before N-th receive. So that special rule can be removed.

Will this make the "incorrect" semaphore suddenly correct?


Yes, it is.

This seems like it would require unnecessary memory barriers in the optimal, perfect-knowledge optimizer.  Before, (for a buffered channel) you only require read barriers when you perform a channel send and write barriers when you perform a channel receive.  This makes sense, because you can think of the data as flowing along with the value you're sending.  I understand it is not obvious at first, but I'm not convinced that the number of programs which will be correct is worth sacrificing the potential optimizations.  In the same way that we can control the memory in our programs to control the garbage and thus the garbage collection overhead, I think we should be able to control our channels and their buffer sizes in order to control the synchronization overhead.

Dmitry Vyukov

unread,
Aug 20, 2013, 12:36:13 PM8/20/13
to Kyle Lemons, Ian Lance Taylor, Dan Kortschak, golang-nuts
I think you are right. But this is relevant only for (1) machines with
relaxed memory model that actually require memory barriers (i.e. not
x86) and (2) list-based chan implementation, which I doubt we will
ever have, because it's a more serious performance degradation in
itself; and array-based chan has to do that memory barrier anyway,
because consumers do pass empty slots back to producers, so the
guarantee that we are talking about has place on physical level.
So I would not assign primary importance to it (if we have other
strong arguments, select is really much more unpleasant :) ).

Kyle Lemons

unread,
Aug 20, 2013, 12:45:11 PM8/20/13
to Dmitry Vyukov, Ian Lance Taylor, Dan Kortschak, golang-nuts
Okay then :).  If the optimization potential is purely theoretical, I don't really have much objection.

Nico

unread,
Aug 20, 2013, 1:53:30 PM8/20/13
to golang-nuts
On 20/08/13 13:55, Dmitry Vyukov wrote:
> Now what if you replace the select with len(c)==1?

It's still the same underlying idea. Our intuition expects that each
time we gain any knowledge about the number of elements in a buffered
channel a synchronisation point occurs:

1) if we know a channel is empty because it blocks on a receive:
one would expect sychronisation between the receive that
emptied the channel and the receive that blocked on the channel

2) if we know a channel is full because it blocks on a send
one would expect sychronisation between the send that
makes the channel full and the send that blocked on the channel

3) if we use len(c)
potentially, one would expect sychronisation between len(c) and
any sends and receives on that channel

Could the memory model be extended to guarantee synchronisation at all
the cases above without a noticeable loss of performance?

Does the current compiler provide all those synchronisation points?

Requiring synchronisation in case 3) seems a very strong requirement,
but what would be use of len(c) in concurrent programming without
guaranteeing synchronisation?


Dmitry Vyukov

unread,
Aug 20, 2013, 2:02:48 PM8/20/13
to Nico, golang-nuts
On Tue, Aug 20, 2013 at 9:53 PM, Nico <nicolas...@gmail.com> wrote:
> On 20/08/13 13:55, Dmitry Vyukov wrote:
>>
>> Now what if you replace the select with len(c)==1?
>
>
> It's still the same underlying idea. Our intuition expects that each time we
> gain any knowledge about the number of elements in a buffered channel a
> synchronisation point occurs:
>
> 1) if we know a channel is empty because it blocks on a receive:
> one would expect sychronisation between the receive that
> emptied the channel and the receive that blocked on the channel
>
> 2) if we know a channel is full because it blocks on a send
> one would expect sychronisation between the send that
> makes the channel full and the send that blocked on the channel
>
> 3) if we use len(c)
> potentially, one would expect sychronisation between len(c) and
> any sends and receives on that channel


You missed that it's possible to observe chan state with close(c).
I.e. if close(c) panics, then there was another preceding close(c) :)



> Could the memory model be extended to guarantee synchronisation at all the
> cases above without a noticeable loss of performance?

No, it's not.
For non-blocking receive you want to do:

if(ATOMIC_LOAD_RELAXED(&c->len) == 0)
return false;

w/o any memory barriers.
and similar for recv on full channel.
Your semantics would require memory barrier on ARM/PPC/etc.


> Does the current compiler provide all those synchronisation points?

Yes, it's provided by runtime library, because each chan is protected
by big lock.


> Requiring synchronisation in case 3) seems a very strong requirement, but
> what would be use of len(c) in concurrent programming without guaranteeing
> synchronisation?
>
>
>

Nico

unread,
Aug 20, 2013, 2:09:08 PM8/20/13
to golang-nuts
Please, see the following video:

http://www.youtube.com/watch?v=218iXiKhKlg

Nico

Dmitry Vyukov

unread,
Aug 20, 2013, 2:15:18 PM8/20/13
to Nico, golang-nuts
can you please transcript it

Nico

unread,
Aug 20, 2013, 2:20:18 PM8/20/13
to golang-nuts
It's a scene from "Analize this".

It goes like this:

ROBERT DE NIRO
Oh, you're good.

BILLY CRYSTAL
No, I--

ROBERT DE NIRO
You really got a gift.

BILLY CRYSTAL
Really, I--

ROBERT DE NIRO
Yes ya do..

BILLY CRYSTAL
No, seriously, I just-

ROBERT DE NIRO
(dead serious)
YES YOU DO!

Dan Kortschak

unread,
Aug 23, 2013, 12:05:57 AM8/23/13
to Ian Lance Taylor, Dmitry Vyukov, golang-nuts
Is this worth an issue logged?

Dmitry Vyukov

unread,
Aug 23, 2013, 4:56:29 AM8/23/13
to Dan Kortschak, Ian Lance Taylor, golang-nuts
sure
at least so that it does not get lost while some gophers are out of office

Dan Kortschak

unread,
Aug 26, 2013, 9:14:13 PM8/26/13
to Dmitry Vyukov, Ian Lance Taylor, golang-nuts
Reply all
Reply to author
Forward
0 new messages