Calling sync.WaitGroup.Add() while a sync.WaitGroup.Wait() is waiting

Jon Bodner

unread,

Apr 26, 2018, 11:29:26 AM4/26/18

to golang-nuts

Hello,

We were having a discussion at work about passing around references to sync.WaitGroup. I said it was a bad idea, because if a goroutine launched a goroutine of its own and called sync.WaitGroup.Add() after a goroutine called sync.WaitGroup.Wait(), but before the count dropped to 0, your code would panic. The comments seem to support this:

"If a WaitGroup is reused to wait for several independent sets of events, new Add calls must happen after all previous Wait calls have returned."

And the WaitGroup source code for Add has lines like this:

if w != 0 && delta > 0 && v == int32(delta) {

panic("sync: WaitGroup misuse: Add called concurrently with Wait")

}

But a co-worker wrote this code, and it runs without a panic: https://play.golang.org/p/NjqK_YoqHeH

I thought that this code might only be working because the playground is single-threaded, but when this code is run locally, it also completes without panics. Running it with the race flag also works properly.

How do you trigger those panics in the WaitGroup codebase?

Thanks,

Jon

Ian Lance Taylor

unread,

Apr 26, 2018, 2:47:56 PM4/26/18

to Jon Bodner, golang-nuts

Here's one panic:

package main

import (
"sync"
"time"
)

func main() {
var wg sync.WaitGroup
wg.Add(1)
go func() {
time.Sleep(time.Millisecond)
wg.Add(-1)
wg.Add(1)
}()
wg.Wait()
}

Ian

delepl...@gmail.com

unread,

Apr 26, 2018, 2:57:13 PM4/26/18

to golang-nuts

Hello Jon

I think this is working as expected.

The relevant part of the doc is "calls with a positive delta that start when the counter is greater than zero, may happen at any time".

This is what happens in your code (stdout on the left, my comments on the right)

here wg is created, its counter is 0

here after first Add, counter is now 1

waiting

...here the first child goroutine is sleeping for 1s...

adding from top level goroutine

here after second Add, counter is now 2

done from top level goroutine

here after Done, counter is now 1

done from background goroutine

here after Done, counter is now 0

There are 3 goroutines involved, and while a few different interleaving of messages are possible, what is important here is that

- the first Add happens-before wg.Wait()

- the first Add happens-before the second Add

- the second Add happens-before the calls to Done, which implies that the second Add start when the counter is greater than zero (it is 1)

The part of the doc that you quote "If a WaitGroup is reused..." warns against a race condition between a "Wait" and an "Add positive number to a zero counter", which is not the case here.

I think it can be rephrased as "make sure all the calls to wg.Wait have properly returned (happen-before) and then you may start reusing the WaitGroup if you wish".

As a side note, you have correctly spotted that it is legal to pass around a pointer to the WaitGroup (as opposed to passing it by value, which would be illegal). But in simple cases like this one, it is even more idiomatic (and legible) to use it directly from the closures scope : https://play.golang.org/p/nF9Am77e4u-

Hope this helps :)

Don't hesitate to ask for details on this topic

Valentin

Message has been deleted

David Collier-Brown

unread,

Apr 26, 2018, 7:49:25 PM4/26/18

to golang-nuts

I also ran into a race diagnostic when I did an "Add" after my main programs started "Wait"ing: I moved the add to the main program and cured it.

I read this as the WaitGroup being careful and efficient about use of locking primitives, and not liking the expensive operation of incrementing a semaphore-like construct while someone is waiting for it to change towards zero.

Logically, it's easy to prove correctness of a counter that increments to a value and then decrements to zero, and less easy to deal with one that jumps up and down (and maybe waves its arms in time with the music (;-))

--dave

roger peppe

unread,

Apr 27, 2018, 7:29:50 AM4/27/18

to Jon Bodner, golang-nuts

On 26 April 2018 at 16:29, Jon Bodner <j...@bodnerfamily.com> wrote:
> Hello,
>
> We were having a discussion at work about passing around references to
> sync.WaitGroup. I said it was a bad idea, because if a goroutine launched a
> goroutine of its own and called sync.WaitGroup.Add() after a goroutine
> called sync.WaitGroup.Wait(), but before the count dropped to 0, your code
> would panic. The comments seem to support this:
>
> "If a WaitGroup is reused to wait for several independent sets of events,
> new Add calls must happen after all previous Wait calls have returned."
>
> And the WaitGroup source code for Add has lines like this:
>
> if w != 0 && delta > 0 && v == int32(delta) {
> panic("sync: WaitGroup misuse: Add called concurrently with Wait")
> }
>
> But a co-worker wrote this code, and it runs without a panic:
> https://play.golang.org/p/NjqK_YoqHeH

FWIW This code is fine, and was part of the original design goal of
WaitGroup. When the count is known to be non-zero (for example because
the current goroutine has previously done an Add) it's OK to call Add
to add more.

The key in the above quote is "If a WaitGroup is reused to wait for
several independent sets of events". If you're reusing a WaitGroup,
this makes sense, but in the example above, the WaitGroup is not
reused so all is good.

As for the panic, others have already explained how that might happen.

Jon Bodner

unread,

Apr 28, 2018, 12:09:00 AM4/28/18

to golang-nuts

Thanks, everyone, for your answers! I think I understand what sync.WaitGroup is doing now.

Jon

Reply all

Reply to author

Forward