Re: question about once.go

83 views

Skip to first unread message

Dmitry Vyukov

unread,

Feb 17, 2012, 12:53:06 AM2/17/12

to Jonathan Amsterdam, golang-dev

+golang-dev

On Fri, Feb 17, 2012 at 8:10 AM, Jonathan Amsterdam <j...@google.com> wrote:

Hi,
I happened to be reading once.go, specifically the Do function (pasted below). For my own understanding of these things, I'd love it if you could explain why you use a compare-and-swap to set the done field to 1, instead of an atomic set, or just an ordinary assignment (since the mutex is held at that point).

Thanks.
------------

func (o *Once) Do(f func()) {

if atomic.LoadUint32(&o.done) == 1 {
return

}
// Slow-path.

o.m.Lock()
defer o.m.Unlock()

if o.done == 0 {
f()

atomic.CompareAndSwapUint32(&o.done, 0, 1)
}

}

Hi Jonathan,

You are absolutely right that plain atomic store is enough here. Is is that way for historical reasons. As far as I remember the order of events was:

0. All sync primitives have simple mutex-based implementation. sync.atomic contains only Add and CompareAndSwap.

1. I optimized most sync primitives, including fast-pathing Once. I had to use CAS for store and Add(0) for load.

2. I added atomic.Load() functions.

3. I optimized Once fast-path to use atomic.Load().

4. I added atomic.Store() functions.

The logical next step is to change CAS to Store. However, I think it can wait until after Go 1.

Dmitry Vyukov

unread,

Feb 17, 2012, 9:23:14 AM2/17/12

to Jonathan Amsterdam, golang-dev

+golang-dev

On Fri, Feb 17, 2012 at 5:51 PM, Jonathan Amsterdam <j...@google.com> wrote:

Thanks for explaining, Dmitry. Could you educate me further by explaining the bug in using a simple assignment? Here's why I (naively) think it's ok:

First, because there is a mutex unlock that will immediately follow, other threads will observe the change.

Atomicity is not the only property we care, proper mutual ordering of memory access is no less important:

http://www.1024cores.net/home/lock-free-algorithms/so-what-is-a-memory-model-and-how-to-cook-it

Consider the following pseudo code of Once.Do:

mtx.lock()
// initialization is inlined for demonstrativeness
obj = new Obj

obj->data = 42

p = &obj

initialized = 1

mtx.unlock()

The compiler is free to reorder it as:

mtx.lock()

initialized = 1

// initialization is inlined for demonstrativeness
obj = new Obj

obj->data = 42

p = &obj

mtx.unlock()

Then another goroutine observes initialized == 1 and access p and p->data. Bang!

Compiler is not the only who can do such reoirderings, processors can do it as well (even w/o inlining).

Mutexes work iff each and every access to protected data is under the mutex. If a single memory access is outside the mutex (what we have in Once), then we are on the lock-free ground and need to take into account atomicity, ordering and visibility.

Second, even if writing a 32-bit value is not atomic, the atomic load on the fast path will either observe a 1, which is fine since f was called, or it won't, in which case no harm done, since done will equal one after the mutex is acquired and f will not be called a second time.

That is [almost] correct. However, even in such case, proper expression of intentions greatly improve code readability. If it is a concurrent access to a shared data do say so. Moreover, such "benign" races confuse verification tools. The data race detector for Go, Go ThreadSanitizer, that we are working on will report it as a data race (if you use plain non-atomic load/store); then if you have dozens of such seemingly false reports, you will miss real harmful reports.

Actually, I am confused about the 32-bit atomic load and store operations. Isn't reading or writing a 32-bit value atomic on any machine with 32-bit or higher data paths? In which case, is the only purpose of those instructions to make changes visible across threads? If so, then you would never need to use either in a mutex-protected section, even if there are non-mutex-protected atomic accesses to the same memory location. Where did I go wrong?

It is correct if you write for example in IA-32 machine codes. Go has a formal memory model and in it 32-bit memory accesses are not atomic, no implicit ordering, no automatic visibility, and on top of that every race is a bug.

Reply all

Reply to author

Forward

0 new messages