On Fri, Feb 17, 2012 at 5:51 PM, Jonathan Amsterdam
<j...@google.com> wrote:
Thanks for explaining, Dmitry. Could you educate me further by explaining the bug in using a simple assignment? Here's why I (naively) think it's ok:
First, because there is a mutex unlock that will immediately follow, other threads will observe the change.
Atomicity is not the only property we care, proper mutual ordering of memory access is no less important:
Consider the following pseudo code of Once.Do:
mtx.lock()
// initialization is inlined for demonstrativeness
obj = new Obj
obj->data = 42
p = &obj
initialized = 1
mtx.unlock()
The compiler is free to reorder it as:
mtx.lock()
initialized = 1
// initialization is inlined for demonstrativeness
obj = new Obj
obj->data = 42
p = &obj
mtx.unlock()
Then another goroutine observes initialized == 1 and access p and p->data. Bang!
Compiler is not the only who can do such reoirderings, processors can do it as well (even w/o inlining).
Mutexes work iff each and every access to protected data is under the mutex. If a single memory access is outside the mutex (what we have in Once), then we are on the lock-free ground and need to take into account atomicity, ordering and visibility.
Second, even if writing a 32-bit value is not atomic, the atomic load on the fast path will either observe a 1, which is fine since f was called, or it won't, in which case no harm done, since done will equal one after the mutex is acquired and f will not be called a second time.
That is [almost] correct. However, even in such case, proper expression of intentions greatly improve code readability. If it is a concurrent access to a shared data do say so. Moreover, such "benign" races confuse verification tools. The data race detector for Go, Go ThreadSanitizer, that we are working on will report it as a data race (if you use plain non-atomic load/store); then if you have dozens of such seemingly false reports, you will miss real harmful reports.
Actually, I am confused about the 32-bit atomic load and store operations. Isn't reading or writing a 32-bit value atomic on any machine with 32-bit or higher data paths? In which case, is the only purpose of those instructions to make changes visible across threads? If so, then you would never need to use either in a mutex-protected section, even if there are non-mutex-protected atomic accesses to the same memory location. Where did I go wrong?
It is correct if you write for example in IA-32 machine codes. Go has a formal memory model and in it 32-bit memory accesses are not atomic, no implicit ordering, no automatic visibility, and on top of that every race is a bug.