I know you were not giving any type of definitive treatise on how go treats atomics across different processors...
but is a related aspect restricting instruction reordering by the compiler itself?
I don't know what the modern go compiler does at this point, but I think at least circa go 1.5 there was a nop function that seemed to be used to help prevent the compiler from inlining and then doing instruction re-ordering (first snippet below), and I think I've seen you make related comments more recently (e.g., FreeBSD atomics discussion snippet I included at the end of this post)?
I haven't followed the more recent atomics related changes (including it seems in 1.10 there might have been some work around intrinsics such as CL 28076: "cmd/compile: intrinsify sync/atomic for amd64"?)...
And yes, on the one hand the answer is "respect the memory model and get a clean report from the race detector, etc., etc."... but of course sometimes the performance aspect of the current compiler does matter beyond just mere natural curiosity about how the go compiler does what it does (where performance was the context I had looked at this more closely in the past).
Two related snippets:
====================================================
====================================================
// The calls to nop are to keep these functions from being inlined.
// If they are inlined we have no guarantee that later rewrites of the
// code by optimizers will preserve the relative order of memory accesses.
//go:nosplit
func atomicload(ptr *uint32) uint32 {
nop()
return *ptr
}
====================================================
====================================================
====================================================
> The second issue I have is translating FreeBSD atomic operations to runtime
> atomic ops.
> If I understand it correctly then atomic_load_acq_32 has weaker requirements
> compared to runtime/internal/atomic.Load.
> On x86 the FreeBSD variant is just a compiler barrier to prevent it
> re-oredering instructions.
The Go compiler does reorder instructions. But it doesn't reorder
instructions across a non-inlined function call. On x86 a simple
memory load suffices for atomic.Load because x86 has a fairly strict
memory order in any case. Most other processors are more lenient, and
require more work in the atomic operation.
====================================================
--thepudds