[RFC]specify the memory order guarantee provided by atomic Load/Store

635 views
Skip to first unread message

Fannie Zhang (Arm Technology China)

unread,
Jul 14, 2019, 10:31:13 PM7/14/19
to golang-dev, Zheng Xu (Arm Technology China)

Hello golang-dev,

 

The issue (https://github.com/golang/go/issues/5045) on how to define atomic memory model was posted six years ago and there is no clear result until now. I also submitted a comment under this issue, but I didn't get any answers, so we pushed  a patch (https://go-review.googlesource.com/c/go/+/185737) for discussion and want to get some advices.  Ian Lance Taylor suggested that we’d better decide what docs we want on the issue or the mailing list first, so I copy the commit message as follows for further discussion. Thank you.

“Current Go documentation does not specify any memory order guarantees

made by the atomic operations, but implementations of sync/atomic are

concerned with the ordering. So it is better to state the behavior of

the atomic operations wrt the memory model to avoid wrong assumptions

being made by Go developers.

 

The compiler emits different instructions for atomic.Load and atomic.Store

on different architectures, as summarized below.

architecture      atomic.Load                atomic.Store

AMD64                     load                              XCHG

S390x                        load                        store+SYNC

MIPS64           SYNC+load+SYNC      SYNC+store+SYNC

PPC64             SYNC+load+ISYNC           SYNC+load

ARM64                     LDAR                             STLR

 

The atomic.Store is generated as XCHG on AMD64 and as "store+SYNC" on S390x

to prevent store-load reordering, and the atomic.Load is implemented as

normal load because other sequences of memory operations (load-load,

store-store and load-store) are already guaranteed by hardware not to be

reordered.

 

The behavior of atomic operations is impacted not only by the instructions

generated for each CPU, but also by the behavior of the compiler. The

compiler behavior of atomic.Load is not sequentially consistent, because

if there is no dependency between normal load IR and later atomic load IR,

the order is not guaranteed to be preserved during compilation.

 

The behavior of the compiler and CPU combined, atomic.Load and atomic.Store

are not sequentially consistent. And the CPU behavior on arm64 is "Release"

semantics for atomic.Store and "Acquire" semantics for atomic.Load. Considering

the current implementation, it might be reasonable that we just refer to the

most relaxed memory ordering behavior, so this CL specifies that atomic.Load

matches acquire semantics and atomic.Store matches release semantics (like

c++ sequentially consistent acquire and release memory order).”

 

Best regards

Fannie

Keith Randall

unread,
Jul 15, 2019, 10:06:33 AM7/15/19
to Fannie Zhang (Arm Technology China), golang-dev, Zheng Xu (Arm Technology China)
This part isn't right. The compiler (gc, at least, not sure about gccgo) will not reorder two loads if at least one of them is atomic.
Atomic loads generate a new store value. Any loads on the previous store value must be scheduled before the atomic load,
and any loads on the subsequent store must be scheduled after. So although there is no explicit edge in the SSA graph, ordering is respected.
 

 

The behavior of the compiler and CPU combined, atomic.Load and atomic.Store

are not sequentially consistent. And the CPU behavior on arm64 is "Release"

semantics for atomic.Store and "Acquire" semantics for atomic.Load. Considering

the current implementation, it might be reasonable that we just refer to the

most relaxed memory ordering behavior, so this CL specifies that atomic.Load

matches acquire semantics and atomic.Store matches release semantics (like

c++ sequentially consistent acquire and release memory order).”

 

Best regards

Fannie

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-dev/DB7PR08MB386843535DE7183D99F5212E94CF0%40DB7PR08MB3868.eurprd08.prod.outlook.com.
For more options, visit https://groups.google.com/d/optout.

Ian Lance Taylor

unread,
Jul 15, 2019, 5:34:13 PM7/15/19
to Fannie Zhang (Arm Technology China), golang-dev, Zheng Xu (Arm Technology China)
On Sun, Jul 14, 2019 at 7:31 PM Fannie Zhang (Arm Technology China)
<Fannie...@arm.com> wrote:
>
> The behavior of the compiler and CPU combined, atomic.Load and atomic.Store
> are not sequentially consistent. And the CPU behavior on arm64 is "Release"
> semantics for atomic.Store and "Acquire" semantics for atomic.Load. Considering
> the current implementation, it might be reasonable that we just refer to the
> most relaxed memory ordering behavior, so this CL specifies that atomic.Load
> matches acquire semantics and atomic.Store matches release semantics (like
> c++ sequentially consistent acquire and release memory order).”

But this raises the question: why does the runtime/internal/atomic
package have both Load and LoadAcq, and both Store and StoreRel?

Ian

Russ Cox

unread,
Jul 15, 2019, 9:12:01 PM7/15/19
to Fannie Zhang (Arm Technology China), golang-dev, Zheng Xu (Arm Technology China)
Although there's been no official resolution to the issue, I think the actual path forward is what I posted a while back: "Go's atomics guarantee sequential consistency among the atomic variables (behave like C/C++'s seqconst atomics), and that you shouldn't mix atomic and non-atomic accesses for a given memory word."

If you'd like to move things forward by writing that up, great. 

I think anything weaker is a serious mistake, especially since we have been telling people the above for a few years now.

Best,
Russ

Lynn Boger

unread,
Jul 16, 2019, 10:44:55 AM7/16/19
to golang-dev
In late 2016 there was a phone meeting with @rsc and a few others from IBM regarding the Go memory model and its implementation on Power.
We were trying to understand what cases could use more relaxed loads and stores to improve performance. At the time it was understood that user
level atomics would have to maintain sequential consistency but for those cases in the runtime where it was commented that load-acquire or store-release
could be used, using lighter weight loads and stores should be acceptable.

I see that Russ made a comment to this effect in this issue https://github.com/golang/go/issues/16476. This issue discusses the situation where
performance could be helped. It is hard to measure this in a simple benchmark.

Carlos also discussed the change with Austin Clements and David Chase at GopherCon last year. As a result he did this CL:

fannie zhang

unread,
Jul 17, 2019, 2:32:10 AM7/17/19
to golang-dev


On Monday, July 15, 2019 at 10:06:33 PM UTC+8, Keith Randall wrote:
 Thank you for your correction. I double-checked the behavior of the compiler. There is implicit dependencies between load and the later atomic load.
 

 

The behavior of the compiler and CPU combined, atomic.Load and atomic.Store

are not sequentially consistent. And the CPU behavior on arm64 is "Release"

semantics for atomic.Store and "Acquire" semantics for atomic.Load. Considering

the current implementation, it might be reasonable that we just refer to the

most relaxed memory ordering behavior, so this CL specifies that atomic.Load

matches acquire semantics and atomic.Store matches release semantics (like

c++ sequentially consistent acquire and release memory order).”

 

Best regards

Fannie

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

Fannie Zhang (Arm Technology China)

unread,
Jul 17, 2019, 2:42:15 AM7/17/19
to Ian Lance Taylor, golang-dev, Zheng Xu (Arm Technology China)
Hi Ian Lance Taylor,

This is really confusing. There is a comment https://github.com/golang/go/issues/32428#issuecomment-498719675 mentioned that “We explicitly use atomic.LoadAcq/atomic.StoreRel in the runtime where we've considered it worth the engineering and maintenance effort to reason about using weaker atomics.”
Blow are the Load/LoadAcq/Store/StoreRel implementations on different architectures.

architecture Load Store LoadAcq StoreRel
AMD64 load XCHG =Load() =Store()
S390X load store+SYNC load store
MIPS64 SYNC+load+SYNC SYNC+store+SYNC =Load() =Store()
PPC64 SYNC+load+ISYNC SYNC+store (load-acquire)+load+ISYNC LWSYNC (Store-Release)+store
ARM64 LDAR STLR =Load() =Store()

On the S390X and PPC64, they have different implementations, and on the other architectures they are the same.

Thank you
Fannie

Fannie Zhang (Arm Technology China)

unread,
Jul 17, 2019, 2:45:08 AM7/17/19
to Russ Cox, golang-dev, Zheng Xu (Arm Technology China)

Hi Russ,

 

Thanks for your explanation! Now I know the implementation on ARM64 is correct, though the behavior is slightly different with x86. 😊 I’d like to put your words in https://golang.org/pkg/sync/atomic/, is that ok? Thank you.

 

Best regards

Fannie

 

From: Russ Cox <r...@golang.org>
Sent: Tuesday, July 16, 2019 9:12 AM
To: Fannie Zhang (Arm Technology China) <Fannie...@arm.com>
Cc: golang-dev <golan...@googlegroups.com>; Zheng Xu (Arm Technology China) <Zhen...@arm.com>
Subject: Re: [golang-dev] [RFC]specify the memory order guarantee provided by atomic Load/Store

 

Although there's been no official resolution to the issue, I think the actual path forward is what I posted a while back: "Go's atomics guarantee sequential consistency among the atomic variables (behave like C/C++'s seqconst atomics), and that you shouldn't mix atomic and non-atomic accesses for a given memory word."

fannie zhang

unread,
Jul 17, 2019, 3:17:30 AM7/17/19
to golang-dev
Thank you for the explanation and providing the useful information.
Reply all
Reply to author
Forward
0 new messages