Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

memory_order_acquire/release in terms of LoadStore/LoadLoad/...

14 views
Skip to first unread message

frege

unread,
Jan 25, 2010, 11:41:14 AM1/25/10
to
For C++0x (and for 'acquire' and 'release' terminology in general),
which is correct?:

A)
acquire == LoadLoad + LoadStore
release == StoreStore + LoadStore

or

B)
acquire == LoadLoad
release == StoreStore

ie I would say that for a general lock, we need interpretation A
(acquire on lock, release on unlock)
but for a producer/consumer queue (or Singleton init vs access) we
only need B (release on produce, acquire on consume).

If we use A for C++0x, how do I access the weaker B when that's all I
need?

Is this a c.s.c++ question?

Tony

Chris Friesen

unread,
Jan 25, 2010, 1:52:04 PM1/25/10
to
On 01/25/2010 10:41 AM, frege wrote:
> For C++0x (and for 'acquire' and 'release' terminology in general),
> which is correct?:
>
> A)
> acquire == LoadLoad + LoadStore
> release == StoreStore + LoadStore
>
> or
>
> B)
> acquire == LoadLoad
> release == StoreStore
>
> ie I would say that for a general lock, we need interpretation A
> (acquire on lock, release on unlock)
> but for a producer/consumer queue (or Singleton init vs access) we
> only need B (release on produce, acquire on consume).

I started doing some digging out of curiousity and stumbled on something
interesting. The linux kernel (which runs on many different
architectures) basically uses the equivalent of:

read (acquire): LoadLoad
write (release): StoreStore
full barrier: LoadLoad|LoadStore|StoreStore|StoreLoad

With the above definitions, a general lock construct implies a read
barrier while taking the lock and a write barrier while releasing the lock.


On the other hand, sparc on glibc uses:

read: LoadLoad | LoadStore
write: StoreLoad | StoreStore
full barrier: LoadLoad|LoadStore|StoreStore|StoreLoad

The difference is interesting...I wonder why the kernel gets away
without LoadStore and StoreLoad.


Chris

Chris Friesen

unread,
Jan 25, 2010, 2:07:13 PM1/25/10
to
On 01/25/2010 12:52 PM, Chris Friesen wrote:
> The linux kernel (which runs on many different
> architectures) basically uses the equivalent of:
>
> read (acquire): LoadLoad
> write (release): StoreStore
> full barrier: LoadLoad|LoadStore|StoreStore|StoreLoad
>
> On the other hand, sparc on glibc uses:
>
> read: LoadLoad | LoadStore
> write: StoreLoad | StoreStore
> full barrier: LoadLoad|LoadStore|StoreStore|StoreLoad
>
> The difference is interesting...I wonder why the kernel gets away
> without LoadStore and StoreLoad.


That should of course be "glibc on sparc". In any case, it looks like
the arch-specific lock functions add in the missing LoadStore/StoreLoad
barriers.

Chris

frege

unread,
Jan 25, 2010, 2:21:40 PM1/25/10
to
On Jan 25, 1:52 pm, Chris Friesen <cbf...@mail.usask.ca> wrote:
>
> I started doing some digging out of curiousity and stumbled on something
> interesting. The linux kernel (which runs on many different
> architectures) basically uses the equivalent of:
>
> read (acquire): LoadLoad
> write (release): StoreStore
> full barrier: LoadLoad|LoadStore|StoreStore|StoreLoad
>
> With the above definitions, a general lock construct implies a read
> barrier while taking the lock and a write barrier while releasing the lock.
>
> On the other hand, sparc on glibc uses:
>
> read: LoadLoad | LoadStore
> write: StoreLoad | StoreStore
> full barrier: LoadLoad|LoadStore|StoreStore|StoreLoad
>
> The difference is interesting...I wonder why the kernel gets away
> without LoadStore and StoreLoad.
>
> Chris

If I'm reading some of the same docs (ie Documentation/memory-
barriers.txt) I notice that they also, separately, use the terms
'lock' and 'unlock' for a different set of barriers. Without looking
at the docs, I think those barriers had the stronger semantics.

So I'm wondering it is to some extent just conflicting use of
terminology:

A: lock = LoadLoad + LoadStore, acquire = LoadLoad
vs
B: acquire = lock = LoadLoad + LoadStore

and similar for unlock/release.

Regardless of what is "right", I'm tempted to stop using "acquire/
release" at all since I've seen enough ambiguity to make true meaning
impossible. Even amongst those explaining C++0x.

I'm thinking of instead using a variation of 'A' (above):

lock/unlock - which is very clear to anyone using locks, which
hopefully is anyone doing lock-free

producer/consumer - weaker, and all that is needed for strictly
separated producer/comsumer situations (ie where consumer only writes,
producer only reads) ie acquire/release of 'A' above.


Or rather, I *would* switch to those terms, except that C++0x is using
'acquire/release' (for lock/unlock) AND it is using 'consume' for
data-dependency handling (ie DEC Alpha).

Tony

Alexander Terekhov

unread,
Jan 25, 2010, 2:49:23 PM1/25/10
to

frege wrote:
>
> For C++0x (and for 'acquire' and 'release' terminology in general),
> which is correct?:
>
> A)
> acquire == LoadLoad + LoadStore
> release == StoreStore + LoadStore
>
> or
>
> B)
> acquire == LoadLoad
> release == StoreStore
>
> ie I would say that for a general lock, we need interpretation A
> (acquire on lock, release on unlock)
> but for a producer/consumer queue (or Singleton init vs access) we
> only need B (release on produce, acquire on consume).

memory_order_acquire/release is 'A' in C++0x.

>
> If we use A for C++0x, how do I access the weaker B when that's all I
> need?

No way in C++0x IIRC.

>
> Is this a c.s.c++ question?

This is belated (by many years) cpp-t...@decadentplace.org.uk
question.

http://www.decadentplace.org.uk/pipermail/cpp-threads/

Ah well.

http://svn.dsource.org/projects/ares/trunk/doc/ares/std/atomic.html

;-)

regards,
alexander.

Chris Friesen

unread,
Jan 25, 2010, 2:47:44 PM1/25/10
to
On 01/25/2010 01:21 PM, frege wrote:

> If I'm reading some of the same docs (ie Documentation/memory-
> barriers.txt) I notice that they also, separately, use the terms
> 'lock' and 'unlock' for a different set of barriers. Without looking
> at the docs, I think those barriers had the stronger semantics.

More accurately that document talks about LOCK and UNLOCK as families of
locking constructs which have implied barriers. These barriers are
one-way permeable, such that operations outside the critical section may
"seep" inside them. In particular, this implies that while an UNLOCK
followed by a LOCK is equivalent to a full barrier, a LOCK followed by
an UNLOCK is not.

> Regardless of what is "right", I'm tempted to stop using "acquire/
> release" at all since I've seen enough ambiguity to make true meaning
> impossible. Even amongst those explaining C++0x.

Talking about memory barriers at all is pretty much impossible except in
the context of a well defined memory model. In that specific context it
should be possible to agree on terminology.

In the larger context of this newsgroup (which is not language specific)
the challenges do become somewhat greater. However, I suspect you'll
find that the linux kernel terminology of read/write/full barriers will
be fairly well understood. If you're coding specifically for hardware
which gives finer-grained control over the barriers (sparc, for example)
then you may need to define your terms more carefully.

> lock/unlock - which is very clear to anyone using locks, which
> hopefully is anyone doing lock-free

Ah, but the barriers required by lock-free algorithms are not
necessarily exactly the same as those required by locks. See the
one-way permeability mentioned above.

Chris

frege

unread,
Jan 25, 2010, 3:22:41 PM1/25/10
to

One-way permeability is just a consequence of locking an *object*
getting implemented as a barrier - which is *between* instructions:

a = x;
lock(mutex)
b = y;
unlock(mutex)
c = z;

a,c can move inside the lock because of where the barriers end up.
The above, translated to barriers is:

a = x;
load mutex.lock // if (CAS(...)) but focus on load aspect
--- lock_barrier = loadload|loadstore ---
b=y;
---- unlock = storestore + loadstore ---
mutex.lock = 0;
c = z;

now with barriers in place, we can see that a=x can be reordered after
mutex.lock, c=z can be reordered before mutex.unlock. Yet a,c never
cross the actual memory barriers.

So they don't permeate the underlying barriers, but they do permeate
the locks at the mutex level.

> Chris

Tony

Chris Friesen

unread,
Jan 25, 2010, 5:14:12 PM1/25/10
to
On 01/25/2010 02:22 PM, frege wrote:

> One-way permeability is just a consequence of locking an *object*
> getting implemented as a barrier - which is *between* instructions:

I disagree. One-way permeability is simply part of the definition of a
mutex. It would be just as easy to define mutexes as impermeable, but
that could result in a performance penalty.

> a = x;
> lock(mutex)
> b = y;
> unlock(mutex)
> c = z;
>
> a,c can move inside the lock because of where the barriers end up.
> The above, translated to barriers is:
>
> a = x;
> load mutex.lock // if (CAS(...)) but focus on load aspect
> --- lock_barrier = loadload|loadstore ---
> b=y;
> ---- unlock = storestore + loadstore ---
> mutex.lock = 0;
> c = z;
>
> now with barriers in place, we can see that a=x can be reordered after
> mutex.lock, c=z can be reordered before mutex.unlock. Yet a,c never
> cross the actual memory barriers.
>
> So they don't permeate the underlying barriers, but they do permeate
> the locks at the mutex level.

You've used stricter barriers than the linux kernel does. If what you
call lock_barrier is loadload, and unlock is storestore, then the write
to "a" can happen after the write to b, and the read from "z" can happen
before the read from "y" (and in fact before the write to "a").

So the question you really need to ask is whether your barriers actually
provide loadstore/storeload guarantees, or just loadload/storestore.

Chris

Chris Friesen

unread,
Jan 25, 2010, 5:59:35 PM1/25/10
to
On 01/25/2010 01:49 PM, Alexander Terekhov wrote:
>
> frege wrote:
>>
>> For C++0x (and for 'acquire' and 'release' terminology in general),
>> which is correct?:
>>
>> A)
>> acquire == LoadLoad + LoadStore
>> release == StoreStore + LoadStore
>>
>> or
>>
>> B)
>> acquire == LoadLoad
>> release == StoreStore

> memory_order_acquire/release is 'A' in C++0x.

Can you point to a definitive statement that acquire/release do both
imply LoadStore? I'm honestly asking here, not trying to be annoying.

It's a bit difficult given the different drafts, but for instance looking at

http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/

there is no mention of LoadStore semantics in either acquire or release.

Chris

Alexander Terekhov

unread,
Jan 26, 2010, 4:59:45 AM1/26/10
to

Chris Friesen wrote:
>
> On 01/25/2010 01:49 PM, Alexander Terekhov wrote:
> >
> > frege wrote:
> >>
> >> For C++0x (and for 'acquire' and 'release' terminology in general),
> >> which is correct?:
> >>
> >> A)
> >> acquire == LoadLoad + LoadStore
> >> release == StoreStore + LoadStore
> >>
> >> or
> >>
> >> B)
> >> acquire == LoadLoad
> >> release == StoreStore
>
> > memory_order_acquire/release is 'A' in C++0x.
>
> Can you point to a definitive statement that acquire/release do both
> imply LoadStore? I'm honestly asking here, not trying to be annoying.

http://www.decadentplace.org.uk/pipermail/cpp-threads/2008-December/001949.html

>
> It's a bit difficult given the different drafts, but for instance looking at
>
> http://bartoszmilewski.wordpress.com/2008/12/01/c-atomics-and-memory-ordering/
>
> there is no mention of LoadStore semantics in either acquire or release.

Note that he says

"As I discussed before, the x86 guarantees acquire semantics for loads
and release semantics for stores"

and his "x86 memory model" article

http://bartoszmilewski.wordpress.com/2008/11/05/who-ordered-memory-fences-on-an-x86/

does mention

"Stores are not reordered with older loads"

rule.

That's LoadStore in SPARC RMO terms.

regards,
alexander.

Chris Friesen

unread,
Jan 26, 2010, 9:25:21 AM1/26/10
to
On 01/26/2010 03:59 AM, Alexander Terekhov wrote:

> Note that he says
>
> "As I discussed before, the x86 guarantees acquire semantics for loads
> and release semantics for stores"
>
> and his "x86 memory model" article
>
> http://bartoszmilewski.wordpress.com/2008/11/05/who-ordered-memory-fences-on-an-x86/
>
> does mention
>
> "Stores are not reordered with older loads"
>
> rule.
>
> That's LoadStore in SPARC RMO terms.

Okay, that works.

Thanks,
Chris

frege

unread,
Jan 29, 2010, 2:26:43 AM1/29/10
to
On Jan 25, 11:41 am, frege <gottlobfr...@gmail.com> wrote:
> For C++0x (and for 'acquire' and 'release' terminology in general),
>
> acquire == LoadLoad + LoadStore
> release == StoreStore + LoadStore
>

It looks like this is the conclusion. I've posted some of my research/
references and conclusions at http://blog.forecode.com/2010/01/29/barriers-to-understanding-memory-barriers/
. I'll probably add some more references there as I find them.

Tony

0 new messages