On Tuesday 02 February 2016 21:31:23 Nemo Yu wrote:
> 1) memory_order_seq_cst is stronger than full barriers(LoadLoad,
> StoreStore, StoreLoad, LoadStore), why C++ doesn't provide a full-barrier?
That's memory_order_acq_rel.
> 2) I found one possible implementation to achieve a StoreLoad barrier:
>
> fetch_add(&addr, 0, memory_order_release);
>
> How does it work?
You need to ask your CPU vendor how they implemented it. Assuming they have a
memory order different from full barrier.
For example, on IA-64, the above could be implemented by an fetchadd4.rel
instruction.
> 3) With respect to 1), when will the following case happen?
>
> - thread 1 writes: a=1; b=2;
> - thread 2 sees: a=1 then b=2;
> - thread 3 sees: b=1 then a=1;
>
> I will appreciate it if you can give a detail about specific scenario, such
> like architecture/intrinsics etc..
Please be more specific. This scenario doesn't make sense because of lack of
information. Please specify:
a) what types are a and b (I assume we're talking about atomic<int>)
b) what you meant by "write". Did you mean store-relaxed, store-release or
store-CST?
c) what the values of a and b were before all of this happened.
If I assume a = b = 0 when everything started, then the case above will never
happen because b is never assigned the value of 1. The *principle* of atomics
is that you can never see an intermediate value, so any observer of b must see
either 0 or 2, never something else.
> 4) What's the difference in practice between Acquire/Release fences and
> Acquire/Release operations(e.g. a load with Acquire/Release)? AFAIK the
> implementation is the same with memory fences, although they are not equal
> in the standard.
On x86, no difference. The LFENCE/SFENCE/MFENCE instructions are not useful on
main memory (cache-backed). They're only used for uncached memory (MMIO), so
compilers do not need to emit them (GCC does anyway).
On most architectures where memory order does matter, instructions either have
an associated order by themselves or there's an extra instruction to do the
fence. Taking the example of IA-64:
* there's ldN and ldN.acq; stN and stN.rel
* there's fetchaddN.acq and fetchaddN.rel, cmpxchg.acq and cmpxchg.rel
* xchgN always has acquire semantics
If you want to implement any order stricter than what the instruction permits,
you insert an mf to force a full barrier.
--
Thiago Macieira - thiago (AT)
macieira.info - thiago (AT)
kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358