Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

About serquential consistency

13 views

Skip to first unread message

Ramine

unread,

Nov 24, 2014, 4:08:58 PM11/24/14

Hello,

I think i have understood correctly RMO (Relaxed Memory Ordering) of
Sparc RMO and ARM and i have understood correctly TSO (Total Store
Ordering) of x86.. but i have a question please...

If you have noticed on my previous post about a Seqlock that
was wrote in Java, you will notice that it is using sfences
and lfences to avoid problems with sequential consistency on
RMO (Relaxed Memory Ordering) and TSO (Total Store Ordering),
but i feel sincerely that RMO (Relaxed Memory Ordering) is
really dangerous, cause for example what will happen to this
Seqlock if the call to Spinlock.lock() inside this Seqlock didn't
contain an MFENCE !? this will become a serious bug and this will become
dangerous and fatal.. so this is why i am kind of afraid of RMO (Relaxed
Memory Ordering) cause RMO higher the complexity and this can introduce
easily serious and fatal bugs...

So my question is: Why have we choose RMO even though that it is
dangerous ?

Thank you,
Amine Moulay Ramdane.

anastasi...@gmail.com

unread,

Nov 29, 2014, 5:33:12 AM11/29/14

That's an issue for hardware folks... Relaxed memory ordering performs better because the cores in a manycore machine can avoid doing extra work...

Think that if an architecture implemented a sequential consistent memory model you would only need to perform atomic loads and stores and no memory barriers at all. So it is easier for you the programmer. BUT, the architecture underneath will implement in hardware those memory fences for you in EVERY load and store you perform.

Next one realizes that ordering memory actions between different processors is not essential for every load and store. But only when you need it...

So the hardware guys say to you:
Well get those cheap load and stores you want. And if you want to order memory actions then i give to you those extra memory barriers instructions (which are costly with respect to other instructions), and you have to place them where they are needed.
So, the hardware runs fast in the common case and CAN also run fast in the case where synchronization is needed if you succeed in placing only the required memory fences and no more.
This certainly adds to the complexity a programmer has to face when implementing a lock-free algorithm and that's why you don't like it! :)

0 new messages