1024cores intrusive MPSC queue memory barriers question

381 views
Skip to first unread message

Ross Bencina

unread,
Feb 17, 2014, 3:42:52 AM2/17/14
to lock...@googlegroups.com
Hello all,

Could someone please clarify the lack of memory barriers in Dmitriy Vyukov's intrusive MPSC queue algorithm?


My question is not the same as earlier discussions that I have found about memory barriers in this algorithm (See threads with Chris Chochran  https://groups.google.com/d/msg/lock-free/i0eE2-A7eIA/ni4GUy2Kv84J and Johan Torp https://groups.google.com/d/msg/comp.programming.threads/nMx1fn_n7i0/6IZtcm1dBI0J ).


I suspect that there are at least two cases where a memory barrier might be needed. Could someone please clarify why they are not? Or are there some assumptions being made about the memory ordering model?


Q1: Why is the following memory barrier not needed?

00 void mpscq_push(mpscq_t* self, mpscq_node_t* n) 
01 { 
02    // *note: n->next==oldvalue    
03    n->next = 0;   
04    mpscq_node_t* prev = XCHG(&self->head, n);  
05    // <-- barrier here to ensure that consumer sees n->next==0, not n->next==oldvalue  
06    prev->next = n; // serialization-point wrt consumer
07 } 

"prev" and "n" are not the same node, therefore I do not see how the assignment at line 06 guarantees that the consumer will see n->next==0 and not n->next==oldvalue.

Is a strong assumption being made about ordering and visibility of writes to independent memory locations?


Q2: Aren't acquire and release barriers needed to ensure that the consumer sees a consistent node payload?

I understand (more or less) that the node linking algorithm does not need barriers to link and unlink nodes correctly (aside from Q1 above).

But I think that the following barriers are necessary to ensure that the consumer sees the latest writes from the producer, not some old stale data:

void mpscq_push(mpscq_t* self, mpscq_node_t* n) 
    // <-- release barrier needed here? or at line 05 above.
    ...(original function)...
}

mpscq_node_t* mpscq_pop(mpscq_t* self) 
    ...
    // <-- acquire barrier needed before each return of valid node?
    return n;
    ...
    // <-- acquire barrier needed before each return of valid node?
    return n;
}

Maybe the answer is the same as in my first question and an assumption being made about the order of visibility of writes to the payload and writes to the queue structure?

I'd appreciate it if you could clarify my confusion.

Thank you,

Ross.

Dmitriy V'jukov

unread,
Feb 17, 2014, 1:39:02 PM2/17/14
to lock...@googlegroups.com
Hi Ross,

The memory barriers are needed in the algorithm. I think I was just
lazy to put them and describe.
Any store that makes data available to other threads must be release,
any load that makes data available to the current thread must be
acquire.
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "Scalable Synchronization Algorithms" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to lock-free+...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/lock-free/afce0cb3-73aa-413e-972e-cc1b23cbde0d%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.



--
Dmitry Vyukov

All about lockfree/waitfree algorithms, multicore, scalability,
parallel computing and related topics:
http://www.1024cores.net

Ross Bencina

unread,
Feb 19, 2014, 12:44:24 PM2/19/14
to lock...@googlegroups.com
Thanks very much for your answer Dmitry. It would appear that I understand.

Ross.
Reply all
Reply to author
Forward
0 new messages