Hi Dmitry,
I have a question regarding the MPMC queue on your site
http://www.1024cores.net/home/lock-free-algorithms/queues/bounded-mpmc-queueOn the enqueue() method, aren't you worried that the store on 'cell->data_' gets re-ordered and becomes visible before the 'enqueue_pos_.compare_exchange_weak()' ?
This would only be a problem if there is a speculative load on 'pos', and if this were to happen, then a dequeue() on the previous round of sequences could end up getting the new (wrong) data.
Here's a more detailed example scenario where the buffer_size is 4:
cell.sequence_ = [4 1 2 3 ]
cell.data_ = [d4 d1 d2 d3]
enqueue_pos_ = 5
dequeue_pos_ = 2Suppose there are three threads, T1, T2 and T3, where T1 and T2 are doing enqueue() operations and T3 is calling dequeue().
Consider the following sequence of events:
1. T1 calls enqueue(), does a speculative load on 'pos' and it is speculated to be 6 and (due to re-ordering) sets cell->data_ in index 2 to d6, and goes to sleep:
cell.sequence_ = [4 1 2 3 ]
cell.data_ = [d4 d1 d6 d3]
enqueue_pos_ = 5
dequeue_pos_ = 2 2. T3 calls dequeue(), increments dequeue_pos_ to 3 and returns the data at index 2, which is (incorrectly) d6:
cell.sequence_ = [4 1 2 3 ]
cell.data_ = [d4 d1 d6 d3]
enqueue_pos_ = 5
dequeue_pos_ = 33. T2 calls enqueue(), increments enqueue_pos_, sets the data and sequence in index 2:
cell.sequence_ = [4 5 2 3 ]
cell.data_ = [d4 d5 d6 d3]
enqueue_pos_ = 6
dequeue_pos_ = 34. T1 wakes up, continues its call to enqueue(), the speculative load of 'pos' succeeds now that enqueue_pos_ is 6 and it gets incremented to 7, and the sequence at index 2 is set to 6:
cell.sequence_ = [4 5 6 3 ]
cell.data_ = [d4 d5 d6 d3]
enqueue_pos_ = 7
dequeue_pos_ = 3A somewhat similar issue may occur on the dequeue() method, where the relaxed load on 'pos' becomes a speculative load, and the load on 'cell->data_' gets re-ordered above the
'dequeue_pos_.compare_exchange_weak()'. This would cause an old value of 'cell->data_' to be read, which would be incorrect.
There is a very interesting paper on these kind of issues:
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42967.pdfI might have completely missed something obvious, so feel free to prove me wrong :)
Anyways, I think the idea for this queue is good, and even if my analysis is accurate and the current implementation is incorrect, it's just a matter of adding the right barriers at the right places and it will become correct.
Thanks,
Pedro