Hi,
I have (beginner) questions about the lockless algorithms:
1. I saw several times that class data starts with initial padding of size cache line. Here, for example
I am guessing it's done to protect against false sharing. Is it done in order to guard class data against memory accesses outside of the scope of the algorithm?
Dmitry specially mentions XCHG as slowest operation on producer side and than says that it's most problematic place is right after "mpscq_node_t* prev = XCHG(&self->head, n);".
is XCHG equivalent to std::atomic_exchange? Is it considered as wait operation?
When you say that consumer is blocked if producer is blocked at (*) you mean that if the producer thread is preempted right after the exchange it will block consumer?
How come consumer is blocked if it does not have loops?
Thank you!