Hi Dmitriy and everyone !
I'd like to contribute with my 2 cents to the topic of unbounded lock-
free SPSC queues.
I wrote an unbounded SPSC queue (called uSPSC) which is used as base
run-time mechanism of the FastFlow parallel programming framework
(
http://sourceforge.net/projects/mc-fastflow ).
The C++ code of the queue can be found inside the FastFlow tarball in
the file ubuffer.hpp
(if necessary I can post a simplified pseudocode).
A full description of the queue implementation (although a bit
didactic), as well as the
correctness proof can be found in a technical report at the following
link :
http://arxiv.org/abs/1012.1824
The basic idea is to properly use a pool of bounded SPSC queues (each
one implemented
following a revised version of the well-known Lamport's circular
buffer algorithm),
which also acts as a cache-repository of the bounded queues for the
producer and
the consumer in order to minimize dynamic memory allocation/
deallocation.
From my tests and from several apps implemented using FastFlow, I
obtained quite good
performance figures.
The implementation is wait-free and does not require any CAS and
memory barrier for TSO
memory model (actually just one write-memory-barrier for more relaxed
memory model).
I would greatly appreciate your comments and opinion regarding the
algorithm and the
implementation.
Best regards,
Massimo Torquati