waitable Vyukov SPSC using futex?

174 views
Skip to first unread message

Keith H Duggar

unread,
Aug 8, 2011, 1:58:02 PM8/8/11
to Scalable Synchronization Algorithms
Consider Dmitry's lockfree nearly waitfree SPSC queue here:

http://www.1024cores.net/home/lock-free-algorithms/queues/unbounded-spsc-queue

Can that implementation be safely and efficiently made waitable
in Linux by adding the following few simple lines of futex code?

#include <unistd.h>
#include <linux/futex.h>
#include <sys/syscall.h>

inline void
futex_wait (
void * p
) {
syscall(SYS_futex,p,FUTEX_WAIT_PRIVATE,0,0,0,0) ;
}

inline void
futex_wake (
void * p
) {
syscall(SYS_futex,p,FUTEX_WAKE_PRIVATE,1,0,0,0) ;
}

class spsc_queue
{
...
public:
void wait ( ) { futex_wait(&tail_->next_) ; }
void wake ( ) { futex_wake(&tail_->next_) ; }
...
} ;

reusing tail_->next_ as the actual futex ;-) With usage as follows

// the single consumer
for(;;) {
SomeData data ;
while(!queue.deque(data)) {
queue.wait() ;
}
process(data) ;
}

// the single producer
queue.enqueue(data) ;
queue.wake() ;

Or is that simple "cheat" (reusing tail_->next_ as the futex)
too good to be true and/or broken?

Also, given that tail_ acts as a stub node (ie alloc_node
never returns tail_ so tail_->value_ is not over-written),
is it safe to have dequeue return &tail_->value_ as follows
for consumer reading instead of returning value_ by value:

T * dequeue ( )
{
node * next = load_consume(&tail_->next_) ;
if ( next ) {
store_release(&tail_,next) ;
return &next->value_ ;
} else {
return 0 ;
}
}

Your guidance would be greatly appreciated!

KHD

Nicola Bonelli

unread,
Aug 8, 2011, 3:22:15 PM8/8/11
to lock...@googlegroups.com
Hello Keith,

such a SPSC queue does not need mutex or futex at all. It's a matter
of taste the policy you use to handle the consumer dequeue() when the
queue is empty.

You are using the linux syscall futex as signaling mechanism, but it
is also ok to use a std::condition_variable or a more aggressive
policy like std::this_thread::yield().

The important point to understand is that such an event does not occur
at any dequeue, but only when the queue is empty, which should happen
with a low frequency (unless the producer is very slow, and in that
case it's much better to reschedule the consumer that polling the
empty queue).

If the consumer has a reserved core, by means of affinity, probably
it's better to yield and poll again, but yet it depends much on the
application.

Ciao,
Nicola

--
The difference between theory and practice is bigger in practice than in theory.

Keith H Duggar

unread,
Aug 8, 2011, 5:17:06 PM8/8/11
to Scalable Synchronization Algorithms
On Aug 8, 3:22 pm, Nicola Bonelli <nicola.bone...@gmail.com> wrote:
> You are using the linux syscall futex as signaling mechanism, but  it
> is also ok to use a std::condition_variable or a more aggressive
> policy like std::this_thread::yield().
>
> The important point to understand is that such an event does not occur
> at any dequeue, but only when the queue is empty, which should happen
> with a low frequency (unless the producer is very slow, and in that
> case it's much better to reschedule the consumer that polling the
> empty queue).

Hi Nicola,

Yes I had in mind slow producers and was attempting to provide an
optional blocking call via the wait() function. It just seemed to
me that the structure of Dimitry's queue is ready made for this
extremely simple (and I hope very efficient) futex solution.

The key points being that 1) it inherently maintains a convenient
and suitable futex variable, tail_->next_ 2) that variable is zero
when the queue is empty and non-zero when not empty.

So it seemed to me that we could do away with the usual event count,
or condvar, or etc and directly use the existing structure of the
queue itself effectively as a condvar. But I'm worried I'm missing
some obvious flaw with this "cheat".

Thank you!

KHD

Samy Bahra

unread,
Aug 8, 2011, 6:22:12 PM8/8/11
to lock...@googlegroups.com, Scalable Synchronization Algorithms
Note that this is in fact the M&S two-lock queue without the locks, you may want to look at their paper.

Sent from a phone

Nicola Bonelli

unread,
Aug 8, 2011, 6:51:50 PM8/8/11
to lock...@googlegroups.com
Keith, I have never used futex before.

I can tell you that for this queue the address passed to the futex
(&tail->next_) is an invariant with respect to the enqueue() member
function.

For this reason it seems that your considerations are good, even if
other pitfalls may be waiting in ambush :-)

For instance what happens if the producer pushes an element and wakes
the consumer in the middle between it finds the queue empty and waits?
Does the FUTEX_WAIT exits immediately if the initial condition is not
met ?

Ciao,
Nicola

--

Keith H Duggar

unread,
Aug 8, 2011, 7:26:20 PM8/8/11
to Scalable Synchronization Algorithms
On Aug 8, 6:51 pm, Nicola Bonelli <nicola.bone...@gmail.com> wrote:
> Keith, I have never used futex before.
>
> I can tell you that for this queue the address passed to the futex
> (&tail->next_) is an invariant with respect to the enqueue() member
> function.
>
> For this reason it seems that your considerations are good, even if
> other pitfalls may be waiting in ambush :-)
>
> For instance what happens if the producer pushes an element and wakes
> the consumer in the middle between it finds the queue empty and waits?
> Does the FUTEX_WAIT exits immediately if the initial condition is not
> met ?

Thanks, I should have have given more detail regarding the futex
call. Let me put the code here with comments for the important
syscall arguments:

inline void
futex_wait (
void * p
) {
syscall(SYS_futex
, p // address of the "futex" variable
, FUTEX_WAIT_PRIVATE // which futex sub-function to take
, 0 // cmp : compare value
, 0,0,0 ) ; // ignore these last 3 arguments
}

inline void
futex_wake (
void * p
) {
syscall(SYS_futex
, p // address of the "futex" variable
, FUTEX_WAKE_PRIVATE // which futex sub-function to take
, 1 // num : number of threads to wakeup
,0,0,0) ; // ignore these last 3 arguments
}

First, in Linux a "futex" variable is just a 32-bit integer.
On my particular Linux platform, pointers to basic types are
32-bit so a pointer variable can serve as a futex. (Note that
the system call simply takes a void * argument.)

The behavior for wait is that it compares the cmp argument to
the futex value (value at *p) and if *p == cmp the call blocks
until a subsequent futex_wake is called on that same futex.
If *p != cmp then futex_wait immediately returns. The above
is done atomically.

The behavior for wake is that it wakes up to num threads that
are waiting on the futex variable pointed to by p. I set num
to 1 in this case because this is single consumer.

If you are interested in more detail and better explanations
here is a nice summary:

http://locklessinc.com/articles/futex_cheat_sheet/

and an example of using futexes to build threading primitives
(in Linux the pthread primitives are built with futexes):

http://locklessinc.com/articles/mutex_cv_futex/

and here is one of the more detailed papers:

http://www.akkadia.org/drepper/futex.pdf

Thanks!

KHD
Reply all
Reply to author
Forward
0 new messages