What to use for a MPMC queue?

270 views
Skip to first unread message

Dan Eloff

unread,
Mar 29, 2009, 8:24:41 PM3/29/09
to lock...@googlegroups.com
Hi,

I found Dmitriy posted what looks to be almost too good to be true for
an MPMC queue:

http://groups.google.com/group/comp.programming.threads/msg/ed26d3a89a35761c

In that it seems efficient, and the code and algorithm is relatively
easy to understand (this is very important for me!)

Would you folks recommend this MPMC queue, or is there another
implementation you prefer?

I'm considering it for maintaining a freelist for a hotpath cache (the
cache being 100K to 1M buffers in size.) Multiple threads navigate the
cache (an array) finding buffers that can be reused, write them to
disk, and stick them on the freelist. Multiple threads grab a buffer
off the freelist, or failing that, join the producer threads in
hunting through the cache for a candidate buffer. I'm still trying to
figure out how to keep all those producers from stepping on each
others toes as they do that, but it seems a MPMC queue is the ideal ds
for the freelist.

Many thanks,
-Dan

Chris Cochran

unread,
Mar 30, 2009, 12:11:19 PM3/30/09
to Scalable Synchronization Algorithms


On Mar 29, 5:24 pm, Dan Eloff <dan.el...@gmail.com> wrote:
> Hi,
>
> I found Dmitriy posted what looks to be almost too good to be true for
> an MPMC queue:
>
> http://groups.google.com/group/comp.programming.threads/msg/ed26d3a89...
>
> In that it seems efficient, and the code and algorithm is relatively
> easy to understand (this is very important for me!)
>
> Would you folks recommend this MPMC queue, or is there another
> implementation you prefer?
>
> I'm considering it for maintaining a freelist for a hotpath cache (the
> cache being 100K to 1M buffers in size.) Multiple threads navigate the
> cache (an array) finding buffers that can be reused, write them to
> disk, and stick them on the freelist. Multiple threads grab a buffer
> off the freelist, or failing that, join the producer threads in
> hunting through the cache for a candidate buffer. I'm still trying to
> figure out how to keep all those producers from stepping on each
> others toes as they do that, but it seems a MPMC queue is the ideal ds
> for the freelist.
>
> Many thanks,
> -Dan

Dan,

Have you considered using per-thread freelists, with a work-stealing
process for threads that finish early? That could reduce your per-
transaction synchro costs to per-batch costs. Also, if array
navigation can start anywhere, successive starting points based on the
Golden Mean can be advantagous for MPs.

Chris

Dan Eloff

unread,
Mar 30, 2009, 4:42:55 PM3/30/09
to lock...@googlegroups.com
On Mon, Mar 30, 2009 at 11:11 AM, Chris Cochran <ch...@megabasic.com> wrote:
>
> Have you considered using per-thread freelists, with a work-stealing
> process for threads that finish early?  That could reduce your per-
> transaction synchro costs to per-batch costs.

You make a good point, that's probably the optimal architecture. (I
think I read that you made a memory allocator based on that concept?)
I'm not sure at all how to do this though, is there something you
could link me to?

> if array navigation can start anywhere, successive starting points based on the
> Golden Mean can be advantagous for MPs.
>

That's the conclusion I came to. I plan to have X staggered starting
positions in the array (X indices), and each time I increment the
index, increment it by X. Since array objects are equal to the size of
the cache line, there won't be false sharing, and producers won't get
in each others way. Just divide the threads as evenly as possible over
the X positions and let them do their thing. (Ideally X == number of
threads) Unfortunately if X != number of threads then atomic
(interlocked) increment/decrement is required. Additionally if X %
number of threads != 0 then indices will be checked very unfairly. One
idea would be to scan to the end of the array, and then go scan some
other threads indices until the entire array has been scanned, and
then only restart at the beginning of the array at the staggered
starting positions (again requires atomic inc/dec of the indices.)

Thanks,
-Dan

Reply all
Reply to author
Forward
0 new messages