Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

lock-free single-producer/multi-consumer unbounded queue...

182 views
Skip to first unread message

Chris M. Thomasson

unread,
Mar 20, 2017, 8:02:05 PM3/20/17
to
Here is an example of using a SPMC (single producer/multi consumer)
queue. It blocks using spin wait. However, this can be avoided through
the use of eventcounts, or even fast semaphores.

The test will complete, give it a minute or two, it does 10000000
iterations with 7 consumer threads.

The entire example code can be found here:

http://pastebin.com/raw/6nG6t422


Can you run this shi%?

Any thoughts, questions?

;^o

____________________________________
#include <cstdio>
#include <deque>
#include <condition_variable>
#include <mutex>
#include <memory>
#include <thread>
#include <atomic>
#include <algorithm>
#include <cassert>



#define mb_relaxed std::memory_order_relaxed
#define mb_consume std::memory_order_consume
#define mb_acquire std::memory_order_acquire
#define mb_release std::memory_order_release
#define mb_acq_rel std::memory_order_acq_rel
#define mb_seq_cst std::memory_order_seq_cst

#define mb_fence(mb) std::atomic_thread_fence(mb)


// Just a check...
static std::atomic<unsigned long> g_alloc_count(0);


// simple single producer/multi-consumer queue
struct node
{
node* m_next;
node() : m_next(nullptr) {} // tidy...
};


struct spmc_queue
{
std::atomic<node*> m_head;

spmc_queue() : m_head(nullptr) {}

// push a single node
void push(node* const n)
{
node* head = m_head.load(mb_relaxed);

for (;;)
{
n->m_next = head;
mb_fence(mb_release);

if (m_head.compare_exchange_weak(head, n, mb_relaxed))
{
break;
}
}
}

// try to flush all of our nodes
node* flush(node* const nhead = nullptr)
{
node* n = m_head.exchange(nhead, mb_relaxed);

if (n)
{
mb_fence(mb_acquire);
}

return n;
}
};

// Spin-Wait, Blocking Adapt Function
node* spmc_queue_spin_lock_flush(
spmc_queue& queue
) {
node* n = nullptr;

for (;;)
{
n = queue.flush();
if (n) break;
std::this_thread::yield();
}

return n;
}


#define CONSUMERS 7
#define N 10000000


struct user_data : public node
{
int m_foo;

user_data(int foo) : m_foo(foo) {}
};


void producer_thread(
unsigned int id,
std::mutex& std_out_mutex,
spmc_queue& queue
){
{
std::unique_lock<std::mutex> lock(std_out_mutex);
std::printf("producer(%u)::queue(%p) - Entry\n", id,
(void*)&queue);
}

for (unsigned int i = 0; i < N; ++i)
{
user_data* ud = new user_data(i + 1); // allocate memory
g_alloc_count.fetch_add(1, mb_relaxed);

queue.push(ud);

if (! ((i + 1) % 1003))
{
std::unique_lock<std::mutex> lock(std_out_mutex);
std::printf("producer(%u)::queue(%p) - produced(%u)\n", id,
(void*)&queue, i + 1);
}
}

for (unsigned int i = 0; i < CONSUMERS; ++i)
{
user_data* ud = new user_data(0); // allocate memory
g_alloc_count.fetch_add(1, mb_relaxed);

queue.push(ud);
}

{
std::unique_lock<std::mutex> lock(std_out_mutex);
std::printf("producer(%u)::queue(%p) - Exit\n", id, (void*)&queue);
}
}


void consumer_thread(
unsigned int id,
std::mutex& std_out_mutex,
spmc_queue& queue
){
{
std::unique_lock<std::mutex> lock(std_out_mutex);
std::printf("consumer(%u)::queue(%p) - Entry\n", id,
(void*)&queue);
}

{
for (unsigned long i = 0 ;; ++i)
{
// Wait for something...
user_data* ud = (user_data*)spmc_queue_spin_lock_flush(queue);
assert(ud); // make sure we have something!

int counter = 0;

while (ud)
{
node* ud_next = ud->m_next;

unsigned int foo = ud->m_foo;
delete ud; // reclaim memory
g_alloc_count.fetch_sub(1, mb_relaxed);

if (foo == 0)
{
// We have recieved a "stop" signal
counter++;
}

if (!((i + 1) % 1003))
{
std::unique_lock<std::mutex> lock(std_out_mutex);
std::printf("consumer(%u)::queue(%p) -
consumed(foo:%u)\n",
id, (void*)&queue, foo);
}

ud = (user_data*)ud_next;
}

std::this_thread::yield(); // just for spice...

while (counter > 1)
{
// Replay all of the excess stop signals
user_data* ud = new user_data(0); // allocate memory
g_alloc_count.fetch_add(1, mb_relaxed);

queue.push(ud);
--counter;

{
std::unique_lock<std::mutex> lock(std_out_mutex);
std::printf("consumer(%u)::queue(%p) - replay(%u)
*****************\n",
id, (void*)&queue, counter);
}
}

if (counter == 1)
{
// We are fin!
break;
}
}
}

{
std::unique_lock<std::mutex> lock(std_out_mutex);
std::printf("consumer(%u)::queue(%p) - Exit\n", id, (void*)&queue);
}
}



int main(void)
{
{
spmc_queue queue;

std::thread consumers[CONSUMERS];
std::mutex std_out_mutex;

for (unsigned int i = 0; i < CONSUMERS; ++i)
{
consumers[i] = std::thread(
consumer_thread,
i + 1,
std::ref(std_out_mutex),
std::ref(queue)
);
}

std::thread producer(
producer_thread,
0,
std::ref(std_out_mutex),
std::ref(queue)
);

producer.join();

for (unsigned int i = 0; i < CONSUMERS; ++i)
{
consumers[i].join();
}
}

std::printf("g_alloc_count:(%lu)\n", g_alloc_count.load(mb_relaxed));
assert(g_alloc_count.load(mb_relaxed) == 0);

std::printf("\nComplete, hit <ENTER> to exit...\n");
std::fflush(stdout);
std::getchar();

return 0;
}

____________________________________


There is it, the whole pile. ;^)

It should compile!

Chris M. Thomasson

unread,
Mar 20, 2017, 8:12:03 PM3/20/17
to
I forgot to comment here that if (nhead) is not a nullptr, the calling
code should ensure release membar semantics, Something like:

if (nhead) mb_fence(mb_release);

**** before the atomic exchange ****


Not sure if this coded precaution is actually needed right here, but the
damn comment sure as hell is!

The test does not make use of nhead != nullptr condition...


> node* n = m_head.exchange(nhead, mb_relaxed);
>
> if (n)
> {
> mb_fence(mb_acquire);
> }
>
> return n;
> }
> };
[...]


Chris M. Thomasson

unread,
Mar 21, 2017, 12:10:33 AM3/21/17
to
On 3/20/2017 5:01 PM, Chris M. Thomasson wrote:
> Here is an example of using a SPMC (single producer/multi consumer)
> queue. It blocks using spin wait. However, this can be avoided through
> the use of eventcounts, or even fast semaphores.

Funny thing is that the code is totally safe for multiple producers. The
actual test is single producer. lol. I mixed up some of my comments and
names when I was getting the code ready. Sorry.

Also, the nature of the sync is LIFO. This sync device returns all items
at once in the flush function, and they are arranged as a stack, not
queue. One point, one can reverse the singly linked list returned from
flush to get FIFO...

Again, sorry for any confusions.

Ahhh, I should have kept all of this secret and waited for somebody else
to point it out! ;^o

Fwiw, its been a little while since I have coded these sync primitives
as individual little test programs. I have an older private library of
them, still using custom asm code, but the lib is not comprised of
little example programs.

Bonita Montero

unread,
Mar 21, 2017, 3:59:04 PM3/21/17
to
When you have a producer-consumer-relation, the time to enqueue and to
dequeue an item is very short in relation to prepare and process that
item. So the likehood of a collision when accessing a queue locked by
a traditional mutex/condvar-duo is extremely low. So expecially for
producer-consumer-relationships, lockfree processing provides almost
no benefit over traditional locking.

--
http://facebook.com/bonita.montero/

Chris M. Thomasson

unread,
Mar 21, 2017, 4:20:49 PM3/21/17
to
Fwiw, have you ever compared two identical queues wrt API, where one of
them is lock-free, and the other is traditional single mutex lock?

Put them under pressure, and see what one wins wrt the contrived tests.

Track the work completed on a per worker thread basis. Run a lock-based
queue for 10 minutes under stress load, and record the amount of work
each thread has accomplished.

Then run the same test using an efficient lock-free custom and
compatible alternative, wrt API. And record its numbers.

I all of my experience, a good lock-free version allows consumers
threads to get more work completed in a stress filled environment. The
mutex version will burn up under contention, loaded with calls to the
kernel to block. Lock-free tries to avoid the kernel at all costs. Think
of an adaptive mutex, does it not try to avoid the kernel by spinning?

Keep in mind, there is a shi% load of total crap lock-free code on the
net. Total race-infested garbage.


Anyway, stressing out a sync algorithm is key, under load in unknown
server request. Imvvho, at least.

Fwiw, CAS was invented for this purpose, and is still in IBM principles
of operation to this date. CAS is the initials of the guys name that
invented it.

Spin wait aside for a moment. There is a construct called an eventcount
that works with basically any lock-free construct. Its like a condvar
for these type of "exotic" sync algorithms.

Bonita Montero

unread,
Mar 21, 2017, 4:33:06 PM3/21/17
to
Am 21.03.2017 um 21:20 schrieb Chris M. Thomasson:

> Put them under pressure, and see what one wins wrt the contrived tests.

There is no such pressure in real-wold apps because the interval
producing or consuming an item is by magnitudes longer than the
enqueuing- or dequeuing-interval.

--
http://facebook.com/bonita.montero/

Chris M. Thomasson

unread,
Mar 21, 2017, 4:39:51 PM3/21/17
to
On 3/21/2017 1:32 PM, Bonita Montero wrote:
> Am 21.03.2017 um 21:20 schrieb Chris M. Thomasson:
>
>> Put them under pressure, and see what one wins wrt the contrived tests.
>
> There is no such pressure in real-wold apps because the interval
> producing or consuming an item is by magnitudes longer than the
> enqueuing- or dequeuing-interval.
>

Why does IBM keep patenting various non-blocking techniques?

RCU?

Bonita Montero

unread,
Mar 21, 2017, 4:43:30 PM3/21/17
to
> Why does IBM keep patenting various non-blocking techniques?

They're patenting lockfree algorithms for
producer-consumer-relationships?

> RCU?

RCU is something completely different.

--
http://facebook.com/bonita.montero/

Chris M. Thomasson

unread,
Mar 21, 2017, 5:01:44 PM3/21/17
to
On 3/21/2017 1:43 PM, Bonita Montero wrote:
>> Why does IBM keep patenting various non-blocking techniques?
>
> They're patenting lockfree algorithms for
> producer-consumer-relationships?

They have the CAS instruction to create these vary things, they have
lock-free stack in principles of operation, to this date.

RCU can be easily used for a very elaborate producer/consumer
relationship. Basically, using RCU as a poor-mans targeted GC. Joe Seigh
called it proxy garbage collection where:

Reader threads are consumers of data
Writer threads mutate the data.

Yes, there are patents out there for producer/consumer relationship.

Even communications based on multiplexing a bunch of single
producer/consumer queues, with no atomic RMW ops can be helped by a RCU
like device.

Take a look at the SMR algorithm, and how its used for lock-free
exception safe multi-producer consumer queues.

Last I heard SMR was under patent application, from IBM.


>
>> RCU?
>
> RCU is something completely different.
>

Not really, RCU is flexible. It is all about maximizing lock-free
readers, this is hyper important aspect. The writer side can be mutexs,
transnational memory, lock-free, wait free, flexible.

The less intrusion the writer side has on the reader side of a RCU
algorithm, the better.

Chris M. Thomasson

unread,
Mar 21, 2017, 5:26:36 PM3/21/17
to
On 3/21/2017 2:01 PM, Chris M. Thomasson wrote:
> On 3/21/2017 1:43 PM, Bonita Montero wrote:
>>> Why does IBM keep patenting various non-blocking techniques?
>>
>> They're patenting lockfree algorithms for
>> producer-consumer-relationships?
[...]
Check this shi% out:

https://www.google.com/patents/US8543743

;^o

Chris M. Thomasson

unread,
Mar 21, 2017, 5:39:00 PM3/21/17
to
On 3/21/2017 1:43 PM, Bonita Montero wrote:
>> Why does IBM keep patenting various non-blocking techniques?
>
> They're patenting lockfree algorithms for
> producer-consumer-relationships?
>
>> RCU?
>
> RCU is something completely different.
>


Check this out, from IBM, Joe is the sole inventor:

https://www.google.com/patents/US5295262

I know he is extremely smart.

Chris M. Thomasson

unread,
Mar 21, 2017, 5:46:33 PM3/21/17
to
Funny thing is that the following bounded queue beats this patented shi%
to hell wrt performance:

https://groups.google.com/d/topic/lock-free/acjQ3-89abE/discussion

Yikes!

Bonita Montero

unread,
Mar 21, 2017, 5:46:52 PM3/21/17
to
Doesn't matter. The cases where lock-free consumer-producer-algorithms
are beneficial over traditional locking are very rare because of the
relationship I explained.

--
http://facebook.com/bonita.montero/

Bonita Montero

unread,
Mar 21, 2017, 5:48:44 PM3/21/17
to
> Funny thing is that the following bounded queue beats this patented shi%
> to hell wrt performance:

You also have the minimal experience with real-world programming like
Aminer.

--
http://facebook.com/bonita.montero/

Chris M. Thomasson

unread,
Mar 21, 2017, 5:49:21 PM3/21/17
to
On 3/21/2017 2:46 PM, Bonita Montero wrote:
> Doesn't matter. The cases where lock-free consumer-producer-algorithms
> are beneficial over traditional locking are very rare because of the
> relationship I explained.
>

Have you read the patent here?

https://www.google.com/patents/US5295262

Its a strong thread safe reference counting mechanism. Are you saying
that reference counting should probably be using mutexs?

Bonita Montero

unread,
Mar 21, 2017, 5:52:07 PM3/21/17
to
> Its a strong thread safe reference counting mechanism. Are you
> saying that reference counting should probably be using mutexs?

I was talking about producer-consumer-relationships and that there
is almost no speedup in most real-world-code with lock-free producler
-consumer algoritms.

--
http://facebook.com/bonita.montero/

Chris M. Thomasson

unread,
Mar 21, 2017, 5:55:30 PM3/21/17
to
On 3/21/2017 2:52 PM, Bonita Montero wrote:
>> Its a strong thread safe reference counting mechanism. Are you
>> saying that reference counting should probably be using mutexs?
>
> I was talking about producer-consumer-relationships and that there
> is almost no speedup in most real-world-code with lock-free producler
> -consumer algoritms.
>

Even in audio apps?

Bonita Montero

unread,
Mar 21, 2017, 5:56:51 PM3/21/17
to
> Even in audio apps?

... for which you have never written any code.

--
http://facebook.com/bonita.montero/

Chris M. Thomasson

unread,
Mar 21, 2017, 5:58:21 PM3/21/17
to
On 3/21/2017 2:56 PM, Bonita Montero wrote:
>> Even in audio apps?
>
> ... for which you have never written any code.
>

Are you trolling me?

I have had discussions with some people in the field.

They are really into the single producer/consumer aspect, the less
atomic RMW the better.

Chris M. Thomasson

unread,
Mar 21, 2017, 5:59:47 PM3/21/17
to
On 3/21/2017 2:48 PM, Bonita Montero wrote:
>> Funny thing is that the following bounded queue beats this patented shi%
>> to hell wrt performance:
>
> You also have the minimal experience with real-world programming like
> Aminer.
>

Comparing me to Aminer's behavior is pretty damn harsh.

Bonita Montero

unread,
Mar 21, 2017, 6:01:07 PM3/21/17
to
You seem very far from being a programmer like Animer.

--
http://facebook.com/bonita.montero/

Bonita Montero

unread,
Mar 21, 2017, 6:02:17 PM3/21/17
to
> Comparing me to Aminer's behavior is pretty damn harsh.

The difference to to Aminer is only the posting-frequency.

--
http://facebook.com/bonita.montero/

Chris M. Thomasson

unread,
Mar 21, 2017, 6:02:31 PM3/21/17
to
On 3/21/2017 3:00 PM, Bonita Montero wrote:
> You seem very far from being a programmer like Animer.
>

Hard-Core!

Alf P. Steinbach

unread,
Mar 21, 2017, 6:12:09 PM3/21/17
to
On 21-Mar-17 11:00 PM, Bonita Montero wrote:
> You seem very far from being a programmer like Animer.

Andrew Koenig is a cat person. You are apparently a cat person. Yet
Andrew has never, as far as I know, progressed unwarranted into personal
characterization, or for that matter, at all.


Cheers!,

- Alf

Chris M. Thomasson

unread,
Mar 21, 2017, 7:06:02 PM3/21/17
to
I feel a bit cheated right now. Almost like everything I posted might
have been in vain. Damn.

The comparison to Ramine's behavior was pretty gnarly Alf.

Alf P. Steinbach

unread,
Mar 21, 2017, 8:04:05 PM3/21/17
to
On 22-Mar-17 12:05 AM, Chris M. Thomasson wrote:
> On 3/21/2017 3:12 PM, Alf P. Steinbach wrote:
>> On 21-Mar-17 11:00 PM, Bonita Montero wrote:
>>> You seem very far from being a programmer like Animer.
>>
>> Andrew Koenig is a cat person. You are apparently a cat person. Yet
>> Andrew has never, as far as I know, progressed unwarranted into personal
>> characterization, or for that matter, at all.
>
> I feel a bit cheated right now. Almost like everything I posted might
> have been in vain. Damn.

No, investigation & exploration & development of techniques is IMO never
in vain. Even if, as might be the case for some postings, and as Bonita
claims, there is currently little practical use /in some given niche/.
However, regarding her claims about what common usage patterns are, and
her claims about implications of that for your work, I'd consult others
about it. Don't know who, though. But maybe you do. ;-)


> The comparison to Ramine's behavior was pretty gnarly Alf.

It's only poison.

Now listen to Alice Cooper singing about a girl, who, in the video, is
beautiful, or at least not the least bit unattractive. You'd never guess
that he was/is a religious nutcase, now would you? But yes, Christian
and all: that's why he thinks that this girl, or his attraction to her,
must be Evil™. But she's really not that special, she just (like many
others) behaves provocationally, without thinking, at times. And in the
end it's all just ordinary, so /very ordinary/. :)

<url: https://www.youtube.com/watch?v=Qq4j1LtCdww>


Cheers!,

- Alf (off-topic rant mode)

Chris M. Thomasson

unread,
Mar 21, 2017, 8:23:43 PM3/21/17
to
On 3/21/2017 5:03 PM, Alf P. Steinbach wrote:
> On 22-Mar-17 12:05 AM, Chris M. Thomasson wrote:
>> On 3/21/2017 3:12 PM, Alf P. Steinbach wrote:
>>> On 21-Mar-17 11:00 PM, Bonita Montero wrote:
>>>> You seem very far from being a programmer like Animer.
>>>
>>> Andrew Koenig is a cat person. You are apparently a cat person. Yet
>>> Andrew has never, as far as I know, progressed unwarranted into personal
>>> characterization, or for that matter, at all.
>>
>> I feel a bit cheated right now. Almost like everything I posted might
>> have been in vain. Damn.
>
> No, investigation & exploration & development of techniques is IMO never
> in vain. Even if, as might be the case for some postings, and as Bonita
> claims, there is currently little practical use /in some given niche/.

At least when one needs them, they are there in the toolbox. ;^o


> However, regarding her claims about what common usage patterns are, and
> her claims about implications of that for your work, I'd consult others
> about it. Don't know who, though. But maybe you do. ;-)

Another place where I have seen work in lock-free is game design wrt
highly efficient queues. Databases are another one. Take the following
talk into account:

https://youtu.be/qcD2Zj9GgI4

These exotic algorithms are important. Niche as they may be.


>
>
>> The comparison to Ramine's behavior was pretty gnarly Alf.
>
> It's only poison.
>
> Now listen to Alice Cooper singing about a girl, who, in the video, is
> beautiful, or at least not the least bit unattractive. You'd never guess
> that he was/is a religious nutcase, now would you? But yes, Christian
> and all: that's why he thinks that this girl, or his attraction to her,
> must be Evil™. But she's really not that special, she just (like many
> others) behaves provocationally, without thinking, at times. And in the
> end it's all just ordinary, so /very ordinary/. :)
>
> <url: https://www.youtube.com/watch?v=Qq4j1LtCdww>

Poison: https://youtu.be/Qq4j1LtCdww
;^)



Chris M. Thomasson

unread,
Mar 21, 2017, 8:27:20 PM3/21/17
to
On 3/21/2017 5:23 PM, Chris M. Thomasson wrote:
> On 3/21/2017 5:03 PM, Alf P. Steinbach wrote:
>> On 22-Mar-17 12:05 AM, Chris M. Thomasson wrote:
>>> On 3/21/2017 3:12 PM, Alf P. Steinbach wrote:
>>>> On 21-Mar-17 11:00 PM, Bonita Montero wrote:
[...]
>>> The comparison to Ramine's behavior was pretty gnarly Alf.
>>
>> It's only poison.
>>
>> Now listen to Alice Cooper singing about a girl, who, in the video, is
>> beautiful, or at least not the least bit unattractive. You'd never guess
>> that he was/is a religious nutcase, now would you? But yes, Christian
>> and all: that's why he thinks that this girl, or his attraction to her,
>> must be Evil™. But she's really not that special, she just (like many
>> others) behaves provocationally, without thinking, at times. And in the
>> end it's all just ordinary, so /very ordinary/. :)
>>
>> <url: https://www.youtube.com/watch?v=Qq4j1LtCdww>
>
> Poison: https://youtu.be/Qq4j1LtCdww
> ;^)

Perhaps I can say only the crazy would ever even think about lock-free!

Welcome to the nightmare.

Chris M. Thomasson

unread,
Mar 21, 2017, 8:34:41 PM3/21/17
to
Heck, even semaphores out there:

https://www.google.com/patents/US8392627

Chris M. Thomasson

unread,
Mar 23, 2017, 2:50:14 AM3/23/17
to
On 3/21/2017 5:23 PM, Chris M. Thomasson wrote:
> On 3/21/2017 5:03 PM, Alf P. Steinbach wrote:
>> On 22-Mar-17 12:05 AM, Chris M. Thomasson wrote:
>>> On 3/21/2017 3:12 PM, Alf P. Steinbach wrote:
>>>> On 21-Mar-17 11:00 PM, Bonita Montero wrote:
>>>>> You seem very far from being a programmer like Animer.
>>>>
>>>> Andrew Koenig is a cat person. You are apparently a cat person. Yet
>>>> Andrew has never, as far as I know, progressed unwarranted into
>>>> personal
>>>> characterization, or for that matter, at all.
>>>
>>> I feel a bit cheated right now. Almost like everything I posted might
>>> have been in vain. Damn.
>>
>> No, investigation & exploration & development of techniques is IMO never
>> in vain. Even if, as might be the case for some postings, and as Bonita
>> claims, there is currently little practical use /in some given niche/.
>
> At least when one needs them, they are there in the toolbox. ;^o
>
>
>> However, regarding her claims about what common usage patterns are, and
>> her claims about implications of that for your work, I'd consult others
>> about it. Don't know who, though. But maybe you do. ;-)
>
> Another place where I have seen work in lock-free is game design wrt
> highly efficient queues. Databases are another one. Take the following
> talk into account:
>
> https://youtu.be/qcD2Zj9GgI4

I forgot to mention that this is a CppCon 2016 talk from Paul E.
McKenney on sophisticated and efficient means to manipulate a binary
tree wrt the Issaquah Challenge.

Slides can be found here:

http://www.rdrop.com/users/paulmck/scalability/paper/Updates.2015.01.16b.LCA.pdf

Chris M. Thomasson

unread,
Apr 3, 2017, 6:12:48 PM4/3/17
to
On 3/20/2017 5:01 PM, Chris M. Thomasson wrote:
> Here is an example of using a SPMC (single producer/multi consumer)
> queue. It blocks using spin wait. However, this can be avoided through
> the use of eventcounts, or even fast semaphores.
>
> The test will complete, give it a minute or two, it does 10000000
> iterations with 7 consumer threads.
>
> The entire example code can be found here:
>
> http://pastebin.com/raw/6nG6t422
>
>
> Can you run this shi%?
[...]

This can be multiplexed wrt using more than one of these stacks and a
little help wrt relaxed atomic loads and stores that try to give a
little meta data on the most recently used container...
0 new messages