Concurrent Priority Queues

208 views

Skip to first unread message

Ravi Teja Govinduluri

unread,

Aug 25, 2015, 7:35:32 AM8/25/15

to Art of Multiprocessor Programming

Hi,
I am currently doing a project on concurrent priority queues. I was looking at the Fine Grained implementation of concurrent priority queues. When I tried to implement fine grained concurrent priority queues, I observed that fine grained implementation is taking much more time than sequential implementation. For randomized 500000 addition and 500000 minimum operations, so total of 10^6 operations, it is taking around 1s for sequential implementation where as 30-40 s for fine grained implementation which is unexpected because concurrency should decrease the amount of time, rather here it is increasing

Number of threads=4
System Configuration:
Number of Cores =4

Can anyone please let me know the issue here for taking more time for concurrent data structures when compared to sequential implementation. ?
If the number of threads are increased, it is observed to be taking more time than 30-40s around 1-2 m
It is also observed that if there are no deletion operations then the concurrent implementation of priority queues is taking less time than sequential implementation.
Is there any issue with removeMin() function ?
I am following Revised First Edition which is printed in 2013

Gavin Lowe

unread,

Aug 26, 2015, 3:43:37 AM8/26/15

to Art of Multiprocessor Programming

Concurrent data structures are often considerably slower than sequential ones. There are a number of reasons for this. Often the algorithm itself it more complex, so, even without concurrency, each thread has to do more work. Threads may conflict on operations, and have to re-start their operations. Threads will suffer more cache misses, and the memory bus will suffer congestion.

Concurrent priority queues scale particularly badly. The head of the queue is inevitably a bottleneck for removeMin, as all threads will be accessing the same part of the datatype. This explains your observation that the priority queue works better when all threads simply enqueue.

The speed-ups you get from concurrent programs will mainly be where the threads act independently, outside shared datatypes. The shared datatypes are a necessary evil, to allow threads to interact. Do you need a pure priority queue in your application, or could you manage with an approximation, where each thread maintains its own queue, and threads sometimes exchange data?

A Google search will find quite a lot of papers about different designs for concurrent priority queues.