Thanks for the explanation David. I understand your workflow and when working with large data pipelines I understand batching may make sense. MultithreadConcurrentQueue is a very high performance queue intended to transfer data quickly between threads. As such it is designed to be most efficient when it is small and typically empty. By small, I mean that it would fit in your processors cache, 2000 entries, often less.
Also the drain method is intended to be a very efficient mechanism to pull everything out of the queue. The intention is that it would boil down to a memcpy type operation. In my performance testing the ‘nulling’ operation makes the drain essentially on par with the performance of using single item polling. It would lose some of the benefit of the bulk operation.
I understand that is problematic for your use case. I’d suggest reducing the size of the queue greatly and see if your application works as effectively. Would you get similar performance from batching up 2000 or so elements at a time?
If your workload requires hundreds of thousands of elements, I’d consider using the LinkedTransferQueue provided in the JDK.
I sure hope this helps. Please let me know how it works out,
John