Bottleneck in using disruptor to send market data messages to algos

honey...@gmail.com

unread,

Aug 16, 2018, 5:38:31 AM8/16/18

to Disruptor

Hi All/Sam,

My use case is -

There is an algo trading system which is used by traders to run semi-automatic trading strategies. A trader can start 20-30 algos.

I am planning to use LMAX ring buffer in this case. I am receiving market data from a multicast socket. After receiving market data I have used the ring buffer. I am pushing the data into ring buffer at this point. I am planning to use the ring buffer consumer in each of the algo's

Problem in this design is that for every algo there would be a new consumer. However, trader can potentially start 30-40 algos as well. In the test case I found that increasing the number of consumers was degrading the performance.

Can you please suggest what should be the ideal design in this case.

regards,

Honi Jain

Michael Barker

unread,

Aug 16, 2018, 5:43:24 AM8/16/18

to lmax-di...@googlegroups.com

How many CPU cores do you have available? If you have less cores than the number of algos, there is a good chance that you are thrashing the CPU (especially if you are using a busy wait based strategy). You may be better of deciding how many cores you want to dedicate to running the algos and create that number of consumers then distribute the algos across the consumers.

Mike.

--
You received this message because you are subscribed to the Google Groups "Disruptor" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmax-disrupto...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

honey...@gmail.com

unread,

Aug 16, 2018, 5:50:17 AM8/16/18

to Disruptor

Hi,

Thanks for the reply. I have dual processor 16 core machine. But that there are 5 traders on one machine. So yes number of cores are less as compared to algos. Also market data server and OMS server are running on same machine.

I agree that one good alternative would be to restrict the number of consumers of ring buffer to say 10 and then distribute the algos across these consumers.

Could there be any other design also in this case.

regards,

Honi Jain

inappinstore

unread,

Aug 19, 2018, 3:52:21 PM8/19/18

to Disruptor

First of all, we need to know which wait strategy you are using. Second, is your workload actually elastic? That is, does this contract and expand based on demand? Can it grow dynamically in future? If so, you need to distribute the load to more than a single machine and use some sort of hashing/workload assignment algorithm to disregard multicast inputs on machines where they are not being used. For example, market data for ticker1-50 is ignored on machine b and ticker 51-100 is ignored on machine a - ignore in this context means not placed on the ring-buffer. For distributing workload by algorithms, you can do this sharding using a static or dynamic table mapping each machine to the set of algos that are running or can be run on this machine. The number of algos can never be more than (number of physical cores - 4), so that 1 core is dedicated to operating system, 2 to network i/o and 1 to dispatch/multicast listener (you should tune your jvm and algos to minimize garbage and garbage collection but if you can't you should also allocate cores for this duty).

Reply all

Reply to author

Forward