Tips for using Chronicle Queue for market data...?

Andrew M

unread,

Aug 26, 2016, 11:05:52 AM8/26/16

to Chronicle

I have multiple market data sources I want to log to disk on a single server. I want to run one JVM for each data source so I can kill/restart them individually. I would like some high resolution time stamp on each market data event so I can interleave the files together for playback later. As these files are being written I will also have production apps that need to start up and read the files from the start of the day to get caught up on the current market state and then continue to receive realtime events. There can't be any race conditions around reading the previously collected events and receiving new ones.

I think this should be a pretty standard use case for Chronicle Queue. Am I on the right track here?

Thanks

Andrew

Peter Lawrey

unread,

Aug 26, 2016, 9:01:42 PM8/26/16

to java-ch...@googlegroups.com

For simplicity, I would write everything to one queue. This way there is not confusion as to the order the events occurs. Note: even if two messages have a time stamp each, this doesn't mean they will be completely written in that order, nor that they will been seen in that order. The only way to guarantee order it is to combine them in a queue.

About half the use cases for Queue are as a large persisted buffer for market data. The busiest users of queue takes peaks of 30 million events per second. If you are concerned about restart times, you can use a Chronicle Map to persist the state up to any given point in the queue. That way you can continue from where you were up to on a restart.

The other half of use cases are in compliance and risk systems. They tend to be bigger users or Chronicle Map.

Regards,

Peter.

--
You received this message because you are subscribed to the Google Groups "Chronicle" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-chronicle+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrew M

unread,

Aug 28, 2016, 11:03:52 PM8/28/16

to Chronicle

I was thinking if I used one queue per instrument per day then if I wanted to repeatedly backtest something using a small number of instruments across many days I only have to parse the specific files related to those instruments. I don't have to scan large files for the events I'm interested in. Does that seem correct?

Thanks very much

Peter Lawrey

unread,

Aug 28, 2016, 11:10:43 PM8/28/16

to java-ch...@googlegroups.com

You could do that however if you need to know the order those evens occurred. e.g. you are back testing how one instrument can be used to trade another, you need total ordering.

For back testing purposes I suggest extracting the information you need into a queues, or flat files, or matlab/vector format for repeated replay. That way only the data appropriate for your backtest is needed. Given you will be performing the replay many, many times it can be useful to cut it down to only the data you need in a compact form. e.g. you might use float or short+fixed precision data.

This is the sort of data store I would use for back testing https://github.com/OpenHFT/Chronicle-TimeSeries Chronicle Queue s a better as a master original source from which you can extract what you are interested in.