Data Frequencies

187 views
Skip to first unread message

Aaron Todd

unread,
Sep 4, 2015, 10:02:30 AM9/4/15
to Zipline Python Opensource Backtester
I understand zipline only supports minute and daily frequencies out of the box.

Can anyone shed some light on what complications there are with supporting 1sec or faster data (possibly even tick data with bid/ask spreads)? I could wade through the code base (which I've glanced at) but it's much easier to hear it from the developers, why the decision was made, and how deeply ingrained this is.

My understanding is that there are some assumptions made internally about the data frequency and I would like to wrap my head around them.  I've searched through quantopian, github issues, and this group but haven't quite stitched the whole story together. I know I'll have to create a new data source generator that generates the events but from what I've gathered that isn't enough (please correct me if I'm wrong here).

What would be really helpful is a high level walkthrough on what would need to change/refactored (just pointing out the relevant classes and dependencies is probably enough, I can figure out the rest).


Thanks for all the hard work!
-Aaron

Brian Bowles

unread,
Dec 1, 2015, 3:37:30 AM12/1/15
to Zipline Python Opensource Backtester
Did you ever get any further on your question for this ?  I have the identical question and perspective. I am considering importing data with second/sub-second resolution via csv to see where the code chokes. This is a time consuming way to make a determination. I don't need a paper written, but a few sentences with someone's informed opinion would be exceptionally useful (and appreciated). Thanks !

Joe Jevnik

unread,
Dec 7, 2015, 6:22:40 PM12/7/15
to Zipline Python Opensource Backtester
Sorry for the delay in answering. Quantopian does not plan on supporting higher frequency data in zipline. There are a lot of places in the code base that assume that mode is either daily or minutely. I am not sure that I can point you at any particular classes because this assumption is baked into most of the code that deals with the clock. We are currently in the middle of a big change to how we manage the clock and data, if you are interested in looking at how hard it would be to make the changes I would suggest looking at the DataPortal object here: https://github.com/quantopian/zipline/pull/858.

Aaron Todd

unread,
Dec 7, 2015, 6:23:00 PM12/7/15
to Zipline Python Opensource Backtester
Nope, to be honest I've moved on to QuantConnect. Both Zipline and Lean (QC) are great projects but after going through the code of both I think the abstractions and architecture of Lean are cleaner (for me at least). I would prefer Python still but the given language choice is not the barrier for me. Only time will tell if the switch was the right choice.


On Tuesday, December 1, 2015 at 3:37:30 AM UTC-5, Brian Bowles wrote:
Reply all
Reply to author
Forward
0 new messages