Minute-frequency data example

1,831 views
Skip to first unread message

Пифтанкин Гена

unread,
Mar 9, 2017, 1:07:58 PM3/9/17
to Zipline Python Opensource Backtester
Hi everyone! Could you please tell me does default 'quantopian-quandl' data have minute bars? What's the simplest way to check this? What's the simplest way to run a backtest based on minute market data? Thanks in advance! 

Пифтанкин Гена

unread,
Mar 10, 2017, 2:34:53 AM3/10/17
to Zipline Python Opensource Backtester
so that a code like following does not work:

from pandas import Timestamp
from zipline import run_algorithm
from zipline.api import order, symbol


def initialize(context):
   
print("hello world")


def handle_data(context, data):
    order
(symbol('AAPL'), 1)




results
= run_algorithm(
   
Timestamp('2013', tz='UTC'),
   
Timestamp('2015', tz='UTC'),
    capital_base
=10000,
    initialize
=initialize,
    handle_data
=handle_data,
    bundle
='quantopian-quandl',
   
data_frequency='minute'
)

FileNotFoundError: [Errno 2] No such file or directory: '...\\2017-03-03T09;32;07.057940\\minute_equities.bcolz\\00\\00\\000000.bcolz\\volume\\meta\\sizes'

четверг, 9 марта 2017 г., 21:07:58 UTC+3 пользователь Пифтанкин Гена написал:

Ed Bartosh

unread,
Mar 10, 2017, 3:15:28 AM3/10/17
to Пифтанкин Гена, Zipline Python Opensource Backtester
Hi Гена,

quantopian-quandl bundle uses Quandl Wiki dataset: https://www.quandl.com/data/WIKI-Wiki-EOD-Stock-Prices
It has only EOD (end of day) price data.

Regards,
Ed 

--
You received this message because you are subscribed to the Google Groups "Zipline Python Opensource Backtester" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zipline+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
BR,
Ed

Gennady Piftankin

unread,
Mar 10, 2017, 3:28:19 AM3/10/17
to Zipline Python Opensource Backtester, pift...@gmail.com
Thanks, Ed!

What about a simple example of using minute frequency data in zipline? 


пятница, 10 марта 2017 г., 11:15:28 UTC+3 пользователь Ed Bartosh написал:
To unsubscribe from this group and stop receiving emails from it, send an email to zipline+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
BR,
Ed

Ed Bartosh

unread,
Mar 10, 2017, 4:18:52 AM3/10/17
to Gennady Piftankin, Zipline Python Opensource Backtester
Hi Gennady,

Your algo is a good example of using minute frequency. You just don't have a minute price data.

Regards,
Ed

To unsubscribe from this group and stop receiving emails from it, send an email to zipline+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
BR,
Ed

Gennady Piftankin

unread,
Mar 10, 2017, 4:23:01 AM3/10/17
to Zipline Python Opensource Backtester, pift...@gmail.com
Ok, but the question is what is the simplest example of loading minute data, for example, if I have minute csv-data. Thank you! 

пятница, 10 марта 2017 г., 12:18:52 UTC+3 пользователь Ed Bartosh написал:

Ed Bartosh

unread,
Mar 10, 2017, 4:55:28 AM3/10/17
to Gennady Piftankin, Zipline Python Opensource Backtester
Zipline doesn't yet support loading price data from csv files.

There were couple of examples on this list how to do it, including my csv bundle: https://github.com/bartosh/zipline/blob/master/zipline/data/bundles/csvdir.py

Feel free to give it a try!

Regards,
Ed

To unsubscribe from this group and stop receiving emails from it, send an email to zipline+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
BR,
Ed

Gennady Piftankin

unread,
Mar 10, 2017, 5:25:30 AM3/10/17
to Zipline Python Opensource Backtester, pift...@gmail.com
Thanks a lot, Ed! I'm a new pythonian. Could you please expand "my" simple example with your functions? Are there other simple examples of loading minute market data to zipline, for example, from web? Now I just want to see on the speed of the zipline back-testing on minute data...        

пятница, 10 марта 2017 г., 12:55:28 UTC+3 пользователь Ed Bartosh написал:

Richard P

unread,
Mar 10, 2017, 6:55:50 AM3/10/17
to zip...@googlegroups.com

Ed Bartosh

unread,
Mar 10, 2017, 9:24:30 AM3/10/17
to Gennady Piftankin, Zipline Python Opensource Backtester
Thanks a lot, Ed! I'm a new pythonian. Could you please expand "my" simple example with your functions?
Your code doesn't need to be changed. data bundles are for loading data into zipline only.

Here is how to load minute data using csvdir bundle:
1. prepare directory structure with price data. Here is an example:

$ find ./csvdir

./csvdir

./csvdir/daily/

./csvdir/daily/AAL.csv

./csvdir/daily/AAPL.csv

./csvdir/daily/ABBV.csv

./csvdir/daily/ABT.csv

./csvdir/daily/ACN.csv

./csvdir/daily/ADI.csv

./csvdir/daily/AVGO.csv

./csvdir/minute

./csvdir/minute/AAL.csv

./csvdir/minute/AAPL.csv

./csvdir/minute/ABBV.csv

./csvdir/minute/ABT.csv

./csvdir/minute/ACN.csv

./csvdir/minute/ADI.csv

./csvdir/minute/AVGO.csv

...

csv files are expected to have this header: date,open,high,low,close,volume

btw, if you don't have daily data it's quite easy to prepare by downsampling minute data using pandas resample API.

2. put these lines into ~/.zipline/extension.py:

from zipline.data.bundles import register, csvdir_equities

register('csvdir', csvdir_equities(['daily', 'minute'])) 

3. Ingest your data using csvdir bundle:

CSVDIR=./csvdir/ zipline ingest -b csvdir

4.  run your algo:

zipline run -b csvdir -f <path to your algo> -s <start date> -e <end date> --data-frequency minute

I hope it helps,


Regards,

Ed


To unsubscribe from this group and stop receiving emails from it, send an email to zipline+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
BR,
Ed

Bo Zeng

unread,
Jan 27, 2018, 8:26:37 PM1/27/18
to Zipline Python Opensource Backtester
Hi Ed:
Thanks a lot for your library;
I have one problem with minute data; if the prices are ordered by minutes, does it make sense to still get the "date" as primary key?

Or does it mean, we should have 60*24 data points for each date?

Could you provide an example in the example folder~ sounds to me the input csv column names are with "daily" format in mind. 

Ed Bartosh

unread,
Jan 28, 2018, 4:34:06 AM1/28/18
to Bo Zeng, Zipline Python Opensource Backtester
Hi,

if the prices are ordered by minutes, does it make sense to still get the "date" as primary key?

I'm not sure I fully understand your question. Yes, you should have datetime as a column in your csv files.
Otherwise it would not be possible to use more than one day of data I guess.

Or does it mean, we should have 60*24 data points for each date?

No, it doesn't mean that. Your data should match exchange calendar you're using.
For example if you're trading US market and use NYSE (default zipline calendar) then your data should contain minute bars from 9:31 to 16:00 US/Eastern time zone.

Could you provide an example in the example folder~ sounds to me the input csv column names are with "daily" format in mind. 

Here is a start of csv file with minute crypto currency data (BTC iirc):

date,open,high,low,close,volume

2014-09-30 00:00:00,379.02,379.02,379.02,379.02,3.35

2014-09-30 00:01:00,378.59,378.59,378.59,378.59,8.24

2014-09-30 00:02:00,378.6,378.6,378.57,378.57,1.77

2014-09-30 00:03:00,378.6,378.6,378.6,378.6,9.5

2014-09-30 00:04:00,378.39,378.39,378.39,378.39,3.743

2014-09-30 00:05:00,378.37,378.37,378.37,378.37,7.217

2014-09-30 00:06:00,378.52,378.52,378.12,378.12,33.561

2014-09-30 00:07:00,378.14,378.14,378.14,378.14,2.68



Regards,
Ed
 

To unsubscribe from this group and stop receiving emails from it, send an email to zipline+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
BR,
Ed

Jimin Park

unread,
Aug 30, 2018, 10:36:16 AM8/30/18
to Zipline Python Opensource Backtester
I prepared the folders and files as instructed here but get the following error.

TypeError: write() got an unexpected keyword argument 'symbols'


And the problem happens here.

Zipline version 1.3.0 (master checked out at commit e12e1c4)
zipline/data/bundles/csvdir.py

class CSVDIRBundle:
    ...
    def csvdir_bundle(...):
        ...
        if tframe == 'minute':
            writer = minute_bar_writer
        else:
            writer = daily_bar_writer

        writer.write(_pricing_iter(ddir, symbols, metadata,
                          divs_splits,show_progress),
                          show_progress=show_progress,
                          symbols=symbols)

And the problem is the minute_bar_writer is an instance of zipline/data/minute_bars.py:class BcolzMinuteBarWriter.
This class does not have a write function with symbols parameter. So it seems with this latest version of Zipline,
multiple symbol ingest from CSV wasn't tested?

Jimin Park

unread,
Aug 30, 2018, 10:39:12 AM8/30/18
to Zipline Python Opensource Backtester
Woops sorry forget my post. I introduced that parameter long time ago to test something.

Chuang-Chieh Lin

unread,
Nov 27, 2018, 5:01:51 AM11/27/18
to Zipline Python Opensource Backtester
Hi, 

I am dealing with minute-frequency data, too. Now I can ingest the data into bundles. 
However, as the output file shows, the output is still in daily scale. 
I have used "data-frequency minute" as one of the arguments. 
Is there anything wrong? 


Best regards, 
Joseph

Ed Bartosh於 2018年1月28日星期日 UTC+8下午5時34分06秒寫道:

Brett Elliot

unread,
Nov 28, 2018, 9:29:42 PM11/28/18
to Zipline Python Opensource Backtester
I think the output file will always be in daily scale. You can see if you're really getting minute data by putting a print statement in handle_data.
Reply all
Reply to author
Forward
0 new messages