First questions (ingesting minute bars using RP's code). Snag after 'Now calling minute_bar_writer'

31 views
Skip to first unread message

Kaveh Vakili

unread,
Dec 2, 2017, 4:45:38 AM12/2/17
to Zipline Python Opensource Backtester


Hi all,

I'm just starting to play with zipline (this looks exiting!) and want to try to ingest my own 1 minute frequency csv files off the bat.

I have been playing at at Richard Prokopyshen's 1 minute data bundle ingest function. I'm hitting on my first snag.  To fix idea,

my data looks like this:


                      open   high    low  close  volume
Timestamp                                              
2017-06-15 23:00:00  46.82  46.84  46.82  46.84   160.0
2017-06-15 23:01:00  46.84  46.85  46.82  46.84   143.0
2017-06-15 23:02:00  46.84  46.84  46.82  46.83   143.0
2017-06-15 23:03:00  46.84  46.85  46.83  46.84    61.0
2017-06-15 23:04:00  46.82  46.83  46.82  46.83    42.0


The code start fine, but by the time it reaches the 'Now calling minute_bar_writer'

print, I seem to hit a problem with my data. I put the error message below in

hope someone has encountered this problem before.

Thanks in advance for any hints,


So when I push:

 zipline ingest -b ingester

I get:


$ zipline ingest -b ingester
entering machina
.  tuSymbols= ('Q0017',)
about to
return ingest function
entering ingest
and creating blank dfMetadata
dfMetadata
<class 'pandas.core.frame.DataFrame'>
<bound method NDFrame.describe of   start_date   end_date auto_close_date symbol
0 1970-01-01 1970-01-01      1970-01-01   None>
S
= Q0017 IFIL= /home/mony_algo/Aggregators/folder_merged_data/Q0017.csv
read_csv dfData
<class 'pandas.core.frame.DataFrame'> length 7717 2017-06-15 23:00:00
start_date
<class 'pandas.tslib.Timestamp'> 2017-06-15 23:00:00 None
end_date
<class 'pandas.tslib.Timestamp'> 2017-06-23 21:00:00 None
ac_date
<class 'pandas.tslib.Timestamp'> 2017-06-24 21:00:00 None
liData
<class 'list'> length 1
Now calling minute_bar_writer
Traceback (most recent call last):
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/pandas/indexes/base.py", line 1945, in get_loc
   
return self._engine.get_loc(key)
 
File "pandas/index.pyx", line 538, in pandas.index.DatetimeEngine.get_loc (pandas/index.c:11140)
 
File "pandas/index.pyx", line 558, in pandas.index.DatetimeEngine.get_loc (pandas/index.c:10701)
KeyError: Timestamp('2017-06-23 21:00:00+0000', tz='UTC')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 
File "/home/mony_algo/Zippers/bin/zipline", line 11, in <module>
    sys
.exit(main())
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/click/core.py", line 722, in __call__
   
return self.main(*args, **kwargs)
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv
= self.invoke(ctx)
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
   
return _process_result(sub_ctx.command.invoke(sub_ctx))
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/click/core.py", line 895, in invoke
   
return ctx.invoke(self.callback, **ctx.params)
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/click/core.py", line 535, in invoke
   
return callback(*args, **kwargs)
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/zipline/__main__.py", line 312, in ingest
    show_progress
,
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/zipline/data/bundles/core.py", line 451, in ingest
    pth
.data_path([name, timestr], environ=environ),
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/zipline/data/bundles/ingester.py", line 112, in ingest
    minute_bar_writer
.write(liData, show_progress=False)
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/zipline/data/minute_bars.py", line 697, in write
    write_sid
(*e, invalid_data_behavior=invalid_data_behavior)
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/zipline/data/minute_bars.py", line 730, in write_sid
   
self._write_cols(sid, dts, cols, invalid_data_behavior)
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/zipline/data/minute_bars.py", line 810, in _write_cols
    latest_min_count
= all_minutes.get_loc(last_minute_to_write)
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/pandas/tseries/index.py", line 1422, in get_loc
   
return Index.get_loc(self, key, method, tolerance)
 
File "/home/mony_algo/Zippers/lib/python3.5/site-packages/pandas/indexes/base.py", line 1947, in get_loc
   
return self._engine.get_loc(self._maybe_cast_indexer(key))
 
File "pandas/index.pyx", line 538, in pandas.index.DatetimeEngine.get_loc (pandas/index.c:11140)
 
File "pandas/index.pyx", line 558, in pandas.index.DatetimeEngine.get_loc (pandas/index.c:10701)
KeyError: Timestamp('2017-06-23 21:00:00+0000', tz='UTC')


Kaveh Vakili

unread,
Dec 2, 2017, 4:52:42 AM12/2/17
to Zipline Python Opensource Backtester


Sorry all,

after seeing the error messages on screen,  I tried removing a couple of rows

from the data. Now the code works like a charm.


So, it seems the following lines are the ones causing mayhem (in attach).

Would anyone know what could be the concern here?

Thanks again,

offending_lines.csv

Kaveh Vakili

unread,
Dec 2, 2017, 6:08:32 AM12/2/17
to Zipline Python Opensource Backtester
Hi all,

problem solved. I had mistakenly thought the original time zone of the data was London.

This thing (zipline) is a ferrari!
Reply all
Reply to author
Forward
0 new messages