How to ingest the 1 minute data bundle?

317 views
Skip to first unread message

chenw...@gmail.com

unread,
Dec 16, 2020, 2:43:46 AM12/16/20
to Zipline Python Opensource Backtester
Hi, 

I try to inget the data bundle of minute but fails. Attached it my 1-minute csv. The extension.py I use is as follow:

#### Begin ####
import pandas as pd

from zipline.data.bundles import register
from zipline.data.bundles.csvdir import csvdir_equities

start_session = pd.Timestamp('2020-12-07 09:30:00', tz='utc')
end_session = pd.Timestamp('2020-12-15 14:05:10', tz='utc')

register(
'custom-minute-bundle', # What to call the new bundle
csvdir_equities(
['minute'], # Are these daily or minute bars
'K:/Python/zipline/thomas/Yahoo_api-master/data/before/minute',
),
calendar_name='NYSE', # US equities default
start_session=start_session,
end_session=end_session
)

#### End ####

The command I use is:

zipline ingest -b custom-minute-bundle  


But I got error:

ValueError: Start session 2020-12-07 09:30:00+00:00 is invalid!  



SPY.csv

Arnold Stevens

unread,
Dec 16, 2020, 9:21:56 AM12/16/20
to chenw...@gmail.com, Zipline Python Opensource Backtester
Try switching your start_session one day later, to 12/8. That sort of error sometimes arises when the start_session is out of bounds of the data. 

--
You received this message because you are subscribed to the Google Groups "Zipline Python Opensource Backtester" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zipline+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/zipline/316e0bd4-671c-4db3-b182-d446459eefb6n%40googlegroups.com.

chenw...@gmail.com

unread,
Dec 16, 2020, 3:27:18 PM12/16/20
to Zipline Python Opensource Backtester
Sorry. This doesn't help. I got error as follow:
...
ValueError: Start session 2020-12-08 09:30:00+00:00 is invalid!

I've also change the end date one day earlier to '2020-12-14', and I've tried changing the index name 'date' to 'datetime'. All of these doesn't help.

Here the whole error chain:
Traceback (most recent call last):
  File "C:\Users\chenw\anaconda3\envs\env_zipline_36\Scripts\zipline-script.py", line 10, in <module>
    sys.exit(main())
  File "C:\Users\chenw\anaconda3\envs\env_zipline_36\lib\site-packages\click\core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\chenw\anaconda3\envs\env_zipline_36\lib\site-packages\click\core.py", line 782, in main
    rv = self.invoke(ctx)
  File "C:\Users\chenw\anaconda3\envs\env_zipline_36\lib\site-packages\click\core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Users\chenw\anaconda3\envs\env_zipline_36\lib\site-packages\click\core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\chenw\anaconda3\envs\env_zipline_36\lib\site-packages\click\core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "C:\Users\chenw\anaconda3\envs\env_zipline_36\lib\site-packages\zipline\__main__.py", line 385, in ingest
    show_progress,
  File "C:\Users\chenw\anaconda3\envs\env_zipline_36\lib\site-packages\zipline\data\bundles\core.py", line 397, in ingest
    end_session,
  File "C:\Users\chenw\anaconda3\envs\env_zipline_36\lib\site-packages\zipline\data\bcolz_daily_bars.py", line 151, in __init__
    "Start session %s is invalid!" % start_session
ValueError: Start session 2020-12-08 09:30:00+00:00 is invalid!

Maybe I should do some change in the core.py? 

chenw...@gmail.com

unread,
Dec 16, 2020, 3:33:46 PM12/16/20
to Zipline Python Opensource Backtester
In the "bcolz_daily_bars.py"  I see these:
...
        if start_session != end_session:
            if not calendar.is_session(start_session):
                raise ValueError(
                    "Start session %s is invalid!" % start_session
                )
...

It seems my start_session is not the calendar-session? What does this mean?

Arnold Stevens

unread,
Dec 16, 2020, 7:14:04 PM12/16/20
to chenw...@gmail.com, Zipline Python Opensource Backtester
From my limited knowledge, it means that the "start session" that the user provides is not a "legitimate" trading datetime per the calendar you selected (here, "NYSE"). 

Wei Chen

unread,
Dec 17, 2020, 5:51:30 AM12/17/20
to Arnold Stevens, Zipline Python Opensource Backtester
Ok, now I can ingest the minute data successfully. Simply change the start_ and the end_session as follow:

start_session = pd.Timestamp('2020-12-07', tz='utc')
end_session = pd.Timestamp('2020-12-15', tz='utc')  

This means no need to use the minute time.

Vincent Perkins

unread,
Jul 9, 2022, 11:25:16 PM7/9/22
to Zipline Python Opensource Backtester
Thanks for supplying this example. In my extensions.py file I used only the date without the time:

#extensions.py
start_session = pd.Timestamp('2020-12-07', tz='utc')
end_session = pd.Timestamp('2020-12-15', tz='utc')

register(
    'custom-minute-bundle',   # What to call the new bundle
    csvdir_equities(
        ['minute'],  # Are these daily or minute bars
        'D:/AlgoDataLocal/Test',  # Directory where the formatted bar data is
    ),
    calendar_name='NYSE', # US equities default
    start_session=start_session,
    end_session=end_session
)

Also in your SPY.csv file the last row reads
2020-12-15 14:05:12 -5:00

I changed it to: 
2020-12-15 14:05:00 -5:00


I also removed the adjusted close column and changed column names to: 
date, open, high, low, close, volume

The data worked after ingesting the bundle

SPY.csv

Nam Mai

unread,
Aug 6, 2022, 11:07:44 AM8/6/22
to Zipline Python Opensource Backtester
thank you guys a lot. great example

Vào lúc 04:25:16 UTC+1 ngày Chủ Nhật, 10 tháng 7, 2022, vincent...@gmail.com đã viết:
Reply all
Reply to author
Forward
0 new messages