How to use data from a local csv file?

1,562 views
Skip to first unread message

Albert Vonpupp

unread,
Nov 13, 2017, 7:53:12 PM11/13/17
to Zipline Python Opensource Backtester
Hello,

As you might have guess, I am new to zipline.

I am trying to read a csv local file to do some tests. I haven't found this on the documentation so I google it up and I found [1]. I tried pretty much what is written but it didn't work for me: Period starts fall after period end. The fact that the TradingAlborithm constructor does not use start and end leads me to think that the tutorial might be old.

I searched on the group also and found a post that uses run_algorithm instead, so I end up with the following code:

# Load CSV
import pandas as pd
data['btc_usdt'] = pd.DataFrame.from_csv('data/kraken-usd.csv')

# Run it
from datetime import datetime
from zipline.api import order, record, symbol
from zipline.algorithm import TradingAlgorithm
from zipline import run_algorithm
import strategies.true_koala as strategy

start = datetime(2015, 1, 1, 0, 0, 0, 0, pytz.utc)
end = datetime(2016, 1, 1, 0, 0, 0, 0, pytz.utc)

#algo_obj = TradingAlgorithm(initialize=strategy.initialize, handle_data=strategy.handle_data, start=start, end=end, capital_base = 100000.0)
algo = run_algorithm(
        start=start,
        end=end,
        initialize=initialize,
        handle_data=strategy.handle_data,
        # analyze = analyze,
        data=data,
        #data_frequency = 'minute',
        capital_base = 1e6 )
perf_manual = algo_obj.run(panel)

The initialize and handle_data functions are the following:

def initialize(context):
    context.i = 0
    context.asset = symbol('btc_usdt')

def handle_data(context, data):
    # Skip first 21 days to get full windows
    context.i += 1
    if context.i < 21:
        return

    # Compute averages
    # data.history() has to be called with the same params
    # from above and returns a pandas dataframe.
    short_mavg = data.history(context.asset, 'price', bar_count=7, frequency="1d").mean()
    long_mavg = data.history(context.asset, 'price', bar_count=21, frequency="1d").mean()

    # Trading logic
    if short_mavg > long_mavg:
        # order_target orders as many shares as needed to
        # achieve the desired number of shares.
        order_target(context.asset, 100)
    elif short_mavg < long_mavg:
        order_target(context.asset, 0)

    # Save values for later inspection
    record(btc=data.current(context.asset, 'price'),
           short_mavg=short_mavg,
           long_mavg=long_mavg)

This is how the CSV file looks like:

Date,Open,High,Low,Close,Volume (BTC),Volume (Currency),Weighted Price
2017-11-13,5840.0,5879.8,5800.0,5848.2,13.04438313,75825.9992298,5812.92334594
2017-11-12,6330.0,6500.0,5464.0,5880.0,14495.1237039,86294051.5049,5953.31597494
2017-11-11,6567.1,6800.0,6200.0,6330.0,6931.899168,44788043.9663,6461.15052756
...
2014-01-09,825.56345,870.0,807.42084,841.86934,8.1583345,6784.24998189,831.57291257
2014-01-08,810.0,899.84281,788.0,824.98287,19.18275555,16097.3295835,839.156269366
2014-01-07,874.6704,892.06753,810.0,810.0,15.62237812,13151.4728443,841.835522304

I also read on the forum that I need to define some sort of 24/7 calendar.

With the rise of crypto I have seen several requests here related to it. Isn't there any simple minimal example to use with local CSV data of cryptocurrencies? If there isn't, it might be a good idea to include one. I can PR this simple example once it is working.

Thanks a lot.

Albert Vonpupp

unread,
Nov 13, 2017, 9:44:32 PM11/13/17
to Zipline Python Opensource Backtester
I forgot to mention my current error hehe.

SymbolNotFound: Symbol 'BTC_USDT' was not found.

Could you please help me out?

Thanks a lot.

Albert Vonpupp

unread,
Nov 14, 2017, 12:10:49 PM11/14/17
to Zipline Python Opensource Backtester
Hi,

I am making some progress using the csvdir and reading though the forum. I could:
- Modify the extensions
- Ingest the bundle

extensions:
from zipline.data.bundles import register
from zipline.data.bundles.csvdir import csvdir_equities

register('csvdir', csvdir_equities(['daily', 'minute']))


I am using jupyter as follows:

%%zipline -b csvdir --start 2016-1-1 --end 2017-1-1 --data-frequency daily

def initialize(context):
    context.i = 0
    context.asset = symbol('BTC_USD')

The handle_data function is the same as on my initial post. The rest of the code using run_algorithm has been removed.

Now I am geting the folowing error:

ValueError: SQLite file u'/home/av/.zipline/data/csvdir/2017-11-14T17;00;29.202350/assets-6.sqlite' doesn't exist.

I feel like I am a step closer, but I don't know what to do next.

Could anyone help me out please? Thanks!

Albert Vonpupp

unread,
Nov 14, 2017, 12:35:40 PM11/14/17
to Zipline Python Opensource Backtester
I noticed that the ingest process had an error (which was at the very top of verbose output, so I missed it at first).

(.env) > $ CSVDIR=./csvdir/ zipline ingest -b csvdir                                                                                                            [±feature/demo ●●] 
/home/av/.zipline/extension.py:4: UserWarning: Overwriting bundle with name 'csvdir'                                                                                               
  register('csvdir', csvdir_equities(['daily', 'minute']))                                                                                                                         
Loading custom pricing data:   [####################################]  100% | BTC_USD: sid 0                                                                                       
Merging daily equity files:  [------------------------------------]  0                                                                                                             
Traceback (most recent call last):                                                                                                                                                 
  File "/home/av/repos/ziplinetest/.env/bin/zipline", line 11, in <module>                                                                                                  
    load_entry_point('zipline==1.1.1+152.g18e4186f', 'console_scripts', 'zipline')()                                                                                               
  File "/home/av/repos/ziplinetest/.env/lib/python2.7/site-packages/click/core.py", line 722, in __call__                                                                   
    return self.main(*args, **kwargs)                                                                                                                                              
  File "/home/av/repos/ziplinetest/.env/lib/python2.7/site-packages/click/core.py", line 697, in main                                                                       
    rv = self.invoke(ctx)                                                                                                                                                          
  File "/home/av/repos/ziplinetest/.env/lib/python2.7/site-packages/click/core.py", line 1066, in invoke                                                                    
    return _process_result(sub_ctx.command.invoke(sub_ctx))                                                                                                                        
  File "/home/av/repos/ziplinetest/.env/lib/python2.7/site-packages/click/core.py", line 895, in invoke                                                                     
    return ctx.invoke(self.callback, **ctx.params)                                                                                                                                 
  File "/home/av/repos/ziplinetest/.env/lib/python2.7/site-packages/click/core.py", line 535, in invoke                                                                     
    return callback(*args, **kwargs)                                                                                                                                               
  File "/home/av/repos/ziplinetest/.env/lib/python2.7/site-packages/zipline/__main__.py", line 327, in ingest                                                               
    show_progress,                                                                                                                                                                 
  File "/home/av/repos/ziplinetest/.env/lib/python2.7/site-packages/zipline/data/bundles/core.py", line 451, in ingest                                                      
    pth.data_path([name, timestr], environ=environ),                                                                                                                               
  File "/home/av/repos/ziplinetest/.env/lib/python2.7/site-packages/zipline/data/bundles/csvdir.py", line 94, in ingest                                                     
    self.csvdir)                                                                                                                                                                   
  File "/home/av/repos/ziplinetest/.env/lib/python2.7/site-packages/zipline/data/bundles/csvdir.py", line 156, in csvdir_bundle                                             
    show_progress=show_progress)                                                                                                                                                   
  File "/home/av/repos/ziplinetest/.env/lib/python2.7/site-packages/zipline/data/us_equity_pricing.py", line 257, in write                                                  
    return self._write_internal(it, assets)                                                                                                                                        
  File "/home/av/repos/ziplinetest/.env/lib/python2.7/site-packages/zipline/data/us_equity_pricing.py", line 378, in _write_internal                                        
    ).difference(asset_sessions).tolist(),                                                                                                                                         
AssertionError: Got 1407 rows for daily bars table with first day=2014-01-07, last day=2017-11-13, expected 972 rows.                                                              
Missing sessions: []

Could it has something to do with the calendar?

I haven't yet figured out how calendars play their role yet on zipline, nor how to create a 24/7 calendar.

Vonpupp

unread,
Nov 14, 2017, 3:35:09 PM11/14/17
to Zipline Python Opensource Backtester
I have progressed a bit, but I am still stuck.

This is my calendar (I think I got it from this group. There is another [1], slightly different):

class POLONIEXExchangeCalendar(TradingCalendar):
    """
    Exchange calendar for Poloniex US.

    Open Time: 12am, US/Eastern
    Close Time: 11:59pm, US/Eastern

    """
    @property
    def name(self):
        return "POLONIEX"

    @property
    def tz(self):
        return timezone("UTC")

    @property
    def open_time(self):
        return time(0, 0)

    @property
    def close_time(self):
        return time(23,59)

    @lazyval
    def day(self):
        return CustomBusinessDay(
            weekmask='Mon Tue Wed Thu Fri Sat Sun',
        )

This is how I register csvdir and the calendar:

register_calendar('POLONIEX',
                  POLONIEXExchangeCalendar(
                      start=Timestamp('2014-01-07', tz='UTC'),
                      end=Timestamp('2017-11-12', tz='UTC')
                      ))
register('custom-csvdir-bundle',
         csvdir_equities(["daily", "minute"],
         '/home/av/repos/ziplinetest/csvdir'),
         calendar_name='POLONIEX')

The CSV file looks like:

Date,open,high,low,close,volume,Volume (Currency),Weighted Price
2017-11-13,5840.0,5879.8,5800.0,5848.2,13.04438313,75825.9992298,5812.92334594
...
2014-01-07,874.6704,892.06753,810.0,810.0,15.62237812,13151.4728443,841.835522304

When I try to ingest data with:

CSVDIR=./csvdir/ zipline ingest -b custom-csvdir-bundle

I get the following error:

IndexError: index 1406 is out of bounds for axis 0 with size 1406

Shouldn't this work? I believe I might be close to be able to backtest.

I am taking also the opportunity to ask a question. Let's assume I have two symbols for daily csv files: BTC and LTC, and that they have different start and end time. Will I need different calendars? I didn't thought the calendars needed a start and ending date, but apparently they do, so if dates are different, my best guess is that I would need different calendars. Also I will need diferent CSV folders, right?

Thanks a lot.

[1]: https://stackoverflow.com/questions/45257823/how-to-use-a-custom-calendar-in-a-custom-zipline-bundle

Vonpupp

unread,
Nov 15, 2017, 3:59:06 PM11/15/17
to Zipline Python Opensource Backtester
Finally I was able to ingest the data =)

Now my problem is when I try to run the algorithm that I get:

Error: Invalid value for "--trading-calendar": invalid choice: POLONIEX. (choose from BMF, CFE, CME, ICE, LSE, NYSE, TSX, us_futures)

I think that my calendar is not getting properly registered. To try to do so I am using:

register_calendar(
   
'TWENTYFOURSEVEN',
   
TwentyFourSevenCalendar(
        start
=start_session,
       
end=end_session
   
)
)

Any help please?

Thanks a lot.

fva...@quantopian.com

unread,
Nov 24, 2017, 4:48:34 AM11/24/17
to Zipline Python Opensource Backtester
Hi Vonpupp,

While I look into this, can you also try aliasing your calendar? 

def register_calendar_alias(self, alias, real_name, force=False):

And alias it as 'NYSE'?

ahmet can acar

unread,
Dec 26, 2017, 5:58:53 AM12/26/17
to Zipline Python Opensource Backtester
Hey Vonpupp , I m tring to handling with the same issue for a while and i realized that i encountered with the same issue with u. Did u solve all these problems? or still going on?

15 Kasım 2017 Çarşamba 23:59:06 UTC+3 tarihinde Vonpupp yazdı:
Message has been deleted

Ed Bartosh

unread,
Jan 30, 2018, 9:15:19 AM1/30/18
to Tadukas, Zipline Python Opensource Backtester
Hi,

This looks like a calendar issue to me.
Can you show your csv file,  ~/.zipline/extension.py and the calendar code?

Regards,
Ed


2018-01-30 13:36 GMT+00:00 Tadukas <seniu...@gmail.com>:
Hello!

Same problem here.

I'm trying to ingest some btc data using csvdir bundle and I keep getting this error:

AssertionError: Got 2330 rows for daily bars table with first day=2011-09-13, last day=2018-01-29, expected 1605 rows.

It looks like a calendar issue, so I registered POLONIEX calendar I found on this group but it didn't help.

Any ideas how to solve this? Ed Bartosh? Thanks!

--
You received this message because you are subscribed to the Google Groups "Zipline Python Opensource Backtester" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zipline+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
BR,
Ed

Tad

unread,
Jan 30, 2018, 9:15:37 AM1/30/18
to Zipline Python Opensource Backtester
Hello!

Same problem here. I'm trying to ingest some btc data using csvdir bundle (zipline ingest -b csvdir) and I keep getting this error:

AssertionError: Got 2330 rows for daily bars table with first day=2011-09-13, last day=2018-01-29, expected 1605 rows.

Also, current version of zipline on github doesn't have extensions.py file (should it?), so I created one:
register(
'crypto-bundle',
csvdir_equities(
['daily'],
'/Users/tadukas/dev/Zipline/csvdir',
),
calendar_name='POLONIEX'
)

But it looks like it doesn't work, because when I try to ingest 'crypto-bundle' I get an error saying that there is no bundle registered with the name 'crypto-bundle'

Any ideas how to solve this? Ed Bartosh? Thanks!

Tad

unread,
Jan 30, 2018, 9:17:24 AM1/30/18
to Zipline Python Opensource Backtester
Oh, I can't edit old message. CSV attached. Thanks, ED!
BTC_USD.csv

Ed Bartosh

unread,
Jan 30, 2018, 9:18:27 AM1/30/18
to Tad, Zipline Python Opensource Backtester
CSV looks ok. Please provide the rest of the files I've asked for.

--
You received this message because you are subscribed to the Google Groups "Zipline Python Opensource Backtester" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zipline+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
BR,
Ed

Tad

unread,
Jan 30, 2018, 9:22:47 AM1/30/18
to Zipline Python Opensource Backtester
There is no extensions.py file on current version of zipline on github, so I created one. 
To unsubscribe from this group and stop receiving emails from it, send an email to zipline+u...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
BR,
Ed
exchange_calendar_poloniex.py
extensions.py

Ed Bartosh

unread,
Jan 30, 2018, 9:56:20 AM1/30/18
to Tad, Zipline Python Opensource Backtester
My observations so far:

1. Your calendar code doesn't import CustomBusinessDay, but uses it.
2. Your extension.py doesn't register POLONIEX calendar, but uses it.

Can you show me where do you put extenstion.py and how do you run ingest?
Which zipline version do you use?

Regards,
Ed


To unsubscribe from this group and stop receiving emails from it, send an email to zipline+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
BR,
Ed

Tad

unread,
Jan 30, 2018, 10:34:37 AM1/30/18
to Zipline Python Opensource Backtester
Thanks for taking look at it. 

- My calendar code actually does have CustomBusinessDay import. 
- extension.py file is in /Users/tad/.zipline/extension.py
- I'm using latest master zipline version on github.

I added this to extensions.py to register calendar:
register_calendar(
'POLONIEX',
POLONIEXExchangeCalendar(
start=Timestamp('2011-01-01', tz='UTC'),
end=Timestamp('2018-01-29', tz='UTC')
)
)
and now I'm getting these errors:

UserWarning: Failed to load extension: '/Users/tad/.zipline/extension.py' 
name 'POLONIEXExchangeCalendar' is not defined

zipline.errors.InvalidCalendarName: The requested TradingCalendar, POLONIEX, does not exist.

At which directory calendar file should be? I put it in zipline/utils/calendars directory.

Thanks again!

Ed Bartosh

unread,
Jan 30, 2018, 10:39:29 AM1/30/18
to Tad, Zipline Python Opensource Backtester
My calendar code actually does have CustomBusinessDay import. 

Sorry, my bad. Overlooked it.

extension.py file is in /Users/tad/.zipline/extension.py

Yes, it's where it should be.

> name 'POLONIEXExchangeCalendar' is not defined

Did you import it? You probably didn't if you added only the code you've shown.

Regards,
Ed


To unsubscribe from this group and stop receiving emails from it, send an email to zipline+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
BR,
Ed

Tad

unread,
Jan 30, 2018, 10:53:14 AM1/30/18
to Zipline Python Opensource Backtester
Yup, you're right - I didn't import it. Stupid mistake. I finally managed to ingest the data!

Double thanks to Ed. For creating csvdir and for being so helpful here :)
Reply all
Reply to author
Forward
0 new messages