Pandas DataReader start date always begins at the first data point available on yahoo despite user specified start date

157 views
Skip to first unread message

s

unread,
Nov 27, 2011, 4:26:45 PM11/27/11
to pystat...@googlegroups.com
from pandas.io.data import DataReader
import datetime
import matplotlib.pyplot as plt

sp500 = DataReader("SPY", "yahoo", start=datetime.datetime(2005, 1, 1))
sp500.info()
sp500["Adj Close"].plot()
plt.show()

<class 'pandas.core.frame.DataFrame'>
Index: 4744 entries, 1993-01-29 00:00:00 to 2011-11-25 00:00:00
Data columns:
Open 4744 non-null values
High 4744 non-null values
Low 4744 non-null values
Close 4744 non-null values
Volume 4744 non-null values
Adj Close 4744 non-null values
dtypes: int64(1), float64(5)

As you can see the index begins at 1993-01-29 despite a user specified
start date of 2005-1-1. You can specify any ticker and the start date
specified in the DataReader function seems to be ignored.

I welcome your thoughts. Appreciate it greatly.

Sp

Wes McKinney

unread,
Nov 27, 2011, 4:35:43 PM11/27/11
to pystat...@googlegroups.com

The Yahoo API is very unreliable in my experience-- the code currently
returns whatever data is returned by the query, so something must have
changed recently there. I guess it should truncate to the passed date
range regardless of what is returned.

- Wes

s

unread,
Nov 27, 2011, 6:51:37 PM11/27/11
to pystat...@googlegroups.com
Thanks Wes. So I truncated by using:

tmp = sp500.truncate(datetime.datetime(2005,1,1), datetime.datetime(2011,11,25)).copy()
tmp.info()
tmp["Adj Close"].plot()
plt.show()

<class 'pandas.core.frame.DataFrame'>
Index: 1739 entries, 2005-01-03 00:00:00 to 2011-11-25 00:00:00
Data columns:
Open         1739  non-null values
High         1739  non-null values
Low          1739  non-null values
Close        1739  non-null values
Volume       1739  non-null values
Adj Close    1739  non-null values
dtypes: int64(1), float64(5)
Process finished with exit code 0

Although its strange that it wasn't truncated in the initial call.

Wes McKinney

unread,
Nov 27, 2011, 7:19:07 PM11/27/11
to pystat...@googlegroups.com
Like I said the function returns exactly the results of the query to Yahoo. If you create a GitHub issue or make a pull request to insert the explicit truncation, that'd be great. 

s

unread,
Nov 27, 2011, 7:25:02 PM11/27/11
to pystat...@googlegroups.com
On 11/27/11 7:19 PM, Wes McKinney wrote:
> Like I said the function returns exactly the results of the query to Yahoo.
> If you create a GitHub issue or make a pull request to insert the explicit
> truncation, that'd be great.
No problem Wes. Please see issue #419:

https://github.com/wesm/pandas/issues/419

Reply all
Reply to author
Forward
0 new messages