Hi Bryan,
On 07/05/2012 20:59, Bryan Wyatt wrote:
> Hopefully this is the right contact method. If not my apologies.
The correct way is always the mailing list ;-)
> I am working with XLRD trying to get it to open up sheets from
>
http://www.eia.gov/totalenergy/data/monthly/
>
> eg:
>
http://www.eia.gov/totalenergy/data/monthly/query/mer_data_excel.asp?table=T01.01
>
> I keep getting the following error:
> xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF
> record; found '<html><h'
That should give you a hint: that url is returning html, not an excel file.
It may be that reading that html will tell you what's gone wrong.
It may be that
eia.gov use a rather crappy technique that serves up html
in a way that excel opens as a spreadsheet. If that's what's going on,
you want an html parser such as BeautifulSoup to extract the data, not xlrd.
cheers,
Chris
--
Simplistix - Content Management, Batch Processing & Python Consulting
-
http://www.simplistix.co.uk