XLRD Error: Unsupported format, or corrupt file

5,879 views
Skip to first unread message

Prashanth Rao

unread,
Sep 23, 2013, 9:46:52 PM9/23/13
to python...@googlegroups.com
Hi,

I'm sure this has been asked before, but I'm using xlrd 0.9.2 (that I installed using easy_install) along with Python 2.7.3. I'm trying to read a .xlsx file, and I get the error as shown in the traceback below:

Traceback (most recent call last):
  File "C:\Users\pprao\Python_files\readXL.py", line 175, in <module>
    book = xlrd.open_workbook("master_spreadsheet.xlsx")
  File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 429, in open_workbook
    biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)
  File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 1545, in getbof
    bof_error('Expected BOF record; found %r' % self.mem[savpos:savpos+8])
  File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 1539, in bof_error
    raise XLRDError('Unsupported format, or corrupt file: ' + msg)
xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found 'PK\x03\x04\x14\x00\x06\x00'

The strange part is that I've managed to read this file on a Windows 7 machine with Excel 2007 with no issues whatsoever. When I try to read the same file on a different machine (Windows 8, with Excel 2010, product version attached), I get the above error. I'd really like to know more about the following:

- Why does this error occur on only one machine (with Excel 2010 installed) and not the other one (with Excel 2007 installed)? 
- Is there some fundamental difference in the .xlsx formats for Excel 2007 and 2010? Can xlrd 0.9.2 read .xlsx files from either version?
- Is there anything I can do to make this file readable on multiple machines, with different versions of Office installed?

Any ideas on the above would be great. Thanks in advance.


office_2010_version.PNG

John Machin

unread,
Sep 24, 2013, 7:25:57 AM9/24/13
to python...@googlegroups.com


On Tuesday, September 24, 2013 11:46:52 AM UTC+10, Prashanth Rao wrote:
Hi,

I'm sure this has been asked before, but I'm using xlrd 0.9.2 (that I installed using easy_install) along with Python 2.7.3. I'm trying to read a .xlsx file, and I get the error as shown in the traceback below:

Traceback (most recent call last):
  File "C:\Users\pprao\Python_files\readXL.py", line 175, in <module>
    book = xlrd.open_workbook("master_spreadsheet.xlsx")
  File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 429, in open_workbook
    biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)
  File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 1545, in getbof
    bof_error('Expected BOF record; found %r' % self.mem[savpos:savpos+8])
  File "C:\Python27\lib\site-packages\xlrd\__init__.py", line 1539, in bof_error
    raise XLRDError('Unsupported format, or corrupt file: ' + msg)
xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found 'PK\x03\x04\x14\x00\x06\x00'

This is symptomatic of xlrd 0.7.x or earlier when fed an xlsx file.
 

The strange part is that I've managed to read this file on a Windows 7 machine with Excel 2007 with no issues whatsoever. When I try to read the same file on a different machine (Windows 8, with Excel 2010, product version attached), I get the above error. I'd really like to know more about the following:

- Why does this error occur on only one machine (with Excel 2010 installed) and not the other one (with Excel 2007 installed)? 
- Is there some fundamental difference in the .xlsx formats for Excel 2007 and 2010? Can xlrd 0.9.2 read .xlsx files from either version?
- Is there anything I can do to make this file readable on multiple machines, with different versions of Office installed?

Any ideas on the above would be great. Thanks in advance.

xlrd doesn't care what version of Office you have installed (or none), and doesn't care what wrote the xlsx file.

There is a high probability that the failing machine has an old version of xlrd installed somewhere, and you are getting that version instead of 0.9.2.

Immediately before the line that reads

book = xlrd.open_workbook("master_spreadsheet.xlsx")
insert this line:
print xlrd.__VERSION__, xlrd.__file__
The second piece there is the path from which xlrd is loaded

HTH,
John
Reply all
Reply to author
Forward
0 new messages