[pyxl] XLRDError found d0 cf 11

284 views
Skip to first unread message

Marc Fargas

unread,
May 20, 2010, 7:35:54 AM5/20/10
to python-excel
Hi all,

I'm messing around with XLRD. I've got some Excel 2003 files but xlrd
refused to read them. (Note: atleast one of them was previously read
on xlrd in Linux).

All the files I have, when opened in an HEX editor start with: "d0 cf
11 e0 a1 b1 1a e1" and as far as I know this is correct, but it seems
it isn't.

When I call open_workbook() I get:

37. book = xlrd.open_workbook(file_contents=data.read(),
verbosity=5)
File "...\xlrd\__init__.py" in open_workbook
429. biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)
File "...\xlrd\__init__.py" in getbof
1545. bof_error('Expected BOF record; found %r' %
self.mem[savpos:savpos+8])
File "...\xlrd\__init__.py" in bof_error
1539. raise XLRDError('Unsupported format, or corrupt
file: ' + msg)

Exception Type: XLRDError at /.../
Exception Value: Unsupported format, or corrupt file: Expected BOF
record; found '\xd0\xcf\x11\xe0\xa1\xb1'

I'am using 0.7.1, over python 2.6 on Windows 7. The last time I tried
(and worked) I was on Linux with Python 2.5 (maybe there's the
problem?)

Thanks,
Marc

--
You received this message because you are subscribed to the Google Groups "python-excel" group.
To post to this group, send an email to python...@googlegroups.com.
To unsubscribe from this group, send email to python-excel...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/python-excel?hl=en-GB.

Chris Withers

unread,
May 20, 2010, 8:17:37 AM5/20/10
to python...@googlegroups.com
Marc Fargas wrote:
> I'm messing around with XLRD. I've got some Excel 2003 files

What application was used to save the problematic files?

Chris

--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk

Marc Fargas

unread,
May 20, 2010, 8:29:37 AM5/20/10
to python-excel
Hi Chris,

On May 20, 2:17 pm, Chris Withers <ch...@simplistix.co.uk> wrote:
> What application was used to save the problematic files?

Those were generated by a propietary application (accounting
software). I tried loading an resaving them from Office 2007, Office
2003, OpenOffice 3.2 in either Excel 2003 or 5.0/95 format.

But I always have the same header on the saved files.

I have tried with Python 2.5 to make sure the problem was not
introduced by python 2.6 but I'm getting the same error.

Thanks for the quick reply!
Marc

John Machin

unread,
May 20, 2010, 8:54:38 AM5/20/10
to python...@googlegroups.com
On 20/05/2010 9:35 PM, Marc Fargas wrote:
> Hi all,
>
> I'm messing around with XLRD. I've got some Excel 2003 files but xlrd
> refused to read them. (Note: atleast one of them was previously read
> on xlrd in Linux).
>
> All the files I have, when opened in an HEX editor start with: "d0 cf
> 11 e0 a1 b1 1a e1" and as far as I know this is correct, but it seems
> it isn't.

It is the correct OLE2 signature.

>
> When I call open_workbook() I get:
>
> 37. book = xlrd.open_workbook(file_contents=data.read(),
> verbosity=5)
> File "...\xlrd\__init__.py" in open_workbook
> 429. biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)
> File "...\xlrd\__init__.py" in getbof
> 1545. bof_error('Expected BOF record; found %r' %
> self.mem[savpos:savpos+8])
> File "...\xlrd\__init__.py" in bof_error
> 1539. raise XLRDError('Unsupported format, or corrupt
> file: ' + msg)
>
> Exception Type: XLRDError at /.../
> Exception Value: Unsupported format, or corrupt file: Expected BOF
> record; found '\xd0\xcf\x11\xe0\xa1\xb1'

Assuming the above line is an exact copy/paste of what you have got,
then xlrd is getting only the first 6 bytes of the file. I suspect that
Python mmap is not working with Windows 7.

(1) try open_workbook() with mmap disabled.
(2) if that doesn't work, send me a copy of the file.

Jon Clements

unread,
May 20, 2010, 9:15:01 AM5/20/10
to python...@googlegroups.com

Possibly another option:

book = xlrd.open_workbook(file_contents=data.read(), verbosity=5)

If the  data object in the file_contents option is not opened in binary mode, then yes, you only get 6 bytes from a .read() op.

Maybe just pass it a filename as a string, rather than using file_contents?

Cheers,

Jon.
 



I'am using 0.7.1, over python 2.6 on Windows 7. The last time I tried
(and worked) I was on Linux with Python 2.5 (maybe there's the
problem?)

Thanks,
Marc


--
You received this message because you are subscribed to the Google Groups "python-excel" group.
To post to this group, send an email to python...@googlegroups.com.
To unsubscribe from this group, send email to python-excel...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/python-excel?hl=en-GB.

Marc Fargas

unread,
May 20, 2010, 9:58:34 AM5/20/10
to python-excel
First of all, Thanks a lot to all of you.

> Assuming the above line is an exact copy/paste of what you have got, then
> xlrd is getting only the first 6 bytes of the file. I suspect that Python
> mmap is not working with Windows 7.
>
> > (1) try open_workbook() with mmap disabled.
> > (2) if that doesn't work, send me a copy of the file.

Disabling mmap did nothing.

> book = xlrd.open_workbook(file_contents=data.read(), verbosity=5)

This one fixed the issue, using filename=XXX instead of file_contents
worded.

FYI "data" was: data = open(filename).

I'll leave that this way.

Thanks again,
Reply all
Reply to author
Forward
0 new messages