Hi Frank,
Either my understanding of what a valid OLE2 Compound Document file
should look like is wrong in a corner case, or your file is corrupt, or
somewhere in the middle (your file is idiosyncratic but not
irredeemable, so xlrd should emit a warning and keep going).
Please answer the following questions:
* What platform?
* Python 2.5.x ... what is x?
* What software created the file?
* Is the file one of many from the same source?
* Do you have this problem with all files from that source, or only this
one file?
* Before raising AssertionError, did xlrd output any warning messages?
* What is the outcome when you try to open this file with Excel?
OpenOffice.org Calc? Gnumeric? [error message? data missing? what versions?]
If possible, e-mail me a zipped-up copy of the smallest file that has
this problem (if more than say 1 MB, make it available for downloading).
I promise not to disclose its contents. If you need a more formal
non-disclosure agreement, please supply the text. Please *don't* upload
the file to this group's file section. If you can't send me a file, I'll
need to send you an OLE dump script; consequently debugging (and testing
a fix, if one is possible) could become a slow tennis match.
Cheers,
John
* What platform?
* Python 2.5.x ... what is x?
* What software created the file?
* Is the file one of many from the same source?
* Do you have this problem with all files from that source, or only this
one file?
* Before raising AssertionError, did xlrd output any warning messages?
* What is the outcome when you try to open this file with Excel?
OpenOffice.org Calc? Gnumeric? [error message? data missing? what versions?]
If possible, e-mail me a zipped-up copy of the smallest file that has
this problem (if more than say 1 MB, make it available for downloading).
I promise not to disclose its contents. If you need a more formal
non-disclosure agreement, please supply the text. Please *don't* upload
the file to this group's file section. If you can't send me a file, I'll
need to send you an OLE dump script; consequently debugging (and testing
a fix, if one is possible) could become a slow tennis match.
Try opening the file in a text editor to check it's really binary and
not just html served with a .xls file extension.
This is an annoying trick often used by web apps where a decent library
such as xlwt is not available.
cheers,
Chris
--
Simplistix - Content Management, Zope & Python Consulting
- http://www.simplistix.co.uk
If that were the problem, it would have given quite a different message
("Unsupported format, or corrupt file: <further details>"). To get to
the point where it raised that assertion error, it has passed several
hurdles:
* first 8 bytes of file contain the OLE2 Compound Document magic cookie
* has correct little-endian flag
* sector sizes not ludicrous
* first part of master sector allocation table not noticeably stuffed
* etc
> On Mon, Nov 3, 2008 at 9:59 PM, John Machin wrote:
>> If possible, e-mail me a zipped-up copy of the smallest file that has
>> this problem (if more than say 1 MB, make it available for downloading).
> will do, the files are small (<100Kb), no NDA needed.
Here's an update for the group:
Problem was that the unknown creating software was writing -1 (FREESID
i.e. a free sector) instead of -2 (EOCSID i.e. an end-of-chain marker)
for the first_SID when the SCCS was empty. Not having EOCSID caused an
assertion failure in _get_stream.
Solution: Avoid calling _get_stream in any case when the SCSS appears to
be empty (i.e. size == 0 and first_SID is negative).
svn updated. Frank happy. Case closed.
Cheers,
John