[xlrd] AssertionError parsing an xls file

1,082 views
Skip to first unread message

Alejandro Recarey

unread,
May 25, 2012, 2:51:58 PM5/25/12
to python...@googlegroups.com
Hi all and thanks for taking the time to read this.

I have an Excel workbook that if I try to open it with xlrd version
0.7.7, will always give me the following error:


>>> import xlrd
>>> book = xlrd.open_workbook('Amend_37058_MONILSAWLA.xls',formatting_info=True)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/xlrd/__init__.py", line
481, in open_workbook
bk.get_sheets()
File "/usr/local/lib/python2.7/dist-packages/xlrd/__init__.py", line
1067, in get_sheets
self.get_sheet(sheetno)
File "/usr/local/lib/python2.7/dist-packages/xlrd/__init__.py", line
1058, in get_sheet
sh.read(self)
File "/usr/local/lib/python2.7/dist-packages/xlrd/sheet.py", line
1082, in read
saved_obj = self.handle_obj(data)
File "/usr/local/lib/python2.7/dist-packages/xlrd/sheet.py", line
1872, in handle_obj
assert pos + 4 == data_len
AssertionError
>>>

I can send the excel book if anybody wants to take a look at it. It
has joined cells and some images, but this has not been a problem
before.

Does anybody know what could be going wrong? I have been using xlrd
flawlessly for some time.

Thank you for your help.

Alex

John Machin

unread,
May 25, 2012, 7:40:15 PM5/25/12
to python...@googlegroups.com
On Saturday, May 26, 2012 4:51:58 AM UTC+10, Alex wrote:

I have an Excel workbook that if I try to open it with xlrd version
0.7.7, will always give me the following error:
[snip]
 File "/usr/local/lib/python2.7/dist-packages/xlrd/sheet.py", line
1082, in read
   saved_obj = self.handle_obj(data)
 File "/usr/local/lib/python2.7/dist-packages/xlrd/sheet.py", line
1872, in handle_obj
   assert pos + 4 == data_len
AssertionError

I can send the excel book if anybody wants to take a look at it. It
has joined cells and some images, but this has not been a problem
before.

Does anybody know what could be going wrong? I have been using xlrd
flawlessly for some time.

Please have a look at  https://github.com/python-excel/xlrd/issues/7 and if that doesn't cover your case, send me a copy of the file.

Regards,
John

Alejandro Recarey Llerena

unread,
May 25, 2012, 8:15:21 PM5/25/12
to python...@googlegroups.com, python...@googlegroups.com
Thanks for the prompt reply. Will try running off master, it seems likely that there is an issue with the bitmap. Ignoring it is fine with me.

Will post back with my result. 

Thanks again!
--
You received this message because you are subscribed to the Google Groups "python-excel" group.
To view this discussion on the web, visit https://groups.google.com/d/msg/python-excel/-/iFJjkCNwTvUJ.
To post to this group, send an email to python...@googlegroups.com.
To unsubscribe from this group, send email to python-excel...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/python-excel?hl=en-GB.

John Machin

unread,
May 26, 2012, 4:06:30 AM5/26/12
to python...@googlegroups.com
On Saturday, May 26, 2012 10:15:21 AM UTC+10, Alex wrote:
Thanks for the prompt reply. Will try running off master, it seems likely that there is an issue with the bitmap. Ignoring it is fine with me.
Will post back with my result. 


You didn't say whether your file was created by xlwt or not. If not, it might be a good idea  to send the file to me so that I can check exactly what is is causing it to barf. The recent "fix" does not purport to ignore *all* invalid BIFF8 OBJECT records.

John Machin

unread,
Jun 3, 2012, 6:43:23 PM6/3/12
to python...@googlegroups.com

... and the result was that Alex  sent me a file.

The problem was two extra zero bytes at the end of each OBJECT record.

I have committed (in the 'master' branch) a fix that ignores any number of extra zero bytes at the end of an OBJECT record.

Cheers,
John

Alejandro Recarey

unread,
Jun 4, 2012, 10:02:10 AM6/4/12
to python...@googlegroups.com
Thanks alot for your help John.

I'll be running of the master branch in production so you have
yourself an extra beta tester ;)

Regards,

Alex

Chris Withers

unread,
Jun 7, 2012, 1:24:18 PM6/7/12
to python...@googlegroups.com, Alejandro Recarey
On 04/06/2012 15:02, Alejandro Recarey wrote:
> Thanks alot for your help John.
>
> I'll be running of the master branch in production so you have
> yourself an extra beta tester ;)

Well, you can use the 0.7.8 release if you like :-)

(Of course, the master branch also has xlsx support, and it'd be great
if you could test that a bit more before the release in a few weeks time...)

cheers,

Chris

--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk

Alejandro Recarey

unread,
Jun 21, 2012, 11:14:37 AM6/21/12
to python...@googlegroups.com
Hi John, running off the master branch (currently at 0.8.0a), I am
running up against the same error. I am sending you the file that
causes it (from the same provider) in a separate email.

This is the stacktrace:

Traceback (most recent call last):
File "xls-reformatter.py", line 467, in <module>
crp.parse_file(changes=options.changes)
File "xls-reformatter.py", line 295, in parse_file
prefixes = ExcelReader.open(xls_file, prefix_sheet)
File "xls-reformatter.py", line 38, in open
book = xlrd.open_workbook(file_name, formatting_info=True)
File "/usr/local/lib/python2.7/dist-packages/xlrd/__init__.py", line
432, in open_workbook
ragged_rows=ragged_rows,
File "/usr/local/lib/python2.7/dist-packages/xlrd/book.py", line
116, in open_workbook_xls
bk.get_sheets()
File "/usr/local/lib/python2.7/dist-packages/xlrd/book.py", line
702, in get_sheets
self.get_sheet(sheetno)
File "/usr/local/lib/python2.7/dist-packages/xlrd/book.py", line
693, in get_sheet
sh.read(self)
File "/usr/local/lib/python2.7/dist-packages/xlrd/sheet.py", line
1082, in read
saved_obj = self.handle_obj(data)
File "/usr/local/lib/python2.7/dist-packages/xlrd/sheet.py", line
1877, in handle_obj
assert pos + 4 == data_len
AssertionError


Thanks again!

Alex

John Machin

unread,
Jun 21, 2012, 6:22:16 PM6/21/12
to python-excel

On Jun 22, 1:14 am, Alejandro Recarey <a...@recarey.org> wrote:
> Hi John, running off the master branch (currently at 0.8.0a), I am
> running up against the same error. I am sending you the file that
> causes it (from the same provider) in a separate email.
>
> This is the stacktrace:
[SNIP]
>   File "/usr/local/lib/python2.7/dist-packages/xlrd/sheet.py", line
> 1877, in handle_obj
>     assert pos + 4 == data_len

That statement disappeared from sheet.py in the latest commit (18 days
ago):

https://github.com/python-excel/xlrd/commit/df92535dc63425d72589bd2db69cf78a2a241027

You are running an earlier version.
Reply all
Reply to author
Forward
0 new messages