error open excel file

792 views
Skip to first unread message

fyaqq

unread,
Jan 16, 2008, 10:37:25 PM1/16/08
to python-excel
This is the error message from python:
Traceback (most recent call last):
File "C:\Documents and Settings\fyaqq\python\readExcel.py", line 48,
in <module>
book = xlrd.open_workbook(fileName)
File "C:\Python25\Lib\site-packages\xlrd\__init__.py", line 370, in
open_workbook
biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)
File "C:\Python25\Lib\site-packages\xlrd\__init__.py", line 1323, in
getbof
raise XLRDError('Expected BOF record; found 0x%04x' % opcode)
xlrd.biffh.XLRDError: Expected BOF record; found 0x683c

The error occured only to some of my excel files. I guess it is due to
different version of excel file?

Thank you for any help!

John Machin

unread,
Jan 17, 2008, 5:23:08 AM1/17/08
to python-excel

On Jan 17, 2:37 pm, fyaqq <l...@clamc.com> wrote:
> This is the error message from python:
[snip]
> xlrd.biffh.XLRDError: Expected BOF record; found 0x683c

| >>> '\x3c\x68'
| '<h'

Your file was quite probably produced by doing "Save As Web Page" in
Excel. Try inspecting its contents with a text editor (e.g. Notepad)
or a browser.

xlrd currently reads only the binary (BIFF) format found in XLS files
and very close relatives. Note that renaming foo.htm as foo.xls does
not change the content.

Background to the error message: xlrd has not found the magic
signature that denotes an OLE2 compound document (Excel 5.x or later),
nor has it found a valid BOF record (XLS or XLW from Excel 4.x or
earlier) -- your file starts with '<h'.

In response to a private message from another xlrd user with the same
problem, I have changed the error message in the current svn trunk to
show the first 8 bytes of the file:

xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected
BOF record; found '<html xm'

I shall also update the documentation to explain exactly what
varieties of Excel file are supported.

Cheers,
John

Matthew

unread,
Jan 23, 2008, 11:31:53 AM1/23/08
to python-excel
I got a similar error message:

xlrd.biffh.XLRDError: Expected BOF record; found 0x3f3c

and inspecting the excel file using a text editor it appears the the
file format is XML:

<?xml version="1.0"?>

Are there plans for xlrd to support this format in the future?

Thanks,
Matthew

John Machin

unread,
Jan 24, 2008, 5:32:11 AM1/24/08
to python-excel
On Jan 24, 3:31 am, Matthew <tscha...@gmail.com> wrote:
> I got a similar error message:
>
> xlrd.biffh.XLRDError: Expected BOF record; found 0x3f3c
>
> and inspecting the excel file using a text editor it appears the the
> file format is XML:
>
> <?xml version="1.0"?>
>
> Are there plans for xlrd to support this format in the future?

XML is not a file format; it is a way of specifying a zillion file
formats. The file you have may be:
(a) result of Excel 2003 "save as XML Spreadsheet"
(b) result of Excel "save as XML Data"
(c) something else.

If it starts off like this:

<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">

then it's "XML Spreadsheet 2003". If you're no sure what it is, send
me a copy.

I'm currently reviewing the list of possible enhancements for both
xlrd and xlwt, and will be publishing it on this newsgroup some time
next week, and seeking comment on priorities, etc. "XML Spreadsheet
2003" will be on the list.

Cheers,
John

Matthew

unread,
Jan 24, 2008, 10:32:09 AM1/24/08
to python-excel
It nearly matches the header you sent:


<?xml version="1.0"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">

It is just missing the <?mso-application progid="Excel.Sheet"?>
line. (2nd line in your example.)

Needless to say, "XML Spreadsheet 2003" is my top priority :)

Thanks for you response and an otherwise great tool.

John Machin

unread,
Jan 24, 2008, 2:39:05 PM1/24/08
to python...@googlegroups.com
Matthew wrote:
> It nearly matches the header you sent:
>
> <?xml version="1.0"?>
> <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
> xmlns:o="urn:schemas-microsoft-com:office:office"
> xmlns:x="urn:schemas-microsoft-com:office:excel"
> xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
> xmlns:html="http://www.w3.org/TR/REC-html40">
>
> It is just missing the <?mso-application progid="Excel.Sheet"?>
> line. (2nd line in your example.)
>
>
Near enough isn't good enough in this game. Maybe we're talking about
"XML Spreadsheet yyyy" where contents differ with yyyy.

Is it possible that the next few lines contain something like:

<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
[snip]
<Version>11.8132</Version>
</DocumentProperties>
<ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">

? Are you able to send me some sample files? If not, it is going to
make the process of discovery rather difficult.


> Needless to say, "XML Spreadsheet 2003" is my top priority :)
>
>

No doubt. You had better start getting your sales pitch together ;-) If
you have only the one file that you have mentioned so far and nobody
else is interested, it won't fly. How many of these files do you have?
What is the source [what software written by whom]? Do the files
continue to be created?

Please feel free to continue this discussion in private e-mail if you
prefer.

Cheers,
John

Message has been deleted

Alan

unread,
Mar 10, 2008, 9:47:49 AM3/10/08
to python-excel
I am not quiet good at excel. But I think you may try a popular Excel
file recovery tool called Advanced Excel Repair to repair your Excel
file. It is a powerful tool to repair corrupt or damaged Excel files.

Detailed information about Advanced Excel Repair can be found at
http://www.datanumen.com/aer/

And you can also download a free demo version at http://www.datanumen.com/aer/aer.exe

Maybe this will be useful.

Alan
Message has been deleted

John Machin

unread,
Mar 10, 2008, 6:34:35 PM3/10/08
to python-excel


On Mar 11, 12:47 am, Alan <f...@datanumen.net> wrote:
> I am not quiet good at excel.

The above statement is not a good advertisement for the software that
you are trying to sell.

> But I think you may try a popular Excel
> file recovery tool called Advanced Excel Repair to repair your Excel
> file.

Casual reading of this thread would indicate to most sentient beings
that the OP's file is *NOT* corrupt; it's merely in a format ("save as
web page") that's not handled by the software that he was trying to
use to read it ... and possibly not even handled by the software that
you are trying to sell :-)

People who have corrupt XLS files can google to find numerous repair
offerings.
This is a moderated newsgroup / mailing-list for discussion about, and
getting help on, accessing Excel files using Python. I have deleted
the other two messages that you attempted to post to this same thread.
Please don't attempt to post spam again.
Reply all
Reply to author
Forward
0 new messages