Excel has error opening xlsx created by openpyxl

5,101 views
Skip to first unread message

James Parris

unread,
Apr 5, 2012, 12:26:09 PM4/5/12
to openpyxl-users
Upon opening the created file, Excel 2007 is finding garbage data in
line one, in some crazy far down column.


> Excel found unreadable content in currentDeals.xlxs,
> Repaired Part: /xl/worksheets/sheet1.xml part with XML error. Load error. Line 1, column 1293893

You can click "Yes" to the error message for auto-repair, and then the
file is just fine. But that is not really acceptable for a production
report =/
Please guys and gals, I've googled my head off about this and gotten
no where. I don't even know where to start. You're my only hope!


Here is the very simple code generating the report. So the highest
column according to get_highest_column() is 44. I also tried
garbage_collect() but makes no difference:


| #!/opt/recon/bin/python2.6
| from openpyxl import Workbook
|
| file=currentDeals.csv
| D='|'
|
| # ------ Read file to Array ------------------- :
| StoredData = []
| FileReader = csv.reader(open(file), delimiter=D)
|
| for line in FileReader:
| StoredData.append(line)
|
| # ------ Write Array to Worksheet ------------- :
|
| wb = Workbook()
| ws = wb.create_sheet()
| ws.title = file.split('.')[0]
|
| for Row in StoredData:
| ws.append(Row)
|
|
| outfile = '.'.join([file.split('.')[0],'xlsx'])
| #ws.garbage_collect()
| print ws.get_highest_column()
| wb.save(filename = outfile)


*Note, I've also tried with Workbook(optimized_write=True) and
Workbook(encoding='utf-8'). Just grasping at straws though.

Arthur Gerigk

unread,
Apr 6, 2012, 10:04:33 AM4/6/12
to openpyx...@googlegroups.com
Are you writing unicode or string types to the xlsx?

I had the same problem when writing utf-8 encoded strings but it didn't occur with unicode objects.

James Parris

unread,
Apr 11, 2012, 3:24:40 PM4/11/12
to openpyxl-users
OK, I'm still not seeing my last reply, so I hope this isn't a double
post.

You got me on the right track (I think). My system's preferred
encoding is actually latin1.

I tried to convert all the values to unicode using just:
item = unicode(item)

That didn't make any difference at all. I then tried:a
item = unicode(item.decode(locale.getpreferredencoding()))

Now that seems to have gotten me farther. Now, while in Windows I
still get the same error opening the file, at least now I can actually
_see_ the buggy worksheet. Now, if only I could get rid of it, maybe
I'll be all set:


| print wb.get_sheet_names()
| for sheet in wb.get_sheet_names():
| if sheet != ws.title:
| try:
| wb.remove_sheet(sheet)
| except:
| print "Warning: failed to delete buggy worksheet '%s'."%sheet
| pass

['Sheet', 'currentDeals']
Warning: failed to delete buggy worksheet 'Sheet'.


Rats! Is there a better way to remove that first sheet with bad
random data in it? Or am I on the wrong track entirely?

James Parris

unread,
Apr 11, 2012, 3:43:22 PM4/11/12
to openpyx...@googlegroups.com

Ehh, nevermind I was able to delete the sheet by not being stupid:


|  for sheet in wb.get_sheet_names():
|    if sheet != ws.title:
|      try:
|        wb.remove_sheet(wb.get_sheet_by_name(sheet))  #   <--- duh.


However, I am still getting the error of bad data in "Sheet1" when I try to open the file in Excel.  
Maybe still an encoding issue, but I have no idea how to troubleshoot it.  =/


James Parris

unread,
Apr 17, 2012, 4:36:25 PM4/17/12
to openpyx...@googlegroups.com
RESOLVED.  

This seems to be because the job is running from an IBM-AIX machine.  I tried the same script from a RedHat machine and it worked fine.  
For some reason, on AIX, the data type was being stored as 'inlineStr' instead of 's', when using the either the append() function or the cell(x,y).value='string' syntax.  
Kudos to the developers for including the set_value_explicit function, because it saved my bacon!!

Here was the change I made that finally got it working from our AIX servers:



|    #for Row in StoredData: # doesn't work because AIX; workaround below
|    #  ws.append(Row)
|     
|    for R in xrange(len(StoredData)):
|      for C in xrange(len(StoredData[R])):
|        col = cell.get_column_letter(C + 1)
|        ws.cell('%s%s'%(col, (R+1))).set_value_explicit(value=StoredData[R][C], data_type='s')

Steve Romanow

unread,
Apr 17, 2012, 8:17:53 PM4/17/12
to openpyx...@googlegroups.com
On my aix, the default encoding is iso-8859-1 (latin-1). That is
likely the difference.
Reply all
Reply to author
Forward
0 new messages