[pyxl] xlutils.copy() not preserving datetime formats

372 views
Skip to first unread message

Michael

unread,
May 18, 2010, 5:34:43 PM5/18/10
to python-excel
First of all just want to say thanks for provided an excellent set of
excel read/write tools. I am involved in the Cardsharp oss project <a
href="http://cardsharp.norc.org/cardsharp/wiki/Home">Cardsharp</a> and
we are using the python-excel libraries for our excel loader.

The issue I am having is that when using xlutils.copy() date,
datetime, and time formats are not preserved. I think the core issue
has to do with the xf_index being incorrectly set. When debugging the
datetime objects have an xf_index of 17 instead of 18 - 20.

I fixed this issue by implementing the below monkey-patch on
xlutils.filter.cell and xlwt.Row.set_cell_number.

Am I missing something/is there a way to preserve the datetime objects
without applying the monkey-patch?

Thanks!
Mike

def my_set_cell_number(self, colx, number,
style=xlwt.Style.default_style, is_date=False):
self._Row__adjust_height(style)
self._Row__adjust_bound_col_idx(colx)
xf_index = self._Row__parent_wb.add_style(style)
if is_date:
if 1 > number >= 0:
xf_index = 20
elif number - int(number) == 0:
xf_index = 19
else:
xf_index = 18
self.insert_cell(colx, xlwt.Cell.NumberCell(self._Row__idx, colx,
xf_index, number))

def my_cell(self,rdrowx,rdcolx,wtrowx,wtcolx):
cell = self.rdsheet.cell(rdrowx,rdcolx)
# setup column attributes if not already set
if wtcolx not in self.wtcols and rdcolx in
self.rdsheet.colinfo_map:
rdcol = self.rdsheet.colinfo_map[rdcolx]
wtcol = self.wtsheet.col(wtcolx)
wtcol.width = rdcol.width
wtcol.set_style(self.style_list[rdcol.xf_index])
wtcol.hidden = rdcol.hidden
wtcol.level = rdcol.outline_level
wtcol.collapsed = rdcol.collapsed
self.wtcols.add(wtcolx)
# copy cell
cty = cell.ctype
if cty == xlrd.XL_CELL_EMPTY:
return
if cell.xf_index is not None:
style = self.style_list[cell.xf_index]
else:
style = default_style
rdcoords2d = (rdrowx, rdcolx)
if rdcoords2d in self.merged_cell_top_left_map:
# The cell is the governing cell of a group of
# merged cells.
rlo, rhi, clo, chi = self.merged_cell_top_left_map[rdcoords2d]
assert (rlo, clo) == rdcoords2d
self.wtsheet.write_merge(
wtrowx, wtrowx + rhi - rlo - 1,
wtcolx, wtcolx + chi - clo - 1,
cell.value, style)
return
if rdcoords2d in self.merged_cell_already_set:
# The cell is in a group of merged cells.
# It has been handled by the write_merge() call above.
# We avoid writing a record again because:
# (1) It's a waste of CPU time and disk space.
# (2) xlwt does not (as at 2007-01-12) ensure that only
# the last record is written to the file.
# (3) If you write a data record for a cell
# followed by a blank record for the same cell,
# Excel will display a blank but OOo Calc and
# Gnumeric will display the data :-(
return
x = 0
wtrow = self.wtsheet.row(wtrowx)
if cty == xlrd.XL_CELL_TEXT:
wtrow.set_cell_text(wtcolx, cell.value, style)
elif cty == xlrd.XL_CELL_NUMBER:
wtrow.set_cell_number(wtcolx, cell.value, style)
elif cty == xlrd.XL_CELL_DATE:
wtrow.set_cell_number(wtcolx, cell.value, style, is_date=True)
elif cty == xlrd.XL_CELL_BLANK:
wtrow.set_cell_blank(wtcolx, style)
elif cty == xlrd.XL_CELL_BOOLEAN:
wtrow.set_cell_boolean(wtcolx, cell.value, style)
elif cty == xlrd.XL_CELL_ERROR:
wtrow.set_cell_error(wtcolx, cell.value, style)
else:
raise Exception(
"Unknown xlrd cell type %r with value %r at (shx=%r,rowx=
%r,colx=%r)" \
% (cty, value, sheetx, rowx, colx)
)


--
You received this message because you are subscribed to the Google Groups "python-excel" group.
To post to this group, send an email to python...@googlegroups.com.
To unsubscribe from this group, send email to python-excel...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/python-excel?hl=en-GB.

John Machin

unread,
May 18, 2010, 8:10:38 PM5/18/10
to python...@googlegroups.com
On 19/05/2010 7:34 AM, Michael wrote:
> First of all just want to say thanks for provided an excellent set of
> excel read/write tools. I am involved in the Cardsharp oss project <a
> href="http://cardsharp.norc.org/cardsharp/wiki/Home">Cardsharp</a> and
> we are using the python-excel libraries for our excel loader.
>
> The issue I am having is that when using xlutils.copy() date,
> datetime, and time formats are not preserved.

What is substituted in what circumstances? Are you certain that number
and currency formats are not affected?

> I think the core issue
> has to do with the xf_index being incorrectly set.
> When debugging the
> datetime objects have an xf_index of 17 instead of 18 - 20.
>
> I fixed this issue by implementing the below monkey-patch on
> xlutils.filter.cell and xlwt.Row.set_cell_number.

An xf_index > 16 has no intrinsic meaning. Hard-coding such values in
your monkey-patch is not very robust at all.

We need to understand what the basic problem is. Please supply a minimal
input file, a minimal script, and a statement of how the resultant
output file differs from what you expected.

Cheers,
John

Michael Jugovich

unread,
May 19, 2010, 4:56:41 PM5/19/10
to python...@googlegroups.com
Here is a script to reproduce the error. After copying all formatting is removed for all types.

from xlwt import *
from datetime import *
from xlutils.copy import copy

w = Workbook()
ws = w.add_sheet('Hey, Dude')

fmts = [
    'General',
    '0%',
    'M/D/YY',
    'D-MMM-YY',
    'D-MMM',
    'MMM-YY',
    'h:mm AM/PM',
    'h:mm:ss AM/PM',
    'h:mm',
    'h:mm:ss',
    'M/D/YY h:mm',
    'mm:ss',
    '[h]:mm:ss',
    'mm:ss.0',
]

i = 0
for fmt in fmts:
    ws.write(i, 0, fmt)

    style = XFStyle()
    style.num_format_str = fmt
    
    font0 = Font()
    font0.name = 'Times New Roman'
    font0.struck_out = True
    font0.bold = True
    style.font = font0
    vals = ['text', 10]
    
    ws.write(i, 4, datetime.now() if fmt not in ('General', '0%') else vals[i], style)
    i += 1

w.save('before_copy.xls')
rb = xlrd.open_workbook('before_copy.xls')
wb = copy(rb)
wb.save('after_copy.xls')

Before I started digging heavily through the xlutils/xwlt/and xlrd code I was hoping to get some guidance as to where to start my search. My thoughts were that it had to do with the xf_index being incorrectly set, but from your response it sounds like you are saying that the xf_index does not determine the style format? The reason I was thinking it did is because when I step through my debugger the following is happening:

from xlutils.filter.cell:
cell = self.rdsheet.cell(rdrowx,rdcolx) <------xf_index = None

...

and from xlrd.Sheet.cell:
def cell(self, rowx, colx):
        if self.formatting_info: <-----------------------no formatting_info
            xfx = self.cell_xf_index(rowx, colx)
        else:
            xfx = None   <-----------------------xfx is being set to None
        return Cell(
            self._cell_types[rowx][colx],
            self._cell_values[rowx][colx],
            xfx,
            )

....

from xlutils.filter.cell:
 # copy cell
        cty = cell.ctype
        if cty == xlrd.XL_CELL_EMPTY:
            return
        if cell.xf_index is not None:
            style = self.style_list[cell.xf_index]
        else:
            style = default_style <--------------default selected

Thanks!
Michael

John Machin

unread,
May 19, 2010, 6:22:31 PM5/19/10
to python...@googlegroups.com
On 20/05/2010 6:56 AM, Michael Jugovich wrote:
> Here is a script to reproduce the error. After copying all formatting is
> removed for all types.

Not at all. *BEFORE* copying all formatting is ignored for all types --
by you.

>
[snip]

> w.save('before_copy.xls')
> rb = xlrd.open_workbook('before_copy.xls')

rb = xlrd.open_workbook('before_copy.xls', formatting_info=True)

> wb = copy(rb)
> wb.save('after_copy.xls')
>
> Before I started digging heavily through the xlutils/xwlt/and xlrd code
> I was hoping to get some guidance as to where to start my search.

Please consider reading the documentation. This site
http://www.python-excel.org/ has links to the xlrd documentation and a
tutorial -- see the xlutils.copy example code on page 48 of the tutorial.


> My
> thoughts were that it had to do with the xf_index being incorrectly set,
> but from your response it sounds like you are saying that the xf_index
> does not determine the style format?

The xf_index means nothing, except that a new Excel-created workbook has
its first 16 or so XFs set up by default to the same values every time.

Michael Jugovich

unread,
May 20, 2010, 11:29:37 AM5/20/10
to python...@googlegroups.com
Thank you John for your quick responses and patience with my inexperience.
Reply all
Reply to author
Forward
0 new messages