Using xlwt in Django

510 views
Skip to first unread message

Joseph Wilhelm

unread,
Oct 23, 2008, 12:04:46 PM10/23/08
to python...@googlegroups.com

Hi all,

 

I just thought you might all be interested to know, I’ve implemented Excel output on the fly in Django, using xlwt. The snippet is posted here: http://www.djangosnippets.org/snippets/1151/

 

Any input is welcome!

Joseph Wilhelm

SMP Communications Corp.
7626 E. Greenway Rd. Suite 100

Scottsdale, AZ 85260
480.905.4100 F: 480.905.7110

1.888.796.3342

www.smpcom.com

This message is intended only for the use of the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the sender of this E-Mail by return E-Mail or by telephone.

 

John Machin

unread,
Oct 23, 2008, 7:31:30 PM10/23/08
to python...@googlegroups.com
On 24/10/2008 03:04, Joseph Wilhelm wrote:
> Hi all,

> I just thought you might all be interested to know, I’ve implemented
> Excel output on the fly in Django, using xlwt. The snippet is posted
> here: http://www.djangosnippets.org/snippets/1151/

> Any input is welcome!

if use_xls is True:
Wah!
book = xlwt.Workbook()
sheet = book.add_sheet('Sheet 1')
for rowx, row in enumerate(data):
for colx, col in enumerate(row):
s/col/value/ ... it's not a "column"
sheet.write(rowx, colx, str(col))
Using str(col) means that you are writing all cells as text cells (i.e.
losing all information as to whether the original data was
number/text/datetime/whatever) and possibly losing about 4 digits of
precision on your numbers. Have you tried not using str()? Sheet.write()
does the "right thing" for the following Python types:
* bool
* int, long, float, decimal.Decimal
* unicode
* str (decoded as specified in the xlwt.Workbook() encoding arg (default
is 'ascii')
* None
* datetime.(datetime|date|time)

HTH,
John

Chris Withers

unread,
Oct 24, 2008, 4:11:40 AM10/24/08
to python...@googlegroups.com
John Machin wrote:
>> I just thought you might all be interested to know, I’ve implemented
>> Excel output on the fly in Django, using xlwt. The snippet is posted
>> here: http://www.djangosnippets.org/snippets/1151/
>
>> Any input is welcome!
>
> if use_xls is True:
> Wah!

I believe "Wah!" is Australian for:

You should have used:
if use_xls:

;-)

Also, why drop back to csv?
Just add a new sheet each time you it 64000 rows.

cheers,

Chris

--
Simplistix - Content Management, Zope & Python Consulting
- http://www.simplistix.co.uk

Joseph Wilhelm

unread,
Oct 24, 2008, 11:41:07 AM10/24/08
to python...@googlegroups.com
John Machin wrote:
> if use_xls is True:
> Wah!

Oops.

> book = xlwt.Workbook()
> sheet = book.add_sheet('Sheet 1')
> for rowx, row in enumerate(data):
> for colx, col in enumerate(row):
> s/col/value/ ... it's not a "column"

Ah yes, true enough. This bit of code here also brings up the question,
is there a way to write an entire row at once? Given what's being done
here, I would be a little happier writing a row at a time rather than a
cell at a time. I'm generating some spreadsheets with 15+ columns and
35k+ rows, so that's a whole lot of calls to write() which could be cut
down drastically by row writes.

> sheet.write(rowx, colx, str(col))
> Using str(col) means that you are writing all cells as text cells
(i.e.
> losing all information as to whether the original data was
> number/text/datetime/whatever) and possibly losing about 4 digits of
> precision on your numbers. Have you tried not using str()?
Sheet.write()
> does the "right thing" for the following Python types:

Honestly, I can't remember why exactly I did that... but I'm sure I had
a reason that sounded good to myself at the time!

I'll do some more poking at it without the str() call and see how it
behaves.

Chris Withers wrote:
> Also, why drop back to csv?
> Just add a new sheet each time you it 64000 rows.

True, I could do that... it feels a little dirty to me though. In my
mind, what's being passed to this is a single list, and it should be
output as such. To me, separate worksheets represent separate lists.

Thanks for the help, guys! Those semantic issues are fixed and the
others will be addressed soon as well.

--Joey

Joseph Wilhelm

unread,
Oct 24, 2008, 11:56:04 AM10/24/08
to python...@googlegroups.com
> > sheet.write(rowx, colx, str(col))
> > Using str(col) means that you are writing all cells as text cells
> (i.e.
> > losing all information as to whether the original data was
> > number/text/datetime/whatever) and possibly losing about 4 digits of
> > precision on your numbers. Have you tried not using str()?
> Sheet.write()
> > does the "right thing" for the following Python types:
>
> Honestly, I can't remember why exactly I did that... but I'm sure I
had
> a reason that sounded good to myself at the time!
>
> I'll do some more poking at it without the str() call and see how it
> behaves.

Ah ha. I pulled off the str() call and immediately remembered my
reasoning. It appears that xlwt is not, in fact, doing the "right thing"
for datetime.datetime data. An example record in my output contains the
following data:

[datetime.datetime(2005, 10, 31, 0, 0), datetime.datetime(2005, 9, 19,
15, 34), datetime.datetime(2005, 10, 1, 0, 0)]

In the resulting xls file, these are producing, respectively:
38565, 38614.65, 38626

Perhaps a lurking xlwt bug?

--Joey

John Machin

unread,
Oct 24, 2008, 2:24:28 PM10/24/08
to python...@googlegroups.com
On 25/10/2008 02:41, Joseph Wilhelm wrote:
> John Machin wrote:

>> book = xlwt.Workbook()
>> sheet = book.add_sheet('Sheet 1')
>> for rowx, row in enumerate(data):
>> for colx, col in enumerate(row):
>> s/col/value/ ... it's not a "column"
>
> Ah yes, true enough. This bit of code here also brings up the question,
> is there a way to write an entire row at once? Given what's being done
> here, I would be a little happier writing a row at a time rather than a
> cell at a time. I'm generating some spreadsheets with 15+ columns and
> 35k+ rows, so that's a whole lot of calls to write() which could be cut
> down drastically by row writes.

What do you imagine a "row write" would or could do, other than
iterating over the individual cell values as you have done?

Joseph Wilhelm

unread,
Oct 24, 2008, 3:42:09 PM10/24/08
to python...@googlegroups.com
John Machin wrote:
> What do you imagine a "row write" would or could do, other than
> iterating over the individual cell values as you have done?

That's exactly why I was asking; I wasn't certain if there would be an
easy way to add the records in a batch rather than individually. It
would be nice in theory, but judging from your reaction it sounds like
it's far from feasible in practice. :)

John Machin

unread,
Oct 24, 2008, 3:44:39 PM10/24/08
to python...@googlegroups.com
On 25/10/2008 02:56, Joseph Wilhelm wrote:
>>> sheet.write(rowx, colx, str(col))
>>> Using str(col) means that you are writing all cells as text cells
>> (i.e.
>>> losing all information as to whether the original data was
>>> number/text/datetime/whatever) and possibly losing about 4 digits of
>>> precision on your numbers. Have you tried not using str()?
>> Sheet.write()
>>> does the "right thing" for the following Python types:
>> Honestly, I can't remember why exactly I did that... but I'm sure I
> had
>> a reason that sounded good to myself at the time!
>>
>> I'll do some more poking at it without the str() call and see how it
>> behaves.
>
> Ah ha. I pulled off the str() call and immediately remembered my
> reasoning. It appears that xlwt is not, in fact, doing the "right thing"
> for datetime.datetime data.

Appearances can be deceptive. It is better when you suspect a problem to
report it so that you get a fix or a work-around or enlightenment as to
why it is not a bug, rather than belting it with a sledge-hammer till it
appears to go away.

> An example record in my output contains the
> following data:
>
> [datetime.datetime(2005, 10, 31, 0, 0), datetime.datetime(2005, 9, 19,
> 15, 34), datetime.datetime(2005, 10, 1, 0, 0)]
>
> In the resulting xls file, these are producing, respectively:
> 38565, 38614.65, 38626

If the first one is really 38565 and not mangled manual reporting of
38656, then you *have* found a bug. The second one is actually
38614.64861111.... (0.65 corresponds to 15:36, not 15:34).

> Perhaps a lurking xlwt bug?

Or perhaps not. Try this: Open your output spreadsheet in Excel or OOo
Calc or Gnumeric. Select the offending cells. Click on
Format/Cells/Number/Custom. Type yyyy-mm-dd hh:mm:ss into the text box.
Click on OK. What do you see?

For background, consider reading the section on dates up near the start
of the xlrd documentation (http://www.lexicon.net/sjmachin/xlrd.html).

You may like to try examining the type information available from your
Django query in order to set up an XFStyle object for each column. As a
first attempt, this could be xlwt.easyxf(num_format_str="yyyy-mm-dd
hh:mm:ss") for datetime objects, and xlwt.Style.default_style for all
others.

HTH,

John

Joseph Wilhelm

unread,
Oct 24, 2008, 4:00:29 PM10/24/08
to python...@googlegroups.com
> Appearances can be deceptive. It is better when you suspect a problem
to
> report it so that you get a fix or a work-around or enlightenment as
to
> why it is not a bug, rather than belting it with a sledge-hammer till
it
> appears to go away.

Agreed, this was just a sledgehammer which I threw into the mix to shove
it along and make it work, and apparently forgot about until it came
crashing down on my head.

> If the first one is really 38565 and not mangled manual reporting of
> 38656, then you *have* found a bug. The second one is actually
> 38614.64861111.... (0.65 corresponds to 15:36, not 15:34).

Yes, the first one really is 38565; checked and double checked. You are
right on the value of the second though.

> Or perhaps not. Try this: Open your output spreadsheet in Excel or OOo
> Calc or Gnumeric. Select the offending cells. Click on
> Format/Cells/Number/Custom. Type yyyy-mm-dd hh:mm:ss into the text
box.
> Click on OK. What do you see?
>
> For background, consider reading the section on dates up near the
start
> of the xlrd documentation (http://www.lexicon.net/sjmachin/xlrd.html).
>
> You may like to try examining the type information available from your
> Django query in order to set up an XFStyle object for each column. As
a
> first attempt, this could be xlwt.easyxf(num_format_str="yyyy-mm-dd
> hh:mm:ss") for datetime objects, and xlwt.Style.default_style for all
> others.

See, I was hoping you weren't going to say that. :) By the "right thing"
I was hoping you meant it would do this formatting for you. I suppose it
would make sense for my class to have this bit of style detection
though.

Thanks again for the tips!

--Joey

John Machin

unread,
Oct 24, 2008, 5:41:26 PM10/24/08
to python...@googlegroups.com
On 25/10/2008 07:00, Joseph Wilhelm wrote:

>
>> If the first one is really 38565 and not mangled manual reporting of
>> 38656, then you *have* found a bug. The second one is actually
>> 38614.64861111.... (0.65 corresponds to 15:36, not 15:34).
>
> Yes, the first one really is 38565; checked and double checked. You are
> right on the value of the second though.

I cannot reproduce this. 38565 corresponds to 2005-08-01.
8<---
import datetime
import xlwt
book = xlwt.Workbook()
sheet = book.add_sheet('dates')
# first 3 items copy/pasted from Joey's message
data =[datetime.datetime(2005, 10, 31, 0, 0), datetime.datetime(2005, 9, 19,
15, 34), datetime.datetime(2005, 10, 1, 0, 0),
# added by SJM
datetime.datetime(2005, 8, 1, 0, 0),
]
for colx, value in enumerate(data):
print colx, repr(value)
sheet.write(0, colx, value)
book.save('joey_dates.xls')
8<---
Result in the xls, formatted to 9 decimal places:
38656.000000000 38614.648611111 38626.000000000 38565.000000000

Please send me a reproducible demonstration of the problem -- a minimal
one (i.e. doesn't need Django). Also exactly which version of xlwt and
Python, and what platform you ran it on.


Alex Ezell

unread,
Oct 24, 2008, 12:08:46 PM10/24/08
to python...@googlegroups.com

Isn't this the way that Excel stores dates? These values are "a float
representing the number of days since (typically) 1899-12-31T00:00:00,
under the pretence that 1900 was a leap year" according to the xlwt
docs. Perhaps Excel needs to be told they are dates?

/alex

Joseph Wilhelm

unread,
Oct 24, 2008, 5:58:11 PM10/24/08
to python...@googlegroups.com
> I cannot reproduce this. 38565 corresponds to 2005-08-01.
> 8<---
> import datetime
> import xlwt
> book = xlwt.Workbook()
> sheet = book.add_sheet('dates')
> # first 3 items copy/pasted from Joey's message
> data =[datetime.datetime(2005, 10, 31, 0, 0), datetime.datetime(2005,
9,
> 19,
> 15, 34), datetime.datetime(2005, 10, 1, 0, 0),
> # added by SJM
> datetime.datetime(2005, 8, 1, 0, 0),
> ]
> for colx, value in enumerate(data):
> print colx, repr(value)
> sheet.write(0, colx, value)
> book.save('joey_dates.xls')
> 8<---
> Result in the xls, formatted to 9 decimal places:
> 38656.000000000 38614.648611111 38626.000000000 38565.000000000
>
> Please send me a reproducible demonstration of the problem -- a
minimal
> one (i.e. doesn't need Django). Also exactly which version of xlwt and
> Python, and what platform you ran it on.

I just ran the above example verbatim, viewed it in Excel 2003 SP3, and
came up with:
38656, 38614.6486111111, 38626, 38565

When I open it via xlrd, I get:
[number:38656.0, number:38614.648611111108, number:38626.0,
number:38565.0]

I'm running this on:
Windows XP SP2 (32-bit)
Python 2.5.2
xlwt 0.7.0
xlrd 0.6.1

John Machin

unread,
Oct 24, 2008, 6:25:31 PM10/24/08
to python...@googlegroups.com

Apart from cosmetics, this is *EXACTLY* what I got. You have *NOT*
reproduced your alleged problem. Your allegation, which you later said
you checked and double-checked, was:


"""
An example record in my output contains the
following data:

[datetime.datetime(2005, 10, 31, 0, 0), datetime.datetime(2005, 9, 19,


15, 34), datetime.datetime(2005, 10, 1, 0, 0)]

In the resulting xls file, these are producing, respectively:
38565, 38614.65, 38626
"""

i.e. that datetime.datetime(2005, 10, 31, 0, 0) produced 38565 in the
resulting file.

Joseph Wilhelm

unread,
Oct 24, 2008, 6:49:05 PM10/24/08
to python...@googlegroups.com
> Apart from cosmetics, this is *EXACTLY* what I got. You have *NOT*
> reproduced your alleged problem. Your allegation, which you later said
> you checked and double-checked, was:
> """
> An example record in my output contains the
> following data:
>
> [datetime.datetime(2005, 10, 31, 0, 0), datetime.datetime(2005, 9, 19,
> 15, 34), datetime.datetime(2005, 10, 1, 0, 0)]
>
> In the resulting xls file, these are producing, respectively:
> 38565, 38614.65, 38626
> """
> i.e. that datetime.datetime(2005, 10, 31, 0, 0) produced 38565 in the
> resulting file.
>

Again, you are correct, my apologies. I initially made a typo (38565 vs
38656) which I did check and double check; I was just checking for the
wrong thing, as I misunderstood what you were looking at. I had thought
this was a problem with the precision not being generated. (38565 vs
38656.000000000) rather than the number actually being that far off.

Sorry about that, it's been a long week! My head must not be fully
functioning today.

Either way, despite me managing to waste far too much of our time today
by being blind, I have updated the snippet for which I started this
thread to incorporate the suggestions brought up here and also add an
optional encoding parameter (defaulted to utf8, as Django attempts to
use Unicode everywhere).

--Joey

Reply all
Reply to author
Forward
0 new messages