Is it possible to get higher performance with xlwt?

1,941 views
Skip to first unread message

Max Belkin

unread,
May 6, 2014, 10:32:28 AM5/6/14
to python...@googlegroups.com
Hello everyone!

For the last few days I was comparing the speed of creating an xls/xlsx file with different modules. xlwt is the best. It is 2 times faster than openpyxl and xlsxwriter (both are mutually equal).

My question is in the subject - can the higher performance be reached?
May be there is some kind of optimization of memory usage as in other modules and I haven't noticed it?

Regards,
Max

John Yeung

unread,
May 6, 2014, 11:46:15 AM5/6/14
to python-excel
On Tue, May 6, 2014 at 10:32 AM, Max Belkin
<bernardito.l...@gmail.com> wrote:
> For the last few days I was comparing the speed of creating an xls/xlsx file
> with different modules. xlwt is the best. It is 2 times faster than openpyxl
> and xlsxwriter (both are mutually equal).

In my experience, XlsxWriter is not that much slower than xlwt. But
of course performance can depend on a lot of factors.

> My question is in the subject - can the higher performance be reached?
> May be there is some kind of optimization of memory usage as in other
> modules and I haven't noticed it?

XlsxWriter and OpenPyXL both have optimized modes:

http://xlsxwriter.readthedocs.org/working_with_memory.html
http://pythonhosted.org/openpyxl/optimized.html#optimized-writer

There is also a newer package called PyExcelerate (NOT to be confused
with the old, buggy, no-longer-maintained pyExcelerator!), if you
don't need all the features of XlsxWriter:

https://github.com/whitehat2k9/PyExcelerate

You could also try using PyPy (in place of regular Python).

Beyond these measures, there's really not much to be done. If you
need significantly more speed, you probably have to look at using
faster (compiled) languages. (If you wind up writing C accelerator
modules for any of these Python packages, be sure to publish them! ;)

John Y.

Max Belkin

unread,
May 7, 2014, 1:03:31 AM5/7/14
to python...@googlegroups.com
Hi, John!
Thanks for your answer.



In my experience, XlsxWriter is not that much slower than xlwt.  But
of course performance can depend on a lot of factors.


I've tried all of these: xlwt, xlsxwriter(with optimisation), openpyxl(with optimisation) on the same data. The result was so that xlwt was 2 times faster.

May be I will try C coding.

Cheers,
Max

John McNamara

unread,
May 7, 2014, 9:31:44 AM5/7/14
to python...@googlegroups.com

On Wednesday, 7 May 2014 06:03:31 UTC+1, Max Belkin wrote:

I've tried all of these: xlwt, xlsxwriter(with optimisation), openpyxl(with optimisation) on the same data. The result was so that xlwt was 2 times faster.


Hi Max,

I recently did some benchmarks with xlwt, XlsxWriter and OpenPyXL in Pandas and found that xlwt was slightly faster than XlsxWriter in (un-optimised mode) and both were about 5 times faster than OpenPyXL (also in un-optimised mode). XlsxWriter in `constant memory` mode should be faster again.

Could you show the benchmark you are using?

PyExcelerate is the fastest of the current Python Excel writers. See the benchmark on the GitHub page.


John




Max Belkin

unread,
May 8, 2014, 2:45:26 AM5/8/14
to python...@googlegroups.com
Hi John,

Thank you for the answer!

Could you show the benchmark you are using?


I don't use any. Just tried to write about 32k rows * 35 columns took my stopwatch and marked the results :D

They are: 
openpyxl (with optimize_write) 62 seconds
xlsxwriter (with constant_memory) 57 seconds
xlwt (simple) 28 seconds

all above for 35347 rows, 38 columns.

I am using core i5-3470S, 4Gb RAM, Python 2.6.6 x64, Django.
may be I'm doing something wrong...

Regards,
Max

John McNamara

unread,
May 8, 2014, 6:45:31 AM5/8/14
to python...@googlegroups.com
On Thursday, 8 May 2014 07:45:26 UTC+1, Max Belkin wrote:

I don't use any. Just tried to write about 32k rows * 35 columns took my stopwatch and marked the results :D


Hi,

Here is the output of a benchmark that I wrote to test various Excel writers in Python:

Versions:
    python      : 2.7.3
    openpyxl    : 1.8.6
    pyexcelerate: 0.5.0
    xlsxwriter  : 0.5.3
    xlwt        : 0.7.5

Dimensions:
    Rows = 35347
    Cols = 38

Times:
    pyexcelerate          :  11.94
    xlwt                  :  18.82
    xlsxwriter (optimised):  21.79
    xlsxwriter            :  26.91
    openpyxl   (optimised):  55.55
    openpyxl              :  98.76


The actual times will vary from machine to machine but I'd expect the overall trends to be the same.

Here is the benchmark program:


The benchmark writes alternate rows of strings and numbers. In most cases it writes the data cell by cell. The strings are unique to exercise the shared string table.

If anyone sees any issues with the benchmark let me know.


But overall these results and the results from the PyExcelerate GitHub page that I linked to earlier don't match your results. So maybe there is some other factor that is influencing your tests.

John


 

Max Belkin

unread,
May 8, 2014, 8:06:40 AM5/8/14
to python...@googlegroups.com
Hi
 
But overall these results and the results from the PyExcelerate GitHub page that I linked to earlier don't match your results. So maybe there is some other factor that is influencing your tests.

Sure it is. Sorry I didn't say exactly. Time I got is time between a button click on web form and file download pop-up (there was a db request and something else I guess). 
 
Times:
    pyexcelerate          :  11.94
    xlwt                  :  18.82
    xlsxwriter (optimised):  21.79
    xlsxwriter            :  26.91
    openpyxl   (optimised):  55.55
    openpyxl              :  98.76

But this order is the same that I got. From fastest: xlwt,  xlsxwriter (optimised), openpyxl   (optimised).

Regards,
Max

John McNamara

unread,
May 8, 2014, 10:44:50 AM5/8/14
to python...@googlegroups.com


On Thursday, 8 May 2014 13:06:40 UTC+1, Max Belkin wrote:

But this order is the same that I got. From fastest: xlwt,  xlsxwriter (optimised), openpyxl   (optimised).


Hi Max,

The order is the same as you got but not the order of magnitude:  xlwt isn't 2 times faster than xlsxwriter in that benchmark or any other tests I've run. If it was then that would represent a major regression in xlsxwriter that I would like to know about (since I wrote that module). But anyway, that is mainly of concern to me.

I guess in answer to your question there isn't any undocumented optimisation mode for xlwt. It is pretty fast as it is (from my point of view) and if you need faster then try PyExcelerate or PyPy as suggested by John Y.

Regards,

John
 

John McNamara

unread,
May 8, 2014, 11:05:38 AM5/8/14
to python...@googlegroups.com

Also, just in case anyone is interested, here is the same test run under PyPy 2.2.1:


Times:
    pyexcelerate          :   5.85
    xlwt                  :   6.91
    xlsxwriter (optimised):   8.19
    xlsxwriter            :   9.78
    openpyxl   (optimised):  15.63
    openpyxl              :  25.29

Which is an impressive 2-4 times faster depending on the case.

John



Max Belkin

unread,
May 9, 2014, 8:15:14 AM5/9/14
to python...@googlegroups.com
Hi John,

Have you (or anybody else) tried Cython? Does it give some acceleration?

Cheers,
Max

John Machin

unread,
May 10, 2014, 7:03:18 AM5/10/14
to python...@googlegroups.com
On 9/05/2014 10:15 PM, Max Belkin wrote:
> Hi John,
>
> Have you (or anybody else) tried Cython? Does it give some acceleration?
>
Not me. Not anybody else that I've heard of.
Reply all
Reply to author
Forward
0 new messages