writing on existing excel file

3,454 views
Skip to first unread message

Mauro

unread,
Jul 15, 2009, 11:51:26 AM7/15/09
to python-excel
Dear group,

I'm using xlrd|xlwt|xlutilis to write some cells in an existing excel
file.

what i basically do is the following:

import xlrd
import xlutils

filename = 'file.xls'
rb = xlrd.open_workbook(filename, on_demand=True)
ws = xlutils.copy(rb)

#pick first sheet
shw = ws.get_sheet(1)

#some processing here (writing to shw), for example:
shw.write(1, 1, 1)
# ... more writing ...

#saving file
ws.save('anotherfile.xls')

#end of code

This is perfectly working, but damn slow, in particular in copying the
xlrd.Book to the xlwt.Workbook via xlutlis.copy. (my excel file is
about 4Mb).
Any hint or ideas to do the same faster?

Thanks a lot!
Mauro


John Machin

unread,
Jul 15, 2009, 7:26:28 PM7/15/09
to python...@googlegroups.com
On 16/07/2009 1:51 AM, Mauro wrote:

> I'm using xlrd|xlwt|xlutilis to write some cells in an existing excel
> file.

I hope not. In any case what you described appears to be intended to
copy a file then write some cells to the copy -- in which case saving
back to the same filename is not a good idea.

> This is perfectly working, but damn slow, in particular in copying the
> xlrd.Book to the xlwt.Workbook via xlutlis.copy. (my excel file is
> about 4Mb).
> Any hint or ideas to do the same faster?

Please tell us the time taken for xlrd.open_workbook() and the time
taken for "xlutils.copy()" -- aren't you missing a ".copy" somewhere in
there? -- with no extra writing.

Then run it again with xlrd.open_workbook(..., on_demand=False) and
report both times as above.

If the answers are something like (("instant", "damn slow"), 2 * ("slow
but not damnably", )) then you have a partial answer.

For better diagnosis, please express the times in SI time units (e.g.
seconds), and also tell us what platform, what versions of Python, xlrd,
xlwt, xlutils you are running.

Also some more info about your file; "4Mb" doesn't tell us much ...
sum(sh.ncols * sh.nrows for sh in book.sheets()) would be a much better
indication of how much work the copy() has to do -- extreme example:
file has one sheet with only one non-blank cell (IV65336). You may want
to check out your file with xlutils.margins.

Cheers,
John

Mauro

unread,
Jul 16, 2009, 4:33:21 AM7/16/09
to python-excel
Hi John,
thanks for your answer

> > I'm using xlrd|xlwt|xlutilis to write some cells in an existing excel
> I hope not. In any case what you described appears to be intended to
> copy a file then write some cells to the copy -- in which case saving
> back to the same filename is not a good idea.



If you check my code you see that I am saving back on another file, so
that's not the case.



> Please tell us the time taken for xlrd.open_workbook() and the time
> taken for "xlutils.copy()" -- aren't you missing a ".copy" somewhere in
> there? -- with no extra writing.

> Then run it again with xlrd.open_workbook(..., on_demand=False) and
> report both times as above.
>
> If the answers are something like (("instant", "damn slow"), 2 * ("slow
> but not damnably", )) then you have a partial answer.
>
> For better diagnosis, please express the times in SI time units (e.g.
> seconds), and also tell us what platform, what versions of Python, xlrd,
> xlwt, xlutils you are running.
>
> Also some more info about your file; "4Mb" doesn't tell us much ...
> sum(sh.ncols * sh.nrows for sh in book.sheets()) would be a much better
> indication of how much work the copy() has to do -- extreme example:
> file has one sheet with only one non-blank cell (IV65336). You may want
> to check out your file with xlutils.margins.


Here are all the details:
My excel file:
9 sheets, 111 columns 8788 rows each.
Actually most of the cells are blank (let's say whe have about 3000
cells filled)
Anyway the workbook will be progressively filled, up to 2millions
cells filled.

I am running my script on windows Vista Home and on Sun Solaris 5.9

times (Vista Home)
xlutils.copy() (on_demand = True) : 42.940000 seconds
xlutils.copy() (on_demand = False) : 39.146000 seconds

times (Solaris 5.9)
xlutils.copy() (on_demand = True) : 02 mins 42.486900 seconds
xlutils.copy() (on_demand = False) : 02 mins 29.924702 seconds


versions (both vista and solaris):
Python 2.5.4
xlutils 1.3.2
xlrd: 0.7.1
xlwt 0.7.2

cheers
Mauro

John Machin

unread,
Jul 16, 2009, 9:56:06 PM7/16/09
to python...@googlegroups.com
On 16/07/2009 6:33 PM, Mauro wrote:
> Hi John,
> thanks for your answer
>
>>> I'm using xlrd|xlwt|xlutilis to write some cells in an existing excel
>> I hope not. In any case what you described appears to be intended to
>> copy a file then write some cells to the copy -- in which case saving
>> back to the same filename is not a good idea.

> If you check my code you see that I am saving back on another file, so
> that's not the case.

So don't frighten your audience by saying (twice) that you are writing
in/on an existing excel file :-)

I did check your code, and raised an exception on the
import xlutils
ws = xlutils.copy(rb)
(should be:
import xlutils.copy
ws = xlutils.copy.copy(rb)
) and didn't read any further.

More on the subject of your code: don't you need xlrd.open_workbook(...,
formatting_info=True) ?

>> Please tell us the time taken for xlrd.open_workbook()

??

>> and the time
>> taken for "xlutils.copy()" -- aren't you missing a ".copy" somewhere in
>> there?

??

Please take the time to address all issues raised.

-- with no extra writing.
>
>> Then run it again with xlrd.open_workbook(..., on_demand=False) and
>> report both times as above.
>>
>> If the answers are something like (("instant", "damn slow"), 2 * ("slow
>> but not damnably", )) then you have a partial answer.
>>
>> For better diagnosis, please express the times in SI time units (e.g.
>> seconds), and also tell us what platform, what versions of Python, xlrd,
>> xlwt, xlutils you are running.
>>
>> Also some more info about your file; "4Mb" doesn't tell us much ...
>> sum(sh.ncols * sh.nrows for sh in book.sheets()) would be a much better
>> indication of how much work the copy() has to do -- extreme example:
>> file has one sheet with only one non-blank cell (IV65336). You may want
>> to check out your file with xlutils.margins.
>
>
> Here are all the details:
> My excel file:
> 9 sheets, 111 columns 8788 rows each.
> Actually most of the cells are blank

Is that blank really blank or just empty? What do xlutils.margins and
runxlrd biff_count tell you? If really blank, and you intend to fill
them, consider avoiding the needless copying.

> (let's say whe have about 3000
> cells filled)
> Anyway the workbook will be progressively filled, up to 2millions
> cells filled.
>
> I am running my script on windows Vista Home and on Sun Solaris 5.9
>
> times (Vista Home)
> xlutils.copy() (on_demand = True) : 42.940000 seconds
> xlutils.copy() (on_demand = False) : 39.146000 seconds
>
> times (Solaris 5.9)
> xlutils.copy() (on_demand = True) : 02 mins 42.486900 seconds
> xlutils.copy() (on_demand = False) : 02 mins 29.924702 seconds

The Windows box appears to have about 3.8 x the grunt of the Solaris
box; is that typical?

Windows box: What CPU, how many GHz, how many cores? How much real
memory spare when you were running your script?

I'd like to do some profiling to see where all the time is being spent.
I don't have any files with both similar contents and similar size. Any
chance of getting a copy of your file (under an NDA if necessary)?

> versions (both vista and solaris):
> Python 2.5.4
> xlutils 1.3.2
> xlrd: 0.7.1
> xlwt 0.7.2

Great; should be no problems there ...

Cheers,
John

Sammie

unread,
Jul 17, 2009, 6:38:37 PM7/17/09
to python-excel
Dear group:

Where do I find complete documentation to the properties and methods
of
classes such as workbook, worksheet, range, cell etc??

workbook: book.nsheets, etc
worksheet: get_sheet(), sh.name, sh.nrows, sh.ncols
sheet_by_index(index), sheet_names
shw.write(x, y, z) # descriptions of x, y, z
range: etc
cell : etc

I want to find these class properties and methods in order to write
a simple Python script to copy pre-selected sheets of several
workbooks
into one target workbook.

Thanks
-Sammie

John Machin

unread,
Jul 17, 2009, 9:30:29 PM7/17/09
to python...@googlegroups.com
On 18/07/2009 8:38 AM, Sammie wrote:
> Dear group:
>
> Where do I find complete documentation to the properties and methods
> of
> classes such as workbook, worksheet, range, cell etc??
>
> workbook: book.nsheets, etc
> worksheet: get_sheet(), sh.name, sh.nrows, sh.ncols
> sheet_by_index(index), sheet_names
> shw.write(x, y, z) # descriptions of x, y, z
> range: etc
> cell : etc
>
> I want to find these class properties and methods in order to write
> a simple Python script to copy pre-selected sheets of several
> workbooks
> into one target workbook.
>
> Thanks
> -Sammie
> On 15 July, 08:51, Mauro <bianchi...@gmail.com> wrote:
>> Dear group,
>>
>> I'm using xlrd|xlwt|xlutilis to write some cells in an existing excel
>> file.
>>


Sammie,

Please don't top-post.

Please don't hijack another person's thread -- start a new thread.

You asked much the same question on 12 July; in case you didn't notice
it, here is my response on 13 July:

"""
On 12/07/2009 2:45 PM, Sammie wrote:
> > I am new to xlwt, xlrd, xlutils. I am going through some of the
*.py
> > files (examples from the installations)
> >
> > The documentation downloaded at http://www.python-excel.org/
are not
> > complete.

> > Where can I download a complete function lists and how to use them?
> > (to name a few: such as wb.add_sheet, alignment, XFStyle,
> > ws.write_merge etc)
> >
> > In short I am looking for methods and attributes of classes (like
> > Workbook and Worksheet etc.)
> > I can't find them on the web.

There is no documentation available other than what you see on that
website.

You appear to be focused on xlwt ... folk seem to do wonders with the
embryonic documentation, the examples, and the source code.

(Chris Withers|I) would be quite interested to hear what extra
documentation you expect for (xlutils|xlrd).
"""

There is no other answer. Your focus seems to have switched to xlrd ...
the attributes and methods that you mention in your latest message are
covered in the xlrd documentation
(https://secure.simplistix.co.uk/svn/xlrd/trunk/xlrd/doc/xlrd.html) ...

Have you read the tutorial (PDF document)?

I say again: I would be quite interested to hear what extra
documentation you expect for xlrd.

Cheers,
John

Reply all
Reply to author
Forward
0 new messages