Significant resources used by open_workbook:
(1) a file object used to access the physical file -- Book.__init__
cleans this up as soon as it's obtained (2); look for "close" and
"del"
(2) the str object or mmap object containing the file contents --
Book.__init__ cleans this up as soon as it's obtained (3) ... unless
(3) *is* (2)
(3) Book.mem which is the str object or mmap object containing the
contiguous data stream to be parsed -- this is released by
open_workbook calling Book.release_resources just before open_workbook
returns to its caller
(4) the actual data which is released implicitly (and thus available
for garbage collection) when you stop referring to it
This style of operating should not cause a problem, unless the
operating system is not returning freed memory to its pool:
for i in xrange(10000):
wb =xlrd.open_workbook('foo%d.xls' % i)
do_something_with(wb)
# following is optional
del wb
do_some_other_work_with(something_else)
This will blow up, because references to the Book object are being
retained unnecessarily:
wbs = [xlrd.open_workbook('foo%d.xls' % i) for i in xrange(10000)]
for wb in wbs:
do_something_with(wb)
What average and max sizes (in Mb) are your XLS files, and how many do
you open in one thread or process? Are you actually experiencing a
problem?
Cheers,
John