Getting Information on Excel's Merged Cells in xlrd Package

5,005 views
Skip to first unread message

Lee

unread,
Mar 2, 2009, 4:25:21 AM3/2/09
to python-excel
Hi,

I am using xlrd package to parse Excel spreadsheets.
I need to know either

- Location and dimentional information of all merged cells (l_col,
r_col, hi_row, lo_row), or
- if a cell is a merged cell

There is something equivalent in Perl's excelParser (mergedArea
method),
but not in Python's xlrd, or maybe I am missing something.

I cannot find any infomation on xlrd documentation.

Any info/pointer will be greatly appreciated.

Regards,

Lea Hwang

John Machin

unread,
Mar 2, 2009, 5:57:28 AM3/2/09
to python...@googlegroups.com
On 2/03/2009 8:25 PM, Lee wrote:
> Hi,
>
> I am using xlrd package to parse Excel spreadsheets.

Hi Lee, Welcome to xlrd & the group.

> I need to know either
>
> - Location and dimentional information of all merged cells (l_col,
> r_col, hi_row, lo_row), or
> - if a cell is a merged cell

This will get you going:

http://www.lexicon.net/sjmachin/xlrd.html#xlrd.Sheet.merged_cells-attribute

>
> There is something equivalent in Perl's excelParser (mergedArea
> method),
> but not in Python's xlrd, or maybe I am missing something.

Well maybe you are. If you have been looking for "merge" in xlrd source
code, and not found the merged_cells attribute of the Sheet object, then
check the xlrd version (import xlrd; print xlrd.__VERSION__). If it's
before 0.6.1, update from http://pypi.python.org/pypi/xlrd.

> I cannot find any infomation on xlrd documentation.

If you mean that you can't find info on merged cells in the xlrd
documentation, then that's most likely because of the out-of-date xlrd
version problem.

If you mean that you can't find any information on how to find a copy of
the xlrd docs, well may be you installed xlrd using some no-source
method (egg?) method. There are THREE sources of xlrd documentation:

(1) On the web: Google("xlrd"); first hit is the latest version on pypi.
Click on the "Homepage" link. Click on the "APIs" link. You have arrived.

(2) If you have installed xlrd from a source distribution (by unpacking
the tgz or zip file and running "python setup.py install") or used a
Windows installer, doc files including the same HTML file as you see on
the web will be installed alongside the source code e.g. on Windows,
docs are in
PYTHON_DIRECTORY\lib\site-packages\xlrd\doc\*.html
and the source is PYTHON_DIRECTORY\lib\site-packages\xlrd\*.py

(3) When all else fails: Look at the source code. All the HTML docs are
generated from the source code -- a block of comments starting with ##
before a class statement, a class attribute, or a def statement is
processed into HTML.

Any suggestions you have on the accessability of xlrd docs will be taken
on board -- a new release should be going out the door Real Soon Now (tm).

Cheers,
John

Lee

unread,
Mar 2, 2009, 11:39:37 AM3/2/09
to python-excel
Hi John,

I really appreciate your quick response.
I downloaded to 0.6.1 and it now compiles with merged_cells method.
But I am still have problem using it. Here is my code.


import xlrd, string, sys

print xlrd.__VERSION__

book = xlrd.open_workbook("myfile.xls")
sh = book.sheet_by_name("Opcode Map")

for crange in sh.merged_cells:
rlo, rhi, clo, chi = crange
print rlo, rhi, clo, chi


Here is the output:

0.6.1
sheet: Opcode Map

Basically it did not recognize any merged cells in the sheet despite
the
facts that there are many.

Any suggestion/pointer will be appreciated.


Regards,


Lee




On Mar 2, 4:57 am, John Machin <sjmac...@lexicon.net> wrote:
> On 2/03/2009 8:25 PM, Lee wrote:
>
> > Hi,
>
> >    I am using xlrd package to parse Excel spreadsheets.
>
> Hi Lee, Welcome to xlrd & the group.
>
> > I need to know either
>
> > - Location and dimentional information of all merged cells (l_col,
> > r_col, hi_row, lo_row), or
> > - if a cell is a merged cell
>
> This will get you going:
>
> http://www.lexicon.net/sjmachin/xlrd.html#xlrd.Sheet.merged_cells-att...
>
>
>
> >   There is something equivalent in Perl's excelParser (mergedArea
> > method),
> > but not in Python's xlrd, or maybe I am missing something.
>
> Well maybe you are. If you have been looking for "merge" in xlrd source
> code, and not found the merged_cells attribute of the Sheet object, then
>   check the xlrd version (import xlrd; print xlrd.__VERSION__). If it's
> before 0.6.1, update fromhttp://pypi.python.org/pypi/xlrd.

John Machin

unread,
Mar 2, 2009, 4:18:11 PM3/2/09
to python...@googlegroups.com
On 3/03/2009 3:39 AM, Lee top-posted:
> Hi John,

Hello Lee,


>
> I really appreciate your quick response.
> I downloaded to 0.6.1 and it now compiles with merged_cells method.
> But I am still have problem using it. Here is my code.
>
>
> import xlrd, string, sys

Nothing to do with your problem, but are you aware that almost
everything in the string module is deprecated in favour of str methods?
What version of Python are you using?

>
> print xlrd.__VERSION__
>
> book = xlrd.open_workbook("myfile.xls")
> sh = book.sheet_by_name("Opcode Map")
>
> for crange in sh.merged_cells:
> rlo, rhi, clo, chi = crange
> print rlo, rhi, clo, chi
>
>
> Here is the output:
>
> 0.6.1
> sheet: Opcode Map

There is nothing in your code that would cause the output "sheet: Opcode
Map" to appear. It doesn't matter this time, but please get into the
habit of posting the actual code that you ran to produce the output.

> Basically it did not recognize any merged cells in the sheet despite
> the
> facts that there are many.
>
> Any suggestion/pointer will be appreciated.

Already done: """When all else fails: Look at the source code."""

Pretend your Internet connection is broken, or, more realistically, that
support is available only about 18/7 [(Google("sleep")]. Locate your
copy of the source. merged_cells is a Sheet attribute, so sheet.py looks
like a good place to start. Look for all occurrences of merged_cells.
Scrutinise the place which most looks like digging the information out
of the file. Trace the problem backwards from there.

Please feel free to suggest a change to the documentation or the code.

HTH,
John

John Machin

unread,
Mar 3, 2009, 6:29:46 PM3/3/09
to python...@googlegroups.com, Lee
On 4/03/2009 8:00 AM, Lee wrote:
> Hi John,

Hi Lee,

I presume your taking the conversation off-list was accidental ...

> 1. It seems that to use the merged_cell attribute, the attribute
> formatting_info must be
> open_workbook(...., formatting_info=True).
> See line 638 of sheet.py. This requirement is not mentioned somewhere.

This deficiency will be fixed very soon.

> 2. The co-ord returned by the merged_cells is such that
> - the rlo is one less then what its suppose to be
> - the chi is one more than what its suppose to be
> For example, if a 1-row-2-col merged cell is located at row=97 and col=
> [9,10].

I presume 97 is counting from 1, but your unexplained [9,10] notation
appears to be counting from 0 ...

> The merge cell params will give: rlo=96, rhi=97, clo=9, chi=11

which indicates that the cells covered are (in Excel notation) J97:K97
aka R97C10:R97C11.

> Is this the specification?

Here's the specification:
"""
How to deconstruct the list:

for crange in thesheet.merged_cells:


rlo, rhi, clo, chi = crange

for rowx in xrange(rlo, rhi):
for colx in xrange(clo, chi):
"""

Here's the rationale:
xlrd is intended to be called by software written in Python by Python
programmers; consequently all column and row indexes are zero-relative
and all intervals are half-open (unless documented to be otherwise).
thing_lo is the index of the first thing. thing_hi is (the index of the
last thing) + 1. See "xrange" above.

Cheers,
John

Lee

unread,
Mar 3, 2009, 6:37:15 PM3/3/09
to python-excel
Hi John,

Thanks for the clarifications.

Lee
Reply all
Reply to author
Forward
0 new messages