Merged cells in read_only mode

420 views
Skip to first unread message

Sascha Hofmann

unread,
Feb 5, 2021, 4:32:22 AM2/5/21
to openpyxl-users
Hi all,

I am aware that currently merged_cells aren't available in ReadOnlyWorkbooks. I also saw this issue and the comment:
"because it requires parsing all of the worksheet". 

I understand that because a read_only workbook is loaded lazily you can't provide all merged cells at once but would it be possible to figure out that cell was merged once it has been read and maybe even with what cell it has been merged? 

Right now,  merged cells return an <EmptyCell> in read_only mode and I was wondering whether it's possible to return a <MergedCell> instead.

Thanks,
Sascha

Charlie Clark

unread,
Feb 5, 2021, 8:28:32 AM2/5/21
to openpyxl-users
On 5 Feb 2021, at 10:32, Sascha Hofmann wrote:

> Hi all,
>
> I am aware that currently merged_cells aren't available in
> ReadOnlyWorkbooks. I also saw this issue
> <https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1302> and the
> comment:
> *"because it requires parsing all of the worksheet". *
>
> I understand that because a read_only workbook is loaded lazily you
> can't
> provide all merged cells at once but would it be possible to figure
> out
> that cell was merged once it has been read and maybe even with what
> cell it
> has been merged?
>
> Right now, merged cells return an *<EmptyCell> *in read_only mode and
> I
> was wondering whether it's possible to return a *<MergedCell>
> *instead*.*

No, this isn't possible. The problem is that merged cells are defined by
some XML at the end of the worksheet file. In standard mode we can have
both the cells and the merged cells in memory so it's possible to create
MergedCells after the file is loaded. Read-only mode streams directly
from the XML so there is no persistence: you can't forward to the end of
the file, get the merged cell definitions and rewind. To do this in
read-only mode would either slow down every cell in every worksheet or
make the API significantly more complicated and thus error prone.

However, you could probably fairly easily create your own code for this
using the parser: simply subclass it to return only the merged cell
definitions and adjust the code for handling rows so that a check could
be made whether any cell is in a range.

Charlie

--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

Sascha Hofmann

unread,
Feb 5, 2021, 8:57:35 AM2/5/21
to openpyxl-users
Ah great, that makes! Will have a look at that. Thank you.
Reply all
Reply to author
Forward
0 new messages