Upcoming breaking change in openpyxl 3.1

438 views
Skip to first unread message

Charlie Clark

unread,
Apr 15, 2021, 10:27:46 AM4/15/21
to openpyxl-users

Hiya,

this is advanced warning of a minor breaking change in openpyxl 3.1 over the way array formula are handled.

What are array formulae? If you need to ask the question, you probably don't need to know the answer… but array formulae are an option in Excel to apply the same formula across an arrange of cells. For further information see Microsoft's documentation:
https://support.microsoft.com/en-us/office/guidelines-and-examples-of-array-formulas-7d94a64e-3ff3-4686-9372-ecfd5caa57c7#ID0EAAEAAA=Office_2010_-_Office_2019

Currently openpyxl provides undocumented support for these so that they are at least preserved in existing files. The details of this are that ws.formula_attributes is a dictionary with the various ranges with such formulae. For the example from Excel's documentation it looks like this:

{'E2': {'t': 'array', 'ref': 'E2:E11'}, 'F11': {'t': 'array', 'ref': 'F11'}}

This is great if you know this, but if you look at the cells themselves you wouldn't necessarily know and could easily break things:

ws["E2"].value
'=C2:C11*D2:D11'
ws["E3"].value
148000

NB. Excel will display such formulae for each cell in the range enclosed in curly brackets:

{=C2:C11*D2:D11} but this is in the GUI only.

This will change and, as I know, that some people have used this in the past, I'm providing advanced warning. Openpyxl 3.1 includes support for so-called "data table" formulae, which receive full support: if a worksheet contains a data table formula then the relevant cell will contain a DataTableFormula instance for inspection, modification, etc. Again, this is likely only to be relevant when such formulae already exist in those large and complicated templates that some companies like to use, because their handling is far from straightforward.

For array formulae the situation will be similar: ws.formula_attributes will be dropped and the relevant cells will contain ArrayFormula objects, which will provide a hint to client code that these should be handled with care.

Lookups will be possible:

formula_attributes = {}
for row in ws:
    for cell in row:
        if cell.data_type == "f":
             if isinstance(cell.value, ArrayFormula):
                  formula_attributes[cell.coordinate] = cell.value.ref

This could be provided as a property, though I'm not sure it's going to be required that often. Feedback welcome!

Charlie

--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Waldlehne 23
Düsseldorf
D- 40489
Tel: +49-203-3925-0390
Mobile: +49-178-782-6226

Reply all
Reply to author
Forward
0 new messages