VLOOKUP formula using a named Table is corrupted when saved

108 views
Skip to first unread message

Marek Staroň

unread,
Sep 22, 2024, 5:29:41 AMSep 22
to openpyxl-users
I'm using openpyxl to insert a VLOOKUP formula which uses a named Table as a source of labels for key values in the neighboring column.

Formula example:
=IF(D2:D1001="","",VLOOKUP(D2:D1001,Reference_dataset_Measures_code,2,FALSE))

(the formula is dynamically generated, and this is how it looks after it has been inserted to the respective cell)

When I open the excel file though, after saving the workbook, the formula looks different, and doesn't work:

=IF(@D2:D1001="","",VLOOKUP(@D2:D1001,'Reference_dataset_Measures_code',2,FALSE))

If I manually remove the '@' and the single quotes, it works as intended.

Any idea why is this happening and how to prevent it?

Charlie Clark

unread,
Sep 22, 2024, 7:26:40 AMSep 22
to openpyxl-users
On 22 Sep 2024, at 11:29, Marek Staroň wrote:

> If I manually remove the '@' and the single quotes, it works as intended.

Excel internally uses extra symbols for particularly purposes. I'm not that familiar with what for but things like arrays, etc. It's like the translations in that it's well meant but actually infuriating.

> Any idea why is this happening and how to prevent it?

I think you need to look of the source produced by Excel and see if other attributes have been set.

Charlie

--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Sengelsweg 34
Düsseldorf
D- 40489
Tel: +49-203-3925-0390
Mobile: +49-178-782-6226

Marek Staroň

unread,
Sep 22, 2024, 10:47:53 AMSep 22
to openpyxl-users
Thank you for the reply Charlie.

On Sunday 22 September 2024 at 13:26:40 UTC+2 charli...@clark-consulting.eu wrote:
On 22 Sep 2024, at 11:29, Marek Staroň wrote:

> If I manually remove the '@' and the single quotes, it works as intended.

Excel internally uses extra symbols for particularly purposes. I'm not that familiar with what for but things like arrays, etc. It's like the translations in that it's well meant but actually infuriating.

Yes, I remember that when I open the .xlsx first time after it was created, Excel showed a green message box proudly stating that the '@' symbols were added to the formula for some reason, which I don't remember, and that it will still work as before, which it obviously didn't.

> Any idea why is this happening and how to prevent it?

I think you need to look of the source produced by Excel and see if other attributes have been set.

Ok. I had a look at the xml of the respective sheet after it was created, and then again after I fixed the formula and saved it.

The formula related part have originally looked like this:
<c r="E2" s="1">
  <f>IF(D2:D1001="","",VLOOKUP(D2:D1001,Reference_dataset_Measures_code,2,FALSE))</f>
  <v/>
</c>

It looks exactly as entered, so it have not been corrupted by the safe.

However, after fixing, it looks like this:
<c r="E2" s="1" t="str" cm="1">
  <f t="array" ref="E2:E1001">IF(D2:D1001="","",VLOOKUP(D2:D1001,Reference_dataset_Measures_code[],2,FALSE))</f>
  <v/>
</c>


So there's an obvious discrepancy in how the formula looks and how it is actually stored.

The first thing is the addition of the '[]' after the named Table name. I have modified my python script to add it there as well, and it fixed the extra single quotes issue.

But I don't know how to get rid of the '@'s. I guess it may have something to do with the attributes in the <f> element added after the correction.
Any idea how I can add those attributes using the openpyxl?

Marek Staroň

unread,
Sep 22, 2024, 11:35:08 AMSep 22
to openpyxl-users
On Sunday 22 September 2024 at 16:47:53 UTC+2 Marek Staroň wrote:

But I don't know how to get rid of the '@'s. I guess it may have something to do with the attributes in the <f> element added after the correction.
Any idea how I can add those attributes using the openpyxl?

I think I figured it out. Kind of.

Instead of inserting the formula as a cell value, I used the ArrayFormula object to put it in instead. It got rid of the '@' and the it works as it should. 

But it's now a true array formula (surrounded by '{}') and thus it can't be easily modified.
Clipboard01.png

It is a bit strange, because in the sheet xml the formula part looks almost exactly the same as the manually corrected one from before.
 
<c r="E2" s="1" t="str">
  <f t="array" ref="E2:E1001">IF(D2:D1001="","",VLOOKUP(D2:D1001,Reference_dataset_Measures_code[],2,FALSE))</f>
  <v/>
</c>


Could that cm="1" attribute, which is now missing in the <c> element, be the reason?

Charlie Clark

unread,
Sep 23, 2024, 5:07:45 AMSep 23
to openpyxl-users
On 22 Sep 2024, at 17:35, Marek Staroň wrote:

> I think I figured it out. Kind of.
>
> Instead of inserting the formula as a cell value, I used the ArrayFormula
> object to put it in instead. It got rid of the '@' and the it works as it
> should.

Right. Excel adds these attributes automatically in the GUI for arrays and data tables. Obviously, we can't do this in Openpyxl without changing formulae from simple strings to something more complicated. And, as the overwhelming majority of formulae are simple strings, we added classes for the other objects.

> But it's now a true array formula (surrounded by '{}') and thus it can't be
> easily modified.
> [image: Clipboard01.png]
>
> It is a bit strange, because in the sheet xml the formula part looks almost
> exactly the same as the manually corrected one from before.
>
> <c r="E2" s="1" t="str">
> <f t="array"
> ref="E2:E1001">IF(D2:D1001="","",VLOOKUP(D2:D1001,Reference_dataset_Measures_code[],2,FALSE))</f>
> <v/>
> </c>
>
> Could that cm="1" attribute, which is now missing in the <c> element, be
> the reason?

Probably, that has something to do with "cell metadata" about which I know very little, but Liam has been working on adding support for this.

https://foss.heptapod.net/openpyxl/openpyxl/-/merge_requests/439

You might have luck with a checkout of the metadata-xml topic.

Francis Finnegan

unread,
Sep 25, 2024, 6:24:55 AMSep 25
to openpyx...@googlegroups.com
On Sunday 22 September 2024 at 16:47:53 UTC+2 Marek Staroň wrote:

But I don't know how to get rid of the '@'s. I guess it may have something to do with the attributes in the <f> element added after the correction.
Any idea how I can add those attributes using the openpyxl?

I think I figured it out. Kind of.

Instead of inserting the formula as a cell value, I used the ArrayFormula object to put it in instead. It got rid of the '@' and the it works as it should. 

But it's now a true array formula (surrounded by '{}') and thus it can't be easily modified.
<Clipboard01.png>


It is a bit strange, because in the sheet xml the formula part looks almost exactly the same as the manually corrected one from before.
 
<c r="E2" s="1" t="str">
  <f t="array" ref="E2:E1001">IF(D2:D1001="","",VLOOKUP(D2:D1001,Reference_dataset_Measures_code[],2,FALSE))</f>
  <v/>
</c>


Could that cm="1" attribute, which is now missing in the <c> element, be the reason?

--
You received this message because you are subscribed to the Google Groups "openpyxl-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openpyxl-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/openpyxl-users/b96d0feb-b1c6-4f46-b860-fbe778c7021an%40googlegroups.com.
<Clipboard01.png>

Marek Staroň

unread,
Sep 29, 2024, 6:00:36 AMSep 29
to openpyxl-users
Thank you Francis.

Unfortunately, adding the _xlfn. before the function didn't help to get rid of the '@'. 
I tried to put it in front of IF. It didn't help. 
I tried to remove the IF part and use only the VLOOKUP part. It didn't work either.

I also can't use your approach to see how the function looks when it's read back by python. 
Reading the value of the corresponding cell where I created the formula manually, doesn't give me the formula, but a reference to an ArrayFormula object instead.

<openpyxl.worksheet.formula.ArrayFormula object at 0x0000015BD00911F0>

Being quite new to python, I don't quite know how to get to that object and how to parse it to get the actual formula. But I assume that it will look quite similar to what I used in the ArrayFormula to create the formula myself. As I wrote before, using it got rid of the '@' characters, but wrapped the formula in '{}'.

Anyway, due to other issues, I had to redesign my solution using the xlswriter library. 
An observation. The library has two ways how to write array formulas.

ws.write_array_formula - formula with curly brackets
ws.write_dynamic_array_formula - no curly brackets.

Does openpyxl has a similar option, or is it something planned?
Reply all
Reply to author
Forward
0 new messages