DrawingML objects -- error parsing a:rect coordinates

283 views
Skip to first unread message

pba...@blucoders.net

unread,
May 9, 2019, 5:26:08 AM5/9/19
to openpyxl-users
Hi,

I've been trying to use openpyxl to insert data into an XLSX file containing images and drawings. I'm hoping to be able to open the file, modify the cell data and save it again, preserving as much of the original file as possible. I understand that the DrawingML support in openpyxl is still under development, and so losing the drawings is not a problem, but I would like the images to be preserved.

If I create the file using LibreOffice, this is indeed what happens: the images are copied across, but the drawings are lost.

However, if I create the file using Excel, then neither the images nor the drawing are preserved. Removing the drawings means that the images are preserved.

Extracting the file created by Excel, I found the following lines in the xl/drawings/drawing1.xml:

<a:custGeom>
<a:avLst/>
<a:gdLst/>
<a:ahLst/>
<a:rect l="l" t="t" r="r" b="b"/>
<a:pathLst>
<a:path w="21600" h="21600">
<a:moveTo>
<a:pt x="0" y="0"/>
</a:moveTo>
<a:lnTo>
<a:pt x="21600" y="21600"/>
</a:lnTo>
</a:path>
</a:pathLst>
</a:custGeom>


An exception was raised when using openpyxl to parse the coordinates for the a:rect element. It seems that the attributes l, r, t and b are expected to be integers:

Traceback (most recent call last):
  File "/home/pbanks/src/openpyxl-testing/venv/lib/python3.7/site-packages/openpyxl/descriptors/base.py", line 57, in _convert
    value = expected_type(value)
ValueError: invalid literal for int() with base 10: 'l'

Removing the a:rect line from the XML file allows openpyxl to parse the file correctly. 

I looked into the Office Open XML definition for the a:rect element [1, pp 2922, section 20.1.9.22]. Each of of these attributes are expected to be of the type ST_AdjCoordinate (defined in [1, pp 2924, section 20.1.10.2) which is defined to be the union of the types ST_Coordinate and ST_GeomGuideName. I don't fully understand the definitions in the document, but the schema says that the latter of these can be any token, and so it appears that this XML file is in fact valid, and that this is a bug with the parsing code of openpyxl.

My knowledge of DrawingML is limited, so please excuse me if my reading of the ECMA documentation is wrong. If someone can confirm that this is an issue, I'm happy to file a bug report about it.

Let me know if you need any more information.

Best wishes,
Peter

PS: attached are XLSX files with the drawing which causes the issues, and also a copy of the file with the offending a:rect line removed, along with a short program that demonstrates the issue. Also attached is a traceback generated by adding a call to traceback.print_exc in openpyxl/reader/drawings.py on the line before the warning about DrawingML support.

input_image_bad_arrow.xlsx
input_image_good_arrow.xlsx
test_image_support.py
traceback.txt

Charlie Clark

unread,
May 9, 2019, 6:19:47 AM5/9/19
to openpyx...@googlegroups.com
Am .05.2019, 11:26 Uhr, schrieb pbanks via openpyxl-users
<openpyx...@googlegroups.com>:

> My knowledge of DrawingML is limited, so please excuse me if my reading
> of the ECMA documentation is wrong. If someone can confirm that this is
> an
> issue, I'm happy to file a bug report about it.

Hi Peter,

thanks for the detailed report. Please do file a bug. DrawingML support is
incomplete largely because it is so extensive and abstract and has all
those groups that the code generator doesn't really know how to deal with.
That said this is indeed, probably pretty easy to fix.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

pba...@blucoders.net

unread,
May 11, 2019, 6:59:54 AM5/11/19
to openpyxl-users

On Thursday, May 9, 2019 at 11:19:47 AM UTC+1, Charlie Clark wrote:
Am .05.2019, 11:26 Uhr, schrieb pbanks via openpyxl-users  
<openpyx...@googlegroups.com>:

> My knowledge of DrawingML is limited, so please excuse me if my reading  
> of the ECMA documentation is wrong. If someone can confirm that this is  
> an
> issue, I'm happy to file a bug report about it.

Hi Peter,

thanks for the detailed report. Please do file a bug. DrawingML support is  
incomplete largely because it is so extensive and abstract and has all  
those groups that the code generator doesn't really know how to deal with.  
That said this is indeed, probably pretty easy to fix.

Charlie

Hi Charlie,


Best wishes,
Peter 
Reply all
Reply to author
Forward
0 new messages