Adding and formatting multiple runs to title paragraph.

224 views
Skip to first unread message

jfoxrab...@gmail.com

unread,
Jun 29, 2016, 12:06:33 PM6/29/16
to openpyxl-users
Hi,

I have come across what is probably an uncommon use-case in openpyxl, and I was wondering if there is a workaround. I am trying to add a colored run of unicode characters to an axis title to simulate a simplified legend entry on a scatter plot.

```
from openpyxl import *

book = workbook.Workbook()
sheet = book.active
chart = chart.ScatterChart()
# Add some data to the chart.
sheet.add_chart(chart, 'A1')

chart.y_axis.title = 'blah'

prop = drawing.text.CharacterProperties(solidFill='FF0000')
run = drawing.text.RegularTextRun(rPr=prop, t='\u00b7\u00b7\u00b7\u00b7')
chart.y_axis.title.tx.rich.paragraphs.append(drawing.text.Paragraph(r=run))

book.save('test.xlsx')
```

There are a couple of issues here. The major one is that there is no way to append `run` above to the existing `chart.y_axis.title.tx.rich.paragraphs[0]`. This could likely be fixed by changing

```
r = Typed(expected_type=RegularTextRun, allow_none=True)
```

to

```
r = Sequence(expected_type=RegularTextRun, allow_none=True)
```

on line 532 of `drawing.text` (`Paragraph` class), but I am not 100% sure about that. The minor issue is that the line `prop = drawing.text.CharacterProperties(solidFill='00FF00')` does not actually do anything.

By the way, before I append the new paragraph, I tried doing

```
chart.y_axis.title.tx.rich.paragraphs[-1].br = None
```

to simulate multiple runs by possibly eliminating the line break between paragraphs, but that did not appear to work at all (still got the label on two lines).

I am using OpenPyXL 2.3.2 on a CentOS 6.8 machine, viewing the results on Windows 7, MS Office Excel 2013.

Here is an output of what I am getting as well as what I am trying to achieve (ignore the data part of the plot):

**Actual result**

![Actual.png](https://bitbucket.org/repo/RMdExx/images/1693739062-Actual.png)

**What I would like to see**

![Expected.png](https://bitbucket.org/repo/RMdExx/images/1172593193-Expected.png)

Regards,

    -Joe

Charlie Clark

unread,
Jun 29, 2016, 2:37:12 PM6/29/16
to openpyx...@googlegroups.com
Am .06.2016, 18:06 Uhr, schrieb <jfoxrab...@gmail.com>:

> Hi,

Hi Joe,

thanks for taking this off the issues, even if I think there probably is a
bug here.

> I have come across what is probably an uncommon use-case in openpyxl,
> and I was wondering if there is a workaround. I am trying to add a
> colored run of unicode characters to an axis title to simulate a
> simplified legend entry
> on a scatter plot.

This really is unchartered territory!

> ```
> from openpyxl import *
>
> book = workbook.Workbook()
> sheet = book.active
> chart = chart.ScatterChart()
> # Add some data to the chart.
> sheet.add_chart(chart, 'A1')
>
> chart.y_axis.title = 'blah'
>
> prop = drawing.text.CharacterProperties(solidFill='FF0000')
> run = drawing.text.RegularTextRun(rPr=prop, t='\u00b7\u00b7\u00b7\u00b7')
> chart.y_axis.title.tx.rich.paragraphs.append(drawing.text.Paragraph(r=run))
>
> book.save('test.xlsx')
> ```
>
> There are a couple of issues here. The major one is that there is no way
> to append `run` above to the existing
> `chart.y_axis.title.tx.rich.paragraphs[0]`. This could likely be fixed by
> changing
>
> ```
> r = Typed(expected_type=RegularTextRun, allow_none=True)
> ```
>
> to
>
> ```
> r = Sequence(expected_type=RegularTextRun, allow_none=True)
> ```
>
> on line 532 of `drawing.text` (`Paragraph` class), but I am not 100% sure
> about that. The minor issue is that the line `prop =
> drawing.text.CharacterProperties(solidFill='00FF00')` does not actually
> do anything.

The real problem is that very little of the drawing stuff has actually
been tested. It is a completely separate specification and is not related
to the spreadsheet stuff. openpyxl 2.3 contains just enough test coverage
to get things done and work reliably for things like titles.

The specification does indeed allow for multiple child elements but it is
the group itself that is unbounded.

<xsd:complexType name="CT_TextParagraph">
<xsd:sequence>
<xsd:element name="pPr" type="CT_TextParagraphProperties" minOccurs="0"
maxOccurs="1"/>
<xsd:group ref="EG_TextRun" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="endParaRPr" type="CT_TextCharacterProperties"
minOccurs="0" maxOccurs="1"/>
</xsd:sequence>
</xsd:complexType>

I think text.CharacterProperties with a fill refers to the kind of thing
you can do with word art.

By the way all of these classes were auto-generated from the schema but
the code for this is not really very good and I was losing the will to
live when I started cleaning them up. Suffice it to say the documentation
here is very different.

While your proposed fix might work for text runs, it really needs to
handle all the possibilities to be accepted and I haven't got an idea on
how to do this with the current object model, which allows sequences but
not of mixed elements.

This stuff is so fiddly that code will not be accepted without 100%
statement coverage and I'm in no hurry to do it myself. I'd like to add
read support for charts in 2.5 but will probably do something similar to
the way we handle it with strings elsewhere and strip all formatting
information.

> By the way, before I append the new paragraph, I tried doing
>
> ```
> chart.y_axis.title.tx.rich.paragraphs[-1].br = None
> ```
>
> to simulate multiple runs by possibly eliminating the line break between
> paragraphs, but that did not appear to work at all (still got the label
> on two lines).

No idea. You really need to look at the generated XML to see what's
missing or needs adding

> I am using OpenPyXL 2.3.2 on a CentOS 6.8 machine, viewing the results on
> Windows 7, MS Office Excel 2013.

You should at least upgrade to 2.3.5, though that has no relevance here.
However, no changes will be accepted into the 2.3 branch, so you're best
working with a 2.4 checkout.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

Joseph Fox-Rabinovitz

unread,
Jun 29, 2016, 4:00:43 PM6/29/16
to openpyxl-users

Charlie,

Thanks for the quick response. I was basing the statement that adding "sequence" would work on what I know about python-docx (which is very little indeed). I created a chart with the type of thing I am interested in in Excel (attached here as a doc and as a screenshot). The relevant XML (for the x-axis) seems to be here:

<c:title> <c:tx> <c:rich> <a:bodyPr rot="0" spcFirstLastPara="1" vertOverflow="ellipsis" vert="horz" wrap="square" anchor="ctr" anchorCtr="1" /> <a:lstStyle /> <a:p> <a:pPr> <a:defRPr sz="1000" b="0" i="0" u="none" strike="noStrike" kern="1200" baseline="0"> <a:solidFill> <a:schemeClr val="tx1"> <a:lumMod val="65000" /> <a:lumOff val="35000" /> </a:schemeClr> </a:solidFill> <a:latin typeface="+mn-lt" /> <a:ea typeface="+mn-ea" /> <a:cs typeface="+mn-cs" /> </a:defRPr> </a:pPr> <a:r> <a:rPr lang="en-US" /> <a:t>X-Axis Title</a:t> </a:r> <a:r> <a:rPr lang="en-US"> <a:solidFill> <a:schemeClr val="accent1" /> </a:solidFill> </a:rPr> <a:t>Different Color</a:t> </a:r> </a:p> </c:rich> </c:tx> <c:layout /> <c:overlay val="0" /> <c:spPr> <a:noFill /> <a:ln> <a:noFill /> </a:ln> <a:effectLst /> </c:spPr> <c:txPr> <a:bodyPr rot="0" spcFirstLastPara="1" vertOverflow="ellipsis" vert="horz" wrap="square" anchor="ctr" anchorCtr="1" /> <a:lstStyle /> <a:p> <a:pPr> <a:defRPr sz="1000" b="0" i="0" u="none" strike="noStrike" kern="1200" baseline="0"> <a:solidFill> <a:schemeClr val="tx1"> <a:lumMod val="65000" /> <a:lumOff val="35000" /> </a:schemeClr> </a:solidFill> <a:latin typeface="+mn-lt" /> <a:ea typeface="+mn-ea" /> <a:cs typeface="+mn-cs" /> </a:defRPr> </a:pPr> <a:endParaRPr lang="en-US" /> </a:p> </c:txPr> </c:title>
I am not sure how the XML-object mapping works in detail (although it seems pretty straightforward on the surface). It appears that `
<a:p>` can contain a sequence of `<a:r>` and that `<a:solidFill>` is how the text color gets set. Do you have any recommendations based on that?

If I am able to get this working in my checkout, what level of testing are you looking for? Is a check that the correct XML elements are generated sufficient, or do you want some sort of visual comparison? Would you accept just the small change if it was properly tested?

Thanks,

-Joe

OPX_Book1.xlsx

Joseph Fox-Rabinovitz

unread,
Jun 29, 2016, 4:03:08 PM6/29/16
to openpyxl-users

Charlie Clark

unread,
Jun 30, 2016, 8:26:00 AM6/30/16
to openpyx...@googlegroups.com
Am .06.2016, 22:00 Uhr, schrieb Joseph Fox-Rabinovitz
<jfoxrab...@gmail.com>:


> Charlie,
>
> Thanks for the quick response. I was basing the statement that adding
> "sequence" would work on what I know about python-docx (which is very
> little indeed). I created a chart with the type of thing I am interested
> in in Excel (attached here as a doc and as a screenshot). The relevant
> XML
> (for the x-axis) seems to be here:

python-docx, a bit like the related powerpoint library, relies on lxml and
manipulating the XML in memory, which keeps the code a bit simpler. This
simply isn't possible with Excel because you can have up 16,384,000,000
cells in a single sheet. openpyxl uses pure Python objects that generally
mimic the XML tree but try and simplify it where possible. This means
people generally work with attributes but also that we can add convenience
methods.

> <c:title>
> <c:tx>
> <c:rich>
> <a:bodyPr rot="0" spcFirstLastPara="1"
> vertOverflow="ellipsis" vert="horz" wrap="square" anchor="ctr"
> anchorCtr="1" />

Here we switch to a completely different part OOXML, which is indeed
shared by the different components.
No: I have no idea how it works and no recommendations.

> If I am able to get this working in my checkout, what level of testing
> are you looking for? Is a check that the correct XML elements are
> generated sufficient, or do you want some sort of visual comparison?
> Would you accept just the small change if it was properly tested?

100 % test coverage is easy to measure with coverage:

py.test --cov=openpyxl/drawing --cov-report=term-missing -rf
openpyxl/drawing

To be honest, the tests are pretty easy to write. If you look at the tests
for charts and the few in the drawing part you'll get an idea: you must be
able to create the relevant XML from code *and* create the relevant Python
objects from XML. Working with existing fragments is an excellent way to
find see if the current classes are correct. There is even a utility
function for creating tests "stubs":

python openpyxl/develop/stub.py ClassName

will spit out something in the testing style I've developed you can copy &
paste into files. It's a bit blunt and stupid but it's better than doing
it all by hand.

Once you have the primitives, well they're already there, you can spend
some time working with them – writing sample documentation is excellent
for this – to see whether the API is really suitable and whether you need
additional helpers, such as the one for chart titles.
Reply all
Reply to author
Forward
0 new messages