[reportlab-users] Parsing of <table> & ICU wordWrap

63 views
Skip to first unread message

Buganini

unread,
Nov 3, 2015, 9:38:42 AM11/3/15
to reportl...@lists2.reportlab.com
Hi,
recently I am adding <table> support to paragraph, I'm not quite
familiar with hq yet so I directly commit to my default branch:
https://bitbucket.org/buganini/reportlab/commits/all
It works pretty well for my use, unless wordWrap=='CJK', since break

I also had my hack to wordWrap using ICU
https://github.com/buganini/reportlab/commit/b0bb4a8fc4017cb6d671f006afe151c19b9b2df1
but it leads to extra spaces, I will rewrite it into a breakLinesICU()
function recently, based on my table changes.

And I may also add more attributes parsing for <table>, but all of
them will happen on the same branch because I just need them all, this
will make it harder to merge back to upstream (if that may happen), so
before that I'd like to hear some ideas from you.

Buganini
_______________________________________________
reportlab-users mailing list
reportl...@lists2.reportlab.com
https://pairlist2.pair.net/mailman/listinfo/reportlab-users

Robin Becker

unread,
Nov 3, 2015, 9:53:06 AM11/3/15
to reportlab-users
On 03/11/2015 14:38, Buganini wrote:
> Hi,
> recently I am adding <table> support to paragraph, I'm not quite
> familiar with hq yet so I directly commit to my default branch:
> https://bitbucket.org/buganini/reportlab/commits/all
> It works pretty well for my use, unless wordWrap=='CJK', since break
>
> I also had my hack to wordWrap using ICU
> https://github.com/buganini/reportlab/commit/b0bb4a8fc4017cb6d671f006afe151c19b9b2df1
> but it leads to extra spaces, I will rewrite it into a breakLinesICU()
> function recently, based on my table changes.
>
> And I may also add more attributes parsing for <table>, but all of
> them will happen on the same branch because I just need them all, this
> will make it harder to merge back to upstream (if that may happen), so
> before that I'd like to hear some ideas from you.
>
> Buganini
........
Hi,

I'm not exactly sure what the use case is here. Form a quick inspection of the
change sets it looks like you are using the cbDefn stuff to add a flowable
(table) in the middle of a paragraph like an <img>; is that correct?

I can imagine that people might want to put a table of figures into flowing text
as a float, but the in line usage kind of escapes me.
--
Robin Becker

Buganini

unread,
Nov 4, 2015, 12:53:03 AM11/4/15
to reportlab-users
> Hi,
>
> I'm not exactly sure what the use case is here. Form a quick inspection of
> the change sets it looks like you are using the cbDefn stuff to add a
> flowable (table) in the middle of a paragraph like an <img>; is that
> correct?

That's correct, BTW I don't know the meaning of cbDefn.

> I can imagine that people might want to put a table of figures into flowing
> text as a float, but the in line usage kind of escapes me.

Since I wrap it with available width, it renders as a block element
(in terms of CSS):

def wrap(self, availWidth, availHeight):
# work out widths array for breaking
self.width = availWidth
style = self.style
leftIndent = style.leftIndent
first_line_width = availWidth -
(leftIndent+style.firstLineIndent) - style.rightIndent
later_widths = availWidth - leftIndent - style.rightIndent
self._wrapWidths = [first_line_width, later_widths]

+ for f in self.frags:
+ cb = getattr(f, 'cbDefn', None)
+ if cb and cb.kind=='flowable':
+ cb.width, cb.height = cb.flowable.wrap(availWidth, None)
+ f.width = cb.width

Robin Becker

unread,
Nov 4, 2015, 8:08:03 AM11/4/15
to reportlab-users
On 04/11/2015 05:52, Buganini wrote:
.........
>> correct?
>
> That's correct, BTW I don't know the meaning of cbDefn.
>
.....
my bad; come from pre-history where var names are too long and use memory.
cbDefn originally stood for "call back definition" and originally we intended it
for really simple stuff like indexing which didn't use any width etc etc, but it
got reused as the carrier of information for the img tag since it was the one
non-regular thing we were already doing in the break lines and similar functions.

If you have a good use case please post some small example so we can see how it
looks. If it seems useful perhaps it can be rolled into RL.

Buganini

unread,
Nov 7, 2015, 7:33:21 AM11/7/15
to reportlab-users
http://pastebin.com/JuQVrGzC

But I just found two problem:
1. table is double if width < 100mm (line 55)
2. this program dies with following message if fontName is not set (line 51)

Traceback (most recent call last):
File "table.py", line 63, in <module>
p.wrap(width-padding*2, height-padding*2)
File "/usr/local/lib/python2.7/dist-packages/reportlab-3.2.9-py2.7-linux-x86_64.egg/reportlab/platypus/paragraph.py",
line 1111, in wrap
blPara = self.breakLines(self._wrapWidths)
File "/usr/local/lib/python2.7/dist-packages/reportlab-3.2.9-py2.7-linux-x86_64.egg/reportlab/platypus/paragraph.py",
line 1430, in breakLines
spaceWidth = stringWidth(' ',fontName, fontSize)
File "/usr/local/lib/python2.7/dist-packages/reportlab-3.2.9-py2.7-linux-x86_64.egg/reportlab/pdfbase/pdfmetrics.py",
line 720, in stringWidth
return getFont(fontName).stringWidth(text, fontSize, encoding=encoding)
File "/usr/local/lib/python2.7/dist-packages/reportlab-3.2.9-py2.7-linux-x86_64.egg/reportlab/pdfbase/pdfmetrics.py",
line 686, in getFont
return findFontAndRegister(fontName)
File "/usr/local/lib/python2.7/dist-packages/reportlab-3.2.9-py2.7-linux-x86_64.egg/reportlab/pdfbase/pdfmetrics.py",
line 668, in findFontAndRegister
face = getTypeFace(fontName)
File "/usr/local/lib/python2.7/dist-packages/reportlab-3.2.9-py2.7-linux-x86_64.egg/reportlab/pdfbase/pdfmetrics.py",
line 625, in getTypeFace
return _typefaces[faceName]
KeyError: 'helvetica'

Robin Becker

unread,
Nov 7, 2015, 4:10:56 PM11/7/15
to reportlab-users
The font name should be 'Helvetica' so some where you have a wrong
font specification; styles['Normal'] has the canvas_basefontname set
into it (imported from rl_settings.py via rl_config). There are
override mechanisms which can alter the default value.

$ python
Python 2.7.10 (default, Sep 7 2015, 13:51:49)
[GCC 5.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from reportlab.lib.styles import getSampleStyleSheet
>>> styles=getSampleStyleSheet()
>>> styles['Normal'].fontName
'Helvetica'

Buganini

unread,
Nov 9, 2015, 9:37:59 AM11/9/15
to reportlab-users
2015-11-08 5:10 GMT+08:00 Robin Becker <ro...@reportlab.com>:
> The font name should be 'Helvetica' so some where you have a wrong
> font specification; styles['Normal'] has the canvas_basefontname set
> into it (imported from rl_settings.py via rl_config). There are
> override mechanisms which can alter the default value.
>
> $ python
> Python 2.7.10 (default, Sep 7 2015, 13:51:49)
> [GCC 5.2.0] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from reportlab.lib.styles import getSampleStyleSheet
>>>> styles=getSampleStyleSheet()
>>>> styles['Normal'].fontName
> 'Helvetica'


Weird, I didn't change anything related to font. Anyway, I'll push
this issue later...

I found the cause of doubling issue,
it's around
tooLong = newLineWidth>maxWidth
in _splitFragWord(), unbreakable element (image, flowable) is not
properly handled.
this issue can be reproduced with <img /> and official code.

For now I change
if g is not f or tooLong:
to
if g is not f or (tooLong and not hasattr(f, 'cbDefn')):
to solve this issue.

Buganini

unread,
Nov 10, 2015, 1:46:11 AM11/10/15
to reportlab-users
Most features I need and most issues I got are sovled and committed.

Here are my demos for these featues/issues
http://tinder.land/static/demo.zip


breaklinesICU still has minor issue that it over measure the width
seems because extra spaces width are counted in.

Buganini

unread,
Nov 10, 2015, 2:16:15 AM11/10/15
to reportlab-users
2015-11-10 14:45 GMT+08:00 Buganini <buga...@gmail.com>:
> Most features I need and most issues I got are sovled and committed.
>
> Here are my demos for these featues/issues
> http://tinder.land/static/demo.zip
>
>
> breaklinesICU still has minor issue that it over measure the width
> seems because extra spaces width are counted in.
this is (probably) fixed and committed.

My yet another question is, why autoLeading is not max by default? I
didn't see any scenario needs autoLeading other than max.

Robin Becker

unread,
Nov 11, 2015, 6:13:04 AM11/11/15
to reportlab-users
On 10/11/2015 07:15, Buganini wrote:
> 2015-11-10 14:45 GMT+08:00 Buganini <buga...@gmail.com>:
>> Most features I need and most issues I got are sovled and committed.
>>
>> Here are my demos for these featues/issues
>> http://tinder.land/static/demo.zip
>>
>>
>> breaklinesICU still has minor issue that it over measure the width
>> seems because extra spaces width are counted in.
> this is (probably) fixed and committed.
>

There is a relatively new para-measure-fix branch where I have attempted to fix
all the measuring errors caused by font size and other changes.

> My yet another question is, why autoLeading is not max by default? I
> didn't see any scenario needs autoLeading other than max.
......

in the past whenever we have gratuitously changed any option we receive howls of
protest. By default leading should be constant as that leads to a better overall
look for paragraphs. Your application is unusual in wanting variable line leading.
--
Robin Becker

Buganini

unread,
Nov 15, 2015, 8:22:40 AM11/15/15
to reportlab-users
It tuns out the lower case "helvetica" came from tts2ps() in fonts.py
problem is solved if I change 'helvetica; to 'Helvetica' for
('helvetica', 0, 0) :'Helvetica',
('helvetica', 1, 0) :'Helvetica-Bold',
('helvetica', 0, 1) :'Helvetica-Oblique',
('helvetica', 1, 1) :'Helvetica-BoldOblique',
in _tt2ps_map.

since everything in _ps2tt_map is lowered before added, I guess faces
in _tt2ps_map should be in title case.

Robin Becker

unread,
Nov 16, 2015, 8:56:17 AM11/16/15
to reportlab-users
On 15/11/2015 13:22, Buganini wrote:
> It tuns out the lower case "helvetica" came from tts2ps() in fonts.py
> problem is solved if I change 'helvetica; to 'Helvetica' for
> ('helvetica', 0, 0) :'Helvetica',
> ('helvetica', 1, 0) :'Helvetica-Bold',
> ('helvetica', 0, 1) :'Helvetica-Oblique',
> ('helvetica', 1, 1) :'Helvetica-BoldOblique',
> in _tt2ps_map.
>
> since everything in _ps2tt_map is lowered before added, I guess faces
> in _tt2ps_map should be in title case.
>
.....
since the existing code works for everyone else I am not going to change it. I
have just run the program that you posted originally and it runs fine with only
a minor adjustment to the font search path at line 30 'cos I am on windows not
unix.

I did not see the reported error even if I comment the assignment to
descStyle.fontName at line 51. I am using latest reportlab with python 2.7.8.

Of course I am not running your code hacks to support the <table> tag. I imagine
that you are failing to propagate or restore something somewhere when you draw
the table. It is important to note that the descStyle.fontName has a value with
or without the assignment to descStyle.fontName so the assignment probably has
little relevance. Of course it might be that the actual value being used in the
paragraph has relevance.

My best guess is that the table drawing is messing up the canvas properties that
are assumed to be valid in the middle of the paragraph. In the absence of an
explicit setting the font used by the table cell styles is the _baseFontName ie
'Helvetica'.
Reply all
Reply to author
Forward
0 new messages