We've run in to an issue with ReportLab where it can orphan a heading
under some circumstances, despite things being correctly marked with
keepWithNext.
After some digging we found that flowables._listWrapOn() does not
compute the same height as Frame._add(). So what happens is that
KeepTogether things the flowables will all fit, but once the Frame
starts adding things the last element gets punted to the next page.
I'm afraid I don't have an easy example for you, but I do have some
numbers that should allow anyone to go through the code.
We have the following elements that are all marked keepWithNext:
Width Height Space Before Space After
--------------------------------------------
0 3 0 0
470 15 12 6
470 36 6 0
The frame is at y=67. The space required for the above elements is
3+12+15+6+36=72. I.e. it won't fit. However _listWrapOn() computes a
required height of 57 here, so it thinks things will fit.
There are two bugs in play here:
a) _listWrapOn() ignores things with 0 width, however Frame._add() does
not. So the first element gets discarded, dropping 3 from the resulting
height.
b) _listWrapOn() assumes it is at the top of a frame, and should
therefore discard the first flowable's spaceBefore. This seems unsafe as
it will then often underestimate the space required. It would likely be
better to assume it is never at the top and hence overestimate. This bug
then acts on the second flowable (since the first was dropped), dropping
another 12 from the total height.
And 72 - 12 - 3 = 57, and we get the bogus height number.
For now I've worked around this by making sure that initial flowable is
1×3 instead of 0×3.
I also noticed that KeepTogether._H0 is probably incorrect since it
doesn't include spaceBefore. But that might be tied to the assumption of
always being at the top of a frame.
Regards
--
Pierre Ossman Software Development
Cendio AB https://cendio.com
Teknikringen 8 https://twitter.com/ThinLinc
583 30 Linköping https://facebook.com/ThinLinc
Phone: +46-13-214600
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
_______________________________________________
reportlab-users mailing list
reportl...@lists2.reportlab.com
https://pairlist2.pair.net/mailman/listinfo/reportlab-users
I think you are probably right in your analysis.
1) It's probably a bug to skip things that are zero width and non-zero height, It would be helpful to know what kind of
flowable was involved there.
2) As you say it would be better to overestimate the required height, but best would be to compute correctly, but that
requires us to check for inclusion in a frame and to work out what the extra height requirement is. Actually we can find
out when we have a frame above us and inspect whether we are at the top of a frame and what if it would do at this
point if we get added.
The overall algorithm is badly designed, but it's probably far too late to adopt a better one.
--
Robin Becker
This is using rst2pdf via Sphinx. The conditions we encountered were a
bullet list followed by a heading and a paragraph. The sequence of
flowables rst2pdf generates in that case is:
* Spacer
* Table
* Spacer
* Heading
* Paragraph
The spacers have keepWithNext set to make sure that a heading before a
bullet list/table works properly. As a side effect the trailing spacer
here will be tied to the following heading. The heading also has
keepWithNext set.
> 2) As you say it would be better to overestimate the required height,
> but best would be to compute correctly, but that requires us to check
> for inclusion in a frame and to work out what the extra height
> requirement is. Actually we can find out when we have a frame above us
> and inspect whether we are at the top of a frame and what if it would
> do at this point if we get added.
>
> The overall algorithm is badly designed, but it's probably far too late
> to adopt a better one.
>
How so? The handling of keepWithNext is mostly internal to ReportLab, so
it should be possibly to modify rather heavily without affecting users?
Regards
--
Pierre Ossman Software Development
Cendio AB https://cendio.com
Teknikringen 8 https://twitter.com/ThinLinc
583 30 Linköping https://facebook.com/ThinLinc
Phone: +46-13-214600
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
I tried fairly hard to make an example like the one you suggest, but it's too complex to really allow easy diagnosis.
Looking at the internal code I was able to check that the KeepTogether class that is used to replace keepWidthNext does
provide access to the space before and after attributes. Also at least in a really simple case it does work. In the
example below the keepWithNext group has a height of 31. The heading stays with the body until the frame height, fh,
falls below 31 ie the required height for the group. When that happens things just break. If you can supply a simple
example I can try to make it go wrong. It's not clear whether it's using spacers that causes the issue, but I would
argue that spaceBefore attributes handle that far better than fixed space as they can vanish.
############################################
from reportlab.platypus import BaseDocTemplate, PageTemplate, Flowable, FrameBreak, KeepTogether, PageBreak, Spacer, Frame
from reportlab.platypus import Paragraph, Preformatted
from reportlab.lib.styles import PropertySet, getSampleStyleSheet, ParagraphStyle
from reportlab.lib.units import inch, cm
import sys
def test4():
class Test4Template(BaseDocTemplate):
_invalidInitArgs = ('pageTemplates',)
def __init__(self, filename, **kw):
fh = 12+10+12+7+12 #53 enough for all
fh = 31 # just enough for the kwn group
if len(sys.argv)>1:
fh = float(sys.argv[1])
frame1 = Frame(inch, 5.6*inch, 6*inch, fh, showBoundary=1, id='F1',
leftPadding=0,rightPadding=0,topPadding=0,bottomPadding=0)
frame2 = Frame(inch, 1.0*inch, 6*inch, fh, showBoundary=1,id='F2',
leftPadding=0,rightPadding=0,topPadding=0,bottomPadding=0)
BaseDocTemplate.__init__(self,filename,**kw)
self.addPageTemplates(PageTemplate('normal',[frame1,frame2]))
sty = ParagraphStyle('simple')
def mpara(content,spaceBefore=0,spaceAfter=0,kwn=False):
p = Paragraph(content,sty)
p.spaceBefore = spaceBefore
p.spaceAfter = spaceAfter
p.keepWithNext = kwn
return p
story = [
mpara('top para',8,6),
mpara('header para',10,7,kwn=True),
mpara('body para',6,6,kwn=False),
]
doc = Test4Template('/home/robin/devel/reportlab/REPOS/reportlab/tests/test_keepwithnext.pdf')
doc.build(story)
if __name__=='__main__':
test4()
############################################
I tried arguments 30.999, 31, 52 & 53; the first is the real failure case.
>
> Regards
--
Robin Becker
I had a look at making a stable test case, and I disproved part of my
initial analysis. There *is* a bug handling zero width flowables. But
there *is not* a bug handling spaceBefore.
The reason spaceBefore works is slight different from how the Frame does
it though, which makes it non-obvious (but correct):
* The _ContainerSpace base class propagates requests for spaceBefore
to the first element, which is the appropriate thing to do
* The Frame looks at spaceBefore before calculating the available
height it then gives to wrap() and split(), which is also appropriate
* _listWrapOn() always ignores spaceBefore for the first element,
since it is either irrelevant, or already included in the available
height calculated by the Frame
So it's just a bit non-obvious, but it seems correct. The problem is
solely that zero width items are incorrectly discarded, which in turn
makes _listWrapOn() consider the next item to be at the top, which it is
not.
The bug is fully present even without spacing though, so here is a test
that exposes the issue as neatly as possible
from reportlab.platypus import BaseDocTemplate, PageTemplate, Flowable,
FrameBreak, KeepTogether, PageBreak, Spacer, Frame
from reportlab.platypus import Paragraph, Preformatted, XBox
from reportlab.lib.styles import PropertySet, getSampleStyleSheet,
ParagraphStyle
from reportlab.lib.units import inch, cm
import sys
def testZeroWidth():
class DocTemplate(BaseDocTemplate):
_invalidInitArgs = ('pageTemplates',)
def __init__(self, *args, **kw):
super().__init__(*args, **kw)
# Two frames, each 100pts high
frame1 = Frame(inch, 5.6*inch, 6*inch, 100,
showBoundary=1, id='F1',
leftPadding=0, rightPadding=0,
topPadding=0, bottomPadding=0)
frame2 = Frame(inch, 1.0*inch, 6*inch, 100,
showBoundary=1, id='F2',
leftPadding=0, rightPadding=0,
topPadding=0, bottomPadding=0)
self.addPageTemplates(PageTemplate('normal',[frame1,frame2]))
class KeepedXBox(XBox):
def __init__(self, width, height, text = 'A Box', *,
keepWithNext=False):
super().__init__(width, height, text)
self.keepWithNext = keepWithNext
story = [
XBox(4*inch, 50, 'top para'),
KeepedXBox(0, 20, 'spacer', keepWithNext=True),
KeepedXBox(4*inch, 20, 'heading', keepWithNext=True),
KeepedXBox(4*inch, 20, 'body para', keepWithNext=False),
]
doc = DocTemplate('test_zerowidth.pdf')
doc.build(story)
if __name__=='__main__':
testZeroWidth()
Regards
--
Pierre Ossman Software Development
Cendio AB https://cendio.com
Teknikringen 8 https://twitter.com/ThinLinc
583 30 Linköping https://facebook.com/ThinLinc
Phone: +46-13-214600
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
...........
>
> I had a look at making a stable test case, and I disproved part of my initial analysis. There *is* a bug handling zero
> width flowables. But there *is not* a bug handling spaceBefore.
>
> The reason spaceBefore works is slight different from how the Frame does it though, which makes it non-obvious (but
> correct):
>
> * The _ContainerSpace base class propagates requests for spaceBefore to the first element, which is the appropriate
> thing to do
> * The Frame looks at spaceBefore before calculating the available height it then gives to wrap() and split(), which is
> also appropriate
> * _listWrapOn() always ignores spaceBefore for the first element, since it is either irrelevant, or already included
> in the available height calculated by the Frame
>
> So it's just a bit non-obvious, but it seems correct. The problem is solely that zero width items are incorrectly
> discarded, which in turn makes _listWrapOn() consider the next item to be at the top, which it is not.
>
> The bug is fully present even without spacing though, so here is a test that exposes the issue as neatly as possible
>
........
I will take a look at the 'bug' as exposed in your example.
I assume this is the line you want to change flowables.py:643
> if w<=_FUZZ or h<=_FUZZ: continue
I believe this line has been there for a very long time since 1.21 at least (2008 vintage). It's supposed to ignore
invisible items and is probably wrong for your purpose. It's too general. So far we never used the width for anything.
I modified some lines of your example
zw = int(os.environ.get('ZW','0')) #here
story = [
XBox(4*inch, 50, 'top para'),
KeepedXBox(zw, 20, 'spacer', keepWithNext=True), #here
KeepedXBox(4*inch, 20, 'heading', keepWithNext=True),
KeepedXBox(4*inch, 20, 'body para', keepWithNext=False),
]
doc = DocTemplate('test_pierre_ossman_%szerowidth.pdf' % ('non' if zw else '')) #here
to see what happens if the spacer is not invisible. Is there an issue that requires zero width for the spacer?
If the spacer is controlled by some other software then presumably a simple fix is just to change its class or to define
an attribute which controls the w<=FUZZ issue.
I ran a render test (401 pdfs) with this line
if (w<=_FUZZ and False) or h<=_FUZZ: continue
and saw no changes. It does not mean that others will have the same result though.
--
Robin Becker
Right. Or the corresponding behaviour in Frame. Either behaviour is fine
with me, it's the discrepancy between the two that is the problem.
> to see what happens if the spacer is not invisible. Is there an issue
> that requires zero width for the spacer?
>
Not to my knowledge, no. But this spacer comes from rst2pdf and it's not
something I'm that familiar with. We saw the issue as users of rst2pdf,
not developers of it.
> If the spacer is controlled by some other software then presumably a
> simple fix is just to change its class or to define an attribute which
> controls the w<=FUZZ issue.
>
> I ran a render test (401 pdfs) with this line
>
> if (w<=_FUZZ and False) or h<=_FUZZ: continue
>
> and saw no changes. It does not mean that others will have the same
> result though.
Our workaround for the moment is to override the spacer in rst2pdf and
makes sure it has minimum dimensions of 1x1, which seems to work fine.
But it would be nice to remove that workaround eventually, and hopefully
let others avoid this issue. :)
Regards
--
Pierre Ossman Software Development
Cendio AB https://cendio.com
Teknikringen 8 https://twitter.com/ThinLinc
583 30 Linköping https://facebook.com/ThinLinc
Phone: +46-13-214600
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?