[reportlab-users] Miscalculation when using keepWithNext

74 views
Skip to first unread message

Pierre Ossman

unread,
Nov 4, 2021, 3:22:22 AM11/4/21
to reportl...@reportlab.com
Hi,

We've run in to an issue with ReportLab where it can orphan a heading
under some circumstances, despite things being correctly marked with
keepWithNext.

After some digging we found that flowables._listWrapOn() does not
compute the same height as Frame._add(). So what happens is that
KeepTogether things the flowables will all fit, but once the Frame
starts adding things the last element gets punted to the next page.

I'm afraid I don't have an easy example for you, but I do have some
numbers that should allow anyone to go through the code.

We have the following elements that are all marked keepWithNext:

Width Height Space Before Space After
--------------------------------------------
0 3 0 0
470 15 12 6
470 36 6 0

The frame is at y=67. The space required for the above elements is
3+12+15+6+36=72. I.e. it won't fit. However _listWrapOn() computes a
required height of 57 here, so it thinks things will fit.

There are two bugs in play here:

a) _listWrapOn() ignores things with 0 width, however Frame._add() does
not. So the first element gets discarded, dropping 3 from the resulting
height.

b) _listWrapOn() assumes it is at the top of a frame, and should
therefore discard the first flowable's spaceBefore. This seems unsafe as
it will then often underestimate the space required. It would likely be
better to assume it is never at the top and hence overestimate. This bug
then acts on the second flowable (since the first was dropped), dropping
another 12 from the total height.

And 72 - 12 - 3 = 57, and we get the bogus height number.

For now I've worked around this by making sure that initial flowable is
1×3 instead of 0×3.

I also noticed that KeepTogether._H0 is probably incorrect since it
doesn't include spaceBefore. But that might be tied to the assumption of
always being at the top of a frame.

Regards
--
Pierre Ossman Software Development
Cendio AB https://cendio.com
Teknikringen 8 https://twitter.com/ThinLinc
583 30 Linköping https://facebook.com/ThinLinc
Phone: +46-13-214600

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
_______________________________________________
reportlab-users mailing list
reportl...@lists2.reportlab.com
https://pairlist2.pair.net/mailman/listinfo/reportlab-users

Robin Becker

unread,
Nov 4, 2021, 6:47:41 AM11/4/21
to Pierre Ossman, For users of Reportlab open source software
Hi Pierre,

I think you are probably right in your analysis.

1) It's probably a bug to skip things that are zero width and non-zero height, It would be helpful to know what kind of
flowable was involved there.

2) As you say it would be better to overestimate the required height, but best would be to compute correctly, but that
requires us to check for inclusion in a frame and to work out what the extra height requirement is. Actually we can find
out when we have a frame above us and inspect whether we are at the top of a frame and what if it would do at this
point if we get added.

The overall algorithm is badly designed, but it's probably far too late to adopt a better one.


--
Robin Becker

Pierre Ossman

unread,
Nov 4, 2021, 10:48:25 AM11/4/21
to Robin Becker, For users of Reportlab open source software
On 04/11/2021 11:47, Robin Becker wrote:
> Hi Pierre,
>
> I think you are probably right in your analysis.
>
> 1) It's probably a bug to skip things that are zero width and non-zero
> height, It would be helpful to know what kind of flowable was involved
> there.
>

This is using rst2pdf via Sphinx. The conditions we encountered were a
bullet list followed by a heading and a paragraph. The sequence of
flowables rst2pdf generates in that case is:

* Spacer
* Table
* Spacer
* Heading
* Paragraph

The spacers have keepWithNext set to make sure that a heading before a
bullet list/table works properly. As a side effect the trailing spacer
here will be tied to the following heading. The heading also has
keepWithNext set.

> 2) As you say it would be better to overestimate the required height,
> but best would be to compute correctly, but that requires us to check
> for inclusion in a frame and to work out what the extra height
> requirement is. Actually we can find out when we have a frame above us
> and inspect  whether we are at the top of a frame and what if it would
> do at this point if we get added.
>
> The overall algorithm is badly designed, but it's probably far too late
> to adopt a better one.
>

How so? The handling of keepWithNext is mostly internal to ReportLab, so
it should be possibly to modify rather heavily without affecting users?

Regards
--
Pierre Ossman Software Development
Cendio AB https://cendio.com
Teknikringen 8 https://twitter.com/ThinLinc
583 30 Linköping https://facebook.com/ThinLinc
Phone: +46-13-214600

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Robin Becker

unread,
Nov 18, 2021, 4:49:10 AM11/18/21
to Pierre Ossman, For users of Reportlab open source software
........

>  * Spacer
>  * Table
>  * Spacer
>  * Heading
>  * Paragraph
>
> The spacers have keepWithNext set to make sure that a heading before a bullet list/table works properly. As a side
> effect the trailing spacer here will be tied to the following heading. The heading also has keepWithNext set.
>
>> 2) As you say it would be better to overestimate the required height, but best would be to compute correctly, but that
>> requires us to check for inclusion in a frame and to work out what the extra height requirement is. Actually we can
>> find out when we have a frame above us and inspect  whether we are at the top of a frame and what if it would do at
>> this point if we get added.
>>
>> The overall algorithm is badly designed, but it's probably far too late to adopt a better one.
>>
>
> How so? The handling of keepWithNext is mostly internal to ReportLab, so it should be possibly to modify rather heavily
> without affecting users?

I tried fairly hard to make an example like the one you suggest, but it's too complex to really allow easy diagnosis.

Looking at the internal code I was able to check that the KeepTogether class that is used to replace keepWidthNext does
provide access to the space before and after attributes. Also at least in a really simple case it does work. In the
example below the keepWithNext group has a height of 31. The heading stays with the body until the frame height, fh,
falls below 31 ie the required height for the group. When that happens things just break. If you can supply a simple
example I can try to make it go wrong. It's not clear whether it's using spacers that causes the issue, but I would
argue that spaceBefore attributes handle that far better than fixed space as they can vanish.

############################################
from reportlab.platypus import BaseDocTemplate, PageTemplate, Flowable, FrameBreak, KeepTogether, PageBreak, Spacer, Frame
from reportlab.platypus import Paragraph, Preformatted
from reportlab.lib.styles import PropertySet, getSampleStyleSheet, ParagraphStyle
from reportlab.lib.units import inch, cm
import sys
def test4():
class Test4Template(BaseDocTemplate):
_invalidInitArgs = ('pageTemplates',)
def __init__(self, filename, **kw):
fh = 12+10+12+7+12 #53 enough for all
fh = 31 # just enough for the kwn group
if len(sys.argv)>1:
fh = float(sys.argv[1])
frame1 = Frame(inch, 5.6*inch, 6*inch, fh, showBoundary=1, id='F1',
leftPadding=0,rightPadding=0,topPadding=0,bottomPadding=0)
frame2 = Frame(inch, 1.0*inch, 6*inch, fh, showBoundary=1,id='F2',
leftPadding=0,rightPadding=0,topPadding=0,bottomPadding=0)
BaseDocTemplate.__init__(self,filename,**kw)
self.addPageTemplates(PageTemplate('normal',[frame1,frame2]))

sty = ParagraphStyle('simple')
def mpara(content,spaceBefore=0,spaceAfter=0,kwn=False):
p = Paragraph(content,sty)
p.spaceBefore = spaceBefore
p.spaceAfter = spaceAfter
p.keepWithNext = kwn
return p

story = [
mpara('top para',8,6),
mpara('header para',10,7,kwn=True),
mpara('body para',6,6,kwn=False),
]

doc = Test4Template('/home/robin/devel/reportlab/REPOS/reportlab/tests/test_keepwithnext.pdf')
doc.build(story)

if __name__=='__main__':
test4()
############################################

I tried arguments 30.999, 31, 52 & 53; the first is the real failure case.

>
> Regards


--
Robin Becker

Pierre Ossman

unread,
Dec 7, 2021, 10:49:38 AM12/7/21
to Robin Becker, For users of Reportlab open source software
On 18/11/2021 10:49, Robin Becker wrote:
>
> I tried fairly hard to make an example like the one you suggest, but
> it's too complex to really allow easy diagnosis.
>
> Looking at the internal code I was able to check that the KeepTogether
> class that is used to replace keepWidthNext does provide access to the
> space before and after attributes. Also at least in a really simple case
> it does work. In the example below the keepWithNext group has a height
> of 31. The heading stays with the body until the frame height, fh, falls
> below 31 ie the required height for the group. When that happens things
> just break. If you can supply a simple example I can try to make it go
> wrong. It's not clear whether it's using spacers that causes the issue,
> but I would argue that spaceBefore attributes handle that far better
> than fixed space as they can vanish.
>

I had a look at making a stable test case, and I disproved part of my
initial analysis. There *is* a bug handling zero width flowables. But
there *is not* a bug handling spaceBefore.

The reason spaceBefore works is slight different from how the Frame does
it though, which makes it non-obvious (but correct):

* The _ContainerSpace base class propagates requests for spaceBefore
to the first element, which is the appropriate thing to do
* The Frame looks at spaceBefore before calculating the available
height it then gives to wrap() and split(), which is also appropriate
* _listWrapOn() always ignores spaceBefore for the first element,
since it is either irrelevant, or already included in the available
height calculated by the Frame

So it's just a bit non-obvious, but it seems correct. The problem is
solely that zero width items are incorrectly discarded, which in turn
makes _listWrapOn() consider the next item to be at the top, which it is
not.

The bug is fully present even without spacing though, so here is a test
that exposes the issue as neatly as possible

from reportlab.platypus import BaseDocTemplate, PageTemplate, Flowable,
FrameBreak, KeepTogether, PageBreak, Spacer, Frame

from reportlab.platypus import Paragraph, Preformatted, XBox


from reportlab.lib.styles import PropertySet, getSampleStyleSheet,
ParagraphStyle
from reportlab.lib.units import inch, cm
import sys

def testZeroWidth():
class DocTemplate(BaseDocTemplate):


_invalidInitArgs = ('pageTemplates',)

def __init__(self, *args, **kw):
super().__init__(*args, **kw)

# Two frames, each 100pts high
frame1 = Frame(inch, 5.6*inch, 6*inch, 100,
showBoundary=1, id='F1',
leftPadding=0, rightPadding=0,
topPadding=0, bottomPadding=0)
frame2 = Frame(inch, 1.0*inch, 6*inch, 100,
showBoundary=1, id='F2',
leftPadding=0, rightPadding=0,
topPadding=0, bottomPadding=0)


self.addPageTemplates(PageTemplate('normal',[frame1,frame2]))

class KeepedXBox(XBox):
def __init__(self, width, height, text = 'A Box', *,
keepWithNext=False):
super().__init__(width, height, text)
self.keepWithNext = keepWithNext

story = [
XBox(4*inch, 50, 'top para'),
KeepedXBox(0, 20, 'spacer', keepWithNext=True),
KeepedXBox(4*inch, 20, 'heading', keepWithNext=True),
KeepedXBox(4*inch, 20, 'body para', keepWithNext=False),
]

doc = DocTemplate('test_zerowidth.pdf')
doc.build(story)

if __name__=='__main__':
testZeroWidth()

Regards
--
Pierre Ossman Software Development
Cendio AB https://cendio.com
Teknikringen 8 https://twitter.com/ThinLinc
583 30 Linköping https://facebook.com/ThinLinc
Phone: +46-13-214600

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Robin Becker

unread,
Dec 14, 2021, 5:17:08 AM12/14/21
to For users of Reportlab open source software, Pierre Ossman
Hi Pierre,

...........


>
> I had a look at making a stable test case, and I disproved part of my initial analysis. There *is* a bug handling zero
> width flowables. But there *is not* a bug handling spaceBefore.
>
> The reason spaceBefore works is slight different from how the Frame does it though, which makes it non-obvious (but
> correct):
>
>  * The _ContainerSpace base class propagates requests for spaceBefore to the first element, which is the appropriate
> thing to do
>  * The Frame looks at spaceBefore before calculating the available height it then gives to wrap() and split(), which is
> also appropriate
>  * _listWrapOn() always ignores spaceBefore for the first element, since it is either irrelevant, or already included
> in the available height calculated by the Frame
>
> So it's just a bit non-obvious, but it seems correct. The problem is solely that zero width items are incorrectly
> discarded, which in turn makes _listWrapOn() consider the next item to be at the top, which it is not.
>
> The bug is fully present even without spacing though, so here is a test that exposes the issue as neatly as possible
>

........

I will take a look at the 'bug' as exposed in your example.

I assume this is the line you want to change flowables.py:643

> if w<=_FUZZ or h<=_FUZZ: continue

I believe this line has been there for a very long time since 1.21 at least (2008 vintage). It's supposed to ignore
invisible items and is probably wrong for your purpose. It's too general. So far we never used the width for anything.

I modified some lines of your example

zw = int(os.environ.get('ZW','0')) #here


story = [
XBox(4*inch, 50, 'top para'),

KeepedXBox(zw, 20, 'spacer', keepWithNext=True), #here


KeepedXBox(4*inch, 20, 'heading', keepWithNext=True),
KeepedXBox(4*inch, 20, 'body para', keepWithNext=False),
]

doc = DocTemplate('test_pierre_ossman_%szerowidth.pdf' % ('non' if zw else '')) #here

to see what happens if the spacer is not invisible. Is there an issue that requires zero width for the spacer?

If the spacer is controlled by some other software then presumably a simple fix is just to change its class or to define
an attribute which controls the w<=FUZZ issue.

I ran a render test (401 pdfs) with this line

if (w<=_FUZZ and False) or h<=_FUZZ: continue

and saw no changes. It does not mean that others will have the same result though.
--
Robin Becker

Pierre Ossman

unread,
Dec 23, 2021, 8:07:27 AM12/23/21
to Robin Becker, For users of Reportlab open source software
On 14/12/2021 11:17, Robin Becker wrote:
>
> I will take a look at the 'bug' as exposed in your example.
>
> I assume this is the line you want to change flowables.py:643
>
> > if w<=_FUZZ or h<=_FUZZ: continue
>

Right. Or the corresponding behaviour in Frame. Either behaviour is fine
with me, it's the discrepancy between the two that is the problem.

> to see what happens if the spacer is not invisible. Is there an issue
> that requires zero width for the spacer?
>

Not to my knowledge, no. But this spacer comes from rst2pdf and it's not
something I'm that familiar with. We saw the issue as users of rst2pdf,
not developers of it.

> If the spacer is controlled by some other software then presumably a
> simple fix is just to change its class or to define an attribute which
> controls the w<=FUZZ issue.
>
> I ran a render test (401 pdfs) with this line
>
> if (w<=_FUZZ and False) or h<=_FUZZ: continue
>
> and saw no changes. It does not mean that others will have the same
> result though.

Our workaround for the moment is to override the spacer in rst2pdf and
makes sure it has minimum dimensions of 1x1, which seems to work fine.

But it would be nice to remove that workaround eventually, and hopefully
let others avoid this issue. :)

Regards
--
Pierre Ossman Software Development
Cendio AB https://cendio.com
Teknikringen 8 https://twitter.com/ThinLinc
583 30 Linköping https://facebook.com/ThinLinc
Phone: +46-13-214600

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Ronak Shah

unread,
Jan 12, 2023, 4:07:20 AM1/12/23
to reportlab-users
Hi All,

I am also facing a similar issue that keepWithNext on flowables is ignored.
I tried that when using BalancedColumns.
Here's the stackoverflow question that I posted:  python - Reportlab balanced cols control split of flowables - Stack Overflow

Is this a simplified example of the problem that you are looking at?

Thanks,
Ronak
Sr. Software Developer

Wayne Davis

unread,
Apr 1, 2023, 10:14:57 AM4/1/23
to reportlab-users
I'm new to reportlab and facing this problem, a section heading is left at the bottom of a frame instead of being kicked to the next frame.  I've simplified my code as much as I can and still reproduce the problem, my apologies for posting too much code.  I've tried to find a workaround but just got even more confused.  I will appreciate any pointers.

Thank you.

from reportlab.platypus import BaseDocTemplate, Frame, Paragraph, Table, TableStyle, PageTemplate
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.pagesizes import legal, letter, landscape, portrait
from reportlab.lib.enums import TA_JUSTIFY, TA_LEFT, TA_CENTER, TA_RIGHT
from reportlab.lib.units import inch
import  os, sys, json
   
def format_time(e):
    return(e['time'])
def format_event(e):          # format event name, address, notes
    return( f"<b>{e['name']}</b><br/>{e['location']}<br/>{e['notes']}")
def format_region(e):
    return(f"<b>{e['region']}</b>")  

def get_events(conf, events):
    styles = getSampleStyleSheet()
    styleE = ParagraphStyle('events',
                           fontName = 'Helvetica',
                           fontSize= 10,
                           parent = styles['Normal'],
                           allowWidows = 0,
                           allowOrphans = 0,
                           alignment= TA_LEFT)
                           
    styleH = ParagraphStyle('heading',
                           fontName = 'Helvetica-Bold',
                           fontSize= 12,
                           parent=styles['Normal'],
                           allowWidows = 0,
                           allowOrphans = 0,
                           alignment= TA_CENTER,
                           keepWithNext = 1)    

    sections = []                       # split events into lists by sections
    for d in range(len(conf['sections'])):
        sections.append([x for x in events if x['day'] == d])
    data = []
    for i in range(len(sections)):              # add heading row
        h = Paragraph(f"<b>{conf['sections'][i]}</b><br/>" , styleH)
        data.append([None, h, None]) 
        for e in sections[i]:         # format event    
            data.append([
                Paragraph(format_time(e), styleE),
                Paragraph(format_event(e), styleE),
                Paragraph(format_region(e), styleE)
            ])
    return(data)
   
def show_events(conf, data):
    doc = BaseDocTemplate(
                    sys.stdout.buffer,
                    pagesize=landscape(legal),
                    leftMargin = .05*inch,
                    rightMargin = .05*inch,
                    topMargin = .05*inch,
                    bottomMargin = .05*inch,
                    allowSplitting=1,
                    showBoundary=0
                    )
 
    tablestyle =  TableStyle([
            ('TOPPADDING', (0,0),(-1,-1), 0),    
            ('BOTTOMPADDING', (0,0),(-1,-1), 2),
            ('LEFTPADDING', (0,0),(-1,-1), 0),
            ('RIGHTPADDING', (0,0),(-1,-1),3),
            ('VALIGN', (0, 0), (-1, -1), 'TOP'),
            ])                  
   
    frame_count = 4
    frame_width = doc.width / frame_count
    frame_height = doc.height
   
    frames = []    
    for f in range(frame_count):  
        leftMargin = doc.leftMargin + f*frame_width
        column = Frame(leftMargin, doc.bottomMargin, frame_width, frame_height)
        frames.append(column)
         
    template = PageTemplate(id = None, frames=frames)
    doc.addPageTemplates(template)

    elements = []            
    col_widths = []
    for x in conf['col_widths']:
        col_widths.append(x*frame_width)
    t = Table(data, col_widths)  
    t.setStyle(tablestyle)
    elements.append(t)
    doc.build(elements)

import io
def main():
    conf = {
"col_widths": [0.17, 0.62, 0.21],
"sections":     ["SUNDAY",
"MONDAY",
"TUESDAY",
"WEDNESDAY",
"THURSDAY",
"FRIDAY",
"SATURDAY"
]
    }
    events = []
    for i in range(7):         # generate dummytest data
        for j in range(14):
            events.append({"day": i, "time": "7:30 am", "name": "Big Splat", "location": "US-441 near Sedona", "region": "Arizona", "notes": "witness reported was ugly"})
       
    data = get_events(conf, events)
    show_events(conf, data)
   
if __name__ == '__main__':
    main()
Reply all
Reply to author
Forward
0 new messages