Problems with EXAMPLE-LIKE LaTeX output

228 views
Skip to first unread message

Michael Doob

unread,
Feb 12, 2023, 11:59:37 AM2/12/23
to PreTeXt development
There seem to be significant problems in the LaTeX output of EXAMPLE-LIKE
input.  Let me give some examples (displayed in the attached pdf files)
and some suggestions of possible directions for improvement.

To start, here are some problems seen in the file cmoextracts.pdf.

First page (numbered 8): The single line at the top of the page is a
   widow. TeX will normally let you avoid these by setting \widowpenalty
   and \clubpenalty.
Second page (numbered 2): Another type of widow, this time as the
   last item in a bulleted list. Also, the two solutions to Problem 1.5
   are missing the little square box signifying the end of proof
   (also called the qedsymbol).
Third page (numbered 23): The qedsymbol sits alone at the top of the page
   separated from its solution on the previous page.
Fourth page (numbered 11): Another widow, this one from a solution on
   the previous page. Also, the qedsymbol for Figure 10 is lower than it
   should be (it would normally be on the same baseline as the Figure 10).
Fifth page (numbered 13): The qedsymbol is below the baseline of the
   displayed equation just before Problem 3.10.
Sixth page (numbered 5): The qedsymbol at the end of Solution 2 has
   the lowered qedsymbol, while Solution 1 directly above has no
   qedsymbol at all.

The cause of these problems is clear and annotated in the LaTeX file
generated by PreTeXt:

<!-- QED Here -->
<!-- 2018-11-20: we have abandoned the amsthm "proof"              -->
<!-- environment, in favor of tcolorbox.  But this is              -->
<!--   (a) some fancy XSL                                          -->
<!--   (b) perhaps useful if the LaTeX is recycled                 -->
<!-- So elsewhere, we redefine \qedhere to do nothing              -->
<!--                                                               -->
<!-- Analyze a final "mrow" or any "me"                            -->
<!-- Strictly LaTeX/amsthm, not a MathJax feature (yet? ever?)     -->
<!--   (1) Locate enclosing proof, quit if no such thing           -->
<!--   (2) Check an mrow for being numbered, do not clobber that   -->
<!--   (3) Locate all trailing element, text nodes                 -->
<!--       strip-space: between "mrow" and "md" or "mdn"           -->
<!--       strip-space: between final "p" and "proof"              -->
<!--   (4) Form nodes interior to proof, and trailing ("remnants") -->
<!--   (5) At very end of proof                                    -->
<!--      (a) if no more nodes, or                                 -->
<!--      (b) one node, totally whitespace and punctuation         -->
<!--          (we don't differentiate whitespace policy here)      -->
<!--   (6) Having survived all this write a \qedhere               -->
<!-- TODO: \qedhere also functions at the end of a list            -->


I don't completely understand why this decision was made when the output
is LaTeX code, but I should point out some useful properties of the
amsth package given in the accompanying qedtext.pdf file.

From the file qedtext.pdf --

First page: If a grad student wanted to set up a problem/solution
   environment using the qedsymbol, this is what I would probably
   suggest. The code below produces the output above.
Second page: Two versions of the same problem, one with the qedsymbol
   and one with it in the right place using \qedhere.
Third page: The qedsymbol for the figure is fixed, and both solutions
   have the qedsymbol.


The cause of these problems is then the use of a breakable tcolorbox for
generating EXAMPLE-LIKE environments. I've examined the documentation
of tcolorbox, but not the code behind it. Even with this caveat, I think
I have a pretty good idea of what is going on.

For those bad widows, you have to consider how LaTeX breaks pages.
The (very very simplified) idea is the following: TeX puts each paragraph
into a vbox and stacks them until the height exceeds the desired page
height. Then it takes the last paragraph (vbox) and uses the TeX primitive
\vsplit to chop off the excess and put it as the starting vbox on the
next page. If the excess has only one line (a widow) it redoes the
splitting to either push the line back to the previous page or add and
extra line to the new page. There is a similar process if the last vbox
on the page has a single line (an orphan). Of course there is lots more
to this story (I haven't mentioned floats, for example, or a whole
host of other side issues going on). But a breakable tcolorbox apparently
does the vsplit on its own and not the other things when breaking text. When
the actual LaTeX page breaking mechanism runs, it's too late.

The reason for the missing qedsymbols is clear: the implementation
combines both the problem and all the solutions into one box before it
appends the qedsymbol at the end. That's too late if there are multiple
solutions. The cause of the problem of the lowered qedsymbols is the
same. A displayed equation or figure already has space below inserted before
the square is added. That is why the amsth package uses \qedhere.

So the next question is what to do. We're  talking about the
pretext-latex.xsl file, so presumably the standard latex packages should
be fair game.  Maybe the amsthm package is still of some use.



---------------------------------------------------------------------

Somewhat tangential suggestion: Normally something is printed on a page
only if it serves a purpose. It would be nice to have an attribute like
qeddisplay="false" to turn off the inclusion of the qedsymbol when it is
superfluous.











 

qedtext.pdf
cmoextracts.pdf

Rob Beezer

unread,
Feb 12, 2023, 1:30:11 PM2/12/23
to prete...@googlegroups.com
Dear Michael,

The open box marks the end of an EXAMPLE-LIKE, in order to indicate where an
"example" (or similar) ends and a run of disjoint paragraphs begins. So
"solution" does not have any ending indicator, as it will only be followed by
another "solution", until the "example" ends. Nor are these the end of proofs.
They are indicated with a filled-in box.

Search your intermediate LaTeX file, and/or "pretext-latex.xsl" for

\square

and/or

\blacksquare

to get to the source of these symbols.

Note that a publisher is encouraged to change these symbols if they like. We
have gone to great lengths to provide the infrastructure to make it
straightforward to reliably change the constructions in the preamble.

Section 42.6: Environments and Blocks
https://pretextbook.org/doc/guide/html/section-196.html#section-196

Rob
> --
> You received this message because you are subscribed to the Google Groups
> "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to pretext-dev...@googlegroups.com
> <mailto:pretext-dev...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/759fc1fa-6af5-4e14-b71a-2f2916e78f83n%40googlegroups.com <https://groups.google.com/d/msgid/pretext-dev/759fc1fa-6af5-4e14-b71a-2f2916e78f83n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Michael Doob

unread,
Feb 26, 2023, 2:22:13 PM2/26/23
to PreTeXt development
Some progress on the problems previously mentioned. The file examplelike.ptx is a MWE
that leaves the closing \square on the top of a page by itself. The LaTeX generated from this example 
is given in examplelike.tex which in turn gives examplelike.pdf. 

A few lines are added to the TeX file to get examplelike+.tex and examplelike+.pdf.
The offending \square has been pulled back to the previous page.

Finally, the file examplediffs is the diff output for the two tex files to display the changes
that is the fix in this case.

As the author of colorbox notes in the documentation, these are rough tools for controlling
widows and orphans. The fix here may leave other examples present.
examplelike.tex
examplelike.pdf
examplelike.ptx
examplelike+.tex
examplediffs

Sean Fitzpatrick

unread,
Feb 26, 2023, 2:38:04 PM2/26/23
to PreTeXt development
I suspect this could reduce the number of times I have to insert \enlargethispage{2\baselineskip} in the .tex file for APEX....

To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/d8c06801-c0f6-4cb9-ae94-c220c2a76630n%40googlegroups.com.

Rob Beezer

unread,
Feb 26, 2023, 2:54:02 PM2/26/23
to prete...@googlegroups.com
Dear Michael,

Perfect! By that I mean the approach. That's the sort of experiment we need.
I don't know those two "tcolorbox" options, so I'll give them a hard look in a
day or two.

Here's something further you could try. "bwminimalstyle" is used as a base
style for many of our blocks. You could try adjusting *that* in the preamble of
the LaTeX output for your project and see how it does on something big/long.

Thanks,
Rob
> https://groups.google.com/d/msgid/pretext-dev/759fc1fa-6af5-4e14-b71a-2f2916e78f83n%40googlegroups.com <https://groups.google.com/d/msgid/pretext-dev/759fc1fa-6af5-4e14-b71a-2f2916e78f83n%40googlegroups.com> <https://groups.google.com/d/msgid/pretext-dev/759fc1fa-6af5-4e14-b71a-2f2916e78f83n%40googlegroups.com?utm_medium=email&utm_source=footer <https://groups.google.com/d/msgid/pretext-dev/759fc1fa-6af5-4e14-b71a-2f2916e78f83n%40googlegroups.com?utm_medium=email&utm_source=footer>>.
>
> --
> You received this message because you are subscribed to the Google Groups
> "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to pretext-dev...@googlegroups.com
> <mailto:pretext-dev...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/d8c06801-c0f6-4cb9-ae94-c220c2a76630n%40googlegroups.com <https://groups.google.com/d/msgid/pretext-dev/d8c06801-c0f6-4cb9-ae94-c220c2a76630n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Rob Beezer

unread,
Feb 27, 2023, 1:04:00 PM2/27/23
to prete...@googlegroups.com
Dear Michael,

Thanks again for this. I never got an "examplelike+.pdf" so I built one with
pdflatex that looks right to me. Attached, or you can replace it if it doesn't
seem right.

Strikes me that something like

lines before break = 2

could be a wise universal addition that would automatically improve lots of
situations.

Do you think "enlargepage" will move the body into the space devoted for a
footer, or will it push a footer further down? If you are making a PDF for
print, you'd want to be careful nothing lands outside your trim size. So this
sort of adjustment feels a bit more dangerous?

We have always recognized that print will require manual editing at the very
last stage. This is partly because we do not use floats, so placement of
figures will need manual adjustments.

I wonder if allowing "hints" on a PreTeXt block (when implemented by tcolorbox)
might be a way to "fix" these unfortunate page breaks. Something easy on the
author side:

<example fix-page-break="yes">

could produce several tcolorbox options (see Section 19.4) that would apply to
just that one block. And maybe there is more than one type of "fix" that might
be needed/requested?

Note that when authors use styles with colored backgrounds and borders the
tcolorbox documentation talks ominously about "empty closing frames", which
will look even worse.

Rob
examplelike+.pdf

Michael Doob

unread,
Apr 11, 2023, 5:30:31 PM4/11/23
to PreTeXt development
I put aside a block of time to better understand the workings of tcolorbox  and how it
interacts with EXAMPLE-LIKE environments. I believe that I better understand the
situation; as usual, there is good news and bad news.  Sorry for the length of
what follows. I've tried to be as concise as possible while not assuming too much.


The problem

Certain typeset objects may not be split across page boundaries. These
include tables, figures and images. This creates a problem: if, for
example, you are typesetting a page and it is 60% full and a table of
half a page comes next, unless the order is changed it must be pushed
to the next page leaving a huge gap at the bottom. When typesetting was
done with movable type, an easy solution was to move the table to the
top of the page and move the extra material on the bottom to the next
page (these objects are called floats for a reason!). As typesetting
became more flexible, the object might be moved to the top of the page,
moved to the bottom of the page, set in between paragraphs, pushed to
the next page or be typeset on a separate float page.

Don Knuth's original output routine for plain TeX had no provision for
floats. One of the first additions to LaTeX was float control, and it
has been improved over the years. No algorithm is perfect, but LaTeX
does a very good job. An override mechanism is present in LaTeX if the
default behaves badly.

The tcolorbox package has been part of LaTeX for a decade. It builds
very sophisticated minipages and allows complicated constructions with a
user-friendly interface. It is a marvel. It is used by PreTeXt to control
the presentation in a number of different contexts (apparently all of
THEOREM-LIKE, AXIOM-LIKE, DEFINITION-LIKE, REMARK-LIKE, COMPUTATION-LIKE,
OPENPROBLEM-LIKE, and EXAMPLE-LIKE). The problem occurs when a float
appears in one of these contexts.

Attached is an illustration of the problem: the LaTeX output of the
file 2floats.ptx is given in 2floats.pdf. If you look at page 2, several
problems are evident:

1. Having two lines above a table is considered to be a widow.  
2. There is almost no space above or below of the table.  
3. The bottom 20%of the page is empty.

What caused all of this? The table at the top is part of the solution
of Problem 1.1; the problem and solution was put into a single
tcolorbox. This tcolorbox is breakable, that is, it may be split
across pages. However the contents cannot be reordered so the float
cannot move. Hence the two lines were at the top of the page before the
float was processed. The same thing happens at the bottom of the page:
the float (the second table) must be pushed to the next page leaving a
huge gap behind. As a point of comparison, I took the same input ran it
though LaTeX directly. The file latexexample.pdf is the result. Page 2
looks much better.

To really see what is going on, it is necessary to look at the algorithm
used in tcolorbox for breakable boxes. I've included the appropriate
page from the tcolorbox manual (in the file tcb_algorithm.jpg) so
that you don't have to look it up. I'll give you the rough short form:
allowable break points are set between the lines, text is accumulated
until there is too much for one page and then the previous break point
is used. The operant word within this process is \shipout. The \shipout
command is a TeX primitive. Originally the only output file was a dvi
(device independent) file, and \shipout would take a box (hbox or vbox)
and ship it directly to the dvi file.  Importantly, there is no further
tweaking or postprocessing possible at this point. With LaTeX the \shipout
command has been overloaded, but the essential idea is the same: your
output is now written in stone (so to speak).  Now we can see the cause
of the problems listed above:

1. Having two lines pushed to the next page is usually okay. The surprise
   appearance of a float created an unforeseen problem.
2. The extra space allocated between the text and the float is handled by
   the LaTeX output routine (the default is to insert 20pt of extra space
   which may stretch of shrink by 10%. Since \shipout was used directly,
   there was no chance to fix the vertical spacing.
3. The last allowable breakpoint left a big gap because it was followed by
   a float (which shows up on the next page).


Improving tcolorbox output

The tcolorbox package includes the implementation of floating objects.
However, a tcolorbox is usually a minipage, and these don't float. A
solution is to extract the float and put it into its own floating
tcolorbox of its own. This is done in 2floats+.tex, and the resulting
2floats+.pdf is attached (I've left the floats with the original grey
background and frame for clarity). It looks much better since LaTeX
itself is doing the float control and \shipout.


Improving output more generally

I examined every use of tcolorbox in pretext-latex.xsl (forty of them).
I note that very few of the huge number of bells and whistles available
are used.  There is nothing wrong with this, but it hints that other
solutions may be easier to implement. Conclusions from reverse engineering
can be suspect, but I believe tcolorbox is being used for the following
reasons:

1. Ease of changing fonts,
2. Control of microspace adjustments.
3. Linking of counters,
4. Ease of using xparse to examine and use parameters.

All of these can be done without resorting to tcolorbox. (In passing:
xparse has migrated from the LaTeX3 project to the LaTeX2e kernel,
and so it is available directly). If I knew exactly what is desired
I might be able to suggest alternatives. This strikes me as a useful
although not altogether trivial project. It might be worthwhile since
using breakable tcolorboxes has some drawbacks:

1. Many natural applications (say, a series of problems and solutions
   as in the first example), can have a high density of breakable
   tcolorboxes. These can cause a \shipout by tcolorbox which, as we
   have seen, is not optimal for quality pages.
2. There will be a steady stream of users complaining about the layout.
3. If the user is contacted by a publisher who wants to make a printed
   version, (this happened to me twice), the generated tex file will
   be useless.

On the other hand, with the current direction and timeline of the PreTeXt
project, this type of revision might not be feasible.  Perhaps Rob or
David could provide feedback.






tcb_alogrithm.jpg
latexexample.pdf
2floats.ptx
2floats+.pdf
2floats.pdf

Rob Beezer

unread,
Apr 12, 2023, 12:18:29 PM4/12/23
to prete...@googlegroups.com
Dear Michael,

On fone, so brief. And have not been able to examine attachments.

Thanks for all the investigation. Improving floats would be welcome. I have never liked LaTeXs handling of floats, especially when they all pile up at the end of a division. So we have fixed placement. But my experience could be past it's sell-by date

Page breaks are a PITA. I have been writing a formatter for embossed braille with similar requirements. We expect authors to resolve bad breaks before publication. We are also very actively exploring the process of working with a commercial publisher.

Most of your speculation about the desirability of tcolorbox is incorrect.

It allows for colored boxes, borders, etc. Which you note we do not do. But style writers will easily, by design. Study Oscar Levin's work.

As a bonus, side-by-side layout and columns of exercises in exercise groups are well-behaved. And we have less ad-hoc package use.

It is highly unlikely anybody will successfully rip out tcolorbox and move to an equally capable replacement.

But if we could manage floats cooperatively with tcolorbox that could be a real improvement.

Rob


On April 11, 2023 5:30:31 PM EDT, Michael Doob <micha...@gmail.com> wrote:
>I put aside a block of time to better understand the workings of tcolorbox
>and how it
>interacts with EXAMPLE-LIKE environments. I believe that I better
>understand the
>situation; as usual, there is good news and bad news. Sorry for the length
>of
>what follows. I've tried to be as concise as possible while not assuming
>too much.
>
>
>*The problem*
>*Improving tcolorbox output*
>
>The tcolorbox package includes the implementation of floating objects.
>However, a tcolorbox is usually a minipage, and these don't float. A
>solution is to extract the float and put it into its own floating
>tcolorbox of its own. This is done in 2floats+.tex, and the resulting
>2floats+.pdf is attached (I've left the floats with the original grey
>background and frame for clarity). It looks much better since LaTeX
>itself is doing the float control and \shipout.
>
>
>*Improving output more generally*

Michael Doob

unread,
Apr 13, 2023, 10:59:52 AM4/13/23
to PreTeXt development
This is quite helpful for my understanding of your justification for using tcolorbox. So let's assume 
that it will stay pretty close to what it is now. What about the following in principle:

1. When a float is encountered in a tcolorbox, it is put in its own  floating tcolorbox. This could not
    be contained in the current tcolorbox since the float would certainly fail in the minipage
    (with a cryptic LaTeX error message, I suppose) environment, but needs to be placed
    at the top level. (Note that float* allows you to override the default placement.)

2. In EXAMPLE-LIKE the tcolorbox will have two parts: one for the problem and one for the
    solution in that type of circumstance. This freezes the  intervening glue; better results
    might be possible if two tcolorboxes were used. If this seems reasonable, I could do
    a few experiments.

Rob Beezer

unread,
May 1, 2023, 11:30:40 AM5/1/23
to prete...@googlegroups.com
Getting back to this one. Enabling floats would be a change in philosophy. We
use LaTeX as a means to end, primarily for good typography, especially for
mathematics. We wrestle with its features for document structure. I'd say
(loosely) that float placement falls somewhere in-between. In other words, just
because LaTeX has a feature, we are not compelled to take advantage of it.

1. Moving out floats.
We do exactly this - every footnote in a structure implemented by a tcolorbox
gets ripped out and accumulated just after the tcolorbox, at the top-level and
LaTeX renders them properly at the foot. Otherwise footnotes can appear
mid-page right where the tcolorbox ends - a very strange look. But notice that
if a tcolorbox consumes several pages, all the footnotes migrate to the last page.

Technically possible to do the same with figures, images, etc. But as with
footnotes, they will appear pages later, all together. And if I remember right,
this increases the potential for all these floats (and subsequent) to jam up at
the end of a division.

2. Examples in two boxes.
A good idea for a plain black text on white background situation. But if you
want your example in a blue box with a black border, now you will need to have
*two* adjacent boxes. This defeats the style options that tcolorbox was
employed to support.

Rob
> --
> You received this message because you are subscribed to the Google Groups
> "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to pretext-dev...@googlegroups.com
> <mailto:pretext-dev...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/55597445-a536-4892-95f1-791ae4ad3656n%40googlegroups.com <https://groups.google.com/d/msgid/pretext-dev/55597445-a536-4892-95f1-791ae4ad3656n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Michael Doob

unread,
May 14, 2023, 3:39:12 PM5/14/23
to PreTeXt development
As you suggested, Rob, I took a careful look at the pdf of the Discrete Mathematics
book distributed by Oscar Levin and took notes on anomalies.

Not surprisingly, many of the potential problems I referred to in earlier
posts do indeed show up. Among the dozens I observed, most were pretty
minor, that is, things that looked rather inelegant by most typesetting
conventions but were still legible. There were a few cases that I would
consider more serious and that really impeded the reader.

Roughly speaking, the number of problematic pages is proportional
to the density of tcolorbox, or, perhaps more properly, the minipages.

So here are a few of suggestions that might ameliorate this problem:

1. For floats within a tcolorbox: add an attribute, say  topfloat,
   that would take a single float (presumably the first) and float it
   to the top of the page.  A similar attribute could be used for moving
   a float to the bottom of the page.

2. A new attribute to be used with tcolorbox to add a small amount
   of \vglue before and after the tcolorbox itself. Since \parskip is
   set to 0, some vertical glue would be really helpful for reading.

3. The use of \tcbbreak and the parameter break at <length> can control
   the allowable break points within a given tcolorbox. It would be nice
   to have an attribute whose effect is to have a single break point after
   n lines. The second (easier to implement) choice would be to have a
   breakpoint at a given distance from the top of the displayed box.

These suggestions are somewhat counter to the philosophy of PreTeXt:
it allows some user control of the presentation.  However, problematic
examples are not low probability special cases; rather, they will surely
occur, especially with the  repeated use of large tcolorboxes.

I should mention that my mind has changed on one thing: most (but not all)
of the problems arising in this context can be remedied by adroit editing
of the intermediate LaTeX file. I'm not sure we want to encourage this
as a solution, but it means some repair is possible.

Rob Beezer

unread,
May 14, 2023, 8:14:17 PM5/14/23
to prete...@googlegroups.com
Thanks, Michael, for continuing with this one.

Comments are welcome from those who care about their PDF output. This does not
need (should not be?) a conversation between one author and the LaTeX maintainer.

On 5/14/23 12:39, Michael Doob wrote:
> 1. For floats within a tcolorbox: add an attribute, say  topfloat,
>    that would take a single float (presumably the first) and float it
>    to the top of the page.  A similar attribute could be used for moving
>    a float to the bottom of the page.

I'd entertain a PR that does this, from you, or an interested party.

> 2. A new attribute to be used with tcolorbox to add a small amount
>    of \vglue before and after the tcolorbox itself. Since \parskip is
>    set to 0, some vertical glue would be really helpful for reading.

If this is a good idea, then it sounds to me like it should just happen all the
time. This is the sort of thing *we* should do, and it should not be a part of
an author's source, nor concern an author at all. I'd be interested to see an
example where this helps dramatically, and if you think it is not universal,
then an example were it is obviously harmful.

> 3. The use of \tcbbreak and the parameter break at <length> can control
>    the allowable break points within a given tcolorbox. It would be nice
>    to have an attribute whose effect is to have a single break point after
>    n lines. The second (easier to implement) choice would be to have a
>    breakpoint at a given distance from the top of the displayed box.

We do have discretionary break indications for display math. There are a couple
of extreme examples in the sample article. A PreTeXt paragraph (or similar)
does not have a notion of "lines" ("md" has "mrows"). We also stay away from
physical units/distances in source. So this sounds to me to be too far removed
from our philosophy.

> I should mention that my mind has changed on one thing: most (but not all)
> of the problems arising in this context can be remedied by adroit editing
> of the intermediate LaTeX file. I'm not sure we want to encourage this
> as a solution, but it means some repair is possible.

We have always assumed that the last step, pre-publication, would be exactly
this sort of tweaking by a publisher. And it might not even be worth saving for
next year, since we are always adding new things like inserting some \vglue here
and there. So at lot of effort goes into making the LaTeX have (a) a legible
semantic body, and (b) a preamble that is editable.

I have thought we might give authors ways to record *some* tweaking. The first
would be something like:

<latex hint="do-a-hard-page-break-here"/>

It might make next year's tweaking easier, even if some hints would have to be
touched.

Thanks,
Rob

Sean Fitzpatrick

unread,
May 15, 2023, 1:14:45 PM5/15/23
to prete...@googlegroups.com
Not sure if it would be helpful, or just annoying, but I could try to
document some of the tweaking I currently do for APEX and for my linear
algebra book. (Some of my tweaking is almost certainly not
Beezer-approved...)

Some of the things I usually adjust:

- It's not uncommon for me to have a section that ends on a page with
one line of text on it. That doesn't look good to me, so I usually try
to force it onto the previous page.
- I've had some tcolorboxes that didn't quite fit on the page, and LaTeX
tried to squish vertically to the point that they overlapped other text,
so I had to force some spacing.
- In my linear algebra book, for reasons I don't understand, it
sometimes happens that on the first page of a section, the text just
runs right off the bottom of the page, and I have to force a page break.

In APEX most of my work is related to the fact that we wanted figures
and asides to go into the margin, but that's a problem of my own creation.
(It is made slightly worse by tcolorbox -- a margin figure inside a
tcolorbox gets shifted horizontally, but I have a way to fix that
automatically. The other issue is that instead of being placed
vertically relative to the text where the figure is located, everything
gets pushed to the bottom of the tcolorbox, and I have to manually shift
things back up.)

Michael Doob

unread,
Jun 7, 2023, 11:23:32 AM6/7/23
to PreTeXt development
Here are some examples from Oscar Levin's fine Discrete
Mathematics textbook available from the PreTexT gallery:
https://discrete.openmathbooks.org/pdfs/dmoi3-tablet.pdf
Some have really problematic layout while others are simply 
unconventional. I'll give them in decreasing order of seriousness.

OLpage049.jpg: The most important sentence on this page is arguably
the first one in the paragraph starting with "Since ...". It is part
of the mathematical narrative of the chapter while everything else
on the page are s an example. That sentence is lost on a page dominated
in the layout by examples (did you have any difficulty finding
the paragraph?). Unfortunately, I was not able to reconstruct the
dmoi3-tablet.tex file used to make the available sample, so I have
included my own sample files of this situation: minipage followed
by a sentence followed by another minipage. If you look at the files
MDpage[1-3].jpg, I've displayed a framed minipage followed by some text
followed by another framed minipage. The first is the default spacing:
almost unbearable to view. Clearly space needs to be added before
and after the minipage. The second approximates the spacing given
by PreTeXt. The third opens this up a bit more. To my eye the third
example looks the best. There is another thing to consider: if the extra
space is added, what happens when you have two consecutive minipages.
MDpage4.jpg illustrates this.

OLpage308.jpg: There is a double problem at the bottom of the page:
the solution is separated from its header, and the page is not flush
bottom. Perhaps it is worth noting that adding a \newpage to the ptx
file will not fix this.

OLpage220.jpg: The solution header made it to the same page this time,
but there is blank space at the top (\vspace is defined in LaTeX is
specifically avoid this). It seems that there is always some space at
the top of continued examples, which is a bit unusual. There is no direct
fix for this.

OLpage300.jpg: Double extra blank space at the top of the page.

OLpage006.jpg: Extra space at the bottom of the page.

OLpage294.jpg: A chapter of a book starts on a recto (odd numbered)
page. If the last page of a chapter is also recto, the verso (reverse
side) is left blank with no header or footer. I thought that this was
automatic with the LaTeX twoside option. In any case, inserting something
like \clearpage\iffodd\thepage\else\leavevmode\pagestyle{empty}\newpage
for the last page should fix this.

Related enough to be part of this thread:
You've probably noted the request by Karl on pretext-support for tips
on how to improve the pdf output. I would like to reply, but I am unsure
of the best way to proceed.
Here are some reasons:
1. All of the examples I've given except for the last one arise from a
   single piece of code: the use of \shipout in a breakable tcolorbox.
   This means that a fix will require an understanding of the tcolorbox
   syntax plus an understanding of manipulating minipages. That's asking
   a lot from your typical user.
2. Adjusting the presentation by altering small blocks of text is
   notoriously unstable. The following sounds like an urban legend, but
   actually happened to me: we had a paper for the Canadian Journal of
   Mathematics in which the author thanked several people at the end of
   the text just before the references. As we were close to publishing,
   the author wanted to add one more name to the list. We did so and
   the paper became on page shorter.
3. With even a modest number of changes, the post processing starts
   from square zero and needs to be redone completely. It's not clear
   to me if much of this can be automated by scripts (if authors have
   that in their skill set).
4. As the density of tcolorboxes increases, the complexity of corrections
   (sometimes drastically) increases.

Any thoughts on how we might proceed?

OLpage220.jpg
OLpage294.jpg
MDpage4.jpg
MDpage3.jpg
OLpage308.jpg
OLpage300.jpg
OLpage049.jpg
MDpage2.jpg
MDpage1.jpg
OLpage006.jpg

Mitch Keller

unread,
Jun 7, 2023, 4:13:00 PM6/7/23
to prete...@googlegroups.com
Thanks, Michael, for looking carefully at these things. I’m looking ahead to print version production projects that will likely hit in summer 2024 and summer 2025, and following with interest. I will add a few thoughts to your numbered list, since I’m one of the folks who has the most experience with page fitting for PreTeXt projects.

Related enough to be part of this thread: 
You've probably noted the request by Karl on pretext-support for tips
on how to improve the pdf output. I would like to reply, but I am unsure
of the best way to proceed.
Here are some reasons:
1. All of the examples I've given except for the last one arise from a
   single piece of code: the use of \shipout in a breakable tcolorbox.
   This means that a fix will require an understanding of the tcolorbox
   syntax plus an understanding of manipulating minipages. That's asking
   a lot from your typical user.

I think for the moment, we are probably still at a stage where the sort of person who is really fussing with getting things into an excellent state for print production isn’t a “typical user”. Thus, I think if we document best practices reasonably well, it’s probably not expecting an unreasonable amount from the folks who would be doing this.

2. Adjusting the presentation by altering small blocks of text is
   notoriously unstable. The following sounds like an urban legend, but
   actually happened to me: we had a paper for the Canadian Journal of
   Mathematics in which the author thanked several people at the end of
   the text just before the references. As we were close to publishing,
   the author wanted to add one more name to the list. We did so and
   the paper became on page shorter.

Oh my. Plus, fussing with small blocks of text sometimes makes other formats come out poor, so it’s really not a great solution for something with as many output targets as PreTeXt offers.

3. With even a modest number of changes, the post processing starts
   from square zero and needs to be redone completely. It's not clear
   to me if much of this can be automated by scripts (if authors have
   that in their skill set).

Assuming that every chapter (if not every section) starts on a new page, that can help with a lot of issues. At least you know that the impacts of your revisions should stay somewhat near to where the revision happens. There is a reasonable workflow using git rebase that can be used to apply hand edits that were made to the LaTeX file before to the latest version that comes out of PreTeXt. However, as Rob has indicated, PreTeXt changes enough in the span of a year (or even six months) that the version of the LaTeX file from a year ago is not helpful. However, it is quite feasible to get LaTeX out of PreTeXt, make some changes to the LaTeX, then realize that something needs to change in the PreTeXt, regenerate the LaTeX, and apply the changes that were previously made to the LaTeX file so that you don’t have to start from scratch.

-- 
You received this message because you are subscribed to the Google Groups "PreTeXt development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pretext-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pretext-dev/a7abdd26-5e58-41ed-bd72-cac2aed6c166n%40googlegroups.com.
<OLpage220.jpg><MDpage4.jpg><MDpage3.jpg><OLpage308.jpg><OLpage300.jpg><OLpage049.jpg><MDpage2.jpg><MDpage1.jpg><OLpage006.jpg><OLpage294.jpg>


Oscar Levin

unread,
Jun 7, 2023, 5:47:11 PM6/7/23
to prete...@googlegroups.com
Thanks Michael for providing examples.  A few notes on them.

On page 49, I think the issue is more that the narrative is lost among the examples, rather than a presentation issue.  Perhaps if I had done a better job styling the example boxes this would look less offensive (maybe indent them slightly).  In any event, is there any latex that would make the original look right?

Most of the strange pagebreaks are the result of my adding \newpage or \clearpage (whatever the difference between those are anyway).  Where the solution heading is disconnected from the body of the solution, I think I just missed it.  Of course it would be great to not have to worry about this, but it seems like no matter what, I need to spend time tweaking things at this stage of the publication process, and often have to decide between the lesser of two evils: do I have a non-flush bottom or is the bottom just the header of the next section?

There are multiple places in which the tcolorbox code can be improved I think (I still don't know why an image after the first paragraph of an exercise gets no top margin....) but even if it is at a local maximum, there likely will be a need to manually fix things.  I wonder if PreTeXt could have some publisher hints in the source that say where to add manual breaks.  Or perhaps the publication file could have a list of xml:ids of blocks that should start a new page?

Oscar.

Michael Doob

unread,
Jun 7, 2023, 6:31:32 PM6/7/23
to PreTeXt development
On page 49 you could try adding a small amount of vertical glue. Something like a \vspace{2pt} 
after the second example and  before the third example would probably help. At this point I
think the right way to go is to add extra space within PreTeXt. It remains to be seen if this
looks good in other contexts. Rob or David could say something more authoritative. 

Invoking \newpage closes out the page and moves on to the next page, 
while \clearpage flushes any pending floats. So, for example, you need
to use \clearpage before testing the parity of the last page in a chapter
to be sure the next chapter starts on an odd-numbered page.

I agree with you that there will always be some fine tuning necessary. This is
inevitable because of the very simple way that breakable tcolorboxes are split.

Cheers,
Michael

kcri...@gmail.com

unread,
Jun 8, 2023, 9:07:03 AM6/8/23
to PreTeXt development
1. All of the examples I've given except for the last one arise from a
   single piece of code: the use of \shipout in a breakable tcolorbox.
   This means that a fix will require an understanding of the tcolorbox
   syntax plus an understanding of manipulating minipages. That's asking
   a lot from your typical user.

I think for the moment, we are probably still at a stage where the sort of person who is really fussing with getting things into an excellent state for print production isn’t a “typical user”. Thus, I think if we document best practices reasonably well, it’s probably not expecting an unreasonable amount from the folks who would be doing this.

Though I was aiming at the "middle ground" of people who have used LaTeX a fair amount for math stuff, but not necessarily know anything about tcolorbox, and only enough about minipages to use them in a basic sense.  The ideas about \par and \newpage are more my speed :-)

3. With even a modest number of changes, the post processing starts
   from square zero and needs to be redone completely. It's not clear
   to me if much of this can be automated by scripts (if authors have
   that in their skill set).

Assuming that every chapter (if not every section) starts on a new page, that can help with a lot of issues. At least you know that the impacts of your revisions should stay somewhat near to where the revision happens. There is a reasonable workflow using git rebase that can be used to apply hand edits that were made to the LaTeX file before to the latest version that comes out of PreTeXt. 

That's why Alex' suggestions of a script were helpful; some things won't change that much, or are predictable (even if not related to specific items).   I just made a sed script that I'll be adding to along those lines.  But yes, that workflow is also possible for more "one-off" things.

Rob Beezer

unread,
Jun 8, 2023, 2:44:26 PM6/8/23
to prete...@googlegroups.com
Dear Michael,

As always, thanks for the careful examples. Are those the *only* substandard
instances you found, or just a representative set of examples of EXAMPLE-LIKE?

Oscar has said many things I would have said. In particular, we should allow a
publisher to say each "section" begins on a new page (if we don't already). For
the changes we are discussing, I think they can be addressed in chapter-sized
(or section-sized) chunks.

PDF is a very bad choice for an electronic document. HTML for online use, EPUB
for offline use. So I only really care about PDF for print purposes. And I
think a publisher would refresh this at most annually.

We introduced tcolorbox on December 16, 2017, over five and a half years ago,
at c6735aef. It improved many deficiencies in the LaTeX conversion, and is now
tightly intertwined (such as the numbering of certain blocks). I can't think of
anyone prepared to rip it out. So it improves over time (we can help push) or
we learn to control it better. No guarantee either will happen.

> Any thoughts on how we might proceed?

Another option is a new conversion to print. This is exactly what XSL-FO is
about, and it would be very natural. "What, a PDF about mathermatics that is
not typeset by LaTeX? Blasphemy!", I hear you say.

We are very good at ripping out the math and converting it to all manner of
formats: PNG (Kindle), SVG (EPUB), Nemeth (braille), speech (EPUB
accessibility). So maybe those would integrate into a workflow with XSL-FO? I
don't know. Almost everything I do know is here

https://en.wikipedia.org/wiki/XSL_Formatting_Objects

which is above-average for being informative, so worth reading. Apache-FO is
open-source (and that's about all I know there).

Doing a new conversion from scratch is entertaining. I just had a do-over with
braille. Next, Jupyter needs the same treatment (and I have a new
meta-technique to employ).

Could you imagine a world where MathJax replaces TeX/LaTeX completely? Its not
that far way, I think.

Rob

David W. Farmer

unread,
Jun 8, 2023, 2:58:17 PM6/8/23
to prete...@googlegroups.com

This is more noise than signal, but:

I don't think it is far-fetched to print the HTML (i.e., save as PDF)
and have it look pretty good, including automatically managing common
problems like making sure a heading is not at the bottom of the page.

This would actually be better for some use cases, such as printing a
section which does not start at the top of the page in the official
PDF version.

This is something I would like to do, but not at the top of my
priority list.

When I get around to fixing the 80% issue with worksheets (those print
nicely the last time I looked, but you have to change the default
scaling in your print dialog) then maybe I can do a bit of work on
non-worksheets.

David
> --
> You received this message because you are subscribed to the Google Groups
> "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pretext-dev...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/MTAwMDAzMC5iZWV6ZXI.1686249864%40quikprotect.
>

Sean Fitzpatrick

unread,
Jun 8, 2023, 7:07:00 PM6/8/23
to PreTeXt development
For this particular problem of spacing around examples: if it turns out that it's needed for Oscar's book, but it doesn't look good in general, I believe it's possible to use your extra xsl to modify the exercise tcolorbox style to include 'before' and 'after' values, do this could be a publisher decision rather than a PreTeXt decision.

Rob Beezer

unread,
Jun 8, 2023, 7:36:41 PM6/8/23
to prete...@googlegroups.com
Exactly.
>> <http://gmail.com/>wrote:
>> visithttps://groups.google.com/d/msgid/pretext-dev/a7abdd26-5e58-41ed-bd72-cac2aed6c166n%40googlegroups.com <https://groups.google.com/d/msgid/pretext-dev/a7abdd26-5e58-41ed-bd72-cac2aed6c166n%40googlegroups.com?utm_medium=email&utm_source=footer>.
>> <OLpage220.jpg><MDpage4.jpg><MDpage3.jpg><OLpage308.jpg><OLpage300.jpg><OLpage049.jpg><MDpage2.jpg><MDpage1.jpg><OLpage006.jpg><OLpage294.jpg>
>
> --
> Mitch Keller
> mi...@rellek.net
>
> http://www.rellek.net/ <http://www.rellek.net/>
>
> --
> You received this message because you are subscribed to the Google
> Groups "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to pretext-dev...@googlegroups.com.
>
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/95427C89-1025-48AE-B0AE-855891E664FA%40rellek.net <https://groups.google.com/d/msgid/pretext-dev/95427C89-1025-48AE-B0AE-855891E664FA%40rellek.net?utm_medium=email&utm_source=footer>.
>
> --
> You received this message because you are subscribed to the Google Groups
> "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pretext-dev...@googlegroups.com
> <mailto:pretext-dev...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/62c5c57a-6fac-4c24-972e-c3ffed877dd8n%40googlegroups.com <https://groups.google.com/d/msgid/pretext-dev/62c5c57a-6fac-4c24-972e-c3ffed877dd8n%40googlegroups.com?utm_medium=email&utm_source=footer>.
>
> --
> You received this message because you are subscribed to the Google Groups
> "PreTeXt development" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to pretext-dev...@googlegroups.com
> <mailto:pretext-dev...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pretext-dev/CAH%2BNcPYKbnKpYG-2RBHm8KmDO7iAWBKyH6XyOkb0%2BhHxMWw8yQ%40mail.gmail.com <https://groups.google.com/d/msgid/pretext-dev/CAH%2BNcPYKbnKpYG-2RBHm8KmDO7iAWBKyH6XyOkb0%2BhHxMWw8yQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Rob Beezer

unread,
Jun 8, 2023, 7:40:09 PM6/8/23
to prete...@googlegroups.com
On 6/8/23 11:58, David W. Farmer wrote:
> I don't think it is far-fetched to print the HTML (i.e., save as PDF)
> and have it look pretty good, including automatically managing common
> problems like making sure a heading is not at the bottom of the page.

Yes, the "paged media" features of CSS might lead to a very acceptable result.

Rob

Michael Doob

unread,
Jun 8, 2023, 8:40:43 PM6/8/23
to PreTeXt development
In answer to your question:
I read through Oscar's book pretty quickly making notes about a page when something
there seemed off. Looking back at my notes, I see I had something to 
say for 47 of the pages. Most were of the nit-picking variety. Recalling Tom Judson's
remark: once you see the arrow in the Fedex logo, you can't unsee it. Also, since I
know something about the source of the problem (the notorious \shipout) I know where 
to expect problems to show up. In general, I am sure that (not infrequently) there will be some 
serious situations that will require a postprocessing correction.

Putting an vertical adjustment parameter in the publisher file sounds like a good idea to me.
The best space will be affected by to particular presentation of the tcolorbox. 

kcri...@gmail.com

unread,
Jun 9, 2023, 9:37:52 AM6/9/23
to PreTeXt development
PDF is a very bad choice for an electronic document. HTML for online use, EPUB
for offline use. So I only really care about PDF for print purposes. And I
think a publisher would refresh this at most annually.

Comments from a less-informed source, hopefully useful.

I've not been impressed (nor unimpressed) by epub as used in Apple Books.app or 'the various Adobe solutions, but maybe I'm reading bad Project Gutenberg books. A few things I would worry about, as author and end user:
* Syncing with print:  Students turn out to be terrible at finding "Section 3.2.3" but great at finding "page 142", so if I want students to find something in print, I use that.  I haven't seen any such syncing in the ebooks of this format I've read, but again maybe more sophisticated ones do.  We can't sync html with print, but syncing offline computer with it is a distinct advantage of pdf.
* Commenting/writing: Again, maybe Books.app is just a bad e-reader, but it only seems to support highlighting some text and writing text about that text as a "Note".  And bookmarks.  That's all well and good, but a reader might want to circle something, not just highlight it, or to write some math notation with a stylus.  I can (fairly) easily do that with pdf, despite all my technological backwardness on the hardware front, but most posts I saw about this just now suggest that epub (intentionally, I guess) doesn't support this.  But writing in your book is a very desired feature of many (especially activity-based) math books.  (I can't see typing out a minimal spanning tree practice problem.)
* Familiarity: For better or worse, the math (and allied) world is familiar with pdf coming from LaTeX.  So both on an authoring and reading ground, you may be likely to get mostly early adopters for quite some time on the epub side.
* Printing: Does epub print nicely?  I just don't know the answer to that.  It seems fairly likely that someone might be reading something and want to print out a page or a section.  Of course, pdf has the problem mentioned somewhere in this thread about sections not starting at the top of a page, so this bullet could cut both ways.

Non-epub comments, which boil down to "this user humbly suggests PTX functionality/usage should focus first on what people are familiar with, to attract a lot more users and developers":
* New print option: The familiarity issue is going to take a lot more time to resolve on anything like that.  Could be interesting for mature projects with early adopter authors to see how they like it.  Or maybe should apply for a Google Summer of Code student to do it?
* Printing HTML as PDF: David's experience with that must be far more positive than mine.  The articles from my regional newspaper that I print occasionally to pdf turn out okay, though not fantastic, perhaps that's their CSS' fault.  But a textbook is a heck of a lot more complex than a newspaper article.

Rob Beezer

unread,
Jun 9, 2023, 11:43:27 AM6/9/23
to prete...@googlegroups.com
On 6/8/23 11:44, Rob Beezer wrote:
> In particular, we should allow a
> publisher to say each "section" begins on a new page (if we don't already).  For
> the changes we are discussing, I think they can be addressed in chapter-sized
> (or section-sized) chunks.

https://github.com/PreTeXtBook/pretext/issues/2001
Reply all
Reply to author
Forward
0 new messages