Re: Synthesizing a table layout algorithm with FTL

Robert O'Callahan

unread,

Mar 9, 2012, 11:42:14 AM3/9/12

to Leo Meyerovich, dev-...@lists.mozilla.org, Ras Bodik

Hi Leo! Have you written anything about how you handle line breaking/page
breaking?

Rob
--
“You have heard that it was said, ‘Love your neighbor and hate your enemy.’
But I tell you, love your enemies and pray for those who persecute you,
that you may be children of your Father in heaven. ... If you love those
who love you, what reward will you get? Are not even the tax collectors
doing that? And if you greet only your own people, what are you doing more
than others?" [Matthew 5:43-47]

Leo Meyerovich

unread,

Mar 9, 2012, 3:45:54 PM3/9/12

to dev-...@lists.mozilla.org, Ras Bodik

Hiya Rob,

The line breaking I've played with seems fine -- a one-pass greedy version is implemented via trait "Wrapping" for classes WrapBox and Cell in the tutorial I sent. LIne wrap doesn't prevent concurrency because it occurs locally (a node iterating over its words).

Perhaps you mean an explicit line break element (br)? That would require changing the intrinsic preferred width calculations to something like:

loop child:
prefWidthRollingSum[i] = child[i].prefWidth + child[i].isBreak ? 0 : prefWidthRollingSum[i - 1];
prefWidthRollingMax[i] = max(prefWidthRollingMax[i - 1], prefWidthRollingSum[i]);
prefWidthRollingSum[-1] = 0
prefWidthRollingMax[-1] = 0

The actual x positioning of a word would likewise check for "child[i].isBreak".

I'm not sure how page breaking works -- does it seem much more complicated? I didn't have it high on my TODO list but can bump it up. Right now, I'm trying to do left floats. Figuring out the semantics has been tricky because Gecko, Opera, and WebKit do different things on edge cases :) I suspect floats will reveal how speculative parallelism and maybe iterative evaluation looks.

Regards,

- Leo

Robert O'Callahan

unread,

Mar 9, 2012, 5:00:37 PM3/9/12

to Leo Meyerovich, dev-...@lists.mozilla.org, Ras Bodik

On Fri, Mar 9, 2012 at 8:46 PM, Leo Meyerovich
<lmey...@eecs.berkeley.edu>wrote:

> The line breaking I've played with seems fine -- a one-pass greedy version
> is implemented via trait "Wrapping" for classes WrapBox and Cell in the
> tutorial I sent. LIne wrap doesn't prevent concurrency because it occurs
> locally (a node iterating over its words).
>

I'm thinking of cases where, in CSS terms, a DOM node has to be split into
multiple boxes. For example in HTML, aa bb cc dd,
where a soft line break is required between bb and cc. In the example you
sent, a WrapBox doesn't have to split any child boxes.

FWIW I imagine you may be best off reformulating into an approach that
avoids splitting your internal boxes, and providing a way to recover the
actual CSS boxes somehow. But a good solution is extremely important for
obvious reasons.

I'm not sure how page breaking works -- does it seem much more complicated?
> I didn't have it high on my TODO list but can bump it up.

It's very similar to line breaking in theory.

Right now, I'm trying to do left floats. Figuring out the semantics has
> been tricky because Gecko, Opera, and WebKit do different things on edge
> cases :) I suspect floats will reveal how speculative parallelism and maybe
> iterative evaluation looks.
>

Yeah, floats are fun :-).

Another particularly delicate part of CSS layout is handling
"overflow:auto", because it involves circular dependencies. The presence of
a scrollbar affects the available width or height within the element, which
affects whether scrollbars are needed. In Gecko we effectively try all four
possible solutions and pick the "optimal" one (let's say, the one showing
the fewest scrollbars, preferring to show a vertical scrollbar over a
horizontal one), with appropriate short-circuiting and incrementalization
optimizations.

I'm loving your work BTW :-).

Leo Meyerovich

unread,

Mar 9, 2012, 6:10:13 PM3/9/12

to rob...@ocallahan.org, dev-...@lists.mozilla.org, Ras Bodik

>
> I'm thinking of cases where, in CSS terms, a DOM node has to be split into multiple boxes. For example in HTML, aa bb cc dd, where a soft line break is required between bb and cc. In the example you sent, a WrapBox doesn't have to split any child boxes.

Yeah, I'm thinking for this case, we can take advantage of a selector-like extension as described at the bottom of the table tutorial.

We want to consider "aa", "bb", "cc", and "dd" all as children of the closest ancestral block formatting context (I think that was the term), even if they happen to be grandchildren or further in the parse tree. So, I'd like to write something like:

class LineWrappingBFC {
children {
words : Node = (./Node[display==Inline || display==InlineBlock])* //tweaked further to reach but not go into floats
}
actions {
loop words:
/* code as in the previous email */

The table tutorial manually did this for a column selecting its cells as a search on the table rows, and I'm pretty sure we can code generate code to do it from a selector-like form shown above.

Further out, I'm curious about handling this code in terms of full automated verification/synthesis, not just code generation. In the table example, I added a few constraints to tell the synthesizer how to pretend it's a normal attribute grammar, but I suspect it's possible to (soundly) fully automate these hints. I should stress that this is not essential for running code nor code generation :)

>
> Another particularly delicate part of CSS layout is handling "overflow:auto", because it involves circular dependencies. The presence of a scrollbar affects the available width or height within the element, which affects whether scrollbars are needed. In Gecko we effectively try all four possible solutions and pick the "optimal" one (let's say, the one showing the fewest scrollbars, preferring to show a vertical scrollbar over a horizontal one), with appropriate short-circuiting and incrementalization optimizations.
>

That sounds like a good one to look at after floats. We have local short-circuiting for expressions, which gets us part of the way there. For example, in "x := a ? f(b) : g(c)", we don't need to compute both "f(b)" and "g(c)". However, globally, it still requires computing "a", "b", and "c". Posing a challenge, it sounds this is a non-local short-circuiting. In particular, the choice in "a" depends on what we get in "x" (a cycle), which in turn gives an iterative computation yielding different "b" and "c" in each iteration until we are happy with "x".

Such an iterative computation seems doable: we might phrase this case as a trigger to an ordered sequence incremental computations which stops on the first satisfactory attempt. For the foreseeable future, this may be in the category of code generation commands / patches that the synthesizer doesn't fully understand, which is fine.

Longer-term, it does make semantic sense to declaratively control non-local short-circuiting. We can view non-local short-circuiting as "pull" dataflow where, currently, we use "push". Currently, if you want a pocket of "pulls" within an overall "push", you'd have to handle that within external functional calls (kind of how the table code computes the grid). Combining them does make sense, sort of like the reverse of Erik Mejier's LINQ framework: there, the default is pull (~SQL), while sometimes you want push (RxLINQ).

When we write up floats, I'll try to shed a bit more light into what all this means.

> I'm loving your work BTW :-).

Thanks :) It's fun and part of an itch I've had for years ("what is CSS??", "I'd like to build tools for CSS but it's complicated...") :)

- Leo

Robert O'Callahan

unread,

Mar 9, 2012, 7:12:50 PM3/9/12

to Leo Meyerovich, dev-...@lists.mozilla.org, Ras Bodik

Flattening the inline content is a good idea but it's no panacea.
Non-flattened element structure still affects layout, for example vertical
alignment and borders of spans. And trying to treat each breakable text
unit as a full-fledged independent box in your system will probably not
scale if applied naively ... for example you'd need an independently
laid-out box for each Kanji character or character in word-wrap:break-word
text.

Flattening gets even worse if you consider vertical breaking. Imagine
laying out a table and needing to place a soft page break inside the table.

Leo Meyerovich

unread,

Mar 9, 2012, 8:15:15 PM3/9/12

to rob...@ocallahan.org, dev-...@lists.mozilla.org, Ras Bodik

Cool examples :)

On Mar 9, 2012, at 4:12 PM, Robert O'Callahan wrote:

> Flattening the inline content is a good idea but it's no panacea. Non-flattened element structure still affects layout, for example vertical alignment and borders of spans.

Right, a concern is something like nested spans with custom padding and centered vertical text alignment:

a

b
 c

d

I think we can do the recursive (non-flattened) form, but that complicates parallelization as a simple top-down/bottom-up parallel tree traversals.

Instead, last year, I played with the idea of multiple passes (ignoring floats):

tree traversal 1: top down (non-flattend): pass margin/padding considerations to nested elements, accumulating their effects

tree traversal 2: top down (flattened):
local loop 1 (forwards): compute x position, line (but not exact y), line height (for last element in the line)
local loop 2 (reverse): pass line height from last word on line to preceding and determined word position in line.

Traversal 2 might need to be split into two traversals, but I'm not sure.

> And trying to treat each breakable text unit as a full-fledged independent box in your system will probably not scale if applied naively ... for example you'd need an independently laid-out box for each Kanji character or character in word-wrap:break-word text.
>

I agree :)

Ras and I thought that a contiguous sequence of characters should be treated with function calls -- we have an example somewhere in our repo that works like that. For example, you can ask for the min and pref width of a character sequence.

Treating contiguous spans of characters (letters, words) via function calls seems to be the way to go for parallelization, incrementalization (you don't need to incrementalize within the call), and constants (keeps memory overhead etc. low). Nested spans etc. seem appropriate territory within the spec, though.

> Flattening gets even worse if you consider vertical breaking. Imagine laying out a table and needing to place a soft page break inside the table.

I'm not really familiar with page break semantics (I've never designed for @media print :)). The challenge here may be to declaratively specify it at all, rather than optimizing it once it we can: a page break sounds like a fairly powerful non-local effect. I'm guessing that, if we can specify it, the correct parallelization scheme is to use speculation and iteratively/incrementally recover.

Btw, great examples, thank you!

- Leo