On 2/15/12 5:14 AM, James Ide wrote:
> Boris,
>
> Thanks for your investigation and comments.
>
>> Can that part that's not needed be stored in a single global string,
>> since it's always the same, and inserted via innerHTML if the user
>> actually clicks on the widget?
>
> This is a good suggestion, although we'll likely need to do general client-side rendering since the markup isn't completely the same across all of the widgets. They each have different IDs, text content (e.g. # of likes), etc....
OK. It had sounded like the markup for these was actually identical...
Something I don't quite understand, in the light of morning: why is
there any cost to styling stuff in these widgets? Are the "hidden until
the user clicks" bits not in a display:none subtree?
> This is good to know. The images are wrapped in divs--these divs have fixed dimensions and have overflow hidden to crop the images, so it seems like the images shouldn't affect the layout of the page, nor the painting outside of the div.
Hmm. Yeah, in that case I'm not sure why I saw the flicker. But note
that even in the scenario you describe there would be a relayout pass
when the image data starts coming in (though the layout of the rest of
the page may not be affected, of course), which can possibly be avoided
if the image dimensions are given up front. Sadly, it looks like we
don't optimize away the relayout in that situation.
>> 15% running script off stylesheet onload events. The time here is
>> mostly spent getting scrollLeft and offsetHeight and setting innerHTML.
>
> This is part of our BigPipe system. Once the CSS associated with a subtree has been downloaded (we use feature detection for the load event--thanks!), we insert the HTML for that subtree and potentially start executing the JS associated with that subtree as well. For the timeline posts specifically, the JS inserts their HTML into an offscreen div, queries their offsetHeights to figure out which of the two columns to put them in (that's why they're in the main DOM tree and also why the offscreen div isn't display:none), and then moves them into the main center div, floating them left or right as previously decided.
Hmm. That's actually a bit of a problem in the sense that at least in
Gecko changes to the "float" property from "none" to "left" or "right"
trigger CSS box recreation. So you're ending up creating all the boxes,
laying them out, measuring them, then destroying them all and recreating
them and having to lay them all out again.
Of course changing from "position: absolute" to "position: static" also
has to recreate the CSS boxes.
I wonder what would happen if you did the offscreen thing by just
floating the boxes left to start with but setting a very large negative
margin-left. Then when you decide which side to float them on, either
change "float" to "right" or leave it as is and remove the large
negative margin. Worth trying, at least. That approach should cause a
lot less work for the browser in Gecko, at least.
>> How hard a requirement is the "insert some more globally applicable CSS" bit?
>
> Pretty hard. We could split the page into fewer pieces, therefore inserting CSS fewer times. It would be cool to have a stylesheet apply only to a given subtree but AFAIK that's only possible with iframes.
OK. At some point scoped stylesheet support will happen, but it's going
to be a at least several months before it does. In the meantime, if
your stylesheets are really meant to be scoped, you may be able to
proactively add the "scoped" attribute to them so when it's supported
things will just work hopefully.
> We've eliminated most, if not all, "*" key selectors. There are still a few generic pseudo-class selectors (e.g. ":hover") that are likely as bad
:hover per se is not quite as bad, since you only have to walk up the
tree if the element is in fact hovered.
> and a bunch that just match tag names as you suggested. Most of our application-specific selectors use class names, but we'll likely have a few site-wide ones that do match all span elements for instance.
OK. To be clear, matching for these sorts of selectors was on the order
of 10% of the overall CPU time. So if there's any way you can put
classes on the relevant spans and whatnot and match on those, that might
help a good bit on this page at least in Gecko.
> This is the BigPipe system, again. We send down<script> tags one at a time with HTTP chunked encoding, and each one contains a markup subtree plus references to the CSS and JS it needs. The pure JS execution you observed is likely the requests for CSS/JS being made.
OK, I just looked at where time is actually spent here. 2/3 of the time
for this pure script execution is just parsing and bytecode-compiling
the script. The rest is almost entirely the interpreter and jit
compiler; very little time is actually spent running jitcode (presumably
because the code really doesn't run long enough to benefit from jit
compilation).
How big are these scripts, exactly? Is there any way to move as much
functionality as possible out of them to a shared script library that's
only loaded once?
> Our understanding of the performance issues after using various profiling tools was that parts of the rendering process are the bottlenecks. We can look into batching up the queries to offsetHeight and related functions. Currently we have little insight into why various reflows/repaints are slow, whether it's beneficial to remove<link rel=stylesheet> elements that have unused CSS, how various CSS rules affect performance (e.g. negative margins, floating; box-shadows were particularly bad) especially when they're on the same page together, how to "silo off" parts of the page so that layout is constrained to a smaller subtree, etc... Browser tools are definitely improving but a lot of the internals are still opaque.
Yes, and changing.
The answers to your questions likely differ in different browsers...
And they can be hard to answer even for someone familiar with the
browser internals. :( We definitely need better tools here, if we can
figure out how to create them.
To answer your specific questions as much as I can:
1) The slowest dynamic changes in Gecko will be ones that require CSS
box reconstruction. There's no hard guideline for what these are, but
generally anything that changes the shape of the box tree (switches
things from being in-flow to out-of-flow or changes which boxes are
containing blocks for other boxes) will trigger a frame reconstruct. So
changes to "display", changes to "position" (though at some point we may
be able to optimize changes between "static" and "relative" better),
changes of float from "none" to other values or back, that sort of
thing. These will lead to Gecko destroying the old CSS boxes, redoing
selector matching on the relevant subtree, creating new CSS boxes,
having to lay them out again, etc.
2) Removing stylesheets that have "unused" CSS is likely beneficial if
the CSS makes use of descendant combinators. CSS that doesn't use those
is much cheaper. Removing a stylesheet that's already in the DOM is
somewhat expensive compared to not putting it there in the first place.
3) The performance of various CSS properties _really_ depends on the
structure of the surrounding markup, at least in Gecko... and going
forward may depend on the state of graphics hardware acceleration on the
platform (say box-shadow).
4) Siloing off parts of the page is very hard in Gecko without using
iframes. In particular, layout happens recursively from something
called "reflow roots", which in practice means roots of subframes or
text controls. We're working on being able to make absolutely
positioned elements reflow roots. Unfortunately, in general in-flow or
floated elements can affect each other, so constraining layout to one of
them in the browser is hard. In general we aim to set dirty bits on the
things that really changed and then redoing layout of as little as we
can to deal with the part that's dirty. But some things (e.g. involving
floats) are very hard to optimize this way while retaining correct
behavior...
-Boris