One of our primary goals for Blink in 2014 is to improve touch-driven interactions by making the engine run smoothly at 60 Hz [1]. This document explains an approach a number of us are pursuing to make that a reality.
At a high level, our approach is to retool the rendering engine to work in terms of animation frames. During each frame, the engine will compute style, layout, and compositing information exactly once. Additionally, we will minimize the information recomputed in each phase by minimizing the amount of information that is invalidated from the previous frame.
== Background (Slightly Apocryphal) ==
In a land before time, when dinosaurs roamed the earth, Blink (then called WebKit) rendered into a single backing store. When a script mutated the DOM, we recomputed style and layout information, and then invalidated portions of the backing store. After coalescing these invalidations, we repainted the invalidated regions and sent the backing store's bitmap to the browser process for display.
In this classical approach, Blink controls the timing of the rendering process. When script mutates the DOM, we want to display the results of the mutation as soon as possible, but there's no external notion of "soon." We recalculated style and layout information asynchronously, which means that "soon" was controlled by the style and layout timers, respectively.
At some point, we switched Blink's rendering model from this classical approach to using accelerated compositing. Instead of rendering into a single backing store, we now rendering to a tree backing stores, each called a GraphicsLayer. Instead of being blitted to the screen, the GraphicsLayers are composited on the GPU by cc, the Chromium Compositor. To update the GraphicsLayer tree, Blink can commit new data to compositor, which will appear on screen in the next vsync after cc activates the tree.
In this more modern view, the compositor controls the timing of the rendering process. The compositor watches the display's vsync signal and puts up a new composition of the GraphicsLayers for each vsync. Blink is throttled to producing commits only as quickly as the compositor can consume them, which maxes out at the frequency of vsync, typically 60 Hz.
== Running at 60 Hz ==
In the classical approach, Blink might recalculate style and layout information multiple times per commit, which is wasteful. For example, if a script mutates the DOM, calls requestAnimationFrame, and then mutates the DOM again inside the requestAnimationFrame callback, it's possible that the recalc style timer will fire in between the two DOM mutations, which means we'll recalc style twice (once when the style timer fires and again to generate the tree of GraphicsLayers to commit to the compositor). That both wastes power and can introduce jank because recalculating style twice might cause us to miss the deadline for the current frame.
Instead, we should update Blink's rendering engine to account for the modern approach to composited graphics. Specifically, instead of using timers to generate internal notions of time, we should request animation frames from the compositor, and use that time signal to throttle our recalculation of style and layout information. This approach will help us recalculate style and layout information exactly once per frame.
We've already switched the style system from being driven off a timer to being driven off animation frames [2]. I'm working on switch the layout system over to using animation frames as well [3].
== A State Machine ==
Although we often refer to Blink as a rendering engine, we haven't been explicit about the state machine that drives the engine. Instead, we've had a collection of bools scattered among several classes that indicate whether, for example, there's a pending style recalculation or whether we're in the middle of computing layout.
As part of switching the engine to be driven off animation frames instead of timers, we're making this state machine explicit in the DocumentLifecycle class. Mutations from script will move the state machine backwards, for example invalidating layout or style information, causing the system to request an animation frame. Once the compositor signals Blink that it's time to put up a frame, we'll drive the state machine forward towards the "Clean" state, at which point we will have reified the consequences of those DOM mutations and can commit a new tree of GraphicsLayers.
The primary advantage of moving to an explicit state machine is that we can prevent earlier phases of the state machine (e.g., style or layout) from reading information that's updated in later phases of the state machine (e.g., compositing). Currently, we have a number of these "backwards" reads, which means we need to continually update compositing information when recalculating style and layout information.
Continually updating compositing information in this way is wasteful in the same way that recalculating style information more than once a frame is wasteful. In some common situations, we can end up updating the same compositing information multiple times for a single animation frame. If you'd like to help burn down these backwards reads, please consider fixing a bug on this list:
== Minimizing Invalidations ==
Even after removing redundant recalculation of style, layout, and compositing information, we can further reduce the amount of work the engine needs to do in order to commit the next frame by minimizing the amount of information that's invalidated by a given DOM mutation. The less information that's invalidated in a given frame, the less the engine needs to compute to drive the state machine to Clean and commit the frame.
The style and layout systems already have elaborate systems for minimizing invalidations, but there are still a number of areas for improvement (e.g., [4] and [5]). By contrast, the compositing update has only Document-level notions of invalidation. We can reduce the amount of work required to update compositing information by tracking invalidates at a finer granularity, perhaps on individual RenderLayers.
One challenge with introducing fine-grained invalidations into compositing update is that one the key algorithms, the one that allocates RenderLayers to GraphicsLayers, depends on the spatial overlap between the RenderLayers. For example, if script translates one RenderLayer, we need to recompute which other RenderLayers overlap that RenderLayer, quickly leading to a complete invalidation of all compositing information.
To address this issue, we are replacing the current algorithm for allocating RenderLayers to Graphics layers with another algorithm, squashing [6], that does not depend on spatial overlap. This change will minimize the amount of compositing information that's invalidated when transforming a composited RenderLayer, which is a common operation to perform in response to touch input.
== Summary ==
To give web developers the tools they need to create high quality touch-driven interactions, we are retooling Blink's core rendering engine to run smoothly at 60 Hz. Specifically:
1) We are switching the style and layout subsystems to be driven by animation frames instead of by arbitrary timers, reducing synchronization issues between these subsystems and the compositor.
2) We are making the rendering engine's state machine explicit to be more disciplined about what information can be read in which state, removing the need to continually recompute compositing state
3) We are minimizing the amount of retained state that's invalidated inside the engine in response to DOM mutations, reducing the total amount of work the engine need to perform in order to compute the next animation frame.
Thanks for reading this (absurdly long) message. Happy hacking!
Adam