Hi blink-dev,
A little while ago, I wrote about an experiment I did which broke up style recalculation into lots of incremental tasks and posted them back onto the main thread. A natural follow-on question is what happens when you post the tasks onto worker threads and recalculate styles in parallel, so I did some experimental hacking, and got a prototype working well enough to run through some MotionMark benchmarks:
The results on the affected subset of MotionMark benchmarks (these are Animometer – Leaves, HTML Suite – CSS bouncing tagged images and HTML Suite – Leaves 2.0, which all heavily manipulate <img> tags) are encouraging given the limited tuning and optimization I've done so far:
On a powerful desktop system (without GPU raster):
FPS improved by up to 37%
Ramp@30FPS complexity increased by up to 39%
Ramp@60 FPS complexity improved by up to 25%
I also tried it on several Arm Chromebooks:
On a Chromebook without GPU raster:
Ramp@30FPS complexity improved by roughly 8-16%
On a Chromebook with GPU raster:
Ramp@30FPS improved by 0-8% (GPU limited)
And also on a Pixel 2 Android device:
Aninometer – Leaves, 30FPS complexity improved by 21%
Aninometer – Leaves, at fixed complexity (702), frame-rate improved by 28%
There are some good reasons why this can't ship in its current state: I've only implemented support for <img> tags so far, and the browser crashes (a lot) on everything else apart from MotionMark and Speedometer, making it difficult to assess how this might impact page loading etc. Performance on Speedometer was surprisingly similar, (but Speedometer doesn't spend much time styling images). Most of the crashes appear to be due to memory ownership issues (e.g. objects created on a worker thread being cleaned up incorrectly) which I think are solvable with enough effort. There are also lots of test failures, and TSAN is often unhappy, however I think this offers an interesting preview of what the pros and cons of such an approach might be, how it might scale, and what the challenges could be with multi-threading parts of Blink. If you're interested in the idea, feel free to try out the patch, take a look at the full results so far, and feel free to ask me any questions here, or via richard....@arm.com.
Thanks
Richard
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/1bddcb82-ee51-4cde-a80d-47b589e267f4%40chromium.org.
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/8ca5a8a8-aaaf-49df-bed2-231a5490d176%40chromium.org.
(Back from holiday). Interesting, so there are a few more things for the task list I think are necessary too:
- Migrate more of the style classes (ComputedStyle etc) to Oilpan to avoid the spinlock in the partition allocator
- Figure out the best way to get Oilpan working with fan-out multithreading (it might work by adopting newly-allocated objects back onto the main thread).
- There's currently things like the TaskSchedulerForegroundWorker thread used for V8 tasks, it would be good if Blink could use those threads too if it needs to multi-thread stuff, which may require some re-plumbing (I think those threads are currently owned by V8).
- Get the breadth-first selector logic working and well-debugged (foundational task).
If that sounds good, let me know. I guess the next stage would be a tracking bug and some sort of high-level design document?
On Mon, Jul 30, 2018 at 2:09 PM <richard....@arm.com> wrote:(Back from holiday). Interesting, so there are a few more things for the task list I think are necessary too:
- Migrate more of the style classes (ComputedStyle etc) to Oilpan to avoid the spinlock in the partition allocator
Moving ComputedStyle to Olilplan would affect layout tree and layout fragments in LayoutNG as they reference ComputedStyle (currently kept alive with a ref counter). It was at some point decided not to move the layout structure to Oilpan for performance reasons. LayoutNG objects were originallly on Oilpan but was moved off of Oilpan for the same reasons.