Hi,
I was directed here by a Google employee as this is a possible place to advertise using a new GPU 2D renderer in Chrome. The renderer is still in development at:
https://github.com/01org/fastuidraw
I gave a talk at the X Developer Conference this year:
https://www.x.org/wiki/Events/XDC2016/Program/rogovin_fast_ui_draw/
The renderer is still requiring work to support all the features needed to be a complete solution; it also needs productization.
However, I would like to explore using the renderer in Chrome, possible with the first place to use it is for Canvas.
At this point in time I am hesitant to make a SKIA backend from the renderer; the main reason is related to clipping. The FastUIDraw renderer is set so that clipping is really implemented almost entirely by the GPU with very, very little CPU load; even under icky things like clipping against paths, rotated rectangles and so on. However, it only support clipIn and clipOut (which is exactly needed by GraphicsContext), whereas SKIA right now tracks the clipping region on CPU and implements other clip combine modes. In addition, the renderer is setup strongly to avoid CPU re-computation load as well which is tricky to fit into the SKIA backend interface.
I made a benchmark, ported it to different canvas style toolkits (SKIA, Qt's QPainter and Cairo) which does lots of rotation and lots of clipping. For that load, FastUIDraw was several times faster (against SKIA GL backend it was 5 times faster). The benchmark and its ports are in the git repo linked above under the branch with_ports_of_painter-cells .
At any rate, I am hoping to explore getting FastUIDraw, or atleast the ideas within it, into Chrome. I am also happy to change and tweak its interface to make it more useable by Chrome (I am right now considering changing the brush interface to match SKIA's where shaders can be chained together instead of a fixed set of functionality).
Looking forward to collaboration.
Best Regards,
-Kevin Rogovin
Hi Rogovin, I am the canvas team lead. A skia back-end really would be the path of least resistance for integrating a new renderer because SkDevice is actually designed to be an abstraction layer. When blink was forked from WebKit a few years ago, we removed the graphics abstractions that existed in WebKit. Now, all the graphics code in blink is tightly coupled with skia. Going around skia is probably not very realistic at this point unless you are willing to invest tremendous time and effort.If your library only supports clip-in and clip-out, that is good enough for blink's uses of skia, and it is definitely sufficient for canvas, so you could get away with implementing a skia back end that does not support other combine modes. FYI, we recently removed uses of the replace op in blink.To get started with canvas, you'd want to modify Canvas2DLayerBridge to make it instantiate an SkSurface that uses your new backend.
Hi,
Thankyou very much for the very fast replies!
To answer question about “the renderer is setup strongly to avoid CPU re-computation load as well which is tricky to fit into the SKIA backend interface”. It is only tangentially related to clipping. The renderer strongly distinguishes between “what” to draw and “how” to draw. The what to draw would be a sequence of glyphs or path and so on. The “how” to draw is what brush, transformation and clipping are applied. The what is embodied essentially by attribute and index data. The how is embodied by a few numbers (clipping is more complicated to describe). The renderer makes use of an uber-shader and the how is represented as numbers copied to a buffer read by a the uber-shader (via TBO, UBO or (later when I implement the last fallback) an SSBO). One of the bits I have noticed is that regenerating attribute and index data has a non-trivial CPU cost and the goal of FastUIDraw is to reduce the CPU load to be able to draw more and more stuff. There are issues that come up (namely need to do FAST culling) from this approach as well. It is possible to choose to regenerate that data at every draw but that I think is inefficient. The interface for path rendering (be it stroking or filling) is so that the data is generated lazily on demand fetched from the path to stroke or fill.
Clipping is more subtle: it is done via a combination of using the depth buffer to occlude together with hardware clip planes. It makes it so that the CPU has almost nothing to compute as clipping changes and (for most GPU’s) clipping can improve performance (!) I am happy to give more details on it.
What is a VC? Virtual Conference? If so I am happy to do so, but can we do it next week? This week is difficult to schedule anything reliably.
Best Regards,
-Kevin Rogovin