Quickset Render

0 views
Skip to first unread message

Jaunita Rousu

unread,
Aug 3, 2024, 1:49:00 PM8/3/24
to facbartsigu

Qt Quick 2 makes use of a dedicated scene graph that is then traversed and rendered via a graphics API such as OpenGL ES, OpenGL, Vulkan, Metal, or Direct 3D. Using a scene graph for graphics rather than the traditional imperative painting systems (QPainter and similar), means the scene to be rendered can be retained between frames and the complete set of primitives to render is known before rendering starts. This opens up for a number of optimizations, such as batch rendering to minimize state changes and discarding obscured primitives.

For example, say a user-interface contains a list of ten items where each item has a background color, an icon and a text. Using the traditional drawing techniques, this would result in 30 draw calls and a similar amount of state changes. A scene graph, on the other hand, could reorganize the primitives to render such that all backgrounds are drawn in one call, then all icons, then all the text, reducing the total amount of draw calls to only 3. Batching and state change reduction like this can greatly improve performance on some hardware.

The scene graph is closely tied to Qt Quick 2.0 and can not be used stand-alone. The scene graph is managed and rendered by the QQuickWindow class and custom Item types can add their graphical primitives into the scene graph through a call to QQuickItem::updatePaintNode().

The scene graph is a graphical representation of the Item scene, an independent structure that contains enough information to render all the items. Once it has been set up, it can be manipulated and rendered independently of the state of the items. On many platforms, the scene graph will even be rendered on a dedicated render thread while the GUI thread is preparing the next frame's state.

Note: Much of the information listed on this page is specific to the built-in, default behavior of the Qt Quick Scene graph. When using an alternative scene graph adaptation, such as, the software adaptation, not all concepts may apply. For more information about the different scene graph adaptations see Scene Graph Adaptations.

The scene graph is composed of a number of predefined node types, each serving a dedicated purpose. Although we refer to it as a scene graph, a more precise definition is node tree. The tree is built from QQuickItem types in the QML scene and internally the scene is then processed by a renderer which draws the scene. The nodes themselves do not contain any active drawing code nor virtual paint() function.

Even though the node tree is mostly built internally by the existing Qt Quick QML types, it is possible for users to also add complete subtrees with their own content, including subtrees that represent 3D models.

The most important node for users is the QSGGeometryNode. It is used to define custom graphics by defining its geometry and material. The geometry is defined using QSGGeometry and describes the shape or mesh of the graphical primitive. It can be a line, a rectangle, a polygon, many disconnected rectangles, or complex 3D mesh. The material defines how the pixels in this shape are filled.

Warning: It is crucial that native graphics (OpenGL, Vulkan, Metal, etc.) operations and interaction with the scene graph happens exclusively on the render thread, primarily during the updatePaintNode() call. The rule of thumb is to only use classes with the "QSG" prefix inside the QQuickItem::updatePaintNode() function.

Nodes have a virtual QSGNode::preprocess() function, which will be called before the scene graph is rendered. Node subclasses can set the flag QSGNode::UsePreprocess and override the QSGNode::preprocess() function to do final preparation of their node. For example, dividing a bezier curve into the correct level of detail for the current scale factor or updating a section of a texture.

Ownership of the nodes is either done explicitly by the creator or by the scene graph by setting the flag QSGNode::OwnedByParent. Assigning ownership to the scene graph is often preferable as it simplifies cleanup when the scene graph lives outside the GUI thread.

The material describes how the interior of a geometry in a QSGGeometryNode is filled. It encapsulates graphics shaders for the vertex and fragment stages of the graphics pipeline and provides ample flexibility in what can be achieved, though most of the Qt Quick items themselves only use very basic materials, such as solid color and texture fills.

The scene graph API is low-level and focuses on performance rather than convenience. Writing custom geometries and materials from scratch, even the most basic ones, requires a non-trivial amount of code. For this reason, the API includes a few convenience classes to make the most common custom nodes readily available.

The rendering of the scene graph happens internally in the QQuickWindow class, and there is no public API to access it. There are, however, a few places in the rendering pipeline where the user can attach application code. This can be used to add custom scene graph content or to insert arbitrary rendering commands by directly calling the graphics API (OpenGL, Vulkan, Metal, etc.) that is in use by the scene graph. The integration points are defined by the render loop.

There are two render loop variants available: basic, and threaded. basic is single-threaded, while threaded performs scene graph rendering on a dedicated thread. Qt attempts to choose a suitable loop based on the platform and possibly the graphics drivers in use. When this is not satisfactory, or for testing purposes, the environment variable QSG_RENDER_LOOP can be used to force the usage of a given loop. To verify which render loop is in use, enable the qt.scenegraph.general logging category.

On many configurations, the scene graph rendering will happen on a dedicated render thread. This is done to increase parallelism of multi-core processors and make better use of stall times such as waiting for a blocking swap buffer call. This offers significant performance improvements, but imposes certain restrictions on where and when interaction with the scene graph can happen.

The following is a simple outline of how a frame gets rendered with the threaded render loop and OpenGL. The steps are the same with other graphics APIs as well, apart from the OpenGL context specifics.

The threaded renderer is currently used by default on Windows with Direct3D 11 and with OpenGL when using opengl32.dll, Linux excluding Mesa llvmpipe, macOS with Metal, mobile platforms, and Embedded Linux with EGLFS, and with Vulkan regardless of the platform. All this may change in future releases. It is always possible to force use of the threaded renderer by setting QSG_RENDER_LOOP=threaded in the environment.

The non-threaded render loop is currently used by default on Windows with OpenGL when not using the system's standard opengl32.dll, macOS with OpenGL, and Linux with some drivers. For the latter this is mostly a precautionary measure, as not all combinations of OpenGL drivers and windowing systems have been tested.

On macOS and OpenGL, the threaded render loop is not supported when building with XCode 10 (10.14 SDK) or later, since this opts in to layer-backed views on macOS 10.14. You can build with Xcode 9 (10.13 SDK) to opt out of layer-backing, in which case the threaded render loop is available and used by default. There is no such restriction with Metal.

By default, a Qt Quick animation (such, as a NumberAnimation) is driven by the default animation driver. This relies on basic system timers, such as QObject::startTimer(). The timer typically runs with an interval of 16 milliseconds. While this will never be fully accurate and also depends on the accuracy of timers in the underlying platform, it has the benefit of being independent of the rendering. It provides uniform results regardless of the display refresh rate and if synchronization to the display's vertical sync is active or not. This is how animations work with the basic render loop.

In order to provide more accurate results with less stutter on-screen, independent of the render loop design (be it single threaded or multiple threads) a render loop may decide to install its own custom animation driver, and take the operation of advancing it into its own hands, without relying on timers.

This is what the threaded render loop implements. In fact, it installs not one, but two animation drivers: one on the gui thread (to drive regular animations, such as NumberAnimation), and one on the render thread (to drive render thread animations, i.e. the Animator types, such as OpacityAnimator or XAnimator). Both of these are advanced during the preparation of a frame, i.e. animations are now synchronized with rendering. This makes sense due to presentation being throttled to the display's vertical sync by the underlying graphics stack.

Therefore, in the diagram for the threaded render loop above, there is an explicit Advance animations step on both threads. For the render thread, this is trivial: as the thread is being throttled to vsync, advancing animations (for Animator types) in each frame as if 16.67 milliseconds had elapsed gives more accurate results than relying on a system timer. (when throttled to the vsync timing, which is 1000/60 milliseconds with a 60 Hz refresh rate, it is fair to assume that it has been approximately that long since the same operation was done for the previous frame)

The same approach works for animations on the gui (main) thread too: due to the essential synchronization of data between the gui and render threads, the gui thread is effectively throttled to the same rate as the render thread, while still having the benefit of having less work to do, leaving more headroom for the application logic since much of the rendering preparations are now offloaded to the render thread.

While the above examples used 60 frames per second, Qt Quick is prepared for other refresh rates as well: the rate is queried from the QScreen and the platform. For example, with a 144 Hz screen the interval is 6.94 ms. At the same time this is exactly what can cause trouble if vsync-based throttling is not functioning as expected, because if what the render loop thinks is happening is not matching reality, incorrect animation pacing will occur.

c80f0f1006
Reply all
Reply to author
Forward
0 new messages