How is Vulkan CommandBuffer recording/flushing scheduled in ANGLE?

15 views
Skip to first unread message

Lorenzo Rutayisire

unread,
Mar 1, 2026, 9:55:18 PM (10 days ago) Mar 1
to angleproject
Hello,
I am interested in learning how Vulkan CommandBuffer recording and flushing is implemented in ANGLE for e.g. OpenGL and OpenCL backends.
In OpenGL and OpenCL flushing of commands to GPU execution is implicit, but can be manually triggered too with glFlush() / clFlush().
In Vulkan commands recording and flushing is implicit, and understanding how many commands have to be recorded before submission to achieve optimality, is a non-trivial problem to me.
Say you have 5 draw calls to dispatch. Vulkan would allow you to record all the 5 draw calls in one CommandBuffer, put fine-grained barriers among used resources and issue one single submission. This would be the optimal way.
Alternatively, you could encapsulate the draw calls into their own command buffer and enforce sequentiality through (timeline) semaphores, and submit each.
Possibly there are other approaches, hence I'm asking, what's the execution model in the ANGLE Vulkan backend? Could anyone pinpoint which part of the source code should I look at?
Thanks, :)

Shahbaz Youssefi

unread,
Mar 1, 2026, 10:09:56 PM (10 days ago) Mar 1
to angleproject
Hi,

This is indeed a non-trivial problem. Assuming a simple case where you are recording commands and submitting to a single queue, there are two things to be balanced really:

* The number of submissions: vkQueueSubmit is not free. Too many submissions has a significant overhead
* Keeping the GPU fed: On the other hand, too few submissions has the downside that the GPU may be left idle while the CPU is accumulating lots of commands

While the situation is particularly complicated for ANGLE (as an OpenGL layer with no knowledge of the future commands during recording), for normal Vulkan applications it's actually not that bad. Most games for example should work perfectly fine with a single submission, where the GPU is executing commands for one frame while the CPU is recording commands for the next. As a rule of thumb, you wouldn't want to be make more than 2 or 3 submissions, but really one should typically be enough.

One important thing to bear in mind is that splitting render passes is extremely expensive, especially on mobile devices (TBR architectures in particular), so your example of splitting the 5 draw calls in 5 submissions is doubly bad because it'll also use 5 render passes instead of 1.

In terms of code pointers, the logic is spread around ContextVk.cpp in various places, like here (submit if app switches to another FBO), here (submit because of a sync object), here (submit because of a query), and more. If you're thinking of getting inspired for your Vulkan app, I would again warn that the situation for ANGLE is more complicated than for a typical Vulkan app, so I wouldn't try to mimic ANGLE.

Cheers

Lorenzo Rutayisire

unread,
Mar 2, 2026, 10:48:26 AM (10 days ago) Mar 2
to angleproject

Hi! Thanks for the prompt and complete answer. I will find time to look into the code carefully. I am not trying to mimic ANGLE as I know the framework complexity,
but I really believe I'm solving a problem which other people working on low-level graphics have already faced.

> One important thing to bear in mind is that splitting render passes is extremely expensive, especially on mobile devices (TBR architectures in particular), so your example of splitting the 5 draw calls in 5 submissions is doubly bad because it'll also use 5 render passes instead of 1.

My workload is mainly compute. I'm not doing graphics here, hence why I'm not using Vulkan the "standard way" with e.g. frame in-flight. I have a part where I do graphics but it's using dynamic rendering, as far as my understanding render passes are an old thing and dynamic rendering is wide spread in modern devices. (maybe render passes are still more performant!)

Shahbaz Youssefi

unread,
Mar 2, 2026, 10:59:27 PM (9 days ago) Mar 2
to angleproject
On Monday, March 2, 2026 at 10:48:26 AM UTC-5 loryr...@gmail.com wrote:

Hi! Thanks for the prompt and complete answer. I will find time to look into the code carefully. I am not trying to mimic ANGLE as I know the framework complexity,
but I really believe I'm solving a problem which other people working on low-level graphics have already faced.

Good luck!
 

> One important thing to bear in mind is that splitting render passes is extremely expensive, especially on mobile devices (TBR architectures in particular), so your example of splitting the 5 draw calls in 5 submissions is doubly bad because it'll also use 5 render passes instead of 1.

My workload is mainly compute. I'm not doing graphics here, hence why I'm not using Vulkan the "standard way" with e.g. frame in-flight. I have a part where I do graphics but it's using dynamic rendering, as far as my understanding render passes are an old thing and dynamic rendering is wide spread in modern devices. (maybe render passes are still more performant!)

Dynamic rendering still creates render passes. It doesn't use legacy render pass _objects_, but it still defines render passes. The need to minimize the number of render passes is unchanged, it just provides a more ergonomic API.
Reply all
Reply to author
Forward
0 new messages