How is the division of work between snap() and insertRecording() in graphite with vulkan?

28 views
Skip to first unread message

YUANPEI WU

unread,
Apr 22, 2024, 7:50:43 AMApr 22
to skia-discuss
Hi folks,
I am building a prototype using graphite directly with Vulkan?
I am wondering what does recorder.snap() exactly do? Does it directly create vulkan command buffers? Or it just creates skia data structures? What does insertRecording do? Is it just about submit vkcommands?
In other words, how is the division of work between snap() and insertRecording() for graphite with vulkan directly.
Thanks in advance!

Greg Daniel

unread,
Apr 22, 2024, 9:47:14 AMApr 22
to skia-d...@googlegroups.com
So there are three different commands in this sequence: snap, insertRecording, and submit.

Snap
When you snap a recording we will do a lot of the work preparing the resources needed for a draw. This can include creating VkPipelines, filling out/uploading data to VkBuffers, allocating needed VkImages/VkBuffers, etc. However, we don't yet create the VkCommandBuffer. Instead all the recorded Skia Canvas commands are written to a Graphite backend API agnostic type command buffer. This stores commands very similar to what will eventually be sent to the VkCommandBuffer (e.g. copyTexture, bindPipeline, draw, etc.). The main reason we cannot go to backend command buffers (e.g. VkCommandBuffer) is that we don't know the order that Recordings will be inserted in the Context or how many times. So things like setting pipeline barriers have to be delayed.

insertRecording
When you call insertRecording we will convert the Graphite command buffer abstraction to the backend specific command buffer (e.g. VkCommandBuffer). This is meant to be a very cheap translation as the majority of CPU work and logic was already done during snap including resource allocations. When a Recording is inserted in the Context we now have a serial ordering of commands to be sent to the GPU so we can insert things like barriers correctly.

submit
Calling submit on the Context will send all the inserted Recordings on the Context to the GPU. This is essentially a call to vkQueueSubmit in Vulkan.

Hope this helps let us know if you have any more questions.
Greg

--
You received this message because you are subscribed to the Google Groups "skia-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to skia-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/skia-discuss/066aeeef-aac5-4db2-a9b4-a4fc1f91e83fn%40googlegroups.com.

YUANPEI WU

unread,
Apr 23, 2024, 9:06:36 AMApr 23
to skia-discuss
Thank you very much! You response helps a lot!

I am doing profiling about the execution time of each step. As far as I know, I can have multiple worker threads, and each of the worker thread can have its own recorder to do snap(). However, all the insertRecording() must be serialized in a single main thread in a serialized style.  So I doubt  the insertRecordings will become a bottleneck.

I draw 75 * 75 rectangles like this:
uint32_t color = 0xfffc6a03;
for(int j = 0; j < 75; j++) {
for(int i = 0; i < 75; i++) {
SkPaint paint;
//paint.setColor(color + 0xff000202 * i); ...option 1...
//paint.setColor(color); ...option 2...
SkRect rect = SkRect::MakeXYWH(i * 2, j * 2, 2, 2);
std::shared_ptr<RectDrawCmd> rectCmd = std::make_shared<RectDrawCmd>(paint, rect);
addDrawCmd(rectCmd);// later draw/snap/insertRecording
}
}

For option 2 when the color does not change, draw and snap take about 2ms, and insertRecording takes 0.06ms, which looks good to me!
However, for option 1 when the color changes for every rectangle, draw and snap still take about 2ms, but insertRecording takes about 100ms... Why is that the case? What makes the difference between option 1 and 2? How should I write the code if I want ~5000 rectangles with different colors?
Because insertRecordings need to be serialized, I don't want it to be a bottleneck.

Thank you in advance!

Greg Daniel

unread,
Apr 23, 2024, 9:38:52 AMApr 23
to skia-d...@googlegroups.com
Out of curiosity, what platform/device are you using?

But regardless, I believe this is caused by an issue in our Vulkan backend that should be getting fixed soon. Basically, when you set different paint colors they get uploaded in a uniform buffer. We can't currently batch draws together that need different uniform values. To make things worse in our Vulkan backend, to just get things up and running we are using UNIFORM descriptor sets and not DYNAMIC_UNIFORM descriptor sets. That means we keep having to look up in our cache, make, and update descriptor sets for every draw right now. This is currently all happening in snap(). When we update to using DYNAMIC_UNIFORM this overhead should mostly go away and you'll see snap be significantly faster (though probably will still be slightly slower than when it all batches together in one draw). This fix is planned to land in the next couple of weeks.

The next fix that should also help things is adding support for storage buffers instead of uniform buffers. These will allow us to batch draws every across uniform value changes. Graphite already supports these on the higher levels and are used in our Metal backend. We just haven't added the Vulkan specific support yet, but I don't believe it should be too much work to get them working.

--
You received this message because you are subscribed to the Google Groups "skia-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to skia-discuss...@googlegroups.com.

YUANPEI WU

unread,
Apr 23, 2024, 10:18:18 AMApr 23
to skia-d...@googlegroups.com
Thank you a lot. I am using ubuntu 21 desktop with i7-13700 and rtx3060 nvidia vulkan driver.


From: 'Greg Daniel' via skia-discuss <skia-d...@googlegroups.com>
Sent: Tuesday, April 23, 2024 9:38:35 PM
To: skia-d...@googlegroups.com <skia-d...@googlegroups.com>
Subject: Re: How is the division of work between snap() and insertRecording() in graphite with vulkan?
 
Reply all
Reply to author
Forward
0 new messages