Yes, we are compiling Skia, so I'm able to patch in your CL. (Thanks!)
Now we're looking to use Graphite (Vulkan) and Recorder::snap can be very slow. Below, you can see that one instance taking 37 ms. Most of this time is in the VulkanAMDMemoryAllocator. The Skia_DrawRegion calls seem slow, too - that's a trace in our code, but the green blocks below are all SkCanvas::drawPath. Any suggestions on how to speed up or diagnose either issue?
Regarding the allocator, maybe there is a lot of contention due to the fact that we're recording from several different threads? I'm looking into creating my own VulkanMemoryAllocator subclass, which would make it easier to test/verify that. (And I understand using Skia's VulkanAMDMemoryAllocator is an option that is being phased out, so might as well get started.)