Does anyone on the WebGL team have any data on what is the fastest method to blit pixels across all platforms? These are all supposed to be vram to vram copies. I'll do more perf tests on the Mac, but thought maybe someone here has insight from the WebGL teams.
This would be a lot easier if WebGL has a mode to log all GL (and WebGL) calls and parameters that it executes, and then could mix those with our own event/marker logs to see if we're ever on a slow path. We don't have RenderDoc on OSX sadly, and then you have to send events/markers out to it to intermingle the context properly. Also being able to enable the gl debug extensions on Windows when there's a "INVALID_OPERATION from glSyncToken" somewhere in WebGL is also currently left as a giant mystery "aw snap" to all WebGL users.
copyTexSubImage2D can go from FBO -> Texture, or FBO -> FBO (texture)
it can also handle depth-stencil formats, but various web searches turn up threads like "super slow on OSX"
copies into an active texture unit, so you don't have to change the textureSet
problems on Nvidia w/IOSurface w/small formats that Chrome worked around by creating temp FBO
How does this work in DX9/DX11 where you have to go through a staging texture or do slow readback to CPU?
The following require all textures to also be renderTargets/FBO, that's not ideal but maybe fits DX model better
blitFramebuffer -> FBO to FBO
This is only available in WebGL2, inconsistently honors scissor where copyTexSubImage2D does not.
closer to DX model - CopySubresource
shader - FBO to FBO
Ultimate control, but also have to switch out shader, blend/depth/raster states, and textureSet
Should work no matter what bugs are in the copy or blit code.