MSAA performance drop with depth-stencil blit?

275 views
Skip to first unread message

Eli Bogomolny

unread,
Feb 9, 2022, 4:58:40 PM2/9/22
to WebGL Dev List
Hi all,

I've been working on adding MSAA for WebGL2 to CesiumJS and a few of our framebuffers have depth-stencil attachments that get blitted when multisampling. We've noticed big performance drops (50%+ frames per second) on Windows 10 and 11, especially on bigger canvas sizes. This happens both in Chrome, Edge, and Firefox, so it might not be a Direct3D issue. It looks like a Windows bug because we're not seeing the same performance drops on macOS or Linux, and our machines that dual boot Linux + Windows get framerate drops only in Windows. 

I couldn't reproduce performance issues with other engines like Babylon.js with MSAA on, but I don't believe Babylon blits depth or stencil attachments. We were wondering if there are any known issues with depth/stencil blitting on Windows or any other details that are easy to miss.

In case it's helpful: 
- Online example running the MSAA branch that gets dropped frames on Windows.

Thanks!

Ken Russell

unread,
Feb 9, 2022, 7:35:34 PM2/9/22
to WebGL Dev List, Rafael Cintron, Geoff Lang
Hi Eli,

This could still very well be a D3D specific problem because all of Chrome, Edge and Firefox use ANGLE's D3D11 backend for their WebGL rendering. You can confirm this in about:gpu in Chrome/Edge and about:support in Firefox.

Is there any way you can produce some kind of reduced test case for this so we don't have to try to debug the entire Cesium app on the browser side? You can look to for example https://github.com/KhronosGroup/WebGL/tree/main/sdk/tests/conformance2/rendering for some smaller blitFramebuffer tests that might help.

If you can, then I think anglebug.com would be the best place to report the issue. CC'ing a couple of colleagues specifically who might have comments from the D3D and ANGLE sides.

-Ken



--
You received this message because you are subscribed to the Google Groups "WebGL Dev List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to webgl-dev-lis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/webgl-dev-list/00f9abb7-a473-4f96-9d84-1e433a5105c7n%40googlegroups.com.

Eli Bogomolny

unread,
Feb 14, 2022, 4:10:59 PM2/14/22
to WebGL Dev List
Hi all, 

I made a small example based on the conformance2/rendering/ tests. There's a fps counter that should show the time per frame where each frame calls a 1280x720 blit. On Windows we can see a mostly constant frame rate but regular drops to <10fps when the blit bitmask is COLOR | STENCIL or STENCIL, and not when it's just COLOR. Happy to open a bug report on angelbug.com if this seems like it should go there.

Eli

msaa-blit-fps.html

Ken Russell

unread,
Feb 14, 2022, 9:04:25 PM2/14/22
to WebGL Dev List
Hi Eli,

Thanks for producing the test. Did you see Geoff Lang's reply to the list? It sounds like depth+stencil resolves are known to be slow on D3D because stencil resolves require a CPU readback. Does Cesium strictly need the stencil buffer to be resolved? Could you use a DEPTH_COMPONENT24 or DEPTH_COMPONENT32F renderbuffer attachment instead when you know you'll need to preserve the resolved depth results?

-Ken



Eli Bogomolny

unread,
Feb 18, 2022, 10:54:31 AM2/18/22
to WebGL Dev List
Hi Ken,

Thanks for the suggestions. I can't see Geoff Lang's reply but we were able to work around the issue by disabling the STENCIL_BUFFER_BIT in the blit bitmask most of the time (some specific cases in CesiumJS require resolved stencil textures) and keeping the DEPTH24_STENCIL8 format.

Eli

Ken Russell

unread,
Feb 22, 2022, 3:03:56 AM2/22/22
to WebGL Dev List
Hi Eli,

Glad you were able to work around the issue.

After some searching, https://bugs.chromium.org/p/angleproject/issues/detail?id=1710 details the hidden restrictions and D3D11 documentation bugs which led to the current solution for GPU-accelerated depth resolves but CPU fallback for stencil resolves. The current D3D11 format table documentation:

indicates that none of the formats that might be used to implement WebGL 2.0's DEPTH24_STENCIL8 format support multisample resolves, and writing the stencil values in a fragment shader is not possible for reasons documented in the code:

-Ken



Geoff Lang

unread,
Feb 23, 2022, 7:58:26 PM2/23/22
to Ken Russell, WebGL Dev List, Rafael Cintron
I can't say for sure what it is without a test case, but if you're trying to resolve a depth stencil buffer I would expect it to be slow on D3D11. There is no way to resolve stencil data on the GPU so we must do a CPU readback. 

Geoff
Reply all
Reply to author
Forward
0 new messages