Inconsistency in WebGL Extension Disjoint Query Timers

84 views
Skip to first unread message

Yassir Solomah

unread,
Nov 14, 2023, 1:17:22 PM11/14/23
to WebGL Dev List
Hi!

I've been looking into measuring GPU performance and I've been testing out the query timers (https://registry.khronos.org/webgl/extensions/EXT_disjoint_timer_query_webgl2/) across a variety of devices. When queuing up a lot of work to the GPU and seeing a slowdown, I'm noticing that the results are only consistent on Apple Silicon. The results are inconsistent on the Intel based Mac and Windows laptop I tested on.

Is this expected? Would love to know if anyone has had success with these!

Thank you,
Yassir Solomah

Yassir Solomah

unread,
Dec 13, 2023, 8:27:03 PM12/13/23
to WebGL Dev List
Hi!

I just wanted to give some more information here on the issue we're seeing. The sample file I'm using to benchmark each GPU is here:


In the test, I measure the time between requestAnimationFrame (RAF) and use that to determine the user perceived FPS (first two lines in the program). I also measure the CPU time spent in RAF (wall clock end - wall clock start) (third line in the program). Finally, I measure the GPU time spent in RAF using a WebGL2.0 query timer every 10th frame and use that to determine the GPU based FPS (fourth line in the program). The test has an option to enable addition of CPU only spinning to simulate a heavy CPU vs heavy GPU load. To increase the GPU load, I simply issue more draws. Ideally, we want to be able to determine if a user is bottlenecked on CPU or GPU using these timers.

Testing on my Apple-silicon based Mac, I can see that the GPU query timer is generally accurate (i.e., if the GPU is the bottleneck, I can see that perceived FPS ~= GPU query timer based FPS). Therefore, using the program above, I can bucket my experience into four different categories:
1) If neither CPU nor GPU are constrained, Perceived FPS ~= Refresh Rate (60FPS) AND WebGL2 FPS >> Perceived FPS. No Bottlenecks.
2) If the CPU is heavily constrained (50 ms sleep), we can see Perceived FPS < Refresh Rate AND WebGL2 FPS >> Perceived FPS. This implies the CPU is the bottleneck.
3) If the GPU is heavily constrained (lots of draws per frame), we can see Perceived FPS < Refresh Rate AND WebGL2 FPS ~= Perceived FPS AND FPS derived just from CPU time spent in RAF >> WebGL2 FPS . This implies the GPU is the bottleneck.
4) If the CPU and GPU are equally and heavily constrained, we can see Perceived FPS < Refresh Rate AND WebGL2 FPS ~= Perceived FPS AND FPS derived just from CPU time in RAF ~= WebGL2 FPS. This implies both GPU and CPU are equally bottlenecking. This is validated by checking the Chrome profiler and seeing that the GPU is almost fully utilized.

The only time it's not distinguishable is when the CPU issues too much GPU work and starts blocking on GPU calls, in which case it appears both CPU and GPU are bottlenecking.This could lead to some conflation between which bucket it is in, so I'm ignoring it for now.

What I've noticed testing across several devices is that (of the two) Intel-silicon Windows based and (of the one) Intel-silicon Mac laptops I've tested on is that they are not consistently reporting scenario number 3 (and by extension, scenario number 4).

What I'm curious about is if my implementation is incorrect or if there's an oversight on why it would not work on Intel-silicon Windows/Mac computers? Open to any suggestions!

Thank you!
Yassir Solomah

Theo Armour

unread,
Dec 14, 2023, 2:23:15 AM12/14/23
to webgl-d...@googlegroups.com
Hi Yassir

I have almost no idea about what you are talking about.

Anyway, here is a link that runs your code:



--
You received this message because you are subscribed to the Google Groups "WebGL Dev List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to webgl-dev-lis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/webgl-dev-list/32fad000-f261-4a4a-9b59-13b409b1d62fn%40googlegroups.com.

Yassir Solomah

unread,
Dec 14, 2023, 1:12:07 PM12/14/23
to WebGL Dev List
Thank you for the link preview!

Haha perhaps I can try to explain the objective a bit more. We're trying to see if we can use time between requestAnimationFrame, CPU time spent in requestAnimationFrame, and GPU time spent in requestAnimationFrame to determine if an application is bottlenecked on GPU vs CPU. I hypothesize that if we can accurately measure these numbers, we can bucket each user session into one of the four buckets described above.

The issue I'm facing is when I try to use the GPU query timers and seeing that they over or under estimate the on non-Apple-silicon devices when the GPU is clearly the bottle neck (in which case, I would expect time between requestAnimationFrame ~= GPU time spent provided from the GPU query timer).

Reply all
Reply to author
Forward
0 new messages