WebGL shader compile flow severely cripples WebGL and WebVR

1,201 views
Skip to first unread message

Alan Wolfe

unread,
Oct 15, 2016, 4:11:43 PM10/15/16
to WebGL Dev List
Hi Guys,

Just to preface where I'm coming from, my name is Alan Wolfe and I am a professional game developer with 15 years of professional experience.  I'm currently a senior graphics and engine programmer working at Blizzard on starcraft 2 and Heroes of the storm.  But, of course, this message is me speaking to my own thoughts and opinions and is not the opinion of my employer (yadda yadda) (:

I also am a huge fan of shadertoy and have a bunch of shadertoy's up there under the username "demofox" (https://www.shadertoy.com/results?query=demofox), and have a graphics/gamedev blog at http://blog.demofox.org.

Anyhow, I wanted to talk about shader compiles in webgl and see if the problems I want to point out are being worked on or not.

The core problem is that there is no way to know when a shader is finished being compiled.

The common pattern for dealing with shader compiles is to compile the shader and then check it's status for errors right away.  That has the known issue that it will block until the shader is finished compiling if it's not yet finished.

If that blocking time is too long, browsers will "time out" and say that webgl has crashed, when it hasn't, but is just doing a long compile.  You can't blame the browsers for doing this, as they are essentially faced with the halting problem :P

A good workaround to this problem is to do "something else" before checking the shader's status, as mentioned in this link: http://toji.github.io/shader-perf/

Using that solution, you can do some other work, and then later check the shader compile status for errors, at which point, hopefully the shader you are working with is finished compiling, else it will block until it's finished.

That helps the problem, but unfortunately not in any meaningful way for serious applications.

The issue is that we can do "other work" or even delay checking the status of the shader compile / using the shader, but we don't know how long we need to wait until we can do that!

That means there is no way to eliminate these false positive "webgl has crashed" issues.  You could put in extra delays before rendering, but then you are just making longer loading times than necessary for people with faster machines, while still not eliminating the problem for lower end machines.  This isn't just perceived instability, but ACTUAL instability, as users cannot use the programs if the shader compiles take too long.

Under these circumstances, any serious application using webgl really can't be made, other than by sticking to only very simple graphics and having a very low number of shaders.

What we really need is some callback mechanism to know when a shader is done being processed.  This would let me kick off my 50 shader compiles or whatever, and show a progress bar that increments 1/50th of the way whenever a shader is reported as finished.

At the end of the progress bar, assuming there were no errors, I would then know it's safe to use any of the shaders I compiled without possibly hitting a "webgl crashing" browser stall.

Is this something already planned to be "fixed" or addressed in some way?

Me and my colleagues would *LOVE* to use our professional gamedev experience to make some real webgl games and hopefully help grow the platform, but in it's current state it really isn't even useable for any serious applications, due to this one very minor, and seemingly easily fixed problem.  A lot of people are also very interested in web vr, but unfortunately, it has this same issue so is a non starter.

Please also let me know if there's somewhere else I ought to bring this up!

Thanks for your time,
Alan

Alan Wolfe

unread,
Oct 15, 2016, 5:03:58 PM10/15/16
to WebGL Dev List
I forgot to mention - this also manifests itself in sandbox webgl pages, like shadertoy.com.

If you make a shader that's too complex, it says "webgl has crashed" on people with machines that take longer than their browser's timeout period to compile the shader.

The makers of shadertoy actually have to police things a bit as a result, telling people who make long compiling shaders how to simplify their shader, or asking them to not make it public.  This is just so shadertoy.com isn't filled with a sea of shaders that make people's browsers report a crash.

Imagine what those people could be doing for the webgl community if they didn't have to spend time combing the shaders for long shader compile times?

Also, imagine the amazing things people could be making if shader compilation times didn't have to serve the lowest common denominator.

Runtime perf sort of does have to watch out for lowest common denominator obviously, but the two aren't always related.

Corentin Wallez

unread,
Oct 15, 2016, 5:58:37 PM10/15/16
to webgl-d...@googlegroups.com
Hey Alan,

Shadertoy is pushing the limit of what shaders can do, and I don't think a game's shaders would cause such slow compiles. OpenGL doesn't expose an API to know asynchronously when the shader is finished compiling so the best browsers can do is draw with the shader and insert a Sync in the command stream, then periodically poll it. Sync are available in WebGL2 so you might want to try that. (an alternative is for browsers to use program binaries created fro, another context, but that's too complex)

- Corentin

--
You received this message because you are subscribed to the Google Groups "WebGL Dev List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to webgl-dev-list+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Alan Wolfe

unread,
Oct 15, 2016, 6:13:52 PM10/15/16
to WebGL Dev List
Hey Corentin,

I haven't used opengl since fixed function days, are you saying that this is a fundamental problem with opengl, and not with the webgl layer?

If so, that is disheartening!

I personally do think a game's shaders could cause compiles that slow, and even for those that don't, a real game is going to have many shaders, not just a single (or very few) pixel shaders.

You have to admit that someone having a slow machine (or the machine being under other loads) is no reason to accept a "WebGL has crashed" message as acceptable, when doing a shader compile.

As it is, i really cannot rely on webgl to make a serious game, knowing that some portion of my player base is going to randomly "crash" when they try to play it.

The sync render fence idea sounds promising.  It really would be nice if webgl handled that behind the scenes without me needing to worry about it though.

It basically puts the "onus" on every person working with webgl to do the right thing, instead of the api / platform doing the right thing inherently.  Every webgl program will then either be wrong (to varying degrees, depending on how right they get the "right" solution, if they even try for it), or will be more complex.

Anyways, thanks for the reply and the info!  I'll check out webgl2 and will be impatiently waiting for it to hit the mainstream (;

To unsubscribe from this group and stop receiving emails from it, send an email to webgl-dev-lis...@googlegroups.com.

Zhenyao Mo

unread,
Oct 18, 2016, 2:28:13 PM10/18/16
to webgl-d...@googlegroups.com
The general design philosophy is to compile/link your shaders as early
as possible, and use them as late as possible.

That said, I feel adding a non-blocking getShaderInfo(COMPILE_STATUS)
and getProgramInfo(LINK_STATUS) could be beneficial - instead of
returning TRUE/FALSE, we can return a third state of PENDING.

Alan Wolfe

unread,
Oct 18, 2016, 2:51:22 PM10/18/16
to webgl-d...@googlegroups.com
That would be wonderful!  What are the odds that we could get something like this in either WebGL or WebGL2?

Just having *some* way to prevent the false positive "webgl has crashed" errors really solves the stability issue, and makes the platform much more usable for serious applications.


> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "WebGL Dev List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/webgl-dev-list/S8-nBx7q-jA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to webgl-dev-list+unsubscribe@googlegroups.com.

Florian Bösch

unread,
Oct 19, 2016, 5:30:32 AM10/19/16
to WebGL Dev List
On Sunday, October 16, 2016 at 12:13:52 AM UTC+2, Alan Wolfe wrote:
As it is, i really cannot rely on webgl to make a serious game, knowing that some portion of my player base is going to randomly "crash" when they try to play it.
Native applications actually crash more often when trying to play them unless you're using unity or unreal and don't monkey around with their renderpaths which are carefully tuned/driver-switching to step around all the bugs.

Regardless, slow shader compiles are not a universal thing. It depends on the backend. Most of the time on desktops using the OpenGL backend, compiles are fairly fast. They also tend to be fast on Microsoft Edge. Compiles may be very slow on any version of Internet Explorer, on any other browser (such as Chrome or Firefox) using the Direct3D backend or on mobiles (which are generally underpowered).

Native applications step around this problem somewhat by precompiling the shaders. However, OpenGL was designed not to do that.

Fortunately Vulkan has switched the model for shader handling substantially and relies on an intermediary bytecode representation for shaders. Unfortunately, we're quite a long way off of getting WebVulkan as Khronos is currently concentrating on getting WebGL 2 out, and after that WebGL 2.1, and maybe after that something along the lines of WebVulkan. At present release cadences it's my estimate that the earliest we'll see any of WebVulkan is therefore in the range of 2024-2028.

Alan Wolfe

unread,
Oct 19, 2016, 10:30:20 AM10/19/16
to webgl-d...@googlegroups.com
Heya Florian,

I'ma graphics / engine programmer on starcraft 2 and heroes of the storm (both use a custom engine we develop) so I totally hear what you are saying about native applications (eg. optimus is awful, and there are plenty of driver and os bugs all the time).

My issue isn't that shader compiles are slow, it's that a realistic game will have enough of them that they are going to be slow regardless of the users machine, and the "flow" of how shader compiles work right now mean that a large portion of the users will have false positive "webgl has crashed" problem, and the API has NO WAY for a developer to deal with this.

Because of that, the platform is unusable by any serious game / shader intensive application.

I want to use it, I just can't, for these reasons.

Instead of saying "oh it's not that bad", read what I'm saying, because it is that bad.  It's a deal breaker, and frankly is probably part of why the platform isn't more successful than it is IMO!

Imagine if chrome/edge/firefox commonly and reliably crashed on launch because a machine was slower, or because the user had some other applications running at the time.

That would not be acceptable, and would be a pretty big bug. It would get fixed pretty quickly, especially due to the fact that it was easy to reproduce.

Unfortunately, that's the unavoidable problem for people using webgl with many shaders, or complex shaders.  We are stuck in that world of instability.

Can you help?? (:


--
You received this message because you are subscribed to a topic in the Google Groups "WebGL Dev List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/webgl-dev-list/S8-nBx7q-jA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to webgl-dev-list+unsubscribe@googlegroups.com.

Zhenyao Mo

unread,
Oct 19, 2016, 1:03:24 PM10/19/16
to webgl-d...@googlegroups.com
On Wed, Oct 19, 2016 at 2:30 AM, Florian Bösch <pya...@gmail.com> wrote:
> On Sunday, October 16, 2016 at 12:13:52 AM UTC+2, Alan Wolfe wrote:
>>
>> As it is, i really cannot rely on webgl to make a serious game, knowing
>> that some portion of my player base is going to randomly "crash" when they
>> try to play it.
>
> Native applications actually crash more often when trying to play them
> unless you're using unity or unreal and don't monkey around with their
> renderpaths which are carefully tuned/driver-switching to step around all
> the bugs.
>
> Regardless, slow shader compiles are not a universal thing. It depends on
> the backend. Most of the time on desktops using the OpenGL backend, compiles
> are fairly fast. They also tend to be fast on Microsoft Edge. Compiles may
> be very slow on any version of Internet Explorer, on any other browser (such
> as Chrome or Firefox) using the Direct3D backend or on mobiles (which are
> generally underpowered).
>
> Native applications step around this problem somewhat by precompiling the
> shaders. However, OpenGL was designed not to do that.
>
> Fortunately Vulkan has switched the model for shader handling substantially
> and relies on an intermediary bytecode representation for shaders.
> Unfortunately, we're quite a long way off of getting WebVulkan as Khronos is
> currently concentrating on getting WebGL 2 out, and after that WebGL 2.1,
> and maybe after that something along the lines of WebVulkan. At present
> release cadences it's my estimate that the earliest we'll see any of
> WebVulkan is therefore in the range of 2024-2028.

:)

>
>>
>> The sync render fence idea sounds promising. It really would be nice if
>> webgl handled that behind the scenes without me needing to worry about it
>> though.
>>
>> It basically puts the "onus" on every person working with webgl to do the
>> right thing, instead of the api / platform doing the right thing inherently.
>> Every webgl program will then either be wrong (to varying degrees, depending
>> on how right they get the "right" solution, if they even try for it), or
>> will be more complex.
>>
>> Anyways, thanks for the reply and the info! I'll check out webgl2 and
>> will be impatiently waiting for it to hit the mainstream (;
>

Jaume Sánchez

unread,
Oct 19, 2016, 1:51:02 PM10/19/16
to webgl-d...@googlegroups.com
Alan, out of curiosity: are you compiling all your shaders in a row without letting the browser take control in between?
You can try using rAF to schedule your shaders compilation and see if it makes a difference. 
Shadertoy really improved when they started initialising their shaders in the main page in a sequence of setTimeout.


> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "WebGL Dev List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to webgl-dev-list+unsubscribe@googlegroups.com.

Alan Wolfe

unread,
Oct 19, 2016, 2:05:02 PM10/19/16
to webgl-d...@googlegroups.com
Thanks for the suggestion!  It doesn't solve the problem unfortunately.

To quickly recap:
1) You can start shader compiles and do other stuff.
2) At some point, you are going to have to use those shaders or check on their status.
3) If they are not done compiling / linking / etc, they will block until they are done.
4) The length of this blocking period varies from machine to machine, shader complexity and other things.  If it's ever longer than the browser is willing to tolerate it says "webgl has crashed" which for all intents and purposes is an actual crash - the end user cannot progress past that point.

There is no way for us developers to know when it's safe to check on status of, or use a shader, without being exposed to the possibility of a "webgl has crashed" false positive error.

It makes the platform unusable for any serious application that uses complex shaders, or more than a few simple shaders. Having end users commonly experience crashes is not ok.

Having a non blocking way to check on the status of a shader would fix this and would really increase the health, stability and usability of the platform.

On Wed, Oct 19, 2016 at 10:50 AM, Jaume Sánchez <the....@gmail.com> wrote:
Alan, out of curiosity: are you compiling all your shaders in a row without letting the browser take control in between?
You can try using rAF to schedule your shaders compilation and see if it makes a difference. 
Shadertoy really improved when they started initialising their shaders in the main page in a sequence of setTimeout.
You received this message because you are subscribed to a topic in the Google Groups "WebGL Dev List" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/webgl-dev-list/S8-nBx7q-jA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to webgl-dev-list+unsubscribe@googlegroups.com.

Kenneth Russell

unread,
Oct 19, 2016, 6:31:31 PM10/19/16
to WebGL Dev List
Alan,

Could you please file an issue on https://github.com/KhronosGroup/WebGL and provide some easily-runnable and realistic test cases of shaders which take so long to compile in WebGL implementations that they trigger browsers' watchdog timeouts? If filing the bug against Chrome, please provide about:gpu from the machines where the compilation takes too long.

As Corentin has already pointed out. Shadertoy is an extreme example, which uses deeply nested loops to perform raycasting. Most 3D applications use at least some triangles to represent their scenes -- not just two, like Shadertoy does -- and any loops in the shaders aren't as long-running or deeply nested.

There are plenty of examples of sophisticated shaders in tools and on sites like https://sketchfab.com/https://playcanvas.com/ , and http://www.marmoset.co/viewer , and these load quickly and run efficiently on a wide range of hardware.

We can certainly consider adding asynchronous entry points for CompileShader and LinkProgram -- it might even be possible to move these to another thread in existing implementations by just adding another link status; I'm not sure without giving it more thought. But the starting point would be a reasonable shader example from a high-end game that can't compile reasonably in current WebGL implementations.

-Ken


Alan Wolfe

unread,
Oct 19, 2016, 7:27:39 PM10/19/16
to webgl-d...@googlegroups.com
I'll do that, thanks for that info on how to move forward.

I'm betting that compiling a reasonable amount of shader permutations will take long enough to hit a timeout on most or all machines.

I'm surprised how much defense there is of this behavior though.

Even for shadertoy's case, there is no technical reason why those shaders should cause a CRASH.

The knee jerk reaction may be "oh but those will run too slowly too, it's better to just crash".  That isn't the case.

For instance, here is a long shader compile which actually runs very quickly once it has compiled.

Thanks for the engagement and information, I really hope that we can get this solved for the betterment of the web!

Florian Bösch

unread,
Oct 20, 2016, 7:12:19 AM10/20/16
to WebGL Dev List
It's kind of important to pick apart the degrees of what's not working, and how it fails.

There's some very severe cases where a single (relatively small) shader program will crash the GPU process. This usually happens with the Direct3D backend operated through Angle where at some or other stage some loop unrolling happens and results in a gigantic shader source (50mb+) that gets sent to the Direct3D compiler which then just blocks the GPU process until some watchdog decides stuff has stopped working. The incidence of these has somewhat gone back, but it's still possible to provoke that behavior. Strangely enough, Microsoft seems to never run into these with Edge anymore, so I don't know what they did, but whatever it is, they seem to be more competent at operating their own compiler than ANGLE.

There's another case which usually does not provoke a GPU process crash, that is shader compiles are just unusually slow. This too is usually the case with the Direct3D backend through ANGLE which somehow is usually 10x slower than OpenGL backend compiles. This too got better with newer versions of the Direct3D backend (it used to be really really bad on Dx9).

If the latter of these cases provokes a page tab timeout, that's a bug that browsers should fix, regardless if these calls "block" in JS. But that bug would be different from the GPU process crashing.

Yet another case to consider is that even when shader compiles are quick, they might be too slow for you to use. Frankly, this isn't such a big problem for a lot of the usecases of WebGL, but naturally, it is a problem for some. But this problematic usecase can be split into two categories as well: 1) shader compiles are too slow period, it'd make your user wait minutes until they're done and 2) they're "just" too slow, but if you didn't get a tab crash it would be OK.

The latter of these two cases should be addressed as a bug, because the tab should not crash because your shaders aren't done compiling. A new semantic to check up on shader compiles is not going to fix the second case, and it's not going to fix the former either. What a new semantic may do is allow you to display some kind of progress for the minute or so you'll have to show the user something else.

To ultimately fix slow shader compiles, OpenGL is not longer the appropriate tool. It's what the restructure of how shaders are delivered to the API of Vulkan is supposed to solve. If you ship OpenGL native applications you will run into the same problem. Your shaders can only be supplied to OpenGL either as source text (which takes a long time to compile), or as platform dependent binary (which you can't ship cause this might differ from GPU to GPU).

That's why most games which don't exclusively target Direct3D have a lengthy "install process" which among other things, precompiles the shaders into the platform dependent native execution binary format.

This shortcoming of OpenGL is known, and it's nothing that WebGL can truly fix beyond some bandaids that make it less horrible for the user.

Alan Wolfe

unread,
Oct 20, 2016, 12:45:41 PM10/20/16
to webgl-d...@googlegroups.com
So, the thing I'm talking about is just the tip of the ice berg, ouch!  That's really unfortunate.

Does it bum you out that hardware accelerated 3d on the web is kind of b0rked in fundamental ways, and isn't likely to be fixed within a handful of years? :P

I'm sure a lot of that has to do with all sorts of valid reasons, including having to herd many large companies into doing the same thing, while they have their own agendas and priorities.

I've chatted with some folks making higher end webgl games and have gotten the feedback that yes, they want a non blocking shader status query, but in practice, they compile shaders before loading assets, and then check shader status after those loads, so is usually ok for the most part.  I'm not sure if they have metrics on how many players are crashing, or what their minspecs are.

All this fundamental brokenness aside, do you consider the webgl platform healthy enough to make serious applications on?

Thanks!!

--

Florian Bösch

unread,
Oct 20, 2016, 2:43:31 PM10/20/16
to WebGL Dev List
On Thursday, October 20, 2016 at 6:45:41 PM UTC+2, Alan Wolfe wrote:
All this fundamental brokenness aside, do you consider the webgl platform healthy enough to make serious applications on?

I don't think that you can per-se just cut every application on either side of the divide of "this is serious" and "this isn't". It's always down to individual usecases that might be a better or worse fit.

WebGL does have really many really well fitting usecases. Some of those even involve fairly complex shaders and might also need lots of shaders.

There is a noted lack of commercial games being deployed on WebGL, but that's not just todo with slow shader compiles. The web has many challenges if you want to deploy on it. It does not work like an app-store, and it doesn't work like a paid download either. It presents you with a lot of things to solve that you need to do differently then when you'd just hand somebody a 50 gig d/l for some $ trough some or other kind of store.

If you want to know some of the issues that the web still hasn't solved in my opinion:
  • input -> output latency tends to be fairly big, which is a problem for twitchy games
  • there's no standard payment/transaction solution
  • there's no standard advertising interface
  • there's no standard way to sign people up (or in)
  • there's no way to do non tcp networking (webrtc data channels are still not enabled in most UAs, and Microsoft is already rolling their own again)
  • hardware accelerated rendering features lag behind desktop APIs by about a decade
  • there's no standard way to deal with client side mass storage (sandbox file system is already deprecated again without ever being picked up by other UAs than chrome)
  • People who look for commercial games are usually hesitant to open web experiences for that purpose (for a variety of complex reasons)
Most of these you can deal with in some or other fashion, but it's not the cushy app-store life :)
Reply all
Reply to author
Forward
0 new messages