Solving slow compilation (and eventual fail) in complex shaders (with patch)

618 views
Skip to first unread message

Alvaro Segura

unread,
Mar 29, 2011, 4:36:24 AM3/29/11
to angleproject, amo...@vicomtech.org
Hi!

We have been noticing that complex shaders with relatively long loops
take very long time to compile. An example is http://fractal.io/. As
no. of iterations increases compile times grow to very long times.
Using the default values (Menger Sponge with 8 iterations and 60 max
steps) on Chrome or FF4 with Angle as backend is significantly slower
than the same example with Angle disabled (pure GLSL) [1]. Compilation
without Angle is also very fast.

Increasing iterations to 30 leads to a compilation failure (Error
1281) with Angle, but, it works fine and smooth without Angle.

Actually the problem is when she shader program is linked
[gl.linkProgram()], which tiggers conversion to DirectX and actual
compilation of HLSL. We suspected the last compilation by the DX
runtime is what takes that much time.

It seems DX is trying to unroll all loops, which for long loops is a
lot of work, apparently. And for very long loops, just fails with a
message "unable to unroll loop". It is weird because our hardware
(common today) is perfectly able to run real loops with an iteration
counter.

Therefore, we have inspected the Angle source to find out how the HLSL
code is generated and we found that prepending "[fastopt] [loop]" [2]
to the "for" statement in the GLSL->HLSL translation solves this
issue. We modified this, build the Angle libs (ligEGL*.dll) and
replaced Firefox's to test.

[loop] tells the HLSL compiler that we want real loops, and prevents
unrolling. This alone improved compile times clearly but not as much
as pure GL. [fastopt] tells the compiler to not attempt optimizations.
Adding also this, improves compile time very significantly, near GL.

We could open a new issue in the bug tracker if you seeit is worth. It
works for this problem but its implications should probably be
studied.

Performance is very good and doesn't seem to be affected by the lack
of unrolling or optimizations. We are unsure now if this would prevent
code to run in simpler hardware not supporting real loops (complex
shaders will fail there anyway).

The suggestion would be either to always use "[fastopt] [loop] for
(...)" or to do only if the number of iterations is known to be quite
high. Keep in mind loops can be nested so that total iteration counts
would multiply.

We send the patch here, but we are not sure it will break many other
things, so, consider it as a quick and specific solution for the
http://fractal.io/ site and similar long loop shaders, although the
slow compilation issu can be found with other examples.


Index: OutputHLSL.cpp
===================================================================
--- OutputHLSL.cpp (revision 594)
+++ OutputHLSL.cpp (working copy)
@@ -1478,7 +1478,7 @@
mUnfoldSelect->traverse(node->getExpression());
}

- out << "for(";
+ out << "[fastopt] [loop] for(";

if (node->getInit())
{
@@ -1720,7 +1720,7 @@

// for(int index = initial; index < clampedLimit;
index += increment)

- out << "for(int ";
+ out << "[fastopt] [loop] for(int ";
index->traverse(this);
out << " = ";
out << initial;


[1] Angle is disabled by setting webgl.prefer-native-gl to TRUE in
about:config
[2] HLSL for statement: http://msdn.microsoft.com/en-us/library/bb509602%28v=vs.85%29.aspx

Daniel Koch

unread,
Mar 30, 2011, 12:48:14 AM3/30/11
to ase...@vicomtech.org, angleproject, amo...@vicomtech.org
Hi Alvero,

Thanks for the information.  This seems quite similar to the problem listed in http://code.google.com/p/angleproject/issues/detail?id=118

What version of Chrome and ANGLE are you using?
As mentioned in Issue 118, Chrome 12.0.713.0 and ANGLE r591 and later will use the d3d10 shader compiler and have optimizations reduced by default for quicker compilation times.

Note though that if you compile ANGLE yourself (to test in FF), you'll get higher optimization settings for the d3d compiler than are used in Chrome.

Let us know if the new version of Chrome improves this for you.

Thanks,
Daniel
---
                        Daniel Koch -+- dan...@transgaming.com
Senior Graphics Architect -+- TransGaming Inc.  -+- www.transgaming.com

John Davis

unread,
Mar 30, 2011, 6:18:11 AM3/30/11
to dan...@transgaming.com, ase...@vicomtech.org, angleproject, amo...@vicomtech.org
I'm still seeing the same amount of lag on 12.


I can increase the octave loop further if need be.
Reply all
Reply to author
Forward
0 new messages