Thanks for the questions! Brian's explanation is accurate, but a couple more details: Skia itself doesn't actually do the flattening when running on the GPU backend (we still emit GLSL with loops present, for example), but we do still check what the total size would be if they were unrolled, inlined, etc... The GLSL ES 1.0 spec is very carefully written such that any valid program could be completely flattened with no branching - this is intentional, because older GPUs worked that way. As a result, some older (or even not-so-old but low-power) GPUs will use the same execution model as our CPU backend. Vectorized execution (as is done on a GPU shader core) requires a fair bit of care once divergent branching can happen, and the logic to analyze and deal with that is much simpler if you can flatten things out (and just execute non-taken paths with no side effects).
It's absolutely possible to write bigger programs, and if we knew we were targeting a GPU that supported ES3+ (for example), we would simply skip the "how big would this program be if unrolled" check. (We have an option to do this, actually, that we use for testing purposes - we don't expose it, because we still want to maintain the invariant that any program we accept is at least theoretically possible to execute on any GPU, and on our CPU backend).
Eventually though - yes, people are going to keep asking for ES3+ features, and we're still figuring out how to handle that from an API perspective.
Lastly: The easiest way to tune that linked shader is to just turn down the MAX_STEPS (even cutting it in half is sufficient to allow the shadows, but I turned it down quite a bit beyond that with no real visual impact).