Primary Eng:
Summary
In an effort to help TTFB for high-priority resources, we throttled low-priority (e.g., not script or css) H2 and QUIC requests like we do HTTP/1.1 (see
bug). This led to solid page-load performance wins (see
design doc). The problem with the launch is that it had developer-facing changes (obvious in retrospect) and so it should have gone through a public I2S process. It didn't, and it's catching developers by surprise (adding latency in some cases) and even breaking things in others. So, let's unship, see if we can find another way without breakage, and try again.
Motivation
The net stack has prioritization issues which cause high-priority requests to wait on low-priority requests. The wait can be substantial when lots of simultaneous requests exist. Throttling requests (to a max of 6 simultaneous low-priority requests per host) above the network stack fixes this, but goes against expected (and documented) H2 and QUIC behavior. While it significantly (2%) improves time-to-first-contentful-paint for our users, It does add latency to overall page load time. For example, if you load a large number of photos they'll take longer overall, though the first requests will finish sooner than they would otherwise. Further, throttling H2 can even break pages (unbeknownst to me). A site that plays 10 videos simultaneously won't work when H2 is throttled.
The throttle was considered a stop-gap while looking for a better solution. The solution (properly prioritizing requests within the net stack) may still be a ways off. We'd like to come up with a better stop-gap in the meanwhile. For instance, we could only throttle H2+QUIC while parsing the head of the response, and unthrottle once parsing body. Or, we could batch requests and allow another 6 in every 10ms. Either way, we should try to find something better if we can, and send an I2S when the time comes to ship.
Compatibility Risk
Unshipping should have no compatibility risk. It makes us compliant.