Contact emails
ari...@chromium.org, awi...@chromium.org, a...@google.com, mike...@chromium.org
None
Summary
This experiment evaluates the impact of changing the per-profile TCP socket pool size from 256 (the current default) to 513 while adding a per-top-level-site cap of 256 (to ensure no two tabs can exhaust the pool). The feasibility of raising the per-profile limit to 512 was already studied and did not yield negative results, and the per-top-level-site cap of 256 is equal to the current per-profile limit so should not cause negative impact. These new limits will be imposed independently for the WebSocket pool and the normal (HTTP) socket pool.
The intent is to roll this experiment directly into a full launch if no ill effects are seen. See the motivation section for more.
Blink component
TAG review
https://github.com/w3ctag/design-reviews/issues/1151
Motivation
Having a fixed pool of TCP sockets available to an entire profile allows attackers to effectively divinate the amount of network requests done by other tabs, and learn things about them to the extent that any given site can be profiled. For example, if a site does X network requests if it’s logged in and Y if it’s logged out, by saturating the TCP socket pool and watching movement after calling window.open, the state of the other site can be gleaned. This sort of attack is outlined in more detail here: https://xsleaks.dev/docs/attacks/timing-attacks/connection-pool/
In order to address this sort of attack, we will cap the max sockets per-top-level-site while raising the per-profile limit. That means no single tab can max out the socket pool on its own. While this mitigation does not fully block the attack (it could still be performed by orchestrating three attacking tabs on different sites) it raises the difficulty by preventing it from being performed by just one tab. Widespread adoption of this attack is already made difficult as multiple attackers all acting at once would step on each other and prevent pool monopolization.
Risks
Interoperability and Compatibility
While other user agents may wish to follow the results, we only anticipate compatibility issues with local machines or remote servers when the amount of available TCP sockets in the browser fluctuates up (256 -> 513) in a way Chrome did not allow before. This will be monitored carefully, and any experiment yielding significant negative impact on browsing experience will be terminated early.
Gecko: https://github.com/mozilla/standards-positions/issues/1299; current global cap of 128-900 (as allowed by OS)
WebKit: https://github.com/WebKit/standards-positions/issues/550; current global cap of 256
Debuggability
This will be gated behind the base::feature kTcpConnectionPoolSizePerTopLevelSiteTrial, so if breakage is suspected that flag could be turned off to detect impact. For how to control feature flags, see this.
Measurement
A new net log event type SOCKET_POOL_STALLED_MAX_SOCKETS_PER_TOP_LEVEL_SITE will be added to track when we hit this new limit as opposed to the existing SOCKET_POOL_STALLED_MAX_SOCKETS event.
An existing metric Net.TcpConnectAttempt.Latency.{Result} will be used to detect increases in overall connection failure rates.
Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, ChromeOS, Android, and Android WebView)?
No, not WebView. That will have to be studied independently due to the differing constraints.
Is this feature fully tested by web-platform-tests?
No, as this is a blink networking focused change browser tests or unit tests are more likely.
Flag name on about://flags
None
Finch feature name
TcpConnectionPoolSizePerTopLevelSiteTrial
Rollout plan
We will never test more than 5% in each group on stable, and will stay on canary/dev/beta for a while to detect issues before testing stable.
Requires code in //chrome?
No
Tracking bug
Estimated milestones
142
Link to entry on the Chrome Platform Status