[blink-dev] Intent to Experiment: Load common payloads from privacy-preserving single-keyed cache

420 views
Skip to first unread message

Daisuke Enomoto

unread,
Apr 26, 2022, 7:59:23 AMApr 26
to blink-dev, Adam Rice, Nidhi Jaju

Contact emails

ri...@chromium.org, nidh...@chromium.org, deno...@chromium.org


Explainer

https://docs.google.com/document/d/1pvaMg7J5beBXD7trzHJH_MDULc_wRHLx40MFYAmjknE/edit


Specification

N/A (because there are no web-exposed changes)


Summary

This limited experiment measures how much "pervasive payloads" contribute to the performance impact of the split HTTP cache in each Chrome channel over a three-week period. Pervasive payloads are those third-party payloads included on at least 500 sites and fetched at least 10M times in a month, based on Chrome's analysis (payload list included below). This experiment further measures the impact on Core Web Vitals metrics of restoring pervasive payloads (and only pervasive payloads) to a single-keyed cache regime. The privacy benefits of the split HTTP cache are preserved.


Blink component

Blink>Network


Motivation

Browsers split HTTP caches based on the top-frame visited origin (“double-keyed” or "triple-keyed" caching) to prevent sites from tracking users via a timing attack on a cross-site client cache.


Chrome’s analysis estimates that split caching results in a 3% increase in cache misses, i.e. fetches for which a payload exists in the cache of the user's device, but is unavailable to the page because it was fetched by the user while loading a page from a different origin. This results in approximately 4% more total bytes being fetched over the network.


Our analysis further revealed that many of the redundant fetches caused by split caching were for common payloads associated with displaying user content (libraries, fonts, widgets, ads) or common payloads that assist in operating online businesses (analytics). The delayed arrival of these common payloads resulted in visible "jank" for users, impacting performance metrics like LCP, FCP and CLS. This jank has been associated with negative effects to online business' engagement and conversion rates. Furthermore, delayed loads of analytics and ads payloads can result in missed ads impressions and dropped analytics hits.


Initial public proposal

This experiment sends a list to Chrome of 100 <URL, hash> pairs whose payloads are considered pervasive (the "pervasive payloads list"). During the three-week experiment period, if Chrome fetches a payload that matches both the URL and its hash on the pervasive payloads list, it is inserted into a local single-keyed cache. This payload is then available for use by Chrome when loading pages on other sites that include the matching URL. All other fetches for URLs not on the pervasive payloads list are cached according to the existing split HTTP cache.


The hash covers the payload body and most response headers, except for those which change on every response.


To ensure we do not degrade the privacy profile of any users during this experiment, only users with third-party cookies currently enabled will be eligible for the experiment. We will compare the experience of users in experiment and control arms according to total bytes loaded and page performance metrics like the Core Web Vitals.


The pervasive payloads list was produced by crawling the web and aggregating the most commonly referenced third-party resource URLs included in HTML content. We then used pseudonymous URL-keyed metrics from Chrome to estimate the traffic to sites and the number of impressions of third-party resources. Individually identifiable browsing or search histories were not used in the creation of the pervasive payload list (for more information about Chrome's data collection policies and privacy policies, see google.com/chrome/privacy). The resulting list was further filtered for any URLs that might contain PII (e.g. URLs with extensive or opaque query parameters). The list was also manually reviewed to ensure it included only payloads reasonably expected to be pervasive; the manual review did not result in any payloads being removed.


The privacy properties of the split HTTP cache are considered essential to users and this proposal intends to maintain those properties, specifically by maintaining split HTTP caching for all payloads not on the pervasive payloads list.


API semantics are unchanged. User-facing functionality is unchanged (though we expect performance to be slightly improved).


The list of the top 100 Pervasive URLs for use in this experiment is pending internal reviews and will be shared on this thread upon approval. 

Future directions

This experiment is the first step in a path exploring improved handling of pervasive payloads in the browser cache. We outline the intended future functionality here to clarify the intentions behind the current experiment. The overview below is not complete or final and subsequent parts of the design and implementation will be presented and discussed in further Intents to Experiment and Prototype.


At a high level, a future improvement to the handling of pervasive payloads may involve:


  • Assembling a list of pervasive payloads that meets the following criteria:

    • Maintains the privacy of user browsing histories in its creation

    • Fairly represents pervasive payloads as they have been chosen by developers on the web, not payloads selected or favored by any particular library or browser vendor.

      • This experiment will initially use a static list of predefined URLs assembled as described in the 'Initial public proposal' section above

      • A future implementation will likely dynamically update the payloads list on, for example, a weekly cadence.

  • Implementing shared caching for pervasive payloads that meets the following criteria:

    • Materially improves load times and responsiveness for web users (under study in this experiment)

    • Does not create a new tracking vector based on cache timing attacks

    • Does not require users to fetch payloads before the browser knows they will need it (i.e. we don't plan to bundle payloads with browser installs or updates)

    • Does not increase local storage required by browser installs or caches


To privately and fairly assemble the list of pervasive payloads, we are exploring the use of Private Heavy Hitters. To implement a privacy-preserving shared cache after the deprecation of third-party cookies, we are exploring adding a measure of randomness to the observed presence or absence of a pervasive payload in the shared cache.


However, this work is only worthwhile if it results in materially improved load times for real users. This Intent to Experiment covers only whether or not we should attempt to measure the performance gains that might be realized if pervasive payloads were placed in a shared cache, as one data point among others that will contribute to discussions about future steps for the proposal.


TAG review

None yet.


TAG review status

N/A


Risks


Interoperability and Compatibility

Chrome's compliance with the relevant standards is unchanged. Caching behavior differs between browsers so interoperability will not be affected.


The list of popular payloads is specifically chosen to minimize compatibility risks.



Gecko: No signal


WebKit: No signal


Web developers: No signals


Other signals:


WebView application risks

Does this intent deprecate or change behavior of existing APIs, such that it has potentially high risk for Android WebView-based applications? No



Debuggability

There is no developer-exposed API for this feature, so most DevTools support is not relevant. It would be useful to indicate whether a resource was served from the single-keyed cache in the network tab, however this will not be implemented in the initial experiment.


Security and privacy

Single-keyed caching introduces global state shared between different browsing contexts. A shared cache can introduce information leaks based on cache probing (https://xsleaks.dev/docs/attacks/cache-probing/), including XS-Search (https://xsleaks.dev/docs/attacks/xs-search/) in applications which conditionally load a single-keyed-cache eligible resource based on authenticated user state. The state of the cache, queried across different contexts, could also be used as a fingerprint, permitting user tracking; however, in this case, we believe this does not provide tracking capabilities beyond those of third-party cookies.


To protect users during this experiment, we limit the experiment population to those users with third-party cookies enabled. Recognizing that third-party cookies will eventually be switched off for most users, we are developing protections such as slightly randomizing cache hit/miss checks, disallowing eviction, or guaranteeing attempts to read from the cache reliably populate that cache entry. These protections will be proposed and incorporated before any future optimizations are launched.


For the purposes of the current experiment, we will be using the same implementation of single-keyed caching that Chrome used before the HTTP cache was partitioned in M77 (https://chromestatus.com/feature/5730772021411840).


To summarize, the security and privacy restrictions on this experiment are as follows:


  1. We will exclude users that have third-party cookies disabled.

  2. Only a small percentage of users will be included in the experiment, reducing the likelihood and impact of any attacks abusing the single-keyed cache.

  3. We will strictly limit the duration of the experiment on each channel to 3 weeks.

  4. We will only serve pervasive resources from the single-keyed cache.

  5. We can turn off the experiment immediately (independent of browser updates) if any other threats appear.


Is this feature fully tested by web-platform-tests?

This behavior is specific to Chrome and not part of any standard, so it will not be tested in web platform tests.


Flag name

CacheTransparency


Requires code in //chrome?

No, but the list of popular payloads and the mechanism for distributing it to the browser will be Chrome-specific.


Tracking bug

https://bugs.chromium.org/p/chromium/issues/detail?id=1309002


Launch bug

https://bugs.chromium.org/p/chromium/issues/detail?id=1309353


Estimated milestones

M103 for off-by-default experiment


Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/5768521127559168



Anne van Kesteren

unread,
Apr 26, 2022, 8:15:21 AMApr 26
to Daisuke Enomoto, blink-dev, Adam Rice, Nidhi Jaju
On Tue, Apr 26, 2022 at 1:59 PM Daisuke Enomoto <deno...@chromium.org> wrote:
> Explainer
>
> https://docs.google.com/document/d/1pvaMg7J5beBXD7trzHJH_MDULc_wRHLx40MFYAmjknE/edit

This document isn't public.

This particular technique has been discussed before, but there's a
flaw which wasn't mentioned in this email. The idea assumes that all
end users can access the same websites and also that all end users
visit similar websites. Neither of those is a given and as such end
users that for one reason or another only end up visiting one or two
websites that use a "pervasive payload" could be vulnerable to attack.

Mike Taylor

unread,
Apr 26, 2022, 9:03:11 AMApr 26
to Anne van Kesteren, blink-dev, Adam Rice, Nidhi Jaju, Daisuke Enomoto
On 4/26/22 8:14 AM, Anne van Kesteren wrote:
> On Tue, Apr 26, 2022 at 1:59 PM Daisuke Enomoto <deno...@chromium.org> wrote:
>> Explainer
>>
>> https://docs.google.com/document/d/1pvaMg7J5beBXD7trzHJH_MDULc_wRHLx40MFYAmjknE/edit
> This document isn't public.

I just checked the settings, and it should be public now.

Noam Rosenthal

unread,
Apr 26, 2022, 10:24:44 AMApr 26
to blink-dev, mike...@chromium.org, blink-dev, ri...@chromium.org, nidh...@chromium.org, deno...@chromium.org, ann...@annevk.nl
The summary says "payload list included below" - I can't find it though... is the list included in one of the links?

Joe Medley

unread,
Apr 26, 2022, 11:50:15 AMApr 26
to Daisuke Enomoto, blink-dev, Adam Rice, Nidhi Jaju
What is an 'off-by-default experiment'? Is that a dev trial flag?
Joe Medley | Technical Writer, Chrome DevRel | jme...@google.com | 816-678-7195
If an API's not documented it doesn't exist.


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAA5e6990s-e4aYUnYK5%2BqzQpAyFzJa42y%2B%3D_MAnL19z%3DqemnWg%40mail.gmail.com.

Vivek Sekhar

unread,
Apr 26, 2022, 4:14:05 PMApr 26
to Anne van Kesteren, Daisuke Enomoto, blink-dev, Adam Rice, Nidhi Jaju
This particular technique has been discussed before, but there's a
flaw which wasn't mentioned in this email. The idea assumes that all
end users can access the same websites and also that all end users
visit similar websites. Neither of those is a given and as such end
users that for one reason or another only end up visiting one or two
websites that use a "pervasive payload" could be vulnerable to attack.

Thanks for raising this. When you say "can access," are you referring to e.g. national governments or ISPs blocking access to large numbers of otherwise-popular sites? If so, would geography-specific lists of pervasive payloads mitigate this concern? If not, can you provide more details on the scenario you have in mind?

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.


--

Vivek | Sekhar | Product Manager | vse...@google.com

Mike Taylor

unread,
Apr 26, 2022, 4:55:35 PMApr 26
to Daisuke Enomoto, Adam Rice, Nidhi Jaju, blink-dev
Hi Daisuke,

Can you clarify the timeline of the experiment? Would it begin in M103? I have concerns about interactions with the current double-key experiment we're running for Network State Partitioning in M101 and M102.
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Nidhi Jaju

unread,
Apr 26, 2022, 8:56:04 PMApr 26
to Noam Rosenthal, blink-dev, mike...@chromium.org, ri...@chromium.org, deno...@chromium.org, ann...@annevk.nl
Hi Noam, 

Apologies for the confusion! The list of the top 100 pervasive payloads for use in the experiment is pending internal reviews and will be shared on this thread upon approval.

Anne van Kesteren

unread,
Apr 27, 2022, 2:50:56 AMApr 27
to Vivek Sekhar, Daisuke Enomoto, blink-dev, Adam Rice, Nidhi Jaju
On Tue, Apr 26, 2022 at 9:22 PM Vivek Sekhar <vse...@google.com> wrote:
>> This particular technique has been discussed before, but there's a
>> flaw which wasn't mentioned in this email. The idea assumes that all
>> end users can access the same websites and also that all end users
>> visit similar websites. Neither of those is a given and as such end
>> users that for one reason or another only end up visiting one or two
>> websites that use a "pervasive payload" could be vulnerable to attack.
>
> Thanks for raising this. When you say "can access," are you referring to e.g. national governments or ISPs blocking access to large numbers of otherwise-popular sites? If so, would geography-specific lists of pervasive payloads mitigate this concern? If not, can you provide more details on the scenario you have in mind?

That is part of the concern, but end users can be segmented in more
ways than that. If an end user minority in a region doesn't visit the
websites the end user majority visits, but a website they do visit
uses a "pervasive payload", you have the same risk. The last time we
discussed this in depth I don't think anyone came up with a solution
that would solve this other than with variations on bundling
"pervasive payloads". I'm rather surprised it's coming up again
without accounting for these issues.

Yoav Weiss

unread,
Apr 27, 2022, 4:46:45 AMApr 27
to Anne van Kesteren, Vivek Sekhar, Daisuke Enomoto, blink-dev, Adam Rice, Nidhi Jaju
Hey Anne! :)

I agree that the concerns you raise are definitely something we'd need to resolve before shipping this.
At the same time, this intent is for a short-lived experiment, aiming to quantify the benefits of the feature, before investing efforts in resolving those hard problems.
 

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Adam Rice

unread,
Apr 27, 2022, 5:26:31 AMApr 27
to Yoav Weiss, Anne van Kesteren, Vivek Sekhar, Daisuke Enomoto, blink-dev, Nidhi Jaju
On Tue, Apr 26, 2022 at 9:22 PM Vivek Sekhar <vse...@google.com> wrote:
>> This particular technique has been discussed before, but there's a
>> flaw which wasn't mentioned in this email. The idea assumes that all
>> end users can access the same websites and also that all end users
>> visit similar websites. Neither of those is a given and as such end
>> users that for one reason or another only end up visiting one or two
>> websites that use a "pervasive payload" could be vulnerable to attack.

I may be misunderstanding your point, but the idea of the "pervasive payloads" is that they are so widely used that you learn nothing useful about the user from knowing they have downloaded them. To provide a concrete example, suppose by probing the cache you discover that the user has the JavaScript for embedded tweets. This doesn't tell you which of the thousands of sites that embed tweets that they went to.

Adam Rice

unread,
Apr 27, 2022, 5:30:09 AMApr 27
to Joe Medley, Daisuke Enomoto, blink-dev, Nidhi Jaju
What is an 'off-by-default experiment'? Is that a dev trial flag?

Just an ordinary experiment, behind a flag which is off-by-default. So most users get the default behaviour (no single-keyed cache), except for those in the experimental group.

Yoav Weiss

unread,
Apr 27, 2022, 5:50:24 AMApr 27
to Adam Rice, Alex Russell, Joe Medley, Daisuke Enomoto, blink-dev, Nidhi Jaju
LGTM to experiment

Thanks for working on this!
I agree with Anne's concerns about fingerprinting risks with shipping this, as well as +Alex Russell's concerns about the risk of this devolving into "playing favorites". But these issues would only be worth solving if we see that the benefits of a single-keyed cache for pervasive resources are significant. So I think running a short-term experiment to measure those benefits (while acknowledging and monitoring the risks) is the right way to go about this.
 

Mike Taylor

unread,
Apr 27, 2022, 9:16:26 AMApr 27
to Daisuke Enomoto, Adam Rice, Nidhi Jaju, blink-dev
Thanks Daisuke - sounds good. I don't think we'll need to extend beyond M102 (but I probably just jinxed it...).

On 4/26/22 8:50 PM, Daisuke Enomoto wrote:
Hi Mike,

Thank you for your question! We're targeting M103 to start the experiment. So, IIUC, it would not interact with the double-key experiment running through M102 unless it's extended.


Daisuke Enomoto

unread,
Apr 27, 2022, 11:16:40 AMApr 27
to Mike Taylor, Adam Rice, Nidhi Jaju, blink-dev
Hi Mike,

Thank you for your question! We're targeting M103 to start the experiment. So, IIUC, it would not interact with the double-key experiment running through M102 unless it's extended.


On Wed, Apr 27, 2022 at 5:55 AM Mike Taylor <mike...@chromium.org> wrote:

Joe Medley

unread,
Apr 27, 2022, 2:07:10 PMApr 27
to Adam Rice, Daisuke Enomoto, blink-dev, Nidhi Jaju
Finch flag?

Joe Medley | Technical Writer, Chrome DevRel | jme...@google.com | 816-678-7195
If an API's not documented it doesn't exist.

Adam Rice

unread,
Apr 28, 2022, 4:26:24 AMApr 28
to Joe Medley, Daisuke Enomoto, blink-dev, Nidhi Jaju
Finch flag?

CacheTransparency (but the code is not landed yet). 

Joe Medley

unread,
Apr 28, 2022, 10:46:39 AMApr 28
to Adam Rice, Daisuke Enomoto, blink-dev, Nidhi Jaju
What is that?

I'm trying to understand what this is because I need or may need to explain it in writing to the external world in a few weeks. I've never heard of a CacheTransparancy flag.

Joe Medley | Technical Writer, Chrome DevRel | jme...@google.com | 816-678-7195
If an API's not documented it doesn't exist.

Daisuke Enomoto

unread,
May 1, 2022, 8:57:04 PMMay 1
to Joe Medley, Adam Rice, blink-dev, Nidhi Jaju
Hi Joe,

CacheTransparancy is the feature name we use for the feature flag. As Adam mentioned, the feature is off by default. Yes, we use Finch to determine the control/experiment group on each channel (50% on canary & dev, 10% on beta, 1% on stable).

Joe Medley

unread,
May 2, 2022, 10:35:00 AMMay 2
to Daisuke Enomoto, Adam Rice, blink-dev, Nidhi Jaju
If it's a feature flag, please put milestones in the the appropriate dev trial milestone fields.

Thanks

Joe Medley | Technical Writer, Chrome DevRel | jme...@google.com | 816-678-7195
If an API's not documented it doesn't exist.

Daisuke Enomoto

unread,
May 10, 2022, 5:03:10 AMMay 10
to Joe Medley, Adam Rice, blink-dev, Nidhi Jaju
Hi Joe,

Sorry for the confusion. The feature flag is used to control the experiment through Finch. Cache Transparency feature does not expose changes for developers to try. We chose to send the I2E to open a discussion to the blink-dev audience for feedback and iterate through experiments to quantify the benefit of single-keyed cache, rather than an Origin Trial where developers can participate in.

Daisuke Enomoto

unread,
Jun 15, 2022, 3:04:57 AM (11 days ago) Jun 15
to blink-dev, Noam Rosenthal, mike...@chromium.org, Nidhi Jaju, ri...@chromium.org, ann...@annevk.nl
Hello


Apologies for the confusion! The list of the top 100 pervasive payloads for use in the experiment is pending internal reviews and will be shared on this thread upon approval.

The list of the top 100 pervasive payloads can be found here. Please note that this list is only for the experiment purposes. On a side note, the experiment is currently disabled in M103 after we found crashing bugs, which we're actively investigating.

Reply all
Reply to author
Forward
0 new messages