Intent to Implement: Partition the HTTP Cache

1114 views
Skip to first unread message

Shivani Sharma

unread,
Jul 31, 2019, 12:25:53 PM7/31/19
to blink-dev


Contact emails

shiva...@chromium.org, jka...@chromium.org 


Explainer

https://github.com/whatwg/fetch/issues/904


Design docs/spec

Threat Model

Design 


TAG review

Not yet started


Summary

The HTTP cache is currently one-per-profile, with a single namespace for all resources regardless of origin or renderer process. This opens the browser to a side-channel attack where one site can detect if another site has loaded a resource by checking if it’s in the cache.


This feature will partition the HTTP cache using top frame origin (and also possibly the subframe origin) to prevent documents from one origin from knowing if a resource from a cross-origin document load was cached or not. 


From a performance perspective, preliminary experiments with partitioning using top-frame-origin show that the cache hit rate drops by about 4% but changes to first contentful paint aren’t statistically significant and the overall fraction of bytes loaded from the cache only drops from 39.1% to 37.8%. This may change as we flesh out the implementation and progress to larger populations but it’s an encouraging start.


Motivation

Cache attacks have been exploited for the following:

  • Cross-site search attack: There exist cross site search attack proofs-of-concept which exploit the fact that some popular sites load a specific image when a search result is empty. By opening a tab and performing a search and then checking for that image in the cache, an adversary can detect if an arbitrary string is in the user’s search results. 

  • Detect if a user has visited a specific site: If the cached resource is specific to a particular site or to a particular cohort of sites, an adversary can detect user’s browsing history by checking if the cache has that resource.


In addition to the above cache attacks, the cache can also be used to store cross-site super-cookies as a fingerprinting mechanism. To clear them the user has to delete their entire cache (not just a particular site). Since fingerprinting is neither transparent nor under the user’s control, it results in tracking that doesn’t respect user choice.


Interoperability and Compatibility

Mozilla is interested in this and Safari has long since implemented a variant of it by using eTLD+1 to partition the cache. This is the github issue with inputs from various browsers.


Firefox: Public support

Edge: No public signals

Safari: Public support


Web developers: This is not a breaking change, but it will have performance considerations for some organizations. For instance those that serve large volumes of highly cacheable resources across many sites (e.g., fonts and popular scripts). This needs to be balanced with the privacy and security concerns and the fact that sites often use different versions of popular libraries, reducing the benefits of such caching. It’s also worth reiterating that Safari has already partitioned its cache and that Firefox has signaled that it wants to as well.


Security

It is an enhancement of security as detailed in the threat model document linked above.


Will this feature be supported on all six Blink platforms (Windows, Mac, Linux,

Chrome OS, Android, and Android WebView)?

Yes


Is this feature fully tested by web-platform-tests?

It is intended to be tested by web platform tests.

https://crbug.com/981970


Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/5730772021411840


Matt Menke

unread,
Jul 31, 2019, 3:53:35 PM7/31/19
to blink-dev
We're not planning on intent to implements for each feature we intend to split by this key, but we're also planning on splitting socket pools, SSL session cache, reporting uploads, and HTTP server properties (like which servers support H2/QUIC - note that this does *not* include HSTS) by this as well, keeping an eye for perf regressions.

Chris Palmer

unread,
Jul 31, 2019, 6:05:35 PM7/31/19
to blink-dev
I don't have LGTM power of course, but I do want to say that from a security perspective we consider this a major win. Thanks to the whole team working on this. :)

Daniel Bratell

unread,
Aug 1, 2019, 8:38:43 AM8/1/19
to blink-dev, Chris Palmer
This sounds like a great feature and I hope it will simplify many other later changes to restrict data revealed between sites.

Sometimes bypassing the cache is faster than using the cache so I'm not surprised that loading times didn't seem to change much. As an extreme, think slow spinning fragmented disk compared to ISP located CDN. 

I think you identified the largest web developer impact when you mentioned server load but I didn't expect even that impact to be large (maybe some of the CDNs can help with collecting data).

/Daniel
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/1f29f0a0-acc1-41e6-838b-90ad87baebdf%40chromium.org.



--
/* Opera Software, Linköping, Sweden: CEST (UTC+2) */

Bnaya Peretz

unread,
Aug 1, 2019, 11:54:31 AM8/1/19
to Daniel Bratell, blink-dev, Chris Palmer

Was it considered to not do so if the response cache header is set to public, or adding a magic op-in global cache header?


Josh Karlin

unread,
Aug 1, 2019, 2:20:22 PM8/1/19
to Bnaya Peretz, Daniel Bratell, blink-dev, Chris Palmer
On Thu, Aug 1, 2019 at 11:54 AM Bnaya Peretz <m...@bnaya.net> wrote:

Was it considered to not do so if the response cache header is set to public, or adding a magic op-in global cache header?

We've certainly considered it. We'd like to avoid allowing for an opt-out of cache partitioning due to the privacy risks that remain and are sometimes difficult to reason about. 

Example 1) WIth opt-out you could continue to use the cache to store super-cookies

Example 2) Let's say a cdn makes jquery-X.Y public because, hey, it's just a static copy of the latest jquery. But two years from now only one major website uses jquery-X.Y. Now, detecting its presence in the cache reveals that this user visits this site.


 

Brandon Maslen

unread,
Aug 5, 2019, 7:29:43 PM8/5/19
to blink-dev, m...@bnaya.net, bra...@opera.com, pal...@google.com
This looks like a good change to help mitigate unintentional information disclosure and the initial cache hit results do look promising. I'm curious to see if the metrics hold out when the change is exposed to a larger audience with potentially varying browsing patterns.

It looks like the thread at https://github.com/whatwg/fetch/issues/904 briefly touches on the topic of Safari partitioning on eTLD+1 vs origin as well as the unknown factor of not using origins. Was there any investigation or discussion into the feasibility of using the top-frame site (eTLD+1) as part of the double key to help balance potential cache-misses with providing adequate partitioning?

Brandon
To unsubscribe from this group and stop receiving emails from it, send an email to blin...@chromium.org.



--
/* Opera Software, Linköping, Sweden: CEST (UTC+2) */

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blin...@chromium.org.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blin...@chromium.org.

PhistucK

unread,
Aug 6, 2019, 4:23:32 AM8/6/19
to Brandon Maslen, blink-dev, Bnaya Peretz, Daniel Bratell, Chris Palmer
Are there plans to somehow partition the (service worker) cache API? If yes, this might have very observable effects and if not, this HTTP cache partitioning will only help in simple cases, right?

PhistucK


To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/298c2807-20f7-4852-b845-2cc9cedbf0b2%40chromium.org.

Ben Kelly

unread,
Aug 6, 2019, 9:26:12 AM8/6/19
to PhistucK, Brandon Maslen, blink-dev, Bnaya Peretz, Daniel Bratell, Chris Palmer
On Tue, Aug 6, 2019 at 4:23 AM PhistucK <phis...@gmail.com> wrote:
Are there plans to somehow partition the (service worker) cache API? If yes, this might have very observable effects and if not, this HTTP cache partitioning will only help in simple cases, right?

The service worker cache API is part of quota managed storage, like IDB, etc.  They would all need to be partitioned together.  Other browsers that partition http cache do also partition quota managed storage as well.  I don't think that is part of this chromium effort yet, though.
 

David Benjamin

unread,
Aug 6, 2019, 12:06:04 PM8/6/19
to Brandon Maslen, blink-dev, m...@bnaya.net, Daniel Bratell, Chris Palmer
The security boundary on the web is the origin, not eTLD+1. eTLD+1 relies on the public suffix list, which is rather a mess. It risks problems with domains that aren't on the PSL but should be, different clients with differently out-of-date lists, opaque origins, having to partition by URL scheme separately, etc.

I think the origin is a much better starting point here. You're right that eTLD+1 is a candidate to consider should this change be too aggressive to deploy. But let's leave that as a backup plan of last resort. In general, relying on eTLD+1 is less secure, less private, and less forward-looking.

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/298c2807-20f7-4852-b845-2cc9cedbf0b2%40chromium.org.

Matt Menke

unread,
Aug 6, 2019, 12:11:49 PM8/6/19
to David Benjamin, Brandon Maslen, blink-dev, m...@bnaya.net, Daniel Bratell, Chris Palmer
Another issue is that we'd need to split out HTTP vs HTTPS, so we'd actually be introducing a new concept of (eTLD+1 + scheme | origin-if-opaque).

You received this message because you are subscribed to a topic in the Google Groups "blink-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/a/chromium.org/d/topic/blink-dev/6KKXv1PqPZ0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAF8qwaDou6%2BnsTFJMCsFreYqszVWxmWgwe9mfQ5mVCD11UNoLA%40mail.gmail.com.

Anne van Kesteren

unread,
Aug 6, 2019, 12:33:41 PM8/6/19
to Matt Menke, David Benjamin, Brandon Maslen, blink-dev, m...@bnaya.net, Daniel Bratell, Chris Palmer
On Tue, Aug 6, 2019 at 6:11 PM Matt Menke <mme...@chromium.org> wrote:
> Another issue is that we'd need to split out HTTP vs HTTPS, so we'd actually be introducing a new concept of (eTLD+1 + scheme | origin-if-opaque).

To be clear, I'm extremely in favor of top-level origin here and
that's the consensus in the standards discussion as I understand it,
but I'm pretty sure we have a number of places using "scheme-and-site"
as I like to call it (e.g., agent clusters (document.domain) and CORP
come to mind).

kevinc...@gmail.com

unread,
Aug 12, 2019, 11:35:11 AM8/12/19
to blink-dev
More of an observation than a concern: This would negate one of the common reasons for using JavaScript CDNs. One of the most commonly stated reasons for using one of the major JavaScript CDNs is the likelihood of users already having the JavaScript in their cache.

While some of the other benefits of CDNs would still exist (less bandwidth on the main site, possibly closer to end user than the origin server, etc), those can all be accomplished by general purpose CDNs (especially the easy to use reverse proxy CDNs), which many of these same sites are already using anyway.

It is not rare for a site to only be using the JavaScript CDN for the likelihood of the script already being in cache.

-------------

On the other hand, here is a more real concern:

What would the impact of this change be on cache storage? Would there be content-body de-duplication or anything? If not, I am concerned that this would make the cache hit its storage storage limit much faster and start evicting entries much sooner, which could really compound the performance impact this has.
Obviously, no matter what, this will make the cache larger, since headers and keys take up non-zero storage, but if, say, the cache ends up storing a complete copy of a large but popular woff font from Google Fonts for every page that uses it, that could be a pretty substantial impact.


On Wednesday, July 31, 2019 at 12:25:53 PM UTC-4, Shivani Sharma wrote:

Josh Karlin

unread,
Aug 12, 2019, 3:41:37 PM8/12/19
to kevinc...@gmail.com, blink-dev
Thanks for the feedback Kevin. We're keeping a close eye on the impact that partitioning the cache has on both page load performance and overall network usage. So far the impact is quite small in aggregate (stats are in the original post), suggesting that the cache's performance isn't severely impacted and that content deduplication is unnecessary at this point. 

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Bnaya Peretz

unread,
Aug 12, 2019, 5:01:46 PM8/12/19
to kevinc...@gmail.com, blink-dev
Not only that, but also wasm compiled code and js bytecode wont be shared.
Its another big perf and resources lose

Kinuko Yasuda

unread,
Aug 12, 2019, 6:40:09 PM8/12/19
to Bnaya Peretz, kevinc...@gmail.com, blink-dev
On Tue, Aug 13, 2019 at 6:01 AM Bnaya Peretz <m...@bnaya.net> wrote:
Not only that, but also wasm compiled code and js bytecode wont be shared.
Its another big perf and resources lose

Wasm and Js compiled code is already partitioned and stored separately from HTTP cache, so this shouldn't affect that.
 

fksc...@gmail.com

unread,
Aug 15, 2019, 11:19:14 AM8/15/19
to blink-dev, kevinc...@gmail.com

Context: I created and operate https://pika.dev/cdn.


First off, thanks for the hard work and thoughtful considerations you all have put into this important security feature so far.


To build on Kevin's post, this suggested change would kill the cache efficiency story of cross-origin CDNs, which is troubling to me for a few reasons. The ability to cache resources across domains is a huge benefit to using cross-origin CDNs at scale. The more use that they get, the more likely cache hits are, and the faster those websites become (potentially loading instantly, or at the very least faster than if they'd served JS on their own origin). Two examples:


1. "This month, July 2019, cdnjs served almost 190 billion requests ... Lodash (4.17.11) skyrocketed to the top of the list this month with 8.7 billion requests."[1]
I imagine the cache efficiency lost due to this change for this CDN alone (jQuery, lodash, etc) will be massive.


2. "Approximately 100% of the Fortune 500 already use npm to acquire approximately 97% of their JavaScript code." [2]
Pika is creating a CDN for modern npm packages that can run in the browser. The project is only a few months old today, but with ESM it becomes feasible for sites to load their npm dependencies from our CDN (or UNPKG, or another cross-origin CDN like it) in production. Basically, cdnjs for npm. In that world, every npm package would only be loaded once across all participating sites, and would then be cached and reused on future visits. Imagine if most sites never had to load React, ReactDOM, Preact, Vue, the 100 most popular npm packages, etc.


Obviously security is a huge concern, and I completely understand and appreciate the work being done here. But I'd want to make sure that an important performance story on the web isn't accidentally destroyed in the process. 


If this proposal does continue to move forward, I'd at least want an opt-in proposal discussed, either via the existing Cache-Control header, a new header, or some other mechanism. I do not believe that either of the two concerns outlined above were reasonably serious: We're talking about a small number of CDN-related cookies, and in practice the "Detect if a user has visited a specific site" attack-surface would be negligible (and again, opt-in). I'm happy to contribute / get involved if time & effort is a blocker here.


Thanks again, 

- FKS


tldr: https://twitter.com/rektide/status/1156261839623401472

To unsubscribe from this group and stop receiving emails from it, send an email to blin...@chromium.org.

Chris Palmer

unread,
Aug 15, 2019, 3:36:56 PM8/15/19
to fksc...@gmail.com, blink-dev, kevinc...@gmail.com
Maybe I'm missing it, but scrolling up, I don't see anyone showing the increase in cache misses. We do have some numbers that I've seen in internal documents, and they are happily much lower than you might think. Googlers, can we publish them?

You received this message because you are subscribed to a topic in the Google Groups "blink-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/a/chromium.org/d/topic/blink-dev/6KKXv1PqPZ0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/9319a175-3e18-4ad4-a9fa-4ea723089331%40chromium.org.

shiva...@google.com

unread,
Aug 19, 2019, 3:49:23 PM8/19/19
to blink-dev, fksc...@gmail.com, kevinc...@gmail.com


On Thursday, August 15, 2019 at 3:36:56 PM UTC-4, Chris Palmer wrote:
Maybe I'm missing it, but scrolling up, I don't see anyone showing the increase in cache misses. We do have some numbers that I've seen in internal documents, and they are happily much lower than you might think. Googlers, can we publish them?
 
From the original mail: "cache hit rate drops by about 4% but changes to first contentful paint aren’t statistically significant and the overall fraction of bytes loaded from the cache only drops from 39.1% to 37.8%." 

fksc...@gmail.com: Thanks for your inputs. We are continuing to look at the changes in cache hit rate as the experiment expands to a larger population.
 

To unsubscribe from this group and all its topics, send an email to blin...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/9319a175-3e18-4ad4-a9fa-4ea723089331%40chromium.org.

r...@fortawesome.com

unread,
Aug 26, 2019, 6:19:42 PM8/26/19
to blink-dev, fksc...@gmail.com, kevinc...@gmail.com


On Thursday, August 15, 2019 at 2:36:56 PM UTC-5, Chris Palmer wrote:
Maybe I'm missing it, but scrolling up, I don't see anyone showing the increase in cache misses. We do have some numbers that I've seen in internal documents, and they are happily much lower than you might think. Googlers, can we publish them?

I second this request. Can the methods, population sample size, raw results be published for review with regard to cache hit rates?

I share the same concerns that the pika.dev/cdn operator brought up. I'm part of the Font Awesome team and our assets are normally in the top 10 most requested resources from a lot of the CDNs out there (up there with jQuery, Lodash, etc).

The 4% number doesn't sound like a big deal on the surface. But how does that break down? Is that stat filtered down to only CDN usage or is it broad? (including resources that are cached directly from the top frame origin).

As FKS mentioned the security benefits are great and I don't think anyone can argue against them. My concern is that the true impact of this may not be understood until it's too late.

I'm also curious if a solution can be created that still allows some level of shared cache without the security problems. Could I set a `Cache-Control` header that marked the content to be shared across all origins? If so the trade-off might be no cookies are allowed to be set or sent. That's an easy "yes" for us because we don't set cookies anyway.

Anne van Kesteren

unread,
Aug 27, 2019, 4:40:05 AM8/27/19
to r...@fortawesome.com, blink-dev, fksc...@gmail.com, kevinc...@gmail.com
On Tue, Aug 27, 2019 at 12:19 AM rob via blink-dev
<blin...@chromium.org> wrote:
> I'm also curious if a solution can be created that still allows some level of shared cache without the security problems. Could I set a `Cache-Control` header that marked the content to be shared across all origins? If so the trade-off might be no cookies are allowed to be set or sent. That's an easy "yes" for us because we don't set cookies anyway.

That continues to have the same problem (as I think was already
mentioned upthread). It's not up to you to decide that it's okay to
reveal that a user visited a site.

Ben Kelly

unread,
Aug 27, 2019, 11:02:14 AM8/27/19
to Anne van Kesteren, r...@fortawesome.com, blink-dev, fksc...@gmail.com, kevinc...@gmail.com
Is the risk reduced if we can determine the resource is linked from a statistcally significant number of sites in the wild?  For example, some list of hashes of "widely shared resources" based on http archive, etc.

Sorry if this was discussed previously.

Ben

Anne van Kesteren

unread,
Aug 27, 2019, 1:02:35 PM8/27/19
to Ben Kelly, r...@fortawesome.com, blink-dev, fksc...@gmail.com, kevinc...@gmail.com
On Tue, Aug 27, 2019 at 5:02 PM Ben Kelly <wande...@chromium.org> wrote:
> Is the risk reduced if we can determine the resource is linked from a statistcally significant number of sites in the wild? For example, some list of hashes of "widely shared resources" based on http archive, etc.

Things that come to mind:

1. Building the infrastructure to support such assertions.
2. Authorities that block N-M of those sites and are interested in who
visits the remainder.
3. Or instead of authorities, there might be other groupings of such
sites that are not immediately apparent, such that only one of them is
in a different language and likely has a different audience, etc.

At the last W3C TPAC there was some hallway discussion among
implementers on how to do cross-site caches and nobody was really able
to come up with a privacy-preserving scheme. Doesn't mean there isn't
one of course, but it seems quite tricky.

Chris Palmer

unread,
Aug 27, 2019, 1:33:09 PM8/27/19
to Anne van Kesteren, Ben Kelly, r...@fortawesome.com, blink-dev, fksc...@gmail.com, kevinc...@gmail.com
Also, we here don't have all the information necessary to determine what the full risk is, what the consequences are, and how much reduction of risk is 'enough'.

--
You received this message because you are subscribed to a topic in the Google Groups "blink-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/a/chromium.org/d/topic/blink-dev/6KKXv1PqPZ0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to blink-dev+...@chromium.org.

fksc...@gmail.com

unread,
Aug 27, 2019, 1:38:18 PM8/27/19
to blink-dev, ann...@annevk.nl, wande...@chromium.org, r...@fortawesome.com, fksc...@gmail.com, kevinc...@gmail.com
I agree, some sort of "safelist" seems tough on multiple fronts.

That continues to have the same problem (as I think was already 
mentioned upthread). It's not up to you to decide that it's okay to 
reveal that a user visited a site. 

To challenge this assertion: As a cross-site CDN owner with cross-site caching enabled (either always on or through some opt-in header), you are signaling to site owners that it has this small potential vulnerability. And then that site owner has the choice to either use that service, or not. The nice thing about public CDNs is that they are rarely (never?) the only option, and a security conscious user could host those assets themselves.

I see that as more of an agreement between the service and the site owner, making this call based on the context & information that they have. 

The real risk (as I understand it) that we're all working to mitigate is that this is currently on for everyone by default, and not that public CDNs alone are good targets for this sort of attack & worth the performance hit. 

On Tuesday, August 27, 2019 at 10:33:09 AM UTC-7, Chris Palmer wrote:
Also, we here don't have all the information necessary to determine what the full risk is, what the consequences are, and how much reduction of risk is 'enough'.

On Tue, Aug 27, 2019 at 10:02 AM Anne van Kesteren <ann...@annevk.nl> wrote:
On Tue, Aug 27, 2019 at 5:02 PM Ben Kelly <wande...@chromium.org> wrote:
> Is the risk reduced if we can determine the resource is linked from a statistcally significant number of sites in the wild?  For example, some list of hashes of "widely shared resources" based on http archive, etc.

Things that come to mind:

1. Building the infrastructure to support such assertions.
2. Authorities that block N-M of those sites and are interested in who
visits the remainder.
3. Or instead of authorities, there might be other groupings of such
sites that are not immediately apparent, such that only one of them is
in a different language and likely has a different audience, etc.

At the last W3C TPAC there was some hallway discussion among
implementers on how to do cross-site caches and nobody was really able
to come up with a privacy-preserving scheme. Doesn't mean there isn't
one of course, but it seems quite tricky.

--
You received this message because you are subscribed to a topic in the Google Groups "blink-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/a/chromium.org/d/topic/blink-dev/6KKXv1PqPZ0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to blin...@chromium.org.

Christopher Thompson

unread,
Aug 27, 2019, 1:50:19 PM8/27/19
to fksc...@gmail.com, blink-dev, ann...@annevk.nl, wande...@chromium.org, r...@fortawesome.com, kevinc...@gmail.com
On Tue, Aug 27, 2019 at 10:38 AM <fksc...@gmail.com> wrote:
I agree, some sort of "safelist" seems tough on multiple fronts.

That continues to have the same problem (as I think was already 
mentioned upthread). It's not up to you to decide that it's okay to 
reveal that a user visited a site. 

To challenge this assertion: As a cross-site CDN owner with cross-site caching enabled (either always on or through some opt-in header), you are signaling to site owners that it has this small potential vulnerability. And then that site owner has the choice to either use that service, or not. The nice thing about public CDNs is that they are rarely (never?) the only option, and a security conscious user could host those assets themselves.

Isn't the problem that it is the end user who would have to make this decision, not the site owner?
 

I see that as more of an agreement between the service and the site owner, making this call based on the context & information that they have. 

The real risk (as I understand it) that we're all working to mitigate is that this is currently on for everyone by default, and not that public CDNs alone are good targets for this sort of attack & worth the performance hit. 

On Tuesday, August 27, 2019 at 10:33:09 AM UTC-7, Chris Palmer wrote:
Also, we here don't have all the information necessary to determine what the full risk is, what the consequences are, and how much reduction of risk is 'enough'.

On Tue, Aug 27, 2019 at 10:02 AM Anne van Kesteren <ann...@annevk.nl> wrote:
On Tue, Aug 27, 2019 at 5:02 PM Ben Kelly <wande...@chromium.org> wrote:
> Is the risk reduced if we can determine the resource is linked from a statistcally significant number of sites in the wild?  For example, some list of hashes of "widely shared resources" based on http archive, etc.

Things that come to mind:

1. Building the infrastructure to support such assertions.
2. Authorities that block N-M of those sites and are interested in who
visits the remainder.
3. Or instead of authorities, there might be other groupings of such
sites that are not immediately apparent, such that only one of them is
in a different language and likely has a different audience, etc.

At the last W3C TPAC there was some hallway discussion among
implementers on how to do cross-site caches and nobody was really able
to come up with a privacy-preserving scheme. Doesn't mean there isn't
one of course, but it seems quite tricky.

--
You received this message because you are subscribed to a topic in the Google Groups "blink-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/a/chromium.org/d/topic/blink-dev/6KKXv1PqPZ0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to blin...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CADnb78iq%2Bahg8H59kR6RjDXrW%2Bx-Fnwepg7AZWgw0XnOFLrkVQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/695baa88-a67c-4254-afe9-31667199eb9e%40chromium.org.

rek...@gmail.com

unread,
Aug 29, 2019, 11:32:26 AM8/29/19
to blink-dev
I guess this is the proof we need that Google Chrome is not in the pocket of Google Fonts. I foresee much much much font redownloading in the future. I wonder how many extra MB or GB of space a users cache will be after this change.

I am distressed & saddened the idea of cache is being so heavily regressed for a couple very niche security concerns, that would be better served by some specific content security policy directives. This move feels like an affront to the harmonious web, a shattering & breaking up of a long valued & treasured piece. It breaks the idea of a CDN completely. This is remarkably bold, for so little, such niche tiny wins.

Dominic Farolino

unread,
Aug 29, 2019, 11:58:49 AM8/29/19
to blink-dev
> I wonder how many extra MB or GB of space a users cache will be after this change.

I guess similar to the average cache size of Safari users, since Chrome isn't the first implementation to explore making this sort of change. Also you'll probably be interested in https://groups.google.com/forum/#!msg/mozilla.dev.platform/eFx-93iBPpU/Hs4jUZRgDgAJ.

I know a lot of people have been asking about the current metrics Chrome has collected, and for a deeper explanation as to why the team doesn't think this change will be as detrimental as some think. The current metrics that have been recorded are pretty early, so I believe the team is simply waiting for the data to become more representative of reality before publishing as much as possible to the public, to help everyone better-understand the anticipated impact.

Kenji Baheux

unread,
Aug 29, 2019, 9:49:20 PM8/29/19
to blink-dev
For reference, this doc, started by Mozilla, captures the discussions around privacy-preserving cross-origin caching that occurred at the last TPAC.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.


--
Kenji BAHEUX
Product Manager - Chrome
Google Japan

Shivani Sharma

unread,
Sep 18, 2019, 4:16:22 PM9/18/19
to blink-dev
Here is the explainer for this work.

It goes into the details of the metrics in the section "Impact on metrics" for various categories like network traffic, page performance and cache misses. It also gives the metrics for specific types of resources like 3rd party fonts, css and js files, although at this time the 3rd party resource metrics are fairly new and we will watch how these numbers change over the next few weeks and update them here.

On Thursday, August 29, 2019 at 9:49:20 PM UTC-4, Kenji Baheux wrote:
For reference, this doc, started by Mozilla, captures the discussions around privacy-preserving cross-origin caching that occurred at the last TPAC.

On Fri, Aug 30, 2019 at 12:58 AM Dominic Farolino <d...@chromium.org> wrote:
> I wonder how many extra MB or GB of space a users cache will be after this change.

I guess similar to the average cache size of Safari users, since Chrome isn't the first implementation to explore making this sort of change. Also you'll probably be interested in https://groups.google.com/forum/#!msg/mozilla.dev.platform/eFx-93iBPpU/Hs4jUZRgDgAJ.

I know a lot of people have been asking about the current metrics Chrome has collected, and for a deeper explanation as to why the team doesn't think this change will be as detrimental as some think. The current metrics that have been recorded are pretty early, so I believe the team is simply waiting for the data to become more representative of reality before publishing as much as possible to the public, to help everyone better-understand the anticipated impact.

On Friday, August 30, 2019 at 12:32:26 AM UTC+9, rek...@gmail.com wrote:
I guess this is the proof we need that Google Chrome is not in the pocket of Google Fonts. I foresee much much much font redownloading in the future. I wonder how many extra MB or GB of space a users cache will be after this change.

I am distressed & saddened the idea of cache is being so heavily regressed for a couple very niche security concerns, that would be better served by some specific content security policy directives. This move feels like an affront to the harmonious web, a shattering & breaking up of a long valued & treasured piece. It breaks the idea of a CDN completely. This is remarkably bold, for so little, such niche tiny wins.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.

Rick Byers

unread,
Sep 18, 2019, 9:14:24 PM9/18/19
to Shivani Sharma, blink-dev
Thanks for sharing these details Shivani!

Once concern I have is that double (or tripple) keying caches may disproportionately hurt smaller sites than big sites if they rely more heavily on 3p resources. If that were the case it might be lost in our metrics since the majority of page loads in Chrome are for the most popular sites. Would it be possible to slice the analysis in the explainer in some way (eg. top-10k origins vs. not) to try to test and quantify this?

Thanks,
  Rick

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.


--
Kenji BAHEUX
Product Manager - Chrome
Google Japan

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/eb7005e4-55e5-4a1d-b911-31d8a0ee2f8f%40chromium.org.

fksc...@gmail.com

unread,
Sep 20, 2019, 11:19:02 AM9/20/19
to blink-dev, shiva...@chromium.org
+1, Thanks for putting that document together Shivani to give the rest of us some insight into to the data you're seeing. 

The data seems to show an impact on 3rd part cache hits/misses of anywhere between ~4-12%, depending on strategy & content type. That is significant, and I'm really surprised to see that not translate into any statistically significant change to page performance. That seems to imply that caching is not strongly related to page performance, which can't be true... right? Am I missing something obvious? Or, to echo Rick's comment, could the type of site profiled be impacting the data in some unexpected way?

- FKS

On Wednesday, September 18, 2019 at 9:14:24 PM UTC-4, Rick Byers wrote:
Thanks for sharing these details Shivani!

Once concern I have is that double (or tripple) keying caches may disproportionately hurt smaller sites than big sites if they rely more heavily on 3p resources. If that were the case it might be lost in our metrics since the majority of page loads in Chrome are for the most popular sites. Would it be possible to slice the analysis in the explainer in some way (eg. top-10k origins vs. not) to try to test and quantify this?

Thanks,
  Rick

On Thu, Sep 19, 2019 at 5:16 AM Shivani Sharma <shiva...@chromium.org> wrote:
Here is the explainer for this work.

It goes into the details of the metrics in the section "Impact on metrics" for various categories like network traffic, page performance and cache misses. It also gives the metrics for specific types of resources like 3rd party fonts, css and js files, although at this time the 3rd party resource metrics are fairly new and we will watch how these numbers change over the next few weeks and update them here.

On Thursday, August 29, 2019 at 9:49:20 PM UTC-4, Kenji Baheux wrote:
For reference, this doc, started by Mozilla, captures the discussions around privacy-preserving cross-origin caching that occurred at the last TPAC.

On Fri, Aug 30, 2019 at 12:58 AM Dominic Farolino <d...@chromium.org> wrote:
> I wonder how many extra MB or GB of space a users cache will be after this change.

I guess similar to the average cache size of Safari users, since Chrome isn't the first implementation to explore making this sort of change. Also you'll probably be interested in https://groups.google.com/forum/#!msg/mozilla.dev.platform/eFx-93iBPpU/Hs4jUZRgDgAJ.

I know a lot of people have been asking about the current metrics Chrome has collected, and for a deeper explanation as to why the team doesn't think this change will be as detrimental as some think. The current metrics that have been recorded are pretty early, so I believe the team is simply waiting for the data to become more representative of reality before publishing as much as possible to the public, to help everyone better-understand the anticipated impact.

On Friday, August 30, 2019 at 12:32:26 AM UTC+9, rek...@gmail.com wrote:
I guess this is the proof we need that Google Chrome is not in the pocket of Google Fonts. I foresee much much much font redownloading in the future. I wonder how many extra MB or GB of space a users cache will be after this change.

I am distressed & saddened the idea of cache is being so heavily regressed for a couple very niche security concerns, that would be better served by some specific content security policy directives. This move feels like an affront to the harmonious web, a shattering & breaking up of a long valued & treasured piece. It breaks the idea of a CDN completely. This is remarkably bold, for so little, such niche tiny wins.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blin...@chromium.org.


--
Kenji BAHEUX
Product Manager - Chrome
Google Japan

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blin...@chromium.org.

Ryan Hamilton

unread,
Sep 21, 2019, 1:17:20 AM9/21/19
to fksc...@gmail.com, blink-dev, shiva...@chromium.org
On Fri, Sep 20, 2019 at 8:19 AM <fksc...@gmail.com> wrote:
+1, Thanks for putting that document together Shivani to give the rest of us some insight into to the data you're seeing. 

The data seems to show an impact on 3rd part cache hits/misses of anywhere between ~4-12%, depending on strategy & content type. That is significant, and I'm really surprised to see that not translate into any statistically significant change to page performance. That seems to imply that caching is not strongly related to page performance, which can't be true... right? Am I missing something obvious? Or, to echo Rick's comment, could the type of site profiled be impacting the data in some unexpected way?

While I agree that it seems ... "not right". I can't say that it shocks me. In QUIC, we initially persisted 0-RTT information into the disk cache to save a RTT in the first connection to a server after restart. We eventually ran an experiment where we removed this logic and saw no change in performance. Some (not amazingly thorough but nonetheless informative) testing indicated that sometimes the disk is as far away as the server.

I'd love to see an experiment where we completely disable the disk cache and see what impact that has on our metrics.

Cheers,

Ryan

Shivani Sharma

unread,
Sep 23, 2019, 2:57:42 PM9/23/19
to blink-dev

Shivani Sharma

unread,
Oct 7, 2019, 10:18:04 AM10/7/19
to Rick Byers, blink-dev
On Wed, Sep 18, 2019 at 9:14 PM Rick Byers <rby...@chromium.org> wrote:
Thanks for sharing these details Shivani!

Once concern I have is that double (or tripple) keying caches may disproportionately hurt smaller sites than big sites if they rely more heavily on 3p resources. If that were the case it might be lost in our metrics since the majority of page loads in Chrome are for the most popular sites. Would it be possible to slice the analysis in the explainer in some way (eg. top-10k origins vs. not) to try to test and quantify this?

Thanks! 
I am looking into it if some of the core metrics can be sliced between popular vs other pages. 

Shivani Sharma

unread,
Jun 15, 2020, 11:18:46 AM6/15/20
to Rick Byers, blink-dev
On Mon, Oct 7, 2019 at 10:17 AM Shivani Sharma <shiva...@chromium.org> wrote:


On Wed, Sep 18, 2019 at 9:14 PM Rick Byers <rby...@chromium.org> wrote:
Thanks for sharing these details Shivani!

Once concern I have is that double (or tripple) keying caches may disproportionately hurt smaller sites than big sites if they rely more heavily on 3p resources. If that were the case it might be lost in our metrics since the majority of page loads in Chrome are for the most popular sites. Would it be possible to slice the analysis in the explainer in some way (eg. top-10k origins vs. not) to try to test and quantify this?

Thanks! 
I am looking into it if some of the core metrics can be sliced between popular vs other pages. 

We analyzed the metrics for top 1/3rd, middle 1/3rd and tail 1/3rd sites, by usage, and for all of these subsets, the regressions in first and largest contentful paint are <1% in most quantiles and <= 0.5% at the median.

kundal...@gmail.com

unread,
Jul 10, 2020, 9:02:08 PM7/10/20
to blink-dev, kevinc...@gmail.com
To account for efficiency concerns, the answer is simple, implement a feature like localcdn, "A web browser extension that emulates Content Delivery Networks to improve your online privacy. It intercepts traffic, finds supported resources locally, and injects them into the environment." This improves privacy and load times.

Currently emulates the following locally, including a number of versions/variations of each

algoliasearch
algoliasearch3.33.0_algoliasearchLite_algoliasearchHelper.jsm
angucomplete-alt
angular-bootstrap-colorpicker
angular-material
angular-payments
angular-stripe-checkout
angular-ui-bootstrap
angular-ui-router
angular-ui-select
angular-ui-utils
angularjs
angularjs-slider
angularjs-toaster
animate.css
autocomplete.js
backbone.js
bootbox.js
bootstrap-3-typeahead
bootstrap-datepicker
bootstrap-daterangepicker
bootstrap-select
bootstrap-slider
bootstrap.css
bootstrap.js
chart.js
clipboard.js
d3
d3-legend
dojo
ember.js
ethjs
ext-core
fancybox
findify-bundle
flv.js
fontawesome
google-material-design-icons
hls.js
jquery
jquery-csv
jquery-jeditable
jquery-migrate
jquery-mobile
jquery-modal
jquery-tablesorter
jquery-validate
jquery.blockUI
jquery.devbridge-autocomplete
jquery.lazyload
jqueryui
js-cookie
jsdelivr-combine-jquery-hogan-algoliasearch-autocomplete.jsm
lazysizes
libphonenumber-js
lodash.js
lozad.js
materialize
mdbootstrap
mirage2
modernizr
moment.js
mootools
nvd3
oclazyload
p2p-media-loader-core
page.js
plyr
popper.js
prototype
raven.js
react
react-dom
rickshaw
rocket-loader
scriptaculous
select2
showdown
simplemde
slick-carousel
socket.io
spin.js
store.js
swfobject
swiper
tether
toastr.js
twitter-bootstrap
underscore.js
urlive
urlize
vue
webcomponentsjs
webfont
webrtc-adapter
wow
Reply all
Reply to author
Forward
0 new messages