Shared Cache is Going Away

by jefftkjefftk1 min read1st Nov 20196 comments

17

Personal Blog

Browsers historically have had a single HTTP Cache. This meant that if www.a.example and www.b.example both used cdn.example/jquery-1.2.1.js then JQuery would only be downloaded once. Since it's the same resource regardless of which site initiates the download, a single shared cache is more efficient. [1]

Unfortunately, a shared cache enables a privacy leak. Summary of the simplest version:

  • I want to know if you're a moderator on www.forum.example.
  • I know that only pages under www.forum.example/moderators/private/ load www.forum.example/moderators/header.css.
  • When you visit my page I load www.forum.example/moderators/header.css and see if it came from cache.
Versions of this have been around for a while, but in March 2019 Eduardo Vela disclosed a way to make it much more powerful and reliable. Browsers are responding by partitioning the cache ( Chrome, Firefox; Safari already had). [2] It's not clear from me reading the bugs when it will launch, but it does sound soon. [3]

What does this mean for developers? The main thing is that there's no longer any advantage to trying to use the same URLs as other sites. You won't get performance benefits from using a canonical URL over hosting on your own site (unless they're on a CDN and you're not) and you have no reason to use the same version as everyone else (but staying current is still a good idea).

I'm sad about this change from a general web performance perspective and from the perspective of someone who really likes small independent sites, but I don't see a way to get the performance benefits without the leaks.


[1] When I worked on mod_pagespeed, rewriting web pages so they would load faster, we had an opt-in feature to Canonicalize JavaScript Libraries.

[2] I was curious if this had launched yet so I made a pair of test pages and tried it out in WebPageTest for Chrome Canary and Firefox Nightly but it's not out yet. I used a WPT script consisting of:

navigate https://www.trycontra.com/test/cache-partition
navigate https://www.bidadance.org/test/cache-partition

[3] Firefox's bug is marked "fixed" and "Milestone: mozilla70", but I have Firefox 70.0.1 and it doesn't seem to be enabled:

Perhaps this is just the code change and they still need a flag flip? I don't know how Firefox does this.

Comment via: facebook, lesswrong

17

6 comments, sorted by Highlighting new comments since Today at 7:19 AM
New Comment

"I'm sad about this change ... from the perspective of someone who really likes small independent sites"

All I know about this topic is what I just read from you... But should I regard this as a plot by Big Tech to further centralize the web in their clouds? Or is it more the reverse, meant to protect the user from evil small sites?

I don't think there's anything nefarious going on, and these are real privacy leaks. That they happen to hurt sites people quickly move between more than ones people stay on a long time is just what happens, but it still pushes the web in a more centralized direction.

This applies at a lot of levels in networking and computing. Various types of https://en.wikipedia.org/wiki/Timing_attack have been used to break memory, process, and virtualization boundaries. And in some cases, even air-gapped separate systems can leak information - cache partitioning doesn't help you if the attacker is timing your page-load time to see if your visit to a page is your first one today.

"I'm sad about this change ... from the perspective of someone who really likes small independent sites"

Honestly, this is for the best. jQuery and other JS/CSS CDNs need to go away. They never (ever) made good on their promise: using them doesn't really increase the hitrate of those resources. This is true for a few reasons:

1. Fragmentation. There are so many versions of the common libraries -- and variations of those versions -- that it's unlikely that a visitor to your site has loaded your particular reference resource from another site.

2. Local cache is surprisingly ineffectual for users that don't show up to your site regularly. Browsers have gotten really good at knowing what they ought to cache based on what sites that user is going to. Local cache is also pretty small and resources get pushed out pretty quickly -- especially as sites grow in size and users visit more sites every day. Unless somebody is visiting your site often, it's likely that local cache won't last more than a few hours.

3. HTTP/2 nearly eliminates the need to host assets on separate domains. HTTP/1.x had a limitation to the number of connections per host it would allow. If your site had a lot of small resources this could be a huge bottleneck, so we moved our resources to multiple domains to increase the number of connections. H2 is a single connection per host that allows for multiple resources to be sent at the same time. This massively increases performance, regardless of how many resources are being requested. In fact, it's faster in H2-times to consolidate your resources instead of spreading them out.


TL;DR-- Local cache isn't what it's cracked up to be. jQuery and other CDNs aren't worth much anymore. Consolidate your resources behind a single domain and CDN and get even faster.

I believe a less performance degrading solution would be for the browsers to use the cache and yet still send the request randomly in 1-2% cases.

It is behind a flag - if you read the commit in the bug you linked it’s defaulted to ‘false’, and controlled under the ‘browser.cache.cache_isolation’ preference setting.

https://phabricator.services.mozilla.com/rMOZILLACENTRALa5e791146ef541d794bb74ac730c0fa40985e4d4