Using HTTP Caching: 2021 Guide
The fastest website is the website that is already loaded, and that’s exactly what we can do with HTTP caching. HTTP caching lets web browsers reuse of previously loaded resources, like pages, images, JavaScript, and CSS. It’s a powerful tool to improve your web performance, but misconfiguration can cause big performance problems. Here’s what you need to know to use HTTP caching without reading hundreds of pages of HTTP Caching Spec.
HTTP caching is controlled by headers returned as part of the server response. The most important of these is the Cache-Control
header, which informs the browser how and when a resource may be cached. The Cache-Control
header has many, many options that control caching behavior. But to avoid writing a novel, we’ll focus on the basics of controlling cache, and give you some recipes for common scenarios.
How to use the Browser Cache
The browser calculates “Cache Freshness” using headers in the HTTP response. Cache freshness is how long a cached asset is valid since it was downloaded. Freshness is calculated depending on which headers are returned.
The Cache-Control
header has a number of directives to control caching behavior, but the most common is max-age
. Max-age specifies how many seconds after download the cached resource is valid. Here’s an example:
# Cache this response for 10 minutes (600 seconds).
Cache-Control: max-age=600
The Expires
header contains a date and time at which the cached resource should be marked stale, but only if you didn’t already use the max-age
Cache-Control
option. Expires
is used to determine freshness if the response also contains a Date
header for when the response was sent. Freshness is simply subtracting Date
from the Expires
time.
# This response can be cached for 1 hour (Expires - Date == freshness).
Expires: Tue, 09 Nov 2021 21:09:28 GMT
Date: Tue, 09 Nov 2021 20:09:28 GMT
The Browser’s Automatic Caching
Even if you don’t use the Cache-Control
or Expires
header, most web browsers will cache resources automatically and guess how long they will remain fresh. This guessing is referred to as “heuristic freshness”. Usually, the guess is based on the Last-Modified
header included automatically by most web servers. But each browser implements this differently, so it’s dangerous to rely on it for your caching.
One method that browser’s use is to assume a resource is “fresh” for 10% of the time since the resource was last modified.
# Freshness = 2 hours (20 hours since last modified)
# (Date - Last-Modified) * 10% == freshness
Last-Modified: Tue, 09 Nov 2021 02:00:00 GMT
Date: Tue, 09 Nov 2021 22:00:00 GMT
Check how your caching is configured right now! We made a neat tool that checks your HTTP cache settings.
Handling Expired Resources
What happens when a resource “expires”? This is referred to as a “stale resource” , and the browser must re-validate the resource from the server. In some cases, the browser can validate the resource without downloading it again. Otherwise, the browser re-downloads the entire resource and caches the new version.
There are a couple ways this validation can happen, depending on which HTTP Validation Headers are sent with your resources.
The ETag
header allows the browser to tell the server what version it currently has. The header contains a string which uniquely identifies the content, usually a checksum of the file.
When a resource expires that had an ETag, the browser will send a validation request with a If-None-Match
header containing the ETag value it already has. If the resource is unchanged, the server replies with an empty 304 (Not Modified) HTTP response. Otherwise, the server sends the resource like normal when the content has changed.
# In original resource response headers:
ETag: "123abc987654"
# Browser sends in the validation request headers:
If-None-Match: "123abc987654"
When an ETag is unavailable, web servers may send a Modified-Date
header, with the last modified date of the source file. Similar to ETags, the browser can send that date in a validation request with the If-Modified-Since
header to tell the server which version it has.
The server returns an empty 304 (Not Modified) response if the content has not changed since the date specified.
# In original resource response headers:
Modified-Date: Tue, 09 Nov 2021 20:00:00 GMT
# Browser sends in the validation request headers:
If-Modified-Since: Tue, 09 Nov 2021 20:00:00 GMT
No Validation
If the original resource had neither ETag
or Modified-Date
headers, then the browser simply requests the entire resource and uses the result.
Busting the Browsers Cache
When something changes, such as a new image, refreshed session, or an updated release of your code, you’ll want to invalidate (or bust!) the browser cache so that your users get the new stuff. If you’ve aggressively set caching headers, this can be challenging, but there are a couple ways to solve it.
1. Changing the URL to the Resource
The most common cache busting strategy is just to change the name of your resources when they change. This could be something like including a hash, version, or date in the filename when you build your site.
scripts.e7686eaf.min.js
2. Adding a Query Parameter
If you can’t change the name of your resources, you can add a querystring parameter with a changeable key, like a version or date. When you change your site, or a resource, updating the querystring to a new value will invalidate all browser caches.
/my/images.png?v=2021119
If you have a look at the source of our page here, you’ll see what we use this strategy, adding a date representation of the build time to all our scripts and styles.
3. Using the Vary Header
The Vary
header is can be returned in resource responses and tells the browser when a resource should be cached as a unique variation of the resource. It does this by specifying one or more headers to use as a unique key.
The browser will never be able to use its cached responses if the header values change on every request. Vary
is often omitted entirely, and should be used carefully when needed.
# Good: A common value that should not impact caching
# Caches gzip vs non-gzip responses separately
Vary: Accept-Encoding
# Bad: Probably not what you want.
# Any change to X-App-Version will invalidate your cache!
Vary: X-App-Version
HTTP Caching Recipes
Different resources are cached differently. Here’s how to accomplish a few common caching scenarios.
1. Never Cache Anything!
Some resources are dynamic or time sensitive and should never be cached. This will force the browser to re-download resources each and every time the user loads the page. Force the browser to always makes a request:
Cache-Control: no-store
2. Cache, But Always Revalidate
Some resources are cacheable, but change often enough that they should be re-validated before use. We can accomplish this with the confusingly named no-cache
directive. The browser will request an updated version of the resource, but will accept a 304 (Not Modified) response to save download bandwidth.
Cache-Control: no-cache
# no-cache is equivalent to:
Cache-Control: max-age=0, must-revalidate
3. Cache For A Day
Some resources change, but do so slowly. Setting a “just right” max-age
on these allows them to be cached but updated in a timely manner when changed. Don’t depend on max-age
alone if it’s critical that the browser immediately uses a new version, use a Cache-Buster!
Cache-Control: max-age=86400
4. Cache “Forever”
You probably don’t want to do this unless you are using a cache-busting strategy. There isn’t actually a “forever” cache directive, but you can get close enough by specifying a very large max-age
.
# Cache this resource for a year
Cache-Control: max-age=31536000
Conclusion
That’s it! You can use these headers and recipes to greatly accelerate your website and save a ton of redundant download bandwidth. Proper caching can improve the way customers perceive your site’s performance. But don’t take our word for it, you should be monitoring your website performance with Request Metrics to check and improve your website performance.