How to get browsers AND Cloudflare to cache a single page on your WordPress site

By | October 29, 2018

I maintain the Send Later Thunderbird add-on, which has something on the order of 80,000 active users (exact numbers are unknown because statistics collection on addons.thunderbird.org has been broken for months). The user guide for Send Later, which includes the release notes, is published elsewhere on this blog. Whenever I publish a new major or minor release of the add-on, Thunderbird automatically downloads and installs the new version, and the updated add-on loads the user guide in a new tab to display the release notes to the user.

Add-on updates in Thunderbird happen in the background when it is restarted, so the updates — and subsequent attempts to fetch the user guide from this blog — tend to happen in clusters surrounding the beginning of the work day in various countries.

That means that 99% of the time my poor little blog server with only 4GB of RAM is humming along just fine, and then all of the sudden when I publish a new Send Later release it has to cope with thousands of requests for the user guide all crammed together in short time windows. It does not like this. I’ve written about this before.

It just happened again, and this time it was even more annoying than the last time since there are even more users of the add-on than there were last time. Since the server that hosts my blog also hosts other web services that I and others depend on, I decided it was time to take more drastic action. I therefore transferred my DNS domain to Cloudflare and told it to start caching requests for my blog. I hoped this would be enough to calm things down, by allowing Cloudflare to handle all of the requests for static files (images, JavaScript, CSS), leaving my server to handle only the request for the actual user guide page.

Alas, it was not enough, because the requests for the static files are actually really easy for my server to handle. It’s the request for the user guide page, which is dynamically generated by WordPress, that takes most of the work and therefore generates most of the load on the server.

So this morning I set out on a mission: figure out how to convince Cloudflare to cache the Send Later user guide page and only the Send Later user guide page so my server doesn’t even see most of the requests for it. This took me a lot longer to figure out than I would have hoped, so I’m posting about it here on the off chance that someone else will benefit from my experience.

That introduction was way longer than it should be, so I’m going to jump right in and tell you exactly what I ended up doing, and then I’ll explain why I did it this way and in particular all the other things I tried that didn’t work.

What works

I run WordPress under Apache httpd. I put this in the httpd configuration file for my blog:

<If "%{THE_REQUEST} =~ m#^GET /send-later/ #">
  Header unset "Pragma"
  Header unset "Cache-Control"
  Header unset "Expires"
  Header unset "Set-Cookie"
  Header set Cache-Control "public, max-age=3600"
</If>

Then I reloaded my httpd configuration. With this change, user browsers would now cache the page for an hour, but Cloudflare still won’t, since by default it only caches pages with certain file suffixes indicating static content (e.g., “.js”, “.css”, “.jpg”, etc.). So I also created a page rule in Cloudflare:

Cloudflare page rule: blog.kamens.us/sendlater*, Browser Cache TTL: an hour, Cache Level: Cache Everything

In this rule, “Cache Level: Cache Everything” tells Cloudflare that it’s allowed to cache this page in its CDN, and “Browser Cache TTL: an hour”, tells browsers to cache the page for only an hour, rather than the default, four hours.

With this httpd configuration change and page rule in place, at last Cloudflare successfully caches this page but no others on my blog. Evidence:

$ curl -v --silent https://blog.kamens.us/ 2>&1 | egrep cf-cache-status
$ curl -v --silent https://blog.kamens.us/send-later/ 2>&1 | egrep cf-cache-status
< cf-cache-status: HIT
$

My one regret with this approach is that since my server no longer sees most requests for this page, they aren’t reflected in the Jetpack statistics for the page. I confess that I like seeing all those page hits. 😉

Please comment below or send me email if you found this useful!

What didn’t work

There are various WordPress plugins which allow you to set the caching headers. I ran into these problems with them:

  • As far as I can tell, they all work across the entire site. I only wanted the caching to apply to a single page.
  • Some of them don’t unset “Pragma: no-cache”, so they end up sending conflicting headers, i.e., a “Cache-Control” header which says to cache and a “Pragma” header which says not to.
  • Cloudflare won’t cache any page that sets a cookie, and WordPress sets a “PHPSESSID” cookie on every page, so even if one of these plugins correctly sets the caching headers, that only allows pages to be cached in the user’s browser, not in Cloudflare’s CDN.

I tried adding PHP code to my site to add a wp_headers filter to edit the caching headers, and it did not work. I tried to do this both by putting my filter function at the end of index.php along with an add_filter call to add it to wp_headers, and by inserting a PHP snippet on the page itself with the “PHP code snippets (Insert PHP)” or “Insert PHP Code Snippet” plugin. The former plugin didn’t work because for some reason it wouldn’t let me edit the snippet code on the “add snippet page.” The latter plugin correctly executed the snippet when the page was loaded — I know, because I put an error_log statement in the snippet and its output was being logged — but for some reason the filter function was never being called. The filter function that I added in index.php also was never called. I think maybe by the time the snippet is executed on the page itself, the headers have already been generated. Similarly, I think maybe if I had put my code at the top of index.php rather than at the bottom, it might have worked. Even then, however, it’s possible that other header filters and/or send_headers would have messed with the headers my code was adding, so I’m not convinced this is a viable approach.

I tried using a <Location "/send-later/"> block in my httpd configuration file to specify the header motifications instead of an <If block. That didn’t work because on WordPress, every URL is redirected internally to /index.php, and <Location matches against the final, redirected URL rather the original URL sent by the browser.

 

Share

Leave a Reply

Your email address will not be published. Required fields are marked *