The performance of my site has always been a consideration, the faster your site, the better the browsing experience. I've made various changes in the past to try and boost performance but this is perhaps the best one so far. Setting up caching with Nginx.


Performance

We should all want better performance on our sites and there are a bunch of different reasons you could benefit. For me, the absolute number one is the browsing experience for visitors. If your site loads faster then users are more likely to stick around and enjoy the experience more. There's heaps of research out there to prove this and even the tiniest of slow downs can get visitors hitting the back button to leave. This also has an impact on the conversion rate for eCommerce sites too. Better performance also helps users on mobile devices or slow connections and it also helps to keep your overheads down because you're going to consume less resources hosting the same site. No matter what your motivation, better performance is just better. To keep an eye on some key metrics I use New Relic to monitor my servers and applications so I know exactly how things are performing but crucially, whether or not things are getting better with the changes I make. There are also many others ways that you can squeeze extra performance out of your site and I did a blog on some micro-optimisations I did for fun!


Nginx

Nginx is an incredibly powerful web server, load balancer and reverse proxy. I currently use it as a reverse proxy for my site to terminate SSL and proxy requests to Node.js for my Ghost blog. This doesn't really take full advantage of what Nginx can do though, so I'm going to turn it into a caching proxy and make a few other alterations to harness some of the power of Nginx.


nginx logo


Serving static assets directly

One of the things that Nginx does really well is serving static files. With a default install of Ghost, Nginx will proxy all requests to Node.js to be handled, even requests for things like JS and CSS files. This introduces a completely unnecessary overhead because Nginx is proxying a request that it could quite easily handle itself much faster. The first step then is to tell Nginx to host these files itself. You probably have a location block for the root of your site that looks something like this:


location / {
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_pass http://localhost:2368;
}

This will catch all requests to your site and proxy them to Node.js on port 2368. We're going to add some new location blocks to stop some of the requests being passed back to Node and have Nginx serve them itself.


location ^~ /assets/ {
    root /var/www/ghost/content/themes/scott;
}

The /assets/ directory is where Ghost references all of the JS, CSS and fonts for your site and would normally be passed back to Node. With this new location block we're telling Nginx that any request for an item in /assets/ can be served out of the path given with the root directive. You will need to update this path to point to your theme folder and remember to change it if you ever change your theme, but Nginx will now host these files directly and not pass the request to Node. We can also instruct Nginx to do the same for our images.


location ^~ /content/images/ {
    root /var/www/ghost;
}

The path here is a little simpler and doesn't require updating if we change our theme as images are stored lower down in the Ghost file structure. Nginx will now also answer any request for an image directly without proxying the request.


Enable caching

One of the other things that Nginx is really good at is caching. In a default setup every request that Nginx receives will be passed to Node which will construct the page and send it back to Nginx to serve to your visitor. The chances are that on a personal blog, a site just like this, that constructing the page from scratch for every single request is a bit pointless, because it never changes. It's a big heap of work to do for each request that results in the exact same output. This is a perfect candidate for caching. We're going to setup caching for Nginx and instead of sending every page request to Node to be built, Nginx will cache a copy of the page for a given period of time and serve that for subsequent requests, removing a massive overhead. The first change is in your nginx.conf file where we need to define the cache we're going to use in the http context.


proxy_cache_path /home/nginx/ghostcache levels=1:2 keys_zone=ghostcache:60m max_size=300m inactive=24h;
proxy_cache_key "$scheme$request_method$host$request_uri";
proxy_cache_methods GET HEAD;

You need to create the directory for the cache path and ensure that Nginx has permissions to use it. The levels directive specifies that Nginx should split the cache files over 2 subdirectories inside the cache folder. The keys_zone directive specifies the name of the cache and the size in Mb for the memory location where cache keys are stored. In my case the cache is called ghostcache and it can use 60Mb of memory to store cache keys. Then max_size defines the maximum size on disk that cached files can take up and inactive specifies how long the cache manager should wait to remove a file from the cache after it was last accessed. I've gone for 24 hours to start with to allow for daily peaks and troughs in traffic. Now that the cache is configured, we need to tell Nginx to use it. Back to our earlier location block.


location / {
    proxy_cache ghostcache;
    proxy_cache_valid 60m;
    proxy_cache_valid 404 1m;
    proxy_ignore_headers Set-Cookie;
    proxy_hide_header Set-Cookie;
    proxy_cache_use_stale error timeout invalid_header updating http_500 http_502 http_503 http_504;
    proxy_ignore_headers Cache-Control;
    add_header X-Cache-Status $upstream_cache_status;

    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_pass http://localhost:2368;
}

I've now added in the necessary changes to tell Nginx to use the cache and we're using ghostcache that we created earlier. The proxy_cache_valid directives set a default cache time of 60m on valid responses with a HTTP status of 200, 301 or 302 and override that for 404 responses which are only cached for 1m. Next we need to ignore the Set-Cookie header which Ghost had a history of setting a lot and means that Nginx wouldn't cache the page. We also hide the Set-Cookie header too. Next up, if there is a problem talking to Node then Nginx can use stale cache items instead of returning an error from Node. We're ignoring the Cache-Control header from Node and enforcing our own cache policy and finally adding the X-Cache-Status header to responses to show the cache status. That's it! Nginx is now setup to cache your pages and you should see some super performance improvements, but there is just one more thing we need to address.


Don't cache everything

Caching your pages is fantastic and they will be so much faster, but, there are some pages that we don't want to cache. Specifically, we really don't want to cache the admin section of the site under /ghost/ because we need that to be live for it work properly. We need to override the cache for this specific directory. Add a new location block to cover it.


location ^~ /ghost/ {
    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_pass http://localhost:2368;
}

This block simply tells Nginx to proxy all requests that come in to /ghost/ as it would have before and no caching is enabled.


Bypassing the cache

With the above configuration Nginx will not allow someone to bypass the cache. If it has a cached item, it will serve it to whoever makes a request. You can configure Nginx to allow visitors to request that it bypass the cache and fetch a fresh copy of the page from the origin, Node in this case, if you wish. Simply add the directive to allow cache bypass.


location / {
    proxy_cache ghostcache;
    proxy_cache_valid 60m;
    proxy_cache_valid 404 1m;
    proxy_cache_bypass $http_cache_control; # add this line <---
    proxy_ignore_headers Set-Cookie;
    proxy_hide_header Set-Cookie;
    proxy_cache_use_stale error timeout invalid_header updating http_500 http_502 http_503 http_504;
    proxy_ignore_headers Cache-Control;
    add_header X-Cache-Status $upstream_cache_status;

    proxy_set_header Host $http_host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_pass http://localhost:2368;
}

This directive tells Nginx that if the client sends the Cache-Control header that Nginx should bypass its cache and proxy the request to the origin for a fresh response. This sounds great and may well be what you need but you also have to consider how often the browser will send this header. If the user clicks to your site via a link then the browser won't send it and Nginx will serve from cache. If the users types in your address and hits enter then the browser will send the header and ask for a fresh page. It will also send the header if the user is on your site and hits the refresh button or does things like Shift-click the reload button. This can mean an awful lot of cache bypassing and may not be something you want, which is why I don't use this particular setting.


Clearing the cache

There may come a time when you need to completely wipe the cache on disk and I came across one whilst writing this article. I was getting ready to switch the theme on my site and as a result pretty much any item in the cache was going to be wrong. I needed to switch the theme and then completely erase the cache. You can do this with a simple command.

sudo find /home/nginx/ghostcache -type f -delete

This will find all files in the given directory, which you need to update to your own cache directory, and delete them. You can run it without the -delete flag first to see a list of files that will be deleted as a result of running the command if you want to double check.


Testing the cache

Now that the cache is setup and you've loaded your new config, you can try out the cache to to see how it performs. Just clicking around my site I can immediately tell that it's a lot faster. That's not very scientific though so let's create a test. I created two scripts.


cat with-cache.sh

for run in {1..10}
do
    curl https://scotthelme.co.uk > /dev/null
done

cat without-cache.sh

for run in {1..10}
do
    curl -H "Cache-Control: no-cache" https://scotthelme.co.uk > /dev/null
done

This will basically request my homepage in a loop 10 times and if we time the amount of time each takes then we can see how much the cache is improving things.


time ./with-cache.sh
real    0m13.066s

time ./without-cache.sh
real    0m14.346s

That's almost a 10% reduction in the time it takes to request my homepage, and don't forget, curl isn't fetching any other assets, this is just the HTML of the page itself. We're now bypassing the proxy request and serving directly for the assets on the page that are self hosted so they will be faster too. Overall, that's quite a healthy boost in performance! I also ran the same tests again on my server to remove any network latency or other issues from the mix. I also increased the loop count up to 100.


time ./with-cache.sh
real    0m3.253s

time ./without-cache.sh
real    0m3.573s

Even running the test locally on the server I'm still seeing the same ~10% reduction in load time. The cache is definitely doing its job! Enjoy!