CACHE ME IF YOU CAN

Introduction

Almost every user experience survey will tell you, a slow loading site is a surefire way to lose visitors. For your webserver, configuration changes or even replacing can only do so much. One option is to invest in a CDN with multiple POPs or if you do not have the budget to do this, you can enable caching on the server side.

Caching is a term you hear often, especially when you are having trouble with a website. The first fix-it-yourself response is “Clear your Browser Cache and Cookies”. Browser cache is the client side cache that speeds up website loading time for a repeat visitor. How long content is stored locally is usually a webserver directive. Popular webservers such as Apache and Nginx have this defined as part of the .htaccess or the .conf file in /etc/nginx/sites-available respectively. Typically, static content such as images, css & javascript files are set with a 30 day expiry date.

What if your site has thousands of visitors from across the globe? How do you handle dynamic content, such as geography based information? We’ll take a look at some of the popular server side caching systems that are not specific to a particular web server.

Varnish

Varnish is a HTTP Reverse Proxy or HTTP accelerator that sits in front of your webserver. It works by caching requests that it has received and re-uses them for subsequent requests. This means that when Varnish is hard at work, a good number of requests do not even hit the server. Requests can be responded to in microseconds. The added advantage is that when requests are handled by Varnish, your web server and the backend database are not being loaded.

Varnish can sit on a separate node and proxy requests to other application server nodes or can run on the same machine. Varnish works well with popular CMSes such as WordPress and in front of any web server application (Apache/Nginx).

Redis, Memcached

REDIS and Memcached are different from Varnish in the sense that they are not caches in front of the webserver. They are closer to the database and help reduce the number of queries on the database by creating a key/value store for a particular piece of information.

For e.g, if your dynamic page displays a menu with various categories, each time the page is loaded, a query is executed to get the categories from a table. By assigning a particular value for the object “menu”, you can programmatically retrieve it from a cache instead of running the query. By extending this example, a single product page would be displayed as a result of multiple queries across tables, say, price, shipping methods, payment methods etc. By storing all of these information into a in-memory table, every retrieval of the product information can be served without running a query against the main database.

However, as you may have inferred, using Redis/Memcached is not a drop-in solution. One needs to make code changes to begin supporting this caching method. Every database call involves a call to Redis/Memcached. In case of a read, the data in the cache is retrieved, checked for staleness and returned. Only if data is not available or old, is the database query executed. Write operations involve writing to both the database and the cache.

Between Memcached and Redis, the latter is newer. It has found favor with some of the popular brands such as Twitter and GitHub. One reason Redis is better than Memcached, is that it offers persistence by allowing snapshotting to disk. The advantage here is that when the service is restarted, you can avoid a deluge of requests to the database as the cache can be reloaded and only stale information updated.

Other Caching Options

The above applications were chosen purely based on popularity. There are various other options as well. NGINX content caching (both on the Open Source version and NGINX Plus) allows for caching content with a fixed validity period. This is similar to how Varnish works, though the OSS version does not have as much control as the Plus version.

Squid, a proxy caching software for HTTP/HTTPS/FTP requests is also a good choice. Wikimedia uses Squid and has stated that using Squid has “effectively quadrupled the capacity of the Apache servers behind them”

Final Thoughts

There is no ideal combination of caching services. The application that runs in the background also makes a difference. For e.g., WordPress comes with it’s own ecosystem of caching plugins, some that convert all blog posts to plain HTML pages. In such cases, installing Redis/Memcached might not provide the speed boost that you expect. It is best to setup different combinations on a test server and stress test the application to find the one with the best response.

Introduction

Varnish

Redis, Memcached

Other Caching Options

Final Thoughts

Popular Stories

Subscribe Email

Popular Tags