A photograph of a squirrel with a walnut in its mouth perched on a concrete wall. The squirrel appears to be looking for a place to put the nut.

Essential elements of high performance applications

Server side caching

October 7, 2022

Christian Charukiewicz Partner & Principal Software Engineer

performance-optimization essential-elements-of-high-performance

Our application’s SQL database is a good place to start with performance optimization, as it doesn’t require changing our infrastructure or major rewrites of the code. Adding indexes and rewriting queries are generally isolated measures that we can take to improve the performance of our application. However, sometimes even after optimization we’ll find performance is still worse than required. This may be for a variety of reasons—our request volume is so large that our database is struggling to serve all the queries even after optimization, or we’re running already optimized but more complex queries whose performance is still inadequate.

One technique we can employ in such a situation is server side caching. In simple terms, server side caching is saving the result of an expensive query or computation and making it more quickly retrievable. The results are typically written against and retrieved using a particular ID or URL that acts as a distinct identifier for the particular data.

Server side caching flow: cache hits and misses

Employing caching in an application requires handling two cases in our data retreival flow, the cache hit and the cache miss. A cache hit occurs when the cache is checked for a particular piece of data and it is found. By contrast, a cache miss is when the cache is checked and the piece of data is absent from the cache. Let’s look at both caching flows.

Assuming our cache is totally empty to start with, here’s what a cache miss flow looks like:

Receive the request for a particular piece of data
Check whether the data for that request’s ID or URL is in the cache
Since the data is not in the cache, run the query or computation to retrieve the result
Save the result in the cache
Return the result

We can see above that the cache miss occurs in step 2, causing step 3 to result in the costly query or computation that we are aiming to avoid executing. By saving the result of this query in the cache in step 4, subsequent requests for this piece of data can result in a cache hit, which looks like the following:

Receive the request for a particular piece of data
Check whether the data for that request’s ID or URL is in the cache
Since the data is in the cache, return this result

In this sequence, we avoid querying the database altogether and rely only on the cache. By avoiding having to read from the database, we can dramatically speed up retrieval of the data that we need.

Let’s look at an example. Suppose we’re working using a piece of project management software and we look up the recent activity of one of our coworkers. The path of the coworkers profile and activity feed might be something like https://www.example.com/users/123456/activity and we navigate to it in our browser. On the web server, the application routes the request to an activity feed handler (a function that handles the request) that takes 123456, our coworker’s user ID, as the argument.

The handler function may look like the following:

def getUserActivityHandler(userId):
	# [1] Check the cache for activity feed data for the given user
	cachedActivityFeedItems = retrieveCachedActivityFeedForUser(key=userId)
	# [2] If activity feed data was found for this user, return it
	if(cachedActivityFeedItems != None):
		# CACHE HIT
		return cachedActivityFeedItems

	# CACHE MISS
	# [3] Otherwise, run SQL queries for each type of activity feed item
	recentPosts = findRecentPostsForUser(userId)
	recentComments = findRecentCommentsForUser(userId)
	recentCompleteTasks = findRecentCompletedTasksForUser(userId)

	# [4] Sort all retrieved items by date
	sortedActivityFeedItems = sortByDate([
		recentPosts,
		recentComments,
		recentCompletedTasks
	])
	# [5] Save the sorted activity feed items in the cache,
	#     associated with the current user id
	saveCachedActivityForUser(key=userId, sortedActivityFeedItems)
	# [6] In addition, return the same activity feed data
	return sortedActivityFeedItems

The above handler function will only run steps [1] and [2] if the activity feed data for the specified user ID is found in the cache. If the data isn’t present in the cache, it will run several SQL queries and sort the results (steps [3] and [4]), and then save save the sorted data in the cache in step [5] before also returning the data in step [6].

When employing caching like this, it’s imperative that the key used to look up the cached data is the same as the one used to store it. In both steps [1] and steps [5], we’re using the userId parameter as the cache key. If the keys did not match, then we would always experience a cache miss after step [2], since the data in the cache would not be retrieved using the same identifier that it was stored under.

Looking at the example above, you’ll notice that caching only matters in successive calls of the getUserActivityHandler function with a particular userId parameter. This means that data in the cache is persisted across requests. One way to conceptualize the cache is as a special type of database or data store that our application uses in tandem with a SQL database.

But how do we know when to update the data in the cache? What happens if the underlying data in the SQL database changes? These concerns are solved through cache invalidation.

Cloudtrellis

A new service built by Foxhound Systems Discover problems with your website before your users do

Cloudtrellis scans your entire site for broken links, accessibility issues, and SEO errors to ensure a flawless user experience.

Detect error pages, broken links, accessibility issues, and SEO problems
Create scans with tailored configurations for each website and subdomain you manage
Schedule scans to run monthly, weekly, or even daily to closely monitor for new issues
Get notified of new scan results via email
Share scan results with your team via direct link

Learn more

Expunging stale data: Cache invalidation

A critical consideration of caching is the concept of cache invalidation, or removing data that is stored in the cache that is not reflective of what is currently present in the SQL database^Δ, which we can consider our system of record or our single source of truth. Looking back to the code example above, we need to consider what happens when our coworker makes a new post or completes a task in the project management system. If we’ve looked at his activity feed recently, his activity feed data will be cached, so looking at the feed again will not show the latest post.

In order to resolve this issue, we need to invalidate the data in the cache. More specifically, we need to invalidate the activity feed cache entry for our coworker’s user ID. How we do this depends on our tolerance for stale data and the performance related considerations of retrieving data from the SQL database. For example, we may set a 15 minute time time live (or TTL) on all activity feed cache entries, causing any piece of data to be dropped from the cache once its age reaches 15 minutes. With a 15 minute TTL, the data we see for a in any user’s activity feed should never be outdated by more than 15 minutes. Whether this is acceptable depends on the user experience expectations for the application.

Another cache invalidation strategy we can use is on creation of any of the individual items that comprise the activity feed. Using this strategy, when a coworker makes a post, comment, or completes a task, their activity feed cache is automatically invalidated. This way we ensure that the data on any user’s activity feed is always up to date, as a stale copy that isn’t reflective of what was last saved in the database should never stick around in the cache. One of the obvious downsides of this approach is that it requires changing the implementation of every location in our code base that saves data that appears in activity feeds. In this case, this would include sites that save recent posts, recent comments, and recently completed tasks. An additional downside of this strategy is that if users in the system are very active and frequently post, comment, or mark tasks as completed, their activity feeds will seldom be cached and our retrieval code will usually result in a cache miss.

In practice, we may want to employ a combination of these strategies across our application, with certain pages or pieces of information using only a TTL strategy, others using the on-create invalidation, and others still using a combination of the two.

Selecting a store for our cached data

Up to this point we’ve only discussed caching in general and as it relates to the overall flow of our application and the role it plays in relation to the SQL database. However, in order to actually implementing caching in our system requires selecting a store for the data that is to be cached.

One of the most commonly used databases for caching is Redis, which is has a number of distinctions from SQL that make it well suited for serving as a server side data cache:

Redis is an in-memory database, meaning its data is stored on the server’s RAM rather than on its disk, and commodity RAM is significantly faster to read from and write to than commodity SSDs, which are commonplace on web application servers.
Redis is a key-value store rather than a SQL database, meaning that instead of tables consisting of columns and rows, it associates every piece of stored data with a single key value. We can think of Redis as a large hashmap or dictionary, which is exactly the data structure we use to implement key-based caching.
Redis does not enforce any schema, meaning that unlike SQL which has tables with predefined structure, we can store free form data in each Redis key (for the purpose of caching, the data we store is usually a serialized array or JSON string)
Redis supports automatic data expiration, so we can set a TTL for each piece of cached data. This allows us to automatically drop data from the cache after it reaches a certain age.
Redis also allows us to set a data eviction policy, such as least recently used (or LRU), meaning that as the cache server reaches its limit for RAM use, it will start evicting least recently used pieces of data, even if their TTL hasn’t been hit.
Redis runs as a server and has a networking interface. This means that an individual Redis instance deployed on a single host can serve multiple application servers, allowing them to share a cache. Redis also supports clustering across multiple nodes, which enables horizontal scaling even and stable performance even at heavy cache workloads.

There are other options available to use as a cached data store besides Redis. For example, memcached is a tool very similar to Redis when used for the purposes of caching. Its feature set is more limited, with Redis having support for data structures beyond just strings and integers that memcached supports. This difference in features is largely inconsequential for the purpose of caching, since as mentioned earlier, caching typically involves serializing data into a string before storage. However, if Redis is used or will likely be used for other purposes such as pub/sub, message queues, or geospatial indexes, it becomes a natural tool to reach for over memcached, as the overhead of learning, finding language libraries for, and maintaining the infrastructure of multiple tools is avoided.

A cache data store can also be even more rudimentary than the dedicated stores discussed above. For example, file caching is sometimes employed “out of the box” by web frameworks, where an application will write to and read from temporary files on its host machine. Despite the somewhat primitive implementation, there are some notable benefits to this approach:

Infrastructure simplicity. Nothing beyond the application needs to be deployed.
Near zero-latency cache lookups. A tool like Redis deployed as described above will have network overhead associated with each cache lookup. Even a host in the same data center will likely incur a couple milliseconds of wait time, whereas a file can be accessed instantaneously.

However, there are plenty of downsides as well:

A local file cache cannot be shared by multiple hosts, so the same web request subsequently routed to a different server will result in the underlying query running again, even if already cached on the first server.
With file caching, cache invalidation in a multi-host environment becomes impractical. This can lead to inconsistent results across servers. It’s possible for two different versions of data to be cached on each machine.
No “TTL” functionality without cron jobs, some other system process, or cleanup logic in the application, so cache files may take up space on disk forever.
Security implications related to having pieces of your data “at rest” outside of your databases. An attacker that gets access to the application server’s disk may be able to see pieces of other users’ data even without database access.

Ultimately, most modern caching setups are likely to employ tools like Redis rather than local file system caching. When network latency is a concern, there are more sophisticated setups that can be employed, such as installing a Redis process on each application server which then broadcasts cache invalidations to other servers (see the links at the end of this article for more on this topic).

Wrap up

In summary, server side caching allows us to significantly reduce the latency associated with repeatedly running intensive database queries or other computations. Most caching implementations rely on tools like Redis that are well-suited to caching data in a multi-host environment. However, caching of this sort comes with trade-offs. It requires additional infrastructure complexity inherent to having an additional caching database, it requires updating our code to actually make use of the cache, and it requires carefully considering our cache invalidation strategies and deciding on our tolerance for serving stale data.