At Redis Conf an important topic was client side caching. I mentioned this
in the keynote, and a talk by Ben Malec focused on this problem and how it
can be solved currently using just what Redis provides. This is going to be
a long email, so I'm splitting it in different parts.
~~~ Why server side caching is an important thing for Redis users
Let's start talking about why I think client side caching is such a big
deal for the Redis community. Redis allows to handle certain tasks at
scale, by delivering a good amount of operations per second and a set of
mutable data structures. However it remains a networked object, like a
traditional database system.
However, after decades of both research and real world practice at
companies that have serious scaling problems, putting the data nearest the
final user, which is the application layer, is clearly a winning solution.
The next step after an networked in memory system, is putting part of the
most accessed data directly inside the memory of the application server.
This concept is known as client side caching.
When the business problem allows it, it's nice to just grab a value, store
it in the app server memory, and say, let's expire it in 2 minutes. However
most applications cannot afford to serve stale data, so the problem of
caching in the client side is almost equivalent to the problem of having a
good way to invalidate what the client stored in its memory.
It is some time that I think at this problem and there are definitely many
solutions, because the design space is huge and full of potential
tradeoffs. However while at Redis Conf, the design proposed by Ben Malec
captured my imagination. When I returned home I played the game of "how
this could be improved if the server would cooperate in the protocol?". But
before going forward let's see what is the design proposed by Ben.
~~~ Ben's approach
Ben reuses the concept of hash slot of Redis Cluster. Which is, hashing the
key to some N bits hashing function, to split the key space into 2^N
possible buckets.
Basically every write performed in the system, also publishes an
invalidation message that is broadcasted to all the clients, and such an
invalidation message contains the hash slot invalidated. In this way the
clients just receive a fixed length information.
On the client side, clients record their local in memory cache, and evict
items as invalidation messages arrive. Actually Ben uses a lazy form of
invalidation, it just tracks the time at which a given value was cached,
and for each hash slot it tracks the last time an invalidation message was
received. This way when a given cached item is accessed, if the time is in
the past compared to the latest invalidation message, the value is
discarded and Redis is hit again with to refresh the cache. Otherwise the
cached value is used.
The good thing about Ben approach is that it allows to send small
invalidation data regardless of the key size, and that it is lazy in the
client side, one thing that is possible because taking metadata *per group
of keys* is a lot simpler and more efficien than taking metadata for each
key. Ben's client-side solution uses this, but once you go from a
client-only solution to a server-client orchestration, this becomes even
more evident.
~~~ Next step: what if the server helps?
Ben's solution, not having any help from the server, has a few problems:
1. All the clients will receive invalidation messages about all the hash
slots. Regardless of the fact a given client has or has not any key in such
hash slot.
2. The client has to send explicit invalidation messages alongside with the
writes.
With the help of the server, we could create a protocol in which, server
side, for each client we remember what key each client is caching, to send
more fine grained invalidation messages. However having the client
informing the server about each key it is caching is problematic: if the
client has to send explicit "I'm caching it" messages we burn a lot of
bandwidth and CPU for this task, moreover there is even the problem of race
conditions to handle, what if I inform the server after the value was
already changed? So I've to also implement some "epoch" attribute of the
key to understand if the value is already no longer valid? And so forth.
Instead what I propose is putting the connection in tracking mode:
CLIENT caching on
This will tell the server: for every command you process from me, assume
that the keys mentioned in such command are keys that I'm going to cache.
So if the client executes "GET foo", and the key "foo" is later modified,
the client will get an invalidation message, using the new "push" messages
introduced in RESP3 (if you think you need multiple connections, N for
commands, and one for invalidation messages, wait a bit, we'll get into it
soon).
However we don't want to track, for each client, all the keys we provided
to it. We are taking of million of keys in a matter of minutes in a few use
cases. So let's use the Ben approach of hashing the key: we will remember
if the client has keys in a given hash slot, and we'll invalidate hash
slots.
instead of using Cluster CRC16 mask, probably we want to use more bits. For
instance with 18 bits we have 262144 buckets, so if a client is caching one
million of keys, an invalidation message will trash 3/4 keys on the
average, which sounds kinda acceptable.
So, server side we take what is called an Invalidation Table, which is a
linear array of 262144 pointers, pointing each to a radix tree. Every time
a client requests data about a given key, the key has hashed to the bucket,
and we store the client ID (not the client structure pointer!) in the
corresponding radix tree.
Clients will need to be reorganized to be also stored inside a global radix
tree of clients, mapping the client ID to the actual client structure. This
is a key point, because it means that when we free the client, we'll not
have to free all its references inside the invalidation table. We'll do it
lazily.
~~~ What happens when a key is modified
Here we leverage the same code path of the keyspace notifications, Redis
core already has hooks in all the commands about that. If a key is
modified, we go to check the invalidation table, and send a message to each
client listed there, removing at the same time all the clients referenced
in this bucket. We can just release the radix tree at all!
So basically, the client will no longer receive invalidation messages at
all about this hash slot. If you compared this approach to using bloom
filters, the good thing is that the client state cleans it up
automatically, we will not notify the same client forever, nor we need some
generational bloom filter setup.
The client will receive the invalidation, and will discard the value (or
will take a client-side data structure to lazy invalidate like Ben did? Up
to you client authors).
So now we have a setup where:
1) We send invalidations only to clients that may have cached keys about
this bucket.
2) If the client does not ask for keys in the same bucket again, we'll not
send invalidation messages to it.
3) Client disconnections are O(1) in this regard. We do lazy removal of the
clients later.
4) The clien does not need to send messages when modifying a given key.
The big disadvantage of this approach is that, basically, the number of
keys each client cache is capped, because the output of the hash function
is fixed. If a message invalidates 100 keys... it's not going to be fun. So
in the range of one million of keys we are fine. It's the biggest sacrifice
of this setup.
~~~ Yep but multiple connections
No problem! You can create a connection to just receive the invalidation
messages, and then create many other connections for the normal commands.
In the normal connections instead of just sending:
CLIENT caching on
You send instead:
CLIENT caching on REDIRECT <client-id-of-the-client-receiving-messages>
So when this client requests some keys, the actual client ID tracked will
be the one that will receive the notifications.
Moreover when the connection we are redirecting to is destroyed, we will
receive a push data saying, your invalidation connection closed, so flush
your cache and restart, because otherwise there is a risk of stale reads.
~~~ What is worth to cache?
On top of that, using the new auxiliary metadata that RESP3 supports, if
the client sends us some CLIENT get-key-popularity or whatever command, we
can also send it a popularity info, to know if it's worth or not to cache
something. We may even send the info only if it's worth it, that is, if the
key has a significant popularity.
The problem is that the more client side caching works, the less the
popularity info is updated. But clients may want to always use some form of
TTL anyway in caches, for two reasons:
1) It helps populating server-side caching info.
2) If you have bugs, you want to eventually recover from stale data.
This was a long email... Sorry,
Salvatore
--
Salvatore 'antirez' Sanfilippo
open source developer - Redis Labs
https://redislabs.com
"If a system is to have conceptual integrity, someone must control the
concepts."
— Fred Brooks, "The Mythical Man-Month", 1975.