Client side caching: initial design ideas

1371 views
Skip to first unread message

Salvatore Sanfilippo

unread,
May 8, 2018, 9:48:51 AM5/8/18
to redi...@googlegroups.com
At Redis Conf an important topic was client side caching. I mentioned this
in the keynote, and a talk by Ben Malec focused on this problem and how it
can be solved currently using just what Redis provides. This is going to be
a long email, so I'm splitting it in different parts.

~~~ Why server side caching is an important thing for Redis users

Let's start talking about why I think client side caching is such a big
deal for the Redis community. Redis allows to handle certain tasks at
scale, by delivering a good amount of operations per second and a set of
mutable data structures. However it remains a networked object, like a
traditional database system.

However, after decades of both research and real world practice at
companies that have serious scaling problems, putting the data nearest the
final user, which is the application layer, is clearly a winning solution.
The next step after an networked in memory system, is putting part of the
most accessed data directly inside the memory of the application server.
This concept is known as client side caching.

When the business problem allows it, it's nice to just grab a value, store
it in the app server memory, and say, let's expire it in 2 minutes. However
most applications cannot afford to serve stale data, so the problem of
caching in the client side is almost equivalent to the problem of having a
good way to invalidate what the client stored in its memory.

It is some time that I think at this problem and there are definitely many
solutions, because the design space is huge and full of potential
tradeoffs. However while at Redis Conf, the design proposed by Ben Malec
captured my imagination. When I returned home I played the game of "how
this could be improved if the server would cooperate in the protocol?". But
before going forward let's see what is the design proposed by Ben.

~~~ Ben's approach

Ben reuses the concept of hash slot of Redis Cluster. Which is, hashing the
key to some N bits hashing function, to split the key space into 2^N
possible buckets.
Basically every write performed in the system, also publishes an
invalidation message that is broadcasted to all the clients, and such an
invalidation message contains the hash slot invalidated. In this way the
clients just receive a fixed length information.

On the client side, clients record their local in memory cache, and evict
items as invalidation messages arrive. Actually Ben uses a lazy form of
invalidation, it just tracks the time at which a given value was cached,
and for each hash slot it tracks the last time an invalidation message was
received. This way when a given cached item is accessed, if the time is in
the past compared to the latest invalidation message, the value is
discarded and Redis is hit again with to refresh the cache. Otherwise the
cached value is used.

The good thing about Ben approach is that it allows to send small
invalidation data regardless of the key size, and that it is lazy in the
client side, one thing that is possible because taking metadata *per group
of keys* is a lot simpler and more efficien than taking metadata for each
key. Ben's client-side solution uses this, but once you go from a
client-only solution to a server-client orchestration, this becomes even
more evident.

~~~ Next step: what if the server helps?

Ben's solution, not having any help from the server, has a few problems:

1. All the clients will receive invalidation messages about all the hash
slots. Regardless of the fact a given client has or has not any key in such
hash slot.
2. The client has to send explicit invalidation messages alongside with the
writes.

With the help of the server, we could create a protocol in which, server
side, for each client we remember what key each client is caching, to send
more fine grained invalidation messages. However having the client
informing the server about each key it is caching is problematic: if the
client has to send explicit "I'm caching it" messages we burn a lot of
bandwidth and CPU for this task, moreover there is even the problem of race
conditions to handle, what if I inform the server after the value was
already changed? So I've to also implement some "epoch" attribute of the
key to understand if the value is already no longer valid? And so forth.

Instead what I propose is putting the connection in tracking mode:

CLIENT caching on

This will tell the server: for every command you process from me, assume
that the keys mentioned in such command are keys that I'm going to cache.

So if the client executes "GET foo", and the key "foo" is later modified,
the client will get an invalidation message, using the new "push" messages
introduced in RESP3 (if you think you need multiple connections, N for
commands, and one for invalidation messages, wait a bit, we'll get into it
soon).

However we don't want to track, for each client, all the keys we provided
to it. We are taking of million of keys in a matter of minutes in a few use
cases. So let's use the Ben approach of hashing the key: we will remember
if the client has keys in a given hash slot, and we'll invalidate hash
slots.

instead of using Cluster CRC16 mask, probably we want to use more bits. For
instance with 18 bits we have 262144 buckets, so if a client is caching one
million of keys, an invalidation message will trash 3/4 keys on the
average, which sounds kinda acceptable.

So, server side we take what is called an Invalidation Table, which is a
linear array of 262144 pointers, pointing each to a radix tree. Every time
a client requests data about a given key, the key has hashed to the bucket,
and we store the client ID (not the client structure pointer!) in the
corresponding radix tree.

Clients will need to be reorganized to be also stored inside a global radix
tree of clients, mapping the client ID to the actual client structure. This
is a key point, because it means that when we free the client, we'll not
have to free all its references inside the invalidation table. We'll do it
lazily.

~~~ What happens when a key is modified

Here we leverage the same code path of the keyspace notifications, Redis
core already has hooks in all the commands about that. If a key is
modified, we go to check the invalidation table, and send a message to each
client listed there, removing at the same time all the clients referenced
in this bucket. We can just release the radix tree at all!

So basically, the client will no longer receive invalidation messages at
all about this hash slot. If you compared this approach to using bloom
filters, the good thing is that the client state cleans it up
automatically, we will not notify the same client forever, nor we need some
generational bloom filter setup.

The client will receive the invalidation, and will discard the value (or
will take a client-side data structure to lazy invalidate like Ben did? Up
to you client authors).

So now we have a setup where:

1) We send invalidations only to clients that may have cached keys about
this bucket.
2) If the client does not ask for keys in the same bucket again, we'll not
send invalidation messages to it.
3) Client disconnections are O(1) in this regard. We do lazy removal of the
clients later.
4) The clien does not need to send messages when modifying a given key.

The big disadvantage of this approach is that, basically, the number of
keys each client cache is capped, because the output of the hash function
is fixed. If a message invalidates 100 keys... it's not going to be fun. So
in the range of one million of keys we are fine. It's the biggest sacrifice
of this setup.

~~~ Yep but multiple connections

No problem! You can create a connection to just receive the invalidation
messages, and then create many other connections for the normal commands.
In the normal connections instead of just sending:

CLIENT caching on

You send instead:

CLIENT caching on REDIRECT <client-id-of-the-client-receiving-messages>

So when this client requests some keys, the actual client ID tracked will
be the one that will receive the notifications.

Moreover when the connection we are redirecting to is destroyed, we will
receive a push data saying, your invalidation connection closed, so flush
your cache and restart, because otherwise there is a risk of stale reads.

~~~ What is worth to cache?

On top of that, using the new auxiliary metadata that RESP3 supports, if
the client sends us some CLIENT get-key-popularity or whatever command, we
can also send it a popularity info, to know if it's worth or not to cache
something. We may even send the info only if it's worth it, that is, if the
key has a significant popularity.

The problem is that the more client side caching works, the less the
popularity info is updated. But clients may want to always use some form of
TTL anyway in caches, for two reasons:

1) It helps populating server-side caching info.
2) If you have bugs, you want to eventually recover from stale data.

This was a long email... Sorry,
Salvatore






--
Salvatore 'antirez' Sanfilippo
open source developer - Redis Labs https://redislabs.com

"If a system is to have conceptual integrity, someone must control the
concepts."
— Fred Brooks, "The Mythical Man-Month", 1975.

Salvatore Sanfilippo

unread,
May 8, 2018, 1:26:24 PM5/8/18
to redi...@googlegroups.com
As for Twitter feedback, this may not be immediately clear, please ask me
anything.
On Tue, May 8, 2018 at 3:48 PM Salvatore Sanfilippo <ant...@gmail.com>
wrote:

Itamar Haber

unread,
May 8, 2018, 1:37:38 PM5/8/18
to Redis DB
Lovin' it.

One point I've encountered with my recent forays into Keyspace Notification Land that needs to be kept in mind - like in BLPOP, you'll have to make sure that internal invalidation on keys happens only after a MULTI's EXEC completed and a Lua script was done. This is currently doable only inside the server, but not by modules. Just pointing this out :)

Also, thinking out loud here. In theory, one could wish to have a local cache of the entire keyspace, without resorting to SYNC/PSYNC implementations. Alternatively, instead of actually touching all/part of the keyspace, It would be neat-o to have a `CLIENT CACHING on [KEY <key>] [PATTERN <pattern>]` kind of thing. Lastly, what if a client wants to stop "watching" a key or a pattern? Perhaps stating the obvious here, but `CLIENT CACHING off` should also support these options.

b...@malec.us

unread,
May 8, 2018, 2:36:11 PM5/8/18
to redi...@googlegroups.com
I have to say I really like the improvements you made to my original design.  At first I was confused that you were removing all the clients from a bucket once you sent an invalidation message, but then it hit me that of course those clients would have to hit Redis to read the new value, which would enroll them back into the notification list.  Clever!

I'll keep trying to think of corner cases but on first pass this seems to be a really good approach.

Ben

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

b...@malec.us

unread,
May 8, 2018, 3:43:57 PM5/8/18
to redi...@googlegroups.com
One question I thought of:  say a client only wants to cache the top 5% popular keys, how would it do that?

First the client would send "CLIENT caching on" to Redis
Next it would send something like your "CLIENT get_key_popularity" to Redis, so it would receive attributes ranking the key's popularity
But, the client will only receive that popularity data when the client executes a "GET my_unpopular_key", right?

So say if that GET call returns the value of my_unpopular_key and an attribute that they key's popularity is 0.000000005.  Because the popularity is so low, the client decides not to cache the value locally.  But won't the GET operation enroll the client to receive notification if any key in that hash slot changes?

Do you see a way that the client could tell Redis, "No, I'm not going to cache this value locally so don't send me notification if that key is updated"?  Or, in the interest of simplicity, would it be better if client still receives the update notification but should just ignore it?

Marc Gravell

unread,
May 8, 2018, 4:38:29 PM5/8/18
to redi...@googlegroups.com
Looks really interesting. A few immediate thoughts:

- what is the interplay between cache slots and redis cluster? presumably the cache slots are per server, i.e. a single server is only tracking its own keys, not the keys in other shards; in which case: is the number of total slots the {cache slot width} x {number of primary servers}? or would the cache slots share a divisor (i.e. "full crc16 plus a few bits" or similar) with the cluster slot such that the total number of cache slots is always {cache slot width}? (i.e. each cluster slot would be 8 cache slots)

- again on cluster: what happens when slots are migrated? is the cache invalidated per key? there could be lots of keys in a single cluster slot, which may have the same cache slot, or a wide range of cache slots, or a small finite number of cache slots - again depending on the same interplay of how cache slot and cluster slot are related; I'm just slightly cautious of how noisy a cluster slot migration could be in terms of cache invalidations; also, can a client that is reconnecting due to cluster-slot migration say "I don't want to have tell you all the keys, but I'm interested in these cache slots"?

- I am slightly concerned about the last point you raise, about successful caching meaning the tracking data becomes invalid; I wonder if there should be some lightweight feedback API here were - essentially INCRBY for cache usage data

- we currently use the same redis store for multiple purposes; some involve cache, some don't; I guess the view here would be "use different connections for tracked vs untracked usage", and that could be indeed be a simple and pragmatic solution; I just thought I'd mention it

- if a client hasn't said it wants to participate in caching: does a change operation still invalidate the cache-slot for other observers? I can see merits in both "yes" and "no" here - "yes" means no unexpected missed changes; "no" means that clients using keys that don't play with caching can play in the same database without causing unexpected false-positive invalidations

- databases; yes, I know they're not advised and are deprecated in "cluster"; but for non-cluster: are cache slots per database, or per server?

As I say, though; looking very interesting!
--
Regards,

Marc

steve.l...@nokia.com

unread,
May 8, 2018, 4:49:54 PM5/8/18
to Redis DB


~~~ Yep but multiple connections

No problem! You can create a connection to just receive the invalidation
messages, and then create many other connections for the normal commands.
In the normal connections instead of just sending:

CLIENT caching on

You send instead:

CLIENT caching on REDIRECT <client-id-of-the-client-receiving-messages>

So when this client requests some keys, the actual client ID tracked will
be the one that will receive the notifications.


Regarding this part.  This sounds like one client able to register another client for receipt of invalidation messages.  
Not sure if this on-bahelf-of approach is used elsewhere within Redis, but couldn't this be used as an attack vector of sorts?
A rogue or maybe just misbehaving client registering other, different, unexpecting clients for these notifications.  
Maybe need to provide the option for the listening client to provide a notification auth key.  The client requesting on
behalf of the other must provide the same key to do so..
Just a thought.  Maybe too edge case-y, but seems simple enough to do at inception.

Marc Gravell

unread,
May 8, 2018, 5:04:30 PM5/8/18
to redi...@googlegroups.com
If you have a rogue or misbehaving client, you're already kinda screwed. There's tons more interesting ways to screw things up, including DEBUG SEGFAULT or CLIENT KILL. I'm not saying it is an invalid concern - it does feel like clients should need to opt in here; an auth key is probably overkill, but "CLIENT let-others-nominate-me" (naming is hard!) doesn't seem unreasonable. Heck, it would be nice if such a command returned the client id, because that is slightly awkward to get otherwise, so the code would already need to issue *some* command to allow it to find the client id to advertise.



--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.


--
Regards,

Marc

b...@malec.us

unread,
May 8, 2018, 5:18:56 PM5/8/18
to redi...@googlegroups.com
Marc, I think I can partially answer you:  the "hash slots" being used for cache item invalidation don't correspond to Redis cluster nodes, rather it's just a way to bucket keys together so you could invalidate small groups of keys together at the same time.  In my presentation at RedisConf I used the same algorithm as Redis uses for clusters since I figured it would be a value Redis users would be familiar with.  But I think Salvatore is looking at 18-bit values to improve selectivity.

In any case, it functions completely independent of how the cluster partitions keys.

Ben

~~~ Yep but multiple connections

No problem! You can create a connection to just receive the invalidation
messages, and then create many other connections for the normal commands.
In the normal connections instead of just sending:

CLIENT caching on

You send instead:

CLIENT caching on REDIRECT <client-id-of-the-client-receiving-messages>

So when this client requests some keys, the actual client ID tracked will
be the one that will receive the notifications.

Moreover when the connection we are redirecting to is destroyed, we will
receive a push data saying, your invalidation connection closed, so flush
your cache and restart, because otherwise there is a risk of stale reads.

~~~ What is worth to cache?

On top of that, using the new auxiliary metadata that RESP3 supports, if
the client sends us some CLIENT get-key-popularity or whatever command, we
can also send it a popularity info, to know if it's worth or not to cache
something. We may even send the info only if it's worth it, that is, if the
key has a significant popularity.

The problem is that the more client side caching works, the less the
popularity info is updated. But clients may want to always use some form of
TTL anyway in caches, for two reasons:

1) It helps populating server-side caching info.
2) If you have bugs, you want to eventually recover from stale data.

This was a long email... Sorry,
Salvatore






--
Salvatore 'antirez' Sanfilippo
open source developer - Redis Labs https://redislabs.com

"If a system is to have conceptual integrity, someone must control the
concepts."
— Fred Brooks, "The Mythical Man-Month", 1975.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Gu Rui

unread,
May 9, 2018, 2:22:18 AM5/9/18
to Redis DB
Hi Salvatore,

Redisson has been providing client side caching feature for about two years now and we have received tremendous interest because of it. I am glad you are working on the improvement with server side capabilities and have it ready for RESP3. This has reminded me that we were at one point talking about faking a slave to try to achieve this, but that's another story for another day. Now I would like to describe how Redisson has done it from the client side and how it is different to Ben's design.

Redisson's client-side caching is implemented not for strings but for hashes. We chosen hash because it is the most used data types in Redisson and there are a lot of hash related improvements we have built such as sharded hashes and field expirable hashes. In addition, this design choice has following benefits over caching strings:
  1. We don't have the noisy-neighbours issue coming from other non-interest keys like Ben has described. Each locally cached hash has its own invalidation channel. You will only receive information relating to this one key. This is particularly useful since you can "group" a few business related informations together in one hash and keep up-to-date all the time while doesn't care about other data from the rest of the system. 
  2. By tagging the hash and pub/sub channel together in one slot, we can achieve atomic updates-broadcast operation through Lua scripts. As we have learned over the years, atomicity can be a hard requirement in some use cases.
I think the most important thing about client side caching is removing a local cached value. In Redisson, we have two scenarios where local cached value are removed: when it is invalidated and/or when it is evicted. Invalidated means the data has ultimately changed or removed in Redis, while evicted means the value is no longer cache locally but the data is still in Redis. 

With this in mind, the Redisson client side cache was done at instance level, meaning each object instance works like an independent client. You may have a few object instances all working on the same hash, but they can have different cached contents. Redisson provided three ways to process an invalidation: 
  • None: An update is not going to be published.
  • Invalidate: tells other clients that this field is now updated, other clients removes this entry from their own cache
  • Update: tells other client that this field is now updated and here is the new data.
A client should also be able to decide, independently, how much data it can cache locally. When the limit has reached, there should be ways to remove some/all of them from the local cache space. Redisson has provided 5 different ways:
  • None: Meaning no eviction at all.
  • LRU: Remove a least recently used item. This option is used with max cache size restrictions
  • LFU: Remove a least frequently used item. This option is used with max cache size restrictions
  • Soft: Data are stored using SoftReference. Spaces will only be reclaimed by GC when JVM is running out of memory.
  • Weak: Data are stored using WeakReference. Spaces will be reclaimed by GC even when the JVM has available memory.
In addition to above, eviction can also happen based on time, it can be specified with time to live and/or max idle time config options.

When a client went offline for sometime, local cached value can still serve requests, but when it is reconnected to the server once again, there are three different ways to handle the situation:
  • None: does nothing.
  • Clear: the client clears all of its local cached value upon reconnected with the server
  • Load: the client will find all the items has been changed in the last 10 minutes and remove them from the local storage when reconnected. If the client went offline for longer than 10 minutes, it will remove all of its local cached value.

Wrote all these between 1am to 2am, not sure if all makes sense, I am happy to clarify anything.

jona...@findmeon.com

unread,
May 9, 2018, 2:22:18 AM5/9/18
to Redis DB
What happens if the client doesn't receive the invalidation message?  Have you considered a timeout on the caching/validity or ways to query/determine if messages have been received?

Salvatore Sanfilippo

unread,
May 9, 2018, 5:24:44 AM5/9/18
to redi...@googlegroups.com
Thanks Ben,

the thing about removing the clients from the invalidation table, also has
the big advantage of no longer notifying the client with invalidation
messages of keys it no longer holds in memory.
This is AFAIK a key advantage of this schema, because it means that clients
that go from caching a set of N keys to a different set of N/10 keys, for
instance, will receive proportionally less notifications.

Salvatore Sanfilippo

unread,
May 9, 2018, 5:31:26 AM5/9/18
to redi...@googlegroups.com
I've the feeling that we could skip this problem for now, because even if
the client does GET and decides of not caching the value, it will get a
single notification for this key: if the key will not be fetched again (and
there should be an high probability of this happening) we'll not get
further notifications. Otherwise saying the server "I'm not caching this"
has two problems: 1) It seems the same work of receiving the notification
later, so why bother? 2) The server may not know what to do, because you
may have other keys hashing in the same bucket.

Salvatore Sanfilippo

unread,
May 9, 2018, 5:33:26 AM5/9/18
to redi...@googlegroups.com
Hello Steve,

there should be no security concern, because here is basically the client
that should receive the notifications that will say "ok, give my
notifications to another client". So it cannot be used to steal info but to
provide info.
And if you think at it as a DoS, you can just do CLIENT KILL :-D
> --
> You received this message because you are subscribed to the Google Groups
"Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at https://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.



Salvatore Sanfilippo

unread,
May 9, 2018, 5:33:48 AM5/9/18
to redi...@googlegroups.com
Exactly, sorry replied without first seeing this reply.

Salvatore Sanfilippo

unread,
May 9, 2018, 5:35:08 AM5/9/18
to redi...@googlegroups.com
Yep, basically let's call it from now on "Invalidation bucket" or whatever
and never mention again Hash Slot otherwise we are going to have a lot of
confusion :-D

Salvatore Sanfilippo

unread,
May 9, 2018, 5:47:04 AM5/9/18
to redi...@googlegroups.com
Thanks Rui,

comments inline:
On Wed, May 9, 2018 at 8:22 AM Gu Rui <jacky...@gmail.com> wrote:

> Hi Salvatore,

> Redisson has been providing client side caching feature for about two
years now and we have received tremendous interest because of it. I am glad
you are working on the improvement with server side capabilities and have
it ready for RESP3. This has reminded me that we were at one point talking
about faking a slave to try to achieve this, but that's another story for
another day. Now I would like to describe how Redisson has done it from the
client side and how it is different to Ben's design.

Thanks, yep faking slaves is a popular one :-D

> Redisson's client-side caching is implemented not for strings but for
hashes. We chosen hash because it is the most used data types in Redisson
and there are a lot of hash related improvements we have built such as
sharded hashes and field expirable hashes. In addition, this design choice
has following benefits over caching strings:

Note that the approach described is not specific of any Redis type, it is
just caching a "key", whatever it contains, but caching it as a whole.

> We don't have the noisy-neighbours issue coming from other non-interest
keys like Ben has described. Each locally cached hash has its own
invalidation channel. You will only receive information relating to this
one key. This is particularly useful since you can "group" a few business
related informations together in one hash and keep up-to-date all the time
while doesn't care about other data from the rest of the system.
> By tagging the hash and pub/sub channel together in one slot, we can
achieve atomic updates-broadcast operation through Lua scripts. As we have
learned over the years, atomicity can be a hard requirement in some use
cases.

I think this is a good thing in general, but is kinda orthogonal to the
schema provided, but yes, I can see how people could say: look I don't want
each of this information into a separated key, especially because the
selectivity of Redis invalidations is the key and is limited, so let's
group info inside hashes.

> I think the most important thing about client side caching is removing a
local cached value. In Redisson, we have two scenarios where local cached
value are removed: when it is invalidated and/or when it is evicted.
Invalidated means the data has ultimately changed or removed in Redis,
while evicted means the value is no longer cache locally but the data is
still in Redis.

I agree, but in some way we can split this into a very clear separation of
concerns:

1) Cache eviction: up to the client.
2) Cache invalidation: up to the server.

This is why my description focuses so much on "2". Because with "1" you can
do client side whatever you like.
It is also important to note that because of client side caching removes in
some way the ability from the server to estimate in a very good way key
popularity, likely advanced clients may implement something like that
client side to make better choices.
Even if I've the feeling that another way to reach this same goal is just
to put an hard TTL on every cached key, so that we still ask Redis often
enough to update the stats (but seldom enough to avoid load problems and
still benefit of the client side value 99.9% of the times)

> With this in mind, the Redisson client side cache was done at instance
level, meaning each object instance works like an independent client. You
may have a few object instances all working on the same hash, but they can
have different cached contents. Redisson provided three ways to process an
invalidation:

> None: An update is not going to be published.
> Invalidate: tells other clients that this field is now updated, other
clients removes this entry from their own cache
> Update: tells other client that this field is now updated and here is the
new data.

The None policy is interesting from the POV of the feature I'm trying to
put into Redis, because it could be useful to be able to say to Redis: the
next value I'm fetching, I'm not going to save it, or I'll just use some
eviction policy, don't record it in the invalidation table.
Not clear how to implement this but it's something to take in mind IMHO.

> A client should also be able to decide, independently, how much data it
can cache locally. When the limit has reached, there should be ways to
remove some/all of them from the local cache space. Redisson has provided 5
different ways:

> None: Meaning no eviction at all.
> LRU: Remove a least recently used item. This option is used with max
cache size restrictions
> LFU: Remove a least frequently used item. This option is used with max
cache size restrictions
> Soft: Data are stored using SoftReference. Spaces will only be reclaimed
by GC when JVM is running out of memory.
> Weak: Data are stored using WeakReference. Spaces will be reclaimed by GC
even when the JVM has available memory.

Ok we are in the realm of client side here.

> In addition to above, eviction can also happen based on time, it can be
specified with time to live and/or max idle time config options.

> When a client went offline for sometime, local cached value can still
serve requests, but when it is reconnected to the server once again, there
are three different ways to handle the situation:

> None: does nothing.
> Clear: the client clears all of its local cached value upon reconnected
with the server
> Load: the client will find all the items has been changed in the last 10
minutes and remove them from the local storage when reconnected. If the
client went offline for longer than 10 minutes, it will remove all of its
local cached value.

This is another good point... It makes sense for clients to perform some
incremental warm-up of the cache on disconnections. In the server-side
approach as long as the notification connection stays alive, there is no
need to flush the cache.
But if such connection is closed, the cache should be flushed, however
clients still now the content of the cache, and may decide instead fo
re-populate it ASAP by re-fetching the values in background or
incrementally or alike.
This is still client logic, but still very interesting hint.

> Wrote all these between 1am to 2am, not sure if all makes sense, I am
happy to clarify anything.

Everything made sense :-) Thanks, very informative overview.
Also appreciate the openness to describe what is not just an OSS project
but also a product for you.
> --
> You received this message because you are subscribed to the Google Groups
"Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at https://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.



Gu Rui

unread,
May 9, 2018, 6:01:04 AM5/9/18
to Redis DB
Hi Salvatore,

Here is what I think with regarding these points:

> ~~~ Yep but multiple connections 


No problem! You can create a connection to just receive the invalidation 
messages, and then create many other connections for the normal commands. 
In the normal connections instead of just sending: 

It is very important that there is a mechanism to receive not only invalidations but also new values/changes. We have had requests from users that in some use cases, such as planned sales events like Black Friday and 11.11 or any other flash sales events, certain content needs to be pushed to the edge (client side) ahead of the time. Just invalidate a piece of data on change would actually make the whole situation worse. 


CLIENT caching on 

You send instead: 

CLIENT caching on REDIRECT <client-id-of-the-client-receiving-messages> 

So when this client requests some keys, the actual client ID tracked will 
be the one that will receive the notifications. 

Moreover when the connection we are redirecting to is destroyed, we will 
receive a push data saying, your invalidation connection closed, so flush 
your cache and restart, because otherwise there is a risk of stale reads. 

I think this should work similar to streams or backed up internally by streams. When a client is interested in a key, it can become a consumer of the stream of invalidations/changes, then you don't need to worry about missing updates due to a connection is lost, upon reconnecting, all the missed invalidations are replayed to you. Currently, we are tracking these through the use of a zset. 

~~~ What is worth to cache? 

On top of that, using the new auxiliary metadata that RESP3 supports, if 
the client sends us some CLIENT get-key-popularity or whatever command, we 
can also send it a popularity info, to know if it's worth or not to cache 
something. We may even send the info only if it's worth it, that is, if the 
key has a significant popularity. 

The problem is that the more client side caching works, the less the 
popularity info is updated. But clients may want to always use some form of 
TTL anyway in caches, for two reasons: 

1) It helps populating server-side caching info. 
2) If you have bugs, you want to eventually recover from stale data. 
Totally agree that the client-side caching suppresses the "heat" of a key. This is one of the reasons that people wanting to use client-side caching, apart from reducing the network latency. I too believe in order to keep tracking of a key's popularity info would require some work from the client side via command(s). 

I also agree that Client-side TTL is an important feature, it has proved to be very useful in Redisson.

Best regards

Rui
On Tuesday, 8 May 2018 14:55:51 UTC, Salvatore Sanfilippo wrote:
Message has been deleted

Salvatore Sanfilippo

unread,
Jun 27, 2018, 1:04:40 PM6/27/18
to redi...@googlegroups.com
Hi all,

minor update on this. Today I implemented CLIENT UNBLOCK (See other
topic on this ML) into unstable, and I believe I'll back port it to
5.0 RC. As a side effect of the implementation, now clients are also
stored in a radix tree, working as a dictionary mapping client IDs to
client handles. This is a small part of what we need to have the full
client caching semantics. During Redis 6 developments the rest will be
added.

Cheers,
Salvatore
On Tue, May 8, 2018 at 3:48 PM Salvatore Sanfilippo <ant...@gmail.com> wrote:
>

Itamar Haber

unread,
Jul 4, 2019, 6:42:31 PM7/4/19
to redi...@googlegroups.com
Hi all,

major update on this :) xref: http://antirez.com/news/130

Cheers,

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.


--

Itamar Haber
Technicalist Evangely

Phone: +972.54.567.9692

Redis Labs

Reply all
Reply to author
Forward
0 new messages