Tag-based cache invalidation in redis cluster+scripting

1,689 views
Skip to first unread message

sqm1998

unread,
Oct 24, 2011, 4:43:19 AM10/24/11
to Redis DB
What is the most efficient way to implement the following tag-based
cache invalidation mechanism within the updating redis cluster
+scripting?

Basically, redis is used to store cache entries that are associated to
1 or more tags. At various times, application logic (external to
redis) will know that cache entries associated to a particular tag(s)
are now 'stale' and needs to be invalidated. EG:
http://code.google.com/p/memcached/wiki/ProposalTagSupport

A simple (but somewhat inefficient) algorithm would be:
1. Store each tag as a redis set containg the various associated cache
entry keys.
2. When it is time to invalidate all cache entry keys associated with
tag X, query redis with SUNION to get all the to be invalidated cache
entry keys, and then use a multi-key DEL to remove all the invalidated
keys.

In a perfect world, would it be possible to use LUA scripting to send
a single EVAL to redis, that will both do the SUNION to get the
invalidated keys, within a DEL command? EG (pseudo-code): EVAL {DEL
(SUNION tagA tagB tagC...)}...

And, within the context of redis cluster (http://redis.io/topics/
cluster-spec), it is said that "Redis Cluster implements all the
single keys commands available in the non distributed version of
Redis. Commands performing complex multi key operations like Set type
unions or intersections are not implemented, and in general all the
operations where in theory keys are not available in the same node are
not implemented."

So, based on the above quote from the redis cluster spec, if a
scripting EVAL was issued by a client (EG: phpredis) to the redis
cluster, what would happen? Would the node within redis cluster be
able to then broadcast the SUNION to all the potential nodes in the
cluster and then aggregate the results and then perform the multi-key
DEL (once again broadcasting to all the hashed nodes in the cluster)?
Or would that be impossible and every set have to be retrieved
separately (that would seem to be very inefficient)?

Josiah Carlson

unread,
Oct 29, 2011, 3:22:46 AM10/29/11
to redi...@googlegroups.com
On Mon, Oct 24, 2011 at 1:43 AM, sqm1998 <sqm...@gmail.com> wrote:
> What is the most efficient way to implement the following tag-based
> cache invalidation mechanism within the updating redis cluster
> +scripting?
>
> Basically, redis is used to store cache entries that are associated to
> 1 or more tags. At various times, application logic (external to
> redis) will know that cache entries associated to a particular tag(s)
> are now 'stale' and needs to be invalidated. EG:
> http://code.google.com/p/memcached/wiki/ProposalTagSupport
>
> A simple (but somewhat inefficient) algorithm would be:
> 1. Store each tag as a redis set containg the various associated cache
> entry keys.
> 2. When it is time to invalidate all cache entry keys associated with
> tag X, query redis with SUNION to get all the to be invalidated cache
> entry keys, and then use a multi-key DEL to remove all the invalidated
> keys.

Let me try to understand this.

You have cache key Y with tags A, B, C. You perform the following operations:

SADD tags:A Y
SADD tags:B Y
SADD tags:C Y
SET key:Y ...

The second part of your explanation suggests that you are also performing:

SADD ktags:Y A
SADD ktags:Y B
SADD ktags:Y C

Then to clear your data, you perform a SUNION across ktags:* entries
based on the output of SMEMBERS tags:X .

> In a perfect world, would it be possible to use LUA scripting to send
> a single EVAL to redis, that will both do the SUNION to get the
> invalidated keys, within a DEL command? EG (pseudo-code): EVAL {DEL
> (SUNION tagA tagB tagC...)}...

You could do that, but I wouldn't. If you end up needing to invalidate
more than a few thousand tags, or unioning more than a few keys, you
will cause Redis to pause for a surprisingly long time.

I've got an idea for an incremental clearing of this kind of data
(which would be faster scripted), which you can do for some user-set
number of keys per pass, which wouldn't cause Redis to pause doing too
much work. You could call it from a worker every few seconds (or more
often, depending on how quickly you want to clear out old data), but
it wouldn't work quite right in scripted cluster (but would work fine
in regular Redis scripting).

> And, within the context of redis cluster (http://redis.io/topics/
> cluster-spec), it is said that "Redis Cluster implements all the
> single keys commands available in the non distributed version of
> Redis. Commands performing complex multi key operations like Set type
> unions or intersections are not implemented, and in general all the
> operations where in theory keys are not available in the same node are
> not implemented."
>
> So, based on the above quote from the redis cluster spec, if a
> scripting EVAL was issued by a client (EG: phpredis) to the redis
> cluster, what would happen? Would the node within redis cluster be
> able to then broadcast the SUNION to all the potential nodes in the
> cluster and then aggregate the results and then perform the multi-key
> DEL (once again broadcasting to all the hashed nodes in the cluster)?
> Or would that be impossible and every set have to be retrieved
> separately (that would seem to be very inefficient)?

No. Redis specifically does not broadcast commands anywhere. Your
desire to union keys would not work.

Regards,
- Josiah

sqm1998

unread,
Oct 30, 2011, 7:25:28 PM10/30/11
to Redis DB
Hi Josiah,

Thanks for the feedback. With regard to the commands, only the 1st set
of commands are used (listed below), not the 2nd set.

SADD tags:A Y
SADD tags:B Y
SADD tags:C Y
SET key:Y ...

So, when it is time for invalidation, the application would SUNION the
tags:* sets to get all the cache keys, and then do a multi-key delete.

In my current use case, the maximum # of tags:* sets that would ever
be SUNIONED is 10 or less, each with about 250 - 1,000 keys (so max #
of members of the SUNION would be 2,500 - 10,000, which is then fed to
a multi-key DEL). At this level, do you think redis will have a
problem?

On Oct 29, 3:22 am, Josiah Carlson <josiah.carl...@gmail.com> wrote:

Josiah Carlson

unread,
Oct 30, 2011, 7:58:14 PM10/30/11
to redi...@googlegroups.com
It sounds like you are just deleting your entire cache. Why not use a
special database for this and just use a FLUSHDB? That would be a lot
faster, fewer commands, etc.

- Josiah

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>
>

Josiah Carlson

unread,
Mar 7, 2013, 5:57:26 PM3/7/13
to redi...@googlegroups.com
That's a long time for a SUNION call. Have you ever considered taking
each individual SET, deleting the items from that set, then moving on
to the next? It will at least remove the initial 24 minute SUNION
operation.

If you are concerned that the contents of the SETs will change, you
could RENAME all of the SET keys, then perform SET-by-SET deletions.

Also, if it's taking so long to perform, then that also smells like
your SETs might be too large for reasonable operations. Could you
shard them and work on smaller groups at a time? If you chopped all of
your SETs into 1000 pieces each, then you could perform each sharded
SET SUNION operation in about 1.5 seconds. While that's still not
great, it's a lot more reasonable.

Another alternative is that you could perform the following in Lua
(this is using Python-like syntax, but you can translate):

for key in keys:
while redis.call('scard', key) > 0:
members = redis.call('srandmember', key, 1000)
redis.call('del', unpack(members)) # you may need to change this
redis.call('srem', unpack(members))

That will pull 1000 members from each set at a time, delete the keys
referenced by the members, and remove the members from the SET. You
could also limit the number of items that are fetched/deleted in
total, to incrementally delete your items, which should improve
interactivity (at the cost of longer total time).


If you explain your larger problem, we might be able to come up with a
better solution for you.

Regards,
- Josiah

On Thu, Mar 7, 2013 at 1:49 AM, Marcin Gil <marci...@gmail.com> wrote:
> Hi!
>
> Did you find any alternative to sUnion yet? I have exactly the same problem.
> For the me sUnion takes even 24 minutes.
>
> Thank you,
> Marcin
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
>
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Matt Watson

unread,
Jan 12, 2015, 4:38:00 PM1/12/15
to redi...@googlegroups.com
Hey guys, we just wrote a blog post about how we are doing Redis caching with tags. Check it out and maybe it could help some of you. We used some Lua scripting.

Reply all
Reply to author
Forward
0 new messages