implementing efficient timers by saving expired keys

2,998 views
Skip to first unread message

Jacob T

unread,
Jan 9, 2015, 6:27:57 PM1/9/15
to redi...@googlegroups.com
Hi Redis folks

My team would like to use redis to implement persistent and scalable timer events based on key expiration. 

The main requirement is that we have is that clients not lose any timer events if they become disconnected from redis. In other words, events should not be lost if no clients are listening.

The most simple and efficient way to implement this feature that I can think of would be by introducing an additional option that enables expired redis keys to be RPUSH'ed to a particular list.

The benefit of an option to store expired keys (created with certain parameters) into a list rather than making use of key expiry notifications (http://redis.io/topics/notifications) are:

1. Events tracked by the expired keys are not lost thereby enabling clients to consume the events upon re-connect to redis.
2. Unlike keyspace notifications, by using BLPOP only a single client can consume the expired event which enables efficient lock-free handoff of the timer event for processing, and enables timer processing to scale.

I wanted to sound out the Redis devs about whether the above feature jives with the current Redis philosophy and whether our implementation of such a feature has a chance of being upstreamed in some form in the future.

We have been working on implementing it ourselves and have a limited working prototype that will store keys into a list based on the specific key syntax. e.g: <mykey>@__expire_into__<listname>
I realize that this is far too restrictive of an API, but it demonstrates the benefit clearly enough in our case. 
I wanted to get some feedback on the viability of the feature first before going forward with adding the feature by extending the data associated with the key expiry data structure created for expiry operations.
The syntax I would envision for the new key option would be something like below:

e.g: SET <mykey> <myvalue> PX 3000 XRPUSH <listkey>


I have been trawling the forums and the GitHub for an answer to the above. Despite some mentions of expired keys being stored in lists, there were no other detailed proposals for this feature.

Thank you for your time.

Jacob

Matt Stancliff

unread,
Jan 9, 2015, 6:54:38 PM1/9/15
to redi...@googlegroups.com

> On Jan 9, 2015, at 18:27, Jacob T <jtr...@gmail.com> wrote:
>
> 1. Events tracked by the expired keys are not lost thereby enabling clients to consume the events upon re-connect to redis.

That’s the general drawback of the Redis pubsub system. It works great until you want non-transient events. At first, everybody thinks “yay pubsub!” Then you hit the trough of “uh, let’s just use a list instead."

> I wanted to sound out the Redis devs about whether the above feature jives with the current Redis philosophy and whether our implementation of such a feature has a chance of being upstreamed in some form in the future.

Probably not, but don’t let that stop you. That’s why code is open after all.

> I have been trawling the forums and the GitHub for an answer to the above. Despite some mentions of expired keys being stored in lists, there were no other detailed proposals for this feature.

I actually did this last year, but just as an experiment. I added the ability to subscribe Lua scripts to pubsub channels, so every PUBLISH invokes your script with the message as parameters to your script. Then your script can persist the message or re-PUBLISH to other channels or parse/evaluate/drop the message if it’s not needed.

All the details are at https://matt.sh/advanced-redis-pubsub-scripts


-Matt

Jacob T

unread,
Jan 9, 2015, 7:07:01 PM1/9/15
to redi...@googlegroups.com
Thanks for your feedback. I did review your implementation in my research into implementing scalable timers but thought it might be significantly more wide ranging (although also much more powerful) than the proposal I outlined above. I was hoping that the narrower scope of my feature would reducing the chance of 'unexpected side effects' from its implementation and thus encourage a higher chance of inclusion. :)

The main reason I wanted to sound out the Redis folks about the viability of upstreaming the XRPUSH feature was selfish as we do not want to maintain a fork of Redis for a tiny change such as this, and judging by the existence of other similar requests on various forums, it seems that this feature might solve some real world problems for people (who need what we need), without incurring the penalty of significantly increasing the complexity of the redis server.

--Jacob

Thomas Love

unread,
Jan 10, 2015, 5:33:40 AM1/10/15
to redi...@googlegroups.com
Can you clarify the basic problem? Is it "clients have jobs to do on a timer"? If so, and notwithstanding especially high precision / throughput requirements, your proposal doesn't sound to me either simple or efficient. 

In general I wouldn't couple Redis-internal timing, let alone Redis garbage collection, to anything at the application level. It will almost certainly surprise you in unexpected ways. 

What's wrong with a basic non-volatile job queue?

Thomas

Jacob T

unread,
Jan 12, 2015, 11:23:54 AM1/12/15
to redi...@googlegroups.com
On the surface:

1. Job is scheduled with a timer
2. A timer expires
3. Client pulls the job out of the queue and executes it

I assume by "surprise you in unexpected ways" you mean: that the Redis expiry of data should be viewed as an implementation detail from the API point of view. 

Question: if relying on the behavior of data expiry in Redis as a reliable external API is not recommended, then why was the ability to listen to key expiry via keyspace notifications added?

We do not require that timers be precise or accurate to a millisecond. I need to ensure that:

1. Timers are not lost or abandoned as worker clients come and go.
2. Do not require a massive timer scan upon worker client reconnect to Redis in order to determine which timers in fact need to be processed, (i know this can be addressed by complicating the timer storage structure by storing timeouts in a sorted set, but this does not scale that well and becomes more complex since you'd need to shard the sorted set too)
3. Do not require timers to be 'owned' by different worker clients before they are removed from Redis to be executed by the client. Ownership leads to an inactive client's timers needing to be reassigned to someone else, the reassignment would need to poll on on a schedule, and now you cant seamlessly start and stop clients without introducing a big latency due to the reassignment delay.
4. Only a single worker client gets notified of each expiring timer (do not fight over each expiring timer ala redis keyspace expiry notifications)

In other words, I want persistent, scalable timers that do not complicate the application logic unnecessarily. I was hoping to be able to support this within Redis by chaining functionality already exposed in Redis via a different API (Keyspace notifications).

I know I can implement all the above requirements by adding various complexity to my application, but honestly it seems like the cost/benefit of having Redis capable of performing atomic actions on data expiry such as placing a key into a list should weight more heavily on the scales.

PS. On a side note, using keyspace notifications in Redis Cluster feels like it may lead to scaling issues (each keyspace notification gets broadcast throughout the cluster), but that's an issue for another day. :)

--Jacob

Thomas Love

unread,
Jan 13, 2015, 4:39:57 AM1/13/15
to redi...@googlegroups.com
On 12 January 2015 at 18:23, Jacob T <jtr...@gmail.com> wrote:
On the surface:

1. Job is scheduled with a timer
2. A timer expires
3. Client pulls the job out of the queue and executes it 

I assume by "surprise you in unexpected ways" you mean: that the Redis expiry of data should be viewed as an implementation detail from the API point of view. 

Yes. The bottom of http://redis.io/topics/notifications explains how that implementation (currently) interacts with notifications, but I would just summarize it as "no guarantees" regarding timing. 


Question: if relying on the behavior of data expiry in Redis as a reliable external API is not recommended, then why was the ability to listen to key expiry via keyspace notifications added?

I don't know the history of this part of Redis. To me the keyspace notification system looks useful as an introspection / monitoring / debugging tool, and *perhaps* an optimization of last resort. I'm not in a position to say "it's not recommended", but I would be reluctant myself to write application code around it, and especially reluctant to use expiring keys as timing signals. 
 

We do not require that timers be precise or accurate to a millisecond. I need to ensure that:

1. Timers are not lost or abandoned as worker clients come and go.

First step then is to ensure Redis is set up with persistence and a noeviction maxmemory policy. 
 
2. Do not require a massive timer scan upon worker client reconnect to Redis in order to determine which timers in fact need to be processed, (i know this can be addressed by complicating the timer storage structure by storing timeouts in a sorted set, but this does not scale that well and becomes more complex since you'd need to shard the sorted set too)

This is a fairly fundamental challenge for any timer system, and there's no free lunch. Redis itself avoids exhaustive scans and sorted sets by doing a random sampling of the keyspace. It sacrifices timer precision in exchange for performance. 

I use the same basic algorithm in application code e.g. with sets and SRANDMEMBER, with good success. But there are other approaches. I found this paper interesting: http://www.cs.columbia.edu/~nahum/w6998/papers/sosp87-timing-wheels.pdf Note there will always be space/time precision/throughput trade-offs and it's worth trying to work out where they are, and being explicit about them. That way you have knobs you can turn later to change the characteristics. 
 
3. Do not require timers to be 'owned' by different worker clients before they are removed from Redis to be executed by the client. Ownership leads to an inactive client's timers needing to be reassigned to someone else, the reassignment would need to poll on on a schedule, and now you cant seamlessly start and stop clients without introducing a big latency due to the reassignment delay.

I'm pretty sure that if you are running multiple clients against the timer dataset you will have to deal with some of these issues, regardless of the source of your list. Even with a single client, consider the case where the client fails after retrieving a job but before completing it. See reliable queues here: http://redis.io/commands/rpoplpush -- it's not difficult to get some reliability, and there are various ways to recover. 

4. Only a single worker client gets notified of each expiring timer (do not fight over each expiring timer ala redis keyspace expiry notifications) 

Redis is single-threaded and its commands are atomic. Not fighting is its natural state! RPOPLPUSH above will hand a given item to at most one polling client. 
 

In other words, I want persistent, scalable timers that do not complicate the application logic unnecessarily. I was hoping to be able to support this within Redis by chaining functionality already exposed in Redis via a different API (Keyspace notifications).

I know I can implement all the above requirements by adding various complexity to my application, but honestly it seems like the cost/benefit of having Redis capable of performing atomic actions on data expiry such as placing a key into a list should weight more heavily on the scales.

I don't think you should treat this kind of thing as "unnecessary complexity" in your application. Chances are, it *is* your application, or at least a very key auxiliary service. 


Thomas

Jacob T

unread,
Jan 13, 2015, 12:00:50 PM1/13/15
to redi...@googlegroups.com
Thanks for your comments, Thomas.

Btw, I wanted to ask, what is your relation to the Redis project?

If building upon the reliability of expiry is not the direction the Redis API is heading, I am fine with that. We'll have to do something else, and build expiry reliability another way.

I do not relish implementing a whole system in the application layer that could be greatly simplified by having a limited and functional Redis feature, but such is life.

I am just pointing out that sometimes *useful* features do not fit neatly into neat API buckets as originally intended, that doesn't always mean the features can't work well and be useful regardless.

--Jacob

Greg Andrews

unread,
Jan 13, 2015, 12:07:02 PM1/13/15
to redi...@googlegroups.com

On Mon, Jan 12, 2015 at 8:23 AM, Jacob T <jtr...@gmail.com> wrote:
Question: if relying on the behavior of data expiry in Redis as a reliable external API is not recommended, then why was the ability to listen to key expiry via keyspace notifications added?


I don't think anyone is asking you to not depend on Redis' behavior with respect to whether or not an expire event is fired.  Redis will expire keys.

They are asking you to not depend on the accuracy of the timing of expire events.  Redis will not expire keys according to the wall clock time (some will, many won't).


  -Greg


Josiah Carlson

unread,
Jan 13, 2015, 12:57:37 PM1/13/15
to redi...@googlegroups.com
Assuming that you have a Redis server running on a machine that is capable, you can write bad Lua to handle reliable timer execution at a rate of 10k/second, with better Lua able to handle 100k+/second (assuming certain conveniences), without sharding. If your IO requirements are in that 10k-100k range (or lower), then your problem is more or less already solved. (this assumes your timer events aren't also carrying around substantial amounts of metadata that needs to be transferred across the network)

If your IO requirements are beyond that 10k-100k events/second, then you will need to shard across Redis servers (unless you can do batched writes or reads, or are happy with reasonable-effort 1 execution queue semantics as opposed to 1+ reliable execution). But sharding is easy on the execution side (run more processors with additional configurations), and adding items to execute is a matter of giving your timers ids and writing the different timer events to different shards based on that id.

What kind of performance do you need? How are your timer events finding their way into this system? Can your clients handle batch insertion/removal? How much metadata does each event carry around with itself?

 - Josiah


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Jacob T

unread,
Mar 18, 2015, 1:00:59 PM3/18/15
to redi...@googlegroups.com
Are you saying that if my redis server is not doing anything and I set a TTL on a key of 5 seconds, if i check for that key 6 seconds later it will likely still be there? My experience contradicts this. My understanding is that redis is being conservative about time guarantees on key expiry for the sake of performance in case 1 billion keys end up inserted with the same millisecond TTL. I understand the limitation of placing such a load on the system and the necessity of delaying expiry under such extreme conditions, however that is NOT my use case.

My use case is to create timers, several thousand every second that will expire about 20 seconds hence. If unrelated query load spikes cause redis to expire the timers 22 seconds later or 25 seconds later, it's not a big deal for me. As long as the expiry is 'usually' guaranteed to be 'near' wall clock time, I am happy with it.

So long story short, this is not some theoretical million timers type of use case that will break redis and explode the server. It is a very well bounded problem. It is especially for this reason that I was hesitant to build complex client side logic to handle resuming of timers that 'should' be possible to handle entirely by adding the trivial mechanism I proposed earlier.

Josiah Carlson

unread,
Mar 18, 2015, 2:35:05 PM3/18/15
to redi...@googlegroups.com
Let me clarify what Greg said.

On Wed, Mar 18, 2015 at 10:00 AM, Jacob T <jtr...@gmail.com> wrote:
Are you saying that if my redis server is not doing anything and I set a TTL on a key of 5 seconds, if i check for that key 6 seconds later it will likely still be there? My experience contradicts this. My understanding is that redis is being conservative about time guarantees on key expiry for the sake of performance in case 1 billion keys end up inserted with the same millisecond TTL. I understand the limitation of placing such a load on the system and the necessity of delaying expiry under such extreme conditions, however that is NOT my use case.

Your experience contradicts your *interpretation* of what he said. Redis randomly deletes expired keys based on a heuristic. Also, if an expired key is accessed, it is deleted upon access. That is why you aren't seeing expired keys sticking around after their expiration - they only exist as long as you don't look at them.
 
My use case is to create timers, several thousand every second that will expire about 20 seconds hence. If unrelated query load spikes cause redis to expire the timers 22 seconds later or 25 seconds later, it's not a big deal for me. As long as the expiry is 'usually' guaranteed to be 'near' wall clock time, I am happy with it.

There are zero guarantees. It's not just that Redis expires based on an approximation, Redis randomly samples the keyspace to discover keys that need to be deleted. When in any pass it discovers that more than 25% of keys it randomly sampled need to be deleted, it does another pass. So it's not that it might expire a key 2 or 5 seconds after it was supposed to expire; it might expire the key 6 months after it was originally due to expire if your keyspace is large and the RNG wasn't in your favor.

You can certainly address this explicitly to ensure that an expiration happens in a timely manner (see SCAN, which will induce deletions upon accessing keys that should be expired), but considering that Redis doesn't even have the functionality to push keys into lists as they are expired (which you were proposing at the start of this thread, 2+ months ago), the combination seems like a bad idea just from an implementation perspective. That doesn't mean that you can't or shouldn't, just that there are solutions that don't involve currently non-existing features and expectations about Redis expiration that don't match reality.

So long story short, this is not some theoretical million timers type of use case that will break redis and explode the server. It is a very well bounded problem. It is especially for this reason that I was hesitant to build complex client side logic to handle resuming of timers that 'should' be possible to handle entirely by adding the trivial mechanism I proposed earlier.

Modifying Redis internals is not a trivial mechanism or operation. It's not putting a man on the moon, but it's also not updating a configuration file.

Given your volume, I am about 99.9% certain you can get the results that you want with a couple Lua scripts executed inside Redis. You don't need to wait for Redis to be patched and released, you don't need to do any hacks to ensure that Redis expires properly, you just need to use a couple Lua scripts. If you want more information about that side of things, you can be a bit more specific with your data, and I'm sure several of us can offer help on writing a Lua script to do what you need.

 - Josiah

On Tuesday, January 13, 2015 at 9:07:02 AM UTC-8, GregA wrote:

On Mon, Jan 12, 2015 at 8:23 AM, Jacob T <jtr...@gmail.com> wrote:
Question: if relying on the behavior of data expiry in Redis as a reliable external API is not recommended, then why was the ability to listen to key expiry via keyspace notifications added?


I don't think anyone is asking you to not depend on Redis' behavior with respect to whether or not an expire event is fired.  Redis will expire keys.

They are asking you to not depend on the accuracy of the timing of expire events.  Redis will not expire keys according to the wall clock time (some will, many won't).


  -Greg


moiz....@gmail.com

unread,
May 4, 2017, 1:49:47 AM5/4/17
to Redis DB
Jacob - i know this is an old thread but I have a similar requirement (persistent and scalable timer events based on key expiration). What approach did you eventually use?

And Josiah/Greg, since this is from 2015 and we are now in 2017  - has anything changed in the latest redis feature set that now allows using redis for this kind of use case out of the box?

Josiah Carlson

unread,
May 4, 2017, 10:46:38 AM5/4/17
to redi...@googlegroups.com
If you are willing to use the module subsystem in Redis (http://antirez.com/news/106), my understanding is that you can get whatever you want.

Otherwise it's still Lua scripts and other such things. That said, a timed queue in Redis is just a timer with extra information, and there are a bunch of timed queue implementations out there; inside Redis, Disque, AWS SQS, etc.

 - Josiah

To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+unsubscribe@googlegroups.com.

To post to this group, send email to redi...@googlegroups.com.

Salvatore Sanfilippo

unread,
May 4, 2017, 11:43:26 AM5/4/17
to redi...@googlegroups.com
An alternative could be to make timers a first-class citizen in the
Redis ecosystem. With the new radix tree data structure we have inside
the core, it could be pretty simple to provide an API where timers
have a name, so that after a disconnection you can just continue
waiting for the timer. A command creates a timer given a number of
milliseconds and returns an handle. Another command can be used in
order to listen for a set of timers, so normally one does a CREATE +
LISTEN cycle, but it is also possible to just LISTEN after a
reconnection. Because that would be backed by a radix tree there is
not the problem we have with expire that the actual expire time is not
bound (just the minimum is guaranteed), but every timer would last
exactly as specified, with the only limit of the Redis HZ resolution.
>> Visit this group at https://groups.google.com/group/redis-db.
>>
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at https://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/d/optout.



--
Salvatore 'antirez' Sanfilippo
open source developer - Redis Labs https://redislabs.com

"If a system is to have conceptual integrity, someone must control the
concepts."
— Fred Brooks, "The Mythical Man-Month", 1975.
Reply all
Reply to author
Forward
0 new messages