Expire and Lua scripts

1,240 views
Skip to first unread message

Salvatore Sanfilippo

unread,
Feb 3, 2014, 5:05:38 AM2/3/14
to Redis DB
Hi all,

please if you have 2 spare minutes check this issue here:

https://github.com/antirez/redis/issues/1525

It documents an issue related to using Lua scripts with keys with an expire set.
There is a but in the current Redis implementation because as you can
see from the issue description, keys can expire in the middle of
scripts executions, resulting in slave desynchronization.

Now the proposed solution is that keys are not expired during the
execution of scripts, but are checked for expire before executing the
script: the KEYS[] array is checked with the usual expireIfNeeded()
function, and that's it.

However this means that keys not explicitly "declared" when using
EVAL, but hard-coded inside the script, will not expire during the
script execution even if their time is already out. Note that the key
will expire anyway at a latter time because of the incremental expire
cycle performed by Redis.

I'm going to implement this change however it is interesting to get
feedbacks. Is this change affecting any of you? Do you have other
interesting ideas?

The alternatives I see is to track key access during the script
execution and expire the key only the first time it is accessed in the
script. This should still guarantee safeness (but I need to check more
formally) and at the same time mostly retain the current
functionality. However this also has the problem of being more
computationally expensive... we would need to put keys inside an hash
table during lookup performed in the context of scripts.

Thanks in advance for any feedbacks.

Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

To "attack a straw man" is to create the illusion of having refuted a
proposition by replacing it with a superficially similar yet
unequivalent proposition (the "straw man"), and to refute it
-- Wikipedia (Straw man page)

Marc Gravell

unread,
Feb 3, 2014, 7:07:29 AM2/3/14
to redi...@googlegroups.com

I don't know the complexity or implications, but I wonder if the simpler approach is for time to not progress during scripts : the time is whatever time it was started at, no matter how long it takes - then expiry can't happen "during" the script.

Marc

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.

Salvatore Sanfilippo

unread,
Feb 3, 2014, 7:56:12 AM2/3/14
to Redis DB

This is brillant! Your idea turns the implementation of the most sane behavious (expire at first lookup or never) into a trivial one with minimal CPU costs :-) if we are in the context of a script we can just expire against the script start time instead of the current time. I'm glad I posted this... Testing the idea and checking the implications and replying back.

Marc Gravell

unread,
Feb 3, 2014, 9:51:07 AM2/3/14
to redi...@googlegroups.com
The fun edge case is replication, of course; but I gather expiry works very differently for replicas (essentially with the delete being propagated as an explicit delete) - so then the fun bit is - if it encounters something expired *during* execution, then presumably the "del" needs to get sent *before* the "exec" to the slave. Or something like that. Don't quote me - I'm just a user.

Marc
Regards,

Marc

Salvatore Sanfilippo

unread,
Feb 3, 2014, 9:59:32 AM2/3/14
to Redis DB
Exactly Marc, this is what happens:

A script is executing on the master, but a key can be found expired
only on the *first* use (and this is trivial applying your idea, no
need to take an hash table as I suggested initially).
All he keys that are found expired, will get a DEL synthesized in the
replication channel before the EVAL is produced, so the slaves will
find the same set of keys expired too.
On the other hand, what was not expired on the first lookup will never
be expired during the execution of the same script, and since the
slave can't expire without master DELs, it will find the keys existing
as well.

Looks like it works as advertised :-)
I'm implementing it, and testing it at least with the steps I used to
reproduce the bug in the first instance.

Salvatore

Marc Gravell

unread,
Feb 3, 2014, 10:26:07 AM2/3/14
to redi...@googlegroups.com
For info, there is a discussion around Nov 2012, titled "Lua script execution time", that might be worth reviewing in case this changes anything there (especially ttl/pttl etc) - just to make sure that there aren't any unexpected side effects.

Marc

Salvatore Sanfilippo

unread,
Feb 3, 2014, 10:29:14 AM2/3/14
to Redis DB
Hey Marc! No problem everything should be fine, the change is totally
trivial, can't break anything outside it, and is surely a lot more
saner than it used to be :-)
Btw I also verified manually that the previous commit has the race,
and in the next commit I can no longer reproduce it. It is obvious
logically but it is always better to verify by hand given that writing
an unit test for this issue is going to be very problematic...

Pushing the change in a moment and then releasing 2.8.5.

Salvatore

Salvatore Sanfilippo

unread,
Feb 3, 2014, 10:31:39 AM2/3/14
to Redis DB
Btw the following is the trivial change to expireIfNeed() fixing the bug.
expireIfNeeded() simply sets "now" to mstime() or
server.lua_time_start depending on the fact we are or not in the
context of a Lua script.

int expireIfNeeded(redisDb *db, robj *key) {
- long long when = getExpire(db,key);
+ mstime_t when = getExpire(db,key);
+ mstime_t now;

if (when < 0) return 0; /* No expire for this key */

/* Don't expire anything while loading. It will be done later. */
if (server.loading) return 0;

+ now = server.lua_caller ? server.lua_time_start : mstime();
+
/* If we are running in the context of a slave, return ASAP:
* the slave key expiration is controlled by the master that will
* send us synthesized DEL operations for expired keys.
@@ -769,12 +777,10 @@ int expireIfNeeded(redisDb *db, robj *key) {
* Still we try to return the right information to the caller,
* that is, 0 if we think the key should be still valid, 1 if
* we think the key is expired at this time. */
- if (server.masterhost != NULL) {
- return mstime() > when;
- }
+ if (server.masterhost != NULL) return now > when;

/* Return when this key has not expired */
- if (mstime() <= when) return 0;
+ if (now <= when) return 0;

Marc Gravell

unread,
Feb 3, 2014, 10:52:37 AM2/3/14
to redi...@googlegroups.com
"everything should be fine, the change is totally trivial, can't break anything outside it"

Pretty sure every significant outage I've ever caused (which is more than I care to admit) has followed some variant of this line ;p

Salvatore Sanfilippo

unread,
Feb 3, 2014, 10:57:01 AM2/3/14
to Redis DB
Haha ;-) We are supposed to make errors, as long as we put our best at
avoiding them. This philosophy really served me very well in the
latest years, after all the Redis success managed to put me in great
pressure from time to time, but once you try to do your best, whatever
consequence is ok, not your fault just the result of your limitation
as human.

Have a nice day,
Salvatore
Reply all
Reply to author
Forward
0 new messages