Key do not expire after redis-server restart (PPA/2.8.4)

911 views
Skip to first unread message

Niel Smith

unread,
Jan 22, 2014, 6:11:04 AM1/22/14
to redi...@googlegroups.com
Hi,

If found an inconsistent problem where keys do not expire or are deleted after a redis-server restart. The EXPIRE/TTL flag is not set when this happend, and is causing havoc from everything from distributed locks, rate limiters, statistics, etc.

From what I can tell, the normal behavior for a redis-server restart is that keys with EXPIRE/TTL flags set are deleted.

I've implemented a workaround where I call TTL then EXPIRE (if needed).

Do I have to accept the overhead for calling TTL for all functionality that depends in EXPIRE, or is there another way around this issue?



Salvatore Sanfilippo

unread,
Jan 22, 2014, 6:24:58 AM1/22/14
to Redis DB
Hello,

Your message is not entirely clear to me, but this is what I
understand: you are reporting a bug about:

1) Redis 2.8.4
2) Restarting the server will clear the associated expire of a key in
your environment.

This is not a known bug, so please could you specify more information?
What are you using, RDB or AOF to persist to disk?
Are you able to reproduce the issue?
This happens on all the keys or on many keys, or it is very unlikely
to happen and involves a few keys?
What is the expire time you set approximately?

Note my last question, maybe you are doing something like:

SET key foo
EXPIRE key 100

If you don't use MULTI/EXEC or Lua scripting to run the above, and you
terminate the server just after the "SET" command, you'll have a key
without expire on restart.

Salvatore
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db.
> For more options, visit https://groups.google.com/groups/opt_out.



--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

To "attack a straw man" is to create the illusion of having refuted a
proposition by replacing it with a superficially similar yet
unequivalent proposition (the "straw man"), and to refute it
— Wikipedia (Straw man page)

Niel Smith

unread,
Jan 28, 2014, 5:51:50 AM1/28/14
to redi...@googlegroups.com
Hi Salvatore,

Apologies for the late reply.

1) Redis server v=2.8.4 sha=00000000:0 malloc=jemalloc-3.2.0 bits=64 build=151580704238c1d
2) I'm using the default RDB configuration.
3) I'm using the PHP Redis binding.
4) Restarting the server (inconsistently) clears the TTL for *some* of the keys.
5) Yes, I can reproduce the problem, the following outlines the test procedure:

    for 0 to N-1
       DEL redis:tests:{N}
   
    for 0 to N-1
       (assert == -2) TTL redis:tests:{N}
    
    for 1 to 20
      for 0 to N-1
        MULTI
        SET redis:tests:{N} "0" NX EX 120
        INCR redis:tests:{N}
        EXEC

    for 0 to N-1
       (assert >= 0) TTL redis:tests:{N}
       (assert == 20) GET redis:tests:{N}

    stop redis-server, (wait a couple of minutes), start redis-server (Or reboot the OS).

    for 1 to 20 (***)
      for 0 to N-1
        MULTI
        SET redis:tests:{N} "0" NX EX 120
        INCR redis:tests:{N}
        EXEC
    
    for 0 to N-1
       (assert != -1) TTL redis:tests:{N}

(*** It would seem that the problem only occurs when immediately start throwing things at the server after a restart)

Test #1 > 5 out of 10,000 key(s) did not expire (TTL returned -1)
Test #2 > 5 out of 10,000 key(s) did not expire (TTL returned -1)
Test #3 > 2 out of 10,000 key(s) did not expire (TTL returned -1)
Test #4 > 0 out of 10,000 key(s) did not expire (TTL returned -1)
Test #5 > 3 out of 10,000 key(s) did not expire (TTL returned -1)

Hope I'm being clear!

Regards,
Niel

Salvatore Sanfilippo

unread,
Jan 28, 2014, 6:02:51 AM1/28/14
to Redis DB
On Tue, Jan 28, 2014 at 11:51 AM, Niel Smith
<daniel.al...@gmail.com> wrote:
> for 0 to N-1
> (assert >= 0) TTL redis:tests:{N}
> (assert == 20) GET redis:tests:{N}


Hello Niel, before to look more closely at the issue, I would change
the code above. After you create your data set, before you restart the
server, you are being very liberal about checking that what you
created is what you think you created.

The test above should be (assert == -1) TTL redis:tests{N}, otherwise
it is possible that the problem is about your script that does not
create what you think.

And indeed, this is entirely possible as I read:

MULTI
SET redis:tests:{N} "0" NX EX 120
INCR redis:tests:{N}
EXEC

So here SET will fail if the key exists, because of NX. If this
happens, INCR will create the key without expire with a value of zero.

I suggest to:

1) Fix the assert so that you verify that actually the dataset is sane
before restarting.
2) If not, fix the race condition above.

Cheers,
Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - GoPivotal
http://invece.org

To "attack a straw man" is to create the illusion of having refuted a
proposition by replacing it with a superficially similar yet
unequivalent proposition (the "straw man"), and to refute it
-- Wikipedia (Straw man page)

Niel Smith

unread,
Jan 28, 2014, 8:02:09 AM1/28/14
to redi...@googlegroups.com
Hi Salvatore,

The problem is the second loop after the restart.

Redis is busy expiring the keys, and every now and then a key expires right between SET redis:tests:{N} "0" NX EX 120 and INCR redis:tests:{N}.

Appears that I'm the idiot...

Thank you for the assistance.

Salvatore Sanfilippo

unread,
Jan 28, 2014, 8:43:34 AM1/28/14
to Redis DB
On Tue, Jan 28, 2014 at 2:02 PM, Niel Smith
<daniel.al...@gmail.com> wrote:

> Appears that I'm the idiot...

Hello Niel,

you are not the idiot at all ;-) Race conditions with expires are
common because the behavior of groups of commands become
time-dependent in that case, so care should be taken.
As a rule of thumb to avoid issues one can always terminate the
MULTI/EXEC block with an explicit EXPIRE for the key in order to make
sure that whatever happens the key will have a limited lifespan.
Thanks for your bug report, as usually it is better a false positive
that a bug that gets unnoticed potentially affecting many users.
Reply all
Reply to author
Forward
0 new messages