Deleting large numbers of keys

Clifton Cunningham

unread,

May 9, 2012, 7:28:54 AM5/9/12

to redi...@googlegroups.com

Hi, I've looked for other articles on this and seen some ideas, discussions around new features etc., but no solutions.

We are using redis in production here for near real time counts of activity by minute, where in any given day we can get upwards of 50M keys created. I only want to keep data for 2 days in Redis, hence once I get to day 3 I need to delete 50M keys.

I have tried:

1. Expires - doesn't actually delete the keys as they aren't accessed once over 2 days old, hence memory just keeps growing, eventually redis starts swapping and the whole thing dies. The background task that tries to delete them just can't keep up.

2. Set that contains the keys to deletes, then spop / del loop from client. I can only seem to delete about 200 / s, mostly due to the fact that I need a full network round trip for each SPOP and DEL.

3. An Eval script that does the SPOP and DEL on Redis, this blocks the whole thread and so the Redis stops.

Is there something that we're missing? Are we using keys with time components in the wrong way?

Thanks in advance!

Clifton

Salvatore Sanfilippo

unread,

May 9, 2012, 8:03:23 AM5/9/12

to redi...@googlegroups.com

On Wed, May 9, 2012 at 1:28 PM, Clifton Cunningham
<clifton.c...@gmail.com> wrote:

> 1. Expires - doesn't actually delete the keys as they aren't accessed once
> over 2 days old, hence memory just keeps growing, eventually redis starts
> swapping and the whole thing dies. The background task that tries to delete
> them just can't keep up.

That sounds impossible, Redis tries to evict keys ten days every
second. It is not perfect if you don't access keys, but you have
problems only if you have like, millions of keys that are not going to
expire, and just a few that *are* going to expire.

The algorithm is also adaptive.

Please can you re-try this approach and give us some number? I'm sure
this should work as expected if the percentage o keys with an expire
that is already reached is at least 0.01% of keys with an expire but
still with remaining time to live.
Actually it should work even with < 0.01% but the smaller percentage,
the less ideal it becomes.

Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware
http://invece.org

Beauty is more important in computing than anywhere else in technology
because software is so complicated. Beauty is the ultimate defence
against complexity.
— David Gelernter

Marc Gravell

unread,

May 9, 2012, 8:07:57 AM5/9/12

to redi...@googlegroups.com

random thought even if you *did* have to do this manually (rather than using inbuilt features); DEL is varadic, which might save a few hops, and you could use a pipelined API to do this without requiring network latency per item - i.e. you start firing off a load of LPOP (without waiting for replies, perhaps using a semaphore to set a bound on the number of outstanding LPOP), and then as results come back in, you could fire off DEL (again, without waiting for the reply) - that should increase the throughput significantly.

But - take Salvatore's advice over mine *every* day of the redis week.

Marc

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.

To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

--
Regards,

Marc

Sripathi Krishnan

unread,

May 9, 2012, 8:12:15 AM5/9/12

to redi...@googlegroups.com

You can store the keys in different databases, one for each day. After two days, simply call flushdb on the appropriate database.

flushdb is fast, and should free up memory quickly. Besides, you don't have to manually expire each of your keys.

Only thing to be aware of - multiple databases are discouraged as they have their own overheads, and they won't be supported in redis cluster. But with just 3 databases, I think you should be fine.

--Sri

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.

Salvatore Sanfilippo

unread,

May 9, 2012, 8:12:48 AM5/9/12

to redi...@googlegroups.com

On Wed, May 9, 2012 at 2:07 PM, Marc Gravell <marc.g...@gmail.com> wrote:

> But - take Salvatore's advice over mine *every* day of the redis week.

I would say they were pretty complementary, I'm just worried about
correctness of the expire algorithm so focused more on what Reds
should do alone already.

Btw I posted too early, forgot to mention that even when the expiring
frequency is not enough, probably it's better to compile with a
trivial change:

#define REDIS_EXPIRELOOKUPS_PER_CRON 10 /* lookup 10 expires per loop */

It is possible to raise this number if really really needed, but
should be a strange case of an incredibly small percentage of keys
that are already expired among other millions not expired, and we need
to expire them ASAP.

I would try setting it to 100 or 1000 in such a rare circumstance.

Cheers,

Salvatore Sanfilippo

unread,

May 9, 2012, 8:14:46 AM5/9/12

to redi...@googlegroups.com

On Wed, May 9, 2012 at 2:12 PM, Sripathi Krishnan
<sripathi...@gmail.com> wrote:
> flushdb is fast, and should free up memory quickly. Besides, you don't have
> to manually expire each of your keys.

Thanks for the contribution, howoever, a warning about FLUSHDB is
fast, but blocking and O(N), so if you have a lot (millions) of keys
it can take seconds.
Incremental expiring can be better for this use case assuming you need
good latency figures.

Marc Gravell

unread,

May 9, 2012, 8:29:54 AM5/9/12

to redi...@googlegroups.com

Here's a random example of this using C# / Booksleeve (purely because that is my usual tool-set); I understand that async APIs are available on many platforms etc: http://pastie.org/3884012

This achieves 50k deletes in about a second (scales linearly, and can be reduced slightly by increasing the semaphore size), by the simple process of not blocking waiting for replies. For example (using 100k items and 100 outstanding items):

From 100000 to 0; 1667ms

Obviously I'm not saying that this is the only or best option; I'm just saying: even in your worst case scenario you should be able to do a lot faster than 200/s.

Marc

--
Regards,

Marc

Clifton Cunningham

unread,

May 9, 2012, 8:46:26 AM5/9/12

to redi...@googlegroups.com

Thanks guys, amazingly fast feedback!

Salvatore, If you are confident that the expiry algorithm should keep up then I will flip back to using expires, and then monitor it over a period of time to isolate the specific point we run into issues. I'll also calculate some better metrics re. number of keys we're creating per second vs number being expired per second. FYI - our key structure is basically:

view:geography:referrer:article:date:hour:minute 1

Would it be better if I used a hash and removed one of the dimensions? e.g. the number of articles by referrer can get up to 15k, would this be more efficient having each article count as a hash property and have a lot less keys overall? I find it better having the keys directly like this as it makes it easier to mget a set of them over any time period and then graph.

The separate DB per day may be doable, but slightly complicates things as we are global, and hence I don't have a clean continuous 24hr time period and constructing reports across DB's feels a bit painful, so I will put that one as my last resort!

Re, performance, I am also surprised at how slow it is, but it may be the client (I'm using the node_redis client) I'll past the code: http://pastie.org/3884059

I'll try the expires approach again to get some harder numbers and report back.

Clifton

On Wednesday, 9 May 2012 13:14:46 UTC+1, Salvatore Sanfilippo wrote:

On Wed, May 9, 2012 at 2:12 PM, Sripathi Krishnan

> flushdb is fast, and should free up memory quickly. Besides, you don't have
> to manually expire each of your keys.

Thanks for the contribution, howoever, a warning about FLUSHDB is
fast, but blocking and O(N), so if you have a lot (millions) of keys
it can take seconds.
Incremental expiring can be better for this use case assuming you need
good latency figures.

Salvatore

--
Salvatore 'antirez' Sanfilippo
open source developer - VMware
http://invece.org

Beauty is more important in computing than anywhere else in technology
because software is so complicated. Beauty is the ultimate defence
against complexity.
— David Gelernter

Message has been deleted

Clifton Cunningham

unread,

May 9, 2012, 8:53:55 AM5/9/12

to redi...@googlegroups.com

Also, FYI, a flushdb with 6M keys took 31s (and stopped all other activity during that time). We're using 2.4.8.

Josiah Carlson

unread,

May 9, 2012, 11:54:07 PM5/9/12

to redi...@googlegroups.com

On Wed, May 9, 2012 at 4:28 AM, Clifton Cunningham
<clifton.c...@gmail.com> wrote:
> Hi, I've looked for other articles on this and seen some ideas, discussions
> around new features etc., but no solutions.
>
> We are using redis in production here for near real time counts of activity
> by minute, where in any given day we can get upwards of 50M keys created. I
> only want to keep data for 2 days in Redis, hence once I get to day 3 I need
> to delete 50M keys.
>
> I have tried:
>
> 1. Expires - doesn't actually delete the keys as they aren't accessed once
> over 2 days old, hence memory just keeps growing, eventually redis starts
> swapping and the whole thing dies. The background task that tries to delete
> them just can't keep up.
> 2. Set that contains the keys to deletes, then spop / del loop from client.
> I can only seem to delete about 200 / s, mostly due to the fact that I need
> a full network round trip for each SPOP and DEL.
> 3. An Eval script that does the SPOP and DEL on Redis, this blocks the
> whole thread and so the Redis stops.

As Salvatore already mentioned, expiration should work for you. As an
alternative...

Take your EVAL script, and limit it to X items per call. By paying
attention to the latency of the EVAL call, you can choose your latency
by adjusting X, and run the command repeatedly until it has no more
keys to delete.

Alternatively, you can instead use a ZSET with scores = 0, and have
your client fetch items in blocks of your choosing with ZRANGE and
ZREMRANGEBYRANK, deleting the items in bulk with the already-mentioned
variable arg length DEL command.

Something like...

def clear_keys(conn, zkey, latency):
block = 100
# use a pipeline without MULTI/EXEC
pipe = conn.pipeline(False)
def next():
pipe.zrange(zkey, 0, block)
pipe.zremrangebyrank(zkey, 0, block)
return pipe.execute()[-2]

# start the clock and fetch the first block
# of items
start = time.time()
to_delete = next()
while to_delete:

# adjust the block size to try to hit
# the desired latency
end = time.time()
if end-start < latency:
block <<= 1
else:
block = max(3 * block >> 2, 100)

# delete the items and fetch the next
# block
start = time.time()
pipe.delete(*to_delete)
to_delete = next()

The above could also be implemented trivially with EVAL in Lua, and
you could get even finer grained control over latency there.

Regards,
- Josiah

Clifton Cunningham

unread,

May 10, 2012, 4:39:59 AM5/10/12

to redi...@googlegroups.com

Josiah,

Thanks, we have reverted back to the expires model, and will track it closely over the next few days (it only takes 72 hrs to fill the 50GB memory and then fall over if we are not clearing out). My current eval script was:

local key

local members

members = redis.call('smembers',KEYS[1])

if table.getn(members) > 0 then

key = table.remove(members,1)

while(key) do

redis.call('del',key)

key = table.remove(members,1)

end

redis.call('del',KEYS[1])

return 1

I will quickly refactor this to delete in batches and test different batch sizes. Is the approach above re. using a table correct, or would spop / del be faster?

Josiah Carlson

unread,

May 10, 2012, 6:51:11 PM5/10/12

to redi...@googlegroups.com

SMEMBERS on a large set is expensive, don't use it if you can avoid it
(in your situation, it's only a factor of 3 better than using KEYS).

While tables in Lua are actually hash tables, I don't know if
table.remove(members, 1) is fast or slow. You could try to use
table.remove(members) instead, if you insist on using SMEMBERS.

While overall the use of SMEMBERS will be faster, you can't do
anything to control the latency during that call. Using SPOP + DEL
will be overall slower, but you can choose your latency. You can also
call SPOP 100 times or so, then use that for a single call to DEL,
which may help to reduce overhead overall.

Why do I keep mentioning the ability to choose your latency? Assuming
that you are updating counters as this bulk delete operation is going
on, if Redis is paused for seconds while it is deleting keys, then all
of your counter updates will be delayed until it is finished. But if
you incrementally delete items, using (for example) 10 milliseconds at
a time, then most of your counter operations will proceed without a
hiccup.

Regards,
- Josiah

On Thu, May 10, 2012 at 1:39 AM, Clifton Cunningham

> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.

> To view this discussion on the web visit
> https://groups.google.com/d/msg/redis-db/-/KG9Q-1H3IYgJ.

Reply all

Reply to author

Forward