How to delete Redis keys stored in Redis set?

2,590 views
Skip to first unread message

Alexey Shuksto

unread,
Aug 4, 2014, 11:34:42 AM8/4/14
to redi...@googlegroups.com
Greetings,

In our project we store set of currently processed by server process Redis keys in special 'backup' Redis set to be able to continue processing if server process crashed. 
After processing is finished, those keys should be removed from Redis instance. Unfortunately, as keys must be processed in 'batches', we are unable to delete key right after processing.

Of course, we can issue 'DEL' command from Redis client in server process, but that will require to sent all processed keys back to Redis. Also, Redis already contains all needed information as members of 'backup' set.

I was looking for a way to delete all Redis keys that are members of 'backup' set and wrote following script:

  `eval 'for _,key in ipairs(redis.call("SMEMBERS", KEYS[1])) do redis.call("DEL", key) end; redis.call("DEL", KEYS[1])' 1 backup`

But I'm still worried about LUA memory limits if 'backup' set will contain around 50 thousand members -- will it be able to iterate over such table?

I've tried to rewrite this script with 'SSCAN' or 'SPOP', but neither of this two commands work: first one due to 'non deterministic behaviour' and second one is just 'not allowed' (i suppose due to its randomness).

Any thoughts?

Michel Martens

unread,
Aug 4, 2014, 11:48:53 AM8/4/14
to redi...@googlegroups.com
On 4 August 2014 12:34, Alexey Shuksto <ale...@shuksto.name> wrote:
>
> But I'm still worried about LUA memory limits if 'backup' set will contain
> around 50 thousand members -- will it be able to iterate over such table?
>
> I've tried to rewrite this script with 'SSCAN' or 'SPOP', but neither of
> this two commands work: first one due to 'non deterministic behaviour' and
> second one is just 'not allowed' (i suppose due to its randomness).

One idea:

To simulate that behavior, maybe you can first store the values in a
list (SORT key1 STORE key2), then you can delete the original set and
iterate the list using LPOP, all of this from the script.

If the performance is bad because you have too many items, you can
first convert the set, then you can grab ranges from the list and
return a value that lets the caller know if the list is empty or if
you need to continue calling the script to remove more elements.

Josiah Carlson

unread,
Aug 6, 2014, 12:41:49 AM8/6/14
to redi...@googlegroups.com
Lua is a programming language running inside a process that stores all of the data Lua will be reading and operating on, plus (possibly orders of magnitude) more data. You should not worry about a 50k element SET being loaded into Lua and processed as a native table. You should instead worry about the amount of data you are trying to delete in one go, which can be addressed by...

local limit = tonumber(ARGV[1])
local keys = redis.call("SMEMBERS", KEYS[1])
while #keys > limit do
    table.remove(keys)
end
if #keys > 0 then
    redis.call("DEL", unpack(keys))
    redis.call("SREM", KEYS[1], unpack(keys))
end
return #keys

Called via:
EVAL[SHA] [script or sha1 hash] 1 backup <count>

Whatever count you provide will be the maximum number of keys to delete from the keyspace and to remove from the SET (so you don't repeat work).

If you feel that repeated calling of SMEMBERS is slow, you are right. This next version addresses the issue by transforming the SET into a ZSET once, then using the built-in pagination over ZSETs to make it faster.

local limit = tonumber(ARGV[1])
local zkey = KEYS[1] .. ":zset"
if tonumber(redis.call("ZCARD", zkey)) == 0 then
    redis.call("ZUNIONSTORE", zkey, 2, zkey, KEYS[1])
    redis.call("DEL", KEYS[1])
end

local keys = redis.call("ZRANGE", zkey, 0, limit-1)
if #keys > 0 then
    redis.call("DEL", unpack(keys))
    redis.call("ZREM", zkey, unpack(keys))
end
return #keys


This is called and operates very similar to the first version, only it's significantly faster for small numbers of keys deleted at one time, due to the one-time SET->ZSET conversion, rather than repeated SET fetching.

 - Josiah



--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Geoffrey Hoffman

unread,
Aug 6, 2014, 1:10:27 AM8/6/14
to redi...@googlegroups.com
This example is just awesome. Thanks for all your posts and answers, Josiah. What Salvatore is to the code, you are to the learning.

Josiah Carlson

unread,
Aug 6, 2014, 3:17:50 AM8/6/14
to redi...@googlegroups.com
Thank you, I really appreciate that :)

To be honest, I got my Ph.D. because I wanted to be a professor. I wanted to teach, to study, and to give back to the world through open source. But for now, I am working for a company. I don't have time to hack on Redis itself; but I do have time to respond to the odd email here and there. To help people that haven't yet gotten help from the growing Redis community.

 - Josiah

Алексей Шуксто

unread,
Aug 6, 2014, 6:19:49 AM8/6/14
to redi...@googlegroups.com
Josiah, first of all -- thank you very much for your answer and provided examples.

Could you please further elaborate why should I worry about number of 'DEL' commands in one go? Currently our Redis servers are able to produce around 125-150 thousand ops per sec, so I thought that even 50K deletes in a go would not pause client requests consumption for more than a second.

Or there are some other Redis server-side drawbacks that I'm unaware of (memory related etc.)?


You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/PK_xmq22gnw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.

To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.



--
With best regards, Alexey Shuksto.

Josiah Carlson

unread,
Aug 6, 2014, 12:10:51 PM8/6/14
to redi...@googlegroups.com
Whether or not 50k deletes in one go will affect Redis will depend on the structures being deleted. If you are deleting 50k strings, HyperLogLog structures (encoded as strings), or a smaller list, set, hash, or sorted set encoded using the ziplist or intset encoding in-memory, then it's not a big deal. But if you are deleting 50k lists, sets, or hashes, each with 100k items, you would be inducing roughly 5 billion memory free calls, which would take several seconds (or minutes) on any machine.

Whether or not it is a concern will depend on the structures at the keys that you are using. Are your structures small? Big? Are they strings/HyperLogLog, or something else?

 - Josiah


Алексей Шуксто

unread,
Aug 6, 2014, 12:40:43 PM8/6/14
to redi...@googlegroups.com
I see your point. Well, all our 'to-be-deleted' structures right now is either hashes with less than 10 members, or sorted sets with less than 250 members each (mostly 5-20), so I think we pretty much safe here. But, rest assured, I will now test delete procedure more and if it will be slow, we'll use your proposed solution with partial deletion. But then we'll really miss 'fire and forget' semantics of current script. :(

Also I remember some message from Salvatore in this group about future 'LAZY_DEL' for keys considered to be 'big', but I think it is just an idea...

Алексей Шуксто

unread,
Aug 6, 2014, 12:43:13 PM8/6/14
to redi...@googlegroups.com
And right after sending previous e-mail I thought that instead of 'DEL' we can possibly issue an 'EXPIRE' command with some short expiration time, like 1 second.

Matt Stancliff

unread,
Aug 6, 2014, 12:47:01 PM8/6/14
to redi...@googlegroups.com

On Aug 6, 2014, at 12:40 PM, Алексей Шуксто <ale...@shuksto.name> wrote:

> Also I remember some message from Salvatore in this group about future 'LAZY_DEL' for keys considered to be 'big', but I think it is just an idea…

Lazy delete/free is being tracked in https://github.com/antirez/redis/issues/1748 — feel free to add a comment saying you’re interested in the feature!


-Matt

Itamar Haber

unread,
Aug 6, 2014, 2:32:37 PM8/6/14
to redi...@googlegroups.com
+1

Josiah Carlson

unread,
Aug 7, 2014, 2:35:10 PM8/7/14
to redi...@googlegroups.com
That is a great idea, and might also work, but does have a caveat. If your keyspace is small (fewer than 200k keys in total), then the ratio of (expired keys)/(total keys) will be greater than 1/4. When Redis does lazy background deleting of expired keys (typically 10 times/second), it will continue to delete keys until it thinks that the ratio is lower than 1/4. If basically all of your keys are of this form, then Redis will delete all of your keys in one shot as part of background deletion.

To solve *this* issue, you can instead give each of your keys a random expiration time in the future, or even explicitly distribute their expiration times over a fixed duration in the future... like 1-60 seconds from now:

for i, key in ipairs(redis.call('smembers', KEYS[1])) do
    redis.call('expire', key, (i % 60) + 1)
end
redis.call('del', KEYS[1])

That will give you a slow expiration over 60+ seconds, hopefully minimizing instantaneous key deletion, doesn't require LAZYDEL, and is almost* fully fire-and-forget :)

* If key names are re-used, and they are re-used within the expiration time of a key, then there might be issues. But this is easily solved with one suitably random string provided by the client and some key renames. I can describe this if you're curious.

 - Josiah

Алексей Шуксто

unread,
Aug 8, 2014, 7:05:16 AM8/8/14
to redi...@googlegroups.com
Josiah, thanks again for a 'time-distributed expire' idea!

About key names re-use -- there is a little catch: while 'backed up' keys are already renamed for processing not to interfere with continued data gathering, there is still small chance of backup key reuse if our server is crashed and then rebooted rapidly. To try to avoid this I've added one more rename of 'backup' key right before expire:

for i, key in ipairs(redis.call('smembers', KEYS[1])) do
    local ren = key .. ":delete:" .. i
    redis.call('rename', key, ren)
    redis.call('expire', ren, (i % 60) + 1)
end

I think that this additional 'RENAME' can cause some visible overhead via internal 'DEL' command only if there a) key name clash and b) key is on same index in 'backup' set members simultaneously.

Josiah Carlson

unread,
Aug 18, 2014, 1:43:40 AM8/18/14
to redi...@googlegroups.com
Your rename works, and yes, you may end up with a delete... You can pass in a randomly-generated token from your client as an ARGV, which will fix the "problem" if you ever notice the overhead :)

 - Josiah

Quinlan Jung

unread,
Dec 11, 2017, 1:36:24 AM12/11/17
to Redis DB
Josiah, the call `redis.call("DEL", unpack(keys))` in your lua script may not work, because in EVAL, you need to specify all keys via KEYS in order for the redis cluster to forward the request to the proper node. 

From the EVAL docs (https://redis.io/commands/eval):

All Redis commands must be analyzed before execution to determine which keys the command will operate on. In order for this to be true for EVAL, keys must be passed explicitly. This is useful in many ways, but especially to make sure Redis Cluster can forward your request to the appropriate cluster node.

Note this rule is not enforced in order to provide the user with opportunities to abuse the Redis single instance configuration, at the cost of writing scripts not compatible with Redis Cluster.

Reply all
Reply to author
Forward
0 new messages