Large set enumeration in Lua, will zscans in a Lua script still block the server?

330 views
Skip to first unread message

Jon

unread,
Feb 17, 2014, 12:25:32 AM2/17/14
to redi...@googlegroups.com
Hi,

I have a large sorted set (a few hundred thousand entries) where I use a Lua script to find certain members of the set. The script iterates over the set using zrange and checks for a string match (it was written before 2.8 was out). I was having some performance issues and I looked at the slowlog and the script took 1897721 microseconds to run.

If I update the script to use zscan, will that in any way change that Redis will keep blocking until the script finishes? My intuition says no, but just wanted to ask.

Thanks,
Jon

Josiah Carlson

unread,
Feb 17, 2014, 4:19:26 PM2/17/14
to redi...@googlegroups.com
Using ZSCAN vs. ZRANGE will continue to block your script until it completes.

However, it is not clear to me why you are using ZRANGE to search for items in your ZSET. The point of a ZSET is that you can access items by both the member and the score. Are you searching for something within the member? Something else? Because there might be a method that you can use that doesn't require scanning over your entire ZSET.

 - Josiah


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/groups/opt_out.

Jonathan Hyman

unread,
Feb 17, 2014, 4:28:48 PM2/17/14
to redi...@googlegroups.com
Yes, I'm searching for something within the member. We're using the Ruby queuing library, Sidekiq, which enqueues future jobs into a sorted set and returns a job id (the entry in the sorted set is a JSON encoding of the job payload, which contains the job id and the job arguments). The code in the library which looks for a job given its id uses the zrange, I'm looking at other options to get it faster. 


--
You received this message because you are subscribed to a topic in the Google Groups "Redis DB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/redis-db/uXYekF_y908/unsubscribe.
To unsubscribe from this group and all its topics, send an email to redis-db+u...@googlegroups.com.

Josiah Carlson

unread,
Feb 17, 2014, 5:01:30 PM2/17/14
to redi...@googlegroups.com
My advice: change your queuing library.

Having written a queuing library myself (RPQueue - Python based), the time-based queue I use is also a ZSET, but which maps UUID -> TIMESTAMP. Arguments for the job are stored in a HASH that maps UUID -> JSON encoded arguments. Cancelling a job in any queue is as easy as deleting the UUID from the HASH, and optionally removing the UUID from the proper ZSET (multiple queues can be defined, prioritized, etc.).

I am sure that several of the existing Ruby-based queuing libraries do it the same as I do, even if the keys are different.

 - Josiah

Jonathan Hyman

unread,
Feb 17, 2014, 5:08:26 PM2/17/14
to redi...@googlegroups.com
Thanks. I've seen that pattern before (e.g., the Ruby queuing extension resque-scheduler uses it), and agree that it's better. I'm pretty heavily invested into Sidekiq at the moment which uses the pattern I mentioned, so for at least the short term I'll try to find something that works with it.

Josiah Carlson

unread,
Feb 17, 2014, 8:48:59 PM2/17/14
to redi...@googlegroups.com
Do you *need* to get the result of your query in a single call? Can you make multiple calls and paginate over the ZSET? Would it be okay if your query failed on occasion if the item was executed while you were searching?

If the answers to those questions are "no", "yes", and "yes", respectively, you can change your Lua script to not block everything else running against Redis, but it may take longer to answer your query, and it may be incorrect on occasion (because what you were searching for was already executed).

That said, I did notice at least a pair of race conditions in Sidekiq that could result in queue items being executed multiple times.

 - Josiah

Jonathan Hyman

unread,
Feb 17, 2014, 8:53:09 PM2/17/14
to redi...@googlegroups.com
As you guessed, the answers are no, yes, and yes. In that scenario, I'd make many calls to a Lua script that handles a group of pages? Is that how you're imagining Lua not blocking?

For the race conditions, can you let me know what you found, or file at https://github.com/mperham/sidekiq/issues?

Josiah Carlson

unread,
Feb 17, 2014, 9:17:12 PM2/17/14
to redi...@googlegroups.com
Yes. You would handle a group of results, returning a timestamp (for pagination) to continue from.

Both add_to_queue() and retry() starting from https://github.com/mperham/sidekiq/blob/master/lib/sidekiq/api.rb#L233 are missing a conn.multi call. If multiple calls are made more or less simultaneously from two different servers running the daemon, items can be pulled multiple times by the zrangebyscore calls, which would result in duplicate Client.push() calls. It is rare, but possible.

 - Josiah

Jonathan Hyman

unread,
Feb 17, 2014, 9:19:00 PM2/17/14
to redi...@googlegroups.com
Thanks for your help, Josiah.
Reply all
Reply to author
Forward
0 new messages