Lua script performance involving scan

829 views
Skip to first unread message

Neelesh Korade

unread,
Apr 15, 2015, 9:37:05 PM4/15/15
to redi...@googlegroups.com
Hello All

We are looking to use redis for one of our use cases wherein we need to SCAN a large number of "rows" with MATCH, and, return the result set. This needs to be done as efficiently as possible as we must keep the latencies really low.

As an example, we may scan up to a million rows and potentially half of those could match the criteria. 

Following is how the script looks like at a high level. Would highly appreciate any recommendations/suggestions to achieve this with the best possible performance. Few context sensitive questions have been embedded as comments in the code below. Some other questions are given at the end.

---------LUA Start-----------
local all_keys = {};
local t = {};
local done = false;
local cursor = "0"
local value = "";
local val = "";
repeat
local result = redis.call("SCAN", cursor, "MATCH", ARGV[1], "count", ARGV[2])
cursor = result[1];
t = result[2];
for i,key in pairs(t) do
value = redis.call("HMGET",key,"field1","field2","field3","field4","field5")
if value[1] == "true" then
-- Q1) Is setting a matched row in a LUA table more efficient than setting it in a redis key (for example, in a hashmap using HMSET)? Currently, the following command seems to be the main performance bottleneck.
all_keys[key] = value
end
end
if cursor == 0 then
done = true;
end
until done
-- Q2) What would be the implications of returning a huge result set? Persisting result set in redis key 
return all_keys
---------LUA End-----------

Q3) LUA best practice suggests having scripts as small as possible. What would be an optimal breakup of this script?

I am getting back to Redis after a long gap. So my apologies for any ignorance.

Regards
Neelesh

Itamar Haber

unread,
Apr 16, 2015, 6:30:46 AM4/16/15
to redi...@googlegroups.com
Hi Neelesh,

Inline.

On Thu, Apr 16, 2015 at 4:36 AM, Neelesh Korade <neelesh...@gmail.com> wrote:
Hello All

We are looking to use redis for one of our use cases wherein we need to SCAN a large number of "rows" with MATCH, and, return the result set. This needs to be done as efficiently as possible as we must keep the latencies really low.


SCANning the keyspace is always going to take time, and the more keys you have the longer it will take. Keeping the latencies low, especially given the need to return a largish result set, could be very ambitious.
 
As an example, we may scan up to a million rows and potentially half of those could match the criteria. 

Following is how the script looks like at a high level. Would highly appreciate any recommendations/suggestions to achieve this with the best possible performance. Few context sensitive questions have been embedded as comments in the code below. Some other questions are given at the end.

---------LUA Start-----------
local all_keys = {};
local t = {};
local done = false;
local cursor = "0"
local value = "";
local val = "";
repeat
local result = redis.call("SCAN", cursor, "MATCH", ARGV[1], "count", ARGV[2])
cursor = result[1];
t = result[2];
for i,key in pairs(t) do
value = redis.call("HMGET",key,"field1","field2","field3","field4","field5")
if value[1] == "true" then
-- Q1) Is setting a matched row in a LUA table more efficient than setting it in a redis key (for example, in a hashmap using HMSET)? Currently, the following command seems to be the main performance bottleneck.

Yes - setting it in a table is done inside the context of the Lua engine whereas doing a redis.call is more expensive. Furthermore, doing HMSET (or any write operation for that matter) is illegal after using a random command (such as SCAN) and Redis will error politely to remind you of that.
 
all_keys[key] = value
end
end
if cursor == 0 then
done = true;
end
until done
-- Q2) What would be the implications of returning a huge result set? Persisting result set in redis key 

Excellent question - you'd be using RAM. The result set (Lua table) needs to be stored somewhere, right? If it is large as you plan it to be, you'd better have lots and lots of spare RAM on your server (because it sounds like you're going to store 0.5M Hashes x 5 fields in it). Also note that Lua's memory allocation isn't controlled very tightly by configuration directives so you can easily get Redis OOMed by the OS when things are tight :)
 
return all_keys
---------LUA End-----------

Q3) LUA best practice suggests having scripts as small as possible. What would be an optimal breakup of this script?

Your script is small enough - don't worry.
 

I am getting back to Redis after a long gap. So my apologies for any ignorance.


Welcome back :) I still don't quite understand a few points with what you're trying to do:
1. How low is a "really low" latency?
2. How big is your dataset? 1M "rows" (keys I guess) is the total database's size or are there other keys in it that don't match the pattern that's SCANned?
3. What is your app doing with a bulk of 0.5M hashes and their fields?
4. Are the key patterns totally ad-hoc or can you precompute the result set? This is key to decide because Redis, like other NoSQL databases, excels when you fetch the data according to the way that it is stored.

Cheers,
 
Regards
Neelesh

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.



--

Itamar Haber | Chief Developers Advocate
Redis Watch Newsletter - Curator and Janitor
Redis Labs - Enterprise-Class Redis for Developers

Mobile: +1 (415) 688 2443
Mobile (IL): +972 (54) 567 9692
Email: ita...@redislabs.com
Skype: itamar.haber

Blog  |  Twitter  |  LinkedIn


Tomek H

unread,
Aug 19, 2022, 11:34:46 AMAug 19
to Redis DB
Hi!

Is it possible to create a generator in LUA and continuously yields results while performing scan instead of gather results first and returning them all at the end?
Something like iterator or generator in Python: https://wiki.python.org/moin/Generators

Best,
 Tomasz

Itamar Haber

unread,
Aug 19, 2022, 11:53:08 AMAug 19
to Redis DB
Hi Tomasz,

Lua provides the [coroutine](https://www.lua.org/manual/5.1/manual.html#2.11) mechanism that is similar to the Python generator pattern.

However. Because Redis blocks scripts from keeping a global context (only `local` variables and functions), it isn't possible to use coroutines meaningfully as far as I know.

Cheers,
Itamar

Reply all
Reply to author
Forward
0 new messages