Concerning "zunionstore then zrangebyscore" feed pattern

138 views
Skip to first unread message

Chan Arthur

unread,
Mar 14, 2014, 12:10:12 PM3/14/14
to redi...@googlegroups.com
Let say one use zunionstore to compute an activity feed from a bunch of sorted set(which use timestamp as score).
If I want to get the "update", i.e. entries that has score greater than certain value, one direct way woudl be to do zunionstore again and then zrangebyscore.

Now the problem is that when doing the second zunionstore, most entries involved are actually "old" and irrelevant. It seems kind of suboptimal. 
But if I do zrangebyscore for each sorted set and then do zunionstore I might be issuing hundreds of zrangebyscore... This doesn't looks good either.

 Seeing that it is a very common problem I am wondering what is the established way to deal with this. Would redis lua scripting be useful for this? 

Josiah Carlson

unread,
Mar 29, 2014, 2:23:59 AM3/29/14
to redi...@googlegroups.com
Redis Lua scripting would be *ideal* for this.


That was fun :)

 - Josiah


--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at http://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Alexandru Stanciu

unread,
Jan 18, 2015, 10:29:27 AM1/18/15
to redi...@googlegroups.com
Hey thanks again Josiah!

I was looking for something like this, is it the only option so far?
This looks like a common pattern especially when dealing with time-sorted sets.

I need to do something similar but the zrangebyscore would be applied with the same (min, max) on all the zsets, then perform the zunionstore on the results.
I'll look into the gist you provided but seems it's not tested indeed :)

Cheers,
Alex

Josiah Carlson

unread,
Jan 19, 2015, 4:22:11 PM1/19/15
to redi...@googlegroups.com
Replies inline.

On Sun, Jan 18, 2015 at 7:29 AM, Alexandru Stanciu <alexandr...@gmail.com> wrote:
Hey thanks again Josiah!

You're welcome, it's always fun to see an old post get some love :)

I was looking for something like this, is it the only option so far?

There aren't special commands in Redis to explicitly union/intersect subregions of sorted sets, so yes, this or something like it is the only option right now.
 
This looks like a common pattern especially when dealing with time-sorted sets.

It might be more or less common than you would think. Certainly in time indexes, but the times that I've needed to do this is in the context of rom[1], which uses sorted sets for all numeric indexes. With rom intending to offer an ORM-like interface to Redis for Python, use of numeric ranges is fairly common.

I need to do something similar but the zrangebyscore would be applied with the same (min, max) on all the zsets, then perform the zunionstore on the results.
I'll look into the gist you provided but seems it's not tested indeed :)

I might have run it a few times on some test inputs, but I certainly haven't run it in production. That doesn't mean it's that far from being production worthy, depending on your needs.

 - Josiah

Alexandru Stanciu

unread,
Jan 20, 2015, 6:43:34 AM1/20/15
to redi...@googlegroups.com
Ok here's my fork of the gist

After I got the idea I fixed some issues and switched to using `register_script` function. 

It works now but not as expected.
I think there's a flow in the algorithm because the scores for a member are not added but just replaced via the zadd command. Here's what I mean

```
127.0.0.1:6379> zadd tmp 1 a
(integer) 1
127.0.0.1:6379> zadd tmp 2 a
(integer) 0
127.0.0.1:6379> zadd tmp 3 a
(integer) 0
127.0.0.1:6379> zrange tmp 0 -1 withscores
1) "a"
2) "3"
```

What should be done is getting the existing zscore of the member in each iteration (for each chunk) and adding the current score. 
Trying to do that, be back shortly

Alexandru Stanciu

unread,
Jan 20, 2015, 7:19:34 AM1/20/15
to redi...@googlegroups.com
Done, it works properly now.

_union_range_score_lua = '''
-- KEYS: [key1, key2, ..., keyn, tempkey]
-- ARGV: [JSON([[min1, max1], [min2, max2], ...])]
local _ARGV = cjson.decode(ARGV[1])
local dest = KEYS[#KEYS]
for i = 1, #_ARGV do
local _a = _ARGV[i]; table.insert(_a, 'withscores')
local chunk = redis.call('ZRANGEBYSCORE', KEYS[i], unpack(_a))
if #chunk > 0 then
for j = 1, #chunk, 2 do
-- Hah, the inversion of ordering for scores and members in ZSET
-- reading vs. writing bites us in the butt.
chunk[j], chunk[j+1] = chunk[j+1], chunk[j]
-- Fixing the flaw
local _s = redis.call('ZSCORE', dest, chunk[j+1])
if _s then
chunk[j] = chunk[j] + tonumber(_s)
end
-- Done, but maybe there's a better way
end
redis.call('ZADD', dest, unpack(chunk))
end
end
local ret = redis.call('ZRANGE', dest, 0, -1, 'withscores')
redis.call('DEL', dest)
return ret
'''

Josiah Carlson

unread,
Jan 20, 2015, 12:08:56 PM1/20/15
to redi...@googlegroups.com
If you are bothering to get the score, you may as well just use ZINCRBY, saving the big ZADD call at the end plus many tonumber() calls. This will be faster (and cleaner):

-- KEYS: [key1, key2, ..., keyn, tempkey]
-- ARGV: [JSON([[min1, max1], [min2, max2], ...])]

local _ARGV = cjson.decode(ARGV[1])
local dest = KEYS[#KEYS]

for i = 1, #_ARGV do
    local _a = _ARGV[i]; table.insert(_a, 'withscores')
    local chunk = redis.call('ZRANGEBYSCORE', KEYS[i], unpack(_a))
    if #chunk > 0 then
        for j = 1, #chunk, 2 do
            chunk[j], chunk[j+1] = chunk[j+1], chunk[j]
            local _s = redis.call('ZINCRBY', dest, chunk[j], chunk[j+1])
        end
    end
end

local ret = redis.call('ZRANGE', dest, 0, -1, 'withscores')
redis.call('DEL', dest)
return ret


 - Josiah


Josiah Carlson

unread,
Jan 20, 2015, 12:11:47 PM1/20/15
to redi...@googlegroups.com
One minor improvement (don't save a result that we're not going to use):

-- KEYS: [key1, key2, ..., keyn, tempkey]
-- ARGV: [JSON([[min1, max1], [min2, max2], ...])]

local _ARGV = cjson.decode(ARGV[1])
local dest = KEYS[#KEYS]

for i = 1, #_ARGV do
    local _a = _ARGV[i]; table.insert(_a, 'withscores')
    local chunk = redis.call('ZRANGEBYSCORE', KEYS[i], unpack(_a))
    if #chunk > 0 then
        for j = 1, #chunk, 2 do
            redis.call('ZINCRBY', dest, chunk[j], chunk[j+1])
        end
    end
end

local ret = redis.call('ZRANGE', dest, 0, -1, 'withscores')
redis.call('DEL', dest)
return ret

 - Josiah

Alexandru Stanciu

unread,
Jan 20, 2015, 1:18:26 PM1/20/15
to redi...@googlegroups.com
Great idea!
You just need to switch the params of the zincrby

redis.call('ZINCRBY', dest, chunk[j+1], chunk[j])

My use case is simpler, I have the same [min, max] for all zrangebyscore calls. Basically I need a zunionstore with a single pair of [min, max] params, so I won't need cjson and the resulting key (dest) will not be deleted in the end. Nice to see there's an easy solution to this problem

Thanks a lot,
Alex

Josiah Carlson

unread,
Jan 20, 2015, 1:22:51 PM1/20/15
to redi...@googlegroups.com
You are right, in removing the chunk item swap in the second one I sent, I forgot to fix the arguments.

 - Josiah
Reply all
Reply to author
Forward
0 new messages