sorted set get rank by score feature suggestion

235 views
Skip to first unread message

Karl Seguin

unread,
May 31, 2011, 11:00:56 AM5/31/11
to redi...@googlegroups.com
Hi,

I'm in need of getting a rank from a sorted set on arbitrary scores (unsaved). Currently the API only allows for ranks from a saved member. The use case that I'm specifically interested in is to return leaderboard ranks to users when the score isn't their best score (and thus isn't saved). We only track a user's best score, but we like to show them what rank a score was - even if its worse than their best rank.

I'm curious if such a patch might be welcomed? 

I haven't programmed in C, but I really need this feature and I'm happy to maintain my own patched fork and run with it. I feel like I have an ok implementation for skiplists, but I'm struggling with the ziplist implementation. I'm thinking I'm going to have to iterate though and call something like zzlGetScore on each entry to find the rank. Any thoughts suggestion?

here's what I have so far if anyone is interested:

karl

Pieter Noordhuis

unread,
May 31, 2011, 11:15:57 AM5/31/11
to redi...@googlegroups.com
Hi Karl,

You can achieve the same thing using already existing commands. You say you want the rank of an arbitrary score. This is equal to knowing the size of the sorted set, and the number of members with a score lower or higher than that score. When you call ZCARD and ZCOUNT in a MULTI/EXEC, you get the size and the number of members from score X to +inf or whatever the range is you are interested in. To find out the rank, you subtract the return value of ZCOUNT from the size and get the relative position of a non-existing element.

Cheers,
Pieter
--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Karl Seguin

unread,
May 31, 2011, 11:19:56 AM5/31/11
to redi...@googlegroups.com
duh, makes sense...at least it made for a fun evening!!

Karl

Josiah Carlson

unread,
Feb 27, 2013, 6:30:03 PM2/27/13
to redi...@googlegroups.com
Using Lua scripting, you can perform the following operations with 1
round trip using O(log n) time:

local result = redis.call('zrangebyscore', KEYS[1], ARGV[1], ARGV[1])
if #result > 0 then
return redis.call('zrank', KEYS[1], result[1])
endif
return -1

- Josiah

On Tue, Feb 26, 2013 at 2:22 PM, Chris Broglie <cbro...@gmail.com> wrote:
> Based on the running time of ZCOUNT, this method would be O(n) where n is
> the number of elements in the set. Finding the rank of an arbitrary score
> should be able to be implemented in O(log(n)).
>
>
> On Tuesday, May 31, 2011 8:19:56 AM UTC-7, Karl Seguin wrote:
>>
>> duh, makes sense...at least it made for a fun evening!!
>>
>> Karl
>
> --
> You received this message because you are subscribed to the Google Groups
> "Redis DB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to redis-db+u...@googlegroups.com.
>
> To post to this group, send email to redi...@googlegroups.com.
> Visit this group at http://groups.google.com/group/redis-db?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Josiah Carlson

unread,
Feb 27, 2013, 6:30:27 PM2/27/13
to redi...@googlegroups.com
P.S. That is completely untested, it may require changes.

Chris Broglie

unread,
Feb 27, 2013, 7:19:39 PM2/27/13
to redi...@googlegroups.com
That method requires you to make assumptions about the distribution of the data contained in the set, b/c you need know what min/max to pass into ZRANGEBYSCORE to ensure it is a big enough range to include a key, but not so large that it returns tons of data. The min and max values would have to be very different if the data values ranged from 0-1 vs. 0-1000000. 

Josiah Carlson

unread,
Feb 28, 2013, 4:23:31 PM2/28/13
to redi...@googlegroups.com
The original post that you resurrected only cared about getting a rank
of an item from the score.

In this case, we can trivially address your concern by using the
'limit' clause available to ZRANGEBYSCORE.

Also, if *you* have a particular problem you are looking to solve that
is unrelated to the original post, maybe you want to describe your
problem.

Regards,
- Josiah

Chris Broglie

unread,
Feb 28, 2013, 4:53:33 PM2/28/13
to redi...@googlegroups.com
I may be missing something, but I don't believe that the 'limit' clause addresses my concern. Let me try to rephrase: given an arbitrary score (one which may or may not be present in the set), I want to get the rank of the closest key.

The problem with using ZRANGEBYSCORE is that without intrinsic knowledge of the values in the set, there is no way to figure out which min and max values to use. Let's say for example the set contained the elements [1,5,7,8,9,10], and I wanted to get the rank for the score 6. In this case, I could naively use +/-1 for the range and pass in 5 and 7 as the min and max. ZRANGEBYSCORE would return 2 elements, and I could use the either one. Great. But what if I want the rank of score 3? Passing in 2 and 4 for the min and max would return no data.

Sure, in this example using a larger range for the min and max (say +/- 2) from the target score would work. But this is just a toy example. What if the set contained a million elements between 1 and 5? You could limit to the first x results, but that's returning the first x results, rather than the x in the middle of the range.

Does that make sense?

Josiah Carlson

unread,
Feb 28, 2013, 5:09:05 PM2/28/13
to redi...@googlegroups.com
ZRANGEBYSCORE key 6 inf limit 0 1

You can use 'inf' and '-inf' as endpoints. This particular use of
limit guarantees you are only returning 1 item.


But maybe I'm not understanding your question. Do you want the next
larger? Next smaller? Either? Both? The closest?

If you want the closest, then you can use ZRANGEBYSCORE as I
described, and also 'ZREVRANGEBYSCORE key 6 -inf limit 0 1' (which get
the next smaller if the score doesn't exist), then based on the scores
compared to the value you were actually interested in, you could then
decide which to use, what rank to pull ranges from, etc.

- Josiah

Chris Broglie

unread,
Feb 28, 2013, 5:32:38 PM2/28/13
to redi...@googlegroups.com
You are absolutely right. The concept that I was missing was using the target score as one of the bounds. Sorry that it took me so long to grasp that :)

-Chris

Josiah Carlson

unread,
Feb 28, 2013, 6:31:09 PM2/28/13
to redi...@googlegroups.com
No worries, I'm glad your problem is now solved :)

- Josiah
Reply all
Reply to author
Forward
0 new messages