set, zset, zdiff and sort

174 views
Skip to first unread message

Xiangrong Fang

unread,
Dec 9, 2010, 2:48:03 AM12/9/10
to redi...@googlegroups.com
Hi there,

There is no ZDIFFSTORE command like we have for set's SDIFFSTORE.   I found there are a few requests to add this, but until now it is not there?   A few questions/thinkings:

- is it possible to add that into v2.2?

- why there is SINTER and SINTERSTORE, but there is no ZINTER?   There was a suggestion I personally think is very good to add STORE key to all applicable commands, will that suggestion be accepted and add to future plan?

- as zset will require more memory, is it "wise" to use set instead, as it has all the standard operations like union/intersect/diff, and then use SORT when necessary? i.e. is the memory saved  out-weight  SORT's performance hit?

Thanks,
Shannon

Pieter Noordhuis

unread,
Dec 9, 2010, 5:07:38 AM12/9/10
to redi...@googlegroups.com
Hi,

ZDIFFSTORE makes no sense, as discussed before on the ML (please search before posting). The intrinsic value of the scores gets lost when you simply start subtracting them. So, instead, you can use e.g. ZUNIONSTORE with WEIGHTS 1 -1 -1 ... and do a ZRANGEBYSCORE to get everything with score >= 1.

ZUNION/ZINTER without the STORE, discussed here:
http://code.google.com/p/redis/issues/detail?id=328

Basically, the result needs to be created before being returned, so there is no point in just returning all contained elements. Instead, the result is stored as a key and you have control on how to query it / cache it / etc.

Comparing sets + SORT with sorted sets is like comparing apples and oranges. When you run SORT against a set (and use e.g. external weights to sort them by), the sorting that takes place runs in O(N*log(N)) time complexity, where sorted sets are stored in their sorted form and can be queries in O(log(N)) time complexity. The added overhead of sorted sets over regular sets is somewhere in the 30%-50% range. You can run some benchmarks to find out what the actual overhead for your particular dataset is.

Cheers,
Pieter

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.

Xiangrong Fang

unread,
Dec 9, 2010, 9:28:44 PM12/9/10
to redi...@googlegroups.com
Hi Pieter,

Thank you for the detailed explanation.  I have some different ideas:

1) "intrinsic value of scores gets lost" is NOT important in most use cases I can think about.  

Because when you do Z*STORE, you usually want to get the result in ANOTHER key, i.e. the original value is not destroyed anyway.  If you need the score of a value in the result, simply refer to the original sets! Similar to SORT's STORE parameter,   I found (pls correct if I am wrong) that the STORE parameter will always store the result into a LIST no matter the key being sorted is a set or what.  In this sense, I think it is even very beneficial to allow set-alike operations on different data types, for example, SDIFF <set-a> <zset-b>, which returns a LIST.

However I realize that this might be crazy as the underlying data structure might not allow joint operation on different datatypes. Hence I suggested to add data type conversion commands AND a universal version of "STORE <key>" parameters, which basically solved the problem.

2) Mimic ZDIFFSTORE using ZUNIONSTORE is not reliable and the behavior is not at all same as ZDIFFSTORE. Because you are relying on the *meaning* of the WEIGHT in *different* zsets.  Consider the following example:

We have a population, which has 2 parameters: "age" and "income".  2 ZSETS stores info about the people,  i.e.  age_zset using age as score and ID number of the person as value; and income_zset using income as score.   Now I want to get a list of IDs who has their age registered, but income is unknown.   How to use ZUNIONSTORE here?

Best Regards,
Shannon  


2010/12/9 Pieter Noordhuis <pcnoo...@gmail.com>
Reply all
Reply to author
Forward
0 new messages