SDIFF but no ZDIFF, why?

1,747 views
Skip to first unread message

Thomas Fritz

unread,
Apr 1, 2012, 4:43:59 AM4/1/12
to Redis
Hello, i was wondering why there is a SDIFF and SDIFFSTORE but no ZDIFF and ZDIFFSTORE?

Kind regards

Josiah Carlson

unread,
Apr 1, 2012, 12:44:22 PM4/1/12
to redi...@googlegroups.com
What is the difference of two zsets? Do you remove every item from the
first zset that was in the second? Or is it just like passing a weight
of -1 to zinterstore for the second zset? That ambiguity would lead to
confusion, so rather than confuse people, it's just not there.

Regards,
- Josiah

On Sun, Apr 1, 2012 at 1:43 AM, Thomas Fritz <frit...@gmail.com> wrote:
> Hello, i was wondering why there is a SDIFF and SDIFFSTORE but no ZDIFF and ZDIFFSTORE?
>
> Kind regards
>

> --
> You received this message because you are subscribed to the Google Groups "Redis DB" group.
> To post to this group, send email to redi...@googlegroups.com.
> To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.
>

Thomas Fritz

unread,
Apr 2, 2012, 2:52:00 AM4/2/12
to redi...@googlegroups.com
I think it should exactly behave like http://redis.io/commands/sdiff 
Subtract all values from the first key found in the given keys after the first one. So just values with the original score of the first key which are not in any of the other given keys are returned.

Regards
Thomas

Josiah Carlson

unread,
Apr 2, 2012, 3:42:23 AM4/2/12
to redi...@googlegroups.com
You may want it to behave like that, but others have disagreed in the
past about what it should do. Zdiffstore doesn't exist because
Salvatore rejected it. I was trying to explain why it's not there, by
offering a question that could be reasonably answered differently by
different people.

Regards,
- Josiah

Josiah Carlson

unread,
Jul 17, 2012, 11:45:45 AM7/17/12
to redi...@googlegroups.com
On Mon, Jul 16, 2012 at 9:41 PM, Austin Alexander
<austin.a...@gmail.com> wrote:
> I'd be willing to bet that most reasonable people would not expect ZDIFF to
> behave differently than SDIFF (no pun intended). If it is called a Set
> (sorted or not), then one would assume that it would support standard set
> operations. And as Josiah pointed out, the alternative interpretation could
> already be accomplished in a single command with ZINTERSTORE and weight -1.
>
> However, if confusion remains a concern, then how about allowing SDIFF to be
> used on Sorted Sets? Since it would behave exactly the same as, well,
> SDIFF, the results would be in line with expectations. This approach would
> not retain the weights, but it would still be nice to be able to mix the two
> without doing an intermediate conversion.

Allowing only one command among a group of related commands to take a
different type of argument would violate a variety of expectations. So
what you are actually proposing is the ability for SINTER, SUNION, and
SDIFF to operate on ZSETs but discard the scores.

While I can't think of a particular situation where I'd have found it
necessary, I do like the symmetry (SETs can already become ZSETs via
the ZSET union and intersect commands), and I suspect that the code
changes would be very minimal.

Related to that, losing information in a ZSET -> SET conversion is
great when that is what you want, but I suspect that something like
the following wwould become a thing...

SDIFFSTORE zsetresult zset1 zset2
ZINTERSTORE zsetresult 2 zsetresult zset1 weights 0 1

... someone performs the difference to get the set itself, followed by
an intersection to get the scores back. My gut says that just offering
ZDIFFSTORE with the same semantics of SDIFFSTORE is cleaner and less
confusing. It also tells me that losing information via ZSET -> SET
conversion is possibly a bad idea, and that just finishing ZSET
operations out with ZDIFFSTORE may actually be the right answer. Then
again, assuming that you have non-zero scores in your ZSETs,
ZDIFFSTORE between two ZSETs can be implemented via...

ZINTERSTORE temp 2 zset1 zset2 weights 1 0
ZUNIONSTORE result 2 zset1 temp weights 1 -1
ZREMRANGEBYSCORE result 0 0
DEL temp

... Which seems to suggest that ZDIFFSTORE would be useful. The above
(with the optional ZUNION that would let this perform on arbitrary
numbers of ZSETs) could be implemented via Lua script, but that
doesn't feel quite right to me.

> On a similar topic, why no ZINTER or ZUNION without STORE? A good default
> option would be to return all of the results like with SINTER and SUNION.
> This is just my personal perspective, but it seems that Sorted Sets should
> behave as an extension of Sets, inheriting/supporting the same methods if
> possible.

The majority of use-cases involving intersection and union on ZSETs
will fetch a sub-range of a potentially large result. Omitting the
non-store version ensures that you know what you want when you fetch
it. Also, if you wanted the non-store variant, you can get the exact
same behavior with a multi/exec transaction. Perform your operations,
followed by a ZRANGE key 0 -1, followed by a DEL. The STORE variants
are strictly more powerful than their non-store counterparts, and by
not having the non-store versions, that is 2 fewer commands to
implement, document, and maintain.

If you feel like ZDIFFSTORE is something that you really want;
implement it, create a feature request on the tracker, and link to
your fork. Including tests will help. Ultimately it's up to Salvatore,
but having code can go a long way towards convincing someone that the
idea is good.

Regards,
- Josiah

Austin Alexander

unread,
Jul 18, 2012, 11:19:51 AM7/18/12
to redi...@googlegroups.com
Josiah, thank you for your thoughtful response.


On Tuesday, July 17, 2012 11:45:45 AM UTC-4, Josiah Carlson wrote:
 
Allowing only one command among a group of related commands to take a
different type of argument would violate a variety of expectations. So
what you are actually proposing is the ability for SINTER, SUNION, and
SDIFF to operate on ZSETs but discard the scores.

While I can't think of a particular situation where I'd have found it
necessary, I do like the symmetry (SETs can already become ZSETs via
the ZSET union and intersect commands), and I suspect that the code
changes would be very minimal.

Agreed.  Symmetry would be nice and there are instances when a SET is all that is needed after a certain point.

just finishing ZSET
operations out with ZDIFFSTORE may actually be the right answer. Then
again, assuming that you have non-zero scores in your ZSETs,
ZDIFFSTORE between two ZSETs can be implemented via...

ZINTERSTORE temp 2 zset1 zset2 weights 1 0
ZUNIONSTORE result 2 zset1 temp weights 1 -1
ZREMRANGEBYSCORE result 0 0
DEL temp

All of the proposed operations can be performed with combinations of the existing commands, and it has been helpful to see patterns like the above, but this is neither convenient nor particularly efficient. 


> On a similar topic, why no ZINTER or ZUNION without STORE?  A good default
> option would be to return all of the results like with SINTER and SUNION.
> This is just my personal perspective, but it seems that Sorted Sets should
> behave as an extension of Sets, inheriting/supporting the same methods if
> possible.

The majority of use-cases involving intersection and union on ZSETs
will fetch a sub-range of a potentially large result. Omitting the
non-store version ensures that you know what you want when you fetch
it. Also, if you wanted the non-store variant, you can get the exact
same behavior with a multi/exec transaction. Perform your operations,
followed by a ZRANGE key 0 -1, followed by a DEL. The STORE variants
are strictly more powerful than their non-store counterparts, and by
not having the non-store versions, that is 2 fewer commands to
implement, document, and maintain.

I agree that being able to get a sub-range is often useful, but there are other times when the full results are all that are needed and, while the same effect can be achieved as you suggested, the asymmetry with SET is bothersome.  By the same argument, the SET versions could be replaced by their STORE variants followed by an SMEMBERS command followed by a DEL.
  

If you feel like ZDIFFSTORE is something that you really want;
implement it, create a feature request on the tracker, and link to
your fork. Including tests will help. Ultimately it's up to Salvatore,
but having code can go a long way towards convincing someone that the
idea is good.

After I originally posted, I found that this has been a recurring request going back to at least 2010.  Apparently someone already did a version ZDIFFSTORE (https://github.com/zooniverse/redis/blob/zdiffstore/src/t_zset.c#L1628-1655) although it doesn't seem to have been accepted yet. 

Xiangrong Fang

unread,
Jul 18, 2012, 11:30:33 AM7/18/12
to redi...@googlegroups.com
This trick relies on NON-ZERO score, right?   In fact, I also requested this command (ZDIFFSTORE) a long time ago, and is just doing it today.  

I just upgraded to redis 2.6 and is now doing some heavy-lifting in lua script, because I cannot guarantee the score of my ZSETs are "non-zero".   Is there any simple way to "unify" scores of all members in a zset to a specific value?

Thanks,
Xiangrong

2012/7/18 Austin Alexander <austin.a...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To view this discussion on the web visit https://groups.google.com/d/msg/redis-db/-/J764r7GgaEgJ.

To post to this group, send email to redi...@googlegroups.com.
To unsubscribe from this group, send email to redis-db+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/redis-db?hl=en.



--


Austin Alexander

unread,
Jul 18, 2012, 12:01:12 PM7/18/12
to redi...@googlegroups.com
Xiangrong,

That's a good point, is does rely on non-zero scores.  I can think of a few ways to still make it work with zero scores, but none that are very clean.

If you want to "unify" the scores of all of the members of a zset, then why not just use a set?  Or perhaps you had in mind a way to quickly apply something like ZINCRBY with a single value to all members of a zset, which with a large enough number could allow them to all exceed zero while still preserving the scores.
Reply all
Reply to author
Forward
0 new messages