Why are multikey operations not supported on Redis cluster?

3,340 views
Skip to first unread message

Sidd S

unread,
Aug 11, 2017, 2:51:28 PM8/11/17
to Redis DB
I was wondering if this was done by design, there is some limitation, or nobody has gotten to it yet. I would like to add some custom code to allow multikey functionality. AFAIK, a cluster of three masters is, in a way, just 3 different redis instances, each with less slots. I should just be able to program the proxy to figure out which keys belong to which instances and then send multikey requests to each node with the corresponding keys? I know that even within a node, sending a multikey request will fail if two keys have different slots, but in the single-instance Redis mode, we can make multikey requests that span different hash slots, so I am wondering why that same logic can't be applied to nodes in a cluster.

Marc Gravell

unread,
Aug 11, 2017, 4:15:44 PM8/11/17
to redi...@googlegroups.com
Multi-key operations *are* supported - you just need to ensure all the keys are on the same slot to ensure they're always on the same node. And the way to do that is via *hash tags*. For example the keys "abc/{foo}" and "{foo}/def" use the hash tag "foo" (via curlies), so "foo" is used to pick the slot.

If you want multi-key operations, you'll be using lots of curly brackets :)

On 11 Aug 2017 7:51 p.m., "Sidd S" <ssin...@gmail.com> wrote:
I was wondering if this was done by design, there is some limitation, or nobody has gotten to it yet. I would like to add some custom code to allow multikey functionality. AFAIK, a cluster of three masters is, in a way, just 3 different redis instances, each with less slots. I should just be able to program the proxy to figure out which keys belong to which instances and then send multikey requests to each node with the corresponding keys? I know that even within a node, sending a multikey request will fail if two keys have different slots, but in the single-instance Redis mode, we can make multikey requests that span different hash slots, so I am wondering why that same logic can't be applied to nodes in a cluster.

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+unsubscribe@googlegroups.com.
To post to this group, send email to redi...@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.

Sidd S

unread,
Aug 12, 2017, 10:46:12 AM8/12/17
to Redis DB
Right! So I was aware of that, but why is even that requirement there? Why do they need to be in the same slot?

Marc Gravell

unread,
Aug 12, 2017, 11:59:19 AM8/12/17
to redi...@googlegroups.com
Because they beed to be on the same node in order for multi-key operations to be supported, and hash tags are the only way to guarantee single-node without it working purely by chance.

Single-node is the only supported scenario for redis server-side operations currently. You could of course to the orchestration client-side, in which case: go for it! But be aware of the atomicity aspects, which is hard to do with multiple masters involved.

On 12 Aug 2017 3:46 p.m., "Sidd S" <ssin...@gmail.com> wrote:
Right! So I was aware of that, but why is even that requirement there? Why do they need to be in the same slot?

Sidd S

unread,
Aug 14, 2017, 10:07:35 AM8/14/17
to Redis DB
Hi Marc,

So I am realizing that the notion of hash slots does not exist on the single-instance version of Redis. Is that true? In that case, suppose I decide to use only 1 total hash-slot per cluster. So if I have three nodes, I would have three total hash slots. Am I compromising the efficiency of Redis by doing this?

My one problem with using hash-tags is that suppose I want to do the following two commands:

MSET a 10 b 20
MSET a 30 c 40
MGET a b c

Well obviously, the above will likely not work because the keys would go to different hash slots, so I would need to use hash tags there. However, now the problem arises that how do I hash tag the above two commands? It doesn't seem that there is an intuitive way of doing it, because my "GET" command would have to know how exactly i hashtagged the 'a' key. The client would then have to know exactly what hash tags were used with different keys, and that is just not very elegant.

Anyways, you mention that the keys all have to be in the same node. However, it is very possible that two keys are different hash slots but in the same node, but of course multikey operations will not work then still. (For example, if 'a' hashes to 100, and 'b' hashes to 500, the command would fail, despite the keys being part of the same node). If machine 1 is responsible for slots 0-5000, then my MSET command (in theory) can check to see if all of the provided keys belong to the same range. Would that not be easy to implement? I am looking through source code right now to see how it could be done (I am still somewhat new to this), but I am hoping I can get some advice before I start.

By the way, thank you for replying to my posts!
 

On Saturday, August 12, 2017 at 11:59:19 AM UTC-4, Marc Gravell wrote:
Because they beed to be on the same node in order for multi-key operations to be supported, and hash tags are the only way to guarantee single-node without it working purely by chance.

Single-node is the only supported scenario for redis server-side operations currently. You could of course to the orchestration client-side, in which case: go for it! But be aware of the atomicity aspects, which is hard to do with multiple masters involved.
On 12 Aug 2017 3:46 p.m., "Sidd S" <ssin...@gmail.com> wrote:
Right! So I was aware of that, but why is even that requirement there? Why do they need to be in the same slot?

--
You received this message because you are subscribed to the Google Groups "Redis DB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+u...@googlegroups.com.

AlexanderB

unread,
Aug 14, 2017, 1:23:03 PM8/14/17
to Redis DB
MGET / MSET and similar commands are often handled by the redis client you are using, These are a bit of a special case. While Redis in cluster mode doesn't actually support these commands, most of the redis clients will transparently convert these to a series of simple GET and SET commands, and then send them each to the appropriate node. 

For instance with the redis-py-client, here is it's mset implemenation. https://github.com/Grokzen/redis-py-cluster/blob/13503d9954456882a70580750464885ac8540b3f/rediscluster/client.py#L661 this actually only sends set commands to nodes, but still allows the app to be written in terms of mset. The one big difference, is that this cluster implementation isn't guaranteed to be atomic anymore. 

I'd suggest digging through the docs for whatever client your are using to talk to redis from your application. You'll likely find that it will handle some of the multikey operations by splitting them into a number of operations for you automatically. 

Sidd S

unread,
Aug 14, 2017, 1:36:07 PM8/14/17
to Redis DB
Thanks for jumping on Alexander. I was actually using the C based hiredis_vip client, which does not (at the moment) take care of multikey operations. The implementation you showed is clearly an easy solution (and it would be easy to implement in hiredis_vip), but I don't think it would be efficient (correct me if I am wrong). We would have to send multiple requests to Redis, whereas a more efficient solution would just send a single request for all the information. Would it not be better if this was taken care of server side? I am still missing the answer to my original question as to why this is not taken care of server side already. If this was done, clients would not need a hacky way of taking care of mset/mget.

I have some solutions in mind, but I am still trying to figure out what the point of hash slots are. More specifically, am I sacrificing efficiency if I decide that, for any one node, I only want to use a single hash slot? 

Marc Gravell

unread,
Aug 14, 2017, 3:03:30 PM8/14/17
to redi...@googlegroups.com
If you are going to use one hash slot per node (assuming you've done the work to check that if actually shards the way you expect, and you don't get one node serving all 3) you might as well forget about cluster and just use 3 single-node instances, directing the traffic manually. Yes, using a "one slot per node" strategy would be a very poor use of cluster, giving you none of the advantages but still all of the admin / setup overhead.

To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+unsubscribe@googlegroups.com.

Sidd S

unread,
Aug 15, 2017, 6:37:34 PM8/15/17
to Redis DB
Alright, so I have come back with a much more concrete question related to this. Check out this line: https://github.com/antirez/redis/blob/202c2ebec4d47d6f8cfbb6c91dd4486dd62aebf6/src/cluster.c#L5294

Why does this line check to see if the slots are the exact same? Why doesn't it check to see if the nodes are the same instead? In my version of the code, I changed this to 
if (n != server.cluster->slots[thisslot]) {

This code works perfectly for me, but it makes no sense why it already isn't like that. What is wrong with this logic? What have I lost by doing this? At the least, I know that I have gained a significantly higher probability of a multikey operation being successful. For example, for a 3 node cluster, setting two keys at once would traditionally have a ~1/16000 chance of succeeding. Now it is has a 1/3 chance. With a few more minor tweaks, (either client side or server side), this probability can be changed to 1 if I figure out how to route keys to the correct node.

Itamar Haber

unread,
Aug 15, 2017, 7:11:11 PM8/15/17
to Redis DB
Hello Sidd,

The whole point of using slots is so the can be moved around the nodes. Your change means that a given multi-key command may run on one cluster topology (where multiple keys in multiple different slots are on the same node) but fail on others. This type of unpredictability isn't something you want to bring into the game, hence the strict requirement for same-slottiness.

To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+unsubscribe@googlegroups.com.

To post to this group, send email to redi...@googlegroups.com.
Visit this group at https://groups.google.com/group/redis-db.
For more options, visit https://groups.google.com/d/optout.



--

Itamar Haber | Chief OSS Education Officer
Redis Labs ~/redis

Mobile: +972 (54) 567 9692
Twitter: @itamarhaber
Skype: itamar.haber

Marc Gravell

unread,
Aug 15, 2017, 7:38:55 PM8/15/17
to redi...@googlegroups.com
to echo that: this would be working by coincidence, not design. It could easily lead you to make incorrect "ah, that works" conclusions, when it is working by pure accident of math. Then suddenly you're shocked when it doesn't work on your "real" cluster (with different numbers of nodes or distributions of slots), or even worse: it works for 12 months then suddenly starts failing when someone does a admin operation to move some slots around (which should be perfectly legal in a shard-based system)

To unsubscribe from this group and stop receiving emails from it, send an email to redis-db+unsubscribe@googlegroups.com.

ma...@andyh.io

unread,
Aug 15, 2017, 9:26:13 PM8/15/17
to redi...@googlegroups.com
Hash slots are designed for sharding. 

One of biggest challenges in sharding is moving the sharding content(Redis keys) when adding/removing nodes in the cluster since most sharding algorithms are using simple `key mod len(nodes in cluster)`. So people invented algorithms like consistent hashing to alleviate the problem. 

Redis Cluster however chose another abstraction, slots, to entirely solve the problem. All the keys are put into slots using a straightforward consistent hash function `crc16(key) mod 16384`. Each physical node is allocated any number of slots. When you need to move keys, you only need to move slots between physical nodes. You will appreciate it's simplicity when you are actually doing the operation.

However this design brings the complexity to the client because for every command client needs to know which slot it is in and then send to the physical node which is allocated. For client implementations, you probably want to use multi threads do this and write to sockets at the same time and wait for response. The overhead of doing this naturally slows you down especially for large pipeline operations. I think in the Python client showed in the previous thread has similar implementations. 

In conclusion, slots is a protocol/contract that client and sever side have to follow for making sharding easier. 

Sent from my iPhone

Sidd S

unread,
Aug 17, 2017, 10:05:03 AM8/17/17
to Redis DB
Thank you for the responses!

So I can appreciate that the specific change to the code that I made might introduce some more unpredictability. I know that for my use case, there is no elegant way of using hash tags to solve this problem, because the client would need to keep track of what hash tags were used with each key. Hash tags also don't seem effective if I wanted to reassign slots from one node to another node (because it seems like keys would be clustered around a few hash slots). The solution (at least for me) will try to modify the server side code to take a multikey command and split it into N separate multikey commands. I know that I risk losing atomicity by doing this, in case the command one node fails while doesn't fail on another node, but it should be sufficient for my use case.
Reply all
Reply to author
Forward
0 new messages