Query Broadcast when query contains a subset of shard key
36 views
Skip to first unread message
Avinash Vyas
unread,
Mar 29, 2017, 9:08:54 PM3/29/17
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to mongodb-user
Hi
I am trying to understand the behavior of the query router in a sharded environment w.r.t. to a shard key. I have a collection with 1 million documents of following types
{
_id: <string>,
y:1,
data:string
}
where value of _id is unique in every document but is chosen by me.
In my sharded cluster, I have used _id as my shard key. My understanding is _id is always unique but the sh.status() shows the following
test.mycollection
shard key: { "_id" : 1 }
unique: false
balancing: true
chunks:
wshard1 178
wshard2 178
wshard3 176
wshard4 177
I have some questions:
a) Why is the unique: false when for the _id index we know that its is true by default.
b) Does it matter if the unique is false or true, Shouldn't all the documents with a particular value for the shard key be stored on the same shard.
c) When I issue a query (or a conditional update with query) { _id : "string1", y:100} the query (query from conditional update) is broadcast to all the shards. Given that my shard key is _id (which is unique), a document with a specific _id can only reside on one shard and why can't the query router identify the correct shard and send it as a target operation.
d) More generally when all the documents for a shard key x:"string1" will be stored in one shard, any query such as {x: "string1", y:10} should be a targeted operation rather than a broadcast operation.
Thanks
Avinash
Avinash Vyas
unread,
Mar 30, 2017, 11:08:14 AM3/30/17
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to mongodb-user
The situation I explained below is not completely correct. The phenomenon we see in our application cannot be explained with this example. I will update the example once we have nailed down the query that is broadcasted.
Asya Kamsky
unread,
Mar 30, 2017, 4:34:22 PM3/30/17
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to mongodb-user
If you look at indexes on the collection in question via
test.mycollection.getIndexes() you will notice that the _id index
never shows that it is unique.
Nevertheless it is always unique.
Your second message indicates what I suspected when reading your
description. Something else must be going on - queries and updates
which specify equality on the shard key will always be targeted to a
single shard only (the one that all documents with this particular
value of the shard key live).