Query Broadcast when query contains a subset of shard key

36 views
Skip to first unread message

Avinash Vyas

unread,
Mar 29, 2017, 9:08:54 PM3/29/17
to mongodb-user
Hi 
    I am trying to understand the behavior of the query router in a sharded environment w.r.t. to a shard key. I have a collection with 1 million documents of following types
   { 
     _id: <string>, 
     y:1, 
     data:string
   } 

    where value of _id is unique in every document but is chosen by me. 

    In my sharded cluster, I have used _id as my shard key. My understanding is _id is always unique but the sh.status() shows the following 

               test.mycollection
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                wshard1 178
                                wshard2 178
                                wshard3 176
                                wshard4 177

    I have some questions:

    a) Why is the unique: false when for the _id index we know that its is true by default.
    b) Does it matter if the unique is false or true, Shouldn't all the documents with a particular value for the shard key be stored on the same shard.
    c) When I issue a query (or a conditional update with query) { _id : "string1", y:100} the query (query from conditional update) is broadcast to all the shards. Given that my shard key is _id (which is unique), a document with a specific _id can only reside on one shard and why can't the query router identify the correct shard and send it as a target operation. 
    d) More generally when all the documents for a shard key x:"string1" will be stored in one shard, any query such as {x: "string1", y:10} should be a targeted operation rather than a broadcast operation.

Thanks
Avinash


Avinash Vyas

unread,
Mar 30, 2017, 11:08:14 AM3/30/17
to mongodb-user
The situation I explained below is not completely correct. The phenomenon we see in our application cannot be explained with this example. I will update the example once we have nailed down the query that is broadcasted. 

Asya Kamsky

unread,
Mar 30, 2017, 4:34:22 PM3/30/17
to mongodb-user
If you look at indexes on the collection in question via
test.mycollection.getIndexes() you will notice that the _id index
never shows that it is unique.

Nevertheless it is always unique.

Your second message indicates what I suspected when reading your
description. Something else must be going on - queries and updates
which specify equality on the shard key will always be targeted to a
single shard only (the one that all documents with this particular
value of the shard key live).

Asya
> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user"
> group.
>
> For other MongoDB technical support options, see:
> https://docs.mongodb.com/manual/support/
> ---
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to mongodb-user...@googlegroups.com.
> To post to this group, send email to mongod...@googlegroups.com.
> Visit this group at https://groups.google.com/group/mongodb-user.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/mongodb-user/d2cd7661-79e0-48bf-9be0-7a44c5569dda%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.



--
Asya Kamsky
Lead Product Manager
MongoDB
Download MongoDB - mongodb.org/downloads
We're Hiring! - https://www.mongodb.com/careers
Reply all
Reply to author
Forward
0 new messages