In my mongodb collection i have 15 millions documents with following json structure. The embedded document count inside playfields array field changes for each json document. All my queries involves filtering the documents based on data in playfields array field. All queries are taking more than 2 minutes to execute. below attached sample queries I am running.
Please suggest if i am doing anything wrong in writing queries or missing any indexes or I have to move data from embed documents in single document to multiple collections?
Index created:
db.collection.ensureIndex({ "playfields.ID": 1, "playfields.value": 1 })
Queries running slow:
db.playfieldvalues.find({$and:[
{playfields: {$elemMatch:{ID:"Play.NHL.NHLHomeTeam" ,value: "BOS BOSTON BRUINS"}}},
{$or:[ {playfields: {$elemMatch:{ID:"Play.NHL.NHLAwayTeam" ,value: "NYI NEW YORK ISLANDERS"}}},{playfields: {$elemMatch:{ID:"Play.NHL.NHLAwayTeam" ,value: "T.B TAMPA BAY LIGHTNING"}}}]},
{playfields: {$elemMatch:{ID:"Play.NHL.NHLEventY" ,value: -38}}},
{playfields: {$elemMatch:{ID:"Play.NHL.NHLEventX" ,value: {$gt: 0}}}},
{playfields: {$elemMatch:{ID:"Play.NHL.NHLEventScoreDifferential" ,value: {$gt: 0}}}}
]
})
db.playfieldvalues.find({playfields: {$elemMatch:{ID:"Play.NHL.NHLHomeTeam" ,value: "BOS BOSTON BRUINS"}}})
Hi
I believe you cross-posted this question in StackOverflow as well: https://stackoverflow.com/questions/46736673/query-performance-while-filtering-embedding-large-arrays-documents-mongodb and there were some suggestions from the community regarding your schema design.
Having said that, I agree with the StackOverflow thread that your schema design is sub-optimal for your query. Without much information, generally it’s best to either:
Which approach is best would depend on your use case. Since you only posted one example document, it’s difficult to see any pattern in your data.
To optimize your queries and schema design, I would suggest you to peruse these links:
Best regards
Kevin