MongoDB Schema: Nested, Flattened, or independent Collections?

83 views
Skip to first unread message

Pranab Agarwal

unread,
Jun 18, 2015, 3:48:14 AM6/18/15
to mongod...@googlegroups.com
We are writing an application in which we have multiple 'Projects'. Each 'Project' has multiple  'Boards'. Each 'Board' has its own set of 'Comments'. What is the recommended way to structure this in MongoDB? 

= Option I (nested collection)
  -Project
    |
    |----- Board
             |
             |----- Comments


= Option II (flattened collection)
  -Project
    |
    |----- Board
    |
    |----- Comment
              |-----Board_ID


= Option III (independent collections)
  -Project
    
  - Boards
      |-----Project_ID
    
  - Comments
      |-----Board_ID   


There are 10,000 projects. Each project has 5 Boards, so total boards is 50,000. Each Board has 20 comments, so total comments are '1,000,000. Only one project, and one board can be open in the application at one time.

So, if we pick Option I, then to get the associated 'Comments' for a particular project/board combination, we will have to query/parse through only 20 comments. However, if I pick Option III, then, to get the associated 'Comments' for a given project/board combination, we will have to query/parse through 1,000,000 comments. So, in theory, Option I sounds faster and more efficient. However, Option I uses a nested collection: Is there any dis-advantages on a nested collection? Are there any reasons for not using nested collections in MongoDB, like Option I? 

What Option (I, II, or III, or any other?), is the recommended practice for such cases in MongoDB? 

Veerapaneni Ayyappa

unread,
Jun 18, 2015, 9:54:06 AM6/18/15
to mongod...@googlegroups.com
It all depends on your data access/growth patterns and read/write requirements. 
Based on your description, option 1 seems efficient, where you may have an collection for each project and  5  'Boards' as documents and each document has nested documents or array for all the comments.

AFAIK, there is nothing called nested collection and I presume you meant nested documents.
In general embedded documents are much faster as you get to query the relevant information together and avoid joins.

Allan Bazinet

unread,
Jun 18, 2015, 11:40:08 AM6/18/15
to mongod...@googlegroups.com
A single document can't exceed 16MB in size.  If you anticipate that occurring, then options 1 and 2 are not for you.  Option 3 is safe in any case, and is likely to be more flexible in terms of indexing, search, and delete.

Bottom line, in situations where the number of elements is unbounded, I always recommend something like option 3.  If instead things are predictable and finite, then one of options 1 or 2 may prove simpler, and is likely to be marginally faster in terms of locality of reference, where 'marginal' in this case means 'unlikely to be a concern in practice, either way'.
Reply all
Reply to author
Forward
0 new messages