How can Reddit.com's comment system be modeled in mongodb?

5,076 views
Skip to first unread message

DoomPirate

unread,
May 4, 2011, 2:28:21 PM5/4/11
to mongodb-user
My website's future comment system can be comparable to Reddit's
comment system.
However, I am unsure how exactly I should model this inside mongodb.

Reddit comments have these key features:
1. Comments can be nested inside one another.
2. Comments can be upvoted where best comments float to the top.

Here is a link to reddit demonstrating the comment system.
http://www.reddit.com/r/politics/comments/h4068/maybe_shooting_an_unarmed_osama_was_okay_maybe_it/


If I wanted to create a near exact copy of reddit's comment system(and
wanted to be able to scale), how should I model the data in mongodb.

Should I make each comment someone posts a new document?(Will query
speeds be much lower than if all the comments were inside a single
document?)
Should I make all comments inside a single document?
How can I nest them? and how can I allow for upvoting provisions?

Gates

unread,
May 4, 2011, 6:41:02 PM5/4/11
to mongodb-user
The core question here is really about schema design and Embed vs.
Reference.

There are some docs regarding this, but there are clearly trade-offs
to make here:
http://www.mongodb.org/display/DOCS/Schema+Design#SchemaDesign-Embedvs.Reference

Please take a look at the General Rules section.

> Should I make each comment someone posts a new document?
> Should I make all comments inside a single document?

There is no single correct way to do this. You could go either way.

> Will query speeds be much lower than if all the comments were inside a single document?

It depends. How many comments do you have? How many are displaying?
How fast is "fast"? I just hit the "show 500" button on the reddit
page how long does it take to load, how long *should* it take to load?

> How can I nest them?

JSON natively support arrays. Nesting comments generally means having
an array of objects:

{
comments: [
{ user: 'gates', text: 'first post'
replies: [
{ user: 'jim', text: 'I was too slow' }
]
},
{ user: 'jim', text: 'second post' }
]
}

The upside here is the ease of querying the entire document. One query
gets you everything you need to display.

The main caveat here is that modifying arrays of objects of arrays of
objects can be a little unwieldy. If you have reddit's volume, you may
break the 16MB limit on document sizes.

Other methods may involve more queries, but may make voting much
easier.

Again, the question you're asking involves quite a few trade-offs.
You'll need to model your important queries and weigh the trade-offs
from the various models.

- Gates

On May 4, 11:28 am, DoomPirate <nguyen.phili...@gmail.com> wrote:
> My website's future comment system can be comparable to Reddit's
> comment system.
> However, I am unsure how exactly I should model this inside mongodb.
>
> Reddit comments have these key features:
> 1. Comments can be nested inside one another.
> 2. Comments can be upvoted where best comments float to the top.
>
> Here is a link to reddit demonstrating the comment system.http://www.reddit.com/r/politics/comments/h4068/maybe_shooting_an_una...
Reply all
Reply to author
Forward
0 new messages