Lloyd,
You will find this video interesting:
http://www.10gen.com/presentations/mongosv-2011/schema-design-at-scale
Essentially, in one document, store one days of tweets for one
person. The reasoning:
- Querying typically consists of days and users
Therefore, you can have the following index:
{user_id: 1, date: 1} # Date needs to be last because you will range
and sort on the date
Have fun!
Chris
MongoHQ
On Feb 17, 11:56 pm, Lloyd Cledwyn <
cled...@gmail.com> wrote:
> Do I need to worry about update/insert performance, if say I add a
> "somegroupID" element to all the tweets associated with a user, thus
> updating thousands of documents? (Of course over and over for each user in
> the "somegroupID") And I can appreciate that once that element is present
> in all those documents then doing a mapReduce / analysis for all those
> documents is straight forward.
>
>
>
>
>
>
>
> On Fri, Feb 17, 2012 at 11:52 PM, Nat <
nat.lu...@gmail.com> wrote:
> > **
> > Like many other nosqls, mongodb doesn't offer join operations.
> > Denormalizing can give better performance than keep multiple fetching other
> > table to simulate joins especially when you only do it one-off for
> > analytical purpose.
> > ------------------------------
> > *From: * Lloyd Cledwyn <
cled...@gmail.com>
> > *Sender: *
mongod...@googlegroups.com
> > *Date: *Fri, 17 Feb 2012 23:48:59 -0600
> > *To: *<
mongod...@googlegroups.com>
> > *ReplyTo: *
mongod...@googlegroups.com
> > *Subject: *Re: [mongodb-user] How far to push the document nesting?
>
> > Interesting. Of course. So counter intuitive from a "normalized"
> > mindset. May add a [duplicated] data element across thousands of elements,
> > but if that helps performance when I'm hoping for it, it could just work.
>
> > On Fri, Feb 17, 2012 at 5:37 AM, Nat <
nat.lu...@gmail.com> wrote:
>
> >> **
> >> I would not store tweet data inside user. It's better to keep them
> >> separated. If you need to run analytic based on user profile such as age,
> >> sex, etc, you might store them together with tweet data, it will make it
> >> easier to run map/reduce or aggregation on it.
> >> ------------------------------
> >> *From: * Lloyd Cledwyn <
cled...@gmail.com>
> >> *Sender: *
mongod...@googlegroups.com
> >> *Date: *Thu, 16 Feb 2012 23:11:06 -0800 (PST)
> >> *To: *<
mongod...@googlegroups.com>
> >> *ReplyTo: *
mongod...@googlegroups.com
> >> *Subject: *[mongodb-user] How far to push the document nesting?
> >> to run analysis on tweets belonging to users with similar *somegroupID*.