Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Prefer more objects or larger objects?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  9 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Daniel Harman  
View profile  
 More options Aug 10 2012, 1:57 pm
From: Daniel Harman <daniel.a.har...@gmail.com>
Date: Fri, 10 Aug 2012 10:57:05 -0700 (PDT)
Local: Fri, Aug 10 2012 1:57 pm
Subject: Prefer more objects or larger objects?

Hi,

I'm writing a chat application and I'm considering having an object per day
with all the messages for that day embedded in a list. Alternative I could
just have the messages in a collection of their own.

e.g.
chatDay = { messages : [ { Id = ..., Msg = "Hello" }, { Id = ..., Msg : "Oh
hi"} ] }

vs

{ Id = ..., Msg = "Hello" }
{ Id = ..., Msg : "Oh hi"}

Obviously the former is going to mean a lot fewer individual objects being
pulled back and forth so I think might be more efficient for reading, and
more convenient for paging. Are there any performance consideration here?
Would things start to degrade if a 'day' container started to get large due
to a huge quantity of messages (n.b. obviously its going to go horribly
wrong if I hit the object size limit!)

Thanks,

Dan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Octavian Covalschi  
View profile  
 More options Aug 10 2012, 2:55 pm
From: Octavian Covalschi <octavian.covals...@gmail.com>
Date: Fri, 10 Aug 2012 13:55:06 -0500
Local: Fri, Aug 10 2012 2:55 pm
Subject: Re: [mongodb-user] Prefer more objects or larger objects?

I think this article may answer your question...

https://openshift.redhat.com/community/blogs/designing-mongodb-schema...

On Fri, Aug 10, 2012 at 12:57 PM, Daniel Harman
<daniel.a.har...@gmail.com>wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel Harman  
View profile  
 More options Aug 10 2012, 6:53 pm
From: Daniel Harman <daniel.a.har...@gmail.com>
Date: Fri, 10 Aug 2012 15:53:34 -0700 (PDT)
Local: Fri, Aug 10 2012 6:53 pm
Subject: Re: [mongodb-user] Prefer more objects or larger objects?

Thanks, It seems to confirm my intuition which is great as I just spent an
hour implementing the bucket approach :)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel Harman  
View profile  
 More options Aug 10 2012, 7:18 pm
From: Daniel Harman <daniel.a.har...@gmail.com>
Date: Fri, 10 Aug 2012 16:18:29 -0700 (PDT)
Local: Fri, Aug 10 2012 7:18 pm
Subject: Re: [mongodb-user] Prefer more objects or larger objects?

Although having said that it doesn't really talk about performance
difference between modifying an existing object with for example a push vs
inserting a new object into a collection. Are there any general principles
to consider here?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Moore  
View profile  
 More options Aug 11 2012, 12:52 pm
From: Rob Moore <robert.allanb...@gmail.com>
Date: Sat, 11 Aug 2012 09:52:19 -0700 (PDT)
Local: Sat, Aug 11 2012 12:52 pm
Subject: Re: [mongodb-user] Prefer more objects or larger objects?

On Friday, August 10, 2012 7:18:29 PM UTC-4, Daniel Harman wrote:

> Although having said that it doesn't really talk about performance
> difference between modifying an existing object with for example a push vs
> inserting a new object into a collection. Are there any general principles
> to consider here?

An update and insert will probably be very comparable _if_ the document has
not grown beyond the size of the currently allocated block.  If it does
have to move the document then the insert is going to be faster.

The other issue to consider is that with each move/delete the document
leaves a hole.  MongoDB is not very good at managing the holes created if
you are not ensuring the documents are of uniform size.  A secondary effect
is that after a while the collection of holes in the database slows down
all allocations (straight inserts and updates that move) as it scan the
growing free lists.

In 2.2, TTL collections will switch the collection to a "power of 2
allocator".  In theory that fixes the fragmented problem at the expense on
1, potentially, extra index with a TTL of "forever" and a little wasted
space.

For me the question is do you ever plan to delete the documents?  If not
then use a document per message and some smart indexing to group records
for faster access.  The data will be packed into memory/disk as tight as
possible.  You will still get temporal/spatial correlation since MongoDB
will always append all of the messages to the end of the extents allocated.

If you will delete records its a toss up based on the primary usage pattern
but you want the TTL collection's power of 2 allocator.

Rob


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
MKN Web Solutions  
View profile  
 More options Aug 12 2012, 10:16 am
From: MKN Web Solutions <mich...@mknwebsolutions.com>
Date: Sun, 12 Aug 2012 07:16:37 -0700 (PDT)
Local: Sun, Aug 12 2012 10:16 am
Subject: Re: [mongodb-user] Prefer more objects or larger objects?

Can anyone from the MongoDB engineering team confirm that this approach is
ideal?  I just want to verify that this scheme is being used and works well.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Scott Hernandez  
View profile  
 More options Aug 12 2012, 10:40 am
From: Scott Hernandez <scotthernan...@gmail.com>
Date: Sun, 12 Aug 2012 07:40:34 -0700
Local: Sun, Aug 12 2012 10:40 am
Subject: Re: [mongodb-user] Prefer more objects or larger objects?
It depends on a lot of factors (update/move rate, deleting, ratio of
inserts to updates, queries and ordering, etc), but it is a good
approach and one that works well for very active short write loads and
mostly reads, like an activity stream or time-based logging.

On Sun, Aug 12, 2012 at 7:16 AM, MKN Web Solutions


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Daniel Harman  
View profile  
 More options Aug 12 2012, 6:44 pm
From: Daniel Harman <daniel.a.har...@gmail.com>
Date: Sun, 12 Aug 2012 15:44:40 -0700 (PDT)
Local: Sun, Aug 12 2012 6:44 pm
Subject: Re: [mongodb-user] Prefer more objects or larger objects?

Hi Rob,

Thanks for the in depth answer. Suggests I've now implemented the wrong
approach and better to go back to object per message. So I guess I could
index by a date field (with no time on it) to get the effect I have now.
However, given that the messages are very small (think IRC not email), I am
left wondering if this isn't going to cause a lot of seeking to load up
messages by day? They will be temporally correlation of course, but in a
table with a whole load of different chat going on they will all be
interleaved. Is that something I should be able to ignore?

Alternatively can I force a min block size for a table? I'm not sure it
makes sense in terms of disk space consumption but worth considering.

I don't ever plan to delete documents and its likely messages will be
cached locally on the web server anyway so perhaps seek time not a huge
concern.

Dan


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Moore  
View profile  
 More options Aug 12 2012, 9:46 pm
From: Rob Moore <robert.allanb...@gmail.com>
Date: Sun, 12 Aug 2012 18:46:29 -0700 (PDT)
Local: Sun, Aug 12 2012 9:46 pm
Subject: Re: [mongodb-user] Prefer more objects or larger objects?

Echoing Scott's comment about there be a lot of variables but...

If the MongoDB cluster is sized to keep the last N days (hours) in memory
then "seeking" isn't an issue except when going back beyond that horizon.  

You can index on the full timestamp and then simply do a range query.  e.g.:
    { timestamp : { $gt : Date(2012-08-12T00:00:00) , $lt :
Date(2012-08-13T00:00:00) } }
The B-Tree indexes that MongoDB uses are designed to efficiently answer
this type of query.

You can also create a compound index on { timestamp : 1, chat_name : 1} and
it should speed up a query using both a range on timestamp and a range or
value for the chat_name.

The only option I know if (other than the bucketed documents) to group the
messages into chats is to use a collection per "chat" but I'd not recommend
that unless you can enumerate the chats before hand.  I have heard issues
about scaling the collection count into the thousands but I prefer to just
not go there.

The only mechanism I know of for controlling the allocation of blocks in
MongoDB is the TTL Collections.  I'm eagerly awaiting the 2.2.0 release so
I can take it for a spin on my current project.

Rob.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »