Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Performance issues
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  7 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Otis Zein  
View profile  
 More options Aug 2 2011, 9:53 am
From: Otis Zein <otisz...@yahoo.com>
Date: Tue, 2 Aug 2011 06:53:48 -0700 (PDT)
Local: Tues, Aug 2 2011 9:53 am
Subject: Performance issues

We are running into performance issues with our mongo cluster.

Here is what we currently have running:
Four nodes running RHEL5 64bit with 24 CPUs (HT), 96GB RAM, and RAID 10 (6 disks) with 10G nics.
Mongo 1.8.2 with no repl sets.  3 config servers and mongos running on each server.

Data:
Trying to ingest 1B rows/day.  On avg, this is 11574 rows/sec.  With our current schema, each doc
is ~512 bytes and the indexes add about ~400 bytes.  So to make it easy, lets say 1K/doc.

Doing the math, I get an average disk write speed of ~12MB/sec.  Not outrageous even if I double that rate
across the 4 servers.

What eventually happens is that the primary shard gets hot because the balancer seems to run too slow.  I have
seen big chunk imbalances where the primary shard has ~120 chunks and the other 3 shards have ~30 chunks.
If we stop ingest, the chunks eventually will balance themselves out.  But for production, we can't since data
is streaming in 24/7.

Our shard key is YYYYMMDD + 6 random digits.  The date is the current date from the data.

The ingest code is written in Java and is currently configured to have 8 threads per server for a total
of 32 threads for the entire cluster.

As another test, we setup another 4 nodes to see if this would help.  Same thing.  The culprit seems to be
the balancer not moving files off the node fast enough to help distribute the load.

So, are we trying to load too much into mongo?  Is there any tuning we can do on the balancer?
Should we try bigger chunks?

Thoughts?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nathan Ehresman  
View profile  
 More options Aug 2 2011, 12:16 pm
From: Nathan Ehresman <nehres...@sentryds.com>
Date: Tue, 2 Aug 2011 12:16:38 -0400
Local: Tues, Aug 2 2011 12:16 pm
Subject: Re: [mongodb-user] Performance issues
Otis,

Your shard key lends itself to pre-splitting your chunks.  Then you would just want
to move some of your new empty chunks onto the right servers before they receive
data.  If done right, it makes the balancer much less busy.

http://www.mongodb.org/display/DOCS/Splitting+Chunks

Nathan Ehresman
http://www.tebros.com/

On 08/02/2011 09:53 AM, Otis Zein wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
vermoid  
View profile  
 More options Aug 2 2011, 1:00 pm
From: vermoid <tdhaayushve...@gmail.com>
Date: Tue, 2 Aug 2011 10:00:40 -0700 (PDT)
Local: Tues, Aug 2 2011 1:00 pm
Subject: Re: Performance issues
From the point of view of increasing the number of inserts.
have you tried doing a BULK insert.
bulk insert through Java driver is pretty much inserting a
List<OBObject> (list of documents) , rather than inserting Just
DBObject

basically, you have very high streaming rate. You can batch things up
and do a bulk insert.
Now how much you should batch, should depend on you application. But
I would say put a time and size limit.
So batch up to say 50 sec, but if the size reaches 100,000 documents,
then flush it.

On Aug 2, 9:16 am, Nathan Ehresman <nehres...@sentryds.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
otiszein@yahoo.com  
View profile  
 More options Aug 3 2011, 10:00 am
From: "otisz...@yahoo.com" <otisz...@yahoo.com>
Date: Wed, 3 Aug 2011 07:00:28 -0700 (PDT)
Local: Wed, Aug 3 2011 10:00 am
Subject: Re: Performance issues
@nathan - Yeah, that is my next step.  I didn't want to have to pre-
split.  More code to maintain.  :-(

@vermoid - We are using bulk inserts but it maxes out at 5K docs.
Once I get this pre-split going, we'll start fiddling with increasing
that number.

10gen - I assume the balancer is throttled?  I have created the 300
chunks and the balancer is still moving chunks around after 20mins.
Is there anyway to get the balancer to run more aggressively?  With
10G nics and 64MB chunks, this should be working faster.  Would moving
the chunks myself speed things up?

I forgot to mention what our performance problem is.  Our ingest can't
keep up.  We get hopelessly backlogged.

Thanks all.

On Aug 2, 1:00 pm, vermoid <tdhaayushve...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Greg Studer  
View profile  
 More options Aug 3 2011, 1:09 pm
From: Greg Studer <g...@10gen.com>
Date: Wed, 3 Aug 2011 13:09:53 -0400
Local: Wed, Aug 3 2011 1:09 pm
Subject: Re: [mongodb-user] Re: Performance issues
The balancer isn't throttled, but it needs to perform verification
steps and continue to push new and updated data to the target shard,
which can slow things down when you continue to add lots of data.  The
mongod logs have info about migration data transfers, if you want to
know more about what's happening on your systems.

As Nathan mentioned, pre-splitting is probably the right way to go
here - it's more code, but hopefully you should only need a small
script that runs every few days.  We are looking at tracking not only
the current size but rate of size increase to make the balancer more
predictive, but for now this may be your best solution.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
otiszein@yahoo.com  
View profile  
 More options Aug 3 2011, 5:38 pm
From: "otisz...@yahoo.com" <otisz...@yahoo.com>
Date: Wed, 3 Aug 2011 14:38:00 -0700 (PDT)
Subject: Re: Performance issues
Greg

Thanks for the info.  It helps to understand how things work.  I
started running a test using pre-splits and its working better.

Would running moveChunk myself be faster then waiting for the balancer
to do it?  I'm planning to run the presplitter before midnight in
order to create the next day's chunks.  My first pre-split took the
balancer 30mins to move the chunks around.  I assume there is no way
to create a chunk directly on a shard.

Also, any insight into whether shoving 1B docs/day into mongo is
sane?  I have searched around a bit and haven't found any other users
with that much data.

On Aug 3, 1:09 pm, Greg Studer <g...@10gen.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Eliot Horowitz  
View profile  
 More options Aug 4 2011, 12:10 pm
From: Eliot Horowitz <el...@10gen.com>
Date: Thu, 4 Aug 2011 12:10:38 -0400
Local: Thurs, Aug 4 2011 12:10 pm
Subject: Re: [mongodb-user] Re: Performance issues

Moving yourself might be faster as you won't have any balancer pauses to
make sure things are ok.
If you know they are going to be empty chunks, might be better.

1B docs/day is fine, planning the cluster size should just factor in working
set size and load.

On Wed, Aug 3, 2011 at 5:38 PM, otisz...@yahoo.com <otisz...@yahoo.com>wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »