Imbalance on auto-sharding

Suhail Doshi

unread,

Jan 24, 2011, 9:38:47 PM1/24/11

to mongodb-user

We have some very strange numbers for a fairly high workload:

We're seeming some really bad IO spikes: http://cl.ly/2p412i2q2J1I3T2R2V2e
-- Can someone explain that?

Additionally, we're seeing about 1/3 the ops on our 2nd shard.

We have 2 routers, 2 shards, 1 config. The 2 shards are 2 8G RAM nodes
with 320G of space.

Shard 1: 693 queries/s
Shard 2: 183 queries/s

Any ideas what's going on and why it may not be load balancing
correctly? We're sharding on a generated key that's fairly dynamic and
should see good distribution.

Suhail Doshi

unread,

Jan 24, 2011, 9:43:27 PM1/24/11

to mongodb-user

Here are some stats:

Shard 1:
{
"ns" : "index.lookup_table",
"count" : 43504321,
"size" : 3951174252,
"avgObjSize" : 90.82257029135106,
"storageSize" : 5028424960,
"numExtents" : 26,
"nindexes" : 2,
"lastExtentSize" : 848204032,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 7831079536,
"indexSizes" : {
"_id_" : 3172737984,
"key_1" : 4658341552
},
"ok" : 1
}

Shard 2:

{
"ns" : "index.lookup_table",
"count" : 35162459,
"size" : 3194050888,
"avgObjSize" : 90.83696017960519,
"storageSize" : 4180220928,
"numExtents" : 25,
"nindexes" : 2,
"lastExtentSize" : 706836736,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 6100879984,
"indexSizes" : {
"_id_" : 2272084928,
"key_1" : 3828795056
},
"ok" : 1

Eliot Horowitz

unread,

Jan 24, 2011, 10:29:27 PM1/24/11

to mongod...@googlegroups.com

It looks like the data is pretty evenly distributed ~ 20% off of ideal.
Since your'e seeing around a 3.8x query difference, it would seem that
the shard key isn't distributed reads evenly.
Can you explain the shard key and data a bit more?

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Suhail Doshi

unread,

Jan 24, 2011, 10:47:23 PM1/24/11

to mongodb-user

Can you explain the crazy IO spikes?

Eliot Horowitz

unread,

Jan 24, 2011, 11:00:19 PM1/24/11

to mongod...@googlegroups.com

Not without more information.
Could be user load, balancing, etc...
Would need logs, etc...

Nat

unread,

Jan 24, 2011, 11:01:21 PM1/24/11

to mongodb-user

What's your mongostat and iostat look like?

Adrien Mogenet

unread,

Jan 25, 2011, 1:58:29 AM1/25/11

to mongodb-user

Could it be due to not preallocating data files ?

Suhail Doshi

unread,

Feb 2, 2011, 11:07:44 AM2/2/11

to mongodb-user

Eliot,

Resolved the ops problem that's related to key distribution. We just
stuck memcache in front since those keys were just being highly hit
with reads (not writes). The rest of the key distribution is basically
based on user. (we don't have whale users really)

Still no large effect on the IO spikes that appear to be getting worse
over time. See: http://cl.ly/2i0j1a2A1h2j252v0R47

I heard in 1.7.5 the balancer was totally re-written because of these
issues?

On Jan 24, 8:00 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> Not without more information.
> Could be user load, balancing, etc...
> Would need logs, etc...
>
>
>
>
>
>
>
> On Mon, Jan 24, 2011 at 10:47 PM,SuhailDoshi <digitalwarf...@gmail.com> wrote:
> > Can you explain the crazy IO spikes?
>
> > On Jan 24, 7:29 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> >> It looks like the data is pretty evenly distributed ~ 20% off of ideal.
> >> Since your'e seeing around a 3.8x query difference, it would seem that
> >> the shard key isn't distributed reads evenly.
> >> Can you explain the shard key and data a bit more?
>

Suhail Doshi

unread,

Feb 2, 2011, 11:08:19 AM2/2/11

to mongodb-user

I doubt that because mongo maxes out on 2G data files. Our IO wouldn't
be getting worse over time so quickly if that were the main issue:
http://cl.ly/2i0j1a2A1h2j252v0R47

On Jan 24, 10:58 pm, Adrien Mogenet <adrien.moge...@gmail.com> wrote:
> Could it be due to not preallocating data files ?
>
> On 25 jan, 05:01, Nat <nat.lu...@gmail.com> wrote:
>
>
>
>
>
>
>
> > What's your mongostat and iostat look like?
>

> > On Jan 25, 11:47 am,SuhailDoshi <digitalwarf...@gmail.com> wrote:
>
> > > Can you explain the crazy IO spikes?
>
> > > On Jan 24, 7:29 pm, Eliot Horowitz <eliothorow...@gmail.com> wrote:
>
> > > > It looks like the data is pretty evenly distributed ~ 20% off of ideal.
> > > > Since your'e seeing around a 3.8x query difference, it would seem that
> > > > the shard key isn't distributed reads evenly.
> > > > Can you explain the shard key and data a bit more?
>

Eliot Horowitz

unread,

Feb 2, 2011, 1:40:02 PM2/2/11

to mongod...@googlegroups.com

The balancer was not written at all...

Can you try to correlate spikes with log files?
There should be some op in there that is slow.

GVP

unread,

Feb 2, 2011, 5:52:50 PM2/2/11

to mongodb-user

@Suhail;

I would like to clarify the issue here, b/c I honestly think that
you've simply hit a capacity issue.

Server #1: 8GB of RAM:
"storageSize" : 5,028,424,960 (5.0GB)
"totalIndexSize" : 7,831,079,536 (7.8GB)

Server #2: 8GB of RAM:
"storageSize" : 4,180,220,928 (4.2GB)
"totalIndexSize" : 6,100,879,984 (6.1GB)

Problem #1: IOWait Spikes on Server #1
-----------
The index size on Server #1 is basically the size of the RAM.

This means that, at any given time, there's almost no actual data in
RAM. The RAM is mostly busy just keeping the index around.

As you get more documents, this problem is just going to get worse
because there is less and less "data" in RAM and more and more
indexes.

In this sense you have a capacity problem. You have a total of 14GB of
indexes with only 16GB of available RAM. Our general rule for good
performance is to keep all indexes in RAM.

*Solution #1*: It's time to add a new shard. The indexes on your data
are ~= your total RAM, the IO spikes are just going to get worse.

*Solution #1a*: Shrink the indexes. What's the purpose of key_1? How
does "key" differ from "_id"?

Problem #2: Uneven query distrubution / balancing problems
-----------

Shard #1:
693 queries/s
"count" : 43,504,321 (43.5M)

Shard #2:
183 queries/s
"count" : 35,162,459 (35.1M)

Ideal balance for this much data is 39M / node (= (43.5+35.1)/2). As
Eliot stated, we're within about 20% here, so that distribution is not
bad.

Of course, your queries are not balanced. Look at "queries per second
per million documents":
Shard #1: 15.9 q/s/M
Shard #2: 5.2 q/s/M

So Shard #1 is getting 3x the number of queries, even after we
normalize for the number of documents. So, for whatever reason, the
users that are active on Shard #1 are just *more* active. So the
server that is the most short on RAM is also the server that's
fielding more queries.

MongoDB does *not* shard "by activity", it shards "by key region". You
say that the load on each should be even, but clearly you have more
activity on one node than the other.

*Solution #2*: It's time to add a new shard. You'll need to re-
distribute that load and get more RAM into the equation.
*Solution #2a*: (see solution #1a)
*Solution #2b*: manual re-chunking. This is the *very* last thing that
you want to do. But you may have to manually move some of the 'hot'
zones from #1 to #2. I personally advise against this. In most cases,
adding a new server is much cheaper than trying to make this happen.
Plus, in general, you want to trust Mongo to do this correctly.

Conclusion
----------
So there are solutions to both of your problems.

Based purely on the numbers, it looks like you have some capacity
issues. It also looks like the data is not evenly distributed on terms
of query load.

The easiest solution to both problems is simply to add a new shard. I
highly suggest that you add that shard during a period of low IOWait
so that you don't overload the existing servers during the chunking
process.

To re-iterate, adding a shard is going to slow down everyone else
until that shard is "caught up".

Regards;
Gates

Sergei Tulentsev

unread,

Feb 2, 2011, 6:22:39 PM2/2/11

to mongod...@googlegroups.com

MongoDB does *not* shard "by activity", it shards "by key region".

I vaguely remember someone's talk which stated that the balancer is smart enough to move hot chunks around in order to even the load. Is this statement false?

Best regards,
Sergei Tulentsev

Eliot Horowitz

unread,

Feb 2, 2011, 7:13:13 PM2/2/11

to mongod...@googlegroups.com

That is an upcoming balancer feature.
Right now its purely size based.

robo -

unread,

Feb 8, 2011, 9:06:55 PM2/8/11

to mongod...@googlegroups.com

If you are in the cloud why not just "add more ram" by deploying on a
high mem type system? If you are replicating those shards you can do
this by rotating in the high mem systems so there is no down time.

Just another option to consider.

robo

Suhail Doshi

unread,

Feb 14, 2011, 5:06:24 PM2/14/11

to mongodb-user

We're still seeing imbalance issues.

Our keys look like this: mysql:3872:063dae09b34c369a3aa5308182b74e3f

Where the last 32 bytes are an md5 hash based on a user identifier.

Would mongodb doing anything with the beginning of the key in terms
of balancing. We have 3 nodes in the system but see really bad
performance when moving chunks around from only the first (and oldest)
node in the system.

Ideas?

Eliot Horowitz

unread,

Feb 17, 2011, 1:42:43 PM2/17/11

to mongod...@googlegroups.com

Can you send db.printShardingStatus() as well as mongos logs?

Reply all

Reply to author

Forward