On Wed, May 30, 2012 at 8:01 AM, Stephan Bösebeck
<
sboes...@googlemail.com> wrote:
>
> Am 30.05.2012 um 13:39 schrieb Scott Hernandez:
>
>> On Wed, May 30, 2012 at 7:33 AM, Stephan Bösebeck
>> <
sboes...@googlemail.com> wrote:
>>> Hi Scott,
>>>
>>> Lets just answer the questions you asked:
>>> - there are no expensive operations/queries running - the same queries work now and in 5 minutes they end in a timeout.
>> That isn't exactly what I was getting at. When you get to these slow
>> points can you please provide db.currentOp() output along with the
>> time things are happening so we can look in MMS?
You may want to enable database profiling on the primaries of your
shards to get an idea of the types of queries and their timing
historically. You can also enable this in MMS so it is easier to
diagnose during the slow times.
>
>>
>>> - yes, we us MMS and Munin...
>> Okay, when does this happen and what is your group name?
> look in MSS, group name
holidayinsider.com
You don't seem to collecting hardware stats with MMS. Can you make
sure connectivity and munin are setup correctly on all your hosts?
http://mms.10gen.com/help/install.html#hardware-monitoring-with-munin-node
>>
>>> - Version 2.0.2 of MongoDB is installed on all nodes
>> You should upgrade to 2.0.5 or 2.0.6 later this week as there are some
>> important fixes related to sharding and balancing.
> We actually use 2.0.4 - we need to plan the upgrade a bit. Last time we upgraded, there was a problem with the replicaset resulting in the cluster not to start at all... (annother issue)
>>
>>> I was thinking about dropping the sharded collection in order to have it re-created as unsharded one getting rid of the balancing issue.
It seems like you have lots of unsharded databases which live on the
hi2 shard. Can you run mongotop on that shard and see where all the
traffic is going -- which collections are active? The operations and
load between the two shards does not seem very even and I'm guess it
is that since your three sharded collections are balanced.