300 faults/sec, 100% lock, plenty of free RAM

72 views
Skip to first unread message

Avishay Lavie

unread,
Sep 9, 2013, 10:00:20 AM9/9/13
to mongod...@googlegroups.com
Hi,

In a sharded cluster, one of the shards exhibits exceptionally high lock percentage while bulk read/write operations are occurring, becoming 100% locked (sometimes higher, as reported by mongostat; I understand this is a combination of both db-level lock and the global lock, and thus can be higher than 100%). The other shards also get higher lock rates when bulk operations occur, but they are only around 25% or so. 

The faulty shard also shows high fault rate, on the order of 200-400 faults/second (again, reported by mongostat). However, index miss rate is reported at zero and there's no shortage of memory - it's a 12GB machine and mongod uses <1GB (according to both 'top' and mongostat's 'res'). 

$ free -m
             total       used       free     shared    buffers     cached
Mem:          7872       2284       5588          0         69       1187
-/+ buffers/cache:       1027       6845
Swap:         3967          1       3966

1. I assume that the high lock rate is *caused* by the high fault rate (lots of faults make operations longer, thus holding the lock for longer periods of time) -- does that make sense? 
2. If mongo has plenty of memory available to it, what could cause high fault rates? 
3. Is there any way I can isolate the bulk operations from the online operations, so they don't impact the application's performance as much?

Thanks,

Avish

Тимур Гимранов

unread,
Sep 10, 2013, 1:39:35 AM9/10/13
to mongod...@googlegroups.com
It looks like your shard key is not suitable and your faulty shard sustain more read/writes per second than other shards

Please, show output if the "sh.status()" command to see if my assumption is true

2. FSyncs

3. No. You only can play with architecture of your database layer

понедельник, 9 сентября 2013 г., 20:00:20 UTC+6 пользователь Avishay Lavie написал:

Avishay Lavie

unread,
Sep 10, 2013, 2:10:55 AM9/10/13
to mongodb-user

To clarify, I'm not asking "what could cause one shard to work harder than the others?", but rather "what could cause a mongod to exhibit high lock/fault rates with no index misses and plenty of free RAM?". The sharding aspect is minor, since even for a standalone server this is strange behavior.

Could you elaborate on what you mean by "FSyncs"?

Thanks,
Avish

--
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb
 
---
You received this message because you are subscribed to a topic in the Google Groups "mongodb-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mongodb-user/4I53vu8YvQc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mongodb-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Тимур Гимранов

unread,
Sep 10, 2013, 2:30:35 AM9/10/13
to mongod...@googlegroups.com
MongoDB sets the fsync lock while flushes data to the disk

Also, every commit to the journal spend some time
By default, every 100 ms mongodb saves journal to the disk. And it's also generate some fault.
And every oplog generates faults

вторник, 10 сентября 2013 г., 12:10:55 UTC+6 пользователь Avishay Lavie написал:
Reply all
Reply to author
Forward
0 new messages