inconsistent read speed - logs with flushing mmaps took 11205ms in the log

490 views
Skip to first unread message

Tim Haines

unread,
Nov 14, 2012, 11:46:42 PM11/14/12
to mongod...@googlegroups.com
Hi there,

I've plugged mongo into my production system today to see how it handles the write speed for a few days.  So far I only have one server with a 15k RPM spinning disk and 8GB of ram.  If I take this to production properly, I intend to shard, but hopefully to only 2 or 3 servers.

Right now I'm doing about 200 upsert findAndModify's per second, and 200 upsert updates.  The stats tells me the averageObjSize is 129 bytes for the documents receiving findAndModifys (4 indexes) and 91 bytes for the documents receiving updates (1 index).  The dataset (data + indexes) is currently around 1GB in total.

I have these entries in the mongo logs - I don't know what they mean yet, but they're concerning:

Wed Nov 14 21:49:23 [DataFileSync] flushing mmaps took 10482ms  for 11 files
Wed Nov 14 21:53:23 [DataFileSync] flushing mmaps took 10520ms  for 11 files
Wed Nov 14 21:55:07 [websvr] serverStatus was very slow: { after basic: 0, middle of mem: 1640, after mem: 1640, after connections: 1640, after extra info: 1640, after counters: 1640, after repl: 1640, after asserts: 1640, after dur: 1640, at end: 1640 }
Wed Nov 14 21:56:22 [DataFileSync] flushing mmaps took 10078ms  for 11 files
Wed Nov 14 21:57:04 [journal] old journal file will be removed: /var/lib/mongodb/journal/j._46
Wed Nov 14 21:57:22 [DataFileSync] flushing mmaps took 10008ms  for 11 files
Wed Nov 14 22:00:23 [DataFileSync] flushing mmaps took 10751ms  for 11 files
Wed Nov 14 22:01:23 [DataFileSync] flushing mmaps took 11075ms  for 11 files
Wed Nov 14 22:05:11 [journal] old journal file will be removed: /var/lib/mongodb/journal/j._47
Wed Nov 14 22:06:23 [DataFileSync] flushing mmaps took 10347ms  for 11 files
Wed Nov 14 22:07:24 [DataFileSync] flushing mmaps took 11479ms  for 11 files
Wed Nov 14 22:10:23 [DataFileSync] flushing mmaps took 10410ms  for 11 files
Wed Nov 14 22:11:23 [DataFileSync] flushing mmaps took 10340ms  for 11 files
Wed Nov 14 22:13:24 [DataFileSync] flushing mmaps took 11128ms  for 11 files
Wed Nov 14 22:13:27 [journal] old journal file will be removed: /var/lib/mongodb/journal/j._48
Wed Nov 14 22:14:24 [DataFileSync] flushing mmaps took 11807ms  for 11 files
Wed Nov 14 22:15:24 [DataFileSync] flushing mmaps took 11205ms  for 11 files

I'm not doing any reads yet, but I decided to check to see how consistent they are - if the mmaps flushing is causing read stalls.

I set up a simple query which fetches one record based on an index, and had it repeat every second.  I log the duration and timestamp if it took longer than 3ms (most were under 1ms).  This is the output I saw - some operations taking multiple seconds for a single record query based on an index.  Where all the data and indexes should still be in ram..

0.104574109 2012-11-14 22:13:54 -0600
4.74783542 2012-11-14 22:14:22 -0600
0.243556581 2012-11-14 22:14:25 -0600
0.821347295 2012-11-14 22:14:30 -0600
0.036629986 2012-11-14 22:14:56 -0600
3.851662545 2012-11-14 22:15:05 -0600
0.007991038 2012-11-14 22:15:10 -0600
0.004475775 2012-11-14 22:15:57 -0600
0.09404014 2012-11-14 22:16:00 -0600
0.004445754 2012-11-14 22:16:07 -0600
0.024827767 2012-11-14 22:16:16 -0600
0.004718648 2012-11-14 22:16:29 -0600
0.543830541 2012-11-14 22:16:48 -0600
0.081387443 2012-11-14 22:16:57 -0600
0.093996291 2012-11-14 22:17:55 -0600
0.013581206 2012-11-14 22:18:01 -0600
0.455109956 2012-11-14 22:18:17 -0600
0.543997559 2012-11-14 22:18:20 -0600
0.006581586 2012-11-14 22:18:27 -0600
0.040279709 2012-11-14 22:18:52 -0600
3.9563892 2012-11-14 22:19:22 -0600
0.08867152 2012-11-14 22:19:36 -0600
0.044744453 2012-11-14 22:19:54 -0600
0.011553574 2012-11-14 22:20:14 -0600
0.151064978 2012-11-14 22:20:50 -0600

My munin graph shows disk utilization is at 33%.  iostat often reports util at 0% - 2%, but has bursts at 100%.

Is it possible I have something misconfigured here or is there something I likely need to tune?  Am I overloading the machine?  It's my first day throwing data at mongo, so if someone more experienced could offer any insight, it would be much appreciated.  

Cheers,

Tim

Tim Haines

unread,
Nov 15, 2012, 11:34:51 AM11/15/12
to mongod...@googlegroups.com
The mmap flushing is even slower now..

Thu Nov 15 10:25:50 [journal] old journal file will be removed: /var/lib/mongodb/journal/j._79
Thu Nov 15 10:25:51 [journal] DR101 latency warning on journal file open 562ms
Thu Nov 15 10:26:40 [DataFileSync] flushing mmaps took 27696ms  for 14 files
Thu Nov 15 10:27:50 [DataFileSync] flushing mmaps took 37244ms  for 14 files
Thu Nov 15 10:28:48 [DataFileSync] flushing mmaps took 35968ms  for 14 files
Thu Nov 15 10:29:01 [journal] old journal file will be removed: /var/lib/mongodb/journal/j._80
Thu Nov 15 10:29:38 [DataFileSync] flushing mmaps took 25631ms  for 14 files
Thu Nov 15 10:30:35 [DataFileSync] flushing mmaps took 22286ms  for 14 files
Thu Nov 15 10:31:50 [DataFileSync] flushing mmaps took 37886ms  for 14 files

Tim.

Jason Lewis

unread,
Nov 15, 2012, 2:08:28 PM11/15/12
to mongod...@googlegroups.com
What OS and filesystem?
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb

Gevorg Hari

unread,
Nov 15, 2012, 2:22:09 PM11/15/12
to mongod...@googlegroups.com
I'm having a very similar issue. Flushing mmaps takes in general about 10000ms for 150 files and 300GB total but it sometimes jump even to 150000ms. During the flushing the db is basically not responsive. I noticed this while inserting data and creating indexes (I had to abort the operation even if in background).

I'm running MongoDB 2.2.1 for Windows 64-bit 2008R2+ on a WIndows Server 2008 R2 with NTFS File System. There about 200GB of disk space available and plenty of RAM that slowly ramps up to be 100% of utilization when this issue starts occurring.

Tim Haines

unread,
Nov 15, 2012, 2:31:20 PM11/15/12
to mongod...@googlegroups.com
I'm running Ubuntu 12.04 and ext4.

I've tried running mongostat this morning to get more insight into how performance is being affected.

It's doing about 4000 write ops per second when there's no disk activity.  Unfortunately these are only in bursts of about 10 seconds now, followed by 40 seconds of high disk io while flushing the mmaps.  While the flush is occurring, the write ops drop to 500 - 1000 per second, with occasional drops to 0.

I'm wondering if moving the journal files to another drive would alleviate this much. 

Tim.

Tim Haines

unread,
Nov 15, 2012, 4:32:22 PM11/15/12
to mongod...@googlegroups.com
Removing the 4 indexes hasn't helped either.  Flushing the mmaps is still taking 10 - 20 seconds, write performance is spikey, and single reads spike up to over 1 second.  

Gevorg Hari

unread,
Nov 16, 2012, 10:38:07 AM11/16/12
to mongod...@googlegroups.com
I downgraded to 2.0.8-rc0 and that specific problem seems to be gone. There is a similar (maybe equivalent) message: "FlushViewOfFile for ../datadb/collection_name.number" that takes way less time and seems not to freeze the db. This is while creating indexes on big collections.

In both versions (2.2.1 and 2.0.8-rc0) the RAM utilization jumps up to 100% after a while and the entire server becomes scaringly slow. This affects every other process running on the server and it doesn't let me connect to the machine (remote desktop) in an acceptable amount of time. I noticing this issue right now while inserting large amounts of data in the db (not using mongoimport for several reasons).

Tim Haines

unread,
Nov 16, 2012, 10:42:40 AM11/16/12
to mongod...@googlegroups.com
Hi Gevorg,

Do your documents contain large arrays by any chance?  A portion of mine do and grow up to 1000 - 3000 elements, but some stay at 1.  They start with 1 element and I grow them 1 element at a time with push.  I'm wondering if this is related to the problem.

Tim.

Gevorg Hari

unread,
Nov 16, 2012, 10:58:55 AM11/16/12
to mongod...@googlegroups.com
Tim,

my documents contain arrays of documents but the average length is about 20/30. Each document in the array has about 10 string fields and I have (I'm trying to have) two separate indexes on two of those. The collection contains ~50M documents. I started having serious issues when I tried to create the second index and eventually I had to abort the operation. 

I saw way less issues with the 2.0.8-rc0 version (try to create a parallel instance if you can and see if it makes any difference for you) but creating the second index (sparse, background) was still too painful. I wiped out everything, created the indexes first, and trying to reinsert the data to see if it makes any difference right now. The RAM jumped up to 100% after a few hours and the server is not doing great but I'll wait and see what happens.

Do you have indexes as well on those arrays?

Tim Haines

unread,
Nov 16, 2012, 11:10:20 AM11/16/12
to mongod...@googlegroups.com
Hi Gevorg,

I don't have indexes on the arrays.  I did have 4 index on other fields, but I tried dropping them and I still see similar messages - though with slightly smaller durations.  My dataset is still smaller than RAM (8GB) - so I'll be very curios to see what happens once it's larger than RAM.  Probably in 2 - 3 more days.  

I'll see if I can spin up a test with 2.0.8 as well, but I'm going on vacation later today, so it might take a bit of time.

Tim.

Api

unread,
Oct 24, 2013, 7:53:39 AM10/24/13
to mongod...@googlegroups.com
Dear All,
we are experiencing the exact same problem.
Flushin small amounts of data (6k small docs) takes ages (up tp 50 seconds and more).
We have a 3M doc collection, index on _id, Ubuntu 12.04, ext4 fs, latest mongodb version (2.4.7).
Any news on the issue?
Seems we are not alone.
We can provide a python script that reproduces the issue.
Reply all
Reply to author
Forward
0 new messages