Best EC2 instance type for MongoDB

1,140 views
Skip to first unread message

Ben Wilber

unread,
Feb 23, 2012, 2:39:53 PM2/23/12
to mongod...@googlegroups.com
Hello,

I'm currently running mongodb on EC2 m2.2xlarge instances (35GB RAM) with 8x EBS disks in RAID0.  I'm wondering if anyone is running m2.4xlarge instances and what your network disk IO performance is.  Have you run on smaller instances and is it better?  Currently I cannot insert more than 250-300 docs per second before locking gets so bad that my query rate drops and I start getting timeouts.

Thanks

Tyler Brock

unread,
Feb 23, 2012, 6:15:27 PM2/23/12
to mongodb-user
Hey Ben,

It depends on how large those documents are and a couple of other
factors.

Usually when people need more write capacity the solution is to add
more shards.

Are you doing safe writes? Are they inserts or updates? How many
clients are doing the writes?

Can you get output from mongostat and iostat -xm 2

-Tyler

Ben Wilber

unread,
Feb 27, 2012, 11:59:46 AM2/27/12
to mongod...@googlegroups.com
Yes they are inserts while simultaneously serving about 130 queries per second. iostat and mongostat output is attached

thanks
iostat.txt
mongostat.txt

Ben Wilber

unread,
Feb 27, 2012, 12:02:02 PM2/27/12
to mongod...@googlegroups.com
No they are not safe writes, we have about 4 clients trying to backfill data from another database.  we can only get to a certain number of inserts/second (usually 250-300) before locking starts to affect normal queries.

Daniel Hunt

unread,
Feb 27, 2012, 2:47:45 PM2/27/12
to mongodb-user
I did a write-test of a *very* trivial document earlier on this
evening, using an m1.large, and was able to hit 20k/sec writes with
safe:false, and 1.6k/sec with safe:true...
Now I'm wondering if my test was bad, or your documents are
considerably larger and more complex.

Locally (SSD, Macbook Pro) I can hit 40k/sec with safe:false, and 9.8k/
sec with safe:true with the same document structure:
{
_id:ObjectId()
count: 1
}

Count is an incrementing integer.

As I said, the document is trivial, but surely you should be able to
get faster than you are, even while reading...

Glenn Maynard

unread,
Feb 27, 2012, 3:23:20 PM2/27/12
to mongod...@googlegroups.com
On Mon, Feb 27, 2012 at 1:47 PM, Daniel Hunt <danie...@gmail.com> wrote:
I did a write-test of a *very* trivial document earlier on this
evening, using an m1.large, and was able to hit 20k/sec writes with
safe:false, and 1.6k/sec with safe:true...

What does your test look like, out of curiosity?  I get about 5k/sec with safe:true on a t1.micro (until micro throttling kicks in) on EBS (attached, Python).

-- 
Glenn Maynard


test.py

Daniel Hunt

unread,
Feb 27, 2012, 3:44:07 PM2/27/12
to mongodb-user
On Feb 27, 8:23 pm, Glenn Maynard <gl...@zewt.org> wrote:
> What does your test look like, out of curiosity?  I get about 5k/sec with
> safe:true on a t1.micro (until micro throttling kicks in) on EBS (attached,
> Python).

I can't attach a file without using my mail client, it would appear,
so I've pasted it here: http://friendpaste.com/5oxfr0SEZBm4CdRGHlnjld
I'll attach the test as soon as I get an email from this thread (I've
just enabled per-mail notification)

My results from that script, after testing locally on my SSD based
Macbook Pro, are:
Safe:true:
Time: 1330374947.7473
Start: 1330374947.7473
End: 1330375042.0205
Diff: 94.273267030716
Avg: 10,607.47/s

Safe:false:
Time: 1330374873.8339
Start: 1330374873.8339
End: 1330374897.391
Diff: 23.557147979736
Avg: 42,450.00/s


The fluctuation seems to be +/- 10%, depending on what my machine is
doing at the time, naturally.

Daniel

Daniel Hunt

unread,
Feb 27, 2012, 3:47:00 PM2/27/12
to mongodb-user
On Mon, Feb 27, 2012 at 8:44 PM, Daniel Hunt <danie...@gmail.com> wrote:
> On Feb 27, 8:23 pm, Glenn Maynard <gl...@zewt.org> wrote:
>> What does your test look like, out of curiosity?  I get about 5k/sec with
>> safe:true on a t1.micro (until micro throttling kicks in) on EBS (attached,
>> Python).
>
> I can't attach a file without using my mail client, it would appear,
> so I've pasted it here: http://friendpaste.com/5oxfr0SEZBm4CdRGHlnjld
> I'll attach the test as soon as I get an email from this thread (I've
> just enabled per-mail notification)

Attached: PHP

mongospeedtest.php

Ben Wilber

unread,
Feb 27, 2012, 5:32:19 PM2/27/12
to mongod...@googlegroups.com
the docs look like this:

{
  "_id": "benwilber|some_object_key",
  "stats": {
    "action": "Favorited",
    "lastCheck": "2012-02-24T11:37:38Z",
    "numChecks": 73,
    "hasReview": 1
  },
  "title": "Some Object Title",
  "userId": "benwilber",
  "objectKey": "some_object_key",
  "displayName": "Ben Wilber"
}

here's db.mycollection.stats():

{
"ns" : "mydatabase.mycollection",
"count" : 3990282,
"size" : 1077000108,
"avgObjSize" : 269.9057630513332,
"storageSize" : 1371238400,
"numExtents" : 27,
"nindexes" : 1,
"lastExtentSize" : 231825408,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 286388928,
"indexSizes" : {
"_id_" : 286388928
},
"ok" : 1
}

normally we have an index on "objectKey_1_userId_1_stats.hasReview_1_stats.numChecks_-1" but for this testing I didn't create it since it's wasn't used (all query testing was done on _id)

On Thursday, February 23, 2012 6:15:27 PM UTC-5, Tyler Brock wrote:
On Thursday, February 23, 2012 6:15:27 PM UTC-5, Tyler Brock wrote:
On Thursday, February 23, 2012 6:15:27 PM UTC-5, Tyler Brock wrote:

Ben Wilber

unread,
Feb 27, 2012, 5:37:54 PM2/27/12
to mongod...@googlegroups.com
what is your flush delay set at?  we have to set it low (5 seconds) or it locks the DB for too long.  I can get a few thousand inserts/second if there are no concurrent reads.  the 250-300 inserts/second limit is when my reads start failing after waiting for 3s+ for a lock

On Monday, February 27, 2012 3:23:20 PM UTC-5, Glenn Maynard wrote:

Daniel Hunt

unread,
Feb 28, 2012, 7:10:48 AM2/28/12
to mongodb-user
On Feb 27, 10:37 pm, Ben Wilber <benwil...@gmail.com> wrote:
> what is your flush delay set at?  we have to set it low (5 seconds) or it
> locks the DB for too long.  I can get a few thousand inserts/second if
> there are no concurrent reads.  the 250-300 inserts/second limit is when my
> reads start failing after waiting for 3s+ for a lock

I've not actually set a flush delay, so I presume that that's
happening every 60 seconds.

Scott Hernandez

unread,
Feb 28, 2012, 7:19:41 AM2/28/12
to mongod...@googlegroups.com
This will most likely be limited by the disk io at these low numbers.
Please checkout network/io performance for the bottleneck on the
server.

With safe writes you are most likely limited by the number of client
processes (threads) because of sequential request/response cycle.

By reducing the syncDelay you are essentially writing more, but
smaller chunks of data which is much closer to the limitations of EBS
which has a low throughput.

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>

Daniel Hunt

unread,
Feb 28, 2012, 7:22:55 AM2/28/12
to mongodb-user
Just to clarify - I'm not experiencing slow writes. I'm simply stating
that my testing results in incredibly *fast* writes ;)

Scott Hernandez

unread,
Feb 28, 2012, 7:31:15 AM2/28/12
to mongod...@googlegroups.com
Yep, that was more of a note for Ben, and in general. If you have lots
of memory and fast disks you will get high write throughput, but the
limitation is still disk/io related in most cases.

Ben Wilber

unread,
Feb 28, 2012, 10:35:22 AM2/28/12
to mongod...@googlegroups.com
we had the syncDelay set to 15s originally which worked well for a couple months serving normal traffic (100 inserts/100 queries per second) but when we needed to backfill a few million objects from another database system we ran into issues of locking when the disk was flushed since there was so much data being written.  We lowered it to 5s which helped smooth out the locking a little better (though we still have problems with it) -- 

My ultimate question is what should we do to achieve some semblance of good write performance (while still serving about 100-200 queries/second)?  Currently our backfills take days because we have to throttle the writes so much so they don't impact real traffic.  200-300 writes/s is just too low and we can't afford to spend this much time on it.

Should we move to the biggest EC2 instance available?  I hear from others that performance is better since there's little/no contention for resources.  I am also very interested in alternative hosting.  Does anyone have experience running mongo at a different host than EC2 while still having your clients on EC2?  I can do with slightly less memory if the disks (SSDs) are fast enough.  Should I just double/triple my EC2 costs and shard across multiple replica sets?

@10gen - I know you're partners with AWS and Rackspace, what is your suggestion?


On Tuesday, February 28, 2012 7:19:41 AM UTC-5, Scott Hernandez wrote:
This will most likely be limited by the disk io at these low numbers.
Please checkout network/io performance for the bottleneck on the
server.

With safe writes you are most likely limited by the number of client
processes (threads) because of sequential request/response cycle.

By reducing the syncDelay you are essentially writing more, but
smaller chunks of data which is much closer to the limitations of EBS
which has a low throughput.

On Tue, Feb 28, 2012 at 7:10 AM, Daniel Hunt <> wrote:
> On Feb 27, 10:37 pm, Ben Wilber <> wrote:
>> what is your flush delay set at?  we have to set it low (5 seconds) or it
>> locks the DB for too long.  I can get a few thousand inserts/second if
>> there are no concurrent reads.  the 250-300 inserts/second limit is when my
>> reads start failing after waiting for 3s+ for a lock
>
> I've not actually set a flush delay, so I presume that that's
> happening every 60 seconds.
>
> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.

> To unsubscribe from this group, send email to mongodb-user+unsubscribe@googlegroups.com.

> To unsubscribe from this group, send email to mongodb-user+unsubscribe@googlegroups.com.

Glenn Maynard

unread,
Feb 28, 2012, 11:45:00 AM2/28/12
to mongod...@googlegroups.com
To clarify: 200-300 inserts per second is what you have to throttle your inserts to (to avoid starving out reads), not what your system can actually do unthrottled, right?

Aside from working around it with sharding or replica sets, I'm definitely curious--as a user of Mongo, not a developer--of the underlying problem.  Since you're only doing inserts and queries, it seems that you shouldn't be seeing much lock contention (http://www.mongodb.org/display/DOCS/How+does+concurrency+work#Howdoesconcurrencywork-Operations).

I wonder why, in mongostat, there's a huge spike in lock% whenever a flush happens.


On Tue, Feb 28, 2012 at 9:35 AM, Ben Wilber <benw...@gmail.com> wrote:
Should I just double/triple my EC2 costs and shard across multiple replica sets?

Are you sharding already?  I can't tell from the discussion.  If you can shard your data set, can you use much smaller, less expensive instances for each shard, and just add more shards?

If you can't shard your data, maybe use a two-machine replica set, with one of the replicas on a smaller instance with priority:0.  The big instance stays the primary, to handle the write load, and use the smaller, cheaper instance for reads without contending against write locks.  You'd have to pay attention to write consistency, of course, so this might not be a trivial change.

--
Glenn Maynard

Ben Wilber

unread,
Feb 28, 2012, 12:59:41 PM2/28/12
to mongod...@googlegroups.com
we're not sharding currently.  we have 2 m2.2xlarge instances in a replica set with an arbiter on another machine.  we do a lot more reads from the secondary than the primary (we need consistent reads in some areas), but even so the lock % is very high on the primary when doing a lot of inserts.  it seems the EBS disks are just too slow but I'm curious how others are doing this on EC2?  Is everyone dealing with this slow of performance for mixed read/write loads?

Glenn Maynard

unread,
Feb 28, 2012, 2:42:00 PM2/28/12
to mongod...@googlegroups.com
On Tue, Feb 28, 2012 at 11:59 AM, Ben Wilber <benw...@gmail.com> wrote:
we're not sharding currently.  we have 2 m2.2xlarge instances in a replica set with an arbiter on another machine.  we do a lot more reads from the secondary than the primary (we need consistent reads in some areas), but even so the lock % is very high on the primary when doing a lot of inserts.  it seems the EBS disks are just too slow but I'm curious how others are doing this on EC2?  Is everyone dealing with this slow of performance for mixed read/write loads?

But why is the lock% so high, for such a (seemingly) light load?

The attached test includes a writer which continuously writes documents (using the example you provided), and a reader that randomly reads items.  I'm running this test on a c1.medium (much smaller than yours), with a single EBS volume containing only the database.  The readers, writers and Mongo itself are all running on the same instance.

mongostat sample attached.  Both the reader and writer are around 2500 queries/sec each, and lock% is negligible.  Your database is smaller than this test set (according to your collection stats), and your server instance is substantially more powerful.  Any idea how this differs from your test workload?

-- 
Glenn Maynard


mongostat.txt
reader.py
writer.py
iostat.txt
stats.txt

Ben Wilber

unread,
Feb 28, 2012, 2:55:20 PM2/28/12
to mongod...@googlegroups.com
what is your --syncDelay set at?  If you haven't changed it then it's flushing to disk every 60s.  How long did you run your test?  I don't see any actual disk flushes happening in the mongostat/iostat output you sent which would explain the very high concurrent reads/writes.  I can't get anywhere near that.

Ben Wilber

unread,
Feb 28, 2012, 2:56:41 PM2/28/12
to mongod...@googlegroups.com
also I have a single replica as well.  Can replication have this kind of impact on performance?

Glenn Maynard

unread,
Feb 28, 2012, 3:45:44 PM2/28/12
to mongod...@googlegroups.com
On Tue, Feb 28, 2012 at 1:55 PM, Ben Wilber <benw...@gmail.com> wrote:
what is your --syncDelay set at?  If you haven't changed it then it's flushing to disk every 60s.  How long did you run your test?  I don't see any actual disk flushes happening in the mongostat/iostat output you sent which would explain the very high concurrent reads/writes.  I can't get anywhere near that.

I didn't set one.  I've rerun the test with --syncdelay 5 (note that parameters are usually case-sensitive), as well as updating to 2.0.2; updated results attached.  (I see that the lock% values are fixed; it looks like the mongostat output in the old version I was using put the decimal point in the wrong place.)

There looks like a big difference between my test output and yours: whenever your system flushes (a 1 in the "flushes" column), there's a huge spike in lock% either on that stat period or the next.  I'm not seeing that at all: there's no significant jump in lock% when a flush happens.  I wonder what the difference is.

--
Glenn Maynard


mongostat.txt
stats.txt

Ben Wilber

unread,
Feb 28, 2012, 4:09:45 PM2/28/12
to mongod...@googlegroups.com
only other thing is I'm doing concurrent reads/writes from several clients at once.  Also I have the network_timeout param to pymongo.Connection set to 3 seconds so I can see errors (by default I don't think there is a timeout.)  I'm very curious about the difference.

I've followed all the best practices I can find regarding disk tuning (8x EBS in RAID0, xfs, noatime, etc.) -- i should be getting much better performance than I am especially on a 2xlarge EC2 instance.

Tyler Brock

unread,
Mar 22, 2012, 4:04:11 PM3/22/12
to mongod...@googlegroups.com

It might be worth noting that even with raid 10 there is a 2gbs rate limit between an individual EC2 instance and the EBS service as a whole, so this is the maximum throughput you could possibly achieve on EBS from a single node no matter how many volumes you have.

Also, you have probably already seen it but there might be some additional information here:

http://www.mongodb.org/display/DOCS/Amazon+EC2

Reply all
Reply to author
Forward
0 new messages