Using EBS as data volumes at Amazon's EC2

511 views
Skip to first unread message

JonW

unread,
Oct 6, 2010, 12:57:13 PM10/6/10
to mongodb-user
We are considering using EBS volumes at Amazon's EC2 as our data
volumes for MongoDB. Is this recommended? Are there performance
concerns? I've only seen one post on this practice in the group and
it appeared positive. Do others have good or bad experiences doing
this?

Thanks,
Jon.

Alvin Richards

unread,
Oct 6, 2010, 1:13:54 PM10/6/10
to mongodb-user
In general a stripped EBS volume will improve the iowaits. However, in
our own tests we see irregular latency spikes during the day from any
given EBS backed device, sometimes peaking at 600ms.

We don't have a good explanation what happening on the EC2, but we
will continue to dig.

Suggest that you run a few tests on your app and post your findings
here.

-Alvin

Will Fleming

unread,
Oct 6, 2010, 11:22:43 PM10/6/10
to mongod...@googlegroups.com
I've been using mongodb for the past few days on a EC2 large with the
db on a ext4 EBS backed volume, i'm seeing periodic latency on inserts
generally in the 100-200ms range, but occasionally spiking up to 700ms
(doing about 20 inserts and 10 upserts per second). These are pretty
unpredictable as to when they occur - sometimes every 20 seconds,
sometimes 5-10 minutes between them.

Initially I thought the problem might be with mongo, so as a test I
setup another machine and ran ~2500 inserts per second - same result.
The occasional latency remained at the same frequency, even with a
100x increase in number of inserts per second.

This is just a personal project at the moment, but when I have time
I'll probably see what the difference is on ephemeral storage
(possibly with the db replicated on EBS), see if using XFS makes a
difference, and look at striping.


I also briefly tried with a 64 bit ec2 micro instance, but was seeing
latency spikes up to 15,000ms. Not sure if that's endemic to micro
instances, or I just caught EBS on a bad day.

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Alvin Richards

unread,
Oct 7, 2010, 12:05:03 AM10/7/10
to mongodb-user
We have been seeing period spikes on EBS of 600ms. To check this do
the following

Iostat -x 2

Or run munin or equivalent.

If you post that with a mongostat it should be fairly simple to
determine the cause.

-Alvin

Will Fleming

unread,
Oct 7, 2010, 1:54:46 AM10/7/10
to mongod...@googlegroups.com
Thanks,

Running ubuntu 10.04 (http://alestic.com/ 64bit S3 based AMI),
mongo-stable installed via apt-get the package from the 10gen
repository.

Only nonstock software on the machine is mongodb, some javascript map
reduce jobs that run once an hour by cron (not running during this
sample), and now iostat.

sdf on the iostat is the ebs volume

4 indexes on the user collection, 5 on post - but despite one of each
below, about 90% of the spikes are on post inserts
Posts collections is currently about 3 million documents, users 1.8 million.


db version v1.6.3, pdfile version 4.5
git hash: 278bd2ac2f2efbee556f32c13c1b6803224d1c01


Mongodb log:

dbtop (occurences|percent of elapsed)
NS total Reads Writes Queries GetMores Inserts Updates Removes
GLOBAL 50 1.4% 0 0.0% 50 1.4% 0 0.0% 0 0.0% 25 1.1% 25 0.3% 0 0.0%
site.post 25 1.1% 0 0.0% 25 1.1% 0 0.0% 0 0.0% 25 1.1% 0 0.0% 0 0.0%
site.user 25 0.3% 0 0.0% 25 0.3% 0 0.0% 0 0.0% 0 0.0% 25 0.3% 0 0.0%
write lock % time in write lock, by 4 sec periods
1 2 4 10 5 5 3 3 1 2 3 3 4 5 3 2 3 4 3 2 15 2 3 6 2 4 1 1 7 10 6 5 4 1
4 14 7 5 3 12 2 6 2 22 4 2 4 3 2 3 6 4 3 2 2 2 4 3 3 4 4 2 2 3 2 3 7 3
2 2 3 2 2 2 38 3 1 3 5 1 4 9 3 6 5 5 12 1 4 1 4 10 6 2 5 2 3 7 13
write locked now: false

05:23:18 [conn635] insert site.post 103ms
05:23:33 [conn635] update site.user query: { userId: 3634671 } 113ms

mongostat:

insert/s query/s update/s delete/s getmore/s command/s flushes/s
mapped vsize res faults/s locked % idx miss % q t|r|w conn
time
7 0 7 0 0 1 0
12235 12919 6667 1 0.7 0 0|0|0 45
05:23:08
7 0 7 0 0 1 0
12235 12919 6667 5 4.3 0 0|0|0 45
05:23:09
6 0 6 0 0 1 0
12235 12919 6667 3 3.8 0 0|0|0 45
05:23:10
5 0 5 0 0 1 0
12235 12919 6667 1 1.6 0 0|0|0 45
05:23:11
7 0 7 0 0 1 0
12235 12919 6667 3 3.5 0 0|0|0 45
05:23:12
6 0 6 0 0 1 0
12235 12919 6667 1 1.7 0 0|0|0 45
05:23:13
7 0 7 0 0 1 0
12235 12919 6667 2 2.9 0 0|0|0 45
05:23:14
9 0 9 0 0 1 0
12235 12919 6667 3 4.9 0 0|0|0 45
05:23:15
7 0 7 0 0 1 0
12235 12919 6667 5 4.1 0 0|0|0 45
05:23:16
7 0 7 0 0 1 0
12235 12919 6667 1 1.6 0 0|0|0 45
05:23:17
7 0 7 0 0 1 0
12235 12919 6667 4 16.6 0 0|0|0 45
05:23:18
8 0 8 0 0 1 0
12235 12919 6667 4 11.7 0 0|0|0 45
05:23:19
7 0 7 0 0 1 0
12235 12919 6667 2 1.3 0 0|0|0 45
05:23:20
7 0 7 0 0 1 0
12235 12919 6666 2 1.6 0 0|0|0 45
05:23:21
6 0 6 0 0 1 0
12235 12919 6666 1 1.8 0 0|0|0 45
05:23:22
6 0 6 0 0 1 0
12235 12919 6666 0 0.2 0 0|0|0 45
05:23:23
8 0 8 0 0 1 0
12235 12919 6666 2 2.7 0 0|0|0 45
05:23:24
7 0 7 0 0 1 0
12235 12919 6666 3 2.7 0 0|0|0 45
05:23:25
3 0 3 0 0 1 0
12235 12919 6666 1 0.6 0 0|0|0 45
05:23:26
6 0 6 0 0 1 0
12235 12919 6666 3 3.1 0 0|0|0 45
05:23:27
insert/s query/s update/s delete/s getmore/s command/s flushes/s
mapped vsize res faults/s locked % idx miss % q t|r|w conn
time
7 0 7 0 0 1 0
12235 12919 6666 3 3.5 0 0|0|0 45
05:23:28
7 0 7 0 0 1 0
12235 12919 6666 2 4.5 0 0|0|0 45
05:23:29
8 0 8 0 0 1 0
12235 12919 6666 5 9.8 0 0|0|0 45
05:23:30
8 0 8 0 0 1 0
12235 12919 6666 0 0.4 0 0|0|0 45
05:23:31
7 0 7 0 0 1 0
12235 12919 6666 0 0.5 0 0|0|0 45
05:23:32
4 0 4 0 0 1 0
12235 12919 6666 4 6.3 0 0|0|0 45
05:23:33
5 0 5 0 0 1 0
12235 12919 6666 2 14.9 0 0|0|0 45
05:23:34
3 0 3 0 0 1 0
12235 12919 6666 0 0.1 0 0|0|0 45
05:23:35
6 0 6 0 0 1 0
12235 12919 6666 8 7.3 0 0|0|0 45
05:23:36
6 0 6 0 0 1 0
12235 12919 6666 1 2 0 0|0|0 45
05:23:37
5 0 5 0 0 1 0
12235 12919 6667 1 0.4 0 0|0|0 45
05:23:38
7 0 7 0 0 1 0
12235 12919 6667 2 5.8 0 0|0|0 45
05:23:39


iostat -tz -x 2


10/07/2010 05:23:07 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 1.74 0.00 98.26

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda1 0.00 0.00 0.00 1.99 0.00 15.92
8.00 0.00 0.00 0.00 0.00
sdf 0.00 0.00 11.94 0.00 405.97 0.00
34.00 0.11 9.58 2.50 2.99

10/07/2010 05:23:09 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 0.74 0.00 99.26

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sdf 0.00 0.00 5.97 1.99 242.79 15.92
32.50 0.01 1.88 1.87 1.49

10/07/2010 05:23:11 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 1.24 0.00 98.76

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda1 0.00 0.00 0.00 1.00 0.00 7.96
8.00 0.00 0.00 0.00 0.00
sdf 0.00 0.00 3.98 0.00 111.44 0.00
28.00 0.04 11.25 6.25 2.49
10/07/2010 05:23:13 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 1.01 0.00 98.99

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda1 0.00 0.00 0.00 1.99 0.00 15.92
8.00 0.00 0.00 0.00 0.00
sdf 0.00 0.00 5.47 0.00 147.26 0.00
26.91 0.02 4.55 4.55 2.49

10/07/2010 05:23:15 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 1.98 0.00 98.02

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sdf 0.00 0.00 6.97 1.99 175.12 15.92
21.33 0.08 9.44 3.33 2.99

10/07/2010 05:23:17 AM
avg-cpu: %user %nice %system %iowait %steal %idle
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sdf 0.00 0.00 5.47 0.00 107.46 0.00
19.64 0.08 14.55 5.45 2.99

10/07/2010 05:23:19 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 15.14 0.00 84.86

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda1 0.00 0.00 0.00 2.99 0.00 23.88
8.00 0.00 0.00 0.00 0.00
sdf 0.00 396.02 13.93 415.92 557.21 6495.52
16.41 32.15 74.80 0.75 32.34

10/07/2010 05:23:21 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.26 0.78 0.00 98.96

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sdf 0.00 0.00 8.46 0.00 222.89 0.00
26.35 0.03 3.53 1.18 1.00

10/07/2010 05:23:23 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 0.56 0.00 99.44

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sdf 0.00 0.00 1.99 0.00 99.50 0.00
50.00 0.04 20.00 5.00 1.00

10/07/2010 05:23:25 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 1.25 0.00 98.75
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda1 0.00 0.00 0.00 2.99 0.00 23.88
8.00 0.00 0.00 0.00 0.00
sdf 0.00 0.00 8.96 1.99 433.83 15.92
41.09 0.06 5.45 1.82 1.99

10/07/2010 05:23:27 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 0.75 0.00 99.25

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda1 0.00 0.00 0.00 1.49 0.00 11.94
8.00 0.00 0.00 0.00 0.00
sdf 0.00 0.00 7.96 0.00 234.83 0.00
29.50 0.04 5.62 2.50 1.99

10/07/2010 05:23:29 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 2.00 0.00 98.00

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sdf 0.00 0.00 7.96 1.99 302.49 15.92
32.00 0.12 12.50 4.50 4.48

10/07/2010 05:23:31 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 2.49 0.00 97.51

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda1 0.00 0.00 0.00 1.99 0.00 15.92
8.00 0.00 0.00 0.00 0.00
sdf 0.00 0.00 8.96 0.00 401.99 0.00
44.89 0.20 22.22 5.56 4.98

10/07/2010 05:23:33 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 9.75 0.00 90.25

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sdf 0.00 431.00 2.00 357.50 16.00 5712.00
15.93 29.87 68.83 0.58 21.00

10/07/2010 05:23:35 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 26.93 0.25 72.82

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sdf 0.00 960.70 3.48 905.47 39.80 15526.37
17.13 95.35 110.51 0.62 56.72

10/07/2010 05:23:37 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.00 1.99 0.00 98.01

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sda1 0.00 0.00 0.00 2.99 0.00 23.88
8.00 0.00 0.00 0.00 0.00
sdf 0.00 0.00 8.96 0.00 214.93 0.00
24.00 0.11 12.78 4.44 3.98

10/07/2010 05:23:39 AM
avg-cpu: %user %nice %system %iowait %steal %idle
0.00 0.00 0.25 3.00 0.00 96.75

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s
avgrq-sz avgqu-sz await svctm %util
sdf 0.00 61.19 1.49 76.12 39.80 1098.51
14.67 3.25 41.47 0.83 6.47

mongostat

insert/s query/s update/s delete/s getmore/s command/s flushes/s
mapped vsize res faults/s locked % idx miss % q t|r|w conn
time
7 0 7 0 0 1 0
12235 12919 6667 1 0.7 0 0|0|0 45
05:23:08
7 0 7 0 0 1 0
12235 12919 6667 5 4.3 0 0|0|0 45
05:23:09
6 0 6 0 0 1 0
12235 12919 6667 3 3.8 0 0|0|0 45
05:23:10
5 0 5 0 0 1 0
12235 12919 6667 1 1.6 0 0|0|0 45
05:23:11
7 0 7 0 0 1 0
12235 12919 6667 3 3.5 0 0|0|0 45
05:23:12
6 0 6 0 0 1 0
12235 12919 6667 1 1.7 0 0|0|0 45
05:23:13
7 0 7 0 0 1 0
12235 12919 6667 2 2.9 0 0|0|0 45
05:23:14
9 0 9 0 0 1 0
12235 12919 6667 3 4.9 0 0|0|0 45
05:23:15
7 0 7 0 0 1 0
12235 12919 6667 5 4.1 0 0|0|0 45
05:23:16
7 0 7 0 0 1 0
12235 12919 6667 1 1.6 0 0|0|0 45
05:23:17
7 0 7 0 0 1 0
12235 12919 6667 4 16.6 0 0|0|0 45
05:23:18
8 0 8 0 0 1 0
12235 12919 6667 4 11.7 0 0|0|0 45
05:23:19
7 0 7 0 0 1 0
12235 12919 6667 2 1.3 0 0|0|0 45
05:23:20
7 0 7 0 0 1 0
12235 12919 6666 2 1.6 0 0|0|0 45
05:23:21
6 0 6 0 0 1 0
12235 12919 6666 1 1.8 0 0|0|0 45
05:23:22
6 0 6 0 0 1 0
12235 12919 6666 0 0.2 0 0|0|0 45
05:23:23
8 0 8 0 0 1 0
12235 12919 6666 2 2.7 0 0|0|0 45
05:23:24
7 0 7 0 0 1 0
12235 12919 6666 3 2.7 0 0|0|0 45
05:23:25
3 0 3 0 0 1 0
12235 12919 6666 1 0.6 0 0|0|0 45
05:23:26
6 0 6 0 0 1 0
12235 12919 6666 3 3.1 0 0|0|0 45
05:23:27
insert/s query/s update/s delete/s getmore/s command/s flushes/s
mapped vsize res faults/s locked % idx miss % q t|r|w conn
time
7 0 7 0 0 1 0
12235 12919 6666 3 3.5 0 0|0|0 45
05:23:28
7 0 7 0 0 1 0
12235 12919 6666 2 4.5 0 0|0|0 45
05:23:29
8 0 8 0 0 1 0
12235 12919 6666 5 9.8 0 0|0|0 45
05:23:30
8 0 8 0 0 1 0
12235 12919 6666 0 0.4 0 0|0|0 45
05:23:31
7 0 7 0 0 1 0
12235 12919 6666 0 0.5 0 0|0|0 45
05:23:32
4 0 4 0 0 1 0
12235 12919 6666 4 6.3 0 0|0|0 45
05:23:33
5 0 5 0 0 1 0
12235 12919 6666 2 14.9 0 0|0|0 45
05:23:34
3 0 3 0 0 1 0
12235 12919 6666 0 0.1 0 0|0|0 45
05:23:35
6 0 6 0 0 1 0
12235 12919 6666 8 7.3 0 0|0|0 45
05:23:36
6 0 6 0 0 1 0
12235 12919 6666 1 2 0 0|0|0 45
05:23:37
5 0 5 0 0 1 0
12235 12919 6667 1 0.4 0 0|0|0 45
05:23:38
7 0 7 0 0 1 0
12235 12919 6667 2 5.8 0 0|0|0 45
05:23:39

Will Fleming

unread,
Oct 7, 2010, 2:04:35 AM10/7/10
to mongod...@googlegroups.com
One other note, not using replication (not concerned about potential
data loss at the moment) or sharding

Alvin Richards

unread,
Oct 7, 2010, 10:24:44 AM10/7/10
to mongodb-user
Based on a bunch of perf run using striped EBS volumes, ephemeral
storage, ext4 versus xfs... in terms of throughput they are within
10-15% of each other (that's up to 20 EBS volumes).

Comparing the munin charts of the runs, there a fairly direct
correlation between latency of EC2 storage and throughput. Bear in
mind the default file-system for ephemeral is ext4 (at least for the
Ubuntu 9.10 image I'm using).

What size instance were you running when you captured the above data?

-Alvin

Will Fleming

unread,
Oct 7, 2010, 12:19:18 PM10/7/10
to mongod...@googlegroups.com
The size is large.

The occasional latency almost always happens on the inserts, not
upserts. I've seen a few people suggest writing out zeros to the
entire EBS volume first can improve performance, but haven't seen a
definitive answer or solid numbers testing that theory out.

Reply all
Reply to author
Forward
0 new messages