problem about mongodb gridfs performance

jack

unread,

Nov 22, 2010, 7:50:36 AM11/22/10

to mongodb-user

Hi,
I'm working for a distributed file system via mongodb gridfs. Because
mongodb maps all its datafiles in memory, mongod's virtual memory
grows very fast. When its virtual memory can all reside in physic
memory, its performance is quite good. But when its virtual memory
exceeds physic memory, its performance degrades greatly, even worse
than disk IO.
It's unacceptable to enhace its performance by adding physic memory,
cause the whole size of the datafiles may grow to dozons TBs. Is there
any other way to improve it? Or is it inherent in mongodb?
Welcome anything can help me to improve it!
Thanks in advance!

Eliot Horowitz

unread,

Nov 22, 2010, 8:05:20 AM11/22/10

to mongod...@googlegroups.com

Generally we haven't seen problems with large gridfs systems up to
100x physical ram.
Can you send some more info on exactly what your'e seeing, mongostat and iostat

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

jack xiang

unread,

Nov 23, 2010, 10:19:35 AM11/23/10

to mongod...@googlegroups.com

Hi Eliot,
Due to the shortage of resources, I just have 4 virtual machines to do
a demo test. The deployment (including 1 mongos, 3 config servers and
2*2 shard servers) is as following:
vm1: mongos1, configserver1, shardserver11
vm2: configserver2, shardserver12
vm3: configserver3, shardserver21
vm4: shardserver22.
Each vm has 1 GB memory and 1 logic cpu assigned. I run my test
programm on vm1. The test programm simply saves 50 files (each has 10
chunks and each chunk is of 64KB in szie) into mongodb, and counts the
total time the write takes.
For comparison, I wrote a programm to write 500 files (each is of 64KB
in szie) into the local file system, and count the time it takes.
At the beginning, mongodb's performance looks quite good, even better
than writing directly to the local file system. I run my test programm
again and again. shardserver's virtual memory grows gradually and
mongodb's performance degrades gradually too. Eventually,
shardserver's virtual memory exceeds the physic memory and mongodb's
performance becomes unacceptable. Certainly, vm's unstable performance
may be a factor for this phenomenon, plus that mongos, configserver,
shardserver and test programm may fight for all kinds of resources
running on the same vm. But what I want to make sure is, besides those
factors, if mongodb's IO performance is constant at least as the disk
IO no matter how many virtual memories it takes. What should be
noticed when deploying the production environment? Can mongos, config
server, shard server, client really be put wherever you want? How to
deploy them really has no influence on the whole system, doesn't it?
Maybe I still need find the best practices.
Thank you in advance!

Eliot Horowitz

unread,

Nov 23, 2010, 3:48:22 PM11/23/10

to mongod...@googlegroups.com

How much physical ram do you have?
I don't think simulating real performance with that setup is going to
give you very realistic performance numbers.
Also, what are you using for virtualization?

jack xiang

unread,

Nov 25, 2010, 2:20:18 PM11/25/10

to mongod...@googlegroups.com

Every VM has 1G RAM assigned. One PC server runs 4 VMWare workstation.
That's all.
Yes, I agree virtualization can't give the realistic performance data.
I just want to compare mongo's IO with the file system's IO on the
same vm. From my observation, when mongod's virtual memory getting too
large, its IO is much slower than the file system's. mongos,
configserver, shardserver and the test client all run on the same vm,
when test client writes to mongo, all of them need run fast to finish
the write. But when I run the other programm to write to file system,
only this programm need run fast. So I guess that maybe the main
reason why mongo performs poorly.
Maybe I should write to a single mongod to make the comparation fair.

Eliot Horowitz

unread,

Nov 25, 2010, 3:10:12 PM11/25/10

to mongod...@googlegroups.com

How much physical ram?

jack xiang

unread,

Nov 29, 2010, 1:54:31 AM11/29/10

to mongod...@googlegroups.com

The PC server has 8G physical ram. Every VM has 1G RAM assigned.

Eliot Horowitz

unread,

Nov 29, 2010, 1:58:09 AM11/29/10

to mongod...@googlegroups.com

What we've seen with many virtualized setups like this is that the VMs
aren't really designed for this case terribly well.

Can you try running 4 mongo processes on different ports on the actual hardware?

jack xiang

unread,

Nov 29, 2010, 4:12:10 AM11/29/10

to mongod...@googlegroups.com

OK. I'll do that when the hardware is ready.
Thank you!

jack xiang

unread,

Dec 1, 2010, 4:48:54 AM12/1/10

to mongod...@googlegroups.com

Hi Eliot,
The hardware I applied for is not ready yet. But I don't want to wait
for it. So I make another comparation test among write to file system,
write to single mongodb and write to a cluster with autosharding
mongodb.
On the same vm, write 1000 files(each has 512KB in size) to file
system takes about 80 seconds averagely, and write 1000 chunks(each
has 512KB in size) to single mongodb takes about 80 seconds averagely
too.
But write 1000 chunks(each has 512KB in size) to a cluster with
autosharding mongodb takes more than 1000 seconds averagely.
So I draw a conclusion that performance of write to single mongodb is
almost as same as that of write to file system, but performance of
write to autosharding mongodb cluster declines sharply.
With further investigation, I find, in the intensively inserting
period, autosharding mongodb also splits and migrates its chunks
constantly. Just inserting is intensive enough to cause the IO
bottleneck, let alone chunks' splitting and migrating.
So is there any way to tune the autosharding policy more subtly so
that chunks' splitting and migrating will only run when the load is
not high? I just insert 50 files(each has 20 chunks), the chunks are
splitted 13 times. As far as my requirement, chunks' splitting and
migrating takes place excessively frequently.
According to http://www.mongodb.org/display/DOCS/Sharding+Design
"split - split a chunk that is growing too large into pieces. as the
two new chunks are on the same machine after the split, this is really
just a metadata update and very fast."
But on the surface of my observation, that's not true. I add only one
shard. So there's no migrating, but still splitting. However, the
performance is still so poor (about 800 seconds).
In fact, I don't really understand the relationships between the
chunks and the data files on the disk. Under the dbpath, there're 6
files for the db. However there're 28 chunks in chunks collection of
config db, and the number of chunks is continually growing with
objects being inserted. But I have never seen it declining even if
many objects are deleted. Dividing shard into chunks may make things
complicated. Why not look the shard as one chunk and set a migrating
threshold for its capacity? For example, we believe the max space is
20GB for a shard, when its capacity reaches 16GB, find a spare time to
migrate data until its capacity is 10GB. I know the load may not be
proportional to the amount of the data, but it's really difficult to
find a way to meet all kinds of requirement without sacrificing other
aspacts.
All in all, IO performance is very important to my application. If
autosharding introduces performance degrading, I plan to use single
replica set and do sharding on application level. On the other hand, I
still want to utilize mongos to simplify the interface to mongodb.
However, it seems both or neither for autosharding and mongos. Is it
right? If no, how to achieve it? If yes, must I take care of the
replica set in client itself?

Eliot Horowitz

unread,

Dec 1, 2010, 8:30:03 AM12/1/10

to mongod...@googlegroups.com

What mongo version are you running?

jack xiang

unread,

Dec 1, 2010, 8:46:36 PM12/1/10

to mongod...@googlegroups.com

# ./mongod --version
db version v1.6.3, pdfile version 4.5
Thu Dec 2 08:58:22 git version: 278bd2ac2f2efbee556f32c13c1b6803224d1c01

Eliot Horowitz

unread,

Dec 1, 2010, 10:01:40 PM12/1/10

to mongod...@googlegroups.com

Could you try 1.7.3?
Might be a significant improvement.

jack xiang

unread,

Dec 2, 2010, 2:12:06 AM12/2/10

to mongod...@googlegroups.com

Hi Eliot,
With 1.7.3, write 1000 chunks(each has 512KB in size) to a cluster
with autosharding mongodb takes about 480 seconds, but occasionally
more than 1000 seconds. Seems there's no qualitative improvement.
Back to my questions before, if I do not use autosharding but single
replica set, can I still utilize mongos? If yes, how to achieve it? If
no, must I take care of the replica set in client itself?

Eliot Horowitz

unread,

Dec 2, 2010, 4:32:25 AM12/2/10

to mongod...@googlegroups.com

That's still strange? Can you send mongo log?

I'm not sure what you mean about using mongos? Do you mean for doing
replica set routing?
the drivers all can handle that for you.

jack xiang

unread,

Dec 2, 2010, 9:58:15 PM12/2/10

to mongod...@googlegroups.com

Hi Eliot,
I did a test again. With 1.7.3, this time, write 1000 chunks(each has
512KB in size) to a cluster with autosharding mongodb took 1173
seconds.
The first attachment is shard server's log, the second is config
server's, the third is mongos' and the last is the source codes of my
test programm.

phpcHvLYL

jack xiang

unread,

Dec 2, 2010, 10:14:15 PM12/2/10

to mongod...@googlegroups.com

seems the attachments are lost, so upload them again.

phpgFbTuv

jack xiang

unread,

Dec 2, 2010, 10:24:04 PM12/2/10

to mongod...@googlegroups.com

seems I can only upload one attachment once, so tar all files into a rar file.

phpUoKyrV

jack xiang

unread,

Dec 3, 2010, 3:56:42 AM12/3/10

to mongod...@googlegroups.com

Hi Eliot,
I tune the chunksize to 2GB, then find that there's no more splitting
when I insert 512MB data. But the performance is as same as before. At
least, I get that chunk splitting is not the root cause for
performance degrading.
The attachment is shard server's log. You can see that no chunk
splitting any more. It seems allocating and filling new datafile takes
a lot of time, more than 600 seconds for all the following files:
-rw------- 1 root root 67108864 2010-12-03 15:30 storage.0
-rw------- 1 root root 134217728 2010-12-03 15:15 storage.1
-rw------- 1 root root 268435456 2010-12-03 15:24 storage.2
-rw------- 1 root root 536870912 2010-12-03 15:30 storage.3
-rw------- 1 root root 536870912 2010-12-03 15:27 storage.4
I konw preallocation is inherent in mongodb. So, I just want to see if
there's a way to avoid the preallocation during high load period or if
I can set the size of datafile even bigger than 2GB. You know, for a
distributed file system, 2GB is still far away from the total size.

php8R0Qsi

Nat

unread,

Dec 3, 2010, 4:12:35 AM12/3/10

to mongodb-user

do you use ext4? It's much better at allocating data blocks

On Dec 3, 4:56 pm, jack xiang <crytosky.xi...@gmail.com> wrote:
> Hi Eliot,
> I tune the chunksize to 2GB, then find that there's no more splitting
> when I insert 512MB data. But the performance is as same as before. At
> least, I get that chunk splitting is not the root cause for
> performance degrading.
> The attachment is shard server's log. You can see that no chunk
> splitting any more. It seems allocating and filling new datafile takes
> a lot of time, more than 600 seconds for all the following files:
> -rw------- 1 root root 67108864 2010-12-03 15:30 storage.0
> -rw------- 1 root root 134217728 2010-12-03 15:15 storage.1
> -rw------- 1 root root 268435456 2010-12-03 15:24 storage.2
> -rw------- 1 root root 536870912 2010-12-03 15:30 storage.3
> -rw------- 1 root root 536870912 2010-12-03 15:27 storage.4
> I konw preallocation is inherent in mongodb. So, I just want to see if
> there's a way to avoid the preallocation during high load period or if
> I can set the size of datafile even bigger than 2GB. You know, for a
> distributed file system, 2GB is still far away from the total size.
>

> On 12/3/10, jack xiang <crytosky.xi...@gmail.com> wrote:
>
>
>
>
>
>
>
> > seems I can only upload one attachment once, so tar all files into a rar
> > file.
>

> > On 12/3/10, jack xiang <crytosky.xi...@gmail.com> wrote:
> >> seems the attachments are lost, so upload them again.
>

> >> On 12/3/10, jack xiang <crytosky.xi...@gmail.com> wrote:
> >>> Hi Eliot,
> >>> I did a test again. With 1.7.3, this time, write 1000 chunks(each has
> >>> 512KB in size) to a cluster with autosharding mongodb took 1173
> >>> seconds.
> >>> The first attachment is shard server's log, the second is config
> >>> server's, the third is mongos' and the last is the source codes of my
> >>> test programm.
>

> >>> On 12/2/10, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> >>>> That's still strange? Can you send mongo log?
>
> >>>> I'm not sure what you mean about using mongos? Do you mean for doing
> >>>> replica set routing?
> >>>> the drivers all can handle that for you.
>

> >>>> On Thu, Dec 2, 2010 at 2:12 AM, jack xiang <crytosky.xi...@gmail.com>

> >>>> wrote:
> >>>>> Hi Eliot,
> >>>>> With 1.7.3, write 1000 chunks(each has 512KB in size) to a cluster
> >>>>> with autosharding mongodb takes about 480 seconds, but occasionally
> >>>>> more than 1000 seconds. Seems there's no qualitative improvement.
> >>>>> Back to my questions before, if I do not use autosharding but single
> >>>>> replica set, can I still utilize mongos? If yes, how to achieve it? If
> >>>>> no, must I take care of the replica set in client itself?
>

> >>>>> On 12/2/10, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> >>>>>> Could you try 1.7.3?
> >>>>>> Might be a significant improvement.
>

> >>>>>> On Wed, Dec 1, 2010 at 8:46 PM, jack xiang <crytosky.xi...@gmail.com>

> >>>>>> wrote:
> >>>>>>> # ./mongod --version
> >>>>>>> db version v1.6.3, pdfile version 4.5
> >>>>>>> Thu Dec 2 08:58:22 git version:
> >>>>>>> 278bd2ac2f2efbee556f32c13c1b6803224d1c01
>

> >>>>>>> On 12/1/10, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> >>>>>>>> What mongo version are you running?
>

> >>>>>>>> On Dec 1, 2010, at 4:48 AM, jack xiang <crytosky.xi...@gmail.com>

> >>>>>>>>> According tohttp://www.mongodb.org/display/DOCS/Sharding+Design

> >>>>>>>>> On 11/29/10, jack xiang <crytosky.xi...@gmail.com> wrote:
> >>>>>>>>>> OK. I'll do that when the hardware is ready.
> >>>>>>>>>> Thank you!
>

> >>>>>>>>>> On 11/29/10, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> >>>>>>>>>>> What we've seen with many virtualized setups like this is that
> >>>>>>>>>>> the
> >>>>>>>>>>> VMs
> >>>>>>>>>>> aren't really designed for this case terribly well.
>
> >>>>>>>>>>> Can you try running 4 mongo processes on different ports on the
> >>>>>>>>>>> actual
> >>>>>>>>>>> hardware?
>
> >>>>>>>>>>> On Mon, Nov 29, 2010 at 1:54 AM, jack xiang

> >>>>>>>>>>> <crytosky.xi...@gmail.com>

> >>>>>>>>>>> wrote:
> >>>>>>>>>>>> The PC server has 8G physical ram. Every VM has 1G RAM
> >>>>>>>>>>>> assigned.
>

> >>>>>>>>>>>> On 11/26/10, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> >>>>>>>>>>>>> How much physical ram?
>
> >>>>>>>>>>>>> On Nov 25, 2010, at 2:20 PM, jack xiang

> >>>>>>>>>>>>> <crytosky.xi...@gmail.com>

> >>>>>>>>>>>>>> On 11/24/10, Eliot Horowitz <eliothorow...@gmail.com> wrote:
> >>>>>>>>>>>>>>> How much physical ram do you have?
> >>>>>>>>>>>>>>> I don't think simulating real performance with that setup is
> >>>>>>>>>>>>>>> going
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> give you very realistic performance numbers.
> >>>>>>>>>>>>>>> Also, what are you using for virtualization?
>
> >>>>>>>>>>>>>>> On Tue, Nov 23, 2010 at 10:19 AM, jack xiang

> >>>>>>>>>>>>>>> <crytosky.xi...@gmail.com>

> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>> Hi Eliot,
> >>>>>>>>>>>>>>>> Due to the shortage of resources, I just have 4 virtual
> >>>>>>>>>>>>>>>> machines
> >>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>> do
> >>>>>>>>>>>>>>>> a demo test. The deployment (including 1 mongos, 3 config
> >>>>>>>>>>>>>>>> servers
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>> 2*2 shard servers) is as following:
> >>>>>>>>>>>>>>>> vm1: mongos1, configserver1, shardserver11
>

> ...
>
> read more »
>
> php8R0Qsi
> 43KViewDownload

jack xiang

unread,

Dec 3, 2010, 4:44:57 AM12/3/10

to mongod...@googlegroups.com

I'm using ext3. Try ext4 later.

jack xiang

unread,

Feb 10, 2011, 1:22:24 AM2/10/11

to mongod...@googlegroups.com

Using a real physic server with ext4, the performance looks quite good. Preallocation of 2GB data file takes less than 1ms. IO performance doesn't rely on the proportion of RAM to the data's total size any more.
Thanks!

Reply all

Reply to author

Forward