Inserts are very slow

2,647 views
Skip to first unread message

WC

unread,
May 11, 2012, 3:00:27 PM5/11/12
to mongodb-user
Hi, I am very new to MongoDB... always have use Mysql or MSSql, and
wanted to try this super fast nosql database.

For reading is great, it seems to perform much better.. but for
writing I dont know whats going on... but is very... very slow...

I have test my script with different configurations, safe=true,
safe=false, journaling, no journaling... and there is not much of
improvement.

Here is just an small example between Mysql and MongoDB

python2.4 mysql.py --keys 10000 --host 127.0.0.1 --gets 100 --focus
100 --do_insert
10000 documents inserted took 0.22s
100 gets (focussed on bottom 100%) took 0.15s

python2.4 bench.py --keys 10000 --host 127.0.0.1 --gets 100 --focus
100 --do_insert
10000 documents inserted took 1.20s
100 gets (focussed on bottom 100%) took 0.07s

I am running:
Centos 5.8 64 bit.
1GB RAM.
Mysql: 5.0.95
Mongo 2.0.5
Both running with default installation configuration files.

Someone can point me to the right direction? I really want mongodb to
be faster than mysql!!

Thanks,

WC

WC

unread,
May 11, 2012, 3:11:39 PM5/11/12
to mongodb-user
As an additional, here is the iostat from mysql::
avg-cpu: %user %nice %system %iowait %steal %idle
6.96 0.35 1.26 3.13 0.00 88.31

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
avgqu-sz await svctm %util
sda 0.41 56.06 1.44 2.27 0.03 0.23
145.83 0.25 67.60 12.89 4.78
sda1 0.02 0.00 0.00 0.00 0.00 0.00
20.47 0.00 24.51 24.04 0.00
sda2 0.40 56.06 1.43 2.27 0.03 0.23
145.89 0.25 67.62 12.89 4.78
dm-0 0.00 0.00 1.69 58.04 0.03 0.23
9.00 19.50 326.53 0.79 4.71
dm-1 0.00 0.00 0.14 0.30 0.00 0.00
8.00 0.06 130.92 2.21 0.10

and from MongoDB:
avg-cpu: %user %nice %system %iowait %steal %idle
25.16 0.00 3.77 0.00 0.00 71.07

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz
avgqu-sz await svctm %util
sda 0.00 135.85 0.00 40.88 0.00 0.69
34.58 1.40 34.31 5.74 23.46
sda1 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00
sda2 0.00 135.85 0.00 40.88 0.00 0.69
34.58 1.40 34.31 5.74 23.46
dm-0 0.00 0.00 0.00 176.73 0.00 0.69
8.00 2.26 12.80 1.33 23.46
dm-1 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00

Adrien Mogenet

unread,
May 11, 2012, 4:20:48 PM5/11/12
to mongod...@googlegroups.com
Please use pastebin or similare to share a formatted output :)

Did you pre-allocate MongoDB storage files ? Their creation can take a while.
Did you insert document one by one or in a bulk insert ?

WC

unread,
May 11, 2012, 4:29:23 PM5/11/12
to mongodb-user
Thanks for your answer.

Pastebin: http://pastebin.com/G8dDT2Nt

I really don't know how to preallocate storage files. Rememeber I am
very new.
I insert the documents in a bulk insert of 100 batch_size.

I just change the mysql engine table to Innodb and resuls for writing
are similar to mongodb... but I still think that I am doing something
wrong!!! help please....

Thanks,
WC

WC

unread,
May 14, 2012, 10:59:09 AM5/14/12
to mongodb-user
Anyone who can point me to the rigth direction?

Sam Millman

unread,
May 14, 2012, 11:09:02 AM5/14/12
to mongod...@googlegroups.com
What are the indexes etc on your collection looking like?

--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb

WC

unread,
May 14, 2012, 11:19:21 AM5/14/12
to mongodb-user
Here is the script:

if do_insert:
db.drop_collection('docs')
docs = db.docs
docs.create_index([('key', ASCENDING)])
batch_size = 100
i = 0
start = time()
while i < keys:
to_insert = []
for j in range(batch_size):
to_insert.append({
'key': i,
'text': 'Mary had a little lamb. ',
"author": "Mike",
"text2": "Another post!",
"tags": ["bulk", "insert"],
"date": datetime.datetime(2009, 11, 12, 11, 14)
})
i += 1
docs.insert(to_insert)
i += batch_size
end = time()
print '%d documents inserted took %.2fs' % (keys, end - start)

On May 14, 11:09 am, Sam Millman <sam.mill...@gmail.com> wrote:
> What are the indexes etc on your collection looking like?
>

Spencer T Brody

unread,
May 14, 2012, 4:36:17 PM5/14/12
to mongod...@googlegroups.com
The most likely culprit is data file preallocation.  What file system are you running the test on?  We recommend running mongodb on ext4 or xfs as they support fast allocation of new datafiles.
If you run the test a second time (dropping the collection but not the database), does it run faster the second time?

Wilmar Campos

unread,
May 14, 2012, 4:47:05 PM5/14/12
to mongod...@googlegroups.com
I am running default Centos Install:

[root@dev etc]# more fstab 
/dev/VolGroup00/LogVol00 /                       ext3    defaults        1 1
LABEL=/boot             /boot                   ext3    defaults        1 2
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
/dev/VolGroup00/LogVol01 swap                    swap    defaults        0 0

EXT3 on a logical volume.

As you can see, I drop the collection.
When is a fresh database, no noticeable change the first time.

Thanks for looking at this!!!


See also the IRC channel -- freenode.net#mongodb



--
Wilmar Campos

Spencer T Brody

unread,
May 14, 2012, 5:12:10 PM5/14/12
to mongod...@googlegroups.com
Yeah, the problem with ext3 is that when we allocate new data files they have to be filled with zeroes that are written back to the underlying disk.  In ext4 and xfs there is an falloc syscall we use that marks the file as allocated and zeroed, without actually writing anything back to the underlying storage.  I would expect that you would see much better insert performance on ext4 than ext3.

One other thing you can try is adding a line to the script to run the compact command on the docs collection after the line that says "db.drop_collection('docs')".  This should make successive runs of the script able to reuse the datafiles that were allocated in the previous run.  I am assuming here that this is the only thing running on this mongod instance, if not then please be aware that running compact is a blocking operation that will cause that server to not be able to process any requests until the compaction is finished.

Wilmar Campos

unread,
May 14, 2012, 5:26:23 PM5/14/12
to mongod...@googlegroups.com
I will test and let you know!!! thanks!!

Wilmar Campos

unread,
May 14, 2012, 6:00:36 PM5/14/12
to mongod...@googlegroups.com
Hi, After changing the file system to EXT4... results are worse.

python2.4 mysql.py --keys 10000 --host 127.0.0.1 --gets 100 --focus 100 --do_insert
10000 documents inserted took 0.52s
100 gets (focussed on bottom 100%) took 0.33s

python2.4 bench.py --keys 10000 --host 127.0.0.1 --gets 100 --focus 100 --do_insert
10000 documents inserted took 2.12s
100 gets (focussed on bottom 100%) took 0.13s

On reading it continuous to be very fast... but on writting... something happen.....
--
Wilmar Campos

WC

unread,
May 14, 2012, 8:44:48 PM5/14/12
to mongodb-user
I even move the dbpath to a tmpfs /dev/
shm tmpfs defaults 0 0 filesystem, the INSERT
took now:

python2.4 bench.py --keys 10000 --host 127.0.0.1 --gets 100 --focus
100 --do_insert
10000 documents inserted took 1.18s
100 gets (focussed on bottom 100%) took 0.05s

Before was taking 2.12s.... but still not comparable to MySQL!!

Could be the pymongo driver?


On May 14, 6:00 pm, Wilmar Campos <wilmar.cam...@gmail.com> wrote:
> Hi, After changing the file system to EXT4... results are worse.
>
> python2.4 mysql.py --keys 10000 --host 127.0.0.1 --gets 100 --focus 100
> --do_insert
> 10000 documents inserted took 0.52s
> 100 gets (focussed on bottom 100%) took 0.33s
>
> python2.4 bench.py --keys 10000 --host 127.0.0.1 --gets 100 --focus 100
> --do_insert
> 10000 documents inserted took 2.12s
> 100 gets (focussed on bottom 100%) took 0.13s
>
> On reading it continuous to be very fast... but on writting... something
> happen.....
>
> On Mon, May 14, 2012 at 5:26 PM, Wilmar Campos <wilmar.cam...@gmail.com>wrote:
>
>
>
>
>
>
>
> > I will test and let you know!!! thanks!!
>
> > On Mon, May 14, 2012 at 5:12 PM, Spencer T Brody <spen...@10gen.com>wrote:
>
> >> Yeah, the problem with ext3 is that when we allocate new data files they
> >> have to be filled with zeroes that are written back to the underlying disk.
> >>  In ext4 and xfs there is an falloc syscall we use that marks the file as
> >> allocated and zeroed, without actually writing anything back to the
> >> underlying storage.  I would expect that you would see much better insert
> >> performance on ext4 than ext3.
>
> >> One other thing you can try is adding a line to the script to run the
> >> compact command on the docs collection after the line that says "db.drop_collection('docs')".
> >>  This should make successive runs of the script able to reuse the datafiles
> >> that were allocated in the previous run.  I am assuming here that this is
> >> the only thing running on this mongod instance, if not then please be aware
> >> that running compact is a blocking operation that will cause that server to
> >> not be able to process any requests until the compaction is finished.
>
> >> On Mon, May 14, 2012 at 4:47 PM, Wilmar Campos <wilmar.cam...@gmail.com>wrote:
>
> >>> I am running default Centos Install:
>
> >>> [root@dev etc]# more fstab
> >>> /dev/VolGroup00/LogVol00 /                       ext3    defaults
> >>>  1 1
> >>> LABEL=/boot             /boot                   ext3    defaults
> >>>  1 2
> >>> tmpfs                   /dev/shm                tmpfs   defaults
> >>>  0 0
> >>> devpts                  /dev/pts                devpts  gid=5,mode=620
> >>>  0 0
> >>> sysfs                   /sys                    sysfs   defaults
> >>>  0 0
> >>> proc                    /proc                   proc    defaults
> >>>  0 0
> >>> /dev/VolGroup00/LogVol01 swap                    swap    defaults
> >>>  0 0
>
> >>> EXT3 on a logical volume.
>
> >>> As you can see, I drop the collection.
> >>> When is a fresh database, no noticeable change the first time.
>
> >>> Thanks for looking at this!!!
>
> >>>>> > > > Pastebin:http://pastebin.com/**G8dDT2Nt<http://pastebin.com/G8dDT2Nt>
> >>>>> > > mongodb-user+unsubscribe@**googlegroups.com<mongodb-user%2Bunsubscribe@goog legroups.com>
> >>>>> > > See also the IRC channel -- freenode.net#mongodb
>
> >>>>  --
> >>>> You received this message because you are subscribed to the Google
> >>>> Groups "mongodb-user" group.
> >>>> To post to this group, send email to mongod...@googlegroups.com
> >>>> To
>
> ...
>
> read more »

Sam Millman

unread,
May 15, 2012, 3:16:51 AM5/15/12
to mongod...@googlegroups.com
So where did you move it from? And I take it the tmpfs is some sort of other file system on another drive right?

Timothy Hawkins

unread,
May 15, 2012, 3:25:37 AM5/15/12
to mongod...@googlegroups.com, mongod...@googlegroups.com
If the test algo is simple, it may be worth coding in another language such as php, or javascript, to eliminate the language and driver as a bottleneck you can install a php cli runtime and the mongo driver in most linux distros with little effort, comparing results might be instructive, javascript can be run directly by the mongo shell which already knows how to talk to mongo. 

Sent from my iPad

Spencer T Brody

unread,
May 15, 2012, 11:01:57 AM5/15/12
to mongod...@googlegroups.com
Can you try running mongostat for the duration of your test script and attach the output (using something like pastebin)?  Can you also attach the script you're using to do the inserts to MySQL?

WC

unread,
May 15, 2012, 11:49:27 AM5/15/12
to mongodb-user
Spencer, here is the code for Mysql:
if do_insert:
cursor.execute("TRUNCATE TABLE keyvalue")
batch_size = 100
i = 0
start = time()
sql = "INSERT INTO keyvalue(`key`, value, author, text2, tags,
date) VALUES %s"
sql %= ",".join(["(%s, %s, %s, %s, %s, %s)"] * batch_size)
while i < keys:
to_insert = []
for j in range(batch_size):
to_insert += [i, 'Mary had a little lamb. ', 'Mike', 'My
First Blog Post!', 'mongodb, python', datetime.datetime(2009, 11, 12,
11,
14)]
i += 1
cursor.execute(sql, to_insert)
connection.commit()
i += batch_size
end = time()
print '%d documents inserted took %.2fs' % (keys, end - start)

Also here is the mongostat for the insert of 100.000 records.
python2.4 bench.py --keys 100000 --host 127.0.0.1 --gets 100 --focus
100 --do_insert
100000 documents inserted took 21.78s
100 gets (focussed on bottom 100%) took 0.09s

http://pastebin.com/RiR7ZsYK


Thank you guys for look at this!!

Wilmar
On May 15, 11:01 am, Spencer T Brody <spen...@10gen.com> wrote:
> Can you try running mongostat for the duration of your test script and
> attach the output (using something like pastebin <http://pastebin.com/>)?
>  Can you also attach the script you're using to do the inserts to MySQL?
>
> On Tue, May 15, 2012 at 3:25 AM, Timothy Hawkins <tim.hawk...@mac.com>wrote:
>
>
>
>
>
>
>
> > If the test algo is simple, it may be worth coding in another language
> > such as php, or javascript, to eliminate the language and driver as a
> > bottleneck you can install a php cli runtime and the mongo driver in most
> > linux distros with little effort, comparing results might be instructive,
> > javascript can be run directly by the mongo shell which already knows how
> > to talk to mongo.
>
> > Sent from my iPad
>
> > On 15 May 2012, at 15:16, Sam Millman <sam.mill...@gmail.com> wrote:
>
> > So where did you move it from? And I take it the tmpfs is some sort of
> > other file system on another drive right?
>
> ...
>
> read more »

WC

unread,
May 15, 2012, 11:50:45 AM5/15/12
to mongodb-user
I move it from EXT3 to EXT4.. it got worse.
Then I use a RAM filesystem, it improve, but then was not as fast a
MySQL.
Then I use xfs and got good results like the RAM fs.

On May 15, 3:16 am, Sam Millman <sam.mill...@gmail.com> wrote:
> So where did you move it from? And I take it the tmpfs is some sort of
> other file system on another drive right?
>
> ...
>
> read more »

Spencer T Brody

unread,
May 15, 2012, 12:21:38 PM5/15/12
to mongod...@googlegroups.com
From the mongostat output you pasted it seems like mongo is easily keeping up with the load - there are no queued writers and the lock% is quite low.  It certainly seems like the problem is at the filesystem level, though I would expect ext4 and xfs to perform comparably.

Another thing you should check is whether or not you are running pymongo with the C extensions.  Pymongo performs much worse without the C extensions active.  There are instructions on installing the C extensions here: http://api.mongodb.org/python/current/installation.html.  You can tell if you have the C extensions setup by calling pymongo.has_c() from python after importing pymongo.

> ...
>
> read more »

Spencer T Brody

unread,
May 15, 2012, 12:43:16 PM5/15/12
to mongod...@googlegroups.com
Just checked with the pymongo guys - when you check pymongo.has_c(), you should probably also check bson.has_c() as well.

Wilmar Campos

unread,
May 15, 2012, 3:46:35 PM5/15/12
to mongod...@googlegroups.com
now we are talking baby!!!!

[root@dev ~]# python2.4 bench.py --keys 100000 --host 127.0.0.1 --gets 100 --focus 100 --do_insert
100000 documents inserted took 4.58s
100 gets (focussed on bottom 100%) took 0.07s

[root@dev ~]# python2.4 mysql.py --keys 100000 --host 127.0.0.1 --gets 100 --focus 100 --do_insert
100000 documents inserted took 12.19s
100 gets (focussed on bottom 100%) took 4.33s

THANK YOU SPENCER TO POINTING THAT OUT!!! 

I see the difference now with C extensions and xfs file system.  Thank you all!!!!

Thanks,

WC

On Tue, May 15, 2012 at 12:43 PM, Spencer T Brody <spe...@10gen.com> wrote:
bson.has_c()



--
Wilmar Campos

Spencer T Brody

unread,
May 15, 2012, 3:57:46 PM5/15/12
to mongod...@googlegroups.com
Ha ha, my pleasure - glad we were able to get to the bottom of it!

Cheers,
-Spencer

--
Reply all
Reply to author
Forward
0 new messages