[mongodb-user] Speed pymongo vs. Ming vs. Mongokit

1,377 views
Skip to first unread message

Andreas Jung

unread,
Apr 21, 2010, 12:09:58 PM4/21/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi there,

I made the following trival benchmark:

100.000 insertions of a dict like

d = {
'name' : 'Andreas Jung',
'street' : 'xxxxxxxxxxxxx',
'city' : 'yyyyyyyyyy',
'zip' : 72070,
}

into MongoDB.

Average insertion time on my Linux box

Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz

pymongo: 9-10 seconds
Ming: 40-42 seconds
Mongokit: 80 seconds

Using MongoDB 1.4.1

The overhead for both ORMs seems to unacceptable for practical usage.

Any other performance experiences with both Ming or Mongokit?

Andreas



- --
ZOPYX Limited | zopyx group
Charlottenstr. 37/1 | The full-service network for Zope & Plone
D-72070 Tübingen | Produce & Publish
www.zopyx.com | www.produce-and-publish.com
- ------------------------------------------------------------------------
E-Publishing, Python, Zope & Plone development, Consulting


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvPI1UACgkQCJIWIbr9KYzw2wCgxHyZLjnpawGAdRAKpFQXeYf5
Iu8AmgIxtorH0u3LxAed5+3pyjFLHs+t
=Sfra
-----END PGP SIGNATURE-----

--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com.
To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.

lists.vcf

Andreas Jung

unread,
Apr 21, 2010, 12:21:26 PM4/21/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Update: using skip_validation=True w/ Mongokit will make the benchmark
pass in 50 seconds...still unacceptable.

Andreas
iEYEARECAAYFAkvPJgUACgkQCJIWIbr9KYxxwQCfZGjRTbYCsYCzMqgOPtky56EV
yN8An1e2Ctau0AP5biIRNT67YfzkiXI5
=EPA0
lists.vcf

Flaper87

unread,
Apr 21, 2010, 12:23:03 PM4/21/10
to mongod...@googlegroups.com
Please, could you share the code used for benchmarking???

How much RAM does your linux box have?

Thanks

2010/4/21 Andreas Jung <li...@zopyx.com>



--
Flavio Percoco Premoli, A.K.A. [Flaper87]
http://www.flaper87.org
Usuario Linux registrado #436538
Geek by nature, Linux by choice, Archer of course.
Key Fingerprint: CFC0 C67D FF73 463B 7E55  CF43 25D1 E75B E2DB 15C7
The Solution to everything:
python -c "from struct import pack; print  pack('5b', (41*len('99')), pow(8,2)+20, 4900**0.5, range(78)[-1], 10)"

Michael Schurter

unread,
Apr 21, 2010, 12:24:26 PM4/21/10
to mongod...@googlegroups.com
It appears both Ming and MongoKit default to safe=True on saves which
slows things down considerably. Did you always use the same value for
safe in your benchmarks?

Andreas Jung

unread,
Apr 21, 2010, 12:33:18 PM4/21/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

http://pastie.org/928161
http://pastie.org/928162
http://pastie.org/928164

The machine has 4 GB of RAM and it was almost idle - I did several runs
and checked about delays caused by the allocation of new data files.

Andreas

Flaper87 wrote:
> Please, could you share the code used for benchmarking???
>
> How much RAM does your linux box have?
>
> Thanks
>
> 2010/4/21 Andreas Jung <li...@zopyx.com <mailto:li...@zopyx.com>>
>
> Update: using skip_validation=True w/ Mongokit will make the benchmark
> pass in 50 seconds...still unacceptable.
>
> Andreas
>
>
> Andreas Jung wrote:
>> Hi there,
>
>> I made the following trival benchmark:
>
>> 100.000 insertions of a dict like
>
>> d = {
>> 'name' : 'Andreas Jung',
>> 'street' : 'xxxxxxxxxxxxx',
>> 'city' : 'yyyyyyyyyy',
>> 'zip' : 72070,
>> }
>
>> into MongoDB.
>
>> Average insertion time on my Linux box
>
>> Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz
>
>> pymongo: 9-10 seconds
>> Ming: 40-42 seconds
>> Mongokit: 80 seconds
>
>> Using MongoDB 1.4.1
>
>> The overhead for both ORMs seems to unacceptable for practical usage.
>
>> Any other performance experiences with both Ming or Mongokit?
>
>> Andreas
>
>
>
>

- --
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com
<mailto:mongod...@googlegroups.com>.
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
<mailto:mongodb-user%2Bunsu...@googlegroups.com>.
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.




> --
> Flavio Percoco Premoli, A.K.A. [Flaper87]
> http://www.flaper87.org
> Usuario Linux registrado #436538
> Geek by nature, Linux by choice, Archer of course.
> Key Fingerprint: CFC0 C67D FF73 463B 7E55 CF43 25D1 E75B E2DB 15C7
> The Solution to everything:
> python -c "from struct import pack; print pack('5b', (41*len('99')),
> pow(8,2)+20, 4900**0.5, range(78)[-1], 10)"

> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.


- --
ZOPYX Limited | zopyx group
Charlottenstr. 37/1 | The full-service network for Zope & Plone
D-72070 Tübingen | Produce & Publish
www.zopyx.com | www.produce-and-publish.com
- ------------------------------------------------------------------------
E-Publishing, Python, Zope & Plone development, Consulting


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvPKM4ACgkQCJIWIbr9KYx0eACaAsLJlouI9axyMeVjssVLGbNV
jdgAoIloDYsx8/kddGPjYmbuJV7y0sBz
=JTNr
lists.vcf

Andreas Jung

unread,
Apr 21, 2010, 12:37:06 PM4/21/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Michael Schurter wrote:
> It appears both Ming and MongoKit default to safe=True on saves which
> slows things down considerably. Did you always use the same value for
> safe in your benchmarks?
>
>

Not sure what you mean. I can not find anything related to 'safe=True'
inside the Mongokit/Ming docs.

Andreas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvPKbIACgkQCJIWIbr9KYyEcACaA3iDRbdZ0Y4/ClgnKZMjNrFI
5GQAni83TU6vn3bPg7UDIfLRL8DanKUE
=gP8P
lists.vcf

Michael Schurter

unread,
Apr 21, 2010, 12:40:44 PM4/21/10
to mongod...@googlegroups.com
grep the source:

~/src/mongokit$ grep safe mongokit/*.py
mongokit/document.py: def save(self, uuid=False, validate=None,
safe=True, *args, **kwargs):
mongokit/document.py: id = self.collection.save(self,
safe=safe, *args, **kwargs)

~/src/Ming-0.2$ grep safe= ming/*.py
ming/session.py: return self._impl(cls).update(spec, fields,
upsert, safe=True)
ming/session.py: dict(_id=doc._id), {'$set':values}, safe=True)
ming/session.py: result = self._impl(doc).save(data, safe=True)
ming/session.py: bson = self._impl(doc).insert(data, safe=True)
ming/session.py: safe=True)
ming/session.py: self._impl(doc).remove({'_id':doc._id}, safe=True)
ming/session.py: impl.update({'_id':doc._id},
{'$set':fields_values}, safe=True)

pymongo defaults to safe=False.

I would suspect that's the vast majority of the speed difference.

Andreas Jung

unread,
Apr 21, 2010, 12:49:32 PM4/21/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Michael Schurter wrote:

> pymongo defaults to safe=False.
>
> I would suspect that's the vast majority of the speed difference.

Right. pymongo + safe=True slows down the pymongo benchmark
down to 15 seconds...still makes an overhead of 200%-300%.

Andreas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvPLJwACgkQCJIWIbr9KYwSogCgnhGob3K3Lgx0rfMIERbP+954
ypwAoNGFmzhb1/gbUeXB2xwHjJm00Q89
=hMpb
lists.vcf

Nicolas Clairon

unread,
Apr 21, 2010, 1:40:51 PM4/21/10
to mongod...@googlegroups.com
Hi !

I'm the MongoKit Author. Thanks you for making this benchmark. I
designed MongoKit to be very light and when you skip the validation,
the performance are pretty closed to pymongo (someone in the past made
a benchmark but I can't find the pointer).

So, I would be very interested in looking into your benchmark script.
This will help improving MongoKit.

N.

Andreas Jung

unread,
Apr 21, 2010, 4:01:19 PM4/21/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



- -------- Original Message --------
From: Andreas Jung <li...@zopyx.com>
Subject: Re: [mongodb-user] Speed pymongo vs. Ming vs. Mongokit
Date: Wed, 21 Apr 2010 18:33:18 +0200
To: mongod...@googlegroups.com
ZOPYX Limited | zopyx group
Charlottenstr. 37/1 | The full-service network for Zope & Plone
D-72070 Tübingen | Produce & Publish
www.zopyx.com | www.produce-and-publish.com
- ------------------------------------------------------------------------
E-Publishing, Python, Zope & Plone development, Consulting


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvPWY8ACgkQCJIWIbr9KYy9qgCgmhGO4STWvJQKspA2J915gUCG
B/cAnAlNHRhMqF+QLLcj/P7ejcUZXqlZ
=AIaq
lists.vcf

Stephen Eley

unread,
Apr 21, 2010, 9:19:13 PM4/21/10
to mongod...@googlegroups.com
On Wed, Apr 21, 2010 at 12:09 PM, Andreas Jung <li...@zopyx.com> wrote:
>
> 100.000 insertions of a dict like
> [...]
> The overhead for both ORMs seems to unacceptable for practical usage.

In practical usage, will your user base be attempting to insert
100,000 records in less than a minute?

If they're not, your criterion isn't based on practical usage. Which
is fine, but a framework isn't "unacceptable" for failing to meet an
artificial benchmark. The reason to use a framework isn't speed. And
if speed is acceptable in the framework's actual use cases, then
overhead in a brute force trial isn't a reason *not* to use one. A
tenth of a millisecond for a database insert is *probably* not going
to be the make-or-break bottleneck in your request durations.



--
Have Fun,
Steve Eley (sfe...@gmail.com)
ESCAPE POD - The Science Fiction Podcast Magazine
http://www.escapepod.org

Andreas Jung

unread,
Apr 21, 2010, 11:24:15 PM4/21/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Stephen Eley wrote:
> On Wed, Apr 21, 2010 at 12:09 PM, Andreas Jung <li...@zopyx.com> wrote:
>> 100.000 insertions of a dict like
>> [...]
>> The overhead for both ORMs seems to unacceptable for practical usage.
>
> In practical usage, will your user base be attempting to insert
> 100,000 records in less than a minute?
>
> If they're not, your criterion isn't based on practical usage.

With respect but such a comment is barely nonsense. We are working on an
application with a very high write read and write rate and performance
really matters for certain parts of the application. And if an ORM
limits the throughput by a factor of two or three than this is not
acceptable (which means you have to throw more hardware into the project
in order to scale your application because of the speed of the
underlying ORM).

Andreas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvPwV4ACgkQCJIWIbr9KYyGmQCeIOpuu24843PaiVCNiJv1ckb4
9r8An3AmP21l6QpB1Xhsri29hi8jeODE
=eouk
-----END PGP SIGNATURE-----
lists.vcf

Stephen Eley

unread,
Apr 22, 2010, 12:03:01 AM4/22/10
to mongod...@googlegroups.com
On Wed, Apr 21, 2010 at 11:24 PM, Andreas Jung <li...@zopyx.com> wrote:
>
> With respect but such a comment is barely nonsense. We are working on an
> application with a very high write read and write rate and performance
> really matters for certain parts of the application. And if an ORM
> limits the throughput by a factor of two or three than this is not
> acceptable (which means you have to throw more hardware into the project
> in order to scale your application because of the speed of the
> underlying ORM).

So you've done profiling of your application in production or under
simulated production conditions, and determined that the overhead of
the ORM is your main bottleneck?

If so, why bother with an artificial benchmark? Why not tell us about
the conditions under which your _application_ is suffering, and which
ORM you're using for it? There may be ways to tune the ORMs'
performance, or to drop the ORM abstractions and get closer to the
metal for those particular chokepoints.

Premature optimization is the root of many problems in software.
Saying "X is many times slower than Y" isn't a reason to avoid using X
all by itself. Python is many times slower than writing in assembler,
but it offers benefits that justify the performance hit. The question
that matters is whether the speed of X is a *problem.*




--
Have Fun,
Steve Eley (sfe...@gmail.com)
ESCAPE POD - The Science Fiction Podcast Magazine
http://www.escapepod.org

Nicolas Clairon

unread,
Apr 22, 2010, 5:12:06 AM4/22/10
to mongod...@googlegroups.com
While I think Stephen is right, I agree that there's room in MongoKit
for optimization. I already found some bottlenecks that bring MongoKit
from 49s to 29s. I plained big work on MongoKit next week (I got some
tickets to closed) and I'll make this optimisation.

While I designed MongoKit to be light and fast, keep in mind that
nothing will be faster than a regular dict. MongoKit brings some
usefull features like i18n, polymorphism, not notation or
auto-reference. But all these features will decrease a little
performances. That's why I design the API to give access to the
underlying pymongo API.

So, if one moment you need speed and don't care about all features,
you can directly use all pymongo's features without Mongokit's one.
For exemple :

>>> obj = connection.mydb.mycol.MyObj.find_one() # <--- mongokit's way
>>> obj = connection.mydb.mycol.find_one() # <--- pymongo way. Really fast !

Thanks you for pointing me this speed issue Andreas. I'll take it
really seriously. If you have some optimization skills, feel free to
contact me. you can also send some patches, I'll be please to
integrate them.

N.

Andreas Jung

unread,
Apr 23, 2010, 12:55:07 AM4/23/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nicolas Clairon wrote:
> While I think Stephen is right, I agree that there's room in MongoKit
> for optimization. I already found some bottlenecks that bring MongoKit
> from 49s to 29s. I plained big work on MongoKit next week (I got some
> tickets to closed) and I'll make this optimisation.

I am happy to check the performance improvements.

I only partly agree with Stephen since I know the throughput we have
deal with and I know how far we can get with one of the ORMs. Yes,
throughput is very important in our case..

- -aj
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvRKCsACgkQCJIWIbr9KYyBHACgw5IZgZ9oLwQQSFIedqVzpyuu
x+8An2nGwX9lwelwcIp3rYLHZdq6MwSD
=hXj0
-----END PGP SIGNATURE-----
lists.vcf

Nicolas Clairon

unread,
Apr 28, 2010, 6:52:03 AM4/28/10
to mongod...@googlegroups.com
Hi,

Just to say that I just pushed a big optimisation for MongoKit in
bitbucket. In my machine I win 20s on your benchmark.

Feel free to confirm and add any objection for making Mongokit better.

N.

Flaper87

unread,
Apr 28, 2010, 9:16:47 AM4/28/10
to mongod...@googlegroups.com


2010/4/28 Nicolas Clairon <cla...@gmail.com>
Just to say that I just pushed a big optimisation for MongoKit in
bitbucket. In my machine I win 20s on your benchmark.

Feel free to confirm and add any objection for making Mongokit better.


Great work Nicolas.

Awesome. 


--
Flavio Percoco Premoli, A.K.A. [Flaper87]
http://www.flaper87.org
Usuario Linux registrado #436538
Geek by nature, Linux by choice, Archer of course.
Key Fingerprint: CFC0 C67D FF73 463B 7E55  CF43 25D1 E75B E2DB 15C7
The Solution to everything:
python -c "from struct import pack; print  pack('5b', (41*len('99')), pow(8,2)+20, 4900**0.5, range(78)[-1], 10)"

Andreas Jung

unread,
Apr 28, 2010, 9:33:00 AM4/28/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Nicolas Clairon wrote:
> Hi,
>
> Just to say that I just pushed a big optimisation for MongoKit in
> bitbucket. In my machine I win 20s on your benchmark.
>
> Feel free to confirm and add any objection for making Mongokit better.
>
>

Very cool!

Thanks,
Andreas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvYOQwACgkQCJIWIbr9KYxVCACfVI8PUwDPYJLPRsyIkfjt3iCZ
FxQAoJg9MtcLuYbCg82TEBxbhtvNuccA
=Il1a
lists.vcf
Reply all
Reply to author
Forward
0 new messages