MongoDB $push performance behavior?

771 views
Skip to first unread message

Nikolay

unread,
Aug 8, 2011, 11:39:32 AM8/8/11
to mongodb-user
Hey all,
I started playing with mongo around a week ago and it has been a
fantastic experience, mongo rocks!
I hope I can get some with with an issue, I've been struggling in the
past couple of days to understand $push's performance. I expected that
since we append something at the end of the array we would have a
constant time operation and it won't depend on the size of the array.
However it turned out that $push performance has a linear correlation
to the size of the array.
My experiment was the following:

-- I created a collection with one document, created an array
inside and filled it with 40 000 relatively small elements by calling
$push 40 000 times. Then I started the profiler and made another push,
the operation finished for around 800ms

-- I repeated the operation, but this time put 10 000 elements
instead of 40 K, the next $push finished for around 200 ms

-- In the end I tried it with 1000 elements and the next $push was
done in 20ms

You see the pattern. I read about the PaddingFactor, in my case it was
1.0099999. Maybe this is where the performance issue comes from, but I
would assume that after getting 40 K $pushes into that array, mongo
would be smart enough to make some space for some more, so it doesn't
have to move the document that often.
Can you guys tell me what am I missing here?

Thanks all so much in advance!
Best Regards,
Nikolay

Chris Westin

unread,
Aug 8, 2011, 2:08:48 PM8/8/11
to mongodb-user
There are two things happening here.

When you update the document, it is getting larger, so once the data
doesn't fit in place anymore, it will get moved. That padding factor
looks too small to be of help. It may be that tje padding factor
doesn't change enough because you are only making the tiniest of
changes per-update (pushing one more value onto the end of the array).

Also, the serialized form of BSON that documents are stored in (see
http://bsonspec.org/) means that the document has to be scanned to
find the place where the new element would go. As the document gets
larger, this takes longer. This effect is probably smaller than the
need to constantly move the document, but it might show up in the
aggregate if you were just timing a loop that does repeated pushes.

Chris
Reply all
Reply to author
Forward
0 new messages