mongoDB large array insert issues

404 views
Skip to first unread message

Ashish Shetty

unread,
Sep 5, 2014, 3:45:06 AM9/5/14
to mongod...@googlegroups.com
Hi,
i have this problem where in my project i have to insert a document into Mongo that has a large array embedded into it that would be 100,000 in size, other documents may have larger array. This insert is taking more than 5 mins, where as i have tried inserting 300,000 different documents as bulk insert that finished within few seconds.Is this normal behavior or is there a problem with my setup. Example:{"name":"john" ,"lastname":"doe","groups":[1,2,3,4........100,000]} the array consists of just list of numbers that i fetch all at once.

Will Berkeley

unread,
Sep 5, 2014, 11:11:11 AM9/5/14
to mongod...@googlegroups.com
How are you doing the insert? Do you have a code snippet or something? I can insert an array of 100,000 numbers into MongoDB instantly on my local machine using the mongo shell. 

> big = []
> for (var i = 0; i < 100000; i++) big.push(i)
> db.bigarray.insert({ "x" : big }) // finishes almost instantly

Is the array field "groups" indexed?

In any case, I'd advise against having gigantic array fields if you want to index the entries or if the array is going to be updated frequently. With large enough arrays, you might hit the 16MB BSON document size limit (1-100,000 in an array in a BSON document is about 1.5MB). I couldn't say a better way to model the data without knowing about the use case, however.

-Will

Masoumeh Haghpanahi

unread,
Nov 18, 2014, 11:21:44 AM11/18/14
to mongod...@googlegroups.com
Hi Will,

I have the same problem regarding inserting large arrays in MongoDB. As you pointed out, an array of 100,000 numbers can be inserted instantly using mongo shell, but the same array takes over a minute to be inserted in MongoDB using PyMongo!  I am just using a simple test document, testdoc = {"_id":"test", "data": range(100000)}, and inserting it using the following command:
db[collection].insert(testdoc, w = 1, j = True)

Do you have any idea why it takes sooooo much longer to insert the document in Python?

Thanks,
Masoumeh

Masoumeh Haghpanahi

unread,
Nov 18, 2014, 2:25:10 PM11/18/14
to mongod...@googlegroups.com
Oh, and I also have to mention that I use the same write concerns (w = 1, j = 1) when inserting the document using mono shell.

Masoumeh

Will Berkeley

unread,
Nov 18, 2014, 2:47:47 PM11/18/14
to mongod...@googlegroups.com
I don't reproduce that performance with PyMongo - it's the same near-instantaneous insert of a 100,000 element array as with the shell. Is the server busy at the time your sending the insert? Do you have a full script with connection info, etc, that you could share? Also, what versions of driver and server? I'm using the latest PyMongo with 2.6.4 (and Python 3, so technically I had to change your test code because range is a special sequence type and not a list in Python 3 and it doesn't BSON serialize).

Also, truly, you almost certainly do not want arrays that large in a document. What is driving the desire for such large arrays?

-Will
Reply all
Reply to author
Forward
0 new messages