Insert Document Time performance

69 views
Skip to first unread message

jimg...@gmail.com

unread,
May 14, 2017, 8:44:21 PM5/14/17
to mongodb-user
I have an Excel/Access VBA app that I want to move the database from Access to MongoDB. And perhaps also the app framework from Excel to Python (or C#).

Each MongoDB document has 9 field (pairs) and the default index. The fields are: 1 string, 1 date (as string for now), and 7 double fields. 

I have written python code using the pymongo.py driver to insert data and measure performance. 

To my surprise using insert-one() for loading data, my Excel/Access VBA app has better performance than my mongoDB/Python app. 

In my mongoDB app, 138208 documents are inserted into a collection in 40.4 seconds. This is approximately 0.292 msec per insertOne() document. Note this in only the time for the insertOne() and does not include file or web access time to get data.

Since I am "Coming up the Curve" in both Python and MongoDB, I would appreciate guidance for a plan of attack to improve performance. My questions are:

1.Is it faster to use insertMany() with many json-like pair documents or iterate on one json-like pair for each document?
2. Would pre-insert Bson functions improve performance?
2. Would c#, C++, etc be faster? Python does not compete when speed is important?
3. Some have suggested indexing can significantly performance? But I have not found info on how?
4. any other ideas?

Thank you.






Wan Bachtiar

unread,
May 29, 2017, 1:45:36 AM5/29/17
to mongodb-user

Is it faster to use insertMany() with many json-like pair documents or iterate on one json-like pair for each document?

Hi Jim,

The factor to consider here is the time it takes to transfer a single insert command from your application to the database server. A multiple trips to the server can be costly, especially if the network latency is high between your application server and your database server. See also PyMongo Bulk Insert.

Does your MongoDB instance live on different server than your application ? If it lives on the same server, check for server resources (CPU/RAM) contention.

Would pre-insert Bson functions improve performance?

What do you mean by pre-insert BSON functions ? Do you have an example/snippet to elaborate this ?

The data is stored in MongoDB internally as BSON. PyMongo will handle marshalling your Python Dictionaries into BSON.

Would c#, C++, etc be faster? Python does not compete when speed is important?

Really depends on your use case, environment and what you’re familiar with. If you’re only migrating data into MongoDB, have you looked into mongoimport ?

Some have suggested indexing can significantly performance? But I have not found info on how?

For information see MongoDB Indexes. Generally, indexes would improve your read operations instead of your write. Depending on how you benchmark your performance, if it’s insert operation that you’re focusing on, adding an index may not improve performance.

any other ideas?

In general, you have to find out where your bottleneck is first. It may be network, disk I/O, CPU or application code. 

You may also find the following resources useful:

Regards,

Wan.

Reply all
Reply to author
Forward
0 new messages