Is it faster to use insertMany() with many json-like pair documents or iterate on one json-like pair for each document?
Hi Jim,
The factor to consider here is the time it takes to transfer a single insert command from your application to the database server. A multiple trips to the server can be costly, especially if the network latency is high between your application server and your database server. See also PyMongo Bulk Insert.
Does your MongoDB instance live on different server than your application ? If it lives on the same server, check for server resources (CPU/RAM) contention.
Would pre-insert Bson functions improve performance?
What do you mean by pre-insert BSON functions ? Do you have an example/snippet to elaborate this ?
The data is stored in MongoDB internally as BSON. PyMongo will handle marshalling your Python Dictionaries into BSON.
Would c#, C++, etc be faster? Python does not compete when speed is important?
Really depends on your use case, environment and what you’re familiar with. If you’re only migrating data into MongoDB, have you looked into mongoimport ?
Some have suggested indexing can significantly performance? But I have not found info on how?
For information see MongoDB Indexes. Generally, indexes would improve your read operations instead of your write. Depending on how you benchmark your performance, if it’s insert operation that you’re focusing on, adding an index may not improve performance.
any other ideas?
In general, you have to find out where your bottleneck is first. It may be network, disk I/O, CPU or application code.
You may also find the following resources useful:
Free online course at MongoDB University M201: MongoDB Performance
Regards,
Wan.