pymongo changes static data passed in to be inserted

24 views
Skip to first unread message

rhea ghosh

unread,
Sep 23, 2015, 3:23:25 PM9/23/15
to mongodb-user
Hi,

So I'm maintaining the MongoDB returner for Saltstack and I found that in pymongo there's unexpected behavior

I have this function which is very simple that I'm using to save my load to the database. However, I've had to add .copy() because upon insert Mongo appends the _id into my load value. This is unexpected behavior and I question why Mongo is changing something that should be static data.


def save_load(jid, load):
   
'''
    Save the load for a given job id
    '''

    conn
, mdb = _get_conn(ret=None)
   
if float(version) > 2.3:
        job
= mdb.jobs.insert_one(load.copy())
   
else:
        mdb
.jobs.insert(load)

Bernie Hackett

unread,
Sep 24, 2015, 10:04:40 AM9/24/15
to mongodb-user
Yes, PyMongo, like all other MongoDB drivers, adds an _id field to your documents if you don't provide one yourself. If the driver didn't add it then the server would instead. Every MongoDB document is required to have an _id value. PyMongo has always modified the document passed to insert helpers to add this field and always will. To change that behavior would break the expectations of every existing application.

If you want to control the value of the _id field to keep your documents "static" you can. PyMongo only modified the document of the _id field does not already exist.

rhea ghosh

unread,
Sep 24, 2015, 11:01:03 AM9/24/15
to mongodb-user
That still doesn't really make sense. I don't want to control the value of the _id field but the data that I pass in should be immuatable data. Returning the ObjectID from the MongoDB server is what I would actually expect for something like this. I expect the driver to insert/update not to modify the data structure.

Bernie Hackett

unread,
Sep 24, 2015, 11:17:54 AM9/24/15
to mongod...@googlegroups.com
> ...the data that I pass in should be immuatable data. 

But it's not. Either the driver or the server will mutate the data to add an _id field. If you want to control that mutation yourself you can, but any document inserted into the server will be mutated to add an _id field if one does not exist.

There is also an advantage to this design. Since the document is modified on insert, if the insert fails due to a brief network issue or replication failover you can retry the insert attempt. If you get no error, then the original attempt failed and the second attempt succeeded. If the second attempt fails with DuplicateKeyError, then you know the original attempt succeeded.

Regardless, this behavior has existing as long as PyMongo has existed. To change it now would break the expectations of all existing applications.

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to a topic in the Google Groups "mongodb-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mongodb-user/tx_cRuHXHK8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/f2e374eb-2c97-4eb2-ba4e-c70ae9f71273%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Bernie Hackett

unread,
Sep 24, 2015, 11:23:37 AM9/24/15
to mongodb-user
> Returning the ObjectID from the MongoDB server is what I would actually expect for something like this.

This is another reason for the driver's behavior. The server does not return the _id of documents that are inserted that don't include and _id:

     >>> c.test.command('insert', 'test', documents=[{'x': 1}])
    {u'ok': 1, u'n': 1}
    >>> c.test.test.find_one()
    {u'x': 1, u'_id': ObjectId('560414fb8cd329440b9a04a5')}
Reply all
Reply to author
Forward
0 new messages