Clarification on implicit updating that occurs when the save() method is called on a document

15 views
Skip to first unread message

Edwardr

unread,
Feb 3, 2012, 5:39:46 AM2/3/12
to MongoEngine Users
Hi all,

I have some processes that automatically go out and collect product
information every day. These products are stored as documents in a
collection, and the ID is generated by myself using attributes of the
products that don't change. In my case it's a hash of some identifying
information, like a name and a code.

There are some fields within the documents however that are not
collected by this daily process and are instead added/updated to the
documents by me on a slower schedule (say weekly).

Here's what's happening - every day I generate a new document and
populate it with information, then I call save() on the documents
thinking that they will only do atomic updates to the fields that have
values just set by being passed to __init__ (not all fields are
required in my schema), remembering that the IDs of the documents will
already exist in the collection.

However, I've just realised that the weekly information I'm adding to
these documents is being deleted, i.e., the document I'm saving is
overwriting an existing document with the same ID.

What's the right approach here to only update fields that have changed
when a document with the same ID already exists?

- Should I be trying to just do atomic updates using
Collection.objects(the_id="unique_id").update(set__my_attribute="xxx")
using upsert if the document doesn't exist?
- Should I be trying to query for the document then making changes
using my_object.title = "new title", then calling save() when I'm
done? If I only retrieve fields I know I want to change using .only()
when I call save() it's going to delete the other fields I didn't
bring down right?

90% of the fields in the documents are required, so in approach one
I'm going to be sending most of the document each time because of the
possibility of an upsert.

I'm not quite sure what's the most efficient approach; I have to deal
with between 10-50 thousand of these documents a day so I'd like to do
it as efficiently as possible.

Thanks!

Ross Lawley

unread,
Feb 6, 2012, 3:31:45 AM2/6/12
to mongoeng...@googlegroups.com
Hi,

The most efficient approach would be to do updates directly.  

In versions 0.5.2+ there is dirty data tracking, so save() will convert to updates if you add or delete data, prior to 0.5 save() updated the whole document - so if you didnt get the whole object you could lose data..

Hope that helps,

Ross

Edwardr

unread,
Feb 6, 2012, 5:01:21 AM2/6/12
to MongoEngine Users
Hi Ross,

Thanks for the reply. Great job on Mongoengine by the way!

> In versions 0.5.2+ there is dirty data tracking, so save() will convert to
> updates if you add or delete data, prior to 0.5 save() updated the whole
> document - so if you didnt get the whole object you could lose data..

Ah, I just realised my machine was on 0.5.1!

I can't always assume I'll be on 0.5.2, so does the following logic
also work?
If I instantiate a new document that uses the same ID as an existing
one, and provide values for the required fields before I save it, and
existing optional fields in the existing document with the same ID,
will be lost, presumably because they are not present in the new
document I instantiated client-side.

To mitigate against this I should do a get() on the id and see if a
document comes back, then update the fields directly with any new
values, then call save(). If I know I'm using 0.5.2+ I don't have to
worry about checking for an existing document first and can just
instantiate one with the same id as an existing one, and the existing
one's optional fields will not be overwritten when I call save() on
the new document.

Right?

Cheers!
Edd

Ross Lawley

unread,
Feb 6, 2012, 8:19:20 AM2/6/12
to mongoeng...@googlegroups.com
Hi,

I'd have to test this out, theoretically *it should be fine* but if you know the id of the document but dont know if its stored, then I would do an update and set operation.

Ross
Reply all
Reply to author
Forward
0 new messages