Atomicity of Update vs FindAndModify

1,962 views
Skip to first unread message

James Crosswell

unread,
Jun 6, 2014, 7:17:04 AM6/6/14
to mongodb...@googlegroups.com
I want to make atomic updates and I'm wondering what the difference is between the Update and FindAndModify methods.

The docs for Update say :
The Update method is used to update existing documents. 

Use FindAndModify when you want to find a matching document and update it in one atomic operation

Does this mean that FindAndModify will be atomic but Update will not?

eMeL

unread,
Jun 6, 2014, 8:06:04 AM6/6/14
to mongodb...@googlegroups.com

> Does this mean that FindAndModify will be atomic but Update will not?

MongoDB isn't transactional.
So if more users read same record and update it later, only the *last*
update persistent (last update win).

eMeL

James Crosswell

unread,
Jun 6, 2014, 9:05:05 AM6/6/14
to mongodb...@googlegroups.com
Yeah, I understand that. However I also understood that MongoDB supported atomic updates at the document level. What I'm wondering is if that "atomic update" functionality applies equally to documents updated using the findAndModify operation and those updated using the update operation.

I'll show my code to demonstrate my use case better... 

I have lots of chunks of work and lots of job runners that perform that work. Each chunk should only be processed by a single job runner. So I have some code like the following to let job runners claim chunks:

        public bool ClaimChunk(JobChunkBase chunk, IdString jobRunnerId, DateTime startDate)
        {            
            var claimChunk = _mongoCollection.Update(
                 Query.And(
                     Query<JobChunkBase>.EQ(j => j.Id, chunk.Id),
                     Query<JobChunkBase>.EQ(j => j.JobRunnerId, IdString.EmptyId)
                     ),
                 Update<JobChunkBase>
                     .Set(j => j.JobRunnerId, jobRunnerId.ToString())
                    .Set(j => j.DateStarted, startDate)
                 );
 
             // If we managed to update the chunk in the databases then the chunk will have been successfully claimed 
            return claimChunk.UpdatedExisting;
         }

From what I can tell though, this wouldn't necessarily work. If two job runners issue that query at very near the same time, they could both get the job returned by the query and then both issue an update (since the Query and Update are performed separately using an update statement). 

The following seems like better code in this situation:

         public bool ClaimChunk(JobChunkBase chunk, IdString jobRunnerId, DateTime startDate)
         {
            var claimChunk = _mongoCollection.FindAndModify(
                Query.And(
                     Query<JobChunkBase>.EQ(j => j.Id, chunk.Id),
                     Query<JobChunkBase>.EQ(j => j.JobRunnerId, IdString.EmptyId)
                     ),
                SortBy<JobChunkBase>.Descending(x => x.Id),
                Update<JobChunkBase>
                    .Set(j => j.JobRunnerId, jobRunnerId.ToString())
                    .Set(j => j.DateStarted, startDate),
                returnNew: true
                 );
 
             // If we managed to update the chunk in the databases then the chunk will have been successfully claimed 
             return claimChunk.ModifiedDocument != null;
         }

Note that this code makes use of an IdString which is simply a helper class to treat Mongo ObjectIds as strings. EmptyId is just the string equivalent of ObjectId.Empty then.

As such, I guess my initial question could be framed more formally as: "are the two blocks of code above functionally equivalent or am I right to suspect that the only the second piece of code will do what I want - i.e. ensure that only a single job runner claims a any particular chunk?" 

Cheers,
James

James Crosswell

unread,
Jun 11, 2014, 9:37:15 AM6/11/14
to mongodb...@googlegroups.com
So I finally found the answer to this in the docs.

When modifying a single document, both findAndModify() and the update() method atomically update the document. See Isolate Sequence of Operations for more details about interactions and order of operations of these methods.

Cheers,
James 

Asya Kamsky

unread,
Aug 8, 2014, 11:53:24 AM8/8/14
to mongodb...@googlegroups.com
I know this is an old thread and OP already found the answer, but I wanted to clarify something to be more explicit about it:

> If two job runners issue that query at very near the same time, they could both get the job returned by the query and then both issue an update (since the Query and Update are performed separately using an update statement). 

This is absolutely *not* the case (and if it were, then it would be a terrible bug we would want to fix immediately).

An update will check that the query condition is true under the same lock that it updates the document - so you could not have two threads update the same document as for the second one the document would *not* satisfy the query condition anymore.

Asya

Gary Leong

unread,
Apr 5, 2016, 5:08:45 PM4/5/16
to mongodb-csharp
i know this is an old thread, but i have been experiencing two docker containers with a single thread reading the mongodb with find and modify and both running the job, though one should not since the "inProg" is set to True, but query is "inProg" is set to False.  with docker-compose, it scales the docker containers such that the pulls happen exactly at the right time apparently.  the containers share the same kernel and same system clock.  i'm not sure what to do at this point.  any suggestions?  i heard mongodb3.2.2 is suppose to use a different storage engine. i'm running 2.6.
Reply all
Reply to author
Forward
0 new messages