Handling "Save with update_fields did not affect any rows"

5,380 views
Skip to first unread message

Jeroen Pulles

unread,
Nov 18, 2014, 10:44:37 AM11/18/14
to django-d...@googlegroups.com
Hi,

the model.save() arguments `force_update` and `update_fields` cause an explicit check if the update statement matched any rows.

Every once in a while one of my applications runs into a "race condition" where one process does an update (with one of the model save arguments) and another process just removed the same object. You then get a very generic DatabaseError with the message "Forced update did not affect any rows.", or "Save with update_fields did not affect any rows.". This invariably makes me scratch my head and wonder what happened.

As per ticket #21761, I don't understand why that is an error and a regular call to save() doesn't throw an error.

But since I can easily recover from this exception, I think it would be nice if these exceptions would be typed in a more specific subclass so that I can catch them and do something nice for the end-user.

As a side note, you might want to check that a single row is updated, and not "more than zero"; just in case a primary key is composed from more than one column and there's some other error...

kind regards
Jeroen Pulles


Cross-posting from django-users as I didn't get any response there

Carl Meyer

unread,
Nov 18, 2014, 11:00:06 AM11/18/14
to django-d...@googlegroups.com
Hi Jeroen,

On 11/18/2014 08:44 AM, Jeroen Pulles wrote:
> the model.save() arguments `force_update` and `update_fields` cause an
> explicit check if the update statement matched any rows.
>
> Every once in a while one of my applications runs into a "race
> condition" where one process does an update (with one of the model save
> arguments) and another process just removed the same object. You then
> get a very generic DatabaseError with the message "Forced update did not
> affect any rows.", or "Save with update_fields did not affect any
> rows.". This invariably makes me scratch my head and wonder what happened.
>
> As per ticket #21761, I don't understand why that is an error and a
> regular call to save() doesn't throw an error.

A regular call to save() will do an INSERT if no rows are updated by an
UPDATE, so neither case is an error. But if you explicitly say "I want
an update" and there is nothing to update, in many cases that is a bug
in your code, so we raise an error.

> But since I can easily recover from this exception, I think it would be
> nice if these exceptions would be typed in a more specific subclass so
> that I can catch them and do something nice for the end-user.

I think it would be reasonable to raise a sub-type of DatabaseError for
these "nothing was updated" errors, so they can be more easily caught
and handled without parsing exception messages. Could you file a ticket
(and pull request?) for that?

> As a side note, you might want to check that a single row is updated,
> and not "more than zero"; just in case a primary key is composed from
> more than one column and there's some other error...

Django models don't support compound primary keys at this point (though
there is hope that they may in the future). If you can show a situation
that is achievable through public APIs where the check for "more than
zero" gives wrong results, that would be worth filing a separate bug for.

Carl

signature.asc

Jeroen Pulles

unread,
Nov 18, 2014, 12:12:08 PM11/18/14
to django-d...@googlegroups.com
On Tuesday, November 18, 2014 5:00:06 PM UTC+1, Carl Meyer wrote:
Hi Jeroen,

On 11/18/2014 08:44 AM, Jeroen Pulles wrote:
> the model.save() arguments `force_update` and `update_fields` cause an
> explicit check if the update statement matched any rows.
>
> Every once in a while one of my applications runs into a "race
> condition" where one process does an update (with one of the model save
> arguments) and another process just removed the same object. You then
> get a very generic DatabaseError with the message "Forced update did not
> affect any rows.", or "Save with update_fields did not affect any
> rows.". This invariably makes me scratch my head and wonder what happened.
>
> As per ticket #21761, I don't understand why that is an error and a
> regular call to save() doesn't throw an error.

A regular call to save() will do an INSERT if no rows are updated by an
UPDATE, so neither case is an error. But if you explicitly say "I want
an update" and there is nothing to update, in many cases that is a bug
in your code, so we raise an error.

I'm not sure you understand my situation:

A regular .save() on an existing object with primary key will cause an UPDATE; No checks, no INSERTs following that statement. My "race condition" where another process DELETEd the object causes the same "0 matching rows" return value from the underlying update call, but with .save() there are no exceptions.

If I were to change my calling code from .save(update_fields=['foobar']) to .save() I would be a happy bunny. No exception, yay! I'll have to deal with some mayhem from no longer having the record, but I can deal with that.

So I'm still confused why the explicit arguments would be somehow different. Why would Django suddenly care about a bug in the caller's code--I have loads ;-) --? The database didn't complain, at least.

My personal preference would be to remove the four lines with the exception raising. I can't imaging anyone is actively expecting these (generic) exceptions in their calling code. If you do get the exception, my position is that it is highly unlikely that the caller made a programming error.

As far as I can see these four lines have been in base.py since Anssi's refactoring.

kind regards,
Jeroen

 

Carl Meyer

unread,
Nov 18, 2014, 10:25:14 PM11/18/14
to django-d...@googlegroups.com
Hi Jeroen,
That's not accurate. A .save() on an existing object, with primary key
set, will cause an INSERT if no rows are updated (that is, its primary
key does not exist in the database.) You can easily verify this by
trying it out.

This is why there is no error - you haven't explicitly asked for an
UPDATE or an INSERT, so Django will happily decide which to do based on
whether the row exists or not. Neither condition is unexpected or an error.

> My "race
> condition" where another process DELETEd the object causes the same "0
> matching rows" return value from the underlying update call, but with
> .save() there are no exceptions.

Correct, because an INSERT takes place instead.

> If I were to change my calling code from .save(update_fields=['foobar'])
> to .save() I would be a happy bunny. No exception, yay! I'll have to
> deal with some mayhem from no longer having the record, but I can deal
> with that.

You would have the record, because it would be re-INSERTed.

> So I'm still confused why the explicit arguments would be somehow
> different. Why would Django suddenly care about a bug in the caller's
> code--I have loads ;-) --? The database didn't complain, at least.

By calling .save() without arguments, you are telling Django "merge this
object: UPDATE if it already exists, INSERT it otherwise." Neither
condition is an error, both are within the expected range of possibilities.

By using `force_update` or `update_fields`, you are telling Django "I
expect this row to already exist, and I want you to update it." Thus,
failure to update it is an error condition, because Django was unable to
do the thing that you asked it to do.

> My personal preference would be to remove the four lines with the
> exception raising. I can't imaging anyone is actively expecting these
> (generic) exceptions in their calling code. If you do get the exception,
> my position is that it is highly unlikely that the caller made a
> programming error.

I understand that you have a case where you don't consider it a
programming error (though one could argue that you should be using a
transaction and select-for-update to prevent this race condition), but I
don't think either of us has real data on that question. There are
certainly plausible programming errors that would lead to this; one
example was already given in this thread (using `update_fields` or
`force_update` on a new, never-saved instance).

As I said before, I think making these errors a more specific sub-type
to make them easier to catch would be reasonable. But I don't think they
should be removed. They represent a case where Django was a) unable to
do what you asked it to do, and b) as a result Django is throwing away
your model data unsaved. That's something that I don't think should pass
silently.

Carl

signature.asc

Jeroen Pulles

unread,
Nov 20, 2014, 4:43:24 AM11/20/14
to django-d...@googlegroups.com
Hi Carl,

Thanks for taking the time to respond.

I'm still wondering what's best for save_table to do. Pre-1.6 Django code would end up with a new object in my situation, and no complaints. Apparently, that worked, for me. I'm inclined to think that is still what I want for the update-y variants of save().

Or maybe I should consider using manager.update.

I've put up a sample of the multiprocessing situation on github:
https://github.com/jeroenp/glowing-octo-cyril/blob/master/steps.py

kind regards
Jeroen

Carl Meyer

unread,
Nov 20, 2014, 6:11:45 AM11/20/14
to django-d...@googlegroups.com
Hi Jeroen,

On 11/20/2014 02:43 AM, Jeroen Pulles wrote:
> I'm still wondering what's best for save_table to do. Pre-1.6 Django
> code would end up with a new object in my situation, and no complaints.
> Apparently, that worked, for me. I'm inclined to think that is still
> what I want for the update-y variants of save().

I hear you -- but I already explained why I think the current behavior
(in 1.6 and 1.7) is better. I don't find "but apparently [the old
behavior] worked, for me" a convincing counter-argument :)

If we were still in the 1.6 release cycle, then "it's a regression from
1.5" alone might have been a reason to change, but now that we've had
two full releases with the new behavior, that's no longer the case.

> Or maybe I should consider using manager.update.

If you want an update that doesn't mind if nothing is updated, then a
QuerySet update is what you want, yes.

(It's ok for QuerySet.update() to update nothing if nothing matches,
unlike Model.save(), because a) you provide the filters yourself, so how
many rows the filters match is clearly in your court, and b) the number
of rows updated is returned, so it's not silently throwing away your data).

> I've put up a sample of the multiprocessing situation on github:
> https://github.com/jeroenp/glowing-octo-cyril/blob/master/steps.py

Yes, I understand the scenario you have.

Carl

signature.asc
Reply all
Reply to author
Forward
0 new messages