[Django] #26804: update_or_create() updates all columns

35 views
Skip to first unread message

Django

unread,
Jun 24, 2016, 12:26:39 PM6/24/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns
----------------------------------------------+----------------------------
Reporter: jensen-cochran | Owner: nobody
Type: Bug | Status: new
Component: Database layer (models, ORM) | Version: master
Severity: Normal | Keywords:
| update_or_create
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+----------------------------
The update_or_create() helper function in Django is passed a 'defaults'
kwarg, which contains the fields that we expect the function to update.
When .save() is called in the function to do the update, it does not pass
the update_fields parameter to .save(), and so the SQL generated to
complete the update sets every column (except pk), even those not passed
in 'defaults'.

This creates a potential race condition. In the period of time between
when the function calls .get() to retrieve the object and when it calls
.save(), some other process may have updated a column in that row that has
nothing to do with this update. When save() is called, these changes will
be overwritten. This is not an issue for the fields passed into
'defaults' since we expected those to be updated anyway. The main issue
is that update_or_create updates fields/columns other than those passed in
'defaults', and does so in a way that could cause data loss.

Example:

{{{

class Authors(models.Model):
id = models.AutoField(primary_key=True)
name = models.CharField(max_length=128, unique=True)
newest_release = models.DateTimeField()
active = models.BooleanField(default=True)


Authors.objects.update_or_create(
name='Bob',
defaults={
'newest_release': timezone.now()
})


# Django update_or_create code from
https://github.com/django/django/blob/master/django/db/models/query.py

def update_or_create(self, defaults=None, **kwargs):
"""
Looks up an object with the given kwargs, updating one with
defaults
if it exists, otherwise creates a new one.
Returns a tuple (object, created), where created is a boolean
specifying whether an object was created.
"""
defaults = defaults or {}
lookup, params = self._extract_model_params(defaults, **kwargs)
self._for_write = True
try:
obj = self.get(**lookup)
except self.model.DoesNotExist:
obj, created = self._create_object_from_params(lookup, params)
if created:
return obj, created
for k, v in six.iteritems(defaults):
setattr(obj, k, v)
obj.save(using=self.db)
return obj, False

}}}

Assume that the author Bob already exists in the database.
update_or_create retrieves the Author object using .get().

Meanwhile, another process runs the following SQL:
{{{ UPDATE authors SET active = FALSE WHERE id = 1234; }}}

Then, update_or_create sets the values from 'defaults' on the object, then
calls .save() on the object. Generated SQL would look something like
{{{ UPDATE authors SET name='Bob', active=TRUE,
newest_release='#######'::timestamp, WHERE id = 1234" }}}

Notice that both 'name' and 'active' are set in the SQL, even though they
were not passed in defaults. Bob has been wrongly made active again, even
though the update portion of the update_or_create() call had nothing to do
with the 'active' field. Besides the data issues, it is also not
efficient to have the database update every field/column when there is no
need.

I have tested this behavior by putting a time.sleep() into
update_or_create right after the .get() and updating the database during
this sleep.
Once the sleep finished, my changes were overwritten by .save()

My initial thought to patch this was to change
{{{ obj.save(using=self.db) }}}
in update_or_create into
{{{ obj.save(using=self.db, update_fields=defaults.keys()) }}}

However, this causes a single test to fail in generic_relations, since
update_fields in .save() does not accept GenericForeignKey fields, which
may be bug in and of itself. There are other ways to fix this like
locking the row, but that would be less efficient and not fix the
underlying issue.

{{{
ERROR: test_update_or_create_defaults
(generic_relations.tests.GenericRelationsTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/usr/lib64/python2.7/unittest/case.py", line 365, in run
testMethod()
File "/home/jensen/sandbox/django/tests/generic_relations/tests.py",
line 540, in test_update_or_create_defaults
tag, created = TaggedItem.objects.update_or_create(tag="shiny",
defaults={'content_object': diamond})
File "/home/jensen/sandbox/django/django/db/models/manager.py", line 85,
in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/jensen/sandbox/django/django/db/models/query.py", line 493,
in update_or_create
obj.save(using=self.db, update_fields=defaults.keys())
File "/home/jensen/sandbox/django/django/db/models/base.py", line 782,
in save
% ', '.join(non_model_fields))
ValueError: The following fields do not exist in this model or are m2m
fields: content_object
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/26804>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jun 24, 2016, 12:27:10 PM6/24/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------

Reporter: jensen-cochran | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:

Keywords: update_or_create | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by jensen-cochran):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:1>

Django

unread,
Jun 24, 2016, 12:57:57 PM6/24/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------
Reporter: jensen-cochran | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: update_or_create | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by shaib):

* stage: Unreviewed => Accepted


Comment:

No, actually locking the row is the only valid solution. Otherwise you
still have a race condition -- when someone else updates the record, it is
possible that your `update_or_create` should be creating a new row rather
than updating the existing one. IMO the right fix is to change the row
{{{
obj = self.get(**lookup)
}}}
into
{{{
obj = self.select_for_update().get(**lookup)
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:2>

Django

unread,
Jun 24, 2016, 1:17:28 PM6/24/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------
Reporter: jensen-cochran | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: update_or_create | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by jensen-cochran):

* has_patch: 0 => 1


Comment:

Pull request https://github.com/django/django/pull/6831

--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:3>

Django

unread,
Jun 27, 2016, 10:33:36 AM6/27/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------
Reporter: jensen-cochran | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: update_or_create | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* needs_tests: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:4>

Django

unread,
Jun 27, 2016, 6:52:17 PM6/27/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------
Reporter: jensen-cochran | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: update_or_create | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by jensen-cochran):

* needs_tests: 1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:5>

Django

unread,
Jun 27, 2016, 7:00:21 PM6/27/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------
Reporter: jensen-cochran | Owner: jensen-
| cochran
Type: Bug | Status: assigned

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: update_or_create | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by jensen-cochran):

* owner: nobody => jensen-cochran
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:6>

Django

unread,
Jul 8, 2016, 3:29:39 PM7/8/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------
Reporter: jensen-cochran | Owner: jensen-
| cochran
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: update_or_create | Triage Stage: Ready for
| checkin

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:7>

Django

unread,
Jul 12, 2016, 2:01:34 PM7/12/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------
Reporter: jensen-cochran | Owner: jensen-
| cochran
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: update_or_create | Triage Stage: Accepted

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* needs_better_patch: 0 => 1
* stage: Ready for checkin => Accepted


Comment:

Awaiting resolution of #26884.

--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:8>

Django

unread,
Jul 14, 2016, 12:23:28 PM7/14/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------
Reporter: jensen-cochran | Owner: jensen-
| cochran
Type: Bug | Status: closed

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: update_or_create | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham <timograham@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"d44afd889275473c97474cca19467d1509e0fcc1" d44afd88]:
{{{
#!CommitTicketReference repository=""
revision="d44afd889275473c97474cca19467d1509e0fcc1"
Fixed #26804 -- Fixed a race condition in QuerySet.update_or_create().
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:9>

Django

unread,
Jul 17, 2016, 8:24:13 PM7/17/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------
Reporter: jensen-cochran | Owner: jensen-
| cochran
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: update_or_create | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by cristianocca):

Would it make sense to avoid the same race condition on get_or_create ?
Although an added select_for_update on a single get query which might be
the use case on most of the cases (99% selects, 1% actually create), would
reduce the performance of those gets by quite a lot since a
select_for_update adds a lock/write to the row. So should probably be
documented or come with an optional flag.

--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:10>

Django

unread,
Jul 17, 2016, 8:53:06 PM7/17/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------
Reporter: jensen-cochran | Owner: jensen-
| cochran
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: update_or_create | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by jensen-cochran):

The race condition in get_or_create would be: get_or_create does not find
a matching row, then another process inserts a row that get_or_create
would have matched, then get_or_create inserts it's row. Now, two rows in
the table exist with duplicate values (for the columns that were being
searched against).

If you have a unique index in your database for the columns you passed to
get_or_create and this race condition occurred, you would not have
duplicate values inserted. Instead, you would get a db exception. So it is
possible to avoid the overhead of the transaction in some cases, but only
if we know that the columns are unique/unique_together.

--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:11>

Django

unread,
Jul 28, 2016, 9:45:43 PM7/28/16
to django-...@googlegroups.com
#26804: update_or_create() updates all columns, which could cause a race condition
-------------------------------------+-------------------------------------
Reporter: jensen-cochran | Owner: jensen-
| cochran
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: update_or_create | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"83be40760a2580db9e5673d111da04f01114aced" 83be4076]:
{{{
#!CommitTicketReference repository=""
revision="83be40760a2580db9e5673d111da04f01114aced"
Fixed #26933 -- Fixed flaky update_or_create() test from refs #26804.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/26804#comment:12>

Reply all
Reply to author
Forward
0 new messages