[Django] #28806: Mechanism of fetching related objects violates READ COMMITTED assumption of Django ORM

7 views
Skip to first unread message

Django

unread,
Nov 16, 2017, 4:28:58 PM11/16/17
to django-...@googlegroups.com
#28806: Mechanism of fetching related objects violates READ COMMITTED assumption of
Django ORM
-------------------------------------+-------------------------------------
Reporter: Aaron Tan | Owner: nobody
Type: Bug | Status: new
Component: Database | Version: master
layer (models, ORM) | Keywords: database, read-
Severity: Normal | committed, concurrency control
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Found here:
https://github.com/django/django/blob/master/django/db/models/fields/related_descriptors.py#L141.
{{{
def get_object(self, instance):
qs = self.get_queryset(instance=instance)
# Assuming the database enforces foreign keys, this won't fail.
return qs.get(self.field.get_reverse_related_filter(instance))
}}}


The comment in that function states: `# Assuming the database enforces
foreign keys, this won't fail.`, but it's not the case. I will illustrate
with a use case we met:

Suppose we have 2 concurrent transactions A and B talking to the same
Postgresql backend, and we have two models:
{{{
class Foo:
pass

class Bar:
fk = ForeignKey(Foo, on_delete=models.SET_NULL)
}}}
populated by objects
{{{
foo = Foo()
foo.save()
bar = Bar(fk=foo)
bar.save()
}}}
Transaction A runs
{{{
bar = Bar.objects.get(pk=<bar's pk>)
}}}
Then transaction B runs
{{{
Foo.objects.get(pk=<foo's pk>).delete()
}}}
If READ COMMITTED is assumed by Django, transaction A will see the db
snapshot with `foo` deleted. But if transaction A subsequently did
{{{
bar.fk #Access foreign key field `fk` for the first time
}}}

Because the `get_object` method linked does not catch `ObjectDoesNotExist`
exception tossed by `get`, the last `bar.fk` will raise
`ObjectDoesNotExist`, thus contradicts the assumption

> the database enforces foreign keys, this won't fail.

We currently manually catch `ObjectDoesNotExist` but that makes code ugly,
I suggest instantiate the related field to `None` in that case, instead of
raising exception.

--
Ticket URL: <https://code.djangoproject.com/ticket/28806>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Nov 21, 2017, 5:55:30 PM11/21/17
to django-...@googlegroups.com
#28806: Mechanism of fetching related objects violates READ COMMITTED assumption of
Django ORM
-------------------------------------+-------------------------------------
Reporter: Aaron Tan | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: database, read- | Triage Stage:
committed, concurrency control | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tomer Chachamu):

The comment is definitely inaccurate, but I don't think the attribute
returning {{{None}}} is the right answer. Sure, the FK is set to NULL on
conflict, but maybe transaction B also updated
{{{Bar.objects.get(pk=<bar's pk>).pk}}} to a new value? In this case, A
will see the new value after a {{{bar.refresh_from_db()}}}, that is not
{{{None}}}.

The docs could probably be improved, to mention that you can use
{{{select_related}}} to prevent this {{{DoesNotExist}}} from being thrown.
(Although, it doesn't remove the fact that you will insert on save, when
you probably meant to update).

--
Ticket URL: <https://code.djangoproject.com/ticket/28806#comment:1>

Django

unread,
Nov 21, 2017, 7:27:50 PM11/21/17
to django-...@googlegroups.com
#28806: Mechanism of fetching related objects violates READ COMMITTED assumption of
Django ORM
-------------------------------------+-------------------------------------
Reporter: Aaron Tan | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: database, read- | Triage Stage:
committed, concurrency control | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Aaron Tan):

Replying to [comment:1 Tomer Chachamu]:


> The comment is definitely inaccurate, but I don't think the attribute
returning {{{None}}} is the right answer. Sure, the FK is set to NULL on
conflict, but maybe transaction B also updated
{{{Bar.objects.get(pk=<bar's pk>).pk}}} to a new value? In this case, A
will see the new value after a {{{bar.refresh_from_db()}}}, that is not
{{{None}}}.
>
> The docs could probably be improved, to mention that you can use
{{{select_related}}} to prevent this {{{DoesNotExist}}} from being thrown.
(Although, it doesn't remove the fact that you will insert on save, when
you probably meant to update).

I see, although manually modifying primary key seems like a kinda
infrequent operation, but that makes sense. In this case, can we at least
capture {{{ObjectDoesNotExist}}} there and make a partial
`refresh_from_database` by only refreshing and instantiating the related
field in question? Since selector works on a pretty low level of Django's
call stack, it might be possible to do that? After all, this means "any
fetching related object for the first time is theoretically vulnerable to
such concurrency issue", wrapping all of them around {{{try...except...}}}
is cumbersome and error-prone in large and convoluted codebases.

Also, using {{{select_related}}} to supress {{{DoesNotExist}}} was our
first solution, but somehow it does not work. Maybe it is worth
investigating/experimenting.

--
Ticket URL: <https://code.djangoproject.com/ticket/28806#comment:2>

Django

unread,
Dec 28, 2017, 11:10:23 AM12/28/17
to django-...@googlegroups.com
#28806: Mechanism of fetching related objects violates READ COMMITTED assumption of
Django ORM
-------------------------------------+-------------------------------------
Reporter: Aaron Tan | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: database, read- | Triage Stage: Accepted
committed, concurrency control |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* stage: Unreviewed => Accepted


Comment:

Accepting for further investigating. I haven't taken time to understand
whether or not a change should be made.

--
Ticket URL: <https://code.djangoproject.com/ticket/28806#comment:3>

Django

unread,
May 26, 2018, 5:09:00 AM5/26/18
to django-...@googlegroups.com
#28806: Mechanism of fetching related objects violates READ COMMITTED assumption of
Django ORM
-------------------------------------+-------------------------------------
Reporter: Aaron Tan | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: database, read- | Triage Stage: Accepted
committed, concurrency control |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by holvianssi):

There's nothing else to do for this case except throw an informative
error. This is what you get with concurrent database requests.

BTW there's another reason why the comment is wrong. Django uses deferred
foreign keys, so nothing prevents you from deleting the parent model
inside a transaction, the fk is only validated during commit.

--
Ticket URL: <https://code.djangoproject.com/ticket/28806#comment:4>

Django

unread,
Aug 17, 2022, 12:29:50 AM8/17/22
to django-...@googlegroups.com
#28806: Mechanism of fetching related objects violates READ COMMITTED assumption of
Django ORM
-------------------------------------+-------------------------------------
Reporter: Aaron Tan | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: dev

(models, ORM) |
Severity: Normal | Resolution:
Keywords: database, read- | Triage Stage: Accepted
committed, concurrency control |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

I would agree that there is little Django can do here and that we should
likely close this issue as ''wontfix'' and move on. Silencing errors would
do more harm than good as previous at accessing `fk_id` could have
returned a value but then `fk` would be `None`?

We could adjust the comment but ultimately the only way to ensure the
retrieval of a graph of object that is coherent at query time is to use
`select_related` to ensure all the data is retrieved a single query. From
that time objects are not locked anyhow and the only way to ensure they
are not altered is to use `select_for_update` and friends.

I would argue that lazy fetching of related objects honours `READ
COMMITED` assumption of the ORM; if you defer relationship then fetching
at a later point in time will attempt to reconciliate your in-memory
representation of model instances with the latest committed database state
possibly resulting in exceptions.

--
Ticket URL: <https://code.djangoproject.com/ticket/28806#comment:5>

Django

unread,
Aug 17, 2022, 5:48:14 AM8/17/22
to django-...@googlegroups.com
#28806: Mechanism of fetching related objects violates READ COMMITTED assumption of
Django ORM
-------------------------------------+-------------------------------------
Reporter: Aaron Tan | Owner: nobody
Type: Bug | Status: closed

Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution: wontfix

Keywords: database, read- | Triage Stage:
committed, concurrency control | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: new => closed
* resolution: => wontfix
* stage: Accepted => Unreviewed


--
Ticket URL: <https://code.djangoproject.com/ticket/28806#comment:6>

Reply all
Reply to author
Forward
0 new messages