[Django] #35376: Prefetched data not used when combining prefetch_related() and only()

27 views
Skip to first unread message

Django

unread,
Apr 15, 2024, 10:13:44 AM4/15/24
to django-...@googlegroups.com
#35376: Prefetched data not used when combining prefetch_related() and only()
-------------------------------------+-------------------------------------
Reporter: Michael | Owner: nobody
Schwarz |
Type: | Status: new
Uncategorized |
Component: Database | Version: 4.2
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
I think I found a simple case combining `prefetch_related()` and `only()`,
where prefetched data isn't used when it should.

See the following model with a `Restaurant` being a `Building`, and a
`Building` being part of a `City`. (replace `Building` with `Business` if
it bugs you that a building can only contain a single restaurant 🙃, I
noticed too late):

{{{#!python
from django.db.models import DO_NOTHING
from django.db.models import ForeignKey
from django.db.models import Model
from django.db.models import OneToOneField
from django.db.models import TextField

class City(Model):
name = TextField()

class Building(Model):
city = ForeignKey(City, on_delete=DO_NOTHING)

street_name = TextField()
street_no = TextField()

class Restaurant(Model):
building = OneToOneField(Building, on_delete=DO_NOTHING)

name = TextField()
}}}

I'm trying to build a query set of buildings with some of its attributes
deferred using `only()`, and the associated restaurants and cities being
prefetched. In the following parametrized `pytest` test case, only the
last instance is exhibiting the issue, the other 3 seem to work as I would
expect.

The test case creates an instance of each model, runs the query
(`list(...)`), and then accesses the `restaurant` attribute, which should
be prefetched in every case. The test case check that the access does not
generate an additional query using `django_assert_num_queries()` from
`pytest-django`.

{{{#!python
import pytest

from myproject.models import Building
from myproject.models import Restaurant
from myproject.models import City


@pytest.mark.parametrize('qs', [
Building.objects.prefetch_related("restaurant", "city"),
Building.objects.only("street_name").prefetch_related("restaurant"),
Building.objects.only("street_name").prefetch_related("city",
"restaurant"),
Building.objects.only("street_name").prefetch_related("restaurant",
"city")
])
def test_repro(db, django_assert_num_queries, qs):
Restaurant.objects.create(
name="",
building=Building.objects.create(
city=City.objects.create(name=""), street_name="",
street_no=""
),
)

result = list(qs)

with django_assert_num_queries(0):
result[0].restaurant
}}}

The first 3 instances of the test above succeed, the last one fails:

{{{
$ venv/bin/pytest test_repro.py
[...]

@pytest.mark.parametrize('qs', [
Building.objects.prefetch_related("restaurant", "city"),
Building.objects.only("street_name").prefetch_related("restaurant"),
Building.objects.only("street_name").prefetch_related("city",
"restaurant"),
Building.objects.only("street_name").prefetch_related("restaurant",
"city")
])
def test_repro(db, django_assert_num_queries, qs):
Restaurant.objects.create(
name="",
building=Building.objects.create(
city=City.objects.create(name=""), street_name="",
street_no=""
),
)

result = list(qs)

> with django_assert_num_queries(0):

[...]

E Failed: Expected to perform 0 queries but 1 was done (add
-v option to show queries)
}}}

I've created a [https://github.com/Feuermurmel/only_prefetch_related_repro
minimal project on GitHub] documenting the exact setup, including all
package versions.

AFAICT, in each case above, `restaurant` should be prefetched, and the
attribute access should not generate an additional access. Only when
`only()` is used, another related model is also prefetched, and that other
model is mentioned _after_ `restaurant` in the call to
`prefetch_related()`, the prefetched data for `restaurant` isn't used.

Running macOS 12.7.4, Python 3.12.2 and Django 4.2.11.
--
Ticket URL: <https://code.djangoproject.com/ticket/35376>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Apr 15, 2024, 11:08:27 AM4/15/24
to django-...@googlegroups.com
#35376: Prefetched data not used when combining prefetch_related() and only()
-------------------------------------+-------------------------------------
Reporter: Michael Schwarz | Owner: nobody
Type: New feature | Status: closed
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Normal | Resolution: duplicate
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* resolution: => duplicate
* status: new => closed
* type: Uncategorized => New feature

Comment:

Thank you for the report!
I think this is a duplicate of #33835 which is closed as 'wontfix' as the
behaviour is intentional.
--
Ticket URL: <https://code.djangoproject.com/ticket/35376#comment:1>

Django

unread,
Apr 15, 2024, 2:11:30 PM4/15/24
to django-...@googlegroups.com
#35376: Prefetched data not used when combining prefetch_related() and only()
-------------------------------------+-------------------------------------
Reporter: Michael Schwarz | Owner: nobody
Type: New feature | Status: closed
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Normal | Resolution: duplicate
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)

Comment:

I think there might more to it here given `Building.restaurant` is a
reverse one-to-one and thus there is no field stored on `Restaurant` to
reference it and thus no field to ''defer''.

If `.city` was used in the test then it would have been a duplicate of
#33835 (and
[https://github.com/django/django/blob/47c608202a58c8120d049c98d5d27c4609551d33/tests/defer/tests.py#L315-L329
something we could warn about] like we do with `select_related` + `only`
misuse) as using `prefetch_related("city")` requires that `"city_id"` is
included in the select mask (include in `only` and excluded from
`exclude`) but here it looks like something different.
--
Ticket URL: <https://code.djangoproject.com/ticket/35376#comment:2>

Django

unread,
Apr 15, 2024, 2:22:55 PM4/15/24
to django-...@googlegroups.com
#35376: Prefetched data not used when combining prefetch_related() and only()
-------------------------------------+-------------------------------------
Reporter: Michael Schwarz | Owner: nobody
Type: New feature | Status: closed
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Normal | Resolution: duplicate
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):

Ah I think this is actually a duplicate of #35044 which is only resolved
in `main`.

What is happening here is that in the third case `restaurant` is
prefetched and stored on the instance but then accessing `city_id` to
prefetch the right `city` requires calling `refresh_from_db` (as it's
excluded from the select mask) then #35044 (which clears all prefetched
data on each `refresh_from_db` call) is triggered and the prefetched
`restaurant` instance is discarded causing the access to the property in
the test to refetch from the database.

The only solution in 4.2 is to include `city` in your select mask (like
you'd want to do anyway to avoid a N+1) to avoid the bugged
`refresh_from_db` call.
--
Ticket URL: <https://code.djangoproject.com/ticket/35376#comment:3>

Django

unread,
Apr 17, 2024, 5:32:26 AM4/17/24
to django-...@googlegroups.com
#35376: Prefetched data not used when combining prefetch_related() and only()
-------------------------------------+-------------------------------------
Reporter: Michael Schwarz | Owner: nobody
Type: New feature | Status: closed
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Normal | Resolution: duplicate
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Michael Schwarz):

Thanks for triaging this and finding a workaround/proper usage for my use
case! 😊
--
Ticket URL: <https://code.djangoproject.com/ticket/35376#comment:4>
Reply all
Reply to author
Forward
0 new messages