[Django] #31666: Combine prefetch_related queries for FK's to the same target model

6 views
Skip to first unread message

Django

unread,
Jun 5, 2020, 11:58:13 AM6/5/20
to django-...@googlegroups.com
#31666: Combine prefetch_related queries for FK's to the same target model
-------------------------------------+-------------------------------------
Reporter: Adam | Owner: nobody
(Chainz) Johnson |
Type: | Status: new
Cleanup/optimization |
Component: Database | Version: master
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
If you have a source model with two or more FK's to the same target model,
`prefetch_related()` for those FK's could do a single query rather than
several. For models with duplicate targets in different FK's, this would
also reduce transferred data and duplicate objects in memory.

For example:

{{{
from django.db import models


class Author(models.Model):
pass


class Book(models.Model):
author1 = models.ForeignKey(Author, on_delete=models.DO_NOTHING,
related_name='+')
author2 = models.ForeignKey(Author, on_delete=models.DO_NOTHING,
related_name='+')
}}}

If I create some authors and fetch everything, the same author is
represented by two objects because they were fetched in different queries:

{{{
In [1]: from example.core.models import Author, Book

In [2]: a1 = Author.objects.create()

In [3]: a2 = Author.objects.create()

In [4]: a3 = Author.objects.create()

In [5]: b1 = Book.objects.create(author1=a1, author2=a2)

In [6]: b2 = Book.objects.create(author1=a2, author2=a3)

In [7]: bs = Book.objects.prefetch_related('author1', 'author2')

In [8]: bs[0].author1
Out[8]: <Author: Author object (1)>

In [9]: bs[0].author2
Out[9]: <Author: Author object (2)>

In [10]: bs[1].author1
Out[10]: <Author: Author object (2)>

In [11]: bs[0].author2 is bs[1].author1
Out[11]: False
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/31666>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jun 5, 2020, 12:08:30 PM6/5/20
to django-...@googlegroups.com
#31666: Combine prefetch_related queries for FK's to the same target model
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: nobody

Johnson |
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Gordon Wrigley):

To my mind the duplicate objects in memory is kinda a feature.
Specifically it seems desirable to me that a prefetch related clause only
changes performance not behaviour. Particularly it seems concerning that
adding a prefetch_related to this (very contrived) example would change
it's behaviour:

{{{
b = Book.objects.get()
b.author1.weight = 90
b.author2.weight = 50
print(b.author1.weight + b.author2.weight)
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/31666#comment:1>

Django

unread,
Jun 5, 2020, 1:05:53 PM6/5/20
to django-...@googlegroups.com
#31666: Combine prefetch_related queries for FK's to the same target model
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: nobody

Johnson |
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

> To my mind the duplicate objects in memory is kinda a feature.
Specifically it seems desirable to me that a prefetch related clause only

changes performance not behaviour...

Agreed, different in memory objects should be created as it would also be
slightly backward incompatible. The ORM doesn't maintain an identity map
of objects (e.g. like SQLAlchemy's session does) so changing it solely for
this feature would be confusing.

Performing a single query could be doable using a form of
`Author.objects.filter(Q(books__author1__in=author1_set) |
Q(books__author2__in=author2_set))` which should naturally result in
duplicated `Author` being returned because of the product caused by the
two `LEFT JOIN`s to `book`.

--
Ticket URL: <https://code.djangoproject.com/ticket/31666#comment:2>

Django

unread,
Jun 8, 2020, 12:37:11 AM6/8/20
to django-...@googlegroups.com
#31666: Combine prefetch_related queries for FK's to the same target model
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: nobody
Johnson |
Type: | Status: closed

Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* status: new => closed
* resolution: => wontfix


Comment:

I agree with previous comments. This optimization is backward
incompatible, quite niche, and can cause a lot of headache.

--
Ticket URL: <https://code.djangoproject.com/ticket/31666#comment:3>

Reply all
Reply to author
Forward
0 new messages