[Django] #36116: prefetch_related() selects too many objects from a related table with a composite primary key and hidden related name

20 views
Skip to first unread message

Django

unread,
Jan 19, 2025, 6:57:52 PMJan 19
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Type:
| Cleanup/optimization
Status: new | Component: Database
| layer (models, ORM)
Version: 5.2 | Severity: Normal
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Noticed a "fixme"
[https://github.com/django/django/blob/c28f821c9067050ba0d099349a4dfea2b29faf99/django/db/models/fields/related_descriptors.py#L170-L180
comment] in the code that pointed me here. With a hidden related name,
e.g. `comments+`, we hit a branch that expects primary keys to consist of
one column only. When a composite primary key violates that assumption,
more objects than necessary will be fetched.

With this test:

{{{#!diff
diff --git a/tests/composite_pk/models/__init__.py
b/tests/composite_pk/models/__init__.py
index 5996ae33b0..40750d91ae 100644
--- a/tests/composite_pk/models/__init__.py
+++ b/tests/composite_pk/models/__init__.py
@@ -1,7 +1,16 @@
-from .tenant import Comment, Post, Tenant, TimeStamped, Token, User
+from .tenant import (
+ Comment,
+ CommentWithHiddenUserField,
+ Post,
+ Tenant,
+ TimeStamped,
+ Token,
+ User,
+)

__all__ = [
"Comment",
+ "CommentWithHiddenUserField",
"Post",
"Tenant",
"TimeStamped",
diff --git a/tests/composite_pk/models/tenant.py
b/tests/composite_pk/models/tenant.py
index 6286ed2354..66fb371d3a 100644
--- a/tests/composite_pk/models/tenant.py
+++ b/tests/composite_pk/models/tenant.py
@@ -46,6 +46,25 @@ class Comment(models.Model):
text = models.TextField(default="", blank=True)


+class CommentWithHiddenUserField(models.Model):
+ pk = models.CompositePrimaryKey("tenant", "id")
+ tenant = models.ForeignKey(
+ Tenant,
+ on_delete=models.CASCADE,
+ related_name="comments+",
+ )
+ id = models.SmallIntegerField(unique=True, db_column="comment_id")
+ user_id = models.SmallIntegerField()
+ user = models.ForeignObject(
+ User,
+ on_delete=models.CASCADE,
+ from_fields=("tenant_id", "user_id"),
+ to_fields=("tenant_id", "id"),
+ related_name="comments+",
+ )
+ text = models.TextField(default="", blank=True)
+
+
class Post(models.Model):
pk = models.CompositePrimaryKey("tenant_id", "id")
tenant = models.ForeignKey(Tenant, on_delete=models.CASCADE,
default=1)
diff --git a/tests/composite_pk/tests.py b/tests/composite_pk/tests.py
index 6b09480fb0..29da1be47b 100644
--- a/tests/composite_pk/tests.py
+++ b/tests/composite_pk/tests.py
@@ -16,8 +16,9 @@ from django.db import IntegrityError, connection
from django.db.models import CompositePrimaryKey
from django.forms import modelform_factory
from django.test import TestCase
+from django.test.utils import CaptureQueriesContext

-from .models import Comment, Post, Tenant, TimeStamped, User
+from .models import Comment, CommentWithHiddenUserField, Post, Tenant,
TimeStamped, User


class CommentForm(forms.ModelForm):
@@ -38,6 +39,9 @@ class CompositePKTests(TestCase):
email="user...@example.com",
)
cls.comment = Comment.objects.create(tenant=cls.tenant, id=1,
user=cls.user)
+ cls.comment_by_hidden_user =
CommentWithHiddenUserField.objects.create(
+ tenant=cls.tenant, id=1, user=cls.user
+ )

@staticmethod
def get_primary_key_columns(table):
@@ -157,6 +161,13 @@ class CompositePKTests(TestCase):
result = list(Comment.objects.iterator())
self.assertEqual(result, [self.comment])

+ def test_prefetch_related_hidden_related_name(self):
+ with CaptureQueriesContext(connection) as queries:
+
CommentWithHiddenUserField.objects.prefetch_related("user").get(
+ pk=self.comment_by_hidden_user.pk
+ )
+ self.assertIn('"comment_id" = 1', queries[1]["sql"])
+
def test_query(self):
users = User.objects.values_list("pk").order_by("pk")
self.assertNotIn('AS "pk"', str(users.query))
}}}

The `WHERE` is too permissive -- would expect both primary key columns to
be checked, i.e. also "comment_id".

{{{#!py
======================================================================
FAIL: test_prefetch_related_hidden_related_name
(composite_pk.tests.CompositePKTests.test_prefetch_related_hidden_related_name)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/source/django/tests/composite_pk/tests.py", line 169, in
test_prefetch_related_hidden_related_name
self.assertIn('"comment_id" = 1', queries[1]["sql"])
AssertionError: '"comment_id" = 1' not found in 'SELECT
"composite_pk_user"."tenant_id", "composite_pk_user"."id",
"composite_pk_user"."email" FROM "composite_pk_user" WHERE
"composite_pk_user"."tenant_id" IN (1)'

----------------------------------------------------------------------
Ran 1 test in 0.006s

FAILED (failures=1)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/36116>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jan 19, 2025, 8:33:57 PMJan 19
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: (none)
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)
* severity: Normal => Release blocker
* stage: Unreviewed => Accepted

Comment:

Thanks for surfacing Jacob, glad we foresaw this one in
cb83448891f2920c926f61826d8eae4051a3d8f2.

Tentatively elevating to release blocker as
[https://docs.djangoproject.com/en/5.2/topics/composite-primary-key
/#composite-primary-keys-and-relations we've opted to start documenting]
`ForeignObject`'s usage with `CompositePrimaryKey` and I would expect
users to run into this problem as well.
--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:1>

Django

unread,
Jan 19, 2025, 9:57:29 PMJan 19
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* has_patch: 0 => 1
* owner: (none) => Simon Charette
* status: new => assigned

Comment:

Going to assign since I got a patch ready that doesn't require introducing
new models and we'll surely see many of these required adjustments until
5.0 gets fully released. Let me know if you want me to hold off for longer
when you report these.
--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:2>

Django

unread,
Jan 20, 2025, 8:50:50 AMJan 20
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Sarah Boyce):

> The `WHERE` is too permissive -- would expect both primary key columns
to be checked, i.e. also "comment_id".

Just to clarify, agreed the value of `comment_id` should be in the SQL but
not a reference to the column `comment_id` (e.g. `WHERE
("composite_pk_user"."tenant_id" = 1 AND "composite_pk_user"."id" = 1)`)
--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:3>

Django

unread,
Jan 20, 2025, 10:40:53 AMJan 20
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* needs_better_patch: 0 => 1

--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:4>

Django

unread,
Jan 20, 2025, 1:04:59 PMJan 20
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):

Clearing ''patch needs improvement'' flag as the `prefectch_related` +
SQLite issue for large number of items is already tracked in #27833.

It just happens that number of parents records is a function of the number
of fields `1000 / len(fields)`.
--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:5>

Django

unread,
Jan 20, 2025, 1:05:04 PMJan 20
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* needs_better_patch: 1 => 0

--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:6>

Django

unread,
Jan 20, 2025, 1:08:30 PMJan 20
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jacob Walls):

* stage: Accepted => Ready for checkin

--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:7>

Django

unread,
Jan 21, 2025, 3:09:54 AMJan 21
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: closed
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce <42296566+sarahboyce@…>):

* resolution: => fixed
* status: assigned => closed

Comment:

In [changeset:"626d77e52a3f247358514bcf51c761283968099c" 626d77e5]:
{{{#!CommitTicketReference repository=""
revision="626d77e52a3f247358514bcf51c761283968099c"
Fixed #36116 -- Optimized multi-column ForwardManyToOne prefetching.

Rely on ColPairs and TupleIn which support a single column to be specified
to avoid special casing ForwardManyToOne.get_prefetch_querysets().

Thanks Jacob Walls for the report.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:8>

Django

unread,
Jan 21, 2025, 3:14:00 AMJan 21
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: closed
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Sarah Boyce <42296566+sarahboyce@…>):

In [changeset:"9861e8654701bf8b95b2b533034e78058eee71b4" 9861e86]:
{{{#!CommitTicketReference repository=""
revision="9861e8654701bf8b95b2b533034e78058eee71b4"
[5.2.x] Fixed #36116 -- Optimized multi-column ForwardManyToOne
prefetching.

Rely on ColPairs and TupleIn which support a single column to be specified
to avoid special casing ForwardManyToOne.get_prefetch_querysets().

Thanks Jacob Walls for the report.

Backport of 626d77e52a3f247358514bcf51c761283968099c from main.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:9>

Django

unread,
Apr 3, 2025, 4:28:37 PMApr 3
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: closed
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"f7f38f3a0b44d8c6d14344dae66b6ce52cd77b55" f7f38f3]:
{{{#!CommitTicketReference repository=""
revision="f7f38f3a0b44d8c6d14344dae66b6ce52cd77b55"
Fixed #36290 -- Made TupleIn() lookup discard tuples containing None.

Just like the In() lookup discards of None members TupleIn() should
discard tuples containing any None as NULL != NULL in SQL and the
framework expects such queries to be elided under some circumstances.

Refs #31667, #36116.

Thanks Basptise Mispelon for bisecting the regression to 626d77e.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:10>

Django

unread,
Apr 3, 2025, 4:28:57 PMApr 3
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: closed
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"8ebdd37a0b1755842baae3bd34d388156ad4bf53" 8ebdd37a]:
{{{#!CommitTicketReference repository=""
revision="8ebdd37a0b1755842baae3bd34d388156ad4bf53"
[5.2.x] Fixed #36290 -- Made TupleIn() lookup discard tuples containing
None.

Just like the In() lookup discards of None members TupleIn() should
discard tuples containing any None as NULL != NULL in SQL and the
framework expects such queries to be elided under some circumstances.

Refs #31667, #36116.

Thanks Basptise Mispelon for bisecting the regression to 626d77e.

Backport of f7f38f3a0b44d8c6d14344dae66b6ce52cd77b55 from main
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:11>

Django

unread,
Jun 4, 2025, 4:47:00 AMJun 4
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: closed
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Sarah Boyce <42296566+sarahboyce@…>):

In [changeset:"08187c94ed02c45ad40a32244dedeaa7ac71ca87" 08187c94]:
{{{#!CommitTicketReference repository=""
revision="08187c94ed02c45ad40a32244dedeaa7ac71ca87"
Fixed #36432 -- Fixed a prefetch_related crash on related target subclass
queryset.

Regression in 626d77e52a3f247358514bcf51c761283968099c.

Refs #36116.

Thanks Cornelis Poppema for the excellent report.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:12>

Django

unread,
Jun 4, 2025, 4:48:51 AMJun 4
to django-...@googlegroups.com
#36116: prefetch_related() selects too many objects from a related table with a
composite primary key and hidden related name
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: Simon
Type: | Charette
Cleanup/optimization | Status: closed
Component: Database layer | Version: 5.2
(models, ORM) |
Severity: Release blocker | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Sarah Boyce <42296566+sarahboyce@…>):

In [changeset:"3340d4144605bd150906d070ae3e910941fa83c9" 3340d414]:
{{{#!CommitTicketReference repository=""
revision="3340d4144605bd150906d070ae3e910941fa83c9"
[5.2.x] Fixed #36432 -- Fixed a prefetch_related crash on related target
subclass queryset.

Regression in 626d77e52a3f247358514bcf51c761283968099c.

Refs #36116.

Thanks Cornelis Poppema for the excellent report.

Backport of 08187c94ed02c45ad40a32244dedeaa7ac71ca87 from main.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/36116#comment:13>
Reply all
Reply to author
Forward
0 new messages