#36035: Prefetch while annotating the reference of a m2m relation generates
duplicated joins
-------------------------------------+-------------------------------------
Reporter: Thiago Bellini | Type: Bug
Ribeiro | Component: Database
Status: new | layer (models, ORM)
Version: 5.1 | Severity: Normal
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
I noticed this when I started investigating this issue from a library that
I maintain:
https://github.com/strawberry-graphql/strawberry-
django/issues/650
Consider the following models:
{{{#!python
class User(models.Model):
groups = models.ManyToManyField("Group", related_name="users")
class Email(models.Model):
user = models.ForeignKey(User, related_name="emails")
class Group(models.Model):
...
}}}
If I do:
{{{#!python
User.objects.prefetch(Prefetch("emails",
Email.objects.annotate(user_name=F("user__name"))))
}}}
This gets properly resolved, and the prefetch will generate a query like:
{{{#!sql
SELECT "email"."id",
"email"."user_id",
"user"."name" AS "foo"
FROM "email"
INNER JOIN "user"
ON ("email"."product_id" = "user"."id")
WHERE "email"."product_id" IN (<ids>)
}}}
Perfect!
Now, if I do the same for groups, which is a m2m relation, like this:
{{{#!python
User.objects.prefetch(Prefetch("groups",
Group.objects.annotate(user_name=F("user__name"))))
}}}
The prefetch will get resolved like this:
{{{#!sql
SELECT ("user_group"."user_id") AS "_prefetch_related_val_user_id",
"group"."id",
"group"."name",
"user"."name" AS "foo"
FROM "group"
LEFT OUTER JOIN "user_group"
ON ("group"."id" = "user_group"."group_id")
LEFT OUTER JOIN "user"
ON ("user_group"."user_id" = "user"."id")
INNER JOIN "user_group" T4
ON ("group"."id" = T4."group_id")
WHERE T4."user_id" IN (<ids>)
}}}
This of course cause spurious results in the prefetched results. And it
also happens even if I annotate the id, which doesn't force a join with
the related table:
{{{#!sql
SELECT ("user_group"."user_id") AS "_prefetch_related_val_user_id",
"group"."id",
"group"."name",
"user_group"."user_id" AS "annotated_user_id"
FROM "group"
LEFT OUTER JOIN "user_group"
ON ("group"."id" = "user_group"."group_id")
INNER JOIN "user_group" T4
ON ("group"."id" = T4."group_id")
WHERE T4."product_id" IN (<ids>)
}}}
I tried digging into the codebase and found out that it is related to the
join in the prefetch queryset not being reused when the prefetch is
actually executed and filtered by the instances. The related line is this
one:
https://github.com/django/django/blob/fcd9d08379a2aee3b2c49eab0d0b8db6fd66d091/django/db/models/sql/query.py#L1946
In that line, it only uses `can_reuse` for a m2m, otherwise, it passes
`None`, which will allow reuse. But that `can_reuse` set is empty when
adding the join for the instances filter, which forces it to create the
`T4` alias.
If I comment that line and force `reuse = None` the issue appears to be
solved and I don't get the extra join anymore (but of course the fix is
not only that, as that code is there for some reason)
--
Ticket URL: <
https://code.djangoproject.com/ticket/36035>
Django <
https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.