[Django] #29271: Chaining Filters on a Reverse Foreign Key Produces Multiple Joins

50 views
Skip to first unread message

Django

unread,
Mar 28, 2018, 8:09:00 PM3/28/18
to django-...@googlegroups.com
#29271: Chaining Filters on a Reverse Foreign Key Produces Multiple Joins
-------------------------------------+-------------------------------------
Reporter: Michael | Owner: nobody
MacIntosh |
Type: Bug | Status: new
Component: Database | Version: 2.0
layer (models, ORM) | Keywords: filter chain
Severity: Normal | reverse foreign key
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
When you perform multiple filters on a queryset on a reverse foreign key,
it produces duplicate joins, which causes the results to multiply. If
instead of chaining the filters, you put both of the filters in one
filter, you do not get duplicate results.

You also get this behavior if you repeat the same filter.

This also applies to Q objects that are used in chain filters that
reference values across reverse foreign keys as well.

Example Models:

{{{
class Alpha(models.Model):
field = models.CharField(max_length=100)

class Beta(models.Model):
field = models.CharField(max_length=100)
data = models.CharField(max_length=100)
alpha = models.ForeignKey(Alpha, on_delete=models.CASCADE)
}}}
Example Data:

{{{
Alpha.objects.all().delete()
Beta.objects.all().delete()
alpha = Alpha(field="alpha_text")
alpha.save()
beta = Beta(field="beta_text", data="data1", alpha=alpha)
beta.save()
beta = Beta(field="beta_text", data="data2", alpha=alpha)
beta.save()
}}}

Example Output:
{{{
>>> Alpha.objects.filter(beta__field="beta_text", beta__data="data1")
<QuerySet [<Alpha: Alpha object (12)>]>
>>>
Alpha.objects.filter(beta__field="beta_text").filter(beta__data="data1")
<QuerySet [<Alpha: Alpha object (12)>, <Alpha: Alpha object (12)>]>
}}}

Formatted SQL of the first query (expected):

{{{
SELECT "test_app_alpha"."id",
"test_app_alpha"."field"
FROM "test_app_alpha"
INNER JOIN "test_app_beta"
ON ( "test_app_alpha"."id" = "test_app_beta"."alpha_id" )
WHERE ( "test_app_beta"."field" = beta_text
AND "test_app_beta"."data" = data1 )
}}}
Formatted SQL of the second query (bug)

{{{
SELECT "test_app_alpha"."id",
"test_app_alpha"."field"
FROM "test_app_alpha"
INNER JOIN "test_app_beta"
ON ( "test_app_alpha"."id" = "test_app_beta"."alpha_id" )
INNER JOIN "test_app_beta" T3
ON ( "test_app_alpha"."id" = T3."alpha_id" )
WHERE ( "test_app_beta"."field" = beta_text
AND T3."data" = data1 )
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/29271>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Mar 28, 2018, 8:40:10 PM3/28/18
to django-...@googlegroups.com
#29271: Chaining Filters on a Reverse Foreign Key Produces Multiple Joins
-------------------------------------+-------------------------------------
Reporter: Michael MacIntosh | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 2.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: filter chain | Triage Stage: Accepted
reverse foreign key |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Andrew Standley):

* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/29271#comment:1>

Django

unread,
Mar 28, 2018, 9:59:53 PM3/28/18
to django-...@googlegroups.com
#29271: Chaining Filters on a Reverse Foreign Key Produces Multiple Joins
-------------------------------------+-------------------------------------
Reporter: Michael MacIntosh | Owner: nobody
Type: Bug | Status: new

Component: Database layer | Version: 2.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: filter chain | Triage Stage: Accepted
reverse foreign key |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham):

Is this really unexpected behavior? Duplicate of #16554 (closed as
invalid)?

--
Ticket URL: <https://code.djangoproject.com/ticket/29271#comment:2>

Django

unread,
Mar 28, 2018, 10:51:33 PM3/28/18
to django-...@googlegroups.com
#29271: Chaining Filters on a Reverse Foreign Key Produces Multiple Joins
-------------------------------------+-------------------------------------
Reporter: Michael MacIntosh | Owner: nobody
Type: Bug | Status: closed

Component: Database layer | Version: 2.0
(models, ORM) |
Severity: Normal | Resolution: invalid

Keywords: filter chain | Triage Stage: Accepted
reverse foreign key |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* status: new => closed
* resolution: => invalid


Comment:

This is expected behavior as per
[https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-
valued-relationships spanning multi-valued relationships] as `beta` is a
reverse `ForeignKey` and thus multi-valued.

I'm not sure how we could give more visibility to this part of the
documentation as this keeps being reported as a bug given how unintuitive
it is. Maybe this could be mentioned in the
[https://docs.djangoproject.com/en/dev/ref/models/querysets/#filter
filter()] reference documentation?

--
Ticket URL: <https://code.djangoproject.com/ticket/29271#comment:3>

Django

unread,
Mar 29, 2018, 2:35:22 PM3/29/18
to django-...@googlegroups.com
#29271: Chaining Filters on a Reverse Foreign Key Produces Multiple Joins
-------------------------------------+-------------------------------------
Reporter: Michael MacIntosh | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 2.0
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: filter chain | Triage Stage: Accepted
reverse foreign key |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Andrew Standley):

I'm pretty sure it's a good rule of thumb that when the behaviour of a
feature is highly unintuitive, one should question whether it was a good
design choice. I'll take this to the mailing list in the hope someone can
explain the justification for this intended behaviour because ,to my naive
self, it seems that `Q` objects would be a cleaner way to allow the type
of multi-valued queries referenced in
[https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-
valued-relationships spanning multi-valued relationships].

I would argue that regardless, this is an issue with the documentation as
Simon suggests.

The [https://docs.djangoproject.com/en/2.0/ref/models/querysets/#filter
filter] documentation simply states that filters are joined via AND.
Meanwhile the [https://docs.djangoproject.com/en/2.0/topics/db/queries
/#chaining-filters chaining filters] documentation explicitly states that
successive filters act on the result of the last filter. Neither have any
link or reference to the contradictory behaviour documented in


[https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-
valued-relationships spanning multi-valued relationships]

--
Ticket URL: <https://code.djangoproject.com/ticket/29271#comment:4>

Django

unread,
Mar 29, 2018, 3:39:10 PM3/29/18
to django-...@googlegroups.com
#29271: Chaining Filters on a Reverse Foreign Key Produces Multiple Joins
-------------------------------------+-------------------------------------
Reporter: Michael MacIntosh | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 2.0
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: filter chain | Triage Stage: Accepted
reverse foreign key |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham):

#27936 requests documentation improvements.

--
Ticket URL: <https://code.djangoproject.com/ticket/29271#comment:5>

Django

unread,
Mar 29, 2018, 6:17:47 PM3/29/18
to django-...@googlegroups.com
#29271: Chaining Filters on a Reverse Foreign Key Produces Multiple Joins
-------------------------------------+-------------------------------------
Reporter: Michael MacIntosh | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 2.0
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: filter chain | Triage Stage: Accepted
reverse foreign key |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Andrew Standley):

Thanks for the link Tim, but isn't the scope of #27936 quite a bit
different? That ticket is regarding possible changes to the
[https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-
valued-relationships spanning multi-value relationships] documentation in
order to clarify the 'intended' functionality.

The issue I see is that the
[https://docs.djangoproject.com/en/2.0/topics/db/queries/#chaining-filters
chaining-filters] documentation appears to contradict the
[https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-
valued-relationships spanning multi-value relationships] documentation.

My suggested changes would be more along the lines of
1) Adding a link to
[https://docs.djangoproject.com/en/2.0/topics/db/queries/#retrieving-
specific-objects-with-filters retrieving specific objects with filters] to
the [https://docs.djangoproject.com/en/2.0/ref/models/querysets/#filter
filter] docs, so the details of filter are more visible.
2) Adding a note to
[https://docs.djangoproject.com/en/2.0/topics/db/queries/#chaining-filters
chaining-filters] that references the behaviour is handled differently for
multi-value relationships and links to
[https://docs.djangoproject.com/en/dev/topics/db/queries/#spanning-multi-
valued-relationships spanning multi-value relationships] .

Also a side note for documentation: this ticket appears to be a duplicate
of #18437 as well.

--
Ticket URL: <https://code.djangoproject.com/ticket/29271#comment:6>

Django

unread,
Jan 10, 2020, 12:28:37 PM1/10/20
to django-...@googlegroups.com
#29271: Chaining Filters on a Reverse Foreign Key Produces Multiple Joins
-------------------------------------+-------------------------------------
Reporter: Michael MacIntosh | Owner: nobody
Type: Bug | Status: new

Component: Database layer | Version: 2.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: filter chain | Triage Stage:
reverse foreign key | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Timothy Hobbs):

* status: closed => new
* resolution: invalid =>
* stage: Accepted => Unreviewed


Comment:

Can this bug please be re-opened?

Simon Charette's comment appears to be clarifying that two filter's are
intended to be different from one. I get that, that IS expected behavior.
What is NOT expected behavior is the duplication of the entries in the
results and that behavior goes directly against the current documentation:

'The second filter restricts the set of blogs further'

The word 'restricts' is used multiple times in that seciton. The word
'duplicates' never occurs.

That behavior is not only unexpected and undocumented, but never useful.
Doing a `.count()` on the results gives a bad result, displaying the
duplicated data is never desirable, mapping over the duplicated data can
lead to corruption. This behavior is literally always wrong and very hard
to catch in testing.

--
Ticket URL: <https://code.djangoproject.com/ticket/29271#comment:7>

Django

unread,
Jan 13, 2020, 12:42:26 AM1/13/20
to django-...@googlegroups.com
#29271: Chaining Filters on a Reverse Foreign Key Produces Multiple Joins
-------------------------------------+-------------------------------------
Reporter: Michael MacIntosh | Owner: nobody
Type: Bug | Status: new

Component: Database layer | Version: 2.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: filter chain | Triage Stage:
reverse foreign key | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

Timothy,

> That behavior is not only unexpected and undocumented, but never useful.

I agree with you, so does all all the folks subscribed to #10060.

But in order to deprecate the current behavior and automatically perform a
subquery pushdown we need an alternative to point the user at when such
events are detected.

There is ungoing work to allow aggregations to be performed using as
subquery which is on my radar for review (#28296) and is something we
could direct the user at. Until this lands there's no alternative we can
point the user at during following theoretical deprecation period where we
warn about this behavior and how it would change in future versions of
Django.

In this particual case I guess we could point users at something like
`.filter(Exists(Q(multi_valued__foo="bar")))` when
`.filter(multi_valued__foo="bar")` is used if we added support for
`Exist(Q)` or `.filter(multi_valued__exists=Q(foo="bar"))` but that's
tricky because [https://en.wikipedia.org/wiki/Correlated_subquery some
database engine are really bad with correlated subqueries] (e.g. MySQL
5.6) so while the results would be more correct it could a ticking bomb
that reveals itself only in production.

In the mean time I'd restate that a mention in the documentation couldn't
hurt. Would you be able to propose an admonition Timothy?

--
Ticket URL: <https://code.djangoproject.com/ticket/29271#comment:8>

Django

unread,
Jan 14, 2020, 8:21:48 AM1/14/20
to django-...@googlegroups.com
#29271: Chaining Filters on a Reverse Foreign Key Produces Multiple Joins
-------------------------------------+-------------------------------------
Reporter: Michael MacIntosh | Owner: nobody
Type: Bug | Status: closed

Component: Database layer | Version: 2.0
(models, ORM) |
Severity: Normal | Resolution: invalid

Keywords: filter chain | Triage Stage:
reverse foreign key | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* status: new => closed
* resolution: => invalid


Comment:

Timothy, please feel-free to submit a patch with a
clarification/admonition to docs. We'll evaluate it even without reopening
this ticket, which IMO is still invalid.

--
Ticket URL: <https://code.djangoproject.com/ticket/29271#comment:9>

Reply all
Reply to author
Forward
0 new messages