This is the SQL that is currently generated:
{{{#!sql
SELECT COUNT(*) AS "__count"
FROM "app_foo"
INNER JOIN "app_foo_bar" ON ("app_foo"."id" = "app_foo_bar"."foo_id")
WHERE "app_foo_bar"."foo_id" = ?;
}}}
This is the SQL that should be generated:
{{{#!sql
SELECT COUNT(*) AS "__count"
FROM "app_foo_bar"
WHERE "app_foo_bar"."foo_id" = ?;
}}}
This optimization can only be applied when there are no filters applied,
because then the join is used to satisfy the filters. In the no-filters
case, only the through table needs to be consulted.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* status: new => assigned
* owner: nobody => oliver
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:1>
* version: 2.1 => master
* type: Uncategorized => Cleanup/optimization
* stage: Unreviewed => Accepted
Comment:
Hello Olivier,
I think you should be able to achieve this by defining a `count()` method
on the dynamic class created by `create_forward_many_to_many_manager` by
filtering `self.through._default_manager` based on `self.instance` and
return its `count()`.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:2>
* cc: Simon Charette (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:3>
Comment (by oliver):
Thanks for your advice.
I will try to correct it according to your advice soon
Replying to [comment:2 Simon Charette]:
> Hello Olivier,
>
> I think you should be able to achieve this by defining a `count()`
method on the dynamic class created by
`create_forward_many_to_many_manager` by filtering
`self.through._default_manager` based on `self.instance` and return its
`count()`.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:4>
Comment (by oliver):
https://github.com/django/django/pull/10366
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:5>
* has_patch: 0 => 1
* needs_tests: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:6>
Comment (by Tom Forbes):
Have we considered making this change for `exists()` as well? I'm sure
this will generate the join in the same way that `count()` does.
Also somewhat related: https://code.djangoproject.com/ticket/28477
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:7>
Comment (by Simon Charette):
Doing it for `exists()` as well makes sense.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:8>
* needs_better_patch: 0 => 1
* needs_tests: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:9>
* needs_better_patch: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:10>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"1299421cadc4fcf63585f2f88337078e43e660e0" 1299421]:
{{{
#!CommitTicketReference repository=""
revision="1299421cadc4fcf63585f2f88337078e43e660e0"
Fixed #29725 -- Removed unnecessary join in QuerySet.count() and exists()
on a many-to-many relation.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:11>
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):
In [changeset:"5f7991c42cff73b6278106d499d719b726f85ead" 5f7991c4]:
{{{
#!CommitTicketReference repository=""
revision="5f7991c42cff73b6278106d499d719b726f85ead"
Fixed #30325 -- Reverted "Fixed #29725 -- Removed unnecessary join in
QuerySet.count() and exists() on a many-to-many relation."
This reverts commit 1299421cadc4fcf63585f2f88337078e43e660e0 due to
a regression with custom managers.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:12>
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):
In [changeset:"e8de1cc94c9a251ba7c569468f1e94faf9a05671" e8de1cc9]:
{{{
#!CommitTicketReference repository=""
revision="e8de1cc94c9a251ba7c569468f1e94faf9a05671"
[2.2.x] Fixed #30325 -- Reverted "Fixed #29725 -- Removed unnecessary join
in QuerySet.count() and exists() on a many-to-many relation."
This reverts commit 1299421cadc4fcf63585f2f88337078e43e660e0 due to
a regression with custom managers.
Backport of 5f7991c42cff73b6278106d499d719b726f85ead from master
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:13>
* status: closed => new
* resolution: fixed =>
Comment:
Solution has been reverted because it caused the inconsistent behavior of
`count()` and `exists()` on a reverse many-to-many relationship with a
custom manager (see new tests in
9ac8520fcde29840a1345be19d80dbda53aa6d03).
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:14>
* has_patch: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:15>
* needs_better_patch: 0 => 1
* has_patch: 0 => 1
Comment:
Switched back to ''patch needs improvement'' instead as it can probably be
adapted to skip the optimization when a custom manager is defined.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:16>
Comment (by felixxm):
Ticket #30325 revealed one more issue, i.e. optimization doesn't work when
chaining `all()` after reverse M2M relations.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:17>
Comment (by Mike Hansen):
Additionally, the optimization causes a query to be done if the related
objects have already been prefetched.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:18>
* owner: oliver => Shiwei Chen
* status: new => assigned
Comment:
Thanks everyone for your insights. Im going to try to implement this
optimization, accounting for all the issues listed here so far.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:19>
* needs_tests: 0 => 1
Comment:
Patch needs a few tweaks to properly disable the custom manager (most
notably tests). It also lacks coverage for the prefetched case as reported
by comment:18.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:20>
* needs_better_patch: 1 => 0
* needs_tests: 1 => 0
Comment:
Updated the patch to properly deal with cases involving custom managers
and prefetch queries, and implemented test cases for them.
https://github.com/django/django/pull/16863
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:21>
* needs_better_patch: 0 => 1
* needs_tests: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:22>
* owner: Shiwei Chen => ontowhee
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:23>
I'm trying to pick up on the work from PR #16863, and I am beginning to
learn about M2M relationships, RelatedManagers, and QuerySets. From
reading the discussions on this ticket and looking through the comments on
the PR, this needs to handle chaining .all() for the reverse M2M relation.
Is that the `self.article.publications`?
Questions:
- What other chaining needs to be considered?
- How many levels and combinations of chaining need to be considered? For
example, I imagine it is desirable to ensure the optimization also occurs
for .all().all().all().count() too. Another example would be
.all().filter().count(), where the filter is technically an empty filter,
but it would return a QuerySet.
- Where would be the best level to implement the optimization? The PR
handles it in ManyRelatedManager. Another possibility could be to
implement this on compiler.get_from_clause(). I found this idea from
reading the comment on another ticket
https://code.djangoproject.com/ticket/20535#comment:3.
I'm not sure if I'm asking the right questions. If anyone has any pointers
for what areas I should look into, please let me know.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:24>
Thank you for working on this ticket ontowhee.
From my understanding the [https://github.com/django/django/pull/16863/
closed PR] is 95% of the way there as it builds on top of the initial work
that had to be reverted and adds extra coverage. I'd start by creating a
new PR out of it and deal with the unaddressed left by Mariusz to see how
the test suite reacts.
> How many levels and combinations of chaining need to be considered? For
example, I imagine it is desirable to ensure the optimization also occurs
for .all().all().all().count() too. Another example would be
.all().filter().count(), where the filter is technically an empty filter,
but it would return a QuerySet.
I'd focus on the many-to-many manager case discussed here only, that's
already hard enough to get right.
> Where would be the best level to implement the optimization? The PR
handles it in ManyRelatedManager. Another possibility could be to
implement this on compiler.get_from_clause(). I found this idea from
reading the comment on another ticket
https://code.djangoproject.com/ticket/20535#comment:3.
Exactly where it's defined in the PR you referenced. Skipping of
unnecessary intermediate joins as described in #20535 is a much larger
problem again that might help here in the long run but I think there are
benefits in solving this well scoped issue first.
> If it is handled in the RelatedManager level, is there a way for the
RelatedManager to look ahead to determine that the next function in the
chain will be a .count() or .exists()?
No need to as at that point it will be a `QuerySet`. Lets keep this ticket
focused on the common many-to-many case where `.count` and `.exists` are
directly called.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:25>
I'm not sure if this is a viable solution. It passing the
`ManyRelatedManager` to the queryset as an instance variable
`_many_related_queryset`.
[https://github.com/django/django/pull/17811 Draft PR]
> No need to as at that point it will be a QuerySet
I briefly looked into `QuerySet._query` to see if a new optimized sql
query could be generated when the `aliasMap` lists certain table types. I
ended up passing the `ManyRelatedManager` to get something working. I'm
open to exploring a different approach.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:26>
I have removed the changes involving chaining all().
[https://github.com/django/django/pull/17811 PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:27>