[Django] #29725: Inefficient SQL generated when counting a ManyToMany

51 views
Skip to first unread message

Django

unread,
Aug 29, 2018, 2:55:08 PM8/29/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin | Owner: nobody
Wahl |
Type: | Status: new
Uncategorized |
Component: Database | Version: 2.1
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
When calling count() on an unfiltered many to many relation, a useless
join is included in the SQL that makes it much slower than it should be.
On my dataset, the difference is 1000ms to 100ms, because an index-only
scan can be used.

This is the SQL that is currently generated:

{{{#!sql
SELECT COUNT(*) AS "__count"
FROM "app_foo"
INNER JOIN "app_foo_bar" ON ("app_foo"."id" = "app_foo_bar"."foo_id")
WHERE "app_foo_bar"."foo_id" = ?;
}}}

This is the SQL that should be generated:

{{{#!sql
SELECT COUNT(*) AS "__count"
FROM "app_foo_bar"
WHERE "app_foo_bar"."foo_id" = ?;
}}}

This optimization can only be applied when there are no filters applied,
because then the join is used to satisfy the filters. In the no-filters
case, only the through table needs to be consulted.

--
Ticket URL: <https://code.djangoproject.com/ticket/29725>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Aug 30, 2018, 9:39:51 PM8/30/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: Uncategorized | Status: assigned
Component: Database layer | Version: 2.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by oliver):

* status: new => assigned
* owner: nobody => oliver


--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:1>

Django

unread,
Sep 1, 2018, 12:30:57 PM9/1/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master

(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* version: 2.1 => master
* type: Uncategorized => Cleanup/optimization
* stage: Unreviewed => Accepted


Comment:

Hello Olivier,

I think you should be able to achieve this by defining a `count()` method
on the dynamic class created by `create_forward_many_to_many_manager` by
filtering `self.through._default_manager` based on `self.instance` and
return its `count()`.

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:2>

Django

unread,
Sep 1, 2018, 9:36:53 PM9/1/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:3>

Django

unread,
Sep 3, 2018, 4:18:46 AM9/3/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by oliver):

Thanks for your advice.
I will try to correct it according to your advice soon

Replying to [comment:2 Simon Charette]:


> Hello Olivier,
>
> I think you should be able to achieve this by defining a `count()`
method on the dynamic class created by
`create_forward_many_to_many_manager` by filtering
`self.through._default_manager` based on `self.instance` and return its
`count()`.

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:4>

Django

unread,
Sep 4, 2018, 3:48:25 AM9/4/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by oliver):

https://github.com/django/django/pull/10366

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:5>

Django

unread,
Sep 4, 2018, 3:48:39 AM9/4/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by oliver):

* has_patch: 0 => 1
* needs_tests: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:6>

Django

unread,
Sep 5, 2018, 5:15:21 AM9/5/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tom Forbes):

Have we considered making this change for `exists()` as well? I'm sure
this will generate the join in the same way that `count()` does.

Also somewhat related: https://code.djangoproject.com/ticket/28477

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:7>

Django

unread,
Sep 5, 2018, 9:11:15 AM9/5/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

Doing it for `exists()` as well makes sense.

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:8>

Django

unread,
Sep 30, 2018, 3:05:24 PM9/30/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* needs_better_patch: 0 => 1
* needs_tests: 1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:9>

Django

unread,
Oct 3, 2018, 4:46:50 AM10/3/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by oliver):

* needs_better_patch: 1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:10>

Django

unread,
Oct 15, 2018, 11:02:33 AM10/15/18
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: closed

Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham <timograham@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"1299421cadc4fcf63585f2f88337078e43e660e0" 1299421]:
{{{
#!CommitTicketReference repository=""
revision="1299421cadc4fcf63585f2f88337078e43e660e0"
Fixed #29725 -- Removed unnecessary join in QuerySet.count() and exists()
on a many-to-many relation.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:11>

Django

unread,
Apr 15, 2019, 6:39:53 AM4/15/19
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"5f7991c42cff73b6278106d499d719b726f85ead" 5f7991c4]:
{{{
#!CommitTicketReference repository=""
revision="5f7991c42cff73b6278106d499d719b726f85ead"
Fixed #30325 -- Reverted "Fixed #29725 -- Removed unnecessary join in


QuerySet.count() and exists() on a many-to-many relation."

This reverts commit 1299421cadc4fcf63585f2f88337078e43e660e0 due to
a regression with custom managers.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:12>

Django

unread,
Apr 15, 2019, 6:40:54 AM4/15/19
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"e8de1cc94c9a251ba7c569468f1e94faf9a05671" e8de1cc9]:
{{{
#!CommitTicketReference repository=""
revision="e8de1cc94c9a251ba7c569468f1e94faf9a05671"
[2.2.x] Fixed #30325 -- Reverted "Fixed #29725 -- Removed unnecessary join


in QuerySet.count() and exists() on a many-to-many relation."

This reverts commit 1299421cadc4fcf63585f2f88337078e43e660e0 due to
a regression with custom managers.

Backport of 5f7991c42cff73b6278106d499d719b726f85ead from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:13>

Django

unread,
Apr 15, 2019, 6:46:32 AM4/15/19
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: new

Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* status: closed => new
* resolution: fixed =>


Comment:

Solution has been reverted because it caused the inconsistent behavior of
`count()` and `exists()` on a reverse many-to-many relationship with a
custom manager (see new tests in
9ac8520fcde29840a1345be19d80dbda53aa6d03).

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:14>

Django

unread,
Apr 15, 2019, 10:54:53 AM4/15/19
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* has_patch: 1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:15>

Django

unread,
Apr 15, 2019, 11:21:50 AM4/15/19
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* needs_better_patch: 0 => 1

* has_patch: 0 => 1


Comment:

Switched back to ''patch needs improvement'' instead as it can probably be
adapted to skip the optimization when a custom manager is defined.

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:16>

Django

unread,
Apr 15, 2019, 12:47:31 PM4/15/19
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by felixxm):

Ticket #30325 revealed one more issue, i.e. optimization doesn't work when
chaining `all()` after reverse M2M relations.

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:17>

Django

unread,
Apr 23, 2019, 5:39:31 PM4/23/19
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: oliver
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mike Hansen):

Additionally, the optimization causes a query to be done if the related
objects have already been prefetched.

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:18>

Django

unread,
May 16, 2023, 4:07:55 PM5/16/23
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: Shiwei
Type: | Chen
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev

(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Shiwei Chen):

* owner: oliver => Shiwei Chen


* status: new => assigned


Comment:

Thanks everyone for your insights. Im going to try to implement this
optimization, accounting for all the issues listed here so far.

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:19>

Django

unread,
May 17, 2023, 12:42:50 AM5/17/23
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: Shiwei
Type: | Chen
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* needs_tests: 0 => 1


Comment:

Patch needs a few tweaks to properly disable the custom manager (most
notably tests). It also lacks coverage for the prefetched case as reported
by comment:18.

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:20>

Django

unread,
Jun 9, 2023, 2:14:43 PM6/9/23
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: Shiwei
Type: | Chen
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Shiwei Chen):

* needs_better_patch: 1 => 0


* needs_tests: 1 => 0


Comment:

Updated the patch to properly deal with cases involving custom managers
and prefetch queries, and implemented test cases for them.

https://github.com/django/django/pull/16863

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:21>

Django

unread,
Jun 21, 2023, 11:46:16 PM6/21/23
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: Shiwei
Type: | Chen
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* needs_better_patch: 0 => 1

* needs_tests: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:22>

Django

unread,
Jan 23, 2024, 10:13:15 AMJan 23
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: ontowhee
Type: | Status: assigned
Cleanup/optimization |

Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by ontowhee):

* owner: Shiwei Chen => ontowhee


--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:23>

Django

unread,
Jan 28, 2024, 11:15:24 AMJan 28
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: ontowhee
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by ontowhee):

I'm trying to pick up on the work from PR #16863, and I am beginning to
learn about M2M relationships, RelatedManagers, and QuerySets. From
reading the discussions on this ticket and looking through the comments on
the PR, this needs to handle chaining .all() for the reverse M2M relation.
Is that the `self.article.publications`?

Questions:
- What other chaining needs to be considered?
- How many levels and combinations of chaining need to be considered? For
example, I imagine it is desirable to ensure the optimization also occurs
for .all().all().all().count() too. Another example would be
.all().filter().count(), where the filter is technically an empty filter,
but it would return a QuerySet.
- Where would be the best level to implement the optimization? The PR
handles it in ManyRelatedManager. Another possibility could be to
implement this on compiler.get_from_clause(). I found this idea from
reading the comment on another ticket
https://code.djangoproject.com/ticket/20535#comment:3.

I'm not sure if I'm asking the right questions. If anyone has any pointers
for what areas I should look into, please let me know.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:24>

Django

unread,
Jan 28, 2024, 11:40:48 AMJan 28
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: ontowhee
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):

Thank you for working on this ticket ontowhee.

From my understanding the [https://github.com/django/django/pull/16863/
closed PR] is 95% of the way there as it builds on top of the initial work
that had to be reverted and adds extra coverage. I'd start by creating a
new PR out of it and deal with the unaddressed left by Mariusz to see how
the test suite reacts.

> How many levels and combinations of chaining need to be considered? For
example, I imagine it is desirable to ensure the optimization also occurs
for .all().all().all().count() too. Another example would be
.all().filter().count(), where the filter is technically an empty filter,
but it would return a QuerySet.

I'd focus on the many-to-many manager case discussed here only, that's
already hard enough to get right.

> Where would be the best level to implement the optimization? The PR
handles it in ManyRelatedManager. Another possibility could be to
implement this on compiler.get_from_clause(). I found this idea from
reading the comment on another ticket
https://code.djangoproject.com/ticket/20535#comment:3.

Exactly where it's defined in the PR you referenced. Skipping of
unnecessary intermediate joins as described in #20535 is a much larger
problem again that might help here in the long run but I think there are
benefits in solving this well scoped issue first.

> If it is handled in the RelatedManager level, is there a way for the
RelatedManager to look ahead to determine that the next function in the
chain will be a .count() or .exists()?

No need to as at that point it will be a `QuerySet`. Lets keep this ticket
focused on the common many-to-many case where `.count` and `.exists` are
directly called.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:25>

Django

unread,
Feb 3, 2024, 2:39:22 AMFeb 3
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: ontowhee
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by ontowhee):

I'm not sure if this is a viable solution. It passing the
`ManyRelatedManager` to the queryset as an instance variable
`_many_related_queryset`.

[https://github.com/django/django/pull/17811 Draft PR]

> No need to as at that point it will be a QuerySet

I briefly looked into `QuerySet._query` to see if a new optimized sql
query could be generated when the `aliasMap` lists certain table types. I
ended up passing the `ManyRelatedManager` to get something working. I'm
open to exploring a different approach.
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:26>

Django

unread,
Feb 6, 2024, 9:22:16 AMFeb 6
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: ontowhee
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by ontowhee):

I have removed the changes involving chaining all().
[https://github.com/django/django/pull/17811 PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:27>

Django

unread,
Feb 16, 2024, 2:58:53 AMFeb 16
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: ontowhee
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* needs_better_patch: 1 => 0
* needs_tests: 1 => 0
* stage: Accepted => Ready for checkin

--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:28>

Django

unread,
Feb 16, 2024, 3:45:54 AMFeb 16
to django-...@googlegroups.com
#29725: Inefficient SQL generated when counting a ManyToMany
-------------------------------------+-------------------------------------
Reporter: Gavin Wahl | Owner: ontowhee
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* resolution: => fixed
* status: assigned => closed

Comment:

In [changeset:"66e47ac69a7e71cf32eee312d05668d8f1ba24bb" 66e47ac6]:
{{{#!CommitTicketReference repository=""
revision="66e47ac69a7e71cf32eee312d05668d8f1ba24bb"
Fixed #29725 -- Removed unnecessary join in QuerySet.count() and exists()
on a many to many relation.

Co-Authored-By: Shiwei Chen <april.c...@gmail.com>
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29725#comment:29>
Reply all
Reply to author
Forward
0 new messages