Re: [Django] #34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key and Annotate

7 views
Skip to first unread message

Django

unread,
Jun 7, 2023, 3:08:12 PM6/7/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
--------------------------------+--------------------------------------
Reporter: Nicolas Lupien | Owner: nobody
Type: Bug | Status: new
Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+--------------------------------------
Description changed by Nicolas Lupien:

Old description:

> We've moved to MySQL 8.0 in order to use Django 4.2 but our production
> system went down and we reverted to using MySQL 5.7 with Django 4.1.
> We've currently found a workaround that I'll add at the end of the bug
> report.
>
> If we use the search function of the admin on model with a foreign key
> and we override ModelAdmin.get_queryset with annotate, the search freezes
> our database. It had the same effect on Google Cloud SQL and on a local
> docker image of MySQL 8.0 and it works fine on both environment when
> using MySQL 5.7.
>
> The code:
>
> models.py
>
> {{{
> class Organization(models.Model):
> name = models.CharField(max_length=255)
>
> class Member(models.Model):
> name = models.CharField(max_length=255)
> organization = models.ForeignKey(Organization,
> on_delete=models.CASCADE, null=True)
> }}}
>
> admin.py
>
> {{{
> class OrganizationAdmin(admin.ModelAdmin):
> search_fields = ["name", "member__name"]
> list_display = ["name", "member_count"]
>
> class Meta:
> model = models.Organization
>
> def get_queryset(self, request):
> return super().get_queryset(request).annotate(Count("member"))
>
> def member_count(self, instance):
> return instance.member__count
> }}}
>
> I found that the ChangeList applies the override to get_queryset
> containing the annotate multiple times making the query extremely
> expensive. Give only 500 members it goes through 125,000,000 (500 * 500 *
> 500) rows.
>
> The workaround: If we override the ChangeList queryset, the call to
> annotate happens only once and the query is fine.
>

> {{{
> class CustomChangeList(ChangeList):
> def get_queryset(self, request):
> return
> super().get_queryset(request).annotate(Count("locker_connectors"))
>

> class OrganizationAdmin(admin.ModelAdmin):
> search_fields = ["name", "member__name"]
> list_display = ["name", "member_count"]
>
> class Meta:
> model = models.Organization
>
> def get_queryset(self, request):
> return super().get_queryset(request).annotate(Count("member"))
>
> def member_count(self, instance):
> return instance.member__count
>
> def get_changelist(self, request, **kwargs):
> return CustomChangeList
> }}}
>
> I created a repo with more details and the complete steps to reproduce
> the issue: https://github.com/betaflag/django-sqlbugdemo

New description:

We've moved to MySQL 8.0 in order to use Django 4.2 but our production
system went down and we reverted to using MySQL 5.7 with Django 4.1. We've
currently found a workaround that I'll add at the end of the bug report.

If we use the search function of the admin on model with a foreign key and
we override ModelAdmin.get_queryset with annotate, the search freezes our
database. It had the same effect on Google Cloud SQL and on a local docker
image of MySQL 8.0 and it works fine on both environment when using MySQL
5.7.

The code:

models.py

{{{
class Organization(models.Model):
name = models.CharField(max_length=255)

class Member(models.Model):
name = models.CharField(max_length=255)
organization = models.ForeignKey(Organization,
on_delete=models.CASCADE, null=True)
}}}

admin.py

{{{
class OrganizationAdmin(admin.ModelAdmin):
search_fields = ["name", "member__name"]
list_display = ["name", "member_count"]

class Meta:
model = models.Organization

def get_queryset(self, request):
return super().get_queryset(request).annotate(Count("member"))

def member_count(self, instance):
return instance.member__count
}}}

I found that the ChangeList applies the override to get_queryset
containing the annotate multiple times making the query extremely
expensive. Give only 500 members it goes through 125,000,000 (500 * 500 *
500) rows.

The workaround: If we override the ChangeList queryset, the call to
annotate happens only once and the query is fine.


{{{
class CustomChangeList(ChangeList):
def get_queryset(self, request):
return
super().get_queryset(request).annotate(Count("locker_connectors"))


class OrganizationAdmin(admin.ModelAdmin):
search_fields = ["name", "member__name"]
list_display = ["name", "member_count"]

class Meta:
model = models.Organization

def member_count(self, instance):
return instance.member__count

def get_changelist(self, request, **kwargs):
return CustomChangeList
}}}

I created a repo with more details and the complete steps to reproduce the
issue: https://github.com/betaflag/django-sqlbugdemo

--

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jun 7, 2023, 3:11:06 PM6/7/23
to django-...@googlegroups.com

Old description:

New description:

The code:

models.py

admin.py

return super().get_queryset(request).annotate(Count("member"))


class OrganizationAdmin(admin.ModelAdmin):
search_fields = ["name", "member__name"]
list_display = ["name", "member_count"]

class Meta:
model = models.Organization

def member_count(self, instance):
return instance.member__count

def get_changelist(self, request, **kwargs):
return CustomChangeList
}}}

I created a repo with more details and the complete steps to reproduce the
issue: https://github.com/betaflag/django-sqlbugdemo

--

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:3>

Django

unread,
Jun 7, 2023, 4:07:28 PM6/7/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
--------------------------------+--------------------------------------
Reporter: Nicolas Lupien | Owner: nobody
Type: Bug | Status: closed
Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution: invalid

Keywords: mysql | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+--------------------------------------
Changes (by Mariusz Felisiak):

* status: new => closed
* resolution: => invalid


Comment:

Thanks for the report, however, as it stands, it's a support question.
You're using very expensive query for every row, so it takes time. Please
reopen the ticket if you can debug your issue and provide details about
why and where Django is at fault. As far as I'm aware, if the generated
query is the same, then it's not an issue in Django itself. I'd start from
checking database indexes.

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:4>

Django

unread,
Jun 7, 2023, 4:24:21 PM6/7/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
--------------------------------+--------------------------------------
Reporter: Nicolas Lupien | Owner: nobody
Type: Bug | Status: closed
Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution: invalid
Keywords: mysql | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+--------------------------------------

Comment (by Simon Charette):

It looks like this relates to #32682
(187118203197801c6cb72dc8b06b714b23b6dd3d).

The fact `self.root_queryset` is used without ''stripping'' unused
annotations (e.g. the `Count` one) explains why the reporter is able to
work around the issue by only annotating after the `Exists` filter is
added.

Adjusting `SQLQuery.exists` to ''strip'' used annotation (like we did with
`SQLQuery.get_aggreation` in #28477
(59bea9efd2768102fc9d3aedda469502c218e9b7)) could be a possible way to
generically optimize this problem away.

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:5>

Django

unread,
Jun 9, 2023, 7:29:24 AM6/9/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
--------------------------------+--------------------------------------
Reporter: Nicolas Lupien | Owner: nobody
Type: Bug | Status: new

Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+--------------------------------------
Changes (by Nicolas Lupien):

* status: closed => new
* resolution: invalid =>


Comment:

It's definitely a Django issue. The query before #32682 is small
instantaneous, and now, it crashes my db because it's astronomically
costly, with only 500 rows. On production with 50,000 row, it costs
1.25e+15. This is, like @simon said, only because Django is not stripping
unused annotations when using the root_queryset. The resulting query does
a lot of unnecessary calculation. You can see it with all the debug
information in my repository that I joined previously
https://github.com/betaflag/django-sqlbugdemo

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:6>

Django

unread,
Jun 9, 2023, 10:43:14 AM6/9/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
--------------------------------+------------------------------------
Reporter: Nicolas Lupien | Owner: nobody
Type: Bug | Status: new
Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+------------------------------------
Changes (by Natalia Bidart):

* stage: Unreviewed => Accepted


Comment:

Accepting based on the comment from Simon. Not marking it as release
blocker since the potential regression was added in 3.2.1 (and afaik, 3.2
only receives security updates at this point, right?).

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:7>

Django

unread,
Jun 10, 2023, 1:41:15 AM6/10/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
--------------------------------+------------------------------------
Reporter: Nicolas Lupien | Owner: nobody
Type: Bug | Status: new
Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+------------------------------------

Comment (by Simon Charette):

Looking at the generated SQL in more details it seems like the annotation
is properly stripped but not the now unnecessary join that was required
for it

{{{#!sql
SELECT
`myapp_organization`.`id`,
`myapp_organization`.`name`,
COUNT(`myapp_member`.`id`) AS `member__count`
FROM
`myapp_organization`
LEFT OUTER JOIN `myapp_member` ON (
`myapp_organization`.`id` = `myapp_member`.`organization_id`
)
WHERE
EXISTS(
SELECT
1 AS `a`
FROM
`myapp_organization` U0
-- U1 is a completely unused alias now that COUNT(`U1`.`id`)
is stripped!
LEFT OUTER JOIN `myapp_member` U1 ON (U0.`id` =
U1.`organization_id`)
LEFT OUTER JOIN `myapp_member` U2 ON (U0.`id` =
U2.`organization_id`)
WHERE
(
(
U0.`name` LIKE '%a%'
OR U2.`name` LIKE '%a%'
)
AND U0.`id` = (`myapp_organization`.`id`)
)
-- Ditto with GROUP BY since aggregation is discarded
GROUP BY
U0.`id`
ORDER BY
NULL
LIMIT
1
)
GROUP BY
`myapp_organization`.`id`
ORDER BY
`myapp_organization`.`name` ASC,
`myapp_organization`.`id` DESC;
}}}

That's something the ORM is not good at.

Since `annotate`, `filter` and friends are ''additive only'' (there's no
API to discard of them) the ORM was never optimized to have an operation
that strips unused JOINs when the select mask changes or filters are
elided (e.g. they end up raising `EmptyQuerySet`). This was identified
when working on stripping unused annotations on aggregations in #28477
(59bea9efd2768102fc9d3aedda469502c218e9b7) and is relatively hard to
solve.

What's interesting here is that prior to
187118203197801c6cb72dc8b06b714b23b6dd3d the reporter was likely not even
getting the right results when searching because they were running into
#2361 (notice the double left join against the multi-valued `myapp_member`
table) which means their `member__count` returned the wrong values.

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:8>

Django

unread,
Jun 11, 2023, 11:50:49 PM6/11/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
--------------------------------+------------------------------------
Reporter: Nicolas Lupien | Owner: nobody
Type: Bug | Status: new
Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette, Mariusz Felisiak, Natalia Bidart (added)


Comment:

Not sure how actionable the ticket is in its current form so I wonder how
we should proceed here.

I can see a few paths forward:

1. Create an optimization / ORM ticket to have `SQLQuery.exists` and
`.aggregate` prune unnecessary joins solely used by pruned annotations. It
doesn't look like we have one already and this is desirable.

2. Adjust #32433 (6307c3f1a123f5975c73b231e8ac4f115fd72c0d) so the
`TypeError` is only raised when `distinct(fields)` is used (which was the
actually reported issue) so #32682
(187118203197801c6cb72dc8b06b714b23b6dd3d) can be reverted.

3. Don't attempt to remove duplicates when the queryset is meant to be
used for deletion purposes as there is no value in doing so but that's
easier said than done due with how things are currently structured.
[https://github.com/django/django/compare/main...charettes:django:ticket-34639
Here's how it could be achieved].

I think that 1. is desirable no matter what, and that we should choose a
solution between 2. and 3. I have a slight preference for 2 given it
relaxes an overly aggressive assertion instead of introducing a feature
for the sole purpose of working around it.

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:9>

Django

unread,
Jun 23, 2023, 10:19:36 AM6/23/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
--------------------------------+------------------------------------
Reporter: Nicolas Lupien | Owner: nobody
Type: Bug | Status: new
Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+------------------------------------

Comment (by Natalia Bidart):

Hello Simon,

I've been reading this ticket and the related tickets to try to provide a
sensible advice. The ORM in general is something I'm not familiar with.

Regarding doing (1), and following your advice in terms of that being
desirable and feasible, that sounds like a change we should pursue.

Then, I don't understand exactly what (2) involves: is it to remove the
`self.query.distinct or ` from the added if guard in the commit? I fail to
see how that is safe (in terms of the original report), since in the PR
Mariusz commented `Using distinct() with delete() sounds like a bad idea
to me.`. Are we confident that allowing `distinct().delete()` is safe?

Thanks!

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:10>

Django

unread,
Jul 4, 2023, 5:28:25 PM7/4/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
--------------------------------+------------------------------------------
Reporter: Nicolas Lupien | Owner: Simon Charette
Type: Bug | Status: assigned

Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+------------------------------------------
Changes (by Simon Charette):

* owner: nobody => Simon Charette
* status: new => assigned


Comment:

Hello Natalia,

In the case of (2), and as originally reported in #32433, calling
`delete()` after `distinct()` is only potentially ambiguous when `DISTINCT
ON` is used (`distinct(*fields)`) or when `DISTINCT` is against a subset
of fields that doesn't not include the primary key. That is the case
because if the subset of fields the distinct grouping is applied on
includes the primary key then it's not possible to reduce the set of
included rows to a subset (the primary key is distinct for each rows by
definition).

The `delete` method already prevents its usage after a `values()` call
([https://github.com/django/django/blob/42b4f81e6efd5c4587e1207a2ae3dd0facb1436f/django/db/models/query.py#L1140-L1141
source]) so doing `values(*subset_not_including_pk).distinct().delete()`
is not allowed. This only leaves the `distinct(*fields).delete()` case as
potentially problematic as reported in #32433.

[https://github.com/django/django/pull/13992/ The patch] for #32433 was
overly strict though and disallowed the `distinct().delete()` case which
then forced the logic introduced by
187118203197801c6cb72dc8b06b714b23b6dd3d to address #32682.

I'll work on PR to demonstrate what I mean by the above to make it crystal
clear.

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:11>

Django

unread,
Jul 4, 2023, 6:15:24 PM7/4/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
--------------------------------+------------------------------------------
Reporter: Nicolas Lupien | Owner: Simon Charette
Type: Bug | Status: assigned
Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------+------------------------------------------
Changes (by Simon Charette):

* has_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:12>

Django

unread,
Jul 5, 2023, 4:49:26 AM7/5/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
-------------------------------------+-------------------------------------

Reporter: Nicolas Lupien | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Ready for
| checkin

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* stage: Accepted => Ready for checkin


Comment:

[https://github.com/django/django/pull/17046 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:13>

Django

unread,
Jul 7, 2023, 1:43:52 AM7/7/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
-------------------------------------+-------------------------------------
Reporter: Nicolas Lupien | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"28e2077148f7602d29165e90965974698819cbba" 28e20771]:
{{{
#!CommitTicketReference repository=""
revision="28e2077148f7602d29165e90965974698819cbba"
Refs #32433 -- Reallowed calling QuerySet.delete() after distinct().

While values(*field_excluding_pk).distinct() and
distinct(*field_excluding_pk) can reduce the number of resulting rows
in a way that makes subsequent delete() calls ambiguous standalone
.distinct() calls cannot.

Since delete() already disallows chain usages with values() the only
case that needs to be handled, as originally reported, is when
DISTINCT ON is used via distinct(*fields).

Refs #32682 which had to resort to subqueries to prevent duplicates in
the admin and caused significant performance regressions on MySQL
(refs #34639).

This partly reverts 6307c3f1a123f5975c73b231e8ac4f115fd72c0d.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:14>

Django

unread,
Jul 7, 2023, 1:43:52 AM7/7/23
to django-...@googlegroups.com
#34639: MySQL 8.0 hangs indefinitely when using the admin search with a Foreign Key
and Annotate
-------------------------------------+-------------------------------------
Reporter: Nicolas Lupien | Owner: Simon
| Charette
Type: Bug | Status: closed
Component: contrib.admin | Version: 4.2
Severity: Normal | Resolution: fixed

Keywords: mysql | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"d569c1dcfeb26ca9ee391e5dfeadedf2b5ed4253" d569c1dc]:
{{{
#!CommitTicketReference repository=""
revision="d569c1dcfeb26ca9ee391e5dfeadedf2b5ed4253"
Fixed #34639 -- Reverted "Fixed #32682 -- Made admin changelist use
Exists() instead of distinct() for preventing duplicates."

This reverts commit 187118203197801c6cb72dc8b06b714b23b6dd3d which
moved to using Exists() instead due to an overly strict
distinct().delete() check added in #32433.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34639#comment:15>

Reply all
Reply to author
Forward
0 new messages