[Django] #34597: Django ORM query SQL generation changed (and is apparently a lot slower)

8 views
Skip to first unread message

Django

unread,
May 25, 2023, 11:05:29 AM5/25/23
to django-...@googlegroups.com
#34597: Django ORM query SQL generation changed (and is apparently a lot slower)
-------------------------------------+-------------------------------------
Reporter: Lorand | Owner: nobody
Varga |
Type: Bug | Status: new
Component: Database | Version: 4.2
layer (models, ORM) |
Severity: Release | Keywords:
blocker |
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
My app has 2 models (making things simple so debugging is easier):
{{{
class Blog(models.Model):
title = models.CharField(max_length=60, blank=True)
is_published = models.BooleanField(default=generate_random_bool,
db_index=True)

class Translation(models.Model):
blog = models.ForeignKey('Blog', on_delete=models.CASCADE,
related_name='translation')
}}}

There is a reverse foreign key relation between translation and Blog. Also
a blog can have multiple translations (or none).

I have this code block:
{{{
from django.db.models import Q
qs = Blog.objects.filter(Q(is_published=True) & ~Q(translation=None))
print(qs.query)
}}}

django 2.2 ORM generates this output:
{{{
SELECT
"blog"."id",
"blog"."title",
"blog"."is_published"
FROM
"blog"
WHERE
(
"blog"."is_published" = True
AND NOT (
"blog"."id" IN (
SELECT
U0."id"
FROM
"blog" U0
LEFT OUTER JOIN "translation" U1 ON (U0."id" = U1."blog_id")
WHERE
(
U1."id" IS NULL
AND U0."id" = ("blog"."id")
)
)
)
)
}}}

For the same code block the django 3.2+ ORM (tried also the latest version
of django 4.2.1) it generates:
{{{
SELECT
"blog"."id",
"blog"."title",
"blog"."is_published"
FROM
"blog"
WHERE
(
"blog"."is_published"
AND NOT (
EXISTS(
SELECT
(1) AS "a"
FROM
"blog" U0
LEFT OUTER JOIN "translation" U1 ON (U0."id" = U1."blog_id")
WHERE
(
U1."id" IS NULL
AND U0."id" = "blog"."id"
AND "blog"."id" = "blog"."id"
)
LIMIT
1
)
)
)
}}}
which is a whole lot slower (at least on the same postgresql 13.7
instance).
The django 2.2 version executes in a couple of seconds while the newer
version executes in half an hour.

I think (but am not completely sure, might be terribly wrong) that the
issue was introduced here:
https://github.com/django/django/pull/13300/files

Environment:
asgiref==3.7.1
backports.zoneinfo==0.2.1
Django==3.2.19
model-bakery==1.11.0
psycopg2-binary==2.8.6
pytz==2023.3
sqlparse==0.4.4
typing_extensions==4.6.1

Db environment:
PostgreSQL 13.7 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1
20180712 (Red Hat 7.3.1-6), 64-bit

Bug does not manifest in django 2.2.12.
Bug manifests in django 3.2 and django 4.2.

--
Ticket URL: <https://code.djangoproject.com/ticket/34597>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
May 25, 2023, 1:11:01 PM5/25/23
to django-...@googlegroups.com
#34597: Django ORM query SQL generation changed (and is apparently a lot slower)
-------------------------------------+-------------------------------------
Reporter: Lorand Varga | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* status: new => closed
* severity: Release blocker => Normal
* cc: Simon Charette (added)
* type: Bug => Cleanup/optimization
* version: 4.2 => 3.2
* resolution: => needsinfo


Comment:

More likely to be related to #32143
(8593e162c9cb63a6c0b06daf045bc1c21eb4d7c1) which switched to using
`EXISTS` over `NOT IN` and was already part of Django 3.2.

AFAIK this is the only report in three years about this change making
things slower and the fact it jumped from a couple of seconds to hours
makes me think there might be something else at play here (lack of
analyze?). FWIW this change was advised [https://www.percona.com/blog/sql-
optimizations-in-postgresql-in-vs-exists-vs-any-all-vs-join/ by this
article] and examples in the wild of `NOT IN` performing poorly.

Without more details such as the Postgres execution plans (use `EXPLAIN`
on the query) it's very hard to provide any guidance here so I'll close as
''needsinfo'' for now as I cannot reproduce any form of slowdown with the
provided model. The `AND "blog"."id" = "blog"."id"` part seems definitely
fishy though so it'd be great to see if you get fast results without it as
that appear to be a bug.

In the mean time you can use
`~Q(id__in=Blog.objects.filter(translation=None, id=OuterRef("id")))`
instead of `~Q(translation=None)`.

--
Ticket URL: <https://code.djangoproject.com/ticket/34597#comment:1>

Django

unread,
May 28, 2023, 11:17:56 AM5/28/23
to django-...@googlegroups.com
#34597: Django ORM query SQL generation changed (and is apparently a lot slower)
-------------------------------------+-------------------------------------
Reporter: Lorand Varga | Owner: nobody
Type: | Status: new

Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Lorand Varga):

* status: closed => new
* resolution: needsinfo =>


Comment:

Adding more information:

Django 2.2 behavior:

{{{
EXPLAIN ANALYZE
SELECT
"test_app_blog"."id",
"test_app_blog"."created_at",
"test_app_blog"."updated_at",
"test_app_blog"."is_published",
"test_app_blog"."api_has_translation"
FROM
"test_app_blog"
WHERE
(
"test_app_blog"."is_published" = True
AND "test_app_blog"."api_has_translation" = False
AND NOT (
"test_app_blog"."id" IN (
SELECT
U0."id"
FROM
"test_app_blog" U0
LEFT OUTER JOIN "test_app_translation" U1 ON (U0."id" =


U1."blog_id")
WHERE
(
U1."id" IS NULL

AND U0."id" = ("test_app_blog"."id")
)
)
)
)
ORDER BY
"test_app_blog"."updated_at" DESC

Sort (cost=1047131.99..1047357.72 rows=90293 width=1985) (actual
time=888.530..888.533 rows=4 loops=1)
Sort Key: test_app_blog.updated_at DESC
Sort Method: quicksort Memory: 33kB
-> Index Scan using test_app_blog_is_published_4b47c652_uniq on
test_app_blog (cost=0.42..995257.82 rows=90293 width=1985) (actual
time=72.414..888.496 rows=4 loops=1)
Index Cond: (is_published = true)
Filter: ((NOT api_has_translation) AND (NOT (SubPlan 1)))
Rows Removed by Filter: 191941
SubPlan 1
-> Nested Loop Left Join (cost=0.71..8.57 rows=1 width=4)
(actual time=0.003..0.003 rows=1 loops=179902)
Join Filter: (u0.id = u1.blog_id)
Filter: (u1.id IS NULL)
Rows Removed by Filter: 0
-> Index Only Scan using test_app_blog_pkey on
test_app_blog u0 (cost=0.42..2.44 rows=1 width=4) (actual
time=0.002..0.002 rows=1 loops=179902)
Index Cond: (id = test_app_blog.id)
Heap Fetches: 27765
-> Index Scan using test_app_translation_51c6d5db on
test_app_translation u1 (cost=0.29..6.07 rows=5 width=8) (actual
time=0.001..0.001 rows=0 loops=179902)
Index Cond: (blog_id = test_app_blog.id)
Planning Time: 0.500 ms
Execution Time: 888.606 ms
}}}

Django 3.2 (and 4.2) behavior:
{{{
EXPLAIN ANALYZE
SELECT
"test_app_blog"."id",
"test_app_blog"."created_at",
"test_app_blog"."updated_at",
"test_app_blog"."is_published",
"test_app_blog"."api_has_translation"
FROM
"test_app_blog"
WHERE
(
"test_app_blog"."is_published"
AND NOT "test_app_blog"."api_has_translation"


AND NOT (
EXISTS(
SELECT
(1) AS "a"
FROM

"test_app_blog" U0
LEFT OUTER JOIN "test_app_translation" U1 ON (U0."id" =


U1."blog_id")
WHERE
(
U1."id" IS NULL

AND U0."id" = "test_app_blog"."id"
AND "test_app_blog"."id" = "test_app_blog"."id"


)
LIMIT
1
)
)
)

ORDER BY
"test_app_blog"."updated_at" DESC

Nested Loop Anti Join (cost=1001.15..131927.02 rows=180586 width=1985)
(actual time=208264.409..3034435.256 rows=4 loops=1)
Join Filter: ((test_app_blog.id = test_app_blog.id) AND (u0.id =
test_app_blog.id))
Rows Removed by Join Filter: 16853800739
-> Index Scan Backward using test_app_blog_updated_at_34e74e5b_uniq on
test_app_blog (cost=0.42..107115.43 rows=180586 width=1985) (actual
time=1.080..17454.750 rows=179902 loops=1)
Filter: (is_published AND (NOT api_has_translation))
Rows Removed by Filter: 27456
-> Materialize (cost=1000.73..21651.35 rows=1 width=4) (actual
time=0.002..9.009 rows=93684 loops=179902)
-> Gather Merge (cost=1000.73..21651.34 rows=1 width=4) (actual
time=6.024..192.937 rows=194102 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Merge Left Join (cost=0.71..20651.20 rows=1 width=4)
(actual time=0.046..87.573 rows=64701 loops=3)
Merge Cond: (u0.id = u1.blog_id)
Filter: (u1.id IS NULL)
Rows Removed by Filter: 19931
-> Parallel Index Only Scan using test_app_blog_pkey
on test_app_blog u0 (cost=0.42..13961.37 rows=86361 width=4) (actual
time=0.018..32.796 rows=69119 loops=3)
Heap Fetches: 33004
-> Index Scan using test_app_translation_51c6d5db on
test_app_translation u1 (cost=0.29..6082.91 rows=60175 width=8) (actual
time=0.025..32.500 rows=58545 loops=3)
Planning Time: 0.882 ms
Execution Time: 3034436.784 ms
}}}

As can be seen, there is a huge increase in execution time.
I can't vouch there isn't something else going on here also but currently
the only thing changed is the django version.
Anything else in the environment is left untouched. (And django2.2
generated query is performant, while django 3.2 query is awfull).
I've managed to use your advice and modify the query to force the subquery
to use IN but that leads to synthax that is more complex, while the django
2.2 synthax is a lot simpler: ~Q(translation=None)

The only synthax that manages to replicate the query correctly (and
performance wise) seems to be
{{{qs.exclude(id__in=Blog.objects.filter(translation=None,
id=OuterRef("id")))}}}}

While not terrible I do think something has been broken in django since
the ~Q(translation=None) seems a lot cleaner and more ORMish.

Thanks,
Lorand.

--
Ticket URL: <https://code.djangoproject.com/ticket/34597#comment:2>

Reply all
Reply to author
Forward
0 new messages