[Django] #33655: Interaction between exists() and group by

2 views
Skip to first unread message

Django

unread,
Apr 20, 2022, 12:54:39 PM4/20/22
to django-...@googlegroups.com
#33655: Interaction between exists() and group by
-------------------------------------+-------------------------------------
Reporter: Marc | Owner: nobody
Perrin |
Type: | Status: new
Uncategorized |
Component: Database | Version: 4.0
layer (models, ORM) |
Severity: Normal | Keywords: exists group by
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
I've got a question about the interaction between exists() and group by.

For example something like:

{{{
Manager.values('field').annotate(cnt=Count('id')).filter(cnt__gt=1).exists()
}}}

The corresp. query with PostgreSQL looks like this:

{{{
SELECT (1) AS "a"
FROM "app_model"
GROUP BY "app_model"."field", (1)
HAVING COUNT("app_model"."id") > 1
LIMIT 1
}}}

exists() (see
[https://github.com/django/django/blob/470708f50d8c13a50770893b8d7181f5218bf3ac/django/db/models/sql/query.py#L563])
clears the SELECT clause and replaces it by (if I understand correctly) a
hardcoded value 1 (as "a"), along with a limit of 1, which makes sense to
me.

But get_group_by() (see
[https://github.com/django/django/blob/6b53114dd862ec97c282fdfdc83579cbd6d1560d/django/db/models/sql/compiler.py#L79])
pushes this hardcoded value to the GROUP BY clause and we end up with the
query above.

Now, on PostgreSQL, that works, but to me it sounds like it works by
luck/lucky robustness... and certainly the same query without the , (1) in
the GROUP BY clause yields the same result.
The problem is that outside of PostgreSQL that GROUP BY clause can be
downright invalid...
Note that if I alter exists() to use {'a': 2} instead of {'a': 1}, it does
not work anymore (on PostgreSQL), because (2) in the SELECT clause means
the hardcoded number 2 while (2) in the GROUP BY clause presumably (??)
refers to the 2nd expr of the SELECT clause, but we only got one...

My feeling is that get_group_by() should avoid adding extra/raw sql (or at
least extra/raw pure values, if we can detect that...) to the group by
expressions?

NB: the existing/old ticket that seems most related to my question would
probably be: [https://code.djangoproject.com/ticket/24835]

--
Ticket URL: <https://code.djangoproject.com/ticket/33655>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Reply all
Reply to author
Forward
0 new messages