[Django] #32155: Count() by default uses group by

5 views
Skip to first unread message

Django

unread,
Oct 28, 2020, 8:21:52 PM10/28/20
to django-...@googlegroups.com
#32155: Count() by default uses group by
-------------------------------------+-------------------------------------
Reporter: Sukhinder | Owner: nobody
Mann |
Type: | Status: new
Uncategorized |
Component: Database | Version: 2.2
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
I am trying to use Count() with Case to find out multiple conditional
counts, and count keeps using `id` as default group by. I don't want it to
use `GROUP BY` by default.


{{{#!python
Staff.objects.filter(department__id__in=[2,3]).annotate(
number_of_managers_in_UK=Sum(
Case(
When(position="MANAGER", location="UK", then=1),
output_field=IntegerField(),
)
),
number_of_directors_in_USA=Sum(
Case(
When(position="DIRECTOR", location="USA", then=1),
output_field=IntegerField(),
)
)
)
}}}


Which creates a SQL query something like this:


{{{#!sql
SELECT "staff"."id",
"staff"."location",
"staff"."position",
"staff"."salary",
"staff"."date_of_birth",
COUNT(
CASE
WHEN "staff"."position" = MANAGER AND "staff"."location" = UK
THEN 1
ELSE NULL
END
) AS "number_of_managers_in_UK",
COUNT(
CASE
WHEN "staff"."position" = DIRECTOR AND "staff"."location" =
USA THEN 1
ELSE NULL
END
) AS "number_of_directors_in_USA"
FROM "staff"
WHERE "staff"."department_id" IN (2, 3)
GROUP BY "staff"."id"; -- This group by is unnecessary and spoils the
result set
}}}


Even though `Count` is an Aggregator function in SQL it can be used
without `GROUP BY`. It would be good if we can have either a parameter or
something to specify whether to use group by or not instead of defaulting
to `GROUP BY`.

--
Ticket URL: <https://code.djangoproject.com/ticket/32155>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Oct 28, 2020, 8:22:28 PM10/28/20
to django-...@googlegroups.com
#32155: Count() by default uses group by
-------------------------------------+-------------------------------------
Reporter: Sukhinder Mann | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: 2.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sukhinder Mann):

* type: Uncategorized => New feature


--
Ticket URL: <https://code.djangoproject.com/ticket/32155#comment:1>

Django

unread,
Oct 28, 2020, 8:25:02 PM10/28/20
to django-...@googlegroups.com
#32155: Count() by default uses group by
-------------------------------------+-------------------------------------
Reporter: Sukhinder Mann | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: 2.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by Sukhinder Mann:

Old description:

New description:

I am trying to use Count() with Case to find out multiple conditional
counts, and count keeps using `id` as default group by. I don't want it to
use `GROUP BY` by default.


{{{#!python
Staff.objects.filter(department__id__in=[2,3]).annotate(
number_of_managers_in_UK=Sum(
Case(
When(position="MANAGER", location="UK", then=1),
output_field=IntegerField(),
)
),
number_of_directors_in_USA=Sum(
Case(
When(position="DIRECTOR", location="USA", then=1),
output_field=IntegerField(),
)
)
)
}}}


Which creates a SQL query something like this:


{{{#!sql
SELECT "staff"."id",
"staff"."location",
"staff"."position",
"staff"."salary",

"staff"."date_of_birth", -- I don't want all these other attributes as
well but I can't get rid of these as well, I just need the count
attributes


COUNT(
CASE
WHEN "staff"."position" = MANAGER AND "staff"."location" = UK
THEN 1
ELSE NULL
END
) AS "number_of_managers_in_UK",
COUNT(
CASE
WHEN "staff"."position" = DIRECTOR AND "staff"."location" =
USA THEN 1
ELSE NULL
END
) AS "number_of_directors_in_USA"
FROM "staff"
WHERE "staff"."department_id" IN (2, 3)
GROUP BY "staff"."id"; -- This group by is unnecessary and spoils the
result set
}}}


Even though `Count` is an Aggregator function in SQL it can be used
without `GROUP BY`. It would be good if we can have either a parameter or
something to specify whether to use group by or not instead of defaulting
to `GROUP BY`.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/32155#comment:2>

Django

unread,
Oct 28, 2020, 8:36:17 PM10/28/20
to django-...@googlegroups.com
#32155: Count() by default uses group by
-------------------------------------+-------------------------------------
Reporter: Sukhinder Mann | Owner: nobody
Type: New feature | Status: closed

Component: Database layer | Version: 2.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* status: new => closed
* resolution: => invalid


Comment:

I think you're misunderstanding the `annotate` vs `aggregate` features of
the ORM.

Annotations of aggregate functions have no choices but to group by the
queryset's model to provide a result for each rows. If you want to perform
an aggregation over filtered rows without grouping you'll want to use
`aggregate` instead.

https://docs.djangoproject.com/en/3.1/topics/db/aggregation/

Please use support channels and consult documentation before filling a
ticket next time as we use Trac solely for tracking feature requests and
confirmed bugs.

https://code.djangoproject.com/wiki/TicketClosingReasons/UseSupportChannels

--
Ticket URL: <https://code.djangoproject.com/ticket/32155#comment:3>

Django

unread,
Oct 29, 2020, 12:48:45 PM10/29/20
to django-...@googlegroups.com
#32155: Count() by default uses group by
-------------------------------------+-------------------------------------
Reporter: Sukhinder Mann | Owner: nobody
Type: New feature | Status: closed
Component: Database layer | Version: 2.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Sukhinder Mann):

Replying to [comment:3 Simon Charette]:


> I think you're misunderstanding the `annotate` vs `aggregate` features
of the ORM.
>
> Annotations of aggregate functions have no choices but to group by the
queryset's model to provide a result for each rows. If you want to perform
an aggregation over filtered rows without grouping you'll want to use
`aggregate` instead.
>
> https://docs.djangoproject.com/en/3.1/topics/db/aggregation/
>
> Please use support channels and consult documentation before filling a
ticket next time as we use Trac solely for tracking feature requests and
confirmed bugs.
>
>
https://code.djangoproject.com/wiki/TicketClosingReasons/UseSupportChannels

Sorry for reporting a non-issue. This is completely my bad, you are
correct I should have used `aggregate` and not `annotate`. Appreciate your
help, thank you.

--
Ticket URL: <https://code.djangoproject.com/ticket/32155#comment:4>

Reply all
Reply to author
Forward
0 new messages