This isn't well documented and I think many of the examples in the
documentation can lead to folks writing unsafe code that works most of the
time, but can fail for empty queries.
For example (using models from
https://docs.djangoproject.com/en/4.2/topics/db/aggregation/):
{{{#!python
obscure_publisher = ...
q =
Book.objects.filter(publisher=obscure_publisher).Aggregate(Sum("pages"))
total_pages = q["pages"] # None if the publisher hasn't published any
books!
# Silly example, but will raise a TypeError
kilopages = total_pages / 1000
}}}
From what I've seen online, I think the safer approach is to use Coalesce.
{{{#!python
obscure_publisher = ...
q =
Book.objects.filter(publisher=obscure_publisher).Aggregate(Coalesce(Sum("pages"),
0)
total_pages = q["pages"] # Safe; using coalesce guarantees an int.
}}}
The only reference I've found to using Coalesce in this way appears to be
a usage example at https://docs.djangoproject.com/en/4.2/ref/models
/database-functions/#coalesce. Coalesce is not mentioned at all on the
main aggregation page.
I think the documentation could be improved in a few ways.
On https://docs.djangoproject.com/en/4.2/topics/db/aggregation/:
- The code examples could consider adding Coalesce, or at least adding a
comment at the top of the cheat sheet section mentioning that the examples
do not include error checking and that using Coalesce is best practice.
For aggregation functions mentioned in
https://docs.djangoproject.com/en/4.2/ref/models/querysets/:
- Add something to the documentation of the return type for functions used
by aggregate() that mentions the empty queryset case if the result type
differs for empty queries.
- For example, Count() safely returns an int even with empty querysets,
but Sum, Avg, Min, and Max do not.
- Might also be worth mentioning that one can use Coalesce to avoid having
to deal with None values.
--
Ticket URL: <https://code.djangoproject.com/ticket/34808>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* cc: Nick Pope (added)
* stage: Unreviewed => Accepted
Comment:
I guess we should augment the
[https://docs.djangoproject.com/en/4.2/topics/db/aggregation/ aggregation
topic] to more clearly highlight the usage of the `Aggregate.default`
option added in 4.0 (see #10929) which was meant to be address this exact
issue by providing [https://github.com/django/django/pull/14026/files
#diff-c7d87b6e6b973bf87e5685442a6c1da9531c24bcaa86fe158c40ea0276986f32R68
a shorthand that avoids wrapping every aggregate] in `Coalesce`
(`Aggregate(default=42)` -> `Coalesce(Aggregate(), 42)`.
Would you be interested in providing documentation adjustments Eric?
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:1>
Comment (by Eric Baumgartner):
Ah, thanks, I completely missed that `default` was an option. (In 4.0, at
least -- I"m still working with some 3.2 projects.)
I'm happy to take a shot at some documentation changes. I haven't
contributed to Django before, so if you have any pointers for newbies I'd
appreciate it.
Now that I've reviewed #10929 (and I know what to look for) I think
coverage is better than I'd originally thought. I think the main things
I'd address include:
- Reviewing all example code across pages that uses `.aggregate(...)` and
adding an explicit `default` param or `None` handling when appropriate. I
think this is important because many people grab the sample code and don't
read additional documentation.
- On the aggregation topic page specifically, add a new section ("Handling
empty querysets"?) that discusses `default` and explains that it uses
Coalesce under the hood.
- Add something to the documentation of `Aggregation.default` on the
querysets page mentioning that it uses Coalesce under the hood.
Does that scope make sense?
I think mentioning Coalesce is helpful because it provides a hint for
pre-4 users who can't use `default`. Is that sort of thing OK? I don't
know what the project's position is on mentioning stuff that only applies
to older versions.
Somewhat related, should the documentation for `default` have a "New in
4.0" flag? Or are those flags reserved for just changed in the last
release or two?
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:2>
Comment (by David Sanders):
Hi Eric,
Thanks for putting your hand up to take a shot! Writing docs isn't too
hard, basically need to check out the repo and update the relevant file
under docs/. It's all in Sphinx and each file corresponds to a page in the
docs (the file path corresponds to the URL path, conveniently enough). You
then run `make html` and point your browser to the generated html.
See https://docs.djangoproject.com/en/dev/internals/contributing/writing-
documentation/ Carlton also mentioned that there was some effort to try
aligning the docs to: https://diataxis.fr/
Once your happy with how it looks it's simply a matter of creating a pull
request and referencing this ticket. (see commit log for message format)
Re your points:
- It's not just empty querysets that produce `None`. Empty groups in a
non-empty queryset will do the same thing.
- A section on handling `None` using `default` is a good idea 👍
Explaining why it returns `None` by default is also a good idea.
- I'm not sure whether it's worth going through every example to
explicitly handle defaults, the dedicated section may be enough, Nessita &
Felix may make a call on this.
Good luck!
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:3>
Comment (by Lufafa Joshua):
Hey Simon, i would love to help with this ticket. some guidance along the
way will be helpful. Thanks
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:4>
Comment (by David Sanders):
Hi Lufafa,
It'd be best to check that Eric hasn't already started… (please read the
comments above).
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:5>
Comment (by Eric Baumgartner):
Replying to [comment:5 David Sanders]:
> Hi Lufafa,
>
> It'd be best to check that Eric hasn't already started… (please read the
comments above).
I haven't started on this, Lufafa you're welcome to dig in if you want.
Otherwise I'll try to take a pass on this over the weekend.
Based on David's feedback I'm backing off the idea of adding default
params to every use of .aggregate() site-wide and just focusing on the
examples on the aggregation topic page.
On the querysets page, I would mention that Aggregate.defaults uses
Coalesce. And I'm also thinking that the "Return type" documentation for
each individual function (Sum, Avg, etc) needs to mention the case where
it may return None. While there is a note about empty querysets directly
under the Aggregation functions header, I think the way documentation is
used is that a lot of people scan the list of functions in the sidebar and
click the one they're interested in. This scrolls them directly to the
function documentation (e.g. for Sum) and they'll never see the empty
queryset note.
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:6>
* owner: nobody => Lufafa Joshua
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:7>
* has_patch: 0 => 1
Comment:
[# PR]=https://github.com/django/django/pull/17251
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:8>
* needs_better_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:9>
Comment (by David Sanders):
Eric, there's a PR up if you'd like to provide some feedback – since you
had a few points.
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:10>
Comment (by Lufafa Joshua):
new [# PR] https://github.com/django/django/pull/17253
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:11>
Comment (by Lufafa Joshua):
Hey guys, would it be necessary to add a possible return of None on every
aggregate function in
https://docs.djangoproject.com/en/4.2/ref/models/querysets/. There is a
notice under aggregation functions on the querysets page that already does
so, perhaps adding a link on the querysets page about the default argument
usage in https://docs.djangoproject.com/en/4.2/topics/db/aggregation/
would be helpful.
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:12>
Comment (by Eric Baumgartner):
>Hey guys, would it be necessary to add a possible return of None on every
aggregate function
I'm curious what others think, but I'm pretty firm that the documentation
for specific aggregation functions like **Sum** should account for
**None** as a possible return value.
I know this may seem redundant given that **default** is already
documented under Aggregation functions generally.
However, the way some people use the documentation means that they won't
see the general aggregation documentation, because that's not what they're
looking for.
Example (which is pretty close to my experience, which is what led me to
file this issue):
- I want to calculate the total of some field across the records in a
queryset.
- I'm pretty sure this will be called Sum, so I hit the documentation to
look up that function.
- On the querysets page, I scan down the Contents sidebar on the right.
- I find **Sum** in the sidebar. Great! I click it.
I'm now scrolled directly to the definition of the Sum function, which
says:
>Return type: same as input field, or output_field if supplied
Why would I not take that at face value? There's nothing in the definition
of Sum that would lead me to scroll up the page to see the general
documentation for aggregation functions.
If I'm calling Sum for an IntegerField, the return type is int. I
interpret that line as equivalent to `Sum(...) -> int`. When in reality it
is `Sum(...) -> Optional[int]`.
I think a relatively simple fix would be to change this to:
>Return type: same as input field, or output_field if supplied. Returns
**None** for empty querysets.
I would suggest adding this for all aggregation functions except `Count`.
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:13>
Comment (by Natalia <124304+nessita@…>):
In [changeset:"78b5c9075348aa12da2e024f6ece29d1d652dfdd" 78b5c907]:
{{{
#!CommitTicketReference repository=""
revision="78b5c9075348aa12da2e024f6ece29d1d652dfdd"
Refs #34808 -- Doc'd that aggregation functions on empty groups can return
None.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:14>
Comment (by Natalia <124304+nessita@…>):
In [changeset:"fb5dd118e955e34c84effc295aaabcec12e6295b" fb5dd118]:
{{{
#!CommitTicketReference repository=""
revision="fb5dd118e955e34c84effc295aaabcec12e6295b"
[5.0.x] Refs #34808 -- Doc'd that aggregation functions on empty groups
can return None.
Backport of 78b5c9075348aa12da2e024f6ece29d1d652dfdd from main
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:15>
Comment (by Natalia <124304+nessita@…>):
In [changeset:"b08f53ff46238301431084b50762a40170d7869d" b08f53f]:
{{{
#!CommitTicketReference repository=""
revision="b08f53ff46238301431084b50762a40170d7869d"
[4.2.x] Refs #34808 -- Doc'd that aggregation functions on empty groups
can return None.
Backport of 78b5c9075348aa12da2e024f6ece29d1d652dfdd from main
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:16>
Comment (by Lufafa Joshua):
~~https://github.com/django/django/pull/17253~~
new PR https://github.com/django/django/pull/17308
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:17>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"8adc7c86ab85ed91e512bc49056e301cbe1715d0" 8adc7c86]:
{{{
#!CommitTicketReference repository=""
revision="8adc7c86ab85ed91e512bc49056e301cbe1715d0"
Fixed #34808 -- Doc'd aggregate function's default argument.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:18>
Comment (by Natalia <124304+nessita@…>):
In [changeset:"d4bbdf5337089925f2900c44d3e953a947e4ca0e" d4bbdf53]:
{{{
#!CommitTicketReference repository=""
revision="d4bbdf5337089925f2900c44d3e953a947e4ca0e"
[5.0.x] Fixed #34808 -- Doc'd aggregate function's default argument.
Backport of 8adc7c86ab85ed91e512bc49056e301cbe1715d0 from main
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:19>
Comment (by Natalia <124304+nessita@…>):
In [changeset:"e8fe48d3a02dbe3b57a87b334f6335b701f0306d" e8fe48d]:
{{{
#!CommitTicketReference repository=""
revision="e8fe48d3a02dbe3b57a87b334f6335b701f0306d"
[4.2.x] Fixed #34808 -- Doc'd aggregate function's default argument.
Backport of 8adc7c86ab85ed91e512bc49056e301cbe1715d0 from main
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/34808#comment:20>