{{{
Publisher.objects.annotate(Sum("books_pages"))
}}}
will give you 200 as the sum. However, adding a filter after the annotate
{{{
Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
}}}
will give you 400. (apparently that's the correct sum 200 multiplied by
the number of books of that publisher)
I understand that joins in annotates can produce incorrect results, akin
to the one documented here:
[https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
multiple-aggregations #combining-multiple-aggregations]. However, the docs
there say only "Combining multiple aggregations with annotate() will yield
the wrong results", and here I'm not combining multiple aggregations.
Furthermore, [https://docs.djangoproject.com/en/4.0/topics/db/aggregation
/#order-of-annotate-and-filter-clauses #order-of-annotate-and-filter-
clauses] says "When an annotate() clause is applied to a query, the
annotation is computed over the state of the query up to the point where
the annotation is requested.", which further made me believe this should
actually work.
--
Ticket URL: <https://code.djangoproject.com/ticket/33403>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* cc: karyon (added)
Old description:
> Publisher A has two books with each 100 pages.
>
> {{{
> Publisher.objects.annotate(Sum("books_pages"))
> }}}
>
> will give you 200 as the sum. However, adding a filter after the annotate
>
> {{{
> Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
> }}}
>
> will give you 400. (apparently that's the correct sum 200 multiplied by
> the number of books of that publisher)
>
> I understand that joins in annotates can produce incorrect results, akin
> to the one documented here:
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
> multiple-aggregations #combining-multiple-aggregations]. However, the
> docs there say only "Combining multiple aggregations with annotate() will
> yield the wrong results", and here I'm not combining multiple
> aggregations. Furthermore,
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#order-of-
> annotate-and-filter-clauses #order-of-annotate-and-filter-clauses] says
> "When an annotate() clause is applied to a query, the annotation is
> computed over the state of the query up to the point where the annotation
> is requested.", which further made me believe this should actually work.
New description:
Publisher A has two books with each 100 pages.
{{{
Publisher.objects.annotate(Sum("books_pages"))
}}}
will give you 200 as the sum. However, adding a filter after the annotate
{{{
Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
}}}
will give you 400. (apparently that's the correct sum 200 multiplied by
the number of books of that publisher)
I understand that joins in annotates can produce incorrect results, akin
to the one documented here:
[https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
multiple-aggregations #combining-multiple-aggregations]. However, the docs
there say only "Combining multiple aggregations with annotate() will yield
the wrong results", and here I'm not combining multiple aggregations.
Furthermore, [https://docs.djangoproject.com/en/4.0/topics/db/aggregation
/#order-of-annotate-and-filter-clauses #order-of-annotate-and-filter-
clauses] says "When an annotate() clause is applied to a query, the
annotation is computed over the state of the query up to the point where
the annotation is requested.", which further made me believe this should
actually work.
It seems that Sum also has an undocumented distinct parameter. When I
applied that one, I got incorrect results as well, and I couldn't even
tell how those were computed.
--
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:1>
Old description:
> Publisher A has two books with each 100 pages.
>
> {{{
> Publisher.objects.annotate(Sum("books_pages"))
> }}}
>
> will give you 200 as the sum. However, adding a filter after the annotate
>
> {{{
> Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
> }}}
>
> will give you 400. (apparently that's the correct sum 200 multiplied by
> the number of books of that publisher)
>
> I understand that joins in annotates can produce incorrect results, akin
> to the one documented here:
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
> multiple-aggregations #combining-multiple-aggregations]. However, the
> docs there say only "Combining multiple aggregations with annotate() will
> yield the wrong results", and here I'm not combining multiple
> aggregations. Furthermore,
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#order-of-
> annotate-and-filter-clauses #order-of-annotate-and-filter-clauses] says
> "When an annotate() clause is applied to a query, the annotation is
> computed over the state of the query up to the point where the annotation
> is requested.", which further made me believe this should actually work.
>
> It seems that Sum also has an undocumented distinct parameter. When I
> applied that one, I got incorrect results as well, and I couldn't even
> tell how those were computed.
New description:
Publisher A has two books with each 100 pages.
{{{
Publisher.objects.annotate(Sum("books_pages"))
}}}
will give you 200 as the sum. However, adding a filter after the annotate
{{{
Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
}}}
will give you 400. (apparently that's the correct sum 200 multiplied by
the number of books of that publisher)
I understand that joins in annotates can produce incorrect results, akin
to the one documented here:
[https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
multiple-aggregations #combining-multiple-aggregations]. However, the docs
there say only "Combining multiple aggregations with annotate() will yield
the wrong results", and here I'm not combining multiple aggregations.
Furthermore, [https://docs.djangoproject.com/en/4.0/topics/db/aggregation
/#order-of-annotate-and-filter-clauses #order-of-annotate-and-filter-
clauses] says "When an annotate() clause is applied to a query, the
annotation is computed over the state of the query up to the point where
the annotation is requested.", which further made me believe this should
actually work.
--
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:2>
* status: new => closed
* resolution: => invalid
Comment:
While the documentation doesn't explicit mention this particular
combination you are effectively hitting #10060 where the `book` table is
joined twice because instead of a single time and used for aggregation and
filtering.
In order to avoid this problem you can use a `FilteredRelation` alias or
simply the `condition` argument of `Sum`
{{{#!python
Publisher.objects.annotate(Sum("books__pages",
condition=Q(books__in=Book.objects.all()))
Publisher.objects.alias(
filtered_books=FilteredRelation("books",
condition=Q(books__in=Book.objects.all()))
).annotate(
Sum("filtered_books__pages"),
)
}}}
Please TicketClosingReasons/UseSupportChannels in the future for this kind
of questions
---
I kind of wish we made `FiltereRelation(relation: str, condition: Q)`
`Relation(relation: str, condition: Optional[Q])` instead and documented
its usage to circumvent this problem with the use of `alias` to ''tell''
the ORM a JOIN **must** be reused.
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:3>
Comment (by karyon):
Hi Simon, I'm sorry if I was unclear. I didn't intend this to be a
question, I already have a workaround in place. This was intended to be a
bug report: Two sections/sentences in the documentation are incorrect as
to which cases should or should not work, see the last paragraph of my
report.
In particular, I think the entire section "combining multiple
aggregations" should be generalized, since it's not only a second
annotation that triggers the issue, but (apparently?) any operation that
produces another join, such as filtering by an attribute in a related
model. In addition, the sentence "the annotation is computed over the
state of the query up to the point where the annotation is requested" is
misleading: A subsequent filter or additional aggregation can alter the
results of a previous one. It could help users if we added something like
"except for these cases <link to a generalized 'combining multiple
aggregations' section>".
Your suggested solution is not necessary in this case, simply filtering
before the annotation does work correctly as far as I could see:
{{{
# works as expected, sums the filtered books
Publisher.objects.filter(books__in=Books.objects.all()).annotate(Sum("books_pages"))
}}}
What does not work correctly is filtering the publishers by some property
of their books *after* the aggregation without affecting the aggregation:
{{{
# this alters the sum although the documentation suggests this should just
work
Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:4>
Old description:
> Publisher A has two books with each 100 pages.
>
> {{{
> Publisher.objects.annotate(Sum("books_pages"))
> }}}
>
> will give you 200 as the sum. However, adding a filter after the annotate
>
> {{{
> Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
> }}}
>
> will give you 400. (apparently that's the correct sum 200 multiplied by
> the number of books of that publisher)
>
> I understand that joins in annotates can produce incorrect results, akin
> to the one documented here:
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
> multiple-aggregations #combining-multiple-aggregations]. However, the
> docs there say only "Combining multiple aggregations with annotate() will
> yield the wrong results", and here I'm not combining multiple
> aggregations. Furthermore,
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#order-of-
> annotate-and-filter-clauses #order-of-annotate-and-filter-clauses] says
> "When an annotate() clause is applied to a query, the annotation is
> computed over the state of the query up to the point where the annotation
> is requested.", which further made me believe this should actually work.
New description:
Publisher A has two books with each 100 pages.
{{{
Publisher.objects.annotate(Sum("books__pages"))
}}}
will give you 200 as the sum. However, adding a filter after the annotate
{{{
Publisher.objects.annotate(Sum("books__pages")).filter(books__in=Books.objects.all())
}}}
will give you 400. (apparently that's the correct sum 200 multiplied by
the number of books of that publisher)
I understand that joins in annotates can produce incorrect results, akin
to the one documented here:
[https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
multiple-aggregations #combining-multiple-aggregations]. However, the docs
there say only "Combining multiple aggregations with annotate() will yield
the wrong results", and here I'm not combining multiple aggregations.
Furthermore, [https://docs.djangoproject.com/en/4.0/topics/db/aggregation
/#order-of-annotate-and-filter-clauses #order-of-annotate-and-filter-
clauses] says "When an annotate() clause is applied to a query, the
annotation is computed over the state of the query up to the point where
the annotation is requested.", which further made me believe this should
actually work.
--
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:5>
Comment (by karyon):
I just realized there was an underscore missing in the original report: It
should be "books__pages", and not "books_pages". Sorry for the confusion!
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:6>
* status: closed => new
* type: Uncategorized => Cleanup/optimization
* resolution: invalid =>
* stage: Unreviewed => Accepted
Comment:
While I'm not sure how to word this properly myself I also agree that the
''the annotation is computed over the state of the query up to the point
where the annotation is requested'' statement can be misleading without
context about how multi-valued filtering behaves.
Given we've recently revised the latter's documentation
(6174814dbe04fb6668aa212a6cdbca765a8b0522) I'm re-opening this ticket to
investigate doing the same here.
Maybe a single addition to docs about how multi-valued filtering against a
previously annotated aggregation might still yield duplicate results would
be enough?
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:6>
* owner: nobody => Abhinav Yadav
* needs_docs: 0 => 1
* has_patch: 0 => 1
* status: new => assigned
Comment:
[https://github.com/django/django/pull/16273 PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:7>
* needs_docs: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:8>
* needs_better_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:9>
* needs_better_patch: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:10>
Comment (by Simon Charette):
FWIW this seems highly related to #15049.
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:11>
* status: assigned => closed
* resolution: => duplicate
Comment:
Replying to [comment:11 Simon Charette]:
> FWIW this seems highly related if not a duplicate of #15049.
Agreed, let's close it as a duplicate of #15049.
--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:12>