[Django] #33403: Annotate results change when filtering *after* the annotate

13 views
Skip to first unread message

Django

unread,
Jan 1, 2022, 12:27:10 PM1/1/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: nobody
Type: | Status: new
Uncategorized |
Component: Database | Version: 4.0
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Publisher A has two books with each 100 pages.


{{{
Publisher.objects.annotate(Sum("books_pages"))
}}}

will give you 200 as the sum. However, adding a filter after the annotate

{{{
Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
}}}

will give you 400. (apparently that's the correct sum 200 multiplied by
the number of books of that publisher)

I understand that joins in annotates can produce incorrect results, akin
to the one documented here:
[https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
multiple-aggregations #combining-multiple-aggregations]. However, the docs
there say only "Combining multiple aggregations with annotate() will yield
the wrong results", and here I'm not combining multiple aggregations.
Furthermore, [https://docs.djangoproject.com/en/4.0/topics/db/aggregation
/#order-of-annotate-and-filter-clauses #order-of-annotate-and-filter-
clauses] says "When an annotate() clause is applied to a query, the
annotation is computed over the state of the query up to the point where
the annotation is requested.", which further made me believe this should
actually work.

--
Ticket URL: <https://code.djangoproject.com/ticket/33403>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jan 1, 2022, 1:17:53 PM1/1/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by karyon):

* cc: karyon (added)


Old description:

> Publisher A has two books with each 100 pages.
>

> {{{
> Publisher.objects.annotate(Sum("books_pages"))
> }}}
>
> will give you 200 as the sum. However, adding a filter after the annotate
>
> {{{
> Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
> }}}
>
> will give you 400. (apparently that's the correct sum 200 multiplied by
> the number of books of that publisher)
>
> I understand that joins in annotates can produce incorrect results, akin
> to the one documented here:
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
> multiple-aggregations #combining-multiple-aggregations]. However, the
> docs there say only "Combining multiple aggregations with annotate() will
> yield the wrong results", and here I'm not combining multiple
> aggregations. Furthermore,
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#order-of-

> annotate-and-filter-clauses #order-of-annotate-and-filter-clauses] says


> "When an annotate() clause is applied to a query, the annotation is
> computed over the state of the query up to the point where the annotation
> is requested.", which further made me believe this should actually work.

New description:

Publisher A has two books with each 100 pages.


{{{
Publisher.objects.annotate(Sum("books_pages"))
}}}

will give you 200 as the sum. However, adding a filter after the annotate

{{{
Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
}}}

will give you 400. (apparently that's the correct sum 200 multiplied by
the number of books of that publisher)

I understand that joins in annotates can produce incorrect results, akin
to the one documented here:
[https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
multiple-aggregations #combining-multiple-aggregations]. However, the docs
there say only "Combining multiple aggregations with annotate() will yield
the wrong results", and here I'm not combining multiple aggregations.
Furthermore, [https://docs.djangoproject.com/en/4.0/topics/db/aggregation
/#order-of-annotate-and-filter-clauses #order-of-annotate-and-filter-
clauses] says "When an annotate() clause is applied to a query, the
annotation is computed over the state of the query up to the point where
the annotation is requested.", which further made me believe this should
actually work.

It seems that Sum also has an undocumented distinct parameter. When I
applied that one, I got incorrect results as well, and I couldn't even
tell how those were computed.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:1>

Django

unread,
Jan 1, 2022, 3:34:59 PM1/1/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by karyon:

Old description:

> Publisher A has two books with each 100 pages.
>

> {{{
> Publisher.objects.annotate(Sum("books_pages"))
> }}}
>
> will give you 200 as the sum. However, adding a filter after the annotate
>
> {{{
> Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
> }}}
>
> will give you 400. (apparently that's the correct sum 200 multiplied by
> the number of books of that publisher)
>
> I understand that joins in annotates can produce incorrect results, akin
> to the one documented here:
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
> multiple-aggregations #combining-multiple-aggregations]. However, the
> docs there say only "Combining multiple aggregations with annotate() will
> yield the wrong results", and here I'm not combining multiple
> aggregations. Furthermore,
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#order-of-

> annotate-and-filter-clauses #order-of-annotate-and-filter-clauses] says


> "When an annotate() clause is applied to a query, the annotation is
> computed over the state of the query up to the point where the annotation
> is requested.", which further made me believe this should actually work.
>

> It seems that Sum also has an undocumented distinct parameter. When I
> applied that one, I got incorrect results as well, and I couldn't even
> tell how those were computed.

New description:

Publisher A has two books with each 100 pages.


{{{
Publisher.objects.annotate(Sum("books_pages"))
}}}

will give you 200 as the sum. However, adding a filter after the annotate

{{{
Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
}}}

will give you 400. (apparently that's the correct sum 200 multiplied by
the number of books of that publisher)

I understand that joins in annotates can produce incorrect results, akin
to the one documented here:
[https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
multiple-aggregations #combining-multiple-aggregations]. However, the docs
there say only "Combining multiple aggregations with annotate() will yield
the wrong results", and here I'm not combining multiple aggregations.
Furthermore, [https://docs.djangoproject.com/en/4.0/topics/db/aggregation
/#order-of-annotate-and-filter-clauses #order-of-annotate-and-filter-
clauses] says "When an annotate() clause is applied to a query, the
annotation is computed over the state of the query up to the point where
the annotation is requested.", which further made me believe this should
actually work.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:2>

Django

unread,
Jan 1, 2022, 7:10:49 PM1/1/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: nobody
Type: Uncategorized | Status: closed

Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* status: new => closed
* resolution: => invalid


Comment:

While the documentation doesn't explicit mention this particular
combination you are effectively hitting #10060 where the `book` table is
joined twice because instead of a single time and used for aggregation and
filtering.

In order to avoid this problem you can use a `FilteredRelation` alias or
simply the `condition` argument of `Sum`

{{{#!python
Publisher.objects.annotate(Sum("books__pages",
condition=Q(books__in=Book.objects.all()))

Publisher.objects.alias(
filtered_books=FilteredRelation("books",
condition=Q(books__in=Book.objects.all()))
).annotate(
Sum("filtered_books__pages"),
)
}}}

Please TicketClosingReasons/UseSupportChannels in the future for this kind
of questions

---

I kind of wish we made `FiltereRelation(relation: str, condition: Q)`
`Relation(relation: str, condition: Optional[Q])` instead and documented
its usage to circumvent this problem with the use of `alias` to ''tell''
the ORM a JOIN **must** be reused.

--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:3>

Django

unread,
Jan 2, 2022, 3:24:23 PM1/2/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: nobody
Type: Uncategorized | Status: closed
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by karyon):

Hi Simon, I'm sorry if I was unclear. I didn't intend this to be a
question, I already have a workaround in place. This was intended to be a
bug report: Two sections/sentences in the documentation are incorrect as
to which cases should or should not work, see the last paragraph of my
report.

In particular, I think the entire section "combining multiple
aggregations" should be generalized, since it's not only a second
annotation that triggers the issue, but (apparently?) any operation that
produces another join, such as filtering by an attribute in a related
model. In addition, the sentence "the annotation is computed over the
state of the query up to the point where the annotation is requested" is
misleading: A subsequent filter or additional aggregation can alter the
results of a previous one. It could help users if we added something like
"except for these cases <link to a generalized 'combining multiple
aggregations' section>".

Your suggested solution is not necessary in this case, simply filtering
before the annotation does work correctly as far as I could see:

{{{
# works as expected, sums the filtered books
Publisher.objects.filter(books__in=Books.objects.all()).annotate(Sum("books_pages"))
}}}

What does not work correctly is filtering the publishers by some property
of their books *after* the aggregation without affecting the aggregation:

{{{
# this alters the sum although the documentation suggests this should just
work
Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:4>

Django

unread,
Jan 2, 2022, 3:26:43 PM1/2/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: nobody
Type: Uncategorized | Status: closed
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by karyon:

Old description:

> Publisher A has two books with each 100 pages.
>

> {{{
> Publisher.objects.annotate(Sum("books_pages"))
> }}}
>
> will give you 200 as the sum. However, adding a filter after the annotate
>
> {{{
> Publisher.objects.annotate(Sum("books_pages")).filter(books__in=Books.objects.all())
> }}}
>
> will give you 400. (apparently that's the correct sum 200 multiplied by
> the number of books of that publisher)
>
> I understand that joins in annotates can produce incorrect results, akin
> to the one documented here:
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
> multiple-aggregations #combining-multiple-aggregations]. However, the
> docs there say only "Combining multiple aggregations with annotate() will
> yield the wrong results", and here I'm not combining multiple
> aggregations. Furthermore,
> [https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#order-of-

> annotate-and-filter-clauses #order-of-annotate-and-filter-clauses] says


> "When an annotate() clause is applied to a query, the annotation is
> computed over the state of the query up to the point where the annotation
> is requested.", which further made me believe this should actually work.

New description:

Publisher A has two books with each 100 pages.


{{{
Publisher.objects.annotate(Sum("books__pages"))
}}}

will give you 200 as the sum. However, adding a filter after the annotate

{{{
Publisher.objects.annotate(Sum("books__pages")).filter(books__in=Books.objects.all())
}}}

will give you 400. (apparently that's the correct sum 200 multiplied by
the number of books of that publisher)

I understand that joins in annotates can produce incorrect results, akin
to the one documented here:
[https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#combining-
multiple-aggregations #combining-multiple-aggregations]. However, the docs
there say only "Combining multiple aggregations with annotate() will yield
the wrong results", and here I'm not combining multiple aggregations.
Furthermore, [https://docs.djangoproject.com/en/4.0/topics/db/aggregation
/#order-of-annotate-and-filter-clauses #order-of-annotate-and-filter-
clauses] says "When an annotate() clause is applied to a query, the
annotation is computed over the state of the query up to the point where
the annotation is requested.", which further made me believe this should
actually work.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:5>

Django

unread,
Jan 2, 2022, 3:29:22 PM1/2/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: nobody
Type: Uncategorized | Status: closed
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by karyon):

I just realized there was an underscore missing in the original report: It
should be "books__pages", and not "books_pages". Sorry for the confusion!

--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:6>

Django

unread,
Jan 3, 2022, 9:29:26 AM1/3/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: nobody
Type: | Status: new
Cleanup/optimization |

Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* status: closed => new
* type: Uncategorized => Cleanup/optimization
* resolution: invalid =>
* stage: Unreviewed => Accepted


Comment:

While I'm not sure how to word this properly myself I also agree that the
''the annotation is computed over the state of the query up to the point
where the annotation is requested'' statement can be misleading without
context about how multi-valued filtering behaves.

Given we've recently revised the latter's documentation
(6174814dbe04fb6668aa212a6cdbca765a8b0522) I'm re-opening this ticket to
investigate doing the same here.

Maybe a single addition to docs about how multi-valued filtering against a
previously annotated aggregation might still yield duplicate results would
be enough?

Django

unread,
Nov 8, 2022, 1:39:23 PM11/8/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: Abhinav
Type: | Yadav
Cleanup/optimization | Status: assigned

Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 1

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Abhinav Yadav):

* owner: nobody => Abhinav Yadav
* needs_docs: 0 => 1
* has_patch: 0 => 1
* status: new => assigned


Comment:

[https://github.com/django/django/pull/16273 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:7>

Django

unread,
Nov 8, 2022, 2:30:42 PM11/8/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: Abhinav
Type: | Yadav
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* needs_docs: 1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:8>

Django

unread,
Nov 9, 2022, 2:07:15 AM11/9/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: Abhinav
Type: | Yadav
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* needs_better_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:9>

Django

unread,
Nov 14, 2022, 12:04:28 AM11/14/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: Abhinav
Type: | Yadav
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Abhinav Yadav):

* needs_better_patch: 1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:10>

Django

unread,
Nov 22, 2022, 11:50:22 PM11/22/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: Abhinav
Type: | Yadav
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

FWIW this seems highly related to #15049.

--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:11>

Django

unread,
Nov 23, 2022, 1:58:09 AM11/23/22
to django-...@googlegroups.com
#33403: Annotate results change when filtering *after* the annotate
-------------------------------------+-------------------------------------
Reporter: karyon | Owner: Abhinav
Type: | Yadav
Cleanup/optimization | Status: closed

Component: Database layer | Version: 4.0
(models, ORM) |
Severity: Normal | Resolution: duplicate
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: assigned => closed
* resolution: => duplicate


Comment:

Replying to [comment:11 Simon Charette]:
> FWIW this seems highly related if not a duplicate of #15049.

Agreed, let's close it as a duplicate of #15049.

--
Ticket URL: <https://code.djangoproject.com/ticket/33403#comment:12>

Reply all
Reply to author
Forward
0 new messages