[Django] #33772: Queryset first() returns different result than queryset[0] when using ExpressionWrapper

14 views
Skip to first unread message

Django

unread,
Jun 7, 2022, 10:40:46 PM6/7/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: pablop94 | Owner: nobody
Type: | Status: new
Uncategorized |
Component: Database | Version: 3.2
layer (models, ORM) | Keywords:
Severity: Normal | ExpressionWrapper,first,queryset
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Hi! I was creating a new app and found the following scenario, seems like
an issue to me:

I have three models:
{{{
class Transaction(models.Model):
amount = models.DecimalField(max_digits=10, decimal_places=2)
description = models.CharField(max_length=255)
created_by = models.ForeignKey(get_user_model(),
on_delete=models.CASCADE, related_name="created_transactions")
created_at = models.DateTimeField(auto_now_add=True)


class TransactionUser(models.Model):
user = models.ForeignKey(get_user_model(), on_delete=models.CASCADE,
related_name="transactions")
transaction = models.ForeignKey("bills.Transaction",
on_delete=models.CASCADE, related_name="users")
amount = models.DecimalField(max_digits=10, decimal_places=2)
transaction_type = models.ForeignKey('bills.TransactionType',
on_delete=models.PROTECT)


class TransactionType(models.Model):
name = models.CharField(max_length=50)
code = models.CharField(max_length=3)
multiplier = models.DecimalField(max_digits=5, decimal_places=2)

}}}

I create the following instances:
{{{
transaction = Transaction(amount=100)
positive = TransactionType(multiplier=1)
negative = TransactionType(multiplier=-1)

TransactionUser(transaction=transaction, user=user1, amount=50,
transaction_type=negative)
TransactionUser(transaction=transaction, user=user2, amount=50,
transaction_type=negative)
TransactionUser(transaction=transaction, user=user1, amount=10,
transaction_type=positive)

}}}
I want to obtain the balance of each transaction by doing the following
query:

{{{
TransactionUser.objects \
.filter(user_id=user_id) \
.annotate(
real_amount=ExpressionWrapper(F("amount")*F("transaction_type__multiplier"),
output_field=DecimalField()),
).values("transaction") \
.annotate(
balance=Sum("real_amount")
)
}}}
If I log the queryset returned, the value logged is:

{{{
<QuerySet [{'transaction': 1, 'balance': Decimal('-40.0000')}]>
}}}

But if I log
{{{
queryset.first()["balance"]
>>> -50
}}}
And if I log
{{{
queryset[0]["balance"]
>>> -40
}}}

It's also reproducible in 4.0

--
Ticket URL: <https://code.djangoproject.com/ticket/33772>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jun 7, 2022, 11:48:48 PM6/7/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: nobody
Type: Uncategorized | Status: closed
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
ExpressionWrapper,first,queryset | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* status: new => closed
* resolution: => invalid


Comment:

This is happening because you don't explicitly define an ordering for your
queryset so [https://www.percona.com/blog/2017/04/07/non-deterministic-
order-select-limit/ the database can return whatever rows it wants] in the
case of `[0]` while `.first()` explicitly order by `pk`
[https://github.com/django/django/blob/49b470b9187b6be60e573fed08c8f4a87f133750/django/db/models/query.py#L1035
if no ordering is specified].

Please TicketClosingReasons/UseSupportChannels before filing issues in
this ticket tracker.

--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:1>

Django

unread,
Jun 8, 2022, 6:23:17 AM6/8/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: nobody
Type: Uncategorized | Status: new

Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
ExpressionWrapper,first,queryset | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Pablo Pissi):

* status: closed => new
* resolution: invalid =>


Comment:

Hi,

I've read the docs about ordering, but in this case the queryset has only
one value and the value returned by .first() is not a value that exists on
the queryset, is that correct?

It's returning values from the non-aggregated queryset

--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:2>

Django

unread,
Jun 8, 2022, 10:08:58 AM6/8/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: nobody
Type: Uncategorized | Status: closed

Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: needsinfo

Keywords: | Triage Stage:
ExpressionWrapper,first,queryset | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Carlton Gibson):

* status: new => closed

* resolution: => needsinfo


Comment:

> It's also reproducible …

Hi Pablo — it's looks like a usage problem, so the support channels are
the place to go, but if you can create a sample project that demonstrates
the issue, perhaps in a failing test case, I'm happy to look further.

--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:3>

Django

unread,
Jun 8, 2022, 11:05:52 PM6/8/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: nobody
Type: Uncategorized | Status: closed
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
ExpressionWrapper,first,queryset | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Pablo Pissi):

Hi again!

Here is a repo with a test that shows the issue:
https://github.com/pablop94/django-poc

Let me know if you need more info.

--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:4>

Django

unread,
Jun 9, 2022, 12:18:27 AM6/9/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: nobody
Type: Uncategorized | Status: closed
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
ExpressionWrapper,first,queryset | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

The addition of `order_by` columns to the `GROUP BY` clause specified by
`values().annotate()`
[https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#interaction-
with-order-by is documented].

The crux of the issue remain; you are asking for an element at a
particular index of a collection with an undefined ordering. Adding
`order_by("example_model")` to your test case addresses the irregularity
you are experiencing with `first()`.

`first()` could arguably be adjusted to raise an exception when attempted
to be used on an unordered collection to ''refuse the temptation to
guess'' but that would require a deprecation period. Another approach
could be to only raise an exception when `first()` is called on a
unordered queryset with an explicit `values()` that doesn't include `pk`
which is the case reported here.

Here's what the code could look like for a systematic error on attempts to
use `first()` on a undefined ordering queryset performing aggregation over
specific columns

{{{#!diff
diff --git a/django/db/models/query.py b/django/db/models/query.py
index 67fcb411eb..da3d1ff3a7 100644
--- a/django/db/models/query.py
+++ b/django/db/models/query.py
@@ -1032,7 +1032,14 @@ async def alatest(self, *fields):

def first(self):
"""Return the first object of a query or None if no match is
found."""
- for obj in (self if self.ordered else self.order_by("pk"))[:1]:
+ queryset = self
+ if not self.ordered:
+ if isinstance(self.query.group_by, tuple):
+ raise TypeError(
+ "Cannot use first() on an unordered queryset
performing aggregation"
+ )
+ queryset = self.order_by("pk")
+ for obj in queryset[:1]:
return obj

async def afirst(self):
}}}

And a somewhat more complex implementation that still allow the `pk`

{{{#!diff
diff --git a/django/db/models/query.py b/django/db/models/query.py
index 67fcb411eb..0534a2cc16 100644
--- a/django/db/models/query.py
+++ b/django/db/models/query.py
@@ -1032,7 +1032,17 @@ async def alatest(self, *fields):

def first(self):
"""Return the first object of a query or None if no match is
found."""
- for obj in (self if self.ordered else self.order_by("pk"))[:1]:
+ queryset = self
+ if not self.ordered:
+ if isinstance(self.query.group_by, tuple) and not any(
+ col.output_field is self.model._meta.pk for col in
self.query.group_by
+ ):
+ raise TypeError(
+ "Cannot use first() on an unordered queryset
performing aggregation "
+ "over a set of expressions that doesn't include the
base model's primary key"
+ )
+ queryset = self.order_by("pk")
+ for obj in queryset[:1]:
return obj

async def afirst(self):
}}}

The latter patch seems to be passing the test suite.

Thoughts on the value of such adjustment Carlton?

--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:5>

Django

unread,
Jun 9, 2022, 2:58:29 AM6/9/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: nobody
Type: | Status: new
Cleanup/optimization |

Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
ExpressionWrapper,first,queryset |

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Carlton Gibson):

* status: closed => new
* type: Uncategorized => Cleanup/optimization
* resolution: needsinfo =>
* stage: Unreviewed => Accepted


Comment:

Thanks both. Good.

> Thoughts on the value of such adjustment Carlton?

We do see tickets periodically running into this problem: I recall a
number of occasions seeing links to that same docs section.
I guess for every one that we see there are several/many in the wild we
don't

The `first()` docs link to the ''Interaction with order_by()'' section
there, presumably to try to forestall this confusion, but I still get the
impression it's one of the more subtle points of the ORM behaviour, so if
we can provide a better error message to point folks in the right
direction that's likely a win. 🤔

I'll provisionally accept on that basis. (Mariusz can object on review if
he doesn't like it 😃)

Pablo would you like to work on a PR based on Simon's suggestion?

--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:6>

Django

unread,
Jun 9, 2022, 12:00:31 PM6/9/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
ExpressionWrapper,first,queryset |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Pablo Pissi):

Yes, I could work on a PR this weekend. Maybe I can add a documentation
note to the **first** function too.

This change will require a deprecation period right?

--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:7>

Django

unread,
Jun 10, 2022, 2:32:47 AM6/10/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
ExpressionWrapper,first,queryset |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Carlton Gibson):

> This change will require a deprecation period right?

I don't think so. We're (just) adding a warning for an already incorrect
usage.

--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:8>

Django

unread,
Jun 10, 2022, 2:33:05 AM6/10/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: Pablo
Type: | Pissi
Cleanup/optimization | Status: assigned

Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
ExpressionWrapper,first,queryset |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Carlton Gibson):

* owner: nobody => Pablo Pissi
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:9>

Django

unread,
Jun 12, 2022, 9:20:08 PM6/12/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: Pablo
Type: | Pissi
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
ExpressionWrapper,first,queryset |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Pablo Pissi):

* has_patch: 0 => 1


Comment:

Hi!

I've created this PR. I've also updated the last() function:
https://github.com/django/django/pull/15769

--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:10>

Django

unread,
Jun 14, 2022, 4:42:34 AM6/14/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: Pablo
Type: | Pissi
Cleanup/optimization | Status: assigned
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
ExpressionWrapper,first,queryset | checkin

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:11>

Django

unread,
Jun 14, 2022, 7:01:42 AM6/14/22
to django-...@googlegroups.com
#33772: Queryset first() returns different result than queryset[0] when using
ExpressionWrapper
-------------------------------------+-------------------------------------
Reporter: Pablo Pissi | Owner: Pablo
Type: | Pissi
Cleanup/optimization | Status: closed

Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
ExpressionWrapper,first,queryset | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"d287294885e116c49c87f0c663068511a8402c0f" d2872948]:
{{{
#!CommitTicketReference repository=""
revision="d287294885e116c49c87f0c663068511a8402c0f"
Fixed #33772 -- Added QuerySet.first()/last() error message on unordered
queryset with aggregation.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:12>

Reply all
Reply to author
Forward
0 new messages