I have three models:
{{{
class Transaction(models.Model):
amount = models.DecimalField(max_digits=10, decimal_places=2)
description = models.CharField(max_length=255)
created_by = models.ForeignKey(get_user_model(),
on_delete=models.CASCADE, related_name="created_transactions")
created_at = models.DateTimeField(auto_now_add=True)
class TransactionUser(models.Model):
user = models.ForeignKey(get_user_model(), on_delete=models.CASCADE,
related_name="transactions")
transaction = models.ForeignKey("bills.Transaction",
on_delete=models.CASCADE, related_name="users")
amount = models.DecimalField(max_digits=10, decimal_places=2)
transaction_type = models.ForeignKey('bills.TransactionType',
on_delete=models.PROTECT)
class TransactionType(models.Model):
name = models.CharField(max_length=50)
code = models.CharField(max_length=3)
multiplier = models.DecimalField(max_digits=5, decimal_places=2)
}}}
I create the following instances:
{{{
transaction = Transaction(amount=100)
positive = TransactionType(multiplier=1)
negative = TransactionType(multiplier=-1)
TransactionUser(transaction=transaction, user=user1, amount=50,
transaction_type=negative)
TransactionUser(transaction=transaction, user=user2, amount=50,
transaction_type=negative)
TransactionUser(transaction=transaction, user=user1, amount=10,
transaction_type=positive)
}}}
I want to obtain the balance of each transaction by doing the following
query:
{{{
TransactionUser.objects \
.filter(user_id=user_id) \
.annotate(
real_amount=ExpressionWrapper(F("amount")*F("transaction_type__multiplier"),
output_field=DecimalField()),
).values("transaction") \
.annotate(
balance=Sum("real_amount")
)
}}}
If I log the queryset returned, the value logged is:
{{{
<QuerySet [{'transaction': 1, 'balance': Decimal('-40.0000')}]>
}}}
But if I log
{{{
queryset.first()["balance"]
>>> -50
}}}
And if I log
{{{
queryset[0]["balance"]
>>> -40
}}}
It's also reproducible in 4.0
--
Ticket URL: <https://code.djangoproject.com/ticket/33772>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* status: new => closed
* resolution: => invalid
Comment:
This is happening because you don't explicitly define an ordering for your
queryset so [https://www.percona.com/blog/2017/04/07/non-deterministic-
order-select-limit/ the database can return whatever rows it wants] in the
case of `[0]` while `.first()` explicitly order by `pk`
[https://github.com/django/django/blob/49b470b9187b6be60e573fed08c8f4a87f133750/django/db/models/query.py#L1035
if no ordering is specified].
Please TicketClosingReasons/UseSupportChannels before filing issues in
this ticket tracker.
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:1>
* status: closed => new
* resolution: invalid =>
Comment:
Hi,
I've read the docs about ordering, but in this case the queryset has only
one value and the value returned by .first() is not a value that exists on
the queryset, is that correct?
It's returning values from the non-aggregated queryset
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:2>
* status: new => closed
* resolution: => needsinfo
Comment:
> It's also reproducible …
Hi Pablo — it's looks like a usage problem, so the support channels are
the place to go, but if you can create a sample project that demonstrates
the issue, perhaps in a failing test case, I'm happy to look further.
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:3>
Comment (by Pablo Pissi):
Hi again!
Here is a repo with a test that shows the issue:
https://github.com/pablop94/django-poc
Let me know if you need more info.
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:4>
Comment (by Simon Charette):
The addition of `order_by` columns to the `GROUP BY` clause specified by
`values().annotate()`
[https://docs.djangoproject.com/en/4.0/topics/db/aggregation/#interaction-
with-order-by is documented].
The crux of the issue remain; you are asking for an element at a
particular index of a collection with an undefined ordering. Adding
`order_by("example_model")` to your test case addresses the irregularity
you are experiencing with `first()`.
`first()` could arguably be adjusted to raise an exception when attempted
to be used on an unordered collection to ''refuse the temptation to
guess'' but that would require a deprecation period. Another approach
could be to only raise an exception when `first()` is called on a
unordered queryset with an explicit `values()` that doesn't include `pk`
which is the case reported here.
Here's what the code could look like for a systematic error on attempts to
use `first()` on a undefined ordering queryset performing aggregation over
specific columns
{{{#!diff
diff --git a/django/db/models/query.py b/django/db/models/query.py
index 67fcb411eb..da3d1ff3a7 100644
--- a/django/db/models/query.py
+++ b/django/db/models/query.py
@@ -1032,7 +1032,14 @@ async def alatest(self, *fields):
def first(self):
"""Return the first object of a query or None if no match is
found."""
- for obj in (self if self.ordered else self.order_by("pk"))[:1]:
+ queryset = self
+ if not self.ordered:
+ if isinstance(self.query.group_by, tuple):
+ raise TypeError(
+ "Cannot use first() on an unordered queryset
performing aggregation"
+ )
+ queryset = self.order_by("pk")
+ for obj in queryset[:1]:
return obj
async def afirst(self):
}}}
And a somewhat more complex implementation that still allow the `pk`
{{{#!diff
diff --git a/django/db/models/query.py b/django/db/models/query.py
index 67fcb411eb..0534a2cc16 100644
--- a/django/db/models/query.py
+++ b/django/db/models/query.py
@@ -1032,7 +1032,17 @@ async def alatest(self, *fields):
def first(self):
"""Return the first object of a query or None if no match is
found."""
- for obj in (self if self.ordered else self.order_by("pk"))[:1]:
+ queryset = self
+ if not self.ordered:
+ if isinstance(self.query.group_by, tuple) and not any(
+ col.output_field is self.model._meta.pk for col in
self.query.group_by
+ ):
+ raise TypeError(
+ "Cannot use first() on an unordered queryset
performing aggregation "
+ "over a set of expressions that doesn't include the
base model's primary key"
+ )
+ queryset = self.order_by("pk")
+ for obj in queryset[:1]:
return obj
async def afirst(self):
}}}
The latter patch seems to be passing the test suite.
Thoughts on the value of such adjustment Carlton?
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:5>
* status: closed => new
* type: Uncategorized => Cleanup/optimization
* resolution: needsinfo =>
* stage: Unreviewed => Accepted
Comment:
Thanks both. Good.
> Thoughts on the value of such adjustment Carlton?
We do see tickets periodically running into this problem: I recall a
number of occasions seeing links to that same docs section.
I guess for every one that we see there are several/many in the wild we
don't
The `first()` docs link to the ''Interaction with order_by()'' section
there, presumably to try to forestall this confusion, but I still get the
impression it's one of the more subtle points of the ORM behaviour, so if
we can provide a better error message to point folks in the right
direction that's likely a win. 🤔
I'll provisionally accept on that basis. (Mariusz can object on review if
he doesn't like it 😃)
Pablo would you like to work on a PR based on Simon's suggestion?
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:6>
Comment (by Pablo Pissi):
Yes, I could work on a PR this weekend. Maybe I can add a documentation
note to the **first** function too.
This change will require a deprecation period right?
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:7>
Comment (by Carlton Gibson):
> This change will require a deprecation period right?
I don't think so. We're (just) adding a warning for an already incorrect
usage.
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:8>
* owner: nobody => Pablo Pissi
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:9>
* has_patch: 0 => 1
Comment:
Hi!
I've created this PR. I've also updated the last() function:
https://github.com/django/django/pull/15769
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:10>
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:11>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"d287294885e116c49c87f0c663068511a8402c0f" d2872948]:
{{{
#!CommitTicketReference repository=""
revision="d287294885e116c49c87f0c663068511a8402c0f"
Fixed #33772 -- Added QuerySet.first()/last() error message on unordered
queryset with aggregation.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33772#comment:12>