{{{
class Parent(models.Model):
pass
class Child(models.Model):
saved_dt = models.DateTimeField(auto_now_add=True)
parent = models.ForeignKey(Parent)
}}}
I am executing this query (for each `Parent` I want to get the `child_set`
in sorted order):
{{{
[m.child_set.order_by('saved_dt') for m in
Parent.objects.prefetch_related(Prefetch('child_set',
Child.objects.order_by('saved_dt'))).all()]
}}}
I would expect that to be two SQL queries, but in reality I see `N+2`
where `N` is the number of `Child`s:
{{{
SELECT "prefetch_parent"."id" FROM "prefetch_parent"; args=()
SELECT "prefetch_child"."id", "prefetch_child"."saved_dt",
"prefetch_child"."parent_id" FROM "prefetch_child" WHERE
"prefetch_child"."parent_id" IN (1, 2) ORDER BY
"prefetch_child"."saved_dt" ASC; args=(1, 2)
SELECT "prefetch_child"."id", "prefetch_child"."saved_dt",
"prefetch_child"."parent_id" FROM "prefetch_child" WHERE
"prefetch_child"."parent_id" = 1 ORDER BY "prefetch_child"."saved_dt" ASC
LIMIT 21; args=(1,)
SELECT "prefetch_child"."id", "prefetch_child"."saved_dt",
"prefetch_child"."parent_id" FROM "prefetch_child" WHERE
"prefetch_child"."parent_id" = 2 ORDER BY "prefetch_child"."saved_dt" ASC
LIMIT 21; args=(2,)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26211>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0
Comment:
If I remove the `order_by` on the `m.child_set`, then I get the expected
two queries:
{{{
[m.child_set.all() for m in
Parent.objects.prefetch_related(Prefetch('child_set',
Child.objects.order_by('saved_dt'))).all()]
SELECT "prefetch_parent"."id" FROM "prefetch_parent"; args=()
SELECT "prefetch_child"."id", "prefetch_child"."saved_dt",
"prefetch_child"."parent_id" FROM "prefetch_child" WHERE
"prefetch_child"."parent_id" IN (1, 2) ORDER BY
"prefetch_child"."saved_dt" ASC; args=(1, 2)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26211#comment:1>
Comment (by timgraham):
Good, that's what I was going suggest. I think we can close this as
expected behavior.
--
Ticket URL: <https://code.djangoproject.com/ticket/26211#comment:2>
Comment (by cancan101):
Woah! I was not suggesting closing this! I don't think behavior is as
expected. The ORM should realize the prefetch has already been done.
--
Ticket URL: <https://code.djangoproject.com/ticket/26211#comment:3>
* type: Uncategorized => New feature
* component: Uncategorized => Database layer (models, ORM)
Comment:
I disagree -- enabling `QuerySet.order_by()` to sometimes do operations in
memory could be quite confusing. I think this has probably been proposed
before in some form before, but can't immediately find a discussion. As
[https://docs.djangoproject.com/en/stable/ref/models/querysets/ the
documentation] notes, "Each `order_by()` call will clear any previous
ordering" (it returns a new `QuerySet`). Anyway, this isn't my area of
expertise so I'll leave it open for a second opinion.
--
Ticket URL: <https://code.djangoproject.com/ticket/26211#comment:4>
* status: new => closed
* resolution: => wontfix
Comment:
I'm also inclined to close this ticket as expected behavior.
The `prefetch_related` method stores the prefetched objects on the
instance it belongs to. As soon as you call a queryset method on the
related manager holding the cached instances (`child_set.order_by()`) a
new queryset instance is returned which doesn't hold any reference to the
previously cached results.
This is same as expecting a call to a queryset method with cached results
to perform its operation in memory instead of going back to the db.
e.g.
{{{#!python
children = Child.objects.all()
list(children) # Will issue a query and cache the results on the children
objects.
list(children) # Will reuse the cached results.
children.order_by('saved_dt') # Will perform a new database query even if
children has cached results.
}}}
If we performed the operation in memory this could lead to catastrophic
performance given a large enough result set. This would be backward
incompatible and kind of a footgun API.
I think the `Queryset` documentation makes it clear that calling a method
will create and return a new instance with no cached results.
--
Ticket URL: <https://code.djangoproject.com/ticket/26211#comment:5>
Old description:
New description:
Given these models:
{{{
class Parent(models.Model):
pass
I would hope that the second call to `children.order_by('saved_dt')` can
return self since the queryset is already sorted by the desired key.
--
Comment (by cancan101):
@charettes I actually think that the example you provided indicated what I
would consider a deficiency of the ORM: coalescing multiple calls to
`order_by` with the same argument (I have modified the original queryset):
{{{
>>> children = Child.objects.order_by('saved_dt').all()
>>> list(children)
[<Child: Child object>, <Child: Child object>]
(0.001) SELECT "prefetch_child"."id", "prefetch_child"."saved_dt",
"prefetch_child"."parent_id" FROM "prefetch_child" ORDER BY
"prefetch_child"."saved_dt" ASC; args=()
>>> list(children)
[<Child: Child object>, <Child: Child object>]
>>> children.order_by('saved_dt')
[<Child: Child object>, <Child: Child object>]
(0.000) SELECT "prefetch_child"."id", "prefetch_child"."saved_dt",
"prefetch_child"."parent_id" FROM "prefetch_child" ORDER BY
"prefetch_child"."saved_dt" ASC LIMIT 21; args=()
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26211#comment:6>
Old description:
> I would hope that the second call to `children.order_by('saved_dt')` can
> return self since the queryset is already sorted by the desired key.
New description:
Given these models:
{{{
class Parent(models.Model):
pass
--
--
Ticket URL: <https://code.djangoproject.com/ticket/26211#comment:7>
Comment (by cancan101):
Here is one more example of what I would consider a bug and it does not
involve multiple calls to `order_by`:
{{{
>>> Parent.objects.prefetch_related(Prefetch('child_set',
Child.objects.order_by('saved_dt'))).first().child_set.all().ordered
False
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26211#comment:8>