{{{
Category.objects.prefetch_related(Prefetch(
'post_set',
queryset=Post.objects.all()[:3],
to_attr='example_posts',
))
}}}
This behavior is also mentioned in
[http://stackoverflow.com/a/37086807/262618 this StackOverflow answer]. On
the other hand it does not seem to be documented in Django Docs.
'''Why is it needed?'''
My use case seems to be a common one: I want to display a list of
categories while displaying couple of example objects from each category
next to it. If I'm not mistaken there isn't currently an efficient way of
doing this. Prefetching without slicing would prefetch all objects (and
there may be thousands of them) instead of the three examples that are
needed.
--
Ticket URL: <https://code.djangoproject.com/ticket/26780>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_docs: => 0
* needs_tests: => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:1>
Comment (by ludwik):
It seems to me that #26565 ("Allow Prefetch query to use .values()") is
related.
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:2>
Comment (by charettes):
Hi ludwik,
#26565 is about a different issue.
The only way we could support slices would be to extract it from the
provided it queryset to make it filterable again and apply it when in-
memory joining. The prefect queryset would still have to fetch all results
so I'm not sure it's worth the trouble as accessing
`category.example_posts[:3]` would have the same effect.
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:3>
Comment (by ludwik):
Charettes,
Thank you for the reply.
> accessing category.example_posts[:3] would have the same effect
I'm pretty sure that is not correct. The end effect would obviously be the
same, but the way it is achieved behind the scenes and the performance
characteristics that follows would be widely different. What you propose
(prefetching without limit, adding slicing in a loop after the fact) would
make Django to perform a database query selecting '''all objects from the
database table''' and loading them into memory. This would happen when the
main queryset is evaluated (that's what prefetching is all about). Then
the slicing would be perform by Python in memory, on a queryset that was
already evaluated.
That's what I understood from
[https://docs.djangoproject.com/en/1.9/ref/models/querysets/#prefetch-
objects the documentation] and also how Django actually behaved in an
experiment I performed couple of minutes ago. What I want to avoid is
exactly this behavior - loading thousands of objects from the database to
display first three of them.
I would be happy with a sane workaround, but to my knowledge there isn't
any.
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:4>
* type: Bug => New feature
* version: 1.9 => master
* component: Uncategorized => Database layer (models, ORM)
* stage: Unreviewed => Accepted
Comment:
Tentatively accepting but as you've noticed it would require a large
refactor of the way prefetching is actually done.
I could see `Category.objects.prefetch_related(Prefetch('articles',
Article.object.order_by('-published_data')[0:3],
to_attr='latest_articles'))` being useful but the underlying prefetch
query would need to rely on subqueries, a feature the ORM is not good at
for now.
Maybe the addition of the [https://github.com/django/django/pull/6478
Subquery] expression could help here.
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:5>
Comment (by asdator):
The only way to limit this Query (and to not load all the records into
memory) would be to is to do a `top-n-per-group` query which I don't think
the `ORM` is capable of.
What I can suggest is: Instead of limiting the `QuerySet`, find a way to
filter it.
For example, get only the `Posts` in the last day, week or a month (based
on the post-frequency).
{{{
Category.objects.prefetch_related(Prefetch(
'post_set',
queryset=Post.objects.filter(
date_published__gte=datetime.date.today()-timedelta(days=7)),
to_attr='example_posts',
))
}}}
This way you won't load all the Posts into Memory.
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:6>
Comment (by skyjur):
If window queries are implemented
(https://code.djangoproject.com/ticket/26608) then this use case could be
possible in something similar to:
{{{
Prefetch(
'post_set',
queryset=
Post.objects
.annotate(_rank=Window(Rank(), partition_by='cateogry')
.filter(_rank__lte=3)
)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:7>
Comment (by Jeff Johnson):
I tried the filter by rank suggestion using Django 2.0 but I get this
error:
{{{
django.db.utils.NotSupportedError: Window is disallowed in the filter
clause.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:8>
Comment (by Haseeb Ahmad):
I think there is a workaround now to in django new version as we have
OuterRef and Subquery.
from django.db.models import OuterRef, Subquery
User.objects.all().prefetch_related('comments',
queryset=Comment.objects.filter(id__in=Subquery(
Comment.objects.filter(user_id=OuterRef('user_id')).values_list('id',
flat=True)[:5])))
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:9>
* cc: TZanke (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:10>
* cc: Dan LaManna (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:11>
* cc: chex (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:12>
* cc: Pavel Lysak (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:13>
* cc: Arthur (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:14>
Comment (by Arthur):
Note that the solution above might not be portable to some MySql
databases, with error
`1235, "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME
subquery'"`
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:15>
* cc: şuayip üzülmez (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:16>
Comment (by Simon Charette):
As pointed out by others support for filter against window functions would
allow `prefetch_related_objects` to use `Rank(partition_by)` to support
this feature.
If someone wants to give a shot at solving this particular issue before
#28333 is resolved it should be doable by using a combination of
`Queryset.raw` and `Query.compile` combinations.
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:17>
* cc: Simon Charette (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:18>
* owner: nobody => Simon Charette
* status: new => assigned
* has_patch: 0 => 1
Comment:
Submitted [https://github.com/django/django/pull/15922 a patch] that works
on top of the currently under-review patch for #28333 and it was
surprisingly very straightforward.
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:19>
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:20>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"242499f2dc2bf24a9a5c855690a2e13d3303581a" 242499f2]:
{{{
#!CommitTicketReference repository=""
revision="242499f2dc2bf24a9a5c855690a2e13d3303581a"
Fixed #26780 -- Added prefetch_related() support for sliced queries.
This was made possible by window function filtering support added in
f387d024fc75569d2a4a338bfda76cc2f328f627.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:21>
Comment (by GitHub <noreply@…>):
In [changeset:"69fa2e8eb25d9bdd395fa8857177d4637aaf4c55" 69fa2e8e]:
{{{
#!CommitTicketReference repository=""
revision="69fa2e8eb25d9bdd395fa8857177d4637aaf4c55"
Refs #26780 -- Made prefetch_related() don't use window expressions fo
sliced queries if not supported.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26780#comment:22>