To make the code a bit more readable the clone() the preparation for next
op should be splitted away to pre_next_op() method. This also allows
creating inplace querysets that work precisely like full-clone querysets
(inplace queryset doesn't do cloning at all). This again allows speeding
up certain internal operations (like model.save() and prefetch_related()),
and also allows users to optimise heavy-to-create querysets by avoiding
clone() overhead.
Patch available from
https://github.com/akaariai/django/tree/splitted_clone. The inplace
queryset used for Model.save_base() speeds up model.save() by about 20%
for a simple m = Model.objects.get(); m.save() benchmark. There are other
places where similar optimisations could be used. Prefetch speedup is
tracked in ticket #20577.
--
Ticket URL: <https://code.djangoproject.com/ticket/20880>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_docs: => 0
* needs_better_patch: => 0
* needs_tests: => 0
* stage: Unreviewed => Accepted
Old description:
> QuerySet and Query clone() methods are doing more than just cloning, they
> are also doing set up for next operation, changing the class of the clone
> if requested, and even throwing away some attribute values.
>
> To make the code a bit more readable the clone() the preparation for next
> op should be splitted away to pre_next_op() method. This also allows
> creating inplace querysets that work precisely like full-clone querysets
> (inplace queryset doesn't do cloning at all). This again allows speeding
> up certain internal operations (like model.save() and
> prefetch_related()), and also allows users to optimise heavy-to-create
> querysets by avoiding clone() overhead.
>
> Patch available from
> https://github.com/akaariai/django/tree/splitted_clone. The inplace
> queryset used for Model.save_base() speeds up model.save() by about 20%
> for a simple m = Model.objects.get(); m.save() benchmark. There are other
> places where similar optimisations could be used. Prefetch speedup is
> tracked in ticket #20577.
New description:
QuerySet and Query clone() methods are doing more than just cloning, they
are also doing set up for next operation, changing the class of the clone
if requested, and even throwing away some attribute values.
To make the code a bit more readable the clone() method should be splitted
to clone() part and pre_next_op() part. This also allows creating inplace
querysets that work precisely like full-clone querysets. A queryset is
called inplace if it doesn't do cloning at all between operations. This
again allows speeding up certain internal operations like model.save() and
prefetch_related(), and also allows users to optimise heavy-to-create
querysets by avoiding clone() overhead.
Patch available from
https://github.com/akaariai/django/tree/splitted_clone. The inplace
queryset used for Model.save_base() speeds up model.save() by about 20%
for a simple m = Model.objects.get(); m.save() benchmark. There are other
places where similar optimisations could be used. Prefetch speedup is
tracked in ticket #20577.
--
--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:1>
Comment (by akaariai):
I updated (force-pushed) to
https://github.com/akaariai/django/tree/splitted_clone. The patch is
pretty good (cleanup needed), and shows promising results for model saving
and some other queryset operations:
{{{
Running 'model_save_existing' benchmark ...
Min: 0.014001 -> 0.010278: 1.3622x faster
Avg: 0.014304 -> 0.010394: 1.3763x faster
Significant (t=37.226713)
Running 'model_save_new' benchmark ...
Min: 0.013848 -> 0.010196: 1.3582x faster
Avg: 0.014079 -> 0.010408: 1.3527x faster
Significant (t=20.503974)
Running 'query_get' benchmark ...
Min: 0.027801 -> 0.023326: 1.1918x faster
Avg: 0.028116 -> 0.023599: 1.1914x faster
Significant (t=31.307197)
}}}
For qs_filter_chaining (~5 .filter() calls) using inplace results in
speedup of:
{{{
Running 'qs_filter_chaining_inplace' benchmark ...
Min: 0.000892 -> 0.000553: 1.6132x faster
Avg: 0.000937 -> 0.000571: 1.6424x faster
Significant (t=16.654697)
}}}
This is by no means complex query, yet it shows promising results.
So, it is worth considering if .inplace() should be public API. If so,
then it would make sense to guard against using stale versions of the
queryset. This is doable by cloning the outer QuerySet. Skipping cloning
of the internal Query, should give most of the speedup.
Another consideration is that if the ._clone() should be renamed to
._copy() or something. The patched version ._clone() does subtly different
thing than the old version, so this might cause some grey hairs for those
who use that internal API.
--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:2>
* owner: nobody => Tim Graham
* status: new => assigned
Comment:
I've been rebasing Anssi's patch and will post a PR when it's polished. If
I understand correctly, Josh [https://github.com/django/django/pull/7505
proposed a similar idea] to `inplace()`.
--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:3>
* has_patch: 0 => 1
Comment:
Here's a [https://github.com/django/django/pull/8716 PR] without the
"inplace" stuff. I think we can leave that to a separate ticket/PR.
--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:4>
Comment (by Tim Graham):
#28455 is the follow up ticket for adding "inplace" querysets.
--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:5>
Comment (by Tim Graham <timograham@…>):
In [changeset:"66933a6619be386248ea9329c81b257d5e2e5990" 66933a66]:
{{{
#!CommitTicketReference repository=""
revision="66933a6619be386248ea9329c81b257d5e2e5990"
Refs #20880 -- Removed non-cloning logic from QuerySet._clone().
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:6>
Comment (by Tim Graham <timograham@…>):
In [changeset:"6155bc4a51d44afa096c4c00766cbfb9ba9d660c" 6155bc4a]:
{{{
#!CommitTicketReference repository=""
revision="6155bc4a51d44afa096c4c00766cbfb9ba9d660c"
Refs #20880 -- Removed non-cloning logic from Query.clone().
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:7>
* status: assigned => closed
* resolution: => fixed
--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:8>