[Django] #20880: Split clone() to clone() and pre_next_op()

10 views
Skip to first unread message

Django

unread,
Aug 8, 2013, 9:34:26 AM8/8/13
to django-...@googlegroups.com
#20880: Split clone() to clone() and pre_next_op()
----------------------------------------------+--------------------
Reporter: akaariai | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Database layer (models, ORM) | Version: master
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+--------------------
QuerySet and Query clone() methods are doing more than just cloning, they
are also doing set up for next operation, changing the class of the clone
if requested, and even throwing away some attribute values.

To make the code a bit more readable the clone() the preparation for next
op should be splitted away to pre_next_op() method. This also allows
creating inplace querysets that work precisely like full-clone querysets
(inplace queryset doesn't do cloning at all). This again allows speeding
up certain internal operations (like model.save() and prefetch_related()),
and also allows users to optimise heavy-to-create querysets by avoiding
clone() overhead.

Patch available from
https://github.com/akaariai/django/tree/splitted_clone. The inplace
queryset used for Model.save_base() speeds up model.save() by about 20%
for a simple m = Model.objects.get(); m.save() benchmark. There are other
places where similar optimisations could be used. Prefetch speedup is
tracked in ticket #20577.

--
Ticket URL: <https://code.djangoproject.com/ticket/20880>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Aug 8, 2013, 9:37:27 AM8/8/13
to django-...@googlegroups.com
#20880: Split clone() to clone() and pre_next_op()
-------------------------------------+-------------------------------------
Reporter: akaariai | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: master
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 0 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by akaariai):

* needs_docs: => 0
* needs_better_patch: => 0
* needs_tests: => 0
* stage: Unreviewed => Accepted


Old description:

> QuerySet and Query clone() methods are doing more than just cloning, they
> are also doing set up for next operation, changing the class of the clone
> if requested, and even throwing away some attribute values.
>
> To make the code a bit more readable the clone() the preparation for next
> op should be splitted away to pre_next_op() method. This also allows
> creating inplace querysets that work precisely like full-clone querysets
> (inplace queryset doesn't do cloning at all). This again allows speeding
> up certain internal operations (like model.save() and
> prefetch_related()), and also allows users to optimise heavy-to-create
> querysets by avoiding clone() overhead.
>
> Patch available from
> https://github.com/akaariai/django/tree/splitted_clone. The inplace
> queryset used for Model.save_base() speeds up model.save() by about 20%
> for a simple m = Model.objects.get(); m.save() benchmark. There are other
> places where similar optimisations could be used. Prefetch speedup is
> tracked in ticket #20577.

New description:

QuerySet and Query clone() methods are doing more than just cloning, they
are also doing set up for next operation, changing the class of the clone
if requested, and even throwing away some attribute values.

To make the code a bit more readable the clone() method should be splitted
to clone() part and pre_next_op() part. This also allows creating inplace
querysets that work precisely like full-clone querysets. A queryset is
called inplace if it doesn't do cloning at all between operations. This
again allows speeding up certain internal operations like model.save() and
prefetch_related(), and also allows users to optimise heavy-to-create


querysets by avoiding clone() overhead.

Patch available from
https://github.com/akaariai/django/tree/splitted_clone. The inplace
queryset used for Model.save_base() speeds up model.save() by about 20%
for a simple m = Model.objects.get(); m.save() benchmark. There are other
places where similar optimisations could be used. Prefetch speedup is
tracked in ticket #20577.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:1>

Django

unread,
Aug 21, 2013, 6:36:12 AM8/21/13
to django-...@googlegroups.com
#20880: Split clone() to clone() and pre_next_op()
-------------------------------------+-------------------------------------
Reporter: akaariai | Owner: nobody

Type: | Status: new
Cleanup/optimization | Version: master
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 0 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by akaariai):

I updated (force-pushed) to
https://github.com/akaariai/django/tree/splitted_clone. The patch is
pretty good (cleanup needed), and shows promising results for model saving
and some other queryset operations:
{{{
Running 'model_save_existing' benchmark ...
Min: 0.014001 -> 0.010278: 1.3622x faster
Avg: 0.014304 -> 0.010394: 1.3763x faster
Significant (t=37.226713)


Running 'model_save_new' benchmark ...
Min: 0.013848 -> 0.010196: 1.3582x faster
Avg: 0.014079 -> 0.010408: 1.3527x faster
Significant (t=20.503974)


Running 'query_get' benchmark ...
Min: 0.027801 -> 0.023326: 1.1918x faster
Avg: 0.028116 -> 0.023599: 1.1914x faster
Significant (t=31.307197)
}}}

For qs_filter_chaining (~5 .filter() calls) using inplace results in
speedup of:
{{{
Running 'qs_filter_chaining_inplace' benchmark ...
Min: 0.000892 -> 0.000553: 1.6132x faster
Avg: 0.000937 -> 0.000571: 1.6424x faster
Significant (t=16.654697)
}}}

This is by no means complex query, yet it shows promising results.

So, it is worth considering if .inplace() should be public API. If so,
then it would make sense to guard against using stale versions of the
queryset. This is doable by cloning the outer QuerySet. Skipping cloning
of the internal Query, should give most of the speedup.

Another consideration is that if the ._clone() should be renamed to
._copy() or something. The patched version ._clone() does subtly different
thing than the old version, so this might cause some grey hairs for those
who use that internal API.

--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:2>

Django

unread,
Nov 8, 2016, 12:22:33 PM11/8/16
to django-...@googlegroups.com
#20880: Split clone() to clone() and pre_next_op()
-------------------------------------+-------------------------------------
Reporter: Anssi Kääriäinen | Owner: Tim
Type: | Graham
Cleanup/optimization | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* owner: nobody => Tim Graham
* status: new => assigned


Comment:

I've been rebasing Anssi's patch and will post a PR when it's polished. If
I understand correctly, Josh [https://github.com/django/django/pull/7505
proposed a similar idea] to `inplace()`.

--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:3>

Django

unread,
Jul 8, 2017, 5:02:48 PM7/8/17
to django-...@googlegroups.com
#20880: Split clone() to clone() and pre_next_op()
-------------------------------------+-------------------------------------
Reporter: Anssi Kääriäinen | Owner: Tim
Type: | Graham
Cleanup/optimization | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* has_patch: 0 => 1


Comment:

Here's a [https://github.com/django/django/pull/8716 PR] without the
"inplace" stuff. I think we can leave that to a separate ticket/PR.

--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:4>

Django

unread,
Jul 31, 2017, 1:57:19 PM7/31/17
to django-...@googlegroups.com
#20880: Remove non-cloning logic from Query.clone() and QuerySet._clone()

-------------------------------------+-------------------------------------
Reporter: Anssi Kääriäinen | Owner: Tim
Type: | Graham
Cleanup/optimization | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham):

#28455 is the follow up ticket for adding "inplace" querysets.

--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:5>

Django

unread,
Jul 31, 2017, 3:53:03 PM7/31/17
to django-...@googlegroups.com
#20880: Remove non-cloning logic from Query.clone() and QuerySet._clone()
-------------------------------------+-------------------------------------
Reporter: Anssi Kääriäinen | Owner: Tim
Type: | Graham
Cleanup/optimization | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"66933a6619be386248ea9329c81b257d5e2e5990" 66933a66]:
{{{
#!CommitTicketReference repository=""
revision="66933a6619be386248ea9329c81b257d5e2e5990"
Refs #20880 -- Removed non-cloning logic from QuerySet._clone().
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:6>

Django

unread,
Jul 31, 2017, 3:53:04 PM7/31/17
to django-...@googlegroups.com
#20880: Remove non-cloning logic from Query.clone() and QuerySet._clone()
-------------------------------------+-------------------------------------
Reporter: Anssi Kääriäinen | Owner: Tim
Type: | Graham
Cleanup/optimization | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"6155bc4a51d44afa096c4c00766cbfb9ba9d660c" 6155bc4a]:
{{{
#!CommitTicketReference repository=""
revision="6155bc4a51d44afa096c4c00766cbfb9ba9d660c"
Refs #20880 -- Removed non-cloning logic from Query.clone().
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:7>

Django

unread,
Jul 31, 2017, 3:54:53 PM7/31/17
to django-...@googlegroups.com
#20880: Remove non-cloning logic from Query.clone() and QuerySet._clone()
-------------------------------------+-------------------------------------
Reporter: Anssi Kääriäinen | Owner: Tim
Type: | Graham
Cleanup/optimization | Status: closed

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* status: assigned => closed
* resolution: => fixed


--
Ticket URL: <https://code.djangoproject.com/ticket/20880#comment:8>

Reply all
Reply to author
Forward
0 new messages