[Django] #20950: Use OrderedDicts in ORM only when needed

17 views
Skip to first unread message

Django

unread,
Aug 21, 2013, 8:54:47 AM8/21/13
to django-...@googlegroups.com
#20950: Use OrderedDicts in ORM only when needed
-------------------------------------+-------------------------------------
Reporter: akaariai | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: master
Component: Database | Keywords:
layer (models, ORM) | Has patch: 1
Severity: Normal | Needs tests: 0
Triage Stage: Accepted | Easy pickings: 0
Needs documentation: 0 |
Patch needs improvement: 0 |
UI/UX: 0 |
-------------------------------------+-------------------------------------
Initializing OrderedDicts seem to be really slow, at least on Python 2.7.
By instantiating OrderedDicts in Query only when needed one can save
considerable time. For example model_save_existing benchmark is 1.3x
faster, qs_filter_chaining 1.35x faster. Nearly all of the query_
benchmarks have at least 10% speedup.

Patch at https://github.com/akaariai/django/tree/ordered_dict_on_need.
Together with splitted_clone branch this gives over 1.5x speedup to
model_save_existing.

There might be some cleaner way to implement the "initiate only on need"
for Query._aggregates and Query._extra. Ideas welcome.

Ill accept this directly as this seems like a good idea to do. This trades
code-cleanness for performance, but in this particular case I think it is
worth it.

--
Ticket URL: <https://code.djangoproject.com/ticket/20950>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Aug 29, 2013, 4:28:00 AM8/29/13
to django-...@googlegroups.com
#20950: Use OrderedDicts in ORM only when needed
-------------------------------------+-------------------------------------
Reporter: akaariai | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: master
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by akaariai):

I benchmarked the change that introduced OrderedDict (that is, commit
07876cf02b6db453ca0397c29c225668872fa96d). It seems the change introduces
around 15% slowdown in model_save_existing benchmark. Initializing an
empty OrderedDict is around 50% slower than Django's SortedDict was
(different algorithms, different tradeoffs).

Using Python's version of ordered dictionary is the correct thing to do. I
am pretty sure Python's OrderedDict will get optimised implementation some
day. But before that happens it seems like a good idea to avoid
initialization of empty OrderedDictionaries where possible.

--
Ticket URL: <https://code.djangoproject.com/ticket/20950#comment:1>

Django

unread,
Sep 14, 2013, 1:53:29 PM9/14/13
to django-...@googlegroups.com
#20950: Use OrderedDicts in ORM only when needed
-------------------------------------+-------------------------------------
Reporter: akaariai | Owner: nobody
Type: | Status: closed
Cleanup/optimization | Version: master
Component: Database layer | Resolution: fixed

(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by Anssi Kääriäinen <anssi.kaariainen@…>):

* status: new => closed
* resolution: => fixed


Comment:

In [changeset:"ff723d894d9272ea721d1996432ffc806c2b8180"]:
{{{
#!CommitTicketReference repository=""
revision="ff723d894d9272ea721d1996432ffc806c2b8180"
Fixed #20950 -- Instantiate OrderedDict() only when needed

The use of OrderedDict (even an empty one) was surprisingly slow. By
initializing OrderedDict only when needed it is possible to save
non-trivial amount of computing time (Model.save() is around 30% faster
for example).

This commit targetted sql.Query only, there are likely other places
which could use similar optimizations.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/20950#comment:2>

Django

unread,
Oct 1, 2013, 3:55:39 AM10/1/13
to django-...@googlegroups.com
#20950: Use OrderedDicts in ORM only when needed
-------------------------------------+-------------------------------------
Reporter: akaariai | Owner: nobody
Type: | Status: closed
Cleanup/optimization | Version: master

Component: Database layer | Resolution: fixed
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by Anssi Kääriäinen <akaariai@…>):

In [changeset:"d64060a73650360dcabfdb4928a9e92d090925b1"]:
{{{
#!CommitTicketReference repository=""
revision="d64060a73650360dcabfdb4928a9e92d090925b1"
OrderedDict creation avoidance for .values() queries

Avoid accessing query.extra and query.aggregates directly for .values()
queries. Refs #20950.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/20950#comment:3>

Reply all
Reply to author
Forward
0 new messages