[Django] #24366: Improve migration dependency graph speed

29 views
Skip to first unread message

Django

unread,
Feb 19, 2015, 7:51:42 AM2/19/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
------------------------------------------------+------------------------
Reporter: MarkusH | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Migrations | Version: master
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
------------------------------------------------+------------------------
With a bout 750 migration files and a total of about 1250 dependency
definitions I end up with the following cProfile stats when calling
`executor.migrate_plan(targets)` in the migrate command:

{{{
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 929.770 929.770 <string>:1(<module>)
38 0.001 0.000 229.709 6.045 __init__.py:41(__init__)
63854649 100.666 0.000 101.193 0.000 __init__.py:58(__setitem__)
16077 0.009 0.000 0.009 0.000 __init__.py:83(__iter__)
38 0.000 0.000 0.000 0.000
_collections_abc.py:437(keys)
38 0.000 0.000 0.000 0.000
_collections_abc.py:462(__init__)
16077 0.007 0.000 0.016 0.000
_collections_abc.py:481(__iter__)
38 103.342 2.720 229.708 6.045
_collections_abc.py:581(update)
76 0.000 0.000 0.000 0.000
_weakrefset.py:70(__contains__)
38 0.000 0.000 0.001 0.000
abc.py:178(__instancecheck__)
38 0.001 0.000 229.710 6.045
datastructures.py:13(__init__)
63854687 25.172 0.000 25.172 0.000
datastructures.py:14(<genexpr>)
38 0.000 0.000 0.001 0.000
datastructures.py:28(__iter__)
1 0.051 0.051 929.770 929.770
executor.py:23(migration_plan)
12 0.000 0.000 0.000 0.000 executor.py:49(<genexpr>)
38 0.350 0.009 0.504 0.013
graph.py:109(ensure_not_cyclic)
38 251.357 6.615 929.097 24.450 graph.py:129(dfs)
38 0.622 0.016 929.719 24.466 graph.py:58(forwards_plan)
63909407 102.207 0.000 141.411 0.000 graph.py:67(<lambda>)
1 0.000 0.000 929.770 929.770 {built-in method exec}
38 0.000 0.000 0.000 0.000 {built-in method hasattr}
38 0.000 0.000 0.001 0.000 {built-in method isinstance}
38 0.000 0.000 0.000 0.000 {built-in method iter}
114 0.000 0.000 0.000 0.000 {built-in method len}
16077 0.527 0.000 0.527 0.000 {built-in method proxy}
63854661 259.600 0.000 259.600 0.000 {built-in method sorted}
408 0.000 0.000 0.000 0.000 {method 'add' of 'set'
objects}
38 0.000 0.000 0.000 0.000 {method 'append' of
'collections.deque' objects}
26476 0.006 0.000 0.006 0.000 {method 'append' of 'list'
objects}
63854611 10.834 0.000 10.834 0.000 {method 'appendleft' of
'collections.deque' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of
'_lsprof.Profiler' objects}
63854611 26.316 0.000 26.316 0.000 {method 'extendleft' of
'collections.deque' objects}
63909419 39.204 0.000 39.204 0.000 {method 'get' of 'dict'
objects}
38 0.000 0.000 0.000 0.000 {method 'items' of 'dict'
objects}
38 0.000 0.000 0.000 0.000 {method 'keys' of 'dict'
objects}
28690 0.010 0.000 0.010 0.000 {method 'pop' of 'list'
objects}
2622 0.001 0.000 0.001 0.000 {method 'pop' of 'set'
objects}
63854611 9.471 0.000 9.471 0.000 {method 'popleft' of
'collections.deque' objects}
26068 0.014 0.000 0.014 0.000 {method 'remove' of 'set'
objects}
}}}

I think this applies to 1.7+, though I only tested it on the current
master

--
Ticket URL: <https://code.djangoproject.com/ticket/24366>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Feb 19, 2015, 9:51:46 AM2/19/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
-------------------------------------+-------------------------------------
Reporter: MarkusH | Owner: nobody
Type: | Status: new
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:

Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by knbk):

* cc: knbk (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:1>

Django

unread,
Feb 19, 2015, 8:35:30 PM2/19/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
--------------------------------------+------------------------------------

Reporter: MarkusH | Owner: nobody
Type: Cleanup/optimization | Status: new
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by timgraham):

* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:2>

Django

unread,
Feb 19, 2015, 9:35:45 PM2/19/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
--------------------------------------+------------------------------------
Reporter: MarkusH | Owner: knbk
Type: Cleanup/optimization | Status: assigned
Component: Migrations | Version: master

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by knbk):

* owner: nobody => knbk
* status: new => assigned


Comment:

Preliminary results: from 25 minutes to 2 seconds with an implementation
based on a `Node` class and a recursive, cached `get_ancestors()` method.

The order isn't the exact same, but the following holds true:

{{{
for i, node in enumerate(ancestors):
assert all(ancestors.index(p) < i for p in node.parents)
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:3>

Django

unread,
Feb 20, 2015, 8:06:57 AM2/20/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
--------------------------------------+------------------------------------
Reporter: MarkusH | Owner: knbk
Type: Cleanup/optimization | Status: assigned
Component: Migrations | Version: master

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------

Comment (by MarkusH):

That sounds crazy. Want to open a pull request?

--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:4>

Django

unread,
Feb 20, 2015, 9:31:11 AM2/20/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
--------------------------------------+------------------------------------
Reporter: MarkusH | Owner: knbk
Type: Cleanup/optimization | Status: assigned
Component: Migrations | Version: master

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------

Comment (by knbk):

Sure. There are still a few kinks to work out (e.g. the
`CircularDependencyError` message throws another error), but I believe the
generated graphs satisfies the requirements. There are still 5 test
failures.

The biggest issue here is that `migrations.test_graph.GraphTests.test_dfs`
fails because it reaches the maximum recursion depth. This is expected and
a result of switching (back) to a recursive design. I haven't looked into
memory usage yet, I'm pretty sure memory usage is increased significantly,
but I don't know how much.

PR: https://github.com/django/django/pull/4173

--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:5>

Django

unread,
Feb 20, 2015, 10:07:28 AM2/20/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
--------------------------------------+------------------------------------
Reporter: MarkusH | Owner: knbk
Type: Cleanup/optimization | Status: assigned
Component: Migrations | Version: master

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------

Comment (by knbk):

The caching does wonders when performing searches in super/subtrees of
previously performed searches, but the algorithm itself is a real
downgrade from the previous one.

I'm also looking into partially ordered sets to see if that is any faster,
I'll let you know if anything comes out of it.

--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:6>

Django

unread,
Feb 21, 2015, 11:10:09 AM2/21/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
--------------------------------------+------------------------------------
Reporter: MarkusH | Owner: knbk
Type: Cleanup/optimization | Status: assigned
Component: Migrations | Version: master

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by knbk):

* has_patch: 0 => 1


Comment:

My current results: 991 migrations, 2280 dependencies, and it takes about
15 seconds to generate a full migration plan with python 2.7, about 20
seconds with python 3.4. In both cases, about 75% of that time goes to the
initialization of the internal `OrderedDict` of the `OrderedSet` in
`Node.ancestors` and `Node.descendants`.

Rich ordering based on `node.key` resulted in the exact same ordering as
the previous algorithm.

--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:7>

Django

unread,
Feb 21, 2015, 1:16:19 PM2/21/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
--------------------------------------+------------------------------------
Reporter: MarkusH | Owner: knbk
Type: Cleanup/optimization | Status: assigned
Component: Migrations | Version: master

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------

Comment (by MarkusH):

Thanks for your effort. This looks really promising! I assume the test
case marked as expected failure is due to the step back to an recursive
approach? Do you think it would be possible to make your approach an
iterative solution as well? If we don't loose caching we could get rid of
a couple of more seconds due to expensive function calls.

--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:8>

Django

unread,
Feb 21, 2015, 4:39:33 PM2/21/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
--------------------------------------+------------------------------------
Reporter: MarkusH | Owner: knbk
Type: Cleanup/optimization | Status: assigned
Component: Migrations | Version: master

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by MarkusH):

* needs_better_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:9>

Django

unread,
Feb 22, 2015, 1:41:35 PM2/22/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
--------------------------------------+------------------------------------
Reporter: MarkusH | Owner: knbk
Type: Cleanup/optimization | Status: assigned
Component: Migrations | Version: master

Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------
Changes (by MarkusH):

* needs_tests: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:10>

Django

unread,
Feb 22, 2015, 5:41:47 PM2/22/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
-------------------------------------+-------------------------------------
Reporter: MarkusH | Owner: knbk
Type: | Status: assigned
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by MarkusH):

* needs_better_patch: 1 => 0
* needs_tests: 1 => 0
* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:11>

Django

unread,
Feb 23, 2015, 6:54:56 AM2/23/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
-------------------------------------+-------------------------------------
Reporter: MarkusH | Owner: knbk
Type: | Status: closed
Cleanup/optimization |
Component: Migrations | Version: master
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Markus Holtermann <info@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"78d43a5e1064b63db1c486516c4263ef1c4c975c"]:
{{{
#!CommitTicketReference repository=""
revision="78d43a5e1064b63db1c486516c4263ef1c4c975c"
Fixed #24366 -- Optimized traversal of large migration dependency graphs.

Switched from an adjancency list and uncached, iterative depth-first
search to a Node-based design with direct parent/child links and a
cached, recursive depth-first search. With this change, calculating
a migration plan for a large graph takes several seconds instead of
several hours.

Marked test `migrations.test_graph.GraphTests.test_dfs` as an expected
failure due to reaching the maximum recursion depth.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:12>

Django

unread,
Feb 23, 2015, 7:04:27 AM2/23/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
-------------------------------------+-------------------------------------
Reporter: MarkusH | Owner: knbk
Type: | Status: closed
Cleanup/optimization |
Component: Migrations | Version: master

Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Markus Holtermann <info@…>):

In [changeset:"980dfca7174a3e00a214ad554bb9199529139796"]:
{{{
#!CommitTicketReference repository=""
revision="980dfca7174a3e00a214ad554bb9199529139796"
[1.8.x] Fixed #24366 -- Optimized traversal of large migration dependency
graphs.

Switched from an adjancency list and uncached, iterative depth-first
search to a Node-based design with direct parent/child links and a
cached, recursive depth-first search. With this change, calculating
a migration plan for a large graph takes several seconds instead of
several hours.

Marked test `migrations.test_graph.GraphTests.test_dfs` as an expected
failure due to reaching the maximum recursion depth.

Backport of 78d43a5e1064b63db1c486516c4263ef1c4c975c from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:13>

Django

unread,
Mar 29, 2015, 10:13:45 AM3/29/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
-------------------------------------+-------------------------------------
Reporter: MarkusH | Owner: knbk
Type: | Status: closed
Cleanup/optimization |
Component: Migrations | Version: master

Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Markus Holtermann <info@…>):

In [changeset:"75430be86f4c90b7fb8a370d2b080a8a7cc925a0" 75430be8]:
{{{
#!CommitTicketReference repository=""
revision="75430be86f4c90b7fb8a370d2b080a8a7cc925a0"
Refs #24366 -- Fixed recursion depth error in migration graph

Made MigrationGraph forwards_plan() and backwards_plan() fall back to an
iterative approach in case the recursive approach exceeds the recursion
depth limit.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:15>

Django

unread,
Mar 29, 2015, 10:13:45 AM3/29/15
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
-------------------------------------+-------------------------------------
Reporter: MarkusH | Owner: knbk
Type: | Status: closed
Cleanup/optimization |
Component: Migrations | Version: master

Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Markus Holtermann <info@…>):

In [changeset:"bc83add04c06e601d09a60df5492ff794baa2cbf" bc83add0]:
{{{
#!CommitTicketReference repository=""
revision="bc83add04c06e601d09a60df5492ff794baa2cbf"
Refs #24366 -- Renamed arguments in MigrationGraph, renamed tests
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:14>

Django

unread,
Mar 21, 2018, 11:38:10 AM3/21/18
to django-...@googlegroups.com
#24366: Improve migration dependency graph speed
-------------------------------------+-------------------------------------
Reporter: Markus Holtermann | Owner: Marten
Type: | Kenbeek
Cleanup/optimization | Status: closed
Component: Migrations | Version: master

Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Krzysztof Gogolewski):

I'd like to partially reverse this in #29243.

--
Ticket URL: <https://code.djangoproject.com/ticket/24366#comment:16>

Reply all
Reply to author
Forward
0 new messages