We should investigate migration loading performance for the following:
- Requiring the migration graph to always have the minimum number of
edges. This would mean enforcing transitive reduction whenever a new edge
is added. This would be a trade off between the additional computation to
ensure the graph is minimal versus the reduced computation in traversing
the graph for ancestors/descendants/etc.
- Performing a single transitive reduction step once the graph has been
built. Again, this would be to simplify any later traversal.
Closely related to this is the `showmigrations` command, with the `--plan`
flag at verbosity 2. This adds a lists of migration dependencies in
brackets next to every migration in the plan. Currently each list of
dependencies is constructed from the original `Migration` object's
`dependencies` list, with:
- appropriate replacements made for migrations that have been replaced by
squash migrations, and
- additional dependencies added based on any `run_before` declarations.
So what this shows is the list of dependencies, somewhat rawly, as set by
the user in their migration files. If the user declares redundant
dependencies, then these bleed into the migration graph, which propagates
to the list of dependencies printed by `showmigrations`. The question is
what do we want it to show? This raw list may help the user with
debugging, but it could get cluttered if there is an abundance of
superfluous dependencies. So what we could do is print dependencies that
come from the migration graph after it has been transitively reduced. This
would improve the brevity, though whether it would be helpful to the user
is up for debate. We could also try using both reduced and unreduced
migration graphs to emphasise superfluous dependencies to the user.
TLDR:
1) Investigate migration performance with transitive reduction:
- at every edge addition
- after graph is built
2) Decide what would be most useful to list for `showmigrations --plan
-v2`:
- [current] Raw dependencies (with changes to account for replacements and
`run_before`).
- Only critical dependencies (by using reduced graph).
- Raw dependencies with superfluous edges somehow emphasised.
To be clear, I don't see a problem with the current system, but thought
this might be worth investigating/discussing.
--
Ticket URL: <https://code.djangoproject.com/ticket/26407>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_docs: => 0
* needs_tests: => 0
* stage: Unreviewed => Accepted
--
Ticket URL: <https://code.djangoproject.com/ticket/26407#comment:1>
Comment (by MarkusH):
The following simple implementation does not scale for more than ~ 50 to
100 nodes:
{{{#!diff
diff --git a/django/db/migrations/graph.py b/django/db/migrations/graph.py
index 1a7a703..28ed208 100644
--- a/django/db/migrations/graph.py
+++ b/django/db/migrations/graph.py
@@ -260,6 +260,26 @@ class MigrationGraph(object):
"""
[n.raise_error() for n in self.node_map.values() if isinstance(n,
DummyNode)]
+ def reduce(self):
+ for xkey, xnode in six.iteritems(self.node_map):
+ # print('xkey: %s' % (xkey,))
+ for ykey, ynode in six.iteritems(self.node_map):
+ # print('ykey: %s' % (ykey,))
+ if xkey == ykey:
+ continue
+ for zkey, znode in six.iteritems(self.node_map):
+ # print('zkey: %s' % (zkey,))
+ if xkey == zkey or ykey == zkey:
+ continue
+ if znode in xnode.children and ynode in
xnode.children and znode in ynode.children:
+ print("rm %(xkey)s -> %(zkey)s : %(xkey)s ->
%(ykey)s -> %(zkey)s" % {
+ 'xkey': xkey,
+ 'ykey': ykey,
+ 'zkey': zkey,
+ })
+ xnode.children.discard(znode)
+ znode.parents.discard(xnode)
+
def clear_cache(self):
if self.cached:
for node in self.nodes:
@@ -278,6 +298,7 @@ class MigrationGraph(object):
raise NodeNotFoundError("Node %r not a valid node" % (target,
), target)
# Use parent.key instead of parent to speed up the frequent
hashing in ensure_not_cyclic
self.ensure_not_cyclic(target, lambda x: (parent.key for parent
in self.node_map[x].parents))
+ self.reduce()
self.cached = True
node = self.node_map[target]
try:
@@ -298,6 +319,7 @@ class MigrationGraph(object):
raise NodeNotFoundError("Node %r not a valid node" % (target,
), target)
# Use child.key instead of child to speed up the frequent hashing
in ensure_not_cyclic
self.ensure_not_cyclic(target, lambda x: (child.key for child in
self.node_map[x].children))
+ self.reduce()
self.cached = True
node = self.node_map[target]
try:
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/26407#comment:2>
* cc: emorley@… (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/26407#comment:3>