The following migration is the result of squashing 3 smaller migrations
together:
1. the initial migration
2. some random RunPython migration
3. changing the foreign key's column migration
The second step is only necessary to prevent squashmigrations from being
smart which avoids the issue.
The squashed migration (both optimized and unoptimized) looks as
following:
{{{#!python
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.db import migrations, models
# bug.migrations.0002_run_python
def do_something(apps, schema_editor):
pass # Do nothing, it's not important anyway.
class Migration(migrations.Migration):
replaces = [('bug', '0001_initial'), ('bug', '0002_run_python'),
('bug', '0003_change_db_column')]
dependencies = [
]
operations = [
migrations.CreateModel(
name='Child',
fields=[
('id', models.AutoField(auto_created=True,
serialize=False, verbose_name='ID', primary_key=True)),
('data', models.IntegerField(default=0)),
],
),
migrations.CreateModel(
name='Parent',
fields=[
('id', models.AutoField(auto_created=True,
serialize=False, verbose_name='ID', primary_key=True)),
('data', models.IntegerField(default=0)),
],
),
migrations.AddField(
model_name='child',
name='parent',
field=models.ForeignKey(db_column='my_parent',
to='bug.Parent'),
),
migrations.RunPython(
code=do_something,
),
migrations.AlterField(
model_name='child',
name='parent',
field=models.ForeignKey(to='bug.Parent'),
),
]
}}}
The models are:
{{{#!python
class Parent(models.Model):
data = models.IntegerField(default=0)
class Child(models.Model):
# Original db_column:
# parent = models.ForeignKey(Parent, db_column="my_parent")
# New db_column:
parent = models.ForeignKey(Parent)
data = models.IntegerField(default=0)
}}}
This will result in the following PostgreSQL error:
{{{
2015-10-08 15:00:27 CEST ERROR: column "my_parent" does not exist
2015-10-08 15:00:27 CEST STATEMENT: CREATE INDEX "bug_child_1e28668d" ON
"bug_child" ("my_parent")
}}}
The reason seems to be that any foreign key operations are added in SQL
format to the `schema_editor.deferred_sql` list and are then executed
after all other SQL commands. However, the third operation already renamed
the column for that index to 'parent_id'.
The deferred statements should follow any changes made during further
operations. Or, if that's not possible, perhaps some sort of 'insert
deferred statements now' operation could be added after each
'submigration' in the squashed migration?
Note that in this sample app it already crashes at the `CREATE INDEX`
statement, but the statement after that is `ALTER TABLE "bug_child" ADD
CONSTRAINT "bug_child_my_parent_3b5bc08e1603165f_fk_bug_parent_id" FOREIGN
KEY ("my_parent") REFERENCES "bug_parent" ("id") DEFERRABLE INITIALLY
DEFERRED`. I think the same issue might occur if `bug_parent(id)` would be
renamed or removed.
--
Ticket URL: <https://code.djangoproject.com/ticket/25530>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* cc: pdewacht@… (added)
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:1>
Old description:
New description:
Possibly related to #25521
# bug.migrations.0002_run_python
class Migration(migrations.Migration):
dependencies = [
]
'submigration' in the squashed migration? In fact, that may even be
desired, since the RunPython submigration may depend on the foreign key's
behaviour/index for speed.
Note that in this sample app it already crashes at the `CREATE INDEX`
statement, but the statement after that is `ALTER TABLE "bug_child" ADD
CONSTRAINT "bug_child_my_parent_3b5bc08e1603165f_fk_bug_parent_id" FOREIGN
KEY ("my_parent") REFERENCES "bug_parent" ("id") DEFERRABLE INITIALLY
DEFERRED`. I think the same issue might occur if `bug_parent(id)` would be
renamed or removed.
--
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:2>
* stage: Unreviewed => Accepted
Comment:
I don't know how "smart" we can make squashmigrations. The solution may
have to be "don't squash migrations that contains operations that must
remain in separate migrations."
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:3>
Comment (by simonphilips):
I don't think squashmigrations really is to blame here. The migration
process itself gets stuck over valid input that could have been written
manually as the order of these operations makes perfect sense from a
coder's point of view. What is the reason for delaying these SQL
statements until after all operations have run? Why are they not run right
after each operation?
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:4>
Comment (by timgraham):
It might be related to 307acc745a4e655c35db96f96ceb4b87597dee49 (#24630),
but this is not an area of expertise for me. If you have a solution to
suggest, please do. It seems maybe you want each operation to be run in a
transaction, however, my understanding is that this won't allow the entire
migration to be atomic in case one of its operations fails.
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:5>
* cc: shaib (added)
Comment:
Replying to [comment:4 simonphilips]:
> What is the reason for delaying these SQL statements until after all
operations have run? Why are they not run right after each operation?
Think about a migration which adds two models to an app: a model `A`, and
a model `B` with a FK to `A`. If `A` is created first, there's no problem,
but if `B` is created first, then you can't add its foreign-key constraint
until `A` is created. So, to run the statements immediately with the
operation, you'd have to trace such dependencies and make sure models are
created in the right order; this would be doable in the migration
generator, but a bit of a chore if you write migrations by hand.
Now consider a case where there is a "loop" of FKs (`A` has a FK to `B` as
well); there is no order of creating the models that will work -- you must
separate out the constraint creation statements.
Since the "loop" case is valid, and a solution for it already handles the
cases that would be solved by tracing FK dependencies, the solution was
adopted across the board: Some statements are generated as "deferred" and
are only executed at the end of the migration.
With that explanation in mind, your suggestion in the ticket -- to add an
operation to "flush" the deferred statements, and put it in between
original migrations when squashing, makes a lot of sense. There is a
problem with that: Such an operation could seriously mess with the
optimizer. A possible solution would be to insert the operation only
before `Run{Python,SQL}` operations which block the optimizer anyway.
Adding an operation (whether implicitly or explicitly) would count as a
new feature, and either way the problem is not a regression nor a major
failure in a new feature in 1.8. If a solution can be found without new
behavior, it might be accepted into 1.9, otherwise, it can only go in
1.10.
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:6>
Comment (by jbzdak):
I could attempt to write code of the flush operation (I have an issue that
kind of, sort of is related but solution is the same). But this is not an
EasyPickling task so I figured to ask for permssion.
For me this issue blocks, makes squashed migrations unusable, which is ok
when I have 30 or so migrations in an app, but before django 1.10 is
released problem might get out of hands.
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:7>
Comment (by timgraham):
I closed #25577 as a duplicate as it seems to be the same underlying
issue.
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:8>
Comment (by charettes):
#26461 was a duplicate caused by an implicit table rename through a
`RenameModel` operation.
I think a possible solution would be to add state dependency tracking to
`deferred_sql`. This way `RenameModel`, `AlterField`, `RenameField` and
`AlterModelTable` operations could alter it to rename the references and
`RemoveModel`, `RemoveField` operations could make sure to remove the
unnecessary statements.
For example, the `AddField` operation above would add a tuple of the form
`('CREATE INDEX ...', [(ModelState('bug', 'child', [...]), 'child'),
(ModelState('bug_parent', 'parent', [...]), 'id)]` (as it depends on both
state and fields) and the `AlterField` operation would loop over all
`editor.deferred_sql` and make sure to replace both the SQL and the state
of items depending on the model state it's about to alter. Model level
operations (`RenameModel`, `AlterModelTable`) could simply use `None` as
field to denote they depend on the model state itself.
As for the SQL replacing part I think it would make more sense to
introduce a class encapsulating the statement format string and its
dependency than naively performing `str.replace('old_table_name',
'new_table_name')` on the constructed SQL. This wrapper could also
subclass `str` in order to maintain backward compatibility with the actual
`deferred_sql` API.
Thoughts?
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:9>
* has_patch: 0 => 1
* version: 1.8 => master
Comment:
Here's a [https://code.djangoproject.com/ticket/25530 POC] of what I had
in mind.
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:10>
* needs_better_patch: 0 => 1
* needs_tests: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:11>
* needs_better_patch: 1 => 0
* needs_tests: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:12>
* needs_better_patch: 0 => 1
Comment:
Left some comments for improvement.
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:13>
Comment (by timgraham):
I closed #27092 as a duplicate.
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:14>
* owner: nobody => Simon Charette
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:15>
Comment (by Simon Charette):
[https://github.com/django/django/pull/6643 PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:16>
* needs_better_patch: 1 => 0
Comment:
Adjusted the patch based on Tim's feedback and rebased on top of the
latest master.
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:17>
* needs_better_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:18>
* needs_better_patch: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:19>
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:20>
Comment (by Simon Charette <charette.s@…>):
In [changeset:"b1cbbe926789bcfc2e118c7d7c7a0f30fab5248a" b1cbbe92]:
{{{
#!CommitTicketReference repository=""
revision="b1cbbe926789bcfc2e118c7d7c7a0f30fab5248a"
Refs #25530 -- Deleted deferred SQL references on delete operation.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:24>
Comment (by Simon Charette <charette.s@…>):
In [changeset:"ea91ad4c131816fd0ea8d5f1bfb21b7abd82b47e" ea91ad4c]:
{{{
#!CommitTicketReference repository=""
revision="ea91ad4c131816fd0ea8d5f1bfb21b7abd82b47e"
Refs #25530 -- Changed _create_index_name to take a table as first
parameter.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:21>
Comment (by Simon Charette <charette.s@…>):
In [changeset:"b50815ee418b38e719476c2d5f6e2bc69f686927" b50815ee]:
{{{
#!CommitTicketReference repository=""
revision="b50815ee418b38e719476c2d5f6e2bc69f686927"
Refs #25530 -- Renamed deferred SQL references on rename operation.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:23>
Comment (by Simon Charette <charette.s@…>):
In [changeset:"3b429c96736b8328c40e5d77282b0d30de563c3c" 3b429c96]:
{{{
#!CommitTicketReference repository=""
revision="3b429c96736b8328c40e5d77282b0d30de563c3c"
Refs #25530 -- Tracked references of deferred SQL statements.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:22>
* status: assigned => closed
* resolution: => fixed
--
Ticket URL: <https://code.djangoproject.com/ticket/25530#comment:25>