Are migrations idempotent and atomic?

1,215 views
Skip to first unread message

Andres Riancho

unread,
Jul 3, 2013, 8:07:27 PM7/3/13
to south...@googlegroups.com
List,

    Question: Are migrations idempotent?

    Context: I'm deploying a django application using Fabric. My deploy script runs syncdb and migrate. The deploy script might be run N times, from different servers, to spawn different EC2 instances behind a load balancer. Will migrate work every time, for all different types of migrations?

    Also, within the same context, are migrations atomic? What happens if two different servers call migrate at the same time?

Regards,
-- 
Andrés Riancho
Project Leader at w3af - http://w3af.org/
Web Application Attack and Audit Framework
Twitter: @w3af
GPG: 0x93C344F3

Andrew Godwin

unread,
Jul 4, 2013, 6:11:30 AM7/4/13
to south...@googlegroups.com
The migrations themselves aren't idempotent but the migrate command is - you can run it as much as you like and you'll always get the same result (schema is most current version).

Migrations are also atomic if you're running on PostgreSQL (as it has transaction support for schema changes). If two servers call migrate at the same time ON THE SAME DATABASE the result is undefined, but probably bad. My usual deployment pattern is to run migrate once per database on just one of my front-end machines during a deploy.

Andrew


--
You received this message because you are subscribed to the Google Groups "South Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to south-users...@googlegroups.com.
To post to this group, send email to south...@googlegroups.com.
Visit this group at http://groups.google.com/group/south-users.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Andres Riancho

unread,
Jul 4, 2013, 8:19:18 AM7/4/13
to south...@googlegroups.com
Andrew,

On Thu, Jul 4, 2013 at 7:11 AM, Andrew Godwin <and...@aeracode.org> wrote:
> The migrations themselves aren't idempotent but the migrate command is - you
> can run it as much as you like and you'll always get the same result (schema
> is most current version).

Hmm, so, if I run "python manage.py migrate" N times on N different
servers, things will break, right?

> Migrations are also atomic if you're running on PostgreSQL (as it has
> transaction support for schema changes). If two servers call migrate at the
> same time ON THE SAME DATABASE the result is undefined, but probably bad. My
> usual deployment pattern is to run migrate once per database on just one of
> my front-end machines during a deploy.

Understand, I suspected this, thus the question :)

Not sure if it's crazy or not, but... maybe it would be useful to have
a "south" table in the DB where south stores the state of the
migrations, and also if a migration is currently running? This would
solve both issues. On the other hand, I'm not sure if it is a real
issue or not... in my case it will just complicate my deployment
process a little bit, because I need to keep track of that piece of
information (if migrate was run) in an external location from my
servers (all my instances are created using AWS' auto-scaling).

Christophe Pettus

unread,
Jul 4, 2013, 8:21:47 AM7/4/13
to south...@googlegroups.com

On Jul 4, 2013, at 2:19 PM, Andres Riancho wrote:
> maybe it would be useful to have
> a "south" table in the DB where south stores the state of the
> migrations

There is.
--
-- Christophe Pettus
x...@thebuild.com

Christophe Pettus

unread,
Jul 4, 2013, 8:25:07 AM7/4/13
to south...@googlegroups.com
Hi, Andres,

(Sorry, sent the last email too soon.)

On Jul 4, 2013, at 2:19 PM, Andres Riancho wrote:

> Hmm, so, if I run "python manage.py migrate" N times on N different
> servers, things will break, right?

That works correctly; as Andrew noted, if you run "python manage.py migrate" twice in a row on the same database, the results are idempotent.

> maybe it would be useful to have
> a "south" table in the DB where south stores the state of the
> migrations, and also if a migration is currently running?

Check your database; south already has tables that track the state of migrations.

A system management process that can potentially spray multiple changes in parallel at a server and hope that they are correctly locked and idempotent on the server is probably one that is doomed to failure. You probably want to solve the problem of serializing changes properly against servers at the system management level.

Andres Riancho

unread,
Jul 4, 2013, 8:32:43 AM7/4/13
to south-users
On Thu, Jul 4, 2013 at 9:25 AM, Christophe Pettus <x...@thebuild.com> wrote:
> Hi, Andres,
>
> (Sorry, sent the last email too soon.)
>
> On Jul 4, 2013, at 2:19 PM, Andres Riancho wrote:
>
>> Hmm, so, if I run "python manage.py migrate" N times on N different
>> servers, things will break, right?
>
> That works correctly; as Andrew noted, if you run "python manage.py migrate" twice in a row on the same database, the results are idempotent.

Great :)

>> maybe it would be useful to have
>> a "south" table in the DB where south stores the state of the
>> migrations, and also if a migration is currently running?
>
> Check your database; south already has tables that track the state of migrations.

Great, so why can't we have atomic migrations on all DBMS? Before
starting migration 00001, south sets the "working" bit to True in the
corresponding south table entry, and when it finishes it sets it to
False. We also keep a field which states if the migration was applied
or not, default to False. if a second instance/server wants to also
start with 00001 at any point in time, it needs to verify that both
"working" and "applied" are False.

Sorry if all this is already answered somewhere else, I'm a south newbie.

PS: Maybe atomic and idempotent questions should go to the FAQ?

> A system management process that can potentially spray multiple changes in parallel at a server and hope that they are correctly locked and idempotent on the server is probably
> one that is doomed to failure. You probably want to solve the problem of serializing changes properly against servers at the system management level.
>
> --
> -- Christophe Pettus
> x...@thebuild.com
>

Andrew Godwin

unread,
Jul 4, 2013, 8:46:03 AM7/4/13
to south...@googlegroups.com
Great, so why can't we have atomic migrations on all DBMS? Before
> starting migration 00001, south sets the "working" bit to True in the
> corresponding south table entry, and when it finishes it sets it to
> False.

You're asking that South invent its own locking system (something that is impossible on all database backends as just having a bit set/bit get command is insufficient to get around the race conditions) just so you can run the migrate command in parallel from more than one server? That sounds a bit... mad, if you'll pardon the phrase.

Running migrations once per database is most definitely the remit of your deployment system, along with things like pushing static files to CDNs or database snapshotting. Even if it were possible to run it from multiple locations, you've now made every single server's deploy that much slower, which seems foolish.

Andrew

Shai Berger

unread,
Jul 4, 2013, 8:50:02 AM7/4/13
to south...@googlegroups.com
On Thursday 04 July 2013 15:32:43 Andres Riancho wrote:
>
> Great, so why can't we have atomic migrations on all DBMS?

Because to do that, you need to be able to perform schema changes within a
transaction. Of add DBMSs supported by South, only PostgreSQL and MSSQL
support this.

> Before
> starting migration 00001, south sets the "working" bit to True in the
> corresponding south table entry, and when it finishes it sets it to
> False. We also keep a field which states if the migration was applied
> or not, default to False. if a second instance/server wants to also
> start with 00001 at any point in time, it needs to verify that both
> "working" and "applied" are False.
>

This could help with the "running migrate in parallel" use- case, but that's a
very rare use-case, and such implementations tend to be rather fragile anyway.

Also, you are talking not about making each migration atomic, but about making
the "migrate" command atomic. This, IMO, is a non-starter -- it is not atomic,
because if a single migration fails, nobody is going to roll back all the
migrations that ran successfully before it.

> PS: Maybe atomic and idempotent questions should go to the FAQ?
>

The question about running "migrate" again (idempotence), maybe.

Shai.

Christophe Pettus

unread,
Jul 4, 2013, 8:56:57 AM7/4/13
to south...@googlegroups.com

On Jul 4, 2013, at 2:32 PM, Andres Riancho wrote:

> Great, so why can't we have atomic migrations on all DBMS? Before
> starting migration 00001, south sets the "working" bit to True in the
> corresponding south table entry, and when it finishes it sets it to
> False.

Because it wouldn't work reliably.

If you set the bit in a separate transaction from the south migration, the odds are 100% that at some point a migration will fail leaving that bit set, requiring manual cleanup (which is far worse than the problem you are trying to solve). If you ran it in the same migration, then the bit is useless, because no one else will see it and the migrations will proceed in parallel.

And you'll have this problem with something else eventually, besides south. This really should not be south's job.

Andres Riancho

unread,
Jul 4, 2013, 9:03:17 AM7/4/13
to south...@googlegroups.com
Thank you all for your comments, I learnt a lot :) As I said, I'm a
south newbie, so there is much to learn yet. It seems that I'll have
to work on my deploy scripts and create a better way for running
migrations.

PS: Great mailing list, one posts a comment and three knowledgeable
guys answer :)

> --
> -- Christophe Pettus
> x...@thebuild.com
>

Alex Satrapa

unread,
Jul 4, 2013, 6:31:58 PM7/4/13
to south...@googlegroups.com
Why does the script to deploy new front ends run syncdb and migrate on all the new front ends?

Define one instance to be the "DBA" and all other instances to be the "clerks". Only the DBA front end runs syncdb and migrate, the clerks only ever launch and start serving requests. Thus you would launch the manager outside the scope of the load balancer, and all the clerks will be launched under the purview of the load balancer.

HTH
HAND
Alex

Reply all
Reply to author
Forward
0 new messages