Using migrations vs. something like Rails' db:schema:load

124 views
Skip to first unread message

Daniel Tao

unread,
Dec 22, 2014, 11:20:33 PM12/22/14
to south...@googlegroups.com
I was recently discussing migrations with some team members, and one of my teammates suggested I contact Andrew directly with my question. If this has been discussed elsewhere, please just go ahead and forward me to that discussion! (I did a quick Google search but didn't have any luck with that.)

I'm actually new to Django, but I've done a lot of work in Rails. The sort of common wisdom I'd picked up from my Rails days led me to the feeling that, for the purpose of setting up a fresh database for an app (e.g., for onboarding a new team member, or for setting up an environment for automated tests), migrations are not the right tool for the job: for a sufficiently old project, there are bound to be issues, for example code dependencies that have broken, configuration changes that were made in production without being captured in code form (bad, but it always happens), etc.

My experience is that migrations are a great tool for making incremental changes to the DB, but that simply for getting set up, doing a one-time schema load is the way to go. In Rails land, the DB schema is always saved to a file (schema.rb) and kept in the code base. Then to set up a fresh DB with the proper schema, one simply runs a Rake task called db:schema:load. But it appears that Django does not have an equivalent of this.

I'm not claiming that what I've just stated is right, simply that it's what I understood prior to entering Django land. My question—since I'm sure you are at least familiar with this approach, and I'm guessing you've considered it and weighed pros and cons yourself—is: what am I missing? Do you disagree with what I've said? Do you think that keeping the DB schema itself in the code base is a bad idea?

Andrew Godwin

unread,
Dec 23, 2014, 4:35:02 AM12/23/14
to south...@googlegroups.com
Hi Daniel,

First off, South is indeed lacking in this area, but the new Django migrations are a more evolved version that solves this problem, so I'll explain how that solves it.

The new Django migrations are declarative, rather than procedural, and in addition each migration action is able to both modify the database schema as well as an in-memory version of the app's models. This means that any set of migrations implicitly has a schema defined - you can run through the migrations in memory to get the resulting schema.

However, of course, Django already has a definitive schema in apps' models.py files, whereas Rails takes the database as more of the source of truth. This is really what means there can be a stable schema defined for an app.

The combination of these, though, is what stops people forgetting to add migrations for things; you can't change the models without migrations noticing, as it can compare the concrete and derived schemas and work out what's changed. Django ignores what's actually in the database completely apart from a table which tracks which migrations have been applied; this way, anything you do out of scope hopefully fails hard and early rather than persisting till you set up a new environment or something.

Finally, a big part of Django migrations is reproduceability. I know that if I install the same app on two servers a few months apart, they'll have run through the exact same set of schema changes (and more importantly, any data migrations that are in the migration set that set up initial data, etc). Doing an initial schema load (like Django's syncdb) defeats this, and doesn't let you write data migrations but instead confines you to data loading being done separately and not the same way across installed and new instances.

There's some discussion about having a just-make-it-from-the-models-py-file mode for the migration backend for test setup (it can be faster), but this is only for very large apps; generally, I like having migrations run in testing too as it means your database is going to be exactly the same as the real one.

Finally, if you end up with too many migrations, there's an ability to squash them down into a few new migrations and have Django automatically manage the switchover (it also optimises away things like add/delete field pairs and things).

Django always has kept the schema in the code base - that's what models.py files _are_ - we just apply it differently to Rails.

(also, in future, django-developers is a better place for discussing migrations, as South is EOL)

Andrew

--
You received this message because you are subscribed to the Google Groups "South Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to south-users...@googlegroups.com.
To post to this group, send email to south...@googlegroups.com.
Visit this group at http://groups.google.com/group/south-users.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages