Database migrations?

414 views
Skip to first unread message

Sam Minnee

unread,
Oct 27, 2008, 11:06:42 PM10/27/08
to SilverStripe Development
Currently, db/build automatically gives us 80% of the database
migrations that we need. This is great, but it leaves the remaining
20% poorly supported.

In particular, although the database schema is updated for you, the
data within the schema isn't. This makes the following changes
annoying:
* renaming a field
* moving a field up/down the class hierarchy
* moving a object's field into a separate object or table, eg,
changing a has_many relationship into a many_many relationship.
* turning a pageless controller into a page type

I'm sure that any experienced SilverStripe developer can think of many
more examples.

I think that we should aim to have better support for database
migrations in 2.4. They would run as part of db/build, after the
current db/build code.

Some design questions:

1) Database versioning. Should we have a table in the database that
stores information about how up-to-date the database. What should we
use to track the current version of the database? A timestamp, a
subversion revision number, or a separate database version number?
GMT timestamps seem to me to be the simplest as they are universally
available and you're not coupling yourself to subversion.

The question then becomes how do you make it easy for people to create
migration scripts? Rails provides a rake action for scaffolding a new
migration script; we could potentially do the same thing with sake,
although so far sake has been an optional tool and it would be nice to
keep it that way.

2) Script format. Should the migration scripts consist of simple .SQL
statements, ORM calls, or some other syntax? If we're going to make a
migration syntax, then the ORM is probably a good place to start, but
we would need to have some way of getting data from no-longer-existent
tables and columns. This seems like pretty achievable, since the
legacy dbfields and dataobjects are still available in the database.

The scripts themselves could probably be simple procedural scripts
that make data object get and update calls. One thing that would be
handy is to extend the ORM to have better support for batch-updates,
as they are currently very inefficient. But, strictly speaking, this
activity is separate from the building the migration system.

3) Reversibility. At one level, making migrations reversible seems
like a high-cost nice-to-have. However, we would need to ensure that
the development process for migrations can cope with people making
buggy migrations. This could be done by writing reversible migration
steps by giving each step a complementary reverse operation, or more
crudely by backing the database up prior to any migration. The back-
up operation, although slower, will more accurately cover cases where
the migration system itself has bugs. It's also simpler to
implement. We would probably want to have the system store a maximum
number of backups.

4) Scope. Database content migrations go beyond schema migrations.
For example, when you create a new page type on a project, you will
often want to add a new instance of that page type to production when
you go live, with some particular content. I would think that the
migration system should be usable for this.

If this is the case, then potentially any content change that you make
is going to be a candidate for a migration. Perhaps this can help
with the process of creating migration scripts: some kind of interface
for tracking changes that you made since your last commit and
inquiring as to which of these should be turned into migrations. But
you would need to be careful not to list 1000s of irrelevant plumbing
commits. Such information could perhaps be drawn from a higher level
than the database: for example, list the content pages created /
edited.

In all of this, we need to identify a simple but useful initial step
to take in this direction, and iterative. This means that we need to
resist the urge to add every cool migration feature that we can think
of in the first iteration.

References:
http://api.rubyonrails.com/classes/ActiveRecord/Migration.html
http://divmod.org/trac/wiki/DivmodAxiom/Reference#Upgraders
http://www.varlena.com/varlena/GeneralBits/127.php

Hayden

unread,
Oct 28, 2008, 3:20:02 PM10/28/08
to SilverStripe Development
We've encountered times in our current project that would benefit
greatly from a standard migration mechanism. At the moment, we are
just using subclasses of Controller.

1) A standard revision number is probably the simplest and least error
prone way of doing this. I don't think that everyone would be using
SVN and those who are would probably use svn export to generate a copy
of the code for the website.

2) In terms of being compatible with 4), surely the easiest way is to
have a DatabaseMigration class in a PHP script or something? Possibly
even have it extend task so that it's in a list on /dev/tasks?

3) A safe way of doing this would be pretty useful. I quite often had
subtle errors in my Rails migrations that made reversing them a bit of
a pain.

Ingo Schommer

unread,
Oct 29, 2008, 5:50:55 AM10/29/08
to silverst...@googlegroups.com
A good collection of usecases and different methodologies:
http://code.djangoproject.com/wiki/SchemaEvolution

Also: http://www.martinfowler.com/articles/evodb.html

On 28.10.2008, at 04:06, Sam Minnee wrote:
>
> Some design questions:
>
> 1) Database versioning. Should we have a table in the database that
> stores information about how up-to-date the database. What should we
> use to track the current version of the database? A timestamp, a
> subversion revision number, or a separate database version number?
> GMT timestamps seem to me to be the simplest as they are universally
> available and you're not coupling yourself to subversion.

How about versionnumbers for each model?
If you take the effort to write a migration script, you might as well
add version numbers to the class incrementally. The currently
applied numbers could be stored for each model (not each table)
in a separate database table.


>
>
> 2) Script format. Should the migration scripts consist of simple .SQL
> statements, ORM calls, or some other syntax? If we're going to make a
> migration syntax, then the ORM is probably a good place to start, but
> we would need to have some way of getting data from no-longer-existent
> tables and columns. This seems like pretty achievable, since the
> legacy dbfields and dataobjects are still available in the database.

SQL-migrations don't work well with database abstraction,
but can be handy if you know what you're doing.
If we can't do both "flavours" of SQL and ORM-based migrations right
away
because of time constraints, I'd start with ORM-based.


>
> 4) Scope. Database content migrations go beyond schema migrations.
> For example, when you create a new page type on a project, you will
> often want to add a new instance of that page type to production when
> you go live, with some particular content. I would think that the
> migration system should be usable for this.

Yeah, not each migration would necessarily change the schema.
Its definetly handy for adding default records - currently we just
check if the table we're about to change has any records to determine
if defaults like a blog page are added.

One important issue for a growing development team is the sequentiality
of migrations - you need to determine an order in which they're
executed.
This means migrations in e.g. feature branches might use a version
number
thats already been reserved in trunk in the meantime, causing weird
database issues.
Might fall in the "too hard" bucket for now...

Sam Minnee

unread,
Oct 29, 2008, 7:31:19 PM10/29/08
to SilverStripe Development
> How about versionnumbers for each model?
> If you take the effort to write a migration script, you might as well
> add version numbers to the class incrementally. The currently
> applied numbers could be stored for each model (not each table)
> in a separate database table.

I don't like the idea of having multiple version numbers for a
database. A lot of data migrations span multiple models, and so your
migrations aren't necessarily tied to a single model. But, this is
one example of having multiple streams of migrations, which I think is
necessary, at a minimum, for modules (see below).

> One important issue for a growing development team is the sequentiality
> of migrations - you need to determine an order in which they're
> executed.
> This means migrations in e.g. feature branches might use a version
> number
> thats already been reserved in trunk in the meantime, causing weird
> database issues.
> Might fall in the "too hard" bucket for now...

This is a good argument in favour of using timestamps as migration
#s. You could still *potentially* get issues with what amount to
conflicts in your migrations, but I haven't seen a lot of other
projects tackling that problem.

Of course, the problem with timestamps as migration #s is that they're
more annoying for developers to manually write.

Another potential way to approach this is to collate your migrations
in a single file. That way, conflicting migrations would be
registered as conflicts by SVN (since they would be changes to the
same part of the same file) and could be dealt with manually by the
merging/updating developer.

You would still need to allow for the existence of multiple migration
scripts (at a minimum, this would be one per module), and this would
ultimately lead to back to the problem of multiple database version
numbers. Perhaps you could just *state* that it is a requirement that
individual migration streams don't tread on each others toes, but it
starts to lead into hairy territory.

Mind you, the requirement of non-conflicting streams of migrations is
going to be there, regardless.

Consider the following situation:

1) You check out ecommerce-0.5.1, with migrations 20080101, 20080102,
20080103
2) You build your site, creating migrations 20080801, 20080802,
20080803
3) You then upgrade to ecommerce-0.5.2, with migrations 20080101,
20080102, 20080103, 20080301, 20080302 (ie, 2 additional migrations
beyond the original 3)

For (3) to execute properly, the system needs to know that the
ecommerce migrations were executed up to 20080103, but that the site's
migrations were executed up to 20080803.
I have no idea how other frameworks such as Rails cope with this;
perhaps they don't deal with modular migrations?

There are two ways of dealing with this:

* Explicitly allow for separate streams of migrations.
* Allow for out-of-order application of migrations by listing every
revision that has been applied. So that at (3) in the above example,
we know that [ 20080101, 20080102, 20080103, 20080801, 20080802,
20080803 ] have all been applied, leaving [ 20080301, 20080302 ].
This would let us know when people have done icky things like inject
an older migration into the migration sequence, but there's no way of
distinguishing between legal and illegal out-of-order application.
One potential solution to this is to have some way of defining
migration pre-requisites, but that makes writing migrations a whole
lot more complex.

Jeremy Thomerson

unread,
Apr 21, 2012, 7:46:56 AM4/21/12
to silverst...@googlegroups.com
What ever came of this?  Is there support for DB migrations now?

Jeremy Thomerson

Sam Minnée

unread,
Apr 21, 2012, 6:43:03 PM4/21/12
to silverst...@googlegroups.com, silverst...@googlegroups.com
No, it stalled. It would be great to pick it up again though!

Aaron Cooper

unread,
Mar 24, 2014, 8:01:35 PM3/24/14
to silverst...@googlegroups.com
Been a year, so I thought I'd pick up the stick and prod this again.

Has there been any further discussion on migrations at all? This is still the first result in Google.

Aaron Cooper

unread,
Mar 24, 2014, 8:02:28 PM3/24/14
to silverst...@googlegroups.com
I forgot how to math. Two years.

Cam Findlay

unread,
Mar 25, 2014, 6:43:45 PM3/25/14
to silverst...@googlegroups.com
There has been a couple of options recently emerge for packaging up and moving full databases across SilverStripe environments.

These might be worth checking out.




On Tuesday, 25 March 2014 13:01:35 UTC+13, Aaron Cooper wrote:

Nicolaas Thiemen Francken - Sunny Side Up

unread,
Mar 25, 2014, 8:51:09 PM3/25/14
to silverstripe-dev
Completely off-topic, but sort of relevant ... apart from reading this mailing list, what is the discovery route for awesome tools like sspak?  It just looks like something that is super useful, but it is not listed in addons (http://addons.silverstripe.org/add-ons?search=sspak) AFAIK.  I wonder if there is a a / site / app / thingy that lists useful developer tools for Silverstripe sites and stays up-to-date.  


--
You received this message because you are subscribed to the Google Groups "SilverStripe Core Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to silverstripe-d...@googlegroups.com.
To post to this group, send email to silverst...@googlegroups.com.
Visit this group at http://groups.google.com/group/silverstripe-dev.
For more options, visit https://groups.google.com/d/optout.



--
Nicolaas Thiemen Francken
  www.sunnysideup.co.nz
  phone: +64221697577

Szabesz

unread,
Mar 26, 2014, 2:09:04 AM3/26/14
to silverst...@googlegroups.com
+1
:P

Anselm Christophersen

unread,
Mar 26, 2014, 3:53:43 AM3/26/14
to silverst...@googlegroups.com
Just chiming in.
I randomly discovered sspak last summer, and am now actively using it - e.g. to download db/assets from the server, and use them locally.
It’s absolutely great. But yes, I also just randomly stumbled over it…

Anselm

camfindlay

unread,
Mar 26, 2014, 9:04:50 PM3/26/14
to silverst...@googlegroups.com, n...@sunnysideup.co.nz
SSPak is still marked as experimental so I can understand why it's not been pushed into addons yet. But part of the power of our community is in building up relationships among developers so you find out about some of these unreleased gems (also gives you a chance to drop some comments or do some testing or a pull request on these modules to move them more quickly into the addons site).

However the GOVTNZ replicant tool is actually on addons, and I think this is actually our first NZ Government contributed module/tool. Which is hopefully a taste of things to come as the the open source sharing mindset takes hold in the public sector.

For me anyway, my discovery route outside of addons site is through building relationships and participate in conversations either in personal or virtually.

Carlos Cordero

unread,
Mar 27, 2014, 3:48:04 PM3/27/14
to silverst...@googlegroups.com, n...@sunnysideup.co.nz
I'd like to add that the NZ Gov CIO Office and Dept. Internal Affaires are clearly communicating to all Govt. agencies the benefits of embracing open source and embracing the CWP.  

I agree with Cam's assertion that as the agencies move towards the CWP, they will continue to contribute.  

Cam Spiers

unread,
Apr 14, 2014, 12:45:40 AM4/14/14
to silverst...@googlegroups.com, n...@sunnysideup.co.nz
As I understand it the original topic was more to do with schema/content migrations and not database transfer/replication. I see these as distinct problems.

A project that I have seen floating around but haven't used is http://phinx.org/, it is for schema migrations but looks like it can also do data migrations (via queries). I am wondering if it would be worth leveraging a tool like this. I think it would be possible to write a tool that generates for phinx SilverStripe-centric schema migrations à la dev/build, and also their rollback (as a class).

This generated migration would either represent the entire migration (acting exactly like dev/build), or would serve as a starting point for more details about the migration to be filled in. We could then disable dev/build or make it so dev/build will only ever run migrations that exist in the codebase but haven't been run before. This would all be optional, and likely offered as module not included in core.

Thoughts?

Cam

Mark Guinn

unread,
Apr 14, 2014, 9:55:29 AM4/14/14
to silverst...@googlegroups.com
I like this idea a lot. It would essential create a record of dev/build runs, and allow you to add data changes to them and also allowing you to roll the changes back. 

stojg

unread,
Jul 8, 2014, 6:02:59 AM7/8/14
to silverst...@googlegroups.com, n...@sunnysideup.co.nz
A project that I have seen floating around but haven't used is http://phinx.org/, it is for schema migrations but looks like it can also do data migrations (via queries). I am wondering if it would be worth leveraging a tool like this. I think it would be possible to write a tool that generates for phinx SilverStripe-centric schema migrations à la dev/build, and also their rollback (as a class).

I've used phinx for non silverstripe project and it's awesome. There should be some way of recording the schema updates, but rollbacks would be trickier if you really want to clean stuff up, instead of falling back to the _obsolete tables.

So deploy scenario would be
1) deploy the code, but don't show for website visitors yet (i.e. not symlink it to webserver docroot)
2) Run ./framework/sake /dev/db/migrate
3) Show the new version for visitors (maybe pre warm the template caches)
4) Discover horrible error and decide to rollback
5) Run ./framework/sake /dev/db/rollback
6) Show the old version of the website for the visitors

This of course requires some sort of orchestration that involves requires a specific deployment strategy with tools that I'm not sure that everyone needs. 

Is it possible to agree on tools and scripts that works on a wide set of environments that the community can benefit from?

/S

Jedateach

unread,
Jan 25, 2015, 7:34:17 PM1/25/15
to silverst...@googlegroups.com, n...@sunnysideup.co.nz
Bumping this thread back into life! :)

I like the look of phinx. I certainly like the idea of being able to roll back/forward to any version of the code + corresponding schema.

I did laravel's introduction tutorial not long ago, and saw that they have quite a similar solution. If I remember correctly, you actually work on the migration almost automatically as you make changes the the schema. Migration steps are stored in classes like phinx, which I think are created as stubs when using their cli tool.

+1 for rolling out DB migrations into framework (or somewhere near the ORM). It'll save the hassle of maintaining a single, increasingly complex migration script for each module. 

swaiba

unread,
Jan 26, 2015, 10:18:24 AM1/26/15
to silverst...@googlegroups.com, n...@sunnysideup.co.nz
I've actually got a prototype of this ready.

In short the code knows it's version.

When a dev/build is requested it determines if the version is going to change
if so it saves the current state & performs any pre operations
then after the dev build it performs post operations

the operations are stored in classes named <version>_to_<version>_Upgrade (or similar)
so whilst you have to specify each task - e.g. if you change a many_many to a has_many - the majority of it can be utility functions with teh structure of the UPDATE command of the database
but as you are programming and you see this change being required - you up the version and make the migration task (using the existing base function)
This allows any code to be run on version changes, but it is mainly focused on maintaining the DB more smoothly.

This is only about going forward though, our update process includes making a full backup, so if the update didn't work the rollback is a simple restore of the previous code and db

We are almost set using behat internally - when this is under test I can upload it
or possible before if anyone is interested

Ingo Schommer

unread,
Feb 12, 2015, 4:51:49 AM2/12/15
to silverst...@googlegroups.com, n...@sunnysideup.co.nz
My initial reaction: Why do we need this in core, can't it just "wrap around" dev/build as a third party module?
But thinking about it a bit more, migrations are an important advanced use case, particularly when it comes to the module ecosystem.

Looking at Laravel and Rails migrations, there's quite different ways of implementing the ActiveRecord pattern to support migrations though: Properties and their casting are inferred from the database table rather than the model class. So if we'd want to implement migrations for *all* schema updates, we either have a lot of duplicated properties ($db and migration script), or a quite deep change to the way our models work. They also rely more on code generation through CLI tasks, incidentally something SilverStripe would benefit from either way.

If we implement migrations as an occasional tool for non-standard schema changes (renames, data migration) alongside dev/build, we'd need to indicate which ones run before dev/build (renames) or after dev/build (data migration). And also which ones should partially replace responsibilities of dev/build. Sounds like a confusing API to me, but might be acceptable for the advanced target audience.

Mark Guinn

unread,
Feb 12, 2015, 5:49:18 AM2/12/15
to silverst...@googlegroups.com
Having worked with both Laravel and Rails I actually prefer the SilverStripe way. I think the simplest path by far is to modify dev/build to automatically create and run migrations. Then it would be easy for module developers or anyone to create additional migrations in the rails cli code generation way to supplement basic schema changes. This seems like the best of both worlds and its backwards compatible.

Patrick Nelson

unread,
Feb 12, 2015, 12:36:50 PM2/12/15
to silverst...@googlegroups.com
Well, I think Mark's solution isn't the simplest per se (since you could implement migrations as an augmentation only, not as a fundamental basis for modifying DB schema in dev/build) - however it's definitely the most robust if, in fact, it were the basis underlying the dev/build process. Possibly implemented using three different command line options when performing a dev/build, like:

1. On its own without options will build automated migrations and then execute them.
2. An option to build migrations only.
3. An option to generate an empty boilerplate migration that you can tweak as you see fit.
4. With an option to execute only.
5. With an option to roll back the last migration

Going through this thought process though, I'm thinking additional metadata might need to be retained alongside each automatically built migration to flag the current declarative schema state as defined in ::$db, like a sha1 hash comment at the top somewhere. i.e. Declarative as opposed to imperative, which we're generating in the migrations. This is because if you happened to run options 1 or 2 again (each of which attempt to build migrations) the system needs to know *if it should* build a migration, according to the most recent meta-data. That way, you can generate migrations of your own (or edit automatically built ones) and not have to worry about executing any of the options and accidentally re-generating migrations which are no longer relevant.

I'd say either this approach (i.e. rework dev/build itself to use migrations) or just setup migrations to be totally independant, as I proposed originally. My original proposition however doesn't necessarily account for ensuring that things are properly run in progression over time *with respect* to dev/build, but only with respect to itself, suggesting a more fundamental change to dev/build, unfortunately. That modification again is a bit more intensive but I think Mark has a very good point: All schema modifications are 1.) easily automated, 2) entirely progressive and still 2.) can be editable.


Patrick Nelson

unread,
Feb 12, 2015, 12:37:35 PM2/12/15
to silverst...@googlegroups.com
And by "three" I mean "five" sorry about that :)

Patrick Nelson

unread,
Feb 17, 2015, 7:50:13 PM2/17/15
to silverst...@googlegroups.com
For anyone interested, I've gone ahead and created a new module aptly named "silverstripe-migrations" and put it here:


I just built out the functionality that I need, no more and no less.  I figured I'd get this out of the way now and set it up so that it can be easily included in other projects, if anyone finds it useful.  Note: It's intended to be very bare bones! Please feel free to fork and extend it if you need to.

swaiba

unread,
May 7, 2015, 12:44:18 PM5/7/15
to silverst...@googlegroups.com
Several months have passed and I've just been discussing this with Hamish Friedlander today and his opinion would be that mostly all of the original point Sam raised should be handled within requireDefaultRecords of each dataoject.
I haven't discussed it in detail, but I do see the point for a lot of simple cases.
This will get resolved for us within the next few months, I'll come back and update this as we progress.

Barry
Reply all
Reply to author
Forward
0 new messages