Migrations single file vs. multiple files

103 views
Skip to first unread message

James Reeves

unread,
May 14, 2011, 3:03:36 PM5/14/11
to Lobos Library
I've been looking into writing database migrations in Clojure, and I
happened across Lobos, which in general looks excellent and a great
complement to ClojureQL. However, I'm a little uncertain about the
proposed migration syntax.

The migration syntax in the README mirrors the Rails method of putting
each migration in its own timestamped file. But this requires that the
migration files exist on the file-system; it doesn't seem like it
would work if the migrations were all bundled into a jar or war, since
there is no interface for searching through packaged resources.

Is there a reason for having one migration per file? Might it not be
more flexible if migrations were placed in a single file, and executed
in the order they are defined? e.g.

(defmigration create-users
(create
(table :users
(integer :id :primary-key)
(varchar :name 100)))

(defmigration add-user-name-length-check
(alter (table :users (check :name (> (length :name) 1)))))

Or am I missing some crucial problem with this?

- James

Nicolas Buduroi

unread,
May 14, 2011, 3:43:02 PM5/14/11
to lobos-...@googlegroups.com
On Saturday, May 14, 2011 3:03:36 PM UTC-4, James Reeves wrote:
I've been looking into writing database migrations in Clojure, and I
happened across Lobos, which in general looks excellent and a great
complement to ClojureQL. However, I'm a little uncertain about the
proposed migration syntax.

Good, I was really looking for feedback on that feature.

The migration syntax in the README mirrors the Rails method of putting
each migration in its own timestamped file. But this requires that the
migration files exist on the file-system; it doesn't seem like it
would work if the migrations were all bundled into a jar or war, since
there is no interface for searching through packaged resources.

That's an issue I hadn't considered yet, it would certainly be better if the migration code would be compiled like normal Clojure source code after all! I'll look more into this, I'll probably use a macro like the 'defmigration' you used in your example. I might change the syntax to handle undoing migrations like this:

(defmigration create-users
  :do (create ...)
  :undo (drop ...))

Also, I wonder if it would be a good idea to add support for implicit rollbacks, that is if the :undo part is missing generate it on the fly or maybe it would be safer to force explicit rollbacks and throw an exception then. In the second case you would have to explicitly tell that a migration could be rolled back safely like this:

(defmigration create-users
  :do (create ...)
  :undo :pass)
 
Is there a reason for having one migration per file? Might it not be
more flexible if migrations were placed in a single file, and executed
in the order they are defined? e.g.

  (defmigration create-users
    (create
      (table :users
        (integer :id :primary-key)
        (varchar :name 100)))

  (defmigration add-user-name-length-check
    (alter (table :users (check :name (> (length :name) 1)))))

Or am I missing some crucial problem with this?

To be honest, I've done it that way only because that's what I'm used to with Rails. Putting all migrations into one file would certainly be more flexible and I don't see any drawback for now. I'll try to find out why ActiveRecord works like that.

Thanks for the feedback!

Nicolas Buduroi

unread,
May 14, 2011, 4:06:30 PM5/14/11
to lobos-...@googlegroups.com
One other thing about the 'defmigration' macro, it could be made to handle docstrings and also a version meta-data key. The version number (auto-generated as a timestamp) would prevent name clashes. The migration table would store both version and name, but the migrations would be executed based in the order they are defined. That should be as safe as what AR is doing, but more flexible and easier to work with.

I've started a topic on the question of using individual file on the RoR: Talk group:

https://groups.google.com/d/topic/rubyonrails-talk/5-SvCmKo-v4/discussion

James Reeves

unread,
May 14, 2011, 4:31:12 PM5/14/11
to lobos-...@googlegroups.com
On 14 May 2011 20:43, Nicolas Buduroi <nbud...@gmail.com> wrote:
> I'll look more into this, I'll probably use a macro like the 'defmigration'
> you used in your example. I might change the syntax to handle undoing
> migrations like this:
>
> (defmigration create-users
>   :do (create ...)
>   :undo (drop ...))

Good idea. I'd overlooked the fact that migrations might need to be undone.

> Also, I wonder if it would be a good idea to add support for implicit
> rollbacks, that is if the :undo part is missing generate it on the fly or
> maybe it would be safer to force explicit rollbacks and throw an exception
> then.

I'd be tempted to make the :undo part explicit, i.e. raise an
exception if a migration is created without an :undo.

>> Is there a reason for having one migration per file? Might it not be
>> more flexible if migrations were placed in a single file, and executed
>> in the order they are defined? e.g.
>

> To be honest, I've done it that way only because that's what I'm used to
> with Rails. Putting all migrations into one file would certainly be more
> flexible and I don't see any drawback for now. I'll try to find out why
> ActiveRecord works like that.

I'm not quite sure why Rails does it that way, either. Possibly to
make it easier to generate migrations using scripts?

- James

James Reeves

unread,
May 14, 2011, 4:40:24 PM5/14/11
to lobos-...@googlegroups.com
On 14 May 2011 21:06, Nicolas Buduroi <nbud...@gmail.com> wrote:
> One other thing about the 'defmigration' macro, it could be made to handle
> docstrings and also a version meta-data key. The version number
> (auto-generated as a timestamp) would prevent name clashes. The migration
> table would store both version and name, but the migrations would be
> executed based in the order they are defined.

Wouldn't the auto-generated version number be different each time the
migration file was evaluated? Or am I misunderstanding?

- James

Nicolas Buduroi

unread,
May 14, 2011, 5:04:05 PM5/14/11
to lobos-...@googlegroups.com
On Saturday, May 14, 2011 4:40:24 PM UTC-4, James Reeves wrote:
Wouldn't the auto-generated version number be different each time the
migration file was evaluated? Or am I misunderstanding?

The version number would be explicitly included in the 'defmigration' form, but wouldn't be mandatory.

args: ([name docstring? attr-map? version? & body])

Where the body would be either a map with :do or :undo keys, ex.:

(defmigration create-user :20110514...
  :do (create ...)
        (insert-rows ...))

The 'dump' migration command would always include the version number and the :undo key if it can generate it completely. I would also make this command output warnings when it can't generate the :undo part or if the name or version is already being used, maybe with a prompt to ask if the user really want to create that migration.


James Reeves

unread,
May 15, 2011, 10:27:01 AM5/15/11
to lobos-...@googlegroups.com
On 14 May 2011 22:04, Nicolas Buduroi <nbud...@gmail.com> wrote:
> The version number would be explicitly included in the 'defmigration' form,
> but wouldn't be mandatory.
>
> args: ([name docstring? attr-map? version? & body])
>
> Where the body would be either a map with :do or :undo keys, ex.:
>
> (defmigration create-user :20110514...
>   :do (create ...)
>         (insert-rows ...))

Why not just rely on the migration name? In Rails, migrations have
timestamps to ensure uniqueness and order, but in Clojure symbol
bindings are unique and ordered anyway.

For example:

(defmigration create-user
(:do [db] ...)
(:undo [db] ...))

(defmigration add-user-creation-date
(:do [db] ...)
(:undo [db] ...))

(An explicit "db" argument seems more in line with how protocols and types work)

If we tried to add another "create-user" definition in the same
namespace, we'd get a warning or error message telling use a symbol
has been redefined. We also know that Clojure evaluates files
linearly, so create-user will always be evaluated before
add-user-creation-date.

So perhaps behind the scenes, a migration like this:

(defmigration create-user
(:do [db] ...)
(:undo [db] ...))

Might get turned into code like:

(do
(defonce lobos-migrations (atom []))
(def create-user
(reify Migration
(migrate-do [db] ...)
(migrate-undo [db] ...)))
(swap! lobos-migrations conj 'create-user))

Then we have have an atom that contains all of the migrations that
*should* be applied, and we can compare this to a database table that
contains all the migrations that *are* applied.

I believe Rails applies migrations out of order if there is a merge
conflict. For example, if there are migrations A and B, and Alice
creates migration C, then she will have a database created by
migrations ABC. But if Bob creates migration D at the same time, then
he'll have a database created by migrations ABD.

Rails resolves this conflict by assuming that migrations are
independent and can be applied out of order. So Alice will wind up
with a database created by ABCD, and Bob will have a database created
by ABDC. I guess the idea is that this method preserves the data in
the development database, because there's no need to undo a migration
that might result in data loss.

However, my feeling is that I'd prefer migrations to applied in the
exact order they will be in production, even if it means my
development database might lose some data. So personally, I'd prefer
the migration library to automatically undo D in Bob's database, and
then run migrations C and D in the correct order.

What do you think?

- James

Nicolas Buduroi

unread,
May 15, 2011, 12:06:36 PM5/15/11
to lobos-...@googlegroups.com
On Sun, May 15, 2011 at 10:27 AM, James Reeves <jre...@weavejester.com> wrote:
Why not just rely on the migration name? In Rails, migrations have
timestamps to ensure uniqueness and order, but in Clojure symbol
bindings are unique and ordered anyway.

Good point!
 
However, my feeling is that I'd prefer migrations to applied in the
exact order they will be in production, even if it means my
development database might lose some data. So personally, I'd prefer
the migration library to automatically undo D in Bob's database, and
then run migrations C and D in the correct order.

What do you think?

That an interesting proposition, I'll consider it for sure, it shouldn't be that hard to implement.

Nicolas Buduroi

unread,
May 15, 2011, 12:22:05 PM5/15/11
to lobos-...@googlegroups.com
On Sun, May 15, 2011 at 10:27 AM, James Reeves <jre...@weavejester.com> wrote:
(defmigration add-user-creation-date
 (:do [db] ...)
 (:undo [db] ...))

(An explicit "db" argument seems more in line with how protocols and types work)

The way I see it, is to simply let the actions use whatever db connection they have access to. It would be the responsibility of the user to call the migration commands in the proper context. This could enable migrations that can apply changes to any number of different databases.


(do
 (defonce lobos-migrations (atom []))
 (def create-user
   (reify Migration
     (migrate-do [db] ...)
     (migrate-undo [db] ...)))
 (swap! lobos-migrations conj 'create-user))

That's an interesting implementation, I'll probably use it!

Nicolas Buduroi

unread,
May 15, 2011, 1:10:44 PM5/15/11
to lobos-...@googlegroups.com
On Sun, May 15, 2011 at 12:22 PM, Nicolas Buduroi <nbud...@gmail.com> wrote:
On Sun, May 15, 2011 at 10:27 AM, James Reeves <jre...@weavejester.com> wrote:
(defmigration add-user-creation-date
 (:do [db] ...)
 (:undo [db] ...))

(An explicit "db" argument seems more in line with how protocols and types work)

The way I see it, is to simply let the actions use whatever db connection they have access to. It would be the responsibility of the user to call the migration commands in the proper context. This could enable migrations that can apply changes to any number of different databases.

I've thought about it a little more and you have a point there! I think my (probably silly) point would better be handled by splitting actions which are applied to different databases into multiple migrations after all.

James Reeves

unread,
May 17, 2011, 6:53:56 PM5/17/11
to lobos-...@googlegroups.com
On 15 May 2011 18:10, Nicolas Buduroi <nbud...@gmail.com> wrote:
> On Sun, May 15, 2011 at 12:22 PM, Nicolas Buduroi <nbud...@gmail.com>
> wrote:
>> On Sun, May 15, 2011 at 10:27 AM, James Reeves <jre...@weavejester.com>
>>> (An explicit "db" argument seems more in line with how protocols and
>>> types work)
>>
>> The way I see it, is to simply let the actions use whatever db connection
>> they have access to. It would be the responsibility of the user to call the
>> migration commands in the proper context. This could enable migrations that
>> can apply changes to any number of different databases.
>
> I've thought about it a little more and you have a point there! I think my
> (probably silly) point would better be handled by splitting actions which
> are applied to different databases into multiple migrations after all.

Well, the database configuration can be passed to the migration via a
binding or an explicit argument. An explicit argument might be a
little better in this case, because you might want to use other SQL
libraries in the migration, and a binding would only work for Lodos
functions.

For instance, perhaps you have a database filled with plaintext
passwords and you want to encrypt them with a salted one-way hash. In
this case, you'd probably need to use a library like ClojureQL to
iterate through the table and encrypt all the existing passwords.

- James

Reply all
Reply to author
Forward
0 new messages