Rake tasks for development and testing

19 views
Skip to first unread message

Robert Evans

unread,
Nov 29, 2007, 9:27:33 PM11/29/07
to rubyonra...@googlegroups.com
I've created 3 rake tasks:

rake db:create:test
rake db:build
rake db:rebuild

The first creates your test database. The second, creates your
development database, migrates it, creates the test database and then
clones the development database to the test database.a

Lastly, the third drops your database (dev), creates it, migrates it,
and clones it for your test database.

For a few projects, I've gotten tired of doing: rake db:drop && rake
db:create && rake db:migrate && rake db:test:clone

link: http://dev.rubyonrails.org/ticket/10316

What do you guys think, +1's?
--
Posted via http://www.ruby-forum.com/.

Matt Aimonetti

unread,
Nov 29, 2007, 11:50:35 PM11/29/07
to Ruby on Rails: Core
I recently had a long discussion with David Heinemeier Hansson. I was
a bit annoyed at the fact that rake db:reset was changed to use the
schema file.

Here is my explanation of how I use rake db:reset

I've been using rake db:reset mainly in the testing environment. Since
I do TDD, I usually don't create a migration all at once.
I write a test, generate the migration needed for the test to pass,
add the code, migrate, test passes. I refactor, write another test,
modify the migration, db:reset, write more code, test passes. I also
use rake:db:reset to make sure migrations won't break.

David made a good point saying that:

You're just iterating over one migration, which
db:rollback + db:migrate would deal with. I can sympathize with that
one. I just don't see the need to run ALL migrations again. Especially
not on production systems that might have hundreds of migrations.

I definitely get the point of verifying the current migration you're
working on. Perhaps something like db:regrate => [ db:rollback,
db:migrate ] would solve that case?

----

db:rollback has a nice STEP option that let's you migrate 'back' few
versions.

rake db:build is a good idea, but David made clear that we should use
the schema.rb file to bring your db to the latest state. On top of
that I don't understand the need to clone your dev env to your test
development.

rake db:create:test is the same as rake db:create RAILS_ENV=test or
rake db:create:all

----

I'll submit a patch for rake db:regrate if you have a suggestion for
a better name, please post here or on the track ticket.

-Matt


On Nov 29, 6:27 pm, Robert Evans <ruby-forum-incom...@andreas-s.net>
wrote:

Robert Evans

unread,
Nov 30, 2007, 1:56:20 AM11/30/07
to rubyonra...@googlegroups.com
I was also annoyed by the change to db:reset, which is why I then wrote
rake db:rebuild. I mainly use it when I'm in the early stages of
development and my migrations are going through some changes. I could
rollback to and then migrate back up, but I just prefer the short syntax
of db:rebuild.

rake db:test:clone clones your test database from your schema (same as
rake db:reset - from schema)

I prefer to have a test database up from the get go (which is why I have
it in rake db:build) and when I remigrate.

The reason I prefer migrations over the schema is that I usually have
some data in my migrations, e.g. admin user, that I'd like to have
setup. I'm sure I'm not the only one who has some data in the
migrations.

The rake db:create:test came about because I wanted to have rake
db:build and have my test database setup(with the current schema). You
can't do rake :build => ["rake db:create RAILS_ENV=test"], it will fail
and rake db:create:all will create all databases defined in your yaml
file - I didn't want that either.

Those were my reasons why these came about and why I've been using them
for a while now. Am I missing something or doing something here that
shouldn't be done?

--
Posted via http://www.ruby-forum.com/.

Chris Cruft

unread,
Nov 30, 2007, 9:02:23 AM11/30/07
to Ruby on Rails: Core
I'm not sure I exactly follow Robert's use cases, but I sympathize
with the frustration with schema.rb versus migrations. Check out this
thread for a similar discussion:

http://groups.google.com/group/rubyonrails-core/t/d871469cb2a6589a?hl=en

The highly-specific use cases for Robert's patch will make it tough to
be accepted, but I do believe there is hope for some migration support
in test. There are a lot of people blogging (and raising tickets like
8389) about wanting to use migrations to build test. If that happens,
at least some of the infrastructure for Robert's use cases will be in
place.

Matt, your conversation with DHH sounds like it touched on migrations
versus schema dumper. I'd love your thoughts (or David's!) on the
subject of migrations for building the test DB.

-Chris

On Nov 30, 1:56 am, Robert Evans <ruby-forum-incom...@andreas-s.net>
wrote:

DHH

unread,
Nov 30, 2007, 10:46:15 AM11/30/07
to Ruby on Rails: Core
> Matt, your conversation with DHH sounds like it touched on migrations
> versus schema dumper. I'd love your thoughts (or David's!) on the
> subject of migrations for building the test DB.

Migrations were never meant to be data seeders. They were meant to
change the schema and massaging existing data to fit the new schema.
In that context, it doesn't make sense to use migrations for the test
database because the test database does not have permanent data of
importance.

But it seems that this misuse of migrations highlights something that
might be lacking: a data seeding system. People are cajoling
migrations to fit that role too even though it wasn't designed as
such. So we should think about addressing that concern as a separate
function.

For me personally, fixtures fulfill the seeder role for the test
database. I'm interested in knowing when that doesn't work for others.
BTW, I agree that fixtures should not be used to seed the production
database. That's another concern that Rails doesn't really address at
the moment.

Alexey Verkhovsky

unread,
Nov 30, 2007, 11:05:23 AM11/30/07
to rubyonra...@googlegroups.com
On Nov 30, 2007 8:46 AM, DHH <david.he...@gmail.com> wrote:
> In that context, it doesn't make sense to use migrations for the test
> database because the test database does not have permanent data of
> importance.

I think, it does make sense to run migrations in the continuous
integration loop (but not in the local build). Reason: you want to
test them, but you don't want to slow down the local build. A fairly
common practice is to use 001_initial_schema migration as the only
migration on the project for as long as there is no valuable
production data to preserve.

> But it seems that this misuse of migrations highlights something that might be lacking: a data seeding system.

Yup. Another common practice is db/dataload.rb, a script of
ActiveRecord operations to put some data into the database, with the
corresponding db:dataload Rake task. Using AR and domain to create
this data is much easier than doing the same thing with YAML-based
fixtures.

--
Alexey Verkhovsky
CruiseControl.rb [http://cruisecontrolrb.thoughtworks.com]
RubyWorks [http://rubyworks.thoughtworks.com]

Rick Olson

unread,
Nov 30, 2007, 3:25:20 PM11/30/07
to rubyonra...@googlegroups.com
> Yup. Another common practice is db/dataload.rb, a script of
> ActiveRecord operations to put some data into the database, with the
> corresponding db:dataload Rake task. Using AR and domain to create
> this data is much easier than doing the same thing with YAML-based
> fixtures.

Mephisto uses fixtures in a special db/bootstrap dir, and inserts them
with a db:bootstrap task. db:bootstrap includes schema:load and the
custom data. Though I agree that doing this in ruby would be
simpler...

--
Rick Olson
http://lighthouseapp.com
http://weblog.techno-weenie.net
http://mephistoblog.com

DHH

unread,
Nov 30, 2007, 3:35:18 PM11/30/07
to Ruby on Rails: Core
> > In that context, it doesn't make sense to use migrations for the test
> > database because the test database does not have permanent data of
> > importance.
>
> I think, it does make sense to run migrations in the continuous
> integration loop (but not in the local build). Reason: you want to
> test them, but you don't want to slow down the local build. A fairly
> common practice is to use 001_initial_schema migration as the only
> migration on the project for as long as there is no valuable
> production data to preserve.

I don't think I understand this. Why do you want or need to
continuously test the migrations? In my opinion, migrations are
transient artifacts that only serve the purpose of moving everyone on
a schema version A to schema version B. Once everyone has been moved,
the migrations are useless and could essentially be deleted.

James Adam

unread,
Nov 30, 2007, 5:36:48 PM11/30/07
to rubyonra...@googlegroups.com
On Nov 30, 2007 8:35 PM, DHH <david.he...@gmail.com> wrote:
> > I think, it does make sense to run migrations in the continuous
> > integration loop (but not in the local build). Reason: you want to
> > test them
>
> I don't think I understand this. Why do you want or need to
> continuously test the migrations?

I'm not sure the "continuous" that Alexey was referring to was the CI
process (as in: it is always running), or a repeated run of every
migration each time the test suite is run.

The former certainly makes sense; you'd want to test that a migration
can successfully run based solely on the contents of the SVN
repository and other expected artefacts; verifying that it won't fail
because a developer has failed to commit or add a particular file
before you try and run that migration on the production system.

That is, you'd want to test that any new migrations don't cause a
system failure - not that every migration can be run, each time the CI
system runs against a new build.

--
* J *
~

Robert Evans

unread,
Nov 30, 2007, 5:42:42 PM11/30/07
to rubyonra...@googlegroups.com
I really like Alexey idea of a ruby file for loading the seed data and
run it with a rake task. Should that be something in rails or rather
"best practices"? Or should we have something akin to create_table
that's like create_data_for :users do... within something like
ActiveRecord::SeedData? (with associated rake tasks)

Granted that my case may be one of few. With that said, what do you
think about my proposed rake tasks, minus the db:test:clone? That way,
we can build the test db with a rake task, and can rebuild from
migrations as well? Or am I still off the mark?

--
Posted via http://www.ruby-forum.com/.

Assaf Arkin

unread,
Nov 30, 2007, 6:18:35 PM11/30/07
to rubyonra...@googlegroups.com

+1

I would like to see a way to test migration, especially those
involving substantive changes to the data, in the framework. And
being able to run them against a large enough dataset, preferably a
copy of the production database. But I don't see testing migrations
being the same thing as running test cases against the test database.

Assaf

Trevor Turk

unread,
Nov 30, 2007, 6:19:52 PM11/30/07
to Ruby on Rails: Core
> Yup. Another common practice is db/dataload.rb, a script of
> ActiveRecord operations to put some data into the database, with the
> corresponding db:dataload Rake task. Using AR and domain to create
> this data is much easier than doing the same thing with YAML-based
> fixtures.

I've set up apps to detect when they have an empty database, and to
run an action which uses regular AR stuff like User.new to seed the
database. That way, the application "sets itself up" the first time
it's run - no additional rake task needed (but with the overhead of
checking to see if we've got a "clean slate").

In an ideal world, I think Rails applications that have the right info
in config/database.yml would be able to create their own database
(something like rake db:create), load their own schema (rake
db:schema:load), and seed their own data (rake db:bootstrap)
automatically when run for the "first time".

The trick would be knowing when an application was being run for the
first time, but that might be as simple as telling people to not run
rake db:schema:load (or rake:db:create) and simply starting their
application after filling out config/database.yml (if database doesn't
exist or has no tables, run some "init" action if it exists).

I'm not sure if this sort of thing is possible (or a good idea), but
it might be worth thinking about.

- Trevor

Chris Cruft

unread,
Dec 1, 2007, 10:56:39 AM12/1/07
to Ruby on Rails: Core
"...migrations are transient artifacts that only serve the purpose of
moving everyone on a schema version A to schema version B."

David,
Koz expressed almost exactly this same sentiment yesterday in another
thread (http://groups.google.com/group/rubyonrails-core/browse_frm/
thread/d871469cb2a6589a?hl=en). You guys are consistent in the
message. But there is an argument being expressed in these threads,
plugins and trac tickets for using migrations for more than just one-
time changes.

I use migrations for building the databases FROM SCRATCH for both
development and production. And I would like to do the same in test
because it works so well for development and production.

*Development: (Before going live and before production even exists)
Occasionally I will end up with a development DB that is full of cruft
and I want to reset. So I drop the development DB and rebuild.
*Production: After months of development, I'm ready to put an app into
production, so I contract with a hosting site and build it from
scratch.

What both these scenarios have in common is that the ruby schema
dumper is inadequate (no DB-specific stuff supported) and the sql
schema dumper is also inadequate (no non-DDL available, such as seed
data loading). Migrations work beautifully to address these problems
in a very Rails-like way (no plugin required!) and using syntax I've
already invested in. I can add an Admin user, a Guest user and their
authorizations and be able to use the app after rake db:migrate.

On a related note, there seems to be a migration-versus-fixtures
debate for seed data coming over the horizon. There is no reason you
can't do a hybrid by loading fixtures within a migration. In fact,
such an approach is described in Agile Web Development with Rails
(page 271, section 16.4). It works well and capitalizes on two well-
tested and understood Rails tools.

It is unfortunate that such a great tool (migrations) can't also be
used to build the test DB. As it is now, I occasionally find my
migrations fail due to subtle DB-side changes or model changes. The
only way to keep them fresh is to manually rebuild a database from
time to time. But it sure would be nice if they could be used in the
day-to-day of building the test DB.



-Chris

DHH

unread,
Dec 1, 2007, 11:26:28 AM12/1/07
to Ruby on Rails: Core
> *Development: (Before going live and before production even exists)
> Occasionally I will end up with a development DB that is full of cruft
> and I want to reset. So I drop the development DB and rebuild.
> *Production: After months of development, I'm ready to put an app into
> production, so I contract with a hosting site and build it from
> scratch.

Both of these scenarios are intended to be solved with db:schema:load.
That task isn't working for you because you're putting seed data into
migrations. In turn, you feel pain from db:schema:load because it
doesn't include your seed data. I think the problem here is seed data
in migrations, not migrations vs schema.

> What both these scenarios have in common is that the ruby schema
> dumper is inadequate (no DB-specific stuff supported) and the sql
> schema dumper is also inadequate (no non-DDL available, such as seed
> data loading).

In my mind, this is a perfect case for SQL schema dumper. You have a
db-specific schema that uses tricks not accessible by the Ruby dumper.
If you split out the concern of seed data, I think a lot of your
problems go away.

> Migrations work beautifully to address these problems
> in a very Rails-like way (no plugin required!) and using syntax I've
> already invested in. I can add an Admin user, a Guest user and their
> authorizations and be able to use the app after rake db:migrate.

Again, I think this is a mistake and it was certainly not what
migrations were designed for. They lead to all the pains and problems
you're describing with migrations.

I fully realize that people are misusing migrations in this way
because they were missing a seed system and just grabbed something
that had the same vague outline. But I think the problem then is to
consider how to best do seeding. Not to twist migrations into a seed
system.

> It is unfortunate that such a great tool (migrations) can't also be
> used to build the test DB. As it is now, I occasionally find my
> migrations fail due to subtle DB-side changes or model changes. The
> only way to keep them fresh is to manually rebuild a database from
> time to time. But it sure would be nice if they could be used in the
> day-to-day of building the test DB.

Again, this is a symptom of wanting to run migrations all the time and
thus needing to make sure they'll work for all eternity. I think
that's a waste of time and hard too. You might very well have old
migrations that depend on classes and methods that are no longer
around. I've seen some of the hoops that people jump through to keep
legacy behavior intact for migrations and it sure ain't pretty.

So in summary, what we need is a seed system as either a best
practice, plugin, or core (doubtful, it doesn't feel like a Most
People, Most of The Time concern) and stop trying to turn migrations
(or even fixtures) into a seed system.

DHH

unread,
Dec 1, 2007, 11:28:30 AM12/1/07
to Ruby on Rails: Core
> In an ideal world, I think Rails applications that have the right info
> in config/database.yml would be able to create their own database
> (something like rake db:create), load their own schema (rake
> db:schema:load), and seed their own data (rake db:bootstrap)
> automatically when run for the "first time".

I think this is being way too clever. Different applications will have
different things they need to have happen before they can run. That
might be gem dependencies, that might be ensuring a certain version of
Ruby, it might be setting up seed data, it might be so many things
that it's not worth standardizing. Just create script/setup and put in
the README that people should run that when first installing the
application. Problem solved, IMO.

John W. Long

unread,
Dec 1, 2007, 1:17:06 PM12/1/07
to rubyonra...@googlegroups.com

On Dec 1, 2007, at 10:56 AM, Chris Cruft wrote:
> I use migrations for building the databases FROM SCRATCH for both
> development and production. And I would like to do the same in test
> because it works so well for development and production.


Would something like the Scenarios plugin solve your problem?

http://faithfulcode.rubyforge.org/docs/scenarios/

--
John Long
http://wiseheartdesign.com

Mike Mangino

unread,
Dec 1, 2007, 2:04:03 PM12/1/07
to rubyonra...@googlegroups.com
[snip]

> Again, I think this is a mistake and it was certainly not what
> migrations were designed for. They lead to all the pains and problems
> you're describing with migrations.
>
> I fully realize that people are misusing migrations in this way
> because they were missing a seed system and just grabbed something
> that had the same vague outline. But I think the problem then is to
> consider how to best do seeding. Not to twist migrations into a seed
> system.


I'm one of the people misusing migrations in this way, and a seed
system could fulfill part of the problem, but not all of it.

I often use migrations for creating data that needs to be present in
every environment. For example, a new account in an accounting table.
I want to add that to an existing production application without
reloading all of the data. By including it in a migration, I'm sure
that it will exist in the database of every developer, our integration
environment, and finally production. It keeps me from having to track
down data bugs.

Of course, the fact that I also have to encode that data into my
fixtures isn't very DRY. I'd love to have some way of specifying the
data only once, I just don't have any brilliant ideas about how to do
it.

[snip]

Mike Mangino
http://www.elevatedrails.com

revans

unread,
Dec 1, 2007, 4:56:17 PM12/1/07
to Ruby on Rails: Core
If we kept the integrity of migrations and moved to create something
like SeedData or an implementation of Scenarios, then we can still
keep the state that you want. We'd just have the normal migration
tasks and then tasks for seeding the data and a task for migrating and
then seeding.

IMO, I'd love to see a seed system that mimics migrations a bit and
keeps the standard AR syntax that we are used to: Person.create(...).

Perhaps something like

class SeedPeople < ActiveRecord::SeedData
def self.up
create_data :people do |p|
p.create(:name => "Robert", :password => "secret")
p.create(:name => "John", :password => "supersecret")
end
end
end

I find that the above syntax feels comfortable to me - what do other
people think of the above? John Long has a lot of this already done.

Are there issues for having a simple rake task for setting up just the
test db or even just the production db (without loading any data)?
e.g. rake db:create:test, rake db:create:production - Perhaps I'm just
not seeing the reason why these aren't nice conveniences?

On Dec 1, 11:04 am, Mike Mangino <mmang...@technologyfusion.com>
wrote:

Jack Danger Canty

unread,
Dec 1, 2007, 6:37:44 PM12/1/07
to rubyonra...@googlegroups.com
IMO, I'd love to see a seed system that mimics migrations a bit and
keeps the standard AR syntax that we are used to: Person.create(...).


If we have a version-based data-seeding system then we've really just created a parallel set of migrations.  Same benefits, same problems.  Once the models are out of date (say you move a field from one table to another) then your older seed files will be broken.
Yet, if we don't use a version-based system then it's difficult to know what actions to perform to update a given environment.  Adding missing data is easy enough but how do you track when data was removed?

I can see now why we haven't had any kind of data-seeding mechanism and why many of us (myself included) cannibalized migrations for that purpose.

::Jack Danger

Robert Evans

unread,
Dec 5, 2007, 5:09:45 PM12/5/07
to rubyonra...@googlegroups.com
I know that these tasks may not be for core, but for those that did find
some of them useful, I have plugin that includes them and some other
tasks, including Tobias Lutke's backup task:

http://svn.robertrevans.com/plugins/data_tasks/

Student

unread,
Dec 7, 2007, 10:14:46 AM12/7/07
to Ruby on Rails: Core
I cannot help but believe that matters are headed in exactly the wrong
direction. We already have a serious problem (as discussed in AWDR)
with the asynchronous relationship between code and the current
migration level. Now there is talk about creating ANOTHER
asynchronous set of tasks which depends on both code and migration?

The creation of an administrative user is very tightly bound to the
creating of the users table. Separating them looks unnatural,
difficult, and errorprone to me. I expect that there are similar
situations in other circumstances.

--

I'm confused DHH's statements regarding the transitory nature of
migrations. Is there a best practices document someplace that
explains the matter?

Chris Cruft

unread,
Dec 8, 2007, 12:10:27 PM12/8/07
to Ruby on Rails: Core
I've tried to adopt DHH's approach of using migrations just as a
change agent and relying on the schema dumper to reveal the state.
What I've found lacking in this approach is that the schema dumper
shows the current schema state AS IS, not AS INTENDED. I'm not
perfect in my practices, and perhaps there is a best practice where
the AS IS (in development) always matches the AS INTENDED (what I want
to test and eventually put into production). But until I find such a
practice, schema dumper doesn't do it for me -and that's before we
even get into the seed data issue.

Migrations definitely serve the purpose for transitory changes. But
they also do a superb job of defining a self-documenting AS-INTENDED
schema using a well-understood and flexible syntax that even supports
loading of "seed" data. Is it possible that migrations, while only
intended to do the job of transitory schema changes, has also turned
out to be the best available tool for building a database from
scratch?

-------- Extra Notes on my practices, feel free to critique ---------
1. I build my initial database (including some minimal seed data) by
creating a 001_baseline.rb migration.
2. Over time, I adjust the schema (and data) with migrations
0nn_<change>.rb
3. When the number and complexity of the migrations begins to get
unwieldy, I condense my migrations into a new baseline. For this
step, I rely heavily on the output of the schema dumper (:sql). I
think this step is essential to good documentation of the AS INTENDED
schema because the net result of a long sequence of meandering
migrations can be difficult to grasp.

In an ideal world, I would have (in priority order):

(a) support for the testing of migrations -today you need to either a
plugin or many steps to test with migrations.
(b) more explicit support for using fixtures in migrations (AWDR shows
how, but it could be cleaner).
(c) support for condensing my migrations into a new baseline.
Something like turning over a new DB epoch, with an epoch marker.

Alexey Verkhovsky

unread,
Dec 8, 2007, 1:16:28 PM12/8/07
to rubyonra...@googlegroups.com
On Nov 30, 2007 1:35 PM, DHH <david.he...@gmail.com> wrote:
> I don't think I understand this. Why do you want or need to continuously test the migrations?

Let me try to explain.
* there is no up-to-date development environment on the continuous
integration box
* but I do want to rebuild the database from scratch in every CI
build. From what?
* if I use db/schema.rb, I am relying on an artefact that was
automatically generated in somebody's development environment. Which
is not how it will be done in production. I also cannot expect
everybody to always pay attention when checking-in auto-generated
artefacts.
* running all migrations then looks like a better choice.

> Once everyone has been moved, the migrations are useless and could essentially be deleted.

Yeah, having many migrations floating around is awkward, too. One can
take schema dumper output and make a new baseline migration out of it,
with the same number as the DB_VERSION in the last prod release.

Chad Woolley

unread,
Dec 8, 2007, 3:29:54 PM12/8/07
to rubyonra...@googlegroups.com
On Dec 8, 2007 11:16 AM, Alexey Verkhovsky <alexey.v...@gmail.com> wrote:
>
> On Nov 30, 2007 1:35 PM, DHH <david.he...@gmail.com> wrote:
> > I don't think I understand this. Why do you want or need to continuously test the migrations?
>
> Let me try to explain.
> * there is no up-to-date development environment on the continuous
> integration box
> * but I do want to rebuild the database from scratch in every CI
> build. From what?

Yes, dropping the db and migrating everything is the sensible approach in CI.

> * if I use db/schema.rb, I am relying on an artefact that was
> automatically generated in somebody's development environment. Which
> is not how it will be done in production. I also cannot expect
> everybody to always pay attention when checking-in auto-generated
> artefacts.

Right. In general, I think it's bad practice to check in generated
artifacts. It's more work for everyone to remember to check in when
they change the schema. It's error prone and people often forget to
check in, which means the CI build breaks and you waste time figuring
out what broke, who forgot to check in, and blaming them. Better to
just always run the migrations in CI and svn:ignore schema.rb.

> * running all migrations then looks like a better choice.
>
> > Once everyone has been moved, the migrations are useless and could essentially be deleted.
> Yeah, having many migrations floating around is awkward, too. One can
> take schema dumper output and make a new baseline migration out of it,
> with the same number as the DB_VERSION in the last prod release.

Yep. Speeds up CI and clean DB setups too if there are fewer
migrations. If there were something to automate the collapse of
migrations into a schema dump up to a given version, that would be
great. It's easy to do manually, but it would be a nifty rake task
for someone to publish.

-- Chad

Reply all
Reply to author
Forward
0 new messages