If adding or changing a column fails due to being not null, make it null.

Avery Pennarun

unread,

Sep 29, 2008, 7:58:46 PM9/29/08

to versap...@googlegroups.com

pmccurdy wrote:
> It's more important to be able to apply a schema change than to keep data,
> at least if the Destructive flag is set. So if a column addition or update
> fails because it's adding a non-null column to a table that already has
> data, we should just make the column be nullable if the Destructive flag
> is set. You may lose a bit of data, but, well, you asked for it.

I think this is a bit mixed up. The idea here is that if you're
allowed to be Destructive anyway, you might as well just drop/add the
table to make it *exactly* like the schema change that we requested.
If Destructive is false, then we don't want to ever lose data, but
making a non-nullable column nullable *doesn't* lose data, it just
fails to apply part of the schema, which is the lesser of two evils.

So I think the logic of your checkin (at least according to the commit
message; I didn't read the code) isn't quite right.

Avery

Peter McCurdy

unread,

Sep 30, 2008, 2:38:52 PM9/30/08

to versap...@googlegroups.com

Note that there's actually two situations where this code gets used:

1. You add a new column that doesn't allow null to a non-empty table.
2. You tried to change a column in a way the database didn't like
(e.g. a change that would cause data in a column to get truncated),
and you set the Destructive flag. In this case, Schemamatic tries to
drop and add the column, which may then in turn fail if the column
doesn't allow null.

So you're right, in case 1 there's no reason to require the
Destructive flag. Oops. I was thinking more about case 2 when I was
doing that. Anyway, that's easy to fix.

As for what to do when the Destructive flag is set, IIRC you were the
one who told me that nuking the table in these (slightly
unpredictable) situations was a bit of a pain :) If you end up caring
about the "you've got Nulls!" warning, you can fix it by pushing up
your table data (which you'd have had to do anyway if we nuked the
table), and then pushing the schema again (which should be quick since
you'll only have to change the one table). Or change your schema to
specify a default, and push again, without having to reload your data.
Or if you just don't care about that column, then you don't have to
do anything and most of your data is still there. That all sounds
like a nicer workflow to me.

Peter.

Avery Pennarun

unread,

Sep 30, 2008, 2:48:44 PM9/30/08

to versap...@googlegroups.com

On Tue, Sep 30, 2008 at 2:38 PM, Peter McCurdy
<peterm...@alumni.uwaterloo.ca> wrote:
> Note that there's actually two situations where this code gets used:
>
> 1. You add a new column that doesn't allow null to a non-empty table.
> 2. You tried to change a column in a way the database didn't like
> (e.g. a change that would cause data in a column to get truncated),
> and you set the Destructive flag. In this case, Schemamatic tries to
> drop and add the column, which may then in turn fail if the column
> doesn't allow null.
>
> So you're right, in case 1 there's no reason to require the
> Destructive flag. Oops. I was thinking more about case 2 when I was
> doing that. Anyway, that's easy to fix.

Okay.

> As for what to do when the Destructive flag is set, IIRC you were the
> one who told me that nuking the table in these (slightly
> unpredictable) situations was a bit of a pain :) If you end up caring
> about the "you've got Nulls!" warning, you can fix it by pushing up
> your table data (which you'd have had to do anyway if we nuked the
> table), and then pushing the schema again (which should be quick since
> you'll only have to change the one table). Or change your schema to
> specify a default, and push again, without having to reload your data.
> Or if you just don't care about that column, then you don't have to
> do anything and most of your data is still there. That all sounds
> like a nicer workflow to me.

Well, naturally anything called "Destructive" is always going to be a
bit of a pain :) We just have to make sure we handle all the most
important situations in the right order of priority.

The top two things I need to be able to do are:

A. Update a database schema so that it is *exactly* like the one in my
filesystem. Otherwise developers can't test their work.

B. Update a database schema as closely as possible to the one in my
filesystem, without destroying anything, and give enough clues for
people to sort it out manually if it's important. This is a useful
mode for production databases, or for careful developers who don't
want to lose test data, or maybe just for developers who want to test
how a production upgrade would work.

So (A) is obviously what Destructive=true is all about, and (B) is
Destructive=false.

The implication is that if Destructive=true, the database *needs* to
come out exactly like I asked it to, or developers are going to screw
up, check in broken code, and have failures that only get noticed in
the autobuilder. (Of course, even the autobuilder won't work if
schemamatic can't guarantee that the resulting database is exactly as
intended.) I know this happens because we already have a destructive
mode in the perl schemamatic and *still* it happens, because
developers aren't quite careful enough. If the system itself refused
to do certain operations automatically, developers (think co-op
students) would certainly be too afraid/unqualified to do them by
hand.

It appears to me that from your analysis above, you're trying to solve
problem (B) in destructive mode, which is going to lead to some
confusion. And then non-destructive mode seems to be optimized for
some other case (C) that I'm not sure about. Is there a use case I
haven't listed?

Thanks,

Avery

Peter McCurdy

unread,

Sep 30, 2008, 4:36:47 PM9/30/08

to versap...@googlegroups.com

Ah. I was under the impression that this wasn't going to be used for
production databases at all. So then I was thinking Destructive=false
was mainly for detecting changes that'd require attention in upgrade
scripts, and Destructive=true was for...something. It was never
terribly clear to me.

Anyway, what you're saying makes sense, and in fact will probably make
life somewhat simpler.

Peter.

Avery Pennarun

unread,

Sep 30, 2008, 4:48:20 PM9/30/08

to versap...@googlegroups.com

In the long run, of course, we'll want to allow for upgrade scripts.
But when that happens, we'll presumably want to build them on top of
the destructive=false behaviour; the test of a good upgrade script is
that it will finish correctly leaving no differences between the two
schemas, even with destructive=false :)

An upgrade script would be something of the form:

add this column
then run this query
then change this column
etc.

Hmm, perhaps an upgrade script could actually consist of *just* one
schema and a set of scripts to run after updating to that schema.
Then a full version upgrade could be a series of upgrade scripts; the
first script updates to an interim schema safely, then does something
appropriate to the data, so that the next schema is able to handle it.

Anyway, I guess I haven't thought it all the way through yet.
Probably worth investigating how Rails does it.