Schema Evolution status

Xan

unread,

Sep 26, 2007, 11:02:52 AM9/26/07

to Django developers

Hi,

I just want to know what is the status of the Schema Evolution.
Reading trac docs and browse code repository I see that you have
several implementations.

What is the most stable implementation?
What is the implementation you probably merge in svn?

What are the big future steps we (django users) have to wait until we
have merged schema-evolution django version?

Please answer to all questions
Thanks in advance,
Xan.

Yuri Baburov

unread,

Sep 26, 2007, 2:18:31 PM9/26/07

to django-d...@googlegroups.com

Hi,

Waiting for Derek, as he's official person responsible for
implementing schema-evolution functionality in django (if we can use
"responsible" with open source) and maintainer of schema-evolution
branch.

What I can say now, is that since django now allows custom commands,
we will implement it as separate python module (installable), and
release first working milestone really soon. I am helping Derek with
it.
I don't know what to do with the branch. Core devs are very strict
with restrictions on code to be included in trunk (they don't seem to
like .aka syntax, for example). Also less pain for them to maintain
schema-evolution after yet more incompatible changes in internals :)
The problem is that schema-evolution branch will be broken again after
updating it to the trunk (because of django and management module
changes), but last version of the branch should be working.
Right now you can also look for patch for the previous trunk versions
(i have 2007-07-19 version and 2007-08-25 version at my local pc -
first one was taken from a ticket, last one... from derek's site
kered.org or from somewhere else).

There were also attempts to do schema-evolution in other way, but no
other project released more than limited alpha versions. I rechecked
it today.

I think I answered all your questions..
Stay tuned :)

2007/9/26, Xan <xanc...@gmail.com>:

--
Best regards, Yuri V. Baburov, ICQ# 99934676, Skype: yuri.baburov,
MSN: bu...@live.com

SmileyChris

unread,

Sep 26, 2007, 3:53:10 PM9/26/07

to Django developers

On Sep 27, 6:18 am, "Yuri Baburov" <burc...@gmail.com> wrote:
> There were also attempts to do schema-evolution in other way, but no
> other project released more than limited alpha versions. I rechecked
> it today.

I'm guessing this could refer to my code.

I'm actually waiting for some code to be put up to
http://code.google.com/p/django-evolution/ - Russell (I'm pretty sure
it was him... we discussed it at the end of the sprint) has someone
who has been working on some code and I really want to see it before I
do any more work.

I like Derek's take on model comparison using introspection, but
really think that it should be based on a migration system. I know
several Django developers agree with me on this (including Russell).

Yuri Baburov

unread,

Sep 26, 2007, 4:27:00 PM9/26/07

to django-d...@googlegroups.com

> I like Derek's take on model comparison using introspection, but
> really think that it should be based on a migration system. I know
> several Django developers agree with me on this (including Russell).

Rails has migrations because it has introspection already built-in.
I don't understand why you do like syncdb doing introspection, but
don't like evolve doing introspection. Weird for me.
For migration system, you can already use dbmigrate project for example.

Yuri Baburov

unread,

Sep 26, 2007, 4:34:09 PM9/26/07

to django-d...@googlegroups.com

> I'm actually waiting for some code to be put up to
> http://code.google.com/p/django-evolution/ - Russell (I'm pretty sure
> it was him... we discussed it at the end of the sprint) has someone
> who has been working on some code and I really want to see it before I
> do any more work.

You better name the project django-migration then and left
django-evolution name for us ;)

> I like Derek's take on model comparison using introspection, but
> really think that it should be based on a migration system. I know
> several Django developers agree with me on this (including Russell).

Maybe you are not against introspection but against lack of migration system?
I'm just don't want to make writing migrations by hands a necessary
operation. For small projects it's a pita. It's equally dumb to
manually fixing tables with sql. It's not DRYish.
But I'm vote with strong +1 for possibility to write them with hands
and correct them with hands.

Derek Anderson

unread,

Sep 26, 2007, 4:30:45 PM9/26/07

to django-d...@googlegroups.com

SmileyChris wrote:
> I like Derek's take on model comparison using introspection, but
> really think that it should be based on a migration system. I know
> several Django developers agree with me on this (including Russell).

i see them as two distinct, non-exclusive tasks, with automatic
introspection during development *optionally* feeding into svn-ed
migration scripts for large or widespread production deployments.

derek

Derek Anderson

unread,

Sep 26, 2007, 5:07:23 PM9/26/07

to django-d...@googlegroups.com

ok, for the official status:

as mentioned earlier, yuri and i are joining forces on
introspection-driven schema evolution. and since automatic
introspection is so controversial (and obviously a poison pill for
acceptance into django-proper), we've rewriting it as an external library.

the project page is here:
http://code.google.com/p/deseb/
i expect a release in the next week or two.

the dreaded "aka" syntax will remain, but will require one additional
import statement to the top of your models.py. and management
integration will continue through the external command support recently
added into the trunk.

i agree with yuri that requiring users to write migrations in some new
django-specific SQL abstraction language is dumb, not DRYish and an
all-around pita. if you want that that's ok, but it's not the itch
we're scratching here. (*)

as soon as we have a working implementation, i will be stopping support
for the schema-evolution branch.

thanks,
derek

(*) however that's not to say wide-scale-production-deployment
management as a whole isn't necessary - we just see it as a separate
issue. and one compatible with this project. (output from sqlevolve
would be saved into whatever system you wish to use)

Russell Keith-Magee

unread,

Sep 26, 2007, 8:17:23 PM9/26/07

to django-d...@googlegroups.com

On 9/27/07, Derek Anderson <pub...@kered.org> wrote:
>
> ok, for the official status:
>
> as mentioned earlier, yuri and i are joining forces on
> introspection-driven schema evolution. and since automatic
> introspection is so controversial (and obviously a poison pill for
> acceptance into django-proper), we've rewriting it as an external library.

I'm not sure where you got the idea that automatic introspection is
the issue - the proposal I put forward included automatic
introspection. The issue has always been the aka syntax, and the
consequences of that syntax on the overall design.

However, that said, moving everything to a Google Code project is a
good move. It encourages a clean separation of concerns, and means you
don't have to keep remerging the whole tree as the trunk updates.

I encourage anyone interested to play with all the schema-evolution
proposals out there, including Derek's. If one of the alternatives
gets wide community support, we can look at merging it into trunk.

Derek - on a housekeeping issue - are you happy for us to write-lock
the schema-evolution branches and close the outstanding tickets in
Django's ticket database relating to that branch?

Yours,
Russ Magee %-)

Russell Keith-Magee

unread,

Sep 26, 2007, 8:52:36 PM9/26/07

to django-d...@googlegroups.com

On 9/27/07, SmileyChris <smile...@gmail.com> wrote:
>
> On Sep 27, 6:18 am, "Yuri Baburov" <burc...@gmail.com> wrote:

> I'm actually waiting for some code to be put up to
> http://code.google.com/p/django-evolution/ - Russell (I'm pretty sure
> it was him... we discussed it at the end of the sprint) has someone
> who has been working on some code and I really want to see it before I
> do any more work.

Yup - that's me. Or rather, it's Ben, and I'm providing occasional
moral support.

> I like Derek's take on model comparison using introspection, but
> really think that it should be based on a migration system. I know
> several Django developers agree with me on this (including Russell).

I'm not anti-introspection. Introspection can be very useful,
especially as a tool that can be used to determine the migrations that
are required at any given stage. However, I don't think that
introspection isn't the end of the story. Any evolution system that
doesn't take into account manual SQL migrations and the sequential
nature of the migration process is, IMHO, flawed, and I haven't seen a
pure-introspection solution that has been able to take these issues
into account.

Yours,
Russ Magee %-)

Derek Anderson

unread,

Sep 26, 2007, 9:13:09 PM9/26/07

to django-d...@googlegroups.com

Russell Keith-Magee wrote:
> I'm not sure where you got the idea that automatic introspection is
> the issue - the proposal I put forward included automatic
> introspection. The issue has always been the aka syntax, and the
> consequences of that syntax on the overall design.

because several of the complaints you've and others have listed would
apply to any automatic introspection implementation. (most memorably
the 'round-trip-change' data issue) but if i'm mistaken i apologize.

btw, i must have missed your proposal. (unless you're referring to your
8/4/2007 email?) link?

> Derek - on a housekeeping issue - are you happy for us to write-lock
> the schema-evolution branches and close the outstanding tickets in
> Django's ticket database relating to that branch?

if you don't mind, i'd prefer you keep them open, lest as you suggest we
get sudden fame and fortune and merging re-enters the realm of distant
feasibility. :) (maintaining should be pretty easy anyway, once the
majority of it is separated. i'll make that call once we release, i'm
just not ready to commit to it yet)

derek

Russell Keith-Magee

unread,

Sep 26, 2007, 10:06:02 PM9/26/07

to django-d...@googlegroups.com

On 9/27/07, Derek Anderson <pub...@kered.org> wrote:
>

> Russell Keith-Magee wrote:
> > I'm not sure where you got the idea that automatic introspection is
> > the issue - the proposal I put forward included automatic
> > introspection. The issue has always been the aka syntax, and the
> > consequences of that syntax on the overall design.
>
> because several of the complaints you've and others have listed would
> apply to any automatic introspection implementation. (most memorably
> the 'round-trip-change' data issue) but if i'm mistaken i apologize.

My objection was to the equivalence between A->B->C and A->C under
your proposal. This is mostly a consequence of the limitations of the
aka-syntax as applied to introspection. I have no problem with the
concept of introspection itself, and the proposal I floated
specifically mentions introspection as a tactic that can be used, both
for validation and for identifying required migrations.

> btw, i must have missed your proposal. (unless you're referring to your
> 8/4/2007 email?) link?

http://groups.google.com/group/django-developers/browse_thread/thread/da7831d08081d7b7/49347e67d22fa3cc?rnum=1#49347e67d22fa3cc

> > Derek - on a housekeeping issue - are you happy for us to write-lock
> > the schema-evolution branches and close the outstanding tickets in
> > Django's ticket database relating to that branch?
>
> if you don't mind, i'd prefer you keep them open, lest as you suggest we
> get sudden fame and fortune and merging re-enters the realm of distant
> feasibility. :) (maintaining should be pretty easy anyway, once the
> majority of it is separated. i'll make that call once we release, i'm
> just not ready to commit to it yet)

No problems.

Yours,
Russ Magee %-)

Derek Anderson

unread,

Sep 26, 2007, 10:57:04 PM9/26/07

to django-d...@googlegroups.com

russell, i've re-read your linked email (from 8/4/07), and i'm still
totally lost as to what you're proposing. i'm reading of application
and verification of an SQL-abstraction syntax:

mutations = [
AddColumn('Author', 'dateofbirth', models.DateField,
initial_value=None),
DeleteColumn('Author', 'age')
]

combined with a signature list embedded in the Meta class in model.
(btw, you said earlier that "in_the_model==bad".... ???)

which is ok and all, but there is nothing on how to actually generate
these mutations, except for the suggestion of calling it "syncdb --hint"
(instead of "sqlevolve" i presume). is it up to the user to write them?
where is the actual introspection/generation bit?

derek

Russell Keith-Magee

unread,

Sep 27, 2007, 12:46:12 AM9/27/07

to django-d...@googlegroups.com

On 9/27/07, Derek Anderson <pub...@kered.org> wrote:
>
> russell, i've re-read your linked email (from 8/4/07), and i'm still
> totally lost as to what you're proposing. i'm reading of application
> and verification of an SQL-abstraction syntax:
>
> mutations = [
> AddColumn('Author', 'dateofbirth', models.DateField,
> initial_value=None),
> DeleteColumn('Author', 'age')
> ]
>
> combined with a signature list embedded in the Meta class in model.
> (btw, you said earlier that "in_the_model==bad".... ???)

It needs to be _somewhere_. Meta is one option, but there are better
options. I was suggesting Meta as a better location than where you
were/are putting the settings (i.e., on the attributes themselves).
However, out of the model entirely would be preferable (and is
entirely possible).

> which is ok and all, but there is nothing on how to actually generate
> these mutations, except for the suggestion of calling it "syncdb --hint"
> (instead of "sqlevolve" i presume). is it up to the user to write them?
> where is the actual introspection/generation bit?

Introspection isn't an essential part of what I was proposing, but it
certainly could be used in two places:

1) Validation of final state - that the final state of the table after
migration matches the model definition. The AddColumn/DeleteColumn
mutations know what modifications they will make to the model
signatures, and this can be used as a form of soft validation.
However, the ultimate validation is that the tables in the database
match expectation at the end of a migration process. This can't be
done without introspection.

2) Reconciliation of modifications that have been made outside the
Django framework - i.e., if you tweak the tables outside of Django,
the signature in the Evolution table won't match the actual database
state. As a result, there will be a need for an evolution step to
'update the Evolution table'. This absolutely requires introspection,
as the state of a model definition is irrelevant to the problem.

Introspection could also be used as a replacement for the tag-tracking
function of the Evolution table. However, the Evolution table is a
much easier development task, and I think there is some value in
keeping the migration process documented in the database.

Yours,
Russ Magee %-)

Derek Anderson

unread,

Sep 27, 2007, 1:13:29 AM9/27/07

to django-d...@googlegroups.com

ok, yes, introspection we all agree has to be used for validation (1)
and signature creation (2). we're all on the same page as that being a
good thing.

but what SoC2006, my subsequent work and what i think most people here
are talking about when we say "evolution through introspection" is about
using introspection to determine *what needs* to change, not verifying
what has already changed. in your proposal, how do you write your
"mutations" list? the developer? or and introspection-based program?

if the former, just say so.
if the latter, how?

derek

Russell Keith-Magee

unread,

Sep 27, 2007, 1:55:08 AM9/27/07

to django-d...@googlegroups.com

On 9/27/07, Derek Anderson <pub...@kered.org> wrote:
>
> ok, yes, introspection we all agree has to be used for validation (1)
> and signature creation (2). we're all on the same page as that being a
> good thing.

Wait up a moment - (2) is an entirely optional part of my proposal. My
proposal primarily operates by developing signatures from the _model_
- i.e., turning models.py into a format that can be easily serialized
into the database, and compared to identify the required migrations.

> but what SoC2006, my subsequent work and what i think most people here
> are talking about when we say "evolution through introspection" is about
> using introspection to determine *what needs* to change, not verifying
> what has already changed. in your proposal, how do you write your
> "mutations" list? the developer? or and introspection-based program?
>
> if the former, just say so.
> if the latter, how?

The latter, by comparing the signature of the models.py that you have
with the signature in the Evolution table. The evolution table
contains the signature of the last model that was sync'd; if this
doesn't correspond to the current model, you need migrations to update
the database.

Database introspection isn't required to get this approach to work,
_unless_ there is someone futzing with the table definitions outside
of the evolution framework (this is case (2) from my last message). If
this is happening, then introspection is required as you need to go to
the actual table definitions to find out what is in the database,
rather than relying upon 'what I put there the last time I ran
syncdb'.

Yours,
Russ Magee %-)

Derek Anderson

unread,

Sep 27, 2007, 3:01:58 AM9/27/07

to django-d...@googlegroups.com

Russell Keith-Magee wrote:
> The latter, by comparing the signature of the models.py that you have
> with the signature in the Evolution table. The evolution table
> contains the signature of the last model that was sync'd; if this
> doesn't correspond to the current model, you need migrations to update
> the database.

ok, i get that. i didn't realize you meant storing the entire model
structure in the db. (might want to avoid calling this a signature for
clarity, but that's semantics)

so now you have two models to diff. how do you detect renames?

derek

SmileyChris

unread,

Sep 27, 2007, 3:38:04 AM9/27/07

to Django developers

On Sep 27, 12:52 pm, "Russell Keith-Magee" <freakboy3...@gmail.com>
wrote:

> On 9/27/07, SmileyChris <smileych...@gmail.com> wrote:
>
>
>
> > On Sep 27, 6:18 am, "Yuri Baburov" <burc...@gmail.com> wrote:
> > I'm actually waiting for some code to be put up to

> >http://code.google.com/p/django-evolution/- Russell (I'm pretty sure

> > it was him... we discussed it at the end of the sprint) has someone
> > who has been working on some code and I really want to see it before I
> > do any more work.
>
> Yup - that's me. Or rather, it's Ben, and I'm providing occasional
> moral support.
>
> > I like Derek's take on model comparison using introspection, but
> > really think that it should be based on a migration system. I know
> > several Django developers agree with me on this (including Russell).
>
> I'm not anti-introspection. Introspection can be very useful,
> especially as a tool that can be used to determine the migrations that
> are required at any given stage. However, I don't think that
> introspection isn't the end of the story.

Maybe I wasn't very clear, but this is exactly what I meant too.

Russell Keith-Magee

unread,

Sep 27, 2007, 4:11:45 AM9/27/07

to django-d...@googlegroups.com

On 9/27/07, Derek Anderson <pub...@kered.org> wrote:
>

> Russell Keith-Magee wrote:
> > The latter, by comparing the signature of the models.py that you have
> > with the signature in the Evolution table. The evolution table
> > contains the signature of the last model that was sync'd; if this
> > doesn't correspond to the current model, you need migrations to update
> > the database.
>
> ok, i get that. i didn't realize you meant storing the entire model
> structure in the db. (might want to avoid calling this a signature for
> clarity, but that's semantics)

It's not the _model_ per se - just a rendition of the significant data
in the model.

> so now you have two models to diff. how do you detect renames?

AFAICT, you _can't_ detect a rename with with 100% reliability.
Whatever scheme you propose will be subject to either:

- False positives: identifying deletion of X and addition of Y as an
X->Y rename because the non-name attributes of X and Y are identical,
or

- False negatives (identifying rename X->Y as a delete of X and an add
of Y because the attributes of X and Y aren't similar enough).

Either outcome is undesirable for a real database. This is why the
'hint and tweak' approach is so essential.

Yours,
Russ Magee %-)

Xan

unread,

Sep 27, 2007, 11:18:21 AM9/27/07

to Django developers

Wow!!!
I think you have no clear to _how_ things will be done!: several
projects, different opinions, etc.
Ideal ambient for create a perfect (with deadlines ;-)) schema
migration

Regards to all of you,
Xan.

Derek Anderson

unread,

Sep 27, 2007, 1:07:22 PM9/27/07

to django-d...@googlegroups.com

Russell Keith-Magee wrote:
> It's not the _model_ per se - just a rendition of the significant data
> in the model.

a rose by any other name.... (but yes, i assumed you meant not the
actual textual rendition, but a data structure containing all the
database-relevant attributes of the model)

>> so now you have two models to diff. how do you detect renames?
>
> AFAICT, you _can't_ detect a rename with with 100% reliability.
> Whatever scheme you propose will be subject to either:

> [...]

> Either outcome is undesirable for a real database. This is why the
> 'hint and tweak' approach is so essential.

agreed 100% that fully that detecting renames without hinting is
impossible. in fact i argue that fully detecting renames is impossible
under any circumstances. (see corollary A)

==============================================================
so we have two remaining issues here:
1) how to store all database-relevant attributes of the
model in the database itself
2) how to let the user 'hint and tweak' at renames
==============================================================

it would seem everything else we're agreed on. lets see what we can do
to come together.

== issue #1 ==
originally i had planned something virtually identical to what you're
proposing now. push all relevant information into a data structure, and
store that in the database. but the key insight was as follows:

"all relevant information" by definition IS the schema (x)

and according to DRY (not to mention just general programming common
sense), if you have the information, but just in an inconvenient form,
transform it (from the most authoritative source). _don't_ re-store it.
because syncing issues are hard (which you're already acknowledged in
your "people may muck with the schema behind django's back" issue).
(note corollary B)

Q1: do you agree that IF claim(x) is true, we should use the schema?
Q2: do you agree with claim(x)?

== issue #2 ==
when do you ask for the hints? during development, in the code? or
during deployment, via user interaction with syncdb?

i argue that "via user interaction with syncdb" is bad for the following
reasons:
a) i don't want to have to leave notes for my deployment monkeys
saying "make sure you match column A with column C". this is
just asking for deployment problems.
b) i want that information stored in CVS (or SVN, etc)

so that leaves storing it in the codebase _somewhere_ as the better
option, IMHO.
Q3: do you agree with this? ^^^

::: corollary A :::
for the proof that no renaming scheme is perfect... (ie, no
false-positives & no false-negatives):
consider FRANK deletes a field, and re-ads the same field before running
"sqlevolve" or "syncdb -- hint" or whatever. he commits both changes to
his VCS as v1 and v2, so it's real. because there are no globally
unique field ids, no system can detect and handle this change
consistently without peeking into his VCS. by manual db management, all
the data in the field is lost. via all automatic evolution systems, the
data is not. you are guaranteed data-inconsistency.
(however schema-consistency is attainable)

Q4: do you agree with this? ^^^

::: corollary B :::
note that i'm not against storing historical model or schema
representations to the DB, just that for the current representation, we
should use the schema over using our own stored "we think the DB looks
like this" data structure, as the schema is the most authoritative source.

derek

p.s. long emails on complicated topics....details get murky. so i've
labeled the four major questions {Q1-4} so we can further isolate where
we do and do not see eye2eye. :)

Paul Davis

unread,

Sep 27, 2007, 2:45:43 PM9/27/07

to django-d...@googlegroups.com

Hey everyone,

Hopefully I've read up enough to jump into this conversation, but if I
haven't then feel free to blast me as is appropriate. If I offend
anyone, try and remember I'm only trying to show my opinions on
different issues. And you're more than welcome to fire back. ;)

So for some reason schema evolution is causing quite a bit of
controversy among the django crowd. At first I couldn't figure out why
until I started reading about the large range of ideas for
implementation. But then, the more I read, the more confused I became
again.

As near as I can tell these are the main issues that don't seem to be resolved:

1. Balancing ease of use with power of use (Ie, Alice vs. Carol)
2. Level of versioning: Model vs. Application vs. Entire database
3. Application of versions: v1 -> v2 -> v3 =? v1 -> v3
4. How do we represent a migration from state 1 to state 2

These are ordered near as I can tell in roughly the order we have to
overcome to have something merged into the trunk.

#1 is a nobrainer. We want both. And we won't settle for less.

#2 There is alot of talking around this issue but I haven't seen
anyone take it full on.
We have to version the entire schema as a single object. Anyone in
doubt, I refer to SVN's repository wide versioning vs CVS's file
specific versioning.

#3 There are those that think v1->v2->v3 == v1->v3. I am not one of
those people.
A database is defined (in my mind) by its schema and the data it
contains. Migrations are non-linear. Each migration script must be
run.

#4 This is minor, but I felt it necessary to say that we're obviously
going to need python scripts to go from state to state. And custom sql
would be easy to integrate.

=======

So those are the main points near as I can tell. Now I'll go over some
of the popular topics I saw:

1. The use of introspection:
Will introspection be used? Of course. Will it be *all* thats used? Of
course not. Introspection will be useful for testing state. It is
possible that we could also add some 'best-guesses' on how to go from
one state to the next, whether someone chooses to accept our guesses
without checking is their problem (Alice). Obviously spitting out
things like "I cannot guess what needs to be done" would be useful. I
tend to think the django crowd would go along with this sort of thing.

2. The famous 'aka' syntax. For those of you out there saying that
this is absurd and doesn't provide any information, I'm thankful. The
syntax is horrible. Remember my quip about SVN vs. CVS versioning?
This is like some ultra brain dead system that trys to version
individual lines in a file. I mean, not even CVS is that bad.

=====

In order to keep things short this is all I'm gonna say for now. This
email is already too long.

Thanks,
Paul Davis

Derek Anderson

unread,

Sep 27, 2007, 4:13:56 PM9/27/07

to django-d...@googlegroups.com

Paul Davis wrote:
>
> As near as I can tell these are the main issues that don't seem to be
resolved:
>
> 1. Balancing ease of use with power of use (Ie, Alice vs. Carol)
> 2. Level of versioning: Model vs. Application vs. Entire database
> 3. Application of versions: v1 -> v2 -> v3 =? v1 -> v3
> 4. How do we represent a migration from state 1 to state 2

> #1 is a nobrainer. We want both. And we won't settle for less.

agreed. this is why fingerprinting, versioning and controlled
deployment scripts are all optional in my implementation. (as i feel
they must be to support alice)

> #2 There is a lot of talking around this issue but I haven't seen

> anyone take it full on.
> We have to version the entire schema as a single object. Anyone in
> doubt, I refer to SVN's repository wide versioning vs CVS's file
> specific versioning.

i agree that at _least_ applications need to be versioned together. i
lean towards agreeing that we should version entire schemas, but w/
django not having any existing inter-app dependency mechanism (other
than they just won't work), i was reluctant to mandate it. mostly
agree, but i think the ramifications need to be better thought out
before we push it on anyone.

> #3 There are those that think v1->v2->v3 == v1->v3. I am not one of
> those people.
> A database is defined (in my mind) by its schema and the data it
> contains.

i think it depends on who you are. for alice at least, can you agree it
will be true? for the majority of apps/developers too, i'd argue. but
for disburse deployments, i've conceded before that support needs to be
there for v1->v2->v3. (and it is supported via my controlled deployment
scripts)

> Migrations are non-linear.

thank $DEITY someone else said this. :) imho any controlled
deployment system that _only_ supports lists of sequential versions is DOA.

> #4 This is minor, but I felt it necessary to say that we're obviously
> going to need python scripts to go from state to state. And custom sql
> would be easy to integrate.

agreed.

> So those are the main points near as I can tell. Now I'll go over some
> of the popular topics I saw:
>
> 1. The use of introspection:
> Will introspection be used? Of course. Will it be *all* thats used? Of
> course not.

agreed, but i think the controversy is over using introspection to
_generate_ our guesses. (heck even syncdb uses introspection - noone's
arguing we shouldn't use any)

> 2. The famous 'aka' syntax. For those of you out there saying that
> this is absurd and doesn't provide any information, I'm thankful. The
> syntax is horrible. Remember my quip about SVN vs. CVS versioning?
> This is like some ultra brain dead system that trys to version
> individual lines in a file. I mean, not even CVS is that bad.

you're confused as to it's purpose. it is certainly *not* an attempt to
version anything. it's there as a necessary hint to the guess_generation
routine. i think everyone acknowledges we need some sore of a hinting
syntax on top of the version tracking if we want to do any generation of
change scripts... the question is where?

the reasons i put it where i did are as follows:
1) easy to use / grok for everyone, even for alice.
2) all field referencing has some ambiguity, if you want to
support fields that no longer exist.. best to have only
one side ambiguous by attaching it directly to the other.

i agree it would be a brain-dead VCS implementation, but it's not. so
that being understood, is it more palatable now?

derek

Derek Anderson

unread,

Sep 27, 2007, 3:47:28 PM9/27/07

to django-d...@googlegroups.com

Paul Davis wrote:
>
> As near as I can tell these are the main issues that don't seem to be resolved:
>
> 1. Balancing ease of use with power of use (Ie, Alice vs. Carol)
> 2. Level of versioning: Model vs. Application vs. Entire database
> 3. Application of versions: v1 -> v2 -> v3 =? v1 -> v3
> 4. How do we represent a migration from state 1 to state 2

> #1 is a nobrainer. We want both. And we won't settle for less.

agreed. this is why fingerprinting, versioning and controlled

deployment scripts are all optional in my implementation. (as i feel
they must be to support alice)

> #2 There is a lot of talking around this issue but I haven't seen

> anyone take it full on.
> We have to version the entire schema as a single object. Anyone in
> doubt, I refer to SVN's repository wide versioning vs CVS's file
> specific versioning.

i agree that at _least_ applications need to be versioned together. i

lean towards agreeing that we should version entire schemas, but w/
django not having any existing inter-app dependency mechanism (other
than they just won't work), i was reluctant to mandate it. mostly
agree, but i think the ramifications need to be better thought out
before we push it on anyone.

> #3 There are those that think v1->v2->v3 == v1->v3. I am not one of

> those people.
> A database is defined (in my mind) by its schema and the data it
> contains.

i think it depends on who you are. for alice at least, can you agree it

will be true? for the majority of apps/developers too, i'd argue. but
for disburse deployments, i've conceded before that support needs to be
there for v1->v2->v3. (and it is supported via my controlled deployment
scripts)

> Migrations are non-linear.

thank $DEITY someone else said this. :) imho any controlled deployment
system that _only_ supports lists of sequential versions is DOA.

> #4 This is minor, but I felt it necessary to say that we're obviously

> going to need python scripts to go from state to state. And custom sql
> would be easy to integrate.

agreed.

> So those are the main points near as I can tell. Now I'll go over some
> of the popular topics I saw:
>
> 1. The use of introspection:
> Will introspection be used? Of course. Will it be *all* thats used? Of
> course not.

agreed, but i think the controversy is over using introspection to

_generate_ our guesses. (heck even syncdb uses introspection - noone's
arguing we shouldn't use any)

> 2. The famous 'aka' syntax. For those of you out there saying that

> this is absurd and doesn't provide any information, I'm thankful. The
> syntax is horrible. Remember my quip about SVN vs. CVS versioning?
> This is like some ultra brain dead system that trys to version
> individual lines in a file. I mean, not even CVS is that bad.

you're confused as to it's purpose. it is certainly *not* an attempt to

Paul Davis

unread,

Sep 27, 2007, 6:28:09 PM9/27/07

to django-d...@googlegroups.com

On 9/27/07, Derek Anderson <pub...@kered.org> wrote:
>

> Paul Davis wrote:
> >
> > As near as I can tell these are the main issues that don't seem to be
> resolved:
> >
> > 1. Balancing ease of use with power of use (Ie, Alice vs. Carol)
> > 2. Level of versioning: Model vs. Application vs. Entire database
> > 3. Application of versions: v1 -> v2 -> v3 =? v1 -> v3
> > 4. How do we represent a migration from state 1 to state 2
>
> > #1 is a nobrainer. We want both. And we won't settle for less.
>
> agreed. this is why fingerprinting, versioning and controlled
> deployment scripts are all optional in my implementation. (as i feel
> they must be to support alice)
>

I think the main difference here is how we're wanting to provide for
Alice here. You seem to want to remove the 'dificult' parts for Alice.
I'd just make those parts automagically generated. I also wonder if
making scripts optional isn't an effect of some of your other design
decisions.

> > #2 There is a lot of talking around this issue but I haven't seen
> > anyone take it full on.
> > We have to version the entire schema as a single object. Anyone in
> > doubt, I refer to SVN's repository wide versioning vs CVS's file
> > specific versioning.
>
> i agree that at _least_ applications need to be versioned together. i
> lean towards agreeing that we should version entire schemas, but w/
> django not having any existing inter-app dependency mechanism (other
> than they just won't work), i was reluctant to mandate it. mostly
> agree, but i think the ramifications need to be better thought out
> before we push it on anyone.
>

This is a pretty good point about app dependancies. Although I think
having a dependancy tracker here isn't required by the migration, but
would help with errors that would invariably crop up when a migration
script ran outside the installed apps setting.

> > #3 There are those that think v1->v2->v3 == v1->v3. I am not one of
> > those people.
> > A database is defined (in my mind) by its schema and the data it
> > contains.
>
> i think it depends on who you are. for alice at least, can you agree it
> will be true? for the majority of apps/developers too, i'd argue. but
> for disburse deployments, i've conceded before that support needs to be
> there for v1->v2->v3. (and it is supported via my controlled deployment
> scripts)
>

No I sure won't agree on this. Given two identical databases, and two
different people running the migration scripts, the resulting
databases should be equal. And if thats not true simply because of
*when* they ran the scripts then the system would be broken.

> > Migrations are non-linear.
>
> thank $DEITY someone else said this. :) imho any controlled
> deployment system that _only_ supports lists of sequential versions is DOA.
>

I think this is related to who we're designing for. You seem to be
designing for Alice. I want to design for Carol. Of course I'm running
my thoughts through the "Will this prevent us from abstracting things
for Alice" but she's not my primary concern.
Specifically, whats to prevent us from requiring sequential versions
that have a script at each step? Nothing. We just autogenerate the
scripts with our guesses and Alice trusts them.

> > So those are the main points near as I can tell. Now I'll go over some
> > of the popular topics I saw:
> >
> > 1. The use of introspection:
> > Will introspection be used? Of course. Will it be *all* thats used? Of
> > course not.
>
> agreed, but i think the controversy is over using introspection to
> _generate_ our guesses. (heck even syncdb uses introspection - noone's
> arguing we shouldn't use any)
>

I'm all for using introspection to generate guesses. But only in cases
where we can reasonably make that guess. (At this point I'm assuming
that trivial 'Alice' changes will generally be guessable)

> > 2. The famous 'aka' syntax. For those of you out there saying that
> > this is absurd and doesn't provide any information, I'm thankful. The
> > syntax is horrible. Remember my quip about SVN vs. CVS versioning?
> > This is like some ultra brain dead system that trys to version
> > individual lines in a file. I mean, not even CVS is that bad.
>
> you're confused as to it's purpose. it is certainly *not* an attempt to
> version anything. it's there as a necessary hint to the guess_generation
> routine. i think everyone acknowledges we need some sore of a hinting
> syntax on top of the version tracking if we want to do any generation of
> change scripts... the question is where?
>
> the reasons i put it where i did are as follows:
> 1) easy to use / grok for everyone, even for alice.
> 2) all field referencing has some ambiguity, if you want to
> support fields that no longer exist.. best to have only
> one side ambiguous by attaching it directly to the other.
>
> i agree it would be a brain-dead VCS implementation, but it's not. so
> that being understood, is it more palatable now?
>

I don't agree that we need a hint syntax. If we can't guess what Alice
did, we tell her that she did something we can't figure out. Then we
tell her where to resolve this issue. Alice has used django without
migrations. Now that she has migrations, we do as much as possible for
her, but no more. But I digress...

The fact that the aka tells what the name was in a previous version
kinda does mean its version information. Throwing out the hint syntax
and allowing the possibility for imperfect guesses makes this Ok.
We'll work on making our guesses pretty good and try to cover 99% of
the Alice and Ben use cases. And that coupled with a couple of FAQ's
for the corner cases that cause too much ambiguity I think will be
good enough for them. And Carol still gets to rock and roll with a
powerful migration system.

So no, its still not palatable. Although I think its an issue that we
can resolve. The only reason you have really presented for keeping is
that it makes things easier. I think the fact that its inherently
wrong should out weigh the 'easiness' factor. (Wrong because its
working on a per column basis. All of our migration steps should be
done on a per schema basis.)

That being said, I'm going to start looking through your
implementation as well as the dbmigration package to see what parts I
do like and then I'll be able to start actually making more
constructive suggestions. (Come on, I wasn't gonna stay on the outside
throwing rocks for very long)

Paul Davis

Russell Keith-Magee

unread,

Sep 27, 2007, 8:43:50 PM9/27/07

to django-d...@googlegroups.com

On 9/28/07, Derek Anderson <pub...@kered.org> wrote:
>
> p.s. long emails on complicated topics....details get murky. so i've
> labeled the four major questions {Q1-4} so we can further isolate where
> we do and do not see eye2eye. :)

Rather that descend in to another few weeks of multi-page emails where
we misunderstand each others interpretation of terms like 'signature'
and 'introspection', I think the most productive thing I can do at
this point is to concentrate on helping my friend Ben get a prototype
out the door so that we can all have a discussion based upon actual
code, rather than a quick-and-dirty mailing list specification.

I've seen your code, and pointed out where I see problems. I hope to
be able to return the favour very soon. Then it will be use cases at
10 paces. :-)

Yours,
Russ Magee %-)

Derek Anderson

unread,

Sep 27, 2007, 10:25:19 PM9/27/07

to django-d...@googlegroups.com

well, this i just don't understand. plenty of programming topics
considerably more challenging than this are solved via listserv every
day in the open source world. i'd rather have a public discussion
incorporating everyone's needs, ideas and concerns, not just "you and
ben" deciding between yourselves what you think is best...and then some
sort of wild-west-style publicity shootout.

to the other people who have weighed into this discussion...am i alone
thinking along these lines?

derek

Russell Keith-Magee

unread,

Sep 27, 2007, 11:09:01 PM9/27/07

to django-d...@googlegroups.com

On 9/28/07, Derek Anderson <pub...@kered.org> wrote:
>
> well, this i just don't understand. plenty of programming topics
> considerably more challenging than this are solved via listserv every
> day in the open source world. i'd rather have a public discussion
> incorporating everyone's needs, ideas and concerns, not just "you and
> ben" deciding between yourselves what you think is best...and then some
> sort of wild-west-style publicity shootout.

Putting the melodrama aside, that isn't what I meant at all. My point was:

1) Working code is fundamentally less confusing that hand-wavy discussions

2) Python affords relatively simple development of prototype code

3) Writing well considered multi-page missives takes a lot of time and effort.

4) With the many proposals that have been thrown around, and the
multiple conflicting interpretations of various key terms, the
discussion waters are now very muddy.

5) The single easiest way to unmuddy the waters is to show you a
working version of what I am driving at.

6) We could avoid having yet another round of multi-page emails if we
only had some working code that allows me to say "This is what I
mean", and for you and others to say "That bit there doesn't work for
use case X"

I don't for a second expect that the prototype code that Ben is
producing will be merged as-is into trunk. However, it will contain
the minimum elements that I consider to be essential, and a "first
draft" attempt at implementing those features. As a result of mailing
list feedback, I fully expect that changes will be required. However,
it's a lot easier to provide feedback on a working design than on a
vague functional spec.

> to the other people who have weighed into this discussion...am i alone
> thinking along these lines?

Feel free to discuss whatever you want with whomever you want. I have
little else to add that I haven't already said, and all the discussion
in the world won't get code written.

If you have the patience, stamina, and time to continue the
discussion, power to you. If nothing else, it may iron out some issues
so that when I _do_ have something to show, there is some sort of
consensus.

Myself - I'm going to retreat into getting something done. You'll hear
from me again when I have something for show and tell.

Yours,
Russ Magee %-)

Marty Alchin

unread,

Sep 28, 2007, 8:58:20 AM9/28/07

to django-d...@googlegroups.com

I won't get into the discussion on features or implementation yet, but
I do have to agree that working code speaks volumes compared to
descriptions. I'll have need for schema evolution in a future project,
so I've been following these discussions, and I've completely lost
track of who's working on what, and how each offering is expected to
work.

Personally, what I'd like to see is this:

* Each view is presented with working code. It doesn't have to be
feature complete, but does some basic jobs, to illustrate how it would
work
* Along with the code, give a list of features it provides. Again, it
doesn't have to be all the features it will ever provide, just those
that are currently implemented.
* A brief (yes, brief) description of why you chose a particular
approach. This way, if two people come up with different approaches to
the same problem, it's easier to compare the reasoning behind them.
* If possible, some sort of feature matrix comparing the states of the
various implementations. So if it looks like one group is tackling
half of the problems, while another group is tackling the other half,
maybe merging them could be easy.

It'd probably be easiest to do the descriptions on the Wiki, with each
proposal having its own article. The current SchemaEvolution article
tries to cover everybody at once, and it does a poor job of organizing
information. I would suggest that article be used for descriptions of
the problems that need to be solved by schema migration, the feature
matrix comparing the implementations, and links to individual articles
that provide more detail for each implementation.

Each article would then describe the implementation's motivations and
tactics, example usage code, and links to relevant supporting
articles, previous discussions, and working code. There's merit in
documenting future plans in each individual article, but I would like
to see the main SchemaEvolution only cover those features that are
actually finished.

I'd be willing to set this up if there's support for it, but if you
guys don't want to maintain it, I won't bother. It's just one
suggestion, but I think it would help.

-Gul

Derek Anderson

unread,

Sep 28, 2007, 12:26:38 PM9/28/07

to django-d...@googlegroups.com

hey marty,

i agree. see:
http://code.djangoproject.com/wiki/SchemaEvolutionDocumentation

maybe another day or two for the stand-alone version...

derek

Xan

unread,

Sep 28, 2007, 3:57:17 PM9/28/07

to Django developers

Russel, I'm only a non-programmer user, but just curious question:
why you (all of core developers) implement the schema evolution in
the self model that is evolving and not outside the model?

Why not if you have:

from django.db import models

class Poll(models.Model):
question = models.CharField(maxlength=200)
pub_date = models.DateTimeField('date published')
author = models.CharField(maxlength=200)

and you want to rename 'pub_date' to 'publication_date', add 'purpose'
field, and modify DateTimeField to DateField why not:

class Migration(models.Migration):
Poll.changes = {Poll.rename(pub_date, Poll.publication_date)
Poll.addfield(purpose =
models.CharField(maxlength=100)),
Poll.modify(publication_date =
models.DateField("date published")}

Well, in more pollite syntax but you understand me

Migration migrates Poll in database and writes models file
A desktop user have a evolutioned model

Thanks,
Xan.

On Sep 28, 2:43 am, "Russell Keith-Magee" <freakboy3...@gmail.com>
wrote:

Reply all

Reply to author

Forward