[GSoC] Proposal django.db schema alteration

62 views
Skip to first unread message

xtrqt

unread,
Apr 6, 2011, 1:47:19 PM4/6/11
to Django developers
Schema Alteration
=================
About Me
~~~~~~~~
I'm student of last year of Technical University of Lodz, Poland on
faculty
of electronic engineering and computer science, while now in parallel
I'm
doing my second diplom of electronic engineering on Polytech de Nantes
in
France. I've been using python for 8 years, and after getting totally
frustrated with php and it's frameworks, I decided to choose something
else
for doing my webdev gigs. I can say I'm with django from 0.96 version.
I
was always reader never commiter, maybe I was too scared to develop
somethingand than made people believe that your idea is right. I
decided to
change that,and take part of django - the framework that made me like
web
development ;)I know that it's a bit late to prove my qualities, but I
believe I can succeedin this project. I hope research I've made for
this
proposal will convince you ;)

Experience
~~~~~~~~~~
Databases: I worked for many years in MySQL environment,
developing small framework for my private use, with the aim, to
generate
SQL for CRUD operations automatically, besides developing it few years
ago it still works on some pages. I used sqlite as a backend in my
school
project last year, I was using raw c interfaces then. I also learned a
bit
of Oracle SQL flavour.

Opensource: As I said before I don't have much experience as a active
member of any opensource community, but still I use mostly opensource
software and I was always looking closely what happen in Django
project
and other projects.

Python: I try to use python everywhere I can, from daily sysadmin
chores to
writing strange dns-tunneling scripts, with Django apps development
in
center ;)

Background
~~~~~~~~~~
As long as I remember Ruby on Rails, and many other frameworks, that
took
it's design from RoR supported introspected migrations in database. On
the
other hand, besides very good schema generation, Django lacks any
mechanism
that would allow to change schema, after generation. This prevents
from
writing efficient application that could manage database migrations.
I believe that such type of mechanism should be a part of core
django.db
module, facilitate writing migration applications, keeping important
alteration code inside django.

Plan
~~~~
Implement database alteration inside django.db module.

Rationale
~~~~~~~~~
This change would allow higher level of management for all database
operations, all utility applications like South wouldn't need to
maintain
their own code to prepare database changes. What is more developing
new
migration tool/app, wouldn't trigger imediatly to fork the code of
different migration apps. Which would create huge amount of
unmanagable
code in different repositories.
In new database backend api, developing backend would need more
effort,
but on the other hand, supporting this backend for any migration
utility
would be not a big problem.

Method
~~~~~~
At this time, all backend written in South are defined by inheriting
``south.db.DatabaseOperations`` class, which defines all important
interfaces to deal with schema alteration
* create_table(self, table_name, fields)
* rename_table(self, old_table_name, table_name)
* delete_table(self, table_name, cascade=True)
* clear_table(self, table_name)
* add_column(self, table_name, name, field, keep_default=True)
* alter_column(self, table_name, name, field, explicit_name=True,
ignore_constraints=False)
* create_unique(self, table_name, columns)
* delete_unique(self, table_name, columns)
* foreign_key_sql(self, from_table_name, from_column_name,
to_table_name,
to_column_name)
* delete_foreign_key(self, table_name, column)
* create_index(self, table_name, column_names, unique=False,
db_tablespace='')
* delete_index(self, table_name, column_names, db_tablespace='')
* delete_column(self, table_name, name)
* rename_column(self, table_name, old, new)
* delete_primary_key(self, table_name)
* create_primary_key(self, table_name, columns)

On the other hand we have django.db.backends, which furnish only
api,
for creation. Here we can find two ways. Define most of the
South Api for alteration but it would lead API to be inconsistent.

Prefered way would be to introduce new layer of API, modeled on
South,
which would make use of existing internal functions for generation,
and
functions to be written for alteration. This layer of API would be
frozen
and made almost 'public' with well written documentation, to use in
schema
modyfing apps.

To sum up we need to develop new layer of API which would allow 3rd
party
applications to create, alter, inspect, drop schema [all meta
operations
on database]. Django doesn't support alterations, and support drop
operation in not consistent way, so these internals should also be
implemented. All cases in which django internally modifies schema
should
also be refactored to use new API.

For now there are not many references to creation API in Django code.
In fact it is used mainly in ``syncdb`` and ``testserver``. These
two managements command implement the loop that aggregates sql code
needed
to create new model tables. We could imagine refactoring it to use new
API. (API till the first week of GSoC, I believe will be open for
discussion).

create model example
--------------------
def create_model(self, model, style, known_models=set(), using,
commit=True):
opts = model._meta
if not opts.managed or opts.proxy:
return {}

fields, rel =
transform_model_fields_to_database(opts.local_fields)
self.create_table(using, style.SQL_TABLE(qn(opts.db_table)),
fields)

if commit:
transaction.commit_unless_managed(using=using)

return rel




Timeline
~~~~~~~~
* 1 week -- design API, define tests
* 1 week -- wrap existing creation code to create creation part of
API
* 2 weeks -- developing base alteration API
developing low level alteration routines for mysql
backend

* 1 week -- developing low level alteration routines for sqlite
backend
* 1 week -- developing low level alteration routines for postgresql
* 2 week -- developing base drop API and drop routines for backends

* 1 week -- wrap existing inspection code to create inspection part
of API
* 2 weeks -- writing tests / documention
* 1 week -- buffer week for everything that is unexpected
-- if everything will be smooth, we'll try to fork South to
use
new API.

Goals
~~~~~
As with any good project we need some criteria by which to measure
success:
* Implementation of alteration routines for mysql, postgresql and
sqlite
backend
* Design and implementation of new schema management API
* Tests/Documentation for all that API

Personal goal
~~~~~~~~~~~~~
Merge of this code to django 1.4 ;)

Please post any questions or comments I'll be glad to reply.

Contact with me, by standard means
email: jan.rzepecki (at) gmail (dot) com
jabber/gtalk: same as above
irc: i've just started to idle everyday on #django and #django-dev on
nick
`xtrqt` ( it is also my nick on django tracker.)

Russell Keith-Magee

unread,
Apr 6, 2011, 8:09:04 PM4/6/11
to django-d...@googlegroups.com
On Thu, Apr 7, 2011 at 1:47 AM, xtrqt <jan.rz...@gmail.com> wrote:
> Schema Alteration
> =================

>I hope research I've made for
> this
> proposal will convince you ;)

Consider me convinced :-)

This is a solid proposal -- it's a clearly defined need, and the work
you have described here sounds like it will be a good match for the
GSoC timeline.

However, we can't let you get away without asking at least one question [1] :-)

[1] http://djangocaptions.com/post/647587573

So:

Can you clarify your approach a little bit here? Are you proposing to
completely deprecate the existing backend.creation interface in favor
of a new "metaoperations" module? Or are you going to keep the old
creation module and introduce a new module that contains the 'missing'
operations like rename and alter? Or something else entirely?

I know that backend.creation isn't formally a stable API, but it's
been around for so long that it shouldn't be changed lightly. Even
though it isn't stable, changing the creation API would have
consequences -- for example, every external database backend would be
broken for 1.4 if they needed to upgrade to use a new creation API.

I'm not opposed to deprecating/modifying the existing creation
interface if necessary -- I'd just like you to elaborate on why it is
necessary, and if there are any mitigation strategies that we can use
(such as redirecting the old interface to the new calls) that we can
use to help ease the migration process.

Yours,
Russ Magee %-)

xtrqt

unread,
Apr 7, 2011, 3:36:13 AM4/7/11
to Django developers
On Apr 7, 2:09 am, Russell Keith-Magee <russ...@keith-magee.com>
wrote:
> On Thu, Apr 7, 2011 at 1:47 AM, xtrqt <jan.rzepe...@gmail.com> wrote:
> However, we can't let you get away without asking at least one question [1] :-)
>
> [1]http://djangocaptions.com/post/647587573

Nice picture ;)

> Can you clarify your approach a little bit here? Are you proposing to
> completely deprecate the existing backend.creation interface in favor
> of a new "metaoperations" module? Or are you going to keep the old
> creation module and introduce a new module that contains the 'missing'
> operations like rename and alter? Or something else entirely?

I haven't made myself clear about that, but I would like to introduce
new features as an optional, 'publicly' available layer of API, old
SQL generating API would persist as long as we need it, but would
become
`internal internal` in comparison to `internal but almost public` new
API.

We could think about different scenario, like you suggested, that old
interface
is redirect to new one, but the major problem is that in new solution,
we
operate on higher abstraction level (it means we consider no "SQL"
abstract)
than in old one. So to make these redirection possible, we would
need,

sql_create_model -> [model_definition] ->
create_model -> [created Model in database, not commited] ->
get_sql_for_table -> [sql for Model] ->
rollback transaction ->
[sql_code_returned]
quite much overhead, but not impossible.

So old functions [creation func.] will be preserved as they were
coded, so for
some time old applications that were using this old fashion of calls
would still work. On the other hand we should note that this should
be only for transition between 1.3-1.4. Applications written in 1.4,
should take in account only new interface. So we can refactor [I can
see my
role in that] creation code to match style of alteration code just
before
1.5 release.

And as I said before, Django use it's creation interface only in 2
places
so it shouldn't be hard to move this code to new call convention.

I hope I have answered all your questions.

I'm still open for new questions ;)

best regards
xtrqt
Reply all
Reply to author
Forward
0 new messages