stucco_evolution 0.4 released

20 views
Skip to first unread message

Daniel Holth

unread,
Apr 19, 2012, 9:14:26 AM4/19/12
to sqlal...@googlegroups.com
stucco_evolution 0.4 has been released. It is a migration tool for SQLAlchemy that attempts to deal with packaged dependencies having their own migration scripts. Reading -> as "depends on",

web application -> openid package -> users/groups package
web application -> users/groups package

When asked to upgrade web application, stucco_evolution will topologically sort its dependencies, run all the migrations for the users/groups package, then run the migrations for the openid package, and finally run the migrations for the web application. If the dependency migrations are constrained in what they change, it works. Foreign key relationships can point in the direction of the -> without problems.

Let me know if you've tried it, or know of another package that attempts to deal with non-monolithic database migration.

Michael Bayer

unread,
Apr 19, 2012, 1:43:59 PM4/19/12
to sqlal...@googlegroups.com
If you've seen my recent talks you saw that I'm a little skeptical of what you're terming "non-monolithic" databases.    Let's say this means, a database with a set of tables maintained by entirely different packages, but with the possibility of dependencies between those tables.    If I understand correctly, if we were dealing with sets of tables that didn't have any dependency, you wouldn't need a distributed migration tool, each package would handle migrations for its own set of tables independently, is that right ?

I think what I need to see here are, what exactly are these packages, outside of the Django community, that actually create their own tables yet encourage dependencies between those tables and your app's own tables ?   I know people are working on them since I see people asking questions about those use cases, but what are they ?  What's the openid and user/groups package you're thinking of here ?

In the development world I've always lived in, we just don't have third party libraries that bring in their own sub-schemas.  Up until now the thinking has been, if it's significant enough that it is part of your datamodel, it's part of what you should own yourself, though certainly drawing upon past recipes.   For me to understand the use case of stucco_evolution I think I first need to see the light on the use case of these third party apps, which so far seem vague to me.



--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/D_bztOahVBQJ.
To post to this group, send email to sqlal...@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.

Daniel Holth

unread,
Apr 19, 2012, 2:23:47 PM4/19/12
to sqlal...@googlegroups.com
On Thursday, April 19, 2012 1:43:59 PM UTC-4, Michael Bayer wrote:
If you've seen my recent talks you saw that I'm a little skeptical of what you're terming "non-monolithic" databases.    Let's say this means, a database with a set of tables maintained by entirely different packages, but with the possibility of dependencies between those tables.    If I understand correctly, if we were dealing with sets of tables that didn't have any dependency, you wouldn't need a distributed migration tool, each package would handle migrations for its own set of tables independently, is that right ?

I think what I need to see here are, what exactly are these packages, outside of the Django community, that actually create their own tables yet encourage dependencies between those tables and your app's own tables ?   I know people are working on them since I see people asking questions about those use cases, but what are they ?  What's the openid and user/groups package you're thinking of here ?

I think you are right, there isn't anything outside of the Django world that does this; stucco_evolution is my attempt to bring something like that kind of re-use to my non-Django-powered world, and as far as I can tell*, I am its only user.

* Koders code search

Admittedly so far the only use case is the users/groups schema where the application attaches a separate user profile table, just like Django. The relationships always go in only one direction: the dependent schema holds a foreign key referencing the dependency schema.

It really is possible to distribute the entire user management interface as a separately maintained package, while still being able to get at user.profile in your app, but you won't be able to perform migrations that change the user table's primary key. It's probably more useful that stucco_evolution makes sure the users table is simply created first.

In the openid case, an openid package manages a users_openids table instead of adding an openid column to the users table.

So there you have it. It very well may be that there is exactly one use case for this package, but who doesn't need to keep track of users and groups? Other than that it does a passable job of applying hand-written linear database upgrades, and it is short.

Michael Bayer

unread,
Apr 19, 2012, 2:40:59 PM4/19/12
to sqlal...@googlegroups.com

On Apr 19, 2012, at 2:23 PM, Daniel Holth wrote:

>
> So there you have it. It very well may be that there is exactly one use case for this package, but who doesn't need to keep track of users and groups? Other than that it does a passable job of applying hand-written linear database upgrades, and it is short.

that it is, and the surprise here is....repoze.evolution ! yikes !

so I guess with these two systems, writing the "scripts" is totally up to the developer, is that right ?

There's a lot that alembic could bring into this. There's all the Alembic ops and dialect-agnostic directives (DDL abstraction). There's migration modes that either talk directly to a database or generate a SQL Script. There's the tools to create new scripts and a more sophisticated versioning scheme too (repoze.evolution seems to use an incrementing integer).

It almost seems like Alembic could integrate with repoze.evolution though I'm not sure if that's useful. You could certainly use Alembic's DDL abstractions directly in this system with a couple of lines if nothing else.


Daniel Holth

unread,
Apr 19, 2012, 3:20:55 PM4/19/12
to sqlal...@googlegroups.com

It almost seems like Alembic could integrate with repoze.evolution though I'm not sure if that's useful.   You could certainly use Alembic's DDL abstractions directly in this system with a couple of lines if nothing else?


My little project doesn't care about DDL, it just passes your script a connection. I didn't consider Alembic when I wrote stucco_evolution in 2010 but I wouldn't mind using it now. At the time I just needed something that didn't scare me. repoze.evolution is fine, it is only 98 lines of code, 17 of which I actually execute. Its design abstracts out the kind of thing that is being upgraded, so you could write another kind of EvolutionManager() to upgrade filesystems if you felt like it.

Michael Bayer

unread,
Apr 19, 2012, 3:54:38 PM4/19/12
to sqlal...@googlegroups.com
my surprise at repoze.evolution wasn't related to a perception of quality, it was that I'd never heard of it before.   I might have tried making Alembic work with it out of the box if I knew about it.

Basically I find it intriguing that Alembic might be further opened up to support other kinds of migrations.   

Wichert Akkerman

unread,
Apr 20, 2012, 4:52:12 AM4/20/12
to sqlal...@googlegroups.com, Michael Bayer
On 04/19/2012 07:43 PM, Michael Bayer wrote:
> If you've seen my recent talks you saw that I'm a little skeptical of
> what you're terming "non-monolithic" databases. Let's say this
> means, a database with a set of tables maintained by entirely
> different packages, but with the possibility of dependencies between
> those tables. If I understand correctly, if we were dealing with
> sets of tables that didn't have any dependency, you wouldn't need a
> distributed migration tool, each package would handle migrations for
> its own set of tables independently, is that right ?

That suggests that every package would have its own migration tool,
which is not very practical from a sysadmin point of view. I am an
upgrading an application I want to be able to run all necessary
migrations for all components of an application in one run. I do not
want to be required to figure out which packages an application was
running and then migrate them all separately. So I definitely see a need
for an upgrade framework that can deal with multiple packages.

> I think what I need to see here are, what exactly are these packages,
> outside of the Django community, that actually create their own tables
> yet encourage dependencies between those tables and your app's own
> tables ? I know people are working on them since I see people asking
> questions about those use cases, but what are they ? What's the
> openid and user/groups package you're thinking of here ?

s4u.image is such an example: https://github.com/2style4you/s4u.image .
That package implements an image store which supports on-demand scaling
of images. Metadata is stored in a SQL database and commonly you add
references to images to other tables. Every site we build uses s4u.image
to manage image handling. This happens to be in-house developed by us,
but for all intents and purposes it is a third-party package to our
front-end developers.

> In the development world I've always lived in, we just don't have
> third party libraries that bring in their own sub-schemas. Up until
> now the thinking has been, if it's significant enough that it is part
> of your datamodel, it's part of what you should own yourself, though
> certainly drawing upon past recipes.

I suspect a difference is that we are often building different sites
that build on shared common functionality. Our main business is building
sites that deal with online fashion, so everything we build has to deal
with things like images and clothing articles. The code to handle those
has been split out to separate packages (s4u.image is one of those) that
define core datamodels and some logic, and our sites build on those.
Sometimes we extend the base models, for example when for a particular
site we need to track extra data for clothing, and sometimes we use the
base models as-is and reference them directly via relationships and
foreign keys. That results in an ecosystem of many different packages
and sites that each have their own evolve in their own way and require
their own migrations.

When we upgrade a site our process is pretty simple: upgrade version
pins for buildout, rerun buildout, run upgrade-script, tell mod_wsgi to
reload. The upgrade-script walks through all migrations from all
packages a site uses so we have a single interface for administrators to
upgrade everything. The upgrade framework itself is extremely minimal
(see https://github.com/2style4you/s4u.upgrade ), but works well enough
for us. Note that we deviate from stucco_evolution in three important
ways: we do not use versioning but require upgrade steps to test if an
upgrade is necessary, our upgrade framework is not tied to SQLAlchemy
but has a more generic requirements-system so you can use it for other
things (we use it for filesystem changes and SOLR configuration as well
for example), and it does not support dependencies. Personally I
consider the first two to be desirable qualities for an upgrade
framework. Dependencies are something that we will probably need to add
at some point.

Wichert.

Chris Withers

unread,
Apr 20, 2012, 1:37:05 PM4/20/12
to sqlal...@googlegroups.com, Michael Bayer
On 19/04/2012 18:43, Michael Bayer wrote:
> but with the possibility of dependencies between those tables. If I
> understand correctly, if we were dealing with sets of tables that didn't
> have any dependency, you wouldn't need a distributed migration tool,
> each package would handle migrations for its own set of tables
> independently, is that right ?

This is what I came up with for mortar_rdb: each package defined a
"source" of tables and each application collected the sources it was
using into a "config". Each "source" has its own schema. I haven't hit a
situation where I needed the topological sort yet, I suspect if I did
I'd just punt and make the owner of the config (ie: the application)
specify the upgrade order manually...

> I think what I need to see here are, what exactly are these packages,

Authentication, authorization and "membership" are the obvious ones; all
three have several interchangeable solutions and I can certainly
conceive packages for each interoperating with a few select foreign
keys, username being the most obvious...

Admittedly, I haven't hit this in the "real world" yet.

I'm very keen to move mortar_rdb onto Alebic, and will be doing so as
soon as I hit the need in the real world...

Chris

--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk

Michael Bayer

unread,
Apr 20, 2012, 3:44:55 PM4/20/12
to Wichert Akkerman, sqlal...@googlegroups.com

On Apr 20, 2012, at 4:52 AM, Wichert Akkerman wrote:

>
> That suggests that every package would have its own migration tool, which is not very practical from a sysadmin point of view. I am an upgrading an application I want to be able to run all necessary migrations for all components of an application in one run. I do not want to be required to figure out which packages an application was running and then migrate them all separately. So I definitely see a need for an upgrade framework that can deal with multiple packages.

in my view every package which deals with it's own schema objects would at least have to maintain it's own migration files - not it's own migration "tool", but assuming an Alembic setup, each would have at least a rudimentary Alembic environment and individual migration files. You could then run each migration environment individually, or write a short coordination script within the main application that calls upon all of them. Assuming the packages either have no schema dependencies on each other, or dependencies without cycles, the correct "order" of which set of scripts to be run could just be hardcoded within the main application. i think when one writes an application M that makes use of libraries A, B, and C, it's not unreasonable that M would have to include some top-level configuration for A, B, and C, that is, adding each one to a list of packages in which to locate an alembic environment and run upgrades. Or there would be some other usage contract between A, B, C and M that allows for publishing of "migration" handles.

What I don't see is that application M has within it migration scripts specific to A, B and C. A, B and C should maintain the knowledge of their own schemas and how they need to be upgraded for new versions of A, B and C I would think.

The one thing that's needed as far as Alembic is concerned is the ability to control the name of the actual "migration" table per environment, this is a short feature add that's been sitting as an enhancement request for some time now.


Jeff Dairiki

unread,
Apr 20, 2012, 6:52:42 PM4/20/12
to sqlal...@googlegroups.com, Michael Bayer
On Fri, Apr 20, 2012 at 03:44:55PM -0400, Michael Bayer wrote:
>
> The one thing that's needed as far as Alembic is concerned is the
> ability to control the name of the actual "migration" table per
> environment, this is a short feature add that's been sitting as an
> enhancement request for some time now.

Since you mention it, I posted patches to

https://bitbucket.org/zzzeek/alembic/issue/34/make-version-table-name-configurable

awhile ago and was awaiting feedback on them (until I forgot about them.)
The patches in addition to supporting a configurable version table name
also support two-column version tables which can be shared between Alembic
environments.

(If you want to veto the two-column version table idea, I can whittle
it down to just the configurable-version-table-name part pretty easily.)

Cheers,
Jeff

Michael Bayer

unread,
Apr 20, 2012, 7:40:09 PM4/20/12
to sqlal...@googlegroups.com

I really have to get used to bitbucket and the need to press the "follow" button on these issues, since I was totally unaware of this !

looking now (and checking other issues for missed activity)


>
> Cheers,
> Jeff


>
> --
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.

Reply all
Reply to author
Forward
0 new messages