--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/D_bztOahVBQJ.
To post to this group, send email to sqlal...@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
If you've seen my recent talks you saw that I'm a little skeptical of what you're terming "non-monolithic" databases. Let's say this means, a database with a set of tables maintained by entirely different packages, but with the possibility of dependencies between those tables. If I understand correctly, if we were dealing with sets of tables that didn't have any dependency, you wouldn't need a distributed migration tool, each package would handle migrations for its own set of tables independently, is that right ?I think what I need to see here are, what exactly are these packages, outside of the Django community, that actually create their own tables yet encourage dependencies between those tables and your app's own tables ? I know people are working on them since I see people asking questions about those use cases, but what are they ? What's the openid and user/groups package you're thinking of here ?
>
> So there you have it. It very well may be that there is exactly one use case for this package, but who doesn't need to keep track of users and groups? Other than that it does a passable job of applying hand-written linear database upgrades, and it is short.
that it is, and the surprise here is....repoze.evolution ! yikes !
so I guess with these two systems, writing the "scripts" is totally up to the developer, is that right ?
There's a lot that alembic could bring into this. There's all the Alembic ops and dialect-agnostic directives (DDL abstraction). There's migration modes that either talk directly to a database or generate a SQL Script. There's the tools to create new scripts and a more sophisticated versioning scheme too (repoze.evolution seems to use an incrementing integer).
It almost seems like Alembic could integrate with repoze.evolution though I'm not sure if that's useful. You could certainly use Alembic's DDL abstractions directly in this system with a couple of lines if nothing else.
It almost seems like Alembic could integrate with repoze.evolution though I'm not sure if that's useful. You could certainly use Alembic's DDL abstractions directly in this system with a couple of lines if nothing else?
That suggests that every package would have its own migration tool,
which is not very practical from a sysadmin point of view. I am an
upgrading an application I want to be able to run all necessary
migrations for all components of an application in one run. I do not
want to be required to figure out which packages an application was
running and then migrate them all separately. So I definitely see a need
for an upgrade framework that can deal with multiple packages.
> I think what I need to see here are, what exactly are these packages,
> outside of the Django community, that actually create their own tables
> yet encourage dependencies between those tables and your app's own
> tables ? I know people are working on them since I see people asking
> questions about those use cases, but what are they ? What's the
> openid and user/groups package you're thinking of here ?
s4u.image is such an example: https://github.com/2style4you/s4u.image .
That package implements an image store which supports on-demand scaling
of images. Metadata is stored in a SQL database and commonly you add
references to images to other tables. Every site we build uses s4u.image
to manage image handling. This happens to be in-house developed by us,
but for all intents and purposes it is a third-party package to our
front-end developers.
> In the development world I've always lived in, we just don't have
> third party libraries that bring in their own sub-schemas. Up until
> now the thinking has been, if it's significant enough that it is part
> of your datamodel, it's part of what you should own yourself, though
> certainly drawing upon past recipes.
I suspect a difference is that we are often building different sites
that build on shared common functionality. Our main business is building
sites that deal with online fashion, so everything we build has to deal
with things like images and clothing articles. The code to handle those
has been split out to separate packages (s4u.image is one of those) that
define core datamodels and some logic, and our sites build on those.
Sometimes we extend the base models, for example when for a particular
site we need to track extra data for clothing, and sometimes we use the
base models as-is and reference them directly via relationships and
foreign keys. That results in an ecosystem of many different packages
and sites that each have their own evolve in their own way and require
their own migrations.
When we upgrade a site our process is pretty simple: upgrade version
pins for buildout, rerun buildout, run upgrade-script, tell mod_wsgi to
reload. The upgrade-script walks through all migrations from all
packages a site uses so we have a single interface for administrators to
upgrade everything. The upgrade framework itself is extremely minimal
(see https://github.com/2style4you/s4u.upgrade ), but works well enough
for us. Note that we deviate from stucco_evolution in three important
ways: we do not use versioning but require upgrade steps to test if an
upgrade is necessary, our upgrade framework is not tied to SQLAlchemy
but has a more generic requirements-system so you can use it for other
things (we use it for filesystem changes and SOLR configuration as well
for example), and it does not support dependencies. Personally I
consider the first two to be desirable qualities for an upgrade
framework. Dependencies are something that we will probably need to add
at some point.
Wichert.
This is what I came up with for mortar_rdb: each package defined a
"source" of tables and each application collected the sources it was
using into a "config". Each "source" has its own schema. I haven't hit a
situation where I needed the topological sort yet, I suspect if I did
I'd just punt and make the owner of the config (ie: the application)
specify the upgrade order manually...
> I think what I need to see here are, what exactly are these packages,
Authentication, authorization and "membership" are the obvious ones; all
three have several interchangeable solutions and I can certainly
conceive packages for each interoperating with a few select foreign
keys, username being the most obvious...
Admittedly, I haven't hit this in the "real world" yet.
I'm very keen to move mortar_rdb onto Alebic, and will be doing so as
soon as I hit the need in the real world...
Chris
--
Simplistix - Content Management, Batch Processing & Python Consulting
- http://www.simplistix.co.uk
>
> That suggests that every package would have its own migration tool, which is not very practical from a sysadmin point of view. I am an upgrading an application I want to be able to run all necessary migrations for all components of an application in one run. I do not want to be required to figure out which packages an application was running and then migrate them all separately. So I definitely see a need for an upgrade framework that can deal with multiple packages.
in my view every package which deals with it's own schema objects would at least have to maintain it's own migration files - not it's own migration "tool", but assuming an Alembic setup, each would have at least a rudimentary Alembic environment and individual migration files. You could then run each migration environment individually, or write a short coordination script within the main application that calls upon all of them. Assuming the packages either have no schema dependencies on each other, or dependencies without cycles, the correct "order" of which set of scripts to be run could just be hardcoded within the main application. i think when one writes an application M that makes use of libraries A, B, and C, it's not unreasonable that M would have to include some top-level configuration for A, B, and C, that is, adding each one to a list of packages in which to locate an alembic environment and run upgrades. Or there would be some other usage contract between A, B, C and M that allows for publishing of "migration" handles.
What I don't see is that application M has within it migration scripts specific to A, B and C. A, B and C should maintain the knowledge of their own schemas and how they need to be upgraded for new versions of A, B and C I would think.
The one thing that's needed as far as Alembic is concerned is the ability to control the name of the actual "migration" table per environment, this is a short feature add that's been sitting as an enhancement request for some time now.
Since you mention it, I posted patches to
https://bitbucket.org/zzzeek/alembic/issue/34/make-version-table-name-configurable
awhile ago and was awaiting feedback on them (until I forgot about them.)
The patches in addition to supporting a configurable version table name
also support two-column version tables which can be shared between Alembic
environments.
(If you want to veto the two-column version table idea, I can whittle
it down to just the configurable-version-table-name part pretty easily.)
Cheers,
Jeff
I really have to get used to bitbucket and the need to press the "follow" button on these issues, since I was totally unaware of this !
looking now (and checking other issues for missed activity)
>
> Cheers,
> Jeff
>
> --
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.