Multi-RDBMS is in the trunk

5 views
Skip to first unread message

Catherine

unread,
Mar 31, 2009, 12:09:36 PM3/31/09
to sqlpython
It's working! At least, in the most basic sense.

0:jrrt@orcl> \c -a sqlite:///:memory:
1:@:memory:> CREATE TABLE foo (n NUMBER);

Executed

1:@:memory:> INSERT INTO foo VALUES (22);

Executed (1 rows)

1:@:memory:> cat foo

n
--
22

Selected Max Num rows (-1)
1:@:memory:>


What comes next is a whole lot of cleaning up Oracle-specific stuff.
For instance, note that, in sqlite, row counts are always reported as
"-1", because pysqlite doesn't populate cursor.rowcount. And none of
the metadata commands (ls, desc, refs, deps, etc.) are even thinking
about working outside Oracle yet.

So it's wide open for you to wade in and start working out the kinks!
Anyplace there are hard-coded queries against various views of the
Oracle data dictionary is screaming for help. We'll probably need a
mixture of these techniques (in order of preference).

1. Using sqlalchemy's object introspection powers
2. Querying against the information schema
3. Queries custom-written for each RDBMS's data dictionaries

Oracle's got no information schema, so we're probably stuck with #3
there, and there's some question about how usable the information
schema will be in the other databases... but go ahead and try it out.

I've (just) begun the work of separating these queries into a dict,
`metaqueries` in metadata.py. Please help me get the hard queries
into that. It's going to look like

metaqueries[purpose of query][RDBMS][target object type] = '''SELECT
blah blah blah'''

and, if the information schema works out, we can define some of it as

metaqueries[purpose of query]['ischema'][target object type] =
'''SELECT blah blah blah'''
metaqueries[purpose of query]['sqlite'][target object type] =
metaqueries[purpose of query]['ischema'][target object type]
metaqueries[purpose of query]['postgres'][target object type] =
metaqueries[purpose of query]['ischema'][target object type]

--------------------

I intend to make all connections through sqlalchemy; that way, we'll
be able to count on having sqlalchemy's goodies available to see the
metadata, etc. Right now, though, I can't make SYS AS SYSDBA, SYS AS
SYSOPER, etc. connections to Oracle through sqlalchemy.create_engine.
For cx_Oracle.connect (the past method), you can pass a `mode`
argument with values of 0, cx_Oracle.SYSDBA (2), cx_Oracle.SYSOPER
(4). sqlalchemy.get_engine cries "invalid arguments" when I try to
pass it `mode=modeval`, though.

So, for the moment, I'm just doing AS SYSOPER/SYSDBA connections the
old non-sqlalchemy way, but that will be a problem once we've got some
of the metadata queries rewritten to utilize sqlalchemy.

All this happens in sqlpython.ora_connect in sqlpython.py

Menno Smits

unread,
Mar 31, 2009, 12:41:19 PM3/31/09
to sqlp...@googlegroups.com
Catherine wrote:
> It's working! At least, in the most basic sense.

Wow, you don't mess around. Great!
No time to look at this right now (at work) but I'll take a look tonight.

> I intend to make all connections through sqlalchemy; that way, we'll
> be able to count on having sqlalchemy's goodies available to see the
> metadata, etc. Right now, though, I can't make SYS AS SYSDBA, SYS AS
> SYSOPER, etc. connections to Oracle through sqlalchemy.create_engine.
> For cx_Oracle.connect (the past method), you can pass a `mode`
> argument with values of 0, cx_Oracle.SYSDBA (2), cx_Oracle.SYSOPER
> (4). sqlalchemy.get_engine cries "invalid arguments" when I try to
> pass it `mode=modeval`, though.

Have you seen this:
http://www.sqlalchemy.org/docs/05/reference/dialects/oracle.html

The docs indicate can append "?mode=modeval@ onto the end of the URL
rather than passing an extra arg to create_engine. Is this what you were
doing?

Menno

Catherine Devlin

unread,
Mar 31, 2009, 2:59:30 PM3/31/09
to sqlp...@googlegroups.com
On Tue, Mar 31, 2009 at 12:41 PM, Menno Smits <me...@freshfoo.com> wrote:
Have you seen this:
http://www.sqlalchemy.org/docs/05/reference/dialects/oracle.html

The docs indicate can append "?mode=modeval@ onto the end of the URL
rather than passing an extra arg to create_engine. Is this what you were
doing?

 Oooh, that is so close it hurts.

This works:

create_engine('oracle://sys:syspassword@localhost/orcl?mode=2').connect()

But this doesn't:

create_engine('oracle://sys:syspassword@orcl?mode=2').connect()

ORA-12154: Could not resolve the connect identifier specified

... but the latter is what we need when duplicating normal Oracle login, which uses a TNS name

From the docs:

Connecting with create_engine() uses the standard URL approach of oracle://user:pass@host:port/dbname[?key=value&key=value...]. If dbname is present, the host, port, and dbname tokens are converted to a TNS name using the cx_oracle makedsn() function. Otherwise, the host token is taken directly as a TNS name.

Additional arguments which may be specified either as query string arguments on the URL, or as keyword arguments to create_engine() are:

  • mode - This is given the string value of SYSDBA or SYSOPER, or alternatively an integer value. This value is only available as a URL query string argument.

Unfortunately, it looks like they mean that ?mode=4 only works with a full-URL, not the abbreviated-with-TNS-name version... but the TNS-name version is all we have when somebody tries connecting username/password@orcl.

I don't see any reason why the TNS form shouldn't accept the ?mode, and it all traces down to def _parse_rfc1738_args(name) in sqlalchemy/engine/url.py, and I can see how it could be patched to accept the parameter in the TNS case... so maybe I'll try submitting a patch to the SQLAlchemy guys.  They're crazy-responsive, after all...

Or am I missing something?

--
- Catherine
http://catherinedevlin.blogspot.com/
*** PyCon * March 27-29, 2009 * Chicago * us.pycon.org ***

Brian Dorsey

unread,
Mar 31, 2009, 3:24:52 PM3/31/09
to sqlp...@googlegroups.com
On Tue, Mar 31, 2009 at 11:59 AM, Catherine Devlin
<catherin...@gmail.com> wrote:
> I don't see any reason why the TNS form shouldn't accept the ?mode, and it
> all traces down to def _parse_rfc1738_args(name) in
> sqlalchemy/engine/url.py, and I can see how it could be patched to accept
> the parameter in the TNS case... so maybe I'll try submitting a patch to the
> SQLAlchemy guys.  They're crazy-responsive, after all...

I'd give that a try, they're still sprinting today. :)

Take care,
-Brian

Menno Smits

unread,
Mar 31, 2009, 4:28:00 PM3/31/09
to sqlp...@googlegroups.com
Catherine Devlin wrote:
>
> I don't see any reason why the TNS form shouldn't accept the ?mode, and
> it all traces down to def _parse_rfc1738_args(name) in
> sqlalchemy/engine/url.py, and I can see how it could be patched to
> accept the parameter in the TNS case... so maybe I'll try submitting a
> patch to the SQLAlchemy guys. They're crazy-responsive, after all...
>
> Or am I missing something?

Looking at _parse_rfc1738_args() it seems like a URL like
"oracle://sys:syspassword@/orcl?mode=2" would work.

The difference is the extra / after the @.
Can't test it at work though...

Menno

Catherine Devlin

unread,
Mar 31, 2009, 4:52:54 PM3/31/09
to sqlp...@googlegroups.com
Nope - I tried that, too.  It throws an error. The code in
_parse_rfc1738_args that takes the "?mode=2" off the url and tucks it
safely explicitly doesn't fire when the regex doesn't parse it as
having explicity host and database names, so it ends up looking for a
database in TNSNAMES.ORA named "sid?mode=2".

I just filed a sqlalchemy ticket that I think will fix it:

http://www.sqlalchemy.org/trac/ticket/1361

If you want to plow ahead on the assumption that sqlalchemy will soon
apply the patch, you could grab a copy of their trunk (hg clone
http://bitbucket.org/mirror/sqlalchemy/), patch it with the attached
file, run python setup.py develop in it, and push onward.  Otherwise,
just use explicit connection urls like "connect
oracle://username:password@sid" when working with the sqlpython trunk
until it's all resolved.

Whew! OK, I spent too long on this today, but it's that
laying-the-groundwork that I wanted to do to make everybody else's
multi-DB work possible.
urlparampatch.diff

Catherine Devlin

unread,
Apr 2, 2009, 9:16:13 AM4/2/09
to sqlp...@googlegroups.com
Turns out that you were right, Menno, except for the placement of the
final slash - SQLAlchemy as it stands now works fine for us, with
syntax like oracle://sys:syspasswd@dbase/?mode=2
http://www.sqlalchemy.org/trac/ticket/1361

... so I've updated sqlpython.py in the trunk accordingly. Now every
connection is handled by SQLAlchemy, so it should be possible to use
SQLA goodies to investigate our objects - I hope that ends up being
helpful.

Offhand, the only SQLA goodie I see is get_tables(), and even that
isn't so useful because our `ls` is good for all sorts of objects -
tables, views, stored procedures, indexes, etc etc etc so get_tables
alone can't replace it. But I haven't even begun to probe into what
SQLAlchemy has got.

I wonder... if we ended up having to build a bunch of
database-specific metadata access objects after all, whether SQLA
would be interested in porting them up? That would make our work even
more widely useful. Maybe we should try to write them with that
possibility in mind.

Menno Smits

unread,
Apr 2, 2009, 1:46:18 PM4/2/09
to sqlp...@googlegroups.com
Catherine Devlin wrote:
> Turns out that you were right, Menno, except for the placement of the
> final slash - SQLAlchemy as it stands now works fine for us, with
> syntax like oracle://sys:syspasswd@dbase/?mode=2
> http://www.sqlalchemy.org/trac/ticket/1361

Great!

> Offhand, the only SQLA goodie I see is get_tables(), and even that
> isn't so useful because our `ls` is good for all sorts of objects -
> tables, views, stored procedures, indexes, etc etc etc so get_tables
> alone can't replace it. But I haven't even begun to probe into what
> SQLAlchemy has got.

Good point. I don't know how well it handles views and indexes and I bet
it has nothing for stored procedures. Let's see what's in there before
giving up on it completely.

The person at the back of the room in the open spaces session seemed to
think that there was a bunch of recently added metadata support. Hunting
around a bit I found reference to Randall Smith, who's supposedly
working on this:

http://www.mail-archive.com/sqlal...@googlegroups.com/msg12686.html

Looks like this gives support for schemas, tables, views, keys and
indices. Not sure if the work has made it into the SQLA trunk.

Hunting around a bit further I found Gerald:
http://www.mail-archive.com/sqlal...@googlegroups.com/msg12686.html

It has an an API for inspecting the schema of Oracle, Postgres and MySQL
DBs and can handle schemas, tables, views, triggers, sequences and
stored procedures. Probably also worth a look.

> I wonder... if we ended up having to build a bunch of
> database-specific metadata access objects after all, whether SQLA
> would be interested in porting them up? That would make our work even
> more widely useful. Maybe we should try to write them with that
> possibility in mind.

DB specific metadata interfaces would also help isolating requirements
for specific DB API modules. Currently sqlpython requires that you have
cx_Oracle installed even if you don't intend to use Oracle because it is
imported and used all over the place. Moving all DB specific
functionality separate modules/classes would mean you only need to
install the DB API modules for the DBs you actually use. The import of
cx_Oracle/psycopg/whatever could be delayed until a connection to such a
DB is requested.

Menno

Catherine Devlin

unread,
Apr 6, 2009, 4:55:46 PM4/6/09
to sqlp...@googlegroups.com
I just sent a new sqlpyPlus.py up to the trunk with `ls` reworked.  Now, it selects from the subquery, and the subquery comes from the metaqueries dict.  That way, the metaqueries can mask the individual RDMBS particulars, renaming the

Those metaqueries are very, very basic and not very accurate right now - I need the help of you folks with your DB-specific knowledge.  You can also use that general concept as a way to de-Oracleize other DDL commands (refs, deps, pull, comments, ...)

So that's one way we could do metadata in non-Oracle.  As for the other ways Menno found...

On Thu, Apr 2, 2009 at 1:46 PM, Menno Smits <me...@freshfoo.com> wrote:
Hunting
around a bit I found reference to Randall Smith, who's supposedly
working on this:

http://www.mail-archive.com/sqlal...@googlegroups.com/msg12686.html

Looks like this gives support for schemas, tables, views, keys and
indices. Not sure if the work has made it into the SQLA trunk.

Hmm.  Well, it is in the sqlalchemy trunk now, but I don't think it's really enough for us.  An Inspected connection has methods like get_table_names, get_view_names, get_columns, etc., but that only gives you names.  We need more than names - we need stuff like: status (valid/invalid), last DDL date/time, foreign key information, dependencies, the creation DDL of objects, etc.  It looks like that would be hard or impossible to get with the sqlalchemy Inspector. 

get_foreign_keys() is returning me empy lists for everything.  Bleh.

I'm voting -1 on sqlalchemy.engine.reflection.
 

Hunting around a bit further I found Gerald:
http://www.mail-archive.com/sqlal...@googlegroups.com/msg12686.html

It has an an API for inspecting the schema of Oracle, Postgres and MySQL
DBs and can handle schemas, tables, views, triggers, sequences and
stored procedures. Probably also worth a look.

Gerald looks a lot more promising.  For one thing, you actually get objects instead of just names, and those objects seem to have most of the info we need for our DDL commands.  Most, but not all, so we'd still have to write direct DDL queries for some stuff.  (I'm unwilling to give up on any functionality, sorry; I still see sqlpython as competing with SQL*Plus, and I want to win on all counts.)

Also in Gerald's favor - it's actually released.  I want to be able to tell people, "easy_install and it will get all your dependencies".  I don't want to say, "Download the SQLAlchemy trunk with subversion..."

Big problem with Gerald: it doesn't yet support MS SQL Server, and SQLite support isn't even listed as an ambition.  I'd hate to up on mssql, because (1) Brian needs it and (2) that's a huge community we could reach with Python power for the first time.

One option would be to go ahead with Gerald and offer Andy Todd (Gerald's maintainer) our help in building mssql and sqlite support for it.  Peeking into the code, it looks like this would actually be very similar to coding our own DDL queries - basically, we'd be writing the same queries, just wrapping them up in Gerald objects instead of our `metaqueries` dict.  This would be significantly more work, but we'd only need to do it for mssql and sqlite, and we'd be benefiting two projects instead of just one.

Or... we could just go back to doing the DDL queries ourselves, as I described at the top.  That might be the KISS solution.

I don't know!  Opinions, please!

DB specific metadata interfaces would also help isolating requirements
for specific DB API modules. Currently sqlpython requires that you have
cx_Oracle installed even if you don't intend to use Oracle because it is
imported and used all over the place. Moving all DB specific
functionality separate modules/classes would mean you only need to
install the DB API modules for the DBs you actually use. The import of
cx_Oracle/psycopg/whatever could be delayed until a connection to such a
DB is requested.

Yes, that part should be possible no matter which way we go.

Catherine Devlin

unread,
Apr 9, 2009, 4:10:00 PM4/9/09
to sqlp...@googlegroups.com
As of today, the trunk has "grep" working for multi-RDMBS, and pretty colors stolen from sqlalchemy_console.

Beyond that, I'm kind of stuck, so I'm going to think out loud some.

sqlalchemy.engine.reflection
---------------------------------------
Despite what I said before, I'm still not totally against using sqlalchemy.engine.reflection.  Even though it can only get as far as listing the names of stuff, we could then use those names to get the rest of the details on each item.  However, it's not in any released version yet - just the sqlalchemy trunk; I don't know any schedule for it moving a release; and I think it's going to be a while, since the current version is crashing when I try to inspect the schema.

Gerald
----------
I still like Gerald, but what I hadn't noticed is that there's no code in PyPI - no egg, not even the source code.  You have to go to the Gerald homepage, download the tarball, unzip it, python setup.py install... THEN you can move on to installing sqlpython itself.  No WAY are database administrators going to go through that.  It needs to be a simple one-step process, or they just won't do it.

So the biggest current problem with both these projects is distribution.  I really don't want to depend on anything that doesn't have eggs in the cheese shop now (or soon), not "someday".

Thus, I tried forging ahead with writing our own metaqueries, but... ugh.  I bogged down about 1/4 of the way into adapting "describe"... and that's just for duplicating a command that psql, mysql console, etc. commands have already.

I'm even toying with the idea of using subprocess.Popen to keep psql/mysql sessions open in the background, feeding "describe" requests to them and piping their results out.  In addition to the complexity, of course, that makes it really hard to actually enhance the output, and it wouldn't take us one step toward buidling the nonstandard functionality like "deps", "refs", etc.  Still, at least it would be a bulletproof way of getting all psql/mysql's functionality as a baseline.

Really, it would be *so nice* if databases would expose their interactive console's commands like "describe" (".schema" in sqlite, "\d" in psql) as functions callable through the DB-API2 modules!  I don't think any of them do, though. 

I am hurting for ideas on how to go on!  Ideas?  HALP

I guess the one thing I can think of, for now, is to nag Andy Todd (the Gerald author) to make us an egg!

Catherine Devlin

unread,
Apr 13, 2009, 5:24:32 PM4/13/09
to sqlp...@googlegroups.com
On Thu, Apr 9, 2009 at 4:10 PM, Catherine Devlin
<catherin...@gmail.com> wrote:
> Gerald
> ----------
> I still like Gerald, but what I hadn't noticed is that there's no code in
> PyPI - no egg, not even the source code.  You have to go to the Gerald
> homepage, download the tarball, unzip it, python setup.py install... THEN
> you can move on to installing sqlpython itself.  No WAY are database
> administrators going to go through that.  It needs to be a simple one-step
> process, or they just won't do it.
>
> So the biggest current problem with both these projects is distribution.  I
> really don't want to depend on anything that doesn't have eggs in the cheese
> shop now (or soon), not "someday".

Good news! I got hold of Andy Todd, author of Gerald, and he agreed
to begin releasing eggs on PyPI beginning with Gerald 0.2.5 in a few
days. So that takes care of the dependency issue; I think we can do
some very serious experimentation with Gerald now. If it goes well,
let's use it, and eventually contribute mssqlite_schema.py and
sqlite_schema.py back to Gerald.

Andy Todd

unread,
Apr 15, 2009, 6:53:00 AM4/15/09
to sqlpython


On Apr 7, 6:55 am, Catherine Devlin <catherine.dev...@gmail.com>
wrote:
If you can tell me what's missing I'm more than happy to add it to
Gerald. Of course, patches are always welcome ;-)

> Also in Gerald's favor - it's actually released.  I want to be able to tell
> people, "easy_install and it will get all your dependencies".  I don't want
> to say, "Download the SQLAlchemy trunk with subversion..."
>
> Big problem with Gerald: it doesn't yet support MS SQL Server, and SQLite
> support isn't even listed as an ambition.  I'd hate to up on mssql, because
> (1) Brian needs it and (2) that's a huge community we could reach with
> Python power for the first time.
>

MS SQL Server should be possible but I'll need help. I did look at
SQLite but it doesn't really have a data dictionary to query. That
shouldn't be a problem and I think that can be worked around. The
major reason I haven't progressed support for it thus far is that
regardless of what you define in your DDL SQLite will accept any value
of any datatype in any column and I didn't see much value in returning
detailed column definitions that weren't enforced.

I'll make a start on adding SQLite support for the next release
though.

> One option would be to go ahead with Gerald and offer Andy Todd (Gerald's
> maintainer) our help in building mssql and sqlite support for it.  Peeking
> into the code, it looks like this would actually be very similar to coding
> our own DDL queries - basically, we'd be writing the same queries, just
> wrapping them up in Gerald objects instead of our `metaqueries` dict.  This
> would be significantly more work, but we'd only need to do it for mssql and
> sqlite, and we'd be benefiting two projects instead of just one.
>
> Or... we could just go back to doing the DDL queries ourselves, as I
> described at the top.  That might be the KISS solution.
>
> I don't know!  Opinions, please!
>
> DB specific metadata interfaces would also help isolating requirements
>
> > for specific DB API modules. Currently sqlpython requires that you have
> > cx_Oracle installed even if you don't intend to use Oracle because it is
> > imported and used all over the place. Moving all DB specific
> > functionality separate modules/classes would mean you only need to
> > install the DB API modules for the DBs you actually use. The import of
> > cx_Oracle/psycopg/whatever could be delayed until a connection to such a
> > DB is requested.
>
> Yes, that part should be possible no matter which way we go.
>
> --
> - Catherinehttp://catherinedevlin.blogspot.com/

Andy Todd

unread,
Apr 15, 2009, 6:54:01 AM4/15/09
to sqlpython

Catherine

unread,
Apr 27, 2009, 3:53:14 PM4/27/09
to sqlpython
Andy, and everybody else,

I just had a few exhilerating days of wild success, Geraldizing the
trunk... and then a couple very frustrating days, realizing it wasn't
going as well as I thought.

Basically, Gerald is awesome at easily getting us collections of
dependent objects... but what are those collections, exactly? Well...
it varies. A lot. For instance:

In an Oracle schema:

- a table has a dict called `columns`. A view has a *list* called
`columns`.
- a named check constraint is returned as a three-item list, with
the constraint's condition as element [2]. An unnamed check
constraint is returned as a two-item list without the condition.

In Oracle, a foreign key is a list whose [2] element is a list of the
key columns in the local table. In postgres, a foreign key is a list
whose [2] element is a list containing the constraint name.

... and on and on... there's a lot more like this. ... all this is
making the prospect of a single set of code, that can navigate and
output from all the Gerald results from any RDBMS, look somewhere
between "nasty" and "impossible".

Another big issue: Gerald is built to look at *your own* objects.
Connect as `catherine`, and you can inspect objects belonging to
`catherine`. However, SQL Client tools are commonly used to explore
objects that you don't actually own, you simply have SELECT rights
to.

Understand: that's perfectly reasonable for a tool that was not
invented to support sqlpython at all! It's a schema comparison tool,
not a SQL client infrastructure.

*** BUT ***

Gerald is still the best solution, I think. sqlalchemy's "reflection"
branch is still not in the trunk, and the trunk is not near release,
so it may be months and months and months - if ever - before it's
truly available.

*** AND ***

Gerald wouldn't actually be too hard to change to produce consistent
results - dictionaries for everything, I think, would be nice. The
code layout is very clear. It might take less time than writing this
email has taken. :)

*** SO ***

I think we could get what we need by modifying Gerald itself - much
easier than accepting Gerald output as-is and then trying to squeeze
it into a consistent shape on our side.

*** BUT ***

I don't know whether that would break Gerald for its original purpose
(of comparing schemas), and look-at-my-own-objects-only vs. look-at-
all-objects-I-can-see is certainly an incompatibility. At best, Andy
would have to open it up to some significant modifications for our
sake.

So I'm wondering whether to basically fork Gerald - copy Gerald as is
into the sqlpython source tree, use our local copy instead of
importing true Gerald, and then modify our local Gerald to give us the
output we need.

Or is forking a horrible idea? Can Gerald move forward as a dual-
purpose project?

(Micro-issue: Gerald is BSD, sqlpython is MIT license. I think we'd
have to change our license to BSD, or actually have two separate
licenses.)

Thanks, everybody,

- Catherine

Andy Todd

unread,
Apr 27, 2009, 8:03:09 PM4/27/09
to sqlp...@googlegroups.com

OK, thanks for that. In general I'm happy to accept suggestions for
improvement and I don't think that anything you have suggested will
'break' Gerald so I propose to incorporate as many of your suggestions
as possible. One question at a time:

1. Internal representation.

The internal storage (and thus metadata model) is a mess. It has evolved
in a very unstructured way and I have a task in my to do list to fix it
all up. I'll happily move that up the list to be priority number 1.
Given the amount of changes that will require (and the fact that it will
break any existing code) I think that this needs to be done carefully.
So my plan is;

- Document the object model that will be used (preferably on a wiki -
any suggestions anyone?)
- Update each of the *schema.py files and their test suites
- Test, test, test
- Release

If I/we get the first part right the rest should be a relative breeze ;-)

2. My schema vs. all available objects

Conceptually this is quite a simple change for every database *apart*
from Oracle. Even then it is entirely possible. I've just never seen the
need for it. I'm happy to do this - probably with a flag of some sort to
differentiate between 'my' objects and 'public' objects. Suggestions for
how to work this into the current API are more than welcome.

3. License.

I am happy to relicense Gerald to the MIT license if that makes life
easier. To my (IANAL) mind there isn't a huge difference between them
and changing will not cause me any problems.

4. Access to the code.

I can grant anyone who asks (nicely) access to the SVN repository at
SourceForge to generate and apply patches.

Alternatively, I have been toying with the idea of moving to a DVCS and
to that end I have created an account at Bitbucket. That is as far as I
have got though. If someone would like to volunteer to help me port
between Subversion and Hg I am happy to move the project repository to
Bitbucket as that will more easily support some of the changes we are
discussing here.

Regards,
Andy
--
From the desk of Andrew J Todd esq - http://www.halfcooked.com/

Catherine Devlin

unread,
Apr 29, 2009, 10:24:57 AM4/29/09
to sqlp...@googlegroups.com
THANK YOU, Andy!

In your API at
http://groups.google.com/group/sqlpython/web/gerald-api?hl=en&pli=1
you're specifying columns, indexes, etc. as tuples. That would work
fine, though I was going to suggest dicts, both to make it a little
bit self-documenting and to ease the burden of standardizing: then, if
it turns out that there's another important piece of data to include
about (say) a mssql column, mssql_schema.py can simply include that in
its column dicts and not worry about inserting a new null element into
the column elements for all the other rdbms's. (It would be the
responsibility of examining programs, like sqlpython, not to die
gruesomely if they don't find a key they expected.)
(It doesn't remove all the standardization burden, of course - there
still needs to be a standard set of key names)

I'm attaching modified versions of oracle_schema.py and
postgres_schema.py as examples of outputting dicts - it seemed faster
than writing out what I'm thinking. (`kdiff3` or `meld` or something
vs. your current) Those are all the changes it would take for
sqlpython's metadata queries to work. (Well, and corresponding
changes in mysql_schema.py, of course.)

... but overall, I'm just happy for anything standard we can work
against! Any API is a good API once it's *the* API.

I've got sqlpython talks coming up on Friday (eek!) and the following
Thursday, so I'd better stop trying to get fully-fledged multi-RDBMS
in the next, um, 48 hours :) and instead release sqlpython 1.6.5 -
just one more new bug to fix, probably later today. It'll leave out
all the Gerald-based stuff that's currently in the trunk.

On Mon, Apr 27, 2009 at 8:03 PM, Andy Todd <and...@halfcooked.com> wrote:
> 1. Internal representation.
>
> The internal storage (and thus metadata model) is a mess. It has evolved
> in a very unstructured way and I have a task in my to do list to fix it
> all up. I'll happily move that up the list to be priority number 1.

Thank you! Though I can see where there's really been no reason for,
say, tables and views to report consistently when schema comparison
was the only purpose. "You're comparing a table against a view?
Well, of course they're different!"

> 2. My schema vs. all available objects
>
> Conceptually this is quite a simple change for every database *apart*
> from Oracle. Even then it is entirely possible. I've just never seen the
> need for it. I'm happy to do this - probably with a flag of some sort to
> differentiate between 'my' objects and 'public' objects. Suggestions for
> how to work this into the current API are more than welcome.

Hmmmmmmm

> 3. License.
>
> I am happy to relicense Gerald to the MIT license if that makes life
> easier. To my (IANAL) mind there isn't a huge difference between them
> and changing will not cause me any problems.

True, MIT and BSD are very close. But anyway, it would only be
relevant if we were going to incorporate a fork - as long as Gerald
remains an independent import, there's no need.

> 4. Access to the code.
>
> I can grant anyone who asks (nicely) access to the SVN repository at
> SourceForge to generate and apply patches.
>
> Alternatively, I have been toying with the idea of moving to a DVCS and
> to that end I have created an account at Bitbucket. That is as far as I
> have got though. If someone would like to volunteer to help me port
> between Subversion and Hg I am happy to move the project repository to
> Bitbucket as that will more easily support some of the changes we are
> discussing here.

Nothing wrong with you remaining the gateway, either, as long as you
don't resent the burden, and it doesn't sound like you do. Again,
thanks so much!

*** PyOhio * July 25-26, 2009 * pyohio.org ***

oracle_schema.py
postgres_schema.py

Catherine Devlin

unread,
Apr 29, 2009, 10:46:58 AM4/29/09
to sqlp...@googlegroups.com
On Wed, Apr 29, 2009 at 10:24 AM, Catherine Devlin
<catherin...@gmail.com> wrote:
> I'm attaching modified versions of oracle_schema.py and
> postgres_schema.py as examples of outputting dicts - it seemed faster
> than writing out what I'm thinking.

Oops! I forgot that the files I sent depend on a tiny function I'd
added in schema.py:

def dict_from_row(curs, row):
return dict(zip((d[0].lower() for d in curs.description), row))

Pavol Lisy

unread,
Apr 30, 2009, 7:40:00 AM4/30/09
to sqlp...@googlegroups.com


2009/4/29 Catherine Devlin <catherin...@gmail.com>

THANK YOU, Andy!

In your API at
http://groups.google.com/group/sqlpython/web/gerald-api?hl=en&pli=1
you're specifying columns, indexes, etc. as tuples.  That would work
fine, though I was going to suggest dicts,



dict is unordered. It does not matter?

python 2.7 (and 3.1 of course) will add OrderedDict
( http://docs.python.org/dev/whatsnew/2.7.html#pep-372-adding-an-ordered-dictionary-to-collections )

python 2.4 and later has equivalent recipe
( http://code.activestate.com/recipes/576693/ )
as you can see in http://docs.python.org/dev/library/collections.html?highlight=ordereddict#collections.OrderedDict

P.
Reply all
Reply to author
Forward
0 new messages