schema vs owner vs database vs ....

76 views
Skip to first unread message

Rick Morrison

unread,
Mar 19, 2008, 11:49:36 AM3/19/08
to sqlalche...@googlegroups.com
...I'm cross-posting this from the -user list, as if we have a discussion on this, it's probably better to happen here.


> I forgot to add in my message, that schema for Table object in column
> is setup to None. This is also a bug.

Maybe it is, and maybe it isn't.

I think Oracle treats 'schema' and 'owner' as synonyms for the same thing, much as MySQL treats 'database' and 'schema' as the same thing. MSSQL shares a similar schizophrenic attitude toward 'schema' and 'owner', and in MSSQL, those attitudes have actually changed over time.

I think we need to have a convention in SQLAlchemy about how to treat these items. Leaving aside the security aspects of the "owner" concept (since SQLAlchemy doesn't have much to say on this subject yet), the three terms 'database', 'schema' and 'owner' are basically namespaces:

All of the databases supported by SQLAlchemy support at least a two-level namespace:

    table.column

and nearly all modern SQL databases support a third namespace:

    namespace.table.column

Here's where the disagreement over what to term this third namespace starts, and right away things begin to get very messy:

 -  Oracle says "owner" and mixes in login name and access restrictions.
 -  MSSQL first agrees (pre-2005), and then (SQL2005) recants and says "schema".
 -  MySQL says "schema and database mean the same thing".
 -  MSSQL also agrees with this, and allows cross database joins.
 -  Oops, MSSQL also supports owner and schema as well. So now a FOUR-part namespace is possible in MSSQL:
     [namespace1].[namespace2]
.table.column. Usually [namespace1] is called the "database".
 -  So what is [namespace2]? Schema? Owner? MSSQL: "take your pick!"
 -  SQlite has no concept of schema at all, except a pragma that confusingly lists a table definition.
 -  When SQlite adds attached databases, suddenly the schema construct becomes useful.
 -  Postgresql first supports only a two-level store (table.column) with no cross-database joings
 -  In PG7.3, they add "schema" for the canonical three-part namespace, and a schema "search path" (IMO, the most sane and flexible scheme out there)

..add a little more confusion with ANSI SQL standards and relational theory, which refer to "schema" as a collection of relational tuple headers (table definitions) and relational constraints (this collection is called "Metadata" by SQLalchemy).

...and some more confusion with data model theory which talks about "conceptional schema", "logical schema" and "physical schema"

...and a little more confusion with named data modeling techniques such as "star schema" or "snowflake schema"

No wonder we can't agree on what the hell "schema" means. Or "owner". Or even "database", sometimes. The words are too overloaded.


Bottom line is this:

Any ORM is going to have to support a multi-level segmented namespace for various databases, and use it to construct queries that mirror the database implementation's idea of how things are namespaced. We're going to need to deal with the variations in database vendor terminology for these namespaces, and work correctly.

This is a problem that the ODBC and JDBC folks have had to deal with a long time ago. I think their tack was to substitute alternate words like 'catalog'.

Should we do the same kind of thing?
Reply all
Reply to author
Forward
0 new messages