Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Databases as objects

5 views
Skip to first unread message

Thomas Gagne

unread,
Dec 20, 2006, 4:25:20 PM12/20/06
to
An unexpected thing happened while debating topmind: I had an epiphany.
Instead of responding to the news group I thought about it for a short
bit (very short) and posted an article to my blog titled, "The RDB is
the biggest object in my system."

<http://blogs.in-streamco.com/anything.php?title=the_rdb_is_the_biggest_object_in_my_syst>

What I realized while trying to describe my preference to use DB
procedures as the primary (re: only) interface between my applications
and the database is because I believe my DB's physical representation of
data belongs to it alone and that customers of the DB oughtn't be
permitted to directly manipulate (change or query) its data. I realized
this is exactly what data-hiding is all about and why expert object
oriented designers and programmers emphasize the importance of
interfaces to direct data manipulation.

I thought more about this and posted a second article, Databases as
Objects: My schema is my class, which explored more similarities between
databases and objects and their classes.

<http://blogs.in-streamco.com/anything.php?title=my_schema_is_an_class>

I intend next to explore various design patterns from GoF and Smalltalk:
Best Practice Patterns to see if the similarities persist or where they
break down, and what can be learned from both about designing and
implementing OO systems with relational data bases.

If you agree there's such a thing as an object-relational impedance
mismatch, then perhaps its because you're witnessing the negative
consequences of tightly coupling objects that shouldn't be tightly coupled.

There's a hypothesis in there somewhere.

As always, if you know of existing research on the subject I'm anxious
to read about it.

--
Visit <http://blogs.instreamfinancial.com/anything.php>
to read my rants on technology and the finance industry.

Message has been deleted

topmind

unread,
Dec 20, 2006, 7:25:29 PM12/20/06
to
Thomas Gagne wrote:
> An unexpected thing happened while debating topmind: I had an epiphany.
> Instead of responding to the news group I thought about it for a short
> bit (very short) and posted an article to my blog titled, "The RDB is
> the biggest object in my system."
>
> <http://blogs.in-streamco.com/anything.php?title=the_rdb_is_the_biggest_object_in_my_syst>

>From the link:

"Why shouldn't applications have embedded SQL? Because it's the same as
accessing the private data members of an object. It shouldn't be done.
OO programmers know the correct way to interface with an object is to
use its method interface--not attempt direct manipulation of the
object's data. OO programmer's attempts to violate that rule is what
causes so much frustration mapping the application's data graph into a
relational database's tables, rows, and columns. Those things belong to
the DB--not to the application."

(end quote)

You OO'ers keep forgetting: SQL *is* an interface. I repeat, SQL *is*
an interface. It is *not* "low level hardware". You OO'ers keep
viewing it as low-level stuff because you don't seem to like it, and
you wrap anything you don't like behind OO and call it "low level" so
that it fits your personal subjective preference and world view. OO may
fit your mind better for whatever reason, but you cannot assume your
head is God's template for every *other* individual.

BTW, Microsoft has ADO, DAO, etc. which are OO wrappers around RDBMS.
Java and other vendors do also. Whether OO is the best way wrap RDBMS
calls is another debate. My point is they already exist.

Further, even if OO *was* the best way to access RDBMS thru an app,
that does not necessarily extrapolate to all domains. OO being good for
X does not automatically imply it is good for Y also. I have already
agreed that OO may be good for writing device drivers and
device-driver-like things; but it has not been shown useful to view
everything as a device driver. I am more interested in seeing how OO
models biz objects rather than how it wraps system services and the
like. Biz modeling has been OO's toughest evidence cookie to crack (but
perhaps not the only).

And finally, just because one *can* view everything as objects does not
necessarily mean one should. One can also view everything as Lisp or
assembler or that Brainf*ck language.

>
> What I realized while trying to describe my preference to use DB
> procedures as the primary (re: only) interface between my applications
> and the database is because I believe my DB's physical representation of
> data belongs to it alone and that customers of the DB oughtn't be
> permitted to directly manipulate (change or query) its data.

When was the last time you've seen this happen? Again, SQL is NOT
"physical representation". For that matter, neither are files even.
File systems are a hierarchical database-like thing. "POKE 462625" is
accessing the physical directly.

> I realized
> this is exactly what data-hiding is all about and why expert object
> oriented designers and programmers emphasize the importance of
> interfaces to direct data manipulation.

"Data hiding"? I am working on "OO hiding". Relational is a high-level
modeling technique which tends to use "declarative interfaces".
Declarative interfaces are not necessarily worse than "behavioral
interfaces", which OO relies on. This sounds like yet another battle
between declarative interfaces versus behavioral interfaces. Note that
one could potentially mix them in RDBMS, but so far it does not appeal
very practical. And this is largely because the tight association
between data and behavior that OO likes simply does not work well in
biz apps. Thus, heavy behavioraltizing of RDBMS is not useful. I am
just pointing out it could be done and probably would be done if it
proved useful. OO forces an overly tight view of data and behavior.
Relational provides a consistency to declarative interfaces, but OO
does not provide any real structure and consistency to behavioral
interfaces. It creates shanty-town biz models.

GOF patterns are supposed to be a solution, but GOF patterns have no
clear rules about when to use what and force a kind of IS-A view on
modeling instead of HAS-A. GOF patterns are like an attempt to catalog
GO TO patterns instead of rid GO TO's. Relational is comparable to the
move from structured programming from GO TO's: it provides more
consistency and factors common activities into a single interface
convention (relational operators). OO lets people re-invent their own
just like there are a jillion ways to do the equivalent of IF blocks
with GO TO's.

OO has simply failed to factor and standardize common relationship and
collection idioms!!!!
OO has simply failed to factor and standardize common relationship and
collection idioms!!!!

That is why OO is such mess and its hard to figure out people's OO
designs. It makes me feel like a building inspector in a shanty town:
there are no building codes and rules. Relational operators and
normalization rules reign in the "creativity" that should be reigned
in. The "self-handling noun" view of OOP means that you get
self-reinventing nouns.

>
> I thought more about this and posted a second article, Databases as
> Objects: My schema is my class, which explored more similarities between
> databases and objects and their classes.
>
> <http://blogs.in-streamco.com/anything.php?title=my_schema_is_an_class>

See about ADO, DAO above

>
> I intend next to explore various design patterns from GoF and Smalltalk:
> Best Practice Patterns

How measured as "best"? Subjective internal votes?

> to see if the similarities persist or where they
> break down, and what can be learned from both about designing and
> implementing OO systems with relational data bases.
>
> If you agree there's such a thing as an object-relational impedance
> mismatch, then perhaps its because you're witnessing the negative
> consequences of tightly coupling objects that shouldn't be tightly coupled.
>
> There's a hypothesis in there somewhere.
>
> As always, if you know of existing research on the subject I'm anxious
> to read about it.
>
> --
> Visit <http://blogs.instreamfinancial.com/anything.php>
> to read my rants on technology and the finance industry.

oop.ismad.com
-T-

aloha.kakuikanu

unread,
Dec 20, 2006, 8:07:51 PM12/20/06
to
Thomas Gagne wrote:
> If you agree there's such a thing as an object-relational impedance
> mismatch, ...

This mismatch is mostly caused by the lack of education of object
propellerheads. Witness pathetic atempts to enhance method dispatch
with predicates. Finally, some folks begin to understand predicate
importance! The problem is that if you do it in ad-hock basis you'll
get some inconsistent messy design.

Here are few facts, which may help you to further appreciate the power
of relations.
1. Function call is formally a relational join (followed by
projection). That is

f(a) is the same as pi_x ( `y=f(x)` |><| `x=a`)

where `y=f(x)` is a binary relation coressponding to the function, and
`x=a` is a relation that has a single tuple.

A consequence of this fact is that function calls (or arithmetic
expressions) fit naturally into the SQL select and where clause.

2. Function composition is a join (again followed by projection).

3. Predicates can be mixed with relations, and arbitrary relational
algebra expression can be transformed into a normal
'select-project-join' form. This explains why most queruies fit nicely
into "select from where" SQL template.

4. The aggregate/group by construct reflects yet another important
mathematical construction: the equivalence relation. This is why it is
so easy to write queries that count things in SQL.

This is only the beginning of the list, and I assure you that you'll
get more return on your investment not if you spend your time
"brainstorming" how to fit databases into objects, but educating
yourself what database management really is.

topmind

unread,
Dec 21, 2006, 1:15:12 AM12/21/06
to
> 1. Function call is formally a relational join (followed by
> projection). That is

In my opinion, its not very useful to say "X is really a Y". A lot of
paradigms, idioms, and ideas are interchangable such that one can be
viewed as the other and visa versa. Thus, implimenting one in the
other does not carry much weight.

As far as the impedence mismatch, I do think it exists. The main reason
is that relational is heavily based on sets, but OO is based on
navigational structures (pointers). The two are very hard to reconcile.
A model based on graphs is very different to work with than one based
on sets. (They are also interchangable, but the choice is a matter of
human convenience and thus productivity.)

-T-

Frans Bouma

unread,
Dec 21, 2006, 4:53:17 AM12/21/06
to
topmind wrote:

> Thomas Gagne wrote:
> > An unexpected thing happened while debating topmind: I had an
> > epiphany. Instead of responding to the news group I thought about
> > it for a short bit (very short) and posted an article to my blog
> > titled, "The RDB is the biggest object in my system."
> >
> > <http://blogs.in-streamco.com/anything.php?title=the_rdb_is_the_bigg
> > est_object_in_my_syst>
>
> > From the link:
>
> "Why shouldn't applications have embedded SQL? Because it's the same
> as accessing the private data members of an object. It shouldn't be
> done. OO programmers know the correct way to interface with an
> object is to use its method interface--not attempt direct
> manipulation of the object's data. OO programmer's attempts to
> violate that rule is what causes so much frustration mapping the
> application's data graph into a relational database's tables, rows,
> and columns. Those things belong to the DB--not to the application."
>
> (end quote)
>

> You OO'ers keep forgetting: SQL is an interface. I repeat, SQL is
> an interface. It is not "low level hardware".

SQL is a set-oriented language, it's not an interface as a language
doesn't do anything without context (in this case a parser-interpreter
combi)

> You OO'ers keep
> viewing it as low-level stuff because you don't seem to like it, and
> you wrap anything you don't like behind OO and call it "low level" so
> that it fits your personal subjective preference and world view. OO
> may fit your mind better for whatever reason, but you cannot assume

> your head is God's template for every other individual.

of course it's not low level stuff, SQL is a set-oriented language and
therefore doesn't match object-oriented languages, so a 'translation'
has to be made as you can't project one onto another in a 1:1 fashion.

> BTW, Microsoft has ADO, DAO, etc. which are OO wrappers around RDBMS.

no they're not. ADO and DAO aren't OO, as they're COM based so they're
actually procedural (library interfaces implemented on a live object).
Furthermore, they provide the interface you talked about to the DB,
which is often referred to as 'the client interface' or 'provider' when
it comes to database access.

> Further, even if OO was the best way to access RDBMS thru an app,


> that does not necessarily extrapolate to all domains. OO being good
> for X does not automatically imply it is good for Y also.

you don't get the point: in an OO application, which works on data IN
the application, you want to do that in an OO fashion. To obtain the
data from the outside is initiated INSIDE the application, thus also in
an OO fashion. As an RDBMS doesn't understand OO in most cases, but it
works with SQL as it has a SQL interpreter in place to let you program
its internal relational algebra statements in a more readable way,
you've to map statements from OO to SQL and set oriented results (the
sets) from the DB back to OO objects.

> I have
> already agreed that OO may be good for writing device drivers and
> device-driver-like things; but it has not been shown useful to view
> everything as a device driver. I am more interested in seeing how OO
> models biz objects rather than how it wraps system services and the
> like. Biz modeling has been OO's toughest evidence cookie to crack
> (but perhaps not the only).

huh? walls full of books have been written about this topic and you
declare it the toughest cookie to crack...

> And finally, just because one can view everything as objects does not


> necessarily mean one should. One can also view everything as Lisp or
> assembler or that Brainf*ck language.

sure, but that doesn't mean the language necessarily fits the purpose
you want to use it for. data oriented operations on sets is best suited
with SQL, as it is designed for that. other languages are designed for
other purposes. Mixing the two is often not that successful, though
that's not a problem per se as processing data is more or less a 3 step
process:
- move data from data producer to data consumer
- process data in data consumer
- move data from original data consumer to original data producer

so you can easily chop up this process in 3 parts and implement the
parts in the language best fit for the job.

> > I realized
> > this is exactly what data-hiding is all about and why expert object
> > oriented designers and programmers emphasize the importance of
> > interfaces to direct data manipulation.
>
> "Data hiding"? I am working on "OO hiding". Relational is a high-level
> modeling technique which tends to use "declarative interfaces".
> Declarative interfaces are not necessarily worse than "behavioral
> interfaces", which OO relies on. This sounds like yet another battle
> between declarative interfaces versus behavioral interfaces.

Could you define 'interface' for me, as it gets more and more abiguous
definitions in this post alone.

> Note that
> one could potentially mix them in RDBMS, but so far it does not appeal
> very practical. And this is largely because the tight association
> between data and behavior that OO likes simply does not work well in
> biz apps. Thus, heavy behavioraltizing of RDBMS is not useful. I am
> just pointing out it could be done and probably would be done if it
> proved useful. OO forces an overly tight view of data and behavior.
> Relational provides a consistency to declarative interfaces, but OO
> does not provide any real structure and consistency to behavioral
> interfaces. It creates shanty-town biz models.

you declare a lot of IMHO rubbish as 'truth' here. E.g.: why wouldn't
biz apps be helped with OO?

> GOF patterns are supposed to be a solution, but GOF patterns have no
> clear rules about when to use what and force a kind of IS-A view on
> modeling instead of HAS-A.

You also fall into the 'use pattern first, find problem for it
later'-antipattern.

a pattern is a (not the) solution for a well defined recognizable
problem. So if you recognize the problem in your application, you can
use the pattern which solves THAT problem to solve THAT problem in your
application. THat's IT. The GoF book names a set of patterns and also
the problems they solve. If you don't have the problems they solve, you
don't need the patterns.

Btw, the GoF book discourages inheritance a lot, just read it. It says
don't use inheritance if you don't have to.

> GOF patterns are like an attempt to
> catalog GO TO patterns instead of rid GO TO's. Relational is
> comparable to the move from structured programming from GO TO's: it
> provides more consistency and factors common activities into a single
> interface convention (relational operators). OO lets people re-invent
> their own just like there are a jillion ways to do the equivalent of
> IF blocks with GO TO's.

I've read a lot of nonsense in your post, but this is one of the most
striking examples. WHat on earth have GO TO's to do with the topic at
hand?

> OO has simply failed to factor and standardize common relationship and
> collection idioms!!!!
> OO has simply failed to factor and standardize common relationship and
> collection idioms!!!!

take your pills, you apparently forgot them ;)

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Thomas Gagne

unread,
Dec 21, 2006, 5:30:53 AM12/21/06
to
topmind wrote:
> <snip>

>
> BTW, Microsoft has ADO, DAO, etc. which are OO wrappers around RDBMS.
> Java and other vendors do also. Whether OO is the best way wrap RDBMS
> calls is another debate. My point is they already exist.
>
> Further, even if OO *was* the best way to access RDBMS thru an app,
> <snip>
>
You're missing something. I am not advocating wrapping the RDB with OO
stuffs. I am not saying OO is the best way to access a
database--directly or through any of the frameworks mentioned above. In
fact, I'm advocating the opposite. Deal with the DB on its own terms,
but treat it as an object. I'm recommending against accessing it using
its low-level interface (SQL), but instead that a higher-level
application/schema/problem domain-specific API be constructed, most
likely using procedures, and that applications should access the DB that
way.

Dmitry A. Kazakov

unread,
Dec 21, 2006, 5:43:09 AM12/21/06
to
On 20 Dec 2006 17:07:51 -0800, aloha.kakuikanu wrote:

> Thomas Gagne wrote:
>> If you agree there's such a thing as an object-relational impedance
>> mismatch, ...
>
> This mismatch is mostly caused by the lack of education of object
> propellerheads. Witness pathetic atempts to enhance method dispatch
> with predicates. Finally, some folks begin to understand predicate
> importance! The problem is that if you do it in ad-hock basis you'll
> get some inconsistent messy design.
>
> Here are few facts, which may help you to further appreciate the power
> of relations.
> 1. Function call is formally a relational join (followed by
> projection). That is
>
> f(a) is the same as pi_x ( `y=f(x)` |><| `x=a`)
>
> where `y=f(x)` is a binary relation coressponding to the function, and
> `x=a` is a relation that has a single tuple.

LOL! That's amusing.

When you are going to turn your TV-set on, do you a relational join over
its diodes and resistors following by a majestic projection, or just press
the button "ON?"

> A consequence of this fact is that function calls (or arithmetic
> expressions) fit naturally into the SQL select and where clause.
>
> 2. Function composition is a join (again followed by projection).

Ah, you mean "ON" followed by "1"! GREAT!

> 3. Predicates can be mixed with relations, and arbitrary relational
> algebra expression can be transformed into a normal
> 'select-project-join' form. This explains why most queruies fit nicely
> into "select from where" SQL template.

These should be the buttons "Vol+" and "Vol-." But, wait, how to select
next to the last diode in SQL?

> 4. The aggregate/group by construct reflects yet another important
> mathematical construction: the equivalence relation. This is why it is
> so easy to write queries that count things in SQL.

Yes, I always wondered how much diodes the damned thing has...

> This is only the beginning of the list, and I assure you that you'll
> get more return on your investment not if you spend your time
> "brainstorming" how to fit databases into objects, but educating
> yourself what database management really is.

Isn't FORMAT C: /q everything one should know about it? (:-))

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Thomas Gagne

unread,
Dec 21, 2006, 5:52:35 AM12/21/06
to
Let me try to clear something up, and thanks to Topmind, Frans, and
Stefan for helping me get there.

In OO, objects are subclassed to make them more specific, not more
general. I consider SQL to be a low level language, as far as RDBs are
concerned, because it is application-ignorant. It's like C for
relational operations. SQL doesn't know anything about my application.

So, I subclass Model (so-to-speak) and add data that is domain-specific
to create my domain-specific database. Why access it from applications
using the same domain-ignorant language? Instead, I construct
procedures that create a domain-specific interface. Instead of the
lower-level

select * from account, user where user.userId=X and account.userId =
user.userId

when instead I can use

exec getAccountsFor @userId=X

?

Besides its brevity, the procedure name clearly communicates the intent
of the operation (stbpp pattern: intention revealing message), makes
obvious its parameters, and provides a layer of indirection behind which
its implementation can change without affecting the procedure's users.

From SQL I've constructed procedures to provide a higher-level,
domain-specific, language-and-paradigm-neutral interface to a
domain-specific database.

To find all the places in my application source code that get account
information with user IDs it is much easier to find senders (callers) of
getAccountsFor than it would be to find all the SQL referencing both the
account and user tables. Could the latter be done? Sure, but when a
more efficient and accurate alternative exists why would you?

Thomas Gagne

unread,
Dec 21, 2006, 5:55:02 AM12/21/06
to
Stefan, what "subject" were you replying to when you wrote that?

Thomas Gagne

unread,
Dec 21, 2006, 6:15:29 AM12/21/06
to

aloha.kakuikanu wrote:
> <snip>


>
> This is only the beginning of the list, and I assure you that you'll
> get more return on your investment not if you spend your time
> "brainstorming" how to fit databases into objects, but educating
> yourself what database management really is.
>
>

I'm going to keep saying this different ways until I finally say it, or
draw a picture, that makes it clear I am not trying to fit the database
inside an object. I am not trying to wrap it inside an object. I am
not advocating an OO framework to arbitrate all DB access.

I am saying domain-specific databases (what makes your application's DB
design different than mine) can be thought-of, and ultimately treated,
as objects. Now--don't run off and think I'm trying to wrap anything.
What I'm saying is that the rules OO designers use to decide what
methods an object should have (and not have) and the justifications for
resisting direct data manipulation can be applied to how applications
interface to database by deciding that 1) no application should directly
access the DB's data (no SQL) and 2) applications should use the DB only
through its interface. Stored procedures are the best example of the
latter I know of.

Ultimately, I think I may need to come up with another name for a
domain-ized database. The word 'database' has too many possibilities.
It's too general. After I've applied my schema to it it no longer has
all the possibilities it once had. After my schema's applied it becomes
something different. It's becomes my domain's data base. My domainabase?

Ed Kirwan

unread,
Dec 21, 2006, 7:14:20 AM12/21/06
to
Thomas Gagne wrote:
> An unexpected thing happened while debating topmind: I had an epiphany.
> Instead of responding to the news group I thought about it for a short
> bit (very short) and posted an article to my blog titled, "The RDB is
> the biggest object in my system."
>
> <http://blogs.in-streamco.com/anything.php?title=the_rdb_is_the_biggest_object_in_my_syst>
>
>
snip ...
>

Hej, Thomas.

I know nothing about databases.

(I always feel like I've just been to Confession whenever I write that
line; it's quite liberating.)

Your (very interesting) proposal, however, seems similar(ish) to the
ideas that pop up from time to time whenever DBers and OOers meeting for
a ho'down and some line-dancing. The general result of these fun
evenings is that DBers point to OO's diluting of the power of the DB in
some way (I've never been sure how). The point is: if you really want
some good DB-centric advice on your proposal, you should consider
posting to comp.databases.theory - those folks know everything there is
to know about DBs and could help you plug any leaks in your endeavours
(and perhaps save you some time in your studies).

If you do make such a post, however, do please multi-post; don't
cross-post (amazing: there is a use for cross-posting afterall). The
reason for this request is that cross-posts between c.o and c.d.t tend
to deteriorate alarmingly quickly into sulking, name-calling, and
kill-files bloating to significant proportions of their hosting discs.

Despite this, their DB expertise is, as mentioned, extraordinary; so do
consider popping over there for a chat ... but don't tell them who sent you.

Just a thought.

.ed

PS On an OO note, regarding your, "My schema is a class/my DB is an
object," concept (again, very interesting). I presume here you mean the
DB as the data it contains, rather than a particular vendor's DB such as
Oracle, etc. If so, then there should be a concept of changing the data
wholesale for some other data, without affecting the users of that data
(i.e., the application). I can't really see this happening. If I have
the data for a suit-tailoring business, and applications that graze this
data, then I can't really see that the applications will remain
unchanged when this data is dropped and the data for, say, a
car-manufacturer inserted instead. Silly example, of course, but I hope
it gets the point across: how often do you have one schema with multiple
data-sets conforming to it. Do bear in mind my first sentence ...

--
www.EdmundKirwan.com - Home of The Fractal Class Composition.

Download Fractality, free Java code analyzer:
www.EdmundKirwan.com/servlet/fractal/frac-page130.html

Message has been deleted

Thomas Gagne

unread,
Dec 21, 2006, 9:21:25 AM12/21/06
to
Ed Kirwan wrote:
> <snip>

>
> Your (very interesting) proposal, however, seems similar(ish) to the
> ideas that pop up from time to time whenever DBers and OOers meeting
> for a ho'down and some line-dancing. The general result of these fun
> evenings is that DBers point to OO's diluting of the power of the DB
> in some way (I've never been sure how). The point is: if you really
> want some good DB-centric advice on your proposal, you should consider
> posting to comp.databases.theory - those folks know everything there
> is to know about DBs and could help you plug any leaks in your
> endeavours (and perhaps save you some time in your studies).
Thank you for the recommendation. I'll post it separately to avoid the
devolving arguments. ;-)
>
> <snip>

>
> PS On an OO note, regarding your, "My schema is a class/my DB is an
> object," concept (again, very interesting). I presume here you mean
> the DB as the data it contains, rather than a particular vendor's DB
> such as Oracle, etc. If so, then there should be a concept of changing
> the data wholesale for some other data, without affecting the users of
> that data (i.e., the application).
In theory (isn't it always) if the interface was consistent between
tailors and auto manufacturers then the answer would be yes. That would
be a kind of polymorphism. Of course, it depends on the interface
staying intact. If the interface is broken it matters little what's on
either side of it--the system is broken.

The metanoia I'm advocating cares little for the DB's vendor. Once
you've created a specific database model to support your application
domain you've created a hypostasis. You started with an empty database
with infinite potential and tailored its purpose for your specific needs
and given it an identity of its own. You started with the general and
hypostatized to the specific. Your database is no longer general
purpose. Its design is intellectual property and its contents proprietary.

This is what happens when we subclass Object to create something
specific, like a Date. Object has the potential to be anything, but
Date has been modified for a specific purpose. It, too, has been
hypostatized. Date has become a species independent of its superclass
with unique data and behavior.

Mike Anderson

unread,
Dec 21, 2006, 9:24:39 AM12/21/06
to

Stefan's point is very interesting as well, though? To use your same
example,

select * from account, user where user.userId=X and account.userId =
user.userId

could also be seen as a message send, something like:

(Table join: account and: user on: [ :a :u | a id = u id ])
select: #(#field1, #field2) where: [ each user id = X ]

NB. The above is for illustrative purposes, I am not saying that SQL
should be mapped to Smalltalk in that way.

It is true that this is a low-level interaction, and
application-ignorant, but so are all the methods of String, or Integer,
or Array, Socket, SystemDictionary... that doesn't prevent them from
being OO.

Mike

Mike Anderson

unread,
Dec 21, 2006, 9:35:57 AM12/21/06
to
On reflection, you've actually covered this off in your blog post (as I
see it, anyway); the database is a very large Facade, covering all of
the tables, which are lower-level objects. Do you agree?

Thomas Gagne

unread,
Dec 21, 2006, 10:18:02 AM12/21/06
to
I've tried, but I've been made aware of some weaknesses in the
description. Not in the premise, but there's confusion in the words
I've chosen--particularly the words transaction and database. After
I've created a database uniquely suited to my application domain's
specific needs it is not longer a 'database' in the generic form, but
something else. After discussing it a bit with a coworker educated in
ancient Greek, we believe hypostasis may be a better term to describe a
customized database.

From <http://www.webster.com/dictionary/hypostasis>

*3 a* *:* the substance or essential nature of an individual *b* *:*
something that is hypostatized
<http://www.webster.com/dictionary/hypostatized>

Once customized for its purpose, the hypostasis is "the substance or
essential nature.." of my system. Plumbers and carpenters are both
skilled tradesmen, but have unique skills peculiar to their specific
trades. We wouldn't confuse a plumber's (deliberately or accidentally)
a plumber's skillbase with a carpenter's, but yet we don't have a word
that adequately differentiates the skills of either, just as we don't
have a word that differentiates a plumber's apparel manufacturer's
database from an auto dealer's database. We just use the word
'database' in different contexts and hope our readers follow us.

Or at least, I do.

Mike Anderson

unread,
Dec 21, 2006, 10:40:50 AM12/21/06
to

Well, let me see if I am following you :)

Before your addition of procedures to the database, you can see it as a
large object with poor data hiding. Specifically, it has no methods, so
the only way to interact with it is to send messages directly to its
instance variables (the tables within it).

Once you have added procedures to the database, you have your
hypostasis, and now you can interact with it instead of its contained
objects. In fact, now you can enforce the encapsulation by revoking
permissions on the tables and only granting them on the procedures.

I don't think that's too controversial, actually; many people would
regard this as a Best Practice for updates. I've seen it advocated for
selects too. However, I haven't seen anyone talking about it in OO
terms.

I find this very interesting, because it seems to me that databases
have a strong similarity to images, but whereas images are mostly an
unknown concept in mainstream programming, databases are commonplace.

topmind

unread,
Dec 21, 2006, 11:55:09 AM12/21/06
to
Frans Bouma wrote:
> topmind wrote:
>
> > Thomas Gagne wrote:
> > > An unexpected thing happened while debating topmind: I had an
> > > epiphany. Instead of responding to the news group I thought about
> > > it for a short bit (very short) and posted an article to my blog
> > > titled, "The RDB is the biggest object in my system."
> > >
> > > <http://blogs.in-streamco.com/anything.php?title=the_rdb_is_the_bigg
> > > est_object_in_my_syst>
> >
> > > From the link:
> >
> > "Why shouldn't applications have embedded SQL? Because it's the same
> > as accessing the private data members of an object. It shouldn't be
> > done. OO programmers know the correct way to interface with an
> > object is to use its method interface--not attempt direct
> > manipulation of the object's data. OO programmer's attempts to
> > violate that rule is what causes so much frustration mapping the
> > application's data graph into a relational database's tables, rows,
> > and columns. Those things belong to the DB--not to the application."
> >
> > (end quote)
> >
> > You OO'ers keep forgetting: SQL is an interface. I repeat, SQL is
> > an interface. It is not "low level hardware".
>
> SQL is a set-oriented language, it's not an interface as a language
> doesn't do anything without context (in this case a parser-interpreter
> combi)

Perhaps we need to clear up our working semantics with regard to
"language" and "interface". Are methods interfaces or a language? I am
not sure it really matters and I don't want to get tangled in a
definition battle.

>
> > You OO'ers keep
> > viewing it as low-level stuff because you don't seem to like it, and
> > you wrap anything you don't like behind OO and call it "low level" so
> > that it fits your personal subjective preference and world view. OO
> > may fit your mind better for whatever reason, but you cannot assume
> > your head is God's template for every other individual.
>
> of course it's not low level stuff, SQL is a set-oriented language and
> therefore doesn't match object-oriented languages, so a 'translation'
> has to be made as you can't project one onto another in a 1:1 fashion.

An option is to not use OO.

>
> > BTW, Microsoft has ADO, DAO, etc. which are OO wrappers around RDBMS.
>
> no they're not. ADO and DAO aren't OO, as they're COM based so they're
> actually procedural (library interfaces implemented on a live object).

Being an OO wrapper on top of procedural calls does not necessarily
turn something into non-OO. Please clarify your labelling criteria.

> Furthermore, they provide the interface you talked about to the DB,
> which is often referred to as 'the client interface' or 'provider' when
> it comes to database access.

This does not contradict anything I've said.

>
> > Further, even if OO was the best way to access RDBMS thru an app,
> > that does not necessarily extrapolate to all domains. OO being good
> > for X does not automatically imply it is good for Y also.
>
> you don't get the point: in an OO application, which works on data IN
> the application, you want to do that in an OO fashion.

Why? Is OO proven objectively better?

> To obtain the
> data from the outside is initiated INSIDE the application, thus also in
> an OO fashion. As an RDBMS doesn't understand OO in most cases, but it
> works with SQL as it has a SQL interpreter in place to let you program
> its internal relational algebra statements in a more readable way,
> you've to map statements from OO to SQL and set oriented results (the
> sets) from the DB back to OO objects.

Are you suggesting methods such as "Add_AND_Clause(column,
comparisonOperator, Value)"?

Those are bloaty and ugly in my opinion, but let's save that value
judgement for a later debate on clause/criteria wrappers.

>
> > I have
> > already agreed that OO may be good for writing device drivers and
> > device-driver-like things; but it has not been shown useful to view
> > everything as a device driver. I am more interested in seeing how OO
> > models biz objects rather than how it wraps system services and the
> > like. Biz modeling has been OO's toughest evidence cookie to crack
> > (but perhaps not the only).
>
> huh? walls full of books have been written about this topic and you
> declare it the toughest cookie to crack...

Such as? I've seen biz examples in OOP books, but they did not show how
they were better than the alternative. Showing how to make an Employee
class does not by itself tell you why an Employee class is better than
not using OO.

If "language" suits you better, that is fine by me. My main point is
that it is not "physical implementation" to be wrapped away. The use
of "direct data manipulation" was what I was responding to.

>
> > Note that
> > one could potentially mix them in RDBMS, but so far it does not appeal
> > very practical. And this is largely because the tight association
> > between data and behavior that OO likes simply does not work well in
> > biz apps. Thus, heavy behavioraltizing of RDBMS is not useful. I am
> > just pointing out it could be done and probably would be done if it
> > proved useful. OO forces an overly tight view of data and behavior.
> > Relational provides a consistency to declarative interfaces, but OO
> > does not provide any real structure and consistency to behavioral
> > interfaces. It creates shanty-town biz models.
>
> you declare a lot of IMHO rubbish as 'truth' here. E.g.: why wouldn't
> biz apps be helped with OO?

(For the record, the use of "rubbish" is a sign of rudeness. Thus, I
did not start the rudeness between us.)

I've never seen it happen. I am not claiming it can't or doesn't, only
that there is no public objective inspectable evidence that it does.
Yet, many push it thru as if it already "passed". I don't claim that
unicorns don't exist, only that I have not seen any captured for
analysis.

>
> > GOF patterns are supposed to be a solution, but GOF patterns have no
> > clear rules about when to use what and force a kind of IS-A view on
> > modeling instead of HAS-A.
>
> You also fall into the 'use pattern first, find problem for it
> later'-antipattern.
>
> a pattern is a (not the) solution for a well defined recognizable
> problem. So if you recognize the problem in your application, you can
> use the pattern which solves THAT problem to solve THAT problem in your
> application. THat's IT. The GoF book names a set of patterns and also
> the problems they solve. If you don't have the problems they solve, you
> don't need the patterns.

Well, a look-up table is usually simpler and more inspectable than
Visitor. Thus, if usefulness is our guide, then GOF patterns are often
not the best.

>
> Btw, the GoF book discourages inheritance a lot, just read it. It says
> don't use inheritance if you don't have to.

If you take away inheritence, you get "network structures" (AKA tangled
pasta). Dr. Codd sought to escape those by applying set theory, and
network structures thankfully fell out of favor, until the OO crowd
tried to bring them back from the dead.

>
> > GOF patterns are like an attempt to
> > catalog GO TO patterns instead of rid GO TO's. Relational is
> > comparable to the move from structured programming from GO TO's: it
> > provides more consistency and factors common activities into a single
> > interface convention (relational operators). OO lets people re-invent
> > their own just like there are a jillion ways to do the equivalent of
> > IF blocks with GO TO's.
>
> I've read a lot of nonsense in your post,

No, the nonsense comes from the OO zealots. They have no proof for biz
apps. Two paradigms are equal or unknown until proven otherwise. I want
to see science, not brochures.

> but this is one of the most
> striking examples. WHat on earth have GO TO's to do with the topic at
> hand?

It is an analogy.

>
> > OO has simply failed to factor and standardize common relationship and
> > collection idioms!!!!
> > OO has simply failed to factor and standardize common relationship and
> > collection idioms!!!!
>
> take your pills, you apparently forgot them ;)

And you took your LSD: you hallucinate evidence that ain't there.

>
> FB
>

-T-
oop.ismad.com

topmind

unread,
Dec 21, 2006, 3:30:04 PM12/21/06
to
Thomas Gagne wrote:
> Let me try to clear something up, and thanks to Topmind, Frans, and
> Stefan for helping me get there.
>
> In OO, objects are subclassed to make them more specific, not more
> general. I consider SQL to be a low level language, as far as RDBs are
> concerned, because it is application-ignorant.

So you are defining "low level" as application-ignorant? I find that a
stretch, but let's continue with it as a working/local definition.

> It's like C for
> relational operations. SQL doesn't know anything about my application.
>
> So, I subclass Model (so-to-speak) and add data that is domain-specific
> to create my domain-specific database.

In the app? Please clarify. Do you mean create the database, or an OO
*view* of the database?

> Why access it from applications
> using the same domain-ignorant language? Instead, I construct
> procedures that create a domain-specific interface. Instead of the
> lower-level
>
> select * from account, user where user.userId=X and account.userId =
> user.userId
>
> when instead I can use
>
> exec getAccountsFor @userId=X

A side notes here before I continue. First, some versions of SQL can do
a "natural join" such that you don't have to explicity declare the
common/default joins between two or more tables. Thus, there are
shorter possibilities. (SQL is hardly the pinnacle of relational
languages IMO, but it is still better than being without a RDBMS.)

Now, the advantage of embedded SQL is that one can add to or change it
as needed without having to hop around. If we need an additional
criteria or columns, we just add it in ONE module. If you have a
separate place for SQL and another for app code, then you have visit
and modify two different modules. That is more work because 2 is
greater than 1. Hopping around slows down development and
modifications.

>
> ?
>
> Besides its brevity, the procedure name clearly communicates the intent
> of the operation (stbpp pattern: intention revealing message), makes
> obvious its parameters, and provides a layer of indirection behind which
> its implementation can change without affecting the procedure's users.

Use comments. And, "getFoo" is hardly an improvement over
"select...from Foo".

>
> From SQL I've constructed procedures to provide a higher-level,
> domain-specific, language-and-paradigm-neutral interface to a
> domain-specific database.

How is it more "neutral" than SQL? A stored procedure still has its own
syntax and rules. What common scenarios are you saving us from?

>
> To find all the places in my application source code that get account
> information with user IDs it is much easier to find senders (callers) of
> getAccountsFor than it would be to find all the SQL referencing both the
> account and user tables. Could the latter be done? Sure, but when a
> more efficient and accurate alternative exists why would you?

Again, this gets back to the change effort cost and frequency analysis
that was part of the last topic. I weigh the costs of all the kinds
changes when I decide to embed or separate SQL. Most changes that I
encounter in the field favor embedding. If your experience or shop
pattern is different, then we will just have to agree to disagree.

It again comes down to frequencies, and I dissagree with your frequency
assessment. We are back to where we ended on the last topic. One should
look at the human effort and frequency involved, not just use
(disputed) labels such as "low level" etc. to shape our decision.

It is a kind of Frederick Winslow Taylor (time and motion studies)
style of decision making. In my years of experience, embedding reduces
the *net* hopping-around effort. Yes, there are times where isolation
of all the SQL would save time, but not enough to make up for the
others.

>
> --
> Visit <http://blogs.instreamfinancial.com/anything.php>
> to read my rants on technology and the finance industry.

-T-

Thomas Gagne

unread,
Dec 21, 2006, 4:02:16 PM12/21/06
to
topmind wrote:
> Thomas Gagne wrote:
>
>> Let me try to clear something up, and thanks to Topmind, Frans, and
>> Stefan for helping me get there.
>>
>> In OO, objects are subclassed to make them more specific, not more
>> general. I consider SQL to be a low level language, as far as RDBs are
>> concerned, because it is application-ignorant.
>>
>
> So you are defining "low level" as application-ignorant? I find that a
> stretch, but let's continue with it as a working/local definition.
>
Low-level meaning further away from my business. Assembly is even
further away from my business than SQL is. Imagine if I'd created a
language that was application-specific, it would be higher-level than
SQL. For instance, if I'd created a language that understood:

purchase 10 shares of IBM into anAccount

It would be pretty darn high-level. By grafting application-aware
constructs into SQL (views and procedures) it becomes increasingly
higher-level.


>
>> It's like C for
>> relational operations. SQL doesn't know anything about my application.
>>
>> So, I subclass Model (so-to-speak) and add data that is domain-specific
>> to create my domain-specific database.
>>
>
> In the app? Please clarify. Do you mean create the database, or an OO
> *view* of the database?
>

I actually mean, "create the database" as in:
create database bookstore;
create table book (...);


>
>> Why access it from applications
>> using the same domain-ignorant language? Instead, I construct
>> procedures that create a domain-specific interface. Instead of the
>> lower-level
>>
>> select * from account, user where user.userId=X and account.userId =
>> user.userId
>>
>> when instead I can use
>>
>> exec getAccountsFor @userId=X
>>
>

> <snip>


>> ?
>>
>> Besides its brevity, the procedure name clearly communicates the intent
>> of the operation (stbpp pattern: intention revealing message), makes
>> obvious its parameters, and provides a layer of indirection behind which
>> its implementation can change without affecting the procedure's users.
>>
>
> Use comments. And, "getFoo" is hardly an improvement over
> "select...from Foo".
>

That's a strawman. I'm sure you can imagine a more complicated SELECT
statement joining 12 tables, with or without natural joins, UNION'ing to
another select. Sure, I could comment it, or I could just create a
procedure and "exec searchForAccount @accountId=..."


>
>> From SQL I've constructed procedures to provide a higher-level,
>> domain-specific, language-and-paradigm-neutral interface to a
>> domain-specific database.
>>
>
> How is it more "neutral" than SQL? A stored procedure still has its own
> syntax and rules. What common scenarios are you saving us from?
>

It's neutral in the sense it can be invoked from C, Python, Java, PHP,
and SQL scripts--all with exactly the same behavior in the same
database. Remember, I'm not wrapping anything in OO, I'm just giving
the DB an API that facilitates my domain solution across language and
paradigm boundaries.

A view does the same thing, only it's not as capable as a stored
procedure is. I can create a view to do all kinds of useful projections
that can be called from any language with an attachment to the database,
but a view can't be extended later to record that a user queried it.
> <snip>


> It again comes down to frequencies, and I dissagree with your frequency
> assessment. We are back to where we ended on the last topic. One should
> look at the human effort and frequency involved, not just use
> (disputed) labels such as "low level" etc. to shape our decision.
>
> It is a kind of Frederick Winslow Taylor (time and motion studies)
> style of decision making. In my years of experience, embedding reduces
> the *net* hopping-around effort. Yes, there are times where isolation
> of all the SQL would save time, but not enough to make up for the
> others.
>

I have insider knowledge of two decent sized commercial finance
applications. One can count the number of lines of SQL because it's
separated into procedures and views, the other can only estimate because
the SQL is embedded. The first /knows/ that 37% of its source code is
SQL (sql, procedures, and views) making up 570 distinct requests of the
DB. The second estimates it around 33% but can not count (so quickly)
the number of distinct DB requests there may be. The designer of the
second (rightly, I think) believes counting the number of SQL lines
would be difficult since they're distributed throughout his code, and
include string concatenation for variables and discriminations spread
throughout functions.

If any SQL has to be modified in the first system, a new procedure can
be loaded into the database without affecting any of dozens of
applications. In fact, their source code needn't even be grep'ed to
find references to tables or columns. The second example would require
a grep, a fix, and a redeployment. Even in an ASP moving something to
production prudently requires a trip through some testing.

The chances of something needing fixing or enhancing in 33-40% is 1 in 3
or 2 in 5, depending on which estimate you want to take. All fixes
being equal (they are not), the first system's applications will spend
37% less time going through QA, suggesting they can more more quickly
than competitors with 37% of their source code not isolated from their
applications (as yours sounds to be).

topmind

unread,
Dec 21, 2006, 4:06:36 PM12/21/06
to
Thomas Gagne wrote:
> topmind wrote:
> > <snip>
> >
> > BTW, Microsoft has ADO, DAO, etc. which are OO wrappers around RDBMS.
> > Java and other vendors do also. Whether OO is the best way wrap RDBMS
> > calls is another debate. My point is they already exist.
> >
> > Further, even if OO *was* the best way to access RDBMS thru an app,
> > <snip>
> >
> You're missing something. I am not advocating wrapping the RDB with OO
> stuffs. I am not saying OO is the best way to access a
> database--directly or through any of the frameworks mentioned above. In
> fact, I'm advocating the opposite. Deal with the DB on its own terms,
> but treat it as an object.

Could you be more specific on "treat it as an object"? OOP is not
consistently defined such that we have to be careful about labelling
stored procedures as an OO concept.

> I'm recommending against accessing it using
> its low-level interface (SQL),

In a sister reply, I challenged your labelling of SQL as "low level".

> but instead that a higher-level
> application/schema/problem domain-specific API be constructed, most
> likely using procedures, and that applications should access the DB that
> way.

Like I've described many times, there are some labor-intensive
drawbacks to that.

>
> --
> Visit <http://blogs.instreamfinancial.com/anything.php>
> to read my rants on technology and the finance industry.

-T-

Thomas Gagne

unread,
Dec 21, 2006, 4:13:26 PM12/21/06
to
You've said it many times before, but perhaps you can give an example of
some SQL that's easier to fix when embedded than as a
procedure--including your normal QA procedures and promotion to production.

Neo

unread,
Dec 21, 2006, 4:22:00 PM12/21/06
to
> If you take away inheritence, you get "network structures" (AKA tangled
> pasta). Dr. Codd sought to escape those by applying set theory, and
> network structures thankfully fell out of favor, until the OO crowd
> tried to bring them back from the dead.

Can you give an example of such as tangled pasta? How did Dr Codd make
network structures fall out of favor?

topmind

unread,
Dec 21, 2006, 4:33:54 PM12/21/06
to
Thomas Gagne wrote:
> topmind wrote:
> > Thomas Gagne wrote:
> >
> >> Let me try to clear something up, and thanks to Topmind, Frans, and
> >> Stefan for helping me get there.
> >>
> >> In OO, objects are subclassed to make them more specific, not more
> >> general. I consider SQL to be a low level language, as far as RDBs are
> >> concerned, because it is application-ignorant.
> >>
> >
> > So you are defining "low level" as application-ignorant? I find that a
> > stretch, but let's continue with it as a working/local definition.
> >
> Low-level meaning further away from my business. Assembly is even
> further away from my business than SQL is. Imagine if I'd created a
> language that was application-specific, it would be higher-level than
> SQL. For instance, if I'd created a language that understood:
>
> purchase 10 shares of IBM into anAccount
>
> It would be pretty darn high-level. By grafting application-aware
> constructs into SQL (views and procedures) it becomes increasingly
> higher-level.

Why not say "application-specific" instead of "high-level"?

Regardless of how large it is, if it has to be changed it has to be
changed. Since SQL changes also tend to mirror app changes and visa
versa, if they are in the same module, we have less hopping around to
do.

> >
> >> From SQL I've constructed procedures to provide a higher-level,
> >> domain-specific, language-and-paradigm-neutral interface to a
> >> domain-specific database.
> >>
> >
> > How is it more "neutral" than SQL? A stored procedure still has its own
> > syntax and rules. What common scenarios are you saving us from?
> >
> It's neutral in the sense it can be invoked from C, Python, Java, PHP,
> and SQL scripts--all with exactly the same behavior in the same
> database.

Same with SQL. That is not a distinquishing feature.

> Remember, I'm not wrapping anything in OO, I'm just giving
> the DB an API that facilitates my domain solution across language and
> paradigm boundaries.
>
> A view does the same thing, only it's not as capable as a stored
> procedure is. I can create a view to do all kinds of useful projections
> that can be called from any language with an attachment to the database,
> but a view can't be extended later to record that a user queried it.

Again, that is a vendor-specific limitation. It is like complaining
about OO because Java does not have multiple inheritance. The lack of
MI is a Java-specific lack, not OO specificly.

> > <snip>
> > It again comes down to frequencies, and I dissagree with your frequency
> > assessment. We are back to where we ended on the last topic. One should
> > look at the human effort and frequency involved, not just use
> > (disputed) labels such as "low level" etc. to shape our decision.
> >
> > It is a kind of Frederick Winslow Taylor (time and motion studies)
> > style of decision making. In my years of experience, embedding reduces
> > the *net* hopping-around effort. Yes, there are times where isolation
> > of all the SQL would save time, but not enough to make up for the
> > others.
> >
> I have insider knowledge of two decent sized commercial finance
> applications. One can count the number of lines of SQL because it's
> separated into procedures and views, the other can only estimate because
> the SQL is embedded. The first /knows/ that 37% of its source code is
> SQL (sql, procedures, and views) making up 570 distinct requests of the
> DB. The second estimates it around 33% but can not count (so quickly)
> the number of distinct DB requests there may be. The designer of the
> second (rightly, I think) believes counting the number of SQL lines
> would be difficult since they're distributed throughout his code, and
> include string concatenation for variables and discriminations spread
> throughout functions.

That is very a minor reason to separate. Is it worth making the app 10%
to 25% more time-consuming to maintain *just* to be able to count
easier? I have to object. Perhaps you have weird managers.

One can get an approximate by sampling about 20 modules, counting the
SQL, finding the percent of source code it consumes, and then count all
the source lines and multiply by the sample percentage.

>
> If any SQL has to be modified in the first system, a new procedure can
> be loaded into the database without affecting any of dozens of
> applications.

It depends. Again, if the same query is used by *multiple* places in an
app, I am not against putting the SQL into a subroutine to simplify its
change.

> In fact, their source code needn't even be grep'ed to
> find references to tables or columns.

Why not? Why is SP's on the database more searchable than in source
code?

> The second example would require
> a grep, a fix, and a redeployment. Even in an ASP moving something to
> production prudently requires a trip through some testing.

I am not against testing either.

>
> The chances of something needing fixing or enhancing in 33-40% is 1 in 3
> or 2 in 5, depending on which estimate you want to take.

I am not sure what you are measuring here. 33-40% of what?

> All fixes
> being equal (they are not), the first system's applications will spend
> 37% less time going through QA, suggesting they can more more quickly
> than competitors with 37% of their source code not isolated from their
> applications (as yours sounds to be).

Please clarify what you are measuring/comparing.

Again, my decision to embed most SQL is based on my experience with
various change frequencies and change scenarios. It is not a "random"
decision. If your experience differs, so be it. Just don't claim it a
universal "best practice"; otherwise I will hold you to the scientific
method.

By the way, one valid reason to separate is that the SQL "programmer"
is different from the app programmer and one does not know the other
language.

-T-

topmind

unread,
Dec 21, 2006, 4:39:14 PM12/21/06
to

Neo wrote:
> > If you take away inheritence, you get "network structures" (AKA tangled
> > pasta). Dr. Codd sought to escape those by applying set theory, and
> > network structures thankfully fell out of favor, until the OO crowd
> > tried to bring them back from the dead.
>
> Can you give an example of such as tangled pasta?

OO Visitor pattern.

> How did Dr Codd make
> network structures fall out of favor?

By using examples and logic. And users of RDBMS found them more useful
than network DB's. Large network DB's only exist now for specialized
niches. If it was not for an OODBMS push from OO fans, nobody would
even talk about them anymore.

-T-

Neo

unread,
Dec 21, 2006, 5:43:18 PM12/21/06
to
> > Can you give an example of such as tangled pasta?
> OO Visitor pattern.

Thx, I am still trying to understand it at wikipedia.

> > How did Dr Codd make network structures fall out of favor?
>
> By using examples and logic. And users of RDBMS found them more useful
> than network DB's. Large network DB's only exist now for specialized
> niches. If it was not for an OODBMS push from OO fans, nobody would
> even talk about them anymore.

The CODYSL network data model is mostly a misnomer. It is better called
a Hierarchal/Relational Hybrid Data Model. A true network database
should allows each thing to be related to any other thing, possibly
similar to the human mind. Much data does tends to fit in table-like
structures making RMDB an excellent tool.

Yet, network structures are everywhere. Following example represent a
network where john likes mary, john hates bob, and like is opposite of
hate. A query finds the person with whom john's relationship is
opposite that of with Mary. Can a RMDB user post an equivalent solution
to model/query this simple network? If possible, the solution's
schema/queries should be resilent to future/unknown data requirements.

(new 'john) (new 'mary) (new 'bob)
(new 'like) (new 'hate) (new 'opposite)

(set like opposite hate)
(set hate opposite like)

(set john like mary)
(set john hate bob)

(; Get person with whom
john's relationship is opposite of that with mary)
(; Gets bob)
(get john (get (get john * mary) opposite *) *)

(; Get person with whom
john's relationship is opposite of that with bob)
(; Gets mary)
(get john (get (get john * bob) opposite *) *)

topmind

unread,
Dec 21, 2006, 6:05:16 PM12/21/06
to

I thought I already did. Anyhow, here is another:

We have a typical Employees table. After 9/11 we want to add a new
column "security clearance level". We need to add this to the Employee
input screen and the Query By Example screen.

For the input screen, we need to change the SQL from:

UPDATE emp SET ... foo=&bar& WHERE empID = &empID&

To

UPDATE emp SET ... foo=&bar&, secClrLvl = &secClrLvl& WHERE empID =
&empID&

It is in the same module that generates the screen. Thus I only have to
visit one module to add this column. I can add it both to the screen
field specification and to the SQL related to inserting and updating.

You would have to visit both the screen app module and the SP(s).
Plus, you have to add new parameters.

Often I also use techniques to generate most of the SET clause based on
data dictionaries or validation routines to kill 2 birds with one
stone. Passing such a generated string can be a PITA with stored
procedures.

-T-

topmind

unread,
Dec 21, 2006, 6:26:31 PM12/21/06
to

Neo wrote:
> > > Can you give an example of such as tangled pasta?
> > OO Visitor pattern.
>
> Thx, I am still trying to understand it at wikipedia.
>
> > > How did Dr Codd make network structures fall out of favor?
> >
> > By using examples and logic. And users of RDBMS found them more useful
> > than network DB's. Large network DB's only exist now for specialized
> > niches. If it was not for an OODBMS push from OO fans, nobody would
> > even talk about them anymore.
>
> The CODYSL network data model is mostly a misnomer. It is better called
> a Hierarchal/Relational Hybrid Data Model. A true network database
> should allows each thing to be related to any other thing, possibly
> similar to the human mind. Much data does tends to fit in table-like
> structures making RMDB an excellent tool.
>
> Yet, network structures are everywhere. Following example represent a
> network where john likes mary, john hates bob, and like is opposite of
> hate. A query finds the person with whom john's relationship is
> opposite that of with Mary. Can a RMDB user post an equivalent solution
> to model/query this simple network? If possible, the solution's
> schema/queries should be resilent to future/unknown data requirements.

One generally uses many-to-many tables for such. Example:

table: Likes
-------------
personRef1
personRef2

table: Hates
--------------
personRef1
personRef2

table: Opposites
------------
factorRef1
factorRef2

Or we could meta-tize it to make it more flexible:

table: PeopleRelationships
-----------
personRef1
personRef2
relationRef // Example: "Hate" ("relation" table not shown)

table: RelationRelationships
--------
relationRef1
relationRef2
relationRef // Example: "Opposite"

I will leave the query work to somebody else.

But this is kind of a "toy" example, such as an AI lab. I would like to
see something more practical.

>
> (new 'john) (new 'mary) (new 'bob)
> (new 'like) (new 'hate) (new 'opposite)
>
> (set like opposite hate)
> (set hate opposite like)
>
> (set john like mary)
> (set john hate bob)
>
> (; Get person with whom
> john's relationship is opposite of that with mary)
> (; Gets bob)
> (get john (get (get john * mary) opposite *) *)
>
> (; Get person with whom
> john's relationship is opposite of that with bob)
> (; Gets mary)
> (get john (get (get john * bob) opposite *) *)

-T-

Neo

unread,
Dec 21, 2006, 10:37:27 PM12/21/06
to
> But this is kind of a "toy" example, such as an AI lab. I would like to
> see something more practical. I will leave the query work to somebody else.

If it is a "toy" example, how difficult could it be to post the query
to find the person with whom john's relationship is opposite of, that
with mary? Then I can proceed to compare how an rmdb vs a network-type
db handle additional data requirements.

topmind

unread,
Dec 22, 2006, 2:02:11 AM12/22/06
to
Neo wrote:
> > But this is kind of a "toy" example, such as an AI lab. I would like to
> > see something more practical. I will leave the query work to somebody else.
>
> If it is a "toy" example, how difficult could it be to post the query
> to find the person with whom john's relationship is opposite of, that
> with mary?

Toy examples are not necessarily trivial. The problem is their
representativeness, not simplicity level. (Whether the solution is
simple or not, I won't bother with.)

> Then I can proceed to compare how an rmdb vs a network-type
> db handle additional data requirements.

Based on past experience with the dubious utility of toy/lab examples,
I think I will elect to skip this.

-T-

topmind

unread,
Dec 22, 2006, 2:03:52 AM12/22/06
to
Neo wrote:
> > But this is kind of a "toy" example, such as an AI lab. I would like to
> > see something more practical. I will leave the query work to somebody else.
>
> If it is a "toy" example, how difficult could it be to post the query
> to find the person with whom john's relationship is opposite of, that
> with mary?

Toy examples are not necessarily trivial. The problem is their


representativeness, not simplicity level. (Whether the solution is
simple or not, I won't bother with.)

> Then I can proceed to compare how an rmdb vs a network-type


> db handle additional data requirements.

Based on past experience with the dubious utility of toy/lab examples,

Frans Bouma

unread,
Dec 22, 2006, 3:51:56 AM12/22/06
to
topmind wrote:
> Frans Bouma wrote:
> > topmind wrote:
> > > You OO'ers keep forgetting: SQL is an interface. I repeat, SQL is
> > > an interface. It is not "low level hardware".
> >
> > SQL is a set-oriented language, it's not an interface as a language
> > doesn't do anything without context (in this case a
> > parser-interpreter combi)
>
> Perhaps we need to clear up our working semantics with regard to
> "language" and "interface". Are methods interfaces or a language? I
> am not sure it really matters and I don't want to get tangled in a
> definition battle.

Methods are part of an interface written in a language. SQL is a
language, a set of stored procs is an interface.

it's not getting much simpler than that.

> > > BTW, Microsoft has ADO, DAO, etc. which are OO wrappers around
> > > RDBMS.
> >
> > no they're not. ADO and DAO aren't OO, as they're COM based so
> > they're actually procedural (library interfaces implemented on a
> > live object).
>
> Being an OO wrapper on top of procedural calls does not necessarily
> turn something into non-OO. Please clarify your labelling criteria.

ADO isn't OO, it's COM. COM isn't OO, despite the fact it lets you
believe you're working with objects, which is actually a facade, you're
not working with OOP style objects, as there's no inheritance nor
polymorphism, you just talk to an interface implemented by an
object-esk construct in memory, which could be seen as a C struct with
function pointers.

> > > Further, even if OO was the best way to access RDBMS thru an app,
> > > that does not necessarily extrapolate to all domains. OO being
> > > good for X does not automatically imply it is good for Y also.
> >
> > you don't get the point: in an OO application, which works on data
> > IN the application, you want to do that in an OO fashion.
>
> Why? Is OO proven objectively better?

why would one WANT to use 2 paradigms, which aren't related as in one
is derived from the other, in a single application? (let's redirect the
'what's a paradigm' posts to /dev/null/ first)

> > To obtain the
> > data from the outside is initiated INSIDE the application, thus
> > also in an OO fashion. As an RDBMS doesn't understand OO in most
> > cases, but it works with SQL as it has a SQL interpreter in place
> > to let you program its internal relational algebra statements in a
> > more readable way, you've to map statements from OO to SQL and set
> > oriented results (the sets) from the DB back to OO objects.
>
> Are you suggesting methods such as "Add_AND_Clause(column,
> comparisonOperator, Value)"?

No.

> Those are bloaty and ugly in my opinion, but let's save that value
> judgement for a later debate on clause/criteria wrappers.

You can perfectly write a set of predicate classes which can be
inherited by the developer and make them more specific to the domain
the developer is working with.

> > > I have
> > > already agreed that OO may be good for writing device drivers and
> > > device-driver-like things; but it has not been shown useful to
> > > view everything as a device driver. I am more interested in
> > > seeing how OO models biz objects rather than how it wraps system
> > > services and the like. Biz modeling has been OO's toughest
> > > evidence cookie to crack (but perhaps not the only).
> >
> > huh? walls full of books have been written about this topic and you
> > declare it the toughest cookie to crack...
>
> Such as? I've seen biz examples in OOP books, but they did not show
> how they were better than the alternative. Showing how to make an
> Employee class does not by itself tell you why an Employee class is
> better than not using OO.

I'm not saying everything should be OO because it's otherwise not
possible, as you can write any program in plain C. It's often more
suitable for writing an application because the resulting application
is developed faster (code re-use) and is more maintainable and business
apps can be very suitable for using an OO language, simply because you
have data and logic operating on that data, so IMHO the ideal
environment for using an OOP approach.

> > > GOF patterns are supposed to be a solution, but GOF patterns have
> > > no clear rules about when to use what and force a kind of IS-A
> > > view on modeling instead of HAS-A.
> >
> > You also fall into the 'use pattern first, find problem for it
> > later'-antipattern.
> >
> > a pattern is a (not the) solution for a well defined recognizable
> > problem. So if you recognize the problem in your application, you
> > can use the pattern which solves THAT problem to solve THAT problem
> > in your application. THat's IT. The GoF book names a set of
> > patterns and also the problems they solve. If you don't have the
> > problems they solve, you don't need the patterns.
>
> Well, a look-up table is usually simpler and more inspectable than
> Visitor. Thus, if usefulness is our guide, then GOF patterns are often
> not the best.

Visitor pattern is a pattern I don't think is very useful as the
problem it solves isn't very common.

But if your point is that OO is crap because Visitor pattern is silly
and thus all that's said in the GoF book is therefore also retarded
then we're done here.

> > > GOF patterns are like an attempt to
> > > catalog GO TO patterns instead of rid GO TO's. Relational is
> > > comparable to the move from structured programming from GO TO's:
> > > it provides more consistency and factors common activities into a
> > > single interface convention (relational operators). OO lets
> > > people re-invent their own just like there are a jillion ways to
> > > do the equivalent of IF blocks with GO TO's.
> >
> > I've read a lot of nonsense in your post,
>
> No, the nonsense comes from the OO zealots. They have no proof for biz
> apps. Two paradigms are equal or unknown until proven otherwise. I
> want to see science, not brochures.

you also have no proof for your claims either. As you started the
claims, let's see them.


FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Mike Anderson

unread,
Dec 22, 2006, 5:02:29 AM12/22/06
to

topmind wrote:

> Neo wrote:
> > Can you give an example of such as tangled pasta?
>
> OO Visitor pattern.

I was going to let it ride earlier, as you seemed to be having a good
rant, and I didn't want to spoil it, but since you've mentioned the
Visitor pattern twice, I would like to know exactly what you understand


by it. Earlier, you wrote:

"Well, a look-up table is usually simpler and more inspectable than
Visitor. Thus, if usefulness is our guide, then GOF patterns are often
not the best."

I can't think of a situation where a lookup table is interchangeable
with something you would use a Visitor pattern for. The Visitor pattern
is used to flatten arbitrary structures; often trees, but applicable to
any kind of graph.

It's also a pattern that isn't necessarily OO. If you pass a lambda to
a function that walks a tree (or other graph), that's effectively the
same thing.

Mike

Thomas Gagne

unread,
Dec 22, 2006, 6:51:24 AM12/22/06
to
> <snip decent example>
>
Had you used a procedure the amount of coding change would have been the
same. True, you visited one location rather than two.

Your example is a change to an interface. It doesn't matter what
language or problem you're working with. When an interface changes
everything that uses the interface must change.

Consider an example that doesn't change the interface.

Our system tracks the buying, selling, and payoffs of financial
contracts. Our users like to report on arbitrary time periods to see
which contracts were open. Using transaction history SQL can answer the
question for any time period--but with unsatisfactory performance.

To help historical queries run faster (fewer IOs) we created a
dailyContractBalance table (yes--it is denormalized and redundant--RDB
purists may be balking now). Simply put, it records for every day
(really) which contracts were open that day. The reports now run much
faster but we have to maintain the table. The report is presented on a
webpage from PHP. The transactions are created in the back office by a
Smalltalk application. Two procedures are involved, the one that
returns the report and the one that adds transactions.

To affect this non-trivial change we updated the procedure that added
transactions to add and remove contracts from the dailyContractBalance
table as they are purchased (insert) and paid-off (delete). We were
able to test the new procedure in isolation to prove it added and
removed contracts correctly from the dailyContractBalance table.

No modifications to the Smalltalk were necessary. No Smalltalk code had
to be shipped.

We then modified the report procedure to query the dailyContractBalance
table instead of the transaction history table. We were able to test it
in isolation to prove it returned the right answer AND that it performed
faster. The same procedure was used by users to export data into
spreadsheets. That module was unchanged as well.

No modification to the website was necessary. No PHP code had to be
shipped.

This is an example of what happens when the database is treated as
though it were an object. If the interface doesn't change then modules
that depend on the interface don't need to change. We were able to do
some significant changes inside the database (new table) AND change the
implementation of two procedures without affecting their interface
(parameters and result set).

Whether SQL is embedded by hand (what it sounds like you do) or is
generated by a framework, or is the result of some other kind of
OO-to-RDB mapping, the change I described would have been more painful
to implement, involved more modules, involved more programs, and
depending on your QA policy may have required a longer pass through QA
leading to a delayed production upgrade.

Neo

unread,
Dec 22, 2006, 11:09:42 AM12/22/06
to
>>>>> Topmind: Dr. Codd sought to escape those by applying set theory, and network structures thankfully fell out of favor, until the OO crowd tried to bring them back from the dead.
>>>>
>>>> Neo: How did Dr Codd make network structures fall out of favor?
>>>
>>>Topmind: By using examples and logic...
>>
>> Neo: Following example represents a network where john likes mary, john hates bob, and like is opposite of hate. A query finds the person with whom john's relationship is opposite that of with mary. Can a RMDB user post an equivalent solution to model/query this simple network? If possible, the solution's schema/queries should be resilent to future/unknown data requirements.
>
> Topmind: Based on past experience with the dubious utility of toy/lab examples, I think I will elect to skip this.

Ok, I think of another example.

topmind

unread,
Dec 22, 2006, 11:31:51 AM12/22/06
to

For some ideas, an airline reservation system or a grades/class college
tracking system make fairly good examples. Boring, perhaps, but that is
why they reflect the real world more :-)

(P.S. I apologize for the duplicate post. My internet connection
burped.)

-T-

Neo

unread,
Dec 22, 2006, 11:42:16 AM12/22/06
to
> Based on past experience with the dubious utility of toy/lab examples, I think I will elect to skip this.

Here is another network example. Adam has children named John(male),
Jack(male) and Mary(female). Find John's sibling of opposite gender.
Below is an implementation using a network-type db. What RMDB
schema/query implements the equivalent? Note that the query does not
refer to John's father (Adam) or John's gender (male) directly.

(new 'male 'gender) (new 'female 'gender)

(new 'opposite 'verb)
(set male opposite female) (set female opposite male)

(new 'adam)
(new 'john) (set john gender male)
(new 'jack) (set jack gender male)
(new 'mary) (set mary gender female)

(set adam child john)
(set adam child jack)
(set adam child mary)

(; Get john's sibling of opposite gender
by getting the thing
whose gender is opposite
and is child of john's parent
and that person is not himself)
(; Gets mary)
(!= (and (get * gender (get (get john gender *) opposite *))
(get (get * child john) child *))
john)

Neo

unread,
Dec 22, 2006, 12:20:48 PM12/22/06
to
> For some ideas, an airline reservation system or a grades/class college tracking system make fairly good examples. Boring, perhaps, but that is why they reflect the real world more :-)

Can you describe what each example should store and be able to query?
If possible, could you give some sample data?

topmind

unread,
Dec 22, 2006, 12:33:58 PM12/22/06
to
Thomas Gagne wrote:
> topmind wrote:

> >> You've said it many times before, but perhaps you can give an example of
> >> some SQL that's easier to fix when embedded than as a
> >> procedure--including your normal QA procedures and promotion to production.
> >>
> >
> > <snip decent example>
> >
> Had you used a procedure the amount of coding change would have been the
> same. True, you visited one location rather than two.
>
> Your example is a change to an interface. It doesn't matter what
> language or problem you're working with. When an interface changes
> everything that uses the interface must change.

Agreed. But most changes *are* interfaces-related changes I've come to
find out. OO'ers seem to overemphasize *implementation* changes, when
what I see for my domain (custom biz apps) most changes are
requirements changes, not implementation changes. The OO books
emphasize the wrong change kinds. Thus, I optimize my designs for
requirements changes, not implementation changes.

>
> Consider an example that doesn't change the interface.
>
> Our system tracks the buying, selling, and payoffs of financial
> contracts. Our users like to report on arbitrary time periods to see
> which contracts were open. Using transaction history SQL can answer the
> question for any time period--but with unsatisfactory performance.
>
> To help historical queries run faster (fewer IOs) we created a
> dailyContractBalance table (yes--it is denormalized and redundant--RDB
> purists may be balking now). Simply put, it records for every day
> (really) which contracts were open that day. The reports now run much

> faster but we have to maintain the table. [....]

If I am not mistaken, you presented this example a few weeks ago in the
"old" topic. I don't dispute that particular situation may have been
been helped by separation (although some of your issues seemed
vendor-specific). But again, software design is a lot like investment
management: you have to weigh your investment options (design
decisions) against estimated future probabilities of different kinds of
changes. There is rarely a free lunch; it is a matter of playing the
odds. If you overhaul the tables or DB schemas, it may indeed require
visiting a lot of embedded SQL. However, those happen maybe once every
2 years or so (in my experience) such that you spend a week or so
making the changes. However, feature changes happen just about every
week. It is more economical to save 3 hours every week for the 2 years
rather waste 3 hours a week to save 60 hours once in that two years.

For those two years:

Separation: 3hr x 50 x 2 = 300 hours (assume 50 work weeks per year)

Embedded: 1 x 60hr = 60 hours

Based on the givens, it is clear that embedding is more economical.

[...]

> This is an example of what happens when the database is treated as
> though it were an object. If the interface doesn't change then modules
> that depend on the interface don't need to change. We were able to do
> some significant changes inside the database (new table) AND change the
> implementation of two procedures without affecting their interface
> (parameters and result set).

That is a property of function/procedures, not OOP. There is no
polymorphism nor inheritence that I could see in your example since you
don't have multiple implementations active at the same time. You simply
gutted the old implementation and replaced it with a new one. This is
old-fashioned function/procedure "implementation hiding". Giving OO
credit is a big stretch.

-T-

topmind

unread,
Dec 22, 2006, 12:41:59 PM12/22/06
to

This may help some:

http://c2.com/cgi/wiki?CampusExample

-T-

Thomas Gagne

unread,
Dec 22, 2006, 12:52:04 PM12/22/06
to
It's hard to buy your estimates without supporting data. All our fixes
and enhancements are entered into a bug tracking system and our code
into CVS. I'm wondering now if I can query CVS in such a way as to
identify how often we change implementation v. interface.

As to your estimates, if the interface breaks the same amount of source
work is needed whether the change is in two places or one. Changes to
the application are the most expensive since touching one part of it
necessarily requires testing all of it as well as shipping it.

In terms of investment management, a portfolio manager wouldn't ignore
the fact that he can improve returns on 40% of his portfolio if he
separated SQL from the application. Even if they happen infrequently
he's successfully hedged against it having a negative affect on his
overall investment--especially when that insurance is free (it didn't
create more code).
> <snip>


>
> That is a property of function/procedures, not OOP. There is no
> polymorphism nor inheritence that I could see in your example since you
> don't have multiple implementations active at the same time. You simply
> gutted the old implementation and replaced it with a new one. This is
> old-fashioned function/procedure "implementation hiding". Giving OO
> credit is a big stretch.
>

First, you're inside comp.object, so using OO terms makes sense, don't
you think? Second, I'm not giving OO credit--just using OO terminology
since it's applicable. Third, I believe there's sufficient evidence
that OO designers and programmers are likely to benefit most from
database interfaces since many of them are trying (very hard) to marry
object models to DB models. I've discovered that enterprise is
unnecessary and that a solution is well within both the technical and
ideological grasp of OO's practitioners.

aloha.kakuikanu

unread,
Dec 22, 2006, 1:27:46 PM12/22/06
to
Thomas Gagne wrote:
> 1) no application should directly
> access the DB's data (no SQL) and 2) applications should use the DB only
> through its interface. Stored procedures are the best example of the
> latter I know of.

Thomas, there is a lot of nonsense in your post that I don't have any
desire to address. I just highlighted couple of sentences.

First, when you mention "stored procedure" you dumbed down discussion
significantly. Because most of the stored procedure usages in
application programming practice are really disgusting. What wrapping
sql command like this

procedure insert_new_employee( <100 of primitive data type arguments> )
begin
#sql insert into employee( <100 of column names> ) values ( <100 of
values> );
end

is supposed to achieve?

Next, you have been pointed out repeatedly that a query via function
call (object wrapped, or not) is inferior to SQL query. Let me give you
an example, square root calculation. In procedural programming you do
call

squareRoot(9)

Meyer made a great deal out of so called "design by contract" idea. He
noticed that square root function is just one manifestation of a square
relation

y = x^2

but he don't go further than introducing an assertion which validates
the y = x^2 predicate between function argument and return value.

In relational world you query the square relation. Given a known value
of x you query all the matching values of y (there happen to be only
one). Or given a known value of y you query all the matching values of
x (there happen to be two).

As you may notice, even in this trivial example you need at least 2
functions
squareRoot(int)
and
square(int)
to perform all possible queries against the database of square numbers.
This number of function that "interface" your database explodes quickly
with increasing complexity. In relational world you just compose your
query out of the small set of well defined operations (selection
followed by projection in this square root example).

On the final note, adding objects into a picture changes really nothing.

topmind

unread,
Dec 22, 2006, 2:02:08 PM12/22/06
to

This works both ways. If your ratio is flipped for whatever reason,
then go with it. I am only describing what I observe and the reasoning
that I use based on these observations.

But I want to at least make sure we agree on impact of such changes
even if we dissagree on the frequency. Readers can plug in their own
observed frequencies.

> All our fixes
> and enhancements are entered into a bug tracking system and our code
> into CVS. I'm wondering now if I can query CVS in such a way as to
> identify how often we change implementation v. interface.

It is not my fault that your CVS brand is not query-able. But, if your
management is hard-up to count every speck of dust and separating the
SQL facilitates this for your existing tools, then go ahead and
separate. Your management's priorities may be questionable, but they
call the shots and you have to please them. Speak your mind, but if
they don't want to listen, let it go. The boss is the boss; the work
place is not a democracy.

>
> As to your estimates, if the interface breaks the same amount of source
> work is needed whether the change is in two places or one. Changes to
> the application are the most expensive since touching one part of it
> necessarily requires testing all of it as well as shipping it.

You should test it *all* anyhow. One can unit test to make sure a given
SP is fine for the tests given to it, but sometimes it may be called in
ways by the app not anticipated by the unit tests. Thus, I don't see
how testing effort would be a different issue either way.

And, I am not (fully) disputing that the particular scenario you gave
would be more costly. I am just doing a frequency-cost analysis and
comparing it to other scenarios.

>
> In terms of investment management, a portfolio manager wouldn't ignore
> the fact that he can improve returns on 40% of his portfolio if he
> separated SQL from the application. Even if they happen infrequently
> he's successfully hedged against it having a negative affect on his
> overall investment--especially when that insurance is free (it didn't
> create more code).

It is not free. It increases the development time for feature changes.

> > <snip>
> >
> > That is a property of function/procedures, not OOP. There is no
> > polymorphism nor inheritence that I could see in your example since you
> > don't have multiple implementations active at the same time. You simply
> > gutted the old implementation and replaced it with a new one. This is
> > old-fashioned function/procedure "implementation hiding". Giving OO
> > credit is a big stretch.
> >
> First, you're inside comp.object, so using OO terms makes sense, don't
> you think?

No matter where it is discussed, it is not an OO-specific concept. A
rose is still a rose in Timbuktu.

> Second, I'm not giving OO credit--just using OO terminology
> since it's applicable.

How about we find another term. "Hiding implementation behind an
interface" is what I'd call it right now without a better alternative.
Some would use "encapsulation", but there is no agreement on what encap
really means.

> Third, I believe there's sufficient evidence
> that OO designers and programmers are likely to benefit most from
> database interfaces since many of them are trying (very hard) to marry
> object models to DB models.

I would suggest they focus very hard on finding the *best* solution,
not necessarily an OO one. OO is oversold and is highly questionable
for custom biz apps.

> I've discovered that enterprise is
> unnecessary and that a solution is well within both the technical and
> ideological grasp of OO's practitioners.

I don't think marrying them will be practical unless you cripple one or
the other. Set theory and graphs are a fundimentally different way to
organize and sift info. Maybe someday there will be a magic breakthru
to meld them nicely. Until then, I say chuck OO for custom biz apps.
The world is not ready for it. Now you can only embrace the RDBMS or
embrace OO and must choose one or the other or risk a bloated mess.
Given that choice, I say OO goes.

>
> --
> Visit <http://blogs.instreamfinancial.com/anything.php>
> to read my rants on technology and the finance industry.

-T-

Patrick May

unread,
Dec 22, 2006, 5:17:23 PM12/22/06
to
"topmind" <top...@technologist.com> writes:
> Neo wrote:
> > > If you take away inheritence, you get "network structures" (AKA
> > > tangled pasta). Dr. Codd sought to escape those by applying set
> > > theory, and network structures thankfully fell out of favor,
> > > until the OO crowd tried to bring them back from the dead.
> >
> > Can you give an example of such as tangled pasta?
>
> OO Visitor pattern.

The Visitor pattern provides two capabilities: 1) simulation of
double dispatch and 2) non-intrusively adding new operations to
existing classes. Both of these address limitations of languages such
as C++ and Java. The existence of this pattern does not demonstrate a
general flaw in the object oriented approach.

Sincerely,

Patrick

------------------------------------------------------------------------
S P Engineering, Inc. | Large scale, mission-critical, distributed OO
| systems design and implementation.
p...@spe.com | (C++, Java, Common Lisp, Jini, middleware, SOA)

Neo

unread,
Dec 22, 2006, 5:30:50 PM12/22/06
to
> > > For some ideas, an airline reservation system or a grades/class college tracking system make fairly good examples. Boring, perhaps, but that is why they reflect the real world more :-)
> >
> > Can you describe what each example should store and be able to query?
> > If possible, could you give some sample data?
>
> http://c2.com/cgi/wiki?CampusExample

Ok, do you already have data for the example or should I create it?

Chris

unread,
Dec 22, 2006, 5:41:15 PM12/22/06
to

Neo wrote:
> > Based on past experience with the dubious utility of toy/lab examples, I think I will elect to skip this.
>
> Here is another network example. Adam has children named John(male),
> Jack(male) and Mary(female). Find John's sibling of opposite gender.
> Below is an implementation using a network-type db. What RMDB
> schema/query implements the equivalent? Note that the query does not
> refer to John's father (Adam) or John's gender (male) directly.
>

Well, use a table like:
People
Name
Father
Sex
(not precise DDL, I know).

insert into People values ('John', 'Adam', 'M'), ('Jack', 'Adam', 'M'),
('Mary', 'Adam', 'F');

select name from poeple other_sibling join people john on
other_sibling.father = john.father and other_sibling.sex != john.sex
where john.name = 'John';

I think this satisfies your requirements.

I would like to note that you should be able to satisfy most
queries/problems in most languages (unless they are sufficiently
crippled); it is just the amount of work involved in achieving the
desired results. Some issues are better resolved in one language;
other issues are better resolved in other languages. The same rule, I
think, holds true in databases as well. (This problem looks pretty
easy in Relational databases to me, for instance).

-Chris

Cesar Rabak

unread,
Dec 22, 2006, 5:48:29 PM12/22/06
to
aloha.kakuikanu escreveu:

> Thomas Gagne wrote:
>> 1) no application should directly
>> access the DB's data (no SQL) and 2) applications should use the DB only
>> through its interface. Stored procedures are the best example of the
>> latter I know of.
>
> Thomas, there is a lot of nonsense in your post that I don't have any
> desire to address. I just highlighted couple of sentences.
>
['highlight' sentences snipped]

>
> On the final note, adding objects into a picture changes really nothing.
>

I think we're mixing oranges and apples here. Thomas is trying to show
that when you have a specific Database (Schema + data fed in) and an
application that uses it (but not for ad hoc queries), his proposition
brings value.

If we want to consider the Database as a repository of data and the need
for doing different queries on a case for case base, then you statement
is correct about the explosion of methods ('functions').

HTH

--
Cesar Rabak

Neo

unread,
Dec 22, 2006, 7:10:25 PM12/22/06
to
> http://c2.com/cgi/wiki?CampusExample

Below is an initial implemenation with some sample data. What can we
verify?

(new 'contact)
(new 'alias)
(new 'address)
(new 'street)
(new 'apt#)
(new 'city)
(new 'state)
(new 'zip)
(new 'country)

(new 'phn#)
(new 'cell#)
(new 'pager#)
(new 'fax#)
(new 'email)

(new)
(set address instance (it))
(set+ (it) street (set '123 'main 'st))
(set+ (it) city 'chicago)
(set+ (it) state 'illinois)
(set+ (it) zip '56789)
(set+ (it) country 'usa)

(new)
(set address instance (it))
(set+ (it) street (set '547 'elm 'rd))
(set+ (it) apt# '54A)
(set+ (it) city 'houston)
(set+ (it) state 'texas)
(set+ (it) zip '774433-7654)
(set+ (it) country 'usa)

(new 'adam 'person)
(set (it) address (get * street (. '123 'main 'st)))
(set+ (it) phn# '234-6789)
(set+ (it) cell# '435-8766)
(set+ (it) email 'adam&ibm.com)
(set+ (it) email 'adam.smith&gm.com)

(new 'eve 'person 'teacher)
(set (it) address (and (get * street (. '547 'elm 'rd))
(get * apt# 54A)))
(set+ (it) cell# '457-8779)

(new 'john 'person)
(set+ (it) alias 'jojo)
(set+ (it) pager# '568-5866)
(set (it) contact adam)
(set (it) contact eve)

(new 'mary 'person)
(set+ (it) fax# '587-8978)
(set (it) contact eve)
(set (it) contact eve)

(new 'student)
(new 'harvard 'university)
(set harvard teacher eve)
(set harvard student john)
(set harvard student mary)

(new 'credit)
(new 'category)
(new 'prerequiste)

(new 'course1 'course)
(set+ (it) credit '4)
(set+ (it) category 'science)

(new 'course2 'course)
(set+ (it) credit '6)
(set+ (it) category 'math)
(set+ (it) category 'logic)
(set (it) prerequiste course1)

(new 'semester1 'semester)
(new 'semester2 'semester)

(new 'class1 'class 'course1)
(set semester1 has class1)

(new 'class2 'class 'course2)
(set semester2 has class2)

(new 'grade)
(new 'took 'verb)

(set+ john took class1 grade '85)
(set+ mary took class2 grade '95)


(; Get harvard student
who took a class
whose course category is math and logic)
(; Gets mary)
(and (get harvard student *)
(get * took (get (and (get * category math)
(get * category logic)
)
instance *)))

Neo

unread,
Dec 22, 2006, 7:32:26 PM12/22/06
to
> > Here is another network example. Adam has children named John(male),
> > Jack(male) and Mary(female). Find John's sibling of opposite gender.
> > Below is an implementation using a network-type db. What RMDB
> > schema/query implements the equivalent? Note that the query does not
> > refer to John's father (Adam) or John's gender (male) directly.
> >
> table like: People (Name, Father, Sex);

> insert into People values:
> ('John', 'Adam', 'M'),
> ('Jack', 'Adam', 'M'),
> ('Mary', 'Adam', 'F');
>
> select name from poeple other_sibling join people john on
> other_sibling.father = john.father and other_sibling.sex != john.sex
> where john.name = 'John';

A very efficient and unexpected solution!

> I think this satisfies your requirements.

Almost, there is the part about the solution being resilent to future
data requirements. Now suppose we want to store Adam's age. Here is how
to do it in the network-type db and still have the original query work.

(new 'age)
(set+ adam age '30)

In the RMDB solution, how can I add Adam's age without impacting the
original query? If that is not possible, go ahead revise the original
schema and query.

topmind

unread,
Dec 22, 2006, 8:24:49 PM12/22/06
to
Neo wrote:
> > http://c2.com/cgi/wiki?CampusExample
>
> Below is an initial implemenation with some sample data. What can we
> verify?
>
> (new 'contact)
> (new 'alias)
> (new 'address)
[...]

I remember you. You're the guy who proposed that Lisp-like database
thingy about a year ago. We had a "query battle" back then IIRC.
Something about finding all houses with 2 refridgerators or the like.

-T-

Neo

unread,
Dec 22, 2006, 8:33:27 PM12/22/06
to
> I remember you. You're the guy who proposed that Lisp-like database thingy about a year ago. We had a "query battle" back then IIRC. Something about finding all houses with 2 refridgerators or the like.

You have a good memory. And what about Judge Judy :)

Tonkuma

unread,
Dec 23, 2006, 5:03:43 AM12/23/06
to
First, it is better to add Adam into People table. Now, Adam's father
is unknown. I set the column to NULL
insert into People values
('Adam', NULL, 'M');

Then add column Age and set Adam's age.
ALTER TABLE People ADD COLUMN Age SMALLINT DEFAULT NULL;
UPDATE People SET Age = 30 WHERE Name = 'Adam';

This way will not impact to existing queries(No modification is
required) including following.

Dmitry A. Kazakov

unread,
Dec 23, 2006, 6:14:48 AM12/23/06
to

Note that there is no any sematic difference between

SQRT(9)

SELECT X FROM SQRT WHERE Y=9

You count SQUARE as an extra interface? But

SELECT Y FROM SQRT WHERE X=3

is as well. What you wrote is basically about syntax sugar. And I can bet
without making any poll that SQRT sugar is much sweeter.

> On the final note, adding objects into a picture changes really nothing.

Only if you don't understand what abstraction is. Objects are supposed to
bind 9 with the table SQRT and the places in the tuples. It is a higher
level theory, which allows questions like what is an integral of SQRT from
0 to 100. Care to write a SELECT for it?

Think about this a bit, and you will understand the reason why it goes in
this direction and not in one you wished. You cannot implement SQRT, it is
not decomposable in relational tables, because that would require an
uncountable number of states. What OO offers is a cut of an infinite
recursion.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Tonkuma

unread,
Dec 23, 2006, 6:47:36 AM12/23/06
to
Please allow broad mindedly me, if I'm off your point.

Thomas Gagne wrote:
>
> select * from account, user where user.userId=X and account.userId =
> user.userId
>
> when instead I can use
>
> exec getAccountsFor @userId=X
>
I feel your comparison is not fair.
Does OO automatically supply the method getAccountsFor?
Didn't you first coding the method?
If different data requirement is happened, you may need to code new
method or modify the metod.

select * from account, user where user.userId=X and account.userId
=user.userId

No pre-coding or something doing required.
After install DBMS, then create table, load data, you can isuue that
SELECT statement.

If you and your colleague use that pattern frequently, you can create
it as view.
If all data in the tables are not required, you can restrict to only
required columns returned.
Though of cause, user.userId should be included.
CREATE VIEW account_for (userId, other-required-columns-list) AS
SELECT user.userId, other-required-columns-list
from account, user where user.userId=X and account.userId =user.userId;
Then, you can use that
SELECT * FROM account_for WHERE userId=X;

And if base tables are changed(add/remove columns or constraints,
etc.), you'll be not neccesary to change the VIEW and SELECT statement
as far as those changed items are not used explicitly or implicitly in
the VIEW (explicitly specifying other-required-columns-list make the
view more robust for changing). Conversaly, if requirement for
account_for changed, you can change the view without influence to
SELECT statement as far as no changing in selected list and condition
userId=X. I think it is similar to changing moethod in OO programming.

aloha.kakuikanu

unread,
Dec 23, 2006, 1:16:47 PM12/23/06
to
Dmitry A. Kazakov wrote:
> On 22 Dec 2006 10:27:46 -0800, aloha.kakuikanu wrote:
> > On the final note, adding objects into a picture changes really nothing.
>
> Only if you don't understand what abstraction is. Objects are supposed to
> bind 9 with the table SQRT and the places in the tuples. It is a higher
> level theory, which allows questions like what is an integral of SQRT from
> 0 to 100. Care to write a SELECT for it?

So what is object insight into what integral is?

> Think about this a bit, and you will understand the reason why it goes in
> this direction and not in one you wished. You cannot implement SQRT, it is
> not decomposable in relational tables, because that would require an
> uncountable number of states. What OO offers is a cut of an infinite
> recursion.

Function u=f(x,y,z) in general is a predicate Pf(x,y,z,u) with some
additional constraint. Given that a predicate is often an infinite
relation we indeed have obvious difficulty implementing it.

Suppose we have the relation R(x,y) where y=x^2. No function yet, just
a relation.

As it has two columns the simplest joins we can think of are:

R /\ `y=9`

R /\ `x=3`

where `y=9` and `x=3` are constant unary relations.

In the first case, we probably want to project the resulting relation
to column `x`

project_x (R /\ `y=9`)

in the second case to `y`

project_y (R /\ `x=3`)

Informally, in the first case we want to know the result of sqrt(9), in
the first case just 3^2

How the join is evaluated in both cases? By the standard optimization
technique pioneered by System R! We enumerate all possible join
ordering, and compute their cost, to pick up the most efficient one.

Let's go into the second
case, because it's easier. As one relation is infinite, the only
feasible join method is nested loops. Moreover, we have to start from
the relation which is finite, that is `x=3`. Now, there are the two
possibilities:
1. Scan the whole R relation and find all the matching tuples. Not
feasible too!
2. Find matching tuples by a some kind of index:

create index unique_x on R(x);

(pseudo SQL syntax). This index is not a conventional b-tree of course,
as all what is needed to do when x is known, and y is not is just to
calculate y by a simple formula x^2. The corollary here is that the
function x->x2 is essentially an index.

The other case is only marginally more complex. Likewise, we quickly
arrive to the conclusion that the only feasible execution is the
indexed nested loops with the scan of the `y=9` as the leading
relation. Then, the required index is the function y->sqrt(y).

In a word, functions to the predicates are what indexes to the tables
in traditional RDBMS are. In RDBMS world there are index organized
tables. In the functions analogy we have "function organized"
predicates.

Indexes are not something that is supposed to be exposed to the end
user, at least in theory. Indexes should be created/destroyed
automatically by RDBMS engine. In todays imperfect world this is done
by DBAs.

This little snippet is just an unconventional perspective to Meyer's
"design-by-contract" idea. The sqrt() function interface is defined
by the assertion y=x^2, and the programmer's job is to write the
sqrt() implementation. We see that functions being implementational
detail has much in common with indexes being implementation details
too.

Here is little more complex example. For x^2+y^2=1, there are 3 access
methods:
1. Given x, and y return {(x,y)} if they satisfy the equation, and {}
otherwise.
2. Given x return {(x,sqrt(1-x^2)),(x,-sqrt(1-x^2))} or empty set.
3. Given y return {(sqrt(1-y^2),y),(-sqrt(1-x^2),y)} or empty set.
Once again the way to represent infinite predicates in future
relational systems is via access path specification.

Nick Malik [Microsoft]

unread,
Dec 23, 2006, 3:53:12 PM12/23/06
to
Hello Thomas,

"Thomas Gagne" <tga...@wide-open-west.com> wrote in message
news:DrydnSQnbsmmNxTY...@wideopenwest.com...
> An unexpected thing happened while debating topmind: I had an epiphany.

You see... it is GOOD to have 'topmind' around! I like 'topmind' for the
continuous and fervent challenge he applies to many of the decisions that we
often take for granted. By being forced to 'support' our best practices
with long involved explanations about 'why they are best,' we can better see
the times when those practices are useful and the times when they are not.
At worst, we discover that we understand the limitations of our ideas much
better. At best, we learn something.

That said, when he joins a thread, it is often not useful for people who are
simply surfing the thread to read the discussion, because it tends to drop
down to a point-by-point argument fairly quickly. For that reason, most of
the contribution that he has added to /this/ thread... I have ignored.
(Sorry, T).

> Instead of responding to the news group I thought about it for a short bit
> (very short) and posted an article to my blog titled, "The RDB is the
> biggest object in my system."
>
> <http://blogs.in-streamco.com/anything.php?title=the_rdb_is_the_biggest_object_in_my_syst>
>
> What I realized while trying to describe my preference to use DB
> procedures as the primary (re: only) interface between my applications and
> the database is because I believe my DB's physical representation of data
> belongs to it alone and that customers of the DB oughtn't be permitted to
> directly manipulate (change or query) its data. I realized this is
> exactly what data-hiding is all about and why expert object oriented
> designers and programmers emphasize the importance of interfaces to direct
> data manipulation.

While I did not read this blog entry (yet), I agree, in general, with the
statement above.

That said, an RDBMS can present MANY interfaces to your code, not all of
which have to be presented through stored procs. You could present through
views, for example, and still hide some of the details of your db design.

I would also say that the db presents the data for 'many' objects instead of
a single one. Viewing the db as a single object begs the question: what
behavior are you encapsulating in it?

>
> I thought more about this and posted a second article, Databases as
> Objects: My schema is my class, which explored more similarities between
> databases and objects and their classes.
>
> <http://blogs.in-streamco.com/anything.php?title=my_schema_is_an_class>
>
> I intend next to explore various design patterns from GoF and Smalltalk:
> Best Practice Patterns to see if the similarities persist or where they
> break down, and what can be learned from both about designing and
> implementing OO systems with relational data bases.

this blog entry, I did read, and I replied to it in the blog comments.

>
> If you agree there's such a thing as an object-relational impedance
> mismatch, then perhaps its because you're witnessing the negative
> consequences of tightly coupling objects that shouldn't be tightly
> coupled.

Nope. I'm seeing the Object Relational Impedence a conceptual disconnect
between the 'traditional' RDBMS interface that doesn't present a mechanism
for encapsulating both code and operations in the same object wrapper with
the object oriented interface which absolutely requires it. Creating an
object wrapper that DOES present these two together is the goal of Object
Relational Mapping (ORM) tools.

Note that even when you do this, you run into the impedence, and that is
because RDBMS systems are not the appropriate place to put every business
capability. (If they were, all apps would be very thin user interfaces on
very thick databases).

You can certainly place some activities in the db, including calculations,
validations, and some data translations (including XML-to-SQL and vice
versa), but I'd argue against placing too many business activities there,
especially things like complex (multi-path) workflow (because of the
difficulty with synchronization logic across a set-oriented interface like
SQL) or cross-system messaging, etc.

Most business capabilities are described as behaviors first, and data
second. The data is the operand, not the operator. This causes us to have
to recast the data into an entirely different view than the one that is used
in RDBMS systems. Key concerns in RDBMS systems, like indexing, Referential
Integrity, volume scaling, and data value ranges should be encapsulated and
'hidden' (but not in the sense of data hiding, but in the sense of 'behavior
hiding' which is an OO notion).

Recasting that data from the efficient storage mechanism presented by
Relations to the more behavior-oriented mechanism required by objects is the
responsibility of the Data layer and is the focus of the discussion in
Object Relational Impedence.

>
> There's a hypothesis in there somewhere.
>
> As always, if you know of existing research on the subject I'm anxious to
> read about it.
>

A good starting point for finding research is to go to a general article
like the following and follow the reference links off the page.
http://en.wikipedia.org/wiki/Object_Relational_Mapping


I guess one thing that stands out for me: you reached a valuable conclusion
about the application of OO design methods to RDBMS design, but you didn't
prove the initial assumption: that stored procedures should be used as the
only interface for code to access the data in the database. In this
respect, I am not convinced.

Stored procedures are VALUABLE, don't get me wrong. You get security
benefits and you get the ability to control the data manipulations against
multiple tables, but as I pointed out in my blog response, stored procedures
are not object methods and they are not tied to any objects. They are
effectively unconstrained procedural code, with access to any and every
table in the database (in many db systems, this visibility extends to tables
in other databases, both on the same server and in other servers).
Therefore, I have a very difficult time viewing stored proces as methods.

I also think you lose something valuable with Stored Procs. Excellent
efforts have been expended to consider the basic principles of RDBMS design
in objects, and to create objects that will effectively assist with Object
Relational Mapping as a first step to addressing the Impedence mismatch.
Those objects are defeated by the artificial barriers placed by stored
procedures. I'm referring to various attempts at Data Access Objects,
including the .Net Data objects in the Microsoft .Net framework.

Some would say that this reduces the value of the DAO-style objects. I
would reply that RDBMS systems are based on a mathematical simplicity, and
approach that is fairly pure and extremely versatile. Hiding that
mathematical simplicity may or may not be a valuable enterprise, but it is
clearly the effect of restricting all data access to a stored procedure
layer. In that aspect, perhaps it is the value of the stored proc that
should be questioned, and not the value of the Data objects in the OO
library.

--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--


Neo

unread,
Dec 23, 2006, 3:48:57 PM12/23/06
to
> > > > network example. Adam has children named John(male),
> > > > Jack(male) and Mary(female). Find John's sibling of opposite gender.
> > > > Below is an implementation using a network-type db. What RMDB
> > > > schema/query implements the equivalent? Note that the query does not
> > > > refer to John's father (Adam) or John's gender (male) directly.
> > > >
> > > People (Name, Father, Sex);

> > > ('John', 'Adam', 'M'),
> > > ('Jack', 'Adam', 'M'),
> > > ('Mary', 'Adam', 'F');
> > > SELECT name from poeple other_sibling join people john on

> > > other_sibling.father = john.father and other_sibling.sex != john.sex
> > > where john.name = 'John';
> >
> > Now suppose we want to store Adam's age. Here is how
> > to [add] it in the network-type db and still have the original query work.

> > (new 'age) (set+ adam age '30)
> > how can I add Adam's age [in RMDB] without impacting original query?
>
> ...add Adam into People table. add column Age ...
> This way will not impact to existing queries...

Great, so the data looks something like below:

Table People
Name Father Sex Age
Adam NULL NULL 30
John Adam M NULL
Jack Adam M NULL
Mary Adam F NULL

Now suppose we want to add another child of Adam named Francis who is
bisexual. Here is how to add it in the network-type db and still have
the original query work: (new 'francis) (set+ francis gender 'bisexual)
(set adam child francis). How can I add Francis to above RMDB without
affect original query.

Please note that I am trying to creating a network to "tangle"
RM/RMDBs. The names of those nodes and edges are only meant for
convenience. If those names make it appear as if a certain network is
impossible (ie a person's gender being bisexual), rename the network
elements in general terms such as node1, node2, edge1, edge2, etc.

Nick Malik [Microsoft]

unread,
Dec 23, 2006, 4:06:19 PM12/23/06
to
"topmind" <top...@technologist.com> wrote in message
news:1166736833....@79g2000cws.googlegroups.com...
> Thomas Gagne wrote:
>> The designer of the
>> second (rightly, I think) believes counting the number of SQL lines
>> would be difficult since they're distributed throughout his code, and
>> include string concatenation for variables and discriminations spread
>> throughout functions.
>
> That is very a minor reason to separate. Is it worth making the app 10%
> to 25% more time-consuming to maintain *just* to be able to count
> easier? I have to object. Perhaps you have weird managers.


That's FUNNY, T! You are arguing with a CTO.

Dmitry A. Kazakov

unread,
Dec 24, 2006, 6:39:32 AM12/24/06
to
On 23 Dec 2006 10:16:47 -0800, aloha.kakuikanu wrote:

> Dmitry A. Kazakov wrote:
>> On 22 Dec 2006 10:27:46 -0800, aloha.kakuikanu wrote:
>>> On the final note, adding objects into a picture changes really nothing.
>>
>> Only if you don't understand what abstraction is. Objects are supposed to
>> bind 9 with the table SQRT and the places in the tuples. It is a higher
>> level theory, which allows questions like what is an integral of SQRT from
>> 0 to 100. Care to write a SELECT for it?
>
> So what is object insight into what integral is?

9 is an object (a value of a type, say, real number). Real number is
roughly a model of mathematical analysis in the program. We can refine that
model by adding SQRT, integral, etc to the type, and thus to the sets of
the objects of. You cannot do that with the relation SQRT.

>> Think about this a bit, and you will understand the reason why it goes in
>> this direction and not in one you wished. You cannot implement SQRT, it is
>> not decomposable in relational tables, because that would require an
>> uncountable number of states. What OO offers is a cut of an infinite
>> recursion.
>
> Function u=f(x,y,z) in general is a predicate Pf(x,y,z,u) with some
> additional constraint. Given that a predicate is often an infinite
> relation we indeed have obvious difficulty implementing it.
>
> Suppose we have the relation R(x,y) where y=x^2. No function yet, just
> a relation.
>
> As it has two columns the simplest joins we can think of are:
>
> R /\ `y=9`
>
> R /\ `x=3`
>
> where `y=9` and `x=3` are constant unary relations.
>
> In the first case, we probably want to project the resulting relation
> to column `x`
>
> project_x (R /\ `y=9`)
>
> in the second case to `y`
>
> project_y (R /\ `x=3`)
>
> Informally, in the first case we want to know the result of sqrt(9), in
> the first case just 3^2
>
> How the join is evaluated in both cases? By the standard optimization
> technique pioneered by System R! We enumerate all possible join
> ordering, and compute their cost, to pick up the most efficient one.

No that's not the question. It is how the relation SQRT is decomposed. The
function SQRT can be decomposed into {+,-,*,/} plus imperative control flow
instruction set using, say, Newton method. If you wanted to present an
alternative, you should have to show a doable method of construction the
table of SQRT from scratch. Note that * is also function, and + is too. The
recursion stops at the hardware/axiomatic level.

(It is allowed to claim that your hardware has SQRT, but then I'll move to
exp, or to modified Bessel's function etc.)

> Let's go into the second
> case, because it's easier. As one relation is infinite, the only
> feasible join method is nested loops.

BTW, looping is not a relational concept. I mean forall x in X { do F(x) }.
do F is not decomposable into SELECTs. You have to have "do F" as a
relation in advance: "do F" : S x X -> Boolean. Where S is the set
computational states. Then you need some sort of accumulation and
serialization of states as you navigate them in forall.

[...]


> This little snippet is just an unconventional perspective to Meyer's
> "design-by-contract" idea.

I don't think it was. The idea was of Dijkstra, i.e. the one of proven
correctness of a program. The Meyer's DbC was this idea applied narrowly to
types. Note that both issues of types and of correctness are orthogonal to
relational algebra, and all three are to the point you seem trying to make,
which is IMO the old worn imperative vs. declarative.

> The sqrt() function interface is defined
> by the assertion y=x^2,

(Pedantically. No, assertion does not define SQRT. SQRT is defined by its
implementation. Assertion is there to check the correctness of.)

> and the programmer's job is to write the
> sqrt() implementation. We see that functions being implementational
> detail has much in common with indexes being implementation details
> too.

OK. However, note that all programs are implementation details of some
semantics, which is outside the computer and the programming language being
used.

> Here is little more complex example. For x^2+y^2=1, there are 3 access
> methods:
> 1. Given x, and y return {(x,y)} if they satisfy the equation, and {}
> otherwise.
> 2. Given x return {(x,sqrt(1-x^2)),(x,-sqrt(1-x^2))} or empty set.
> 3. Given y return {(sqrt(1-y^2),y),(-sqrt(1-x^2),y)} or empty set.

4. Given return {(x,y) | x^2+y^2<1}
5. Given a,b return {(x,y) | x^2+y^2=1 & a*x+b*y=0}
...
The set of "access methods" is obviously uncountable.

> Once again the way to represent infinite predicates in future
> relational systems is via access path specification.

That depends on the hardware. Our programming languages are built on the
hardware in which non-linear constraints like above cannot be efficiently
computed. In analogue computers, for example, differential equations were
not a problem. Sqrt was a problem, if I recall correctly, the module was
bigger than the rest of the computer.

Merry Christmas,

JXStern

unread,
Dec 24, 2006, 11:22:27 AM12/24/06
to
On Thu, 21 Dec 2006 06:15:29 -0500, Thomas Gagne
<tga...@wide-open-west.com> wrote:

>Ultimately, I think I may need to come up with another name for a
>domain-ized database. The word 'database' has too many possibilities.
>It's too general. After I've applied my schema to it it no longer has
>all the possibilities it once had. After my schema's applied it becomes
>something different. It's becomes my domain's data base. My domainabase?

It's your application's data model.

J.


JXStern

unread,
Dec 24, 2006, 11:34:14 AM12/24/06
to
On 21 Dec 2006 13:22:00 -0800, "Neo" <neo5...@hotmail.com> wrote:
>How did Dr Codd make
>network structures fall out of favor?

Y'know, I've often wondered about that.

Explicit b-trees and b-trees crosslinked into "networks" can be very
efficient and straightforward, so why did they lose so completely?
The textbook answer is that they are not reusable, and a normalized
relational database, is. But I think the answer is elsewhere.
Because of its independence, the relational database requires some
kind of language interface, pretty much universally some form of SQL
these days, which further hides the implementation from the
application (I agree with topmind on this).

But the bottom line is that Codd provided an alternative with
strengths and weaknesses, and the combination just seemed to make
developers and managers happier over time.

J.

H. S. Lahman

unread,
Dec 24, 2006, 12:01:03 PM12/24/06
to
Responding to Malik...

>>An unexpected thing happened while debating topmind: I had an epiphany.
>
>

> You see... it is GOOD to have 'topmind' around! <snip>


> At best, we learn something.

Bryce is a bright guy but I don't think his motivation is to challenge
ideas. I've been observing him for a decade or so and I think he just
engages in these debates to annoy OO people for his own amusement. Note
the following observations:

If I had the task of designing a web site that would really annoy OO
people because of things like unsupported assertions and misstatements
about what the OO paradigm was about, it would be Bryce's Geocities web
site. There is no way that anyone practicing the OO paradigm would not
be outraged on reading the material. IMO, it is actually very cleverly
done because to be so universally inflammatory he has to know what
buttons to push. IOW, I think he knows a lot more about OO development
than he lets on and he uses that in his debates to pull OO people's chains.

He has been pushing the same "challenges" for a decade or more and
completely ignores any refutations. In particular, he continues making
his more outrageous misstatements about what OO development is about no
matter how many times he is corrected. Thus he continues to use
assertions like OO development being "noun-driven" that have a tiny
grain of truth (i.e., Peter Coad's primer technique for object blitz
mechanics) as an overall generalization of the entire paradigm. IOW, he
already knows he is wrong when making the statements.

He consistently employs a suite of forensic ploys that are designed to
pull the opponent down a rabbit hole of misdirection. Thus one of his
favorites is to ask for an example, which he then proceeds to tear apart
for reasons unrelated to the original point that triggered the example.
Those ploys are quite predicable once one watches a few of his
debates. Unfortunately they are so overused that it becomes very clear
that Bryce's game is the debate itself rather than the content.

Bottom line: don't feed the troll. There is nothing to be learned in a
debate with Bryce except the effectiveness of debating ploys. He is
just amusing himself by infuriating OO people.


*************
There is nothing wrong with me that could
not be cured by a capful of Drano.

H. S. Lahman
h...@pathfindermda.com
Pathfinder Solutions
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
"Model-Based Translation: The Next Step in Agile Development". Email
in...@pathfindermda.com for your copy.
Pathfinder is hiring:
http://www.pathfindermda.com/about_us/careers_pos3.php.
(888)OOA-PATH

topmind

unread,
Dec 24, 2006, 3:52:16 PM12/24/06
to
Nick Malik [Microsoft] wrote:
> "topmind" <top...@technologist.com> wrote in message
> news:1166736833....@79g2000cws.googlegroups.com...
> > Thomas Gagne wrote:
> >> The designer of the
> >> second (rightly, I think) believes counting the number of SQL lines
> >> would be difficult since they're distributed throughout his code, and
> >> include string concatenation for variables and discriminations spread
> >> throughout functions.
> >
> > That is very a minor reason to separate. Is it worth making the app 10%
> > to 25% more time-consuming to maintain *just* to be able to count
> > easier? I have to object. Perhaps you have weird managers.
>
>
> That's FUNNY, T! You are arguing with a CTO.

As W shows, rank != right. Managers are often too distant from the
nitty-gritty of daily work such that they over-focus on "big" changes,
such as schema overhauls that need to be planned and managed days in
advanced. However, everyday slowness created by a particular design
decision may not be appearent to them. I am not saying for sure this is
the case, but it is something I have witnessed.

Similarly, a cubicle-dweller's perspective may miss things that a
manager has to deal with. Nobody has a perfect perspective. We have to
listen and cooperate to get the best results.

>
> --
> --- Nick Malik [Microsoft]
> MCSD, CFPS, Certified Scrummaster
> http://blogs.msdn.com/nickmalik
>
> Disclaimer: Opinions expressed in this forum are my own, and not
> representative of my employer.
> I do not answer questions on behalf of my employer. I'm just a
> programmer helping programmers.
> --

-T-

Thomas Gagne

unread,
Dec 24, 2006, 10:48:00 PM12/24/06
to
Nick Malik [Microsoft] wrote:
> <snip>

> That said, an RDBMS can present MANY interfaces to your code, not all of
> which have to be presented through stored procs. You could present through
> views, for example, and still hide some of the details of your db design.
>
> I would also say that the db presents the data for 'many' objects instead of
> a single one. Viewing the db as a single object begs the question: what
> behavior are you encapsulating in it?
>
Objects are often composed of many other objects. My database object is
no different. Primarily, through stored procedures the DB has methods
and projections. The projections can either be returned as collections
of tuples or I can send a lambda expression (especially helpful for
really large result sets) that evaluates one row at a time. Streams can
also be returned. This where a facade can be helpful to make the DB's
interface more idiomatic for your favorite OOPL. Enjoy!
>
> <snip>

>
>
>> If you agree there's such a thing as an object-relational impedance
>> mismatch, then perhaps its because you're witnessing the negative
>> consequences of tightly coupling objects that shouldn't be tightly
>> coupled.
>>
>
> Nope. I'm seeing the Object Relational Impedence a conceptual disconnect
> between the 'traditional' RDBMS interface that doesn't present a mechanism
> for encapsulating both code and operations in the same object wrapper with
> the object oriented interface which absolutely requires it. Creating an
> object wrapper that DOES present these two together is the goal of Object
> Relational Mapping (ORM) tools.
>
I agree it is the goal, but should every DB-ish object in your
application actually map to a tuple inside the DB (bean)? Why do that
when objects should communicate to each other through messages? Aren't
the OR tools distracting OO programmers from how they ought really talk
to the DB?

> Note that even when you do this, you run into the impedence, and that is
> because RDBMS systems are not the appropriate place to put every business
> capability. (If they were, all apps would be very thin user interfaces on
> very thick databases).
>
I don't know everyone's experience, but every DB I've worked with /was/
my system. It stored the entire state of my system in neat tables and
rows with glorious relations between them to answer every question I
could possibly ask. Everything else was one of two things: automation
or cosmetics. Portfolio management, trading, banking, and
insurance--the DB recorded everything. If the system stopped the DB
knew where. When the system started the DB knew where from. In fact,
before there was a system there was a DB. It was designed, proved
correct, constraints implemented, procedures created, load tested, and
all kinds of fun unit-testing kinds of things before a single line of
application code was created.

In fact, the DB isn't only the biggest object in my system, but it was
also the first object--and an OOPL wasn't even necessary to realize it.


> You can certainly place some activities in the db, including calculations,
> validations, and some data translations (including XML-to-SQL and vice
> versa), but I'd argue against placing too many business activities there,
> especially things like complex (multi-path) workflow (because of the
> difficulty with synchronization logic across a set-oriented interface like
> SQL) or cross-system messaging, etc.
>

The database knows how to do things (load, update, remove, and query)
but it doesn't know why things are done. That's what applications and
users are for.


> Most business capabilities are described as behaviors first, and data
> second.

I disagree, only because behaviors are based on weak assumptions and
common practices. Whatever is done can either be done well or poorly.
Behaviors are based of the weakest facts--the /way/ things are done. In
fact, after analyzing the data and comparing the state of a DB before
and after some behavior, programmers often discover how behavior can be
improved.

But the DB must always be correct. Whether the behaviors are
correct or not, the DB must maintain its integrity. It must protect
its state. In fact, our DR plan is based on the premise that the
DB's integrity is the most critical--everything else is cosmetic.
<http://blogs.in-streamco.com/anything.php?title=rules_for_production>

> <snip>


>>
>
> A good starting point for finding research is to go to a general article
> like the following and follow the reference links off the page.
> http://en.wikipedia.org/wiki/Object_Relational_Mapping
>
>
> I guess one thing that stands out for me: you reached a valuable conclusion
> about the application of OO design methods to RDBMS design, but you didn't
> prove the initial assumption: that stored procedures should be used as the
> only interface for code to access the data in the database. In this
> respect, I am not convinced.
>

I don't blame you. I need to present more evidence, which I will do
through examples.
> <snip>


>
> I also think you lose something valuable with Stored Procs. Excellent
> efforts have been expended to consider the basic principles of RDBMS design
> in objects, and to create objects that will effectively assist with Object
> Relational Mapping as a first step to addressing the Impedence mismatch.
>

You're right. Some great research has been spent here--as there was in
alchemy.

Consider my situation, I have a single database which I know is correct
because it's guarded by procedures, constraints, unit and integrity
tests. I have multiple applications--some of them share common data
models but others of them do not. Which model is correct?

Consider you have 20 programs each doing specific things. Between the
20 you've discovered there's three different object models that best
reflect their dependent applications needs and designs. Which of the
three should be mapped to the DB? Should the DB's model be massaged to
reflect any of them, or should it be designed to be perfect for the
business data?


> Those objects are defeated by the artificial barriers placed by stored
> procedures. I'm referring to various attempts at Data Access Objects,
> including the .Net Data objects in the Microsoft .Net framework.
>
> Some would say that this reduces the value of the DAO-style objects. I
> would reply that RDBMS systems are based on a mathematical simplicity, and
> approach that is fairly pure and extremely versatile. Hiding that
> mathematical simplicity may or may not be a valuable enterprise, but it is
> clearly the effect of restricting all data access to a stored procedure
> layer. In that aspect, perhaps it is the value of the stored proc that
> should be questioned, and not the value of the Data objects in the OO
> library.
>

Would you make that same argument about a Date object, or any other
object in your system.

"I would reply that Date objects are based on mathematical simplicity,
an approach that is fairly pure and extremely versatile. Hiding that

mathematical simplicity may or may not be a valuable enterprise, but it

is clearly the effect of restricting all data access to Date's interface
methods. In that aspect, perhaps it is the value of Date's interface
that should be questioned and not the value of the Date objects in the
OO library."

--

topmind

unread,
Dec 24, 2006, 11:14:22 PM12/24/06
to

Patrick May wrote:
> "topmind" <top...@technologist.com> writes:
> > Neo wrote:
> > > > If you take away inheritence, you get "network structures" (AKA
> > > > tangled pasta). Dr. Codd sought to escape those by applying set
> > > > theory, and network structures thankfully fell out of favor,
> > > > until the OO crowd tried to bring them back from the dead.
> > >
> > > Can you give an example of such as tangled pasta?
> >
> > OO Visitor pattern.
>
> The Visitor pattern provides two capabilities: 1) simulation of
> double dispatch and 2) non-intrusively adding new operations to
> existing classes. Both of these address limitations of languages such
> as C++ and Java. The existence of this pattern does not demonstrate a
> general flaw in the object oriented approach.

How about you Visitor defenders present a somewhat practical biz
example where Visitor allegedly makes maintenence easier.

>
> Sincerely,
>
> Patrick
>

-T-

topmind

unread,
Dec 24, 2006, 11:27:12 PM12/24/06
to
JXStern wrote:
> On 21 Dec 2006 13:22:00 -0800, "Neo" <neo5...@hotmail.com> wrote:
> >How did Dr Codd make
> >network structures fall out of favor?
>
> Y'know, I've often wondered about that.
>
> Explicit b-trees and b-trees crosslinked into "networks" can be very
> efficient and straightforward, so why did they lose so completely?

Efficient, maybe. Straitforward? no way. Relational offers more
consistency. There are less different ways to model the same business
in relational. It sounds like a bad thing, but relational killed the
"creativity". It is similar to how structured blocks killed GOTO
creativity. IBM's IMS is dead for a reason.


> The textbook answer is that they are not reusable, and a normalized
> relational database, is. But I think the answer is elsewhere.
> Because of its independence, the relational database requires some
> kind of language interface, pretty much universally some form of SQL
> these days, which further hides the implementation from the
> application (I agree with topmind on this).

Navigational query languages were proposed. They were ugly because
navigational is ugly.

>
> But the bottom line is that Codd provided an alternative with
> strengths and weaknesses, and the combination just seemed to make
> developers and managers happier over time.
>
> J.

-T-

topmind

unread,
Dec 24, 2006, 11:48:15 PM12/24/06
to
Frans Bouma wrote:
> topmind wrote:
> > Frans Bouma wrote:
> > > topmind wrote:
> > > > You OO'ers keep forgetting: SQL is an interface. I repeat, SQL is
> > > > an interface. It is not "low level hardware".
> > >
> > > SQL is a set-oriented language, it's not an interface as a language
> > > doesn't do anything without context (in this case a
> > > parser-interpreter combi)
> >
> > Perhaps we need to clear up our working semantics with regard to
> > "language" and "interface". Are methods interfaces or a language? I
> > am not sure it really matters and I don't want to get tangled in a
> > definition battle.
>
> Methods are part of an interface written in a language. SQL is a
> language, a set of stored procs is an interface.

It is possible to represent one with the other if a RDBMS supports
triggers. In other words, they are technically interchangable, sort of
a Turing Equivalency Principle act work..

>
> it's not getting much simpler than that.

Then we are in trouble. I don't think it matters much whether we call
it language or interface so far, and thus see no reason to make a big
deal out of it just yet. IF it becomes pivitable, then we can revisit
the definition.

>
> > > > BTW, Microsoft has ADO, DAO, etc. which are OO wrappers around
> > > > RDBMS.
> > >
> > > no they're not. ADO and DAO aren't OO, as they're COM based so
> > > they're actually procedural (library interfaces implemented on a
> > > live object).
> >
> > Being an OO wrapper on top of procedural calls does not necessarily
> > turn something into non-OO. Please clarify your labelling criteria.
>
> ADO isn't OO, it's COM. COM isn't OO, despite the fact it lets you
> believe you're working with objects, which is actually a facade, you're
> not working with OOP style objects, as there's no inheritance nor
> polymorphism, you just talk to an interface implemented by an
> object-esk construct in memory, which could be seen as a C struct with
> function pointers.

Some claim "structures with function pointers" is in fact OO. The
author Robert C. Martin generally uses this definition IIANM.
Polymorphism is when one puts different function pointers in structure
"cells" with the same name.

>
> > > > Further, even if OO was the best way to access RDBMS thru an app,
> > > > that does not necessarily extrapolate to all domains. OO being
> > > > good for X does not automatically imply it is good for Y also.
> > >
> > > you don't get the point: in an OO application, which works on data
> > > IN the application, you want to do that in an OO fashion.
> >
> > Why? Is OO proven objectively better?
>
> why would one WANT to use 2 paradigms, which aren't related as in one
> is derived from the other, in a single application? (let's redirect the
> 'what's a paradigm' posts to /dev/null/ first)

Yin and Yang. RDBMS do attribute management far better than code (and
OO). But things like logic expressions (If shipment quantity is greater
than bin size and it is Sunday then.....). This Yin/Yang between
procedural and RDBMS is what makes them shine. Each does best what it
does best. OO does neither best.

>
> > > To obtain the
> > > data from the outside is initiated INSIDE the application, thus
> > > also in an OO fashion. As an RDBMS doesn't understand OO in most
> > > cases, but it works with SQL as it has a SQL interpreter in place
> > > to let you program its internal relational algebra statements in a
> > > more readable way, you've to map statements from OO to SQL and set
> > > oriented results (the sets) from the DB back to OO objects.
> >
> > Are you suggesting methods such as "Add_AND_Clause(column,
> > comparisonOperator, Value)"?
>
> No.

How can you completely wrap SQL into OO without them?

>
> > Those are bloaty and ugly in my opinion, but let's save that value
> > judgement for a later debate on clause/criteria wrappers.
>
> You can perfectly write a set of predicate classes which can be
> inherited by the developer and make them more specific to the domain
> the developer is working with.

Example?

>
> > > > I have
> > > > already agreed that OO may be good for writing device drivers and
> > > > device-driver-like things; but it has not been shown useful to
> > > > view everything as a device driver. I am more interested in
> > > > seeing how OO models biz objects rather than how it wraps system
> > > > services and the like. Biz modeling has been OO's toughest
> > > > evidence cookie to crack (but perhaps not the only).
> > >
> > > huh? walls full of books have been written about this topic and you
> > > declare it the toughest cookie to crack...
> >
> > Such as? I've seen biz examples in OOP books, but they did not show
> > how they were better than the alternative. Showing how to make an
> > Employee class does not by itself tell you why an Employee class is
> > better than not using OO.
>
> I'm not saying everything should be OO because it's otherwise not
> possible, as you can write any program in plain C. It's often more
> suitable for writing an application because the resulting application
> is developed faster (code re-use) and is more maintainable and business
> apps can be very suitable for using an OO language,

Please reread what you replied to. I did NOT claim that OO does not
run.

> simply because you
> have data and logic operating on that data, so IMHO the ideal
> environment for using an OOP approach.

When I see coded proof for costum biz apps, I will believe you. Until
then, I will not take your word for it.

>
> > > > GOF patterns are supposed to be a solution, but GOF patterns have
> > > > no clear rules about when to use what and force a kind of IS-A
> > > > view on modeling instead of HAS-A.
> > >
> > > You also fall into the 'use pattern first, find problem for it
> > > later'-antipattern.
> > >
> > > a pattern is a (not the) solution for a well defined recognizable
> > > problem. So if you recognize the problem in your application, you
> > > can use the pattern which solves THAT problem to solve THAT problem
> > > in your application. THat's IT. The GoF book names a set of
> > > patterns and also the problems they solve. If you don't have the
> > > problems they solve, you don't need the patterns.
> >
> > Well, a look-up table is usually simpler and more inspectable than
> > Visitor. Thus, if usefulness is our guide, then GOF patterns are often
> > not the best.
>
> Visitor pattern is a pattern I don't think is very useful as the
> problem it solves isn't very common.
>
> But if your point is that OO is crap because Visitor pattern is silly
> and thus all that's said in the GoF book is therefore also retarded
> then we're done here.

The rest of GOF sucks for somewhat similar reasons.

> > > > GOF patterns are like an attempt to
> > > > catalog GO TO patterns instead of rid GO TO's. Relational is
> > > > comparable to the move from structured programming from GO TO's:
> > > > it provides more consistency and factors common activities into a
> > > > single interface convention (relational operators). OO lets
> > > > people re-invent their own just like there are a jillion ways to
> > > > do the equivalent of IF blocks with GO TO's.
> > >
> > > I've read a lot of nonsense in your post,
> >
> > No, the nonsense comes from the OO zealots. They have no proof for biz
> > apps. Two paradigms are equal or unknown until proven otherwise. I
> > want to see science, not brochures.
>
> you also have no proof for your claims either. As you started the
> claims, let's see them.


I don't claim my favorite approaches are objectively better. I am only
claiming that there is no evidence that OO is better and that one
should wrap everything they hate behind OO classes UNTIL they prove OO
is better. It is the K.I.S.S. principle, and wrapping is not KISS if
there is no repetition being factored out.

>
>
> FB
>
> --

-T-

Neo

unread,
Dec 25, 2006, 2:15:53 AM12/25/06
to
> Navigational query languages were proposed.
> They were ugly because navigational is ugly.

People have historically related navigational queries with supposed
"network" databases that were actually relational/hierarchal hybrid
dbs. Below are two nearly equivalent queries based on nearly equivalent
data structures. The first one is for a true network-type db. Which of
the below is navigational/ugly?

(!= (and (get person instance *)
(get * gender male)
(get (get * child john) child *))
john)

SELECT P2.*
FROM ((person INNER JOIN link ON person.ID = link.childID)
INNER JOIN link AS link2 ON link.parentID = link2.parentID)
INNER JOIN person AS P2 ON link2.childID = P2.ID
WHERE (((P2.name)<>"John")
AND ((person.name)="John")
AND ((P2.sex)="Male"));

topmind

unread,
Dec 25, 2006, 2:48:29 AM12/25/06
to

The biggest bloater of the SQL version is the joins. Some RDBMS offer
"natural joins" to avoid having to explicity state common joins. Thus,
the fault is SQL (or at least this version) and not relational in
general in this case. Without the join jabber, they would be quite
comparable. Your version also tends to use symbols instead of
key-words. Early lab relational languages also did this, but IBM
rejected that approach, afraid it would scare away customers. Thus,
that is also a language-specific issue.

(What is with the extra parenthesis? It looks like the MS-Access-style
bastardization of SQL code.)

-T-

Frans Bouma

unread,
Dec 25, 2006, 6:43:28 AM12/25/06
to
Thomas Gagne wrote:
> Nick Malik [Microsoft] wrote:
> > <snip>
> >> If you agree there's such a thing as an object-relational
> impedance >> mismatch, then perhaps its because you're witnessing the
> negative >> consequences of tightly coupling objects that shouldn't
> be tightly >> coupled.
> >>
> >
> > Nope. I'm seeing the Object Relational Impedence a conceptual
> > disconnect between the 'traditional' RDBMS interface that doesn't
> > present a mechanism for encapsulating both code and operations in
> > the same object wrapper with the object oriented interface which
> > absolutely requires it. Creating an object wrapper that DOES
> > present these two together is the goal of Object Relational Mapping
> > (ORM) tools.
> I agree it is the goal, but should every DB-ish object in your
> application actually map to a tuple inside the DB (bean)? Why do
> that when objects should communicate to each other through messages?
> Aren't the OR tools distracting OO programmers from how they ought
> really talk to the DB?

I wouldn't say 'distracting', I'd say 'providing an alternative way to
work with the same well-known relational database.'.

Example: if you model your relational model with NIAM, you can use the
model for both your domain classes and also for the database model
(generate an E/R model from it). An O/R mapper provides a way to use
the entities defined at the level of abstraction NIAM provides both in
your own code and also in the DB, i.o.w. it provides a way to utilize
the model at runtime in an OO fashion.

> > Most business capabilities are described as behaviors first, and
> > data second.
> I disagree, only because behaviors are based on weak assumptions and
> common practices. Whatever is done can either be done well or

> poorly. Behaviors are based of the weakest facts--the way things are


> done. In fact, after analyzing the data and comparing the state of a
> DB before and after some behavior, programmers often discover how
> behavior can be improved.
>
> But the DB must always be correct. Whether the behaviors are
> correct or not, the DB must maintain its integrity. It must
> protect its state. In fact, our DR plan is based on the premise
> that the DB's integrity is the most critical--everything else is
> cosmetic.
> <http://blogs.in-streamco.com/anything.php?title=rules_for_production>

Though what is 'correctness' in the DB? Isn't that close to semantical
interpretation of the data in such a way that you need to transform the
data into information first to be sure it's correct?

I mean, sure, there are referential integrity rules, but if I store a
row in the customer table with an address where city is New York and
Country is THe Netherlands, it's not correct, though the db doesn't
object.

> > <snip>
> >
> > I also think you lose something valuable with Stored Procs.
> > Excellent efforts have been expended to consider the basic
> > principles of RDBMS design in objects, and to create objects that
> > will effectively assist with Object Relational Mapping as a first
> > step to addressing the Impedence mismatch.
> You're right. Some great research has been spent here--as there was
> in alchemy.
>
> Consider my situation, I have a single database which I know is
> correct because it's guarded by procedures, constraints, unit and
> integrity tests.

that doesn't mean anything. It still can be incorrect in some form.
There's a difference between data and information, and unless you have
your data -> information transformation code inside your db for ALL
your applications (which thus means, your applications are written
inside the db) you still can have incorrect data.

Before people start jumping up and down that thats impossible, the
keyword is 'incorrect'. You might have a relational model which of
course forces integrity rules on you and the data stored in the
physical representation of that relational model thus obeys these
rules, but that doesn't mean the data is correct.

This means that data is only correct in the semantical context the
data is used in, i.e. when it becomes information. An address with 'New
York' as city and 'The Netherlands' as country isn't violating your
model's rules, but is violating a rule for a valid address in the
Netherlands as we don't have a city called 'New York' in the
netherlands.

So what you're saying doesn't imply solely on a db with procs, it also
implies on db's with an external api.

> I have multiple applications--some of them share
> common data models but others of them do not. Which model is correct?
> Consider you have 20 programs each doing specific things. Between
> the 20 you've discovered there's three different object models that
> best reflect their dependent applications needs and designs. Which
> of the three should be mapped to the DB? Should the DB's model be
> massaged to reflect any of them, or should it be designed to be
> perfect for the business data?

In my opinion, there's just 1 model possible, and it comes down to a
NIAM/ORM (object role modelling) model. With that model, you can speak
about an entity and how it's used in the persistent storage and also in
your application.
See:
http://weblogs.asp.net/fbouma/archive/2006/08/23/Essay_3A00_-The-Databas
e-Model-is-the-Domain-Model.aspx

Everyone who has created NIAM models or even E/R models knows that
there are a couple of different models possible when you look solely on
the information analysis results. So one could for example say: to ease
development, aggregated entities should be possible, e.g. a
'SalesOrder' which contains 'Order', 'Customer', order lines etc. and
is used as a single entity in a piece of code. This however comes down
to the fact that the entity 'SalesOrder' is a denormalization of a
normal NIAM model with customer, order, order lines etc.

Question now is: would it be best to store the SalesOrder directly
into the db as a single table, or do you want to keep the normalized
tables and use the denormalized SalesOrder entity in your code?

Either way, the SalesOrder entity has an advantage: you can define a
set of rules for this SalesOrder and implement them inside that
salesorder, and your own code working with that SalesOrder is then
simpler.

Chosing this route IMHO implies a new abstraction layer above a NIAM
model. It then comes down to: to which abstraction level do you want to
talk in your code: the NIAM level or the abstraction level just above
it.?

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Dmitry A. Kazakov

unread,
Dec 25, 2006, 8:11:22 AM12/25/06
to
On 25 Dec 2006 11:43:28 GMT, Frans Bouma wrote:

> Thomas Gagne wrote:

>> Consider my situation, I have a single database which I know is
>> correct because it's guarded by procedures, constraints, unit and
>> integrity tests.
>
> that doesn't mean anything. It still can be incorrect in some form.
> There's a difference between data and information, and unless you have
> your data -> information transformation code inside your db for ALL
> your applications (which thus means, your applications are written
> inside the db) you still can have incorrect data.
>
> Before people start jumping up and down that thats impossible, the
> keyword is 'incorrect'. You might have a relational model which of
> course forces integrity rules on you and the data stored in the
> physical representation of that relational model thus obeys these
> rules, but that doesn't mean the data is correct.
>
> This means that data is only correct in the semantical context the
> data is used in, i.e. when it becomes information. An address with 'New
> York' as city and 'The Netherlands' as country isn't violating your
> model's rules, but is violating a rule for a valid address in the
> Netherlands as we don't have a city called 'New York' in the
> netherlands.

You address here one issue, that is, the RA rules applied to wrong data may
produce wrong outcome. It is important, but it is not *the* problem. Which
is that having a correct (consistent) set of rules (like RA) and some valid
input (data set), one can still produce semantically wrong outcomes. This
is universally true. Arithmetic is certainly consistent, but 1 apple + 1
orange is not 2 Ampere. Arithmetic does not define the meaning of 1 and +.
Equivalently, the operations of RA should have a meaning in the application
domain. This meaning lies outside RA, and nothing in RA can warranty
anything about it. Same is true for any programming language. So the custom
chat about "DB correctness" is either trivial or rubbish.

Neo

unread,
Dec 25, 2006, 11:00:19 AM12/25/06
to
> What is with the extra parenthesis?
> It looks like the MS-Access-style bastardization of SQL code.

Yes, it was auto-generated by MS Access. I suppose the extra
parenthesis are helpful to their parser.

> The biggest bloater of the SQL version is the joins. Some RDBMS offer
> "natural joins" to avoid having to explicity state common joins.

Which RMDB? What might it look like in this case?

> Thus, the fault is SQL (or at least this version)
> and not relational in general in this case.

> Your version also tends to use symbols instead of key-words.
> Early lab relational languages also did this, but IBM
> rejected that approach, afraid it would scare away customers.
> Thus, that is also a language-specific issue.

IBM rejected earlier relational languages with less join jabbers and
key-words because they thought it would scare away customers? Can you
explain further?

> Without the join jabber, they would be quite comparable.

For a majority of the cases, SQL expressions are simpler. For some more
complex networks with the flexibility to meet new data requirements
with minimal impact, like the "toy" examples, network-type db's
expressions can be simpler.

Patrick May

unread,
Dec 25, 2006, 1:22:19 PM12/25/06
to

Non-sequitur. I never claimed to be a "fan" of the pattern, nor
did I assert that it makes maintenance easier. I merely pointed out
the reasons for using it and provided an example of the context in
which it applies. My only purpose was to demonstrate that your use of
this pattern to support your contention that OO techniques result in
"tangled pasta" is without merit.

Sincerely,

Patrick

------------------------------------------------------------------------
S P Engineering, Inc. | Large scale, mission-critical, distributed OO
| systems design and implementation.
p...@spe.com | (C++, Java, Common Lisp, Jini, middleware, SOA)

topmind

unread,
Dec 25, 2006, 5:36:53 PM12/25/06
to

Fine, pick another OO pattern and kick procedural/relational's ass with
it. I don't care how you kick its ass, just do it and show it. Put your
money where your mouth is and beat the hell of out of me with OO.

>
> Sincerely,
>
> Patrick
>

-T-

topmind

unread,
Dec 25, 2006, 6:16:35 PM12/25/06
to

Neo wrote:
> > What is with the extra parenthesis?
> > It looks like the MS-Access-style bastardization of SQL code.
>
> Yes, it was auto-generated by MS Access. I suppose the extra
> parenthesis are helpful to their parser.
>
> > The biggest bloater of the SQL version is the joins. Some RDBMS offer
> > "natural joins" to avoid having to explicity state common joins.
>
> Which RMDB? What might it look like in this case?

It goes something like:

...NATURAL JOIN tableA, tableB, tableC
WHERE...

The natural join clause then uses either a data dictionary or column
names to perform the join rather than explicit matches. I have not used
it myself, so I am not an expert on it. The point is that joins can be
simplified. If a dialect of SQL itself does not provide it, then one
can use something like:

SELECT ...
#joinClause("tableA,tableB,tableC")#
WHERE ...

Our embedded function (returning a string) can then supply the
commonly-used joins and create the SQL join phrase for us.

>
> > Thus, the fault is SQL (or at least this version)
> > and not relational in general in this case.
> > Your version also tends to use symbols instead of key-words.
> > Early lab relational languages also did this, but IBM
> > rejected that approach, afraid it would scare away customers.
> > Thus, that is also a language-specific issue.
>
> IBM rejected earlier relational languages with less join jabbers and
> key-words because they thought it would scare away customers? Can you
> explain further?

The early experiments used math-based symbols instead of key-words
because researchers were math savvy and because Dr. Codd presented his
query language with math-based symbols. IBM felt that it would be more
marketable if key-words were used instead. Perhaps they felt it should
be natural to COBOL programmers. The early symbols did have a natural
join if I remember correctly, but I am not positive. I don't know what
vendors do supply a natural join. It seems based on column names, which
to me is a mistake: it should be based upon a data dictionary or join
reference table IMO.

>
> > Without the join jabber, they would be quite comparable.
>
> For a majority of the cases, SQL expressions are simpler. For some more
> complex networks with the flexibility to meet new data requirements
> with minimal impact, like the "toy" examples, network-type db's
> expressions can be simpler.

So network DB's are better at toy examples? I won't even bother to
challenge that :-)

Generally a problem domain tends to present common needs/patterns such
that views or custom functions (if available) can often simplify SQL
queries. I do wish there was more competition in relational query
languages. SQL has grown a bit stale.

Note that Dr. Codd and supported used to have "query-offs" with
Bachman's group, the premier navigational/network supporter at the
time. By most accounts, Dr. Codd's group won.

-T-

Patrick May

unread,
Dec 25, 2006, 6:17:25 PM12/25/06
to

You've got it backwards. You used the Visitor pattern in support
of one of your claims in your conversation with Neo. I simply pointed
out that it does not, in fact, support your argument. The burden of
proof is still on you to provide an example of OO techniques leading
to "tangled pasta".

Alternatively, you could simply admit to Neo that you cannot
support your assertion.

topmind

unread,
Dec 25, 2006, 6:50:10 PM12/25/06
to

There are no real rules for when to use what GOF pattern, especially if
there are competing factors. The rules of relational normalization are
governed mostly by duplication removal. All else being equal,
consistency trumps inconsistency.

>
> Alternatively, you could simply admit to Neo that you cannot
> support your assertion.
>
> Sincerely,
>
> Patrick
>

-T-

Frans Bouma

unread,
Dec 26, 2006, 5:25:26 AM12/26/06
to
Dmitry A. Kazakov wrote:

> On 25 Dec 2006 11:43:28 GMT, Frans Bouma wrote:
>
> > Thomas Gagne wrote:
>
> >> Consider my situation, I have a single database which I know is
> >> correct because it's guarded by procedures, constraints, unit and
> >> integrity tests.
> >
> > that doesn't mean anything. It still can be incorrect in some form.
> > There's a difference between data and information, and unless you
> > have your data -> information transformation code inside your db
> > for ALL your applications (which thus means, your applications are
> > written inside the db) you still can have incorrect data.
> >
> > Before people start jumping up and down that thats impossible, the
> > keyword is 'incorrect'. You might have a relational model which of
> > course forces integrity rules on you and the data stored in the
> > physical representation of that relational model thus obeys these
> > rules, but that doesn't mean the data is correct.
> >
> > This means that data is only correct in the semantical context the
> > data is used in, i.e. when it becomes information. An address with
> > 'New York' as city and 'The Netherlands' as country isn't violating
> > your model's rules, but is violating a rule for a valid address in
> > the Netherlands as we don't have a city called 'New York' in the
> > netherlands.
>
> You address here one issue, that is, the RA rules applied to wrong

> data may produce wrong outcome. It is important, but it is not the


> problem. Which is that having a correct (consistent) set of rules
> (like RA) and some valid input (data set), one can still produce
> semantically wrong outcomes. This is universally true. Arithmetic is
> certainly consistent, but 1 apple + 1 orange is not 2 Ampere.
> Arithmetic does not define the meaning of 1 and +. Equivalently, the
> operations of RA should have a meaning in the application domain.
> This meaning lies outside RA, and nothing in RA can warranty anything
> about it. Same is true for any programming language. So the custom
> chat about "DB correctness" is either trivial or rubbish.

Thanks Dmitry for correctly wording this. I tried to explain what you
said it way better. :)

Patrick May

unread,
Dec 26, 2006, 7:47:44 AM12/26/06
to

So you can't provide an actual example. You should just come out
and say so.

>> Alternatively, you could simply admit to Neo that you cannot
>> support your assertion.

This still appears to be your only option.

topmind

unread,
Dec 26, 2006, 12:08:02 PM12/26/06
to

Patrick May wrote:
> "topmind" <top...@technologist.com> writes:
> > Patrick May wrote:
> >> You've got it backwards. You used the Visitor pattern in
> >> support of one of your claims in your conversation with Neo. I
> >> simply pointed out that it does not, in fact, support your
> >> argument. The burden of proof is still on you to provide an
> >> example of OO techniques leading to "tangled pasta".
> >
> > There are no real rules for when to use what GOF pattern, especially
> > if there are competing factors. The rules of relational
> > normalization are governed mostly by duplication removal. All else
> > being equal, consistency trumps inconsistency.
>
> So you can't provide an actual example. You should just come out
> and say so.

How exactly does one provide examples to show that there are no
consistent consensus rules for something? If I say "There is no
evidence that unicorns exist", you cannot ask for an example. It is
YOUR burden to show that unicorns exist. Now, replace unicorns with
"consistent consensus rules".

>
> >> Alternatively, you could simply admit to Neo that you cannot
> >> support your assertion.
>
> This still appears to be your only option.
>
> Sincerely,
>
> Patrick
>

-T-

Patrick May

unread,
Dec 26, 2006, 12:27:57 PM12/26/06
to
"topmind" <top...@technologist.com> writes:
> Patrick May wrote:
>> "topmind" <top...@technologist.com> writes:
>> > Patrick May wrote:
>> >> You've got it backwards. You used the Visitor pattern in
>> >> support of one of your claims in your conversation with Neo. I
>> >> simply pointed out that it does not, in fact, support your
>> >> argument. The burden of proof is still on you to provide an
>> >> example of OO techniques leading to "tangled pasta".
>> >
>> > There are no real rules for when to use what GOF pattern,
>> > especially if there are competing factors. The rules of
>> > relational normalization are governed mostly by duplication
>> > removal. All else being equal, consistency trumps inconsistency.
>>
>> So you can't provide an actual example. You should just come
>> out and say so.
>
> How exactly does one provide examples to show that there are no
> consistent consensus rules for something?

That wasn't the claim under discussion. You said, in message
1166720109....@73g2000cwn.googlegroups.com:

> If you take away inheritence, you get "network structures" (AKA
> tangled pasta).

Neo, quite reasonably, asked you for an example of such tangled
pasta. You have thus far failed to provide an example. After this
many messages, it appears that you don't have one. You should just
admit it.

S Perryman

unread,
Dec 26, 2006, 12:50:08 PM12/26/06
to
topmind wrote:

> Patrick May wrote:

>>"topmind" <top...@technologist.com> writes:

>>>Patrick May wrote:

PM> You've got it backwards. You used the Visitor pattern in
PM>support of one of your claims in your conversation with Neo. I
PM>simply pointed out that it does not, in fact, support your
PM>argument. The burden of proof is still on you to provide an
PM>example of OO techniques leading to "tangled pasta".

TM>There are no real rules for when to use what GOF pattern, especially
TM>if there are competing factors. The rules of relational
TM>normalization are governed mostly by duplication removal. All else
TM>being equal, consistency trumps inconsistency.

> So you can't provide an actual example. You should just come out
>>and say so.

> How exactly does one provide examples to show that there are no
> consistent consensus rules for something?

I contend that GoF have such rules: they are labelled "motivation" etc .


> If I say "There is no
> evidence that unicorns exist", you cannot ask for an example. It is
> YOUR burden to show that unicorns exist. Now, replace unicorns with
> "consistent consensus rules".

Counter-argument : it is easier to *disprove* something than it is to
*prove* it. ***

Proofs are universal : they must hold under all conditions.
Dis-proof is existential : only one condition has to be found to render
a proof statement invalid as it stands.

So, as far as GoF patterns go (and using your weird language) :

show us *one* "inconsistent consensus rule" .


It is *your burden* to show that one condition.

If you cannot, as Patrick May has so amusingly had you squirming over
the months on various different threads trying to avoid, state that
while you have doubts as to the veracity of something, you specifically
do not have the proof (and/or ability) to disprove the veracity.


Regards,
Steven Perryman

*** Lest you try to claim this is not how proof works etc, I will
*immediately provide a real-world example of one of the most important
advances in science of the last 100 yrs* (ie teaching you how to
actually disprove something) .

topmind

unread,
Dec 26, 2006, 1:44:02 PM12/26/06
to

S Perryman wrote:
> topmind wrote:
>
> > Patrick May wrote:
>
> >>"topmind" <top...@technologist.com> writes:
>
> >>>Patrick May wrote:
>
> PM> You've got it backwards. You used the Visitor pattern in
> PM>support of one of your claims in your conversation with Neo. I
> PM>simply pointed out that it does not, in fact, support your
> PM>argument. The burden of proof is still on you to provide an
> PM>example of OO techniques leading to "tangled pasta".
>
> TM>There are no real rules for when to use what GOF pattern, especially
> TM>if there are competing factors. The rules of relational
> TM>normalization are governed mostly by duplication removal. All else
> TM>being equal, consistency trumps inconsistency.
>
> > So you can't provide an actual example. You should just come out
> >>and say so.
>
> > How exactly does one provide examples to show that there are no
> > consistent consensus rules for something?
>
> I contend that GoF have such rules: they are labelled "motivation" etc .

They are often worded as "adding an X without having to change Y".
However, change needs often change over time. Up-front change needs are
often not a good guide to future change needs.

>
>
> > If I say "There is no
> > evidence that unicorns exist", you cannot ask for an example. It is
> > YOUR burden to show that unicorns exist. Now, replace unicorns with
> > "consistent consensus rules".
>
> Counter-argument : it is easier to *disprove* something than it is to
> *prove* it. ***
>
> Proofs are universal : they must hold under all conditions.
> Dis-proof is existential : only one condition has to be found to render
> a proof statement invalid as it stands.


Okay, I declare "tangled pasta" a subjective opinion. However, there is
no evidence of GOF OO patterns improving realistic business logic.
Until they are proven better in my domain with inspectable public
source code, I shall use procedural/relational techniques instead and
recommend others ignore them also.


>
> So, as far as GoF patterns go (and using your weird language) :
>
> show us *one* "inconsistent consensus rule" .
>
>
> It is *your burden* to show that one condition.
>
> If you cannot, as Patrick May has so amusingly had you squirming over
> the months on various different threads trying to avoid, state that
> while you have doubts as to the veracity of something, you specifically
> do not have the proof (and/or ability) to disprove the veracity.

The bottom line is that you cannot prove GOF OO better. Mr. May likes
nitty side-tracks to distract from the real issue. He is more
interested in bashing me than in defending OO. Kick the messenger all
you all want, but OO is not proven better outside of systems software.
Whether I am a genious or Bozo, you still have no OO proof.

>
>
> Regards,
> Steven Perryman
>

-T-

Thomas Gagne

unread,
Dec 26, 2006, 1:47:48 PM12/26/06
to
Frans Bouma wrote:
> Thomas Gagne wrote:
>
> <snip>

>
> Though what is 'correctness' in the DB? Isn't that close to semantical
> interpretation of the data in such a way that you need to transform the
> data into information first to be sure it's correct?
>
> I mean, sure, there are referential integrity rules, but if I store a
> row in the customer table with an address where city is New York and
> Country is THe Netherlands, it's not correct, though the db doesn't
> object.
>
For DB's to be correct, more is needed than referential integrity. For
instance, in financial systems the data needs to balance. Transactions
are supposed to balance, which in theory would keep the system
"balanced", but production unit tests (doesn't everyone run those?) can
prove on a daily (or more frequent) basis that the DB's state is both
referentially correct, business rule correct, and in other respects,
correct.

>
>>> <snip>
>>>
>>> I also think you lose something valuable with Stored Procs.
>>> Excellent efforts have been expended to consider the basic
>>> principles of RDBMS design in objects, and to create objects that
>>> will effectively assist with Object Relational Mapping as a first
>>> step to addressing the Impedence mismatch.
>>>
>> You're right. Some great research has been spent here--as there was
>> in alchemy.
>>
>> Consider my situation, I have a single database which I know is
>> correct because it's guarded by procedures, constraints, unit and
>> integrity tests.
>>
> <snip>

>
> So what you're saying doesn't imply solely on a db with procs, it also
> implies on db's with an external api.
>
Even stripped of all APIs, the database should be provably correct.
APIs provide a rampart against corruption (data errors), as does
referential integrity (structural errors). Whatever bugs may express
themselves in the code it's more important to identify and isolate them
in the database. Bad data in the database can infect multiple
applications and be the cause for bad decisions--both manual and
automated. This is one of the reasons we focus so strongly on DB integrity.

>
>> I have multiple applications--some of them share
>> common data models but others of them do not. Which model is correct?
>> Consider you have 20 programs each doing specific things. Between
>> the 20 you've discovered there's three different object models that
>> best reflect their dependent applications needs and designs. Which
>> of the three should be mapped to the DB? Should the DB's model be
>> massaged to reflect any of them, or should it be designed to be
>> perfect for the business data?
>>
>
> In my opinion, there's just 1 model possible, and it comes down to a
> NIAM/ORM (object role modelling) model. With that model, you can speak
> about an entity and how it's used in the persistent storage and also in
> your application.
>
I don't agree there's a chicken-and-egg problem. Chicken and eggs
dilemmas are so because both the chicken and egg are required entities.
Object-relational design questions aren't characterized that way because
objects and object models are optional. They exist only because system
designers selected object oriented languages to program with. The
relational model exists (and persists) with or without objects or object
oriented languages.

Along the same theme, the article also errs in presenting a
bifurcation. Even if we assume both object and relational models exist
in symbiosis we do not have to map one model onto the other. We can
instead implement to the interface, which is what transactions are all
about.

In fact, I could probably prove transactions are how it should be done
using the same proof for an intersection table's requirement to
represent many-to-many relationships--but that proof is outside the
scope of this reply. It is. however, a good subject for a subsequent
article. It already proves the need for middleware, why not prove
transactions are the most efficient and correct approach to joining
applications to databases?

Thomas Gagne

unread,
Dec 26, 2006, 1:57:06 PM12/26/06
to
Frans Bouma wrote:
> Dmitry A. Kazakov wrote:
>
>
>> <snip>

>> You address here one issue, that is, the RA rules applied to wrong
>> data may produce wrong outcome. It is important, but it is not the
>> problem. Which is that having a correct (consistent) set of rules
>> (like RA) and some valid input (data set), one can still produce
>> semantically wrong outcomes. This is universally true. Arithmetic is
>> certainly consistent, but 1 apple + 1 orange is not 2 Ampere.
>> Arithmetic does not define the meaning of 1 and +. Equivalently, the
>> operations of RA should have a meaning in the application domain.
>> This meaning lies outside RA, and nothing in RA can warranty anything
>> about it. Same is true for any programming language. So the custom
>> chat about "DB correctness" is either trivial or rubbish.
>>
>
> Thanks Dmitry for correctly wording this. I tried to explain what you
> said it way better. :)
>
> FB
>
>
I must be the only one that didn't follow it.

Where did the wrong data come from? Why is it solely relational
algebra's problem to detect it?

topmind

unread,
Dec 26, 2006, 3:33:05 PM12/26/06
to

Thomas Gagne wrote:
> Frans Bouma wrote:
> > Thomas Gagne wrote:
> >
> > <snip>
> >
> > Though what is 'correctness' in the DB? Isn't that close to semantical
> > interpretation of the data in such a way that you need to transform the
> > data into information first to be sure it's correct?
> >
> > I mean, sure, there are referential integrity rules, but if I store a
> > row in the customer table with an address where city is New York and
> > Country is THe Netherlands, it's not correct, though the db doesn't
> > object.
> >
> For DB's to be correct, more is needed than referential integrity. For
> instance, in financial systems the data needs to balance. Transactions
> are supposed to balance, which in theory would keep the system
> "balanced", but production unit tests (doesn't everyone run those?) can
> prove on a daily (or more frequent) basis that the DB's state is both
> referentially correct, business rule correct, and in other respects,
> correct.

I was talking with some tech buddies of mine about this once, and we
concluded that double-entry book-keeping was archaic. One does not need
to store the same info in two different places with modern DB's. If you
want to avoid losing a record, then use some kind of increment
sequencing number.

Of course the usual nightly backup process should be in place, and
perhaps a DB mirror system if you have the budget.

-T-

JXStern

unread,
Dec 26, 2006, 6:22:49 PM12/26/06
to
On 24 Dec 2006 20:27:12 -0800, "topmind" <top...@technologist.com>
wrote:

>> Explicit b-trees and b-trees crosslinked into "networks" can be very
>> efficient and straightforward, so why did they lose so completely?
>
>Efficient, maybe. Straitforward? no way.

OK, but the efficient can be a big deal.

>Relational offers more
>consistency. There are less different ways to model the same business
>in relational. It sounds like a bad thing, but relational killed the
>"creativity". It is similar to how structured blocks killed GOTO
>creativity. IBM's IMS is dead for a reason.

VSAM is more what I had in mind.

>> The textbook answer is that they are not reusable, and a normalized
>> relational database, is. But I think the answer is elsewhere.
>> Because of its independence, the relational database requires some
>> kind of language interface, pretty much universally some form of SQL
>> these days, which further hides the implementation from the
>> application (I agree with topmind on this).
>
>Navigational query languages were proposed. They were ugly because
>navigational is ugly.

SQL isn't very pretty. I should try using MULTIPLY/DIVIDE and other
terminology that is more precise - inner/natural join can be ambiguous
as to why you're doing it, and optimizers can assume incorrectly.

The wordiness is not a problem for me, I tend to comment extensively
anyway.

The role of compilation and optimization in relational databases is
VERY underemphasized. Maybe it's "only" about performance, but 100x
or more is enough to pay attention to.

J.


topmind

unread,
Dec 26, 2006, 7:21:33 PM12/26/06
to

JXStern wrote:
> On 24 Dec 2006 20:27:12 -0800, "topmind" <top...@technologist.com>
> wrote:
>
> >> Explicit b-trees and b-trees crosslinked into "networks" can be very
> >> efficient and straightforward, so why did they lose so completely?
> >
> >Efficient, maybe. Straitforward? no way.
>
> OK, but the efficient can be a big deal.

I doubt a general-purpose navigational DBMS will be faster than a
general-purpose relational DBMS. Application-specific navigational DBMS
do exist and they are indeed fast. (Phone co's use them IIANM).
However, much is hard-wired in order to acheive such speed. A
hard-wired RDBMS could be created also.

>
> >Relational offers more
> >consistency. There are less different ways to model the same business
> >in relational. It sounds like a bad thing, but relational killed the
> >"creativity". It is similar to how structured blocks killed GOTO
> >creativity. IBM's IMS is dead for a reason.
>
> VSAM is more what I had in mind.

That is an implimentation detail, isn't it? If one is going to create a
navigational DB standard, it should not assume a specific
implementation I would think.

>
> >> The textbook answer is that they are not reusable, and a normalized
> >> relational database, is. But I think the answer is elsewhere.
> >> Because of its independence, the relational database requires some
> >> kind of language interface, pretty much universally some form of SQL
> >> these days, which further hides the implementation from the
> >> application (I agree with topmind on this).
> >
> >Navigational query languages were proposed. They were ugly because
> >navigational is ugly.
>
> SQL isn't very pretty. I should try using MULTIPLY/DIVIDE and other
> terminology that is more precise - inner/natural join can be ambiguous
> as to why you're doing it, and optimizers can assume incorrectly.

Well, I am all for creating a new relational language and/or RDB
standard to replace or compete with SQL. I even proposed my own called
SMEQL. However, even with its warts, SQL beats what is currently out
there.

>
> The wordiness is not a problem for me, I tend to comment extensively
> anyway.
>
> The role of compilation and optimization in relational databases is
> VERY underemphasized. Maybe it's "only" about performance, but 100x
> or more is enough to pay attention to.

Please explain. Are you suggesting that if RDBMS were more "pure", then
automatic optimization would be more effective? Perhaps. But again, you
are talking about a brand of apples and not apples in general.

I'll be behind you if you wish to lobby for new relational standards.
At least relational has (semi) standards. Navigational has none in
usage that I know of (other than file systems, and perhaps to some
extent XML-DBs).

>
> J.

-T-

Thomas Gagne

unread,
Dec 26, 2006, 9:10:43 PM12/26/06
to
Are you and your buddies appropriate authorities to dismiss a Generally
Accepted Accounting Practice in use since the 12th century?

topmind

unread,
Dec 27, 2006, 1:59:47 AM12/27/06
to


Is it required to be implemented the same way as done on paper? Or is
one merely required to present it in double-entry form (which is just a
presentation issue)? It seems foolish to have laws that force
denormalized data. There are better ways to get integrity than to
mirror paper. In other words:

if forced by law then
law not rational
else
there are better ways to ensure data intregrity
end if

>
> --
> Visit <http://blogs.instreamfinancial.com/anything.php>
> to read my rants on technology and the finance industry.

-T-

Dmitry A. Kazakov

unread,
Dec 27, 2006, 3:34:42 AM12/27/06
to
On 26 Dec 2006 09:08:02 -0800, topmind wrote:

> If I say "There is no
> evidence that unicorns exist", you cannot ask for an example.

No, we can immediately discard this statement as illegal. "There is no"
must be applied to an observable set. You can say "In Britanica there is no
evidences that unicorns exist." Then we could go and verify that.
Otherwise, it is always your burden to prove a universally quantified
statement. For example by showing that existing unicorns would necessarily
post in comp.object within each hour.

Frans Bouma

unread,
Dec 27, 2006, 4:27:49 AM12/27/06
to
Thomas Gagne wrote:
> Frans Bouma wrote:
> > Thomas Gagne wrote:
> >
> > <snip>
> >
> > Though what is 'correctness' in the DB? Isn't that close to
> > semantical interpretation of the data in such a way that you need
> > to transform the data into information first to be sure it's
> > correct?
> >
> > I mean, sure, there are referential integrity rules, but if I
> > store a row in the customer table with an address where city is New
> > York and Country is THe Netherlands, it's not correct, though the
> > db doesn't object.
> >
> For DB's to be correct, more is needed than referential integrity.
> For instance, in financial systems the data needs to balance.
> Transactions are supposed to balance, which in theory would keep the
> system "balanced", but production unit tests (doesn't everyone run
> those?) can prove on a daily (or more frequent) basis that the DB's
> state is both referentially correct, business rule correct, and in
> other respects, correct.

that's semantic interpretation of data, i.e. the transformation of
data into information. Though, why do you need an RDBMS for this?

About unit tests: they test what you wrote the test for, though they
don't give you an absolute answer if your code is correct. They only
proof that what the test has to proof is true or not.

> >>> <snip>
> > > >
> >>> I also think you lose something valuable with Stored Procs.
> >>> Excellent efforts have been expended to consider the basic
> >>> principles of RDBMS design in objects, and to create objects that
> >>> will effectively assist with Object Relational Mapping as a first
> >>> step to addressing the Impedence mismatch.
> >>>
> >> You're right. Some great research has been spent here--as there
> was >> in alchemy.
> > >
> >> Consider my situation, I have a single database which I know is
> >> correct because it's guarded by procedures, constraints, unit and
> >> integrity tests.
> >>
> > <snip>
> >
> > So what you're saying doesn't imply solely on a db with procs, it
> > also implies on db's with an external api.
> >
> Even stripped of all APIs, the database should be provably correct.

I still have a problem with what you mean with 'Correct'. Let's say
you mean by it the relational integrity correctness but also the
semantical correctness of the data, as you tried to explain above.
Isn't it so that in that situation I can write another program, also
using the same database which consumes a subset of your tables and
finds incorrect data? Simply because its semantic interpretation of the
data is different? (a dutch zip code format is different from a US zip
code format for example ;))

> APIs provide a rampart against corruption (data errors), as does
> referential integrity (structural errors). Whatever bugs may express
> themselves in the code it's more important to identify and isolate
> them in the database. Bad data in the database can infect multiple
> applications and be the cause for bad decisions--both manual and
> automated. This is one of the reasons we focus so strongly on DB
> integrity.

all great, but that doesn't imply you THUS should use a set of stored
procedures to create that. As part of your definition of a correct
database/dataset is based on semantical interpretation of the data, you
can also do that outside the DB, in whatever application you're writing.

> >> I have multiple applications--some of them share
> >> common data models but others of them do not. Which model is
> correct? >> Consider you have 20 programs each doing specific
> things. Between >> the 20 you've discovered there's three different
> object models that >> best reflect their dependent applications needs
> and designs. Which >> of the three should be mapped to the DB?
> Should the DB's model be >> massaged to reflect any of them, or
> should it be designed to be >> perfect for the business data?
> >>
> >
> > In my opinion, there's just 1 model possible, and it comes down to
> > a NIAM/ORM (object role modelling) model. With that model, you can
> > speak about an entity and how it's used in the persistent storage
> > and also in your application.
> >
> I don't agree there's a chicken-and-egg problem.

I must be missing something, but I didn't speak of any chicken-egg
problem ? :)

> Chicken and eggs
> dilemmas are so because both the chicken and egg are required
> entities. Object-relational design questions aren't characterized
> that way because objects and object models are optional. They exist
> only because system designers selected object oriented languages to
> program with. The relational model exists (and persists) with or
> without objects or object oriented languages.

they're all technical solutions to problems which arise when an
application has to be build. They're not the initiation of the problem,
they're a solution after the problem has been recognized. This means
that no-one starts with an O/R mapper or RDBMS and then looks around to
build an application. An application has to be build and then tools are
sought to make that building easier and/or resulting in more
maintainable software etc.

> Along the same theme, the article also errs in presenting a
> bifurcation. Even if we assume both object and relational models
> exist in symbiosis we do not have to map one model onto the other.
> We can instead implement to the interface, which is what transactions
> are all about.

that won't work in practise as you can't consume a set in an OOPL
unless you transform that set to a consumable object. You too have to
find a way to translate between imperative executed statements and set
oriented statements.

The mapping is necessary from a technical point of view. Semantically
you're dealing with the same thing: an entity, just in different
representation forms.

You IMHO also make the mistake to confuse SQL with a relational model.

> In fact, I could probably prove transactions are how it should be
> done using the same proof for an intersection table's requirement to
> represent many-to-many relationships--but that proof is outside the
> scope of this reply.

Well, you just need 2 m:1 relations to have a m:n relation, so I can
have 1 single table and define a m:n relation, that's not that hard.

I also fail to see what transactions have to do with the concept of an
entity. A transaction is related to executing statements (also outside
the DB if I might add), an entity is a static concept, it's not
executing something, it's a definition of what a set of attributes mean
semantically.

> It is. however, a good subject for a subsequent
> article. It already proves the need for middleware, why not prove
> transactions are the most efficient and correct approach to joining
> applications to databases?

Because I don't think they are.

Frans Bouma

unread,
Dec 27, 2006, 4:37:14 AM12/27/06
to
A clarification

Frans Bouma wrote:

What I meant by this was that a 'transaction' is used in the process
of consuming an RDBMS in a client program but it doesn't cover the
whole conceptual theory behind what the client code actually
represents. As a transaction can also be used in code itself, without
ever going to a DB, I don't see it as a way to describe what you're
saying, also because my article was about a completely different thing,
namely concepts, not code.

Dmitry A. Kazakov

unread,
Dec 27, 2006, 6:10:45 AM12/27/06
to
On Tue, 26 Dec 2006 13:57:06 -0500, Thomas Gagne wrote:

> Frans Bouma wrote:
>> Dmitry A. Kazakov wrote:
>>
>>
>>> <snip>
>>> You address here one issue, that is, the RA rules applied to wrong
>>> data may produce wrong outcome. It is important, but it is not the
>>> problem. Which is that having a correct (consistent) set of rules
>>> (like RA) and some valid input (data set), one can still produce
>>> semantically wrong outcomes. This is universally true. Arithmetic is
>>> certainly consistent, but 1 apple + 1 orange is not 2 Ampere.
>>> Arithmetic does not define the meaning of 1 and +. Equivalently, the
>>> operations of RA should have a meaning in the application domain.
>>> This meaning lies outside RA, and nothing in RA can warranty anything
>>> about it. Same is true for any programming language. So the custom
>>> chat about "DB correctness" is either trivial or rubbish.
>>
>> Thanks Dmitry for correctly wording this. I tried to explain what you
>> said it way better. :)
>>

> I must be the only one that didn't follow it.
>
> Where did the wrong data come from? Why is it solely relational
> algebra's problem to detect it?

No, nobody claims that. The point is that there exist many consistent
systems, therefore one cannot argue to consistency of RA as a genuine
advantage of. That renders triviality:

- RA is consistent!
- Wow! Let's have a beer.

Rubbish were to argue that RA solves the problem, but we have already
agreed that it does not, which by no means were RA's problem. So we stay by
triviality.

Thomas Gagne

unread,
Dec 27, 2006, 8:09:05 AM12/27/06
to
Frans Bouma wrote:
> Thomas Gagne wrote:
>
> <snip>
>
> I still have a problem with what you mean with 'Correct'. Let's say
> you mean by it the relational integrity correctness but also the
> semantical correctness of the data, as you tried to explain above.
> Isn't it so that in that situation I can write another program, also
> using the same database which consumes a subset of your tables and
> finds incorrect data?
In that example, where is the bug: the program or the database? If the
DB stored the zip code precisely as the human intended and it wasn't
lost, overwritten, or corrupted, then the DB is correct. If a user
entered an incorrect zip code the database can still be correct, though
its data may not.

But to the point, if a program was able to store improperly-formatted
zipcode inside the DB then whose fault is that? Something someplace
should make sure zipcodes are properly formatted for all programs that
may update the database, and if both Dutch and US-formatted zipcode are
allowed both are properly formatted before adding them to the DB.

One place that edit can happen is in the DB's interface. A stored
procedure can be written that both validates the zip code, and records
its old value, who changed it, when, and all that other good stuff.
Perhaps later if other zipcode formats are supported and new tables
created to represent them, the stored procedure can be modified without
the change domino-ing into applications.


> Simply because its semantic interpretation of the
> data is different? (a dutch zip code format is different from a US zip
> code format for example ;))
>
>
>> APIs provide a rampart against corruption (data errors), as does
>> referential integrity (structural errors). Whatever bugs may express
>> themselves in the code it's more important to identify and isolate
>> them in the database. Bad data in the database can infect multiple
>> applications and be the cause for bad decisions--both manual and
>> automated. This is one of the reasons we focus so strongly on DB
>> integrity.
>>
>
> all great, but that doesn't imply you THUS should use a set of stored
> procedures to create that. As part of your definition of a correct
> database/dataset is based on semantical interpretation of the data, you
> can also do that outside the DB, in whatever application you're writing.
>

If there's more than one executable which is responsible for semantic
correctness?

I once helped create a transaction processor for a credit union
product. Every online and batch application in the system funneled
through the transaction processor. Both it and the system's stored
procedures were created to maintain system integrity and process
transactions as fast as possible.

I wrote briefly about it a long time ago:
<http://gagne.homedns.org/~tgagne/articles/newdef.html#casestudy1>.


>
>>>> I have multiple applications--some of them share
>>>> common data models but others of them do not. Which model is
>>>>
>> correct? >> Consider you have 20 programs each doing specific
>> things. Between >> the 20 you've discovered there's three different
>> object models that >> best reflect their dependent applications needs
>> and designs. Which >> of the three should be mapped to the DB?
>> Should the DB's model be >> massaged to reflect any of them, or
>> should it be designed to be >> perfect for the business data?
>>
>>>>
>>>>
>>> In my opinion, there's just 1 model possible, and it comes down to
>>> a NIAM/ORM (object role modelling) model. With that model, you can
>>> speak about an entity and how it's used in the persistent storage
>>> and also in your application.
>>>
>>>
>> I don't agree there's a chicken-and-egg problem.
>>
>
> I must be missing something, but I didn't speak of any chicken-egg
> problem ? :)
>

I thought the web page you referenced did, <

<http://weblogs.asp.net/fbouma/archive/2006/08/23/Essay_3A00_-The-Database-Model-is-the-Domain-Model.aspx>

> <snip>


>> Along the same theme, the article also errs in presenting a
>> bifurcation. Even if we assume both object and relational models
>> exist in symbiosis we do not have to map one model onto the other.
>> We can instead implement to the interface, which is what transactions
>> are all about.
>>
>
> that won't work in practise as you can't consume a set in an OOPL
> unless you transform that set to a consumable object. You too have to
> find a way to translate between imperative executed statements and set
> oriented statements.
>

What is returned is a projection, and that projection has tuples. If
RAM allows you may treat them as a collection or a read-only stream.
The application doesn't need to know what a set-oriented statement is if
it uses stored procedures.


> The mapping is necessary from a technical point of view. Semantically
> you're dealing with the same thing: an entity, just in different
> representation forms.
>
> You IMHO also make the mistake to confuse SQL with a relational model.
>

You'll have to elaborate on that--I don't know another way to converse
with RDBMS without SQL. But one of RDBMS' advantages over other DBMS is
the provision of stored procedures. Other DBMS can still be provided an
interface, but it'll have to be home-rolled.

Thomas Gagne

unread,
Dec 27, 2006, 8:11:18 AM12/27/06
to
topmind wrote:
> Is it required to be implemented the same way as done on paper? Or is
> one merely required to present it in double-entry form (which is just a
> presentation issue)? It seems foolish to have laws that force
> denormalized data.
How does the law require denormalized data?

Thomas Gagne

unread,
Dec 27, 2006, 8:19:51 AM12/27/06
to
RA is great, but it proves only structural (relational) correctness. If
there's an algebra for network databases then it could prove structural
(network) correctness as well. I think both are good things.

In addition to structural correctness we can test for semantic
correctness. In banking systems one way to do that is to compare all
the debit and credit accounts to make sure they net $0.

We can also walk through transaction history to make sure the final
answer is what's represented in derived locations, like the account
tables. Data change (DC) history is the same--make sure the value
represented in the table is the last value in DC history.

Depending on your hypostasis (DB+design) other tests are likely available.

Patrick May

unread,
Dec 27, 2006, 9:28:17 AM12/27/06
to
"topmind" <top...@technologist.com> writes:
> Okay, I declare "tangled pasta" a subjective opinion.

After only a half-dozen or so request/response pairs. Record
time.

> Mr. May likes nitty side-tracks to distract from the real issue.

Yes, I am rather fond of "nitty side-tracks" such as providing
evidence for one's claims. I can see how that would distract from
your "real issue" of spewing errant nonsense. Darn me. Darn me to
heck.

JXStern

unread,
Dec 27, 2006, 9:31:35 AM12/27/06
to
On 26 Dec 2006 16:21:33 -0800, "topmind" <top...@technologist.com>
wrote:

>I'll be behind you if you wish to lobby for new relational standards.
>At least relational has (semi) standards. Navigational has none in
>usage that I know of (other than file systems, and perhaps to some
>extent XML-DBs).

I don't have anything specific in mind, except some additional
definition of what the problems are that might be solved by a new
language/standard.

For one thing, I think the implementation underneath common database
SQL's could be improved, caching more plans per statement. Too many
stories of sites where performance goes to h*ll when an inappropriate
cached plan is used for variant values. I don't think there's any
language design issue there that could be resolved at compile time.
The whole need to reevaluate at runtime, differentiates SQL from
familiar procedural languages.

J.

It is loading more messages.
0 new messages