Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Hierarchical Model and its Relevance in the Relational Model

321 views
Skip to first unread message

Derek Asirvadem

unread,
Jan 28, 2015, 4:53:01 AM1/28/15
to
James K Lowden
Jan Hidders

I have mentioned the two of you, and this issue, in a recent post. Far be it from me to snipe from the edges. I expect to deal with this matter squarely, so here is a new thread. Pasted here for your convenience:

----
> On Wednesday, 28 January 2015 12:50:48 UTC+11, in the "Topological Relational Algebra" thread, Derek Asirvadem wrote:
>
> There are Three Obstacles and One Consideration.
>
> The first is, mathematicians are scared of The Hierarchy. And they are in a state of denial re the evidenced fact, that the Relational Model is a progression of, not a substitute of, the Hierarchical Model. Note the comments of J Hidders and J K Lowden in recent threads. Of the two Normal Forms defined in the RM (which btw remain undefined by mathematicians after 44 years), the first is, if we were to name it without psychological impediments, the
> ____ Hierarchical Normal Form____
> - It destroys many of the problems that mathematicians, even today, are grappling with.
> - It deems many of the proposals of Date, Fagin, and Darwen *non-relational*, which is why they suppress the Hierarchical issue, in order that people won't identify their proposals as massive breaches of the RM.

...

> [Implementation of Data Hierachies is] simple for me, difficult-to-impossible for those who are victims of the suppression of Hierarchies, as a result of Date's, Fagin's, Darwen's, Abiteboul-Hull-Vianu's "work", and as proliferated by virtually every mathematician, eg. as evidenced by J Hidders and J K Lowden.
----

There are four parts to this, as it is addressed to two people: a common context; one part each, which is specific to each of you; and one common part addressed to both.

==== 1 Context ====

> On Saturday, 3 January 2015 10:02:35 UTC+11, James K. Lowden wrote:
>
> "Nowadays it is accepted that ORM (Object-Role Modeling),
> Object Oriented (OO) and other post-OO methodologies, such as Agile
> Modeling, Agile Development (Erickson, Lyytinen, & Siau, 2005) are most
> appropriate not only for programming, but also for analyzing and
> designing information systems, including, of course, database design."
>
> Far from being "accepted" for the purpose, software development
> methodologies have exactly nothing to say about database design.

First let me say that I agree, and strongly. (I have issues with the exclusive wording, but that is irrelevant to this thread.)

> The
> author doesn't seem to know that that "OO databases" rest on the very
> same theoretical void that the hierarchical and network models did
> before them.

Let's get a couple of things out of the way, so that the scope of this thread is clear.
- While there are a number of mathematical papers that propose or reinforce various OO theories, which is of course what the OO boys use to justify their creatures, I whole-heartedly agree that if one were to remove the un-scientific papers from that body, one is left with a void.
- It is one of the main reasons that "analysing and designing" data, for either Relational or OO implementation, using OO/UML principles, is completely bankrupt. We are flooded, these days, with people using OO/UML to model what they intend to implement as a "relational database".

The Network Model, although very relevant, is not relevant to todays issues or to the Relational Model, so let's carve that off. So we are left with the theme of the thread: you have posted, and I paraphrase, that:

A. ____the Hierarchical Model rests on a theoretical void____

You did not state it, but the implication is, the HM is dead, gone the way of the NM. Therefore (again, I paraphrase):

B. ____the HM is dead, it has no relevance wrt the Relational Model___

-- Aside --

> Yet if we discard the fancy language and
> ... mathematical exactitude ...

I had a good laugh at that one.
c. you guys cannot even agree with each other (one man's proof is another man's void)
d. you guys cannot differentiate base relations from views (derived relations), you are forever making the gross error of "normalising" views, and failing to "normalise" base relations. "Normalise" is in quotes because you guys have your private "definitions", so I happily allow whatever you mean by the term, it is not intended to be an insult.
e. you guys insist that you know the RM, but are forever producing evidence that you do not
f. you will not allow yourselves to be pinned down on any single point, ie. you avoid exactitude.
Therefore "exactitude" is not a reasonable term to use in the existing, evidenced, context. But let's not get distracted.

-- End Aside --

> That's part of the beauty of the [Relational] model and why Codd developed it:
> relations are easy to understand and develop intuition for.

Agreed, completely.

F. The implication here, and in many other places, is that you know the Relational Model, and you know it well.

That is the scope, I hope I have identified it clearly.

==== 2 James K Lowden ====

If you are unhappy with my paraphrasing, please feel free to supply your own precise statements. Assuming you accept the paraphrasing as accurate:

1. I take issue with your proposal [A]. I charge that it is completely and totally false. But that is not as important as the implication [B], which is totally and completely false.

2. Further, I charge that the belief [A][B] of which, substantially damages and hinders (i) the understanding of, and (ii) the application of, the RM.

3. There is a statement implied in [1][2], which I will make explicit. I declare, the Relational Model is not a replacement for, or a substitution for the Hierarchical Model; it is a progression of it. The RM was not concocted in isolation, in the far reaches of outer space. (Let's not mince words, of course, something that is a progression of its predecessor, something that is far superior, replaces the predecessor. However, the predecessor remains a fundamental part of the successor, and is visible in it.)

Given [F]; your valuable advice to seekers; your demonstrated concern for accuracy, which I support, I suggest you should respond, with:
- either a retraction of [A][B]
- or some real world evidence supporting [A][B], which thus far has been stated without evidence

==== 3 Jan Hidders ====

You have made several statements re the Hierarchical Model, in several posts, pretty much along the lines JKL. I did search, and I did not find a post that states anything squarely (it is always an indirect reference [not suggesting that such was wrong, it may have been quite appropriate for the context] ). But in any case, I think I am reasonably correct in identifying that you sit in the same position as JKL on this matter. If you are unhappy with my paraphrasing, please feel free to supply your own precise statements. Assuming you accept the paraphrasing as accurate:

I believe you agree with [A][B]. Unquestionably, [F] applies to you. So I am taking it up with you as well, I make the same charges, and the same declarations [1][2][3].

If you understand me, and agree that I have picked up your position re the HM:RM accurately, your comments about [F] accurately, then simply proceed, skip this next section.

-- 3.1 JH Specifics --

Iff you want something specific, as a starting point, try this post:
https://groups.google.com/d/msg/comp.databases.theory/r3BHhq1EFbs/iE5AIvLD6BYJ

The context of the thread, and your post itself, are not the issue, it is the statements/references/implied-statements that you made re the HM, and its value in the RM context, that I wish to take up:

4.
> On Sunday, 21 April 2013 23:35:37 UTC+10, Jan Hidders wrote:
> >
> > [RM and ERM are the context]
> >
> > http://en.wikipedia.org/wiki/Structured-Entity-Relationship-Model
>
> Right. SERM. Part of Aris, a SAP thing. The hierarchical database
> model, once again trying to make a come back in a new disguise. :-)

I declare, the HM (or the "hierarchical database model") never left. It has always been here. It therefore cannot make a comeback.

Please feel free to submit any evidence otherwise.

5.
> > [Rule: Circular references are not permitted]
>
> The German page is more explicit and does mention this rule.
>
> > Which appears to be a perfectly reasonable rule to me ...
> > Is it perfectly evident that this requirement must be enforced, since a
> > model with cyclic dependencies is plain "spaghetti", maybe even
> > violating some normal form?
>
> It has nothing to do with normalization in the classical relational
> theory sense of the word.

I declare, it has a direct and definitive relationship with Normalisation, as defined in the RM.

Please feel free to submit any evidence otherwise.

(Forget about "classical relational theory sense of the word" [AFAI am aware, "relational theory" does not deal with Normalisation, but hey, I am not a mathematician. But that is not the point I wish to take up.] I am dealing with the RM.)

-- End Specifics --

Clearly, from your various papers as well as posts, [F] applies to you, you are a prolific author in the OO/ORM/RM (ok, theory only) space, you are a teacher at university level, transferring knowledge to many young minds, therefore I expect you to take it up responsibly. Obviously, I support that, the transfer of accurate information. I suggest you should respond, with:
- either a retraction of [A][B]
- or some real world evidence supporting [A][B], which thus far has been stated without evidence.

==== 4 Further Context ====

In case it is not obvious, I am not interested in mathematical proofs. As one of my professors said, decades ago, a mathematician can prove that pigs can fly, it is up to you [as a scientist] to prove that they actually do. Therefore (a) in the first instance, let's avoid that skirmish, (b) but if you want it, I will engage.

In the event that there is a discussion, I too will be supplying real world evidence to support my statements, not mathematical proofs.

Mathematical proofs are only a fraction of the theory (the "t" in c.d.t). Theory is only a fraction of the science. I would like to limit the discussion to science. That means hard evidence, no "convincing", no "beliefs", which is the ambit of non-science; cultists; jehovah's witnesses, and which requires private "definitions'; use of private "bibles"; etc.

I am discussing the ***Relational Model***, and statements you have made about it, and its componentry, including the relational theory in it, as published. I am *not* discussing what mathematicians call "relational theory", which apparently has completely different descriptions (no definitions AFAIK), algebra, calculus, etc, which perhaps existed before 1970, and in any case, is not in agreement with, and bears little relation to, the RM. So please, let's not get side-tracked, dragged into RT or the RM::RT delta. The statements made were about the RM.
- if your statements were about RT, and were applied to the RM as a result of sloppiness, then it is a simple matter for you to correct the sloppiness, and state that they do not apply to the RM.

I appreciate that I am a practitioner, in the real universe, and that you are abstractionists, in the unreal universe. But you have made statements re the RM; the HM; which most definitely exist in the real universe, that cause problems for many people (eg. the referenced Topological Algebra post), which I wish to resolve once and for all.

-- Alternative --

I will also give you an alternative (to supplying evidence to support your statements, and the predictable results). After you retract them, redress
__the void in your profession re the Relational Model__
extant for forty four years, and write a formal paper, complete with mathematical proofs, that /formally/ defines the two Normal Forms that are /informally/ (but quite usefully, for practical purposes) defined in the Relational Model. As a "lay person", the obvious names for them are:
____Hierarchical Normal Form
____Relational Normal Form
If you take this up, I would be pleased to supply more context, and directions, in order that you avoid the infamy of frauds (such as Date, Fagin, Darwen, Abiteboul-Hull-Vianu), and you gain instead fame, in the physical universe. You might achieve the ACM Codd Award.

This is a public notice. I will endeavour to find your email addresses, and to email this to you as well.

I await your considered response.

Cheers
Derek Asirvadem

James K. Lowden

unread,
Jan 28, 2015, 9:09:08 PM1/28/15
to
On Wed, 28 Jan 2015 01:52:59 -0800 (PST)
Derek Asirvadem <derek.a...@gmail.com> wrote:

Hello Derek,

Thank you. I accept the bait^W invitation. ;-)

> A. ____the Hierarchical Model rests on a theoretical void____
>
> You did not state it, but the implication is, the HM is dead, gone
> the way of the NM. Therefore (again, I paraphrase):
>
> B. ____the HM is dead, it has no relevance wrt the Relational Model___

I accept your paraphrase.

> > ... mathematical exactitude ...
>
> I had a good laugh at that one.

I suspect I wanted "seeming exactitude". We need a complement for
"exact" that expresses the same relationship as certainty/certitude.

> Therefore "exactitude" is not a reasonable term to use in the
> existing, evidenced, context. But let's not get distracted.

Fair enough.

> 1. I take issue with your proposal [A]. I charge that it is
> completely and totally false. But that is not as important as the
> implication [B], which is totally and completely false.

Regarding [A], it was Codd's observation. He had to invent the term
"hierarchical model" ex post facto to name the thing he was comparing
the relational model to. It's not even a "model", though, insofar as
it has no mathematical foundation.

He was never so inartful as to put it in so many words, but the
argument for the relational model might be boiled down to, "Look,
here's math, and there's a set of commonly accepted engineering
practices. Which is a better foundation?"

The RM has an algebra and a calculus, as you know. What is the
equivalent (or even analog) in the hierarchical model? There is no
"graph algebra", no set of operations closed over the domain. Graph
theory offers no sets, bears no connection to predicate logic.

What we have in graph theory is a taxonomy and a giant collection of
algorithms. I would say that puts it about where biology was before
germ theory: a museum of classification and observed behavior.

I wouldn't think that observation controversial on c.d.t.

Regarding [B], I suppose it depends on what we mean by "dead" and
"relevance".

> I declare, the Relational Model is not a replacement for, or a
> substitution for the Hierarchical Model; it is a progression of it.

Evidence, please. I can think of no way in which the hierarchical
model informs the relational model except as antithesis. Codd
contrasted the RM with "noninferential systems", which hardly
sounds like a source of inspiration.

I think you will recognize this, from the abstract:

"Existing noninferential, formatted data systems provide users
with tree-structured files or slightly more general network models of
the data. In Section 1, inadequacies of these models are discussed."

(I hereby propose that on c.t.d. we refer to this source as RMDLSBD.)

Is the hierarchical model relevant? Not to me. Dead? Not dead
enough.

It never dies because it never lived. It's the zombie of database
theory, preying on the unsuspecting and eating their brains. IBM still
sells IMS, presumably to someone, presumably at a profit. And there's a
cadre of Facebook-wannabes who think "graph databases" are the cat's
meow for finding out who's within 6 degrees of separation from Kevin
Bacon. It would seem that once you introduce the average programmer to
a hierarchical filesystem, you can never wean him of the notion that
that's the "natural" structure for data.

Returning to your points,

> A. ____the Hierarchical Model rests on a theoretical void____
> B. ____the HM is dead, it has no relevance wrt the Relational Model___

No theory informs the so-called hierarchical model. It is in every way
irrelevant as a theoretical database construct. That doesn't prevent
many naïve people from believing it has something to offer. Every lousy
idea has its ardent supporters.

In trying to fathom why you raised the subject (you being an advocate
of the relational model, I would say), I noticed:

> [Implementation of Data Hierachies is] simple for me,
> difficult-to-impossible for those who are victims of the suppression
> of Hierarchies,

among whom I'm included. :-(

I think we could say that assertion is not even wrong.

As you well know, tables can represent graphs, ergo hierarchies. On a
few occassions I have implemented them. For example, a SQLite virtual
table representing a directory hierarchy
(http://www.schemamania.org/sql/sqlite/udf/). Now that SQLite supports
recursive queries, you can implement find(1) in SQL, if you like.

So I would appreciate it if you would exclude me from the set of people
for whom the concept is "difficult-to-impossible". Please do count me
among those who usually find it needless, though.

Representing a hierarchy is one thing, and basing a DBMS on
data-as-hierarchy something else. Hierarchies exist. The hierarchical
model does not.

--jkl

Eric

unread,
Jan 29, 2015, 3:40:04 AM1/29/15
to
On 2015-01-29, James K. Lowden <jklo...@speakeasy.net> wrote:
8>< ----
> (I hereby propose that on c.t.d. we refer to this source as RMDLSBD.)

RMDLSDB

Eric
--
ms fnd in a lbry

Jan Hidders

unread,
Jan 29, 2015, 6:39:37 AM1/29/15
to
Hi Derek,

Let's see if I can clarify my position on the questions you raise. I'm a bit short on time, so my answers will probably not be as comprehensive as they should be and will only address some of the points you made. I have papers to write and exams to be graded. :-)

Op woensdag 28 januari 2015 10:53:01 UTC+1 schreef Derek Asirvadem:
>
> The Network Model, although very relevant, is not relevant to todays issues or to the Relational Model, so let's carve that off. So we are left with the theme of the thread: you have posted, and I paraphrase, that:
>
> A. ____the Hierarchical Model rests on a theoretical void____
>
> You did not state it, but the implication is, the HM is dead, gone the way of the NM. Therefore (again, I paraphrase):
>
> B. ____the HM is dead, it has no relevance wrt the Relational Model___

You ask whether I agree with these claims. The answer is that, no, they do not correctly represent my position.

Concerning A: This is sort of true in an uninteresting way. It is indeed true that one cannot point to a single paper that defines formally the data model and the associated languages. But there are plenty of papers on defining hierarchical data models and languages to query them, and this is in fact quite a well understood area.

Some people also take "theoretical foundation" to mean that there is a philosophical foundation for that type of knowledge representation, for example such as exists for first order logic, which is more or less inherited by the Relational Model. But there are also such theories for higher-order logics, so in that sense I don't think it is true. But I actually don't accept that this is an important observation. What matter is if people in practice understand and can deal with hierarchical data. They can, and they do.

Concerning B: It becomes important here what it is that you precisely mean with "the HM". If you include the traditional assumptions about how the data is stored and the pragmatics of how to effectively query and manage it, then, yes, that is pretty much dead. But most now understand the relevance of data independence. If you define the HM as only the idea that data can be nested, then as you know some promote the idea that the Relational Model should allow this and provide means to effectively deal with them. I agree to some extent with that idea, but not when we are talking about RM as the model for the conceptual level in the DBMS.

> -- 3.1 JH Specifics --
>
> Iff you want something specific, as a starting point, try this post:
> https://groups.google.com/d/msg/comp.databases.theory/r3BHhq1EFbs/iE5AIvLD6BYJ
>
> The context of the thread, and your post itself, are not the issue, it is the statements/references/implied-statements that you made re the HM, and its value in the RM context, that I wish to take up:
>
> 4.
> > On Sunday, 21 April 2013 23:35:37 UTC+10, Jan Hidders wrote:
> > >
> > > [RM and ERM are the context]
> > >
> > > http://en.wikipedia.org/wiki/Structured-Entity-Relationship-Model
> >
> > Right. SERM. Part of Aris, a SAP thing. The hierarchical database
> > model, once again trying to make a come back in a new disguise. :-)
>
> I declare, the HM (or the "hierarchical database model") never left. It has always been here. It therefore cannot make a comeback.

Depending on what you mean exactly by the HM I would agree with that. What I was talking about above is that probably (but I might be wrong) the reason for the introduction of the notion of hierarchy is not for conceptual but for implementation and efficiency. And that idea, linking the hierarchy to how the data is stored, at least in the communities I work in, is mostly dead. Except apparently at SAP. :-)

> 5.
> > > [Rule: Circular references are not permitted]
> >
> > The German page is more explicit and does mention this rule.
> >
> > > Which appears to be a perfectly reasonable rule to me ...
> > > Is it perfectly evident that this requirement must be enforced, since a
> > > model with cyclic dependencies is plain "spaghetti", maybe even
> > > violating some normal form?
> >
> > It has nothing to do with normalization in the classical relational
> > theory sense of the word.
>
> I declare, it has a direct and definitive relationship with Normalisation, as defined in the RM.

Really? Which of the classical normal forms (1NF, 2NF, 3NF, BCNF, 4NF, 5NF [in its different variants you find in text books]) forbids cyclic dependencies?

Kind regards,

-- Jan Hidders

Derek Asirvadem

unread,
Jan 30, 2015, 12:10:42 AM1/30/15
to
Clarification

4.
The HM is a fundamental part of the RM. (To be clear, "fundamental" means "founded on".) Removing the HM from the RM would remove much of the integrity, power, and speed from the RM. Which mathematicians do, frequently. And then invent tiny fragments of integrity, to patch up a few of the gaping holes, what they have lost. The patchwork never ends. I can determine, by virtue of the evidence, of the systems prescribed by these mathematicians, exactly what is missing; subtracted.

----

Addenda A (not Errata)

Of course, any theoretician who considers themselves in the category that I have described, and within the scope of the thread that I have given, should feel free to respond.

----

Addenda B (not Errata)

Let me tell you where I am coming from. Besides Ceylon. Besides Canada. Besides Australia.

> Nevertheless, I am willing to execute the thankless task

I love my profession.

It is easy for me to help others, either furthering their knowledge or implementing something. It is easy for me to correct mistakes in my professional and to protect it from damage. That is the spirit and intent with which I started this thread. I am hoping that you love your profession, at least half as much as I do.

That leads to ...

I love the Truth.

I hope I do not have to explain why, how, in any profession, and to humanity itself, falsity is damaging. A lot of unnecessary argument can be eliminated if we stick to the truth. There is no such thing as private definitions or private truths, exist exist only for people who are severely isolated from society, people who do not have an authority. In the professions, we have authorities, standards, laws. If we observed them, there would be no conflict within each profession.

Which leaves us with conflict between professions, as we have here. This skirmish is about mathematicians published falsity about my profession, causing damage to it (as well as to themselves, refer my comments re Norbert's thread). This does not happen when I deal with the banking industry or the car manufacturing industry: both sides are well aware of the sciences involved. It only happens with mathematicians who declare themselves to be the theorists in my industry, the database implementation profession.

There is a gaping chasm between what the mathematicians do (published secretly amongst themselves, unknown to practitioners), ie. what they think practitioners should do, and what practitioners are actually doing. No other industry that I know of suffers this problem. Such a problem, an absence of theory, or an incapability to translate that theory to practical terms, would normally kill the industry. Fortunately the vendors have their own scientists, and the high end of practitioners are theoretically founded, so the industry proceeds without any input from mathematicians. The last recognisable contribution was 1970 to 1978. That is not to say that the mathematicians have been idle, no, a mountain of books have been published for the plebs, and they are all garbage, irrelevant to the actual application of the sciences, to successful implementations.

So that is the nugget of the conflict.

Any conflict is resolved by truth. Objective truth.

The only reason that there is conflict, and that it has remained for so long (forty five years that I am intimately aware of), is that the mathematicians in the database industry engage is various behaviours that are not science, and they do not have an authority. There are 42 "relational algebras"; 56 notations; papers contradict each other; nothing is resolved (among themselves, before the plebs see it as published).

Therefore, I requested that any discussion that may be had, be limited to the scientific realm. That means you cannot deny reality; other sciences; the laws of physics; etc. That means that you uphold the truth, that truth is beyond the world of abstract mathematics; that truth is higher than the god of mathematics. Otherwise the conflict will remain for another forty five years.

I am hoping that you love the truth, at least half as much as I do.

Cheers
Derek

Derek Asirvadem

unread,
Jan 30, 2015, 6:29:48 AM1/30/15
to
> On Thursday, 29 January 2015 13:09:08 UTC+11, James K. Lowden wrote:
> On Wed, 28 Jan 2015 01:52:59 -0800 (PST)
> Derek Asirvadem <derek.a...@gmail.com> wrote:

I will probably run out of time on this post, so if it feels cut off, I will complete it another day.

> Thank you. I accept the bait^W invitation. ;-)

Come, come, James. A square and direct face isn't bait. I don't ask you to, and I cannot be asked to, watch my every turn of phrase.

I am trying to deal with what I consider a serious issue in our field. I would appreciate it if your treat it likewise. This is not the banter or baseless attacks that is commonplace here.

> > A. ____the Hierarchical Model rests on a theoretical void____

> > B. ____the HM is dead, it has no relevance wrt the Relational Model___

F. The implication here, and in many other places, is that you know the Relational Model, and you know it well.



> > 1. I take issue with your proposal [A]. I charge that it is
> > completely and totally false. But that is not as important as the
> > implication [B], which is totally and completely false.
>
> Regarding [A], it was Codd's observation. He had to invent the term
> "hierarchical model" ex post facto to name the thing he was comparing
> the relational model to.

That is not correct.

I was alive and kicking in those days, and I had moved from kicking inflated rubber balls to kicking IMS off its high horse. I was an Engineer (it meant something different in those days) for Cincom, with TOTAL as our Network DBMS. About a third of us were mathematicians (not me of course). All of us were scientists. We modelled before we wrote a line of code. Codd didn't invent the term. The Hierarchical Model was a fact, and we modelled (we did help IMS customers see the light, yes). Unlike these days, we did not have six million meaningless models with everyone choosing their own notation, we had five, and everyone who practised their craft knew each of them intimately.

In terms of beating the competition, we knew their gear inside-out. If we walked into a customer site and we could not deal with their models directly, we would have had no credibility. Nowadays cartoons pass off as models. We didn't have software to draw models on PCs, but we had excellent hand-drawn models, a set of symbols, notation, rules, stencils, etc. given to us by authorities in our field.

I remember fondly, when my models were ready to be published, I would visit a certain customer in Rochester, who had a secret drawing system (later turned to be the Xerox Star system that preceded the Apple interface, which btw Windoze is a poor man's copy of), and in exchange for a few hours consulting I would use the XDS to draw my model and publish it with a level of precision that surpassed my otherwise hand-drawn models.

Even in the loose sense (as per your comments), even looking backwards to those days, the HM as a model, to compare against the NM or the RM, is an easily understood, and much used, concept.

> It's not even a "model", though, insofar as
> it has no mathematical foundation.

Hah!

So what.

There are millions of models in various fields of scientific endeavour that do not have a mathematical foundation. Eg. In those days we used, and many people still use, Gane and Sarsons SSADM (not the abortion it has become since others have been "taking care" of it), commonly known as Data Flow Diagrams. It has a scientific, theoretical foundation but not a mathematical one. UML doesn't have an equivalent. IDEF0 is the equivalent on the IDEF side.

We had both the HM and the NM as models, with a set notation, and they certainly had a scientific and theoretical basis. Every incremental version showed proof of it. We had mathematicians on board, but I never saw their proofs, even though they worked with us. It was enough for the guy who specialised in this or that aspect of the resource (affecting the codeline), that he said he had, or had not ,researched something, and therefore he stands on, or doesn't, what he is saying. Credible people do the same these days. They don't have to be mathematicians.

Computers were orders of magnitude more expensive then, the codeline was precious in those days. We had a complete DBMS with OLTP in an 8-bit machine. By the 80' we had graduated to 16 bits, Britton-Lee and Tandem NonStop. We had moved from nipping at IBMs heels (they *were* the market) to crawling up their shins. Only complete idiots, who worked alone, coded without modelling first.

> He was never so inartful as to put it in so many words, but the
> argument for the relational model might be boiled down to, "Look,
> here's math, and there's a set of commonly accepted engineering
> practices. Which is a better foundation?"

I have no argument with that. As long as the previous historical facts are not denied.

> The RM has an algebra and a calculus, as you know. What is the
> equivalent (or even analog) in the hierarchical model?

Who cares. (Refer my previous comments.)

Analogue is graph theory of course.

The answer to your question does not prove anything, so I will drop it.

> There is no
> "graph algebra", no set of operations closed over the domain. Graph
> theory offers no sets, bears no connection to predicate logic.

So what ? A scientific person can, look at a graph, a tree, and instantly determine that it has integrity or not. What theory is he practising ? What algebra is he using ?

I can look at a data model, and using the small knowledge of graph theory that I have, instantly identify precisely where the bottlenecks will be, where the lock contention will be. And it is always confirmed when the system goes into production. I have no algebra or calculus, and only a pittance of graph theory.

Ok, fair enough, a mathematician might have to sit down and produce a new algebra, and a calculus to work with it, and a year or three later, he might have a proof that the graph has integrity. The world passed the poor sod by.

> What we have in graph theory is a taxonomy and a giant collection of
> algorithms. I would say that puts it about where biology was before
> germ theory: a museum of classification and observed behavior.

Well, that does not compute with my non-mathematician's research on the subject (let's say I have invented two things, relevant to our field, and I did some research, of course not formal, in that field).

But I certainly do not know enough to argue with you on that point.

I simply disagree. Eg. we can compute whether a graph, the point sets of which exist in a Relational Database, has certain properties or not, fairly quickly, without needing a Cray or a mathematician. So be it a museum or whatever, it is far more accessible and use-able (by non-mathematicians) than the stuffed skins of non-existent animals that we have in our database museum. I have yet to see a paper in our field, that scientifically observes behaviour, all I have seen is papers that focus on skin and hair (microscopically of course) of museum pieces.

Please feel free to change that.

Take a look at Norbert's papers about his field (Architectural Spaces). AFAIC, they are excellent, they put the papers in the RDB field to shame. Well written, well structured, good explanations, and finally a mathematical proof. All we have over here is speculation, in isolation/denial of other sciences, followed by a short proof.

Unfortunately, some poor sod uses it to code a system somewhere.

----

Therefore I say, the notion that the Hierarchical Model rests on a theoretical void, is without merit, and in denial of historical facts. The HM and the NM had science and theory behind it, as well as all the articles required for modelling (just not PCs with drawing tools).

(It may well not have a mathematical proof, yes.)

----

> Regarding [B], I suppose it depends on what we mean by "dead" and
> "relevance".

Well, you said it (I paraphrased), so it is up to you to tell us what you meant, or to modulate or correct it in some way.

> > I declare, the Relational Model is not a replacement for, or a
> > substitution for the Hierarchical Model; it is a progression of it.
>
> Evidence, please. I can think of no way in which the hierarchical
> model informs the relational model except as antithesis. Codd
> contrasted the RM with "noninferential systems", which hardly
> sounds like a source of inspiration.

(Not avoiding this point, tomorrow, please.)

> I think you will recognize this, from the abstract:
>
> "Existing noninferential, formatted data systems provide users
> with tree-structured files or slightly more general network models of
> the data. In Section 1, inadequacies of these models are discussed."

Accepted. And both of us know that the abstract is not the paper. Details tomorrow.

> (I hereby propose that on c.t.d. we refer to this source as RMDLSBD.)

NO.

I am not saying that you are being dishonest. But that is exactly the kind of madness that you guys perform, that we have to swallow, and I am not swallowing it.

The database world (separate to the tiny fraction of mathematicians concerned with the RM, which is say 1%) knows the RM by the term RM. It consists of Codd's original paper, easily accessible, plus his 11 other papers, noting some are exploratory and retracted, some are commercial interest, etc. It is a body of work that is well known. It contains a relational algebra, and later a calculus. Vendors have spent thousands of manhours and implementers have spent millions of manhours, implementing that thing that is known as the RM. End of story.

It is therefore unacceptable that a mathematician, in the typical way of mathematicians in this poorly-served field, attempts to carve it up, or to call one of their 42 relational algebras THE relational algebra, or to call whatever thing they refer to as "the relational model", THE Relational Model. Although you have exposed it by another route, this is precisely the problem I am trying to deal with.

You made your comments about the RM, in a public forum about Relational Databases. I am confronting those comments. You did not say, the body of papers that the mathematicians call the "relational model". If you did not mean the RM when you made those comments, please just say so and retract them.

This is also relevant to [F].

> Is the hierarchical model relevant? Not to me. Dead? Not dead
> enough.
>
> It never dies because it never lived.

That is silly, plainly false. I don't think that you would argue that it did live, at least until 1984, when the Relational model took off. (I think we can dismiss the IMS and Network die-hards, who are most certainly alive today, as "not relevant, not worthy of discussion".)

> It's the zombie of database
> theory,

Hang on, your said it had no theory. You are contradicting yourself. I am not being silly here, not playing with words, think about this. How can something that is devoid of theory be a threat to something that has allegedly well-established theory ??? It is not possible. If anything, it (your comments) makes a statement about the vulnerability of what you call database theory.

It is quite different from where I sit. First I have to say, what you and I call "database theory" are two vastly different things, I think you know that. I reject 90% of the papers written on the subject. That is not as important as this: the 10% left standing, are invulnerable.

Any new theory is not going to do any damage to it. Iff it is proved (forget "well-received"), *and* it has a scientific basis (forget published as a textbook), then it ADDS to that body. Otherwise, it gets the same bin as the 90%.

Further, in that standing and invulnerable 10% of database theory, the Hierarchical <model|theory|whatever> has a place, it is already integrated.

(I did not say it is written up in theoretical papers or that it has a mathematical proof, something that would be exceedingly silly to demand, because it is an evidenced fact. That would be like saying, show me a mathematical proof that your legs work, before I will believe it. I would have to be in a state of denial that *my* legs work, get the men with the white coats.)

> preying on the unsuspecting and eating their brains. IBM still
> sells IMS, presumably to someone, presumably at a profit.

Cincom still sells TOTAL and MANTIS. I agree, that is an irrelevant market.

> And there's a
> cadre of Facebook-wannabes who think "graph databases" are the cat's
> meow for finding out who's within 6 degrees of separation from Kevin
> Bacon. It would seem that once you introduce the average programmer to
> a hierarchical filesystem, you can never wean him of the notion that
> that's the "natural" structure for data.

I agree, applying hierachical <fill-in-substitute-word-for-model> to everything would be a serious mistake.

Ok. So then what, in your considered opinion, is the natural or "natural" structure for data, or is there none ? (I think we both agree, it is not network.)

(I am not selling hierarchical, I am just not in denial of it. I just don't have the fear and loathing that your emotional comments demonstrate.)

> Returning to your points,
>
> > A. ____the Hierarchical Model rests on a theoretical void____
> > B. ____the HM is dead, it has no relevance wrt the Relational Model___
>
> No theory informs the so-called hierarchical model.

Dismissed.

> It is in every way
> irrelevant as a theoretical database construct.

Considering my comments above, if not my declarations, definitely not.

The most I can accept from your statement is, It is in every way irrelevant as a theoretical database construct at the current state of play amongst the mathematicians in the field of database theory, which field is very poorly served. Not the scientist, not the theorists, just the mathematicians.

And the evidence is, yet again, they are demonstrably scared of it. For un-scientific reasons.

> It is in every way
> irrelevant as a theoretical database construct.

That you people think so, might be the crux of the problem in our industry.

> That doesn't prevent
> many naïve people from believing it has something to offer. Every lousy
> idea has its ardent supporters.

(I am not a supporter, just a non-denier.)

> In trying to fathom why you raised the subject (you being an advocate
> of the relational model, I would say)

Definitely. Codd, only Codd, and nothing but Codd, so help me God!

> I noticed:
>
> > [Implementation of Data Hierachies is] simple for me,
> > difficult-to-impossible for those who are victims of the suppression
> > of Hierarchies,
>
> among whom I'm included. :-(

Excuse me, I did not include or exclude you, the determination is open. If and when I see your work, then a determination can be made.

> I think we could say that assertion is not even wrong.
>
> As you well know, tables can represent graphs, ergo hierarchies. On a
> few occassions I have implemented them. For example, a SQLite virtual
> table representing a directory hierarchy
> (http://www.schemamania.org/sql/sqlite/udf/).

(Although I only have time for skimming it for now, that looks like a seriously good contribution. I would be interested in a data model of the base table that supplies the virtual table, in return, I will show you mine.)

> Now that SQLite supports
> recursive queries, you can implement find(1) in SQL, if you like.
>
> So I would appreciate it if you would exclude me from the set of people
> for whom the concept is "difficult-to-impossible". Please do count me
> among those who usually find it needless, though.

Two answers.

First, yes, you do not look like you should be included in that category of victims of the suppression of hierarchies, those who find it difficult-to-impossible to implement hierarchies. And I would very much like to make the determination, and thus a declaration to that effect. The sticking point, as I am sure you will appreciate, is that you have made the statements that you have about hierarchies; the hierarchical model; its place in the RM. Note your emotive comments. I don't have to be a psychologist to figure out that that would leave you indisposed (at best, and I won't mention the worst) to implementing one, properly, correctly, in a Relational database. More later.

Second.

Whoa. But, but, but. Wait a minute. Hang on.

Given that you have stated what you did about the hierarchies, and the hierarchical <model-or-whatever-word-you-use>, and that it is devoid of theory (not playing on words), what theory then, what seed, what concept, did you use when you implemented that directory ?

I did not let you fracture and split the RM (as the world knows it), and it may be that I should not let you carve up and quarter the HM, so that you can agree with some parts and disagree with others, but I will go with your quartering for now.

Where did you get that hierarchical concept from, for the directory, if not from the HM ?

Do you know that Oracle's Clustered Tables are a straight-forward implementation of the HM. Ie. the HM, not the Hierarchical concept, not an Oracle private definition, but the HM, pure and simple. Complete with pointers, and co-location of mixed data, non-tabular.

==== Task: JKL & JH ====

So here is a small task for you that I think will demonstrate many of the issues that we are dealing with in this thread. Given your demonstrated skill in the SQLite link and the RM (forget the "rm" of the mathematicians), I think this would take 30 mins or less.

Produce a data model for a set of tables that is fully compliant with the RM, ie, completely Relational, for the storage of data pertaining to:
- Countries (Name, FullName, FederationDate, ExpiredDate)
- State/Province/Territory (Name, FullName, Type)
- County (Name, Type)
- Town/Township/Metropolis (Name, Type)
- Suburb (StreetName(FK), StreetType(FK) )

- StreetName (PK: StreetName (CHAR(50) )
- StreetType (PK: StreetType (CHAR(3), Name)

For all except the last two, the keys have been left out, the exercise is to determine and decide the keys. Everyone knows that there are ISO or ANSI codes for identifiers, for some of them, just choose good keys from that. Whatever you find as identifiers from a recognised standards body (eg. US ANSI County Codes) is deemed part of the "data".

Feel free to choose meaningful keys for the remainder. In case it needs to be said, the exercise is not about spending time finding info: if you have any trouble at all, please ask, and I will supply.

The data model can be in either the IDEF1X standard that practitioners have been using since 1985, or the text that mathematicians still use in the museum, in SQL DDL. The former is much faster. UML is somewhere in the middle, nowhere near as rich or specif as IDEF1X, but better, sort of than DDL.

==== End Task ====

> Representing a hierarchy is one thing, and basing a DBMS on
> data-as-hierarchy something else.

Er, I did not suggest, or recommend, "basing a DBMS on data-as-hierarchy". Of course, as you seem to agree, data sometimes exists in an hierarchy.

> Hierarchies exist. The hierarchical model does not.

Well that is still up in the air, and as long as we proceed as we have, that will be resolved soon. The HM vs "the concept of an Hierarchy" may have to be separated.

>
> --jkl

Question for you. In one para or less, what is your considered opinion of the Alice book.

In one sentence or less, how relevant is it to the field of Relational Database design.

Notice, I did not qualify that question with the term "mathematician" one way or the other.

Please do not avoid [F].

I will not forget two unanswered items, tomorrow.

Cheers
Derek

Jan Hidders

unread,
Jan 30, 2015, 6:35:51 AM1/30/15
to
Op vrijdag 30 januari 2015 06:10:42 UTC+1 schreef Derek Asirvadem:
> Clarification
>
> 4.
> The HM is a fundamental part of the RM. (To be clear, "fundamental" means "founded on".) Removing the HM from the RM would remove much of the integrity, power, and speed from the RM.

As you know I agree with that to some extent, but not completely. So I wonder what arguments and observation you have to offer to support that position. And to avoid misunderstandings: I'm interpreting your position as that you think that the Nested Relational Model will lead to more effective DBMSs then the Flat Relational Model. Would that be fair?

> Which mathematicians do, frequently.

If you mean who I think you mean then that is not a fair characterization. Yes, much research went into the Flat Relational Model, but that was not just because they thought that was the best model. Even if you think the Nested Relational Model is better, it still has a subset the flat model, which is easier to study, and so it makes sense to get a good understanding of that first.

You also seem to imply that there is no theoretical research on the Nested Relational Model. I know form direct experience that this is false, and have no idea why you would think that.

> And then invent tiny fragments of integrity, to patch up a few of the gaping holes, what they have lost.

I have not idea what you are referring to here.

> I love my profession.
>
> It is easy for me to help others, either furthering their knowledge or implementing something. It is easy for me to correct mistakes in my professional and to protect it from damage. That is the spirit and intent with which I started this thread. I am hoping that you love your profession, at least half as much as I do.

Yes, I do, and recognise this is in you. It is the reason why, even though I strongly dislike your debating style which I think is counterproductive, I still think the conversation might lead to something since your intentions seem to be sincere.

> I hope I do not have to explain why, how, in any profession, and to humanity itself, falsity is damaging. A lot of unnecessary argument can be eliminated if we stick to the truth. There is no such thing as private definitions or private truths, exist exist only for people who are severely isolated from society, people who do not have an authority. In the professions, we have authorities, standards, laws. If we observed them, there would be no conflict within each profession.

That's a bit too optimistic for me. :-) But, sure, commonly agreed definitions help and lack of them can make it hard to have meaningful discussions. But what can be fare more damaging is the unwillingness to listen to the other side, even if their definitions are not exactly yours.

> Which leaves us with conflict between professions, as we have here. This skirmish is about mathematicians published falsity about my profession, causing damage to it (as well as to themselves, refer my comments re Norbert's thread).

Which falsities would that be?

> This does not happen when I deal with the banking industry or the car manufacturing industry: both sides are well aware of the sciences involved. It only happens with mathematicians who declare themselves to be the theorists in my industry, the database implementation profession.

When you say database implementation, do you mean indeed database implementation or DBMS implementation? But on both matters the theoreticians I know are actually very modest in their prescriptions. So I'm wondering what you are referring to here.

> There is a gaping chasm between what the mathematicians do (published secretly amongst themselves, unknown to practitioners), ie. what they think practitioners should do, and what practitioners are actually doing.

Any examples?

> Therefore, I requested that any discussion that may be had, be limited to the scientific realm.

That's not so easy, since most issues at hand are actually engineering questions rather then scientific questions. But ok, I support the demand to be as scientific as possible, and not only because my profession demands it.

-- Jan Hidders

Jan Hidders

unread,
Jan 30, 2015, 7:03:48 AM1/30/15
to
Op vrijdag 30 januari 2015 12:29:48 UTC+1 schreef Derek Asirvadem:
>
> ==== Task: JKL & JH ====
>
> So here is a small task for you that I think will demonstrate many of the issues that we are dealing with in this thread. Given your demonstrated skill in the SQLite link and the RM (forget the "rm" of the mathematicians), I think this would take 30 mins or less.
>
> Produce a data model for a set of tables that is fully compliant with the RM, ie, completely Relational, for the storage of data pertaining to:
> - Countries (Name, FullName, FederationDate, ExpiredDate)
> - State/Province/Territory (Name, FullName, Type)
> - County (Name, Type)
> - Town/Township/Metropolis (Name, Type)
> - Suburb (StreetName(FK), StreetType(FK) )
>
> - StreetName (PK: StreetName (CHAR(50) )
> - StreetType (PK: StreetType (CHAR(3), Name)
>
> For all except the last two, the keys have been left out, the exercise is to determine and decide the keys. Everyone knows that there are ISO or ANSI codes for identifiers, for some of them, just choose good keys from that. Whatever you find as identifiers from a recognised standards body (eg. US ANSI County Codes) is deemed part of the "data".

Usually it's me who's handing out home work. :-) Would you mind if I decline for the reason that it is not my position that at the conceptual level hierarchically nested data is a bad idea in a DBMS or "does not naturally exist"? In fact, I've defended in the past in this newsgroup the position (and still do) that the graph-based models such as (F)ORM data models are actually in that respect superior to the Relational Model, even the nested one.

So please don't confuse my position what James' position. We are almost diametrically opposed on all issues.

-- Jan Hidders

Derek Asirvadem

unread,
Jan 30, 2015, 7:58:19 AM1/30/15
to
> On Thursday, 29 January 2015 22:39:37 UTC+11, Jan Hidders wrote:

I am too tired to write, but I am reading. Quick answer for now, on one point only. I will reply the rest tomorrow.

> Concerning B: It becomes important here what it is that you precisely mean with "the HM". If you include the traditional assumptions about how the data is stored and the pragmatics of how to effectively query and manage it, then, yes, that is pretty much dead.

Obviously. I think we all agree that it died /in that sense/ in 1984.

there are other senses.

> But most now understand the relevance of data independence.

Hah! you have to be kidding.

Please provide a single link to a paper written by a mathematician in the RDB space, that asserts that.

For each that you do provide, I will provide one that asserts the opposite. I would have thought that you are familiar with some of them.

For the few that do assert database independence, note that it has taken them forty five years to reverse their position. Practitioners who followed Codd, have never left that position.

That is precisely part of the problem I am saying, that contaminated and damages the RDB space.

> If you define the HM as only the idea that data can be nested, then as you know some promote the idea that the Relational Model should allow this and provide means to effectively deal with them.

No, I don't limit my idea of the HM (in the various senses it exists today, and as it influenced or was fundamental to the RM), to that sense. And there are other senses.

In scoping terms, let us exclude that: nested sets; nested relations; complex datatypes; issue from this thread.

I have already stated somewhere, I provide all that without the need to change the definition of atomic, etc, using todays SQL. But that too is out of scope. The reason I make that out-of-scoope point is, the supply that I provide is totally within the RM. That is to say, the RM does not need to "allow" full and complete handling of hierarchical data, it already does. And only an idiot (ie. one who does not understand the RM; one who does not understand Hierarchies independently; and one who does not understand Hierarchies as prescribed within the RM), would have to ask for that.

Further, those creeps that ask for such changes, cannot change the RM. It is not theirs to change. If their claims were valid, they should make up their own model. Of course, it would be a Toy Model, with rubber bands and popsicle sticks.

Once again, the damage that people who have private definitions, do.

Sorry for the short treatment.

Cheers
Derek

Derek Asirvadem

unread,
Jan 30, 2015, 8:23:48 AM1/30/15
to
Jan

Again, just on that one point.

> On Friday, 30 January 2015 22:35:51 UTC+11, Jan Hidders wrote:
>
> As you know I agree with that to some extent, but not completely. So I wonder what arguments and observation you have to offer to support that position. And to avoid misunderstandings: I'm interpreting your position as that you think that the Nested Relational Model will lead to more effective DBMSs then the Flat Relational Model. Would that be fair?

There is no such thing as the Nested Relational Model. There is only one Relational Model.

No. My initial post, this thread, is about real hierarchies. Not about the various papers that propose to implement data in some nested form (a view that is partially hierarchical) without understanding that the Relational Model has Hierarchies. ignorant people are forever re-inventing the wheel. and forever coming up short and square.


> You also seem to imply that there is no theoretical research on the Nested Relational Model. I know form direct experience that this is false, and have no idea why you would think that.

I have read five papers. Two are very good. Three are very poor.

The two that are good, are ignorant of hierarchies, and they re-invent the wheel, in the front of the cart, and sideways.

The great problem with ALL the mathematical papers these days is that they are written is staggering ignorant of other sciences; with a narrow focus on their tiny area; in ignorance of the real world, where the thing that they are "researching" already exists, and can be readily observed.

> Yes, much research went into the Flat Relational Model

Barf bag.

> , but that was not just because they thought that was the best model. Even if you think the Nested Relational Model is better

Barf bag with holes in it.

> , it still has a subset the flat model, which is easier to study, and so it makes sense to get a good understanding of that first.

three separate answers.

1. I don;t use barf bags manufactured by others, I only use my own.

2. Why don't they research the Relational model, instead of inventing things that are in it, by some other, stupid, name. By virtue of the evidence, mathematicians in this space have very little understanding of the RM. But they think otherwise, and they prove themselves wrong, when they invent the thing that already exists in the RM. They have no shame, no professional pride.

3. Ok, fine. Just don't tell the world about it. Don't try to change the RM. Don't publish it outside the protected space that mathematicians need to do their research. Don't tell anyone that you invented it. And if you do, there will be consequences.

Just look at Norbert's Topology thread. Excellent research but we have had that in commercial systems for over twenty years. And that research was done is isolation from the real world (note his question "Is this relevant?"). SO when you carve off the all the stuff that we already have; that stuff that is claimed to be relational without evidence, you are left with one thing: three tables plus a bunch of SQL/<some-language> queries to paint Spaces from within an app. Really great stuff, but he should have known that from the beginning.

Tomorrow.

Cheers
Derek

Jan Hidders

unread,
Jan 30, 2015, 10:12:27 AM1/30/15
to
Op vrijdag 30 januari 2015 14:23:48 UTC+1 schreef Derek Asirvadem:
> Jan
>
> Again, just on that one point.
>
> > On Friday, 30 January 2015 22:35:51 UTC+11, Jan Hidders wrote:
> >
> > As you know I agree with that to some extent, but not completely. So I wonder what arguments and observation you have to offer to support that position. And to avoid misunderstandings: I'm interpreting your position as that you think that the Nested Relational Model will lead to more effective DBMSs then the Flat Relational Model. Would that be fair?
>
> [.. snip ..]
>
> No. My initial post, this thread, is about real hierarchies.

And are these essentially different from nested relations? Can you define how exactly? Just so it's clear what you are talking about. I'm guessing it is that there is a notion of logical relationships between parts of hierarchies. But I'm not sure, so it would help the discussion if you could confirm that. I'm also curious what your evidence is that this has always been part of how the Relational Model was widely understood. I know database researchers who were around at the time, with both contacts in academia and industry, and they disagree with that.

> Not about the various papers that propose to implement data in some nested form (a view that is partially hierarchical) without understanding that the Relational Model has Hierarchies. ignorant people are forever re-inventing the wheel. and forever coming up short and square.

I don't know papers that talk about "implementing" in a nested form, but I do know papers about "representing" it that way. That is of course an essential difference if we take data independence into account. Why did you use the word "implementing" and which papers are doing that?

I also don't really get this accusation that these paper unfairly claim to have invented something new. As far as I can tell they usually don't do that; they rather investigate what happens if you do. Any paper in particular that you can point to that does this in a way you find unacceptable?

-- Jan Hidders

Derek Asirvadem

unread,
Jan 31, 2015, 12:01:12 AM1/31/15
to
Jan

> On Saturday, 31 January 2015 02:12:27 UTC+11, Jan Hidders wrote:

Referring to the whole post identified above, plus parts of your first two posts.

You are getting into definitions, private definitions, and what yours and mine are. That is something that (thankfully) you agreed not to do.

That is all tangential.

You are posing questions instead of answering the issues on the table.

As stated from the outset, I wish to limit the scope of this thread. One reason for that, is to avoid the time-consuming address of tangential issues; private definitions; etc, which lead to arguments that never get resolved.

I want these issues resolved, with minimal argument.

I repeat, from my initial post, you made the statements, in your various papers, articles and posts. I am confronting those statements. When you stated "RM", if you meant something other than the RM which is clearly understood by 99% of the people in the industry, perhaps one of the many "rms" that the 1% may be aware of, then it is up to you to retract it, or modulate it or correct it in some way. It is up to the writer to make the distinction. As it stands, those statements are incorrect, and I am trying to deal with that, and to resolve that.

The papers concerned, including some written by you, are published in the public domain. They get used to justify the implementation of certain notions. Those notions are raised in those papers. Those notions are false (sure, they do have a mathematical proof; but that is not the whole theory in this industry; and it is definitely not science). That is one of the cancers that is damaging the industry, in particular, the understanding and application of the RM. I am hunting those cancer-causing agents down, and dealing with them. We can deal with it squarely, or you can get tangential and avoid it. In the latter case, you will damage your own credibility.

Likewise for your statements re the HM, hierarchies, and their relevance to the RM.

If you start going off on tangents, it means you are avoiding the resolution of the issues. Please do not do that. Notice, James is not doing that.

This thread has very little to do with definitions, and a lot to do with abusing established definitions, which allows claims to be made in the universe of that one percent, which gets used in the universe of the 99%. Deal with the abuse.

> A. ____the Hierarchical Model rests on a theoretical void____
>
> B. ____the HM is dead, it has no relevance wrt the Relational Model___
>
> F. ____The implication here, and in many other places, is that you know the Relational Model, and you know it well.____
>
> That is the scope, I hope I have identified it clearly.

And since then, that "nested relations" are excluded from "hierarchical", if only to preserve the stated scope (noting my comments that they are already catered for in the RM, and a "nested set r m" or a "bovine r m" is unacceptable if published in the public domain).

----

I will deal with the points which are not tangential, in your three post, shortly.

Cheers
Derek

Derek Asirvadem

unread,
Jan 31, 2015, 1:53:38 AM1/31/15
to
Jan

> On Thursday, 29 January 2015 22:39:37 UTC+11, Jan Hidders wrote:
> On Friday, 30 January 2015 22:35:51 UTC+11, Jan Hidders wrote:
> On Friday, 30 January 2015 23:03:48 UTC+11, Jan Hidders wrote:
> On Saturday, 31 January 2015 02:12:27 UTC+11, Jan Hidders wrote:
> > Op vrijdag 30 januari 2015 12:29:48 UTC+1 schreef Derek Asirvadem:

I accept that you two may be in "diametrically opposite" positions. But I had placed you two in the same basket wrt the subject of the thread. I will answer your each of your posts separately, name at the top. But if I answer them /fully/ I will end up with duplication, which is onerous for me, and offensive to the readership. Therefore, please read all my responses, noting that it applies directly to the named person (answering an identified post), but applies to both of you in general.

> > A. ____the Hierarchical Model rests on a theoretical void____

> > B. ____the HM is dead, it has no relevance wrt the Relational Model___

> > F. ____The implication here, and in many other places, is that you know the Relational Model, and you know it well.____

> Concerning A: It is indeed true that one cannot point to a single paper that defines formally the data model and the associated languages.

So what. There are no papers re the Network model either. So what. It was all proprietary in those days. It is silly to expect papers for such.

There are no papers for what Sybase has been doing in the last twenty years, re (eg) caching data buffers in their server. So what. Any Sybase professional knows most of the content fairly well. The absence of papers is not a proof that the server does not cache buffers, or that the caching does or does not do something, or that it does use this or that algorithm. The fact that the buffers are cached, that a specific algorithm is operating, and that the algorithm is more advanced than the 2006 algorithm, is easily proved.

Unless one denies science, evidenced reality, one does not need a paper.

> I'm also curious what your evidence is that this [hierarchies] has always been part of how the Relational Model was widely understood.

I did not say that.

I said that:
- the 99% THEN understood it perfectly
- no idea what the 1% did THEN
- that in the forty five years since, the 1% have written about 100 papers that deny various parts of the RM, hierarchies being one of them
- that those papers (specifically those authors) are responsible for the diminution of the RM, in general
- specifically, re this thread, that hierarchies and the RM are mutually eclusive
- that hierarchies do not exist in the RM
- that a very important part of the data and referential integrity in the RM, that is dependent on hierarchies, is lost when those hierarchies are ignored or suppressed(you appear to understand that integrity is contained in the RM, but only the non-hierarchical part of it)
- So I am saying, NOW, based on the evidence, the 1% does not understand the RM, and specifically, does not understand that Hierachies, the HM, exists within the RM.
- all that results in a widely-understood notion among the 99% NOW, of an RM that is legless, fragmented, fractional, devoid of its core principles

That the evidence is, the papers, those authors, have little knowledge of the RM. To wit, it WAS widely understood, but as a result of the mathematical papers, it is now NOT widely understood. It is starting to look like your understanding of the RM is based on papers that preceded yours, and not on the RM itself. Which categorically means, you should not be making statements about the RM.

> I know database researchers who were around at the time, with both contacts in academia and industry, and they disagree with that.

They are not in a position to agree or disagree. Refer my comments above re proprietary systems.

> But there are plenty of papers on defining hierarchical data models and languages to query them, and this is in fact quite a well understood area.

That is my point, agreed. It is silly to argue that the HM is not known, or that the model doesn't exist as a model, etc.

(This does appear to contradict your initial points, Concerning A.)

> Some people also take "theoretical foundation" to mean that there is a philosophical foundation for that type of knowledge representation,

Sure. And there was THEN a well-established implementation foundation, evidence that it worked well, really well, for what it was designed to do: DSS (inferential systems). (The market was moving and it had new demands, namely from Batch TP to OLTP, and the HM did not work well for those new demands. Network came along with instantaneous OLTP, and started wiping the floor with IMS. That did not mean that OLTP lives and HM died, it was not an exclusive issue. IBM, their customers, and others wanted OLTP /and/ DSS, in the one data bank.

Therefore this:

> I know database researchers who were around at the time, with both contacts in academia and industry, and they disagree with that.

is false for another reason. Anyone in academia or the industry at the time, who did not know the basic issues; the "philosophy"; the implementation; the reality; the market demands, is simply not credible, he is passing himself off as someone knowledge-able in a subject while providing evidence that he is clueless.

> for example such as exists for first order logic, which is more or less inherited by the Relational Model.

That is not accurate. Codd defined a specific Relational algebra using first-order logic, for the purpose concerned. There is no suggestion that the RM contains the whole of FOL, or that the RA contain the whole of any RA that existed at that time.

> But there are also such theories for higher-order logics, so in that sense I don't think it is true.

Come on, Jan, higher-order logic does not apply. Please stay on the point.

> But I actually don't accept that this is an important observation. What matter is if people in practice understand and can deal with hierarchical data. They can, and they do.

Good. I am very happy that that is clear to you, that that is your position. I hope and pray that you have a similar position on the RM.

> Concerning B: It becomes important here what it is that you precisely mean with "the HM". If you include the traditional assumptions about how the data is stored and the pragmatics of how to effectively query and manage it,

No.

> then, yes, that is pretty much dead.

Agreed. In 1984.

> But most now understand the relevance of data independence.

I suppose I have to trust that you mean that in the fullness of the data integrity as prescribed in the RM.

Er ...

1. I have already stated in the Topological thread:

> > At this moment, I have two papers on my desk that are awaiting approval/rejection from a customer, written by well-established mathematicians (one who is a correspondent here), and they are both raised by an OO development team leader, to support his notion that the data & referential integrity problems that plague the customer's app plus record filing system can be corrected by further definitions, typing, classifiers, etc. Obviously that is false, and it is simply a continuation of the same promises that have been made for three years, none of which have been delivered. Except that this time, he has a few academic papers to support his promises. His credibility is long gone, and the app+RFS is going to be replaced by an RDB + app, which is why I was hired.

> > The papers are not science, therefore pointing out the errors and dismissing them is not difficult, but it is time-consuming, since I have to address the construction of the argument before stitching up in the proposal.

2. I have already stated in this thread:

> > Hah! you have to be kidding ...
> > Please provide a single link to a paper written by a mathematician in the RDB space, that asserts that.
> > For each that you do provide, I will provide one that asserts the opposite. I would have thought that you are familiar with some of them.
> > For the few that do assert database independence, note that it has taken them forty five years to reverse their position. Practitioners who followed Codd, have never left that position.

> > That is precisely part of the problem I am saying, that contaminated and damages the RDB space.

3.
> > The HM is a fundamental part of the RM. (To be clear, "fundamental" means "founded on".) Removing the HM from the RM would remove much of the integrity, power, and speed from the RM. Which mathematicians do, frequently. And then invent tiny fragments of integrity, to patch up a few of the gaping holes, what they have lost. The patchwork never ends. I can determine, by virtue of the evidence, of the systems prescribed by these mathematicians, exactly what is missing; subtracted.

> > And then invent tiny fragments of integrity, to patch up a few of the gaping holes, what they have lost.
>
> I have not idea what you are referring to here.

Let's cut to the chase.

One of those papers I refer to in [1] is yours.

Actually, one of the many papers I would produce as evidence that mathematicians in this space assert the opposite [2], could be the same paper.

And quickly skimming it again, it contains one perfect example for [3].

It contains many of the issues that I raise and wish to address in this thread:
a. the absence of understanding data, as data, independently, including that all constraints on data should be located in the RDb
b. the absence of understanding (perceiving ? observing? recognising ?) that the data is an hierarchy, which obviously is a result of the suppression of hierarchies that are contained in the RM
c. the proposition that, while data integrity as implemented in the program space has failed, yet more complexity and programming in the program space will solve the problem, complete with a mathematical proof to that effect
d. ignorance [of the relevant parts] of the RM, although it is referenced
e. ignorance of the fact that the RM handles hierarchies beautifully
f. ignorance of the fact that if the data were implemented as per the RM, the problem would not exist, therefore the proposal is totally without merit
g. that your statement now, re the value of data independence is in direct contradiction to papers that you have written in the past
h. re [f], I am not suggesting that you have based the paper on a Straw Man concept, but on ignorance, of the RM, such as mathematicians in this space commonly exhibit. A point that I assert in this thread. So it serves as a proof of that, too.

Of course, the great danger, the evidenced damage to the industry, is that people use such papers, believing the propositions to be true, the proofs to be valid, and they implement systems on that basis. People like me know that they were false from the beginning, that data independence, etc are irrefutable facts, a science, since 1970, clearly demonstrated in systems since 1985. However, people who implemented systems based on such papers don't know that, they are influenced to implement systems that contradict science. And they keep implementing them, because the papers have not been retracted, all we have is a statement from the author in an unrelated post on c_d_t stating that "most /now/ understand the relevance of data independence". Uhuh, twenty years later (1995 to 2015), after twenty years of causing damage, propositions and proofs that were made in denial of the RM; science, that existed in the previous twenty five years (1970 to 1995).

Therefore we can use that paper, which is still being used today, to deal with many of the issues raised in this thread, to prove (evidentially, not by mathematical proof) many of my declarations in this thread. As well as giving you an example of what I mean *the head of this section).

Rather than diving into it, I would like to ask your permission to do so. Of course, your credibility will be diminished if you do not accede, but that is a consequence of your actions, historically, and now. It is not a consequence of my actions.

> > 5.
> > > > [Rule: Circular references are not permitted]
> > >
> > > The German page is more explicit and does mention this rule.
> > >
> > > > Which appears to be a perfectly reasonable rule to me ...
> > > > Is it perfectly evident that this requirement must be enforced, since a
> > > > model with cyclic dependencies is plain "spaghetti", maybe even
> > > > violating some normal form?
> > >
> > > It has nothing to do with normalization in the classical relational
> > > theory sense of the word.
> >
> > I declare, it has a direct and definitive relationship with Normalisation, as defined in the RM.
>
> Really? Which of the classical normal forms (1NF, 2NF, 3NF, BCNF, 4NF, 5NF [in its different variants you find in text books]) forbids cyclic dependencies?

Please stick to the statements made, please read again. I did not refer to the NFs (which I have severally stated are abnormal; tiny fragments of Normalisation), I have specifically stated "Normalisation, as defined in the RM". I am under no obligation to defend statements that I did not make.

What do you have to say in response my statement, in the posted context, refuting your statement, without reference to the NFs ?

But of course, you raise an important point, which I will not avoid. And it is a great point because it drags us back to the scope of this thread, my initial post:

> > The first [obstacle] is, mathematicians are scared of The Hierarchy. And they are in a state of denial re the evidenced fact, that the Relational Model is a progression of, not a substitute of, the Hierarchical Model. Note the comments of J Hidders and J K Lowden in recent threads. Of the two Normal Forms defined in the RM (which btw remain undefined by mathematicians after 44 years), the first is, if we were to name it without psychological impediments, the
> ____ Hierarchical Normal Form____
> - It destroys many of the problems that mathematicians, even today, are grappling with.
> - It deems many of the proposals of Date, Fagin, and Darwen *non-relational*, which is why they suppress the Hierarchical issue, in order that people won't identify their proposals as massive breaches of the RM.

The Hierarchical Normal Form, as defined in the RM, forbids cyclic dependencies. It demands a tree without circular references.

The definition in the RM is informal, but scientific, and easily understood by the 99%.

Since the mathematicians in this space have failed, miserably, to (a) understand the RM, and (b) to provide /formal/ definitions for the two Normal Forms in the RM, they may not be known to the 1%. Which again, is additional evidence that the 1% are clueless about the RM.

Separately, anyone who is programming in any language, or designing databases in any platform, who does not know that cyclic dependencies and references are wrong, wrong, wrong, proves themselves to be an un-scientific ignoramus, executing work in denial of science. Because science (many proofs in other, related sciences) tells us that. Because anyone working with an hierarchy (recognised as an hierarchy) knows that: what would happen if a directory structure, such as James mentioned, were to have a circular reference ?

Circular references are invalid, period. Everywhere. Constraints. Programs. Database definitions. Data. Newspapers. Electricity grids. Puzzles, mazes. Marital bliss. Gender "orientation". Transit systems. Any set of manuals. Hospital porcedures. In any system (using the word in the loosest sense) of related articles. The list is endless, and only necessary if one does not understand that Circular references are invalid, period.

It is not invalid /only/ because they are invalid in the HM, or /only/ because Codd defined them as invalid in the RM. It is a law of nature, established in every science (except the 1% in our field).

Unless one is in a state of denial re evidenced reality (schizophrenic), *and* ignorant of other sciences, one does not need a mathematical proof to prove that circular references are invalid, plain stupid. And the corollary is, in order to accept circular references, one must drink the Kool-Aid.

Now it must be said, separate to the fact that they are prohibited by the RM, in the mathematical desert that is the database space, idiots and schizophrenics write papers, with mathematical proofs, which deny reality and other sciences. The hilariously side-splitting evidenced fact is, there are papers here that provide mathematical proofs that circular references are "valid". (Here the mathematical proof which is final to mathematics, and its non-finality to scientists, is revealed. But let us not get distracted.) And the same freaks provide examples using circular references at every opportunity, in order to re-inforce the totally invalid proposal, the daily does of Kool-Aid.

Of course, I am rather attached to science, and I expose such nonsense as being in denial of science, of reality (hence schizophrenic), and I dismiss all of it. But again, many people, also ignorant of science, believe that putrid garbage to be "valid", "proved", and they implement systems. Another example of the cancer in the industry, and the cancer-caused agents are easily identified. I would shoot the lot of them.

To finish this point, it must be said, this particular cancer is so wide-spread, that not only implementers implement it, but DBMS implementers implement DBMSs with it built in. PostgresNonSql and others are a sick joke, they legalise the insanity ("deferred constraint checking" which is only required if the cancer of circular references is deemed valid), they support schizophrenia and thus normalise it.

That is why DB2 and Sybase do not have support for "deferred constraint checking", for decades. We do not need it. Only idiots, who listen to and believe schizophrenics, need it. And their "systems", their "databases", are massively inefficient and problematic.

> > 4.
> > The HM is a fundamental part of the RM. (To be clear, "fundamental" means "founded on".) Removing the HM from the RM would remove much of the integrity, power, and speed from the RM.
>
> As you know I agree with that to some extent, but not completely. So I wonder what arguments and observation you have to offer to support that position.

No. It is the other way around. You made statements that the HM has nothing to do with the RM, etc, etc. Contrary to the standing evidence (the RM). So it is up to you to produce evidence of that which contradicts established science. That route, addressing the points you might raise, would be the smallest piece of work.

In the event that you can't do so, sure, I will provide chapter and verse from the RM. Chapter and verse is required, because the HM is unknown these days, whereas the HM was the only model in the 70's and 80's, and very well know. It provides the historical context in which the RM came into existence (the RM was not written in isolation, in a vacuum, or on the fourth moon of Jupiter). Some terms in the RM need exposition, if it to be understood today. That is a larger piece of work.

Third, again not avoiding your request, if we proceed with dealing with your paper, this issue [4] would be addressed and closed fully. So I await your permission.

> > I love my profession.
> >
> > It is easy for me to help others, either furthering their knowledge or implementing something. It is easy for me to correct mistakes in my professional and to protect it from damage. That is the spirit and intent with which I started this thread. I am hoping that you love your profession, at least half as much as I do.
>
> Yes, I do, and recognise this is in you.

Thank you.

> It is the reason why, even though I strongly dislike your debating style which I think is counterproductive, I still think the conversation might lead to something since your intentions seem to be sincere.

You misunderstand me. I am not here for debates. I am here for resolution.

It is not a conversation, it is a demand for evidence. Or "why are you writing this, in denial of established science".

If you consider my direct communication style "combative", then I am very sorry. And please observe that I did not create the combat. The person writing papers that are in denial of science did, my demand is merely the consequence.

> > I hope I do not have to explain why, how, in any profession, and to humanity itself, falsity is damaging. A lot of unnecessary argument can be eliminated if we stick to the truth. There is no such thing as private definitions or private truths, [they] exist only for people who are severely isolated from society, people who do not have an authority. In the professions, we have authorities, standards, laws. If we observed them, there would be no conflict within each profession.
>
> That's a bit too optimistic for me. :-) But, sure, commonly agreed definitions help and lack of them can make it hard to have meaningful discussions. But what can be fare more damaging is the unwillingness to listen to the other side, even if their definitions are not exactly yours.

I am not interested in any definitions other than the scientific ones that are established. There is no point in listening to someone who has a different definition, because the basis of such, is a rebellion against science, against established definitions.

Practitioners couldn't care less about the debates re the ever-changing definitions that mathematicians seem to enjoy. At best (the most politeness I can muster), they are irrelevant, precisely because they are not fixed, and cannot be relied upon to build anything. That is not to say that we do not allow theorists to do engage in it, but even then, the relevance depends on whether it is scientific or not.

I don't know what the slang term is in Holland. The German one is incisive, but too rude to print. The Australian one for that sort of exercise is, mental masturbation. Getting others involved in the conversation makes it mutual mental masturbation.

> > Which leaves us with conflict between professions, as we have here. This skirmish is about mathematicians published falsity about my profession, causing damage to it (as well as to themselves, refer my comments re Norbert's thread).
>
> Which falsities would that be?

The list is endless. We have already started dealing with a few specific falsities, they will progress and close (as decidedly false) as the thread progresses. Eg. your paper. I trust that the short list on the table suffices for now.

> > This does not happen when I deal with the banking industry or the car manufacturing industry: both sides are well aware of the sciences involved. It only happens with mathematicians who declare themselves to be the theorists in my industry, the database implementation profession.
>
> When you say database implementation, do you mean indeed database implementation or DBMS implementation?

Both, and separately.

> But on both matters the theoreticians I know are actually very modest in their prescriptions. So I'm wondering what you are referring to here.

Jan, you do make me laugh.

It is not the modesty in the paper that is the operative issue. It is that the papers, with proofs, devoid of science, get used, to implement falsity, the result of which is garbage. And twenty years later, when the mathematicians finally accept that the said proofs are garbage, they do not have the professionalism to retract them, they just carry on with new proofs.

In other cases, eg. MVCC, they deny evidenced reality, that there previous proofs were wrong, and write new proofs about a new Wonderland. Vociferous denial.

Third, and this is probably the most important, these same freaks write books, and propagate those false theories, those cancers, to the masses. That is the wilful bombardment of previously undamaged humanity with disease.

> > There is a gaping chasm between what the mathematicians do (published secretly amongst themselves, unknown to practitioners), ie. what they think practitioners should do, and what practitioners are actually doing.
>
> Any examples?

Heaps. Let's deal with the ones identified in this post thus far. And again, the specific ones in your paper.

> > Therefore, I requested that any discussion that may be had, be limited to the scientific realm.
>
> That's not so easy, since most issues at hand are actually engineering questions rather then scientific questions.

I do not split those hairs.

> But ok, I support the demand to be as scientific as possible, and not only because my profession demands it.

Thank you. And in that case, it /should/ be easy.

> On Saturday, 31 January 2015 00:23:48 UTC+11, Derek Asirvadem wrote:

> > You also seem to imply that there is no theoretical research on the Nested Relational Model. I know form direct experience that this is false, and have no idea why you would think that.
>
> I have read five papers. Two are very good. Three are very poor.
>
> The two that are good, are ignorant of hierarchies, and they re-invent the wheel, in the front of the cart, and sideways.
>
> The great problem with ALL the mathematical papers these days is that they are written is staggering ignorant of other sciences; with a narrow focus on their tiny area; in ignorance of the real world, where the thing that they are "researching" already exists, and can be readily observed.

Ie. ALL mathematical papers in the RDB space. The car industry; manufacturing; computer hardware, industries have no such problem. The software industry suffers a fair amount, and within that, the database industry is the worst, pathetic.

To be clear, from where I sit, there are three easily identified categories of mathematicians or theoreticians.

1. Knowledge
Science, theory (including mathematics, which is a fraction) and implementation, that is based on all the sciences that apply; all the theories that are scientific; all the established rules in the field (eg. the RM).

2. Ignorance
Theory (including mathematics, which is a fraction) and implementation, that is based on one science, in ignorance of related sciences (eg. laws of physics, architectural principles); ignorance of the established rules in the field (eg. the RM, software architecture); and of the reality (current state) in the field. Un-scientific.

Here, the mathematical theory is asserting itself over everything else. Dangerous and stupid.

One of the two good papers (according to my informal review) is the Wok paper on Nested Normal forms. It is clearly in [2]. To some extent, that is forgive-able, and I have forgiven them, as described above, because the field is damaged; they have been taught cancer; they are not creating it; they are unaware that they have cancer; they are spreading it out of ignorance.

3. Cancer Causing Agent
Theory (limited to the mathematical fraction) and implementation, in violation of the science in the field;in violation of related sciences; in violation of the established rules in the field; and of the reality in the field. This is Anti-scientific. A state of denial, and if they keep doing it after it is pointed out to them it is a pathological state of denial (which is why I use the term schizophrenic). These are the causative agents.

Here, the isolated mathematical proof is god, untouchable. New private definitions are welcomed with glee. Whether the proof is true or false, whether it violates the sciences is suppressed. The only way to do that is to suppress the other sciences, and to demand that all should think like these demented mathematicians (that is the point where you guys triggered me to confront you). That is the Kool-aid: in short, accept isolated and ignorant proofs, and deny the sciences.

Date, Darwe, Fagin, Fowler, Ambler, OMG, Abitebuol, Hull, Vianu, *all* fit squarely into category [3].

And as detailed further above, to the extent that they write books, or teach in universities, they are bombarding us with cancer.

> > So here is a small task ...
>
> Usually it's me who's handing out home work. :-)

It isn't homework.

It is a challenge to prove the extent of your understanding of the RM, re declaration [F], which based on the evidence that I have detailed, I am calling into question. It will also, secondly, deal with and close other issues that have been raised in this thread.

> Would you mind if I decline for the reason that it is not my position that at the conceptual level hierarchically nested data is a bad idea in a DBMS or "does not naturally exist"?

It has nothing to do with nested relations. It has everything to do with whether hierarchies, the HM, exists within the RM or not. And that if you are unaware that it does, then the tables that you create will have less than integrity than an implementation that is not in denial of hierarchies in the RM. Ie. my initial post.

I will give you another chance to deal with the challenge or decline.

I don't mind, either way. If you decline, you will be declining an opportunity to assert your knowledge per stated. In which case, you have no business telling anyone anything about the RM or the HM or about Hierarchies.

> So please don't confuse my position what James' position. We are almost diametrically opposed on all issues.

Understood.

> > > As you know I agree with that to some extent, but not completely. So I wonder what arguments and observation you have to offer to support that position. And to avoid misunderstandings: I'm interpreting your position as that you think that the Nested Relational Model will lead to more effective DBMSs then the Flat Relational Model. Would that be fair?
> >
> > [.. snip ..]
> >
> > No. My initial post, this thread, is about real hierarchies.
>
> And are these essentially different from nested relations?

No, I have specifically stated that they are the same. That only mathematicians working in isolation, and in ignorance, assert that they are different.

> Can you define how exactly? Just so it's clear what you are talking about. I'm guessing

Please do not guess. I am quite literal in my posts.

> it is that there is a notion of logical relationships between parts of hierarchies.

Yes and no, depends what you mean. In any case, that is going far too much ahead of the current state of play of this discussion. Yes, it is important, but no, it is not relevant /now/, and when we get further, the question will not exist (it would have been resolved by the progression.)

I repeat, nested sets or relations or whatever they are variously called, are not relevant to the thread. Please do not get hung up about it.

Hierarchies (whether acknowledged from the HM or not; whether nested sets or not), are relevant. They are relevant conceptually at concept time; logically at logical time; and physically at physical time. The logical and the physical would not happen unless they were conceptual first. I am saying, hierarchies as they are prescribed in the RM, and as they exist in naturally in data, are suppressed, by the mathematicians in this industry.

> > Not about the various papers that propose to implement data in some nested form (a view that is partially hierarchical) without understanding that the Relational Model has Hierarchies. ignorant people are forever re-inventing the wheel. and forever coming up short and square.
>
> I don't know papers that talk about "implementing" in a nested form, but I do know papers about "representing" it that way. That is of course an essential difference if we take data independence into account. Why did you use the word "implementing" and which papers are doing that?

1. Representing where ? In the back of the eyelids ???

Representing for what purpose ? If it has a purpose beyond MMM, then an implementation is is next. Check Norbert's thread, he is trying to cross over from the from eye-closed universe to the eyes-open universe, check the chasm between them.

In any case, as long as the paper is published in the public domain, it will get used. By implementers.

2. Fine. Strike "implemented'. Replace it with "mathematically defined and proofed". It is still garbage, executed in ignorance. I would be pleased in you would address that point, rather than the distinction between representation and implementation.

> I also don't really get this accusation that these paper unfairly claim to have invented something new. As far as I can tell they usually don't do that; they rather investigate what happens if you do.

When I say "invent" I mean propose a concept, as if it were new (novel, novelty), and provide a proof. Ignorant that the concept is not new, it already exists, under a different name, within the subject space. And they should be aware of that.

> Any paper in particular that you can point to that does this in a way you find unacceptable?

So the way they do it is not the problem, all the papers in this space do it. Even Norbert's Topological Space paper does it.

The unacceptable part is that they are ignorant of (a) the science and established rules in this field, and (b) the related sciences. As detailed above, so I won't repeat the detail again.

The result is two-fold, first others, as well as I, laugh our heads off because what is new to the darling author, is twenty or thirty-year-old hat to us. Second, the damage, others not so knowledgeable use the paper and implement the novel concept, in the novel location, which is wrong, because the science tells us the correct location. Thus it usurps existing science. Thus it is offensive.

All the papers in this field do that, including yours, which is why I am saying it is a great example, and dealing with it will answer many of your questions here. I don ot wish to avoid your questions.

All the papers, including the Fagin, Darwen, Pascal papers on the NFs. Don't get me started on that one, please, let's concentrate on this thread and close it.

Most of the books. The Alice book. All the Date and Darwen books.

----

Ok, I have answered pretty much each of your points in detail. You do not need to do the same in return. If you read the whole of this and only respond to the points that are worthy, only the points that are within the declared scope, that are relevant to you, that would be fine with me.

If you wish to leave this post unanswered, and to proceed directly to dealing with your paper, which will expose specific examples of many of the issues in this thread, and address them to closure quickly, that would be the highest rate of progress.

Cheers
Derek

Eric

unread,
Jan 31, 2015, 7:20:04 AM1/31/15
to
On 2015-01-30, Derek Asirvadem <derek.a...@gmail.com> wrote:
> On Thursday, 29 January 2015 13:09:08 UTC+11, James K. Lowden wrote:
8>< --------
>> (I hereby propose that on c.t.d. we refer to this source as RMDLSBD.)
>
> NO.
>
> I am not saying that you are being dishonest. But that is exactly the
> kind of madness that you guys perform, that we have to swallow, and I
> am not swallowing it.

Why on earth shouldn't there be an accepted abbreviation to refer to this
single very important paper, for everybody here, whether they accept it
or reject the paper, in whole or in part?

(although it should be RMDLSDB to get the words of the title in the
correct order)

> The database world (separate to the tiny fraction of mathematicians
> concerned with the RM, which is say 1%) knows the RM by the term RM.
> It consists of Codd's original paper, easily accessible, plus his 11 other
> papers, noting some are exploratory and retracted, some are commercial
> interest, etc. It is a body of work that is well known.

More or less well known, given that there are more than 11 papers (look
at some bibliographies), some of which are _not_ easily accessible. So
how about listing the 11, categorised into definitive, exploratory,
retracted, commercial interest and any other categories you like.

Thankyou.

Jan Hidders

unread,
Jan 31, 2015, 1:07:34 PM1/31/15
to
Hi Derek,

Ok. You put quite a few things on the table that I would like to respond to, but I promised my girl-friend to help with cooking our favourite dish, Dutch-Finnish pea soup, and although I do not consider many things more important then database theory, this is one of them.

So a very quick response on a few points:

1. You don't need my permission to discuss one of my papers here, but if you would like my explicit permission, then you have it now. I would btw. be very curious to know what conclusions your client would think he or she could draw from them.

2. An important point of disagreement between us seems to be what is and what is not part of The Relational Model or how a certain group of people or individuals interpreted it in the past. To be honest, although I have opinions on these issues, I find such discussions unscientific and without any merit, even if it is about how Codd himself meant his model to be understood. It is akin to the argument by authority, which is a very weak type of argument. What matters is, which objective arguments were put forward to support that interpretation and what the evidence for its merit was. Which interpretation leads to the most effective DBMSs and what scientific evidence is there for this. I'd prefer it if we could focus on that in future exchanges.

3. I'm not a mathematician, so would like to ask you to not refer to me in that way. I use math in my research. I have real mathematicians as my coauthors. But I do not hold a PhD or MSc in mathematics, so cannot really make that claim.

Tegiri Nenashi

unread,
Jan 31, 2015, 3:02:13 PM1/31/15
to
On Wednesday, January 28, 2015 at 1:53:01 AM UTC-8, Derek Asirvadem wrote:
> ...you avoid the infamy of frauds (such as Date, Fagin, Darwen, Abiteboul-Hull-Vianu), and you gain instead fame, in the physical universe. You might achieve the ACM Codd Award.

This reminds me the most hilarious thread of the previous decade, where certain posters were ranked as frauds. AFAIR Marshall Spight was honored to be fraud #1, and Bob Badour was merely fraught #4. I bet if you start a similar thread with Date, Fagin, and the rest, you have their attention!

Derek Asirvadem

unread,
Jan 31, 2015, 9:38:28 PM1/31/15
to

James

> > On Wed, 28 Jan 2015 01:52:59 -0800 (PST) Derek Asirvadem <derek.a...@gmail.com> wrote:

> > (Not avoiding this point, tomorrow, please.)

> > Accepted. And both of us know that the abstract is not the paper. Details tomorrow.

I am trying to provide the two responses that I owe you, to complete the response to your post. But I am running into difficulty re the amount of detail required, and I need a quick clarification, please.

> It would seem that once you introduce the average programmer to
> a hierarchical filesystem, you can never wean him of the notion that
> that's the "natural" structure for data.

Since this thread is about the RM, I assume you mean, he then implements that thing that he considers the "natural" structure for data, in a RDBMS. (Otherwise the point does not apply.)

Could you please give an example of what one of those guys did, that you consider to be incorrect, and also your version of what that should be, if you corrected it, to what you think the "natural" structure is. Ie. they use it on everything, and you use it sparingly.

Cheers
Derek

Derek Asirvadem

unread,
Feb 1, 2015, 12:33:40 AM2/1/15
to

Jan

> On Sunday, 1 February 2015 05:07:34 UTC+11, Jan Hidders wrote:

No worries. Take your time. I am happy to wait for the long, considered response. I would like the thread to progress to completion, with considered responses all the way through.

> 1. You don't need my permission to discuss one of my papers here, but if you would like my explicit permission, then you have it now.

I know that. It was a matter of courtesy. Some people may not appreciate having their papers discussed in open forum.

> I would btw. be very curious to know what conclusions your client would think he or she could draw from them.

Since I have already given you a synopsis, a short chronology, I am not sure what you mean. Would you like a more complete one ?

Further, if we are going to work through the main issue in the paper, the conclusions (now) would be detailed (as opposed to the conclusions drawn by hundreds of others, previously, which are incorrect).

I suspect you and I are not on the same page on this one. So let me clarify, and ask for a clarification.

Now in this thread you have stated:

> > But most /now/ understand the relevance of data independence.

(My emphasis.)

To which I replied:

> I suppose I have to trust that you mean that in the fullness of the data integrity as prescribed in the RM.

Which you have not confirmed or denied. Which means, I still do not know the /extent/ to which you understand "data independence", and how it is administered.

So the clarification begs. The paper is Database Programming Languages, 1995. Are you aware:

1. That, on the face of it, your statement above, contradicts, or let's say unofficially retracts, the main thrust, the solution given, in your 1995 paper ?
__ (which is why I stated "... the papers have not been retracted, all we have is a statement from the author in an unrelated post on c_d_t stating that "most /now/ understand the relevance of data independence.")

Or, do you stand, on that paper, now ?

2. Of the Architectural Principle, established as science in our field, that Data must be separated from Process ?
__ (And it follows that there are separate and different methods for Analysing & Designing the two, etc, etc.)
__ It is clearly established in the industry, that implementers are specialists in either the /Data Space/ xor the /Program Space/ (those who cover both are few, and exceptional).

3. That [2] existed, as science, before Codd, 1970, the RM ?
__ (That it has been furthered ever since then, and rendered for whatever context one uses (eg. a RDB; an awk script). That it (as with everything in science) has only gotten stronger as an Architectural Principle, and applicable in more contexts.

4. That in his paper, the RM, in 1970, Codd gave specific /further/ prescriptions and prohibitions re "data independence", without having to explain what "data independence" meant, because it was well-known ?
__ Which resulted in implementation of those concepts in the commercial RDBMS platforms, as well as in the implementations of RBDs.

5. The result being, that 100% of all controls upon data should be deployed in the RDB ?
__ (As I am sure you know, DKNF alludes to this. We implement a much fuller form, as standard practice.)

6. The corollary being, that controls on the data should not be deployed in the /Program Space/. Eg. OO Objects or classifiers ?
__ And if it is deployed there, (a) it will never be adequate, or (b) as complete, as a deployement in the /Data Space/. Something that has been painfully proved in millions of OO-centric implementations.

Very short answers, please.

> 2. An important point of disagreement between us seems to be what is and what is not part of The Relational Model or how a certain group of people or individuals interpreted it in the past.

Yes and no. I am uncomfortable with the way that is stated. AFAIC, I don't have a disagreement with you (yet), which is why I have taken the approach that I have.

It is only by interacting with people here, mathematicians, theorists, others, that I have been able to determine the problems in this industry, and the causes. As per my OP, I charge that many of the mathematicians and theorists in this space write articles and posts re the RM, and they do so from a position of authority. What I am finding out, to my horror, sadly, is that they actually know sweet fanny adams about the RM. Which means they have no business theorising about it, and their claim to authority is false. I have singled you and James out because you have (a) made offending posts such as I describe, and (b) at the same time exhibit small knowledge of the RM. You represent the whole group, and you are direct enough that I can draw the problem out, confirm it, and thus address a massive hindrance in my profession.

-- Aside --

BTW, note from all this, it is not a personal or personality issue re you or James. I am very grateful that I can interact with you and James, that you two are honest enough, for me to chase this down, and to identify the exact causes of the problem. As I have clearly stated, you two are category [2], victims, you are not the cause category [3]. I can only determine and confirm the cause, through you. I need evidence, specifics, such as in your paper, who, which mentor, gave you that direction, etc. Again, this is not about trashing your paper in a public forum, this is about identifying the specific errors, and under whose direction you made them.

Eg. I took it up first with the Date, Darwen, and Fagin, and they would not (could not ?) answer anything directly. They cannot read or understand data models; technical English; etc. So they are proved to be even less scientific and less capable than the posters here, working in the Dark Ages. I endured three years at TTM, so I know their game well. The suggestion that it has anything to do with the RM is false. I didn't drink the Kool-Aid, so I did not fit in.

-- End Aside --

What is and what is not part of the RM, is not an arguable issue. Simply read it, and determine that. There may well be related issues such as, a good standards-compliant implementer takes more out of the RM, gets more direction from it, than a poor one, but that is a separate point. What is in or out is simple.

There is an issue with some theorists, mathematicians (whom I have categorised as [2] Ignorant), in that they invent things (propose new contemplation, representations, and perspectives), that already exist in our science, or more specifically, in the RM. I am addressing that as an important but secondary issue, because that activity is a stupid and avoidable waste of time and money. This industry is very poorly served in terms of theoreticians, there has be zero advancement since Codd (all advances have been by scientists who are employed by the vendors, sans papers published, and to a lesser degree by faithful disciples like me). Those researchers should instead, study the sciences involved; study the RM; and then extend, advance it.

> To be honest, although I have opinions on these issues, I find such discussions unscientific and without any merit, even if it is about how Codd himself meant his model to be understood. It is akin to the argument by authority, which is a very weak type of argument.

Per details above, I do not expect that type of argument.

We do need to take Codd as the authority. Otherwise we can pack our bags and go home.

> What matters is, which objective arguments were put forward to support that interpretation and what the evidence for its merit was. Which interpretation leads to the most effective DBMSs [, RDBs] and what scientific evidence is there for this.

Note my insertion.

Yes, all very good points. But I think even that /could/ be avoided, or let's say, easily stated and closed: the /commercial/ vendors have already done that work; the high-end implementers implement it. Something that the theoreticians do not seem to be able to comprehend. they are about thirty years behind the industry that they theorise for. You will of course, have to accept evidenced reality as scientific evidence, not papers by theoreticians who have already established themselves as un-scientific. Mathematical proofs alone are pure garbage.

> I'd prefer it if we could focus on that in future exchanges.

Sure. As long as we keep progressing on the main issues, and close them.

> 3. I'm not a mathematician, so would like to ask you to not refer to me in that way. I use math in my research. I have real mathematicians as my coauthors. But I do not hold a PhD or MSc in mathematics, so cannot really make that claim.

My apologies. You sure write a lot of papers in the space. Do you then consider yourself a theoretician in this space ?

Actually, if you excised the mathematical proofs from your papers, it would increase their credibility. Because the mathematical proofs have been proven false in the course of time, or were false from the beginning due to their contradicting other sciences.

Cheers
Derek

Eric

unread,
Feb 1, 2015, 7:10:06 AM2/1/15
to
On 2015-01-30, Derek Asirvadem <derek.a...@gmail.com> wrote:
8>< --------
>==== Task: JKL & JH ====
>
> So here is a small task for you that I think will demonstrate many
> of the issues that we are dealing with in this thread. Given your
> demonstrated skill in the SQLite link and the RM (forget the "rm" of
> the mathematicians), I think this would take 30 mins or less.
>
> Produce a data model for a set of tables that is fully compliant with
> the RM, ie, completely Relational, for the storage of data pertaining to:
> - Countries (Name, FullName, FederationDate, ExpiredDate)
> - State/Province/Territory (Name, FullName, Type)
> - County (Name, Type)
> - Town/Township/Metropolis (Name, Type)
> - Suburb (StreetName(FK), StreetType(FK) )
>
> - StreetName (PK: StreetName (CHAR(50) )
> - StreetType (PK: StreetType (CHAR(3), Name)
>

You left out the Street table!

Derek Asirvadem

unread,
Feb 1, 2015, 8:22:57 AM2/1/15
to
I am concerned that there ha been no progress on the small task.

- As I am sure that you can observe, most of the leg work is done, two chunks remain
- FDs, MVDs, JDs, BBs, FIIKs, are all complete (inasmuch as we assume that there /will/ be a PK for the attributes to be dependent on)
- Normalisation except for PKs and FKs is complete, the tables are practically given
- This is Task 1 in my Db Design course, the participants are supposed to be adults with some experience. It is structured to inform /me/ about the exact knowledge level of the participant re Relational Db Design and application of the RM. Specifically, which key concepts of the RM they do or do not know, and if they know a concept, whether they can apply it or not. Such that I know what items from the syllabus I can skip; teach only clarification of science; teach from scratch; etc. Such that I know what level to pitch the course content at.
- In the case of this thread, it will inform me, as to the degree of detail I need to post, when explaining things, ie. answering your questions, which are outstanding.
- In case it needs to be stated, the choice of keys, and their arrangement is crucial to any database, even more so in a Relational Database (due to related-by-key), the spec removes all other aspects, and allows you to focus on that one task.
- The exercise is not marked

Let's say the exercise is in three parts:

Part A:
__ The exercise, as given in a previous post
__ One small ambiguity has been left in (in case you are not aware, it is a classical old-style teaching method, notice the excitement on the noise channel, it engages personal interest; it inspires [that is, the Holy Ghost, not me!] )
__ 1. During that exercise, you are supposed to expose that ambuguity, and either fix it and complete the exercise (which a capable person does) or demand that I resolve the ambiguity and clarify the spec, before you can complete the exercise (which is what grasshoppers do).
__ 2. There is a small "research" task to be done, via google, ISO, ANSI, etc, to determine the standard-compliant components (but not the key itself) for each table.

Part B
__ I give you the "clarified" spec [A.1], eliminating said ambiguity
__ I give you [A.2] the standard-compliant identifying /columns/ for each table
__ you still have to determine the /keys/ for each table
__ This [A]+[B] remains as the small exercise, the result sought is a completed set of table specs that conform to the RM (whatever you think that is, whatever you mean when you have been making your posts and writing your articles).

Part C
__ There is no work involved in this part.
__ I add a new independent table to the database the participants have constructed, and we evaluate the effect it has on their tables and the SQL/DML code that would have been written to operate against those tables:
____ 1. no changes (meaning that the participant understands and implements Open Architecture; Standards; Ease of Extension. Ie. the high end of the RM).
____ 2. small changes (meaning some understanding, something to teach)
____ 3. structural changes (meaning no understanding, a lot to teach)
__ In terms of learning, this component can be substantial, but in terms of work, there is none: it is just a check, a verification, that classifies the previous work. (Of course, I will skip the learning part for this thread.)

Given that a few days have elapsed, and no solutions have been submitted, I will make it easier for you. Here is the offer: I give you Part [B], eliminating the research task.
- But, re [A.1], that will classify you as a less-than-fully-capable person in this task.
Hence I am offering the above, and not giving it right away, you need to confirm that you want it.

Cheers
Derek

Eric

unread,
Feb 1, 2015, 11:40:06 AM2/1/15
to
On 2015-02-01, Derek Asirvadem <derek.a...@gmail.com> wrote:
8>< --------
> __ One small ambiguity has been left in (in case you are not aware,
> it is a classical old-style teaching method, notice the excitement on
> the noise channel, it engages personal interest; it inspires [that is,
> the Holy Ghost, not me!] )

Of course, it was not meant for me, but I saw it, and then thought
about it a little. One of the thoughts that I had was that there seemed
to be something missing at the end of the list of tables. The other
thoughts that I had did not encourage me to pursue it, but then a small
bell started ringing in the back of my mind (speaking figuratively,
of course). So I dug around in my very miscellaneous collection of
database-related stuff downloaded from the internet because it might be
interesting, and I found something called Order DM Advanced.pdf . This
prompted me to make an observation. Merely an observation, no excitement.

I will make another observation, namely that your task has to be treated
as an exam question. That is, do not consider whether the given partial
specification makes any sense in the real world, but just complete it
as if it did. That observation could also be a statement that, in my
opinion, the analysis that led up to this design is incomplete and/or
incorrect. Or maybe it's just the scope.

Good luck!

James K. Lowden

unread,
Feb 1, 2015, 6:49:58 PM2/1/15
to
On Sat, 31 Jan 2015 18:38:26 -0800 (PST)
Derek Asirvadem <derek.a...@gmail.com> wrote:

> > It would seem that once you introduce the average programmer to
> > a hierarchical filesystem, you can never wean him of the notion that
> > that's the "natural" structure for data.
>
> Could you please give an example of what one of those guys did, that
> you consider to be incorrect

In two cases that I had direct contact with, the database design
reflected some form of object orientation. It used the table structure
to implement an elaborated version of the entity-attribute-value table,
and self-joins to construct what you or I would have designed as
tables. Naturally, the design was, er, designed to be "flexible".

The resulting databases were impossible to understand, comprised solely
abstract nouns, and defeated the system's ability to enforce integrity
constraints. Queries were tedious to read and executed poorly. When
I was approached for ways to make it faster (because speed is the only
thing that every application programmer and his manager understands),
they were nonplussed when I suggested a Dumpster, a clean slate, and
class in database design, not necessarily in that order.

--jkl


James K. Lowden

unread,
Feb 1, 2015, 6:50:09 PM2/1/15
to
On Fri, 30 Jan 2015 03:29:45 -0800 (PST)
Derek Asirvadem <derek.a...@gmail.com> wrote:

> > On Thursday, 29 January 2015 13:09:08 UTC+11, James K. Lowden
> > wrote: On Wed, 28 Jan 2015 01:52:59 -0800 (PST)
> > Derek Asirvadem <derek.a...@gmail.com> wrote:
...
> > > 1. I take issue with your proposal [A]. I charge that it is
> > > completely and totally false. But that is not as important as the
> > > implication [B], which is totally and completely false.
> >
> > Regarding [A], it was Codd's observation. He had to invent the term
> > "hierarchical model" ex post facto to name the thing he was
> > comparing the relational model to.
>
> That is not correct.
>
> I was alive and kicking in those days, and I had moved from kicking
> inflated rubber balls to kicking IMS off its high horse. I was an
> Engineer (it meant something different in those days) for Cincom,
> with TOTAL as our Network DBMS. About a third of us were
> mathematicians (not me of course). All of us were scientists. We
> modelled before we wrote a line of code. Codd didn't invent the
> term. The Hierarchical Model was a fact, and we modelled (we did
> help IMS customers see the light, yes).

Yes, I remember Codasyl, too, and Charles Bachman Overdrive. ;-)

Acknowledged, those systems modelled data as hierarchy. And they were
useful, sure. I really doubt, though, that they construed their
systems as being based on "the" (or even "a") hierarchical model. The
idea was simply implicit.

> We didn't have software to draw models on PCs, but we had excellent
> hand-drawn models, a set of symbols, notation, rules, stencils, etc.
> given to us by authorities in our field.

Yes, which you very much needed, as I'm sure you also remember,
both because of the lack of data independence, and because navigating
the hierarchy required having a clear picture in mind of the database's
structure.

> ... Xerox Star system ...

Enjoyed your reminisce. :-)

> > It's not even a "model", though, insofar as
> > it has no mathematical foundation.
...
> There are millions of models in various fields of scientific
> endeavour that do not have a mathematical foundation.

Sure, but so what? Are there any in computer science without a
mathematical foundation? Even if there are, the distinguishing
feature of the relational model that it's an inferential system. If
we define a "model" as something with an algebra or that supports first
order logic, then we have to exclude the hierarchical model from the
set.

> We had both the HM and the NM as models, with a set notation, and
> they certainly had a scientific and theoretical basis.

I would like to know more about that. To my knowledge it's not
possible.

> > There is no "graph algebra", no set of operations closed over the
> > domain. Graph theory offers no sets, bears no connection to
> > predicate logic.
>
> So what ? A scientific person can, look at a graph, a tree, and
> instantly determine that it has integrity or not. What theory is he
> practising ? What algebra is he using ?

None, and he's completely at a loss for making any logical inferences
from it. As far as I know, there's nothing like e.g. SQL EXISTS for
graphs. There's no way to say "find all <graph expression> where
<subgraph expression> = <graph expression>". Sure, there are ways to
iterate over the nodes. There are even things like XPATH to find nodes
having particular properties. But the solutions are all ad hoc and
complex, and still come nowhere near what RM does.

I think it was Don Chamberlin who remembered Codd going around
scribling 12-word relational queries equivalent to several pages of IMS
code. Relational didn't win because it was theoretically superior.
Relational won because its theoretical superiority permitted
implementation of superior systems, which were adopted because
they were easier to use, theory or no theory.

> Ok, fair enough, a mathematician might have to sit down and produce a
> new algebra, and a calculus to work with it

I'm saying is that mathematics doesn't exist. If 44 years hasn't been
long enough, I suspect "a year or three" more won't be, either.

> Take a look at Norbert's papers about his field (Architectural
> Spaces). AFAIC, they are excellent, they put the papers in the RDB
> field to shame. Well written, well structured, good explanations,
> and finally a mathematical proof.

I don't feel qualified to comment on his work. I don't know anything
about topological spaces. I was waiting to see how the dust settles.

> Therefore I say, the notion that the Hierarchical Model rests on a
> theoretical void, is without merit, and in denial of historical
> facts. The HM and the NM had science and theory behind it, as well
> as all the articles required for modelling (just not PCs with drawing
> tools).
>
> (It may well not have a mathematical proof, yes.)

I think we just mean different things by "hierarchical model", and by
"model". I still say Codd invented the term simply to contrast it with
his work. If you believe otherwise, please cite one paper in database
research before 1969 that refers to it.

I acknowledge the existence of IMS et al., and recognize that the data
were modelled on a hierarchy. I'm only saying that "model" lacked any
of the expressive power of the RM, to the extent they're not really
comparable. Hence it's a pretense even to call it a "model".

> > > I declare, the Relational Model is not a replacement for, or a
> > > substitution for the Hierarchical Model; it is a progression of
> > > it.
> >
> > Evidence, please.
>
> (Not avoiding this point, tomorrow, please.)

I haven't forgotten. :-)

I've deleted quite a bit of our back-and-forth because ISTM we meant
different things by "model".

> Ok. So then what, in your considered opinion, is the natural or
> "natural" structure for data, or is there none ?

Not to be sophistic about it, but I would say that data don't occur in
nature and have no natural structure. As a human construct, they
have no more "natural" structure than do novels or cities.

> Question for you. In one para or less, what is your considered
> opinion of the Alice book.

You're on record as calling it an "abortion". I don't know what about
it earned your emnity. I have an interest in database theory, and
query language complexity. I would like to see SQL replaced by a
language that is simpler, more expressive, less verbose, and that better
represents relational algebra. I read Part E with great interest,
although I don't claim to have mastered the subject.

I think it was from that book that I learned that there are some
problems that cannot be expressed in FOL. Hierarchies are in fact an
example: some problems require recursion, which, as you know, is
inexpressible in relational algebra. The book is quite deliberate
about the effects of introducing more power to a language. I have
not read a theoretical treatment of SQL's recursion. It seems like a
good job to me, a minimal extension. But it also seems harder to use
than necessary.

> In one sentence or less, how relevant is it to the field of
> Relational Database design.

Not relevant at all.

> Please do not avoid [F].

Not sure what you mean by "avoid". I didn't intend to avoid anything.

--jkl

James K. Lowden

unread,
Feb 1, 2015, 6:54:24 PM2/1/15
to
On Sun, 1 Feb 2015 05:22:56 -0800 (PST)
Derek Asirvadem <derek.a...@gmail.com> wrote:

> I am concerned that there ha been no progress on the small task.

I'm not intending to spend time on it. I'm sorry if that offends you,
and I realize that may lower your esteem of me. I have bigger fish to
fry. I hope you understand.

--jkl

Derek Asirvadem

unread,
Feb 1, 2015, 10:19:41 PM2/1/15
to

James

Thank you for your responses. Just a quick response for now, while I compile a full response.

First, I happily acknowledge that you are definitely not in the difficult-to-impossible category re implementation of hierarchical data. Actually as an implementer, period.

Second, I really appreciate the directness and concision of your responses.

Third, it is uncanny to me at least that you and I are so close on some items, so much in agreement, but so far apart on others. It feels like (speaking as a non-scientist!) you and I are playing on the same team, against the same enemy, and we are among the better players ... but you have a particular bone that is missing, and thus you have certain blind spots, can't-dos. And I readily acknowledge that if you use the same analogy, you would probably say the same thing about me. It is in no way an insult, more of a concern, how do I get this team member past his blind spot, how do I get him to insert that missing bone. Say the bone is a femur.

The missing bone is of course your perception of hierarchies. To you, it is entirely outside the RM, your body (note your emotive comments), the rest of which is so integrated.

Something earlier than the Alice book (Date ? Darwen ?) planted that notion in you. The Alice book has reinforced that notion (I will detail that later), so you see the bone as a tool, say a hockey stick, external to your body, the RM. And you think I am nuts, because I play, without a hockey stick.

And for me, who has read many of those books, binned them, retaining only Codd, for me who has practised only Codd, hierarchies are handled, completely and totally, within the RM. And SQL as the one-and-only data sub-language (definitely has faults, but let's not get distracted) that implements the RM, to whatever degree (let's not argue about "relations"), handles hierarchies completely and easily. Iff you understand the first sentence in this para. You cannot execute the second sentence otherwise, of course you will have "difficulty". Thus you revert to the bone, the hockey stick, outside your body, the RM.

You have been subverted.

Unless you are aware that you have been subverted, you will not be open to anything I say about this matter. You will not be able to hypothesise and examine what I say, the examples I give, or provide me with specific examples.

Thanks again. More, later.

Cheers
Derek

Jan Hidders

unread,
Feb 2, 2015, 5:47:19 AM2/2/15
to
Op zondag 1 februari 2015 06:33:40 UTC+1 schreef Derek Asirvadem:
> Jan
>
> > On Sunday, 1 February 2015 05:07:34 UTC+11, Jan Hidders wrote:
>
> No worries. Take your time. I am happy to wait for the long, considered response. I would like the thread to progress to completion, with considered responses all the way through.
>
> > 1. You don't need my permission to discuss one of my papers here, but if you would like my explicit permission, then you have it now.
>
> I know that. It was a matter of courtesy. Some people may not appreciate having their papers discussed in open forum.

Ok. The courtesy is appreciated.

> > I would btw. be very curious to know what conclusions your client would think he or she could draw from them.
>
> Since I have already given you a synopsis, a short chronology, I am not sure what you mean. Would you like a more complete one ?

More detail, as in where the devil lives. Because I suspect more is concluded from the paper then is actually warranted and meant by the author. But if we get to that later, that's fine.

> I suspect you and I are not on the same page on this one. So let me clarify, and ask for a clarification.
>
> Now in this thread you have stated:
>
> > > But most /now/ understand the relevance of data independence.
>
> (My emphasis.)
>
> To which I replied:
>
> > I suppose I have to trust that you mean that in the fullness of the data integrity as prescribed in the RM.
>
> Which you have not confirmed or denied. Which means, I still do not know the /extent/ to which you understand "data independence", and how it is administered.

Administered? That seems a strange word to use here. I'm also not sure what you meant by "in the fullness of" here, so that makes it a bit hard to answer. Am I aware how current DBMSs realise (to some extent) data independence? Yes, I am. Am I aware of the available techniques that are not yet implemented? Yes. I Am. Am I aware that under certain approaches there is a trade-off between the extent of data independence and complexity of integrity checking? Yes, I am. Not sure if that answered your question, but that's the best I can come up with at the moment.

> So the clarification begs. The paper is Database Programming Languages, 1995. Are you aware:

Yikes! My very first paper that I wrote as a beginning PhD student! :-) Ok. This is going to be interesting.

> 1. That, on the face of it, your statement above, contradicts, or let's say unofficially retracts, the main thrust, the solution given, in your 1995 paper ?
> __ (which is why I stated "... the papers have not been retracted, all we have is a statement from the author in an unrelated post on c_d_t stating that "most /now/ understand the relevance of data independence.")
>
> Or, do you stand, on that paper, now ?

I'm not sure which statement you mean, but I don't think I've said anything that strongly contradicts the results and assumptions in this paper. I'm also not sure what you mean by "presenting a solution" here. The paper does not introduce a new model, it studies an existing one and focuses on reasoning over union types within that model. But the results actually carry over into other data models.

You might mean that by studying that data model the paper implicitly gives a vote of approval to it. Do I still stand by that vote? Mostly yes, but these days I tend to think that simpeler graph-based models (but with mechanisms to model nested values) would be more useful and effective.

> 2. Of the Architectural Principle, established as science in our field, that Data must be separated from Process ?
> __ (And it follows that there are separate and different methods for Analysing & Designing the two, etc, etc.)
> __ It is clearly established in the industry, that implementers are specialists in either the /Data Space/ xor the /Program Space/ (those who cover both are few, and exceptional).

I am aware of that, but not sure why you think this is relevant for the paper.

Btw. when you say "science" I have the impression you actually mean "engineering".

> 3. That [2] existed, as science, before Codd, 1970, the RM ?
> __ (That it has been furthered ever since then, and rendered for whatever context one uses (eg. a RDB; an awk script). That it (as with everything in science) has only gotten stronger as an Architectural Principle, and applicable in more contexts.

That engineering principle has a long and venerable tradition, yes.

> 4. That in his paper, the RM, in 1970, Codd gave specific /further/ prescriptions and prohibitions re "data independence", without having to explain what "data independence" meant, because it was well-known ?
> __ Which resulted in implementation of those concepts in the commercial RDBMS platforms, as well as in the implementations of RBDs.

To some extent. From my colleagues who were around at the time I know that the concept already existed, but not everybody understood it in the same way.

> 5. The result being, that 100% of all controls upon data should be deployed in the RDB ?
> __ (As I am sure you know, DKNF alludes to this. We implement a much fuller form, as standard practice.)

Not sure why you drag poor little DKNF into this, since that only deals with a very small part of this, but, yes.

> 6. The corollary being, that controls on the data should not be deployed in the /Program Space/. Eg. OO Objects or classifiers ?
> __ And if it is deployed there, (a) it will never be adequate, or (b) as complete, as a deployement in the /Data Space/. Something that has been painfully proved in millions of OO-centric implementations.

Definitely, yes.

[... big snip ..]

> > To be honest, although I have opinions on these issues, I find such discussions unscientific and without any merit, even if it is about how Codd himself meant his model to be understood. It is akin to the argument by authority, which is a very weak type of argument.
>
> Per details above, I do not expect that type of argument.
>
> We do need to take Codd as the authority. Otherwise we can pack our bags and go home.

Quite the contrary. Codd's contributions were fantastic, some of them anyway, but it is by no means the last word on these matters, if only because technology and insight has progressed since then.

> > What matters is, which objective arguments were put forward to support that interpretation and what the evidence for its merit was. Which interpretation leads to the most effective DBMSs [, RDBs] and what scientific evidence is there for this.
>
> Note my insertion.

Yes, noted. But I disagree with the R there.

> Yes, all very good points. But I think even that /could/ be avoided, or let's say, easily stated and closed: the /commercial/ vendors have already done that work; the high-end implementers implement it. Something that the theoreticians do not seem to be able to comprehend. they are about thirty years behind the industry that they theorise for. You will of course, have to accept evidenced reality as scientific evidence, not papers by theoreticians who have already established themselves as un-scientific. Mathematical proofs alone are pure garbage.

Mathematical proofs can only proof mathematical facts, not whether a certain model is practical or not, although certain results can give some support. There the proof is really in the eating of the pudding. Any other position would be unscientific.

> Actually, if you excised the mathematical proofs from your papers, it would increase their credibility. Because the mathematical proofs have been proven false in the course of time, or were false from the beginning due to their contradicting other sciences.

Mathematical proofs can only be proven false by mathematics. But perhaps you mean that the underlying assumptions about how the models and assumptions are relevant in the real world might be shown to be incorrect by other sciences. Yes, that can happen. Has this happened in this case? I don't think so. The different OO data models have mostly fallen out of favour for non-technical reasons.

-- Jan Hidders

Derek Asirvadem

unread,
Feb 2, 2015, 6:46:19 AM2/2/15
to
James

> On Monday, 2 February 2015 10:49:58 UTC+11, James K. Lowden wrote:
It looks like you are side-stepping the request, maybe not. Excellent description, etc, but that is neither hierarchical nor non-hierarchical: any EAV structure may, or may not be an hierarchy. And that description is not specific enough to prove anything one way or the other.

I am not a proponent of EAV, but I have corrected a few EAV systems from the dumpster; added a bit of science; some constraints (you most definitely can add constraints, once the EAV structures have been elevated); etc.

Over here, we do not "write" SQL. I think it was 1993 when I stopped doing that. That is one of the myths that theoreticians have about us, manufacture any fiction to keep the myth alive, to feel inflated. /They/ are used to typing strings of relation definitions, and /they/ think the same of SQL. Yeah, sure THEY write SQL. They do not understand IDEs or models, so they are clueless re what they do, what their purpose is, etc. We push a button on the screen and GENERATE SQL, hundreds of tables, procs, functions, of the current version of the model, with on click.

Tedious. Huh ? It is a data sub-language. If the column names are meaningful, they are long, then the SQL generated is long. If you use an IDE, it fills all that in as you type. The better IDEs provide drop-downs and point-and-click. Costs a lousy fifty bucks. When I walk into a site, if the customer doesn't have one, I down-load a free version of an IDE that I do know, usually the sexy functions are blocked, but the basic functions are all I need.

So, yeah, sure, it may well be tedious for those who live in the dark ages, and type everything, but it is more than silly to apply that impediment to the rest of the planet.

The users, of course, have modern reporting tools, such that the NEVER type.

Now the key element in your example is that they did not use Views. That eases the EAV problem greatly. And if you understand that we GENERATE View definitions, we do not type them, and mess with their hair, then most of the pain (after elevation), most of the tedium, the monotony that you guys mythologise about, is removed.

On one occasion I wrote an extension to the Sybase catalogue, so that we could automate the construction of views, but with good planning and minimising EAV, you don't need to go that far. So I didn't bother to publish it.

Back to the request for example. Ok, so you can't give me an example of an implementation where the person incorrectly implemented an hierarchy for non-hierarchical data (because he though hierarchies were "natural" or whatever); nor one of how you corrected it.

And you won't do the exercise, which would demonstrate several things re your knowledge of the RM, one aspect specifically being hierarchies, as prescribed in the RM.

I looked at your nonSqLite link a bit more. Very good work. But for our purposes, there is no model, and it is not clear to me what structures are in the database (relational or not). I am not sure what a "virtual table" in that platform is (yes, I can guess). Perusing the code, it looks to me that you are using ROW_IDS, which is anti-relational. To be clear, you are /treating/ the data as hierarchical, and your /program code/ handles it, but your /data/, which clearly is hierarchical, is not implemented as hierarchical data. (If I have got any of that wrong, please correct me.) Not a surprise, given that you think hierarchies are external to the RM. I would say that stands as evidence that you do not implement the RM (ie. besides the non-implementation of the hierarchical component of the RM)

To sum:
- given that you reject the HM as HM, the concept of the HM (you have not given me a word that you prefer)
- given that you have no specific examples of improper implementation of hierarchies, or corrections thereto
- given that you consider hierarchies and relational handling of them to be outside the RM
- given that the example of your work presented as "hierarchical" is contrary to the RM, and absent the HM component of the RM
- given that you will not demonstrate your ability to implement classic hierarchical data in a Relational Database
I have to say:
- to that extent, you are severely hindered in the implementation of hierarchies in Relational Databases
- (others are crippled, because have a slew of hindrances, you have just one, and it is a major one)

Therefore my statement:
> First, I happily acknowledge that you are definitely not in the difficult-to-impossible category re implementation of hierarchical data. Actually as an implementer, period.

is incorrect. I retract it.

You are definitely a capable implementer, period. With regard to proper perception of, and the implementation of hierarchies, as they appear in data, and as prescribed in the Relational Model, which has been proved to be difficult-impossible to the theoreticians in this space (including those who write books on the subject), your declarations re your ability have not been demonstrated. Further, you refuse to execute an exercise that will demonstrate those declarations.

Let's try another avenue, in order to avoid stalling on this point. I want to close the issue re the perception and implementation of an hierarchy. Since you are so busy with the frying pan, let me do the work. Give me a detailed example (enough detail to implement; definitions in technical English, not gibberish), of an hierarchy that you or your mentors (authors whom you drink up) declare to be impossible to implement or outside the RM, and I will give you, overnight, the hierarchy, with full (i) integrity (ii) power and (iii) speed, that only the RM can provide, and within the express framework of the RM.

That is second preference, because although demonstrating my skills, and puncturing one of your myths, it is not identifying where your impediment lies; it is not removing it, which the exercise would do, which is why that remains first preference. I don't need to be cured. I don;t get anything out of demonstrating something that I have demonstrated a couple of dozen times.

> Yes, I remember Codasyl, too, and Charles Bachman Overdrive. ;-)

Yes, Charles was never as good without Turner.

> Acknowledged, those systems modelled data as hierarchy. And they were
> useful, sure. I really doubt, though, that they construed their
> systems as being based on "the" (or even "a") hierarchical model. The
> idea was simply implicit.

I have already provided a fair amount of evidence that it wasn't "simply implicit".

> > We didn't have software to draw models on PCs, but we had excellent
> > hand-drawn models, a set of symbols, notation, rules, stencils, etc.
> > given to us by authorities in our field.
>
> Yes, which you very much needed, as I'm sure you also remember,
> both because of the lack of data independence, and because navigating
> the hierarchy required having a clear picture in mind of the database's
> structure.

That is two separate points.

Data Independence. Huh ? That is nonsense. Before the RM, we had the concept, well and truly. It was well-known that a "good" program was "good" precisely because it had data independence, and a "bad" program was "bad" precisely because it had no data independence. That is separate to the fact that, because we worked with IMS or TOTAL or ISAM (for that matter), there were limits to the level of data independence that we could implement. A good programmer implemented the highest level of data independence that the file or database access method allowed. The result was, we could change one or two programs in the system, or one or two files, without having to change the whole system (ie. all the programs that made up the system). Whereas a poor programmer wrote systems unaware of Data Indepence, such that you could not change such systems with changing a large part of it.

In my apprenticeship, working for a TimeShare service bureau, I spent half my life upgrading programs written by customers, such that they had data independence, such that we could make point changes.

We had another form of data independence, within the set of files taken as a unit, or within the HM database alone, or the NM database alone, separate to the programs. If the files were designed well, within the limits of the platform of course, it improved the data independence, if they were poorly designed, it reduced it.

Then the RM came along. Separate to the notion of sets and relations, yes, it gave us specific prescriptions and prohibitions, and a couple were specifically re data independence. We could not employ any of that, until 1984, when a genuine RDBMS platform was delivered, so until then we were implementing pre-relational data independence, a well-understood concept. And after 1984, sure, we implemented a much greater level of data independence, that of the RM.

And again, those of us who followed the RM as Law, implemented way more Data Independence, etc, than those of you, who picked and chose the bits of the RM that you understood and accepted. Refer my interaction with Jan. He is now accepting far more from the RM, and that deems his work from ten and twenty years ago dead set wrong. Meanwhile over on this side, we did not take twenty years to accept it, we always accepted it.

Clear picture in mind. Well that is my point, not yours. You need a "clear picture in mind" if you are going to implement ANYTHING, if we are going to navigate ANYTHING. A database today, a pre-relational file system or database. A network topology. That is why we have IDEF1X, a standard. Instead of funny pictures in Visio and UML. In those pre-relational we have different diagrams to provide us with a "clear picture in mind", and a standard for the symbols, which were available as stencils at the newsagent. It proves once again, we had a MODEL.

> Enjoyed your reminisce. :-)

Yes, the vertical screen made people who hadn't seen one, just stop and stare. The first one had just the one icon, a page with a dog-ear.

> the distinguishing
> feature of the relational model that it's an inferential system.

Yes. Agreed. As long as you do not mean ONLY inferential. It is perfect for OLTP as well as DSS/OLAP, in the same database.

> If
> we define a "model" as something with an algebra or that supports first
> order logic, then we have to exclude the hierarchical model from the
> set.

I think we have argued that one to death. Yes, agreed. But it is silly to expect that level of definition from something that pre-dates the first occurrence of that level of definition. Very, very, silly. Look, the electric guitar was invented, say in 1960. It is silly, very silly, to say that guitars prior to 1960 were not precise because they did not have the precision that was required to use electricity. You are not that silly.

Second, you keep repeating your definition of "model". Sure it serves your purpose. Sure, I agree, that under that definition blah blah blah. But the problem is, you are denying that any other definition could be possible, you are denying historical facts, physical evidence. Very unscientific. There, you are crossing over from your esteemed scientific person into a religious freak.

> > We had both the HM and the NM as models, with a set notation, and
> > they certainly had a scientific and theoretical basis.
>
> I would like to know more about that. To my knowledge it's not
> possible.

Done to death. Evidence given. Read again.

It is not "to your knowledge", it is despite your knowledge, and to your religious convictions.

You seem to misunderstand, conveniently, I suppose, despite the details I have given, that I agreed there is no published theoretical basis for the HM, silly to expect, etc. Given my details, "set notation" above means set of notations, symbols, not notation of sets, which concept did not exist prior to Cod, and therefore is silly to expect, blah blah blah.

> > So what ? A scientific person can, look at a graph, a tree, and
> > instantly determine that it has integrity or not. What theory is he
> > practising ? What algebra is he using ?
>
> None, and he's completely at a loss for making any logical inferences
> from it.

Well, let me assure you that when I look at a graph, a tree, and instantly determine that it has integrity or not, I am practising a scientific theory. And further, I make logical inferences from that. And further, I am not the only person on the planet who can do that. Ok, I accept that you can't.

And I don't have the time to explain how it is done. Rest assured, I do not take drugs; or drop ACID; or drink the Kool-Aid.

One specific thing that a person who is reasonably familiar with the RM would understand: In the scientific exercise above, I am applying Codd's __Hierachical Normal Form__. But you are in denial that such exists.

> I'm saying is that [the] mathematics [for graph theory] doesn't exist. If 44 years hasn't been
> long enough, I suspect "a year or three" more won't be, either.

You missed my point. And you are mixing up my comments re the void of theoreticians in the field of databases vs the field of graph theory.

For the relational database field, they have never tried, because, like you, they have blind spot there, where hierarchies sit, because your religious scripture says it does not exist. For the forty four years since relational theory has made made public.

Re the graph theory field, I have no idea what their timeframes are, it just isn't that forty four years. While it would be fantastic if they applied Relational Theory to graph theory, etc, etc, they are under no obligation to do so. And that concerns The Relational Theory, after Codd. If you are talking about one of the 56 "relational theories" other than Codd's, that are so beloved of theoreticians who produce nothing in this space, then it is a Very Good Thing that the graph theorists do not touch it with a ten foot pole. They have not succumbed to some strange religion.

Let's put it another way, strictly from the pov of a scientific high end implementer. After I understood Codd's RM, I perceive all data as Relational. I don't need a RDBMS, I implement all data in Relational form, 3NF on steroids (all the NFs that are written plus all the NFs that will be defined in the future. In any language or platform. I have done in ISAM, COBOL, DIBOL, various incarnations of C; awk, etc. When I describe data to people, outside an RDB context, I give them an IDEF1X data model (minus the attribute level), and they love it.

The RM is about a new way of perceiving data. It is not about an RDBMS. No data is excluded. I believe you will agree with me 100% thus far.

Now for the kicker. Please leave your religion for a few minutes and think scientifically. Since no data is excluded, that includes hierarchical data. Therefore hierarchical data is included in the RM. Don't ask for the Lemma.

> As far as I know, there's nothing like e.g. SQL EXISTS for
> graphs. There's no way to say "find all <graph expression> where
> <subgraph expression> = <graph expression>". Sure, there are ways to
> iterate over the nodes. There are even things like XPATH to find nodes
> having particular properties. But the solutions are all ad hoc and
> complex, and still come nowhere near what RM does.

So sure, if I were to implement a bunch of point sets for a customer, I would translate all that data into RM terms, give them vectors, dimensions, or whatever they need to IDENTIFY their data, and implement it in a RDBMS. It isn't ad hoc, it is formal. It does everything the RM states can be done for data. No funny constructs. All data instantly available via a single SELECT.

I realise you mean theory, above, but my point is, I don't need a theoretician to define it theoretically, in order to implement it. And that identifies yet another point where the theoreticians are crippled (note the other thread). Norbert believes (unscientifically) that he has way more work to do, before he can implement. All of you guys cannot understand that I can implement tomorrow, if you can define it in English today, supported by the evidence that consists of the entire physical universe (ie. other have done the same). You spend your lives defining something in gibberish, with all sorts of artificial limitations, that you impose upon yourself. This is why I am saying it is a religious issue for you guys, and the religion is false.

> I don't feel qualified to comment on [Norbert's] work. I don't know anything
> about topological spaces. I was waiting to see how the dust settles.

I am certainly not qualified either. He asked for, and I gave, an informal, non-theoretical, practitioner's review. You guys just do not get that painting rectangles on screen is twenty-to-thirty-year-old hat. Oh, oh, wait. It is me who doesn't understand. And even worse, it is you who can't communicate.

> I think we just mean different things by "hierarchical model", and by
> "model". I still say Codd invented the term simply to contrast it with
> his work. If you believe otherwise, please cite one paper in database
> research before 1969 that refers to it.
>
> I acknowledge the existence of IMS et al., and recognize that the data
> were modelled on a hierarchy. I'm only saying that "model" lacked any
> of the expressive power of the RM, to the extent they're not really
> comparable.

Beside Done to Death; Read Again; Silly Requests; Denial of Evidenced Historical Facts; etc, you are contradicting yourself, severely, between your two paras.

And I never compared the HM and the RM. The RM superceded the HM, while retaining everything that we had learned about data up to that point (not, the storage, Christ help me, not the literal model, Baby Jesus, but the essence). If one were to compare, which is stupid, the HM loses badly, for all the reasons you have mentioned. Plus one reason (at least) that you are unaware of. It was not a win-or-lose affair, it was a succession. Like the auto-mobile-carriage was a succession of the horse-and-carriage. Comparing the two is silly.

> Hence it's a pretense even to call it a "model".

For you. Not for the rest of the planet.

> > Ok. So then what, in your considered opinion, is the natural or
> > "natural" structure for data, or is there none ?
>
> Not to be sophistic about it, but I would say that data [hierarchies?] don't occur in
> nature and have no natural structure.

That sounds like you think that data (is therefore) fragments, isolated from each other, with oh, a relationship here, and ah, another relationship there. The very opposite of the RM.

My practised opinion is, after 39 years in the industry, 36 in databases, 30 years pure Relational, is that all data in a database, is related, to all data, in that database. Obviously, not limited to "by referential constraint", related in various ways. At the very least, it is all related simply because the database is an recovery unit. One step up from that, that you will understand, is the referential constraint, so that is the limit for you guys. Codd taught me that their are myriad other relations, and applying the RM, I implement some of them. But that is about thirty years ahead of you guys, and I will not try to explain any of it here. We are stilt, heh heh, arguing about what a relation is, and whether hierarchies are part of the RM.

> As a human construct, they
> have no more "natural" structure than do novels or cities.

Ok. So, according to you, there is no relation between novels or cities or novels-and-cities. And you say, you got that from the RM. Gee, I got the exact opposite. But hey, I am just a dumb practitioner, you guys are inflated theoreticians who produce nothing. I can't match that level of success.

> > Question for you. In one para or less, what is your considered
> > opinion of the Alice book.
>
> You're on record as calling it an "abortion".

Yes. And a Cancer Causing Agent, to the extent that it is used as a "textbook". Ready evidence: it has poisoned your mind re hierarchies are "not possible" in the RM.

> I don't know what about
> it earned your emnity.

I won't enumerate here, but to sum up, it is based on lies, Straw Man arguments, denial of reality, denial or the RM. To name just a few. And I can provide evidence for each of those charges. One really stand-out point is the chapter on the "relational model": 70% everything but, and in the remaining 30%, none of the main concepts of the RM. Fraud. Crime of Omission. Crime of Commission. Misrepresentation.

> I have an interest in database theory, and
> query language complexity. I would like to see SQL replaced by a
> language that is simpler, more expressive, less verbose,

All very good ideas, but about twenty years too late, in that, while all that mattered twenty years ago, it doesn't matter now. We do not type SQL. We have machinery that does that for us. We work ate a level of productivity than is beyond the imagination of typists and their mythical, untested, ideas about the planet.

> and that better
> represents relational algebra.

Yes, I would definitely like that.

Given your demonstrated skills, I have a suggestion. Spend a month or three, and write a C program, that parses a text string (that is supposed to be one of the 56 RAs that you guys have, and produces SQL verbs, that are executed against any SQL (or NonSQL) platform, and you are home and hosed. It can run only any SQL platform, and any range of functions that the platform provides. Use libraries for the parsing of course, otherwise it will take years.

That way, you can implement FOL, SOL, 42OL.

Just do not, please, under any circumstances, implement anything less than the simple FOL in the RM.

The TTM cult have produced nothing for twenty years, not even 1% of SQL, but they promise to exceed it. They have a Toy Language semi-described, nothing defined yet. ANd of course they keep changing it. I hear they are once again, arguing about Typing, after dropping it unresolved four years ago (when I was there): they haven't figured out that the Typing is already in the one-and-only place it should be, the database. They are still worshipping at the Tower of Babel, it should be no problem to beat them to it.

Just don't build a monolith, we are in the age of components and layered software. Babel is about two thousand years out-of-date. It should be a straight program that runs on a PC (Unix and Windoze, in whatever order suits you), that calls a resident SQL client library.

> I read Part E with great interest,
> although I don't claim to have mastered the subject.

I haven't got to that yet. I can only handle small doses because the dishonesty and fraud, which happens on every single page, sometimes every paragraph on the page, drives me to commit sin, ie. more than the bad words, vomiting and spasms. So I have to stop, perform an act of contrition, and write it up for my next confession. But that doesn't tell the story, because we can't keep committing the same sin over and over, it means we are not sincere about giving it up. We are supposed to avoid the occasion of sin, which means maintaining a state of Grace is very simple: don't read the book. But for the sake of identifying and dealing with the cancer in my beloved profession, I have a dispensation to enter the world of the devil, and deal with his children. Of course, I believe in material penance, and that takes a lot of time. So the sum is, for every six or seven pages that I do read and ingest from the cursed book, not counting time spent in the toilet, I have to perform about 12 days of labour.

Basically, it is the penance that I have to do to redress all the sinning I have done in my life, which I gratefully accept, so while there is no problem re motivation, etc, there is a problem with the rate of progress. Compared to my three years penance that I performed at the TTM Kingdom Hall, in some ways it is worse, in other ways better. Eg. you can't pin a TweedlDum slave down via email (which btw is one reason I appreciate this interchange), in their own church, and of course, they never go outside. Vs AHV, where the crime, the evidence, is in black and white, permanent, so I can him the freaks down, and stitch them up. Having just typed that, I realise now that my time at TTM was training, so, thanks be to God.

Thus far, I am up to ch 11, with parts of 14 and 15, and ch 21.

I won't be reading part E.

And it must be said, you and I are reading it from opposite sides of the fence. You are infected, enslaved, lapping it up, for the divine juice that you have been tricked into thinking it is. I am protected by the real thing, handling it with a mask and gloves, like the dangerous contagion it is. All tolerable, my small contribution to humanity, in order to arrest the enslavement, the subversion, of good young minds.

> I think it was from that book that I learned that there are some
> problems that cannot be expressed in FOL. Hierarchies are in fact an
> example: some problems require recursion, which, as you know, is
> inexpressible in relational algebra.

Excellent! We might have found another avenue (given you rprivate definition, and that you refuse the exercise) to deal with Hierarchies vis-a-vis the Relational Model. Here, I will happily do all the work, there is nothing for you to do, except to communicate the need, the problem to me.

First, let's get that last clause in your para out of the road. Yes, I know, but what A RA or THE RA can or cannot do, is irrelevant. No idea why you think it is otherwise.

Second, isn't it amazing how the Holy Ghost works! Here I am, 57 years old, and still sometimes just blown away by that. There I was, just four paras further up, believing that the thread had stalled due to your refusals, but I praised God, spat out the devil, and acknowledged the work for what it is, and boom, four small paras later *you* give *me* an opening. Let me go and say a prayer and return.

Ok. So, given the previous interaction and stall, we need to draw up some boundaries. There has to be clear success or failure. None of this unscientific refusal to accept established definitions, no denial of physical evidence.

1. So will you please confirm, if I show you how hierarchies are dealt with in the RM, handled properly by a reasonable implementer, implementing the RM, then it will prove that hierarchies are catered for in the RM, and that application of the RM re hierarchies is as easy as you see it being done ?

2. This one might be harder. If you see (ie. not deny) that the constructs and concepts that I use, that are in the RM, are in the HM, will you concede, graciously, that the HM (in essence, in concept, in spirit) exists in the RM ?

3. Therefore the HM lives in the RM ?

4. And since you having been denying it thus far, therefore, you did not previously know [3]

5. Given [1][2], there was that content that you did not previously know.

6. Therefore, to that extent [1][2], you did not know the RM.

7. If I give you a full solution as Relational Db tables, and a data model to go with it, using an existing commercial SQL platform, and SQL DDL, will you concede that the solution is executed in current, Relational, SQL, RDBMS ?

8. If I give you a full solution including the code required to perform whatever tasks you give me (which you perceive as currently being impossible), using an existing commercial SQL platform, and SQL DML (ie. 100% SQL, a calling language used for calling and presentation convenience only, no subject code), will you concede that the solution is executed in current, Relational, SQL, RDBMS ?

9. Therefore whatever your teachers (other than Codd) taught you about the combination (FOL; Hierarchies: the RM; can't do), is wrong, wrong, wrong.

10. ANd the corollary, given that my solution is 100% RM, that Codd is right.

Iff you agree to the above, then please give me a real world example problem to work with, otherwise, please discuss.

Single relations are unacceptable, that is the classic trick that frauds use to erect their Straw Man arguments, we prefer lots of detail, even ten tables. In the three case studies that I use for my courses, each starts with ten or so tables, as we progress, it expands to 20, 30 tables, and it finishes with 80 tables. Nothing of value can be taught or learned, from a few relations. Except fraud.

> I have
> not read a theoretical treatment of SQL's recursion.

Why does one need a "theoretical treatment" of recursion ??? Do you have a "theoretical treatment" of the recursion in C ? For awk scripts ?

SQL is a data sub-language, period. It is not a language, why would anyone expect a full-program capability from a data sub-language ??? That is as stupid as writing a cursor in SQL, which is a set-processing engine (at least in the commercial end of the market).

You might be making the same mistake as Jan, that I enumerated in detail ? You two are reading the same books, lapping up the same poison. Failure to observe the scientific Architectural Principle in the industry, of differentiating /Data/ vs /Program/.

> It seems like a
> good job to me, a minimal extension.

No idea what you mean. The recursion is in the server, not the SQL code. Same as for an awk script: the recursion is in Unix, not in the awk script. There is no recursion "in" SQL. Of course, both the awk script, and the SQL function, must be written with the awareness THAT it will be executed recursively, but that is not recursion IN SQL or the awk script. We have had it since 1984.

> But it also seems harder to use
> than necessary.

How would you know ?

Or, are you talking about some freeware/shareware/vapourware NonsSql that has recursion IN the NonSql ? Such as those that have "deferred constraints checking", etc. Hilarious. They have read the same book, lapped up the same poison, ignored the same scientific Architectural Principle, and written puke into their program.

--

But thank you for answering my question, and your excellent comments. I do not mean to take your answer apart, I was just adding my comments under yours.

> > In one sentence or less, how relevant is it to the field of
> > Relational Database design.
>
> Not relevant at all.

Excellent. Like Harry Potter's Book of Magicke. Only for HP fans. Not for the Christened ones.

Cheers
Derek

Derek Asirvadem

unread,
Feb 2, 2015, 9:14:14 AM2/2/15
to
> On Monday, 2 February 2015 22:46:19 UTC+11, Derek Asirvadem wrote:

> My practised opinion is, after 39 years in the industry, 36 in databases, 30 years pure Relational, is that all data in a database, is related, to all data, in that database. Obviously, not limited to "by referential constraint", related in various ways. At the very least, it is all related simply because the database is an recovery unit. One step up from that, that you will understand, is the referential constraint, so that is the limit for you guys. Codd taught me that their are myriad other relations, and applying the RM, I implement some of them. But that is about thirty years ahead of you guys, and I will not try to explain any of it here. We are stilt, heh heh, arguing about what a relation is, and whether hierarchies are part of the RM.

In case it needs to be said, I do not mean "Codd taught me" literally, in person. I mean through his papers and articles, through the RM. The more I applied the RM, the more I understood it, that is applied more than I previously thought. So I applied it more the next time, and in doing so, I learned that it applied in even more ways.

In that sense, Codd is still very much alive, much like my dead father is alive to me internally. His work is a living work, if you interact with it and respect as the Commandments for Relational Databases, you get way more out of it than if you argued and fought. That might take you ten or twenty years of incorrect implementations, as some of you have mentioned. Genuine disciples of Codd never had that problem.

Of course I implement 100% of all data controls in the db, using RDBMS facilities, declaratively. But there is much more to data integrity than that. There is a whole level of what I call Logical Integrity. To portray that, to document that, I use a set of diagrams, separate to, and entirely bound to, the full IDEF1X data model. Only because IDEF1X doesn't have it. IDEF1X is not like UML, where everyone and his dog has a different set of symbols and notations, so I don't interfere with it, I don't add symbols, I add a separate diagram.

But that is beyond the scope of this thread. I just wanted to clarify the "Codd taught me" statement, in case someone took that literally.

Derek Asirvadem

unread,
Feb 3, 2015, 4:22:27 AM2/3/15
to

James
Jan

> On Thursday, 29 January 2015 13:09:08 UTC+11, James K. Lowden wrote:
> On Wed, 28 Jan 2015 01:52:59 -0800 (PST)
> Derek Asirvadem <derek.a...@gmail.com> wrote:

> > A. ____the Hierarchical Model rests on a theoretical void____

> > B. ____the HM is dead, it has no relevance wrt the Relational Model___

> > F. The implication here, and in many other places, is that you know the Relational Model, and you know it well.

> > I declare, the Relational Model is not a replacement for, or a
> > substitution for the Hierarchical Model; it is a progression of it.
>
> Evidence, please. I can think of no way in which the hierarchical
> model informs the relational model except as antithesis. Codd
> contrasted the RM with "noninferential systems", which hardly
> sounds like a source of inspiration.

(Not avoiding this point, tomorrow, please.)

Tomorrow is here, after some delay.

I would like to make sure that we do not waste time with silly arguments, I have more respect for you than that. Last year, someone here got into such an argument with me, after I stated that SQL was the manifest language of the data sub-language that Codd defined in the RM. He had a mathematical proof that it was not (same as your demand for truth tables in the Null problem thread one year ago), and of course that was in denial of the historical facts. The issue on my side was simple: unless one is in a pathological state of denial, the father can be recognised, easily, by virtue of the unique characteristics of the father, in the son. Whether the son carries the father's name, or disowns the father, or the certificate of birth is lost, etc does not matter.

Codd is unarguably the father of the one-and-only data sub-language, from which SQL is directly derived. Separately, by route of the sequence of historical facts (System/R, SEQUEL, SQL), Codd is the grandfather of SQL itself. To deny this, separate to denying Codd his due, is to deny historical facts, and to deny the unique characteristics of the father that are evident in the son.

Third, there is one, and only one son (SQL), and one, and only one, definition of a data sub-language (Codd's), that the son can lay claim to being the manifestation of.

That is not to say that Codd was happy with his grandson, or that he agreed with the way he had turned ou (IBM and the teams had their politics). Any negative comments from Codd (such as you posted) have to be viewed with the eyes wide open.

Likewise re Hierarchies. Much as you hate them. The HM exists in the RM, by virtue of the unique characteristics of the father in the son.

Now the thread has progressed, and you have split the HM /as I first stated/ into two pieces: hierarchies in general as they may exist in the data; and the HM as a model. Personally, I think the difference is trivial, not material to the thread, but as stated previously, I am willing to allow the split since it is important to you. In which case, I need to restate my declaration.

I think we all agree that the HM, as an implementation, died in 1984, when the RM became available as an implementation platform. So we are not talking about the HM living in the RM as an implementation method, or a data storage method. Whatever form it lives as, in the RM, the implementation method is RM.

I think we have cleared up the issue that it does or does not have a paper supporting it (and that it is silly to expect one), before accepting it as a model, because in those days everything was proprietary, the HM was the only kid on the block, and it was widely known and well-understood as a model. Even today it is well-understood, but very few will remember the [much simpler] modelling methods or the symbols.

Note that in the RM, more than half of Codd's references are to products and product manuals. There were not too many theoretical papers in the field.

So we are talking about its spirit, its philosophy, its design elements. In the logical sense, since the RM is mostly logical, and in the resultant physical sense, because that is not denied (the physical is a result of the logical, not isolated from the logical). Ie. Codd does give logical implementation prescriptions and prohibitions, and concerns himself with the physical, without giving prescriptions. Eg. Access Path Dependence is prohibited, we all know what that means, it is primarily logical, and of course, when implemented it is physical. Therefore I think we should not argue that the HM lives in the RM in a physical form or not, or that since we cannot see the physical HM in the RM, it means that it doesn't esixt in the RM.

Last, in attempting to answer your request, I started typing up a justification, but that will be nothing short of a full exposition of the specific parts of the RM that relate to the HM. Very long. Open to argument. So I think the best thing to do is you give you the specific references to the HM in the RM, and have you confirm or deny that. Then I can respond to just those points. And I am quite aware, that due to the blind spot you have, denial of the HM, and the HM in the RM, the discussion might be quite short and sweet.

References. So the first preference I would say, is to read chapter one of the RM, with this thread, what I have said in mind, and come to your own NEW conclusions. Specifically
- [1.4] Normal Form
- You need to understand the manner he uses to construct keys, "simple" and "non-simple domain", so the preceding sections
- special attention to the pre-requisites

There are many terms that Codd uses in the RM, which have gone out of fashion. I don't know which ones you know or don't know. Please feel free to ask me, and I will explain. There seems to be a bit of confusion over a couple of terms, so I will give them first.
deductive question-answering system = Decision Support System; Online Analytical Processing
Inferential system = DSS/OLAP
non-deductive system = Online Transaction Processing
non-inferential system = OLTP
graph = tree
tree = Hierarchy (graph) [13]
graph model = Hierarchical Model

Tree/Hierarchy
As you know, Normalisation was well and truly established, although we did not have titled "normal forms" until Codd declared them, first in the RM, and later with 1NF, 2NF, and 3NF. Therefore, when he gives the pre-requisties to his __Relational Normal Form__ [1.4](1)(2), and in (1) states "collections of trees", we take that to mean:
- trees with integrity
- normalised to the extent that we did prior to the RM
- no circular references
- what I am calling, in retrospect __Hierarchical Normal Form__

In the alternative, trees without integrity, with circular references, would fail the requirements and directions given on that entire page. That is further evidence that he meant trees with integrity. I am sure you will understand the ramifications of circular references, it is the same with recursion: an infinite loop. Deadly, and career suicide.

Note that he emphasises the relevance of the pre-requisites and the implied integrity with "The writer knows of no application which would require
any relaxation of these conditions." The conditions are onerous, otherwise relaxation need not be mentioned.

Restatement

Given the content of previous discussions, a short restatement of my position re the HM and its relevance to the RM is in order. I am saying the HM lives in the RM:
- as a concept
- a previously well-known model, method, of organising data
- that is explicitly referenced, and used by Codd in the RM, throughout the paper
- if one implements data according to the _Relational Normal Form_ given, for which explicit steps are given, this leads to Relational Keys, which are compound keys ("non-simple")
- such keys, in the context of the series of tables in which each and all of those components ("simple" as well as all subsets of the "non-simple") are used, will form an hierarchy
- the set of such table form an hierarchy
- such keys may well be called Hierarchical Keys

The corollary is:
- where one fails to implement Relational Keys as they naturally occur in the data, the value, speed, and Relational power (independent join power) that exists in the Hierarchy (whence they came) is lost

Unfortunately, during the intervening forty five years, while the theoreticians have invented all sorts of "normal forms" to suit non-relational purposes, they have been staggeringly unable to define the two Normal Forms that are /informally/ defined in the RM, /formally/.

In the middle of section [1.4] Normal Form, Codd gives an example of a tree (hierarchy) in fig 3(a). In case you are interested, I have a page that shows the HM, the hierarchical file structure, and the transformation according to Codd's instructions. Please ask.

> I think you will recognize this, from the abstract:
>
> "Existing noninferential, formatted data systems provide users
> with tree-structured files or slightly more general network models of
> the data. In Section 1, inadequacies of these models are discussed."

Accepted, that the RM succeeded the HM, that it was inadequate.

That says precisely nothing about whether the hierarchy is a valid structure for data. The big difference between pre-relational and relational, as you well know, it reference-by-key over reference-by-pointer. We had references and cross-references in those pre-relational days, we just did not have keys as pointers.

Further At no point does Codd state that the hierarchy of the data itself (ie. other than the pointer-based storage and access path dependence) is in adequate or should be replaced.

And finally, the method he gives for the __Relational Normal Form__ uses a data hierarchy, and he transforms that hierarchy of data into Relational form (array, matrix, table), keeping that hierarchy of data intact.

> I can think of no way in which the hierarchical
> model informs the relational model except as antithesis.

I didn't say "informs". I said the RM is a progression of the HM. The evidence is in the paper, just that one page. To the degree that one understands Codd's numerous references to the hierarchy ("tree"), one can ascertain how much it "informed" Codd.

To the extent that any hierarchy that exists in the data, is maintained as an hierarchy, after transformation to the Relational Model, the hierarchy lives, exists, breathes (normally, not zombie-like), in the RM, and therefore to this day.

Antithesis. Proved false.

Thesis. Proved.

No one said thesis = 100%. Life is not a simplistic black-or-white issue. If I was asked, just how much the HM influenced the RM, I would take the time to enumerate the RM, and notice its features, I wouldn't put a number to it right now. That the HM is in the RM, as detailed in my restatement, in unarguable, irrefutable.

Cheers
Derek

Derek Asirvadem

unread,
Feb 3, 2015, 9:42:12 AM2/3/15
to

James
Jan

> On Tuesday, 3 February 2015 20:22:27 UTC+11, Derek Asirvadem wrote:

Further Clarity

> Restatement
>
> Given the content of previous discussions, a short restatement of my position re the HM and its relevance to the RM is in order. I am saying the HM lives in the RM:
> - as a concept
> - a previously well-known model, method, of organising data
> - that is explicitly referenced, and used by Codd in the RM, throughout the paper
> - if one implements data according to the _Relational Normal Form_ given, for which explicit steps are given, this leads to Relational Keys, which are compound keys ("non-simple")
> - such keys, in the context of the series of tables in which each and all of those components ("simple" as well as all subsets of the "non-simple") are used, will form an hierarchy
> - the set of such table form an hierarchy
> - such keys may well be called Hierarchical Keys
>
> The corollary is:
> - where one fails to implement Relational Keys as they naturally occur in the data, the value, speed, and Relational power (independent join power) that exists in the Hierarchy (whence they came) is lost

Given the content of previous discussions, a short restatement of my position re the HM and its relevance to the RM is in order. I am saying the HM lives in the RM:
- as a concept
- a previously well-known model, method, of organising data
- that is explicitly referenced, and used by Codd in the RM, throughout the paper
- if one implements data according to the _Relational Normal Form_ given, for which explicit steps are given, this leads to Relational Keys, which are compound keys ("non-simple")
--- such keys, in the context of the series of tables in which each and all of those components ("simple" as well as all subsets of the "non-simple") are used, will form an hierarchy of keys
- the set of such tables form an hierarchy of tables
- such keys may well be called Hierarchical Keys (they are well-known as Relational Keys, I am not suggesting that we change it)

The corollary is:
- where one fails to implement Relational Keys, such as they occur naturally in the data, such as they existed in the hierarchical Model (whence they came), such as they would have existed in the Relational Model (had they been implemented), the value, speed, and Relational power (independent join power) is lost.

Cheers
Derek

Derek Asirvadem

unread,
Feb 4, 2015, 12:43:34 AM2/4/15
to

James

> On Monday, 2 February 2015 22:46:19 UTC+11, Derek Asirvadem wrote:
>
> > Yes, which you very much needed, as I'm sure you also remember,
> > both because of the lack of data independence, and because navigating
> > the hierarchy required having a clear picture in mind of the database's
> > structure.
>
> That is two separate points.
>
> Clear picture in mind. Well that is my point, not yours. You need a "clear picture in mind" if you are going to implement ANYTHING, if we are going to navigate ANYTHING. A database today, a pre-relational file system or database. A network topology. That is why we have IDEF1X, a standard. Instead of funny pictures in Visio and UML. In those pre-relational [days] we had different diagrams to provide us with a "clear picture in mind", and a standard for the symbols, which were available as stencils at the newsagent. It proves once again, we had a MODEL.

Point being we need a "clear picture in mind" to navigate any topology, today's IDEF1X Relational data models or yesterdays Hierarchical Topology Navigation Map (since you deny the word "model" for the diagram). No less, no more.

Of course we had relationships, and they were maintained by the platform, but, and this is the great difference between pre-relational vs relational, the chains or links or pointers were record numbers, not keys. COdd details this in the RM.

----

> > the distinguishing
> > feature of the relational model that it's an inferential system.
>
> Yes. Agreed. As long as you do not mean ONLY inferential. It is perfect for OLTP as well as DSS/OLAP, in the same database.

And, importantly, the inferential capability of the system would be limited to the knowledge and capability of the modeller. If he did not implement the data hierarchies that occurred "naturally" in the data, into the database, as hierarchies, then the database would be severely hindered, in terms of DSS/OLAP.

Based on many projects completed with this issue being a key implementation point, I would go as far as saying, most of the replicated databases that you see these days, the purpose of which is to offer a data mart or data warehouse or report-only capability, can be eliminated if only people were aware of the relevance of hierarchies, and implemented them properly: the one Relational Database would support both OLTP and DSS/OLAP. There is an awful lot of wasted resources, hardware and manpower in the replicated dbs.

Same vein, another example, one I am sure you do understand, because you quoted the Rule. The typical "database" that people implement these days consist of "tables" with RecordIDs. That breaks Data Independence, specifically, Codd's Data Access Path Independence requirement. Every "table" is in fact a separate FILE, with an access path dependence. Further, the level of integrity that such schemes can implement is a tiny fraction of the integrity that is effortlessly afforded, if one were to Normalise that same data into HNF and RNF.

----

> You seem to misunderstand, conveniently, I suppose, despite the details I have given, that I agreed there is no published theoretical basis for the HM, silly to expect, etc. Given my details, "set notation" above means set of notations, symbols, not notation of sets, which concept did not exist prior to Codd, and therefore is silly to expect, blah blah blah.

Another analogy for the issue you raise. Some years ago, most Western contries implemented Occupational Health & Safety, as law. That means a standard, that was understood and accepted by everyone. All new buildings were required to comply from the start of the project. All existing building were given one year to comply, or be demolished.

Now at that time it would be silly if someone trooped up and said, re one of the old buildings "hey that building over there is unsafe, it has no safety whatsoever, it is build on a safety theoretical void." Er, we have OHS for decades before. Every build had some form of it, depending on the integrity of the company. We just did not have a publicly available syllabus against which to measure the exact extent of safety. There was not a requirement to publish the safety aspects for every building.

Such is your position re the "HM was built on a theoretical void".

----

> > I have not read a theoretical treatment of SQL's recursion.
> > It seems like a good job to me, a minimal extension.
>
> No idea what you mean. The recursion is in the server, not the SQL code. Same as for an awk script: the recursion is in Unix, not in the awk script. There is no recursion "in" SQL. Of course, both the awk script, and the SQL function, must be written with the awareness THAT it will be executed recursively, but that is not recursion IN SQL or the awk script. We have had it since 1984.
>
> > But it also seems harder to use
> > than necessary.
>
> How would you know ?
>
> Or, are you talking about some freeware/shareware/vapourware NonsSql that has recursion IN the NonSql ? Such as those that have "deferred constraints checking", etc. Hilarious. They have read the same book, lapped up the same poison, ignored the same scientific Architectural Principle, and written puke into their program.

The full extent of the horror of the freeware/shareware/vapourware purveyors writing recursion =IN= SQL, instead of providing recursion in the execution engine (server or bunch-of-communicating-programs or whatever they have), sunk it overnight. My colleagues and I have difficulty discussing anything else today. Is this for real ? Is the shareware crowd /that/ stupid ? I mean, I know they are stupid from various items in the past, but I did not know they were /that/ stupid.

Of course it would be hard, very hard to use. The essence of the simplicity of relations would be lost.

And even harder to implement. So let me get this right: 10,000 PostGresNonSql (and other NonSql variant) developers across the planet are working on implementing recursion =IN= SQL. Have I got that right ?

My colleagues are eager to obtain a confirmation, please, because the news is unbelievable.

And whose idea is this, Darwen's ? Abiteboul's ? Of course it is ready evidence that he, too, does not know the Architectural Principle of separating data (relations) from program. Who is this guy that is lining himself up for the Darwin Award. We have to recognise him and give him credit for his efforts.

Cheers
Derek

Derek Asirvadem

unread,
Feb 4, 2015, 12:44:36 AM2/4/15
to
Jan

> On Monday, 2 February 2015 21:47:19 UTC+11, Jan Hidders wrote:
> Op zondag 1 februari 2015 06:33:40 UTC+1 schreef Derek Asirvadem:
>
> > > I would btw. be very curious to know what conclusions your client would think he or she could draw from them.
> >
> > Since I have already given you a synopsis, a short chronology, I am not sure what you mean. Would you like a more complete one ?
>
> More detail, as in where the devil lives. Because I suspect more is concluded from the paper then is actually warranted and meant by the author. But if we get to that later, that's fine.

You will understand perhaps, that I am restricted due to contract and confidentiality issues.

Sydney. One of the larger Australian banks. From experience, we are much more conservative, and we have much more legislature governing, than American banks. Probably (not sure, from speaking to colleagues) somewhat more than than EU banks. Customer has an app, OO, ORM, all the OO bells and whistles. The data store in in the app, ie. closed architecture, but deployed on MS SQL for convenience (SQL DML, backups). Typical OO monolith, typical OO minus RDB problems. Hundreds of objects have progressed to thousands, a maintence and performance nightmare. Three years of failure, of various kinds, but the most important is lack of data integrity, and every time they install a new version, a whole set of new bugs are exposed. Team leader has a lot of influence, but the auditors gave the business an ultimatum: fix it or shut it down. Second most important: the problems getting the data in/out of the data store, and behind that (really the same problem) lack of Relational Database access. Performance is crap but they are used to it. So they came to me (I had replaced an app+Db in another division in NZ, and the auditors had nice words), and the directive is, replace it with an RDb+App, and teach the team standards, so that they never make these mistakes again. The business has no choice, he doesn't need to get a budget allocated, as it is a risk issue and bank level funds are already allocated. I work for the business, so there is some politics, but I answer to, get everything signed off by, the auditors.

Paper. The Team leader is on his last legs. There is no pressure that he will get fired, but he is dying from loss of credibility. He is adamant that if he adds more complexity to the app, to the objects and classifiers, the app will "work", the bugs will be fixed. They have heard this for three years. This time the difference is, he is making a formal presentation, along with two papers that support his position. I am pretty sure that he got that from his OO groupies, no idea who that is, probably some consulting firm that his wife or brother knows. I say that because the presentation was reasonably professional, while denying all the facts.

This is the typical state of the majority of the OO world: despite the past 100% failures, they will do it better next time, promise. Note that if nothing changes, nothing changes.

DBPL is yours. The central theme of your paper is <read abstract>. He is saying that there is scientific, theoretical, evidence that more complexity deployed in the OO classifier layer will fix the data integrity problems, thus it is his team's fault for not doing that properly, not a problem with the OO/ORM concept (specifically no independent RDb), and he deserves yet another chance.

After a quick consultation as to the veracity of each of his statements, proved to be zero, the auditors have asked me to take the presentation down formally. The business has asked me to do so softly.

Enough ?

Proper Use of Paper

Here is what he presents (what the whole OO/ORM world uses):
- the premise of the paper (central theme ?) is that there are data integrity issues that result from (eg) multiple inheritance
- that such issues can be corrected by increasing the level of complexity in the objects, the classifiers
- a proof and specific methods are given
From reading your paper, I cannot say that he (or the OO/ORM crowd) is using it improperly

The paper is a good example because hundreds of this /sort/ of data integrity problem (ie. sort, not instances), occur only because of invalid beliefs such as this, and those same hundreds, can be eliminated by utilising established architectural principles in our science, by a RM-compliant Relational Database. The entire OO/ORM madness can be eliminated by a professional implementation, but I won't address that, I will address just your paper and the issues therein, just the project and that one issue.


< bulk snipped, skeleton retained ...>

> > I suspect you and I are not on the same page on this one. So let me clarify, and ask for a clarification.
> >
> > Now in this thread you have stated:
> >
> > > > But most /now/ understand the relevance of data independence.
> >
> > (My emphasis.)
> >
> > To which I replied:
> >
> > > I suppose I have to trust that you mean that in the fullness of the data integrity as prescribed in the RM.
> >
> > Which you have not confirmed or denied. Which means, I still do not know the /extent/ to which you understand "data independence", and how it is administered.
>
> Administered? That seems a strange word to use here.

Administered, analysed, designed, modelled, implemented, maintained, such that, through all those activities, one observes the rules that pertain to data independence, open architecture.

> I'm also not sure what you meant by "in the fullness of" here, so that makes it a bit hard to answer. Am I aware how current DBMSs realise (to some extent) data independence? Yes, I am. Am I aware of the available techniques that are not yet implemented? Yes. I Am. Am I aware that under certain approaches there is a trade-off between the extent of data independence and complexity of integrity checking? Yes, I am. Not sure if that answered your question, but that's the best I can come up with at the moment.

That's fine. Let me respond to each one quickly, I don't want to get distracted from the main issue, your paper.

> I'm also not sure what you meant by "in the fullness of" here

I think we already know, that you guys treat the RM as a pick-list, and with very little understanding of what it contains, what each of the items in the pick-list actually means, what they deliver, etc. And some of us over here in implementation land, treat the RM as The Law for Relational Databases. Not only have we take every word as Law (no picking and choosing), not only have we implemented it, after having enjoyed the fruits of such lawful activity, we have implemented further specifications; finer categorisations; more application areas. Thus the fullness of the RM, and of any single item (data integrity is just one item), it applied fully, and completely, and after much experience, even more fully.

Thus there is a huge gap between one who is a picker, who understands very little of what he is picking, and none of the fruits of what he has not picked, and one who is lawful, full of fruits, and growing more fruits than the original RM described. To wit, we perceive far more form Codd's laws re data integrity, (than you did in 1995, andwhat you do now) and therefore the commercial vendors have enabled it, and therefore we have implemented it. And after a decade or so of sitting on top of that MINIMUM level of data integrity we see more, and implement more. Whereas you are still perceiving far less than that minimum, and you do not appreciate the value of it as prescribed. I can't expect you to even imagine, in your wildest dreams, what we do beyond that minimum.

So for this paper, this issue, we are only dealing with prescriptions in (a) our science and (b) Codd's RM.

There is a good example if your are interested, that I will be working through, in the On Normalisation thread.

> Am I aware how current DBMSs realise (to some extent) data independence? Yes, I am.

Per details above, and per evidence in your paper, I really do not think so.

> Am I aware of the available techniques that are not yet implemented? Yes. I Am.

I think you are aware of a fraction (of what is implemented in the commercial platforms).

Separately, the evidence is, you are unaware of the concepts re data integrity/independence, that must be implemented IN the data.

> Am I aware that under certain approaches there is a trade-off between the extent of data independence and complexity of integrity checking? Yes, I am.

From where I sit, there is never a trade-off, the evaluation you give never happens. It is no problem to implement complete and total data integrity (that item will never be traded off), to any level of complexity, in the database. Imagine what my databases do: we maintain millions of public trades, to hundreds of complex legislative requirements. Both the declaration of those requirements, and the maintenance of the data to those requirements, is in the database. And any implementation of such, outside the database, is not only wrong, incomplete, etc, it breaks the architectural principle of separation of /Data/ vs /Program/, the result will be a sub-standard mish-mash of complex objects that fail anyway. Typical OO/ORM madness. As supported by your paper (among others, and by book such as AHV). The complement on the OO side is actually desirable: simple, rather than complex objects, that are less vulnerable to changes.

Further there is no merit in the monolith, we killed that in 1985. Only really uneducated people still (a) prescribe them and (b) build them.

From where the auditors sit, when they recognise that some project has broken those laws, those principles, they send the team off for re-education.


> > So the clarification begs. The paper is Database Programming Languages, 1995. Are you aware:
>
> Yikes! My very first paper that I wrote as a beginning PhD student! :-) Ok. This is going to be interesting.
>
> > 1. That, on the face of it, your statement above, contradicts, or let's say unofficially retracts, the main thrust, the solution given, in your 1995 paper ?
> > __ (which is why I stated "... the papers have not been retracted, all we have is a statement from the author in an unrelated post on c_d_t stating that "most /now/ understand the relevance of data independence.")
> >
> > Or, do you stand, on that paper, now ?
>
> I'm not sure which statement you mean, but I don't think I've said anything that strongly contradicts the results and assumptions in this paper. I'm also not sure what you mean by "presenting a solution" here. The paper does not introduce a new model, it studies an existing one and focuses on reasoning over union types within that model. But the results actually carry over into other data models.

Central theme described above.

Ok, so you have not retracted it, the paper stands.

Yes, I agree, the model is not yours. You support the model, and you provide methods /within/ that model, to fix problems /caused/ by that very model. You give a method to fix the problem, in the model.

You fail totally, to realise that the problem is not in the model, and therefore no amount of fixing it in the model, will fix it.

I accept, you understand more about data independence than you did in the past, but it it still a tiny fraction of that contained in the RM. And that that limited perception, that inability to see the relevance of the items in the RM, hinders you from (a) dealing with data issues in the data (in the RDb, in the platform), and (b) maintains your venture of implementing data integrity in the object layers (the Program), which is at best, fragmented and only a tiny portion of [a].

> You might mean that by studying that data model the paper implicitly gives a vote of approval to it. Do I still stand by that vote? Mostly yes, but these days I tend to think that simpeler graph-based models (but with mechanisms to model nested values) would be more useful and effective.

As long as those items are implemented in the /Program Space/ and not the/Data Space?, they break a number of laws established in science and in Relational Database. We have the laws, specifically to protect society from precisely the results of what you are describing.

And the fact that you are continuing in this path (the "model"), in denial of the evidence that this path has failed, increasing the complexity of the vehicle, means you are not observing basic scientific principles, you are simply addicted to the path.

> > 2. Of the Architectural Principle, established as science in our field, that Data must be separated from Process ?
> > __ (And it follows that there are separate and different methods for Analysing & Designing the two, etc, etc.)
> > __ It is clearly established in the industry, that implementers are specialists in either the /Data Space/ xor the /Program Space/ (those who cover both are few, and exceptional).
>
> I am aware of that, but not sure why you think this is relevant for the paper.
>
> Btw. when you say "science" I have the impression you actually mean "engineering".

I was brought up on science. I went to a scientific school. My tertiary education is science, computer science. I have a lot of interest and understanding of engineering, and most of what I do is engineering, yes, but that is the application of science. It rests on, and relies on, science. Now, in the thirty nine years since I left college, you guys might have changed the definition of "science" to some floating flying itinerant ever-changing object, but I am not about to do so.

> > 3. That [2] existed, as science, before Codd, 1970, the RM ?
> > __ (That it has been furthered ever since then, and rendered for whatever context one uses (eg. a RDB; an awk script). That it (as with everything in science) has only gotten stronger as an Architectural Principle, and applicable in more contexts.
>
> That engineering principle has a long and venerable tradition, yes.

Well, your answer to [3] contradicts your answer to [2]. [2] and [3] are inseparable. Codd did not invent [3] out of thin air, it was based, founded in, existing science, including the Hierarchical Model. You cannot carve of a specific implementation of [2], namely [3], and deny [2]. It is absurd.

Whatever you perceive as [3], in isolation from [2], is a deformed, not whole [3]. In which case, you do not understand that venerable engineering principle.

> > 4. That in his paper, the RM, in 1970, Codd gave specific /further/ prescriptions and prohibitions re "data independence", without having to explain what "data independence" meant, because it was well-known ?
> > __ Which resulted in implementation of those concepts in the commercial RDBMS platforms, as well as in the implementations of RBDs.
>
> To some extent. From my colleagues who were around at the time I know that the concept already existed, but not everybody understood it in the same way.

Well, then, the fact that three was a difference among them means that they, as a group were partially ignorant, and that the ones who had the higher understanding could not, did not, bring those at the lower level, to the higher level. A direct result of picking and choosing from a list that you (they) did not understand.

And it must be said, they (you) live in ignorance of what the vendors did, and why they are doing it, why certain capabilities have been implemented, and others not.

Whereas, for those who observed the law, as law, the higher level was the only level.

> > 5. The result being, that 100% of all controls upon data should be deployed in the RDB ?
> > __ (As I am sure you know, DKNF alludes to this. We implement a much fuller form, as standard practice.)
>
> Not sure why you drag poor little DKNF into this, since that only deals with a very small part of this, but, yes.

Ok. Good.

But then we have a problem. Self-contradiction again. You accept cannot that 100% of the controls on data, data integrity, closely related to data independence, should be deployed in the RDB, and at the same time be supporting a model that deploys some large portion of said controls in the app layers, the objects and classifiers.

Sure, your answers allow you to argue on both sides of the fence, but it is incoherent. You damage your credibility.

> > 6. The corollary being, that controls on the data should not be deployed in the /Program Space/. Eg. OO Objects or classifiers ?
> > __ And if it is deployed there, (a) it will never be adequate, or (b) as complete, as a deployement in the /Data Space/. Something that has been painfully proved in millions of OO-centric implementations.
>
> Definitely, yes.

Good. Ok. But same response re self-contradiction above.

> > > To be honest, although I have opinions on these issues, I find such discussions unscientific and without any merit, even if it is about how Codd himself meant his model to be understood. It is akin to the argument by authority, which is a very weak type of argument.
> >
> > Per details above, I do not expect that type of argument.
> >
> > We do need to take Codd as the authority. Otherwise we can pack our bags and go home.
>
> Quite the contrary. Codd's contributions were fantastic, some of them anyway, but it is by no means the last word on these matters, if only because technology and insight has progressed since then.

Well I disagree strongly, but I won't take time to enumerate. Quickly, re these matters :
- Neither I nor any of my colleagues (the high end of the implementation space) know of any of Codd's contributions (minus the known retractions) to be anything less than fantastic; less than law; less than the last word
- there have been no insights of value published since Codd
- there have been no progress in technology that relates to these matters since 1984
___ expect for improvements and enhancements in the platforms

In case you are taking about the theoretical fraction of our field, I am quite aware that there has been a lot of MMM activity, but there has been no published result of any value from the theoreticians, since Codd.

Please feel free to name one, or to provide a link.

> > > What matters is, which objective arguments were put forward to support that interpretation and what the evidence for its merit was. Which interpretation leads to the most effective DBMSs [, RDBs] and what scientific evidence is there for this.
> >
> > Note my insertion.
>
> Yes, noted. But I disagree with the R there.

What, in this day and age, you give assent to a database that is not Relational ?

What one of the hundred or so OODBMS that have come and gone in the last twenty years ? That makes the same mistake detailed above, failure to separate /Data/ and /Process/; fail Data Independence and Open Architecture; failure to decompose and deploy. Oh, the next one will work, will it.

> > Yes, all very good points. But I think even that /could/ be avoided, or let's say, easily stated and closed: the /commercial/ vendors have already done that work; the high-end implementers implement it. Something that the theoreticians do not seem to be able to comprehend. they are about thirty years behind the industry that they theorise for. You will of course, have to accept evidenced reality as scientific evidence, not papers by theoreticians who have already established themselves as un-scientific. Mathematical proofs alone are pure garbage.
>
> Mathematical proofs can only proof mathematical facts, not whether a certain model is practical or not, although certain results can give some support. There the proof is really in the eating of the pudding. Any other position would be unscientific.

Well and good, generally speaking.

But where it concerns the matters on the table, the principles discussed, and specifically your paper (a) there is a well-established and easily recognised model, (c) you are supporting it, your paper supports it, and (c) the people who propagate that model use your paper (as well as fifty more that I am aware of) to prop up that model.

Meanwhile, back at the farm, you are on record (above) as agreeing with specific scientific and architectural principles that the model breaks.

The model is broken. Despite twenty years of fixing and re-inventing, it remains broken. These guys are in denial of reality, evidenced facts. An dyou both support the model and add to it, while claiming that you agree with the principles that prevent such models.

> > Actually, if you excised the mathematical proofs from your papers, it would increase their credibility. Because the mathematical proofs have been proven false in the course of time, or were false from the beginning due to their contradicting other sciences [specific principles , now identified].
>
> Mathematical proofs can only be proven false by mathematics. But perhaps you mean that the underlying assumptions about how the models and assumptions are relevant in the real world might be shown to be incorrect by other sciences. Yes, that can happen.

Ok. A bit naïve, because thousands of people use those papers as proofs to propagate their model; millions of implementers use, and they rely on those proofs.

Not guys like me, because we have our feet on the ground, and we do not deny other sciences.

But for those who do have esteem for the model, ignorant of science, they end up in the situation where, when the model is broken, they believe the fault is with themselves, not in the model, and they try harder next time. So the real crime is not with them, it is with the people who invented the model; who propagate it; who implement contrary to the science; who deny other sciences; who write papers supporting it, fixing it; etc. That is why I say, they commit a massive fraud.

> Has this happened in this case? I don't think so. The different OO data models have mostly fallen out of favour for non-technical reasons.

The concept of the OO data model, or OO/ORM+data store, ie. minus an RDB, as detailed above, is a total scientific fraud, non-science.

Especially in light of the fact that the traditional model, 100% RDB for the /Data/ and 100% OO for the /Program/, ie. minimal ORM, and no writes allowed, works perfectly and does not break any scientific principles. I will refrain from listing the benefits and the money saved.

____

Ok, to summarise your paper. In the context of this thread.

- it acknowledges and supports the OO/ORM+data store model
- it acknowledges that there are myriad errors in the model
___ that data integrity (the tiny bit that you do understand) is broken
___ (interestingly, you do not attack the data integrity problem, you address only the display of data that is erroneous)
___ but it fails to determine the main errors or causative error in the model
- fails to mention RDBs; the principle of separating Data vs Program, and then dealing with each separately
___ thus it fails to recognise the problem for what it is: data integrity failures, due to
___ a. incomplete definition (analysis; classification; Normalisation; etc) of the data itself, and
___ b. absent constraints upon the data
- it continues to perceive data through the very skewed lens of the model
- it determines the problem to be caused by methodology within the model,
___ which is already complex, and has limits to its complexity
- it proposes a solution within the model
___ that is even more complex
___ with methods to deal with that added complexity

____

To summarise my official Response to the paper, paraphrased for the context of this thread:

- it acknowledges and supports the OO/ORM+data store model
___ that is well-established as broken, as per mountains of evidence in the field, including three years of consistent evidence on this project
___ for the main reason that it breaks the scientific and Architectural Principle of Separation of Data and Process, and the standards we have for data, and separately for the various processes that operate on the data
___ in this, the author contradicts established science, the paper should be viewed in that light
- therefore any and all proposals, as well as proofs, that support such a model, are null and void
___ the details need not be examined
- the paper is dismissed.

To the extent that the TeamLeader, prior to the scheduled education, needs to be informed:
- the problem described is real, we have over 200 different data integrity problems that are caused by use of that model, although only one is detailed in the paper
___ this response applies to all such data integrity problems both in the database, and in the display components
- since the author has failed to determine the location of the problem, that it is in the data, and incorrectly determines that it is in the object classifiers, no proposal from that position can address the problem
- the author makes three cardinal errors
__ 1 Failure to separate Data and Process
__ 1.1 consequently failure to deal with each properly, in its rightful location; absence of data controls in the database
__ 1.2 attempt to control data in the process space, and after the fact of storage
__ 1.3 typical of the OO/ORM model
__ 1.4 typical of Maslow's Hammer theory
__ 2 Failure to recognise data hierarchies and to implement them as such in the database
______ It is noted, that while the authors diagram, and the text, and the notion of inheritance all refer to hierarchies, the hierarchical component as relates to data is somehow invisible
__ 3 General ignorance re the methods used in the implementation space
__ 3.1 Relational Model in general
__ 3.2 The ordinary capabilities in RDBMS platforms in particular
- The data integrity problem itself (and all such problems) are caused by:
___ a. absent classification and treatment of data
___ b. absent /ordinary/ data integrity constraints upon such classified data (no additional or special constraints are required)
- the solution is detailed in that section (refer Response)
- once [a] is implemented, the problem (both as stated, and all problems related to data integrity) disappear
- Note that there is no work to be done no the object side re the problem
___ In fact less work is required
___ As per standard, all Updates to the database shall be via Transactions only, ie. no direct Updates to tables are permitted (included in the education)
___ The entire ORM problem is removed due to removal of Update issues; the remainder is trivial
___ The objects remain simple, there is no restriction whatsoever to the object side, multiple inheritance can be implemented without regard to the data content, and the integrity of such

Please feel free to ask questions about anything you do not understand, I expect there will be a few.

I can't give you the whole Response, but I can obtain permission and provide a couple of the key pages from it, particularly those with the diagrams that explain the Solution. But first, I hope you don't mind me asking, please confirm that you can /read/ standard IDEF1X data models that we have been using since 1985, and UML classifiers. I am shocked to find out, as evidenced in the Normalisation thread, that many (all ?) theoreticians cannot do that, eg. they cannot read the predicates or the constraints that are in diagrammatic notation, and ask for them to be spelled out in text form.

Cheers
Derek

James K. Lowden

unread,
Feb 4, 2015, 12:45:27 AM2/4/15
to
On Tue, 3 Feb 2015 06:42:11 -0800 (PST)
Derek Asirvadem <derek.a...@gmail.com> wrote:

> Given the content of previous discussions, a short restatement of my
> position re the HM and its relevance to the RM is in order. I am
> saying the HM lives in the RM:
> - as a concept
> - a previously well-known model, method, of organising data
> - that is explicitly referenced, and used by Codd in the RM,
> throughout the paper
> - if one implements data according to the _Relational Normal Form_
> given, for which explicit steps are given, this leads to Relational
> Keys, which are compound keys ("non-simple")
> --- such keys, in the context of the series of tables in which each
> and all of those components ("simple" as well as all subsets of the
> "non-simple") are used, will form an hierarchy of keys
> - the set of such tables form an hierarchy of tables
> - such keys may well be called Hierarchical Keys (they are well-known
> as Relational Keys, I am not suggesting that we change it)

I hardly know where to begin, Derek. It seems we've been talking past
each other to some degree, because when we agree on what words mean, we
seem, by and large, to agree on the concepts we each espouse. And I
want to thank you, because in preparing this reply I gained a new
appreciation for the meaning of "data independence".

You raise too many points for me to answer one by one. Let me call
out some I think are important. If I mistake your meaning at any
juncture, please correct me.

> Note that in the RM, more than half of Codd's references are to
> products and product manuals. There were not too many theoretical
> papers in the field.

You seem to think one implies the other, that commercial products
preclude theoretical papers. But you must know that's not true. The
reason there were no papers is that there was no theory. You don't feel
that's important; I suggest that's one reason pre-relational systems
were so inelegant.

One giant leap owed to Codd that I think was (and often still is)
underappreciated is his adoption of value semantics. Your helpful
citation illustrates that point quite well, see next.

> There are many terms that Codd uses in the RM, which have gone out of
> fashion.

It does take some work to read Codd's 1970 paper while trying to
embrace the technological perspective of his audience in the days of
punch cards and drum memory.

Looking at the example in section 1.4, I finally see what you mean by
"hierarchy". And, fair enough, Codd says of Figure 3 {employee,
jobhistory, salaryhistory children} ,"The tree ... shows just these
interrelationships...." Having worked with the relational model all
these years, I look at that diagram and I don't see a tree. I see an
ancestor of a Chen diagram, and automatically assume the "nonsimple
domains" are tables. To Codd's contemporaries, the tree-ness was
obvious.

> when he gives the pre-requisties to his __Relational Normal Form__
> [1.4](1)(2), and in (1) states "collections of trees", we take that
> to mean:
> - trees with integrity
> - normalised to the extent that we did prior to the RM
> - no circular references
> - what I am calling, in retrospect __Hierarchical Normal Form__

Codd certainly knew that a tree is a kind of DAG. I don't know what
"normalized [before] RM" refers to, and I wouldn't rely on this example
to prove circular references can't exist in a relational database, but
I don't want to argue that point just yet, because we're talking
hierarchies. Requoting,

> - the set of such tables form an hierarchy of tables
> - such keys may well be called Hierarchical Keys

Sure, but only at a severe cost to meaning!

Codd's reader doubtless saw a hierarchy (effortlessly, as you do, as I
did not). But the example shows that they are *not* a hierarchy,
despite appearances. Figure 3(b) shows each relation (his term) having
"man#" as part of the key. It is not necessary to go through
jobhistory to get to salaryhistory. It is perfectly possible, as you
know, to

select birthyear, salary
from salaryhistory as j join children as c on j.man# = c.man#

If that kind of access is possible, in what sense do the four tables
form a hierarchy? Are we to say they have "hierarchical keys" simply
because employee->jobhistory->salaryhistory are related through their
foreign keys?

If that's what you mean, OK. Given that the tables don't have to be
used hierarchically, ISTM that calling them a hierarchy is to adopt a
blinkered view.

> To the extent that any hierarchy that exists in the data, is
> maintained as an hierarchy, after transformation to the Relational
> Model, the hierarchy lives, exists, breathes

No. What you're really saying is that the tables are related, and that
their relationships are manifest in their keys.

The hierarchical systems you remember so well adopted the idea --
and required the schema to manifest the idea -- that e.g. jobhistory is
a *property* of employee. (They didn't use that term, of course.) One
could not access jobhistory records except through a *pointer* acquired
through an employee record. The hierarchy wasn't just a notional (or
notational) communication convention; it constituted the access path.

With that example, I really think the fairest thing to say is that it
shows it's *not* a hierarchy. By adopting value semantics -- by making
the keys values instead of pointers -- each relation becomes
free-standing and self-consistent. We can think of them as forming a
hierarchy as a convenience; perhaps they'll be commonly used that way in
some application. But we're not required to. The new, non-hierarchical
relations can be combined in arbitrary ways. We can find the highest
salary for each year, without ever learning the men's names.

> I said the RM is a progression of the HM.

I suppose that's true, in the sense that the United States as
constituted in 1789 was a progression of government from what had
existed in 1775. Something came before the thing that came later, and
many people would call it "progress".

The RM was also revolutionary in 1) using math as a foundation, and 2)
rejecting the tree -- and with it, pointer semantics -- as the basis
for data organization.

--jkl

Derek Asirvadem

unread,
Feb 4, 2015, 12:45:34 AM2/4/15
to
James
Jan

Re a number of items that have been exposed on this thread, the OO/ORM+NonRdb Model in particular


Have you guys heard this Irish joke ?

Drunk guy, at two o'clock in the morning. On his hands and knees, in the street, between two cars, desperately looking for something. London bobby arrives.

Bobby: "Ello, ello, ello, what have we here ?"

Paddy: "Ello, ossifer, jusht looking for mah car keysh."

Bobby: "Where is your car ?"

Paddy, pointing 20 metres down the rod: "Over there, ossifer."

Bobby: "Where did you lose your keys ?"

Paddy, pointing to his car: "Over there, ossifer."

Booby: "So, tell me, if you lost your keys over there, why are you looking for them over here ?"

Paddy, pointing at the streetlight above their heads: "Becosh there is light over here."

----

Have you read Abraham Maslow's Hammer theory ?
"If I had a hammer, I'd be lookin' for a nai-i-a-i-ail."

Maslow wrote an excellent book on the dangers of reductionism. Unfortunately, instead of heeding it, and modulating the reductionism that there science is founded upon, theoreticians have taken it to the extreme. In the theoretical void in our field, data is reduced to single characters in italics, devoid of meaning and context.

Have you read Kruger and Dunning's 1999 paper ?
__Unskilled and Unaware of It: How Difficulties in Recognizing One's Own Incompetence Lead to Inflated Self-Assessments.
AFAIC it is essential reading for anyone who considers themselves a theoretician, mandatory reading for decision-makers in our industry. That paper is especially good, because there was an outcry from the pharisees, so K&D performed further research and further confirmed their findings in 2008
__Why the unskilled are unaware: Further explorations of (absent) self-insight among the incompetent

Cheers
Derek

Jan Hidders

unread,
Feb 4, 2015, 7:27:57 AM2/4/15
to
Hi Derek,

It's end of my lunch break, and I really need to get back my grading work, so I am going to very aggressively zoom in what I think are the main points. I will also not try to fully explain and answer everything, just on trying to make my position clear.

Op woensdag 4 februari 2015 06:44:36 UTC+1 schreef Derek Asirvadem:
> Jan
>
> > On Monday, 2 February 2015 21:47:19 UTC+11, Jan Hidders wrote:
> > Op zondag 1 februari 2015 06:33:40 UTC+1 schreef Derek Asirvadem:
> >
> > > > I would btw. be very curious to know what conclusions your client would think he or she could draw from them.
> > >
> > > Since I have already given you a synopsis, a short chronology, I am not sure what you mean. Would you like a more complete one ?
> >
> > More detail, as in where the devil lives. Because I suspect more is concluded from the paper then is actually warranted and meant by the author. But if we get to that later, that's fine.
>
> You will understand perhaps, that I am restricted due to contract and confidentiality issues.
>
> Sydney. One of the larger Australian banks. From experience, we are much more conservative, and we have much more legislature governing, than American banks. Probably (not sure, from speaking to colleagues) somewhat more than than EU banks. Customer has an app, OO, ORM, all the OO bells and whistles. The data store in in the app, ie. closed architecture, but deployed on MS SQL for convenience (SQL DML, backups). Typical OO monolith, typical OO minus RDB problems. Hundreds of objects have progressed to thousands, a maintence and performance nightmare. Three years of failure, of various kinds, but the most important is lack of data integrity, and every time they install a new version, a whole set of new bugs are exposed. Team leader has a lot of influence, but the auditors gave the business an ultimatum: fix it or shut it down. Second most important: the problems getting the data in/out of the data store, and behind that (really the same problem) lack of Relational Database access. Performance is crap but they are used to it. So they came to me (I had replaced an app+Db in another division in NZ, and the auditors had nice words), and the directive is, replace it with an RDb+App, and teach the team standards, so that they never make these mistakes again. The business has no choice, he doesn't need to get a budget allocated, as it is a risk issue and bank level funds are already allocated. I work for the business, so there is some politics, but I answer to, get everything signed off by, the auditors.
>
> Paper. The Team leader is on his last legs. There is no pressure that he will get fired, but he is dying from loss of credibility. He is adamant that if he adds more complexity to the app, to the objects and classifiers, the app will "work", the bugs will be fixed. They have heard this for three years. This time the difference is, he is making a formal presentation, along with two papers that support his position. I am pretty sure that he got that from his OO groupies, no idea who that is, probably some consulting firm that his wife or brother knows. I say that because the presentation was reasonably professional, while denying all the facts.
>
> This is the typical state of the majority of the OO world: despite the past 100% failures, they will do it better next time, promise. Note that if nothing changes, nothing changes.
>
> DBPL is yours. The central theme of your paper is <read abstract>. He is saying that there is scientific, theoretical, evidence that more complexity deployed in the OO classifier layer will fix the data integrity problems, thus it is his team's fault for not doing that properly, not a problem with the OO/ORM concept (specifically no independent RDb), and he deserves yet another chance.
>
> After a quick consultation as to the veracity of each of his statements, proved to be zero, the auditors have asked me to take the presentation down formally. The business has asked me to do so softly.
>
> Enough ?

Yes, I think so. Very interesting. Many thanks for that.

> Proper Use of Paper

Hard to be completely certain, but I think the answer is a very strong: No. Unless his project was to write a programming or query language for a similar data model and implementing the type inference mechanism in there. That seems unlikely from your description.

> Here is what he presents (what the whole OO/ORM world uses):
> - the premise of the paper (central theme ?) is that there are data integrity issues that result from (eg) multiple inheritance
> - that such issues can be corrected by increasing the level of complexity in the objects, the classifiers
> - a proof and specific methods are given
> From reading your paper, I cannot say that he (or the OO/ORM crowd) is using it improperly

"Improperly" does not even come close. :-)

> [.... completely unjustified snip ignoring many important issues ....]
>
> > > > What matters is, which objective arguments were put forward to support that interpretation and what the evidence for its merit was. Which interpretation leads to the most effective DBMSs [, RDBs] and what scientific evidence is there for this.
> > >
> > > Note my insertion.
> >
> > Yes, noted. But I disagree with the R there.
>
> What, in this day and age, you give assent to a database that is not Relational ?

I don't exclude the future possibility. That is not the same thing.

> Meanwhile, back at the farm, you are on record (above) as agreeing with specific scientific and architectural principles that the model breaks.

It doesn't. The paper is talking about a certain data model as a logical data model, not as a physical data model. It has nothing to say about how the data is to be stored or even in what type of database.

> ____
>
> Ok, to summarise your paper. In the context of this thread.
>
> - it acknowledges and supports the OO/ORM+data store model

No. Only as a logical data model.

> - it acknowledges that there are myriad errors in the model
> ___ that data integrity (the tiny bit that you do understand) is broken

No. It looks at how to reason over types and find inconsistencies in schema specifications. Something that happens in any reasonably expressive data modelling language where you can specify integrity constraints. That does not mean that such a language is broken. The opposite could be claimed: if your constraint language is so weak that it cannot express inconsistenties as studied here, it is probably too weak.

> - fails to mention RDBs; the principle of separating Data vs Program, and then dealing with each separately

It doesn't since those are not relevant for the results presented in the paper.

> - the author makes three cardinal errors

The reviewer only one. He completely misunderstands what the paper is about and its relevance within its scientific context.

I would say normally at this point something like "feel free to ask for more clarification" but I'm not sure if I will find the time to answer. :-(

> I can't give you the whole Response, but I can obtain permission and provide a couple of the key pages from it, particularly those with the diagrams that explain the Solution.

I don't think that's necessary. I'm guessing something like a rewrite of the schema where all class combinations are made explicit.

> But first, I hope you don't mind me asking, please confirm that you can /read/ standard IDEF1X data models that we have been using since 1985, and UML classifiers. I am shocked to find out, as evidenced in the Normalisation thread, that many (all ?) theoreticians cannot do that, eg. they cannot read the predicates or the constraints that are in diagrammatic notation, and ask for them to be spelled out in text form.

I don't think that's what they were doing, but yes, I can read IDEF11X and UML Class diagrams. Even taught them for a while to give my student some feeling of how the different notations work and what their differences are.

-- Jan Hidders

Derek Asirvadem

unread,
Feb 5, 2015, 3:05:27 AM2/5/15
to
Jan

> On Wednesday, 4 February 2015 23:27:57 UTC+11, Jan Hidders wrote:
> Op woensdag 4 februari 2015 06:44:36 UTC+1 schreef Derek Asirvadem:

Thank you for your response.

> It's end of my lunch break, and I really need to get back my grading work, so I am going to very aggressively zoom in what I think are the main points. I will also not try to fully explain and answer everything, just on trying to make my position clear.

No problem.

> > ...Context...

> Very interesting. Many thanks for that.
>
> > Proper Use of Paper
>
> Hard to be completely certain, but I think the answer is a very strong: No. Unless his project was to write a programming or query language for a similar data model and implementing the type inference mechanism in there. That seems unlikely from your description.

Oh come on. I think I have explained it well enough. And you have not argued with the fact that it, and many papers like it, get used in the theoretical world that uses that model, and that model is implemented classically, in the OO/ORM world. I never said that it was a proper use technically it isn't, but the fact it in the physical universe (a) it if used for the model and (b) the model is implemented. Therefore there is a direct responsibility between the theoreticians who write those papers and the implementation of OO/ORM, the madness that we have to deal with. And (c) they don't give up that madness, because the theoreticians are writing more papers.

If you are saying, you invented a small fold-up camping axe; someone used that axe to commit murder, yes, of course I agree. But it isn't that simple. You have written a paper on how to use that axe to circumvent forensics, the result being that many people are murdering many people, and the murder weapon cannot be found. Therefore, beyond inventing greatly-needed camping axes, to the extent that you subvert the [finding the] truth, you enable murderers. That is what I want to deal with.

I have also tried to make it clear that this is not an attack on you personally. It is an attack on all such papers that support a model that is unscientific, breaks well-known principles, etc. You are just the single one who is courageous enough to (a) deal with it in the physical implementation universe, and (b) have one of you papers used as an example.

> > From reading your paper, I cannot say that he (or the OO/ORM crowd) is using it improperly
>
> "Improperly" does not even come close. :-)

Agreed. Within the context explained previously, and above.

But he already has, it is a done deal. I have to deal with that, because his use of the paper was not questioned. And in the context, I could not identify any improper use. (You have identified it to me now, but even that, does not deny the context.)

> > > > > What matters is, which objective arguments were put forward to support that interpretation and what the evidence for its merit was. Which interpretation leads to the most effective DBMSs [, RDBs] and what scientific evidence is there for this.
> > > >
> > > > Note my insertion.
> > >
> > > Yes, noted. But I disagree with the R there.
> >
> > What, in this day and age, you give assent to a database that is not Relational ?
>
> I don't exclude the future possibility. That is not the same thing.

Ok. I don't think you are splitting hairs. You are saying that you support, the model supports any database, not nec Relational. Even though all the theories and and the implementations post-relational-paradigm have been total failures. Is that correct ?

Note my insertion. I did not remove yours, I added mine, and noted it.

So this might be a great opportunity for you to entire absolve yourself of responsibility. In such papers, you should clearly state two things:
a. the model used, that any "database" or data store will do, they are implementation details (which you have done, but not to a clear degree)
b. that the paper, the model, is written in ignorance of the RM, of RDBMS, of RDBs.

Otherwise, in this day and age, where there is only one database model (Relational) has exists in terms of any significance (again, HM and NM don't), you run the risk of people (your side as well as implementers) thinking that the model applies to the Relational paradigm. And to let them go on thinking that, if it were not true, is disingenuous. Again, I am attacking the causes, the people who suggest such a model is valid, even if only for contemplation on the dark side of the moon.

> > Meanwhile, back at the farm, you are on record (above) as agreeing with specific scientific and architectural principles that the model breaks.
>
> It doesn't. The paper is talking about a certain data model as a logical data model, not as a physical data model. It has nothing to say about how the data is to be stored or even in what type of database.

(Separate to my notes above, which should be repeated here.)

So it is good for people who don't know the architectural principle, and who use ISAM. Not otherwise. Ok, fine.

So let us limit the issue discussed in your paper as it regards the model, only. It supports the model. The model is bankrupt. All my comments still apply (I went over them):
- you SHOULD know that the model breaks the architectural principle
- you SHOULD know that supporting such a model is anti-science
- you propose yet another fix-up within the model, you give a method
- you SHOULD know that the method breaks the architectural principle
- you SHOULD know that giving such a method is anti-science

What the OO/ORM crowd do with model; how they implement it (RFS, not RDB); in no concern of yours. Fine. They are imbeciles, and they implement that model, your method included. Fine. Billions of dollars wasted, none of those systems work; millions of people waste millions of manhours fixing; improving; up-grading; elevating; inflating such systems, and still none of them work.

But the high end of the market, the 5%, we know that the architectural principle is broken; that the model violates science; and oh, btw we have had RDBs that have no such problems since 1984.

So why do theoreticians in this field exist ? What do they do ? They contribute noything to science; nothing to the Relational paradigm, and any contributions that they do make, is to support the vertical column of pig poop. And even that, they deny responsibility for.

> > Ok, to summarise your paper. In the context of this thread.
> >
> > - it acknowledges and supports the OO/ORM+data store model
>
> No. Only as a logical data model.

Yes. Agreed. And as you know, some people implement that, physically. They rely on the fact that their physical implementation is sound, supported by theories (that are naturally limited to the logical). Only Codd dealt with the physical.

There is a huge danger in reductionism (abstraction). That has been established by greater minds than mine. Maslow and others.

> > - it acknowledges that there are myriad errors in the model
> > ___ that data integrity (the tiny bit that you do understand) is broken
>
> No. It looks at how to reason over types and find inconsistencies in schema specifications. Something that happens in any reasonably expressive data modelling language where you can specify integrity constraints. That does not mean that such a language is broken.

The opposite could be claimed: if your constraint language is so weak that it cannot express inconsistenties as studied here, it is probably too weak.

Hang on. That is correct. My language (IDEF1X plus SQL, which exists, for over thirty years, and can be used for implementation), which you may consider primitive, compared to yours (which does not exist)
- expresses all those things: consistencies; inconsistencies
__ the capable modeller perceives the missing bits as well, the novice does not
- expresses types; classifications, in a manner, and to greater detail than yours
- ELIMINATES the problem that you have, that you propose a solution for
- a swag of other constraints on data that you people know nothing about (I am mentioning that but not elaborating, in order to avoid exceeding the scope)

Thus our primitive language is far superior to what you are theorising about.

Whatever you guys are theorising about, you should first find out about what the RM, RDBs, SQL actually do, and then theorise about something better. It is extremely moronic to theorise about something than is far less than what we have, what we have had, since 1985.

This is like Darwen's (TweedleDee) Toy Language. He suggests that it "will be" better than SQL, while demonstrating that he is clueless about SQL. He suggests that it is better than the RM, that the RM is "incomplete", while demonstrating that he is clueless about the RM. Oh, wait, he has a 43rd private definition of what the RM is.

> > - fails to mention RDBs; the principle of separating Data vs Program, and then dealing with each separately
>
> It doesn't since those are not relevant for the results presented in the paper.
> It doesn't since RDBs are not relevant for the results presented in the paper.

Accepted. With notation per above.

> It doesn't since [the principle of separating Data vs Program is] not relevant for the results presented in the paper.

Rejected, as detailed above.
Rejected, in the new restricted scope that I have accepted.

> > - the author makes three cardinal errors
>
> The reviewer only one.

Whoa. I am not a reviewer. I have neither the qualification, nor the coverage of that theoretical space. I did not at any time suggest that my comments were a review of any kind, formal or informal. I apologise for any lack of clarity in this regard.

I did take the time to explain the context; you know who I am, a practitioner with a high regard for theory (not camel poop), and a good coverage of the implementation space for that logical model: OO/ORM with out without an RDB; and a reasonably good implementer of the RM, RDBs. That do not break. Due to reliance on science in our field and related sciences.

So, please take all my comments (including my dismissal of such papers) from that position.

Note also that my Response is in two parts. Ch 1 is for managers and auditors. Once the principle is broke, is is unscientific, all else is lost

Ch 2 is for the implementers who needs the whole thing spelled out, and the solution as well.

> He completely misunderstands what the paper is about and its relevance within its scientific context.
>
> I would say normally at this point something like "feel free to ask for more clarification" but I'm not sure if I will find the time to answer. :-(

We are way past that point. It is implemented in the fact that the model is used as is (no significant change) in the OO/ORM world; that they have implemented it, making it physical. That he has implemented a well-known and understood model, correctly. But it is broken.

So, if what you say is correct, then hundreds of OO/ORM proponents; millions of OO/ORM implementers, have done the same thing.

I do not accept the word "scientific" in your sentence. All such papers are un-scientific, for reasons already detailed.

He is not using such papers within the theoretical exploration context, yes. But you cannot deny that such papers are used to validate and justify the classic OO/ORM model, and their implementations. All your mentors, Abiteboul, Hull, Vianu, et al, support the OO/ORM crowd, and do so openly. Just look at this pig poop in ch 3 "Relational Model" and ch 11/Translating to RM in the Alice book: those are serious attacks against the RM, and constructed on a fraudulent basis.

> > I can't give you the whole Response, but I can obtain permission and provide a couple of the key pages from it, particularly those with the diagrams that explain the Solution.
>
> I don't think that's necessary. I'm guessing something like a rewrite of the schema where all class combinations are made explicit.

Definitely not. That would make the same mistake of using the broken model to fix the broken model. That would also validate the fiction that it is a class combination issue, as well as create masses of rows to make the combinations explicit.

> > But first, I hope you don't mind me asking, please confirm that you can /read/ standard IDEF1X data models that we have been using since 1985, and UML classifiers. I am shocked to find out, as evidenced in the Normalisation thread, that many (all ?) theoreticians cannot do that, eg. they cannot read the predicates or the constraints that are in diagrammatic notation, and ask for them to be spelled out in text form.
>
> I don't think that's what they were doing,

Well, they said the words:
> > dependencies haven't been stated
> > Assuming certain external predicates [ie. implying they have not been given]
> > your drawing of boxes and lines

Maybe the English in Belgium has been redefined.

> > unskilled people present a drawing of boxes

Clearly the beast has not recognised an IDEF1X model, despite the fact that that is expressly stated in the OP. Despite the fact that any donkey can google and get a bit of understanding re the boxes and squiggles. Despite the fact that my 6-page Introduction comes up in the top five.

Anyway, I must acknowledge the powers you have to know what people who state they can't read the notation were doing, because it is certainly beyond me.

> but yes, I can read IDEF11X and UML Class diagrams. Even taught them for a while to

So you are clear on _not only_ the visual difference between:
- solid and dashed lines
- square and round corners
but on the consequences, the ramifications, the consequences ? Ala the "expressiveness" in the non-existent theoretical languages, that exist in the primitive languages ?

And you will *not* /read/ anything in the model that is not in the model ? In the Address A model which does not have "candidate keys", you perceived such. Indeed, your intuition correctly picked up one of the *missing* constraints, due specifically to the *expressiveness*.

While I cover the OO side quite nicely (I dictate standards) and I lead teams (but we use 4GLs that generate objects, we do not actually write objects, same as we do not actually write SQL), I do not profess to be a specialist in UML notation. As you will know, since you are teaching it, it is not a standard by any stretch of the imagination, it cannot be used for verification of the completeness of the model, let alone verification of the model (both of which we can do in IDEF1X). Everyone adds and drops whatever they want. There is one symbol and it is used, misused, and abused, for everything. Most of the other symbols and notation is lost. Therefore, while I do draw UML diagrams (as distinct from erecting IDEF1X models), I am not a UML Notation specialist, I would ask that you forgive my mistakes.

> give my student some feeling of how the different notations work and what their differences are.

Ugh!

The OO/ORM crowd make many mistakes. But if you ask me to name one, the most serious mistake, that has the most negative consequences, I would say, failure to observe the scientific and architectural principle of separation of Data vs Program. Sure, we are naming that on this thread, but we are dealing with only one of the consequences, and there are many. Here is another. The consequence of NOT observing that principle means they view Data and Program together, they make mashed potatoes with minced meat, instead or french fries and roast beef. And then they have all sorts of problems separating the bits of meat from the mash, because half the time, they need EITHER meat XOR potatoes. And their controls on their mince-and-mash objects start to fail after two level of inheritance, therefore that model too, is broken.

Here you have hit another consequence. The OO/ORM crowd, due to NOT observing said principle, PERCEIVE data and program as the same. Thus their Analysis, Design, Modelling, and Implementation of Data is severely limited. They perceive data through the lens of OO. So at the minimum, their A, D, M, I of data is severely handicapped. The dotted vs solid lines; the square vs round corners, not only have meaning, they have serious ramifications. Further, UML is nowhere near as rich as IDEF1X in terms of expressiveness or precision, so whatever is drawn in UML re Data, is even less classified. Those are Crimes of commission.

Separately, they commit crimes of Omission. Eg. NOT perceiving Data as Data, and treating it with the separate A, D, M, I methods that pertain to Data.

Absence of an understanding of, an appreciation of, the Relational Model (not a pick-and-choose list, not the "rm" of the theoreticians) means that one will not, can not understand the relevance of many of the _concepts_ in IDEF1X. And thus one cannot A, D, M, I according to the RM. Thus any A, D, M, I that they do is devoid of the Relational Integrity, Power, Speed that is given in the RM.

Therefore, UML-for-data-objects and IDEF1X for Data are not comparable, they are worlds apart. I don't see how one can teach them together or compare them for differences. Especially not, if the RM is not taught first. Or stated otherwise, the only place where such a comparison can be had, is in the invalid place of denial of the principle *and* denial of the RM.

Think about i. Half the problems that manifest in every UML-only implementation are wiped out if they use an RDB. As this sub-thread is going to demonstrate.

So you ask me, ok, Derek, what the frog would you teach.
1 The principle re Data vs Process, its relevance, danger of ignoring it.
2 For Data, the RM, the whole RM, and nothing but the RM
2.1 IDEF1X as the standard for modelling Relational databases
3 For Program components, the principles of programming; process flow; data flow; decomposition as a modelling method
3.1 SSADM for simple projects or IDEF0 for complex projects
3.2 UML for program objects, classifiers, etc

Importantly, [3.1] addresses decomposition; genuine modelling and progression; both data and process flows, something UML does not cover; doesn't have the methods or the notation for.

I have four tools in my tool kit. Not one Hammer.

----

And if we do go down this path, it is NOT on the basis of proving your paper wrong, etc.

As per this thread, as stated in this sub-thread, it IS on the basis of proving that in any logical or physical context (the paper is simply one example of hundreds):

- ignorance of or rebellion against the architectural principle of separation of /Data/ and /Process/, will kill your model, project, system, career

- any proposals based on such absence can be dismissed, such as our TeamLeader's; the classic OO/ORM model

- re the modelling of a database of any kind, ignorance of or rebellion against the Relational Model and the Standard for Modelling Relational Databases (IDEF1X) will place such a database at the theoretical, logical, and physical level of pre-1970 theory (ISAM, HM, NM) and pre-1985 modelling methods

- re the implementation of a database of any kind, ignorance of or rebellion against the Relational Model (and secondly, in its fullness as implemented by the commercial vendors, but we will exclude that on c_d_t, because we are still getting to the unexpurgated RM), will place such a database at the theoretical, logical, and physical level of pre-1960 Record Filing Systems (ie. ISAM. HM in the 60's, and HM and NM in the 70's and 80's were far more advanced)

(
___ only because the self-importance of theoreticians in this space need to be acknowledged, since they have developed nothing useable or implemented (in successful systems) since 1970, whatever they are theorising about, can be safely ignored. I won't belabour the fact that that content is actually damaging to standards, science, success, that is been exposed, slowly, in these threads.

___ and the corollary, since the only things the theoreticians have produced have been used by naïve implementers, who themselves ignore or rebel against science, and since all such systems are failures, disasters, train wrecks, the theoreticians can only produce train wrecks.
)

Do feel free to change that pathetic state of affairs.

Cheers
Derek

Jan Hidders

unread,
Feb 5, 2015, 5:53:58 AM2/5/15
to
Op donderdag 5 februari 2015 09:05:27 UTC+1 schreef Derek Asirvadem:
> Jan
>
> > On Wednesday, 4 February 2015 23:27:57 UTC+11, Jan Hidders wrote:
> > Op woensdag 4 februari 2015 06:44:36 UTC+1 schreef Derek Asirvadem:
>
> Thank you for your response.
>
> > It's end of my lunch break, and I really need to get back my grading work, so I am going to very aggressively zoom in what I think are the main points. I will also not try to fully explain and answer everything, just on trying to make my position clear.
>
> No problem.
>
> > > ...Context...
>
> > Very interesting. Many thanks for that.
> >
> > > Proper Use of Paper
> >
> > Hard to be completely certain, but I think the answer is a very strong: No. Unless his project was to write a programming or query language for a similar data model and implementing the type inference mechanism in there. That seems unlikely from your description.
>
> Oh come on. I think I have explained it well enough. And you have not argued with the fact that it, and many papers like it, get used in the theoretical world that uses that model, and that model is implemented classically, in the OO/ORM world. I never said that it was a proper use technically it isn't, but the fact it in the physical universe (a) it if used for the model and (b) the model is implemented. Therefore there is a direct responsibility between the theoreticians who write those papers and the implementation of OO/ORM, the madness that we have to deal with. And (c) they don't give up that madness, because the theoreticians are writing more papers.
>
> If you are saying, you invented a small fold-up camping axe; someone used that axe to commit murder, yes, of course I agree. But it isn't that simple. You have written a paper on how to use that axe to circumvent forensics, the result being that many people are murdering many people, and the murder weapon cannot be found. Therefore, beyond inventing greatly-needed camping axes, to the extent that you subvert the [finding the] truth, you enable murderers. That is what I want to deal with.

[since this is the main point, I will focus on this]

I don't accept the premisse that OODBMs are necessarily evil, and disagree that they inherently break important engineering principles such separation of process and data, and the principle of real data independence. There were research prototypes which you have probably never heard of that got this reasonably right. But for reasons we can go into later all the commercial products broke these principes. Not that this played the biggest role in their lack of succes, but it did play a role.

I distinguish that from ORMs, which I consider an unhappy kludge, even if many practitioners seem very happy with it and defend it with both the same type of arguments (their practical experience, building real-world systems, having more experience with existing technology, etc.) and vigour that you are displaying here. In my own experience they can have their use, but can also cause great damage. In all cases, at the time that I wrote my paper it was far from clear that this would become a leading paradigm. It was certainly not what I had in mind. And there has actually been research, theoretical and practical, since then to allow easier acces to data sources, relational or not, from programming languages without the need for ORMs. Understanding type inference is actually a part of that.

> > > From reading your paper, I cannot say that he (or the OO/ORM crowd) is using it improperly
> >
> > "Improperly" does not even come close. :-)
>
> Agreed. Within the context explained previously, and above.
>
> But he already has, it is a done deal. I have to deal with that, because his use of the paper was not questioned. And in the context, I could not identify any improper use. (You have identified it to me now, but even that, does not deny the context.)

You cannot blame scientists for non-scientists misunderstanding and abusing their papers. It's simply not practical to try and make them fool-proof in that sense.

> So let us limit the issue discussed in your paper as it regards the model, only. It supports the model. The model is bankrupt. All my comments still apply (I went over them):
> - you SHOULD know that the model breaks the architectural principle

Let me stop you right there. As you know I do not agree with that, so if you want a meaningful exchange of ideas, we'll first need to establish this.

I have on my side all the relevant research communities, not just the more theoretical ones such as PODS, ICDT and DBPL, but also the more applied communities such as VLDB, SIGMOD, ICDE, EDBT, etc. I visit all of them regularly. I have organised workshops in some of them. Been in discussion panels. I meet there both theoreticians and experts on DBMS construction, many of them also working at commercial DBMS suppliers and working on actual DBMSs. Did I already mention that I actually have done some systems research myself on indexing and query optimisation? Hence my interest in these communities.

Pretty much all the researchers in these research communities disagree with that claim of yours. These are the leading experts in the world on these matters.

Not that I consider this a very strong argument. I don't. It is argument by authority, which should be avoided in real scientific discussions. But I hope it brings home the point that you will have to come up with something better then your personal practical experience as the only argument justifying that claim.

> > > - it acknowledges that there are myriad errors in the model
> > > ___ that data integrity (the tiny bit that you do understand) is broken
> >
> > No. It looks at how to reason over types and find inconsistencies in schema specifications. Something that happens in any reasonably expressive data modelling language where you can specify integrity constraints. That does not mean that such a language is broken.
>
> The opposite could be claimed: if your constraint language is so weak that it cannot express inconsistenties as studied here, it is probably too weak.
>
> Hang on. That is correct. My language (IDEF1X plus SQL, which exists, for over thirty years, and can be used for implementation), which you may consider primitive, compared to yours (which does not exist)
> - expresses all those things: consistencies; inconsistencies

IDEF1X by itself cannot, but that's actually fine given the purpose it was designed for. A stronger constraint language is not always a better thing.

> He is not using such papers within the theoretical exploration context, yes. But you cannot deny that such papers are used to validate and justify the classic OO/ORM model, and their implementations. All your mentors, Abiteboul, Hull, Vianu, et al, support the OO/ORM crowd, and do so openly. Just look at this pig poop in ch 3 "Relational Model" and ch 11/Translating to RM in the Alice book: those are serious attacks against the RM, and constructed on a fraudulent basis.

Yeah, I know you claim that. I don't see it that way, and with me all database researchesr, both theoretical and applied, that I know. So unless you are going to supply some real argument for this claim, this discussion is nog going to go forward.

> > > But first, I hope you don't mind me asking, please confirm that you can /read/ standard IDEF1X data models that we have been using since 1985, and UML classifiers. I am shocked to find out, as evidenced in the Normalisation thread, that many (all ?) theoreticians cannot do that, eg. they cannot read the predicates or the constraints that are in diagrammatic notation, and ask for them to be spelled out in text form.
> >
> > I don't think that's what they were doing,
>
> Well, they said the words:
> > > dependencies haven't been stated
> > > Assuming certain external predicates [ie. implying they have not been given]
> > > your drawing of boxes and lines

Indeed. The complaint was that it it was not clear if all relevant dependencies had been explicit in the diagram and/or the description, which is necessary for correct normalization, and as indeed you yourself indicated was not the case.

> > but yes, I can read IDEF11X and UML Class diagrams. Even taught them for a while to
>
> So you are clear on _not only_ the visual difference between:
> - solid and dashed lines
> - square and round corners
> but on the consequences, the ramifications, the consequences ? Ala the "expressiveness" in the non-existent theoretical languages, that exist in the primitive languages ?

Yes.

> And you will *not* /read/ anything in the model that is not in the model ? In the Address A model which does not have "candidate keys", you perceived such. Indeed, your intuition correctly picked up one of the *missing* constraints, due specifically to the *expressiveness*.

Not sure what expressiveness has to to do with this, but yes, when questions of normalization are considered in practice it must be thoroughly checked that all valid dependencies have been made explicit. Many incompetent data modellers tend to forget that.

> While I cover the OO side quite nicely (I dictate standards) and I lead teams (but we use 4GLs that generate objects, we do not actually write objects, same as we do not actually write SQL), I do not profess to be a specialist in UML notation. As you will know, since you are teaching it, it is not a standard by any stretch of the imagination, it cannot be used for verification of the completeness of the model, let alone verification of the model (both of which we can do in IDEF1X). Everyone adds and drops whatever they want. There is one symbol and it is used, misused, and abused, for everything. Most of the other symbols and notation is lost. Therefore, while I do draw UML diagrams (as distinct from erecting IDEF1X models), I am not a UML Notation specialist, I would ask that you forgive my mistakes.

No problem I'm not a big fan of UML for data modelling myself.

> The OO/ORM crowd make many mistakes. But if you ask me to name one, the most serious mistake, that has the most negative consequences, I would say, failure to observe the scientific and architectural principle of separation of Data vs Program.

The crowd that I'm in does not do that, so I'm not sure who you are talking about now.

-- Jan Hidders

Derek Asirvadem

unread,
Feb 5, 2015, 6:14:01 PM2/5/15
to
James

> On Wednesday, 4 February 2015 16:45:27 UTC+11, James K. Lowden wrote:
> On Tue, 3 Feb 2015 06:42:11 -0800 (PST) Derek Asirvadem <derek.a...@gmail.com> wrote:

Thank you for your response.

I was preparing one for you, but I ran out of time, Fri is chockers for me, the weekend dawns, your post will be left unanswered for a long time ... so I am going to take a different approach, taking into account that you may have a bit of time on the weekend as well.

The bottom line re my perception of your position is this. You have moved from "there is nothing from the hierarchies/HM in the RM" to "holy cow, there is something of hierarchies/HM in the RM". If it need be said, I am not trying to convert you (eg. to "the hierarchies/HM lives in the RM"), but given that I happen to be the one who has the pleasure of introducing this subject to your (and it is a big subject, with many ramifications and consequences), then it is my job to transmit the subject and to transmit it intact, not in pieces. Intact, over the next year, (a) you will not lose it, and (b) you will perceive more, and get more out of it. If all you have is one big chunk, over the next year, it will go back to zero.

It took me two projects, probably three years, to get the full quid, all the ramifications and consequences, to obtain the full Integrity, Power, and Speed from the RM. I had to read the RM probably four times, and each time, I got more out of it, and implemented one more increment of it. I have the experience of decades. For you, you are fighting (mentally) what has been drummed into you, what you believe from your teachers, ala HM is hopeless, no value; there is nothing hierarchical about the RM. So it is a huge step that you have crossed the line (but you have not accepted the full core), and I acknowledge you for that.

Yes, terms are very important, and private definitions, or thinking the other person means something other than the word he is using, damages communication.
- When I say "Hierarchy" I mean the English word, to its full meaning, and its additional meaning in the computer software industry.
- When I say "Hierarchical Model" I mean the very real thing that existed 1960 to 1984, as implemented by IBM/IMS, philosophically; theoretically; logically; physically. Logically and physically means the storage and Access Path are Dependent.
- Of course when I compare "Hierarchical Model" to something else, eg. the RM, or apply the concept to a set of tables implemented in an RDBMS, the physical part does not apply, the physical part that does apply is the RM (logically, first) and the particular storage methods available in the particular RDBMS.

To wit, when I said "Hierarchy", up to now (including your last post) you have taken that to mean "Hierarchical storage", and that is not correct, I meant "Hierarchy".

1. Therefore, you may wish to re-read my last couple of posts on the subject, with that in mind, if you do, you will get more meaning out of them, and I won't have to post detailed explanations. I am identifying this as an option, not necessarily a recommendation.

2. Therefore, if you have times before Mon, when I next expect to post, I recommend that you read those same sections of the RM again, (1.4) plus preceding, with the following in mind. And with the the following notes, side by side.

a. Are you clear that Codd's "non-simple" means a Repeating Group. What you might call a nested set or nested relation. What would be implemented Relationally as a child table ?

b. Are you aware that Codd uses the word "Domain", beyond the way it is currently understood. Ie. it applies, each and every time (he uses it in many contexts), to each context, and fully ?

c. There is no Fig (3). There are two figures, Fig 3(a) and 3(b). Fig 3(a) has a top and bottom half.

d. When you get to [1.4 Normal Form, para 2], please see if this diagram explains Codd's Fig 3(a). They could not insert diagrams that were created outside the publishing house in publications in 1970, ie. the author's. This is my understanding of what Codd wishes to communicate.
http://www.softwaregems.com.au/Documents/Article/Normalisation/RM%20Foo%203_A_Top.pdf

e. When you read anything and everything that Codd has to say about the limitations of the HM, please see if this diagram helps. It is an overview, depicting the salient aspects that Codd raises. Note that it is not detailed enough to show the storage issue (which I believe you understand) to the full extent of its agony. Please ask, if you want that.
http://www.softwaregems.com.au/Documents/Article/Normalisation/RM%20Foo%203_A.pdf

f. Then read and understand all of Codd's material in [1.4] plus surrounds.

g. IFF you are happy with all the above [a] through [f], AND you have gotten more out of reading the RM with these in mind, THEN and only then, please examine this doc. It is the tables that anyone who accepts the RM, without hindrances such as a blind spot re Hierarchies, would have created, directly from Codd's [1.4 Normal Form] METHOD, from para 1, to the words "R(g).r.d". Nothing more, nothing less. No contemporaneous understanding required. I would call that which he has given __Relational_Normal_Form__

Codd gives the Keys in Fig 3(b), they are in italics, the non-key attributes are not in italics.

http://www.softwaregems.com.au/Documents/Article/Normalisation/RM%20Foo%203_B.pdf

It is given in response to your:
> If that kind of access is possible, in what sense do the four tables
> form a hierarchy? Are we to say they have "hierarchical keys" simply
> because employee->jobhistory->salaryhistory are related through their
> foreign keys?

The solid vs dashed lines; the square vs round corners, have (a) specific meaning, and (b) a slew of ramifications. If you are not familiar, really familiar with what that means, please check:
http://www.softwaregems.com.au/Documents/Documentary%20Examples/IDEF1X%20Introduction.pdf

I am not plugging "Hierarchical Key". I accept that "Relational Key" is the correct and widely-understood term. I was saying, they could as well have been called "Hierarchical Key", because the components in the Key reflect the Hierarchy of the tables, the hierarchy of the Domain, the hierarchy of the information. Regardless of storage.

Further, I would say that any configuration of the tables from Codd's example, that is NOT in the rendition that I have given in Relational form, is missing something serious from the RM (Integrity, Power, Speed, any combination of that), to the extent that it is different.

> And I
> want to thank you, because in preparing this reply I gained a new
> appreciation for the meaning of "data independence".

Data Independence is a large subject, with several dependencies, the hierarchical understanding of the implementer being just one component of it, so I will leave that out for now, and resume it after you have confirmed a better understanding of Codd's ideas re Hierarchies.

Ok, it wasn't short after all!

Cheers
Derek
Message has been deleted

Derek Asirvadem

unread,
Feb 5, 2015, 9:32:24 PM2/5/15
to
James

> On Friday, 6 February 2015 10:14:01 UTC+11, Derek Asirvadem wrote:
>
> 2. Therefore, if you have times before Mon, when I next expect to post, I recommend that you read those same sections of the RM again, (1.4) plus preceding, with the following in mind. And with the the following notes, side by side.

To be clear. I think you have some respect for Codd, although that is diminished due to the barrage of misinformation from your teachers. Codd's words in the RM are everything. My words are nothing.
____ (Yeah. sure. this whole thread is about confronting the denial re the HM is in the RM, and those words are relevant, but when dealing with the RM proper, my words are nothing, Codd's words are everything.)

Thus the diagrams I have provided, are of mild interest only, and intended to assist in the understanding what Codd meant in the RM, if only because the contemporaneous understanding is missing. suppressed. denied, these days. Whatever you get re Hierarchies in the RM, should be gotten from the RM, not from me.

Thus in the diagrams I have provided, I have used the fewest words possible, and then, only to assist in comprehension of the diagram.
- I use Helvetica for titles and Times for text
- Specific references to the RM are in Georgia, as it was the RM

> g. IFF you are happy with all the above [a] through [f], AND you have gotten more out of reading the RM with these in mind, THEN and only then, please examine this doc. It is the tables that anyone who accepts the RM, without hindrances such as a blind spot re Hierarchies, would have created, directly from Codd's [1.4 Normal Form] METHOD, from para 1, to the words "R(g).r.d". Nothing more, nothing less. No contemporaneous understanding required. I would call that which he has given __Relational_Normal_Form__

And thus, the pre=requisite Codd has given, although he calls it the "Unnormalised Form" in the context of the RM he is introducing, based on the specific terms used in the pre-requisite, is in fact what could be named the __Hierarchical_Normal_Form__.

We don't have to discuss that, or what it is, or what a good name for it is, but I say that that which is given is indisputable: an hierarchy, a tree, with integrity.

[h] Further, I would say that any configuration of the tables from Codd's example, that is NOT in the rendition that I have given in Relational form, is missing something serious from the RM (Integrity, Power, Speed, any combination of that), to the extent that it is different.

I expect some discussion there. People will want to arrange the tables in some manner that subtracts from the hierarchy I have given, and to that extent, the hierarchy, which is important for understanding the DATA, will not be transmitted to the reader. Likewise, another person may have keys in a sequence other than that which Codd has given, is which case he is rebelling and refusing to accept the whole that Codd has given

Cheers
Derek

Derek Asirvadem

unread,
Feb 6, 2015, 5:58:49 AM2/6/15
to
> On Friday, 6 February 2015 10:14:01 UTC+11, Derek Asirvadem wrote:

All diagrams upgraded

Cheers
Derek

Derek Asirvadem

unread,
Feb 12, 2015, 2:00:54 AM2/12/15
to
James

> On Wednesday, 4 February 2015 16:45:27 UTC+11, James K. Lowden wrote:

I am reminded, again, just how far apart our worlds are.

> select birthyear, salary
> from salaryhistory as j join children as c on j.man# = c.man#
>
> If that kind of access is possible, in what sense do the four tables
> form a hierarchy? Are we to say they have "hierarchical keys" simply
> because employee->jobhistory->salaryhistory are related through their
> foreign keys?

First, I have to say, while some posters have demonstrated consistent self-contradiction, where virtually every para contradicts the next, you have not. Sure, you have been rigid; refused to take up minor challenges that would demonstrate your declared knowledge; refused to give me an example where I could demonstrate that the ideas you have are false; etc, but you have not been self-contradictory. Therefore when I saw the staggering contradiction between your two paras above, it stayed with me. How could someone who could figure out the SQL required in p1, NOT understand what he is asking in p2 ? The two paras from the one person just did not jive.

> Are we to say they have "hierarchical keys" simply
> because employee->jobhistory->salaryhistory are related through their
> foreign keys?

Yes, of course that is true, but that is not the reason I suggest they are hierarchical keys. That would be a very weak argument to support the notion that they are "hierarchical keys".

The reason they are hierarchical keys is because employee->jobhistory->salaryhistory are related through their IDENTIFIERS, their PRIMARY keys (which, btw, components thereof happen to be foreign keys). Why did you not see that, why were you thinking that I was picking a weak reason when, the massive reason is standing out there front and centre, like testosterone on a bull, the same massive reason that you must have used when you wrote that SQL.

Despite the fact that Codd defined KEY in the preceding section, and then gave the keys:

> > Codd gives the Keys in Fig 3(b), they are in italics, the non-key attributes are not in italics.

you have implemented them as non-keys. Foreign keys, sure, but as attributes. Which means Non-identifying relations (dashed lines) rather than Identifying (solid lines).

The only way you would not see the advertisement on the bull, is that you are so very, very used to record filing systems; with surrogate record IDs (not "surrogate _keys_", there is no such thing); where all the relations are Non-identifying, that even when the Keys and Identifying relations are given, you see only non-keys, non-identifying relations. In that case, yes, the links (can't call them relations) that connect the files (can't call them tables) do not represent an hierarchy, one has to stretch to see it, the notion is weak.

I have known for some time, that theoreticians in this unserved space know only RFS, but I did not place you amongst them.

> Are we to say they have "hierarchical keys" simply
> because employee->jobhistory->salaryhistory are related through their
> foreign keys?

No. The reason they are hierarchical keys is because employee->jobhistory->salaryhistory are related through their IDENTIFIERS, their PRIMARY keys.

If you look at the keys Codd gave, the keys are Hierarchical:
__Employee ( ManNo )
____JobHistory ( ManNo, JobDate )
______Salary ( ManNo, JobDate, SalaryDate )
____Children ( ManNo, ChildName )

The FKs are used to form the PKs, as per Codd's definition of KEYS and construction in [1.4].

> If that kind of access is possible, in what sense do the four tables
> form a hierarchy?

Refer my para immediately above, *and* my diagrams for Codd's Fig 3(b). The tables form a classic hierarchy, and the keys within each table form the classic Relational Hierarchy.

The mental block for me is that you said you had implemented many Relational systems, and I took you at your word. I thought, yeah sure, this guy knows some Relational, and I will get him across the line, to understand hierarchies in the RM.

So it is a bit shock, horror, for me to realise, hang on, he is a product of his teachers, they are clueless about the RM, they diminish the RM at every turn (attempting to show how their "vision" is somehow "better"), as evidenced, they know and implement only pre-1970 RFS; non-FD; non-key, so there is no way you could be anything but.

The reason your p1 contradicts your p2, is that you are using a non-relational Record Filing system, and even when Codd gives you Keys, you place them as fields, with an FK. Your rendition of Codd's words is not in Relational Normal Form.

----

Now for the rest.

> You raise too many points for me to answer one by one. Let me call
> out some I think are important. If I mistake your meaning at any
> juncture, please correct me.
>
> > Note that in the RM, more than half of Codd's references are to
> > products and product manuals. There were not too many theoretical
> > papers in the field.
>
> You seem to think one implies the other, that commercial products
> preclude theoretical papers.

No, I have already posted a fair amount of detail that the opposite is true.

Perhaps I should have stated, There were not too many theoretical papers PUBLISHED in the field, as detailed previously, we had lots of internal proprietary papers.

> But you must know that's not true. The
> reason there were no papers is that there was no theory. You don't feel
> that's important; I suggest that's one reason pre-relational systems
> were so inelegant.

(Already answered that we had sound theory, proprietary, that was not published.)

They weren't inelegant for the time, they were quite elegant, if you consider that ISAM was all we had before the HM came along. And I haven't even started on the Network Model, which was even more elegant, less restrictive, than the HM.

Sure, they are inelegant now, because we have the RM as the measure against which the comparison is made.

> One giant leap owed to Codd that I think was (and often still is)
> underappreciated is his adoption of value semantics. Your helpful
> citation illustrates that point quite well, see next.
>
> > There are many terms that Codd uses in the RM, which have gone out of
> > fashion.
>
> It does take some work to read Codd's 1970 paper while trying to
> embrace the technological perspective of his audience in the days of
> punch cards and drum memory.

Nonsense.

You don't honestly believe that ISAM and the HM were implemented using punched cards and drum memory, do you ? Whoever told you that, whoever taught you that, is a disgusting liar. It is a transparent attempt to demean the fact that the RM was founded on the HM; that the HM had a sound basis. By rewriting history (the tell-tale sign of a liar), they wipe out the Hierarchy in the RM, and suggest the RM was fresh, new, first-time based on theory, and all that bull dust that contradicts the evidenced facts.

In 1976, when I took my first job in a computer service bureau, as an apprentice programmer, we had no drums, no punched cards. We had a machine with one disk (for loading the o/s and programs, not for data), and eight mag tapes (for data). Each mag tape was a data file, accessed serially, only. Typically a program would use six files, six tapes, repositioning each of them. Just like a six-file merge that unix sort would perform.

During that year, we installed 16 fixed disks, and eliminated the tapes for files (kept them for backups only). We had complete databases implemented in ISAM, pointer-based, of course. And Normalisation, without declared NFs, of course.

By 1978 we had removable disks of various sizes.

By 1978 we had full OCR readers, all of Canada's Catholic schools did their exams on OCR forms, and our service bureau read them, and tabulated them. The rests of Canada switched over slowly. The readers often jammed, so when the exams were on, we had a 24-watch, we took it in turns. While waiting, we programmed extensions to the Star Trek game we had, written in BASIC-PLUS, with the clever bits in Assembler.

When DBMS platforms came along, they gave us a much easier and much more secure (eg. the pointers) method of implementing those databases that we did have; ACID Transactions; better concurrency; etc, etc. It would be silly to think that we didn't have databases or Normalisation until DBMS platforms came along. Once DBMS took off, any errors in the database design (file design), any errors in Normalisation (fields within the files), were magnified.

We did all that without the benefit of Date, Darwen, Fagin, Abiteboul, Hull, Viana's contributions to mankind. Which, AFAIC, is still a great big zero.

> Looking at the example in section 1.4, I finally see what you mean by
> "hierarchy". And, fair enough, Codd says of Figure 3 {employee,
> jobhistory, salaryhistory children} ,"The tree ... shows just these
> interrelationships...." Having worked with the relational model all
> these years, I look at that diagram and I don't see a tree. I see an
> ancestor of a Chen diagram, and automatically assume the "nonsimple
> domains" are tables. To Codd's contemporaries, the tree-ness was
> obvious.
>
> > when he gives the pre-requisties to his __Relational Normal Form__
> > [1.4](1)(2), and in (1) states "collections of trees", we take that
> > to mean:
> > - trees with integrity
> > - normalised to the extent that we did prior to the RM
> > - no circular references
> > - what I am calling, in retrospect __Hierarchical Normal Form__
>
> Codd certainly knew that a tree is a kind of DAG.

No. He was a strong proponent of a single Large Shared Data Bank, the classic single-version-of-the-truth. The tree is the hierarchy, the tree is the Relational hierarchy. In a single location. Not a DAG at all. Distributed databases are for the birds, and a DAG is just the latest flavour of birdseed.

> I don't know what
> "normalized [before] RM" refers to,

Do you understand that DRY, Agile, etc, is Normalisation for a program ?

Imagine that "before the RM" disk space was extremely expensive, the waste was zero. Imagine that Update Anomalies had much greater negative effect (the duplicated field might be on a disk that was not mounted), to be avoided at all costs. We Normalised very carefully in those days, to eliminate data duplication. We just did not have a formal declaration and name. All our Normalisation prior to Codd was 3NF minus the Relational Key aspect, within the limits of whatever it was that we used, ISAM or HM or NM, as opposed to Codd's 3NF, which was Relational.

Imagine that even today, if one were to implement data in an ISAM system, one can Normalise to 3NF minus the Relational Key aspect.

Imagine that even today, if one were to implement data in awk arrays, which I have done recently, one can Normalise to 3NF fully, and implement the arrays as Relational tables, with full Relational (hierarchical) Keys. Of course, the other features of a DBMS are absent.

> and I wouldn't rely on this example
> to prove circular references can't exist in a relational database,

Not this example, but Codd's words in the RM. As explained in my para, ten paras above this one.

Do you honestly believe that a tree, in the days of the HM and NM, could survive a circular reference ? That a leaf node could point to a branch node ? What would the program that followed such a reference do ? I trust you understand the concept of the Infinite Loop, that it is to be avoided at all costs. In those days, we could not interrupt a program that was caught in a tight infinite loop, the o/s never got the chance (in its execution vector) to interrupt a tight loop, because the looping program never let go of the CPU. We had to HALT the machine, two hundred and fifty customer programs would be halted. The restart of all those programs was a massive affair. The HP-2000's and 3000's were excellent machines for the day.

No, a tree meant a tree, with the integrity properties of a tree, not a weed with incestuous properties.

The tree that Codd refers to in the pre-requisite to the [1.4]_[Relational]_Normal_Form_ is a tree with integrity, no circular references. The pre-requisite is obviously the _Hierarchical_Normal_Form_.

Therefore no circular references are allowed in the RM, in the RNF.

Which is why the commercial RDBMSs do not have the idiotic features such as "deferred constraint checking" that the idiots "must have".

To be clear, while this simple example itself does not prove it, Codd's RM prohibits circular references, by definition (if one understands the contemporaneous definitions).

The corollary is, that is why the freaks that teach you guys suppress the Hierarchical Model; why they lie that there is nothing of the HM in the RM, etc, etc. The freaks need their circular references. Put another way, the ordering in the HM resolves circular references into trees.

Further, circular references are simply not necessary. If you Normalise the data into HNF, then RNF, as demanded by Codd, or Normalise with an overall understanding of Normalisation without specific reference to HNF and RNF, you will not need circular references. Which is why I ask for your example, it is easy to prove.

> but
> I don't want to argue that point just yet, because we're talking
> hierarchies.

Ok. Whenever you are ready.

> Requoting,
>
> > - the set of such tables form an hierarchy of tables
> > - such keys may well be called Hierarchical Keys
>
> Sure, but only at a severe cost to meaning!

I did state:
> > - such keys may well be called Hierarchical Keys (they are well-known
> > as Relational Keys, I am not suggesting that we change it)

> Codd's reader doubtless saw a hierarchy (effortlessly, as you do, as I
> did not). But the example shows that they are *not* a hierarchy,
> despite appearances.

(I have treated this section of your post in detail, earlier, so my comments here are to be taken with that in mind.)

It is a classic hierarchy, visually; in terms of the tables; in terms of the Keys.

> Figure 3(b) shows each relation (his term) having
> "man#" as part of the key. It is not necessary to go through
> jobhistory to get to salaryhistory.

Absolutely, it is Relational.

> It is perfectly possible, as you
> know, to
>
> select birthyear, salary
> from salaryhistory as j join children as c on j.man# = c.man#
>
> If that kind of access is possible, in what sense do the four tables
> form a hierarchy? Are we to say they have "hierarchical keys" simply
> because employee->jobhistory->salaryhistory are related through their
> foreign keys?
>
> If that's what you mean, OK. Given that the tables don't have to be
> used hierarchically, ISTM that calling them a hierarchy is to adopt a
> blinkered view.

(Treated)

> > To the extent that any hierarchy that exists in the data, is
> > maintained as an hierarchy, after transformation to the Relational
> > Model, the hierarchy lives, exists, breathes
>
> No. What you're really saying is that the tables are related, and that
> their relationships are manifest in their keys.

More. The relationships are manifest in their PRIMARY keys, which forms their identity, and which forms the hierarchy.

> The hierarchical systems you remember so well adopted the idea --
> and required the schema to manifest the idea -- that e.g. jobhistory is
> a *property* of employee. (They didn't use that term, of course.)

Not at all.

There was no *idea* to adopt, with or without the modern term.

The HM provided one file for each Hierarchy. Each Hierarchy consisted of several record-types. In Codd's example, he is clearly discussing a single hierarchy, a single file, with four record types. The DBMS handled grouping of record belonging to a single parent record type; fragmentation; etc, that was its job.

JobHistory would be a separate record-type, in a separate physical location (grouped) to the location (grouped) of the Employee record-type. There are multiple JobHistories per Employee. The Employee record had First/Last pointers to its JobHistories. The JobHistories had Previous/Next pointers to the JobHistories for that /one Employee record/.

That is hardly a *property* of employee.

Refer my diagram, which shows the logical storage in the HM (the hierarchies *modelled*, and any model is abstracted from the physical). If you request it, I will give you a diagram of the physical storage for the HM.

> One
> could not access jobhistory records except through a *pointer* acquired
> through an employee record.

Correct.

> The hierarchy wasn't just a notional (or
> notational) communication convention; it constituted the access path.

Correct.

And that Access Path Dependence is specifically prohibited in the RM.

> With that example, I really think the fairest thing to say is that it
> shows it's *not* a hierarchy.

(Treated. Look again.)

> By adopting value semantics -- by making
> the keys values instead of pointers --

Making the references keys instead of pointers, which is what Codd did, is quite different to "making the keys [there were none] values instead of pointers", the latter makes no sense.

> each relation becomes
> free-standing and self-consistent.

How, exactly, is that "self-consistency" achieved ? I don't see Codd's tables being "self-consistent" at all (except the head of the hierarchy, Employee). The other three are quite Dependent.

Yeah, but Hierarchical and Relational are non mutually exclusive. They can be independently-accessible tables, Relational, AND hierarchical.

If all you see in the RM is that each relation is now a "free-standing" table, with "independent" access, then you do not have the RM, you have a Record Filing System.

Now we know from your comments (at the top), that
- you don't have Codd's tables as defined in the RM Fig 3(b), as per my IDEF1X model of the same figure. You are just not getting the Hierarchy is within the Primary Keys.
- So you have something else in mind (your post that I am responding to, your understanding of Codd's words)
- I presume it is not a Record Filing System of the first order (RFS 1), where record are referenced by record ID, and there is no itegrity.
- I presume that you have been through the traps, that you have some integrity in them, that the records reference some unique key in the parent record (not the record ID which is NOT a key, and has no integrity).
- In order to proceed and close this, have a look at this page, please confirm that what you have in mind is RFS 5, not RFS 1
- or something in-between
http://www.softwaregems.com.au/Documents/Article/Normalisation/RM%20Foo%20RFS.pdf

Based on your response, then we can move ahead.

> We can think of them as forming a
> hierarchy as a convenience; perhaps they'll be commonly used that way in
> some application. But we're not required to.

Never said we were. (That would be an Access Path Dependence.)

> The new, non-hierarchical
> relations can be combined in arbitrary ways. We can find the highest
> salary for each year, without ever learning the men's names.

Yeah, yeah. You can do that with the new hierarchical relations as well.

> > I said the RM is a progression of the HM.
>
> I suppose that's true, in the sense that the United States as
> constituted in 1789 was a progression of government from what had
> existed in 1775. Something came before the thing that came later, and
> many people would call it "progress".

And the government that existed in 1775 was a progression of the government in England, etc. And that was a progression of Henry V and Elizabeth I. Et cetera.

But I have said much more than that, that the HM is fundamental to the RM. Until you understand Codd's words, you will not see that. (you have come a fair way.)

> The RM was also revolutionary in 1) using math as a foundation, and 2)
> rejecting the tree -- and with it, pointer semantics -- as the basis
> for data organization.

I agree with [1]. It rejected the pointers but it did not reject the tree, it ket the tree. As you can see from Codd's words, in [1.4 Normal Form], the tree lives, the keys in the tree are transformed to a Relational Normal Form, and retained in the new Relational Primary Key.

And further, the denial of the tree, the suppression of the tree, by those teachers, causes people to fail to see trees in the data (where such exists). Hence they have grossly inefficient "adjacency lists" and "nested sets". Just look at the agony that the theoreticians are experiencing in the Normalisation thread, they are completely impotent, unable to produce anything.

> On Friday, 6 February 2015 10:14:01 UTC+11, Derek Asirvadem wrote:

No response yet.

1. It would be good if you could confirm that you
- got something out of that "further detail" post of mine
- or this one, with additional diagrams
- that you have a better understanding of the hierarchies, and the HM in the RM, than you had in your last post
- that my diagrams assisted (or not) in the second reading of section [1.4] of the RM
- that you can now see the hierarchy that Codd gives in Fig 3(b), visually in my diagrams, as hierarchical keys

2. Further, It would be good if you could confirm that from the RM itself, hierarchies, and the HM (the essence, not the storage) exist in the RM.

3. Re your teachers' allegations that hierarchies cannot be implemented in the RM; that there is nothing of the HM in the RM, I have proved such claims to be false, and that such teachers are evidently quite ignorant re the RM.

3.a I repeat, if you are interested, I have a set of tables and code that I use to teach the proper implementation of Relational Hierarchies, that are online. They include the projection of "nested relation"-type data using ordinary SQL.

4. Re your teachers' allegations that Codd's FOL and RM doesn't provide for Hierarchies; that SOL/42OL is required for hierarchies; etc, I am waiting for the requested example, in order to show you how all that can be done within the RM, and to prove that such claims are false. If you give me a reference to the AHV book or whatever, that will be fine.

If you do not respond, do consider that based on the information I have provided in this thread thus far, which proved their foundation claims re hierarchies in the RM [3] to be blatantly false, and ignorant of the RM, their secondary claims [4], are likewise false, ignorant, and without basis. And when challenged, no evidence is given. Which proves in and of itself, that such claims are baseless.

Cheers
Derek

James K. Lowden

unread,
Feb 15, 2015, 8:07:48 PM2/15/15
to
On Wed, 11 Feb 2015 23:00:52 -0800 (PST)
Derek Asirvadem <derek.a...@gmail.com> wrote:

> > Are we to say they have "hierarchical keys" simply
> > because employee->jobhistory->salaryhistory are related through
> > their foreign keys?
>
> The reason they are hierarchical keys is because
> employee->jobhistory->salaryhistory are related through their
> IDENTIFIERS, their PRIMARY keys (which, btw, components thereof
> happen to be foreign keys).

To me, that's a distinction without a difference. I just don't see
what you find significant about it.

employee is identified by {man#}
jobhistory is identified by {man#, jobdate}

For some reason I can't fathom, you believe:

1. It's vastly more important that the primary key for jobhistory
incorporates the primary key for employee than that

FOREIGN KEY (man#) REFERENCES employee(man#)

even though the two statements are equivalent.

2. The very fact that primary key for jobhistory incorporates the
primary key for employee deserves the special designation of a
"hierarchical" relationship, perhaps to reflect how the relationship
would have been designed in pre-relational DBMSs.

To support this assertion, you list the keys vertically, and note that
each longer one incorporates the shorter one above, ergo hierarchy.
Also you note Codd visually arranged the boxes in a way that suggests a
hierarchy. You'll forgive me if I find that unpersuasive?!

No, I don't think the fact that one table's primary key is a subset of
another's is interesting, let alone signficant. It doesn't deserve any
special designation, "hierarchical" other.

> > > Note that in the RM, more than half of Codd's references are to
> > > products and product manuals. There were not too many theoretical
> > > papers in the field.
> >
> > You seem to think one implies the other, that commercial products
> > preclude theoretical papers.
>
> No, I have already posted a fair amount of detail that the opposite
> is true.
>
> Perhaps I should have stated, There were not too many theoretical
> papers PUBLISHED in the field, as detailed previously, we had lots of
> internal proprietary papers.

If you say so. IBM was big in the field and not shy about publishing
papers. But, taking you at your word, I can't evaluate them without
having read them. I have this filed under "don't care" because even
today there's no model for databases comparable to the relational
model.

> > It does take some work to read Codd's 1970 paper while trying to
> > embrace the technological perspective of his audience in the days of
> > punch cards and drum memory.
>
> Nonsense.
...
> In 1976, when I took my first job in a computer service bureau, as an
> apprentice programmer, we had no drums, no punched cards. We had a
> machine with one disk (for loading the o/s and programs, not for
> data), and eight mag tapes (for data).

You arrived a little ahead of me. Doubtless you remember some things
I've only read about.

In 1976, though, you were already 6 years in, and there were still
plenty of punch cards around. I programmed with them in college after
that. When I arrived at work in 1982, we had CICS and 3270 terminals,
but they had only arrived two years before. My mother in the mid-70s
programmed on punch cards, too. (One compile per day, taken to the
computer and back the the unusual RJE mechanism known as a "station
wagon"). But, if you thought I meant punch cards were used for
data storage, no, sorry. I was being allusory.

Granted, "drum memory" is an exaggeration, but not much. The big IBM
360 machine sold in 1968 came with 1-4 MB core memory, but many came
with much less.

It's very easy to imagine people in IT management in those days whose
knowledge of computer science was nil and whose understanding of
programming was limited to whatever IBM classes the firm had sent them
to. No need to imagine them, in fact, because I worked with and under
some of them. Hurrah for Syncsort [TM]. But it does take some work to
try to read Codd's paper through their perspective.

> > Codd certainly knew that a tree is a kind of DAG.
>
> No. He was a strong proponent of a single Large Shared Data Bank, the
> classic single-version-of-the-truth. The tree is the hierarchy, the
> tree is the Relational hierarchy. In a single location. Not a DAG
> at all. Distributed databases are for the birds, and a DAG is just
> the latest flavour of birdseed.

I think you did not take my meaning: directed acyclic graph. I can't
account for your answer otherwise.

> > I don't know what "normalized [before] RM" refers to,
>
> Do you understand that DRY, Agile, etc, is Normalisation for a
> program ?

If you say so. Not Agile, which is just methodology fetish. I've
never once heard an application programmer call his data structures
"normalized", whether or not he knew what the term meant.

> We Normalised very carefully in those days, to eliminate data
> duplication. We just did not have a formal declaration and name.

OK, so now I know what you mean. But you can minimize redundancy
without eliminating repeating groups, a requirement for 1NF. And going
to 1NF for a repeating group means repeating the key, definitely *not*
minimizing redundancy wrt disk storage.

Are you going to claim you never used repeating groups in your
"normalized" HM databases? Surely you know their use was standard
practice, one that violated no theory.

So I can accept your defintion for purposes of discussion, but I reject
it in general because the practice had no theoretical underpinning and
was a mere suggestion of what we mean by the term today.

> Do you honestly believe that a tree, in the days of the HM and NM,
> could survive a circular reference ?

No. In fact I would go futher: a tree with a circular reference is not
a tree. A tree is a kind of directed acyclic graph. A "tree" with a
cycle is a cyclic graph (directed or not is hard to say).

By posing the question as you do, I am led to think you're working with
an informal definition of "tree".

I hope at this point that we understand each others position regarding
what the so-called hierchical model means, why it's not a "model" in
the sense of "relational model", and why I think it's pointless to make
any claim about a "hierarchy" based on the components of the keys.

It was an interesting foray into the systems of that bygone era, and I
think I understand, vaguely why you say that pre-relational systems,
Cullinet et al., influenced relational ones.

> But I have said much more than that, that the HM is fundamental to
> the RM. Until you understand Codd's words, you will not see that.

I reject that flatly. Whether or not I can convince you I understand
Codd's words, I cannot see any way in which "the HM is *fundamental* to
the RM". You say there's sound theory in proprietary papers that never
came to light, lo these 45 years later, despite the immense importance
of the RM and the ever-present reinvent-the-past interest in graph
databases today. I can't prove you're wrong. All I can say is I don't
believe you and won't until I can see for myself.

> 3. Re your teachers' allegations that hierarchies cannot be
> implemented in the RM

Au contraire. I said tables can represent graphs, and tree are graphs,
and hierarchies are trees. Therefore hierarchies can be represented
relationally. Furthermore, they can be represented simpler, because
value semantics allow both relation and relationship to be represented
using one structure.

Over to you.

--jkl

Derek Asirvadem

unread,
Feb 17, 2015, 12:34:09 AM2/17/15
to
James
First a summary response to all that, then a point address to a couple of items. We are closing the gap between what I am trying to convey (Hierarchies in the RM, in various forms. using Codd's words, and only Codd's words) and what you are doing (which is evidently somewhat less than than, a that hinges on a denial of hierarchies in the RM).

We are progressing. But that denial is no longer an objective logical denial, it a subjective, vociferous, psychological one. With your last post, I have now identified three denials on your side, of important technical issues, which I will attempt to address.

I did state:
> > Now we know from your comments (at the top), that
> > - you don't have Codd's tables as defined in the RM Fig 3(b), as per my IDEF1X model of the same figure. You are just not getting the Hierarchy is within the Primary Keys.
> > - So you have something else in mind (your post that I am responding to, your understanding of Codd's words)
> > - I presume it is not a Record Filing System of the first order (RFS 1), where record are referenced by record ID, and there is no itegrity.
> > - I presume that you have been through the traps, that you have some integrity in them, that the records reference some unique key in the parent record (not the record ID which is NOT a key, and has no integrity).
> > - In order to proceed and close this, have a look at this page, please confirm that what you have in mind is RFS 5, not RFS 1
> > - or something in-between
http://www.softwaregems.com.au/Documents/Article/Normalisation/RM%20Foo%20RFS.pdf

> > Based on your response, then we can move ahead.

But you have not responded to those specifics.

I was expecting some interaction based on the diagram. Nothing.

I take it then, that you have something between RFS 1 and RFS 5. Please note, you are not following Codd's words, you do not have his tables (which are Relational, ala [1.4 Normal Form] ). You have something substantially less than Codd's tables, and you see that whatever difference there is, is "a distinction without a difference".

Denial One - Codd's Words

If you genuinely wish to (a) understand Codd and the RM, and (b) understand my proposal (very much second), I have to ask that you absolutely follow his words, and then mine. Otherwise this discussion has to end, on the basis that, as evidenced, you do not, and you will not follow the RM, Codd's words.

I did not say, that the mere arrangement of components of the key makes it hierarchical. I said, if you read Codd's words, and look at his diagrams (modern equivalents supplied by me, to be read side-by-side with the RM) (again for your convenience:)
http://www.softwaregems.com.au/Documents/Article/Normalisation/RM%20Foo%203_B.pdf
you will see that the Primary Keys Codd gave, are hierarchical. I merely emphasised that, because you seem to have missed it (and still do). that emphasis is not a proof in and of itself. And that Codd's Primary Keys replace (RNF) the IDENTIFIERS in HM example that he used, which is hierarchical using pointers.

In the diagram I have given for the Hierachical Model, the Employee File in Codd's example, the one and only KEY is ManNo, used to access the Employee File, there are no other Keys. All the navigation within the Employee File is via pointers (open arrows). Repeating Groups ("non-simple domains") are represented by an open arrow with a double head. In the HM, JobDate, SalaryDate, and ChildName are not KEYS, they are identifiers within each Repeating Group (in retrospect, using todays context, sure, they may look like Keys, but they are not).

Denial Two - Difference is Significant

The second issue that you seem to be in vociferous denial about is this. First, let me make sure that you understand, I did not invent the RM or IDEF1X. I am speaking as a faithful practitioner of both. IDEF1X is a standard (unlike UML, it is a real standard). Robert Brown invented it, and did so using Chen's ERD as the starting point. He famously had Codd's input into the process. Which is why IDEF1X can claim that it has all the features that allows a person to *implement* the RM.

To that extent, it has specific features, that are normal, ordinary, pedestrian, to an RM implementer, that incorporate the RM (natural progressions, strictly within the RM ?), that are not described in detail in the RM. Two of those specific features are:

1. The difference between Identifying Relationships and Non-identifying ones. Solid vs dashed lines.
___ If the parent Primary Key is used to IDENTIFY the child, to form the child Primary Key, it is Identifying.
___ The entire example given by Codd uses Identifying relations.
___ Typically, the relationaships for Reference tables (eg. SecurityType) are Non-identifying, there is no value in the parent identifying the child, eg. in SecurityType identifying the Security.
___ Wherease, there is a great value in the Employee Identifying the JobHistory and Children.

2. The difference between Independent tables vs Dependent tables. Square vs round corners.
___ Each Independent table represents the top of a Data Hierarchy. The equivalent in the Hierarchical model is a File.
___ Each such Independent table, or File, constitutes an Access Path Dependence, which is prohibited in the RM,
___ If the RM is followed, most of the tables in a database would be Dependent.

To which, I take it, your response is:
> No, I don't think the fact that one table's primary key is a subset of
> another's is interesting, let alone signficant.
and:
> To me, that's a distinction without a difference. I just don't see
> what you find significant about it.

So you are denying something that millions of people who understand and practise the RM consider vitally important, that the standard for Relational modelling defines.

Which is understandable to some extent, given that your databases are in fact Record Filing Systems (possibly mature ones, as opposed to totally broken first-order ones). And notably, it is the method that your teachers teach.

Denial Three - Primary Key

This is a result of your teachers' garbage being embedded in your head. It is a trick they use to subvert the RM. The logic they use is the same as you use for Denial 3: a refusal to accept that the difference is significant, "oh sure, I can see that there is some difference, but the difference is insignificant. The two options amount to the same thing".

The issue concerns their use of "candidate key". The RM defines Primary Key. The RM does NOT define "candidate key".

Of course, in the 1980's R Brown, with Codd, following the RM:
___"A primary key is nonredundant if it is either a simple domain (not a combination) or a combination such that none of the participating simple domains is superfluous in uniquely identifying each element. A relation may possess more than one nonredundant primary key. This would be the case in the example if different parts were always given distinct names. Whenever a relation has two or more nonredundant primary keys, one of them is arbitrarily selected and called the primary key of that relation."

determined that all Keys on a relation other than the Primary Key, the Non-primary Keys, shall be named Alternate Keys.

a. To the extent that you refuse to use Primary Key, you are in direct violation of the RM
b. To the extent that you use "candidate keys", you are using an invention outside the RM.
c. To the extent that you refuse to use Alternate Key, you are in direct violation of IDEF1X, the standard for modelling Relational Databases, and of the RM, for Non-primary keys.

Your teachers have a long and consistent history of being ignorant of the RM, and of subverting it. Here, fortunately, I can deal with you, and resolve this, one issue at a time.

So the bottom line on this point is, you are using RFSs of some state of maturity, and your refusal to use Primary Keys for the designated Keys, directly violates the RM.

----

> For some reason I can't fathom, you believe:
>
> 1. It's vastly more important that the primary key for jobhistory
> incorporates the primary key for employee

Not "you believe", the fact is, Codd states that. And RM adherents appreciate the value of that.

> than that
>
> FOREIGN KEY (man#) REFERENCES employee(man#)
>
> even though the two statements are equivalent.

Er, they are not equivalent by any stretch of the imagination. You need three things.

a. The FK you have given above.

b. *And* the PK definition:
____JobHistory: PK ( man#, jobdate )

c. In which case, your RECORD ID is *REMOVED*, because it is redundant, superfluous, an additional column and index that serves no purpose.

So the truth is you "equivalance" is hardly that, it is not [a][b][c]. Your "equivalence" is [a] and the prohibited [c].

> 2. The very fact that primary key for jobhistory incorporates the
> primary key for employee deserves the special designation of a
> "hierarchical" relationship, perhaps to reflect how the relationship
> would have been designed in pre-relational DBMSs.

I didn't say that, you are painting someone into a corner. Hope it isn't you.

> To support this assertion, you list the keys vertically, and note that
> each longer one incorporates the shorter one above, ergo hierarchy.
> Also you note Codd visually arranged the boxes in a way that suggests a
> hierarchy. You'll forgive me if I find that unpersuasive?!

Well, I did state that, to support another assertion. You have lost the thread and you think I am supporting some other assertion. Read again.

> No, I don't think the fact that one table's primary key is a subset of
> another's is interesting, let alone signficant. It doesn't deserve any
> special designation, "hierarchical" other.

Besides being indisputably hierarchical, whether you agree or not, which are very secondary items, you are missing the primary item, or that which I have been trying to transmit to you. Don't worry about the secondary items, try to understand the primary item: that the formation of the child PK includes the parent PK, and that that is very important to the integrity, power, and speed of the Relational database.

----

Now, I can demonstrate that (a) those two difference are significant, and (b) more important, the consequences are very significant, which is the reason we have such a difference in the diagrammatic notation. But in order to do that, you are going to have to accept that there *is* a difference, that greater minds than mine determined, as being part of the RM.

Therefore the request is, that you understand my post, and accept that you have three well-used denials, and in order to address them, you have to be able to put those denials aside, and implement Codd's words to the letter, without diverging on the basis of some weird interpretation of the spirit.

Next, in order for me to type less words in explanation, could you please answer: when you implement Codd's tables (we know you are not following his words, you do not have his Fig 3(b) ), do you have RFS 1, RFS 5, or something in-between ?

----

I have this filed under "don't care" because even
> today there's no model for databases comparable to the relational
> model.

Of course.

> > > It does take some work to read Codd's 1970 paper while trying to
> > > embrace the technological perspective of his audience in the days of
> > > punch cards and drum memory.
> >
> > Nonsense.
> ...
> > In 1976, when I took my first job in a computer service bureau, as an
> > apprentice programmer, we had no drums, no punched cards. We had a
> > machine with one disk (for loading the o/s and programs, not for
> > data), and eight mag tapes (for data).
>
> You arrived a little ahead of me. Doubtless you remember some things
> I've only read about.
>
> In 1976, though, you were already 6 years in, and there were still
> plenty of punch cards around. I programmed with them in college after
> that. When I arrived at work in 1982, we had CICS and 3270 terminals,
> but they had only arrived two years before. My mother in the mid-70s
> programmed on punch cards, too. (One compile per day, taken to the
> computer and back the the unusual RJE mechanism known as a "station
> wagon").

I remember it well. In college, where we also had evening classes for "punch card operators", we used to write our programs on 80-column sheets, which the PC operators would type onto PCs (that was part of their practical), which we would then take to the computer operators, who would compile it overnight, and we would pick up the results in the morning. The big deal was to stay in the computer operations area, to check that the compile worked, such that we could throw out the PCs at the operations room, and save ourselves the effort of carrying the box of PCs around. For exam programs, we were given a maximum of three compiles.

In the US, most toll highways issued PCs at every on-ramp, which were made from recycled PCs, which you submitted to a teller at every off-ramp, until at least 1990 IIRC.

> But, if you thought I meant punch cards were used for
> data storage, no, sorry. I was being allusory.
>
> Granted, "drum memory" is an exaggeration, but not much. The big IBM
> 360 machine sold in 1968 came with 1-4 MB core memory, but many came
> with much less.

Yes. Our 360 had core. That you could see it with the naked eye. Those little magnets NEVER failed. Actually, the HP-2000 had the same. It was the HP-3000 that had IC memory, no core.

360/CICS lived on well into the 90's. You will not believe how many are still running in Aussie and American banks. I have to write transports to/from them.

> It's very easy to imagine people in IT management in those days whose
> knowledge of computer science was nil and whose understanding of
> programming was limited to whatever IBM classes the firm had sent them
> to. No need to imagine them, in fact, because I worked with and under
> some of them. Hurrah for Syncsort [TM]. But it does take some work to
> try to read Codd's paper through their perspective.
>
> > > Codd certainly knew that a tree is a kind of DAG.
> >
> > No. He was a strong proponent of a single Large Shared Data Bank, the
> > classic single-version-of-the-truth. The tree is the hierarchy, the
> > tree is the Relational hierarchy. In a single location. Not a DAG
> > at all. Distributed databases are for the birds, and a DAG is just
> > the latest flavour of birdseed.
>
> I think you did not take my meaning: directed acyclic graph. I can't
> account for your answer otherwise.

I thought you meant, and my previous comments relate to, MS Database Availability Group.

Yes, Codd certainly knew that a tree is a kind of DAG.

But that means you understand something that you claim to not understand. More, later.

> > > I don't know what "normalized [before] RM" refers to,
> >
> > Do you understand that DRY, Agile, etc, is Normalisation for a
> > program ?
>
> If you say so. Not Agile, which is just methodology fetish.

True. Normalisation is the "big secret" behind Agile, DAD. Ambler could not call it Normalisation, because he spent two decades decrying it, propagating the myth that "de-normalisation improves performance."

Same as your teachers. When they implement an hierarchy (that occurs naturally in the data) they call it everything but, "adjacency lists", etc, and they make a right royal hash of it, using two to four times as many keys and indices as a Relational implementation requires.

> I've
> never once heard an application programmer call his data structures
> "normalized", whether or not he knew what the term meant.

1. Not just the data structure in programs, but the program elements (dependent on language: routines; subroutines; functions; sub-programs; etc). If they have a diagram of the program or system, the diagram is an hierarchy.

2. I didn't say that the programmers call it Normalised, I said it *is* that. Whether they are aware of Normalisation, or whether they have applied it, is a separate matter.

3. If they are aware of Normalisation as a science, that they can apply it to their program elements, then their programs are far-better, because they Normalise through the entire exercise, and there is not "normalisation" or "reduction of redundancy" to be done.

4. If if they are unaware, then when their programs are slow enough, they go an try to remove some redundancy, while remaining clueless that it is a Normalisation operation.

> > We Normalised very carefully in those days, to eliminate data
> > duplication. We just did not have a formal declaration and name.
>
> OK, so now I know what you mean. But you can minimize redundancy
> without eliminating repeating groups, a requirement for 1NF.

I think you mean 2NF (Normalise repeating groups), not 1NF (Atomic data).

(Date and Darwen have mounted an assault on 1NF, in order to squeeze their imbecility of derived relations into "satisfying" 1NF, they scramble the terms to main tain confusion. Further evidence of their putrescence. I trust you are not doing that.)

No you can't. To achieve 2NF, you have to remove repeating groups, which means placing them in a separate table, which means including an FK, which means a migrated PK. Such tables are 1NF, 2NF, and 3NF.

> And going
> to 1NF for a repeating group means repeating the key, definitely *not*
> minimizing redundancy wrt disk storage.

Not sure what you are saying here. Disk storage then was limited, but
- we did carry the Key in ISAM (pre-HM), as a means of verification, and for rebuilding the file using one single scan, rather than by navigating the pointer chains for the entire file, which would be very slow.
- the HM (eg. IMS) carried the key, for the same purpose, the rebuild moved from our code, into their command.
- Notice that Codd did not show those carried keys in his example, we didn't normally show them same as we didn't normally show the pointer chain.
- such Keys were migrated, in exactly the same way that FKs are migrated in a RDb (but of course, not used as such).
- given that it is of value, and it is demanded for the method, in all three cases, the migrated key is not "redundant", it has a purpose that makes it non-redundant.

Now if you are saying that the migrated key in the RDb (as well as ISAM and HM) is "redundant" merely because it can be derived by other means (such as navigating the parent RECORD), you place yourself in the category of Nicola's imbecility, which I have responded to in detail. Darwen had the same imbecility. They miss the point that "related-by-key" demands that the related-key is carried, migrated, wherever it is used as a reference, an FK.

This is tightly related to the fact that Key is not "data"; the theoreticians cannot make that distinction; their non-FD-fragmenting and puzzling is based on denying that distinction.

> Are you going to claim you never used repeating groups in your
> "normalized" HM databases?

What ? Have you not looked at the diagrams I provided ?

The HM, IMS, and the others, all allowed for repeating groups to be handled correctly, ie. normalised, and placed the different record types (tables in todays vernacular) in a separate physical areas of the one file.

For ISAM, which I programmed until the end of the 90's (due to some critical systems never having an HM or RM DBMS), we used separate files for each record type, again, fully Normalised to 3NF, repeating groups were never in the same file as the parent.

So, no, I have never implemented repeating groups incorrectly, in the same record as the parent. I have not seen any other doing that either. I repeat, Normalisation was well understood in those days.

I have only seen it done incorrectly in marginal situations, eg. where it was a temporary fix-up, until the file was rebuilt.

> Surely you know their use was standard
> practice, one that violated no theory.

Nonsense. Details above. Repeating groups implemented incorrectly (ie. not Normalised) would be a gross error. The standard (unless your teachers have a different definition of the word) was to eliminate such gross errors, by simple Normalisation.

Get this, your teachers are disgusting liars, as has been proved hundreds of times. The only case in which repeating groups appeared in the same row as the parent, is the equivalent of:
____SELECT ... FROM parent JOIN child
which of course, is a derived relation (not stored, de-normalised by definition), not a base relation (stored, Normalised).

The gangsters use derived relations to "demonstrate" something, eg. Normalisation or lack thereof, that applies to base relations only. Disgusting. Sub-human.

> So I can accept your defintion for purposes of discussion, but I reject
> it in general because the practice had no theoretical underpinning and
> was a mere suggestion of what we mean by the term today.

No. I have specifically stated in detail, that we applied what is now known as 3NF (Codd's 3NF to you), except for the fact that we did not have the label, Codd's declaration. We just stated "Normalised" or not.

And of course the implementation was limited by the platform, the method.

You need a theoretical underpinning for 3NF (Codd's 3NF) ?

And one for "Normalised" ?

Sheesh.

> > Do you honestly believe that a tree, in the days of the HM and NM,
> > could survive a circular reference ?
>
> No. In fact I would go futher: a tree with a circular reference is not
> a tree. A tree is a kind of directed acyclic graph. A "tree" with a
> cycle is a cyclic graph (directed or not is hard to say).

Agreed, completely.

> By posing the question as you do, I am led to think you're working with
> an informal definition of "tree".

No. First, I was answering your questions re the RM, Codd's words, so the definition of tree in that context is what it meant in 1970.

Second, I agree with your, that Codd understood a tree is a DAG.

Third, I agree that the term tree remains, and is still used today, with more definition, it is a DAG. Todays definition does not contradict yesterdays.

Therefore, I am not using the term by any informal definition, only the formal ones.

> I hope at this point that we understand each others position regarding
> what the so-called hierchical model means, why it's not a "model" in
> the sense of "relational model",

Yes, we don't agree, but we understand each other's position.

Which btw is all I am aiming for. To have my views examined and understood, and still not accepted is fine. I have no intention of convincing you of anything, as stated from the outset.

I never said that the HM was a model in the sense that the RM is a model, that is your pivot, not my argument. Feel free.

> and why I think it's pointless to make
> any claim about a "hierarchy" based on the components of the keys.

Well, that remains open, on the table, waiting for you to answer questions about the state of your RFS, such that I can demonstrate the difference, that you insist is insignificant, is indeed very significant, such that we can then form conclusions. I wouldn't be repeating the conclusion without that gap being closed.

If you refuse to proceed, it does not mean my claim is false, or pointless (I have posted evidence), it means that your argument against it lies without evidence.

> It was an interesting foray into the systems of that bygone era, and I
> think I understand, vaguely why you say that pre-relational systems,
> Cullinet et al., influenced relational ones.

Sure, and you are unwilling to penetrate that vagueness.

The memory lane experience was a consequence of your assertions, and my having to detail why they were invalid. Yes, they were marginal, and avoided the central issues.

> > But I have said much more than that, that the HM is fundamental to
> > the RM. Until you understand Codd's words, you will not see that.
>
> I reject that flatly. Whether or not I can convince you I understand
> Codd's words,

I don't need convincing.

It is you who are unwilling to implement Codd's words.

And you who has some weird interpretation of words.

So whether you understand them or not, as evidenced, you refuse to implement them. So from where I sit, you will never understand them.

> I cannot see any way in which "the HM is *fundamental* to
> the RM". You say there's sound theory in proprietary papers that never
> came to light, lo these 45 years later, despite the immense importance
> of the RM and the ever-present reinvent-the-past interest in graph
> databases today. I can't prove you're wrong. All I can say is I don't
> believe you and won't until I can see for myself.

That is all superficial, and marginal, to the central issues.

You've placed yourself in a double-bind. You cannot ever "see for yourself", if you won't implement his words.

> > Until you understand Codd's words, you will not see that.

You are avoiding dealing with the central issues, by refusing to implement Codd's words, denying various aspects of the RM, and insisting that there is "no significant difference".

The bottom line remains, that what you have been implementing for years as "relational", is substantially less than Relational (proved), and does not have [has only a fraction of] the integrity, power, speed of the RM. Which is easily demonstrated, I am waiting for your answers re my diagrams, in order to proceed.

As a result of the interaction in this and the Normalisation thread, there is a consistent body of evidence (presented by you), that demonstrates that you deny various aspects of the RM, and implement only portions of it.

> > 3. Re your teachers' allegations that hierarchies cannot be
> > implemented in the RM
>
> Au contraire. I said tables can represent graphs, and tree are graphs,
> and hierarchies are trees. Therefore hierarchies can be represented
> relationally. Furthermore, they can be represented simpler, because
> value semantics allow both relation and relationship to be represented
> using one structure.

Er, agreed.

But previously you did state, that they stated that, which is why I responded. Now you are stating the opposite.

Ok, now my [3] is closed.

My [4] remains open:

> > 4. Re your teachers' allegations that Codd's FOL and RM doesn't provide for Hierarchies; that SOL/42OL is required for hierarchies; etc, I am waiting for the requested example, in order to show you how all that can be done within the RM, and to prove that such claims are false. If you give me a reference to the AHV book or whatever, that will be fine.

> > If you do not respond, do consider that based on the information I have provided in this thread thus far, which proved their foundation claims re hierarchies in the RM [3] to be blatantly false, and ignorant of the RM, their secondary claims [4], are likewise false, ignorant, and without basis. And when challenged, no evidence is given. Which proves in and of itself, that such claims are baseless.

And all the questions I had, remain unanswered.

Cheers
Derek
0 new messages