Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Relational Model [Codd] vs Anti-Relational Muddle [Date/Darwen/Fagin/et al] • How to identify a movie?

389 views
Skip to first unread message

Nicola

unread,
Dec 28, 2019, 8:13:51 AM12/28/19
to
Suppose that you want to build a database of all the movies ever made (not
only feature films). How do you identify a movie?

For your reference, there are a couple of European standards:

http://filmstandards.org/fsc/index.php/Main_Page

Also keep in mind that, for movies as for literature, one must typically
distinguish between "works", "variants/expressions" (at the abstract
level), "manifestations" and "items" (at the physical level), as per the
terminology introduced by FRBR:

https://www.ifla.org/files/assets/cataloguing/frbr/frbr_2008.pdf

For instance, there is one "Blade Runner" movie (imagined as a unique
work of art), but with at least two variants, or expressions, (the
original director's cut and the version at the theater), each with
several manifestations (such as DVDs, Blue Rays, digital files in
various formats with or without DRM, etc.), each existing as many items
(my copy of the DVD and your copy of the DVD).

So, how do you identify a movie (at each level)?

Btw, modelling in this context is relatively easy using RM/T, where you
trade flexibility for integrity.

Nicola

Derek Ignatius Asirvadem

unread,
Dec 29, 2019, 9:45:21 AM12/29/19
to
> On Sunday, 29 December 2019 00:13:51 UTC+11, Nicola wrote:
>
> So, how do you identify a movie (at each level)?

You have to give us the "levels". And their use through the database. Eg. what is the database intended to do: track works for the purpose of ownership; copyright; licensing; etc, or track "items"; things that are sold. The former is intellectual, that latter is physical.

> Btw, modelling in this context is relatively easy using RM/T, where you
> trade flexibility for integrity.

That is like saying, I am will to trade my wife (who was a virgin when we married) for a girl at the local brothel. because she is "flexible".

That is not a trade, it is the lazy man's excuse for not working, for not applying oneself and performing some genuine modelling. A database in 2019 (45 years since the advent of genuine Relational platforms; SQL) that has no integrity is not worth discussing.

No, that is definitely, absolutely, positively, not a justification for using RM/T (universal Surrogate) or common surrogate. Failure to model means, you have not the basis for using Relational anything, or RM/T. Just use surrogates, no pretence to RM/T, and store records, no pretence to domains; Codd's 3NF; etc..

If you are interested in a Relational database that has 100% data integrity, I would be happy to respond. As explained in the header thread, and expanaded in the /1971 Paper/ thread, I wipe my backside with teh 1971 paper; the 1974 paper; 1979 RM/T.

Cheers
Derek



>
> Nicola

Derek Ignatius Asirvadem

unread,
Dec 29, 2019, 9:47:18 AM12/29/19
to
On Monday, 30 December 2019 01:45:21 UTC+11, Derek Ignatius Asirvadem wrote:
> > On Sunday, 29 December 2019 00:13:51 UTC+11, Nicola wrote:
> >
> > So, how do you identify a movie (at each level)?
>
> You have to give us the "levels".

Read a bit of the links.
Got the levels.

They have done some good work, but their modelling ability is not that good.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Dec 29, 2019, 10:04:41 AM12/29/19
to
> On Monday, 30 December 2019 01:45:21 UTC+11, Derek Ignatius Asirvadem wrote:
> > On Sunday, 29 December 2019 00:13:51 UTC+11, Nicola wrote:
> >

Fig 3.3 is missing.

Cheers
Derek

Nicola

unread,
Dec 30, 2019, 4:42:53 AM12/30/19
to
All figures are missing, apparently. Maybe, this is a better document:

https://www.fiafnet.org/images/tinyUpload/E-Resources/Commission-And-PIP-Resources/CDC-resources/20160920%20Fiaf%20Manual-WEB.pdf

It's inspired by FRBR, but it is specific for movies. It has some
diagrams.

Nicola

Nicola

unread,
Dec 30, 2019, 5:27:26 AM12/30/19
to
On 2019-12-29, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>> On Sunday, 29 December 2019 00:13:51 UTC+11, Nicola wrote:
>>
>> So, how do you identify a movie (at each level)?
>
> You have to give us the "levels".

You've got them already: work, variant, manifestation, item.

> And their use through the database.
> Eg. what is the database intended to do: track works for the purpose
> of ownership; copyright; licensing; etc, or track "items"; things that
> are sold. The former is intellectual, that latter is physical.

The former is more challenging. I post some more relevant links below.

>> Btw, modelling in this context is relatively easy using RM/T, where you
>> trade flexibility for integrity.
>
> That is not a trade, it is the lazy man's excuse for not working, for
> not applying oneself and performing some genuine modelling.

Of course, I am playing the devil's advocate here, as no one else so far
has argued in favor of surrogates. Continuing to do so, I will assert
that any model based on Relational Keys is doomed to be unstable, i.e.,
for any choice of the primary key you make, I can find an instance of
two different movies coinciding on the values of the chosen primary key,
or prove that for at least one attribute of the chosen primary key the
corresponding information does not always exist.

> If you are interested in a Relational database that has 100% data
> integrity, I would be happy to respond.

I am. Here is some more contextual information if you like:

http://filmstandards.org/fsc/index.php/Special:AllPages

Most of the links are worth reading, but I'd start from these:

- The case for reference models
- No entity without identity
- Relationships: An essential component of art and culture
- Description levels: A worked example
- Metadata specifications in context

Nicola

Derek Ignatius Asirvadem

unread,
Dec 30, 2019, 8:39:51 AM12/30/19
to
> On Monday, 30 December 2019 21:27:26 UTC+11, Nicola wrote:
> On 2019-12-29, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >> On Sunday, 29 December 2019 00:13:51 UTC+11, Nicola wrote:

Quick post seeking clarifications only, not a reply to yours.

> > Eg. what is the database intended to do: track works for the purpose
> > of ownership; copyright; licensing; etc, or track "items"; things that
> > are sold. The former is intellectual, that latter is physical.
>
> The former is more challenging. I post some more relevant links below.


>
> >> Btw, modelling in this context is relatively easy using RM/T, where you
> >> trade flexibility for integrity.
> >
> > That is not a trade, it is the lazy man's excuse for not working, for
> > not applying oneself and performing some genuine modelling.
>
> Of course, I am playing the devil's advocate here, as no one else so far
> has argued in favor of surrogates. Continuing to do so, I will assert
> that any model based on Relational Keys is doomed to be unstable, i.e.,
> for any choice of the primary key you make, I can find an instance of
> two different movies coinciding on the values of the chosen primary key,
> or prove that for at least one attribute of the chosen primary key the
> corresponding information does not always exist.

Seems like a decent challenge.

But that is a rather strong claim that you make against RKs.

Are you saying that you gain anything from surrogates, that if surrogates were used
- there will never be more than 1 movie with the "value of the chosen AK [Relational Key]
- there will never be an attribute value that does not existent
?

This, after accepting that with surrogates you "trade off integrity".

I don't understand. If you are willing to "trade off integrity" with a surrogate framework (it cannot be called "model"), on what basis, from what ground, are you attacking a Relational data model, that has generations of data integrity (more than the "theoreticians" have identified) that the surrogate framework does not have, that might have [you say you can prove] this or that fault.

It is like a serial killer calling out a shoplifter who has never dreamed of killing anyone.

> I am. Here is some more contextual information if you like:
>
> http://filmstandards.org/fsc/index.php/Special:AllPages

Much better.
We use the "Full hierarchy model (3 levels)" page 8 ?

> Most of the links are worth reading, but I'd start from these
>
> - The case for reference models
> - No entity without identity
> - Relationships: An essential component of art and culture
> - Description levels: A worked example
> - Metadata specifications in context

First , I have a time problem. Second, modernism is deplorable. Third, I stopped reading academic papers when the main frame of papers was obviously Straw Man, it is like academia = Straw Man now. Except yours of course. IIRC /No entity without identity/ is one nutcase arguing that some other nutcase was wrong, and neither nutcase realises that the reason they are locked up, their disease, is modernism.

The various devices (Straw Man with heroin) they use to find Aristotle "inadequate" are not as convincing of their rigour, as their drooling snot, and the pulmonary embolisms they died from. Not to mention the drugs and the orgies. You are free to hold them in regard, and to nurse them, please do not ask me to do so.

Fourth, you are making the claims, so I would ask you to back yourself up, with your own arguments, which may have a prior author, rather than citing the mountain of pig poop, which may have a few gems hidden in it. Unless you explicitly ask me to read one or the other.

They have even destroyed the definition of /ontology/ in their pathetic war against God, so as to make being-ness unimportant and their phantasmagorical non-being relevant. Why, now we have multiple "ontologies", one per library or dataset. And "description logics". Hysterical.

The Law of Identity is the first of the Four.

--

Again, this is clarification seeking only. Sorry for the 4 paras.

Can there be a Work without a creator (author; writer; conceptualiser; or whatever) ?
No.
Then why on earth are they defining Work as Independent, Identified by
"W" [gotta love the bunnykins]
+ "-" [regression to 1960's mainframes, before we introduced Normalisation]
surrogate.
Answer: they read Date & Darwen.

I do understand, the FIAF manual is about their requirement, not the database. But they do assert "Identifiers". Must have had an OO munchkin for IT consultancy.

There is no entity without Identity. So the schizophrenics confect an Identity, so that their unreal fantasies can be made, oo ooh oo, into entities.

Cheers
Derek

Nicola

unread,
Dec 31, 2019, 6:16:57 AM12/31/19
to
On 2019-12-30, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:

> Are you saying that you gain anything from surrogates, that if
> surrogates were used
> - there will never be more than 1 movie with the "value of the chosen
> AK [Relational Key]
> - there will never be an attribute value that does not existent
> ?

No and no. What do you gain you ask? (Recall that I'm playing the
devil's advocate here) I'd say you gain simplicity and flexibility. If
I define Work(WorkID), then I can link all the properties I need to any
work, e.g., Title(WorkID, Title, Language, Type). I don't have to decide
in advance whether there will ever be a movie with two distribution
titles in the same language: my framework can accommodate that. Do
I have record data about a movie with no title? No problem. And so on.

Is there a chance that I will insert data about a movie into
the database n times? Sure, but eventually someone will discover the
duplication and merge the data. Think as an archivist: it's better to
have more copies of the same thing than none at all. My count of movies
will be slightly off, but it will improve with time.

Is there a chance that I will insert contradictory data (e.g., the fact
that a movie has no title, and its title)? Sure, but that means that
there is conflicting information about a movie in the real world, which
must be resolved. In the meantime, I have a place to record such data,
so it doesn't get lost.

Of course, I will end up with a spaghetti model, essentially what
"linked data models" look like. But I can organize it using an
"ontology", which will put every piece of data within a well defined
hierarchy.

> This, after accepting that with surrogates you "trade off integrity".
>
> I don't understand. If you are willing to "trade off integrity" with
> a surrogate framework (it cannot be called "model"), on what basis,
> from what ground, are you attacking a Relational data model, that has
> generations of data integrity (more than the "theoreticians" have
> identified) that the surrogate framework does not have, that might
> have [you say you can prove] this or that fault.

Is data integrity so fundamental, or even a priority, in this context?
Cataloguers and archivists' main priority is to amass as much
information as possible. They are used to work with mess (I heard some
of them literally say that they "love working with XML"). Let's care
about getting things "in"; somehow, we will find a way (even if
complicated) to get things "out". A properly designed Relational Model
would put too many constraints on what gets "in", although I must admit
that it would be somewhat easier to get the right things "out".

(End of playing the devil's advocate)

>> I am. Here is some more contextual information if you like:
>>
>> http://filmstandards.org/fsc/index.php/Special:AllPages
>
> Much better.
> We use the "Full hierarchy model (3 levels)" page 8 ?

It's ok with me.

>> Most of the links are worth reading, but I'd start from these
>>
>> - The case for reference models
>> - No entity without identity
>> - Relationships: An essential component of art and culture
>> - Description levels: A worked example
>> - Metadata specifications in context
>
> First , I have a time problem. Second, modernism is deplorable.
> Third, I stopped reading academic papers when the main frame of papers
> was obviously Straw Man, it is like academia = Straw Man now. Except
> yours of course. IIRC /No entity without identity/ is one nutcase
> arguing that some other nutcase was wrong, and neither nutcase
> realises that the reason they are locked up, their disease, is
> modernism.

Those are not papers, but (AFAICS) explanations about the reasoning that
went behind the standards that were eventually published. Not mandatory
reading. Skip that if you wish.

> The various devices (Straw Man with heroin) they use to find Aristotle
> "inadequate" are not as convincing of their rigour, as their drooling
> snot, and the pulmonary embolisms they died from. Not to mention the
> drugs and the orgies. You are free to hold them in regard, and to
> nurse them, please do not ask me to do so.
>
> They have even destroyed the definition of /ontology/ in their
> pathetic war against God, so as to make being-ness unimportant and
> their phantasmagorical non-being relevant. Why, now we have multiple
> "ontologies", one per library or dataset. And "description logics".
> Hysterical.

(Playing the devil's advocate again) Ontologies help us clarify concepts
and classify things. Description logics allow us to perform inferences
(even reason about contradictory data), which SQL cannot.

> Can there be a Work without a creator (author; writer; conceptualiser;
> or whatever) ?

No, but it's not unreasonable to imagine that in some cases the creator
may not be known. Not thinking about feature movies, of course, but
obscure films or tapes found in the deep of some store-house. Same as
every other form of art (think paintings by unknown author); one
difference, though, is that a movie typically has no single creator, but
rather a group of people with distinct roles (writer, screenwriter,
director, etc.). Not sure whether you may attribute a movie to a single
principal creator.

Nicola

Derek Ignatius Asirvadem

unread,
Jan 1, 2020, 8:36:27 AM1/1/20
to
> On Tuesday, 31 December 2019 22:16:57 UTC+11, Nicola wrote:
> On 2019-12-30, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:

First off. Happy New Year, all the best to you and yours.

We are suffering 42º, high winds, and raging bushfires in every state. Nothing to do with climate change hysteria, but due to only one thing. (I have been riding horses in the bush for over 30 years, 9 years as a volunteer bush fireman.) 25 Years of leftie; greenie; schizo bureaucrats disallowing permits to burn off the build-up of fuel on the ground, which we have been doing for 40,000 years (so our dear aboriginals say). So when the natural fire comes, and it does every 7 years, it has an enormous build-up of fuel, and thus it rages, with high intensity. It sends embers up into the heights, where the wind that the fire generates sends it to where the fire is not. And so it propagates itself.

I pray that 2020 is the year that the lefties; greenies; schizos consume themselves by way of their own created disasters.

I will answer in what i I believe is a logical sequence, not that in your post or my previous one.

> Those are not papers, but (AFAICS) explanations

Sorry. They ARE the titles of philosophical papers. Which I have read, and dealt with long ago. Hence my rant re philosophy.

But wait on. Some of us do think about such things (eg. Identity) deeply. Which is why the entire body of Western Thought (ie. pre-modern) is so important. And then we do not confect what we think Identity is from scratch, in isolation, limited by the content of a scrambled cranium, but we implement what we /know/ to be Identity in a computer.

> Those are not papers, but (AFAICS) explanations about the reasoning that
went behind the standards that were eventually published. Not mandatory
reading. Skip that if you wish.

Now corrected, I will read the links.

Something is not a standard just because it is published. All manner of pig poop is published. Some of it achieves the status of a convention. Some of it, eg. UML is heavily promoted and declared as a "standard" but has nothing of a Standard.

At best, this is a set of considerations that they request we make, when implementing a computer system that stores the subject matter.

> [various comments accepting primitive, sub-standard, broken methods as a "database"]

Look, by the Grace of God, in the 70's and 80's, I was a Lead s/w Engineer for Cincom, supplier of the then 5th largest (of five!) DBMS, TOTAL. Look me up on LinkedIn. My team wrote the second generation for minicomputers, then the first great threat to mainframes. 100% multi-threaded and full Transaction control. We had integrity in the first version, much more so in the second. Only Britton-Lee with their Database Machine, when they moved to Relational; SQL, and became Sybase, eclipsed us.

I left primarily because Cincom would not move to Relational. Again only through His Charity, I moved directly into high-end consulting for high-end customers. Meaning that I stayed in the high-end, rigid standards, absolutism about integrity and quality. With no knowledge or regard of what Date; Darwen; Fagin; et al were doing to damage the /Relational Model/, because I was implementing the /RM/ at that high level,and without any problem.

So in the 00's, when I started helping smaller shops with their D&D&F RFS labelled as "relational", it was more than a shock to find that the theoretical and applied theoretical world was already destroyed.

Likewise Bill Gates has a lot to answer for. In 1981 when the first PCs started coming out, we were up in arms about the PCs , that failed all the time, that hung up, that froze with the "blue screen of death" (then a black screen). It is still legal in Australia to return something that does not work, that is not fit for purpose. But people seldom do that for PCs, they have been programmed into accepting, and working with filth that passes for merchandise.

Not me. not my customers.

If we take life from the top, down (genuine authority; mainframe to min; standards; quality), we have had great systems since the 1960's, and ongoing. None of the problems that you enumerate, which are in reality far worse, and which you (devilishly) accept.

If we take it from the bottom, up (rebellion against authority; PCs; no standards; no quality), we have filth. The great unwashed masses love their great unwashed filth. All the problems you enumerate, and much worse. Simply accepted, no devil's advocate required.

In my 30 years (since I left Cincom) of delivering high-end databases, mostly to large Aussie banks, it has been a total top-down rewrite, a "Version 2", of their terribly broken, bottom-up, devoid-of-standards unwashed filth. The point is, the filth can be borne for a limited time only, and market forces will drive a move to higher levels systems. Either the corp disintegrates to dust, or it has to elevate its game.

Usually I rewrite on their current platform, their current h/w. Always I guarantee that the new system will be 10 times faster, and that any report can be serviced by a single SELECT command. That is totally foreign to them. In many cases they already have a heavy-duty reporting system such as BusinessObjects, which has a "universe" that describes the V1 "database", so that developers & users CAN write reports (they absolutely cannot otherwise). After the V2 system is implemented, they disconnect BO, and stop paying the mega-fees for annual maintenance. Because every report can be written directly from the Relational Database. Because it is Logical. Because the data model is a logic map. Because Codd used FOPC. Because I did as well. Because SQL is a proper implementation of FOPC; of Codd's RA; of Codd’s RM.

In reality, as distinct from the contractual guarantee, I deliver two orders of magnitude, and SELECTS that used to take tens of minutes, in milliseconds.

(The "theoreticians" complain that SQL is this and that, that it is "broken". SQL is never the problem. The data model or spaghetti model of the RFS is the problem, and then yes, any data sub-language would be "inadequate". In my three years hard labour at the TTM Gulag, Darwen would often present a problem, with the herald that "the RM is incomplete" or "Codd did not define" or "SQL is broken", and I would always, each and every time, give the pig poop eater a real Relational data model, which either eliminated the "problem", or provided the base from simple and straight-forward SQL, a single SELECT that produced the report that he alleged was "impossible".)

All of that is documented in TTM. (Includes me destroying the insane arguments that the TTM slaves slavishly erect. Straw Men orgies.)

Some of that is documented here on c.d.t.

There are other docs that I have given a better treatment to:
https://www.softwaregems.com.au/Documents/Article/Application%20Architecture/UTOOS%20Response.pdf
https://www.softwaregems.com.au/Documents/Article/Normalisation/DNF%20Data%20Model%20D.pdf

Here's one a friend of mine took up, but did not complete, or concede defeat:
http://www.softwaregems.com.au/Documents/Article/Normalisation/DNF%20Nicola%20C.pdf

(You can take up any of those here. Or privately. As declared in the header thread, failure to concede a point honourable constitutes concession of the point. If so, please open a new thread. This thread is Movie Identity.)

My projects include implementation of a great library system, competing with GEAC at the time. GEAC was a great system in the 80's. But sure, academics make a total mess of it. One of my college mates was an engineer at GEAC, before this deal, I was well versed in their offering. We competed for a contract with this system at Queens:
https://library.queensu.ca/techserv/cat/Sect01/history.html#geac
We stopped because the academics at the Queens library were idiots, there was no possibility of pleasing them. Here they have taken the world's best library system at the time, and implemented it their way, against the advice of GEAC, the supplier, and then they complain, falsely that the GEAC system was the problem. Er, their library system was broken, and they would not implement the changes that we at Cincom, or that GEAC required.

Academics love whatever they come up with, they hang on to the promise, in pathological denial of reality. Stonebraker invented the filth known as Ingres. It defecated on the users if more than five used the system. But it was highly praised by academia. Its bastard child, its resurrected ghost, is Postgres*NON*sql. It does not even remotely comply with SQL, but they will in their beloved Straw Man way, argue that SQL is broken. No, no, no. Pissgress is broken, it is not SQL. It has many cowbells and dog-whistles that assist an RFS (no assistance to an Rdb, because an Rdb has SQL), but none of the basics of SQL. The use of SQL in the label, and the slavish repetition on every manual page, is a total fraud. But hey, academics love it. It reminds them of their most famous idiot, Stonebraker. It gives them comfort that they too, can spend their entire academic life producing nothing that works, and filth that does not.

The worst thing in that category is, anything you write in pus-filled "sql", when you finally move to an SQL platform, has to be re-written, at the least. Every single line of code. But wait. Once you realise that you did have 42 work-arounds in your data model to cope with the pus-NONsql, you will rewrite the data model as well. I have assisted in scores of such ventures.

(
That is just about PusGrossNONsql vs SQL, that does not cover the other issues, such as No Server Architecture; No ACID Transactions; No possibility of concurrency. If you do not know what that is, have a look at this introductory level doc. For every instance of "Oracle", substitute "PissGrossNONsql".
https://www.softwaregems.com.au/Documents/Article/Oracle%20Circus/Oracle%20vs%20Sybase.pdf

Chase the links if you would like to know what a Server Architecture actually is, what we have had in the real world (as distinct from the academic fantasy world) since 1981. This is why I know, before the fact, that my V2 database will run on the existing customer h/w, at 10 times the speed of the V1.

Then look at the thousands of concurrency problems about the pig poop on help sites such as StackOverflow.
)

Of course, every crime is one of omission as well as one of commission, They have to pathologically deny what SQL really is, in order to elevate their pig poop as "sql'. And don't forget, that same pathology, the denial of the Four Laws of Thought, allow them to bask in the Excluded Middle, denying resolution. They genuflect to the ghost of Stonebraker, they suppress Codd, and elevate the insanity of Date; Darwen; Fagin; et al. They love the fact that they are four decades removed from the real world. They sacrifice their daughters to the god of Open Scourge. As if the contents of 10,000 crania spread across the world, and each nicely scrambled, ignorant of what a server is, what integrity is, can ever have the integrity that is naturally present in any single undamaged cranium.

The Kool Aid is Pig Poop. The source is Date; Darwen; Fagin; et al. Distilled and matured in large intestines of friendly sows. Very very friendly.

In this field, database science, there is not a single theoretician or "theoretician" since Codd that serves the industry. The mountain of evidence is, the commercial suppliers drive the industry, and they employ great theoreticians. And a few great Applied Scientists who implement systems.

----

Therefore, if you would like to engage because you would like to know what actual Authority; Standards; Codd (RM & Twelve Rules only); genuine SQL; high-end customers have been doing for four decades, while academia has been hyper-ventilating about, and engorging themselves with, each others backsides, we have a chance of having a meaningful discussion. Specifically, you will gain something.

But if you want to remain solidly in that bottomed-out position that academia is, by virtue of the mountain of evidence, and merely argue (devil's advocate or not) with the high-end real-world implementations That Do Not Break, that are unknown to academia FOR FORTY YEARS (fifty re the RM), if you are attached to the world where the minimum is good enough, spaghetti for logic; then no, we are not going to get anywhere. Specifically, you will gain nothing. Only frustration, because the attempted re-inforcement of the slavish arguments will fail to find the re-inforcement that is sought.

I an a technician, I have no sales skills at all. I cannot sell you up from that position. People seek me, I don't seek them. Because they have had a gutful of the pain they are suffering, and because they have identified that my systems have zero pain. And not before. So take it, that if you are not in enough intellectual pain, psychological pain, if you have not identified that the mountain you are standing in is made of pig shit, if your pain is not enough to drive you to seek freedom from it, you might not have much of a chance at grasping the concepts and standards that the painless enjoy, and have enjoyed for forty years.

I am not saying that you have to drop everything that has sustained you in your career before you can have a chance to attach yourself to the real world. That is true, because all academics in this field are addicted to the mountain of academic pig poop, exactly the same way a drug addict is. But that is not the condition I am declaring. I am saying you must have some serious pain, caused by the natural revulsion at the sophistry that passes for "logic", the foul odour of the mountain you are standing in, to move away from it.

----

> > Are you saying that you gain anything from surrogates, that if
> > surrogates were used
> > - there will never be more than 1 movie with the "value of the chosen
> > AK [Relational Key]
> > - there will never be an attribute value that does not existent
> > ?
>
> No and no.

Phew.

> What do you gain you ask? (Recall that I'm playing the
> devil's advocate here)

Otherwise known as throwing up the slavish arguments that constutute the “academic” “literature”.

> I'd say you gain simplicity and flexibility. If
> I define Work(WorkID), then I can link all the properties I need to any
> work, e.g., Title(WorkID, Title, Language, Type). I don't have to decide
> in advance whether there will ever be a movie with two distribution
> titles in the same language: my framework can accommodate that. Do
> I have record data about a movie with no title? No problem. And so on.

Oh come on. That is the most unscientific drivel that I have heard in a long time. You are a scientist. Why don't you observe that that is a moving, shifting anti-framework, that frames nothing. Why don't you observe that if you (personally, because you are a scientist) apply a little bit of science, all that can be replaced with an Entity-Attribute-Value set of files. Much cheaper. No need to even name anything, let the users name and number their things, according to whatever they define their things to be. Set up just four files in your RFS ”database":
- Entity
- AttributeChar & Value
- AttributeNumeric & Value
- AttributeText & Value

There, done, they have a system that will serve every single one of their needs, even needs they have not dreamt of yet. Until they die. Or until they kill you.

Now you are free. Go and do something worthwhile with your precious life.

> I'd say you gain simplicity and flexibility

No. That is not simplicity. Stop lying to yourself. That is ignoring the science that produces simplicity, and shifting the responsibility back to the user. That system is, in scientific fact, complexity fraudulently claimed to be "simplicity". What you gain is abject abdication of responsibility, same as the other theoreticians', the other "academics" in this field. it gives you the freedom to now write yet another paper about what could work if a certain fantasy were "true". Same as the fifty-year-old men who claim that they would be happy only if they were /inside/ a five-year-old girls body. There is a clinical word for that, and it is not "scientist" or "academic" or "theoretician".

Simplicity derives from using more than one Standard, more than one science, that together are plaited; braided; inter-woven, that can be used precisely because that are all compliant with higher order truth.
Eg. FOPC-> RM-> Relational data model.
Eg. FOPC-> Predicate -> Relational data model -> Validation
Pre-modern science, /scientiam/, means KNOWLEDGE, specifically knowledge of truth. All science is Integrated, because all truth is Integrated.

Post-modern "science" is speculation claimed to be "knowledge", an ever-changing morass of filth, and that can only be contemplated after denial of science.

Science is not knowledge of fragments that exist in isolation, floating in space, with contrived relations, that require massive doses of insanity or good drugs to see. The science is Integrity, something totally foreign to the pig poop eaters that pass for "theoreticians" in this space, whose mountain of pseudo-science is ever-changing pig poop.

Truth, and only truth, is simplicity.

The task of science is to determine that truth. What you are doing (devil or not) is anti-science, abdicating your responsibility, and claiming the "simplicity" that you would have had if you had been a responsible scientist.

> Is there a chance that I will insert data about a movie into
> the database n times? Sure, but eventually someone will discover the
> duplication and merge the data. Think as an archivist: it's better to
> have more copies of the same thing than none at all. My count of movies
> will be slightly off, but it will improve with time.

That is the same argument that the poor old Jew in the ghetto uses, to sell copper as gold.

Science, Logic: Pig poop cannot “improve” over time or millennia. A reptile cannot “evolve” into a mammal. An ape cannot “evolve” into a human. A higher-order creature can deteriorate into the behaviour of a lower-order creature (but it remains the original creature), but a lower-order creature cannot “evolve” to a higher-order creature. The notion is idiotic, but sold on every propaganda channel. Think science. Think DNA. Think number of chromosome pairs in each species. Think gene editing.

No. If the defining order was pig poop, it can never be anything but pig poop, all fantasies to the contrary. Yes, every cripple, every retard, fancies themselves to be the next Einstein.

> Think as an archivist

I have, and I am. I have worked with real library systems (that do not break, that are still running 30 years later). In case you do not know, you can run a GEAC or OpenVMS system today, even though the hardware has not been manufactured for 30 years, on virtual machines that provide GEAC/OS or OpenVMS as THE platform. "Don't fix what ain't broken".

I have completed projects at the Dept of Minerals & Energy. Think full 3D geographic locations; channel definition, for mining operations, for licensing purposes (there had better not be any conflicts, or even liabilities that stem from mine shafts that are too close together, or too close to water channels, or too expensive to be commercial. An ancient GEAC system plus my Relational database to provide all the modern expectations, on Unix/Sybase. At a fraction of the price after they threw the multi-million dollar supplier out, specifically for promising like you, and delivering like you can, a piece of pig poop with no integrity. (Unfortunately I bid a six figure price, I did not know at the time that the previous supplier bid a seven figure price.) The system is still running today, GEAC over 30 years old and defunct, mine over 20 years old and never needing maintenance.

You are using the word "archivist" or "librarian", but you mean an imbecile who does not know the slightest thing about archival or libraries. You are happy with a "computer system" and a RFS fraudulently claimed as "database" (let alone Relational). As long as you seek employment from one-man companies, and the insecurity of such positions, which you are very very familiar with anyway, that will be fine.

If ever you get out of the garage and smelled the roses, if ever you seek employment in the real world, you might find that before you make hysterical claims about what you know about "archival", you will have to get formal education, and a certificate and everything. (And that cannot be obtained from FIAF or FilmStandards.org.)

Nah, you are not thinking archivist, you are not thinking responsible scientist who delivers science such that the system is simple for the archivist. You are thinking cheapest bang for the buck, never see him again, simplest way to get a firetruck at the brothel (the sick ones are really cheap) and a feed at the Porcine Excreta Diner tonight, the hell with tomorrow, you won’t be around when he finds out you defrauded him.

Copper for gold.

> Is there a chance that I will insert contradictory data (e.g., the fact
> that a movie has no title, and its title)? Sure, but that means that
> there is conflicting information about a movie in the real world, which
> must be resolved. In the meantime, I have a place to record such data,
> so it doesn't get lost.

Get serious. It is lost already. You don't have a name by which you can find it (and you stopped writing ID numbers on a piece of paper five years ago).

Get serious. There is no conflicting info in the real world (it conforms to the Law of Non-Contradiction and the Law of the Excluded Middle). If there is, that proves you simply have not figured it out, you are simply not ready to enter anything authoritative into a computer registry (you can, into a mickey mouse RFS).

Get serious, the whole movie industry, in all countries, is run by Jews. The material goal (let's not discuss the formal goal) is about money; money; and nothing but money. They make sure that there are no conflicts about their money; money; money. Same with the Chinamen.

Get serious. There is no such thing as a movie without a title. (There is such a thing as a "working title".) Because in Reality, there is no entity without an identity.

> Of course, I will end up with a spaghetti model,

Minus the meatballs.

> essentially what
> "linked data models" look like.

What scientists call pig poop.

> But I can organize it using an
> "ontology",

Sure. Yet another substantial labour that I do not have to do. Because my Ontology is already in, is fully integrated with, the Relational data model. Same as not needing a massive BusinessObjects "Universe".

> which will put every piece of data within a well defined
> hierarchy.

False. You (and FIAF; etc, the “theoreticians”; et al) have no clue what Hierarchy means. The notion of Hierarchy has been, and continues to be suppressed in post-modern education; speech; "academic" papers; "literature"; every single propaganda channel; etc. Ever since the French Destruction, the unnatural notion of "equality" has been elevated, and the natural hierarchies have been suppressed. You are at the end of centuries of that suppression, the actual hierarchies are totally invisible to you, an inculcated blindness, demanded in order to foster the unnatural and destructive "eek-wally-tee". (Wally is Aussie slang for idiot.)

What you do have, is a fixed and meaningless (not "defined" in intellectual terms, but specified in physical-only terms) list. Look up Nested Sets; Adjacency Lists, beloved of imbeciles such as Date and Celko. The "definition" one has when one has no clue what definition is.

There are THREE types of genuine Hierarchies, which are inherent (yes, inherent) in the Relational Model. In the fifty years since Codd, not a single "theoretician" has defined or articulated them. Proved it is, that that is not known to you, by the fact that you are pushing linked lists of physical IDs as "hierarchies".

> > This, after accepting that with surrogates you "trade off integrity".
> >
> > I don't understand. If you are willing to "trade off integrity" with
> > a surrogate framework (it cannot be called "model"), on what basis,
> > from what ground, are you attacking a Relational data model, that has
> > generations of data integrity (more than the "theoreticians" have
> > identified) that the surrogate framework does not have, that might
> > have [you say you can prove] this or that fault.

Ok, before we get into the next point. As defined in the header thread, by definition, you have conceded that you are a flaming hypocrite (with or without the devil assisting you).

Conceded also. You can prove nothing about what you say you can prove. If you could, this would be the point at which delivery of alleged proof is begged. That point has now passed.

----

Ok, so now your new position is, since you (your anti-model) have no integrity, integrity is not important. Now that the Jew has been proved a fraud, that the gold is copper has been exposed, he is saying the gold is not important.

We should make a movie about this. Harvey Feldman. Bernie Goldstein. Einstein could play a cameo role, especially if we keep Poincaré and the Scientists out of the picture. Let's call it a collaborative Work. You and me and no-one else, 50-50. I'll write up the concept. Working title is, wait for it, /Copper for Gold, and the Pigs are Free/. (If that is foreign to you, look up the song by Dire Straits.) And it exists as a Work without a creator or creators. Quick, you need an ID for that.

> Is data integrity so fundamental, or even a priority, in this context?

No, not in your contrived context.

Absolutely, in the real world. Anyone suggesting otherwise classifies himself as unqualified for the job. Let me assure you, at the banks, I have witnessed or participated in instances where we phoned security and Human Resources, and had the termination cheque ready before the speaker had finished his presentation. There is a certain way that we warn each other that a person is strolling into termination territory, to STFU, by waving our security pass quietly.

The point is, even in corps that have less concern about integrity and security, such a statement or implication seriously damages the speaker’s credibility. Unrecoverable.

> Cataloguers and archivists'

In your contrived multiverse

> main priority is to amass as much
> information as possible.

Perfect, for the unwashed, the contrived garage "archivists", who think that mountains of form-less matter has value. Like a 500 page thesis that says nothing.

Pig poop, for the real world cataloguers and archivists who are qualified (certified), who understand that Matter is irrelevant, especially the volume, that Form (what you intellectually know about the matter; how the matter is intellectually organised) is relevant. Especially when finding anything in the archive. The same day as the request.

> They are used to work with mess (I heard some
> of them literally say that they "love working with XML"). Let's care
> about getting things "in"; somehow,

Sounds like the drunk at the brothel on a Saturday night. And the 15 mins is coming to a close. He can't get it up, but he insists on getting it in. The girls must love clients like that.

> somehow, we will find a way (even if
> complicated) to get things "out".

The adorable "logic" of the insane. Great hilarity for a New Years Day, thank you. You can't get it in; you can't find it after you get it in; but you promise you can get it out. Despite the duplicates and the mis-attributions and the registration of things that are not real. Truly. Oh yeah, and despite having proved all that, you promise that you are capable of thought, and even "complicated" thought when called for.

> A properly designed Relational Model
> would put too many constraints on what gets "in", although I must admit
> that it would be somewhat easier to get the right things "out".

That first part is a Straw Man argument. You have framed what it really is, as something that it is not. And then you have attacked the frame that you created. Congratulations, you win at destroying the nonsense that you created, I did not need to lift a finger. Great, what it really is has not been affected.

A properly designed Relational data model has many constraints, yes. A mature RDM has twice as many constraints on the same number of tables. Constraints of the type that are unknown to you and the "theoreticians" here. They deliver, and enforce, Integrity. Undamaged humans love Integrity. Dis-integration is virtually the definition of criminality, of insanity. But they are invisible to the user, the constraints or their number is not known, they do not know that it is especially hard to get things in, they only know:
- that it is natural (in nature, only hard things get in);
- that it is integrated with all the relevant science (truths about the profession of librarian or archivist) that they know
- that it does not contradict science or reality
- that it is therefore simple.

> (End of playing the devil's advocate)

Thank God.

The devil has no chance at all, against one who loves the Truth.

> > "Academics" ...
> >
> > They have even destroyed the definition of /ontology/ in their
> > pathetic war against God, so as to make being-ness unimportant and
> > their phantasmagorical non-being relevant. Why, now we have multiple
> > "ontologies", one per library or dataset. And "description logics".
> > Hysterical.
>
> (Playing the devil's advocate again)

God help me.

> Ontologies help us clarify concepts and classify things.

No they do not. Ontology means one thing, and one thing only, for 2,000 years.

Your (and their) "ontology" is quite a different thing, contrived; man-made (actually damaged-man-made); confected. Using the label "ontology" is a gross fraud, which gives it a respectability that it has not, and it suppresses the real ontology. For those sins, you (as devil's advocate) will burn in hell, for eternity.

(
Let me say, I am currently assisting a PhD in AI. She sought me through a common network, precisely because they told her that I can help her with the "ontologies" and "descwiption logics" that were slowly exploding her head. Believe me, I do not go around looking for people whose brains I can unscramble. I am straightening that head out, but it will take a year or three. I can't give her any of my proprietary work, even under NDA, because she has Chinese agents in her University and in her company, but as I do for many others, I am happy to give her everything else. Point is, I have studied a few of them, and in each case, vomited.
)

Now, about that freaky thing that you fraudulently call "ontology". It is a pathetic, after-the fact, post-mortem, fragmented (even the best), disjointed, and very limited definition of concepts. Even the concepts of the concepts are fragmented and limited, usually to the level of the one and only fragmented designer. And so bad that no one else can modify or enhance it. Pig poop tastes the same even if honey is taken with it.

Oh, but an "academic" with a PhD wrote an "academic paper" defining anti-logical "ontologies". And oh, all "academics" have to protect and contrive to elevate other "academics" due to their fragility of mind. So hackerdemics all over the contrived world (they exist with total detachment from the real world) now dive into the hysteria and write "ontologies" for their beloved RFS.

Recall that every crime is two-fold. So the only way that this sort of pig poop can be elevated to "academic" matter, that can end up being used as a "definition of concepts", is the ALL the "theoreticians" involved, every single one, is totally ignorant of FOPC. Of the fact that anything and everything in the universe (the real one, not the fragmented subjective "reality" of the insane) can be defined in FOPC Predicates. That everything in a Relational database is a definition in terms of FOPC Predicates. That if their pig poop were defined in Relational terms, their hysterical "ontology" would be redundant; superfluous; idiotic.

Recall that a Relational database does not need a massive additional definition for a reporting, such as six-figure BusinessObjects. Because it is already logically defined, and logically reportable. $500 per seat instead of six figures. Any user can write a report instead of only those who took the expensive (again) BO course. But a massive BO Universe is very very necessary for an RFS. It is the same thing with onto-ANTI-logies. Totally irrelevant for RDBs. Massively necessary for RFSs. And unchangeable.

> Description logics allow us to perform inferences
> (even reason about contradictory data), which SQL cannot.

It has nothing to do with SQL, fool. Trying to get SQL to get something out of an anti-relational morass of pig poop is stupid. And you cannot blame SQL for that. SQL is logical, based on FOPC and Cod''s RA. Getting anything out of a database that conforms to the RM, which implies conformance to FOPC, is dead easy. I challenged the great god of pig poop himself, Hugh Darwen. Much as he tried, there was not one thing in three whole years that he could come up with.

Go for it, give me anything that "cannot be done in SQL". I will give you this whole year. If I supply SQL to solve the problem, it will prove that you are ignorant of SQL. If I give you the Relational model that solves the problem (because that and not SQL, is the sea of the problem), it will prove that you are ignorant of the RM and how to use it. Real world problems only, the more complex the better. No problems from the realm of fantasy, they need a different form of treatment.

"Description logics" is not logic. there is one, and only one Logic (the mechanism of the intellect is Logic). The label is fraudulent. Again, because it demeans logic, and elevates pig poop to "logic". Again, while Logic (FOPC, Predicates, the RM) are inherent in a Relational database, and thus give us all the inferences that do exist over the data in it, therefore "description logics" is irrelevant; superfluous; imbecilic.

Oh, but an "academic" with a PhD wrote an "academic paper" defining anti-logical "description logics". And oh, all "academics" have to protect and contrive to elevate other "academics" due to their fragility of mind. So hackerdemics all over the contrived world (they exist with total detachment from the real world) now dive into the hysteria and write anti-logical "description logics" for their beloved anti-logical RFS. It is insanity squared. And institutionalised. They have turned universities into asylums.

The growth of such insanity is directly related to the growth of cancer. After one hundred years of incubation, and linear growth, around 1970 the growth progressed to exponential. All universities in the occupied countries are heavily infected. The kids are deemed terminal before they walk in, before they pay their registration fees. Suicide is the new norm, and self murder is now legalised. The gravity of this is denied.

> (even reason about contradictory data),

I vomited twice already, I don't know if I can handle another one on this first day of the year.

A different point about that hysteria. First, nothing contradictory exists in the real world. Second, the Laws of Thought demand that any contradiction that may be /perceived/ be resolved. Therefore there is nothing that is in the data, or in the Relational (logical) database that is contradictory. So the anti-logical "description pig poop logicks" again, do nothing for undamaged humans.

But wait. In the anti-logical RFS, or the anti-logical data model that the anti-logical "academick" "defines", lo and behold, there is contradiction. And wait, wait. He has a "reason" or bunch of spaghetti "reasonings" that "reason" about the anti-reason contradiction. One cannot reason the unreasonable. That is why the Laws of Thought come first, are fundamental, to reason. Anything outside the laws, IS NOT REASON. And REASON cannot be had over matter that is NON-REASON.

Oh, but an "academic" with a PhD wrote an "academic paper" defining anti-logical, anti-reason "reasoning for anti-reason contradictions". And so on and so forth.

To those of us who have not been indoctrinated in pig poop, whose crania are still unscrambled, that is third generation anti-logic. Or anti-logic cubed. We do not allow contradiction in our data models, long before they become databases. We do not need the masses of third generation insanity because we stopped the first generation of madness from taking root.

I will take it that at this point, you have ended your role as Satan's offspring.

----

> > Can there be a Work without a creator (author; writer; conceptualiser;
> > or whatever) ?
>
> No

Then stop !

Realise the concept of defining a Work (creature) without having defined the Creator, which the Work is existentially Dependent upon, is hysterically stupid. You are no longer Moloch's child, I do not have a game that I have to play. They are, as evidenced, drooling imbeciles. You are not. All their inferences and speculations are based on that beginning premise which is therefore their first principle, all of it is based on drool. Pig poop. Academically validated pig poop. Now drag your humanity together from wherever it has been seduced to not-function, grab a espresso, and start thinking. Natural, non-contradictory thinking that conforms to the Four Laws of Thought. Do not join, or rely upon, those who lie swooning in pig poop.

Aristotle teaches us that:
//the least initial deviation from the truth is multiplied later a thousandfold ... a principle is great, rather in power, than in extent; hence that which was small [mistake] at the start turns out a giant [mistake] at the end.//
Paraphrased as, a small mistake at the beginning (eg. principles; definitions) turns out to be a large mistake at the end.

Forget FIAF and their drooling "definition" of the problem, or their myriad problems. Define the problem yourself, as a human being. To me as a human being. If the data exists in the real world, there is no problem at all to define it (FOPC; RM; RDM). In the real world, a Work is existentially Dependent on its Creator (that could be plural, a collaboration, no problem).

> but it's not unreasonable to imagine that in some cases the creator
> may not be known.

Send the drooling idiot home. Inform him that the database is for registration purposes, that the Works (and derivatives) are assets, legal assets, that represent Value. Money; money; money, in an industry that is all about money; money; money. So a real world movie cataloguing system does have basic legal requirements to ensure they do not get sued, and that people (droolers and others) are not permitted to commit fraud that can be prevented. Failure to do so is legal Negligence. In Australia there is no legal defence for Negligence. They need certain specific info to be documented, before a thing can be registered. Such systems are SIMPLE to use, and the users can make claims or disallow invalid actions on the basis or truth, legally defensible truth from the real world. So tell the person attempting an incomplete registration to fly two kites in opposite directions, and to come back if and when he has one kite that still flies.

There are no nulls in a Relational database. Don't you dare start yet another anti-logical "three-valued logics" argument. I have closed that already on c.d.t.

> Not thinking about feature movies, of course, but
> obscure films or tapes found in the deep of some store-house. Same as
> every other form of art (think paintings by unknown author);

Well, those things get catalogued by the facts that are known. The facts include ownership but not Creator; content but not intent; etc. Possibly heavy detail at the Item level, but the Work is Unknown. If and when the stored attributes match up with other Works; Manifestations; etc, the Item can be better Identified. If not, it remains in the [real world] status in which it was found: a mess of unknown stuff found in a un-catalogued facility.

https://www.dailymail.co.uk/news/article-7498341/Elderly-French-woman-discovers-Renaissance-masterpiece-worth-5million.html

In that case, if there were a real owner who is not the elderly woman who claimed ownership, with the media hype, he will come to know about his piece of artwork, and he can now sue to recover it.

Or if it has been mis-attributed, now it can be correctly attributed. $26M might end up being $20K.

That is the power of a genuine catalogue, as used by an undamaged human. That cannot be obtained from and RFS, plus an anti-logical "ontology" layer, plus an anti-logical "description logics" layer, plus an anti-logical "artificial intelligence" [yet another misrepresentation] layer. Like handing over $5 in real cash in one second vs one year of mining a cryptocurrency.

> one
> difference, though, is that a movie typically has no single creator, but
> rather a group of people with distinct roles (writer, screenwriter,
> director, etc.). Not sure whether you may attribute a movie to a single
> principal creator.

Author: I don't. They call it Agent. It can be a single person, or a corporation, or a collective of persons.

Cast: They have specific roles for each member of the cast. Minus the drug suppliers are the whores. For some reason, they never get credited.

Cheers
The best to you and yours
Derek

Derek Ignatius Asirvadem

unread,
Jan 3, 2020, 4:41:52 AM1/3/20
to
> On Tuesday, 31 December 2019 22:16:57 UTC+11, Nicola wrote:
> On 2019-12-30, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> What do you gain you ask? (Recall that I'm playing the
> devil's advocate here) I'd say you gain simplicity and flexibility. If
> I define Work(WorkID), then I can link all the properties I need to any
> work, e.g., Title(WorkID, Title, Language, Type).

(Having previously responded to the "simplicity" and "flexibility" promise, re a dis-integrated RFS)

In an RDM, you can "link" all the properties that are actually related to a Work (the Key is not declared at this point, but assuming it is a Relational Key, not a surrogate):
- in the immediate row (Codd's 3NF/"full" FD)
- in the descendent rows (2NF properly understood)
- in the ancestor rows (the attributes), because we have the Key
Which can be accessed using JOIN (NATURAL JOIN on SQL platforms, something else on others). What other form of "linking" does an RFS have, that the RDM does not have ?

> I don't have to decide
> in advance whether there will ever be a movie with two distribution
> titles in the same language: my framework can accommodate that.

An RDB can do that as well.

Your use of the word /framework/ scares me. Are you squarely in the OO/ORM stream where they think "we can solve everything in the muddleware" ? In total denial that it is a miserable failure, with monthly or quarterly "refactorings" ? Gee, now what effect does that have downstream, on "defnition" for the reporting system; the anti-being "ontology"; the anti-logical "description logics".

> [mess, improvable] ...
> [contradictory data] ...
> [recorded & lost] ...
> ["linked data models"] ...
> "ontology", which will put every piece of data within a well defined
> hierarchy.

That indicates that you do know, at least intuitively, that a well defined hierarchy is required, as a fundament part of the data definition. In the RDM as an intrinsic property, or in the RFS + "multiverse" + "ontology"

Given that the four levels defined in FIAF etc are not really a hierarchy (except if it were in a full RDM with Relational Keys in those four tables), how exactly would you define that "well defined hierarchy" ?

> > We use the "Full hierarchy model (3 levels)" page 8 ?
>
> It's ok with me.

Actually, it is turning out to be a pain, and the pain is caused by attempting three levels. I will give you the four levels.

> On Thursday, 2 January 2020 00:36:27 UTC+11, Derek Ignatius Asirvadem wrote:
> > On Tuesday, 31 December 2019 22:16:57 UTC+11, Nicola wrote:
>
> > Those are not papers, but (AFAICS) explanations
>
> Sorry. They ARE the titles of philosophical papers. Which I have read, and dealt with long ago. Hence my rant re philosophy.

https://www.academia.edu/484553/No_Entity_Without_Identity

Which is, of course a response to Quine's paper.

Neither is worth reading. The objective truth in the real world is, if it exists, it has an Identity. First Law, expanded. Hamilton. Boole. If there is no species identity, then the genus (taxonomy). If an instance must be recorded, the worst (ie. no proper identity defined by the user who owns the data) is a list of differentiators.

I do have a method for that, intellect plus three tables. But I am a bit reluctant to put that in, because in the past academics simply do not get it. Done properly, ie. to prevent duplicate Keywords, it requires a Function that is called recursively. All standard SQL in the real world. Now I am quite sure than you can code a recursive Function, but since you are using a pig poop NONsql that you think is "SQL", you may not be aware of how to use such. So, a qualifying question, using the simplest DDL, so that the issue is exposed and not detracted from: in this RDM (which is 80% complete, so please do not argue that it is not perfectly Relational, we know it is not)

http://www.softwaregems.com.au/Documents/Documentary%20Examples/Order%20DM%20Advanced.pdf
Keyword
Part
PartDescription
Object_V (View with computed columns)
Do you understand that Object_V.Description is the concatenated list of Keywords, that describes a PartCode ? That is derived by calling a recursive Function, eg. Part_Description_fn( PartCode ).

Not limited to four or ten "levels" as is normally used by NONsql boffins in a hard-coded ( via four or ten JOINs ) attempt to grab four or ten "levels" (unlimited Keywords here).

Null in a result set is normal. Occurs if the ObjectType is Asset, not Part.

Cheers
Derek

Nicola

unread,
Jan 4, 2020, 9:18:32 AM1/4/20
to
On 2020-01-01, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>> On Tuesday, 31 December 2019 22:16:57 UTC+11, Nicola wrote:
>> On 2019-12-30, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> First off. Happy New Year, all the best to you and yours.

Happy New Year to you, too.

> We are suffering 42º, high winds, and raging bushfires in every state.

Seen from the other part of the world, the situation in Australia looks
really bad. As someone coming from a country that has not a good record
for respecting its land and its geology, resulting in landslides (the
worst disaster being https://en.wikipedia.org/wiki/Vajont_Dam), floods
and fires (not at the scale we are witnessing in Australia) often caused
by uncontrolled growth of the underwood, I have to agree that insane but
opportunistic political choices have a lot to be blamed for.

>> Those are not papers, but (AFAICS) explanations about the reasoning
>> that went behind the standards that were eventually published.

> Something is not a standard just because it is published.

My assertion did not imply a judgment. They are just called like that:
"metadata standards".

> At best, this is a set of considerations that they request we make,
> when implementing a computer system that stores the subject matter.

Yes.

>> [various comments accepting primitive, sub-standard, broken methods as a "database"]

I essentially agree with your extensive reply.

> Look, by the Grace of God, in the 70's and 80's, I was a Lead s/w
> Engineer for Cincom, supplier of the then 5th largest (of five!) DBMS,
> TOTAL.

https://books.google.lt/books?id=Wz-oh7ZQo8MC&pg=PA45&lpg=PA45&focus=viewport&hl=lt

Always nice to get a glimpse into less known parts of database history.

> With no knowledge or regard of what Date; Darwen; Fagin;

I understand why you want to put those names together, but I disagree
that Darwen and Date's contributions (or lack thereof) are at the same
level as Fagin's (no need to reply to this; let's keep our respective
opinions).

> My projects include implementation of a great library system,
> competing with GEAC at the time. GEAC was a great system in the 80's.

As above, nice to get to know about old computers systems.

> Academics love whatever they come up with, they hang on to the
> promise, in pathological denial of reality.

I beg to disagree. Don't generalize. Academic work is not different from
any other human endeavor: many people do it, some do it well, only a few
are exceptionally good at it. But let's not get sidestepped: your
opinion about academia is widely known in this group.

> Stonebraker invented the filth known as Ingres. It defecated on the
> users if more than five used the system. But it was highly praised by
> academia.

This is partly due to the fact that it was the only RDBMS coming from
academia at the time. But academia does recognize the fundamental merits
of other systems, such as System R and (to a less extent) IS/1 and PRTV.

>Its bastard child, its resurrected ghost, is Postgres*NON*sql.

The natural child being Sybase, I guess, given that it was made by
members of the Ingres' team :)

> It does not even remotely comply with SQL,
> […]
> ( That is just about PusGrossNONsql vs SQL, that does not cover the
> other issues, such as No Server Architecture;

Kind of agree (if, as you say, "a list of Unix processes clamouring for
resources is not an architecture"). It still shines, IMO, compared to
Oracle.

> No ACID Transactions;

Do you refer to MVCC and stored procedures? Happy to discuss that in
a different thread.

> No possibility of concurrency. If you do not know what that is, have
> a look at this introductory level doc. For every instance of
> "Oracle", substitute "PissGrossNONsql".
> https://www.softwaregems.com.au/Documents/Article/Oracle%20Circus/Oracle%20vs%20Sybase.pdf

No objection here. Although PostgreSQL has seen several improvements
recently. Again, happy to discuss different implementations elsewhere,
but let's stick to the topic here.

> Therefore, if you would like to engage because you would like to know
> what actual Authority; Standards; Codd (RM & Twelve Rules only);
> genuine SQL; high-end customers have been doing for four decades,
> while academia has been hyper-ventilating about, and engorging
> themselves with, each others backsides, we have a chance of having
> a meaningful discussion. Specifically, you will gain something.

Yes, that's the prime and ultimate goal of this discussion!

>> I'd say you gain simplicity and flexibility.

> Oh come on. Why don't you observe that if you (personally, because
> you are a scientist) apply a little bit of science, all that can be
> replaced with an Entity-Attribute-Value set of files. Much cheaper.

I agree.

> No. That is not simplicity. Stop lying to yourself. That is
> ignoring the science that produces simplicity, and shifting the
> responsibility back to the user.

From here on, I essentially agree with your extensive reply (except for
some unimportant details).

> There are THREE types of genuine Hierarchies, which are inherent (yes,
> inherent) in the Relational Model. In the fifty years since Codd, not
> a single "theoretician" has defined or articulated them. Proved it
> is, that that is not known to you, by the fact that you are pushing
> linked lists of physical IDs as "hierarchies".

I see two types of hierarchies: at the instance level (e.g., the
classical bill of materials) and at the schema level (via dependent
entities, including categorizations). What is the third?

>> one difference, though, is that a movie typically has no single
>> creator, but rather a group of people with distinct roles (writer,
>> screenwriter, director, etc.). Not sure whether you may attribute
>> a movie to a single principal creator.
>
> Author: I don't. They call it Agent. It can be a single person, or
> a corporation, or a collective of persons.

Yes, that may a good way to model that.

> Cast: They have specific roles for each member of the cast.

Sure. More accurately, each member of the cast may have more than one
role (e.g., an actor that is also the movie director).

Nicola

Nicola

unread,
Jan 4, 2020, 11:55:32 AM1/4/20
to
On 2020-01-03, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>> On Tuesday, 31 December 2019 22:16:57 UTC+11, Nicola wrote:
>> On 2019-12-30, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>>
>> What do you gain you ask? (Recall that I'm playing the
>> devil's advocate here) I'd say you gain simplicity and flexibility. If
>> I define Work(WorkID), then I can link all the properties I need to any
>> work, e.g., Title(WorkID, Title, Language, Type).
>
> (Having previously responded to the "simplicity" and "flexibility"
> promise, re a dis-integrated RFS)
>
> In an RDM, you can "link" all the properties that are actually related
> to a Work (the Key is not declared at this point, but assuming it is
> a Relational Key, not a surrogate):
> - in the immediate row (Codd's 3NF/"full" FD)
> - in the descendent rows (2NF properly understood)
> - in the ancestor rows (the attributes), because we have the Key
> Which can be accessed using JOIN (NATURAL JOIN on SQL platforms,
> something else on others). What other form of "linking" does an RFS
> have, that the RDM does not have ?

I'd say that one should ask the opposite question, as one of the
strength of the RM is that it does not rely on "pointers" (which can
point only to one thing), but since connections are established on
value comparisons it's much more flexible. Record id's in RFSs can only
be used as pointers.

>> I don't have to decide
>> in advance whether there will ever be a movie with two distribution
>> titles in the same language: my framework can accommodate that.
>
> An RDB can do that as well.

Sure, if you account for that in advance. If you have defined
Movie(...,Title,...) then you have to redesign your database schema.
The advantage of not having a data model is that you don't have to
change one when the requirements change! (I have to write this down).

> Your use of the word /framework/ scares me.

Well, how do you solve problems in the real world in a timely and
cost-effective fashion, if not using this or that framework?

> Given that the four levels defined in FIAF etc are not really
> a hierarchy (except if it were in a full RDM with Relational Keys in
> those four tables), how exactly would you define that "well defined
> hierarchy" ?

Using some Descriptive Logic.

>> > We use the "Full hierarchy model (3 levels)" page 8 ?
>>
>> It's ok with me.
>
> Actually, it is turning out to be a pain, and the pain is caused by
> attempting three levels. I will give you the four levels.

Ok.

> The objective truth in the real world is, if it exists, it has an
> Identity. First Law, expanded. Hamilton. Boole. If there is no
> species identity, then the genus (taxonomy). If an instance must be
> recorded, the worst (ie. no proper identity defined by the user who
> owns the data) is a list of differentiators.
>
> I do have a method for that, intellect plus three tables. But I am
> a bit reluctant to put that in, because in the past academics simply
> do not get it. Done properly, ie. to prevent duplicate Keywords, it
> requires a Function that is called recursively. All standard SQL in
> the real world.

That intrigues me.

> Now I am quite sure than you can code a recursive
> Function, but since you are using a pig poop NONsql that you think is
> "SQL", you may not be aware of how to use such. So, a qualifying
> question, using the simplest DDL, so that the issue is exposed and not
> detracted from: in this RDM (which is 80% complete, so please do not
> argue that it is not perfectly Relational, we know it is not)
>
> http://www.softwaregems.com.au/Documents/Documentary%20Examples/Order%20DM%20Advanced.pdf
> Keyword
> Part
> PartDescription
> Object_V (View with computed columns)
> Do you understand that Object_V.Description is the concatenated list
> of Keywords, that describes a PartCode ? That is derived by calling
> a recursive Function, eg. Part_Description_fn( PartCode ).

Just a request for clarification: such function should concatenate the
keywords associated with the provided PartCode, or those keywords *plus*
the keywords associated to any (sub)component of the provided PartCode?

> Null in a result set is normal. Occurs if the ObjectType is Asset, not Part.

Ok.

Nicola

Derek Ignatius Asirvadem

unread,
Jan 5, 2020, 4:34:52 AM1/5/20
to
> On Sunday, 5 January 2020 03:55:32 UTC+11, Nicola wrote:
> On 2020-01-03, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >> On Tuesday, 31 December 2019 22:16:57 UTC+11, Nicola wrote:
> >> On 2019-12-30, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >>
> >> If
> >> I define Work(WorkID), then I can link all the properties I need to any
> >> work, e.g., Title(WorkID, Title, Language, Type).
> >
> > (Having previously responded to the "simplicity" and "flexibility"
> > promise, re a dis-integrated RFS)
> >
> > What other form of "linking" does an RFS
> > have, that the RDM does not have ?
>
> I'd say that one should ask the opposite question, as one of the
> strength of the RM is that it does not rely on "pointers" (which can
> point only to one thing), but since connections are established on
> value comparisons it's much more flexible. Record id's in RFSs can only
> be used as pointers.

Ok, so that point is conceded. RFS has no other form of "linking".

I am not attempting to argue with your words, I am attempting to change your perception of the RM. The great difference between RFS and RM is that where the RFS was physical only (and therefore the pointers are physical), the RM is entirely logical. Logical in all respects. If you are climbing out of the RFS mindset, to the RM, one rung at a time, that will be small increments, and very slow. I am asking you to look at the RM from the top down, look for the Logical. In everything.
- Logical rows (not physical records),
- Logical Keys (not IDs, plus UNIQUE when errors are found and patched),
- Logical Relations (not the mathematical term, but the full meaning in English)
- Logical Sets (way, way, beyond what the "literature" states)
- etc

> >> I don't have to decide
> >> in advance whether there will ever be a movie with two distribution
> >> titles in the same language: my framework can accommodate that.
> >
> > An RDB can do that as well.
>
> Sure, if you account for that in advance. If you have defined
> Movie(...,Title,...) then you have to redesign your database schema.
> The advantage of not having a data model is that you don't have to
> change one when the requirements change! (I have to write this down).

Sure. The advantage of not attaching yourself to a belief is, you don't have to change it when your belief changes. The advantage of not having a wife is, you don't have to get rid of her when a slut comes along.

The point you are missing is huge. We do not create a model of what we want, because that is limited to what we understand based on the app that we are trying to build. That is one of the biggest and common mistakes that the OO/ORM crowd make. It leaves them with no understanding of the data, as data. The notion of "universe of discourse" is pathetic. Good only for classroom exercises, if at all.

What we need to do is to observe the universe (the one, objective reality), and model the subset of that that we need. Perceiving the data only, and nothing but the data.

Even the notion of attempting to model data using UML (separate to it being an anti-standard) is pathetic because the person already has, and does not have to release, his OO/ORM mindset.

If you understand my UTOOS Response, you will understand that we should use the right tool and mindset for the task: UML (if ignorant of real Process Modelling Standards) for the app; IDEF1X for the data (thus far it is not a database, much later, it will be).

There are huge advantages in modelling the data, as data, and nothing but data, as extracted from the perception of the real world.
- That, and only that, makes your model immune to change. (Here, of course I mean structural change, not merely adding an attribute, which has no effect on the existing code)

So
- if an Address has a Lot No (legal tract of rural land in Anglo countries), do not squeeze that into Number-in-Street, keep it as a separate attribute.
- if a movie with two distribution titles in the same language, model that

It is not "in advance". It is modelling reality vs modelling just what you need at the moment (anti-modelling).

> > Your use of the word /framework/ scares me.
>
> Well, how do you solve problems in the real world in a timely and
> cost-effective fashion, if not using this or that framework?

That was meant strictly in the OO/ORM "framework" sense. As in, you have a framework for this, and a framework for that, all of which are big fat layers of code, in the muddleware, all of which fail Architecture (even though they use that word all the time).

http://geek-and-poke.com/geekandpoke/2013/7/13/foodprints

I solve problems using Architecture. For the purpose here, Architecture is defined as:
1. the overall design of platforms and services for the whole app, for the entire lifespan
2. the proper encapsulation and deployment of each code segment in [1]

Eg. All definition of the data must be in the database, not outside. That means all constraints (there is no constraint that cannot be defined, but that is a separate argument, if you pose it). You do understand that the database is a single Unit of Recovery, a self-contained and self-defining unit. Codd's Twelve Rules, if not Architecture rules, yes ?.

> > Given that the four levels defined in FIAF etc are not really
> > a hierarchy (except if it were in a full RDM with Relational Keys in
> > those four tables), how exactly would you define that "well defined
> > hierarchy" ?
>
> Using some Descriptive Logic.

Yes, I know. Could you give us an idea of what that is. Eg. If you asked me that question, I would give you the Predicates, which can be /read/ directly from the data model, but I would give it in text form for those who can't. I have worked with two anti-logical "description logics", and they are fetid messes but the one person who uses it thinks it is marvellous (she is the only one who understands what the person who coded it did). No one else can use it.
What ? What has a component or sub-component have to do with it ?

(The app should not (does not need to) access tables, that would be very silly. It should access Views.)

CREATE VIEW Object_V
AS
SELECT ObjectCode,
________ ...,
________Description = CASE ObjectTypeCode
____________WHEN "P" THEN dbo.Part_Description_fn( PartCode )
____________ELSE "[NoDesc]"
____________END
________ ...
____FROM ObjectType,
________JOIN Object ...,
________JOIN Part ...,
________JOIN Asset ...,
________ ...
____WHERE
________ ...

Where does components come into it ?

> or those keywords *plus*
> the keywords associated to any (sub)component of the provided PartCode?

(Still trying to figure out what you could have meant.)

Do you understand, each Key is an Atom ? Modular design (Architecture) of code segments means each Constraint; Function; etc, is associated with the one Atom, the right one. Just as we need only "full" FDs, because we have only a Relational Key, an Atom.

Ok, I think I know what you mean. There is another Function. If you understand Codd's RM § 1.4, the Relational Normal Form, the pre-requisite is "the Unnormalised Set", which is the Hierarchic Normal Form. That means Trees, strict Trees, Directed Acyclic Graphs, no circular references. So yes, there is a CONSTRAINT to prevent circular references.

> > Done properly, ie. to prevent duplicate Keywords, it
> > requires a Function that is called recursively.

And to prevent circular references. In the normal case (no hierarchies //in the data//), circular references are already prevented, simply by complying with the Relational Rules. So we only need to actively prevent circular references whenever we create the possibility: hierarchies /in the data/.

(Duplicate Keywords are already prevented in that 80% complete model. The 100% complete model gives PartDescription a fuller treatment.)

This will probably be the subject of questions, so let me take it from the top, if you do not mind. This is the second of the three types of Hierarchies in the RM. The example is the Unix Node (file & directory hierarchy), the data is:
http://www.softwaregems.com.au/Documents/Tutorial/Recursion/Hierarchy%20Inline.pdf

The data model is:
http://www.softwaregems.com.au/Documents/Tutorial/Recursion/Directory%20DM%20Inline.pdf

We need to produce the Path, as a column. Perfect candidate for a Function, that is recursive.
Node_Path_fn (
NodeNo, -- Starting point
ReturnNodeNo -- Boolean: if set, return CSV NodeNos,
else return "/" separated list of Node.Names
)
Returns Path ( CHAR(255) not TEXT) -- list of NodeNos/Node.Names
The same Function is used in the CONSTRAINT to CHECK that the NodeNo attempted on INSERT is not in the list of ancestors (circular reference).

If you want the full write-up, the second & third types of Hierarchies in the RM, see this doc. Even more constraints, in the right place. This is a single doc that I use to refer to, from many Answers in various fora. (Please forgive the laboured detail, it is required to get todays Computer Science grads away from the pig poop that the "textbooks" and "professors" implant in them.)
http://www.softwaregems.com.au/Documents/Student%20Resolutions/Bill%20of%20Materials.pdf

Back to this question. Yes, we need a constraint to prevent circular references. Yes it calls a Function, which is recursive. That Function is one and the same for producing a list as a column. (eg. Path above). It calls a Function, again linear:
Part_Assembly_Get_fn( PartCode ) - list of ancestors for the PartCode

Simple and straight-forward in SQL.

If PusGrossNONsql has no recursion, note that technically, it is the platform (language processor), and not SQL (language) per se, that provides recursion. Again, why I directed you to the Oracle Circus/Oracle vs Sybase doc. We have had it, in commercial SQL platforms, since 1984. IBM customers have had it somewhat earlier, in the 70's, with System/R; SEQUEL; and proprietary SQL.

> Just a request for clarification: such function should concatenate the
> keywords associated with the provided PartCode, or [...]
> the keywords associated to any (sub)component of the provided PartCode?

The former.

In SQL, the latter (either the Assembly tree, or the Component tree) is serviced by a Stored Proc, not a Function. Last time I looked at the bastard grandson of Stonebraker, it did not have Stored Procs. And so "functions" (pathetic implementation aside) were overloaded with all sorts of tiny bits of functionality that is normal in a stored proc. Which, as I warned, is anti-SQL. Based on its purpose, there are certain things that a Function should, and should not, do, and certain things that a Stored Proc should, and should not, do. In PooGross it is all mixed up and a fraction of the mix-up is delivered in a "function". Point being, I understand why you expect to deliver that in a "function", but that is abnormal for a Function, normal for a Stored Proc.

The purpose of an SQL Function is to RETURN a scalar. That means a single domain (column) in a result set. Imagine a column in a two-dimnesional table, the value changes for each row, depending on the PK. It might be a substitute for a Subquery. The point is, the Function is single-row.

The purpose of an SQL Stored Proc is to contain a code segment, either to execute an ACID Transaction (Return a Status) or to produce a result set (Return a Status), ie. multiple rows, not a scalar.
- In poor systems or RFS, Stored Procs are heavily misused. They might use a cursor, which is Procedural Processing, utterly foreign to a Relational database. Temporary tables abound, they need "intermediate storage".
- In Relational systems, such as the example model, we use Set Processing only. Temporary tables are simply not required.

In normal SQL, the Assembly or Component tree would be produced by a recursive Stored Proc that contains a single SELECT. With full indentation to show the levels in the tree, or a LevelNo which is derived, etc. Executes itself until it reaches the leaf level across all branches.

In MS/SQL and some others, a recent "feature" is a "table valued function" can RETURN a TABLE. But that is a fudge made from a dog's breakfast. A pus-poor substitute for an SQL Stored Proc. In the PusGross case, it is the wrong way to deliver what a Stored proc normally does.

(MS has all sorts of weirdness, they probably did it only because PooGross had it, and the Date; Darwen; Fagin Gulag was screaming for it. Think about this, anyone who has been on MS/SQL prior to the advent of "table valued function" would never use it.)

Therefore, if you would like to maintain an SQL mindset, get an SQL platform (cheap for universities). You can end the pain, and horrendous consequences of, thinking that NONsql is SQL ("SQL can't do this or that"). No joke, it has to be done at some point. The need for understanding the logical chain, understanding FOPC->RA->SQL, cannot be overemphasised.

Second option: just code a single pure SELECT, in a WHILE (forever) loop that traverses the tree. And after that, translate that into the NONsql that is required for your particular suite of freeware.

Cheers
Derek

Lifepillar

unread,
Jan 5, 2020, 12:33:04 PM1/5/20
to
> where the RFS was physical only (and therefore the
> pointers are physical), the RM is entirely logical.
> […]
> We do not create a model of what
> we want, because that is limited to what we understand based on the
> app that we are trying to build. That is one of the biggest and
> common mistakes that the OO/ORM crowd make. It leaves them with no
> understanding of the data, as data.
> […]
> There are huge advantages in modelling the data, as data, and nothing
> but data, as extracted from the perception of the real world. - That,
> and only that, makes your model immune to [structural] change.
> […]
> Eg. All definition of the data must be in the database, not outside.
> That means all constraints

Not sure you've appreciated the irony in my responses. But you make good
points, with which I agree entirely. I'd add that too often too much
emphasis is put on making the life of developers easier, regardless
of the consequences for the users, and this trend, unfortunately,
encompasses the whole computer industry.

>> > So, a qualifying question, using the simplest DDL, so that the
>> > issue is exposed and not detracted from: in this RDM (which is 80%
>> > complete, so please do not argue that it is not perfectly
>> > Relational, we know it is not)
>> >
>> > http://www.softwaregems.com.au/Documents/Documentary%20Examples/Order%20DM%20Advanced.pdf
>> > Keyword
>> > Part
>> > PartDescription
>> > Object_V (View with computed columns)
>> > Do you understand that Object_V.Description is the concatenated
>> > list of Keywords, that describes a PartCode ? That is derived by
>> > calling a recursive Function, eg. Part_Description_fn( PartCode ).
>>
>> Just a request for clarification: such function should concatenate
>> the keywords associated with the provided PartCode, or those keywords
>> *plus* the keywords associated to any (sub)component of the provided
>> PartCode?
>
> What ? What has a component or sub-component have to do with it ?

Your model has an Assembly entity, so I was wondering whether the
function would traverse the part-component hierarchy (which would
require recursion) to retrieve the keywords associated to the components
of a given part, too.

To get the keywords directly associated with a given PartCode X, I would
not use recursion:

select PartCode, string_agg(Keyword, ', ' order by Keyword)
from PartDescription
where PartCode = X
group by PartCode;

> (The app should not (does not need to) access tables, that would be
> very silly. It should access Views.)

Yes.

> This will probably be the subject of questions, so let me take it from
> the top, if you do not mind. This is the second of the three types of
> Hierarchies in the RM. The example is the Unix Node (file & directory
> hierarchy), the data is:
> http://www.softwaregems.com.au/Documents/Tutorial/Recursion/Hierarchy%20Inline.pdf
>
> The data model is:
> http://www.softwaregems.com.au/Documents/Tutorial/Recursion/Directory%20DM%20Inline.pdf
>
> We need to produce the Path, as a column. Perfect candidate for
> a Function, that is recursive.
> Node_Path_fn (
> NodeNo, -- Starting point
> ReturnNodeNo -- Boolean: if set, return CSV NodeNos,
> else return "/" separated list of Node.Names
> )
> Returns Path ( CHAR(255) not TEXT) -- list of NodeNos/Node.Names

Yes, for that I'd use a recursive function, which in PostgreSQL I'd code
as follows, modulo some omitted details:

create function Node_Path_fn(_NodeNo integer)
returns char(255)
language sql as
$$
select case
when ParentNodeNo = _NodeNo then '/'
else Node_Path_fn(ParentNodeNo) || Name || '/'
end
from Node
where NodeNo = _NodeNo;
$$;

> The same Function is used in the CONSTRAINT to CHECK that the NodeNo
> attempted on INSERT is not in the list of ancestors (circular
> reference).

Mmh, no. A newly inserted node cannot be in the list of its ancestors.
Such constraint would be useful to avoid circularity on UPDATEs.

If you want to prevent multiple roots (i.e., more than one record with
NodeNo = ParentNodeNo), I would define a separate constraint.

> Simple and straight-forward in SQL.

Yes.

> If PusGrossNONsql has no recursion,

It does. Both recursive functions and recursive queries. And yes, we can
use them.

> In SQL, the latter (either the Assembly tree, or the Component tree)
> is serviced by a Stored Proc, not a Function. Last time I looked at
> the bastard grandson of Stonebraker, it did not have Stored Procs.

They were introduced very recently:

https://www.postgresql.org/docs/12/sql-createprocedure.html

You are right that functions are used for both purposes in PostgreSQL
and they can return anything, from nothing to tables.

> In normal SQL, the Assembly or Component tree would be produced by
> a recursive Stored Proc that contains a single SELECT. With full
> indentation to show the levels in the tree, or a LevelNo which is
> derived, etc. Executes itself until it reaches the leaf level across
> all branches.

You'd achieve that in PostgreSQL with a function returning a table.

> Second option: just code a single pure SELECT, in a WHILE (forever)
> loop that traverses the tree.

I prefer to avoid this kind of coding, except when it is needed because
it provably improves performance.

You have asked to keep the content of these posts to the subject, but
this "qualifying test" is a digression. Please move it to a different
thread if you want to continue discussing it.

The main question of this thread is (a genuine question of mine): How do
you identify a movie?

Nicola

Derek Ignatius Asirvadem

unread,
Jan 5, 2020, 8:45:03 PM1/5/20
to
Yes, that is the best doc.

The material to focus on would be p17 to p81, ch1 to ch3. But there are problems.
a. If that were written by a high-end user, as a formal or informal User Requirement doc, I would give them a 10 out of 10. Great start, thanks, now it is over to me. Wherein I would model the data (erect a formal data model, several iterations, in communication with the user, etc) and write a Technical Requirement doc to accompany the data model if necessary. At that stage the DM would be what you guys call "logical", and much more, because the real Logical is unknown to you.
b. But it is not. It is a Technical doc that is supposed to be written by a Technical person (film industry technical as opposed to IT technical), for use as a "standard", for an implementation. We have previously agreed that it is a set of considerations and an intent.
c. It is seriously deficient. I would give it 2 out of 10 as a User Technical doc, 0 out of 10 as a Standard. From a technical perspective (attempting a Relational data model [which is the best way to understand the data]; trying to make sense of what they are requiring), it is incoherent.

Let's say that my Key for Movie Title at this stage (some iteration of the DM that is not yet complete), based on the doc, is as follows. (I know it is not correct yet, that is not the point being argued here):
Work.PK ( CountryCode, Year, Creator, Title )

Now take one eg. p22 § 1.1.1 Boundaries between Works

1. The examples given (to differentiate Works) that are mere changes to one or more component of the Key = Good Thing, confirms the Key. The other examples suck dead bears.

//Change in footage and/or changes in continuity (primary editing)// & the examples given

2. Um. Footage exists only in a Manifestation, it cannot exist in a Work (intellectual concept, not yet real)

//different language versions shot at the same time, released simultaneously//

3. Different Languages is not a Work, because a human can conceive of something in just one language (those who conceive of things in multiple languages can be safely assigned to an asylum). The Language would be the Language of the Country or of the Creator.
Nevertheless, I am willing to accept that multiple languages is an INTENT of the Work.

4. Shooting is a Manifestation (realisation), not a Work.

5. Release is a Manifestation (realisation), or even an Item, not a Work.

Et cetera. I have given just one example of incoherence, there are many such issues.

The same set of problems (incoherence) exists at each of the four Levels.

----

In the normal, real world case, following [a][b][c], I would give the customer a presentation of such errors, the result of which is:
d. either that he would retain responsibility and flick it back to his technical people to produce a Technical doc,
e. or he would commission me to produce a Technical doc that is resolved; free of errors; coherent. Which I would do side-by-side with a DM, in communication with the key users.

Obviously that is not going to happen here. Keeping in mind the central question in this thread, and not shirking the work required, I suggest the following.
- I play both roles, User and Data Modeller
- I make decisions about what each of Work; Variant; Manifestation; Item, actually is (a real world implementation). This will not be arbitrary decisions, but sensible ones
- which will be reflects in the DM, as well as in the conversation in this thread
- however that leaves a rather large gap for argument, and a devil's advocate argument that would be a veritable chasm, a Grand Canyon. Which would detract from the purpose of this thread, and allow you to avoid accepting a hard answer
--- In that case, you have to accept that that is not permitted
- the alternative is [d], you retract the doc, and come back when you have a coherent one, that is sensible enough (not perfect) to model from, to make reasonable decisions re what the Relational
{ Work | Variant | Manifestation | Item }.PK
should be.

Over to you.

Correct me if I am wrong, but the real intent, the real value of this thread, is going to be the discussion that occurs AFTER I give a Relation Key that Identifies a movie. Ie. a good Relational Key vs a surrogate, wrt the charges you have made at the top of this thread.

If, on the other hand, your question is really "how does one determine a Relational Key for a real world implementation, from this incoherent document", then let's confirm that that is the case, and let's have that discussion. It depends on your initial or progressed intent: have a discussion re:
q1. RFS/surrogate vs RDB/Relational Key
-- wherein a model with zero integrity, that does everything for everyone, has already been rejected
or
q2. What is a good Relational Key for this problem

Which is why it is over to you.

----

Re progress, I am more than half complete in either case.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Jan 6, 2020, 1:32:25 AM1/6/20
to
> On Sunday, 5 January 2020 01:18:32 UTC+11, Nicola wrote:
> On 2020-01-01, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >> On Tuesday, 31 December 2019 22:16:57 UTC+11, Nicola wrote:
> >> On 2019-12-30, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> I have to agree that insane but
> opportunistic political choices have a lot to be blamed for.

Yes, and more. The fat layer of bureaucracy in government, that is full of lefties; greenies; perverts; climate hysterics, implementing their extreme left political agenda. in Australia it is much worse than the big companies ruiing the environment or the government (heads) making poor choices. This is the material establishment of the Deep State; the slaves of the Globalist Agenda.

> >> Those are not papers, but (AFAICS) explanations about the reasoning
> >> that went behind the standards that were eventually published.
>
> > Something is not a standard just because it is published.
>
> My assertion did not imply a judgment. They are just called like that:
> "metadata standards".

I did not mean you. I meant FIAF; FRFB, etc. Date; Darwen; Fagin et al. The whole leftist movement. The use of false labels that (at best) confuse, and (at worst) purposely misrepresent.

> > Look, by the Grace of God, in the 70's and 80's, I was a Lead s/w
> > Engineer for Cincom, supplier of the then 5th largest (of five!) DBMS,
> > TOTAL.
>
> https://books.google.lt/books?id=Wz-oh7ZQo8MC&pg=PA45&lpg=PA45&focus=viewport&hl=lt
>
> Always nice to get a glimpse into less known parts of database history.

That was a great walk down memory lane. Thank you !

If interested:
https://books.google.lt/books?id=Wz-oh7ZQo8MC&pg=PA24&lpg=PA24&focus=viewport&hl=lt
An IBM advert for SQL/DS. That ran as a stand-alone program, comparable to running SQL in interactive mode today (Sybase & MS isql). The "embedded SQL" version was a library: link it into COBOL or RPG, and you could write a genuine SQL statement (less than todays syntax) inside your program.

If you looked at this doc § 2, that I linked in my previous response:
http://www.softwaregems.com.au/Documents/Student%20Resolutions/Bill%20of%20Materials.pdf
The left diagram is the BoM solution for IBM/IMS then, simplified. Probably still true today (IMS is still sold and supported)
The right diagram was our Cincom/TOTL BoM, that blew the doors off the IMS equivalent. It was so successful that Cincom made a full-blown product out of it: MRPS. Advert at the bottom of this page. Still runs in many sites across the world:
https://books.google.lt/books?id=Wz-oh7ZQo8MC&pg=PA24&lpg=PA24&focus=viewport&hl=lt

In the TOTAL NDBMS, the pointers are completely invisible, to both the user and the developer. Only need to be understood by the DBA.

> > With no knowledge or regard of what Date; Darwen; Fagin;
>
> I understand why you want to put those names together, but I disagree
> that Darwen and Date's contributions (or lack thereof) are at the same
> level as Fagin's (no need to reply to this; let's keep our respective
> opinions).

Yes and no. Agreed that Fagin is a real mathematician, the dolts are not. But he, not Codd, wrote the "mathematical definitions" for all the anti-Relational (RM/T etc) concepts, the abnormal "normal forms", etc. He flatly states that DKNF is not achievable, meaning in his/their RFS. I achieve far more than DKNF in every database. Far more, because there are many constraints that are not the result of Domain or Key constraints.

About 2000, I wrote to him with a full IDEF1X data model, and full definitions, concerning that point. He could not understand a graphical model. He could only understand text. He wanted a mathematical definition, which was way beyond the presentation I was making. The usual elitism of academics.

Therefore he is squarely in the same category, of (a) denying the RM and (b) elevating the RFS (RM/T) as "RM". He may not have pig poop on his hands or lips, but he has it in his cranium, same as the rest. From the evidence, he operates with the same goal as the rest.

> > Academics love whatever they come up with, they hang on to the
> > promise, in pathological denial of reality.
>
> I beg to disagree. Don't generalize. Academic work is not different from
> any other human endeavor: many people do it, some do it well, only a few
> are exceptionally good at it. But let's not get sidestepped: your
> opinion about academia is widely known in this group.

Labelling a person and attacking the person is a lame, /ad hominem/ attack, that distracts from the argument and focuses on the person, it permits you to avoid answering the argument. You are too honourable for that.

Ok, let's set aside the opinion about my opinion, let's look at evidenced facts. Needless to say, there are at least 120 academic papers (yours included) that deny RM and elevate RM/T. Name one single articulation of RM (Not RM/T) concepts that any academic since Codd (fifty years and counting) has written. Or a formal definition. These two scream out for a formal treatment:
RM § 1.4
Trees -> "Unnormalised Set" -> Hierarchic Normal Form
"Normalised Set" -> Relational Normal Form

Fifty years.
Fagin has written much purporting to be about the "RM" or "relational", but is actually nothing about the RM, everything about RFS (RM/T), same as the rest of his tribe.

> > Stonebraker invented the filth known as Ingres. It defecated on the
> > users if more than five used the system. But it was highly praised by
> > academia.
>
> This is partly due to the fact that it was the only RDBMS coming from
> academia at the time.

Nonsense. They ignored ALL the products that were available at the time. Pathological denial of reality.

And worse, by 1985, when the products were well-established and fully compliant to Codd's Twelve Rules (well-known before 1985 due to the ongoing discussion in ComputerWorld and Datamation), they still continued to deny them, and write only filth as if reality did not exist. Ingres never complied with the Twelve Rules. PusGross and Oracle today, do not comply with either the Twelve Rules or the SQL Standrd.

And the Standards. Academia suppresses IDEF1X, and pushes ERD which is totally inappropriate, completely useless, for Relational data analysis or modelling. They teach that filth in universities. It is a commitment to denying the RM; Relational Keys; the Atom, and instead dealing with fragments of the Key (fragments of the Atom, denying its indivisibility), stored in Records, in an RFS, branded as "relational".

> But academia does recognize the fundamental merits
> of other systems, such as System R and (to a less extent) IS/1 and PRTV.

That is a perfect example of their dishonesty, thank you. No one has heard of the last two. System/R progressed to SEQUEL progressed to SQL/DS progressed to full SQL (proprietary). Then on to SQL as a Standard. Not mentioned. Every "good" lie needs an ounce of truth, which is mentioning some selected facts of history, adds credibility as a historian, knowledge of obscure details, etc. The Omission part is omitting the truth, the progress, and even SQL. And the Commission part is presenting their filth as "sql".

Don't forget the promise that they will produce a replacement for SQL because it is "not relational" (FALSE) and "broken" (FALSE), and because they in their ivory tower, divorced from reality, can. Sure, thirty years and still nothing.

Maybe when Stephen Hawking comes up with the long-promised and much awaited Theory of Everything. Or when he redefines nothing as actually something. Oops, too late. Maybe Chris Hitchens. Aw shucks. Wait, they have the autistic boy genuflectus Elon Musk Ox himself. Electric cars made from petro-chemicals, requiring massive electric batteries that blow up, that are made from rare earth minerals, extracted using slave and child labour in desperately poor countries. The new definition of eco-friendly, denial of reality and propagation of fantasy as "reality".

Two of the Four Horsemen of the Apocalypse down, two to go.

But for the rest of us, we have pre-modern science. That does not change.

> >Its bastard child, its resurrected ghost, is Postgres*NON*sql.
>
> The natural child being Sybase, I guess, given that it was made by
> members of the Ingres' team :)

Not sure what you mean.

I meant PusGross is the bastard of Ingres.

Sybase is the natural child of Britton-Lee, a genuine pre-Relational database machine that competed against IBM, Cincom, etc. When the RM became accepted, they formed a new company and came out with the first genuine SQL platform outside IBM: Sybase. Sybase and IBM/DB2 are genuine database machines (refer Architecture doc). That was the Bob Epstein crowd. There were no Ingres team members at Sybase. (I was a Sybase Commercial Partner for 17 years, I worked directly with their Engineers, pushed for specific enhancements that the banks needed; presented at their Engineers' conference; etc.)

> > It does not even remotely comply with SQL,
> > […]
> > ( That is just about PusGrossNONsql vs SQL, that does not cover the
> > other issues, such as No Server Architecture;
>
> Kind of agree (if, as you say, "a list of Unix processes clamouring for
> resources is not an architecture"). It still shines, IMO, compared to
> Oracle.

(I did give a bit more definition re what is, and is not, Server Architecture. That is a very condensed introductory doc for comparison purposes. A technical person would understand the consequences, of each of the Architecture and the Non-Architecture.)

Ok, I agree. In my considered experience, new twenty-year-old pig poop shines in comparison to forty-year-old pig poop. But they are in the same category. From the evidence of the various papers, you, and the academics in general, have no idea what a real SQL platform looks like, let alone what it works like. There is no basis for you guys to declare "SQL is this ... SQL can't do that ...", but you do, all the time. In denial of reality.

> > No ACID Transactions;
>
> Do you refer to MVCC and stored procedures? Happy to discuss that in
> a different thread.

Yes, please. Although the definition is short, the discussion will be long. Might include different implementations. Need to understand the historical progression.

> > Therefore, if you would like to engage because you would like to know
> > what actual Authority; Standards; Codd (RM & Twelve Rules only);
> > genuine SQL; high-end customers have been doing for four decades,
> > while academia has been hyper-ventilating about, and engorging
> > themselves with, each others backsides, we have a chance of having
> > a meaningful discussion. Specifically, you will gain something.
>
> Yes, that's the prime and ultimate goal of this discussion!

Ok, great. But I did detail a caveat, ending with:

> > I am saying you must have some serious pain, caused by the natural revulsion at the sophistry that passes for "logic", the foul odour of the mountain you are standing in, to move away from it.

Which I will take it, that you have accepted that. The idea is to limit, if not eliminate, devil's advocate arguments for something that is not reasonable for a Scientist.

Now we need some boundaries and expectations.
1. After 34 years of practising the RM, and not the anti-Relational RM/T, without the contamination of the Data; Darwen; Fagin sophistry that markets the anti-Relational RFS (RM/T as the RM), I have a fair amount of progressions to the RM, which I credit to Codd, because anyone whose brains have not been scrambled would go down the same path and find the same progressions. These are documented in customer systems, confidential, and not in the public domain. These are covered by a customer contract which includes a Non-Disclosure Agreement. They continue to have commercial value.

2. This forum is public domain. [1] is explicitly excluded from public discussion.

- There is a huge gap between what the RM is marketed and promoted as (RFS based on RM/T plus plus, truly an RFS) and what the RM actually is. Which is defined in the header thread. That is the scope of discussion here. What a real world Applied Scientist would do, from the RM proper.

- Whereas [1] is the implementation, including Rules; extensions to existing Standards; new Standards where there is none; implementation constructs that can be immediately implemented, this [2] is everything minus that.

As evidenced in this and other threads I have participated in. Same as Codd, I accept Applied Theory and scoff at theory that has no application intent, or that contradicts application science.

> > No. That is not simplicity. Stop lying to yourself. That is
> > ignoring the science that produces simplicity, and shifting the
> > responsibility back to the user.
>
> From here on, I essentially agree with your extensive reply (except for
> some unimportant details).

Thank you.

> > There are THREE types of genuine Hierarchies, which are inherent (yes,
> > inherent) in the Relational Model. In the fifty years since Codd, not
> > a single "theoretician" has defined or articulated them. Proved it
> > is, that that is not known to you, by the fact that you are pushing
> > linked lists of physical IDs as "hierarchies".
>
> I see two types of hierarchies: at the instance level (e.g., the
> classical bill of materials) and at the schema level (via dependent
> entities, including categorizations). What is the third?

My numbering, since 1985, which is logical, in the sense of sequence of encounter as one applies the RM. I will use IDEF1X terms, because that is the one and only Standard for Relational data modelling. (You may wish to discuss why ERD is totally useless for Relational modelling; why the academics push ERD; why they deny the existence of IDEF1X; why they do not teach IDEF1X; etc).

My Intro to IDEF1X is:
http://www.softwaregems.com.au/Documents/Documentary%20Examples/IDEF1X%20Notation.pdf

1. The Relational Key
The central article of the RM.
Which is a composite, which itself reflects The Hierarchy. A single Hierarch (Independent table), followed by a sequence of Hierarchic tables (Dependent tables), the relations between being Identifying relations. In the example:
http://www.softwaregems.com.au/Documents/Documentary%20Examples/Order%20DM%20Advanced.pdf
- one perfect example would be Country ... Street
- one imperfect example would be Party ... OrderSaleItem. Imperfect because the Hierarch.PK is a surrogate (justification aside), and thus a Relational Breach; a violation of the Access Path Independence Rule. But the consequence, the impact, is small, due to it being at the top of the natural hierarchy.
- another imperfect example would be Object ... OrderSaleItem
- Address is a more impactful breach (the example DM is a teaching vehicle, and thus includes classic errors in order to deeply understand them), because in the pure RM model it would be a Hierarchic, in the middle of the natural hierarchy. The consequence of the Relational Breach/violation of the Access Path Independence Rule is more impactful, because it cuts off the descendent rows from the natural relation to their ancestor rows. Whereas Party & Object is not.

What it is not
Reference tables such as ObjectType, because it has no Identifying relations.
Address, because even though it has a hierarchy one level deep, it goes nowhere.

The central article of the RM provides (and RFS [RM/T] can not have):
- Relational Integrity (which is Logical, and distinct from Referential Integrity, which is physical)
- Relational Power (simplest example is JOIN power)
- Relational Speed (due to smallest tables; fewest indices; etc)
These three absolute features of the RM is identified here, but not expanded. These are the three features that must be considered carefully, when choosing to implement a surrogate, because they will be lost.

It is remarkable that academics do not know this, despite making grandiose declarations about what the RM is, and is not (promotion and marketing of the RFS as "the RM" aside).

For a related discussion, see:
https://groups.google.com/forum/#!topic/comp.databases.theory/5212JwYtip4
It is a bit laboured due my having to overcome the dishonesty of academics, but it does have relevant background info, and a trip down memory lane.

If I were to /define/ the RM is a single sentence, I would not use the academics' declaration, that it was totally new, the first model because it has and RA, invented in isolation from the historic context, blah blah. Because that is a lie. It would be:
- the RM is based on (a) The Hierarchical Paradigm, (b) the Network Paradigm, and (c) FOPC, leading to the RA. Once the cataracts are removed from the eye, each of those items are readily visible.

2.Preamble
Whereas [1] is /in/ the Key, and migrated as FK::subordinate Key, and therefore /in/ a chain of tables, [2] is not, it is within the data rows of a single table.
Using this write-up, § 1:
http://www.softwaregems.com.au/Documents/Student%20Resolutions/Bill%20of%20Materials.pdf

2. Single-Parent Hierarchy
Commonly called "One Way".
Also commonly called "self-reference" which is an idiotic term (the row does not reference itself, the proponent is evidently confused re a table and a row) and therefore I do not use it.

The definition is, a row in the table references another single row in the same table, as a parent. The relation is Non-Identifying because the subject Key is already an Atom, and the reference does not define it, therefore the reference is an attribute of the subject Key.

The reference/relation does not have an attribute: if it did, it would have to be in a subordinate table, one Identifying relation and one Non-Identifying relation.

This describes a complete RM single-parent hierarchy, which is Logical and therefore without any limitation. Code that navigates the hierarchy can be simple or complex, based on the presentation demand. Typically, a recursive Function is used to the list of ancestors as a single column; variable. An example is Path in:
http://www.softwaregems.com.au/Documents/Tutorial/Recursion/Hierarchy%20Inline.pdf
(corresponds to Node in the BoM write-up)

Moving a branch requires an UPDATE to a single row. We have had these in the real world since 1985. SQL WITH is not required, CTE is not required.

For understanding, the RFS equivalents as promoted by Date; Darwen; Fagin; Celko (a whole book /SQL for Dummies/ ); et al. Adjacency lists; Nested Sets; still others, poured in concrete. Moving a leaf is horrendous, moving a branch is an overnight batch job that might blow the transaction log.

3. Multiple-Parent Hierarchy
Using the same BoM write-up, § 3.
Whereas [2] implements a full tree downstream, but a scalar upstream (single row per generation), this allows reference to multiple parent rows (multiple rows per generation). The classic Bill of Materials with zero duplication. In its simple form, the relation has no attributes, thus it is an Associative Table. Where the relation has attributes, it is an ordinary Dependent table.

It must be emphasised that solving this problem (it was a major problem in pre-Relational DBMS) was a specific task that Codd was given to solve in his RM project. Which he did brilliantly, due to the RM being Logical (FOPC: RA).

For understanding, the main difference between this and the RFS equivalent is simple code for navigation and no integrity errors vs complex navigation code and integrity errors. And worse, such integrity errors are not readily visible, because they hide in the undefined row or fragments of the row, which is inside the referenced record or records.

----

> I see two types of hierarchies: ...
> including categorizations

Categorisation (generalisation::specification in OO/ORM terms) is not an hierarchy. Just as we had three NFs; ACID Transactions; etc, before Codd, we had Basetype::Subtype clusters before Codd. Codd gives licence for all that in his RM.

/Categories/ is the original IDEF1X term but it is horrible (think: Aristotle /Ten [Classical] Categories/, or the normal English meaning) and not used in the IDEF1X community. Further the Original IDEF1X notation is limited, same as the notation of Cardinality. Therefore IEEE notation and terms are used for both Cardinality and Subtypes in an IDEF1X model. (Detailed in my IDEF1X Intro.) All modelling tools that I have used over the decades, that provide IDEF1X modelling allow IDEF1X/IEEE notation for Cardinality and Subtypes.

We have a Basetype and a set of Subtypes, forming a single-level cluster. Detailed discussion here:
http://www.softwaregems.com.au/Documents/Article/Database/Relational%20Model/Subtype.pdf

The main concept is, a single Basetype::Subtype pair should be understood (perceived as) a single logical row, not two rows with a relation between them. In Logic terms, it is an OR GATE. XOR for Exclusive, ordinary OR for Non-Exclusive. Another reason why the IEEE nomenclature is more applicable.

If you mean an hierarchy because a Subtype may in turn be a Basetype, and so on, then no, that is an ordinary hierarchy type [1], as defined by the Key, not a different type of hierarchy.

> >> one difference, though, is that a movie typically has no single
> >> creator, but rather a group of people with distinct roles (writer,
> >> screenwriter, director, etc.). Not sure whether you may attribute
> >> a movie to a single principal creator.
> >
> > Author: I don't. They call it Agent. It can be a single person, or
> > a corporation, or a collective of persons.
>
> Yes, that may a good way to model that.
>
> > Cast: They have specific roles for each member of the cast.
>
> Sure. More accurately, each member of the cast may have more than one
> role (e.g., an actor that is also the movie director).

No problem at all.
Simple Dependent table with a Key that implements that.

Cheers
Derek

Nicola

unread,
Jan 6, 2020, 12:24:02 PM1/6/20
to
> Yes, that is the best doc.
>
> Let's say that my Key for Movie Title at this stage (some iteration of
> the DM that is not yet complete), based on the doc, is as follows. (I
> know it is not correct yet, that is not the point being argued here):
> Work.PK ( CountryCode, Year, Creator, Title )
>
> Now take one eg. p22 § 1.1.1 Boundaries between Works
>
> 1. The examples given (to differentiate Works) that are mere changes
> to one or more component of the Key = Good Thing, confirms the Key.
> The other examples suck dead bears.
>
> //Change in footage and/or changes in continuity (primary
> editing)// & the examples given
>
> 2. Um. Footage exists only in a Manifestation, it cannot exist in
> a Work (intellectual concept, not yet real)

I'd say that you are right. The examples are taken from this paper:

https://escholarship.org/content/qt6hk8h9vp/qt6hk8h9vp.pdf

My interpretation is that, even at the intellectual level, the "moving
image" part of a movie ("footage" is probably not the best term here)
must be considered an intrinsic part of a (Moving Image) Work,
independent of any Manifestation and shared by all of the Manifestations
of the same Work, in the same way as Shakespeare's sonnets or
Aristotle's Metaphysics have an existence independent of any particular
printed or digital edition of them, and all printed or digital (or
spoken, for that matter) editions of them must share the same words (or
most of the same words). Similarly, all the Manifestations of the same
Moving Image Work must share the same "moving image" part, otherwise
they have to be considered expressions of different Works.

For example, if I interpret correctly those definitions, the recent
hyper-realistic remake of The Lion King by Disney even at the conceptual
(i.e., Work) level must be considered a distinct Work from the original,
even if it is based on the same screenplay, because the "moving image"
part is completely different.

One difficulty here is that we come to know Works only through their
Manifestations, so the only way we have to determine the existence of
a new Work is through the analysis of some Manifestation of it. Perhaps,
the simplest way to think of a Work is as the abstraction of the
collection of all of its Manifestations, similarly to the way we define
Person as the abstraction of the collection of all individuals.

> //different language versions shot at the same time, released
> simultaneously//
>
> 3. Different Languages is not a Work, because a human can conceive of
> something in just one language (those who conceive of things in
> multiple languages can be safely assigned to an asylum). The Language
> would be the Language of the Country or of the Creator. Nevertheless,
> I am willing to accept that multiple languages is an INTENT of the
> Work.

They provide Tod Browning's Dracula (1931), and other movies, as
examples. According to what I have read (I am not an expert), that movie
was shot at night on the same set both in English and in Spanish (with
Spanish-speaking actors). I *think* that they may have shot a scene in
English then, immediately after, shot the same scene in Spanish,
possibly changing some actors. According to FIAF, those are to be
considered two different Works (because of course the "footage"—hence,
more abstractly, the "moving image" part—cannot be the same).

I have searched some of those movies on IMDB: interestingly, Dracula
has one "record" (one page):

https://www.imdb.com/title/tt0021814/?ref_=fn_tt_tt_29

But, for example, Anna Christie (1930), which is another of the FIAF
examples, appears twice:

https://www.imdb.com/title/tt0020641/?ref_=fn_al_tt_1
https://www.imdb.com/title/tt0020642/?ref_=fn_al_tt_2

once attributed to USA and the other one to Germany.

> 4. Shooting is a Manifestation (realisation), not a Work.
>
> 5. Release is a Manifestation (realisation), or even an Item, not a Work.
>
> Et cetera. I have given just one example of incoherence, there are many such issues.
>
> The same set of problems (incoherence) exists at each of the four Levels.

I agree that they should have paid more attention to such aspects.

> ----
>
> In the normal, real world case, following [a][b][c], I would give the
> customer a presentation of such errors, the result of which is:
> d. either that he would retain responsibility and flick it back to
> his technical people to produce a Technical doc,
> e. or he would commission me to produce a Technical doc that is
> resolved; free of errors; coherent. Which I would do side-by-side
> with a DM, in communication with the key users.
>
> Obviously that is not going to happen here. Keeping in mind the
> central question in this thread, and not shirking the work required,
> I suggest the following.
> - I play both roles, User and Data Modeller
> - I make decisions about what each of Work; Variant; Manifestation;
> Item, actually is (a real world implementation). This will not be
> arbitrary decisions, but sensible ones
> - which will be reflects in the DM, as well as in the conversation in
> this thread

That's fine.

> - however that leaves a rather large gap for argument, and a devil's
> advocate argument that would be a veritable chasm, a Grand Canyon.
> Which would detract from the purpose of this thread, and allow you to
> avoid accepting a hard answer
> --- In that case, you have to accept that that is not permitted
> - the alternative is [d], you retract the doc, and come back when you
> have a coherent one, that is sensible enough (not perfect) to model
> from, to make reasonable decisions re what the Relational
> { Work | Variant | Manifestation | Item }.PK
> should be.
>
> Over to you.

I'm afraid I don't know other resources to refer you to. Appendix I has
some examples that might be interesting to browse, though.

> Correct me if I am wrong, but the real intent, the real value of this
> thread, is going to be the discussion that occurs AFTER I give
> a Relation Key that Identifies a movie. Ie. a good Relational Key vs
> a surrogate, wrt the charges you have made at the top of this thread.

Yes, that might be an interesting comparison. But I'd be happy to stop
after resolving the specific question I have posed.

> If, on the other hand, your question is really "how does one determine
> a Relational Key for a real world implementation, from this incoherent
> document", then let's confirm that that is the case, and let's have
> that discussion. It depends on your initial or progressed intent:
> have a discussion re:
> q1. RFS/surrogate vs RDB/Relational Key
> -- wherein a model with zero integrity, that does everything for
> everyone, has already been rejected
> or
> q2. What is a good Relational Key for this problem
>
> Which is why it is over to you.

First and foremost, the question is always the same: how do you identify
a movie (i.e., q2)?

Nicola

Nicola

unread,
Jan 6, 2020, 1:29:38 PM1/6/20
to
On 2020-01-06, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> That was a great walk down memory lane. Thank you !

Thanks for the links. Bookmarked.

>> I beg to disagree. Don't generalize. Academic work is not different from
>> any other human endeavor: many people do it, some do it well, only a few
>> are exceptionally good at it. But let's not get sidestepped: your
>> opinion about academia is widely known in this group.
>
> Labelling a person and attacking the person is a lame, /ad hominem/
> attack,

I didn't mean that as a lame personal attack, sorry if it sounded so.
I was just trying to limit the scope of this thread, into which we have
already mixed database history, concurrency control, general questions
about hierarchies, etc. (my fault, too). Nothing against discussing
a wide range of topics, but please let's stick to one topic per thread.
Time is limited for everyone (especially now that holidays are over),
and having to discuss several topics at once may become infeasible for
me.

I am still replying to your comments below, but it may become too
time-consuming for me to keep doing so. So, again, nothing personal, but
let's move tangential topics, like you did for the PostgreSQL thread.

> Name one single articulation of RM (Not RM/T) concepts that any
> academic since Codd (fifty years and counting) has written.

I think that, whatever I name, you won't accept it :) Off the top of my
head: Datalog.

But I must agree that most database theory has remained such. Besides,
I'd say that research on database theory has declined sharply since the
'90s, or maybe even earlier. AFAICS, current research in the database
field is primarily related to database *systems* (e.g., physical
optimization).

> ERD [...] is totally inappropriate

Agreed.

>> >Its bastard child, its resurrected ghost, is Postgres*NON*sql.
>>
>> The natural child being Sybase, I guess, given that it was made by
>> members of the Ingres' team :)
>
> Not sure what you mean.
>
> I meant PusGross is the bastard of Ingres.

> Sybase is the natural child of Britton-Lee, a genuine pre-Relational
> database machine that competed against IBM, Cincom, etc.

I quote from "Readings in Database Systems", Fourth Edition, p. 98:

"Perhaps the most commercially successful and technically impressive of
these systems was from a company called Britton-Lee, which was founded
by a number of alumni from the INGRES research group."

Nicola

Derek Ignatius Asirvadem

unread,
Jan 8, 2020, 6:38:15 PM1/8/20
to
> On Tuesday, 7 January 2020 04:24:02 UTC+11, Nicola wrote:
> On 2020-01-06, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >
> > Let's say that my Key for Movie Title at this stage (some iteration of
> > the DM that is not yet complete), based on the doc, is as follows. (I
> > know it is not correct yet, that is not the point being argued here):
> > Work.PK ( CountryCode, Year, Creator, Title )
> >
> > Now take one eg. p22 § 1.1.1 Boundaries between Works
> >
> > 1. The examples given (to differentiate Works) that are mere changes
> > to one or more component of the Key = Good Thing, confirms the Key.
> > The other examples suck dead bears.
> >
> > //Change in footage and/or changes in continuity (primary
> > editing)// & the examples given
> >
> > 2. Um. Footage exists only in a Manifestation, it cannot exist in
> > a Work (intellectual concept, not yet real)
>
> I'd say that you are right. The examples are taken from this paper:
>
> https://escholarship.org/content/qt6hk8h9vp/qt6hk8h9vp.pdf

What a joke. I love the way millenials think that if they call themselves /scholar/ they magically mystically mysteriously BECOME /scholars/.

> My interpretation is that, even at the intellectual level, the "moving
> image" part of a movie ("footage" is probably not the best term here)
> must be considered an intrinsic part of a (Moving Image) Work,
> independent of any Manifestation and shared by all of the Manifestations
> of the same Work, in the same way as Shakespeare's sonnets or
> Aristotle's Metaphysics have an existence independent of any particular
> printed or digital edition of them, and all printed or digital (or
> spoken, for that matter) editions of them must share the same words (or
> most of the same words). Similarly, all the Manifestations of the same
> Moving Image Work must share the same "moving image" part, otherwise
> they have to be considered expressions of different Works.
>
> For example, if I interpret correctly those definitions, the recent
> hyper-realistic remake of The Lion King by Disney even at the conceptual
> (i.e., Work) level must be considered a distinct Work from the original,
> even if it is based on the same screenplay, because the "moving image"
> part is completely different.
>
> One difficulty here is that we come to know Works only through their
> Manifestations, so the only way we have to determine the existence of
> a new Work is through the analysis of some Manifestation of it.

Yes. That is exactly why I moved from the challenge to submit THREE levels to FOUR levels. They have to be worked, be modelled together.

> Perhaps,
> the simplest way to think of a Work is as the abstraction of the
> collection of all of its Manifestations, similarly to the way we define
> Person as the abstraction of the collection of all individuals.

(Not arguing with your wording here, but rather the logical concept. I will use the well-known example here, rather than the mystical magical thingee that FIAF/FRBR/UCLA/etc cannot define in 20 years, despite their familiarity with it.)

How is that an /abstraction/ ?

Person is the genus, wherein { Teacher; Lawyer; Imbecile; etc } is the species.
Teacher is the genus wherein { Secondary; Lecturer; Professor; etc } is the species.
Sure, when approaching the subject from a starting position of "unknown", one must define the species (plural), and thus the genus becomes known, and can be defined.

But but but, that is not /it/. Second level. What we model in other entities will expose attributes of either genus or species of Person, either Matter or Form (accidents or being or essence).

So no, from where I sit, /abstraction is not anywhere near the accurate description or definition.

(If you are using /abstraction in the way the OO/ORM boffins use, to mean absolutely anything they cannot define, please use another word.)

Labelling, naming, a thing is very very very important. The name carries meaning. From the progress thus far, /Work/ is an idiotic label. I am tending towards /Concept/.

> > //different language versions shot at the same time, released
> > simultaneously//
> >
> > 3. Different Languages is not a Work, because a human can conceive of
> > something in just one language (those who conceive of things in
> > multiple languages can be safely assigned to an asylum). The Language
> > would be the Language of the Country or of the Creator. Nevertheless,
> > I am willing to accept that multiple languages is an INTENT of the
> > Work.
>
> They provide Tod Browning's Dracula (1931), and other movies, as
> examples. According to what I have read (I am not an expert), that movie
> was shot at night on the same set both in English and in Spanish (with
> Spanish-speaking actors). I *think* that they may have shot a scene in
> English then, immediately after, shot the same scene in Spanish,
> possibly changing some actors. According to FIAF, those are to be
> considered two different Works (because of course the "footage"—hence,
> more abstractly, the "moving image" part—cannot be the same).

Well, you and I agreed at the outset, the first principle is the Four Laws of Thought. Operating outside those Laws is the definition of insanity. They literally live in the Excluded Middle, resolving nothing; having contradictory and or confused or indeterminate meanings. I've got a job to do in the real world, a cataloguing system to build, from real physical articles (collection of movie articles just obtained from a deceased estate; etc). I can't be asked to either join their insanity and wring my hands for the next 20 years while sounding knowledgeable about fantasy but doing nothing to define it. Therefore, applying the Law of the Excluded Middle, I will
--------
resolve
--------
what they cannot, and I will accept the responsibility for doing so. That means, if you find it broken, in the next ten years, you can come back to me and I will fix it free of charge.

Likewise, they have contradictions. The universe has no contradictions. Anything perceived to be a contradiction is squarely in the perceiver's intellect, and it must be resolved (one side is cancelled). They are welcome to their intellectual and spiritual pain, which is the hot maintained tension or opposites. I can't be asked to join them. Therefore, applying the Law of the Non-Contradiction, I will
--------
resolve
--------
what they cannot, and I will accept the responsibility for doing so. That means, if you find it broken, in the next ten years, you can come back to me and I will fix it free of charge.

- No problem at all, that the Concept will become known from the Manifestation.
- Each of Dracula & Anna Christie are a single Concept, multiple Manifestations.
- Articles such as Aristotle's Metaphysics is a single Concept, multiple Manifestations.
- /the same "moving image" part/ is eliminated as a Concept.
--- If and when the user (archivist; librarian; cataloguer; etc) determines THE or A /concept/, he can add it.
--- the notion of IDENTIFYING the Concept before the addition of A Manifestation is dismissed, because it reverses the order of the universe, the natural progression of determining what in heavens name the articles in a Collection are. Ie. we come to determine the genus through determination of, and further discernment of, the species.

> I have searched some of those movies on IMDB: interestingly, Dracula
> has one "record" (one page):
>
> https://www.imdb.com/title/tt0021814/?ref_=fn_tt_tt_29
>
> But, for example, Anna Christie (1930), which is another of the FIAF
> examples, appears twice:
>
> https://www.imdb.com/title/tt0020641/?ref_=fn_al_tt_1
> https://www.imdb.com/title/tt0020642/?ref_=fn_al_tt_2
>
> once attributed to USA and the other one to Germany.

See what I mean. When one does not follow the Laws, or the science, one ends up with massive duplicates, duplicates at more than one level, not just duplicate rows in a single table (which is simple to prevent). Dupes like this are the result of conceptual errors (you might say at the /abstraction/ level).

> > 4. Shooting is a Manifestation (realisation), not a Work.
> >
> > 5. Release is a Manifestation (realisation), or even an Item, not a Work.
> >
> > Et cetera. I have given just one example of incoherence, there are many such issues.
> >
> > The same set of problems (incoherence) exists at each of the four Levels.
>
> I agree that they should have paid more attention to such aspects.

Not just attention, which I am sure they have paid mountains of. But the Laws, the science. All of which they are evidently ignorant of.

> > In the normal, real world case, following [a][b][c], I would give the
> > customer a presentation of such errors, the result of which is:
> > d. either that he would retain responsibility and flick it back to
> > his technical people to produce a Technical doc,
> > e. or he would commission me to produce a Technical doc that is
> > resolved; free of errors; coherent. Which I would do side-by-side
> > with a DM, in communication with the key users.
> >
> > Obviously that is not going to happen here. Keeping in mind the
> > central question in this thread, and not shirking the work required,
> > I suggest the following.
> > - I play both roles, User and Data Modeller
> > - I make decisions about what each of Work; Variant; Manifestation;
> > Item, actually is (a real world implementation). This will not be
> > arbitrary decisions, but sensible ones
> > - which will be reflects in the DM, as well as in the conversation in
> > this thread
>
> That's fine.

Done.

> > - however that leaves a rather large gap for argument, and a devil's
> > advocate argument that would be a veritable chasm, a Grand Canyon.
> > Which would detract from the purpose of this thread, and allow you to
> > avoid accepting a hard answer
> > --- In that case, you have to accept that that is not permitted
> > - the alternative is [d], you retract the doc, and come back when you
> > have a coherent one, that is sensible enough (not perfect) to model
> > from, to make reasonable decisions re what the Relational
> > { Work | Variant | Manifestation | Item }.PK
> > should be.
> >
> > Over to you.
>
> I'm afraid I don't know other resources to refer you to. Appendix I has
> some examples that might be interesting to browse, though.

Ok. So no devil's advocate arguments. Only real world arguments. And per the detail above, any un-resoved or contradictory article (eg. Anna Christie) will be resolved, and THAT it is unresolved in the magical mystical mysterious world of idiots will NOT be a basis for barracking for them. If you do, by that very act, you place yourself in the same category, and I cannot help you.

> > Correct me if I am wrong, but the real intent, the real value of this
> > thread, is going to be the discussion that occurs AFTER I give
> > a Relation Key that Identifies a movie. Ie. a good Relational Key vs
> > a surrogate, wrt the charges you have made at the top of this thread.
>
> Yes, that might be an interesting comparison. But I'd be happy to stop
> after resolving the specific question I have posed.
>
> > If, on the other hand, your question is really "how does one determine
> > a Relational Key for a real world implementation, from this incoherent
> > document", then let's confirm that that is the case, and let's have
> > that discussion. It depends on your initial or progressed intent:
> > have a discussion re:
> > q1. RFS/surrogate vs RDB/Relational Key
> > -- wherein a model with zero integrity, that does everything for
> > everyone, has already been rejected
> > or
> > q2. What is a good Relational Key for this problem
> >
> > Which is why it is over to you.
>
> First and foremost, the question is always the same: how do you identify
> a movie (i.e., q2)?

Ok.

But the thread is under the head of [q1], even if we do not have that discussion. So [q2] is primary, and [q1] is the context, the backdrop, secondary. We cannot divorce ourselves from the claims that you made in the initial post, re what you can do in an RFS.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Jan 8, 2020, 10:12:16 PM1/8/20
to
First submission. Obviously I have done far more work than this, this is just what I am willing to show the "senior user", in order to foster discussion.

Ambiguities and contradictions and circular workflow paths resolved, as detailed in the previous post.

(
I don't have separate "conceptual model"; "logical model"; physical model" (neither does /ERwin/), they are just points of progression in a project. So even though I have a progressed "logical model", I am submitting only the "conceptual model".
)

As such, we are not yet discussing Keys, but we are certainly evaluating, contemplating:

How is each fact Identified in the real world ?

How do we Identify each fact ?

With some regard to the downstream effect.

> Labelling, naming, a thing is very very very important. The name carries meaning. From the progress thus far, /Work/ is an idiotic label. I am tending towards /Concept/.

Did I say /very/ ?

Obviously, if I use their terms, then I must mean what they mean. Eg. Item.

Where the facts I have determined are different to their confused mess, I can't use their terms. Logical & meaningful names used.

> > > //different language versions shot at the same time, released
> > > simultaneously//
> > >
> > > 3. Different Languages is not a Work, because a human can conceive of
> > > something in just one language (those who conceive of things in
> > > multiple languages can be safely assigned to an asylum). The Language
> > > would be the Language of the Country or of the Creator. Nevertheless,
> > > I am willing to accept that multiple languages is an INTENT of the
> > > Work.

Each different language is a different Realisation, /shot/ means physicalisation in a particular medium. May be a Variant.

If and when the Concept becomes established, that fact can be registered.

> > > 4. Shooting is a Manifestation (realisation), not a Work.

Realisation

> > > 5. Release is a Manifestation (realisation), or even an Item, not a Work.

Realisation, possibly worked back from an Item.

> So no devil's advocate arguments. Only real world arguments. And per the detail above, any un-resoved or contradictory article (eg. Anna Christie) will be resolved, and THAT it is unresolved in the magical mystical mysterious world of idiots will NOT be a basis for barracking for them. If you do, by that very act, you place yourself in the same category, and I cannot help you.

I do not wish to limit what you say. But at least for part of evaluating this submission, please play the role of a real world archivist/librarian/cataloguer.

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_1.pdf

---------------------------------------
User Progression (Early Workflow)
---------------------------------------

Register Collection
Register Items, which are Undetermined
Physical properties ... leading to logical properties
As and when the Realisation is determined ("abstraction" is identified), register Realisation
As and when the Concept is determined ("abstraction" is identified), register Concept

Cheers
Derek

Nicola

unread,
Jan 9, 2020, 7:17:06 AM1/9/20
to
On 2020-01-09, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:

> I do not wish to limit what you say. But at least for part of
> evaluating this submission, please play the role of a real world
> archivist/librarian/cataloguer.
>
> http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_1.pdf
>
> ---------------------------------------
> User Progression (Early Workflow)
> ---------------------------------------
>
> Register Collection
> Register Items, which are Undetermined
> Physical properties ... leading to logical properties
> As and when the Realisation is determined ("abstraction" is identified), register Realisation
> As and when the Concept is determined ("abstraction" is identified), register Concept

That's fine: archivists usually start from the physical media, sometimes
without even knowing what it contains. So, working "bottom up"
definitely makes sense. The distinction between determined and
undetermined items is also sensible, from such perspective. The bottom
part of you diagram looks good to me so far.

Variant is a many-many association: that's ok. It's interesting that you
are taking the stance that it is an association between Realisations
rather than Concepts, so moving it down at the concrete level rather
than keeping it at the intellectual level, contrary to what FIAF, FRBR,
etc. do. This is debatable: probably, yes, you first recognize Variants
between Realisations, but later you may want to promote them as
different (abstract) Expressions of the same Concept. Note that FRBR
uses the term Expression rather than Variant (as you say, naming is very
important).

I have more comments on that, but I have run out of time.

Nicola

Derek Ignatius Asirvadem

unread,
Jan 9, 2020, 8:50:56 AM1/9/20
to
> On Thursday, 9 January 2020 23:17:06 UTC+11, Nicola wrote:
> > On 2020-01-09, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> [...]
>
> I have more comments on that, but I have run out of time.

I will wait for the rest before responding.

Well if you're going into all those aspects, then I will give you what I have. I previously gave you just the introduction to the basic structure, because it IS different to FRBR; FIAF; etc, and i I thought that THAT would be the focus of the initial discussion.

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_2.pdf

Non-Identifying relations are not shown (clutter at this stage, next stage is choosing Keys, relevant after that).

I am showing the following as separate SubjectAreas:
Agent ... Person ... Corp ... Collective
Country ... Address

Cheers
Derek

Nicola

unread,
Jan 9, 2020, 10:41:29 AM1/9/20
to
Ok, that's much more that I have asked for.

One question is whether establishing Variants can be done without an
overarching Concept already established. If that weren't the case (as
I believe), probably Variant should be moved under Concept Realisation.
As the model stands now, you may relate Variants of Realisations of
distinct Concepts.

In your model, would the FIAF example we have talked about:

Dracula (USA, 1931, Tod Browning, Spanish and English)

be a single instance of Concept, with two instances of Realisations (the
Spanish and English versions)? That would make sense to me, given the
circumstances in which it was filmed.

As another example (p. 29 from FIAF manual) various re-edited releases
of Blade Runner (1982, 1986, 1992, 2007) would be different Realisations
of the same Concept, each a Variant of another (or of the others),
right?

But then, each such Realisation would have also have different
"manifestations" (using the FIAF term for now): theatrical releases,
VHS, DVD, Blurays, etc., each with their own technical specifications
and possibly other minor differences, e.g., subtitles, extra dubbing,
different titles, etc.) It seems to me that currently your model cannot
accommodate for those, unless you consider each and every change a new
Realisation. I'd say that you need another entity between Realisation
and Item ("Medium/Media"?).

Which would make the backbone of your model a four-level hierarchy as
the original (Concept -> Realisation -> "Medium" -> Item), but with
a few important differences:

- only one "intellectual" entity (Concept); everything else is backed by
something concrete (this would remove the confusion existing between
Work and Variant in the FIAF manual);
- all entities (possibly except for "Medium") are independent (well, the
FIAF manual does not mention this explicitly, but I'd say that it
implies that the only independent entity is the top-level Work);
- Variant is a fifth key entity, and is an associative entity at the
"concrete", not "intellectual", level.

Do you agree to the above?

Nicola

Derek Ignatius Asirvadem

unread,
Jan 9, 2020, 6:34:28 PM1/9/20
to
> On Thursday, 9 January 2020 23:17:06 UTC+11, Nicola wrote:
> > On 2020-01-09, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_1.pdf
> >
> > ---------------------------------------
> > User Progression (Early Workflow)
> > ---------------------------------------
> >
> > Register Collection
> > Register Items, which are Undetermined
> > Physical properties ... leading to logical properties
> > As and when the Realisation is determined ("abstraction" is identified), register Realisation
> > As and when the Concept is determined ("abstraction" is identified), register Concept
>
> That's fine: archivists usually start from the physical media, sometimes
> without even knowing what it contains. So, working "bottom up"
> definitely makes sense.

I know, I am following your paper. Add one table Workflow subordinate to EstateItem, and you have it.

> The distinction between determined and
> undetermined items is also sensible, from such perspective. The bottom
> part of you diagram looks good to me so far.
>
> Variant is a many-many association: that's ok. It's interesting that you
> are taking the stance that it is an association between Realisations

First, there seems to be a bit of confusion, part of which I caused, so let me correct that.

> > > Labelling, naming, a thing is very very very important. The name carries meaning. From the progress thus far, /Work/ is an idiotic label. I am tending towards /Concept/.

> Did I say /very/ ?

> Obviously, if I use their terms, then I must mean what they mean. Eg. Item.

> Where the facts I have determined are different to their confused mess, I can't use their terms. Logical & meaningful names used.

My Item is the same as theirs, so I have named it the same.

I cringe when I look at Variant now, because I do use that name, but it is not the same at all. To them it is a THING (full set of attributes). To us it is a relation between two Realisations, a simple Fact about two Realisations.

> rather than Concepts, so moving it down at the concrete level rather
> than keeping it at the intellectual level, contrary to what FIAF, FRBR,
> etc. do.

(I would not use the word "contrary" or "contradictory". I have no problem at all appearing to "contradict" insanity, but that is not the point: insanity IS self-contradictory, and contradicts reality, so anything that is closely related to reality will, by definition, "contradict" insanity.)

> This is debatable:

Yes. The debate is welcomed (that is the back-and-forth, that predicate the iterations in the modelling exercise). But as per the Four Laws, we must have resolution of each point, not non-resolution.

> probably, yes, you first recognize Variants
> between Realisations,

(At this stage E-R, we are evaluating the tables and relations, with the Identifying relations only, the Keys [although not specified] being the main consideration. They signify ownership, belonging, Matter. The whole signifies the Form under which the Matter exists. Which is why I need to consider all the ables and relations before we decide the Key for Realisation.)

To be clear:
- I recognise Realisation A, add all the attributes
- then recognise Realisation B, add all the attributes
--- at that point I don't have a Concept
- then recognise that Realisation B (only because I added it second) is a Variant of Realisation A, so I register that fact

> but later you may want to promote them as
> different (abstract) Expressions of the same Concept.

(I don't have a table for that.)

No. Later when I determine the Concept for Realisation A, I add the Concept A, all its attributes, and then add ConceptRealisation A::A.

I don't need to /register ConceptRealisation for Realisation B/ because they are already related at the concrete level. I don't need a ConceptVariant table.

(If I married your sister, I am related to you because of your relation to your sister, I don't have to add a relation /Derek is related to Nicola/.)

Also, I need just one Variant row A::B, not two (B::A). Because they are two sides of the one coin, not two coins. ("Deferred constraint" types usually argue at this point because they register two rows, and then agonise over the quandary.)

The alternative (Variant under Concept instead of under Realisation) would be limiting, because the Concept of A and B has to be recognised, and Concept A registered first. I am taking it that we work from the concrete up, to the intellectual. The Philosopher says, we recognise the Matter first, via the senses (physical attributes of a thing) before we can conceptualise the Form, the intellectual, non-material "soul" that drives the thing, that gives it the power to act.

Second, yes, it is my "stance", but the point is, I am recording facts about the real world, according to the Four Laws; Science; etc. They are in rainbow unicorn land, with innumerable ambiguities and contradictions. Therefore the two are not readily comparable. And "stance" is not applicable, it is not a personal thing, I am supposed to be a clear-eyed observer of reality, and a designer.

What we need to confirm is, is it true in the real world, that a thing has to exist (Realisation) before it can be determined to be a variation of another thing that exists (Realisation). The Spanish/English versions of Dracula. The examples of Blade Runner.

At that point, the Concept may not exist. Usually doesn't.

Or is it true that the whole tree for a single Realisation A is built, and therefore the Concept A exists, before Realisation B comes along and we need to register it. That is the implied workflow in FRBR/FIAF/IFSA/etc. That is false in your real world requirement for non-feature movie archival.

----

> Note that FRBR
> uses the term Expression rather than Variant (as you say, naming is very
> important).

And FIAF uses Variant.

----

Run out of time. More, later.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Jan 10, 2020, 1:03:08 AM1/10/20
to
> On Friday, 10 January 2020 02:41:29 UTC+11, Nicola wrote:
> > On 2020-01-09, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >> On Thursday, 9 January 2020 23:17:06 UTC+11, Nicola wrote:
> >
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_2.pdf
> >
> > Non-Identifying relations are not shown (clutter at this stage, next
> > stage is choosing Keys, relevant after that).
> >
> > I am showing the following as separate SubjectAreas:
> > Agent ... Person ... Corp ... Collective
> > Country ... Address
>
> Ok, that's much more that I have asked for.

Not really. They are all done, from previous projects. We only need to work out roles.

> One question is whether establishing Variants can be done without an
> overarching Concept already established. If that weren't the case (as
> I believe), probably Variant should be moved under Concept Realisation.

(I have already responded to that, awaiting your response.)

> As the model stands now, you may relate Variants of Realisations of
> distinct Concepts.

Yes. That is unlikely, but possible. There are several ways to prevent that, the obvious one is one we have already discussed: a constraint that calls a function. We need such a function anyway, to check that the attempted AddVariant is valid (Realisations are in fact close enough to be a Variant, or not. Eg Language; length; no of cast; etc.)

But do not let that subtract from the discussion (that Variant is more correct under Concept or ConceptRealisation).

> In your model, would the FIAF example we have talked about:
>
> Dracula (USA, 1931, Tod Browning, Spanish and English)
>
> be a single instance of Concept, with two instances of Realisations (the
> Spanish and English versions)?

Yes.

> That would make sense to me, given the
> circumstances in which it was filmed.
>
> As another example (p. 29 from FIAF manual) various re-edited releases
> of Blade Runner (1982, 1986, 1992, 2007) would be different Realisations
> of the same Concept, each a Variant of another (or of the others),
> right?

Yes.

General point. I think that whole page 24, when understood without the FRBR/FIAF schizo mindset, demonstrates that each Variant is first a Realisation. They explain thing backwards, typical of schizos: Work -> Variant -> Manifestation. And Manifestation relates back up again to Variant: Manifestation -> Variant. Eg. virtually all attributes are optional. Gee I wonder why. So let's get what they are actually doing: they are insisting on a framework (physical, coz there ain't no logic, and there ain't no logical Keys), with all attributes as optional. Same as your initial post, same as the OO/ORM crowd, which is an RFS with all fields are Nullable. So really, they have no basis for insisting on the framework.

One cannot define a Concept until one has the thing (Realisation) that the Concept lies within, registers it, and then views it, and then determines the Concept. (Or accepts the Concept given by an Authority, a critic, etc.) Same as, one cannot define a Realisation until after the Item has been registered and examined.

> But then, each such Realisation would have also have different
> "manifestations" (using the FIAF term for now): theatrical releases,
> VHS, DVD, Blurays, etc., each with their own technical specifications
> and possibly other minor differences, e.g., subtitles, extra dubbing,
> different titles, etc.) It seems to me that currently your model cannot
> accommodate for those, unless you consider each and every change a new
> Realisation. I'd say that you need another entity between Realisation
> and Item ("Medium/Media"?).

(That is exactly the kind of discussion we need to have. No problem at all changing the model, there is no limit on the iterations.)

So my determination from all the docs was this. The Realisation is a "work", the thing that is produced at the end of a project. Definitely not the finished product (one DVD), but the 42 reels of movie files in a computer, of all the takes. Plus the script; deployment orders; etc. The things in the war chest, when all the takes are completed, but post-production has not started. Whatever the producer insures. That is why I chose Realisation, as in, the project has been realised.

As the docs require (confusedly), different edits would be different Realisations. A dubbing in another language would be a different Realisation. Etc.

(I understand your intent re Medium.)

So why is the finished product (one DVD or one movie reel) not an Item ? (We need to go beyond the definitions in the FRBR/FIAF docs, to take them only as ambiguous and confused, unresolved, considerations.)

At this point we could say, oooh. A dubbing in another language (with no other changes) is not a separate Realisation, but a second Item. Which makes a hell of a lot more sense than the FRBR/FIAF definition.

I would go so far as to say, they can edit as much as they like, but unless they actually create a finished product (an Item) all those edits are irrelevant to us (they are relevant to the production company; their inventory; their workflow).

I would ask, what is the precise distinction between Item and the proposed Medium ? It seems to me to be 1::1.

To sum up this point.
- Realisation is the fullness of a defined project; a production, that did take place. It is concrete but excludes the product (hah, I just figured out a better name for Item !)
- Item (new name Product) is the finished product, actually the product of post-production.
- Item is classified by Medium (not what you meant, but a simple reference table).

Dracula would still be two Realisations because (a) they are two languages (b) two separate takes (productions done at the same time). With one Item (Product) each. If an Item comes out in three different media, that would be three Items.

Blade Runner would still be four Realisations, with one Item (Product) each.

> Which would make the backbone of your model a four-level hierarchy as
> the original (Concept -> Realisation -> "Medium" -> Item),

Yes. And the backbone is visually rendered by Identifying relations, the Keys, the solid lines.

> but with
> a few important differences:
>
> - only one "intellectual" entity (Concept); everything else is backed by
> something concrete (this would remove the confusion existing between
> Work and Variant in the FIAF manual);

No kidding. Purposely done.

> - all entities (possibly except for "Medium") are independent (well, the
> FIAF manual does not mention this explicitly, but I'd say that it
> implies that the only independent entity is the top-level Work);

If /Independent/ has the normal meaning, not the RM/IDEF1X meaning, yes. But why is independence relevant ?

If /Independent/ has the RM/IDEF1X meaning, no. Only Concept & Realisation are Independent.

> - Variant is a fifth key entity, and is an associative entity at the
> "concrete", not "intellectual", level.

Associative at the concrete level, yes.
Not sure what you mean by the rest. What is a "fifth key entity":
- a fifth column in the Movie Key, fifth because we have four ? No. Variant has two four-column Keys, one for the referenced, one for the referencing.
- a fifth entity to the four ? Yes.

----

Item is no longer the FRBR/FIAF Item. For a better name for Item, would you prefer Product, or Article ?

Cheers
Derek

Nicola

unread,
Jan 10, 2020, 3:38:29 AM1/10/20
to
Derek,
quick reply on only a couple of the points you raise. I will come back
on the rest later.

On 2020-01-10, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> I would ask, what is the precise distinction between Item and the
> proposed Medium ? It seems to me to be 1::1.

You must distinguish between a particular edition of a movie (e.g., the
Final cut's version of Blade Runner distributed by, say, Warner Bros in
2007 on DVD featuring also extra material such as interviews with the
actor and backstage footage), and the single items of that edition (my
copy of the DVD and your copy of the DVD). The former is a Manifestation
in FIAF's terminology and the latter is an Item. Such distinction is
important.

Each Realisation may have many editions through time and on different
media. And, in general, the single physical items of a particular
edition (my copy of the DVD and your copy of the DVD) may have different
characteristics for some reason, which I may want to record (e.g., the
state of conservation of the physical disc, the fact that my copy has
a torn cover while your copy is still intact in its envelope, etc.).

> Item is no longer the FRBR/FIAF Item. For a better name for Item,
> would you prefer Product, or Article ?

With the distinction above, Item would retain its meaning. So, perhaps:
Edition and Item.

Concept -> Realisation -> Edition -> Item
| |
Variant

Nicola

Nicola

unread,
Jan 10, 2020, 11:26:35 AM1/10/20
to
On 2020-01-10, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>> One question is whether establishing Variants can be done without an
>> overarching Concept already established. If that weren't the case (as
>> I believe), probably Variant should be moved under Concept Realisation.
>
> (I have already responded to that, awaiting your response.)

No need for change, at least for now. I agree that it's not the main
focus of the discussion.

>> As the model stands now, you may relate Variants of Realisations of
>> distinct Concepts.
>
> Yes. That is unlikely, but possible. There are several ways to
> prevent that

Sure.

> But do not let that subtract from the discussion (that Variant is more
> correct under Concept or ConceptRealisation).

Ok.

> General point. [Details...]

Agreed.

> So why is the finished product (one DVD or one movie reel) not an Item
> ? (We need to go beyond the definitions in the FRBR/FIAF docs, to
> take them only as ambiguous and confused, unresolved, considerations.)

I think I have clarified that in my previous post. The finished product
(the Product, or the Edition) may exist in multiple physical copies (the
Items).

> At this point we could say, oooh. A dubbing in another language (with
> no other changes) is not a separate Realisation, but a second Item.

A second Product/Edition, yes. E.g., the Final cut version of Blade
Runner (Realisation) may be distributed as a Product/Edition (in
English) in the US (as many physical Items that you can buy at a store)
and as a second Product/Edition (dubbed in Italian, French, Spanish,
German) in Europe.

> Which makes a hell of a lot more sense than the FRBR/FIAF definition.

Yes.

> I would go so far as to say, they can edit as much as they like, but
> unless they actually create a finished product (an Item) all those
> edits are irrelevant to us (they are relevant to the production
> company; their inventory; their workflow).

Yes.

> I would ask, what is the precise distinction between Item and the
> proposed Medium ? It seems to me to be 1::1.

Answered before.

> To sum up this point.
> - Realisation is the fullness of a defined project; a production, that
> did take place. It is concrete but excludes the product (hah, I just
> figured out a better name for Item !)
> - Item (new name Product) is the finished product, actually the
> product of post-production.
> - Item is classified by Medium (not what you meant, but a simple
> reference table).

Fine. I interpret "Item" in the last sentence as Product/Edition.

> Dracula would still be two Realisations because (a) they are two
> languages (b) two separate takes (productions done at the same time).
> With one Item (Product) each. If an Item comes out in three different
> media, that would be three Items.

Agreed, with "Item" replaced by "Product"/"Edition". Besides, that
Dracula would be one Concept (it's basically the same script)—which,
again, makes way more sense than considering it two distinct Works as
FIAF suggests.

> Blade Runner would still be four Realisations, with one Item (Product)
> each.

Yes.

>> Which would make the backbone of your model a four-level hierarchy as
>> the original (Concept -> Realisation -> "Medium" -> Item),
>
> Yes. And the backbone is visually rendered by Identifying relations,
> the Keys, the solid lines.
>
>> but with
>> a few important differences:
>>
>> - only one "intellectual" entity (Concept); everything else is backed by
>> something concrete (this would remove the confusion existing between
>> Work and Variant in the FIAF manual);
>
> No kidding. Purposely done.
>
>> - all entities (possibly except for "Medium") are independent (well, the
>> FIAF manual does not mention this explicitly, but I'd say that it
>> implies that the only independent entity is the top-level Work);
>
> If /Independent/ has the normal meaning, not the RM/IDEF1X meaning,
> yes. But why is independence relevant ?
>
> If /Independent/ has the RM/IDEF1X meaning, no. Only Concept
> & Realisation are Independent.

I mean independent in the IDEF1X sense. In your model (Movie Title
Progression V0_2) Item is independent, (Item Determined is not).

>> - Variant is a fifth key entity, and is an associative entity at the
>> "concrete", not "intellectual", level.
>
> Associative at the concrete level, yes.
> Not sure what you mean by the rest. What is a "fifth key entity":

I meant a fifth entity to the four.

All this reasoning makes sense to me. I wait for your answer re the
Product/Edition thing. If we converge on that, I think we can finally
start talking about the keys.

Nicola

Nicola

unread,
Jan 10, 2020, 11:37:11 AM1/10/20
to
On 2020-01-10, Nicola <nic...@nohost.org> wrote:
>> Blade Runner would still be four Realisations, with one Item (Product)
>> each.
>
> Yes.

Correction: Blade Runner would still be four Realisations, with one *or
more* Products each.

One of those Products may be considered "the master" (the 42 reels
etc.), but I wouldn't insist much on such a notion, because there are
cases (mostly outside the feature film context) where it's difficult or
even impossible to identify a "master" product. That might be an
additional, optional, fact. As one might add additional facts to relate
Products among each other.

Nicola

Derek Ignatius Asirvadem

unread,
Jan 10, 2020, 12:21:23 PM1/10/20
to
Nicola

Quick answer on a couple of points only. I will get to the detail tomorrow.

> On Friday, 10 January 2020 19:38:29 UTC+11, Nicola wrote:
>
> You must distinguish between a particular edition of a movie (e.g., the
> Final cut's version of Blade Runner distributed by, say, Warner Bros in
> 2007 on DVD featuring also extra material such as interviews with the
> actor and backstage footage), and the single items of that edition (my
> copy of the DVD and your copy of the DVD). The former is a Manifestation
> in FIAF's terminology and the latter is an Item. Such distinction is
> important.
>
> Each Realisation may have many editions through time and on different
> media. And, in general, the single physical items of a particular
> edition (my copy of the DVD and your copy of the DVD) may have different
> characteristics for some reason, which I may want to record (e.g., the
> state of conservation of the physical disc, the fact that my copy has
> a torn cover while your copy is still intact in its envelope, etc.).

Edition.
Better than Product.
We we retain the definition of Edition, that it is the output of the project/production cycle, as distinct from the Item, which is the output of the merchandising cycle (if no merchandising, as in non-feature films, then just one Item).

> > Item is no longer the FRBR/FIAF Item. For a better name for Item,
> > would you prefer Product, or Article ?
>
> With the distinction above, Item would retain its meaning. So, perhaps:
> Edition and Item.

We would be comparable to FRBR/FIAF/IFSA only loosely, in that we have given it due consideration. I would not say that we "retain its meaning" on any issue. We have tight definitions (unambiguous; no circular references; implementatio-ready; etc), they do not.

> Concept -> Realisation -> Edition -> Item
> | |
> Variant

Nice

> On Saturday, 11 January 2020 03:26:35 UTC+11, Nicola wrote:
>
> > If /Independent/ has the RM/IDEF1X meaning, no. Only Concept
> > & Realisation are Independent.
>
> I mean independent in the IDEF1X sense. In your model (Movie Title
> Progression V0_2) Item is independent, (Item Determined is not).

Sorry. My mistake. (My head was already in V0_3)

> All this reasoning makes sense to me. I wait for your answer re the
> Product/Edition thing.

Edition is perfect.

> If we converge on that, I think we can finally
> start talking about the keys.

"Finally" is not a problem. Three iterations at this level (concepts; entities; relations) before Keys is nothing. We are defining the logical structure of the database, yes ? Concepts (first) are Identified by Keys (second), the sequence cannot be reversed, yes ?

Please confirm
- the ER diagram I sent you is /semantic/
- you can //read// all the semantics from the diagram

The next one will be a bit more intense.

----

Concept. In V0.2 a Realisation can be registered without having determined its Concept. My understanding has progressed. Is that realistic ? When would a Realisation NOT have a determined Concept ? Even including the non-feature film registry.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Jan 10, 2020, 1:52:37 PM1/10/20
to
Nicola

> On Saturday, 11 January 2020 03:37:11 UTC+11, Nicola wrote:

I've taken all your comments into account, the next iteration is ready. I don't have the words in response. Tomorrow. Since we are flipping night-day, instead of losing a cycle, here it is.
- I have changed Concept to Story
- since we are ready for Keys, I have included the elements for it.

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_3.pdf

Style:
- Data hierarchies, Identifying Relations, ie. the solid lines, are shown vertically.
- Non-Identifyin relations are attached to the side, so as not to interfere

Cheers
Derek

Nicola

unread,
Jan 10, 2020, 4:02:58 PM1/10/20
to
On 2020-01-10, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>Please confirm
>- the ER diagram I sent you is /semantic/
>- you can //read// all the semantics from the diagram

Yes and yes.

>Concept. In V0.2 a Realisation can be registered without having
>determined its Concept. My understanding has progressed. Is that
>realistic ? When would a Realisation NOT have a determined Concept
>? Even including the non-feature film registry.

I don't have a definitive answer. The very term "realisation" prompts
the question: a realisation... of what? I think you can make
a Realisation subordinate to a Concept—perhaps one that is described as
tentative/draft/not validated.

>> On Saturday, 11 January 2020 03:37:11 UTC+11, Nicola wrote:
>
> I've taken all your comments into account, the next iteration is
> ready. I don't have the words in response. Tomorrow. Since we are
> flipping night-day, instead of losing a cycle, here it is.
> - I have changed Concept to Story
> - since we are ready for Keys, I have included the elements for it.

> http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_3.pdf

* Realisation.PK ( Country, Creator, Year, Title )

That wouldn't be enough for the Dracula movie, unless Title doesn't
include some differentiating element (e.g., the primary key of Title
includes a TitleComment attribute):

Country Creator Year Title TitleComment
US Tod Browning 1931 Dracula [English cast]
US Tod Browning 1931 Dracula [Spanish cast]

TitleComment would be an empty string for most movies.

Titles are complex (see FIAF, §1.3.2 and App. A). You have Alternate
Title at the Realisation level, but maybe you need something similar at
the intellectual level. E.g., if you're recording data about Russian or
Chinese movies, perhaps you want the transliterated title along with the
one in the original alphabet.

Other comments:

- Country: although determining "the" country of a movie may be
difficult (think multi-national productions), FIAF gives several
criteria (p. 39–41) to choose one. So this attribute seems fine.
- Creator: that may be individual or collective. Ok.
- Year: let's say it's ok, but keep in mind FIAF, §1.3.4. Quoting:

"For Works, the date is typically related to events such as its
creation, availability (i.e., publication, release, distribution,
broadcast or transmission) or registration (e.g., for copyright or
intellectual property purposes)."

And also:

"a Work may have a production date of 1962, a copyright date of
December 1963, and a first release date of January 1964."

So, which year is that? A conventional one? Or would you qualify it
with a DateType?

* Realisation.PK ( Language, Creator, Year, Title )

Language as part of the primary key is questionable, IMO. There are
silent movies, movies in multiple languages, etc. Country is better: at
least there are already detailed directions from FIAF on how to choose
one.

Nicola

Derek Ignatius Asirvadem

unread,
Jan 12, 2020, 1:21:53 AM1/12/20
to
> On Friday, 10 January 2020 19:38:29 UTC+11, Nicola wrote:
>
> Concept -> Realisation -> Edition -> Item
> | |
> Variant

> Concept -> Realisation -> Edition -> Item
> | |
> Variant

Does that mean Variant is:
- a child of Realisation (I have that)
AND
- Variant is a child of Concept (I do not have that) ?
If the latter, no. That is not necessary. Since we have the former, that latter is a simple JOIN. In either direction. If that is not clear, please ask.

(Also more, later, when we discuss Title.)

> On Saturday, 11 January 2020 03:26:35 UTC+11, Nicola wrote:
>
> All this reasoning makes sense to me. I wait for your answer re the
> Product/Edition thing. If we converge on that, I think we can finally
> start talking about the keys.

General point re scope.
1. I appreciate that you may consider some parts of the data model “not central” to the discussion. But that is precisely the entities that need to be considered ... because they form the movie Key (which is the central article). That non-consideration, or tendency to narrow down something to only what the entity is (ie. movie table; therefore movie key only; everything else is not relevant to movie) is anti Relational.

More precisely anti-logical because (FOPC -> RM) comes before the RM. Because everything in a Relational database is related. If we do not properly define {Country} {Creator} {Year} {Title}, and the keys for each, we have no chance of getting the key for {Realisation} right.

2. As stated, I am not bound to the FRBR/FIAF fiasco that is passed off as a “standard”. While it is distinctly better than the pig excreta that is heavily marketed by the Date; Darwen; Fagin; et al, it is still squarely in the myopic OO/ORM mindset (refer your initial devil’s advocate propositions) with no integrity and no logical relations, but that everything can be “defined” or “linked” outside the database. As agreed, what is important, is that the data model reflects the real world, and does so accurately. Thus when you counter with “FIAF does this”, I would ask that instead you consider what the real world requirement is.

Eg. in V0.3 I have given two ways of getting the Story (previously Concept) into the Realisation. That is for discussion. I don’t have a fixed idea, and you are closer to the [real world] data. Not as good as a real user, but certainly closer than I.

> On Saturday, 11 January 2020 08:02:58 UTC+11, Nicola wrote:
> >On 2020-01-10, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >Please confirm
> >- the ER diagram I sent you is /semantic/
> >- you can //read// all the semantics from the diagram
>
> Yes and yes.

Excellent. Now for the next iteration, I will include keys, one step more than E-R. Since you know that ERD is totally inadequate for that level, what do you use; what do you teach ? If you don’t have one, I will provide IDEF1X/Table-Key level.

> >Concept. In V0.2 a Realisation can be registered without having
> >determined its Concept. My understanding has progressed. Is that
> >realistic ? When would a Realisation NOT have a determined Concept
> >? Even including the non-feature film registry.
>
> I don't have a definitive answer. The very term "realisation" prompts
> the question: a realisation... of what?

Exactly what I mean.

> I think you can make
> a Realisation subordinate to a Concept

Ok. Then it is an Identifying (defining) relation, a parent of Realisation (not as erected in V0.3).
Realisation is a realisation of 1 Concept
or
Realisation is a realisation of 1 Story

> —perhaps one that is described as
> tentative/draft/not validated.

Ok, that would be an attribute of the Realisation [of the Concept]
Realisation.ConceptConfirmationLevel { NotValidated | | Tentative | Draft }

But due to the previous point, the column is mandatory.

> * Realisation.PK ( Country, Creator, Year, Title )
>
> That wouldn't be enough for the Dracula movie, unless Title doesn't
> include some differentiating element (e.g., the primary key of Title
> includes a TitleComment attribute):
>
> Country Creator Year Title TitleComment
> US Tod Browning 1931 Dracula [English cast]
> US Tod Browning 1931 Dracula [Spanish cast]

Yes.

> TitleComment would be an empty string for most movies.

Well, there are many such examples of differentiators in the FiIAF Manual. Different cast is just one such. So let's name it
Differentiator
It cannot be constrained quite as much as a normal column is constrained, because it is very much "up to the librarian".

And this is not to be confused with Language (spoken in the movie).

> Titles are complex (see FIAF, §1.3.2 and App. A).

Original or first Edition Title.
The long form of the one original or first Edition Title will be in TitleAlt
The reduced form TitleReduced will be in the PK.

> You have Alternate
> Title at the Realisation level, but maybe you need something similar at
> the intellectual level. E.g., if you're recording data about Russian or
> Chinese movies, perhaps you want the transliterated title along with the
> one in the original alphabet.

Ok. But why is that at the Concept level ? It is an action, a "doing", as opposed to the Concept of the "doing". I do not expect to have a full discussion re Existence ... but as per my question [is Concept mandatory to Realisation], Concept is the genus, Realisation is the species.

> Other comments:
>
> - Country: although determining "the" country of a movie may be
> difficult (think multi-national productions), FIAF gives several
> criteria (p. 39–41) to choose one. So this attribute seems fine.

(It appears the copy of the FIAF manual that you gave me via link, and the copy you are quoting from, is different. § 1.3.3 Country of Reference is on p32.)

Yes.
ISO CountryCode.
Along with the Year, it will differentiate Czechoslovakia vs Czech Republic/Slovakia

> - Creator: that may be individual or collective. Ok.

Agent{ Individual | Collective | Corporation }

> - Year: let's say it's ok, but keep in mind FIAF, §1.3.4. Quoting:
>
> "For Works, the date is typically related to events such as its
> creation, availability (i.e., publication, release, distribution,
> broadcast or transmission) or registration (e.g., for copyright or
> intellectual property purposes)."

Year of first Edition, or year of production if no Edition.

> And also:
>
> "a Work may have a production date of 1962, a copyright date of
> December 1963, and a first release date of January 1964."
>
> So, which year is that? A conventional one? Or would you qualify it
> with a DateType?

Year in the PK: as above.
CopyrightDate is an Attribute of Concept, Realisation.
ReleaseDate is an attribute of Edition.

(FIAF has a stupid XML approach, typical of OO/ORM. They warn about Date representation without understanding that there is a difference between storage and representation. And actually give us recommendations re indexing. Hilarious.)

> * Realisation.PK ( Language, Creator, Year, Title )
>
> Language as part of the primary key is questionable, IMO.

Well, I am proposing it for discussion. No decision has been made, it can only be made with confidence after discussion.

> There are
> silent movies, movies in multiple languages, etc. Country is better: at
> least there are already detailed directions from FIAF on how to choose
> one.

So the starting point for discussion of the proposed Language as an element in the Movie Key, is beyond the Matter that FRBR/FIAF have thought of, thus not an alternative to that which is in the manual.

> silent movies

Language of the Country (origin)

> movies in multiple languages,

One Language has to be primary, usually the country of audience which is also country of origin.

Discussion
-------------

Realisation.PK[ Language, Creator, Year, Title, Differentiator ]

Concept[ English, GB, 1897, Bram Stoker ]
Realisation[ English, US, 1931, Tod Browning, English cast ]
Realisation[ Spanish, US, 1931, Tod Browning, Spanish cast ]
...
Realisation[ English, US, 1958, Jimmy Sangster, "" ] = Christopher Lee
...

Country does not inform us re the Language of the movie (spoken in the movie), whereas Language more often does. If we use Country in the PK , Language has to be an attribute, if we use Language, Country has to be an attribute.

In this eg, Differentiator is not the same as Language, even if it seems so. The Fact of a different cast is not relevant //to the PK//, because it would be two different Realisations due to having to populate the Cast table with different Actors (Persons). That is to say, if we used Language in the PK instead of Country, Differentiator is not required (does not apply in other cases).

That is not as important as the next point. But first, I have to mention why I changed Concept to Story, and why I have erected it as under Title (ie. TitleReduced as PK, full Title as attribute). The thing is this, the Title for Concept should be the Title for Realisation. And if there is any Title in Realisation other than the Concept.Title, then it is one of the TitleAlts, and attributed Preferred or Original; etc.

Second point. I have erected Title -> Story -> StoryRealisation because the Title = Story, and if the above point is understood, then Title = Realisation. What I am saying in V0.3 is, Title is a higher level, in the intellectual realm, that Story. And it is that higher-level Title, and not Story-without-a-title, that is Realised.

**In addition**, I have drawn a loop Title -> Story -> Story Realisation, so that we can evaluate that alternative, without having to give you two models. The model as it stands is otherwise invalid, because it has that circular reference. We have to choose one xor the other.

Third, and please evaluate this point third. Now what is the Title exactly ?
Quoting FIAF Manual § 1.3.2 p31 (these are given as Variants, to us these are Realisations, but I am asking you to focus on the Title as an Independent entity):
"
Аленький цветочек (USSR, 1952, Lev Atamanov)47
Аленький цветочек – Title of the Work
Alenkiy tsvetochek: Alternative (transliterated) title of Work/ Variant (Preferred title if systems don’t use Cyrillic)
Feuerrotes Blümchen – Variant title – Dubbed (German)
The Scarlet Flower – Variant title – Dubbed (English)
"
Why are those NOT one Title in four Languages ?

How can anyone render those Titles without the prefix Language ?

Thus Title, wherever it is stored, is Identified as [ Language, TitleReduced ]

Thus Title.PK is [ Language, TitleReduced ] =TitleFull, ...

Thus Story.PK is [ Language, TitleReduced ] = Story

Thus Realisation.PK is [ Language, TitleReduced, Creator, Year [, Differentiator] ] = Title{ Original | Preferred | Screen | etc }, Story, ...

(Realisation.Story is different to Story.Story)

Language here, throughout, not Country.)

Please discuss. Please note my /General point re scope/ at the top.

The purpose of V0.3 was to introduce these concepts, without giving the relations, so that we can have this discussion. If I showed the relations, it would have been too intense, too much of a difference from V0.2. For the next iteration, I will erect whatever is reasonable, based on your responses to this post.

----

Please indicate your preference: Concept or Story.

Do we need Characters for Story ?

Cheers
Derek


Derek Ignatius Asirvadem

unread,
Jan 13, 2020, 3:51:58 AM1/13/20
to
> On Sunday, 12 January 2020 17:21:53 UTC+11, Derek Ignatius Asirvadem wrote:
>
> Now for the next iteration, I will include keys, one step more than E-R. Since you know that ERD is totally inadequate for that level, what do you use; what do you teach ? If you don’t have one, I will provide IDEF1X/Table-Key level.

> The purpose of V0.3 was to introduce these concepts, without giving the relations, so that we can have this discussion.

As I was progressing the model, it became plain to me that I have raised many new issues in the previous post, and they cannot be reasonably contemplated without a data model. I understand that academics suppress graphic models and insist on text (DDL), but that is not reasonable at this "conceptual" stage, or the next "logical" stage ... and from the progress in this thread thus far, it appears you accept a graphic model.

Here is the next iteration, remaining in the conceptual stage, probably the last before one that defines the Keys.

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_4.pdf

The grey tables are Reference or Lookup tables. In this increment, I have introduced most of them, drawing some, but not all, of their Non-Identifying relations. The short form (small squares) are:
- [C]lassifier
- [D]iscriminator
- other [R]eference

We are up to 45 tables plus 20 Reference. No screaming so far. We must be progressing well.

When you respond, which will be against either V0.3 or V0.4 please indicate which.

Cheers
Derek

Nicola

unread,
Jan 14, 2020, 5:51:51 AM1/14/20
to
On 2020-01-12, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>> On Friday, 10 January 2020 19:38:29 UTC+11, Nicola wrote:
>>
>> Concept -> Realisation -> Edition -> Item
>> | |
>> Variant
>
>> Concept -> Realisation -> Edition -> Item
>> | |
>> Variant
>
> Does that mean Variant is:
> - a child of Realisation (I have that)

Yes

> AND
> - Variant is a child of Concept (I do not have that) ?

No. That was just a reproduction of your model.

>> On Saturday, 11 January 2020 03:26:35 UTC+11, Nicola wrote:
>>
>> All this reasoning makes sense to me. I wait for your answer re the
>> Product/Edition thing. If we converge on that, I think we can finally
>> start talking about the keys.
>
> General point re scope.

Understood.

> Thus when you
> counter with “FIAF does this”, I would ask that instead you consider
> what the real world requirement is.

Sure. When I say "FIAF does this" or "the standard says that", I am not
using an "appeal to authority" argument: I am just pointing out how
people with deep knowledge of the domain have addressed some issues,
perhaps unsatisfactorily (for various reasons). I believe in the
authority of the argument, not vice versa.

> Since you know that ERD is totally inadequate for that
> level, what do you use; what do you teach ?

IDEF1X.

>> I think you can make
>> a Realisation subordinate to a Concept
>
> Ok. Then it is an Identifying (defining) relation, a parent of
> Realisation (not as erected in V0.3).
> Realisation is a realisation of 1 Concept
> or
> Realisation is a realisation of 1 Story

Ok. I see that in V0.4.

>> —perhaps one that is described as
>> tentative/draft/not validated.
>
> Ok, that would be an attribute of the Realisation [of the Concept]
> Realisation.ConceptConfirmationLevel { NotValidated | | Tentative | Draft }
>
> But due to the previous point, the column is mandatory.

Yes.

>> * Realisation.PK ( Country, Creator, Year, Title )
>>
>> That wouldn't be enough for the Dracula movie, unless Title doesn't
>> include some differentiating element (e.g., the primary key of Title
>> includes a TitleComment attribute):
>>
>> Country Creator Year Title TitleComment
>> US Tod Browning 1931 Dracula [English cast]
>> US Tod Browning 1931 Dracula [Spanish cast]
>
> Yes.
>
>> TitleComment would be an empty string for most movies.
>
> Well, there are many such examples of differentiators in the FiIAF
> Manual. Different cast is just one such. So let's name it
> Differentiator
> It cannot be constrained quite as much as a normal column is
> constrained, because it is very much "up to the librarian".

Yes.

> And this is not to be confused with Language (spoken in the movie).

Indeed.

>> Titles are complex (see FIAF, §1.3.2 and App. A).
>
> Original or first Edition Title.
> The long form of the one original or first Edition Title will be in TitleAlt
> The reduced form TitleReduced will be in the PK.

Ok. After reflecting on how Titles and Stories should be related and
comparing your v0.3 and v0.4 models, I think that you should reintroduce
an independent Title entity in v0.4. These are the essential facts to be
modelled, IMO:

1. A Title identifies (is the "main"/"reduced" title of) of 0-N Stories.
2. A Story has one identifying/main/reduced title.
3. A Title may be an Alternate Title;
4. A Story has zero or more Alternate Titles.

AFACS, in v0.4 you are modelling only 4 (Story Title).

> (It appears the copy of the FIAF manual that you gave me via link, and
> the copy you are quoting from, is different. § 1.3.3 Country of
> Reference is on p32.)

Ah, right, I was looking at a draft copy.

>> - Creator: that may be individual or collective. Ok.
>
> Agent{ Individual | Collective | Corporation }
>
>> - Year: let's say it's ok, but keep in mind FIAF, §1.3.4. Quoting:
>>
>> "For Works, the date is typically related to events such as its
>> creation, availability (i.e., publication, release, distribution,
>> broadcast or transmission) or registration (e.g., for copyright or
>> intellectual property purposes)."
>
> Year of first Edition, or year of production if no Edition.
>
>> And also:
>>
>> "a Work may have a production date of 1962, a copyright date of
>> December 1963, and a first release date of January 1964."
>>
>> So, which year is that? A conventional one? Or would you qualify it
>> with a DateType?
>
> Year in the PK: as above.
> CopyrightDate is an Attribute of Concept, Realisation.
> ReleaseDate is an attribute of Edition.

I think that these may be fine.

>> * Realisation.PK ( Language, Creator, Year, Title )
>>
>> Language as part of the primary key is questionable, IMO.
>
> Well, I am proposing it for discussion. No decision has been made, it
> can only be made with confidence after discussion.

It might be part of an alternate key, probably.

>> There are
>> silent movies, movies in multiple languages, etc. Country is better: at
>> least there are already detailed directions from FIAF on how to choose
>> one.
>
> So the starting point for discussion of the proposed Language as an
> element in the Movie Key, is beyond the Matter that FRBR/FIAF have
> thought of, thus not an alternative to that which is in the manual.
>
>> silent movies
>
> Language of the Country (origin)

Not sure how well languages map to countries. There might always be
a "main" language.

>> movies in multiple languages,
>
> One Language has to be primary, usually the country of audience which
> is also country of origin.

Seems reasonable.

> Discussion
> -------------
>
> Realisation.PK[ Language, Creator, Year, Title, Differentiator ]
>
> Concept[ English, GB, 1897, Bram Stoker ]
> Realisation[ English, US, 1931, Tod Browning, English cast ]
> Realisation[ Spanish, US, 1931, Tod Browning, Spanish cast ]
> ...
> Realisation[ English, US, 1958, Jimmy Sangster, "" ] = Christopher Lee

I think you forgot to mention the title there, otherwise it seems ok.

I would say that the Concept is [English, US, 1931, Tod Browning,
Dracula], i.e., the movie is a different concept than the novel it was
inspired from. In the same vein, I wouldn't put all Dracula movies under
one Concept, as they might be wildly different stories (horror, comedy,
romantic, etc.).

> Country does not inform us re the Language of the movie (spoken in the
> movie), whereas Language more often does. If we use Country in the PK
> , Language has to be an attribute, if we use Language, Country has to
> be an attribute.

Both pieces of information have to be stored, as they are somewhat
orthogonal (e.g., a US producer that decides to film in Spanish). It
seems to me that Country is what is usually reported beside the title
and the creator. So, I would be inclined to use Country in the PK (with
Language possibly in some AK).

> In this eg, Differentiator is not the same as Language, even if it
> seems so. The Fact of a different cast is not relevant //to the PK//,
> because it would be two different Realisations due to having to
> populate the Cast table with different Actors (Persons).

Well, mine was just an example. One could use I and II instead (as IMDb
has done for Anna Christie, 1930). Or something else.

> That is to
> say, if we used Language in the PK instead of Country, Differentiator
> is not required (does not apply in other cases).

Exactly. There is a wide range of possible situations. There are
certainly cases of movies in English with the same title made in US,
Canada, UK or Australia.

> That is not as important as the next point. But first, I have to
> mention why I changed Concept to Story, and why I have erected it as
> under Title (ie. TitleReduced as PK, full Title as attribute). The
> thing is this, the Title for Concept should be the Title for
> Realisation. And if there is any Title in Realisation other than the
> Concept.Title, then it is one of the TitleAlts, and attributed
> Preferred or Original; etc.

Ok.

> Second point. I have erected Title -> Story -> StoryRealisation
> because the Title = Story, and if the above point is understood, then
> Title = Realisation. What I am saying in V0.3 is, Title is a higher
> level, in the intellectual realm, that Story. And it is that
> higher-level Title, and not Story-without-a-title, that is Realised.

I think this is correct. That was changes in v0.4, but I think you
should revise that.

> **In addition**, I have drawn a loop Title -> Story -> Story
> Realisation, so that we can evaluate that alternative, without having
> to give you two models. The model as it stands is otherwise invalid,
> because it has that circular reference. We have to choose one xor the
> other.

I don't see any circular reference. Anyway, I think that this part
should be modeled as I have explained above.

> Third, and please evaluate this point third. Now what is the Title
> exactly ?
> Quoting FIAF Manual § 1.3.2 p31 (these are given as Variants, to us
> these are Realisations, but I am asking you to focus on the Title as
> an Independent entity):
> "
> Аленький цветочек (USSR, 1952, Lev Atamanov)47
> Аленький цветочек – Title of the Work
> Alenkiy tsvetochek: Alternative (transliterated) title of Work/ Variant (Preferred title if systems don’t use Cyrillic)
> Feuerrotes Blümchen – Variant title – Dubbed (German)
> The Scarlet Flower – Variant title – Dubbed (English)
> "
> Why are those NOT one Title in four Languages ?

That is one title in four languages and should be treated as such,
I believe. Title translations are not so easy, though: it is quite
common for a movie to have a title in one language that is totally
unrelated to the original title, e.g., Eternal Sunshine of the Spotless
Mind (2004) in Italian was distributed with a title that sounds like "If
you leave me, I'll erase you" (!).

Thinking about it, the Italian title would be an Alternate Title of
a Realisation or perhaps even an Edition (Italian dubbing) of the movie.
So, it should not be considered a "translation" of the Story's title.

In the example above, "Feuerrotes Blümchen" could be both (a) the German
translation (I think, I don't speaking Russian or German) of the
original title of the Story, and (b) the title of a Realisation (or
Edition) of the movie.

What are dubbed movies? Realisations or Editions? Does the Cast include
dubbers?

> How can anyone render those Titles without the prefix Language ?

How about a title like 8½ (by Fellini, 1963)? Ok, you might prefix it
with Italian...

> Thus Title, wherever it is stored, is Identified as [ Language,
> TitleReduced ]

And script? How would you distinguish "Аленький цветочек" and
"Alenkiy tsvetochek"?

> Thus Title.PK is [ Language, TitleReduced ] =TitleFull, ...
>
> Thus Story.PK is [ Language, TitleReduced ] = Story
>
> Thus Realisation.PK is [ Language, TitleReduced, Creator, Year [,
> Differentiator] ] = Title{ Original | Preferred | Screen | etc },
> Story, ...
>
> (Realisation.Story is different to Story.Story)
>
> Language here, throughout, not Country.)
>
> Please discuss. Please note my /General point re scope/ at the top.

I'll respond after we have solved the points raised above.

> Please indicate your preference: Concept or Story.

Concept is more encompassing (e.g., a documentary might not have
a proper "storyline"). It's a very general term, but within the model it
is contextualized, so I think it is acceptable (Moving Image Concept
would be more accurate, but perhaps a bit too long).

> Do we need Characters for Story ?

Good question. In a fully developed model I would probably include them
(the set of characters of a Story does not necessarily coincide with any
of the sets of characters of its realisations), but I think that they
can be added at a later stage.

Nicola

Nicola

unread,
Jan 14, 2020, 9:44:42 AM1/14/20
to
On 2020-01-14, Nicola <nic...@nohost.org> wrote:
>> Thus Title, wherever it is stored, is Identified as [ Language,
>> TitleReduced ]
>
> And script? How would you distinguish "Аленький цветочек" and
> "Alenkiy tsvetochek"?

Sorry, forget this. Lack of coffee struck back.

Nicola

Derek Ignatius Asirvadem

unread,
Jan 15, 2020, 4:48:03 AM1/15/20
to
Nicola

We are getting into a nice rhythm, especially re the day-night difference and ability to turn around a response during each other's overnight. I will improve my posts as follows.
a. following yours, which are responses, I will give responses that are required to close any issues, only, and exclude any discussion that causes a next iteration
b. give the discussion from your response, which serves as an intro, and the next iteration, in a separate post.

-------------------------------------
This post is one such [a], re V0.4
-------------------------------------

> On Tuesday, 14 January 2020 21:51:51 UTC+11, Nicola wrote:
> > On 2020-01-12, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> > > On Friday, 10 January 2020 19:38:29 UTC+11, Nicola wrote:
>
> > Thus when you
> > counter with “FIAF does this”, I would ask that instead you consider
> > what the real world requirement is.
>
> Sure. When I say "FIAF does this" or "the standard says that", I am not
> using an "appeal to authority" argument: I am just pointing out how
> people with deep knowledge of the domain have addressed some issues,
> perhaps unsatisfactorily (for various reasons). I believe in the
> authority of the argument, not vice versa.

Understood.

I don't have a problem with authority. It is a post-modern tactic to demean the authority (in order to elevate the "everyone is equal and my thoughts are as good as the authority" nonsense) just because it is the authority. Eg. real Standards; Codd; etc, are genuine authorities. UML; Date; Darwen; Fagin; et al; the Alice book; are not authorities ... the proof is, their work cannot be relied upon. That is separate to the gigantic fraud.

The diminution of an argument with a charge of "appealing to authority" or that it is "weak" has no logic in it. It is an attack with no substance, by someone obsessed with rebellion authority. Try driving through a red light at peak hour. Real authorities are, well, real. We do not have to buy it, or pander to their destructive needs.

The FRBR/FIAF people may well have deep knowledge, but they do not follow the Four Laws of Thought; Science; etc. Their standard is ambiguous; confused. It cannot be relied upon, it cannot be readily used for an implementation, the same old considerations have to be repeated, with the addition that a resolution is achieved. Just look at the work that we (you and I) have to do, to get beyond what they have given us, to a resolution. I would say, a full 75% of what we are doing, is not data modelling per se but resolving the ambiguities and confusion that those docs have left us with, using data modelling (Codd, Relational, logical) as the tool or method to do that.

Actually, I love authority. God. Truth. Science. And so on down the hierarchy.

> Ok. After reflecting on how Titles and Stories should be related and
> comparing your v0.3 and v0.4 models, I think that you should reintroduce
> an independent Title entity in v0.4. These are the essential facts to be
> modelled, IMO:
>
> 1. A Title identifies (is the "main"/"reduced" title of) of 0-N Stories.
> 2. A Story has one identifying/main/reduced title.
> 3. A Title may be an Alternate Title;
> 4. A Story has zero or more Alternate Titles.
>
> AFACS, in v0.4 you are modelling only 4 (Story Title).

No. In V0.4, the Predicates touching Story are as follows. (The model itself does not give Keys but I have given five of them at the top right):
Language
Each Language is independent
Each Language is engendered by 1 Country
Each Language is primarily identified by ( Language )
Each Language identifies, and is expressed in, 1-to-n Stories

Agent
Each Agent is independent
Each Agent is hosted by 1 Country
...
Each Agent creates 0-to-n Stories
...

Story
Each Story is dependent on 1 Agent
Each Story is created by 1 Agent
Each Story is identified by 1 Agent
Each Story is dependent on 1 Language
Each Story is an expression of 1 Language
Each Story is identified by 1 Language
Each Story is primarily identified by ( Language, TitleReduced, Creator, Year )
Each Story is known as 1-to-n StoryTitles
...

Taking each in turn:
> 0. reintroduce an independent Title entity

The predicate for that is false. A Title does not exist independently. It exists only because a Story [with that Title] exists.

> 1. A Title identifies (is the "main"/"reduced" title of) of 0-N Stories.

A ( Language, TitleReduced, Creator, Year ) identifies 1 Story

> 2. A Story has one identifying/main/reduced title.

Each Story is primarily identified by ( Language, TitleReduced, Creator, Year )
Each Story is known as 1-to-n StoryTitles
One StoryTitle is the TitleReduced in long form

> 3. A Title may be an Alternate Title;

The /rest/ in StoryTitle are Alternate Titles

> 4. A Story has zero or more Alternate Titles.

Yes. The 1 in the /1-to-n/ is the main Title un-reduced.

> > Discussion
> > -------------
> >
> > Realisation.PK[ Language, Creator, Year, Title, Differentiator ]
> >
> > Concept[ English, GB, 1897, Bram Stoker ]
> > Realisation[ English, US, 1931, Tod Browning, English cast ]
> > Realisation[ Spanish, US, 1931, Tod Browning, Spanish cast ]
> > ...
> > Realisation[ English, US, 1958, Jimmy Sangster, "" ] = Christopher Lee
>
> I think you forgot to mention the title there, otherwise it seems ok.

Yes, sorry.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Jan 15, 2020, 4:59:15 AM1/15/20
to
Nicola

> On Wednesday, 15 January 2020 20:48:03 UTC+11, Derek Ignatius Asirvadem wrote:
>
> We are getting into a nice rhythm, especially re the day-night difference and ability to turn around a response during each other's overnight. I will improve my posts as follows.
> a. following yours, which are responses, I will give responses that are required to close any issues, only, and exclude any discussion that causes a next iteration
> b. give the discussion from your response, which serves as an intro, and the next iteration, in a separate post.

------------------------------------------------
This post is one such [b], introducing V0.5
------------------------------------------------

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_5.pdf

If your uni is anything like ours, it is a bit hard to find an A2 colour printer, but easy to find an A3 colour printer, so here it is in A3x2:
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_5%20A3x2.pdf

> > Please indicate your preference: Concept or Story.
>
> Concept is more encompassing (e.g., a documentary might not have
> a proper "storyline"). It's a very general term, but within the model it
> is contextualized, so I think it is acceptable (Moving Image Concept
> would be more accurate, but perhaps a bit too long).

Done.

> > Do we need Characters for Story ?
>
> Good question. In a fully developed model I would probably include them
> (the set of characters of a Story does not necessarily coincide with any
> of the sets of characters of its realisations), but I think that they
> can be added at a later stage.

Done.

> >> silent movies
> >
> > Language of the Country (origin)
>
> Not sure how well languages map to countries. There might always be
> a "main" language.

Yes.
If I don't use Language as modelled, then the Pk would be ( Country, ... ). In that case, I need an Associative Table CountryLanguage.

> > Discussion [MINE CORRECTED]
> > -------------
> >
> > Realisation.PK[ Language, Creator, Year, Title, Differentiator ]
> >
> > Concept[ English, Bram Stoker, 1897, Dracula ]
> > Realisation[ English, Tod Browning, 1931, Dracula, English cast ]
> > Realisation[ Spanish, Tod Browning, 1931, Dracula, Spanish cast ]
> > ...
> > Realisation[ English, Jimmy Sangster, 1958, Dracula, "" ] = Christopher Lee

> I would say that the Concept is [English, US, 1931, Tod Browning,
> Dracula], i.e., the movie is a different concept than the novel it was
> inspired from.

Whoa.
The Concept is intellectual, not concrete.
So we need a Concept [ English, Tod Browning, 1931, Dracula ] that is derived from Concept [ English, Tod Browning, 1931, Dracula ]
For you (a non-theatrical movies-only system), the movie is concrete, it is the first Realisation of the Concept
For other systems, the 1897 book is the first Realisation of the first Concept, the 1931 movie is the first Realisation of the second Concept.

> In the same vein, I wouldn't put all Dracula movies under
> one Concept, as they might be wildly different stories (horror, comedy,
> romantic, etc.).

In general, I agree.
Realisation.Story may be the full script; etc.
Whereas Concept.Story is a shorter or simpler form, enough to define the Concept, but nowhere near a script.

In the specifics, for Realisations that are true to the Concept (not exactly, but directly inspired by), they will be under the original 1897 Story, and they will be differentiated by Genre; Cast; etc
For a Realisation that is wildly different, yes, it is a new Concept, derived from the original Concept.

That exposes another requirement: we need a simple hierarchy (single parent, like Node) in Concept. The wildly different comedy Concept [ ..., 2019, Dracula ] must refer to the original Concept [ English, Bram Stoker, 1897, Dracula ] as /derived from/.

----
How about this one:

Concept Play [ Greek, Pygmalion, Unknown, 350BC ] is rendered as:
Realisation Play [ Greek, Pygmalion, Unknown, 350BC, "" ]

Concept Play [ English, Pygmalion, W S Gilbert, 1871 ]
is derived from:
Concept Play [ Greek, Pygmalion, Unknown, 350BC ]
is rendered as:
Realisation Play [ English, Pygmalion, W S Gilbert, 1871, "" ]

Concept Play [ English, Pygmalion, G B Shaw, 1913 ]
is derived from:
Concept Play [ English, Pygmalion, W S Gilbert, 1871 ]
is rendered as:
Realisation Play [ English, Pygmalion, G B Shaw, 1913, "" ]

Concept Play [ English, My Fair Lady, A J Lerner, 1956 ]
is derived from:
Concept Play [ English, Pygmalion, G B Shaw, 1913 ]
is rendered as:
Realisation Play [ English, My Fair Lady, A J Lerner, 1956, "" ]
Realisation Movie [ English, My Fair Lady, A J Lerner, 1964, "" ]

You were right: we may need Variant at the Concept level (as well as the Realisation level). Ok, so for lineage, we need
a. Either Concept /is varied as/ Variant.Concept -- allows multiple parents
XOR
b. Concept /has variant / Concept -- single parent
I am tending towards the latter, a derivation can be from only one parent.
Done. For both Concept & Realisation.

> > Country does not inform us re the Language of the movie (spoken in the
> > movie), whereas Language more often does. If we use Country in the PK
> > , Language has to be an attribute, if we use Language, Country has to
> > be an attribute.
>
> Both pieces of information have to be stored, as they are somewhat
> orthogonal (e.g., a US producer that decides to film in Spanish). It
> seems to me that Country is what is usually reported beside the title
> and the creator. So, I would be inclined to use Country in the PK (with
> Language possibly in some AK).

Ok, I will show that in the next model.
(Which means the above examples should now be read with Country substituted for Language).

> > In this eg, Differentiator is not the same as Language, even if it
> > seems so. The Fact of a different cast is not relevant //to the PK//,
> > because it would be two different Realisations due to having to
> > populate the Cast table with different Actors (Persons).
>
> Well, mine was just an example. One could use I and II instead (as IMDb
> has done for Anna Christie, 1930). Or something else.
>
> > That is to
> > say, if we used Language in the PK instead of Country, Differentiator
> > is not required (does not apply in other cases).
>
> Exactly. There is a wide range of possible situations. There are
> certainly cases of movies in English with the same title made in US,
> Canada, UK or Australia.

Ok, Country it is.

> > That is not as important as the next point. But first, I have to
> > mention why I changed Concept to Story, and why I have erected it as
> > under Title (ie. TitleReduced as PK, full Title as attribute). The
> > thing is this, the Title for Concept should be the Title for
> > Realisation. And if there is any Title in Realisation other than the
> > Concept.Title, then it is one of the TitleAlts, and attributed
> > Preferred or Original; etc.
>
> Ok.
>
> > Second point. I have erected Title -> Story -> StoryRealisation
> > because the Title = Story, and if the above point is understood, then
> > Title = Realisation. What I am saying in V0.3 is, Title is a higher
> > level, in the intellectual realm, that Story. And it is that
> > higher-level Title, and not Story-without-a-title, that is Realised.
>
> I think this is correct. That was changes in v0.4, but I think you
> should revise that.

It appears I have not been clear. I am not advocating Title as an entity. I am saying TitleReduced is the defining element of Concept:
Concept = Title
Title = Concept
V0.3 was primitive in relation to V0.4, which resolves the /Title is above Concept/ issue:
There is no Title without [at least one] Concept
There is no Concept without its defining Title

> > **In addition**, I have drawn a loop Title -> Story -> Story
> > Realisation, so that we can evaluate that alternative, without having
> > to give you two models. The model as it stands is otherwise invalid,
> > because it has that circular reference. We have to choose one xor the
> > other.
>
> I don't see any circular reference.

Sorry, I was not clear. V0.3:
Title -> Story -> StoryRealisation
Title -> Realisation -+ StoryRealisation
is not a true circular reference. Perhaps better called a loop. It is one of those gotchas that one watches out for during the modelling exercise, that has to be resolved. There should not be two paths to StoryRealisation, unless they are actually true, actually two separate predicates.

> Anyway, I think that this part
> should be modeled as I have explained above.
>
> > Third, and please evaluate this point third. Now what is the Title
> > exactly ?
> > Quoting FIAF Manual § 1.3.2 p31 (these are given as Variants, to us
> > these are Realisations, but I am asking you to focus on the Title as
> > an Independent entity):

I have messed up there, using "Independent entity". I meant, evaluate Title independently, as a Fact, I did not mean actual Independent entity. In V0.3, Title is Story, Story is Title.

> > "
> > Аленький цветочек (USSR, 1952, Lev Atamanov)47
> > Аленький цветочек – Title of the Work
> > Alenkiy tsvetochek: Alternative (transliterated) title of Work/ Variant (Preferred title if systems don’t use Cyrillic)
> > Feuerrotes Blümchen – Variant title – Dubbed (German)
> > The Scarlet Flower – Variant title – Dubbed (English)
> > "
> > Why are those NOT one Title in four Languages ?

Should be:
Why are those NOT one Concept.Title in four Languages ?

> That is one title in four languages and should be treated as such,
> I believe. Title translations are not so easy, though: it is quite
> common for a movie to have a title in one language that is totally
> unrelated to the original title, e.g., Eternal Sunshine of the Spotless
> Mind (2004) in Italian was distributed with a title that sounds like "If
> you leave me, I'll erase you" (!).

<< If you understand pathological Denial, ie. schizophrenia ... the fragmented mind ... the person just deleted one compartment ... achieved spotlessness ... ah, the sunshine. Contrived, of course. >>

Ok, so for that we would have two Concepts, one Realisation each. Each Concept would have a different PK, because of the different Title (and Country, etc). But wait ...

> Thinking about it, the Italian title would be an Alternate Title of
> a Realisation or perhaps even an Edition (Italian dubbing) of the movie.
> So, it should not be considered a "translation" of the Story's title.

One Concept, two Realisations with the second Differentiator ( Italian }, one Edition each.

This is not resolved.
Language raises it persistent head, as being a possible defining feature.

> In the example above, "Feuerrotes Blümchen" could be both (a) the German
> translation (I think, I don't speaking Russian or German) of the
> original title of the Story, and (b) the title of a Realisation (or
> Edition) of the movie.

It is [a]. "Feuerrotes Blümchen" means fire-red (scarlet) flower.
- If the language has not changed (not dubbed) one Realisation with a second Edition Differentiator ( German }
- If the language has changed (dubbed) a separate Realisation with Differentiator ( German }

> What are dubbed movies? Realisations or Editions?

We should make a separate clear decision that a dubbing is
a. second Edition (not second Realisation) with Differentiator{ Language }
b. second Realisation with Differentiator{ Language }

True dub, where the language of the film has been changed, as opposed to subtitles. As per the FIAF manual discussion, I would say, the substance "footage" has changed; the Language has changed; it is a "primary edit", therefore it is [b]. Which allows different (Alternate) Title in RealisationTitle

For subtitles (not dubbed), if it were a separate Edition released, then a separate Edition. In modern times, since many subtitles are inserted in the one image file, that is not even a separate Edition.

----
So this:
> > Аленький цветочек (USSR, 1952, Lev Atamanov)47
> > Аленький цветочек – Title of the Work
> > Alenkiy tsvetochek: Alternative (transliterated) title of Work/ Variant (Preferred title if systems don’t use Cyrillic)
> > Feuerrotes Blümchen – Variant title – Dubbed (German)
> > The Scarlet Flower – Variant title – Dubbed (English)

for a non-Cyrillic system would be:
Concept[ USSR, Alenkiy tsvetoche, Lev Atamanov, 1952 ]
Realisation Movie[ USSR, Alenkiy tsvetoche, Lev Atamanov, 1952, “" ]
RealisationTitle[ ..., Preferred ] = Alenkiy tsvetoche
RealisationTitle[ ..., Original ] = Russian[ Аленький цветочек ]
(Full definition of Realisation)

Realisation Movie[ USSR, Alenkiy tsvetoche, Lev Atamanov, 1952, German Dub ]
is a derivative of Realisation Movie[ USSR, Alenkiy tsvetoche, Lev Atamanov, 1952, “" ]
RealisationTitle[ ..., Preferred ] = German[ Feuerrotes Blümchen ]
Crew[ ... ] = German Dub crew

Realisation Movie[ USSR, Alenkiy tsvetoche, Lev Atamanov, 1952, English Dub ]
is a derivative of Realisation Movie[ USSR, Alenkiy tsvetoche, Lev Atamanov, 1952, “" ]
RealisationTitle[ ..., Preferred ] = English[ The Scarlet Flower]
Crew[ ... ] = English Dub crew

> Does the Cast include
> dubbers?

No.
IFF credited, they would be in the Crew. They are voice-over actors, paid like the cast for a TV advert, not at all like TV or film actors are paid. Which would cause a new Realisation with Differentiator { Dub crew }.

> > How can anyone render those Titles without the prefix Language ?
>
> How about a title like 8½ (by Fellini, 1963)? Ok, you might prefix it
> with Italian...

FIAF Manual § 2.3.1 p51 is not clear. Assuming the original Concept was an Italian:
Concept[ Italy, Otte e mezzo, Federico Fellini, 1962 ]
ConceptExternalIdentifier[ ..., 0000-0000-161F-0000-W-0000-0000-F ] = ISAN
Realisation[ Italy, Otte e mezzo, Federico Fellini, 1962, “” ]
RealisationTitle[ ..., Original ] = Otto e mezzo
RealisationTitle[ ..., Alternate ] = 8½
RealisationExternalIdentifier[ ..., 0000-0000-161F-0000-W-0000-0002-B ] = V-ISAN
Edition[ ..., OpenReel, 1962.06.01 ]
Edition[ ..., Blu-ray, 1962.07.01 ] = Japanese subtitles

Language again. Title is dependent on Language. So wherever Title is stored, the Atom must be [ Language, Title ]. Which is why I had Language not Country in the PK. Response please.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Jan 15, 2020, 3:59:11 PM1/15/20
to
Nicola

> On Wednesday, 15 January 2020 20:59:15 UTC+11, Derek Ignatius Asirvadem wrote:
>
> [...]
>
> Language again. Title is dependent on Language. So wherever Title is stored, the Atom must be [ Language, Title ]. Which is why I had Language not Country in the PK. Response please.

Before responding, I would ask that you obtain familiarity with SQL language and CharacterSet implementation, both server-side (storage) and client-side (representation). The concept of Locales.

(
I don't know how the NONsql suites do it, and I really don't want to know. I trust you will agree, since we are working with the RM, not the pig poop marketed as "relational", we should concern ourselves with the data sub-language that is defined in the RM: SQL, and not anything else. As always, the high-end suppliers implemented language and charset support decades before the SQL Standard defined it.
)

Note also, my design for Keyword. I allow Keywords to be specified for:
a. { Concept | Realisation }, and
b. { Concept | Realisation }.Title, and
c. I ensure they are not repeated for a given { Concept | Realisation }.

Since Title is Language dependent, any Keyword that is related to a Title will thusly be Language dependent. Default Locale, etc.

Cheers
Derek

Nicola

unread,
Jan 18, 2020, 5:54:21 AM1/18/20
to
>> On Tuesday, 14 January 2020 21:51:51 UTC+11, Nicola wrote:

>> Ok. After reflecting on how Titles and Stories should be related and
>> comparing your v0.3 and v0.4 models, I think that you should reintroduce
>> an independent Title entity in v0.4.

On a second thought, no, I agree that Titles should be modelled as
dependent entities.

> In V0.4, the Predicates touching Story are as follows.

I understand those.

> Taking each in turn:
>> 0. reintroduce an independent Title entity
>
> The predicate for that is false. A Title does not exist
> independently. It exists only because a Story [with that Title]
> exists.

Agreed.

>> 1. A Title identifies (is the "main"/"reduced" title of) of 0-N Stories.

Here I did not explain clearly: what I meant to say is that the Title
entity is identifying for Story (i.e., Story is dependent on Title). But
we have already agreed that the opposite makes more sense.

> A ( Language, TitleReduced, Creator, Year ) identifies 1 Story

Ok, that's your current choice for the PK. We'll come back to that.

>> 2. A Story has one identifying/main/reduced title.
>
> Each Story is primarily identified by ( Language, TitleReduced, Creator, Year )
> Each Story is known as 1-to-n StoryTitles
> One StoryTitle is the TitleReduced in long form

By TitleReduced, do you mean with less attributes (e.g., without
the specification of the title's language?). Also, the language of
a Story and the language of a title are distinct, so perhaps we could
call the latter TitleLanguage for clarity.

>> 3. A Title may be an Alternate Title;
>
> The /rest/ in StoryTitle are Alternate Titles

Ok.

>> 4. A Story has zero or more Alternate Titles.
>
> Yes. The 1 in the /1-to-n/ is the main Title un-reduced.

Ok.

Nicola

Nicola

unread,
Jan 18, 2020, 11:14:49 AM1/18/20
to
On 2020-01-15, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> ------------------------------------------------
> This post is one such [b], introducing V0.5
> ------------------------------------------------
>
> http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_5.pdf

Got it.

> If your uni is anything like ours, it is a bit hard to find an A2
> colour printer, but easy to find an A3 colour printer, so here it is
> in A3x2:
> http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_5%20A3x2.pdf

I think I've never seen A2 printers around here. With large models (and
even not so large) models, I usually split the diagrams into several
pages anyway.

>> >> silent movies
>> >
>> > Language of the Country (origin)
>>
>> Not sure how well languages map to countries. There might always be
>> a "main" language.
>
> Yes.
> If I don't use Language as modelled, then the Pk would be ( Country,
> ... ). In that case, I need an Associative Table CountryLanguage.

Yes, the relationships between countries and languages are loose (they
vary with time, boundaries, wars, etc). A (Movie) Concept is expressed
in one (principal) Language, and it is produced by one (principal)
Country: that's right.

>> > Discussion [MINE CORRECTED]
>> > -------------
>> >
>> > Realisation.PK[ Language, Creator, Year, Title, Differentiator ]
>> >
>> > Concept[ English, Bram Stoker, 1897, Dracula ]
>> > Realisation[ English, Tod Browning, 1931, Dracula, English cast ]
>> > Realisation[ Spanish, Tod Browning, 1931, Dracula, Spanish cast ]
>> > ...
>> > Realisation[ English, Jimmy Sangster, 1958, Dracula, "" ] = Christopher Lee
>
>> I would say that the Concept is [English, US, 1931, Tod Browning,
>> Dracula], i.e., the movie is a different concept than the novel it was
>> inspired from.
>
> Whoa.
> The Concept is intellectual, not concrete.
> So we need a Concept [ English, Tod Browning, 1931, Dracula ] that is
> derived from Concept [ English, Tod Browning, 1931, Dracula ]

I assume that you meant the latter to be Bram Stoker's Dracula. Yes,
concepts are related to each other and those relationships are useful
facts to model.

A remark: my intended definition of "Concept" in this context is
strictly as "Moving Image Concept", i.e., it would exclude other forms
of intellectual creations. So, if a movie is derived from a novel, that
should be expressed as an association between a (Moving Image) Concept
instance and an instance of another entity, not existing in the current
model. I see that your definition is broader: that's not necessarily
wrong, but I would be cautious with including literary works into the
picture: the facts to be associated to literary concepts might be
significantly different from those associated to movie concepts. As
a more concrete example, in your model a Concept has many Realisations,
each with its own Cast and Crew. If one of your Concepts is Bram
Stoker's novel, that doesn't make sense. (I know that the relationships
are 0-N, but still...)

> That exposes another requirement: we need a simple hierarchy (single
> parent, like Node) in Concept. The wildly different comedy Concept
> [ ..., 2019, Dracula ] must refer to the original Concept [ English,
> Bram Stoker, 1897, Dracula ] as /derived from/.

Yes, sure.

> ----
> How about this one:
>
> Concept Play [ Greek, Pygmalion, Unknown, 350BC ] is rendered as:
> Realisation Play [ Greek, Pygmalion, Unknown, 350BC, "" ]
>
> Concept Play [ English, Pygmalion, W S Gilbert, 1871 ]
> is derived from:
> Concept Play [ Greek, Pygmalion, Unknown, 350BC ]
> is rendered as:
> Realisation Play [ English, Pygmalion, W S Gilbert, 1871, "" ]
>
> Concept Play [ English, Pygmalion, G B Shaw, 1913 ]
> is derived from:
> Concept Play [ English, Pygmalion, W S Gilbert, 1871 ]
> is rendered as:
> Realisation Play [ English, Pygmalion, G B Shaw, 1913, "" ]
>
> Concept Play [ English, My Fair Lady, A J Lerner, 1956 ]
> is derived from:
> Concept Play [ English, Pygmalion, G B Shaw, 1913 ]
> is rendered as:
> Realisation Play [ English, My Fair Lady, A J Lerner, 1956, "" ]
> Realisation Movie [ English, My Fair Lady, A J Lerner, 1964, "" ]

Not commenting about the details of this specific example (e.g., one
might introduce a Concept [English, My Fair Lady, G Cukor, 1964] derived
from either Shaw's Pygmalion or Lerner's My Fair Lady...), but the
overall idea makes sense, yes.

> You were right: we may need Variant at the Concept level (as well as
> the Realisation level). Ok, so for lineage, we need
> a. Either Concept /is varied as/ Variant.Concept -- allows multiple parents
> XOR
> b. Concept /has variant / Concept -- single parent
> I am tending towards the latter, a derivation can be from only one parent.
> Done. For both Concept & Realisation.

Ok, I see this in your v0.5.

>> Eternal Sunshine of the Spotless
>> Mind (2004) in Italian was distributed with a title that sounds like "If
>> you leave me, I'll erase you" (!).
>> [...]
> One Concept, two Realisations with the second Differentiator ( Italian
> }, one Edition each.

In the case of dubbing, I'd say that there is one Concept, *one*
Realisation (same footage), and two Editions (one in the original
language, one with added dubbing).

> This is not resolved.
> Language raises it persistent head, as being a possible defining feature.

The Language of a dubbed version would still be English (the original
language), because Language is the language the movie was filmed in.
Dubbing information are characteristics of an Edition, IMO. In your
v0.05 model I would add an association between Edition and Realisation
Title (an Edition adopts one of the Realisation Titles).

The Dracula (1931) example may be partially confusing here: we say that
the difference in the case of Dracula is that one version in English and
the other is in Spanish, but the key point is that the *footage* is
different, so there are clearly two different Realisations (of the same
Concept). We use Spanish/English as the differentiator, but the
difference is not only in the language, it's more profound than that
(e.g., one might differentiate the two versions by mentioning the name
of an actor that acted in one version but not in the other).

So:

Concept: [US, Eternal Sunshine..., M Gondry, 2004]
Realisation: [US, Eternal Sunshine..., M Gondry, 2004, ""]
Is Known As: [US, Eternal, Sunshine]
Is Known As: [IT, Se mi lasci ti cancello]
Edition: [US, Eternal Sunshine..., M Gondry, 2004, "", DVD, Some Publisher, 2011]
Edition Title: [IT, Se mi lasci ti cancello]

Concept: [US, Dracula, T Browning, 1931]
Realisation: [US, Dracula, T Browning, 1931, "English cast"]
Is Known As: [US, Dracula]
Is Known As: [LT, Drakula]
Realisation: [US, Dracula, T Browning, 1931, "Spanish cast"]
Is Known As: [US, Dracula]
Is Known As: [LT, Drakula]

Note that while Concepts and Realisations may have alternate titles, an
Edition should have only one title. Of course, such title must be among
the titles by which the corresponding Realisation instance is known as.

But wait:

> We should make a separate clear decision that a dubbing is
> a. second Edition (not second Realisation) with Differentiator{ Language }
> b. second Realisation with Differentiator{ Language }

You say b, I have suggested a. I have reviewed your other example ("The
Scarlet Flower") and the accompanying comments and I don't have a strong
opinion now. If you create a new Realisation instance for a dubbed
version, such instance has the same cast and characters and crew (except
for dub crew) than the one it is derived from. Perhaps it's this
duplication that makes me uneasy about your decision.

> Language again. Title is dependent on Language. So wherever Title is
> stored, the Atom must be [ Language, Title ]. Which is why I had
> Language not Country in the PK. Response please.

What is the meaning of an atom [Language: Y, Title: X]?

- "Title X is written in language Y"?

No, titles may have no meaning in any language or be ambiguous (e.g,
"Gift" means different things in English and German) and such ambiguity
may exist on purpose (e.g., a Spanish movie about a poisonous present).

- "X is the title with which the movie was distributed in a country in
which language Y is spoken"?

Then, why not [Title, Country] instead?

- Something else?

Nicola

Nicola

unread,
Jan 18, 2020, 11:50:44 AM1/18/20
to
On 2020-01-15, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> Nicola
>
>> On Wednesday, 15 January 2020 20:59:15 UTC+11, Derek Ignatius Asirvadem wrote:
>>
>> [...]
>>
>> Language again. Title is dependent on Language. So wherever Title
>> is stored, the Atom must be [ Language, Title ]. Which is why I had
>> Language not Country in the PK. Response please.
>
> Before responding, I would ask that you obtain familiarity with SQL
> language and CharacterSet implementation, both server-side (storage)
> and client-side (representation). The concept of Locales.
>
> ( I don't know how the NONsql suites do it, and I really don't want to
> know. I trust you will agree, since we are working with the RM, not
> the pig poop marketed as "relational", we should concern ourselves
> with the data sub-language that is defined in the RM: SQL, and not
> anything else. As always, the high-end suppliers implemented language
> and charset support decades before the SQL Standard defined it.)
>
> Note also, my design for Keyword. I allow Keywords to be specified for:
> a. { Concept | Realisation }, and
> b. { Concept | Realisation }.Title, and
> c. I ensure they are not repeated for a given { Concept | Realisation }.

That's fine.

> Since Title is Language dependent, any Keyword that is related to
> a Title will thusly be Language dependent. Default Locale, etc.

Again, I believe that you are making too strong assumptions about
titles.

Nicola

Derek Ignatius Asirvadem

unread,
Jan 20, 2020, 8:58:11 AM1/20/20
to
> On Sunday, 19 January 2020 03:14:49 UTC+11, Nicola wrote:\

Nicola

I was preparing a detailed response but I ran out of time. Problem is, Tues is a very busy day for me. I will get back to this on Wed

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Jan 23, 2020, 11:07:59 PM1/23/20
to
> On Saturday, 18 January 2020 21:54:21 UTC+11, Nicola wrote:
>
> >> 2. A Story has one identifying/main/reduced title.
> >
> > Each Story is primarily identified by ( Language, TitleReduced, Creator, Year )
> > Each Story is known as 1-to-n StoryTitles
> > One StoryTitle is the TitleReduced in long form
>
> By TitleReduced, do you mean with less attributes (e.g., without
> the specification of the title's language?).

No.
Wherever Title is stored, it must be stored with TitleLanguage.

> By TitleReduced

I have previously given a quick description of what that is, which I thought was understood ...
OrderAdvanced ...
Keyword ...
PartDescription ...

The same structure will be used for:
- Title
- minus prepositions; conjunctions; etc.
Fully search-able at the SQL level

> Also, the language of
> a Story and the language of a title are distinct, so perhaps we could
> call the latter TitleLanguage for clarity.

Yes.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Jan 23, 2020, 11:08:19 PM1/23/20
to
> On Sunday, 19 January 2020 03:14:49 UTC+11, Nicola wrote:
> > On 2020-01-15, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> > ------------------------------------------------
> > This post is one such [b], introducing V0.5
> > ------------------------------------------------
> >
> >> > Discussion [MINE CORRECTED]
> >> > -------------
> >> >
> >> > Realisation.PK[ Language, Creator, Year, Title, Differentiator ]
> >> >
> >> > Concept[ English, Bram Stoker, 1897, Dracula ]
> >> > Realisation[ English, Tod Browning, 1931, Dracula, English cast ]
> >> > Realisation[ Spanish, Tod Browning, 1931, Dracula, Spanish cast ]
> >> > ...
> >> > Realisation[ English, Jimmy Sangster, 1958, Dracula, "" ] = Christopher Lee
> >
> >> I would say that the Concept is [English, US, 1931, Tod Browning,
> >> Dracula], i.e., the movie is a different concept than the novel it was
> >> inspired from.
> >
> > Whoa.
> > The Concept is intellectual, not concrete.
> > So we need a Concept [ English, Tod Browning, 1931, Dracula ] that is
> > derived from Concept [ English, Tod Browning, 1931, Dracula ]
>
> I assume that you meant the latter to be Bram Stoker's Dracula.

Yes, sorry.

// The Concept is intellectual, not concrete.
// So we need a Concept [ English, Tod Browning, 1931, Dracula ] that is
// derived from Concept [ English, Bram Stoker, 1897, Dracula ]

> Yes,
> concepts are related to each other and those relationships are useful
> facts to model.

Now that the logic is getting tighter, strictly through:
- Each Varied.Concept is varied as 0-to-n Variant.Concepts
--- Each Variant.Concept is a variant of 1 Varied.Concept
- Each Derived.Concept is derived as 0-to-n Derivative.Concepts
--- Each Derivative.Concept is a derivative of 1 Derived.Concept

Gives:
- Concept [ English, Bram Stoker, 1897, Dracula ] is derived as Concept [ English, Tod Browning, 1931, Dracula ]
--- Concept [ English, Tod Browning, 1931, Dracula ] is a derivative of Concept [ English, Bram Stoker, 1897, Dracula ]

> A remark: my intended definition of "Concept" in this context is
> strictly as "Moving Image Concept", i.e., it would exclude other forms
> of intellectual creations. So, if a movie is derived from a novel, that
> should be expressed as an association between a (Moving Image) Concept
> instance and an instance of another entity, not existing in the current
> model.

I don't understand. How can one associate [or reference] a thing with another thing that does not exist ?

> I see that your definition is broader: that's not necessarily
> wrong, but I would be cautious with including literary works into the
> picture: the facts to be associated to literary concepts might be
> significantly different from those associated to movie concepts. As
> a more concrete example, in your model a Concept has many Realisations,
> each with its own Cast and Crew. If one of your Concepts is Bram
> Stoker's novel, that doesn't make sense. (I know that the relationships
> are 0-N, but still...)

Why does it not make sense ?
(Given that you are restricting the content of the database (as distinct from the data model) to "Moving Images" only ...)
EITHER
- you register the original Concept in whatever form that is (Dracula 1897, Pygmalion 350BC)
--- and the chain of derivative Concepts
--- such that you can register the first Realisation that you need (the driver)
--- and thus the tree is complete (for anyone reporting the Realisation, seeking its lineage)
--- (that some Concepts have no Realisations is not relevant)
OR
- you register only the Concept that is immediate to the Realisation that you need
--- and obtain none of the above
--- (change the cardinality to 1-to-n ... because it is true for the content, not true for reality)

In the second instance, it takes the scope of content as being only the moving picture industry as if it came into existence from nowhere, with nothing in its lineage, which is false, for every Concept that is not an original movie Concept. Schizophrenia. Like the echo chamber in an asylum. Any visitor who knows anything about a particular film will laugh at the absence of proper lineage.

> >> Eternal Sunshine of the Spotless
> >> Mind (2004) in Italian was distributed with a title that sounds like "If
> >> you leave me, I'll erase you" (!).
> >> [...]
> > One Concept, two Realisations with the second Differentiator ( Italian
> > }, one Edition each.
>
> In the case of dubbing, I'd say that there is one Concept, *one*
> Realisation (same footage), and two Editions (one in the original
> language, one with added dubbing).

Not according to my reading of the FIAF manual. Eg. the sound (audio channel for the movie) is different (which according to you does not change the "footage", but according to me it does). Second, there would be a dub cats who should be registered as such.

> > This is not resolved.
> > Language raises it persistent head, as being a possible defining feature.
>
> The Language of a dubbed version would still be English (the original
> language), because Language is the language the movie was filmed in.
> Dubbing information are characteristics of an Edition, IMO.

Disagree, as above.

There is no difference between a Realisation [ Country[England], Language[English] ] and Realisation [ Country[England], Language[English] ] with Italian subtitles.

There is a difference between a Realisation [ Country[England], Language[English] ] and Realisation [ Country[England], ? Language[English] ? ] with Italian dubbing.
- In the model that is Realisation [ Country[England], Language[Italian] ]
- the Title may also be Language[Italian]
- Dub cast, which the original does not have


> In your
> v0.05 model I would add an association between Edition and Realisation
> Title (an Edition adopts one of the Realisation Titles).

Ok. I am not against that but then we have an item (loop in the model, accidentally called a circular reference in a previous instance) that needs resolution.

Each Title in RealisationTitle is qualified by Language.

Why is such an Edition not IDENTIFIED BY that single RealisationTitle (instead of being identified by the Realisation PK plus having a single RealisationTitle) ?

My experience of Europe is that in such things as videos (training or feature or copy of a TV show) comes on a DVD in multiple languages ... and /optionally/ one Language is primary. If you are saying that an Edition is one Language, then yes,
- an Edition should be Identified by a single RealisationTitle
not
- an Edition should be Identified by a single Realisation, and has one RealisationTitle

> The Dracula (1931) example may be partially confusing here: we say that
> the difference in the case of Dracula is that one version in English and
> the other is in Spanish, but the key point is that the *footage* is
> different, so there are clearly two different Realisations (of the same
> Concept).

Yes.
But there is more: the Cast is different; one has one Title is English, only; the other has one Title is Spanish, only

> We use Spanish/English as the differentiator, but the
> difference is not only in the language, it's more profound than that
> (e.g., one might differentiate the two versions by mentioning the name
> of an actor that acted in one version but not in the other).

Yes. We use Differentiator { English | Spanish } for convenience only, something that is meaningful to us, Yes, we could just as well use { FirstShoot | SecondShoot } or { ActorFoo | ActorBar } .

> So:
>
> Concept: [US, Eternal Sunshine..., M Gondry, 2004]
> Realisation: [US, Eternal Sunshine..., M Gondry, 2004, ""]
> Is Known As: [US, Eternal, Sunshine]
> Is Known As: [IT, Se mi lasci ti cancello]
> Edition: [US, Eternal Sunshine..., M Gondry, 2004, "", DVD, Some Publisher, 2011]
> Edition Title: [IT, Se mi lasci ti cancello]
>
> Concept: [US, Dracula, T Browning, 1931]
> Realisation: [US, Dracula, T Browning, 1931, "English cast"]
> Is Known As: [US, Dracula]
> Is Known As: [LT, Drakula]
> Realisation: [US, Dracula, T Browning, 1931, "Spanish cast"]
> Is Known As: [US, Dracula]
> Is Known As: [LT, Drakula]

Ok. (Assuming the lineage is there, but not shown.)

> Note that while Concepts and Realisations may have alternate titles, an
> Edition should have only one title. Of course, such title must be among
> the titles by which the corresponding Realisation instance is known as.

Ok, then Edition is Identified by RealisationTitle, not Realisation.
- Each RealisationTitle manifests as 0-to-n Editions
- Each Edition is a manifestation of 1 RealisationTitle

> But wait:
>
> > We should make a separate clear decision that a dubbing is
> > a. second Edition (not second Realisation) with Differentiator{ Language }
> > b. second Realisation with Differentiator{ Language }
>
> You say b, I have suggested a. I have reviewed your other example ("The
> Scarlet Flower") and the accompanying comments and I don't have a strong
> opinion now. If you create a new Realisation instance for a dubbed
> version, such instance has the same cast and characters and crew (except
> for dub crew)

Well, then, it is a different Crew. Therefore a different Realisation.

> than the one it is derived from. Perhaps it's this
> duplication that makes me uneasy about your decision.

Where is the duplication ? Think about it from the physical back to the logical. Two DVDs. One has one Crew. The other has that same Crew plus credits for the dubbing Crew. (Codd's 3NF & FD) the attributes that depend on the Key are different, therefore it must have a different Key.

Second, the Language is different. And the Language is in the Key itself.

The alternative is, make a separate table RealisationDub, a child of Realisation, which has 1-to-n DubCrew. That would be a single Title of the many in RealisationTitle.

Consider that dubbing is post-production, not Production. That RealisationDub is basically Edition. So then we move onto this: A Dub is a different Edition (with a DubCrew), not a second Realisation. Poof, the duplication is eliminated. Normalisation rocks.

> > Language again. Title is dependent on Language. So wherever Title is
> > stored, the Atom must be [ Language, Title ]. Which is why I had
> > Language not Country in the PK. Response please.
>
> What is the meaning of an atom [Language: Y, Title: X]?
>
> - "Title X is written in language Y"?
>
> No, titles may have no meaning in any language or be ambiguous (e.g,
> "Gift" means different things in English and German) and such ambiguity
> may exist on purpose (e.g., a Spanish movie about a poisonous present).
>
> - "X is the title with which the movie was distributed in a country in
> which language Y is spoken"?
>
> Then, why not [Title, Country] instead?
>
> - Something else?

Something else. Think Europe, where multiple Languages and their alphabets (Character Sets) are common, and commonly supported on computer systems (it is 2020, not 1984). When a Title is rendered on the screen (GUI; Report tool; etc), what CharSet do we display the content of the column /Title/ in ? In order to display this (Title only is the concern), we need to know the Language, which in turn has one CharSet:
Аленький цветочек (Russian, Cyrillic)
Alenkiy tsvetochek (Russian, no Cyrillic = Western Europe))
Feuerrotes Blümchen (German, Western Europe)
The Scarlet Flower (English, Western Europe)
Το κόκκινο λουλούδι (Greek, Greek)

Language codes:
ISO 639-1 { en | de | ru }

CharSets (Alphabet):
ISO 8859-1 Western Europe (meaning Latin alphabet plus accents, etc)
ISO 8859-5 Cyrillic
ISO 8859-7 Greek

So wherever Title is stored, the Atom must be [ Language, Title ], preferable to [ CharSet, Title ].


(
This is the full high end enterprise-level implementation
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc31654.1600/doc/html/san1360629204598.html

I am not saying you should do that, because I know your SQL or NONsql is not that capable. The minimum is that which I have given: wherever there is a Title column store [ Language, Title ]
)

Which is why I said ...

> On Sunday, 19 January 2020 03:50:44 UTC+11, Nicola wrote:
> >
> > Before responding, I would ask that you obtain familiarity with SQL
> > language and CharacterSet implementation, both server-side (storage)
> > and client-side (representation). The concept of Locales.
> >
> > ( I don't know how the NONsql suites do it, and I really don't want to
> > know. I trust you will agree, since we are working with the RM, not
> > the pig poop marketed as "relational", we should concern ourselves
> > with the data sub-language that is defined in the RM: SQL, and not
> > anything else. As always, the high-end suppliers implemented language
> > and charset support decades before the SQL Standard defined it.)


> > Since Title is Language dependent, any Keyword that is related to
> > a Title will thusly be Language dependent. Default Locale, etc.
>
> Again, I believe that you are making too strong assumptions about
> titles.

No problem. Let me know where to back off.

----

I don't know if you would agree. My perspective (real world, about 40 years):
- what we have been doing in Logical level modelling
--- you probably call it "conceptual level"
- this is where most, if not all constraints are determined (there are many more, but that is at a lower level, and pedestrian
- we are about 75-80% complete
- I don't have a "conceptual level", it is an irrelevant notion, relevant to the academics only, for some purpose that has nothing to do with reality (an implementation or intent of an implementation)
- to me "conceptual" is simply the early stages of Logical

In ERwin ( IDEF1X modelling s/w ) there is just one model (file), it progresses as the modelling progresses. Logical/Physical is just Domain/RawDataType, etc, meaning what will be shown in the rectangles, so ot is a rendition, not a next-stage. At any time, you can show { Table { Key { Attribute } } } at { Logical | Physical }.

The notion of separate models (files) for { conceptual | logical | physical } is ridiculous to me. Other modelling tools allow such ridiculous actions, and sometimes have a "match" or "sync" feature to synch up the ridiculous duplications and differences that people like me never have. 90% of the data modellers that use such software are stuck in the maintenance of that nonsense.

Cheers
Derek





>
> Nicola

Derek Ignatius Asirvadem

unread,
Jan 25, 2020, 3:35:35 AM1/25/20
to
===================
This post introduces V0.6
===================

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_6.pdf

The Keys are defined along with some notes at top right.

> On Friday, 24 January 2020 15:08:19 UTC+11, Derek Ignatius Asirvadem wrote:

All issues discussed in that post have been modelled.

> So wherever Title is stored, the Atom must be [ Language, Title ], preferable to [ CharSet, Title ].

That is expanded to cover any attribute:
Wherever an attribute that is not in the default Language is stored, it is stored as [ Language, <attribute> ]

> [ I said somewhere, that the un-reduced version of Concept.TitleReduced will be in ConceptTitles. ]
That is an error. According to 3NF/FD, it will be an attribute in Concept.Title.

----

In the same way that Edition is a manifestation of 1 RealisationTitle (not Realisation), and therefore a single Language, in this model, I expect to clarify whether Realisation is rendition of 1 ConceptTitle (not Concept), and therefore a single Language.


Cheers
Derek

Nicola

unread,
Jan 25, 2020, 5:00:03 AM1/25/20
to
Ok.

>> Yes,
>> concepts are related to each other and those relationships are useful
>> facts to model.
>
> Now that the logic is getting tighter, strictly through:
> - Each Varied.Concept is varied as 0-to-n Variant.Concepts
> --- Each Variant.Concept is a variant of 1 Varied.Concept
> - Each Derived.Concept is derived as 0-to-n Derivative.Concepts
> --- Each Derivative.Concept is a derivative of 1 Derived.Concept
>
> Gives:
> - Concept [ English, Bram Stoker, 1897, Dracula ] is derived as Concept [ English, Tod Browning, 1931, Dracula ]
> --- Concept [ English, Tod Browning, 1931, Dracula ] is a derivative of Concept [ English, Bram Stoker, 1897, Dracula ]

Ok.

>> A remark: my intended definition of "Concept" in this context is
>> strictly as "Moving Image Concept", i.e., it would exclude other forms
>> of intellectual creations. So, if a movie is derived from a novel, that
>> should be expressed as an association between a (Moving Image) Concept
>> instance and an instance of another entity, not existing in the current
>> model.
>
> I don't understand. How can one associate [or reference] a thing with
> another thing that does not exist ?

You won't reference it, of course. In the example above, you would stop
at Concept [ English, Tod Browning, 1931, Dracula ], without ever
recording the fact that it is derived from Bram Stoker's Dracula. If
you'd want to record such additional fact, you would extend your model.

But I see what you mean, and I agree: there's no need to multiply
entities without necessity. Both pieces of information are Concepts and
both must be recorded as such. Correctly managing the Concepts
hierarchy, as you explain below, is not difficult.

>> I see that your definition is broader: that's not necessarily
>> wrong, but I would be cautious with including literary works into the
>> picture: the facts to be associated to literary concepts might be
>> significantly different from those associated to movie concepts. As
>> a more concrete example, in your model a Concept has many Realisations,
>> each with its own Cast and Crew. If one of your Concepts is Bram
>> Stoker's novel, that doesn't make sense. (I know that the relationships
>> are 0-N, but still...)
>
> Why does it not make sense ?
> (Given that you are restricting the content of the database (as
> distinct from the data model) to "Moving Images" only ...)
> EITHER
> - you register the original Concept in whatever form that is (Dracula 1897, Pygmalion 350BC)
> --- and the chain of derivative Concepts
> --- such that you can register the first Realisation that you need (the driver)
> --- and thus the tree is complete (for anyone reporting the Realisation, seeking its lineage)
> --- (that some Concepts have no Realisations is not relevant)

Understood. That's fine then.

>> >> Eternal Sunshine of the Spotless
>> >> Mind (2004) in Italian was distributed with a title that sounds like "If
>> >> you leave me, I'll erase you" (!).
>> >> [...]
>> > One Concept, two Realisations with the second Differentiator ( Italian
>> > }, one Edition each.
>>
>> In the case of dubbing, I'd say that there is one Concept, *one*
>> Realisation (same footage), and two Editions (one in the original
>> language, one with added dubbing).
>
> Not according to my reading of the FIAF manual. Eg. the sound (audio
> channel for the movie) is different (which according to you does not
> change the "footage", but according to me it does). Second, there
> would be a dub cats who should be registered as such.

Ok, re-read the relevant FIAF guidelines. Agreed.

>> > This is not resolved.
>> > Language raises it persistent head, as being a possible defining feature.
>>
>> The Language of a dubbed version would still be English (the original
>> language), because Language is the language the movie was filmed in.
>> Dubbing information are characteristics of an Edition, IMO.
>
> Disagree, as above.
>
> There is no difference between a Realisation [ Country[England],
> Language[English] ] and Realisation [ Country[England],
> Language[English] ] with Italian subtitles.
>
> There is a difference between a Realisation [ Country[England],
> Language[English] ] and Realisation [ Country[England],
> ? Language[English] ? ] with Italian dubbing.
> - In the model that is Realisation [ Country[England], Language[Italian] ]
> - the Title may also be Language[Italian]
> - Dub cast, which the original does not have

Ok.

>> In your
>> v0.05 model I would add an association between Edition and Realisation
>> Title (an Edition adopts one of the Realisation Titles).
>
> Ok. I am not against that but then we have an item (loop in the
> model, accidentally called a circular reference in a previous
> instance) that needs resolution.
>
> Each Title in RealisationTitle is qualified by Language.
>
> Why is such an Edition not IDENTIFIED BY that single RealisationTitle
> (instead of being identified by the Realisation PK plus having
> a single RealisationTitle) ?

My reasoning was a consequence of putting dubbing at the Edition level.
Now that we have agreed that it is a new Realisation, what you say is ok
with me.

> My experience of Europe is that in such things as videos (training or
> feature or copy of a TV show) comes on a DVD in multiple languages ...
> and /optionally/ one Language is primary. If you are saying that an
> Edition is one Language, then yes,
> - an Edition should be Identified by a single RealisationTitle

Ok.

>> But wait:
>>
>> > We should make a separate clear decision that a dubbing is
>> > a. second Edition (not second Realisation) with Differentiator{ Language }
>> > b. second Realisation with Differentiator{ Language }
>>
>> You say b, I have suggested a. I have reviewed your other example ("The
>> Scarlet Flower") and the accompanying comments and I don't have a strong
>> opinion now. If you create a new Realisation instance for a dubbed
>> version, such instance has the same cast and characters and crew (except
>> for dub crew)
>
> Well, then, it is a different Crew. Therefore a different Realisation.
>
>> than the one it is derived from. Perhaps it's this
>> duplication that makes me uneasy about your decision.
>
> Where is the duplication ? Think about it from the physical back to
> the logical. Two DVDs. One has one Crew. The other has that same
> Crew plus credits for the dubbing Crew. (Codd's 3NF & FD) the
> attributes that depend on the Key are different, therefore it must
> have a different Key.
>
> Second, the Language is different. And the Language is in the Key itself.

Ok. At the E-R level I believe that you have nailed things down. I think
that at this point we might start moving to a key-based model, where it
will be clearer whether further normalisation is needed.
Then, you can probably assume UTF-8 or UTF-16 and dispose of CharSet. In
either case, now that I better understand your model, I accept that.

>> Again, I believe that you are making too strong assumptions about
>> titles.

That was a misunderstanding on my part. Your assumptions are reasonable.

> I don't know if you would agree. My perspective (real world, about 40 years):
> - what we have been doing in Logical level modelling
> --- you probably call it "conceptual level"

I am with Codd when he says that the distinction between "conceptual
level" and "logical level" is fuzzy. Nowadays, I take the two terms as
synonyms.

Nicola

Nicola

unread,
Jan 25, 2020, 6:15:33 AM1/25/20
to
On 2020-01-25, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>===================
> This post introduces V0.6
>===================
>
> http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_6.pdf

A few thoughts:

- Does Concept need to be identified by Country? Couldn't that just be
[TitleReduced, Creator, Year]? Each Realisation of a Concept, but not
the Concept itself, is "produced by" a (principal) country. The
Concept itself is a creation of its Creator at some point in time.
How about:

Concept [TitleReduced, Creator, Year] -> Title
Realisation [TitleReduced, Creator, Year, Differentiator] -> Country

?

- Besides, I would remove Title from Concept, because that information
belongs to ConceptTitle. A ConceptTitle can additionally provide all
the required information about each title.

- Most of the non-identifying relations you have marked with L do not
need to be explicitly modelled, because, I think, they are implied by
the various referential integrity constraints.

What does "SG" stand for in "SG Relational Notation"?

> In the same way that Edition is a manifestation of 1 RealisationTitle
> (not Realisation), and therefore a single Language, in this model,

Ok.

> I expect to clarify whether Realisation is rendition of 1 ConceptTitle
> (not Concept), and therefore a single Language.

I'd lean towards answering affirmatively, if we regard a movie dubbed in
a different language as a different Realisation.

Nicola

Derek Ignatius Asirvadem

unread,
Jan 25, 2020, 6:56:05 AM1/25/20
to
> On Saturday, 25 January 2020 21:00:03 UTC+11, Nicola wrote:
> > On 2020-01-24, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >> On Sunday, 19 January 2020 03:14:49 UTC+11, Nicola wrote:
> >> > On 2020-01-15, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >>
> >> > ------------------------------------------------
> >> > This post is one such [b], introducing V0.5
> >> > ------------------------------------------------
> >>
> >> A remark: my intended definition of "Concept" in this context is
> >> strictly as "Moving Image Concept", i.e., it would exclude other forms
> >> of intellectual creations. So, if a movie is derived from a novel, that
> >> should be expressed as an association between a (Moving Image) Concept
> >> instance and an instance of another entity, not existing in the current
> >> model.
> >
> > I don't understand. How can one associate [or reference] a thing with
> > another thing that does not exist ?
>
> You won't reference it, of course. In the example above, you would stop
> at Concept [ English, Tod Browning, 1931, Dracula ], without ever
> recording the fact that it is derived from Bram Stoker's Dracula. If
> you'd want to record such additional fact, you would extend your model.

No. You would:
+ Concept [ English, Bram Stoker, 1879, Dracula ] All relevant attributes
Concept [ English, Tod Browning, 1931, Dracula ] Parent = [ English, Bram Stoker, 1879, Dracula ]

> But I see what you mean, and I agree: there's no need to multiply
> entities without necessity. Both pieces of information are Concepts and
> both must be recorded as such. Correctly managing the Concepts
> hierarchy, as you explain below, is not difficult.

Yes. I don't promise an "artificial intelligence" system, I promise a Logical one. To the extent that any constraint that is declared by the user, is Logical.

There is a distinction automating that curator's automate-able tasks, and providing curator services.

> Ok. At the E-R level I believe that you have nailed things down. I think
> that at this point we might start moving to a key-based model, where it
> will be clearer whether further normalisation is needed.

Yes.

> > So wherever Title is stored, the Atom must be [ Language, Title ],
> > preferable to [ CharSet, Title ].
>
> Then, you can probably assume UTF-8 or UTF-16 and dispose of CharSet. In
> either case, now that I better understand your model, I accept that.
>
> > I don't know if you would agree. My perspective (real world, about 40 years):
> > - what we have been doing in Logical level modelling
> > --- you probably call it "conceptual level"
>
> I am with Codd when he says that the distinction between "conceptual
> level" and "logical level" is fuzzy. Nowadays, I take the two terms as
> synonyms.

There cannot be a conceptual that is not Logical (ok, there can be, but it will not be the knid that you and I care about). Therefore the only Conceptual that I accept is /within/ the domain of Logical. And FOPC in the first order of that Logical.

Therefore Conceptual is the left side of
If the domain of Logical were perceived as a spectrum:
- Conceptual would be the left side,
- Physical would be the right side
- with a number of specific stages in-between

But even that does not describe it properly. First, I am totally with Codd. Not just hanging on his every word (not saying you are), but:
- what he actually meant when he said something, and
- how that fits into the grand scheme of creating something (true).
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Creative%20Act.png

Derek Ignatius Asirvadem

unread,
Jan 25, 2020, 8:11:12 AM1/25/20
to
> On Saturday, 25 January 2020 22:15:33 UTC+11, Nicola wrote:
> > On 2020-01-25, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >
> > ===================
> > This post introduces V0.6
> > ===================
> >
> > http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_6.pdf
>
> A few thoughts:

(I like where you are going with this ...)

> - Does Concept need to be identified by Country? Couldn't that just be
> [TitleReduced, Creator, Year]? Each Realisation of a Concept, but not
> the Concept itself, is "produced by" a (principal) country. The
> Concept itself is a creation of its Creator at some point in time.
> How about:
>
> Concept [TitleReduced, Creator, Year] -> Title
> Realisation [TitleReduced, Creator, Year, Differentiator] -> Country
>
> ?

Ok.
Person[ ... ] → BirthCountry

We could say
Concept [TitleReduced, Creator, Year] → Country.
Baba Yaga comes from a distinct Country

We could say
Concept [TitleReduced, Creator, Year] → Language.
Baba Yaga comes from a distinct Culture, which is defined by Language which is better than defining it by Country

And then,
Realisation [TitleReduced, Creator, Year, Differentiator] → Concept.Country (parental attribute)

OR, as you say:
Realisation [TitleReduced, Creator, Year, Differentiator] → Country (local attribute).

If that is not perfectly clear, examine this. In:
http://www.softwaregems.com.au/Documents/Documentary%20Examples/Order%20DM%20Advanced.pdf
for Street:
Street[ ... ] → County.Name

Just as
Table[ Key ] → Attribute
If 3NF/"Full" FD is understood AND a Relational Key is used:
Table[ Key ] → Parent.Attribute

> - Besides, I would remove Title from Concept, because that information
> belongs to ConceptTitle. A ConceptTitle can additionally provide all
> the required information about each title.

Disagree. This is correct:
Concept [TitleReduced, Creator, Year] → 1 Title (local attribute)
And
Concept [TitleReduced, Creator, Year] → 0-to-n ConceptTitles (subordinate table)
Not 1-to-n as I have explained elsewhere.

> - Most of the non-identifying relations you have marked with L do not
> need to be explicitly modelled, because, I think, they are implied by
> the various referential integrity constraints.

Yes, for some. No, not for others. Therefore it must be explicit (a model should never be ambiguous).

> What does "SG" stand for in "SG Relational Notation"?

Software Gems/Derek Asirvadem

First, my intellect and the fact that it is protected from the insanity of corrupted "science", is a Grace from God. Second, I am just a disciple of Codd, having implemented his /RM/ faithfully for about 40 years, without reading the pig poop. Therefore, even though I have many inventions re the RM, I state that they are progressions of Codd's work, and not isolated inventions,

To fully understand what I am doing with that Notation,one must consider the FOPC Predicates (the set that is required for an implementation of the RM) in text form. (Some examples have been given previously.) And since we use Relational Keys, everything is stated in terms of English words. That. Have. Meaning.

Customers love the natural use of English, anything that gets them out of the mind-numbing r, s, t with subscripts and superscripts and Greek letters.

There is more to the Notation, that is just the definition, to get started.

> > In the same way that Edition is a manifestation of 1 RealisationTitle
> > (not Realisation), and therefore a single Language, in this model,
>
> Ok.
>
> > I expect to clarify whether Realisation is rendition of 1 ConceptTitle
> > (not Concept), and therefore a single Language.
>
> I'd lean towards answering affirmatively, if we regard a movie dubbed in
> a different language as a different Realisation.

That logic still holds, but /in that instance/ it has been trumped by declaring a Dub as an Edition, not a Realisation.

Ok, so we have:
Concept[ TitleReduced, Creator, Year ] → 1 Title (local attribute)
Concept[ TitleReduced, Creator, Year ] → 0-to-n ConceptTitles (subordinate table)
ConceptTitle[ TitleReduced, Creator, Year, Language_CT, TitleReduced_CT ] → Language_CT[ Title ]
Realisation[ TitleReduced, Creator, Year [, Differentiator ], Language_CT, TitleReduced_CT ]

Time for another one of those progression-of-Codd inventions that is fully-integrated-with-science-and-logic. That PK is getting silly, because both TitleReduced and TitleReduced_CT have a Language. Title is duplicated, only one is required. Resolving all that, this becomes the Primary Key:
Realisation[ Language_CT, TitleReduced_CT, Creator, Year [, Differentiator ] ]

This is Alternated Key Normal Form. It is dependent on (the row must be be in) Codd's 3NF/"Full" Functional Dependency ... which I declare as Key Dependency Normal Form. It is not the Date; Darwen; Fagin; et al "3NF", it is more than their "BCNF", "4NF", "5NF", all of which contradict the /RM/, and a bit more articulated than, while remaining strictly within, Codd's.

Cheers
Derek

Nicola

unread,
Jan 25, 2020, 1:42:42 PM1/25/20
to
On 2020-01-25, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:

> Just as
> Table[ Key ] → Attribute
> If 3NF/"Full" FD is understood AND a Relational Key is used:
> Table[ Key ] → Parent.Attribute

Sure, FDs are not necessarily intra-relational.

>> - Besides, I would remove Title from Concept, because that information
>> belongs to ConceptTitle. A ConceptTitle can additionally provide all
>> the required information about each title.
>
> Disagree. This is correct:
> Concept [TitleReduced, Creator, Year] → 1 Title (local attribute)
> And
> Concept [TitleReduced, Creator, Year] → 0-to-n ConceptTitles (subordinate table)
> Not 1-to-n as I have explained elsewhere.

Well, let it be then.

> Ok, so we have:
> Concept[ TitleReduced, Creator, Year ] → 1 Title (local attribute)
> Concept[ TitleReduced, Creator, Year ] → 0-to-n ConceptTitles (subordinate table)
> ConceptTitle[ TitleReduced, Creator, Year, Language_CT, TitleReduced_CT ] → Language_CT[ Title ]

Where [TitleReduced_CT, Language_CT, Creator, Year] is also a key. Nice!

> Realisation[ TitleReduced, Creator, Year [, Differentiator ], Language_CT, TitleReduced_CT ]

Ok.

> Time for another one of those progression-of-Codd inventions that is
> fully-integrated-with-science-and-logic. That PK is getting silly,
> because both TitleReduced and TitleReduced_CT have a Language. Title
> is duplicated, only one is required. Resolving all that, this becomes
> the Primary Key:
> Realisation[ Language_CT, TitleReduced_CT, Creator, Year [, Differentiator ] ]

Fine, by migrating an AK (this is one aspect where the IDEF1X standard
should relax its rules, IMO).

That's nice, because each Realisation is naturally identified by its own
title, even if they realise the same Concept.

Nicola

Derek Ignatius Asirvadem

unread,
Jan 26, 2020, 9:24:21 PM1/26/20
to
> On Saturday, 25 January 2020 22:56:05 UTC+11, Derek Ignatius Asirvadem wrote:
>
> > I am with Codd when he says that the distinction between "conceptual
> > level" and "logical level" is fuzzy. Nowadays, I take the two terms as
> > synonyms.
>
> There cannot be a conceptual that is not Logical (ok, there can be, but it will not be the knid that you and I care about). Therefore the only Conceptual that I accept is /within/ the domain of Logical. And FOPC in the first order of that Logical.
>
> Therefore Conceptual is the left side of
> If the domain of Logical were perceived as a spectrum:
> - Conceptual would be the left side,
> - Physical would be the right side
> - with a number of specific stages in-between
>
> But even that does not describe it properly. First, I am totally with Codd. Not just hanging on his every word (not saying you are), but:
> - what he actually meant when he said something, and
> - how that fits into the grand scheme of creating something (true).
> http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Creative%20Act.png

(The post got cut off)

At the asylum end of the academic spectrum we have Date; Darwen; Fagin; et al, and their followers.
- Half their life activity is "argument"; sophistry, about fragments of the atoms in the /RM/ that undamaged human beings do not have ... because we have not split Codd's atom into fragments.
- The other half is bleating about what Codd did not do, labouring that the /RM/ is "incomplete" because it does not give rules for database design.

The evidence is, these people have not been to uni, they have not written a single academic paper. If they had, they would know that a paper defines the new article, it is not required to define the larger article, the context that the paper addresses.

Database design and Normalisation existed long before Codd, the /RM/ is not required to define it.

So when Codd states a directive, I take it that he is not rebelling against Logic; the Four Laws; Science; etc, but that his directives are to be taken within that.

All of which are unknown at the TTM Gulag. Which is why they fault Codd for not having taught them how to blow their nose; how to suck their thumbs; .

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Jan 26, 2020, 9:41:00 PM1/26/20
to
> On Sunday, 26 January 2020 05:42:42 UTC+11, Nicola wrote:
> > On 2020-01-25, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> > Just as
> > Table[ Key ] → Attribute
> > If 3NF/"Full" FD is understood AND a Relational Key is used:
> > Table[ Key ] → Parent.Attribute
>
> Sure, FDs are not necessarily intra-relational.

No.
If 3NF/"Full" FD is understood AND a Relational Key is used:
Parent[ ParentKey ] → Parent.Attribute
Child[ ParentKey, Differentiator ] → Child.Attribute
Child[ [ ParentKey ] ] → Parent.Attribute

> >> - Besides, I would remove Title from Concept, because that information
> >> belongs to ConceptTitle. A ConceptTitle can additionally provide all
> >> the required information about each title.
> >
> > Disagree. This is correct:
> > Concept [TitleReduced, Creator, Year] → 1 Title (local attribute)
> > And
> > Concept [TitleReduced, Creator, Year] → 0-to-n ConceptTitles (subordinate table)
> > Not 1-to-n as I have explained elsewhere.
>
> Well, let it be then.

Done.

> > Ok, so we have:
> > Concept[ TitleReduced, Creator, Year ] → 1 Title (local attribute)
> > Concept[ TitleReduced, Creator, Year ] → 0-to-n ConceptTitles (subordinate table)
> > ConceptTitle[ TitleReduced, Creator, Year, Language_CT, TitleReduced_CT ] → Language_CT[ Title ]
>
> Where [TitleReduced_CT, Language_CT, Creator, Year] is also a key. Nice!

Yes. In my courses, having previously taught the /Principle/ that
Each truth is integrated with every other truth
this section of the course is titled /Fun with Relational Keys/.

> > Time for another one of those progression-of-Codd inventions that is
> > fully-integrated-with-science-and-logic. That PK is getting silly,
> > because both TitleReduced and TitleReduced_CT have a Language. Title
> > is duplicated, only one is required. Resolving all that, this becomes
> > the Primary Key:
> > Realisation[ Language_CT, TitleReduced_CT, Creator, Year [, Differentiator ] ]
>
> Fine, by migrating an AK (this is one aspect where the IDEF1X standard
> should relax its rules, IMO).

Yes and no. It is Codd's directive, in his /RM/. IDEF1X is simply implementing the directive. Therefore I would not change that.

What we have here is a finer grain, a progression, /after/ having implemented that rule. Which is why I have a separate NF defined for it.

> That's nice, because each Realisation is naturally identified by its own
> title, even if they realise the same Concept.

All truth is integrated.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Jan 26, 2020, 9:43:02 PM1/26/20
to
Nicola

===================
This post introduces V0.7
===================

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_7.pdf

The Keys are defined along with some notes at top right. That is getting a bit tedious, but it is required for understanding because I have not produced the IDEF1x/Key levl model yet.

> On Monday, 27 January 2020 13:41:00 UTC+11, Derek Ignatius Asirvadem wrote:

All issues determined in that post have been modelled.

Cheers
Derek

Nicola

unread,
Jan 27, 2020, 4:03:16 AM1/27/20
to
On 2020-01-27, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> Nicola
>
>===================
> This post introduces V0.7
>===================
>
> http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20Progression%20V0_7.pdf

That looks good to me.

Nicola

Derek Ignatius Asirvadem

unread,
Jan 30, 2020, 6:43:25 AM1/30/20
to
On Monday, 27 January 2020 20:03:16 UTC+11, Nicola wrote:
>
> That looks good to me.

Sorry. I am sick. Can't concentrate.

Cheers
Dere

Derek Ignatius Asirvadem

unread,
Feb 14, 2020, 11:38:57 PM2/14/20
to
> On Monday, 27 January 2020 20:03:16 UTC+11, Nicola wrote:
> > On 2020-01-27, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >
> > This post introduces V0.7
> >
>
> That looks good to me.

Sorry about the delay. Fires. Floods. Sickness.
But by the Grace of God, we don't eat rats; bats; or marmots, and we don't have a plague of locusts.

===================
This post introduces V0.8
===================

Having agreed that V0.7 Table-Relational level is good, there is no /substantial/ difference between V0.7 and V0.8. As you may appreciate, in erecting the Table-Attribute level, minor errors or omissions in the Table-Relation level were exposed, and so there is a new version.

Table-Relational
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TR%20V0_8.pdf

Table-Attribute (many FKs are in the Attributes, not the Keys)
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_8.pdf

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Feb 14, 2020, 11:47:34 PM2/14/20
to
> On Saturday, 15 February 2020 15:38:57 UTC+11, Derek Ignatius Asirvadem wrote:
>
> ===================
> This post introduces V0.8
> ===================
>
> Having agreed that V0.7 Table-Relational level is good, there is no /substantial/ difference between V0.7 and V0.8. As you may appreciate, in erecting the Table-Attribute level, minor errors or omissions in the Table-Relation level were exposed, and so there is a new version.

Not sure if it is possible, but I will ask anyway. I don't have a capable person to check my work right now, and I won't until 06 Mar. There is only so much that self-checking exposes. Could you please;
1. check this at the clerical, not modelling, level. Ie. typos; missed migrations; etc. If any errors I will fix them up.
2. IFF [1] is correct, proceed with ordinary discussion, which is modelling. No point in [2] if there are silly errors.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Feb 17, 2020, 12:26:34 AM2/17/20
to
> On Saturday, 15 February 2020 15:47:34 UTC+11, Derek Ignatius Asirvadem wrote:
>
> Not sure if it is possible, but I will ask anyway. I don't have a capable person to check my work right now, and I won't until 06 Mar. There is only so much that self-checking exposes. Could you please;
> 1. check this at the clerical, not modelling, level. Ie. typos; missed migrations; etc. If any errors I will fix them up.
> 2. IFF [1] is correct, proceed with ordinary discussion, which is modelling. No point in [2] if there are silly errors.

Found and fixed such [1] clerical errors, not change to the substance or intent. It does not warrant a new version number. Note the Date 17 Feb 2020.

==================
This post corrects V0.8
==================

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Feb 17, 2020, 6:47:35 AM2/17/20
to
Nicola

===================
This post introduces V0.9
===================

I have:

1. Language
Fixed up Language throughout, particularly where it counts (excludes tables which are loosely defined, which you will probably use UTF-8 and not worry about a declared Language).

2. Keyword
Added clarity, so as to exclude rubbish words from actual Keywords. This is going to be one of the most used search vectors, so I have given it the full definition. (All Hierarchs in a Relational database are "Dimensions" in the "Dimension-Fact" paradigm ...) but this one is rather special, one I developed decades ago, and used in many databases with great success. Here we have:
- Keywords from the Title (in whatever Language the Title is)
- additional Keywords to recall the movie
- without duplication
- prevention of silly words being used as Keywords

Nice little structure for implementing EXCLUDE, which may be of interest to you.

3. Squeezed
Since you do not have access to an A2 printer, I have set it all out in A3 (each SubjectArea), but that does squeeze things a bit. Sorry.

(In Australia, we just go to the local print shop or stationers, where they can print A2; A1; A0, on media that is cheap paper; expensive paper; vinyl; etc, to suit the budget. Colour A2 cost $5 for cheap; $12 for expensive; $20 and up for vinyl; etc.))

Which means a new Version 0.9:

Table-Relation
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TR%20V0_9.pdf

Table-Attribute
(many FKs are in the Attributes, not the Keys)
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_9.pdf

Cheers
Derek

Nicola

unread,
Feb 19, 2020, 3:20:34 PM2/19/20
to
On 2020-02-17, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> Nicola
>
>===================
> This post introduces V0.9
>===================

I need a few days before coming back to this at full throttle. Life
sometimes gets in the way.

At a quick glance: Agent is an interesting case, because apparently it
*requires* the use of a surrogate (AgentNo)...

Nicola

Derek Ignatius Asirvadem

unread,
Feb 20, 2020, 5:51:59 AM2/20/20
to
> On Thursday, 20 February 2020 07:20:34 UTC+11, Nicola wrote:
> > On 2020-02-17, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> I need a few days before coming back to this at full throttle. Life
> sometimes gets in the way.

By all means. We need full throttle on both sides for this one. (When I am sick, usually I can still concentrate, at least for short durations. But not that last time, which I felt quite bad about, I couldn't even implement an FK without screwing up.)

When you do come back, please evaluate just the last model, eg. V0_10, those in-between can be safely skipped.

>>> 1. check this at the clerical, not modelling, level. Ie. typos; missed migrations; etc. If any errors I will fix them up.
>>> 2. IFF [1] is correct, proceed with ordinary discussion, which is modelling. No point in [2] if there are silly errors.

I think I have caught all of [1]. You can proceed directly into argumentation of the logical.

> At a quick glance: Agent is an interesting case, because apparently it
> *requires* the use of a surrogate (AgentNo)...

Yes, of course.

If you think I have declared surrogates are banned, period, that is false. I have declared that surrogates are banned AS A STARTING POIINT in the exercise of modelling a database that is intended to comply with the /Relational Model/.

There are only two scientifically valid reasons for a surrogate (as opposed to the anti-scientific, schizophrenic "reasons" of the Date; Darwen; Fagin; et al Gulag). In both cases, because it is a surrogate, it still constitutes a Relational Breach, specifically
a. a breach of the Independent Access Path Rule. It cuts of access from the tables below the breach, to their genuine ancestors (above the breach).
b. and of course the Relational Key Normal Form.

The surrogate breaks the natural Hierarchy, and creates a new Hierarch (in my IDEF1X models, I show this visually, because I render all Hierarchies visually, vertically). Please feel free to discuss in detail.

> I render all Hierarchies visually, vertically

I have an even more superior method, but I can't show that in a public forum, sorry.

1. The PK has become too wide to be used (migrated) throughout the hierarchy, and so a surrogate is /substituted/ for the PK. This is exemplified in Address and Person. I think you understand that usage.

2. The table is a Basetype, and the Subtypes have quite different PKs, such that a common PK cannot be established. Thus the common PK in the Basetype is manufactured, it is a surrogate. Please feel free to discuss this usage in detail.

In any case, if a surrogate is chosen, it must be be late in the modelling exercise, so as not to interfere with the process: the exposure; determination; and deployment of discrete FOPC Facts. Which you have seen a bit of in this exercise, this thread.

-- The corollary is the more relevant point, due to the pig poop promoted and marketed by the Anti-Relational pig poop purveyors. Starting the exercise by stamping a surrogate on every file (they are not tables) cripples the modelling exercise, the exposure; determination; and deployment of discrete Facts.

--------------------------------------
-- This post introduces V0.10 --
--------------------------------------

1. Corrections
- Creator should be CreatorNo

2. Minor but important changes
- RealisationCast renamed RealisationPersonRole
- PK changed, allowing elimination of AK
- the relation
/Each PersonRole plays 0-n RealisationCasts/
changed to
/Each PersonRole fills 0-n RealisationPersonRoles/

3. RealisationPlayer added
/Each RealisationCharacter is played by 0-1 RealisationPlayer/
/Each RealisationPersonRole plays 0-1 RealisationPlayer/

4. I have changed Realisation to Project.
I wanted to propose it; have some discussion; obtain acceptance/rejection; etc ... but with a view to reducing the back-and-forth, given the minor delays on both sides, I have simply implemented it in this version. If you do not like it, I can change it back to Realisation.

/ConceptTitle[ ... ] manifests as Project/
/ProjectTitle [ ... ] is merchandised as Edition/

I considered Materialisation, and rejected it.

5. Table names improved (minor).

Table-Relation
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TR%20V0_10.pdf

Table-Attribute
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_10.pdf

Question
-----------
Re working protocol. To reduce my work in erection of the PDFs. As you can imagine, I cannot work (model) at the A3/SubjectArea level, which involves duplication of tables, I have to work at the level of the model on a single page. Otherwise I can make quite preventable mistakes. (Separately, there is no substitute for the visual facility of the model on a single page, even if hundreds of tables.)

For each version, the last thing I do is erect a PDF in A3 (which I understand you need, at least for printing purposes). This is an exercise of getting the SubjectArea with the full context (I won't exclude the context, it renders the SubjectArea invalid) onto an A3 page. Which includes some squeezing, as well as ugliness. No complaint.

So the question is this. While we are going back and forth with the modelling, can I give you just the whole model on one page ? Right now it is A1.

And whenever you need any Version in A3/SubjectArea, please ask. (I don't know what your requirements are ... print always; print only to read on the train; or whatever). Please do not be shy.

Cheers
Derek

Nicola

unread,
Feb 20, 2020, 5:08:45 PM2/20/20
to
On 2020-02-20, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>> On Thursday, 20 February 2020 07:20:34 UTC+11, Nicola wrote:
>> > On 2020-02-17, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:

> When you do come back, please evaluate just the last model, eg. V0_10,
> those in-between can be safely skipped.

Not yet there, but I'll answer a few points you have raised (for
feedback about the model, I need more time).

>> At a quick glance: Agent is an interesting case, because apparently it
>> *requires* the use of a surrogate (AgentNo)...
>
> Yes, of course.
>
> If you think I have declared surrogates are banned, period, that is
> false. I have declared that surrogates are banned AS A STARTING
> POIINT in the exercise of modelling a database that is intended to
> comply with the /Relational Model/.

Good.

> There are only two scientifically valid reasons for a surrogate (as
> opposed to the anti-scientific, schizophrenic "reasons" of the Date;
> Darwen; Fagin; et al Gulag). In both cases, because it is
> a surrogate, it still constitutes a Relational Breach, specifically a.
> a breach of the Independent Access Path Rule. It cuts of access from
> the tables below the breach, to their genuine ancestors (above the
> breach). b. and of course the Relational Key Normal Form.

Agreed.

> The surrogate breaks the natural Hierarchy, and creates a new Hierarch
> (in my IDEF1X models, I show this visually, because I render all
> Hierarchies visually, vertically). Please feel free to discuss in
> detail.

That's clear to me, both the fact that surrogates force you to
"navigate" a model and that a hierarchical visual layout is a good idea.

For the former, I recently had to evaluate an "optimization" of
a somewhat acceptable Relational model for an ideal telephony company,
which consisted in adding an ID attribute to almost every table ("so
that less attributes have to be stored several times"). As
a consequence, queries that would require joining two tables ended up
requiring joining six tables, and inserting data about each new call
passed from a single INSERT to a transaction with a few SELECTs and one
INSERT.

For the latter, circularity in a data model is undesirable for several
reasons (by circularity I mean cyclic referential integrity constraints,
i.e., a FK from R1 to R2, a FK from R2 to R3, ..., a FK from Rn back to
R1):

1. conceptual: a child instance is a descendant of itself;
2. computational: problems such as implication of dependencies easily
become highly complex (even undecidable) in the presence of circular
dependencies;
3. pragmatic: an "egg and chicken" problem requires defining tables in
more steps (e.g., create table A, create table B with FK to A, add FK
from A to B), plus transactions with deferred constraints for
inserting data.

Such things are well studied. Circularity can always be eliminated with
minimal impact on the semantics of a model. Once eliminated, a data
model, viewed as a graph (nodes = entities, edges = associations), is
a DAG (associations are oriented from parent to child). So, it's only
natural to use a hierarchical layout to display such a model. I see that
you make finer distinctions, e.g., by drawing categorizations
(specializations) on the side, and that's fine, too. I am not sure
I understand your usage of colors, though.

I see that you don't eliminate circularity completely, because you
retain "self-loops" (FKs from an entity to itself). Such foreign keys
don't cause problems of type (3) above (and most likely not even
problems of type (2)), but they may be eliminated as well, and it's
often desirable to avoid the unnatural situation of having a tuple
reference itself. For instance, instead of:

Employee(EmployeeNo, Name, ManagerNo)

which forces at least one employee to be her own manager, one may
define:

Employee(EmployeeNo, Name)
Manager(EmployeeNo, ManagerNo)

> 1. The PK has become too wide to be used (migrated) throughout the
> hierarchy, and so a surrogate is /substituted/ for the PK. This is
> exemplified in Address and Person. I think you understand that usage.

Yes, that's a pragmatical (not logically necessary) use of a surrogate,
which is introduced out of convenience: a trade-off that should be
carefully evaluated, for the reasons mentioned above.

> 2. The table is a Basetype, and the Subtypes have quite different
> PKs, such that a common PK cannot be established. Thus the common PK
> in the Basetype is manufactured, it is a surrogate. Please feel free
> to discuss this usage in detail.

That is the Agent case, where introducing a surrogate is necessary to be
able to reference heterogeneous entities types. That's a textbook use
case for surrogates, e.g., Elmasri and Navathe's book discuss it (they
call it a "union type", although I don't like the name).

> Starting the exercise by stamping a surrogate on every file (they are
> not tables) cripples the modelling exercise, the exposure;
> determination; and deployment of discrete Facts.

Sure.

> --------------------------------------
> -- This post introduces V0.10 --
> --------------------------------------

I'll come back to that.

As for the format, please know that I am not going to print anything
larger than A3. I don't commute, I am (still) comfortable viewing large
models on a big screen, and I can easily read (good) models split on
several pages, with "collapsed" entities (as you call them) or "shadow"
entities (as the recent IDEF1X revision calls them) that reference
entities from different pages.

Nicola

Derek Ignatius Asirvadem

unread,
Feb 21, 2020, 7:48:47 PM2/21/20
to
> On Thursday, 20 February 2020 21:51:59 UTC+11, Derek Ignatius Asirvadem wrote:
>
> There are only two scientifically valid reasons for a surrogate ...
> In both cases, because it is a surrogate, it still constitutes a Relational Breach, specifically
> a. a breach of the Independent Access Path Rule. It cuts of access from the tables below the breach, to their genuine ancestors (above the breach).
> b. and of course the Relational Key Normal Form.
>
> The surrogate breaks the natural Hierarchy, and creates a new Hierarch (in my IDEF1X models, I show this visually, because I render all Hierarchies visually, vertically). Please feel free to discuss in detail.

That fact, that it is a Relational Breach, is further confirmed because one cannot use FOPC, the first order in the Logical realm, and therefore cannot use the second order, the /Relational Model/.

More precisely, FOPC *can* be applied (because FOPC can be applied to absolutely anything), *and* the FOPC Predicates that are distilled from an RFS model or surrogated table are FALSE (wrt to Reality).

Take an unrelated eg. In the Relational data model, Country cluster:
__Each Country is independent
__Each Country consists of 0-n States
____Each State is dependent on, and identified by, and a constituent of, 1 Country
____Each State is made up of 0-n Counties
______Each County is dependent on, and identified by, and a possession of, 1 State
______Each County has 0-n Towns
each of which is true wrt Reality.

That is a partial set, I have not included the other Predicates, eg:
__Each <row> is identified by <RelationalKey>.
__etc

In the equivalent RFS record model:
__Each Country is independent
__Each Country consists of 0-n States
____Each State is independent <= FALSE
____Each [independent] State is made up of 0-n Counties <= FALSE
______Each County is independent <= FALSE
______Each [independent] County has 0-n Towns <= FALSE
each of which is FALSE wrt Reality, but implemented anyway for some /physical/ purpose.

And:
__Each <record> is identified by <physical_record_no> <= FALSE
__etc

Which is just one of the several reasons the pig poop Gulag avoid mentioning FOPC or Predicates (and if they do, it is a degenerate, misleading form).

A second reason they can't is worth mentioning here. the RFS (even though the RM/T which supposedly supports it contains the word /model/, iy is not a model in the mathematical sense, and therefore FOPC or FOPC Predicates simply cannot be used. Which is why, when the freaks do suggest a Predicate, it is of the degenerate kind (eg. /Person is Called/).

The freaks actually teach this degenerate stuff as "education". This is Hugh Darwen's "compuetr science/database" course at "university of warwick", for decades. He does a great job of perverting Predicates, avoiding the proper introduction of FOPC. He also perverts /Intension/. If you are in a hurry look at the result only § 10:
https://www.dcs.warwick.ac.uk/~hugh/CS252/CS252-HACD-Notes4.pdf

(Every lecture is chock-full of pig poop, serious perversions ... I am directing your attention to a single item: perversion of the concept of Predicate.)

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Feb 22, 2020, 1:16:29 AM2/22/20
to
> On Friday, 21 February 2020 09:08:45 UTC+11, Nicola wrote:
> > On 2020-02-20, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> > > On Thursday, 20 February 2020 07:20:34 UTC+11, Nicola wrote:
> > > > On 2020-02-17, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> > The surrogate breaks the natural Hierarchy, and creates a new Hierarch
> > (in my IDEF1X models, I show this visually, because I render all
> > Hierarchies visually, vertically). Please feel free to discuss in
> > detail.
>
> That's clear to me, both the fact that surrogates force you to
> "navigate" a model

Navigate is the right term. Specifically, that the row at the breach must be /additionally/ JOINed.

In a Relational database we can:
SELECT Name, -- Country.Name
________StreetName
____FROM Street
________JOIN Country ON Street.CountryCode = Country.CountryCode
____WHERE StreetName = "Washington"

Whereas, in an RFS falsely labelled "relational", we /have to/ instead:
SELECT Country.Name,
________StreetName
____FROM Street
________JOIN Suburb ON SuburdId
________JOIN Town ON TownId
________JOIN County ON CountyId
________JOIN State ON StateId
________JOIN Country ON CountryId
____WHERE StreetName = "Washington"

> and that a hierarchical visual layout is a good idea.
>
> For the former, I recently had to evaluate an "optimization" of
> a somewhat acceptable Relational model for an ideal telephony company,
> which consisted in adding an ID attribute to almost every table ("so
> that less attributes have to be stored several times"). As
> a consequence, queries that would require joining two tables ended up
> requiring joining six tables, and inserting data about each new call
> passed from a single INSERT to a transaction with a few SELECTs and one
> INSERT.

Exactly.

In case it helps. The larger problem there is, their OO/ORM mindset, specifically CRUD, which is totally stupid, but very common.

The Transaction part is correct: they should have had Transactions only, for every user function, throughout. They see it as "bad" only because they had a substandard non-architecture, and it is "work" to elevate one corrected user function to semi-standard Transaction.

The client should never read/write directly to tables. It should read only Views, and write (execute) only Transactions. Here is an extremely condensed diagram. Discuss:
http://www.softwaregems.com.au/Documents/Article/Application%20Architecture/Open%20Architecture.pdf

Further discussion re Transactions at:
https://groups.google.com/forum/#!topic/comp.databases.theory/qqmnhu036FQ

----

> For the latter, circularity in a data model is undesirable for several
> reasons (by circularity I mean cyclic referential integrity constraints,
> i.e., a FK from R1 to R2, a FK from R2 to R3, ..., a FK from Rn back to
> R1):
>
> 1. conceptual: a child instance is a descendant of itself;
> 2. computational: problems such as implication of dependencies easily
> become highly complex (even undecidable) in the presence of circular
> dependencies;
> 3. pragmatic: an "egg and chicken" problem requires defining tables in
> more steps (e.g., create table A, create table B with FK to A, add FK
> from A to B), plus transactions with deferred constraints for
> inserting data.
>
> Such things are well studied. Circularity can always be eliminated with
> minimal impact on the semantics of a model. Once eliminated, a data
> model, viewed as a graph (nodes = entities, edges = associations), is
> a DAG (associations are oriented from parent to child). So, it's only
> natural to use a hierarchical layout to display such a model.

Whoa. Most of what you say is correct. But there are great slabs that are missing. And you are sitting squarely in the RFS (RM/T as "relational") mindset, trying to understand the goodness of the Relational Model (application), without yet taking on the Relational mindset. You are following the historical progress of the high end of the academics, which is very very slow (fifty years) and still not the Relational mindset, meaning that they had the Date; Darwen; Fagin pig poop RFS as "relational", and slowly, very slowly, they are finding problems and correcting them. It is slow because it is bottom-up, problem identification and fix-up. (And this is a damn sight better than the 90% of academia that slavishly follow the pig poop, and teach it.)

I invite you to understand the /Relational Model/ deeply, and to approach all issues top-down. It needs be appreciated the the paper is seminal (dense, every word counts), and the terms are dated (not readily understood today, but well understood in the 1970's; 80's; and 90's).

In the /Relational Model/ § 1.4 Normal Form ...
In defining the pre-requisites ...
"If normalization as described above is to be applicable, the unnormalized collection of relations must satisfy the following conditions:
(1) The graph of interrelationships of the nonsimple domains is a collection of trees."

We understood completely that GRAPH means TREE. The notion of TREE in both the mathematical sense, and the implementation (platform) sense, was well-known. A Directed Acyclic Graph means specificaly not cyclic, not circular. The Hierarchical DBMS of the day implemented TREES. From the opening paras in the /RM/, Codd discusses "tree-structured files".

(The Independent/Dependent design is obtained from the Network DBMS.)

That means circular references are prohibited in the /RM/.

Circular references were prohibited in HDBMS. Because they are stupid. Believe me, I spent a full 10 years implementing NDBMS/TOTAL to replace several badly implemented HDBMS ... and the common fault in the badly implemented HDBMS was attempts to form circular references. They did not appreciate that the consequent problems was due to their breaking of a cardinal rule.

To be clear, the directive "all data must be arranged in trees" applies to Identifying Relations. (Non-Identifying Relations have a different purpose, which is obviated in the directive.)

Further discussion (laboured because JKL is not honest) at:
https://groups.google.com/forum/#!topic/comp.databases.theory/5212JwYtip4
https://groups.google.com/forum/#!topic/comp.databases.theory/4ZnIhiX7fnw

Therefore, instead of implementing circular references, as prescribed by the Pig Poop Gulag (and the consequently demanded "deferred constraint checking" forcibly implemented in pus-filled non-platforms such as PusGreNONsql) ... and then finding out decades later that it is stupid ... and then seeking methods to overcome the problems ... which methods are severely limited by being bottom up ... those of us who followed the /RM/ without the contamination of the Pig Poop Gulag and all the "textbooks" written by their slaves, never implemented circular references in the first place, and therefore never suffered the consequent problems.

High end SQL platforms do not have, and do not need "deferred constraint checking".

Now I will respond to each of your points, which are lower level, in turn.

> For the latter, circularity in a data model is undesirable for several
> reasons (by circularity I mean cyclic referential integrity constraints,
> i.e., a FK from R1 to R2, a FK from R2 to R3, ..., a FK from Rn back to
> R1):
>
> 1. conceptual: a child instance is a descendant of itself;
> 2. computational: problems such as implication of dependencies easily
> become highly complex (even undecidable) in the presence of circular
> dependencies;

Prohibited. If implemented, it FAILS RM/Relational Normal From.

More. Because the pre-requisite to
/RM/ § 1.4 Normal Form/Fig 3(b) Normalised Set
is actually
/RM/ § 1.4 Normal Form/Fig 3(a) Unnormalised Set
which defines the pre-requisite criteria of (a) "tree", which we have previously agreed is a true Normal Form, the obvious name being Hierarchic Normal Form, if a circular reference is implemented, it FAILS RM/Hierarchic Normal Form.

> 3. pragmatic: an "egg and chicken" problem requires defining tables in
> more steps (e.g., create table A, create table B with FK to A, add FK
> from A to B),

Prohibited. Fails RM/Hierarchic Normal Form and RM/Relational Normal Form.

More importantly, for understanding, there are no circular references, no "chicken and egg" problems, in the universe (it is a perfect Hierarchy). If one does manage to conceive a circular reference, it is
(a) a gross error at the intellectual level, of something that is not there (does not exist) and
(b) a confection in the intellect, not in the real world.
So it is doubly stupid to implement that confection, that fantasy ... and then suffer all the consequent problems.

Further, even //having// the "chicken and egg" problem, is a problem that scientists do not have. If we always apply the Four Laws of Thought in our science (both theoretical and applied), we cannot leave it a problem that is not resolved (per the Law of nNon-Contradiction and the Law of the Excluded Middle). Maintaining the problem is practically living in the Excluded Middle without resolution (aka insanity).

That is to say, the scientific mind does not have, and cannot have, the "chicken and egg" problem. It is well-known that unscientific minds have problems, and consequent problems squared, that scientific minds do not have.

Again, because you are moving from the bottom, up, fixing a problem that you have accepted as a necessary problem, you might not appreciate the elegance of not having the problem in the first place. Or that that problem does not exist in the universe, and it does not exist in a scientific mind, and has not existed in high end practitioners and high end platforms. For fifty years.

> plus transactions with deferred constraints for
> inserting data.

Transactions are demanded anyway, they cannot be avoided.

"Deferred constraints" and "deferred constraint checking" in the anti-platform is not required.

> Such things are well studied.

By virtue of the evidence, that is completely false.

If the idiots who pass for "academics" and "theoreticians" studied the /RM/ even a little bit, they would not have the problem. So the mountain of evidence is, they have studied the pig poop that is marketed and promoted as "relational", which has the problem as a fixture, and after forty years of hard labour at the gulag, they have found a problem
"ooooo, circular references present consequent problems,
ooo aaa ooo, circular references are undesirable",
and they have brain-dead methods of working with the problem without eliminating it, such as "deferred constain chicken".

> Circularity can always be eliminated with
> minimal impact on the semantics of a model.

No, no, it is the other way around. The model IS FOPC; the model IS Logic; the model IS Semantic. And therefore circular references are visually exposed, and eliminated (by complying with the RM/directive: "the data must be arranged in trees" = DAG = no circular references).

> Once eliminated, a data
> model, viewed as a graph (nodes = entities, edges = associations), is
> a DAG (associations are oriented from parent to child).

No, no, it is the other way around. The model IS a graph; the model IS a tree; the model IS a DAG. DAG means no circular references.

> So, it's only
> natural to use a hierarchical layout to display such a model.

Yes.

All models (not just a corrected one).

Now for a bit more understanding. The layout I have used for each hierarchy is VERTICAL (meaning Identifying Relations are connected at the top of each table, Non-Identifying Relations are connected at the left side), it follows the "organisation chart" layout, which is immediately understood by most people, and that is why I use it in the introductory levels.

However, it has limitations (such as being extremely wide ... an A1 page is required for this rather small model with 57 tables). I do have an advanced layout for depicting hierarchies, which I might give you in a future progression. It does take a bit of getting used to, and in any case, for the understanding of the model, this introductory level must be established first.

----

> I see that
> you make finer distinctions, e.g., by drawing categorizations
> (specializations) on the side, and that's fine, too.

*Category* is original IDEF1X nomenclature. It (and its cardinality) is limited, no one uses it. We use IEEE nomenclature and cardinality. (All modelling tools allow one to choose which { IDEF1X | IEEE } symbols to display on the relation lines).

*Generalisation-Specialition* is [OMG] OO/ORM pig poop. Relevant only to the myopic mindset of that insanity. Along with the [OMG] UML insanity. It is not Relational terminology, and other than exceptions such as you, who are seeking /what is genuine Relational/, it is Anti-Relational. In any case the definition is too loose and confused for a scientific person to use.

1. One-to-Zero-or-One Relation

Yes.

Whereas a 1:0-n child is always drawn at least one grid increment below the parent, I draw a 1::0-1 child to the right. The term "sibling" often used for this is false (the siblings are the separate 1::0-n child tables, eg. ProjectPersonRole and ProjectCharacter)).

The mind works top-to-bottom, left-to-right (unless the person is schizophrenic, wherein the damaged mind reverses the order of everything). So genuine child tables are below the parent and Extensions (the mathematical term) are to the right.

These are usually progressions of events. Take a look at the sequence Reading-Alert-Acknowledgement-Action on page 4:
http://www.softwaregems.com.au/Documents/Student%20Resolutions/Mark%20Simonetti/Mark%20Data%20Model%20V0_23.pdf

I can't draw the Extension of ProjectCharacter, ProjectPlayer, to the right for two reasons: it would exceed the A1 width; and it interferes with the fact that the Non-Identifying Extension of ProjectPersonRole is also Project Player. (Even this little difficulty is eliminated in my advanced layout.)

2. Base-Type-Subtype Cluster
The IDEF1X+IEEE nomenclature is Basetype-Subtype. "Supertype" is false, and as usual, beloved by the Date; Darwen; Fagin; et al pig sty, along with their false "Superkeys"; etc.

A single Basetype-Subtype pair should be viewed as a single logical row.

The Subtypes are dependent on the Basetype, with a [pure Logic] OR or XOR Gate **on the relation line**. The cardinality is 1::1 for each pair. Therefore the Subtypes are drawn to the right of the Basetype, clustered to the right of the OR/XOR Gate.

> I am not sure
> I understand your usage of colors, though.

To begin with, yes, of course I have a standard or starting set of colours. Refer to the Legend at bottom right of the linked DM above. Colour affects the right hemisphere of the brain, and therefore has a huge and immediate effect (one that is not usually registered in the left hemisphere). Use colours with that in mind. Use very mild pastel colours only, to avoid the colour dominating over the logic of the model.

Therefore:
- simple Reference elements are grey
- Reference clusters or Standards are blue (the traditional colour of Authority)
- Identifying elements are green
- Transactions are yellow (starting to get warm)
- TransactionItems are pink (warm)
- History or Audit (change all the time) are light purple

In your model, I use two recognisably different /shades/ of green to differentiate the one Identifying hierarchy:
Concept
Project
Edition
Instance

(
I am assuming that you understand that:
- the Identifying Relations define a data hierarchy
--- the Non-Identifying Relations do not
- each Hierarch (Independent, square corners) is the root of a genuine data hierarchy, a "Dimension" in the "Dimension-Fact" paradigm.
If you appreciate that, you will appreciate that the Relational database is a true data warehouse, as per Codd's declared intent in the /RM/.
)

Variations. For understanding only. Eg. we could colour all the child tables in the Keyword Hierarchy the same colour as Keyword (say slate blue). Eg. we could colour all child tables in the Agent hierarchy the same, light purple. It is not that one colour scheme is right and the other is wrong, it is more about what one wants to emphasise, knowing that colour has a huge effect. I am emphasising the main data hierarchy over the Keyword and Agent hierarchies.

I will change it in the next progression, so that you can evaluate the difference.

> I see that you don't eliminate circularity completely, because you
> retain "self-loops" (FKs from an entity to itself). Such foreign keys
> don't cause problems of type (3) above (and most likely not even
> problems of type (2)), but they may be eliminated as well, and it's
> often desirable to avoid the unnatural situation of having a tuple
> reference itself. For instance, instead of:
>
> Employee(EmployeeNo, Name, ManagerNo)
>
> which forces at least one employee to be her own manager, one may
> define:
>
> Employee(EmployeeNo, Name)
> Manager(EmployeeNo, ManagerNo)

Whoa. Many errors of understanding in that. As with other matters, it addresses the lower levels of a larger problem, without reference to the larger problem, and does so bottom up, and we really should address the whole subject, top down. New thread.

> > 1. The PK has become too wide to be used (migrated) throughout the
> > hierarchy, and so a surrogate is /substituted/ for the PK. This is
> > exemplified in Address and Person. I think you understand that usage.
>
> Yes, that's a pragmatical (not logically necessary) use of a surrogate,
> which is introduced out of convenience: a trade-off that should be
> carefully evaluated, for the reasons mentioned above.

Yes.

It is not for convenience, but for implementation considerations, performance. And if one uses a non SQL non-platform (which is usually fraudulently branded "xxxxxxSQL"), to overcome the non-platform limitations.

(
I reject the notion that theory should be isolated from application, as idiotic, the realm of the asylum. Such theory, is by evidenced fact, fantasy, that is not applicable. The Date; Darwen; Fagin; et al Gulag. People do implement (apply) it, because the Gulag prescribes it, and they suffer the predictable problems. The intent of the Gulag is nothing short of sabotage.

For theory to be valid (not just useful), it must have an application intent. Codd's work is both theory and application.

Likewise application without a foundational theory is dangerous and stupid.
)

> > 2. The table is a Basetype, and the Subtypes have quite different
> > PKs, such that a common PK cannot be established. Thus the common PK
> > in the Basetype is manufactured, it is a surrogate. Please feel free
> > to discuss this usage in detail.
>
> That is the Agent case, where introducing a surrogate is necessary to be
> able to reference heterogeneous entities types. That's a textbook use
> case for surrogates, e.g., Elmasri and Navathe's book

We were doing it for decades before that textbook was written. Possibly decades before Elmasri and Navathe were born. We did have genuine databases; database platforms; Normalisation, long before the /Relational Model/. All of which is suppressed by the Pig Poop Asylum, in order to propagate their anti-Relational RFS as "relational".

I have critiqued the "Alice Book" (Abiteboul, Hull & Vianu), and written it off as pure unadulterated filth, a slavish rendition of the Date; Darwen; Fagin; et al Asylum, a great Pig Poop Manual. I have not critiqued the Elmasri & Navathe book, but I have been told by my protégés that it is in the same category. Neither book promotes or articulates the /RM/, while presenting that they are "relational".

> discuss it (they
> call it a "union type", although I don't like the name).

We can't use that term, it is already established and defined in English, and in SQL as UNION, and in OO/ORM as UnionTypes. This is definitely not a "Union", same as this is definitely not a "Supertype".

Why can't they use the IDEF1X or IEEE term that has been established for fifty and seventy years, respectively. Again, they prove that they are anti-Relational.

This is only a slight variation on a vanilla Basetype-Subtype cluster. Whereas the vanilla version is data as it exists, the slight variation is the Basetype is consciously (not accidentally) manufactured, to establish a Common base for a "heterogenous" set of Subtypes.

----

Now that AgentNo is understood, I will give you an advancement in the next iteration.

> As for the format, please know that I am not going to print anything
> larger than A3. I don't commute, I am (still) comfortable viewing large
> models on a big screen, and I can easily read (good) models split on
> several pages, with "collapsed" entities (as you call them) or "shadow"
> entities (as the recent IDEF1X revision calls them)

The freaks love to use terms that are false and confusing. The word /shadow/ has a specific meaning, and this is definitely not a shadow.

Collapsed is a correct term because it shows only the table name; type { Independent | Dependent }; and colour. This allows me (and the viewer) to have just one version of the full definition of the table, avoiding duplication, and the consequences of duplication. The reduced space requirement is a minor issue.

Where I provide a fully enabled PDF, I use those capabilities, such that one can click on the collapsed entity and the PDFViewer will take you to the single full definition. Such as in:
http://www.softwaregems.com.au/Documents/Documentary%20Examples/Order%20DM%20Advanced.pdf

Most modelling tools will not allow tables and collapsed tables on the same page, because the presentation of any window or page is based on "show level". IDEF1X does not specify the requirement. I happily place both on the same page, but only when the model is split into multiple pages.

Ok, but you have not answered the question: if you ever want to print the data model (which must be A3, multiple pages), please ask. Is that ok ?

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Feb 22, 2020, 9:47:36 PM2/22/20
to

Nicola

unread,
Feb 24, 2020, 10:44:32 AM2/24/20
to
On 2020-02-20, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:

>https://www.zerohedge.com/geopolitical/coronavirus-deaths-outside-china-spike-who-team-visits-wuhan
>
>God bless you and your family

Thanks. Things will be quiet this week (universities, schools, gyms,
libraries and other public places are all closed).

I am coming back to discussing the model.

> --------------------------------------
> -- This post introduces V0.10 --
> --------------------------------------
> Table-Attribute
> http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_10.pdf

First, some minor details:

- Typos:
- "Project Constranits" (bottom left);
- "Is Known As" is identifying (Concept -< ConceptTitle and Project -< ProjectTitle);
- "AK.i" (ConceptIdentifier).
- Year and/or TitleReduced are not marked as FKs (in bold) in
ConceptTitle, ConceptCharacter, and ConceptIdentifier.
- Key migrations:
- ComponentType not migrated into ProjectComponent
- missing migrated key for "Project Is Varied As 0-N Projects";
- ProjectTitle should include LanguageCode_P (or TitleReduced_P) in the key.
- ConceptTitle allows one Title per LanguageCode (in the scope of the
same Concept): I don't think that's intentional (see below).
- Others:
- KeywordPermitted is 1:1-N with both ConceptKeyword and
ProjectKeyword: I think that should be 1:0:N (a simple reason is
that a Concept may have no associated Projects).

The rest looks fine to me.

One issue with the term "Project" is that a full database model would
likely include "cataloguing projects", so the term is at risk of being
overloaded. But for this thread I believe it's fine.

I want to propose a few changes to the model, but first I need a better
grasp on titles and languages (again!) as they are currently modeled.

First, when one populates the database, are "reduced" titles assumed to
be in their original language or in the cataloguer's language? For
instance, should an Australian cataloguer insert the title of a new
Concept "Godzilla" as (a) "Gojira" (in Japanese, I assume), (b) "ゴジラ"
(which *should* be the Japanese script for Godzilla, at least according
to Google Translate) or (c) "Godzilla" (in the cataloguer's language)?
Or doesn't it really matter?

I ask because there is no LanguageCode in Concept, so I assume you mean
(c). IMO, that's an unneeded restriction (why not adding LanguageCode?).

Second, according to IMDb, the movie "Gojira ni-sen mireniamu" is
translated into English as "Godzilla 2000" and also as "Godzilla 2000:
Millennium". In my understanding, those would be two titles with the
same LanguageCode. So, shouldn't the key of ConceptTitle include
TitleReduced_C rather than LanguageCode?

> So the question is this. While we are going back and forth with the
> modelling, can I give you just the whole model on one page ? Right
> now it is A1.

If you work more comfortably on one page, sure.

Nicola

Derek Ignatius Asirvadem

unread,
Feb 25, 2020, 12:03:10 AM2/25/20
to
> On Tuesday, 25 February 2020 02:44:32 UTC+11, Nicola wrote:
> > On 2020-02-20, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> Things will be quiet this **week**(universities, schools, gyms,
> libraries and other public places are all closed).

More like a month or three. Europe is isolating Italy.

> > --------------------------------------
> > -- This post introduces V0.10 --
> > --------------------------------------

> First, some minor details:
>
> - Typos:
> - "Project Constranits" (bottom left);
> - "AK.i" (ConceptIdentifier).
> - Year and/or TitleReduced are not marked as FKs (in bold) in
> ConceptTitle, ConceptCharacter, and ConceptIdentifier.

Thanks.
Yes, those are clerical level mistakes.

> - "Is Known As" is identifying (Concept -< ConceptTitle and Project -< ProjectTitle);

That is not a clerical level mistake. That is unchanged from V0.9. So that is either a matter of understanding or modelling issue (not an error) that needs to be progressed in the next iteration.

In case it is a matter of understanding. First, note that that /difference/ ie.
/Concept is known as ConceptTitle/ (Non-Identifying)
vs the other children of Concept being Identifying, was a conscious discussed choice.

Second, when the **full** PK
Concept[ Creator, Year, TitleReduced ]
is **not** used to Identify the child
ConceptTitle[ Creator, Year, LanguageCode, TitleReduced ]
the relation is Non-Identifying. Of course, the FK is migrated as normal:
/Concept[ Creator, Year, TitleReduced ] is known as ConceptTitle[ Creator, Year, TitleReduced_C ]/


> - Key migrations:
> - ComponentType not migrated into ProjectComponent
> - missing migrated key for "Project Is Varied As 0-N Projects";

Yes. Thanks and sorry.

> - ProjectTitle should include LanguageCode_P (or TitleReduced_P) in the key.

It has both (V0.10).

> - ConceptTitle allows one Title per LanguageCode (in the scope of the
> same Concept): I don't think that's intentional (see below).

Absent discussion, that is intentional on my side. Feel free to expand as a modelling change (as opposed to a drawing mistake).

> - Others:
> - KeywordPermitted is 1:1-N with both ConceptKeyword and
> ProjectKeyword: I think that should be 1:0:N (a simple reason is
> that a Concept may have no associated Projects).

Yes and no. I realise that academics; "academics"; and certainly all the "theoreticians" (there are no theoreticians serving this space) contemplate each FK (indeed every element in the data model) as a single element, isolated from its context, divorced from all other elements. But that is wrong:
- the database is an integrated unit (no element is isolated; every element exists in a context)
- the FK relates to the parent, and to the child
- where there are two FKs, there are two parents with a common child
- so the two relations, and their cardinality, must be viewed as involving three parties (not two times two parties)

The cardinality declared in V0.10 states, a Keyword will not be added unless it has a child in { ConceptKeyword | ProjectKeyword }. Ie. we do not walk up to the database and add Keywords first, then add Titles. No, the user function is, we add Titles, and as a result, if the words do not exist in Keyword, we add the Keyword as part of the Transaction. (Keyword first, in the Transaction sequence.)

The academics; "academics"; and certainly all the "theoreticians", are totally ignorant of ACID Transactions, which is a huge problem. We had ACID Transactions in every *DBMS long before the /RM/, and SQL, the data sublanguage defined in the /RM/ has ACID Transactions. So while it could be said that the /RM/ does not define Transactions, that is irrelevant, because it does not have to define anything other than the /Relational Model/ (eg. it does not define Basetype-Subtypes either, which we also had before). The relevant point is, all DBMS platforms, Relational or otherwise, have ACID Transactions, as a necessary demand for any implementation, and we had better teach it.

(Non-platforms do not have Transactions and various other implementation requirements, let's not discuss them. They just simply should not be used for any academic or teaching purpose.)

> (a simple reason is
> that a Concept may have no associated Projects).

(A Concept does not have Projects, )

/Each ConceptTitle manifests as 0-n Projects/

And where ConceptTitle exists
/Each ConceptTitle is recalled by 1-n Keywords[Permitted]/
And therefore those Keywords
/Each Keyword[Permitted] recalls 1-n (not 0-n) ConceptTitles

So no problem there.

> The rest looks fine to me.
>
> One issue with the term "Project" is that a full database model would
> likely include "cataloguing projects", so the term is at risk of being
> overloaded. But for this thread I believe it's fine.

(I would much rather model a real world database than a purely academic one for this thread. The former has a veracity that the latter does not have, and the former covers the latter plus more, not less.)

If you mean intake at La Camera Ottica Laboratory, I would prefer to include their functional requirements. So Estate (collection of Articles) is the intake unit ... or "cataloguing project". And we should build a workflow cluster around that. No problem to differentiate Project (Movie Title sense) from "cataloguing project". Feel free to suggest table names.

> I want to propose a few changes to the model,

By all means.

> but first I need a better
> grasp on titles and languages (again!) as they are currently modeled.

By all means.

> First, when one populates the database, are "reduced" titles assumed to
> be in their original language or in the cataloguer's language?

(The "un-reduced" Title is in the row, so the question applies to both Title and TitleReduced.)

For Concept, in the cataloguer's Language (implicit).
For everything below Concept, including ConceptTitle, it is in LanguageCode, which is in the row (explicit).

> For
> instance, should an Australian cataloguer insert the title of a new
> Concept "Godzilla" as
> (a) "Gojira" (in Japanese, I assume),

Hepburn actually, Japanese language in the Roman (be proud!) alphabet.

> (b) "ゴジラ"
> (which *should* be the Japanese script for Godzilla, at least according
> to Google Translate)

It is. Japanese in the Japanese charset (technically East Asians do not have an alphabet, they have only symbols for words).

> or
> (c) "Godzilla" (in the cataloguer's language)?
> Or doesn't it really matter?

In one situation (eg. La Camera Ottica) It matters only if the cataloguer is going to add a Project ... and therefore needs to add only the Concept and ConceptTitle that is required for the Project being added.
- In that situation, they would add only [c] because they have a Project[ ?, ?, en, Godzilla, ? ] in English.

In another situation (a proper curator, doing full research to the extent of their ability, with a view to providing a full definition of the history and geography regarding a Project), they would add all known ConceptTitles (and a bit of description). But not necessarily at the same time that they add the Project and the minimum ConceptTitle.
- In that situation, they would add [c] because they have to add a Project[ ?, ?, en, Godzilla, ? ], and later when they research the history and associations for the new Project, add [a] [b].

Likewise, they would fiddle with the Derivatives and Variants, to form an accurate and full history.

(We discussed Pygmalion ... My Fair Lady earlier.)

> I ask because there is no LanguageCode in Concept, so I assume you mean
> (c).

Yes.

> IMO, that's an unneeded restriction (why not adding LanguageCode?).

Ok. Next iteration.

That means, instead of:
/Each Country produced 0-n Concepts/
/Each Country produced 0-n Projects/
we would have (more logical, less restricted; maintains intent):
/Each CountryLanguage produced 0-n Concepts/
/Each CountryLanguage produced 0-n Projects/
Much nicer.

> Second, according to IMDb, the movie "Gojira ni-sen mireniamu" is
> translated into English as "Godzilla 2000" and also as "Godzilla 2000:
> Millennium". In my understanding, those would be two titles with the
> same LanguageCode. So, shouldn't the key of ConceptTitle include
> TitleReduced_C rather than LanguageCode?

That is a discussion point, that leads to a decision, as opposed to a mistake. The current intention is, we have just one ConceptTitle (and ProjectTitle) in any one Language. So if we need multiple titles per Language ... I would say, the determinant is ...

TitleType, and move that into the PK, rather than TitleReduced_C (the title itself).
ConceptTitle[ ?, ?, en, Original, Godzilla 2000 ]
ConceptTitle[ ?, ?, en, Alternate, Godzilla 2000: Millennium ]

> > So the question is this. While we are going back and forth with the
> > modelling, can I give you just the whole model on one page ? Right
> > now it is A1.
>
> If you work more comfortably on one page, sure.

Much. Devoid of duplicates.

But please, do ask for A3 whenever you want a data model that you would like to print.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Feb 25, 2020, 3:46:35 AM2/25/20
to
> On Tuesday, 25 February 2020 16:03:10 UTC+11, Derek Ignatius Asirvadem wrote:
> > On Tuesday, 25 February 2020 02:44:32 UTC+11, Nicola wrote:
>
> > - Others:
> > - KeywordPermitted is 1:1-N with both ConceptKeyword and
> > ProjectKeyword: I think that should be 1:0:N (a simple reason is
> > that a Concept may have no associated Projects).
>
> Yes and no. I realise that academics; "academics"; and certainly all the "theoreticians" (there are no theoreticians serving this space) contemplate each FK (indeed every element in the data model) as a single element, isolated from its context, divorced from all other elements. But that is wrong:
> - the database is an integrated unit (no element is isolated; every element exists in a context)
> - the FK relates to the parent, and to the child
> - where there are two FKs, there are two parents with a common child
> - so the two relations, and their cardinality, must be viewed as involving three parties (not two times two parties)
>
> The cardinality declared in V0.10 states, a Keyword will not be added unless it has a child in { ConceptKeyword | ProjectKeyword }. Ie. we do not walk up to the database and add Keywords first, then add Titles. No, the user function is, we add Titles, and as a result, if the words do not exist in Keyword, we add the Keyword as part of the Transaction. (Keyword first, in the Transaction sequence.)
>
> The academics; "academics"; and certainly all the "theoreticians", are totally ignorant of ACID Transactions, which is a huge problem. We had ACID Transactions in every *DBMS long before the /RM/, and SQL, the data sublanguage defined in the /RM/ has ACID Transactions. So while it could be said that the /RM/ does not define Transactions, that is irrelevant, because it does not have to define anything other than the /Relational Model/ (eg. it does not define Basetype-Subtypes either, which we also had before). The relevant point is, all DBMS platforms, Relational or otherwise, have ACID Transactions, as a necessary demand for any implementation, and we had better teach it.
>
> (Non-platforms do not have Transactions and various other implementation requirements, let's not discuss them. They just simply should not be used for any academic or teaching purpose.)

For clarity, teaching purpose, etc. Transactions are best understood from the user perspective (action; function), rather than the database perspective, and then add the understanding of the database requirement. When I (a) discuss or (b) name Transactions, I purposely:
- do NOT use insert/update/delete
- DO use add/modify/drop

So the Transaction discussed is ConceptTitle_Add_tr (because it is independent of Concept_Add_tr, ConceptTitle can be added independent of Concept). It does (in order):
- remove the KeywordsExcluded from Title = TitleReduced
- check that the remaining words are in KeywordsPermitted
- foreach that does not exist in KeywordsPermitted:
--- INSERT Keyword
--- INSERT KeywordPermitted
- then foreach word in TitleReduced
--- INSERT ConceptKeyword
--- INSERT ConceptKeywordTitle
- INSERT ConceptTitle

The order is usually the logical tree, reversed. The database is [C]onsistent at all times, [C]onsistency is not broken for the duration of the Transaction, as some non-platforms prescribe (that is Anti-ACID). We do not need "deferred constraint checking".

> > For
> > instance, should an Australian cataloguer insert the title of a new
> > Concept "Godzilla" as
> > (a) "Gojira" (in Japanese, I assume),
>
> Hepburn actually, Japanese language in the Roman (be proud!) alphabet.

Ie. phonetic. The Japanese pronunciation of "ゴジラ" sounds like "Gojira". The English word is Godzilla, because it is an amalgam of sea monster (they have deities but no unifying God) plus gorilla.

> > (b) "ゴジラ"
> > (which *should* be the Japanese script for Godzilla, at least according
> > to Google Translate)
>
> It is. Japanese in the Japanese charset (technically East Asians do not have an alphabet, they have only symbols for words).

The East Asians are all primitive as F. No alphabet. No CONCEPT of words that are made up of root words plus prefixes and suffixes to denote tense; particles; etc. Just words as concepts, meaning dependent on context (the string of concepts) and words each have a symbol. Therefore they do not have any literature or poetry (the concept is absent). "Simplified" Chinese:
- symbol for TREE is 木 (notice the stick figure tree)
- symbol for garden is 2 x TREES = 庭 (there are 2 x stick figure trees in that, plus house)
- symbol for forest is 3 x TREES = 森 (notice 3 x stick figure trees)

If I want to tell a Chinese to piss off, I have to say the symbol for /bad egg/.

The Japanese, Vietnamese, and Koreans are a bit ahead of the Chinese, because they have a Roman alphabet for phonetic use. The Vietnamese are the best in this pathetic category, because the Jesuits gave them a massive alphabet (Roman plus diacritical marks) for their clicks and sucks.

In any case, their entire total dictionary is only 40,000 words/concepts/symbols. The minimum a person needs to function is 2,000 symbols, for business 4,000. A person is considered /learned/ if they know 6,000 symbols.

That is why learning those stupid "languages" is a simple matter of memory. That is also why East Asian adults can never learn English or Italian, etc: the notion of root words plus suffixes is way too much for them.

Due to this deep intellectual issue, and other deep issues, they have a huge inferiority complex.

Cheers
Derek

Nicola

unread,
Feb 25, 2020, 4:20:36 AM2/25/20
to
On 2020-02-25, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:

>> - "Is Known As" is identifying (Concept -< ConceptTitle and Project -< ProjectTitle);
>
> That is not a clerical level mistake. That is unchanged from V0.9.
> So that is either a matter of understanding or modelling issue (not
> an error) that needs to be progressed in the next iteration.
>
> In case it is a matter of understanding. First, note that that /difference/ ie.
> /Concept is known as ConceptTitle/ (Non-Identifying)
> vs the other children of Concept being Identifying, was a conscious discussed choice.
>
> Second, when the **full** PK
> Concept[ Creator, Year, TitleReduced ]
> is **not** used to Identify the child
> ConceptTitle[ Creator, Year, LanguageCode, TitleReduced ]
> the relation is Non-Identifying. Of course, the FK is migrated as normal:
> /Concept[ Creator, Year, TitleReduced ] is known as ConceptTitle[ Creator, Year, TitleReduced_C ]/

That naming of the attributes has got me confused. Now, I've got it.
Ditto for Project/ProjectTitle.

>> - Others:
>> - KeywordPermitted is 1:1-N with both ConceptKeyword and
>> ProjectKeyword: I think that should be 1:0:N (a simple reason is
>> that a Concept may have no associated Projects).
>
> The cardinality declared in V0.10 states, a Keyword will not be added
> unless it has a child in { ConceptKeyword | ProjectKeyword }.>

...unless it has a child in ConceptKeyword **&** ProjectKeyword }. To
express "one or the other or both (but at least one)", you should make
the associations 0–N and add a footnote to explain that at least one
association must hold for each keyword. The way I understand your model
is that you cannot add a Keyword unless it is used as both
a ConceptKeyword and as a ProjectKeyword.

>> One issue with the term "Project" is that a full database model would
>> likely include "cataloguing projects", so the term is at risk of being
>> overloaded. But for this thread I believe it's fine.
>
> (I would much rather model a real world database than a purely
> academic one for this thread. The former has a veracity that the
> latter does not have, and the former covers the latter plus more, not
> less.)

How about just... Movie (or Moving Image)?

>> First, when one populates the database, are "reduced" titles assumed to
>> be in their original language or in the cataloguer's language?
>
> (The "un-reduced" Title is in the row, so the question applies to both
> Title and TitleReduced.)
>
> For Concept, in the cataloguer's Language (implicit).

>> IMO, that's an unneeded restriction (why not adding LanguageCode?).
>
> Ok. Next iteration.
>
> That means, instead of:
> /Each Country produced 0-n Concepts/
> /Each Country produced 0-n Projects/
> we would have (more logical, less restricted; maintains intent):
> /Each CountryLanguage produced 0-n Concepts/
> /Each CountryLanguage produced 0-n Projects/
> Much nicer.

Ok.

>> Second, according to IMDb, the movie "Gojira ni-sen mireniamu" is
>> translated into English as "Godzilla 2000" and also as "Godzilla 2000:
>> Millennium". In my understanding, those would be two titles with the
>> same LanguageCode. So, shouldn't the key of ConceptTitle include
>> TitleReduced_C rather than LanguageCode?
>
> That is a discussion point, that leads to a decision, as opposed to
> a mistake. The current intention is, we have just one ConceptTitle
> (and ProjectTitle) in any one Language. So if we need multiple titles
> per Language ... I would say, the determinant is ...
>
> TitleType, and move that into the PK, rather than TitleReduced_C (the title itself).
> ConceptTitle[ ?, ?, en, Original, Godzilla 2000 ]
> ConceptTitle[ ?, ?, en, Alternate, Godzilla 2000: Millennium ]

Do you need LanguageCode in the key, though? I'd say that both
LanguageCode and TitleType are determined by the title.

Nicola

Derek Ignatius Asirvadem

unread,
Feb 25, 2020, 6:40:50 AM2/25/20
to
> On Tuesday, 25 February 2020 20:20:36 UTC+11, Nicola wrote:
> > On 2020-02-25, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> > Second, when the **full** PK
> > Concept[ Creator, Year, TitleReduced ]
> > is **not** used to Identify the child
> > ConceptTitle[ Creator, Year, LanguageCode, TitleReduced ]
> > the relation is Non-Identifying. Of course, the FK is migrated as normal:
> > /Concept[ Creator, Year, TitleReduced ] is known as ConceptTitle[ Creator, Year, TitleReduced_C ]/
>
> That naming of the attributes has got me confused. Now, I've got it.
> Ditto for Project/ProjectTitle.

Sorry. Expanded as TitleReduced_Concept, etc in the next iteration.

> >> - Others:
> >> - KeywordPermitted is 1:1-N with both ConceptKeyword and
> >> ProjectKeyword: I think that should be 1:0:N (a simple reason is
> >> that a Concept may have no associated Projects).
> >
> > The cardinality declared in V0.10 states, a Keyword will not be added
> > unless it has a child in { ConceptKeyword | ProjectKeyword }.>
>
> ...unless it has a child in ConceptKeyword **&** ProjectKeyword }. To
> express "one or the other or both (but at least one)", you should make
> the associations 0–N and add a footnote to explain that at least one
> association must hold for each keyword. The way I understand your model
> is that you cannot add a Keyword unless it is used as both
> a ConceptKeyword and as a ProjectKeyword.

Discussion. I am trying to get you to appreciate higher meanings in cardinality when the DM is taken as a whole, not each relation in isolation. What you are saying is perfectly reasonable in the latter perspective, but not in the former.

(Consider, no need to answer each /why/.)

Why ?

Why, when taking the whole model into account, here the Concept cluster, does that mean that ? Why **and**, not **or** ?

Why, in your statement, does the Concept relations not prevail ?

(Transactions have a bearing on this, they are properly considered in the former realm, ignored in the latter.) I repeat, in Transaction code, the order of INSERTS is usually the reverse of the subject tree. In terms of higher meaning, that means there is no Keyword that is NOT IN either ConceptKeyword{ Direct | Title } **or** ProjectKeyword{ Direct | Title }. (It does NOT mean that for every Keyword, there must be a ConceptKeyword **and** a ProjectKeyword.)

Why, when adding a ConceptTitle via execution of ConceptTitle_add_tr, does the DM not suffice ? Why does the unrelated ProjectTitle requirements (and then its Keyword requirements) have to be considered ?

There is no Keyword_Add_tr. Keywords & KeywordsPermitted will added as and when Concepts; ConceptTitles; Projects; ProjectTitles, are added.

In a nutshell, why does the Keyword tree have priority over the Concept (or Project) tree ?

(The above is for discussion.)

> To
> express "one or the other or both (but at least one)", you should make
> the associations 0–N and add a footnote to explain that at least one
> association must hold for each keyword.

Separate point. Generic answer. No. The correct way is to add a pair of Non-Exclusive Subtypes for KeywordPermitted { Concept | Project }.

> >> One issue with the term "Project" is that a full database model would
> >> likely include "cataloguing projects", so the term is at risk of being
> >> overloaded. But for this thread I believe it's fine.
> >
> > (I would much rather model a real world database than a purely
> > academic one for this thread. The former has a veracity that the
> > latter does not have, and the former covers the latter plus more, not
> > less.)
>
> How about just... Movie (or Moving Image)?

Done.

> >> Second, according to IMDb, the movie "Gojira ni-sen mireniamu" is
> >> translated into English as "Godzilla 2000" and also as "Godzilla 2000:
> >> Millennium". In my understanding, those would be two titles with the
> >> same LanguageCode. So, shouldn't the key of ConceptTitle include
> >> TitleReduced_C rather than LanguageCode?
> >
> > That is a discussion point, that leads to a decision, as opposed to
> > a mistake. The current intention is, we have just one ConceptTitle
> > (and ProjectTitle) in any one Language. So if we need multiple titles
> > per Language ... I would say, the determinant is ...
> >
> > TitleType, and move that into the PK, rather than TitleReduced_C (the title itself).
> > ConceptTitle[ ?, ?, en, Original, Godzilla 2000 ]
> > ConceptTitle[ ?, ?, en, Alternate, Godzilla 2000: Millennium ]
>
> Do you need LanguageCode in the key, though? I'd say that both
> LanguageCode and TitleType are determined by the title.

> I'd say that ...
> LanguageCode ... are determined by the title.

1. From our previous discussions, LanguageCode has to go with a Title where the Title may not be in the cataloguer's language.

2. How can LanguageCode be determined by theTitle ?
For uneventful presentation (any GUI, any platform ... Localisation), I thought it is the reverse.

> I'd say that ...
> TitleType are determined by the title.

Please explain. Context is of course, our previous discussion: that there are more than one ConceptTitles in a particular Language for a given Concept:

>>> So, shouldn't the key of ConceptTitle include
>>> TitleReduced_C rather than LanguageCode?

I am addressing that problem, but nominating TitleType as the determinant, which is also the determinant of the TitleReduced in the PK.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Feb 27, 2020, 1:59:55 AM2/27/20
to
Nicola

======================
This post introduces V0.11
======================

1. Hierarchy & Layout
---------------------------

We discussed Layout, OrgChart (all DMs thus far) vs Hierarchic. This one is Hierarchic Layout. It does take a bit of getting used to, but once gotten, it is preferred. Why ? Because it emphasises the nature of the Hierarchy, ie. all data hierarchies in the model, which are the hierarchies in the real world that we are modelling.

I intended to provide a write-up on hierarchies; their destruction and suppression in the mental enslavement system (otherwise known as "education"). Because the destruction and suppression is so effective, people do not perceive the hierarchies that exist in the universe, and even when I give them a purely Relational database, with all the hierarchies intact (and a nice data model in OrgChart Layout), they still do not appreciate the data hierarchies. Your comments again prove that effectiveness, and which I planned to respond to. But I have run out of time today.

Please humour me. I am trying to elevate your understanding of data hierarchies. When you first open the doc, view the data model such that it fills the window (your large screen). This would be the lowest magnification so that the whole fits in the window. This is the view of the database from 10,000 metres. Study this view for at least 15 mins before proceeding.

If you do not appreciate the database at this level, you will not be able to fully appreciate it at any other level. The solid lines (barely differentiated from the dashed lines at this distance) denote the Logical Structure of the database, and therefore the physical structure.

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/DM%20at%2010000%20Metres.png

Please spend some time with the new Layout, and compare, before making a decision as to which is best for you. I will produce the next iteration in whatever layout you choose.

2. Changes
---------------

All discussed changes and corrections have been implemented.

3. CreatorName Replaces CreatorNo
-----------------------------------------------

We discussed the method of choosing a surrogate in order to form a Common Key for a disparate ("heterogenous"( set of Subtypes. Here is an advanced method. Whereas CorporationName (the short one as distinct from the FullName which is the official and long one) and CollectiveName are single column, and can easily be used as Keys, Person is not because PersonName is LastName plus FirstName plus MiddleName. In the previous method a surrogate PersonNo; CorporationNo; CollectiveNo was used.

The advanced method uses
LastName + ", " + FirstName + " " + MiddleName
to form PersonName. We now have a Common Key. Yes, it is de-normalised (which always results in duplicated columns). Yes, it breaks 1NF, but it is a computed or derived column. Acceptable only in a mature DM that has no errors. That is for convenience: the Agent Key migrated throughout is now a much more useful one.

Concept[ [Shaw, George Bernard], 1913, Pygmalion ]
Concept[ Lerner & Loewe, 1956, My Fair Lady ] -- Variant
Movie[ Lerner & Loewe, 1964, en, My Fair Lady, ] -- Collective
MoviePersonRole[ Lerner & Loewe, 1964, en, My Fair Lady, , [Harrison, Rex], Lead Actor ]

It is advanced because it has no surrogate, the Keys are indeed "made up from the data", and therefore the additional SELECT demanded by a surrogate is eliminated. One of the many progressions that can only be found if the /RM/ has been implemented faithfully, and experience of playing with Keys is gained.

Compare with AddressNo.

4. Colour
------------

I have used a colour scheme to differentiate Concept (belongs to Agent); Movie; Edition; Instance.

For (eg) ConceptKeyword*, I have used the Concept colour, which emphasises its belong to Concept, instead of slate blue, which would emphasise its belonging to Keyword.

5. Constraint
-----------------

First, we have arrived at that point in the modelling exercise wherein all the tables are constrained by Domain (one meaning of Domain, of the 83 occurrences in the /RM/) and Key. Therefore we have achieved "DKNF", which even its author and RM saboteur Ronald Fagin states as impossible to achieve (he wrote the maffemythical definitions for the anti-Relational "NFs", so he was thinking of such abominations, RFS fraudulently marketed as "relational"). As you have seen, achieving "DKNF" is a no-brainer. Every single database I have written since 1993 is "DKNF".

Second, there are of course more Constraints to be declared for each table. I have started declaring them. Feel free to declare whatever you see fit.

The additional symbols I have used are Extensions to IDEF1X, I do not think they need explanation. However, it does make the DM a bit busy. I do have an Extension that eliminates that issue, which you may be interested in, after we settle down with this one.

When the Constraints are actually complete, we could say we have achieved Fully Constrained Normal Form, "DKNF" + C + C ...

6. Data Model
------------------

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_11.pdf

> On Tuesday, 25 February 2020 02:44:32 UTC+11, Nicola wrote:
>
> I am coming back to discussing the model.

I was expecting more discussion.

Also, when you check the data model, could you please confirm that you are checking the Predicates (all expressed in the data model in symbols, not in text form).

Cheers
Derek


Nicola

unread,
Feb 27, 2020, 3:31:35 AM2/27/20
to
On 2020-02-25, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>> >> - Others:
>> >> - KeywordPermitted is 1:1-N with both ConceptKeyword and
>> >> ProjectKeyword: I think that should be 1:0:N (a simple reason is
>> >> that a Concept may have no associated Projects).
>> >
>> > The cardinality declared in V0.10 states, a Keyword will not be added
>> > unless it has a child in { ConceptKeyword | ProjectKeyword }.>
>>
>> ...unless it has a child in ConceptKeyword **&** ProjectKeyword }. To
>> express "one or the other or both (but at least one)", you should make
>> the associations 0–N and add a footnote to explain that at least one
>> association must hold for each keyword. The way I understand your model
>> is that you cannot add a Keyword unless it is used as both
>> a ConceptKeyword and as a ProjectKeyword.
>
> Discussion. I am trying to get you to appreciate higher meanings in
> cardinality when the DM is taken as a whole, not each relation in
> isolation. What you are saying is perfectly reasonable in the latter
> perspective, but not in the former.
>
> (Consider, no need to answer each /why/.)

The relevant predicates corresponding to your model (v.10) are:

1. Each KeywordPermitted recalls one or more ConceptKeywords;
2. Each KeywordPermitted recalls one or more ProjectKeywords.

At the end of a transaction inserting (in order) into:

- Concept
- ConceptTitle
- Keyword
- KeywordPermitted
- ConceptKeyword

would result in predicate (2) to be violated. Such a transaction should
be considered correct, though. Hence, the constraint imposed by the
model is too strict. That is easily fixed as you suggest below.

>> To
>> express "one or the other or both (but at least one)", you should make
>> the associations 0–N and add a footnote to explain that at least one
>> association must hold for each keyword.
>
> Separate point. Generic answer. No. The correct way is to add
> a pair of Non-Exclusive Subtypes for KeywordPermitted { Concept
> | Project }.

Right. I stand corrected.

>> >> Second, according to IMDb, the movie "Gojira ni-sen mireniamu" is
>> >> translated into English as "Godzilla 2000" and also as "Godzilla 2000:
>> >> Millennium". In my understanding, those would be two titles with the
>> >> same LanguageCode. So, shouldn't the key of ConceptTitle include
>> >> TitleReduced_C rather than LanguageCode?
>> >
>> > That is a discussion point, that leads to a decision, as opposed to
>> > a mistake. The current intention is, we have just one ConceptTitle
>> > (and ProjectTitle) in any one Language. So if we need multiple titles
>> > per Language ... I would say, the determinant is ...
>> >
>> > TitleType, and move that into the PK, rather than TitleReduced_C (the title itself).
>> > ConceptTitle[ ?, ?, en, Original, Godzilla 2000 ]
>> > ConceptTitle[ ?, ?, en, Alternate, Godzilla 2000: Millennium ]
>>
>> Do you need LanguageCode in the key, though? I'd say that both
>> LanguageCode and TitleType are determined by the title.
>
>> I'd say that ...
>> LanguageCode ... are determined by the title.
>
> 1. From our previous discussions, LanguageCode has to go with a Title
> where the Title may not be in the cataloguer's language.

> 2. How can LanguageCode be determined by theTitle ?
> For uneventful presentation (any GUI, any platform ... Localisation),
> I thought it is the reverse.

Upon reflection, yes, I am wrong. To properly track titles in multiple
languages, LanguageCode must be part of the key.

>> I'd say that ...
>> TitleType are determined by the title.
>
> Please explain. Context is of course, our previous discussion: that
> there are more than one ConceptTitles in a particular Language for
> a given Concept:

The title type is a discriminator for titles (a title is "original" xor
"alternate" xor "preferred" ...).

Now I see that there's a v11. I'll continue the discussion there.

Nicola

Derek Ignatius Asirvadem

unread,
Feb 27, 2020, 4:18:56 AM2/27/20
to
> On Thursday, 27 February 2020 17:59:55 UTC+11, Derek Ignatius Asirvadem wrote:
>
> 5. Constraint
> -----------------
>
> [...]
>
> Second, there are of course more Constraints to be declared for each table. I have started declaring them. Feel free to declare whatever you see fit.
>
> [...]
>
> When the Constraints are actually complete, we could say we have achieved Fully Constrained Normal Form, "DKNF" + C + C ...
>
> Also, when you check the data model, could you please confirm that you are checking the Predicates (all expressed in the data model in symbols, not in text form).

>> An earlier post ...
>> [mess, improvable] ...
>> [contradictory data] ...
>> [recorded & lost] ...
>> ["linked data models"] ...
>> "ontology", which will put every piece of data within a well defined
>> hierarchy.

>> and a "description logics"

The larger goal is, a fully defined Relational database, based on FOPC and the /RM/, which is therefore stable and fully integrated [all senses], plus a full set of Open Architecture Transactions (which is the Database API), and serves all needs, instead of:
- Record Filing System with zero integrity, refactored frequently
- plus a middleware layer that attempts to do what Transactions do, badly
- plus an "ontology", which has to change for every refactor
- plus a "description logics"

Thus for Constraints, I would like you to ensure that all of them are declared, such that we can work through any and all difficulties, such that all functions in the RFS+MW+O+DL are covered, and the imbecile OO/ORM method can be seen for what it is.

In case I have not stated it before, the Predicates are a powerful feedback loop for checking the veracity of the graphical data model. Ie. validation. If there is any difficulty at all in /reading/ them from the DM, please ask, and I will furnish a text version.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Feb 27, 2020, 4:33:21 AM2/27/20
to
> On Thursday, 27 February 2020 19:31:35 UTC+11, Nicola wrote:
> > On 2020-02-25, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> The relevant predicates corresponding to your model (v.10) are:
>
> 1. Each KeywordPermitted recalls one or more ConceptKeywords;
> 2. Each KeywordPermitted recalls one or more ProjectKeywords.
>
> At the end of a transaction inserting (in order) into:
>
> - Concept
> - ConceptTitle
> - Keyword
> - KeywordPermitted
> - ConceptKeyword
>
> would result in predicate (2) to be violated. Such a transaction should
> be considered correct, though. Hence, the constraint imposed by the
> model is too strict.

You are correct, the V0.10 DM is incorrect.

> That is easily fixed as you suggest below.
>
> >> To
> >> express "one or the other or both (but at least one)", you should make
> >> the associations 0–N and add a footnote to explain that at least one
> >> association must hold for each keyword.
> >
> > Separate point. Generic answer. No. The correct way is to add
> > a pair of Non-Exclusive Subtypes for KeywordPermitted { Concept
> > | Project }.
>
> Right. I stand corrected.

There is nothing, absolutely nothing, that cannot be defined in FOPC. The tendency of OO/ORM types is to put everything in the middle tier, and nothing in the RFS. Switching to putting everything in the database does require conscious effort.

> >> I'd say that ...
> >> TitleType are determined by the title.
> >
> > Please explain. Context is of course, our previous discussion: that
> > there are more than one ConceptTitles in a particular Language for
> > a given Concept:
>
> The title type is a discriminator for titles (a title is "original" xor
> "alternate" xor "preferred" ...).

Ok. At this stage it is not a Discriminator, it is a Classifier. As a Reference table, the referencing row can only reference a single row in the referenced table, so sure, it has a "discriminating" effect, same as a classifying effect. But we can't use the term Discriminator because it means a specific thing: the determinant for an Exclusive Subtype cluster.

So, TitleType is a Classifier, of which only one can be referenced by any one ConceptTitle or MovieTitle. TitleType determines the Title, much the same as LanguageCode does.

Cheers
Derek

Nicola

unread,
Feb 27, 2020, 7:10:38 AM2/27/20
to
With a model this large, perhaps this new layout is preferable. Among
the rest, it tends to be less wide and taller, which is probably better
suited to the way we read, e.g., books.
That looks good to me. As we agree that data is manipulated only through
transactions, I don't see issues with such an approach. Besides, which
duplicates can you have? Person still has the AK - but I'll come back to
that AK below (*).

A common complaint with taking the step you take is that "the key is
larger" => "indexes will be larger" => "searches, and especially
updates, will be more expensive and the database will occupy more
space".

You are correctly pointing out that the surrogate requires an additional
SELECT because data is searched through real data values anyway. The
purported advantage is that *additional* queries will benefit from the
surrogate, i.e., you retrieve it once and you use it several times, so
that further queries (e.g., in the same transaction) will filter using

...where id = X;

rather than

...where pk_attr1 = Y and pk_attr2 = W and ... and pk_attr_n = Z;

And you will join on one attribute rather than many.

Other (perhaps more interesting) reasons in favor of surrogates have
been elaborated in Hall, Owlett and Todd's "Relations and Entities",
which Codd acknowledges as a major inspiration for RM/T and summarizes
in his 1979 paper, §4. Feel free to discuss (Codd explains three
difficulties with "user-controlled" keys—more details are in the other
papers). As the paper is not easy to find, here's a temporary link:

https://send.firefox.com/download/57050c8af86ed25e/#SIg0c7Uuovr7dRMFu6s_Ww

One of the motivating examples in the above mentioned paper is this:
consider an Employee entity with primary key EmployeeNo. Suppose that an
EmployeeNo is updated in the database: does that mean that an employee
has been fired and a new employee has been hired or that the employee
number of a current employee has changed? The point here is the semantic
ambiguity of the update operation on a key (am I recording a change of
the key value for a single entity or am I replacing an entity with a new
one having the same descriptors?).

Note, again, that I am using "surrogate" as "entity identifier"
(following the authors above), *not* as "record identifier" (as I said,
I am not interested in the latter). In particular, surrogates are never
updated (and possibly never deleted either).

(*) Regarding the now AK of Person: you may want to record data about
persons whose birth date, birth town, etc. are not known. That might be
another case for a surrogate. Here is some data:

https://blog.aniljohn.com/2013/06/how-to-choose-attributes-to-uniquely-identify-a-person.html

> Compare with AddressNo.

Why wouldn't you do the same as for Agent?

> 4. Colour
> ------------
>
> I have used a colour scheme to differentiate Concept (belongs to
> Agent); Movie; Edition; Instance.
>
> For (eg) ConceptKeyword*, I have used the Concept colour, which
> emphasises its belong to Concept, instead of slate blue, which would
> emphasise its belonging to Keyword.

Ok.

> 5. Constraint
> -----------------
>
> First, we have arrived at that point in the modelling exercise wherein
> all the tables are constrained by Domain (one meaning of Domain, of
> the 83 occurrences in the /RM/) and Key. Therefore we have achieved
> "DKNF", which even its author and RM saboteur Ronald Fagin states as
> impossible to achieve

That depends on the class of constraints you consider, and also on the
problem you want to solve. Many problems related to first-order logic
are undecidable. Constraints that are encountered in practice are
usually a very restricted subset of what you can express in FO logic, so
they are typically much easier to deal with. As a simple example, Fagin
shows that, if the only constraints you consider are functional
dependencies, then any schema may be easily transformed into a DKNF
database schema (with inclusion dependencies). But functional
dependencies are constraints of the form:

forall x,y,z . R(x,y) and R(x.z) -> y=z

which is a very, very special kind of FO formula.

Also, note that "undecidable" means that there is no general algorithm
to solve the problem. That doesn't mean that the instances of that
problem which you encounter in practice cannot be solved.

> (he wrote the maffemythical definitions for the
> anti-Relational "NFs", so he was thinking of such abominations, RFS
> fraudulently marketed as "relational"). As you have seen, achieving
> "DKNF" is a no-brainer. Every single database I have written since
> 1993 is "DKNF".

It may as well be the case. "Your" DKNF is different from the
maffemythical (whatever that means) definition of Fagin's "DKNF". You
haven't said exactly what "your" DKNF means (that's probably also the
issue Fagin had when you wrote to him). My impression is that it can be
explained as a collection of design principles and a holistic view of
the data model (most normal forms, DKNF included, totally lack the
latter).

> Second, there are of course more Constraints to be declared for each
> table. I have started declaring them. Feel free to declare whatever
> you see fit.

Could you clarify with an example why TitleType must be in the key of
ConceptTitle and ProjectTitle?

I believe that my question "How to identify a movie?" is now answered.
One thing I'd still change in the model, though: I'd turn the
associations "Is Varied As" and "Is Derived As" into (associative)
entities. The reason is that one would want to add more information
related to such associations (e.g., data about sources that justify
a particular relationship between movies).

> The additional symbols I have used are Extensions to IDEF1X, I do not
> think they need explanation. However, it does make the DM a bit busy.
> I do have an Extension that eliminates that issue, which you may be
> interested in, after we settle down with this one.
>
> When the Constraints are actually complete, we could say we have
> achieved Fully Constrained Normal Form, "DKNF" + C + C ...

Forget about DKNF and call it FCNF then!

> I was expecting more discussion.

I hope I am catching up. During the next days I'll review your
constraints and validate the model against the examples from the FIAF
manual.

> Also, when you check the data model, could you please confirm that you
> are checking the Predicates (all expressed in the data model in
> symbols, not in text form).

Sure I do.

Nicola

Derek Ignatius Asirvadem

unread,
Feb 29, 2020, 4:41:08 AM2/29/20
to
> On Thursday, 27 February 2020 23:10:38 UTC+11, Nicola wrote:
> > On 2020-02-27, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >
> > It is advanced because it has no surrogate, the Keys are indeed "made
> > up from the data", and therefore the additional SELECT demanded by
> > a surrogate is eliminated. One of the many progressions that can only
> > be found if the /RM/ has been implemented faithfully, and experience
> > of playing with Keys is gained.
>
> That looks good to me. As we agree that data is manipulated only through
> transactions, I don't see issues with such an approach. Besides, which
> duplicates can you have? Person still has the AK - but I'll come back to
> that AK below (*).

Ok, I'll respond to this at that stage.

> A common complaint with taking the step you take is that "the key is
> larger" => "indexes will be larger" => [...] and especially
> updates, will be more expensive and the database will occupy more
> space".

1.
> the key is larger"

No, the Key is added.
Yes, it is larger than a surrogate.

> => "indexes will be larger" => "searches, and especially
> updates, will be more expensive and the database will occupy more
> space".

That was a valid consideration in the days when we counted disk space in megabytes and RAM kilobytes, **and** we did not have caches in the server that cached the indices. It lost its relevance when we moved to counting in gigabytes & megabytes. It is completely irrelevant in servers that have caches and caches its indices. (I have preformed many benchmarks ... which although for other, specific purposes, directly and indirectly prove this.) The generic cost became irrelevant. Now that we are in the days of counting in terabytes & gigabytes, the irrrelevant issue has moved to being doubly irrelevant.

Either a surrogate or a made-up Key is always:
- one additional column
- plus one additional index

For the specific cost, yes, an additional column (wide or not); and an additional index, costs more than not having it. But it is the cost of a trade-off which is carefully considered, it is done precisely because it increases Relational Power (less JOINs). So whatever cost it is (not denied) is traded off against the overall lower cost of access.

For those who suggest otherwise, just run a benchmark for them, and they shut up. It is arguing in the tiny corners of the Bell curve ... that cast doubt on a consciously chosen solution ... without attacking it squarely.

That is for commercial SQL platforms. For the mickey mouse program suites (they do not have a server architecture, they are not platforms), sure, they might cack themselves.

> especially
> updates, will be more expensive

Assuming you mean update of the Key (which might be worth discussing), and not update of the row (re which the statement is false). Well, the idea is to choose stable Keys (there is no such thing as an immutable Key, so let's not discuss that impossibility). Which means update of the Key is rare (only when the Identifying properties of the Fact in the real world changes ... eg. Mary Smith becomes May Brown after marriage.

Updating the Key is **not** more expensive in the way you mean (see above).

Updating the Key in such circumstances is, yes, more expensive and onerous, but not in the way you mean. Because all the occurrences of the migrated Key need to be changed, where such is required, one must write a Transaction (we call these Batch Transactions) to:
- add the entire tree below the affected Key, with the new Key value
- drop the entire tree below the affected Key
- this is done in several loops (WHILE 1=1), the number of which equals the depth of the tree
- in batches of 500 or 1000 rows (SET ROWCOUNT 500)
- BREAKing out when the branch is empty
Otherwise you would strangle the server (affect all online users) **and** blow the transaction log file, which is suicide (the code blocks itself).

> => "indexes will be larger" => "searches,

False.
Any normal searches would be against the components of the AK (not the surrogate, not the made-up Key)
Any searches wrt JOINS will be much faster for either a surrogate (vs the full AK) or a made-up Key (vs the full AK).

2.
The real cost (actually worth consideration) related to these issues is actually whether the rows have a good, Relational index (meaning Relational Key of many columns), such that the INSERT/UPDATE/DELETEs are spread across the table, such that there is no contention (high concurrency). Imbeciles who propose surrogates do not understand this.

A surrogate guarantees a "hot spot" in a particular location in the physical table (if IDENTITY or AUTOINCREMENT is used, that is the current last page), (a) with all INSERTs fighting over the last page, (b) while holding locks on other resources, (c) guaranteeing lock contention and (d) slowness (low concurrency).

There are two common methods that are used to alleviate this, and both are non-started, but still commonly used.

(Due to my high performance profile, when I do use a horrible surrogate, yes, I have a solution that I have not seen others use. I submitted it to the Sybase Engineers, who loved it, but then they were acquired by SAP, which uses 100% RFS and 1960's Batch Processing ... SAP is not a genuine online system.)

> You are correctly pointing out that the surrogate requires an additional
> SELECT because data is searched through real data values anyway. The
> purported advantage is that *additional* queries will benefit from the
> surrogate, i.e., you retrieve it once and you use it several times, so
> that further queries (e.g., in the same transaction) will filter using
>
> ...where id = X;
>
> rather than
>
> ...where pk_attr1 = Y and pk_attr2 = W and ... and pk_attr_n = Z;
>
> And you will join on one attribute rather than many.

1. No one actually does that.
So it is science fiction about what could be if I were a unicorn, and your hair was green. We can't enhance the performance (or prevent poor performance) of a circumstance that does not happen. And I certainly won't impede the performance of circumstances that **do** happen for the sake of some nonsense that does not happen.

2. At the performance level,
It is 2020, SQL servers do not have that issue (they did in the early days, the 1980's.)

In fact, the opposite is true. In true servers, when a query is processed, a Query Tree is built first, and it is based on Statistics. Then the QueryTree is executed, and that is based on cache contents. The unit that is cached is a Page (2048 bytes default in Sybase, normal for OLTP, up to 16384 bytes configurable for DW). Therefore, the Pages in cache are retained (more useful for the next query by any user) when rows are physically clustered based on a Relational Key (refer CLUSTERED INDEX). Something that is not possible with a surrogate, because:
- serial access (any DW usage, eg. SUM() for a particular Customer's InvoiceIDs): has to take long jumps across the entire table, the relevant portions of each Page is very low
- online access: all users fight over the last Page, the cache is unused

Whereas for a Relational Key;
- serial access: all Pages containing Invoices for a given Customer{ No | Code | Name } are located in a small physical area of the table, a large portion of the Pages are useful, and the entire table is **NOT** read
- online acces: all users are spread nicely across the table

3. At the coding level
Irrelevant. Come on. SQL is a data sublanguage. It is, must needs be, a low level language (equivalent to Assembler or MACRO-32 or awk. As such, it is necessarily cumbersome. Only idiots (such as "academics" [which you are not]; and our cursed "theoreticians" expect SQL to be high level, or expect high level functionality from a low level language.

It is 2020. Get an IDE. Even the free ones eliminate coding that stuff, the burden of low-level fiddling is heavily reduced. The commercial ones fill in the JOINs and columns (all occs) for you.

> Other (perhaps more interesting) reasons in favor of surrogates have
> been elaborated in Hall, Owlett and Todd's "Relations and Entities",
> which Codd acknowledges as a major inspiration for RM/T and summarizes
> in his 1979 paper, §4. Feel free to discuss (Codd explains three
> difficulties with "user-controlled" keys—more details are in the other
> papers). As the paper is not easy to find, here's a temporary link:
>
> https://send.firefox.com/download/57050c8af86ed25e/#SIg0c7Uuovr7dRMFu6s_Ww

Thanks for making it available.

All complete rubbish. Even the argumentation is anti-science; speculation. The usual tribal descent into emotion and feelings.

This applies to all anti-Relational papers, both Codd (1971; 1974; 1979; RM/T, collectively called RM/T) and others.

As I have stated several ties, per the Law of Non-Contradiction, we either have the Relational Model mindset, or we don't. In the period 1970 (advent) to 1984 (first genuine RDBMS platform outside IBM), both:
- Codd
was marketing to the then prevailing HDBMS users, which was intractably RecordID based, for the purpose of elevating their RecordID based hierarchic physical structures (much like Oracles Index Ordered Tables) to obtain some small benefit by implementing bits of the /RM/.
--- Codd is a mathematician; an engineer, like me, he is hopeless at marketing.
--- He was tricked by the tribe who were professing to support him, but actually as per evidence, sabotaging him, and he did not realise it
- others,
such as those authors, were also finding ways to enhance their RecordID based systems. Woohooo, knock yourself out. But after 1984, all such conceptions and methods are invalid, schizophrenic (denial of reality).
- others still,
were sabotaging the /RM/ by tricking Codd into selling to the RFS market, providing all the mathematical definitions for RFS structures, such as transitive FDs; the 17 abnormal "normal forms"; etc.

THAT these authors (and Codd as referenced in the paper, as well as in the RM/T) are writing about a context that is RecordID based, and ONLY RecordID based, and NOTHING BUT RecordID based, is evident in the paper.
1. "unified data model"
2. "unified data model" again, when referring to Chen, after reference [23]
3. Take a look at the diagram on page 208. I discuss just one aspect (the basis) of the many errors, here, and therefore re-confirm that it is totally irrelevant after 1984, the Relatoinal paradigm:

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Response%20to%20Relations%20and%20Entities.pdf

The paper is full of errors; unscientific absurdities; arguments from the tiny corners of the Bell curve (irrelevant but establishes doubt); appeal to emotions (anti-science). I have discussed just one aspect. Feel free to discuss or argue for it. On that aspect or any of the others in the paper.

<<<One aspect that you may discuss or argue for, is this business of a R

> One of the motivating examples in the above mentioned paper is this:
> consider an Employee entity with primary key EmployeeNo. Suppose that an
> EmployeeNo is updated in the database: does that mean that an employee
> has been fired and a new employee has been hired or that the employee
> number of a current employee has changed?

It means whatever the data modeller meant it to mean. That it is an ambiguity is false, and it is fraudulent and anti-science to propose it as such.

1. Generic situation. In a normal secured database, one is not allowed to change a Key. It does not matter what one feels like or what one had for breakfast, the database is a corporate asset, with various constraints and security implemented, it is not a personal asset to do with as one pleases or feels.

Same response to those who feel like because they set the EmployeeNo, therefore they feel they can change it. I tell people who communicate their feelings in the workplace please to go home and tell their mother. It is pathetically childish to concern oneself with feelings in a professional technical (scientific) environment.

2. A few special cases. In a normal secured database, where one IS allowed to change a Key, on a select few tables, it is done by a Batch Transaction, and with Permission GRANTed to a specific ROLE. Ie. not all users who can add an Employee but one who has authority to change an Employee Key. It means only that the Identifier for the FOPC Fact (not entity) in the real world has changed, and you are changing the database to reflect the change in the real world.

3. Under no circumstances can it mean that an employee has been fired.

4. Under no circumstances is an EmployeeNo re-used. Such Identifiers as
- EmployeeNo - bad example
- SocialSecurityNo (US)
- SocialInsuranceNo (CA)
- TaxFileNo (AU)
are usually retained for the life of the database or organisation (not the life of the person or the duration of employment). Such Identifiers are provided by the authorities, once, for the life of the Person, and remain unchanged if they change names or genders. Only completely insane or criminal people attempt to change recorded history.

> The point here is the semantic
> ambiguity of the update operation on a key (am I recording a change of
> the key value for a single entity or am I replacing an entity with a new
> one having the same descriptors?).

An issue which a normal person in the real world simply does not have.
Detailed above.

> Note, again, that I am using "surrogate" as "entity identifier"
> (following the authors above), *not* as "record identifier" (as I said,
> I am not interested in the latter).

You are so seduced by idiotic papers such as this or papers by the RM/T as "relational" Gulag, that ordinary logic has been overruled. You should not be so seduced, because your mind is clearer than the rest of the slaves. Let me clarify it:
- there is no scientific difference (cause; effect; material) between a "surrogate" and a "record identifier". There are merely different names for the same thing, and the thing is a physical record identifier..
--- As per the /RM/, Logical means data as it exists in the real world, not manufactured by the system
--- Physical means not exists in the real world, manufactured by the system
--- Logical means addressing a list of attributes
--- Physical means addressing the entire record (all attributes stored for the surrogate), which must be accused first, in order to access a specific attribute second. What you call it does not matter, the effect it causes does matter (Access Path Independence Rule broken, detailed in earlier posts)
- the Hall; Owlett, and Todd idiots and the Gulag slaves think that if you call a dog a horse, it magically becomes a horse. It does not.
- they then go on to treat the dog as a horse. That is insanity squared.
--- the concept of /entity/ is a concept, it does not exist in the real world, and the conception is not scientific or reliable enough for others to use
--- the real world consists of objects or events, only. Real objects, not concepts of objects.
--- for the educated mind, the real world consists of Facts (an abstraction) about objects or events, only
--- any other abstraction, such as /entity/ is false.
--- here, it serves one purpose only, to conceptualise a thing in the real world in order to elevate the surrogate to something the surrogate is not.

The only difference between a "surrogate" and a "record identifier" is intellectual, and only in the mind of the data modeller, which means treatment. You cannot define what that difference in treatment is. Or where it is valid.

I flatly reject the notion of /entity/.

Yes, I agree that there are things in the real world that sometimes cannot be identified by a set of attribute values, but they have not defined (at least in a reliable form) how to solve that, and I reject the solution they have given (concoction of /entity/ plus a surrogate). As evidenced, I do use surrogates, without calling them not-record-identifiers, but the validity (location and reasons) may not be exposed.

Feel free to discuss. If you do, please provide a real world example, where a surrogate is NOT a record identifier.

This usually leads to a discussion re a much more important issue:
Determination of the Correct Identifier for an Object in the Real World
whenever you are ready. We are going to touch on it, again, in the next point, without dealing with it squarely.

> In particular, surrogates are never
> updated (and possibly never deleted either).

So what.
- EmployeeNo is not a surrogate, but it is a bad Identifier.
- AddressNo and PersonNo are surrogates/record identifiers.
- PersonName is superior, eliminating the surrogate/record identifier, but it is a duplicate of other columns. It is a database design device, for coding and JOIN convenience, but it breaks Relational and Normalisation rules. It remains controlled by the system (code) and not the user.

Surrogates/record identifiers are not seen by the user. So what.

> (*) Regarding the now AK of Person:

(Always was the AK. It is the PK that I have changed in V0.11.)

> you may want to record data about
> persons whose birth date, birth town, etc. are not known. That might be
> another case for a surrogate. Here is some data:
>
> https://blog.aniljohn.com/2013/06/how-to-choose-attributes-to-uniquely-identify-a-person.html

Too superficial and incomplete to deal with.

Let me deal with the issue squarely.

1. For decades, the convention (we still do not have a Standard AFAIK) for proper identification of a Person has been the method used internationally for passport identification:
- LastName
- First Name
- MiddleName (or initial, placed in a MiddleName column)
- Date of birth
- Country of birth
- Place of birth (meaning Town, that can be found on an ordinary map, not village, not hamlet, not city)

Address or parts thereof are considered too unstable (buildings change, houses are torn down and more houses are built in the same lot, etc).

In our database, the Identifier for Town is [ CountryCode, StateCode, CountyCode, Town ]. The first three are all ISO or ANSI Standard values.

2. In our database, when a Person is added, that AK, the entire set of attributes has to be entered. A Person whose birth date, or birth town, or birth country; etc. are not known cannot be added. End of story.

3. In some OTHER database, where a Person whose birth date, or birth town, or birth country; etc. are not known CAN be added ... well, we would have to design that, won't we. It would be silly to expect the capability and functions of database [3 in database [2].

Without thinking very hard, which I really should, I would implement:
- Person with a pair of Exclusive Subtypes { Complete | Partial }
- PersonComplete is defined as you see in the V0.11 DM
- PersonComplete is the table that ALL relevant child tables reference, meaning that a Partially Identified Person cannot be an Agent or an Employee, etc.
- PersonPartial is NOT referenced by any other table
- PersonPartial is defined as:
--- mandatory columns and Key [ LastName, FirstName, MiddleName, Differentiator ]
--- Differentiator can be set by the system, but by all means, let the user provide it, and it will be more meaningful
--- optional columns: BirthDate; BirthCountryCode; BirthTown ( StateCode, CountyCode, Town )
- in code, whenever an optional column is added, if all optional columns are filled, convert the PersonPartial to a PersonComplete.

4. In the MovieTitle database, there is a different requirement, because the above optional columns are usually not known for historic people, and we need the Person, Partial or Complete, to be referenced. In that case, I would:
- dispense with the Subtypes in [3]
- the mandatory columns are already mandatory, and already the PK
- implement one table for each optional column: PersonBirthDate; PersonBirthCountry; PersonBirthTown.

I will give you that in the next iteration.

The summary points are two;
a. Do not expect database capability of [3] or [4] from a database with capability [2]. Do not wring your hands because you cannot exceed what is made to not be exceeded. Sitting in that place without Resolution (the paper, your question) is insanity.
b. Per the Four Laws of Thought, I have resolved it. Based on the specific requirement of the specific database, implement the Solution as above.

Which proves, again, that the paper is unscientific junk. Yes, even Codd, when he kowtows to such absurdity (when seduced into selling to the Record I based crowd).

The same applies to the example in the paper, of adding an Employee with a known name and salary, for which an EmployeeNo has not yet been assigned. In the normal case, only Human Resources department would be adding Employees (not the interviewer or manager), and the Employee will not be added until an EmployeeNo has been assigned. In the real world, people fill in pieces of paper, usually a form (pro forma) that is issued by HR. The database is secured against an interviewer or manager adding employees. Therefore the example is stupid.

Adding a Person with partial Identifiers is not stupid, but acceptable only in the specific requirement, which is what defines the precise columns, etc.

> > Compare with AddressNo.
>
> Why wouldn't you do the same as for Agent?

1. Because in this particular requirement I do not need it, and a wider Key is not justified (unless it has a purpose).

2. In some other requirement ...
look at our beloved Order Advanced DM
let's say that we need reports that show PartCodes that are sold by Country and PostCode
a. The perfect (not further perfectible) method would be to implement all Address AK columns as PK. More storage, but the structure would never need change.
b. Because that system has online order entry, including Credit Card payments and exact Address, I don't have to worry about which columns are optional: all are mandatory.
c. The more reasonable implementation would be whatever the requirement states. In this example, I would implement the following in PartyAddress, as the superior PK, and eliminate the surrogate AddressNo as Key.
[ PartyNo, CountryCode, StateCode, PostCode, Differentiator ]
d. Differentiator is arbitrary, and there are a gazillion operators taking online orders, who I don't wish to rely upon, so it is set by the system.

> > 5. Constraint
> > -----------------
> >
> > First, we have arrived at that point in the modelling exercise wherein
> > all the tables are constrained by Domain (one meaning of Domain, of
> > the 83 occurrences in the /RM/) and Key. Therefore we have achieved
> > "DKNF", which even its author and RM saboteur Ronald Fagin states as
> > impossible to achieve
>
> That depends on the class of constraints you consider, and also on the
> problem you want to solve.

At the stage that I made the statement ... as per the "DKNF" definition ... that includes all Domains via datatype; all Domains via References [which means a Domain in the Referenced table]; all Keys [which were carefully determined, and compliant with the /RM/. It excludes all CHECK constraints which are { row | intra-Table | Extra-Table }. Addition of those, achieves FCNF, yes.

> Many problems related to first-order logic
> are undecidable.

Nonsense. Give me even one real world example.

I insist that there is no such thing as an intractable (or "undecidable") problem in FOL. AFAIC, a person who says there is, simply does not understand FOL or FOPC, he does not have Logic aptitude.

> Constraints that are encountered in practice are
> usually a very restricted subset of what you can express in FO logic, so
> they are typically much easier to deal with. As a simple example, Fagin
> shows that, if the only constraints you consider are functional
> dependencies, then any schema may be easily transformed into a DKNF
> database schema (with inclusion dependencies). But functional
> dependencies are constraints of the form:
>
> forall x,y,z . R(x,y) and R(x.z) -> y=z
>
> which is a very, very special kind of FO formula.

I do not have that restriction, and would scoff at the demand for it. Which is why "DKNF" is broken, half the reason why I place it in quotation marks. The other half is it is incomplete, eg. excludes Constraints. Not to mention, the transitive and partial FDs which are anti-Reltional.

> Also, note that "undecidable" means that there is no general algorithm
> to solve the problem. That doesn't mean that the instances of that
> problem which you encounter in practice cannot be solved.

Understood. But it is stupid. The typical Straw Man arguments that "academics' and "theoreticians" love to erect, to excuse their impotence. They cannot figure out a general algorithm, so they pervert an English term, and give it a new definition. Pig poop.

If ever I find the time, I might try writing the general algorithm that the demented freaks cannot. I won't call it "artificial intelligence" though, it is just normal intelligence, un-perveted by the pig poop brigade..

> > (he wrote the maffemythical definitions for the
> > anti-Relational "NFs", so he was thinking of such abominations, RFS
> > fraudulently marketed as "relational"). As you have seen, achieving
> > "DKNF" is a no-brainer. Every single database I have written since
> > 1993 is "DKNF".
>
> It may as well be the case. "Your" DKNF is different from the
> maffemythical (whatever that means)

I am mocking the freaks' use of /mathematical definition/, because these freaks use mathematical definitions to:
a. declare that it proves something
--- [it does not, math is only a formalisation; a distillation of Logic, in and of itself, it is not a proof of anything ... it is the starting premise that is true/false regardless of teh math that is used to articulate it
b. maintain isolation, their elitism. Which I scoff at.

> definition of Fagin's "DKNF". You
> haven't said exactly what "your" DKNF

(Further below.)

> means (that's probably also the
> issue Fagin had when you wrote to him).

No.
1. I gave him a full IDEF1X DM, and declared that it achieved "DKNF", that he said was impossible, and asked him please to confirm. He did not answer that question. From his comments, I realised that he (like all other "academics' and "theoreticians") cannot READ a graphical DM.
2. I gave him a second IDEF1X DM, which was a higher level design of [1] and asked him to confirm that it achieved "DKNF". He asked for a maffematical mystical mythological definition. Pushing the elitism. Asserting the cabalist mindset. It did not need a mathematical definition, it was fully explained in technical English. Again, his comments proved he could not read the graphical. And that he was totally ignorant of FOPC Predicates.

> My impression is that it can be
> explained as a collection of design principles and a holistic view of
> the data model (most normal forms, DKNF included, totally lack the
> latter).

Yes. Mine do not suffer such disabilities. FCNF is ...

Codd's 1NF
Codd's 2NF
Hierarchic NF (per /RM/ Unnormalised Set, pre-requisite to ...)
Relational Key NF (per /RM/ Normalised Set)
< one more, confidential >
Relational NF (which covers the cross-database issues that you mention)
Codd's 3NF (which I have to call Key Dependency NF, because (a) the Pig Poop Brigade has perverted 3NF, and (b) Codd's 3NF requires "full" FDs, no transitives, no partials, no tribal thinking)
Fully Constrained NF

Definition.
- Achieved KDNF
- all Domains via datatype;
- all Domains via References [which means a Domain in the Referenced table];
- all Domains via Keys [which were carefully determined, and compliant with the /RM/.
- all CHECK constraints, which are { row | intra-Table | Extra-Table }. Addition of those, achieves FCNF, yes.

Sorry, out of time. I will pick up here next time.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Feb 29, 2020, 8:50:06 AM2/29/20
to
> On Saturday, 29 February 2020 20:41:08 UTC+11, Derek Ignatius Asirvadem wrote:
> > > On Thursday, 27 February 2020 23:10:38 UTC+11, Nicola wrote:
> > > > On 2020-02-27, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:

Clarification to my previous post.

> > You are correctly pointing out that the surrogate requires an additional
> > SELECT

Not only the additional SELECT, but but worse: one additional column and one additional index.

> because data is searched through real data values anyway. The
> > purported advantage is that *additional* queries will benefit from the
> > surrogate, i.e., you retrieve it once and you use it several times, so
> > that further queries (e.g., in the same transaction) will filter using
> >
> > ...where id = X;
> >
> > rather than
> >
> > ...where pk_attr1 = Y and pk_attr2 = W and ... and pk_attr_n = Z;
> >
> > And you will join on one attribute rather than many.

0. That is because some SQL platforms, as well as all the NON-sql suites, are brain-dead. In Sybase, I do not have to give the column names for a NATURAL JOIN, the Query Optimiser in the server (not the IDE) finds it and fills it in.

> 1. No one actually does that.

I mean "use it several times, so that further queries (e.g., in the same transaction) ...". If they SELECT the same row more than once in a single Transaction code segment, they really should not be coding, they need to change oil or clean toilets or something more suited to their faculties.

> So it is science fiction about what could be if I were a unicorn, and your hair was green. We can't enhance the performance (or prevent poor performance) of a circumstance that does not happen. And I certainly won't impede the performance of circumstances that **do** happen for the sake of some nonsense that does not happen.

Further, if ever an "use it several times, so that further queries (e.g., in the same transaction) " event does happen in the mythology, whether one filters on one column or several makes no difference.

> 2. At the performance level,
> It is 2020, SQL servers do not have that issue (they did in the early days, the 1980's.)
>
> In fact, the opposite is true. In true servers, when a query is processed, a Query Tree is built first, and it is based on Statistics. Then the QueryTree is executed, and that is based on cache contents. The unit that is cached is a Page (2048 bytes default in Sybase, normal for OLTP, up to 16384 bytes configurable for DW). Therefore, the Pages in cache are retained (more useful for the next query by any user) when rows are physically clustered based on a Relational Key (refer CLUSTERED INDEX). Something that is not possible with a surrogate, because:
> - serial access (any DW usage, eg. SUM() for a particular Customer's InvoiceIDs): has to take long jumps across the entire table, the relevant portions of each Page is very low

--< Which is a TABLE SCAN regardless of the size of the result set >--

> - online access: all users fight over the last Page, the cache is unused

--< Sybase & MS only, have an internal enhancement such that it does not freeze out the cache for a TABLE SCAN >--

> Whereas for a Relational Key;
> - serial access: all Pages containing Invoices for a given Customer{ No | Code | Name } are located in a small physical area of the table, a large portion of the Pages are useful, and the entire table is **NOT** read

--< the cache CAN be used, especially the "hot" Pages >--

> - online acces: all users are spread nicely across the table

> 4. Under no circumstances is an EmployeeNo re-used. Such Identifiers as
> - EmployeeNo - bad example
> - SocialSecurityNo (US)
> - SocialInsuranceNo (CA)
> - TaxFileNo (AU)
> are usually retained for the life of the database or organisation (not the life of the person or the duration of employment). Such Identifiers are provided by the authorities, once, for the life of the Person, and remain unchanged if they change names or genders. Only completely insane or criminal people attempt to change recorded history.

In the physical data model, wherein I show the physical [!] tables and columns, from a platform-specific perspective, every table has the following, which is not shown in the Logical or Conceptual-Logical model:
- UpdatedDateTime
--- which is sometimes called TimeStamp, very incorrectly
--- it is actually the Row Version, for use in Optimistic Locking in OLTP Transactions
- UpdatedUserId -- which references the system User table:
- optionally CreatedDate and CreatedUserId
That UserId prevents deletion of the UserId, ever. Terminated employees stay on file forever. We just disable their Login when terminated.

> > Note, again, that I am using "surrogate" as "entity identifier"
> > (following the authors above), *not* as "record identifier" (as I said,
> > I am not interested in the latter).

Fully Constrained NF.

> Definition.
> - Achieved KDNF
> - all Domains via datatype;

- all Domains via Subtypes (OR/XOR Gates)

> - all Domains via References [which means a Domain in the Referenced table];
> - all Domains via Keys [which were carefully determined, and compliant with the /RM/.
> - all CHECK constraints, which are { row | intra-Table | Extra-Table }. Addition of those, achieves FCNF, yes.

----

Surrogate vs Record ID discussion
I think the way we have discussed it is not clear. Everybody and his dog, including those three puppies, has a different definition of /surrogate/, something special to suit their special needs. Which is the opposite of an objective scientific definition that does not, cannot, should not, be changed

a. Since the /RM/ demands a Relational Key that is "made up from the data", therefore a surrogate it domain that is not made up from the data, but made up by the system.
b. Retaining the full English meaning, it is a literal surrogate for the otherwise proper Key.
c. The datatype does not define it as a surrogate, [b] does.
d. Therefore PersonNo; PersonId; PersonName, are all surrogates.
e. A Record Id is a poor choice for a surrogate, the last one possibility.
f. An IDENTITY or AUTOINCREMENT column is the worst of the worst. Created and guaranteed "hot spot" on the last Page.

Now address, what is the difference between your "surrogate" vs "record identifier".

AFAIC, any form of an integer, is a Record Id. AddressNo; PersonNo are true for [a..e], and an enhancement prevents it from falling into the toilet [f], and eliminates only the created hot spot issue.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Feb 29, 2020, 10:00:38 AM2/29/20
to
Nicola

Associative tables for Concept & Movie /Is varied as/ and /is derived as/.

Really means multiple parents instead of single. I can see how a thing (Concept or Movie or any thing) can be derived from multiple things. But I cannot see how a thing can be a variant of more than one thing.

It might have to do with the Identifier, which is as required per the Law of Identity ... as opposed to simply /what columns identify the fact ?/. Which I might have a better grasp of. Lineage + Instance (Differentiator) is indeed the best form of Identity.

A single parent provides a single lineage, obtained by a recursive Function, possible because it is a scalar, but multiple parents cannot because it is a two-dimensional result set. There is no real sense of lineage, Like a foster kid.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Feb 29, 2020, 5:59:34 PM2/29/20
to
> On Thursday, 27 February 2020 23:10:38 UTC+11, Nicola wrote:
> > On 2020-02-27, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:

... continued.

> > Second, there are of course more Constraints to be declared for each
> > table. I have started declaring them. Feel free to declare whatever
> > you see fit.
>
> Could you clarify with an example why TitleType must be in the key of
> ConceptTitle and ProjectTitle?

Whoa. In another post, you stated:

----
> On Thursday, 27 February 2020 19:31:35 UTC+11, Nicola wrote:
> > On 2020-02-25, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> >> I'd say that ...
> >> LanguageCode ... are determined by the title.
> >
> > 1. From our previous discussions, LanguageCode has to go with a Title
> > where the Title may not be in the cataloguer's language.
>
> > 2. How can LanguageCode be determined by theTitle ?
> > For uneventful presentation (any GUI, any platform ... Localisation),
> > I thought it is the reverse.
>
> Upon reflection, yes, I am wrong. To properly track titles in multiple
> languages, LanguageCode must be part of the key.
----

> I believe that my question "How to identify a movie?" is now answered.

Great.

78 posts of interaction to achieve that. 73 tables thus far. I know you said the database is large, but for me, I would say around 100 tables is the average size of a database for a particular application (as opposed to an enterprise), and that it is small. And the interaction (given the lack of face-to-face meetings and workshops) is small. Very successful modelling exercise, due to precision in comms.

200 tables is medium. 500 tables is large, and normal for an enterprise level database or a financial system. I once wrote a Version 2 financial system that was 300 tables, from a 550 table Version 1.

I would like to continue until:
1. You have a database that is complete enough for La Camera Ottica.
2. You confirm that the Relational database which is ...
FOPC + RM + Relational Modelling + Transactions <-- stable
is far superior, not comparable to ...
RFS + "ontology" + "descipshun logicks" + middleware <-- ever-changing, unstable

> One thing I'd still change in the model, though: I'd turn the
> associations "Is Varied As" and "Is Derived As" into (associative)
> entities. The reason is that one would want to add more information
> related to such associations (e.g., data about sources that justify
> a particular relationship between movies).

As detailed in the earlier post, I can see the validity in a Concept or Movie being derived from more than one Concept or Movie, and therefore it is not a lineage. But I cannot see how a Variant has more than one parent, and how it is not a [single-parent] lineage.

> > The additional symbols I have used are Extensions to IDEF1X, I do not
> > think they need explanation. However, it does make the DM a bit busy.
> > I do have an Extension that eliminates that issue, which you may be
> > interested in, after we settle down with this one.
> >
> > When the Constraints are actually complete, we could say we have
> > achieved Fully Constrained Normal Form, "DKNF" + C + C ...
>
> Forget about DKNF and call it FCNF then!

Thank you. Ok.

> > I was expecting more discussion.
>
> I hope I am catching up. During the next days I'll review your
> constraints and validate the model against the examples from the FIAF
> manual.

In your own time. I can see that Piedmont and Tirol are exploding. The weaponised Chinese virus sure is effective.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 1, 2020, 4:56:07 AM3/1/20
to
Nicola

====================
This post introduces V0.12
====================

All resolved items implemented.

67 Tables after 12 iterations. That is starting from scratch (determining entities), probably 8 iterations in the usual situation. Good work.

More constraints for you to shake a stick at.

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_12.pdf

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 2, 2020, 8:48:12 AM3/2/20
to
Nicola

=====================
This post introduces V0.13
=====================

Another walk-through.

1. Just caught one mistake. Although the columns in ConceptTitle were correct, the relation /Language expresses ConceptTitle/ was missing. That probably caused your confusion about the table. Even when pointed out, and examined again, I corrected the shortened column names, but I missed the missing relation. Sorry.

2. A number of tiny mistakes corrected, spelling or transposition only. Too small to detail.

> 67 Tables after 12 iterations. That is starting from scratch (determining entities), probably 8 iterations in the usual situation. Good work.
>
> More constraints for you to shake a stick at.

For me, delivering databases that are fully self-defining per FOPC and /RM/, the number of additional Constraints (beyond "DKNF" or beyond those implicit in the CREATE TABLE commands taken together), ie. CHECK commands, is:
1.5 to 2.0 times the number of tables.
I count these at the table rather than column level because most are at the table level, some are intra-table. That means 100 to 134. So far we have 29.

http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_13.pdf

Trust you and your family are safe.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 2, 2020, 1:13:32 PM3/2/20
to
> On Tuesday, 3 March 2020 00:48:12 UTC+11, Derek Ignatius Asirvadem wrote:

3. One reference table added: MediumType. The values need some work. It may progress to MediumType vs FormatType.

> > 67 Tables after 12 iterations.

Correction: 63. I don't count reference tables.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 4, 2020, 12:40:39 AM3/4/20
to
Nicola

I trust you and yours are safe.

> On Tuesday, 3 March 2020 00:48:12 UTC+11, Derek Ignatius Asirvadem wrote:
>
> =====================
> This post introduces V0.13
> =====================
>
> 1. Just caught one mistake.

Two

> Although the columns in ConceptTitle were correct, the relation /Language expresses ConceptTitle/ was missing. That probably caused your confusion about the table. Even when pointed out, and examined again, I corrected the shortened column names, but I missed the missing relation. Sorry.

- /ConceptTitle manifest as 0-n Movie/ -- is Identifying
- /MovieTitle is merchandised as 0-n Editions/ -- is Identifying
- The Key in the Edition cluster has been corrected.

EditionType needs address.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 5, 2020, 5:12:59 PM3/5/20
to
> On Wednesday, 4 March 2020 16:40:39 UTC+11, Derek Ignatius Asirvadem wrote:
>
> - /ConceptTitle manifest as 0-n Movie/ -- is Identifying

Retracted. Non-Identifying.

> - /MovieTitle is merchandised as 0-n Editions/ -- is Identifying

Retracted. Non-Identifying.

Sorry.

Cheers
Derek

Nicola

unread,
Mar 5, 2020, 6:19:08 PM3/5/20
to
Derek,
some logistic issues due to the current situation have forced me to
focus on other matters and I have delayed my replies (health is good,
thanks God). But I have read them all.

>Now address, what is the difference between your "surrogate" vs "record
>identifier".

Two surrogates are equal iff they represent the same real world entity.
That is not true in general for record identifiers.

A join of any two tables on surrogates always joins all and only the
semantically related records. A join of any two tables on record
identifiers produces random records.

>I would like to continue until:
>1. You have a database that is complete enough

You have done a lot more work than I requested (my question was very
specific after all). I appreciate that. I also understand that correctly
answering my question entailed "looking at the bigger picture". This was
a fruitful interaction with fruitful results, from the Concept -> Movie
-> Edition -> Instance hierarchy to the PersonPartial of v13. Regarding
the subject area we have been discussing, I think that the database *is*
complete enough. Perhaps not "complete" in a "production-ready" sense,
but the most difficult issues within the subject area we have been
discussing are resolved. There are other topics we have not touched,
e.g., see ISO 18923, ISO 11179, https://pro.europeana.eu/project/dca,
IFLA guidelines, probably others—not to mention regional requirements (I
have a ~30 A4-page data model derived from well over a hundred pages of
Italian regulations for cataloguing contemporary cultural assets, yet to
be integrated with international guidelines). This matter is wide. But
that was not the point. I wanted to point out a case where the issue of
"identification" was relatively complex. In fact, a situation where even
ontologically things are no so simple (what *is* a movie? If you have
watched Blade Runner's director's cut and I have watched the theatrical
release, have we watched the same movie? If you have watched it in its
original language and I have watched it dubbed, have we watched the same
movie? When does a change to a movie make it "another movie"? Etc.).
I think that you have convincingly shown that you can satisfactorily
deal with such issues within the realm of the Relational model.

>2. You confirm that the Relational database which is ...
> FOPC + RM + Relational Modelling + Transactions <-- stable
> is far superior, not comparable to ...
> RFS + "ontology" + "descipshun logicks" + middleware <-- ever-changing,=
> unstable

Yes, we are on the same track here.

>> One thing I'd still change in the model, though: I'd turn the
>> associations "Is Varied As" and "Is Derived As" into (associative)
>> entities.

>As detailed in the earlier post, I can see the validity in a Concept or
>Movie being derived from more than one Concept or Movie, and therefore
>it is not a lineage. But I cannot see how a Variant has more than one
>parent, and how it is not a [single-parent] lineage.
> [...]
>Associative tables for Concept & Movie /Is varied as/ and /is derived as/.
>
>Really means multiple parents instead of single. I can see how a thing
>(Concept or Movie or any thing) can be derived from multiple things.
>But I cannot see how a thing can be a variant of more than one thing.

Ok, agreed.

>> A common complaint with taking the step you take is that "the key is
>> larger" => "indexes will be larger" => [...] and especially updates,
>> will be more expensive and the database will occupy more space".

>That was a valid consideration in the days when we counted disk space in
>megabytes and RAM kilobytes, **and** we did not have caches in the
>server that cached the indices.
>[...]
>It is completely irrelevant in servers that have caches and caches its
>indices.
>[...]

Ok, but perhaps not completely irrelevant in all circumstances. Computer
systems have become much more powerful, but also the scale of problems
solved by computers has changed significantly.

>> especially updates, will be more expensive
>
>Assuming you mean update of the Key (which might be worth discussing), and
> not update of the row (re which the statement is false).

I did not express myself clearly. I meant any change to the database: in
this context, I was thinking especially of INSERTs, rather than UPDATEs.

>Updating the Key in such circumstances is, yes, more expensive and
>onerous, but not in the way you mean. Because all the occurrences of
>the migrated Key need to be changed, where such is required, one must
>write a Transaction (we call these Batch Transactions) to:
>- add the entire tree below the affected Key, with the new Key value
>- drop the entire tree below the affected Key
>- this is done in several loops (WHILE 1=1), the number of which equals t=
>he depth of the tree
>- in batches of 500 or 1000 rows (SET ROWCOUNT 500)
>- BREAKing out when the branch is empty
>Otherwise you would strangle the server (affect all online users)
>**and** blow the transaction log file, which is suicide (the code
>blocks itself).

In principle, one should just declare "on update cascade" and update as
usual, letting the system choose how to avoid strangling the server. The
considerations you make are probably the best workaround (because
cascading updates would be disastrous) for the system you work with, but
do not necessarily carry over to other architectures or other systems.
This is not to say that there currently is a system where such
workarounds are not needed. Research on physical optimization is still
hot nowadays.

>The real cost (actually worth consideration) related to these issues is
>actually whether the rows have a good, Relational index (meaning
>Relational Key of many columns), such that the INSERT/UPDATE/DELETEs
>are spread across the table, such that there is no contention (high
>concurrency). Imbeciles who propose surrogates do not understand this.

Am I correct that you are thinking of tables clustered on the primary
key and spread over several disks?

>> Many problems related to first-order logic are undecidable.=20
>
>Nonsense. Give me even one real world example.

"Is formula F satisfiable?" But perhaps this is not "real world" for
you.

"Is this cryptographic protocol secure?" This sounds real world enough
to me (albeit expressed very informally). But you won't find any program
that always answers correctly to that question (note that both "always"
and "correctly" are critical here).

>I insist that there is no such thing as an intractable (or
>"undecidable") problem in FOL. AFAIC, a person who says there is,
>simply does not understand FOL or FOPC, he does not have Logic
>aptitude.

Do not confuse problems (sets of instances) with the instances
themselves. Undecidability refers to the former. For the cryptography
question above, you may find several computer programs, and they are
even useful in practice. They can solve correctly and in a finite time
many instances of the problem. Yet none of them can solve the stated
problem correctly in every case.

Back to databases, I'd say that most instances of FO-related problems
that you encounter in the "real world" are "easy" (on the scale on
computational complexity, not on the scale of human understanding).
Example: the implication problem for functional and inclusion
dependencies taken together is undecidable (in general). But I bet that
you can always say whether or not a given constraint is implied by the
functional dependencies and foreign keys of your model. Those two facts
are not contradictory.

Nicola

Nicola

unread,
Mar 5, 2020, 6:21:24 PM3/5/20
to
On 2020-03-05, Nicola <nic...@nohost.org> wrote:
> some logistic issues due to the current situation have forced me to
> focus on other matters and I have delayed my replies (health is good,
> thanks God). But I have read them all.

Eh, of course: them = *your* replies.

Nicola

Derek Ignatius Asirvadem

unread,
Mar 8, 2020, 5:01:08 AM3/8/20
to
> On Friday, 6 March 2020 10:19:08 UTC+11, Nicola wrote:

> some logistic issues due to the current situation have forced me to
> focus on other matters and I have delayed my replies (health is good,
> thanks God).

Hopefully something good will come out of this ... perhaps the great fraud that China has perpetrated on the world, the economic war (enabled by the same international banksters, the globalists, that feast on pig poop) that they have prosecuted for 30 years, will unravel.

Answering just a few items. Will get to the rest later. Some of your previous items, too, beg a response.

> >Now address, what is the difference between your "surrogate" vs "record
> >identifier".

(My stated position being, there is no scientific difference.)

> Two surrogates are equal iff they represent the same real world entity.
> That is not true in general for record identifiers.

Mumbo jumbo. That says precisely nothing. Scientific declarations please.

It is like saying, 2 + 3 = 5 because 5 - 3 = 2. We need to know what each of the numerals represent, not that it is a representation. I reject the entire RM/T, and the Hall, Owlett & Todd paper as anti-scientific pig poop in the same vein as the Date; Darwen; Fagin; et al (Abiteboul; Hull; Vianu) pig poop.

Here the fraud is committed by the famous schizophrenic method of redefining a term (eg. the much redefined term "entity", which now means nothing at all, and anything the imagination [not intellect] can conjure) to mean what it does not mean, and then elevating those non-meaings to something of "value". The value is precisely zero. But people trained in schizophrenic thinking think that zero is a value.

> A join of any two tables on surrogates always joins all and only the
> semantically related records.

The point we are arguing is, there is no semantics in a surrogate. If you magically invest semantics in a surrogate (which one can do, only in ones imagination or emotional state, but one cannot do in ones intellect), it does not actually result in the surrogate holding semantics (or logic, or meaning). You are holding to the pig poop redefinitions, which I have already rejected. Give me a scientific definition that does not fail logic.

Eg. use a real world example.

Eg. AddressNo is a surrogate, and unarguably so. Try proving that it has "semantics".

Eg. my scientific proof is (a) the effect, and (b) I do not pervert established terms to obtain false credibility (which is easily proved, as I am doing to yours). There is no difference in effect between a "semantic" or ordinary surrogate that "identifies" an "entity" and a Record ID.
1. It breaches the RM Relational Key requirement
2. it fails the Access Path Independence Rule
---- meaning that the descendant tables are cut off from their ancestor tables (the data hierarchy), ---- and
---- a JOIN is forced at the level of the breach

Now I like my AddressNo, and it serves a purpose, but there is no way that I will propose that AddressNo is anything but a Record ID. I would not say "A join of any two tables on surrogate AddressNo always joins all and only the semantically related records"
because that is drivel,
because a join of any two tables on the Record ID AddressNo always joins all and only the semantically related records.
Because there is no semantics in a surrogate and no semantics in a Record ID.

> A join of any two tables on record
> identifiers produces random records.

Nonsense. Assuming that one has the correct indices on the tables, iIt produces precisely the joined records. Please provide an example.

Neither the surrogate nor the Record ID provides Relational Integrity (logical), they only provide Referential Integrity (physical). Whereas I can guarantee that a State or County is **not** added to the wrong Country, I cannot guarantee that an AddressNo is **not** added to the wrong Agent (MovieTitle) or Party (Order Advanced DM).

Genuine semantics lies in a logical Relational Key, which has to be made up from the data (otherwise it is not logical), there is no semantics in a number. Any "semantics" that you anoint the number or a number with, is not semantics but imagination. And thanks for the invitation, but I am not joining you in your imaginings.

Date; Darwen; Fagin; Hall; Owlett; Todd; Hull; Vianu; et al imagine warm fresh pig poop straight from the backside of a sow to be gourmet food for humans. It isn't. It is food only for devil worshippers and other animals. Or humans that have been brain-washed into thinking that pig poop is human food, demonstrative of intelligence. Evidence of massive denial of reality.

The question has not been answered: over to you.

> >I would like to continue until:
> >1. You have a database that is complete enough
>
> You have done a lot more work than I requested (my question was very
> specific after all). I appreciate that.

You are most welcome.

But the question was not specific. No such question is, or can be.

After decades of fighting the "academics", who argue academic questions that have carefully scoped result, and who avoid any real world question that would prove the relevance of context (which give meaning, which is why the /RM/ is absolute), it was nice to see a question in which the context could not be avoided, and thus a meaningful interaction could be had.

Second, after dealing with me for some years now, I relied on the fact that you had moved from the earlier slavish "academic" argumentation (pig poop) and into a more open and scientific position. That your question was earnest. That you have written a paper on the subject, using RM/T+MW+"O"+ "DL", which is RFS with a false science to validate it, regarding which I could wipe the floor using a FOPC+RM solution.

Declaring that "RM is better" vs "RM/T+++ can do anything" would not be enough, we needed the full interaction, with someone I could trust.

> I also understand that correctly
> answering my question

Any real world question. "Academics" purposely remain in simple puzzle type questions, which are good for the classroom, and irrelevant for the real world.

(And there exists, even their simple puzzle type questions, that they cannot resolve. But I can. If you ever want to take one of those up.)

> entailed "looking at the bigger picture".

Yes. When I went to college,
1. we were taught genuine Logic (it was understood to be an aptitude, that split the class up)
2. Data Analysis (it was called Systems Analysis then)
3. the big assignment for each semester was a full-blown case study
4. idiot questions were for classroom exercises only, to re-inforce something taught in a lecture.

Nowadays (I went back to uni recently and did a couple of semesters to research this), [1][2] are not taught at all, pig poop (schizophrenic thinking) is taught instead, labelled as "logicks" [play on magicke and cabalist divination] and "cuticle thinking" [play on the alleged "critical thinking" which is anti-science]. That are no case studies. The idiot questions are the only tests. The Four Laws are suppressed, Non-Resolution, the Excluded Middle, is the new norm. We are creating learned imbeciles, people who cannot produce anything that lasts, but can give 42 "scientific" reasons, and cite 56 papers, as to why they can't.

The "bigger picture" is the minimum context **for each table**, when determining the PK and AKs. Which in turn can only be understood by appreciating the whole (entire DM on one A1 page). Recall, you have agreed, the scientific NFs (Codd's and mine, not the abnormal "NFs" coomonly used for the common RFS-RM/T falsely labelled "relational") have to be across all tables.

> This was
> a fruitful interaction with fruitful results, from the Concept -> Movie
> -> Edition -> Instance hierarchy to the PersonPartial of v13.

Thank you.

> Regarding
> the subject area we have been discussing, I think that the database *is*
> complete enough. Perhaps not "complete" in a "production-ready" sense,
> but the most difficult issues within the subject area we have been
> discussing are resolved.

Ok. But I would prefer if we close some of the issues that (to me, of course) are not yet resolved, without going to a production-ready (more accurately an implementation ready) state, a case study level of resolution. Recall, we are dealing with true meaning; deeper meaning, far beyond what the FIAF lunatics with their RFS mindset and loose definitions can contemplate. Eg.

1. I am not sure that Language should be the determinant (the way it is in V0.13), Culture produces a story, not a Language, not a Country.

2. More rigourous testing re the cascade of Keys, and the alternation based on *Title, for the Concept→Movie→Edition→Instance data hierarchy. The more precise data hierarchies as defined are:
Concept = Agent ... Concept
Movie = Language ... TitleType ... ConceptTitle
Edition = Language ... TitleType ... MovieTitle

I am not sure that I can communicate this via IDEF1X ... but I will try (it is explicitly communicated in my IDEF1X Extension).
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_13%20Annotated.pdf

> There are other topics we have not touched,
> e.g., see ISO 18923, ISO 11179, https://pro.europeana.eu/project/dca,
> IFLA guidelines, probably others—not to mention regional requirements (I
> have a ~30 A4-page data model derived from well over a hundred pages of
> Italian regulations for cataloguing contemporary cultural assets, yet to
> be integrated with international guidelines).

Sure.

> This matter is wide.

No. It is wide, and confused, only because they have idiots for "academics" directing them, and they are forced into doing more than users should, in order to get something more than the academics can give them. No one is capable of taking responsibility and actually delivering. Common problem. Get a real practitioner who follows the Four Laws; Science, and rejects pig poop, and any problem becomes tractable.

> But
> that was not the point. I wanted to point out a case where the issue of
> "identification" was relatively complex. In fact, a situation where even
> ontologically

(
I reject the use of that term (of course I have to accept what the freaks have done: create a definition outside the RFS for all the FOPC Predicates that they cannot place in the RDB), because it perverts the meaning of the term as it has stood for two millennia. The study of the nature of BEING, which only humans and angels have. Their "ontology" is mere declarations of relations and rules (Predicates). There is no BEING-NESS or BEING in that simple straight-forward definition, inside or outside the database. Thus it serves only to remove BEING-NESS from humans, part of their ongoing war against humanity, against the definition of the SOUL. This ties in with the bestialists elevating animals, and anointing them with being-ness ("sentient beings", another fraudulent term) which animals do not have. Good for those who suckle pigs, or who place their reproductive organs where animals collect their waste matter. Degrading and degenerate for human beings.

Ontology = philosophy[ metaphysics[ science ] ] ] of the nature of Being.

"Ontology" = Fragments of FOPC Predicates placed outside the database by a damaged person who is ignorant of the science (FOPC & RM).
)

> things are no so simple (what *is* a movie?

Yes. Great case study, as detailed above.

> If you have
> watched Blade Runner's director's cut and I have watched the theatrical
> release, have we watched the same movie?

Well, now you are departing from science (single, objective reality), and entering into subjective reality, which is multiple. The "movie" that is "watched", more precisely, absorbed by each person, is subjective, and personal. Irrelevant to science. A thousand people who watched either of the two { Director's Cut | Commercial Edition } would have "seen" at least one thousand "movies". In occupied countries they would have seen 1,500 "movies".

I will answer only from science.

There are precisely two Editions.

> If you have watched it in its
> original language and I have watched it dubbed, have we watched the same
> movie?

Yes.
In different Languages.

(But we subjectively "saw" different "movies".)

> When does a change to a movie make it "another movie"? Etc.).

Again. That can be determined easily using science. As evidenced in this thread.

But it can never be determined by those who are ignorant of the Four Laws; science, and who rely on feelings, who are committed to Non-Resolution.

We scientists just do it (the determination, using precise scientific documented steps), and TELL the freaks what it is, from a position of Authority (science is an authority, which is why the freaks reject it and use instead their personal imaginings and feelings and anal warts). Those users who follow us (via the correctly set up GUI and database are completely resolved, no questions remain. Those who rebel against authority, who eat pig poop, cannot be helped due to their insanity. Insanity eventually leads to suicide or homicide, we can only hope it is the former.

> I think that you have convincingly shown that you can satisfactorily
> deal with such issues within the realm of the Relational model.

Thank you.

To be precise, this has been an exercise in ordinary database design, using design science, which includes Logic and FOPC, that is [after 1984] qualified by the RM. (The RM defines the Relational Model only, as with any academic paper, it cannot be expected to define anything outside its scope, such as database design; Normalisation; how to avoid eating pig poop; etc.) The RM does not inform us about how to do what we did, it just gives us (a) a defined mindset, (b) particulars for certain specific steps, and (c) strict rules that qualify/disqualify the resulting data model.

> >2. You confirm that the Relational database which is ...
> > FOPC + RM + Relational Modelling + Transactions <-- stable
> > is far superior, not comparable to ...
> > RFS + "ontology" + "descipshun logicks" + middleware <-- ever-changing,=
> > unstable
>
> Yes, we are on the same track here.

Excellent.
So let's get the data model to the point where all things contained in the non-ontology and the anti-logical "description logics", which is far removed from the database, are defined and deployed in the correct place: the Relational database.

Rather than
How to Identify a Movie
let's do
How to Identify a Movie for a Curation (museum, etc) in a Open Architecture Relational Database that serves all purposes (without Fragmentation and misplaced logic segments)

> >> A common complaint with taking the step you take is that "the key is
> >> larger" => "indexes will be larger" => [...] and especially updates,
> >> will be more expensive and the database will occupy more space".
>
> >That was a valid consideration in the days when we counted disk space in
> >megabytes and RAM kilobytes, **and** we did not have caches in the
> >server that cached the indices.
> >[...]
> >It is completely irrelevant in servers that have caches and caches its
> >indices.
> >[...]
>
> Ok, but perhaps not completely irrelevant in all circumstances. Computer
> systems have become much more powerful, but also the scale of problems
> solved by computers has changed significantly.

Those points are true, but they do not affect the declarations I have made.

Eg. I have used AddressNo and PersonNo since 1993, because then the true PKs were too wide, and the cost was relevant. Over the decades, both (a) the cost has become irrelevant, and (b) the customers want more meaning (search vectors; "Dimensions"; etc) in their lowest level tables, so I have placed more and more Person columns in (eg) OrderSaleItem or SecurityPriceDate.

While nothing is fit for all circumstances, all circumstances demand a trade-off that is made by a capable person, not one who is slavishly believing in RFS or one who is ignorant of the higher order principles in the RM or arguing that zero is equal to one.

> >> especially updates, will be more expensive
> >
> >Assuming you mean update of the Key (which might be worth discussing), and
> > not update of the row (re which the statement is false).
>
> I did not express myself clearly. I meant any change to the database: in
> this context, I was thinking especially of INSERTs, rather than UPDATEs.

Let's call the lot of them /writes/, and INSERT/UPDATE/DELETE when we are being specific.

In that case the assertion is completely false.
a. At the logical level, there is no difference (expense; cost) between a Record ID write and a RK write.
b. At the physical level, the RK write is far superior (detailed elsewhere), **AND** the entire table is maintained in efficient order (for all SELECTS and writes)
c. INSERTs are particularly stupid in Record ID environments, and particularly catered for [fast INSERT into a physical location that an INSERT can actually happen (modulated by the amount of space the administrator reserves for this purpose). Further the table can be rebuilt periodically (generally annually, overnight ... whereas I rebuild Statistics weekly.) Whereas for RFS there is no point in reserving space because **every** INSERT is located on the last Page, guaranteeing a contentious "hot spot", and Stats are irrelevant, ie. Normal table maintenance that does elevate speed (both SELECT and writes) cannot be done (has no effect) on a Record ID file.

Whereas those with RID tables have to rebuild every week to maintain speed. I know several customers who rebuild some volatile RID tables nightly. Obviously not in my databases, in third-party or their home-grown pig poo pickles.

Read up on Clustered Index in Sybase and MSSQL. We have had this since 1984, IBM/DB2 has a similar, not quite as great, facility. That is the commercial or high end of the platforms. The low end, the freeware, are still finding out how to tie their shoelaces such that they do not trio themselves all on their own. You have no idea what you are missing.

- Concern yourself only with "All-Pages Locked table", skip "Data-Only Locked table". The former is for RKs, the latter is for RIDs.
- - Concern yourself only with Clustered Index, skip Non-Clustered Index. The former is for RKs, and latter if for all AKs (Rks or RIDs). It would be really, really stupid to Cluster on an RID.
Start here:
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc00841.1570/html/phys_tune/X26562.htm

This doc might be easier to understand (this is the public version, which I provide free, paying customers get even more detail, plus configuration directions):
http://www.softwaregems.com.au/Documents/Sybase%20GEM%20Documents/Sybase%20Data%20Storage/3_2_Clustered_Index.html
http://www.softwaregems.com.au/Documents/Sybase%20GEM%20Documents/Sybase%20Fragmentation/12_II_Page_Chain.html
http://www.softwaregems.com.au/Documents/Sybase%20GEM%20Documents/Sybase%20Fragmentation/14_II_Unused_Space_Extent.html
http://www.softwaregems.com.au/Documents/Sybase%20GEM%20Documents/Sybase%20Fragmentation/15_II_Unused_Space_Page.html

Follow the [blue] links only if you have the interest.

Sybase has yet one more performance tuning facility of RKs. Starting here, use ▶ to navigate the next 5 pages (less reading. more diagrams).
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc00842.1502/html/spsysmon/X60643.htm

And all that is ONLY the physical structure of the tables. The Set Up and ongoing management of other, closely-related, resources such as Cache; the Transaction Log; Lock Management; etc, have significant impact on both table performance (queries and writes) and overall server response.

> >Updating the Key in such circumstances is, yes, more expensive and
> >onerous, but not in the way you mean. Because all the occurrences of
> >the migrated Key need to be changed, where such is required, one must
> >write a Transaction (we call these Batch Transactions) to:
> >- add the entire tree below the affected Key, with the new Key value
> >- drop the entire tree below the affected Key
> >- this is done in several loops (WHILE 1=1), the number of which equals t=
> >he depth of the tree
> >- in batches of 500 or 1000 rows (SET ROWCOUNT 500)
> >- BREAKing out when the branch is empty
> >Otherwise you would strangle the server (affect all online users)
> >**and** blow the transaction log file, which is suicide (the code
> >blocks itself).
>
> In principle, one should just declare "on update cascade" and update as
> usual, letting the system choose how to avoid strangling the server.

Not possible on commercial or high end platforms, because THEY DO NOT HAVE CASCADE !!!

CASCADE is for fools (otherwise known as "academics" or "theoreticians") who cannot think. This is where I say, again, the notion that theory should be divorced from practical concerns is mind-numbingly stupid, it let's such theoreticians invent all sorts of idiocies in their dreams. Theory without a practical intent is anti-science, personal fantasy, that one should keep to themselves and their close intimates, such as others who share the same sow at night.

Science means theory + practice, only theory that has a practical intent.

There is no such thing as "letting the system choose how to avoid strangling the server", because only the person who submits the command [directly or via code execution of code segments] knows what the extent, the full effect, of the command is. Strangling the server is theoretically allowed, due to permissions (eg. give yourself Administrator privilege and you can do anything). But practically prevented, by use of OLTP Transactions.

Configuring a server, and designing an app+database is purpose-driven. Say we build a set of roadways for a city (server resources), for use by cars of usual 3, max 8 people (OLTP Transactions), if you come along with one of your CASCADEs, that is a freight train or 100 freight cars, running along the roadways. That will cause massive gridlock at every intersection, it will strangle the city, all commuters. That is why we do not run trains on roads.

According to your thinking, you say you concern yourself with commuters, and say that the freight trains are just fine for moving freight, but you ignore the consequences, saying that that is the citys problem. Typical anti-social disorder, typical victim, typical of those incapable of taking responsibility.

According to my SG Standards (actually IBM 360/CICS Standards from the 1960's, restated in RM and SQL terms), the freight is split up into box trucks, max one tonne. No trains alloed on roadways.

> The
> considerations you make are probably the best workaround (because
> cascading updates would be disastrous) for the system you work with,

It is not a "workaround". It is a scientific Method, reasoned and documented, by responsible adults, who can. known by scientific types since 1960 (I learned it in 1976 and have used it ever since, same as other OTLP Standards).

> but
> do not necessarily carry over to other architectures or other systems.
> This is not to say that there currently is a system where such
> workarounds are not needed.

I state flatly, every system requires it, and you are irresponsible if you don't implement it. It is not based on need, or table population or size of database, it is a demand of Standards. Systems without it (95% of the "databases" out in the world) are SUB-STANDARD. As prescribed and heavily marketed by imbeciles who pass for "theoreticians".

> Research on physical optimization is still
> hot nowadays.

There is no genuine scientific research being done in this area, and there has not been since 1984-1987. All genuine research that is done (including that 1984-1987 content,) is done by commercial enterprises such as Sybase and IBM. Codd was a product of precisely such a venture, and he is the only one outside the platform suppliers who gave theory that was based on practice.

The freaks that come up with "physical optimisation" are re-inventing the wheel, in their isolated ivory towers, purposefully ignorant that (a) the wheel was invented thousands of years ago, and (b) perfected for particular purposes. They are doing "research:" about what we have had since 1987, and "inventing" it. That is what happens when clueless "theoreticians" who are addicted to pig poop find out that their "theories" do not work in the practical world, and then take THIRTY YEARS to come up with a new fix-it "theory".

MVCC is a great example of such monstrosities. Beloved of the "academics" who, all 10,000 of them spread across the planet, have no clue that they are breaking scientific principles, and who "invent" [copy, badly] what Oracle had that does not work.

> >The real cost (actually worth consideration) related to these issues is
> >actually whether the rows have a good, Relational index (meaning
> >Relational Key of many columns), such that the INSERT/UPDATE/DELETEs
> >are spread across the table, such that there is no contention (high
> >concurrency). Imbeciles who propose surrogates do not understand this.
>
> Am I correct that you are thinking of tables clustered on the primary
> key and spread over several disks?

No.
Tables Clustered on the PK, yes.
Whether they are on one disk or several is a separate matter, which in the commercial world is called Partitioning (the mickey mouse concept of "partitioning" is not at all the same), which is an **additional** physical level of distribution.

I did not previously mention Partitioning, my comments were about the physical distribution of rows in a table by virtue of the fact that the PK is a Clustered Index, and can only be had because the PK s a multi-column RK. I think the diagrams in the links above might explain it.

So first, the rows are physically distributed because the PK is a **Logical** RK, **and** the PK is a Clustered Index (which means Sybase; MSSQL, and DB2 only). Something that is not possible with a Record ID.

Second, an Additional level of physical distribution may be obtained by using Partitions.

In all circumstances, the physical structure (including but not limited to distribution), affects performance, and particularly contention vs concurrency.

The notion of Logical has been suppressed by the pig poop Gulag. Logical means:
- an abstraction
- at least one level of abstraction away from the physical
- irrelevant if not concerned with the physical (which is the implementation in a dumb machine)

Just as Codd concerned himself with the physical distribution of records in a HDBMS (Record IDs), which he was commissioned to overcome, he remained concerned about the physical distribution of rows in a RDBMS. The Clustered Index is the 1984 rendition of the 1960's Hierarchic File Structure in the HDBMS. The Independent Access Path is the 1984 rendition of the Network File Access in the 970's NDBMS (Index instead of Randomised Key). In a commercial RDBMS, with a CI, the RK **IS** the physical distribution.

It is accurate to say that the /Relational Model/ is:
a. the HDBMS (all Hierarchic principles retained [here we are discussing only the RK] )
b. married to
c. the NDBMS (Independent Access Path, per table)
d. with a mathematical definition to form the foundation

"There is nothing new under the sun."

The freaks and gooks suppress that knowledge, purposely, in order to pretend that there was nothing before the /RM/, and to the pervert the /RM/, promoting and marketing and teaching the RFS [RM/T, which is false after 1984] as "relational".

I will say again,
- everything in the universe exists in an hierarchy.
- all propositions about the universe should therefore be in **that** hierarchy (a faithful subset, not a separate conception of what the universe is)
--- otherwise it will keep changing as the need in the database changes
--- preventing structural change in the database means complying with reality, the structure of reality
- the database definition should be understood as propositions,
- expressed as FOPC Predicates,
- if they are to conform to the /RM/ faithfully (because RM is founded on FOPC)

The engineers in the high end of SQL platforms (including me) knew all that from 1970, and implemented that from 1984 onwards. They implemented CI as the HFS with RK instead of RID, and splitting each table off as per Network Access (the original HFS had many "tables" in a single file structure [eg. Customer-OrderSale-OrderSaleItem] ). So the HFS is now in the CI.

The low end, the program suites, the PusGreNONsqls, the darling monkeys who code such, are clueless about all that. They learn by repeating trial-and-error, in isolation from the real world, re-inventing from scratch, what has been available since 1984.

> >> Many problems related to first-order logic are undecidable.=20
> >
> >Nonsense. Give me even one real world example.
>
> "Is formula F satisfiable?" But perhaps this is not "real world" for
> you.
>
> "Is this cryptographic protocol secure?" This sounds real world enough
> to me (albeit expressed very informally). But you won't find any program
> that always answers correctly to that question (note that both "always"
> and "correctly" are critical here).

(Converted to a FOPC proposition, that is to be tested using FOL, 2-valued Logic only:
This formula is satisfiable
This cryptographic protocol is secure)

Oh, come on. It is not only that those are not real world (there are not) , but that those are the just variations of the usual puzzle type problems ["this statement is false"] that lecturers use to make students think. With an added kink that makes the statement itself inaccessible. They have no value except for that, they do not even exist outside the classroom.

The answer is, each such statement cancels itself, the second qualifying clause NULLIFIES the first clause. AND is assumed.
[1..∞] x 0 = 0
Therefore there is nothing (a null proposition) to solve. I am not so stupid as to solve a proposition that does not exist, due to it nullifying itself.

"Always" is arguing in the tiny corners of the Bell curve. Irrelevant in practice, in the real world. "Correctly" is demanded in the real world.

(I already know, the "academics" declare than certain things are just so, and therefore it becomes established as an academic truth, which no academic will dare to puncture. Such things remain false. Eg. Every set contains the Empty Set. ROTFLMAO. First they create a concept for something that does not exist, that they "need" for "academic" purposes, which is fine. But then they try to impose that on the real world, which does not have the non-existent thing.)

You are better than that. I ask again, give me a real world intractable or un-declarable example.

> >I insist that there is no such thing as an intractable (or
> >"undecidable") problem in FOL. AFAIC, a person who says there is,
> >simply does not understand FOL or FOPC, he does not have Logic
> >aptitude.
>
> Do not confuse problems (sets of instances) with the instances
> themselves. Undecidability refers to the former. For the cryptography
> question above, you may find several computer programs, and they are
> even useful in practice. They can solve correctly and in a finite time
> many instances of the problem. Yet none of them can solve the stated
> problem correctly in every case.

See above.

> Back to databases, I'd say that most instances of FO-related problems
> that you encounter in the "real world" are "easy" (on the scale on
> computational complexity, not on the scale of human understanding).

Ok.

> Example: the implication problem for functional and inclusion
> dependencies taken together is undecidable (in general). But I bet that
> you can always say whether or not a given constraint is implied by the
> functional dependencies and foreign keys of your model.

Irrelevant. Because anything other than Full FDs are irrelevant, a notion created by frauds and freaks who push RFS as "relational".

> you can always say whether or not a given constraint is implied

Not just Keys and FKs, but CHECK Constraints that may or may not call a Function.

Not just by FDs or partial or transitive or inclusive, but by actually understand the data; how the Predicate is structured in relation to the real world Fact that is being stored. And of course understanding of Sets, which can only be obtained by using RKs, and appreciating Sets that are implied. Eg. my*Annotated link* above shows that a little bit.

> Those two facts
> are not contradictory.

Ok. But one is irrelevant.

>> all instances ...
>> every case ...

The preponderance on that is Modernist filth. It serves only to cripple the academic, to hinder him from completing his theory, his formulæ. It stultifies the progress of science. It stops him from progressing to the next demanded step, that of testing the formulæ on many, varied cases.

What matters is the mass in the middle of the Bell curve, not the tiny abnormal cases in the corners. The Modernists reverse that, and elevate the cases in the corners of the Bell curve, and seek to pervert the definition of NORMAL. You are already caught in the insanity. See if I can get you out of it.

We knew all the formulæ that is required for the physics of building bridges by the 1800's. The Romans knew the main of it in 300 BC, just check the aqueducts; viaducts that still stand, and stand perfectly. All that has been added since then relates to the physics of the materials that are used; the connections that are made; the stress on those connections.

So we built bridges. That never fail. All over the world.

But there are a few bridges that fail. Not because the formulæ are false, or because there is some law is physics that has not yet been discovered. But because the builder did not follow them.

And there are may places that, for hundreds of years, could not have a bridge built across it, due to various circumstances. if the physicists were hindered in publishing their formulæ until bridges could be built in every circumstance, we would have no bridges.

The notion to cover every case; all circumstances, is a Modernist trick to disable progress, to enable insane thinking.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 9, 2020, 6:06:18 AM3/9/20
to
> On Sunday, 8 March 2020 20:01:08 UTC+11, Derek Ignatius Asirvadem wrote:
> > On Friday, 6 March 2020 10:19:08 UTC+11, Nicola wrote:

Corrections & clarifications to my previous post

> Eg. my scientific proof [that a surrogate = Record ID] is (a) the effect, and (b) I do not pervert established terms to obtain false credibility (which is easily proved, as I am doing to yours). There is no difference in effect between a "semantic" or ordinary surrogate that "identifies" an "entity" and a Record ID.
> 1. It breaches the RM Relational Key requirement
> 2. it fails the Access Path Independence Rule
> ---- meaning that the descendant tables are cut off from their ancestor tables (the data hierarchy), ---- and
> ---- a JOIN is forced at the level of the breach

> > A join of any two tables on surrogates always joins all and only the
> > semantically related records.

How exactly do you ensure that it "joins all and only the **semantically related** records" ?

I have seen this trick. Something like 30 years after pushing RFS (RM/T) as "relational, and finding out that there is no data integrity whatsoever, the freaks ate the Date; Darwen; Fagin; et al Gulag figured out that ooo, if they declare an FK (in the child) referencing an UNIQUE KEY (in the parent), they can obtain a tiny bit of data integrity. So they hammered the SQL Committee until they got it.

The imbeciles are clueless that although they have obtained a tiny bit of integrity betwen two tables only (parent and the specific child), they still do not have:
- data integrity overall
- Relational Integrity (not even a hope, not even a devil's chance in heaven)
--- which requires a composite Key ... and they are still playing with the brainless notion that the single-cloumn Record ID in the parent is a "key", which is "semantically" referenced by the records in the child.

It remains a trick, that seduces only the uneducated, it is not real semantics (in the RM sense, which is the Standard against which comparisons can be made). Again, give me an example and we can hash it out. Doesn't have to be a real world example, just a real example that has this nonsense in it.

> > When does a change to a movie make it "another movie"? Etc.).
>
> Again. That can be determined easily using science. As evidenced in this thread.

That is, we will define the single objective reality, which is the movie that is shown, not the soft and gooey multiple realities, which is the movies that are seen by various people.

Eg. Blade Runner is one Movie, for which there are two Editions
Movie[ Blade Runner ... ]
Edition[ Blade Runner ... { Director's Cut | Commercial Edition } ]

In some other case it may be two Movies, one Edition each.

If you go with the FIAF sillyness that if the "footage" changes at all (I say, only if the change is substantially), it is a different "manifestation", then:
Movie[ Blade Runner ... ]
Movie[ Blade Runner Director's Cut ] -- Variant
One Edition each.

> > >> especially updates, will be more expensive
> > >
> > >Assuming you mean update of the Key (which might be worth discussing), and
> > > not update of the row (re which the statement is false).
> >
> > I did not express myself clearly. I meant any change to the database: in
> > this context, I was thinking especially of INSERTs, rather than UPDATEs.
>
> Let's call the lot of them /writes/, and INSERT/UPDATE/DELETE when we are being specific.
>
> In that case the assertion is completely false.
> a. At the logical level, there is no difference (expense; cost) between a Record ID write and a RK write.
> b. At the physical level, the RK write is far superior (detailed elsewhere), **AND** the entire table is maintained in efficient order (for all SELECTS and writes)
> c. INSERTs are particularly stupid in Record ID environments, and particularly catered for [fast INSERT into a physical location that an INSERT can actually happen (modulated by the amount of space the administrator reserves for this purpose). Further the table can be rebuilt periodically (generally annually, overnight

Meaning a small batch job for the specific table, that is run overnight.

> ... whereas I rebuild Statistics weekly.)

Meaning all tables, large batch job that is run Sunday night.

> Whereas for RFS there is no point in reserving space because **every** INSERT is located on the last Page, guaranteeing a contentious "hot spot", and Stats are irrelevant, ie. Normal table maintenance that does elevate speed (both SELECT and writes) cannot be done (has no effect) on a Record ID file.

Added punctuation ...

> Whereas for RFS, there is no point in reserving space because **every** INSERT is located on the last Page, guaranteeing a contentious "hot spot". And Stats are irrelevant. Ie. Normal table maintenance, that DOES elevate speed (both SELECT and writes) for normal tables, cannot be done (has no effect) on a Record ID file.

> Whereas those with RID tables have to rebuild every week to maintain speed. I know several customers who rebuild some volatile RID tables nightly. Obviously not in my databases, in third-party or their home-grown pig poo pickles.

Millions of such Idiots

> Read up on Clustered Index in Sybase and MSSQL. We have had this since 1984, IBM/DB2 has a similar, not quite as great, facility. That is the commercial or high end of the platforms. The low end, the freeware, are still finding out how to tie their shoelaces such that they do not trio

trip

> themselves

up

> all on their own. You have no idea what you are missing.

> Sybase has yet one more performance tuning facility of RKs. Starting here, use ▶ to navigate the next 5 pages (less reading. more diagrams).
> http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc00842.1502/html/spsysmon/X60643.htm

The point is, when a Page Split is required (due to a new row being INSERTed, and the relevant Page is full:
- in the non-RK case, it simply splits the content of the Page, leaving the first half in the current Page, and writing a new Page for the second half at the nearest location [for which space has or has not been allocated]
- but in the RK case, it splits the Page on an RK boundary, leaving both the first RK [first /section/ of the Page] and the second RK [second /section/ of the Page] to grow [in future] on their respective Pages.

With a Clustered In dex, the rows are in physical order of the Clustered Index (composite RK), which spreads the INSERT [and DELETE, and UPDATE] contention across the entire table. And with this enhancement, it keeps a RK on a Page as much as possible.

> And all that is ONLY the physical structure of the tables. The Set Up and ongoing management of other, closely-related, resources such as Cache; the Transaction Log; Lock Management; etc, have significant impact on both table performance (queries and writes) and overall server response.

Such other considerations are mentioned, but not discussed here.

> > Am I correct that you are thinking of tables clustered on the primary
> > key and spread over several disks?
>
> No.
> Tables Clustered on the PK, yes.
> Whether they are on one disk or several is a separate matter, which in the commercial world is called Partitioning (the mickey mouse concept of "partitioning" is not at all the same), which is an **additional** physical level of distribution.
>
> I did not previously mention Partitioning, my comments were about the physical distribution of rows in a table by virtue of the fact that the PK is a Clustered Index, and can only be had because the PK s a multi-column RK. I think the diagrams in the links above might explain it.
>
> So first, the rows are physically distributed because the PK is a **Logical** RK, **and** the PK is a Clustered Index (which means Sybase; MSSQL, and DB2 only). Something that is not possible with a Record ID.
>
> Second, an Additional level of physical distribution may be obtained by using Partitions.

Partitions may be on the same disk (sounds silly, but there is a good reason which I won;t detail here); on different disks; or striped across a set of disks (the best). The purpose of Partitions is to allow parallel processing of a **query** (SELECT or writes). The Sybase server is massively parallel, at both the o/s level (uses processor Cores or Threads) and the resource level (Caches; Disks; Query Processing). Which goofy myNONsql or PusGresNUNsql cannot do, because it is not a server, it runs just one thread. Oracle is the same non-architecture, but it does run multiple "instances", not Threads.

> >> all instances ...
> >> every case ...
>
> The preponderance on that is Modernist filth. It serves only to cripple the academic, to hinder him from completing his theory, his formulæ. It stultifies the progress of science. It stops him from progressing to the next demanded step, that of testing the formulæ on many, varied cases.
>
> What matters is the mass in the middle of the Bell curve, not the tiny abnormal cases in the corners. The Modernists reverse that, and elevate the cases in the corners of the Bell curve, and seek to pervert the definition of NORMAL. You are already caught in the insanity. See if I can get you out of it.
>
> We knew all the formulæ that is required for the physics of building bridges by the 1800's. The Romans knew the main of it in 300 BC, just check the aqueducts; viaducts that still stand, and stand perfectly. All that has been added since then relates to the physics of the materials that are used; the connections that are made; the stress on those connections.
>
> So we built bridges. That never fail. All over the world.
>
> But there are a few bridges that fail. Not because the formulæ are false, or because there is some law is physics that has not yet been discovered. But because the builder did not follow them.
>
> And there are may places that, for hundreds of years, could not have a bridge built across it, due to various circumstances. if the physicists were hindered in publishing their formulæ until bridges could be built in every circumstance, we would have no bridges.
>
> The notion to cover every case; all circumstances, is a Modernist trick to disable progress, to enable insane thinking.

The point is, it is the application engineer who makes all the decisions that relate to the tiny corners of the Bell curve (eg. difficult geographic location; gorges; difference in elevation at start and finish; poor ground for foundations; etc). He would rely on all those sciences, that is beyond the [bridge builder] theoretician. It is not possible for the theoretician to cover all that. And certainly not something that should stop him from getting the formulæ that covers the mass of the Bell curve complete and published.

Cheers
Derek

Nicola

unread,
Mar 13, 2020, 9:11:23 AM3/13/20
to
On 2020-03-08, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>> On Friday, 6 March 2020 10:19:08 UTC+11, Nicola wrote:

>> A join of any two tables on surrogates always joins all and only the
>> semantically related records.
>
> The point we are arguing is, there is no semantics in a surrogate.

The semantics of a surrogate derives from the bijective correspondence
of the surrogate to a real key (which is a totally arbitrary bijection,
granted). Once added, address number 123 in your entire database denotes
one and only one real address. More importantly, there is no other
surrogate in your entire database with value 123. Most often than not,
databases that make extensive use of ID attributes do not guarantee the
latter property; alas, many do not even guarantee the former.

> Eg. my scientific proof is (a) the effect, and (b) I do not pervert
> established terms to obtain false credibility (which is easily proved,
> as I am doing to yours). There is no difference in effect between
> a "semantic" or ordinary surrogate that "identifies" an "entity" and
> a Record ID.

That is a valid position, and I am not going to convince you otherwise.
Below, I try to explain in what sense I think that the two concepts
differ (whether the difference is important is a separate issue).

> 1. It breaches the RM Relational Key requirement
> 2. it fails the Access Path Independence Rule
> ---- meaning that the descendant tables are cut off from their ancestor tables (the data hierarchy), ---- and
> ---- a JOIN is forced at the level of the breach

Yes.

>> A join of any two tables on record
>> identifiers produces random records.

Poor usage of words on my part. What I meant is, if your model has:

Country(ID, Name)
State(ID, Name, CountryID)
County(ID, Name, StateID)
Town(ID, Name, CountyID)

(with ID as a PK in each table) a join of Town and Country on ID doesn't
give any meaningful result (ID = record ID). If the IDs were surrogates
in the sense in which I am using the word, such a join would always be
(correctly) empty. You may say that it would be stupid to write such
a query, or that the purported advantage of the additional property
I grant surrogates over record IDs is irrelevant. Still, if I have to
use surrogates/record IDs, I'd rather like them to enjoy such property.

> Neither the surrogate nor the Record ID provides Relational Integrity
> (logical), they only provide Referential Integrity (physical).
> Whereas I can guarantee that a State or County is **not** added to the
> wrong Country, I cannot guarantee that an AddressNo is **not** added
> to the wrong Agent (MovieTitle) or Party (Order Advanced DM).

No doubt that surrogates/record IDs reduce referential integrity.

>> Regarding
>> the subject area we have been discussing, I think that the database *is*
>> complete enough. Perhaps not "complete" in a "production-ready" sense,
>> but the most difficult issues within the subject area we have been
>> discussing are resolved.
>
> Ok. But I would prefer if we close some of the issues that (to me, of
> course) are not yet resolved, without going to a production-ready
> (more accurately an implementation ready) state, a case study level of
> resolution. Recall, we are dealing with true meaning; deeper meaning,
> far beyond what the FIAF lunatics with their RFS mindset and loose
> definitions can contemplate. Eg.
>
> 1. I am not sure that Language should be the determinant (the way it
> is in V0.13), Culture produces a story, not a Language, not a Country.

Can't a country and a language be used to identify a culture? I don't
grasp what alternative you are considering.

> 2. More rigourous testing re the cascade of Keys, and the alternation
> based on *Title, for the Concept→Movie→Edition→Instance data
> hierarchy. The more precise data hierarchies as defined are:
> Concept = Agent ... Concept
> Movie = Language ... TitleType ... ConceptTitle
> Edition = Language ... TitleType ... MovieTitle
>
> I am not sure that I can communicate this via IDEF1X ... but I will
> try (it is explicitly communicated in my IDEF1X Extension).
> http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_13%20Annotated.pdf

Sorry, I am not sure I understand the purpose of that annotation.

>> But
>> that was not the point. I wanted to point out a case where the issue of
>> "identification" was relatively complex. In fact, a situation where even
>> ontologically
>
> I reject the use of that term (of course I have to accept what the
> freaks have done: create a definition outside the RFS for all the FOPC
> Predicates that they cannot place in the RDB), because it perverts the
> meaning of the term as it has stood for two millennia.

Here, I was using the word in its original sense, not in the "computer
science" sense. Before defining the facts (the database), you need to
know that the facts talk about. In the domain of arts and humanities,
that is often far from crystal clear, even to domain experts.

>> >> especially updates, will be more expensive
>> >
>> >Assuming you mean update of the Key (which might be worth discussing), and
>> > not update of the row (re which the statement is false).
>>
>> I did not express myself clearly. I meant any change to the database: in
>> this context, I was thinking especially of INSERTs, rather than UPDATEs.
>
> Let's call the lot of them /writes/, and INSERT/UPDATE/DELETE when we
> are being specific.
>
> In that case the assertion is completely false.
> a. At the logical level, there is no difference (expense; cost)
> between a Record ID write and a RK write.

Sure.

> b. At the physical level, the RK write is far superior (detailed
> elsewhere), **AND** the entire table is maintained in efficient order
> (for all SELECTS and writes)

Yes.

> c. INSERTs are particularly stupid in Record ID environments, and
> particularly catered for [fast INSERT into a physical location that an
> INSERT can actually happen (modulated by the amount of space the
> administrator reserves for this purpose). Further the table can be
> rebuilt periodically (generally annually, overnight ... whereas
> I rebuild Statistics weekly.) Whereas for RFS there is no point in
> reserving space because **every** INSERT is located on the last Page,
> guaranteeing a contentious "hot spot", and Stats are irrelevant, ie.
> Normal table maintenance that does elevate speed (both SELECT and
> writes) cannot be done (has no effect) on a Record ID file.
>
> Whereas those with RID tables have to rebuild every week to maintain
> speed. I know several customers who rebuild some volatile RID tables
> nightly. Obviously not in my databases, in third-party or their
> home-grown pig poo pickles.
>
> Read up on Clustered Index in Sybase and MSSQL. We have had this
> since 1984, IBM/DB2 has a similar, not quite as great, facility. That
> is the commercial or high end of the platforms. The low end, the
> freeware, are still finding out how to tie their shoelaces such that
> they do not trio themselves all on their own. You have no idea what
> you are missing.

Clustering tables is possible in some open-source systems (e.g., CLUSTER
in PostgreSQL - nothing like a clustered index, though); it's true that
such limited support is far from satisfactory.

Thanks for the documentations pointers.

>> Research on physical optimization is still hot nowadays.
>
> There is no genuine scientific research being done in this area, and
> there has not been since 1984-1987. All genuine research that is done
> (including that 1984-1987 content,) is done by commercial enterprises
> such as Sybase and IBM. Codd was a product of precisely such
> a venture, and he is the only one outside the platform suppliers who
> gave theory that was based on practice.

That sounds like the adage: "things were better in (my) old times". That
a lot good deal of fundamental research was done in the '70s and '80s,
especially by "big names", is a fact, no question. But today's computers
are not '80s computers, and there is a lot of research that is done and
that has to be done yet. Take a look at VLDB or at
https://15721.courses.cs.cmu.edu/spring2020/schedule.html.

> The freaks that come up with "physical optimisation" are re-inventing
> the wheel

Sure, that happens all the time. But from time to time some new very
good ideas pop up. That's how research works.

> in their isolated ivory towers, purposefully ignorant that
> (a) the wheel was invented thousands of years ago, and (b) perfected
> for particular purposes.

"For particular purposes" is crucial. The wheel may be significantly
un-perfected for other purposes, unforeseen decades ago, but relevant
nowadays.

> They are doing "research:" about what we
> have had since 1987, and "inventing" it. That is what happens when
> clueless "theoreticians" who are addicted to pig poop find out that
> their "theories" do not work in the practical world, and then take
> THIRTY YEARS to come up with a new fix-it "theory".
>
> MVCC is a great example of such monstrosities. Beloved of the
> "academics" who, all 10,000 of them spread across the planet, have no
> clue that they are breaking scientific principles, and who "invent"
> [copy, badly] what Oracle had that does not work.

MVCC has its drawbacks and some advantages, especially re concurrency
and performance, compared to 2PC. Systems that implement MVCC sometimes
do also provide explicit lock mechanisms for the situations where
a 2PC-like behaviour is required. The consensus seems to be that such
applications are a minority and for the rest MVCC is adequate. If you
want to discuss this topic further, please move it to a new thread.

>> Am I correct that you are thinking of tables clustered on the primary
>> key and spread over several disks?
>
> No.
> Tables Clustered on the PK, yes.
> Whether they are on one disk or several is a separate matter, which in
> the commercial world is called Partitioning (the mickey mouse concept
> of "partitioning" is not at all the same), which is an **additional**
> physical level of distribution.

Ok.

> It is accurate to say that the /Relational Model/ is:
> a. the HDBMS (all Hierarchic principles retained [here we are discussing only the RK] )
> b. married to
> c. the NDBMS (Independent Access Path, per table)
> d. with a mathematical definition to form the foundation

This is a perspective which, in general terms, I wouldn't disagree with.

> "There is nothing new under the sun."

Do you think that Codd was just "reinventing the wheel"?

Nicola

Derek Ignatius Asirvadem

unread,
Mar 14, 2020, 2:48:07 AM3/14/20
to
> On Saturday, 14 March 2020 00:11:23 UTC+11, Nicola wrote:
> > On 2020-03-08, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
> >> On Friday, 6 March 2020 10:19:08 UTC+11, Nicola wrote:
>
> >> A join of any two tables on surrogates always joins all and only the
> >> semantically related records.
> >
> > The point we are arguing is, there is no semantics in a surrogate.
>
> The semantics of a surrogate derives from the bijective correspondence
> of the surrogate to a real key (which is a totally arbitrary bijection,
> granted). Once added, address number 123 in your entire database denotes
> one and only one real address. More importantly, there is no other
> surrogate in your entire database with value 123.

If I understand you correctly:
- surrogates are assigned database-wide (not file-wide)
- surrogate 123 is one and only one Address record.
- surrogate 345 may be one and only one record in some other record file.

The method is clear, but the semantics or "semantics" [something else, but using the term fraudulently] have still not been explained.

Look, when I cook a curry, it certainly brings back the memory of my great aunt, who first taught me how to cook properly, and my feelings about her ... but I cannot say that my great aunt is in the curry. The influence, the teaching, is in the Form, not in the Matter. No part of a sculptor is **in** the sculpture (if there were, the sculptor would be missing pieces of himself, and cease to be the sculptor, eventually dying from loss of flesh and blood; my great aunt [bless her soul] is rotting in the ground, and there are no pieces that can be used).

There is only an esoteric [definitely not abstract] sense that the sculptor is **in** the sculpture; that my great aunt is **in** the curry, declarations that only unscientific artistic types would make.

The investment is completely subjective and personal (yes, you guys have a shared subjective reality, an echo chamber), but it has zero effect on the Matter, the material. The material remains unchanged, and obeys laws of Physics unchanged.

> Most often than not,
> databases that make extensive use of ID attributes do not guarantee the
> latter property; alas, many do not even guarantee the former.

Agreed.
The former is obtained by Referential Integrity. Full stop.
The latter is ... More later.
Yes.
I am saying that it would be stupid to write such a query. That no one would. That if they did, it would result in indications that the JOIN was false, and they would then correct the JOIN to get what they intended. The point that the JOIN fails is scientifically irrelevant. It only has value in the realm of feelings, a subjective reality shared by those who share the same feeling.

It is like saying, if I take poison and the antidote, nothing will happen. Sane people do not take poison in the first place, and nothing happens. I obtain the result of "nothing happens" another, less laborious and less risky way.

But in all cases, that attempt is a mistake. if he looks at the documentation, or the catalogue in the SQL platform/suite, he would find out that he has to step through:
Town→County→State→Country.

A good modeller (while remaining in the RFS mindset) would define the following (every Key or wannabe "key" should have an explicit and private name (and datatype); never use "ID" alone):
Country( CountryID, Name )
State( StateID, Name, CountryID )
County( CountyID, Name, StateID )
Town( TownID, Name, CountyID )
That would assist in preventing stupid JOINs.

Yes.
I am saying that the purported advantage of the additional property you grant surrogates over record IDs is not scientific, it is scientifically irrelevant. The point that there is some advantage is not proved (or even explained as to how it is an advantage).

I am not seeking to be convinced, I am seeking specifically, what is the difference in scientific terms. Thus far, there is none.

----

On the other hand, being a scientist, and not being affected by feelings and purported advantages that are based on unscientific notions, I can prove that it is a GROSS DISADVANTAGE in the single universal surrogate.

1. In the normal RFS case
- say 200 files, one would have to have a file that stores the last-used surrogate for each file ( one record per file, in a file named say RFS_Surrogate, 200 records). Typical for RFSs written in Oracle.
--- This can be eliminated by using
INSERT ... SELECT RecordID = ( SELECT MAX( RecordId ) + 1 FROM File )
but that requires a correct understanding of ACID Transactions (which cannot be had in MVCC), so very few people actually do that
--- It is more commonly handled via { IDENTITY | AUTOINCREMENT }, in platforms that have such a facility
--- which merely pushes the function onto the platform, making it very slightly more efficient, it does NOT eliminate it
--- and as you know, maintaining such Record IDs (changing; resetting; moving) RFS is horrendous. ( I have done my fair share over the decades, in a fraction of the time that the resident DBA does it, due to him being stuck in Record IDs, and me elevating the issue to logical rows).
- guaranteed "hot spot" per each file, a max of one user per file can INSERT without contention
- GUID; UUID; etc, make no difference at all (except in the realm of feelings), but of course a wide record size.

2. RFS with an universal surrogate (if you can call it that)
- how, pray tell, does one ensure that the surrogate required for the next INSERT is unique across the database ?
- one would have to have a file that stores the single, cross-database surrogate ( a single record in a file named RFS_Surrogate ).
- Instead of one surrogate for each of those 200 files, you have one surrogate for the entire RFS
- instead of one guaranteed "hot spot" per INSERT per file (200 hot spots), you have one guaranteed "hot spot" per INSERT.
- "Academics" call this progress, or "let the platform figure it out".

It takes a certain amount of technical "academic" blindness to (a) create such a hindrance [2], (b) which is horrendously worse than [1], and (c) then justify it, which (d) exposes the ivory tower isolation from reality.

3. RKs in an RDb
- In each table, the INSERT is guaranteed to be spread across the entire table. "Hot spots" are not created (I can't say they are eliminated, because we do not have any to eliminate).
- the only contention is (eg) when two people from the same Customer try to INSERT OrderSales at the exact same millisecond. And even that is resolved in milliseconds.

> Still, if I have to
> use surrogates/record IDs, I'd rather like them to enjoy such property.

That has no effect in reality, the enjoyment is cerebral only. And you suffer/enjoy the consequences that go with it (created horrendous contention), worse than one surrogate per file, regardless of whether you deny it or not.

> > Neither the surrogate nor the Record ID provides Relational Integrity
> > (logical), they only provide Referential Integrity (physical).
> > Whereas I can guarantee that a State or County is **not** added to the
> > wrong Country, I cannot guarantee that an AddressNo is **not** added
> > to the wrong Agent (MovieTitle) or Party (Order Advanced DM).
>
> No doubt that surrogates/record IDs reduce referential integrity.

No. It appears you are missing a crucial point here, re a facility than the /RM/ provides, that RFS (RM/T) cannot.
-- Referential Integrity is an SQL term, it applies at the physical level only.
---- If a RID is used, there is one additional field and one additional index
-- Relational Integrity is my term, an articulation of the capability in the /RM/. It is Logical, both in concept, and in terms of data (not manufactured or generated by the system). Of course, it is implemented at the physical level using Referential Integrity.
---- No additional column, no additional index

1, 2. Surrogates/Record IDs
-- have only Referential Integrity, between physical RECORDS.
-- Relational Integrity is not possible
-- it is NOT POSSIBLE to guarantee that (using your example definition above)
---- a State is **not** added to the wrong Country, or
---- a County is **not** added to the wrong Country or wrong Country-State
---- etc

3. RKs
-- have RELATIONAL Integrity, between ROWS
-- it is already guaranteed (no additional work or definition required) that
---- a State can **not** be added to the wrong Country, or
---- a County can **not** be added to the wrong Country or wrong Country-State
---- etc

If this is not clear, if you would like more explanation, please ask. (we took up an example years ago on c.d.t, but you dropped it due to me not genuflecting to academia.)

> >> Regarding
> >> the subject area we have been discussing, I think that the database *is*
> >> complete enough. Perhaps not "complete" in a "production-ready" sense,
> >> but the most difficult issues within the subject area we have been
> >> discussing are resolved.
> >
> > Ok. But I would prefer if we close some of the issues that (to me, of
> > course) are not yet resolved, without going to a production-ready
> > (more accurately an implementation ready) state, a case study level of
> > resolution. Recall, we are dealing with true meaning; deeper meaning,
> > far beyond what the FIAF lunatics with their RFS mindset and loose
> > definitions can contemplate. Eg.
> >
> > 1. I am not sure that Language should be the determinant (the way it
> > is in V0.13), Culture produces a story, not a Language, not a Country.
>
> Can't a country and a language be used to identify a culture? I don't
> grasp what alternative you are considering.

Wherever we currently have a table that references { Language | Country }, we should instead reference CountryLanguage.

> > 2. More rigourous testing re the cascade of Keys, and the alternation
> > based on *Title, for the Concept→Movie→Edition→Instance data
> > hierarchy. The more precise data hierarchies as defined are:
> > Concept = Agent ... Concept
> > Movie = Language ... TitleType ... ConceptTitle
> > Edition = Language ... TitleType ... MovieTitle
> >
> > I am not sure that I can communicate this via IDEF1X ... but I will
> > try (it is explicitly communicated in my IDEF1X Extension).
> > http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_13%20Annotated.pdf
>
> Sorry, I am not sure I understand the purpose of that annotation.

Pardon me.
1. Codd asks us to think in terms of Sets. Here the relevant Sets to be held in mind are the Domain of each Relational Key.
2. The Data Hierarchies of course form the components of each Relational Key.
Therefore:
- for Concept, it appears (visually, leading to meaning) that the Data Hierarchy starts at Concept ...
--- No, it actually starts at Agent. This is clear if you inspect the Concept PK
- Movie is actually dependent (not in the IDEF1X sense, but existentially dependent) on ConceptTitle
--- the hierarchy for ConceptTitle starts at (a) Language, and (b) TitleType ... ConceptTitle is a "binary relation", a cross-over between two hierarchies
- for Movie, the Data Hierarchy actually starts at Language, and then ConceptTitle
- for Edition, the Data Hierarchy starts at Language; another at TitleType; and then MovieTitle

(Oops, I just spotted an error. Movie is Independent, but shown as Dependent. Calls for another iteration. I will wait until some of these questions get resolved.)

> >> But
> >> that was not the point. I wanted to point out a case where the issue of
> >> "identification" was relatively complex. In fact, a situation where even
> >> ontologically
> >
> > I reject the use of that term (of course I have to accept what the
> > freaks have done: create a definition outside the RFS for all the FOPC
> > Predicates that they cannot place in the RDB), because it perverts the
> > meaning of the term as it has stood for two millennia.
>
> Here, I was using the word in its original sense, not in the "computer
> science" sense. Before defining the facts (the database), you need to
> know that the facts talk about. In the domain of arts and humanities,
> that is often far from crystal clear, even to domain experts.

No kidding.
Whereas the current "scientists" and "academics" deal with un-resolved "facts", using non-science; speculation; redefined terms; etc, the arts and humanities people deal with mush that keeps changing, using an even worse set of non-science. AFAIC there are no domain experts in those areas, only people who keep some track of undefined, ever-changing mush.

None of them have heard of Aristotle, let alone that all such things were scientifically defined by him in 350BC. Needless to say, I use his methods, and thus have no problem at all defining the "undefinable", resolving the "un-resolvable". Like I said, give me an example of an intractable or "un-decidable" problem.

> > Read up on Clustered Index in Sybase and MSSQL. We have had this
> > since 1984, IBM/DB2 has a similar, not quite as great, facility. That
> > is the commercial or high end of the platforms. The low end, the
> > freeware, are still finding out how to tie their shoelaces such that
> > they do not trio themselves all on their own. You have no idea what
> > you are missing.
>
> Clustering tables is possible in some open-source systems (e.g., CLUSTER
> in PostgreSQL - nothing like a clustered index, though); it's true that
> such limited support is far from satisfactory.

As I said, they are re-inventing the wheel that was perfected on commercial platforms in 1984 (thirty six years), they are still learning how to tie their shoelaces without tripping themselves up all on their own (exclusive table lock for the duration of the CLUSTER command ... and it is a command, not an index). Maybe in another thirty years they will achieve what we had in 1984.

They will need thirty more years to achieve multiple threads per query, that we had in 1993.

> Thanks for the documentations pointers.

You are welcome.

> >> Research on physical optimization is still hot nowadays.
> >
> > There is no genuine scientific research being done in this area, and
> > there has not been since 1984-1987. All genuine research that is done
> > (including that 1984-1987 content,) is done by commercial enterprises
> > such as Sybase and IBM. Codd was a product of precisely such
> > a venture, and he is the only one outside the platform suppliers who
> > gave theory that was based on practice.
>
> That sounds like the adage: "things were better in (my) old times".

I did not mean it as such. This is a scientific discussion. The emphasis was on these so-called researchers, who are totally ignorant of wheels, that they have been perfected for specific purposes, who go about inventing a wheel, anyway, from scratch.

> That
> a lot good deal of fundamental research was done in the '70s and '80s,
> especially by "big names", is a fact, no question. But today's computers
> are not '80s computers, and there is a lot of research that is done and
> that has to be done yet. Take a look at VLDB or at
> https://15721.courses.cs.cmu.edu/spring2020/schedule.html.

It appears that that proves my case more than it does yours !!!

>>> CMU 15-721
This course is a comprehensive study of the internals of modern database management systems. It will cover the core concepts and fundamentals of the components that are used in both high-performance transaction processing systems (OLTP) and large-scale analytical systems (OLAP). The class will stress both efficiency and correctness of the implementation of these ideas. All class projects will be in the context of a real in-memory, multi-core database system. The course is appropriate for graduate students in software systems and for advanced undergraduates with dirty systems programming skills.
<<<

That is how to teach young minds how to be stupid. Exclude actual modern database management systems such as Sybase; DB2; MSSQL, and include only pieces of pig poop such as PusGrossNONsql. Stonebraker is right there at the top, proving again, that academics love academics and have no clue about the world outside academia. Invent a wheel from scratch, in isolation from the real world. Then spend thirty years getting it to work. It is based on the elevation of personal power and capability, in an ignorant state, in stead of real education, which [latter] would eliminate the thirty years. The very definition of Modernist "education".

> > The freaks that come up with "physical optimisation" are re-inventing
> > the wheel
>
> Sure, that happens all the time. But from time to time some new very
> good ideas pop up. That's how research works.

Name one. In this field. Please.
(There are genuine advances made by genuine research in other hard sciences. Eg. biology has proved Darwin's theory false [separate to the fact that there has never been any proof], using Darwin's own measure of proof.)

The point I was making is, that all, a full 100%, of the progress in database science comes from the engineers of the large commercial SQL platforms, same as in the 1960's and 1970's, when absolutely all the progress in *DBMS and database technology came from them. Codd is included in that, because he was employed by IBM and given a commissioned project. Never from an academic or theoretician.

What academia does, what they have produced in FIFTY Years (since the /RM/), using their Modernist "science", is a big fat zero.

Not one single paper that articulates the /RM/.
Not one single paper that progresses the /RM/.
About 120 papers that propagandise RFS [RM/T} as "relational".
We call that sabotage and regression, you guys call it "science".

> > in their isolated ivory towers, purposefully ignorant that
> > (a) the wheel was invented thousands of years ago, and (b) perfected
> > for particular purposes.
>
> "For particular purposes" is crucial. The wheel may be significantly
> un-perfected for other purposes, unforeseen decades ago, but relevant
> nowadays.

Absolutely.

The real progress, if that can be had, is to perfect a wheel for a new specific purpose, one that needs a wheel that does not yet exist.

What they are doing is, "creating" a wheel that was perfected in 1984, from scratch and in ignorance, in 2002, which is badly broken, and then "perfecting" over 20 years and counting. There will never be a day when PusGrossNONsql has either a CLUSTER TABLE or a CLUSTERED INDEX that comes even remotely close to a Sybase Clustered Index of 1984.

> > They are doing "research:" about what we
> > have had since 1987, and "inventing" it. That is what happens when
> > clueless "theoreticians" who are addicted to pig poop find out that
> > their "theories" do not work in the practical world, and then take
> > THIRTY YEARS to come up with a new fix-it "theory".
> >
> > MVCC is a great example of such monstrosities. Beloved of the
> > "academics" who, all 10,000 of them spread across the planet, have no
> > clue that they are breaking scientific principles, and who "invent"
> > [copy, badly] what Oracle had that does not work.
>
> MVCC has its drawbacks and some advantages, especially re concurrency
> and performance, compared to 2PC. Systems that implement MVCC sometimes
> do also provide explicit lock mechanisms for the situations where
> a 2PC-like behaviour is required. The consensus seems to be that such
> applications are a minority and for the rest MVCC is adequate. If you
> want to discuss this topic further, please move it to a new thread.

Done.

> > It is accurate to say that the /Relational Model/ is:
> > a. the HDBMS (all Hierarchic principles retained [here we are discussing only the RK] )
> > b. married to
> > c. the NDBMS (Independent Access Path, per table)
> > d. with a mathematical definition to form the foundation
>
> This is a perspective which, in general terms, I wouldn't disagree with.

You are one of very few. The others, the great mass of "theoreticians" in this field, deny the evidenced facts. And promote RFS as "relational". You are the single academic in my 44 years in this industry (about 22 large projects, and about 80 small projects, all involving many academics) who has become (a) open to the reality of the real world, and (b) wishes to know the real, unsuppressed /Relational Model/.

> > "There is nothing new under the sun."
>
> Do you think that Codd was just "reinventing the wheel"?

Using the established analogy, and the declaration I made immediately above. No.

a. The HDBMS wheel was perfected, for its specific purpose. In fact, perfected differently for each different wheel (commercial platform, there were four big ones), due to their internal physical structure being different.

c. The NDBMS wheel was perfected, for its specific purpose. There was, and still is, just one platform supplier. There has been no change to that perfected wheel (or the underlying physical structure that predicates its particulars) since 1978. And that is very small fine-tuning. No substantial change since 1965.

While the wheels were fit for purpose, there was still more progress to be made, more demands to be met. In those days, only large manufacturers had the money to own the big expensive machines of the day. 3M; Corning; Kodak; many steel and aluminium companies (all my customers, when I was at Cincom, and fondly remembered ).
1. The Bill of Materials problem was a big deal.
2. Running OLTP on one database and then OLAP on a second was a major problem to be overcome (nowadays it is accepted as the norm !!! And I am the exception).

The point is, Codd was **commissioned** by IBM to find a solution to those two problems (not to perfect the already perfect wheels). Being a rock solid researcher, he researched everything about the subject, and all the problems in the contemporary platforms (as enumerated in the /RM/). Additionally, he had indications and directions from other IBM ventures.
3. IBM was well aware that their RKs were fabulously successful, but they were still RID based at the file level, a known integrity problem
4. and they were well aware that their ISAM access lacked the speed of NDBMS Randomised Access for OLTP (Cincom was chomping up that market, which was previously HDBMS-only).

>>> RM
Acknowledgment. It was C. T. Davies of IBM Poughkeepsie who convinced the author of the need for data independence in future information systems.
<<<

Data Independence is the Relational Key. The whole Relational Key. And nothing but the Relational Key. So help me Codd.

So from those two perfected wheels [a][c], plus a few unfinished, nowhere-close-to-perfect lab-model wheels, he created one new wheel, for a specific new purpose:
- OLTP
- plus OLAP on the same database (eliminated a separate db)
- Relational keys (eliminating RIDs)
--- reinforced the Hierarchic structure of data, and the storage thereof
- Bill of Materials, without duplication of records (duplication which was required previously)
- Network access to separate tables (elimination of the many-files-in-one-hierarchy structure)

Which is progress of [a][c], and the invention of [b], as a machine.

And then [d] a mathematical foundation.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 15, 2020, 3:28:17 AM3/15/20
to
Nicola

> (Oops, I just spotted an error. Movie is Independent, but shown as Dependent. Calls for another iteration. I will wait until some of these questions get resolved.)

With a view to getting the DM complete enough to be useful for this exercise (not "production ready"), I would like to resolve all un-resolved items. I have identified several such items in previous posts, these await your discussion. Here is another.

Consider:
- In the FIAF manual, where the definitions are loose and floppy, and anything can be any or all of { Work | Variant | Manifestation | Item }, they have Identifiers (our StandardType) on all four elements of their four-level hierarchy.
- In our DM, where definitions are tight, I had implemented StandardType (FIAF Identifiers) on the three levels of the hierarchy.
But our Concept is purely cerebral, not a realisation/manifestation, which is our Project/Movie. So the question is, can there be a FIAF Identifier/StandardType for a Concept ? I suspect it is for material things only, for Movie & Edition, not for [our definition of] cerebral things. But I am not sure, our Concept may well be a material thing to them.
- I have gone through the FIAF Manual again, and could not resolve this point. Over to you.
- (Btw, that little exercise re-inforced our notion that our treatment of Variant & Derivative is superior to theirs.)

Another reason the DM is not complete is, I have not removed the Nulls. Which absolutely have to be removed before I let a DM be considered complete. It is one of those second-to-last steps.

As discusse, more Constraints, please. Especially difficult ones.

I can appreciate that you are busy, with the conditions in Northern Italy making life very difficult. If you have any time to respond to this thread, this subject is the one I would like you to prioritise.

As I am walking through the DM, I am still finding little errors. I was hoping that your walk-through, and reading off the Predicates as you go, would have caught some of those.

Cheers
Derek

Nicola

unread,
Mar 16, 2020, 2:24:07 PM3/16/20
to
On 2020-03-15, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>> Can't a country and a language be used to identify a culture? I don't
>> grasp what alternative you are considering.
>
>Wherever we currently have a table that references { Language | Country},
>we should instead reference CountryLanguage.

Language still gets me confused. A language is described by Language
(e.g., Russian) and CharacterSet (e.g., cyrillic). Defined this way, it
seems appropriate to be referenced, say, by Titles, which are written in
a certain language and character set. But why would a Culture or Person
need to be associated to a specific character set? That does not make
sense to me. Perhaps, if you tell me which ISO LanguageCodes are
supposed to be, I will understand better.

That said, what are you proposing? That (Concept|Movie)Titles should
incorporate CountryCode? I would go a different route (see below).

>> > 2. More rigourous testing re the cascade of Keys, and the alternation
>> > based on *Title, for the Concept→Movie→Edition→Instance data
>> > hierarchy. The more precise data hierarchies as defined are:
>> > Concept = Agent ... Concept
>> > Movie = Language ... TitleType ... ConceptTitle
>> > Edition = Language ... TitleType ... MovieTitle
>> >
>> > I am not sure that I can communicate this via IDEF1X ... but I will
>> > try (it is explicitly communicated in my IDEF1X Extension).
>> > http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_13%20Annotated.pdf
>>
>> Sorry, I am not sure I understand the purpose of that annotation.
>
>Pardon me.
>1. Codd asks us to think in terms of Sets. Here the relevant Sets to be held in mind are the Domain of each Relational
> Key.
>2. The Data Hierarchies of course form the components of each Relational Key.
>Therefore:
>- for Concept, it appears (visually, leading to meaning) that the Data Hierarchy starts at Concept ...
>--- No, it actually starts at Agent. This is clear if you inspect the Concept PK
>- Movie is actually dependent (not in the IDEF1X sense, but existentially dependent) on ConceptTitle
>--- the hierarchy for ConceptTitle starts at (a) Language, and (b) TitleType ... ConceptTitle is a "binary relation", a
> cross-over between two hierarchies
>- for Movie, the Data Hierarchy actually starts at Language, and then ConceptTitle
>- for Edition, the Data Hierarchy starts at Language; another at TitleType; and then MovieTitle

I am considering some (perhaps too naive) changes. Please take a look at
this:

https://send.firefox.com/download/b2c409b72171dddf/#x_7fCElwXgWmgV2-MVWETg

Sorry, there's only so much time I can devote to writing posts in this
period. I hope the diagram is self-explaining. The biggest difference is
that I am putting Movie (now Moving Image) under Concept rather than
ConceptTitle. I have not redrawn the whole model: what is not there is
assumed to be unchanged, except for obvious changes, e.g., to foreign
keys.

One thing to be noted (in your model as well in my my sketched changes):
consider a Concept B which is a variant of Concept A. Then, B and
A cannot have a common ConceptTitle. That may be too restrictive. One
may assume that a variant "inherits" all the titles of the Concept it is
a variant of, but that may not be satisfactory either. Something to
think about (at least for me).

Ditto for Movie/MovieTitle.

>(Oops, I just spotted an error. Movie is Independent, but shown as
>Dependent. Calls for another iteration. I will wait until some of
>these questions get resolved.)

Correct, but see above. I'd make Movie dependent on Concept.

> Consider:
> - In the FIAF manual, where the definitions are loose and floppy, and
> anything can be any or all of { Work | Variant | Manifestation | Item
> }, they have Identifiers (our StandardType) on all four elements of
> their four-level hierarchy.
> - In our DM, where definitions are tight, I had implemented
> StandardType (FIAF Identifiers) on the three levels of the hierarchy.
> But our Concept is purely cerebral, not a realisation/manifestation,
> which is our Project/Movie. So the question is, can there be a FIAF
> Identifier/StandardType for a Concept ? I suspect it is for material
> things only, for Movie & Edition, not for [our definition of] cerebral
> things. But I am not sure, our Concept may well be a material thing
> to them.

But it is absolutely not material in our model. AFAIK, the various
standard identifiers are for what we have called Movie or Edition
(Instances of course come with their own codes, barcodes, serial numbers
or whatever).

> - I have gone through the FIAF Manual again, and could not resolve
> this point. Over to you.

That's because what a "Work" is not clear. They sometimes say Work is
"abstract", sometimes that it is "concrete". E.g., at p. 9, Work is
labeled as an "abstract entity". Then, at p. 224 you find this pearl:
«moving image "works" are more easily conceptualized as concrete
entities».

> - (Btw, that little exercise re-inforced our notion that our treatment
> of Variant & Derivative is superior to theirs.)

I agree.

> Another reason the DM is not complete is, I have not removed the
> Nulls. Which absolutely have to be removed before I let a DM be
> considered complete. It is one of those second-to-last steps.

Have your marked optional attributes in some way? I have assumed that
all attributes are mandatory.

> As discusse, more Constraints, please. Especially difficult ones.

I'd like to resolve the above first.

> I can appreciate that you are busy, with the conditions in Northern
> Italy making life very difficult. If you have any time to respond to
> this thread, this subject is the one I would like you to prioritise.
>
> As I am walking through the DM, I am still finding little errors.
> I was hoping that your walk-through, and reading off the Predicates as
> you go, would have caught some of those.

Things were moving fast here (locking down, etc.) and we had to adapt
quickly, e.g., to giving remote classes. I have to focus mostly on that
now. Besides, working from home bears its own issues.

Nicola

Derek Ignatius Asirvadem

unread,
Mar 17, 2020, 11:54:48 AM3/17/20
to
> On Tuesday, 17 March 2020 05:24:07 UTC+11, Nicola wrote:
> > On 2020-03-15, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> >> Can't a country and a language be used to identify a culture? I don't
> >> grasp what alternative you are considering.
> >
> >Wherever we currently have a table that references { Language | Country},
> >we should instead reference CountryLanguage.

That means EITHER Language OR Country

The purpose of the suggestion is to make the reference more specific: Dostoyevsky was Russian, and wrote in Russian. Khodorkovsky is Jewish (there is no such thing as a Russian Jew [ or an Italian Jew ...} but I am not arguing that point here) in Russian, writing mostly is English when he transfers his stolen money to Mayer Rothschild. Russian to his second wife. English to his Soros funded English-speaking anti-Russian protesters. Each of them define a culture, rather than only a Country, or only a Language.

Eg. Chinese (because it is not a Language in the normal [Indo-European group of Languages] sense] has no alphabet, but has five common renderings: three different symbol sets implemented in CharactersSets (using LanguageCode[ zh ] ); plus two more in supposedly phonetic English (using LanguageCode[ en ] ),

(The proper term for the Chinese characters is ideographs (symbols; but not hieroglyphs because that includes sounds, and Chinese symbols are for concepts, not sounds. Therefore their entire vocabulary, which is 40,000, is their um dictionary,)

CharSets are a different thing. ISO is starting to go insane, due to Modernist "science". ISO 8859-1 is UTF-8 and UTF-16 plus a few bits of nonsense pandering to peoples who had no written language, but now pretend that they do (ie. they use the Latin alphabet plus diacritical marks to produce the phonetics).

> Language still gets me confused. A language is described by Language
> (e.g., Russian) and CharacterSet (e.g., cyrillic).

That is true. But I would not state it that way. A Language has just one CharacterSet. Full FD. Therefore, a Language USES one CharacterSet. Sure, it is a Descriptor, but any of the non-Key columns are Descriptors of the Key. The Key is a Definer, so just LanguageCode defines a Language.

> Defined this way, it
> seems appropriate to be referenced, say, by Titles, which are written in
> a certain language and character set.

If only to display the Title correctly on any device, in any place (whether the device has the CharSet or not; etc). UTF-8 gets past some, but not all, of that. Eg. the notion of 1-to-4 characters is problematic.

So yes, it is appropriate.

> But why would a Culture or Person
> need to be associated to a specific character set? That does not make
> sense to me.

Well, it isn't. (I don't know how you concluded that.)

As per Note [ 2 ] in the right margin:
CountryCode & LanguageCode:
• in PersonBirthCountryCode & PersonFull,
it is that of birth
• in Agent, it is the operative value, relevant to
Concept & Movie.

(PersonBirthCountryCode = PersonBirthCountry -- mistake)

So for a curated system, it is good to know what ( Country, Language ) he was born into. And separately, what ( Country, Language ) he operated in when he created a Concept or Movie. That latter may be multiple.

Separately, if and when a Title is displayed, the database informs the developer as to which CharSet is required ... allowing him to display it as is ( goop in UTF-8 ), or throw an error [ Title requires CharSet[ goop ], which is not loaded on this device ]

> Perhaps, if you tell me which ISO LanguageCodes are
> supposed to be, I will understand better.

That is a different (unrelated) stand-alone question. I don't think it will clarify our issue.

Overview:
https://www.iso.org/iso-639-language-codes.html

List:
https://www.loc.gov/standards/iso639-2/php/code_list.php
(Nothing to too naïve or too trivial.)
Beautiful diagram !

> Sorry, there's only so much time I can devote to writing posts in this
> period. I hope the diagram is self-explaining.

> I have not redrawn the whole model: what is not there is
> assumed to be unchanged, except for obvious changes, e.g., to foreign
> keys.

No problem. Yes, it is self-explanatory.

> The biggest difference is
> that I am putting Movie (now Moving Image) under Concept rather than
> ConceptTitle.

(In responding, I am torn between answering what you have given me, which will be more specific to your need/question, vs answering from DM V0.13, which is built up over 80 or so posts plus the manuals. I think the latter, because it will be less text, sorry. And at the end, I will respond to any loose ends.)

So the question is why do you want that, what do we gain that the model does not now provide ? (And then, less or more constraints)

The set of Predicates we have cultivated thus far in V0.13 provides:
- a Concept is abstract and does not exist materially
- it starts existing materially when is has a ConceptTitle, that is how it gets established; gets known
--- so there is at least one ConceptTitle (the original Language)
--- and it may be known in other cultures as a local ConceptTitle (different Language)
--- which means 1-to-n ConceptTitles. Eg. any story that is known across cultures, usually mythology; Bible stories; children's stories; etc
- when a Movie is created, in a different time and space from the Concept, it is from one of those [known as] ConceptTitles, which would be in the Concept[ CreatorName & Language ]
Concept[ [ GreekCulture ], 350BC, Pygmalion] -- Greek Characters (eg. Galatea)
ConceptTitle[ [ GreekCulture ], 350BC, el, Original, πυροβελόνη ] -- /Pygmalion/ translated by Systran
ConceptTitle[ [ GreekCulture ], 350BC, en, Translated, Pygmalion ]
ConceptVariant[ [ Shaw, George Bernard], 1913, en, Original, Pygmalion ] -- play, English Characters (eg. Eliza Doolittle)
- Then the Movie, the first was in German (less famous)
ConceptVariant[ [ Shaw, George Bernard], 1913, de, Translation, Pygmalion ] -- same name in German (they know Greek mythology)
Movie[ [ Oberländer, Heinrich ], 1935, de, Pygmalion, ] -- no Differentiator
TitleType[ Translation ], CountryCode_Produced[ de ], LanguageCode_Produced[ de ]
- Then the Musical and Movie (more famous)
ConceptVariant[ Lerner & Loewe, 1956, en, Original, My Fair Lady ]
-- because you have only movies, books; plays; musicals; etc, which are necessary only for the addition of a Movie, have to be stored as a ConceptVariant
Movie[ Lerner & Loewe, 1964, en, My Fair Lady, ] -- no Differentiator
TitleType[ Original ], CountryCode_Produced[ us ], LanguageCode_Produced[ de ]

> One thing to be noted (in your model as well in my my sketched changes):
> consider a Concept B which is a variant of Concept A. Then, B and
> A cannot have a common ConceptTitle.

Yes.

> That may be too restrictive.

(Not sure what you mean by "too restrictive")

Concept[ A ] and ConceptVariant[ B ] certainly have different Titles, at least they are allowed to.

Which is [one reason] why a Movie is a child of ConceptTitle and not of Concept (which would have many ConceptTitles)

> One
> may assume that a variant "inherits" all the titles of the Concept it is
> a variant of,

Definitely not. That would lead to insane coding. All valid possibilities must be (a) designed, and (b) constrained.

>>>> Eg.
Movie_TitleIsValid_ck has a typo, sorry, it should be:
CHECK TitleType IN ( "Original", "Preferred", "Alternate" )
<<<<
And maybe we should add Translated to that list.

(I would recommend strongly that you give up the notion of "inherit", because it is established in the OO/ORM insanity ... where they deny proper design (decomposition ... analysis) and deny the natural hierarchies, but then have to add fragments of other isolated fragments (called Objects) to make a particular isolated Object work. They end up with 40 or 50 "inherits", that are all relevant only to the particular version of the developer's output, and otherwise totally irrelevant. Refactoring monthly. Total insanity.

In FOPC & RM, we have hierarchies, we are not denying the Facts that exist in nature. In which the child is OWNED (Matter) by the parent if Identifying, or EMPOWERED (Form, power to Act) by the parent if Non-Identifying. Far superior to the stupefying "inherits".

The myopic idiots say, in attempting to understand the /RM/, say that a Subtype "inherits" the columns in the Basetype. It does not. The columns in the Basetype are COMMON.)

> but that may not be satisfactory either. Something to
> think about (at least for me).
>
> Ditto for Movie/MovieTitle.

Can you name a situation in which a Movie with multiple MovieTitles is merchandised as a single Edition ... but with the many MovieTitles unresolved. What exactly would be printed on the cover of the DVD ?

The FIAF Manual does not resolve it (we did), but it does give quite a few examples of the Edition having to be one Language & Title. That is why we have Screen as a possibility in TitleType.

Therefore, I am saying, Edition is dependent on MovieTitle.

----
Ok, now for your very good DM (it is not a sketch, it is a DM of the relevant part).

> May a Moving Image be produced by a Culture different from the one of the associated Concept? I would rather create a new Concept instance in that case

Yes. I have tried to illustrate that in the example above. The columns as it stands (V0.13) ensure that: in order to have a different { CountryCode, LanguageCode ], it has to be a different Concept, which is a ConceptVariant.

> Relation Culture..Movie

We already have relations:
CountryLanguage produces 0-to-n Concepts
CountryLanguage produces 0-to-n Movies
But yes, the FK columns are missing in Movie. There are a few errors of that ilk that I have found and corrected. Next version.

I will change CountryLanguage to Culture.

> Is it true that every Moving Image title is a Concept title? I’d say so: after all, a Moving Image instance is a manifestation of a Concept.

Yes. And each MovingImage.Title is not one of the many ConceptTitles via Concept, but precisely one ConceptTitle.

> Relation ConceptTitle..MovieTitle

Three items.

1. It is not necessary because what you are trying to do (ensure the Movie references exactly one MovieTitle) is what I have already done (that you had not understood, but hopefully my explanation in this post suffices).

2. It is not cyclic in the DAG sense, but it is redundant, commonly and incorrectly called "cyclic" because it is visually a loop. It nevertheless raises a flag that the modelling is incorrect or incomplete.

3. IDEF1x and data modelling. You cannot have a Basetype with a single Subtype (which must be /one of/ or /any of/, and is not /zero-or-one of/). That is properly an optional column and therefore an optional table. See Conceptvariant in V0.14.

> (1) Formerly called TitleReduced.

Definitely better. Keeping with my Naming Convention: Title_Concept. The underscore means something, a qualifier. See other uses.

(3) Acyclic.

Yes. Enforced. By the Constraint. Further defined in ConceptVariant to remove the NULL FK. Detailed below.

PDFs
Please feel free to annotate my PDFs with notes and comments such as this. That may be quicker than creating a new diagram. My PDFs are not protected, Preview (and any Adobe viewer) has an AddNote facility. Toolbox.

> >(Oops, I just spotted an error. Movie is Independent, but shown as
> >Dependent. Calls for another iteration. I will wait until some of
> >these questions get resolved.)
>
> Correct, but see above. I'd make Movie dependent on Concept.

I have given my response. Over to you to see if that is acceptable, or if you still have reasons for Movie dependent on Concept ... that has many ConceptTitles. Or more discussion.

> > Consider:
> > - In the FIAF manual, where the definitions are loose and floppy, and
> > anything can be any or all of { Work | Variant | Manifestation | Item
> > }, they have Identifiers (our StandardType) on all four elements of
> > their four-level hierarchy.
> > - In our DM, where definitions are tight, I had implemented
> > StandardType (FIAF Identifiers) on the three levels of the hierarchy.
> > But our Concept is purely cerebral, not a realisation/manifestation,
> > which is our Project/Movie. So the question is, can there be a FIAF
> > Identifier/StandardType for a Concept ? I suspect it is for material
> > things only, for Movie & Edition, not for [our definition of] cerebral
> > things. But I am not sure, our Concept may well be a material thing
> > to them.
>
> But it is absolutely not material in our model. AFAIK, the various
> standard identifiers are for what we have called Movie or Edition
> (Instances of course come with their own codes, barcodes, serial numbers
> or whatever).

Excellent. Done.

> > - I have gone through the FIAF Manual again, and could not resolve
> > this point. Over to you.
>
> That's because what a "Work" is not clear. They sometimes say Work is
> "abstract", sometimes that it is "concrete". E.g., at p. 9, Work is
> labeled as an "abstract entity". Then, at p. 224 you find this pearl:
> «moving image "works" are more easily conceptualized as concrete
> entities».

Exactly.

Arts and humanities types have difficulty with definitions: they either treat definitions as fluid; or they keep redefining them to suit their nefarious purposes (subjective reality instead of objective).

For example, FIAF, on the model implied by their FIAF Manual, could not achieve the precision that we have, in our Concept vs ConceptTitle for Movie discussion above. Not even in their dreams.

> > - (Btw, that little exercise re-inforced our notion that our treatment
> > of Variant & Derivative is superior to theirs.)
>
> I agree.

The differences just keep adding up.

> > Another reason the DM is not complete is, I have not removed the
> > Nulls. Which absolutely have to be removed before I let a DM be
> > considered complete. It is one of those second-to-last steps.
>
> Have your marked optional attributes in some way? I have assumed that
> all attributes are mandatory.

Well data modelling is a progressive exercise. At the beginning we don't worry about things like that, somewhere in the middle, we start to worry, and long before the end we get them all perfect. So no, thus far (up to V0.13) I have not worried. Now we are entering the territory where I want the DM to be really useful, so yes, I will make that clear.

(In another thread we discussed hierarchies in a different sense: what has to be done at the physical level; why "deferred constraint checking" is hilariously absurd; etc. Eg. if handled improperly [or when the DM is not yet complete] the head of every hierarchy is a NULL. When handled properly that column is an optional table, and the NULL is eliminated. Again, INSERT/UPDATE/DELETE is true, and "deferred constraint checking" is not required.)

ConceptVariant and MovieVariant will be such tables, new in the next iteration.

From V0.14 onwards, all columns are mandatory. No Nulls. No ambiguities (which is why I wanted the full set of constraints).

Up to V0.13, it has been all columns are mandatory but the model is not complete because this and that has not be drawn. Eg. everyone on the planet knows that Derek would never leave a single NULL in the database, so that is taken for granted. But not drawn, will be drawn. Remember, we started with Movie Title Progression. We have come a long way. Good work.

> > As discusse, more Constraints, please. Especially difficult ones.
>
> I'd like to resolve the above first.
>
> > I can appreciate that you are busy, with the conditions in Northern
> > Italy making life very difficult. If you have any time to respond to
> > this thread, this subject is the one I would like you to prioritise.
> >
> > As I am walking through the DM, I am still finding little errors.
> > I was hoping that your walk-through, and reading off the Predicates as
> > you go, would have caught some of those.
>
> Things were moving fast here (locking down, etc.) and we had to adapt
> quickly, e.g., to giving remote classes. I have to focus mostly on that
> now. Besides, working from home bears its own issues.

I can only imagine. Here it is preparation, and dealing with all the cheaters (Chinese cheat at absolutely everything, and our universities have turned to prostitution, genuflecting to Chinese students who can't tie their shoelaces). No lockdown yet. Not even face masks, except for some Chinese.

----

This semester, I am busy on Wednesdays, not Tuesdays.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 17, 2020, 12:00:25 PM3/17/20
to
Nicola

I opened a new thread for MVCC.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 18, 2020, 5:05:37 AM3/18/20
to
> On Wednesday, 18 March 2020 02:54:48 UTC+11, Derek Ignatius Asirvadem wrote:

Further to that post. Changes and clarifications.

> > On Tuesday, 17 March 2020 05:24:07 UTC+11, Nicola wrote:
> > > On 2020-03-15, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> > I am considering some (perhaps too naive) changes.
>
> > The biggest difference is
> > that I am putting Movie (now Moving Image) under Concept rather than
> > ConceptTitle.
>
> [...]
>
> - Then the Musical and Movie (more famous)
> ConceptVariant[ Lerner & Loewe, 1956, en, Original, My Fair Lady ]
> -- because you have only movies, books; plays; musicals; etc, which are necessary only for the addition of a Movie, have to be stored as a ConceptVariant
> Movie[ Lerner & Loewe, 1964, en, My Fair Lady, ] -- no Differentiator
> TitleType[ Original ], CountryCode_Produced[ us ], LanguageCode_Produced[ de ]

Of course that should be:
TitleType[ Original ], CountryCode_Produced[ us ], LanguageCode_Produced[ en ]

Further, that is just an example. If the variations were substantial, we could have used ConceptDerivative instead.

> > Relation ConceptTitle..MovieTitle
>
> Three items.

A couple of minor items re IDEF1X requirements. If we used a modelling tool, it would do all this for us.

FK
There is no "FK" or (FK)" notation. Each FK column is bold only.

AK
The PK notation already shows the actual sequence of columns. The AK notation has show the the actual sequence of columns, it cannot be assumed to be in the order of appearance. Eg:
AK1.1
AK1.3
AK1.2.
If there is just one AK, I drop the "1.".

> > (1) Formerly called TitleReduced.
>
> Definitely better. Keeping with my Naming Convention: Title_Concept. The underscore means something, a qualifier. See other uses.

Upon implementing it, actually, no. It is already carefully thought out. There are two aspects, I will take them in turn.

1. The full Title is in the row. TitleReduced states clearly what it is, as differentiated from Title. There is no difference, whether you take /Reduced/ off, and add /Full/.
TitleReduced & Title
vs
Title & FullTitle

2. The addition of /Concept/ to Title looked good to begin with. In the Naming Convention I use, the rule for columns that form a Key, is that it must be explicit (in another discussion, ID was corrected to CountryID).

But we get into difficulty when we consider implementing that is ConceptTitle (with both Titles for each of those rows, and Concept.ConceptTitle. Therefore I pared it back to Title. The same applies to Movie.Title.

I await your comments.

> (3) Acyclic.
>
> Yes. Enforced. By the Constraint.

Movie_VariantLineageIsValid_ck
CHECK -- PK NOT IN dbo.Movie_VariantLineage_fn( PK )

The method was discussed in the previous thread. The Function produces the Lineage, for any purpose. Here it is used to ensure that the INSERTED row is not in the Lineage, thus preventing a circular reference.

> PDFs
> Please feel free to annotate my PDFs with notes and comments such as this. That may be quicker than creating a new diagram. My PDFs are not protected, Preview (and any Adobe viewer) has an AddNote facility. Toolbox.

I don't know if this point was a consideration for you, I will cover it just in case. Each country has a slightly different understanding of copyright (East Asia has no understanding at all). Americans have the most stringent, but it is not correct. If a document is published in the public domain, it is public, available to all, and therefore can be copied. If it has a Copyright notification, it means it can be copied, but only with the Copyright notice intact.

Feel free to annotate any PDF I send you. Except the final version of the DM because that is the finished one.

> > > Another reason the DM is not complete is, I have not removed the
> > > Nulls. Which absolutely have to be removed before I let a DM be
> > > considered complete. It is one of those second-to-last steps.
> >
> > Have your marked optional attributes in some way? I have assumed that
> > all attributes are mandatory.
>
> Well data modelling is a progressive exercise. At the beginning we don't worry about things like that, somewhere in the middle, we start to worry, and long before the end we get them all perfect. So no, thus far (up to V0.13) I have not worried. Now we are entering the territory where I want the DM to be really useful, so yes, I will make that clear.
>
> (In another thread we discussed hierarchies in a different sense: what has to be done at the physical level; why "deferred constraint checking" is hilariously absurd; etc. Eg. if handled improperly [or when the DM is not yet complete] the head of every hierarchy is a NULL. When handled properly that column is an optional table, and the NULL is eliminated. Again, INSERT/UPDATE/DELETE is true, and "deferred constraint checking" is not required.)
>
> ConceptVariant and MovieVariant will be such tables, new in the next iteration.

The removal of NULL by adding a table for the optional column (the Fact that the row is a Variant, and thus that the parent row is or is not), was discussed in the previous thread. That eliminates the need for treating the head of an hierarchy asa "special case" (code or allow zero or "deferred constraint checking").

> > > As I am walking through the DM, I am still finding little errors.
> > > I was hoping that your walk-through, and reading off the Predicates as
> > > you go, would have caught some of those.

The problem is of course that my situation is abnormal. Normally I use a modelling tool, ERwin, and then erect OmniGraffle diagrams only when pretty ones are required. But I have a new Mac, and Windows & Erwin are not installed yet. So it is OG from start to finish. Pretty, but the risk of typos.

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 18, 2020, 7:32:23 AM3/18/20
to
> On Tuesday, 17 March 2020 05:24:07 UTC+11, Nicola wrote:
> > On 2020-03-15, Derek Ignatius Asirvadem <derek.a...@gmail.com> wrote:
>
> That said, what are you proposing? That (Concept|Movie)Titles should
> incorporate CountryCode? I would go a different route (see below).
>
> I am considering some (perhaps too naive) changes. Please take a look at
> this:
>
> https://send.firefox.com/download/b2c409b72171dddf/#x_7fCElwXgWmgV2-MVWETg

In my previous two posts, I addressed what I thought you were trying to do, and thus my comments were within that scope. On further study of your DM, it appears that you are trying to do much more, trying to communicate much more, than I first thought. But that means I have to take the definitions as given, that there are no mistakes. (The AK notation issue is too small to consider as a mistake that warrants a discussion here.)

In that case, rather than argue the small points, I have to ask, in your DM, have you communicated the changes that you want, without mistakes ? Assuming the answer is yes ...

Concept
is ok (discussed, awaiting your response)

ConceptTitle
I do not understand what or why you have changed the Key columns. Don't think of ConceptTitle as a list of Titles for a Concept, think of it as a 1::0-n Fact about a Concept, regarding its Titles. And then, how we are going to constrain those Facts. Speaking from V0.13:
-- AK --
As is typical, when the PK is different from the parent PK plus a differentiator, the AK (a) preserves the relation from Concept PK, plus (b) the differentiator. Which in this case is TitleType & Language Code, meaning the ConceptTitle is allowed a max of 1 LanguageCode in each TitleType. It prevents silly and duplicate ConceptTitles, it forces the curator to identify what the Concept is known as in each Language. Once.
-- PK --
Is what the ConceptTitle really is, keeping in mind what we want re the migration to child tables. Which in this case is NOT the Concept PK (because it contains a TitleReduced) and we want a different TitleReduced. Which may be in a different LanguageCode. And again max 1 in each TitleType.

So, given that explanation, I do not understand what you are trying to do by:
a. removing TitleType from the PK ... that would allow more than 1 TitleType per LanguageCode

Movie, MovieTitle
Most items discussed in the previous post.

New item. Separate to whether it is a child of Concept or of ConceptTitle. Given the Key has Differentiator, why do we additionally need ProductionDate in the Key. As per previous discussions, ProductionDate pertains to the Project; the Movie production, so there should be just one. Not to be confused with EditionDate which is the date of each Edition, of which there is one per Edition, many per MovieTitle.

Again, apologies, I should have understood your intent better, and included all this in the previous post.

Side-by-side comparison, in case it helps
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Concept-Project-Edition%20Response.pdf

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 18, 2020, 8:30:08 AM3/18/20
to
Nicola

=======================
This post introduces V0.14
=======================

All items that have been discussed and resolved. Start with the view from 10,000 metres.

Table Relation
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TR%20V0_14.pdf

Table Attribute
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_14.pdf

Consolidated PDF in A3
http://www.softwaregems.com.au/Documents/Article/Database/Movie%20Title/Movie%20Title%20TA%20V0_14%20A3.pdf

- Only because you never ask. Probably too premature because the db is not quite stable yet.
- All collapsed items are clickable
- The understanding of Data Hierarchies is crucial to (a) the understanding,g and (b) the use of the database. Page 2 is the first instalment of that (I give custs more detail of course)
- Split into SubjectAreas, which are the main hierarchies

Cheers
Derek

Derek Ignatius Asirvadem

unread,
Mar 21, 2020, 7:59:31 AM3/21/20
to
> On Wednesday, 18 March 2020 23:30:08 UTC+11, Derek Ignatius Asirvadem wrote:
>
> =======================
> This post introduces V0.14
> =======================

Updated just now

Cheers
Derek

0 new messages