Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

some information about anchor modeling

1,028 views
Skip to first unread message

vldm10

unread,
Jul 14, 2012, 3:38:13 PM7/14/12
to
I began the thread, “The Original Version” on May 26th of 2010 on this
user group. In this thread, the following three facts were portrayed
about the paper “Anchor Modeling – An agile modeling technique using
the sixth normal form for structurally and temporally evolving data”:

(i) that the paper is inaccurate
(ii) that there are parts of the paper “Anchor Modeling” (AM) that are
of significance, but that these parts are actually only specific cases
of the ideas from my paper published in 2005 on my website
http://www.dbdesign10.com and on this user group. My paper was
published four years before the AM paper and was intensely discussed
within this user group.
(iii) that the paper is about the most important things in database
theory

At the end of mentioned thread I wrote the following:
“Given the significance of the results of my paper, published four
(five) years before "Anchor Modeling" on this user group, I have
decided to submit complaints to Springer and Data & Knowledge
Engineering publishers - the publishers of the first and second
versions of "Anchor Modeling". When I find out the results of these
complaints I will post them on this user group.”

On May 20th, 2012 I sent the following email to the person responsible
for my case, at Springer Publishing Company:

“Dear Ms Anna Kramer,

I am writing to inquire about the status of a complaint of plagiarism
I filed over a year ago against the paper with the title “An Agile
Modeling Technique using the Sixth Normal Form for Structurally and
Temporally Evolving Data” by: Olle Regardt, Lars Rönnbäck, Maria
Bergholtz, Paul Johannesson, Petia Wohed.
This paper was published by Springer.

I would appreciate it if you could notify me of the status of this
complaint, as it has been over a year since it has been filed and I
have not yet received a response.

Sincerely,
Vladimir Odrljin ”

The next day I received the following answer:

“Dear Vladimir Odrljin,

Having contacted all parties involved, we decided against retracting
this paper. The claims of plagiarism were not strong enough to warrant
taking this step.

Best regards,

Anna Kramer “
--

With regards to this response, I would like to express my opinion. In
spite of the conclusions of the publishing house Springer, I hold that
the paper Anchor Modeling is a plagiarism of my paper published on the
webpage http://www.dbdesign10.com

(i)
“According to U.S. law, the expression of original ideas is considered
intellectual property, and is protected by copyright laws, just like
original inventions. Almost all forms of expression fall under
copyright protection as long as they are recorded in some way (such as
a book or a computer file)”
(This quotation of the U.S. law can be found on web page http://plagiarism.org
.
There are more applicable clauses on this webpage than I have cited
here).

(ii)
The following basic ideas from my paper are used by the authors of
Anchor Modeling as their own:

a) My database solution introduces a new idea that enables the
modeling and maintaining
of knowledge about the states of objects (and relationships) and
changing of the
knowledge about entities (and relationships). This enables
solutions of databases of a
general character . The existing database theory, in contrast,
deals with simple and
static databases.

b) My database solution introduces and enables the construction and
maintenance of a
“history of changes”. My work gives the first complete solution of
the “history”
problem.

c) My database solution introduces and determines the construction of
binary structures
which result from the decomposition of database structures. I want
to note that E. Codd
unsuccessfully attempted to achieve binary decomposition in his
paper RM/T. Perhaps
we could say that entire database movement dealing with normal
forms converges
towards binary structures. There have also been numerous
unsuccessful patents related
to binary decomposition of database structures.

d) My solution introduces only one operation with data, and that is
the addition of new
data to the database. There is no deleting or updating of data in
the database. This
solution, therefore, controls redundancy and has no add, delete or
update anomalies.
My design calls for everything that is done in a database to be
saved, forever.

e) Note that there are some existing theories about changes, but that
all of them use
undefined terms like “the world”, “the situation of the world”,
“the state of the world”,
“states of affairs”, etc. In contrast to this, my solution models
only changes of entities
and relationships, which are terms that are defined. Secondly, in
my papers, the
following terms are introduced for the first time: identifiers of
states of entities, states
of relationships, identifiers of states of relationships, history
of changes of states of
entities and history of changes of states of relationships

f) An important part of solving “temporal”, “historical” and other
complex databases
consists of the following three sub-steps:
1. Constructing an identifier of an entity or relationship
2. Connecting all the changes of states of one entity (or
relationship) to the identifier of
this entity (or relationship). So, the identifier of the entity
always remains unchanged.
Excluding the identifier, all other attributes of the entity can
change states.
3. Constructing times that correspond to the “creation event” and
“closing event”.

I named these three sub-steps “procedure (a)” in my thread, “The
Original Version”. In this thread I wrote in more detail about the
importance of procedure (a) and its completely new approach in
database theory.

Regarding the above case (f) we can note that “Anchor Modeling” uses a
surrogate key as the identifier of an entity. A surrogate key is a
special sub-case of my solution. In general, it is a very bad database
solution. It can only be a solution to a limited number of cases,
which are only sub-cases of my solution.

Regarding the above case (f) we can note that “Anchor Modeling” uses
time in the way which is a sub-case of my solution.
Let me give an example. My solution can support a situation in which 3
people enter in the database 3 different colors of a single car at the
same time. They do this because they have different opinions about the
real car’s attribute, i.e. the car’s color.
My solution can support and store these three sets of information
(opinions). This case “Anchor modeling” can not support at all. This
kind of an event is impossible in (AM).

In fact, with regard to time, my solution is much more general than
the solution of the "Anchor Modeling". My solution is event-oriented,
and I use only two kinds of events. These two events determine time in
a general sense. They can support a case in which time is continuous,
and they can also support a case in which time is discrete. “Anchor
Modeling” supports only one case.

g) Data structures of the (historized attribute, historized ties…)
“Anchor Modeling” are
sub cases of binary structures from my paper.

(iii)
I explained my reasons for filing the complaint to Springer in 27
arguments. Each argument is brief and precisely explained. As one can
see from their above-pasted response, I received a generalized
response from Springer, that is, they did not respond to any of my
arguments specifically. Had Springer shown one of these 27 arguments
to be false, I would have accepted this as a legitimate rejection of
my complaint.

I also think my work solves certain significant issues pertaining to
other scientific fields, including Logic, Semantics and Mathematics
(Model Theory, for instance). I defined certain truth conditions, see
my paper, “Semantic Databases and Semantic Machines”, Sections 3.1.2
and 3.2.1 on http://www.dbdesign11.com
Thus, the issue in question is the plagiarism of a work that is of
importance for a group of scientific fields.

Working on this case, I realized that the participants in user groups
have limited opportunities to protect their ideas. I believe members
of the user group have the right to give and take information when it
is related to the activities of the group.

Vladimir Odrljin

vldm10

unread,
Jul 15, 2012, 7:41:04 AM7/15/12
to
In my thread “The original version” posted on this user group I
pointed out the errors in “Anchor Modeling” and pointed to the correct
solutions to these errors in my paper.
The authors of “Anchor modeling” then fixed some of these mistakes,
using important constructions which are presented in my paper. They
sent me a private email, but I responded to them, on this user group,
that I only give my professional opinions about “Anchor Modeling”
publicly on this user group (not privately).
The new version of “Anchor Modeling” was published in Data & Knowledge
Engineering Journal. On April 2011, I sent my claim of plagiarism to
Dr. Peter Chen, the Editor-in-Chief of the mentioned journal. He
replied that he would respond to me upon investigation. I recently
send Dr. P. Chen an email inquiring about when his response would
come, and have still gotten no reply.

Vladimir Odrljin

vldm10

unread,
Jul 18, 2012, 10:20:11 AM7/18/12
to
Why Is the Surrogate Key a Bad Solution?

First, we can note that a few database theories use the surrogate key
as a basic construct. I will mention the Object Oriented Approach
(abbreviated OOA), Codd’s RM/t and “Anchor Modeling”. Secondly, we can
note that OOA and Codd’s RM/T do not apply any of the points (ii)
a,b,c,d,e,f that I mentioned in my first message in this thread from
14 July. And this is very bad. OOA uses states of an entity but does
not use states of relationships. Note that OOA do not maintain states
and histories of states.
The concept of object identity in OOA is mostly about the object
existence which is independent of its value. This concept is less
about states.A key in the Relational Model is on the level of relation
and a key in OOA is on the level of a database. In OOA every object
has a system-supplied identity, which is surrogate key. (A key in the
Relational Model is on the level of relation and a key in OOA is on
the level of a database. In OOA every object has a system-supplied
identity, which is surrogate key.)

Let me give some examples which show that the surrogate key is a bad
solution.

1. Two distinct objects can have the same state. Here, the problem is
to find the two corresponding real world objects. But if someone, for
example working with the VIN or ISBN, then this one will not have
problems.
This case shows that the surrogate key is very convenient for hackers,
insiders, wrong entered data etc.
Note also that in OOA two distinct objects can have the same internal
state.

2. Today, more than 90% of databases have an identifier which is a
property of the corresponding entity. The VIN – Vehicle Identification
Number is a good example. For these databases the following is true:
a) This identifier is much better than the surrogate key.
b) These databases do not have any reason to use a surrogate key.

3. On page 183 of his last book, “Database Design & Relational Theory
– Normal Forms & All That Jazz” C.J. Date gives us Example 6. I think
that this example contains some misunderstandings and mistakes, but I
will use a more general example of it to show that the surrogate key
is a bad solution. (C. Date in this example writes about the
redundancy)

Let R be Relvar, which has the surrogate key K and three properties
A,B,C. Now, according to C.J. Date’s recommendation we will apply the
“RM/T discipline”. Here, Date refers to Codd’s paper RM/T. This is the
first mistake. The main goal of Codd in RM/T was to get a
decomposition of a relvar into binary relvars. He was unsuccessful at
this and was not able to show how this is done. So, there is the
question of how C. Date got binary relations in this example.

However, let us suppose that it is “somehow” possible to decompose the
above relvar into binary relvars using the “RM/T discipline”. Let the
following example be one possible situation:

K K A K B K C
--- -------- --------- --------
k1 k1 a1 k1 b1 k1 c3
k2 k3 c3
k3 k4 c3
k4
k5
k6

The above decomposition is very bad. For instance, there is the
question: how will a user find the real world entity that has the
attribute C=c3 and the surrogate key K=k4? Note that a surrogate key
is only in the database, it is not in the real world.
So, my point here is that the surrogate key makes this table so bad
that it becomes not an acceptable design.

4. In RM/T, section 4, E. Codd introduced cases (1), (2), (3). If
multiple databases using one and the same entity, and if every of
these databases using surrogate key for that entity, then mentioned
three cases are possible. If one do not know how to maintain a
history of events and do not know how to decompose a relvar into
binary relvar then maybe this one can try Codd's suggestion in section
4: "Database users may cause the system to generate or delete a
surrogate, but they have no control over its value, nor is its value
ever displayed to them.”
Note that the surrogate key in these cases is also a real problem.
Regarding surrogates, E. Codd wrote: "The capability of making equi-
joins on surrogates implies that users see the headings of such
columns but not the specific values in those columns." (see section
4).

Obviously, this unusual kind of design, based on surrogates, in fact
has a lot disadvantages.

Vladimir Odrljin

Eric

unread,
Jul 18, 2012, 2:21:10 PM7/18/12
to
On 2012-07-18, vldm10 <vld...@yahoo.com> wrote:
> ...
> 2. Today, more than 90% of databases have an identifier which is a
> property of the corresponding entity. The VIN ? Vehicle Identification
> Number is a good example. For these databases the following is true:
> a) This identifier is much better than the surrogate key.
> b) These databases do not have any reason to use a surrogate key.
> ...

The VIN (and most other such things) is nothing more or less than
Somebody Else's Surrogate Key. It is not an intrinsic property of the
object it refers to.

Eric
--
ms fnd in a lbry

vldm10

unread,
Jul 18, 2012, 8:48:02 PM7/18/12
to
The VIN exists in a database and in the real world.

Each surrogate key only exists in a database.
This imply(for example) that the VIN and a surrogate key have very
different semantics.

The VIN is an intrinsic property.
A property is an intrinsic property of an entity if we can say: “yes,
this entity has this property”. Each attribute i.e. a particular value
of a property, we determine by applying identification.
For example, an adrress is not an intrinsic property of a person.

Note also that most entities are made by man.( Cars, buildings, credit
cards, goods, invoices, etc )

Vladimir Odrljin

Eric

unread,
Jul 19, 2012, 3:53:06 PM7/19/12
to
On 2012-07-19, vldm10 <vld...@yahoo.com> wrote:
> On Jul 18, 8:21?pm, Eric <e...@deptj.eu> wrote:
>> On 2012-07-18, vldm10 <vld...@yahoo.com> wrote:
>>
>> > ...
>> > 2. Today, more than 90% of databases have an identifier which is a
>> > property of the corresponding entity. The VIN ? Vehicle Identification
>> > Number is a good example. For these databases the following is true:
>> > a) This identifier is much better than the surrogate key.
>> > b) These databases do not have any reason to use a surrogate key.
>> > ...
>>
>> The VIN (and most other such things) is nothing more or less than
>> Somebody Else's Surrogate Key. It is not an intrinsic property of the
>> object it refers to.
>>
>> Eric
>> --
>> ms fnd in a lbry
>
>
> The VIN exists in a database and in the real world.
>
> Each surrogate key only exists in a database.
> This imply(for example) that the VIN and a surrogate key have very
> different semantics.

Suppose I am designing a database and I have decided (rightly or wrongly,
but I _have_ decided) to use a surrogate key for a table whose rows are
about some real-world object. We have to communicate with other people
about these real-world objects, so we suggest it would be easier for
all concerned if everyone used the same arbitrary numbers (i.e. our
surrogate key) to refer to each particular real-world object. In fact,
since we are a government department, we can tell them that they have to
use it. Oh, we seem to have just invented something very like a VIN!

> The VIN is an intrinsic property.

No it is not. It is not even an acceptable candidate key. If you think
that it is, consider a database about car crime and insurance fraud.

> A property is an intrinsic property of an entity if we can say: "yes,
> this entity has this property".

You forgot "and the property is permanent".

> Each attribute i.e. a particular value
> of a property, we determine by applying identification.

> For example, an adrress is not an intrinsic property of a person.

No, an address is not an intrinsic property of a person. It is not even
an intrinsic property of a building.

> Note also that most entities are made by man.( Cars, buildings, credit
> cards, goods, invoices, etc )

Are you trying to say that whoever makes something can assign it an
identifier? Of course they can, but that doesn't necessarily make it
intrinsic; painting or tattooing a number on something does not make
that number an intrinsic property of what would still be the same thing
without the number.

David Portas

unread,
Jul 19, 2012, 5:02:30 PM7/19/12
to
"Eric" wrote in message news:slrnk0dvk...@teckel.deptj.eu...

> The VIN (and most other such things) is nothing more or less than
> Somebody Else's Surrogate Key. It is not an intrinsic property of the
> object it refers to.

I have very little idea what you mean by "intrinsic" here. All data in
database systems are symbols assigned to things by people and/or machines.
Who does the assigning is not necessarily important. What matters is what
propositions the database user wishes to assert by using those symbols.

Different sources may disagree somewhat about what is a "surrogate" and what
isn't. Codd's RM/T defines them as hidden attributes not exposed to database
users and therefore not used to identify things in the external UoD. It
follows that other (non-surrogate) keys are required to be used as external
identifiers. This I think is the only sound and useful way in which a
surrogate is distinguishable from a key that isn't a surrogate. Quite
simply: either a key is being used to identify a correspondence between
database propositions and facts in the external UoD (a domain key) or it is
not (surrogate key).

> > The VIN is an intrinsic property.

> No it is not. It is not even an acceptable candidate key. If you think
> that it is, consider a database about car crime and insurance fraud.

Surely a VIN is an "acceptable" candidate key if you desire to ensure it is
unique in a relation and you create a database constraint to guarantee that
it is. That is all. The acceptability or otherwise of making such a
constraint depends on what business rules you wish to enforce. If you are an
insurer then I suggest it *could* be perfectly reasonable to enforce such a
constraint on a Policy relation - to provide some degree of fraud
prevention.

> > A property is an intrinsic property of an entity if we can say: "yes,
> > this entity has this property".

> You forgot "and the property is permanent".

Permanence is obviously not a requirement of candidate keys however. Again,
I fail to see why the notion of something being "intrinsic" is interesting
or relevant in a discussion about keys.

David

George Neuner

unread,
Jul 19, 2012, 12:27:07 PM7/19/12
to
On Sat, 14 Jul 2012 12:38:13 -0700 (PDT), vldm10 <vld...@yahoo.com>
wrote:

>(ii) that there are parts of the paper "Anchor Modeling" (AM) that are
>of significance, but that these parts are actually only specific cases
>of the ideas from my paper published in 2005 on my website
>http://www.dbdesign10.com and on this user group. My paper was
>published four years before the AM paper and was intensely discussed
>within this user group.
>
> :
>
>(iii)
>I explained my reasons for filing the complaint to Springer in 27
>arguments. Each argument is brief and precisely explained. As one can
>see from their above-pasted response, I received a generalized
>response from Springer, that is, they did not respond to any of my
>arguments specifically. Had Springer shown one of these 27 arguments
>to be false, I would have accepted this as a legitimate rejection of
>my complaint.
>
>I also think my work solves certain significant issues pertaining to
>other scientific fields, including Logic, Semantics and Mathematics
>(Model Theory, for instance). I defined certain truth conditions, see
>my paper, =93Semantic Databases and Semantic Machines=94, Sections 3.1.2
>and 3.2.1 on http://www.dbdesign11.com
>Thus, the issue in question is the plagiarism of a work that is of
>importance for a group of scientific fields.
>
>Working on this case, I realized that the participants in user groups
>have limited opportunities to protect their ideas. I believe members
>of the user group have the right to give and take information when it
>is related to the activities of the group.
>
>Vladimir Odrljin

I'm not familiar with the publication involved, but unless it is a
text book or a refereed journal, few publishers will respond to
complaints concerning accuracy of content.

If you are seeking credit for your ideas, IANAL, so be sure to consult
with a qualified IP attorney (be sure to go to an IP specialist, *not*
a general practitioner). However, you need to be aware that under US
law, only rights pertaining to *registered* copyrights can be legally
enforced ... the implicit copyright granted by virtue of authorship
under the Berne convention carries absolutely no legal weight.

You may have some limited options if you can prove that, in every case
where the material appeared online, an explicit copyright notice was
included (even if the copyright was unregistered), and also can prove
that the ideas of the paper in question really are based on your work
(i.e. that the paper is a "derivative" work). But even so the odds of
a satisfactory outcome are not terribly good ... court decisions in
cases involving unregistered copyrights are entirely ad hoc. If you
somehow could make it a "prior art" patent case it would have
(slightly) more chance of success.

Unfortunately, online "publishing" by individuals in most cases really
is not considered publishing under the law. Too many authors are
unaware of this and of the consequences of putting their ideas online.
In many cases all the authors have accomplished is to add their ideas
to the public domain.

Good luck.
George

Erwin

unread,
Jul 21, 2012, 6:29:33 PM7/21/12
to
> I began the thread, “The Original Version” on May 26th of 2010 on this
> user group. In this thread, the following three facts were portrayed
> about the paper “Anchor Modeling – An agile modeling technique using
> the sixth normal form for structurally and temporally evolving data”:
>
> a) My database solution introduces a new idea that enables the
> modeling and maintaining
> of knowledge about the states of objects (and relationships) and
> changing of the
> knowledge about entities (and relationships). This enables
> solutions of databases of a
> general character . The existing database theory, in contrast,
> deals with simple and
> static databases.
>
> b) My database solution introduces and enables the construction and
> maintenance of a
> “history of changes”. My work gives the first complete solution of
> the “history”
> problem.

Are you familiar with the work by one Nikos Lorentzos ? I have a book in my library that was published 2003, summarizing his approach.

And since you claim that your solution is "complete", I reckon you also know what to do about and how to deal with, say, cyclic point types ?



> d) My solution introduces only one operation with data, and that is
> the addition of new
> data to the database. There is no deleting or updating of data in
> the database. This
> solution, therefore, controls redundancy

Wait a sec. Just because you allow only additions in your databases, implies that it is impossible to have redundancy in your databases ?



> e) Note that there are some existing theories about changes, but that
> all of them use
> undefined terms like “the world”, “the situation of the world”,
> “the state of the world”,
> “states of affairs”, etc.

"Closed WORLD assumption", anybody ?



> In contrast to this, my solution models
> only changes of entities
> and relationships, which are terms that are defined.

No kidding. Are they ?



vldm10

unread,
Jul 21, 2012, 7:59:15 PM7/21/12
to
On Jul 19, 9:53 pm, Eric <e...@deptj.eu> wrote:
> Suppose I am designing a database and I have decided (rightly or wrongly,
> but I _have_ decided) to use a surrogate key for a table whose rows are
> about some real-world object. We have to communicate with other people
> about these real-world objects, so we suggest it would be easier for
> all concerned if everyone used the same arbitrary numbers (i.e. our
> surrogate key) to refer to each particular real-world object. In fact,
> since we are a government department, we can tell them that they have to
> use it. Oh, we seem to have just invented something very like a VIN!

Note that if one uses the name of an identifier of an entity in a
communication, then it does not mean that this one can identify this
entity in the real world.

Imagine that a Honda dealer has 2000 identical new Hondas, and that
neither of them has a VIN number. In this situation, a database
application that uses surrogate keys will not work at all. All the
Hondas will have the same attributes and the unique surrogates. This
database will be a total confusion and a database disaster. This is a
very clear example that surrogates are bad solution.
However, if I apply a database based on VIN numbers, then everything
will be ok, and I do not need surrogate keys at all.

Note that many products have all the same intrinsic attributes. For
this case, I introduced the law which is a generalization of Leibniz’s
Law. (see my paper, Semantic Databases and Semantic Machines, section
5.6 at http://www.dbdesign11.com )


> > The VIN is an intrinsic property.
>
> No it is not. It is not even an acceptable candidate key. If you think
> that it is, consider a database about car crime and insurance fraud.


The VIN is an intrinsic property, you can see the surrogate on each
car. The VIN is based on international standards. For example in US it
is used by important institutions to identify individual motor
vehicle.


> Are you trying to say that whoever makes something can assign it an
> identifier? Of course they can, but that doesn't necessarily make it
> intrinsic; painting or tattooing a number on something does not make
> that number an intrinsic property of what would still be the same thing
> without the number.


If you are interested in intrinsic properties and the identification
of attributes, then you can see my paper "Database design and data
model founded on concept and knowledge constructs" at http://www.dbdesign11.com
. In section 2, I introduce intrinsic properties and in section 3.3
identification of attribute is defined in (3.3.3).

Vladimir Odrljin

Eric

unread,
Jul 22, 2012, 12:25:08 PM7/22/12
to
On 2012-07-21, vldm10 <vld...@yahoo.com> wrote:
> On Jul 19, 9:53?pm, Eric <e...@deptj.eu> wrote:
>> Suppose I am designing a database and I have decided (rightly or wrongly,
>> but I _have_ decided) to use a surrogate key for a table whose rows are
>> about some real-world object. We have to communicate with other people
>> about these real-world objects, so we suggest it would be easier for
>> all concerned if everyone used the same arbitrary numbers (i.e. our
>> surrogate key) to refer to each particular real-world object. In fact,
>> since we are a government department, we can tell them that they have to
>> use it. Oh, we seem to have just invented something very like a VIN!
>
> Note that if one uses the name of an identifier of an entity in a
> communication, then it does not mean that this one can identify this
> entity in the real world.

Not the name, the value!!! But no, it does not guarantee the ability to
identify the real world entity, only that everyone is (apparently)
talking about the same entity.

> Imagine that a Honda dealer has 2000 identical new Hondas, and that
> neither of them has a VIN number. In this situation, a database
> application that uses surrogate keys will not work at all. All the
> Hondas will have the same attributes and the unique surrogates.

No serious objection to that statement (except for English usage -
"none" rather than "neither").

> This database will be a total confusion and a database disaster.
> This is a very clear example that surrogates are bad solution.

Only if you expect to use them directly to find the real-world entity.

> However, if I apply a database based on VIN numbers, then everything
> will be ok, and I do not need surrogate keys at all.

Unless the database contains "illegal" vehicles and is expected to do
so.

> Note that many products have all the same intrinsic attributes. For
> this case, I introduced the law which is a generalization of Leibniz?s
> Law. (see my paper, Semantic Databases and Semantic Machines, section
> 5.6 at http://www.dbdesign11.com )

To some level of achievable accuracy. Which is why a database will only
count the "identical" entities (usually those in some definable
location).

>> > The VIN is an intrinsic property.
>>
>> No it is not. It is not even an acceptable candidate key. If you think
>> that it is, consider a database about car crime and insurance fraud.

> The VIN is an intrinsic property, you can see the surrogate on each
> car. The VIN is based on international standards. For example in US it
> is used by important institutions to identify individual motor
> vehicle.

A high-value vehicle stolen to order will have its identity changed,
including the VIN. Of course the original VIN still technically refers
to it, but how can you find out? The substitute VIN may or may not be
a duplicate of some other. A cut-and-shut job made out of parts of two
vehicles may inherit the VIN of one of them, or not, but this VIN is no
longer valid. Are the rules for kit-cars and extensive modifications by
legal owners adequate and adequately enforced?

So no, in general a VIN is not intrinsic, and therefore not a very good
example to choose.

>> Are you trying to say that whoever makes something can assign it an
>> identifier? Of course they can, but that doesn't necessarily make it
>> intrinsic; painting or tattooing a number on something does not make
>> that number an intrinsic property of what would still be the same thing
>> without the number.

> If you are interested in intrinsic properties and the identification
> of attributes, then you can see my paper "Database design and
> data model founded on concept and knowledge constructs" at
> http://www.dbdesign11.com . In section 2, I introduce intrinsic
> properties and in section 3.3 identification of attribute is defined in
> (3.3.3).

I have not read this yet, I may do so at some point, but for the moment
I do not see that it could make any difference to my main point:

Any identifier assigned to a natural or man-made entity (by anyone
whatsoever) does not become an intrinsic property of the entity no matter
how many times or where it is recorded on the entity. Either there will
be a way for the entity to continue as the same entity if it is removed
or altered, or removing it will destroy the entity in which case the
identifier refers only to an entity that no longer exists. All such
identifiers are merely the assigner's surrogate key that they choose to
disseminate in some way.

vldm10

unread,
Jul 22, 2012, 7:15:54 PM7/22/12
to

> Note that if one uses the name of an identifier of an entity in a
> communication, then it does not mean that this one can identify this
> entity in the real world.
Here I made ​​a mistake. Instead of "an identifier of an entity" should be "a surrogate key of an entity"




> The VIN is an intrinsic property, you can see the surrogate on each
> car. The VIN is based on international standards. For example in US it
> is used by important institutions to identify individual motor
> vehicle.
Instead of "you can see the surrogate on each car." should be "you can see the VIN on each car."


I used an old version of the user group and I noticed that the graphics on the new version is not well represented.
My apologize for these errors.

vldm10

unread,
Jul 23, 2012, 9:23:30 PM7/23/12
to

> &gt; b) My database solution introduces and enables the construction and
> &gt; maintenance of a
> &gt; “history of changes”. My work gives the first complete solution of
> &gt; the “history”
> &gt; problem.
>
> Are you familiar with the work by one Nikos Lorentzos ? I have a book in my >library that was published 2003, summarizing his approach.
>
> And since you claim that your solution is &quot;complete&quot;, I reckon you > also know what to do about and how to deal with, say, cyclic point types ?


If you refer to the book “Temporal Data & the Relational Model “ by C. j. Date, Hugh Darwen, Nikos Lorencos, then you are wrong, because this book is about temporal data rather then about history of data.


> &gt; d) My solution introduces only one operation with data, and that is
> &gt; the addition of new
> &gt; data to the database. There is no deleting or updating of data in
> &gt; the database. This
> &gt; solution, therefore, controls redundancy
>
> Wait a sec. Just because you allow only additions in your databases, implies that it is impossible to have redundancy in your databases ?


I wrote “controls redundancy”. Shortly, by “controls redundancy” I mean the following:
My solution does not update and delete data. So it does not have update and delete anomalies (usually caused by redundancy). The intention of this solution is to save all that is entered into the database. My solution uses the binary structures and only two events. Therefore, it is possible to set powerful constraints on the level of data entry. The possible events are: “create new data” and “close the existing data”.


> &gt; e) Note that there are some existing theories about changes, but that
> &gt; all of them use
> &gt; undefined terms like “the world”, “the situation of the world”,
> &gt; “the state of the world”,
> &gt; “states of affairs”, etc.
>
> &quot;Closed WORLD assumption&quot;, anybody ?


There is a precise definition about the "Closed WORLD assumption". However, "the world" and "a change of the world" are the most undefined words in the world.


> &gt; In contrast to this, my solution models
> &gt; only changes of entities
> &gt; and relationships, which are terms that are defined.
>
> No kidding. Are they ?


Please note that the entity and relationship are precisely defined in my papers. But if you believe that the entity and relationship are not well defined in other papers, then you may be right.

Vladimir Odrljin

Erwin

unread,
Jul 24, 2012, 10:12:56 AM7/24/12
to
Op dinsdag 24 juli 2012 03:23:30 UTC+2 schreef vldm10 het volgende:
> &gt; &amp;gt; b) My database solution introduces and enables the construction and
> &gt; &amp;gt; maintenance of a
> &gt; &amp;gt; “history of changes”. My work gives the first complete solution of
> &gt; &amp;gt; the “history”
> &gt; &amp;gt; problem.
> &gt;
> &gt; Are you familiar with the work by one Nikos Lorentzos ? I have a book in my &gt;library that was published 2003, summarizing his approach.
> &gt;
> &gt; And since you claim that your solution is &amp;quot;complete&amp;quot;, I reckon you &gt; also know what to do about and how to deal with, say, cyclic point types ?
>
>
> If you refer to the book “Temporal Data &amp; the Relational Model “ by C. j. Date, Hugh Darwen, Nikos Lorencos, then you are wrong, because this book is about temporal data rather then about history of data.

Chpater 15 of that book is entirely devoted to the particular subject of "history of the data itself". And if you think that there is a logical difference between "a statement about the modeled reality that was true for some period of time" and "a statement about the beliefs recorded in the database and the period during which they were recorded in said database", then you are wrong too, and pathetically wrong at that.

There is nothing to stop relational theory from still applying, even when the "modeled reality" is the very set of beliefs itself that were recorded in the database, and there is nothing that stops the presented concepts of temporal data management from applying to that case too.

There is a _practical_ difference (the issue of retroactive updates can, or should, never arise when it comes to "the history of the beliefs that were recorded in the database"), but that does not count as a _logical_ difference.

There may be differences in the algorithms that turn out to be the most optimal for dealing with either of both kinds of history, and there may be differences in the syntactic shorthands that are built into the language and offered to the user for dealing with either kind of history, but none of those count as logical differences either.



> &gt; Wait a sec. Just because you allow only additions in your databases, implies that it is impossible to have redundancy in your databases ?
>
>
> I wrote “controls redundancy”. Shortly, by “controls redundancy” I mean the following:

OK. I'll rephrase.

So just because you have dropped, in a certain sense, UPDATE and DELETE from your DML language, this implies by and of itself that redundancy in the database has now come under control [whereas in a language that does have UPDATE and DELETE, redundancy and derivability must necessarily be beyond the DBA's control] ?

Redundancy (the full name for denoting the subject topic was "redundancy and derivability", IIRC) comes from what the external predicates are for each of the elements in your database design (in a TR system, those elements are relvars, I don't know how you call them in your system, I'll assume "entity" henceforth).

If the truth value for any instantiation of a predicate (that's a proposition), or for any portion thereof, is derivable by looking at _other_ predicates and propositions, then you [inevitably] have redundancy and/or derivability. In each such case, you [inevitably] have an update anomaly problem, because there are then two distinct possible ways for determining the truth value of said proposition [or portion thereof] , let's call this proposition p :

(a) By looking at the database element (relvar/entity) and inspecting whether the tuple (entity occurrence) corresponding to p appears or not (making the truth value of p true or false, respectively).

(b) By looking at the _other_ database elements, determining the truth value of whatever it is that allows us to derive p, and doing that derivation.

The two might yield different results, and if they do, then you have a contradiction at hand. The only way to avoid this is by imposing a constraint. Whether or not the DML has UPDATE and/or DELETE, has nothing to do with it.



> My solution does not update and delete data. So it does not have update and delete anomalies (usually caused by redundancy). The intention of this solution is to save all that is entered into the database. My solution uses the binary structures and only two events. Therefore, it is possible to set powerful constraints on the level of data entry. The possible events are: “create new data” and “close the existing data”.

The only thing you have done is renamed DELETE (to CLOSE, IIRC). Doing that does not eliminate update anomalies. At best, you only replace DELETE anomalies with CLOSE anomalies.

vldm10

unread,
Jul 28, 2012, 11:30:54 AM7/28/12
to

> Chpater 15 of that book is entirely devoted to the particular subject of
> "history of the data itself".

These authors do not apply any of the points (ii) a,b,c,d,e,f that I mentioned in my first message in this thread from 14 July. These points are related to the large and important parts of the "History".






>And if you think that there is a logical difference between "a statement about >the modeled reality that was true for some period of time" and "a statement >about the beliefs recorded in the database and the period during which they >were recorded in said database", then you are wrong too, and pathetically wrong at that.
>
>
>
> There is nothing to stop relational theory from still applying, even when the "modeled reality" is the very set of beliefs itself that were recorded in the database, and there is nothing that stops the presented concepts of temporal data management from applying to that case too.
>
>
>
> There is a _practical_ difference (the issue of retroactive updates can, or should, never arise when it comes to "the history of the beliefs that were recorded in the database"), but that does not count as a _logical_ difference.
>
>
>
> There may be differences in the algorithms that turn out to be the most >optimal for dealing with either of both kinds of history, and there may be >differences in the syntactic shorthands that are built into the language and >offered to the user for dealing with either kind of history, but none of those >count as logical differences either.


I am not sure that you understand my db design. In my database design, I model states of entities and relationships. To be precise, a state of an entity (or relationship) is knowledge about the entity (relationship) (see my paper “Database design and data model founded on concept constructs and knowledge constructs”, section 3.9 and 4.2.4.1 at http://www.dbdesign11.com) .
My db design changes the definition of truth and truth conditions. Truth and truth conditions depend on the state of the corresponding entity. I think that this is very important result.

The relationship between meaning and truth is also defined in my model. (see my paper “Semantic databases and semantic machines”, section 3, at http://www.dbdesign11.com )




> The only thing you have done is renamed DELETE (to CLOSE, IIRC). Doing that >does not eliminate update anomalies. At best, you only replace DELETE >anomalies with CLOSE anomalies.


There is a huge difference between "close" and "delete". These things can not be compared at all.

Vladimir Odrljin

vldm10

unread,
Jul 31, 2012, 5:59:42 PM7/31/12
to
How should we design a database, so that it supports the decomposition of relvars into corresponding binary relvars?


In current database theory, database design is influenced by functional dependencies. My approach to database design is reversed. That is what I wanted to express with the title.

Let me briefly explain my solution to this problem. My solution only works with entities, relationships, intrinsic/extrinsic properties and simple keys. First, I will describe the general case:

Case 1. All-key entity. In my data model the general case is that the entity has intrinsic properties. This means that properties of the entity take values freely. This entity corresponds to a relation which has mutually independent attributes.

Case 2. All the entities from a set of entities have the same attributes. For instance, they are all Honda Civics with the same attributes. In this case we introduce the identifier id = VIN as a property of the entity. We can note that there are many identifiers introduced into business applications by international standards.

(In the case of a relationship, the key is always predefined.)

These two cases are general cases in my database design and we can note that all FDs here are implied only by keys. We can also note that here join can be done only by the key, because the key is only mutual part for all these attributes.

I introduced Simple Form on May 15, 2006 (see http://www.dbdesign10.com ) and the above two cases satisfy conditions for Simple Form. Simple Form is much better than other normal forms. We don’t need to put relation into 2NF, 3NF, BCNF, 5NF … because the binary schemas can be immediately constructed.
In both of the two general cases, the schema of the entity will be: E (identifier, attribute1, ..., attributeN). This schema meets the conditions for the Simple Form, so now we can construct the corresponding binary relations.

Now we can ask what other cases in database design exist besides the two above-mentioned ones? A better question would be: What generates the other cases?

The other cases for the design of entities (or relationships) can be generated by the following:
Case-a. By introducing business rules and constraints to entities (or relationships)
Case-b. By applying the operation add, delete and update to entities (or relationships)

The mentioned business rules and constrains can ruin every database.
The business rules in my database solution are solved by using binary structures because the business rules and constraints should be defined on the level of binary structures.

Note that business rules can be set up so badly that they cannot be brought into compliance with the theory of databases.
Case-b in my database model is solved by creation of “History”. Databases which maintain “History” do not have delete nor update operations; my database model keeps all data.

In my data model, for any database data it is known how it was made and who (or which procedures) made it. In my opinion this property is a huge theoretical and practical advantage of my data model.
We can ask the question, “Is there a way that someone can break the maintenance of history in my data model?” The answer is no.

Note that in object-oriented approach there are constructors, destructors, encapsulations, and states, but it is not possible to determine many things related to data. Deleting and updating data is allowed, thus there is no maintenance of states. The identity in OOA often uses a physical location in memory.
In “Anchor Modeling” deleting of data is allowed. This implies that updating of data is also allowed. Therefore there is no history. Deleting or updating of data impedes online (internet supported) databases.

In my data model I work with the conceptual model and the logical model. To switch from one model to another, I use the mapping from one data model to the other. This mapping consists of two mappings: schema mapping and data mapping. Both mappings are determined by binary structures and by identifiers of states of entities (or relationships). See my paper “Database design and data model founded on concepts and knowledge constructs”, Section 6.4 on http://www.dbdesign11.com , from 2008.

Simple Form is on the level of the relational model. However, there is the following problem: how to decompose a structure into binary structures on the conceptual model level? I showed how this can be solved in my paper “Database design and data model founded on concepts and knowledge constructs”, Section 4.2.3, Example 6 on http://www.dbdesign11.com , from 2008 and in my paper “Semantic databases and semantic machines” Section 5.8, on http://www.dbdesign11.com

--
The above text relates to simple databases. Now we will show how to construct a binary structure for complex databases (databases that manage changes, and keep history)
For complex databases, the decomposition into the binary structure is realized by means of states and identifiers of states. See my paper “Database design and data model founded on concept and knowledge constructs", Section 4.2.9 )

(Note that in RM / T and "Anchor Modeling" this decomposition is not proven. For example it is not done at the level of the ER model)

vldm10

unread,
Sep 12, 2012, 5:03:22 PM9/12/12
to
On Tuesday, July 24, 2012 4:12:56 PM UTC+2, Erwin wrote:

> The only thing you have done is renamed DELETE (to CLOSE, IIRC). Doing that does not eliminate update anomalies. At best, you only replace DELETE anomalies with CLOSE anomalies.



Let me give you two examples. These two examples show that the combination of “Surrogate key” and “Delete” is a very bad combination.

Example1.
State1. Person X from a company that supplies consumers with electricity reads a device for measuring the energy consumption for a person Y. Person X writes down on paper that, on May 1st, 2011, person Y fully spent 128 units of energy.
State2. Person X submits this list to the IT department on June 1st, 2011, a month later.
State3. The person from the IT department has put this list in a drawer and forgotten about it. After two months, he was reminded of this list and promptly handed it over to the person doing data entry.
State4. Thus, the data was entered into the database on September 1st, 2011.
State5. Then the IT department filed a lawsuit against a person Y, because this person has not paid the electricity consumed.
State6. However, it turns out person Y passed away on May 15th, 2011.
State7. During the trial it is determined that the person who entered the data, made a mistake and entered 728 instead of 128 (as the amount of energy consumed).
The son of person Y is a good lawyer represents in court his late father.

In this example, it is clear who created each state. We also know the procedure responsible for creating this state. Finally, we have an operation assignment. More formally, we have the “who” from the real world, “which procedure” in the db and the “name” of the location in a memory in which the “state” became permanent (assigned).

However, if one uses existing db theory (write and delete), then he doesn’t know what is going on and who is responsible for some particular data.

Example2.
Let there be a data relation that represents the entity Car. Aside from other attributes, car has the attribute VIN (vehicle identification number). If we use the surrogate key, then one of the binary relations has the following scheme : R1 (SurrogateKey, VIN). For example, one of the possible instances of this scheme can be the following relation : R1(SurrogateKey228, VIN1).

In the OO-approach, “RM/T” Codd’s approach and in Anchor Modeling, it is possible to delete this binary relation. Anchor modeling allows the deleting of “erroneous data”. Thus, using the mentioned theories, we can delete R1. Later, we can insert the following new R1(SurrogateKey228, VIN2). Here, VIN1 and VIN2 are two real VIN numbers.

This example shows that anyone can change the identification of an entity. Of course, this is nonsense.
We can notice that the “VIN” identification is determined in accordance with international standards, procedures and laws.
So, it is possible that a malicious data entry person or insider/outsider/car company can change the identity of a car.
In this way, any country can change someone’s identity. My point here is that this is a kind of "disaster" database design.

Codd, and the authors of OO-approach did not do anything that is relevant to the “history” of events.

Vladimir Odrljin

vldm10

unread,
Sep 17, 2012, 9:07:45 AM9/17/12
to
Thank you George for this information. I think this information is important to those who use the user group to express their opinions, their ideas and their creativity. Also this information is necessary for all professionals from the IT industry who are not "scientists". Probably more should be written on this subject.
Little attention is paid to protecting the intellectual property rights of these large groups of IT professionals. On the other hand, it is enough that some scientists publish a paper, which is the same as the work that has already been done by some IT professional, and the IT professional loses all the rights in this case.
It took time for me to do serious science. I quit my solid job position and took lower level, part-time jobs, because it was not possible to work intensively computer science and to be employed full-time at another job. It resulted in a number of other material and non-material consequences. I am writing about this to demonstrate a small fragment of the things that a court can never fix.

In my case, I have published my work on my website and this user group, simultaneously. I have paid for my website. My work has been extensively discussed on this user group. At that time, these user groups were being translated into numerous languages. So for my work, there are plenty of witnesses. My results were published four years before the "Anchor Modeling" results.

Anchor Modeling uses a Surrogate Key. In my definition of an identifier of an entity (or relationship), a Surrogate Key is included. That definition states:

“Besides Ack, every entity has an attribute which is the identifier of the entity or can provide identification of the entity. This identifier has one value for all the states of one entity or relationship.” (see Section 1.1 at http://www.dbdesign10.com from the year 2005.

In this definition, the second part "... or can provide identification of the entity" refers to a surrogate. But only for those surrogates, that can provide identification of entities.
In fact, my intention with this part of the definition of the identifier of the entity, was to identify the relationship. Especially when the relationship is represented via surrogates. This is a much more serious problem, which authors of Anchor modeling have not even noticed.
In my paper, a relationship is, for the first time, defined as an abstract object. I also give a definition of identifiers of abstract objects and a definition of identifiers of states of relationships.

In my complaints to the publishing houses Springer and Elsevier, I stated that the paper Anchor Modeling is incorrect, that it has scandalous theoretical errors and that it uses very important sections from my paper. I showed the same on this user group. After that, a new version of Anchor Modeling was published. In this new version, my critique and important ideas form my paper were used to correct certain errors in the paper. As I have indicated in this user group, surrogates can be applied only in very limited cases.
Note that this is a substantial work in computer science, as well as some for other sciences.

Vladimir Odrljin

vldm10

unread,
Oct 31, 2012, 5:19:19 AM10/31/12
to
On page 211, Appendix A / Primary Keys Are Nice but Not Essential,
of his last book, "Database Design & Relational Theory – Normal Forms & All That Jazz” C.J. Date writes about “anchor relvars”

What is the role of these maritime terms in the theory of databases?
Does anyone know what this is?

Vladimir Odrljin

com...@hotmail.com

unread,
Nov 1, 2012, 1:01:53 AM11/1/12
to
On Wednesday, October 31, 2012 2:19:19 AM UTC-7, vldm10 wrote:
> On page 211, Appendix A / Primary Keys Are Nice but Not Essential,
> of his last book, "Database Design & Relational Theory – Normal Forms & All That Jazz” C.J. Date writes about “anchor relvars”

He is just using it in its everyday metaphoric sense of a rooted base.
Google gives a Safari book preview of a few lines, below. (I have the book, this enough to give you an idea of his usage, introduced here.) See http://en.wikipedia.org/wiki/Relational_Model/Tasmania re kernel entities of RM/T.

philip

A. Primary Keys Are Nice but Not Essential
ONE PRIMARY KEY PER ENTITY TYPE?
I turn now to the second of the two issues mentioned in the introduction to this appendix: viz., that entities of a given type are supposed to be identified in exactly the same way everywhere in the database. What this means, loosely speaking, is that there’ll typically be:
A single “anchor” relvar for the pertinent entity type, having some particular primary key, together with
Zero or more subsidiary relvars giving further information about entities of that type, each having a foreign key that refers back to the primary key of that anchor relvar.
(Does this state of affairs remind you of the RM/T discipline discussed in Chapter 15?)


he

vldm10

unread,
Nov 4, 2012, 1:31:33 PM11/4/12
to
On Thursday, November 1, 2012 6:01:54 AM UTC+1, com...@hotmail.com wrote:
> On Wednesday, October 31, 2012 2:19:19 AM UTC-7, vldm10 wrote:
>
> > On page 211, Appendix A / Primary Keys Are Nice but Not Essential,
>
> > of his last book, "Database Design & Relational Theory – Normal Forms & All That Jazz” C.J. Date writes about “anchor relvars”
>
>
>
> He is just using it in its everyday metaphoric sense of a rooted base.

Here, C. Date tries to declare RM / T discipline, however RM / T is not discipline. RM / T did not provide a solution and has many very serious mistakes. This state of affairs reminds me on the use of other people's papers in order to prove that the RM / T is actually a "discipline". Speaking in the style of maritime terms, it seems to me that the "anchor relation" as a kind of a release ink.

On Jull 18, 2012 in this thread, I explained, with examples, that the surrogate key is a very bad solution. Note that these examples are given only for the entities. However, relationships with surrogates have much worse solutions.
Note that there are more serious problems for the RM / T at the theoretical level.


> Google gives a Safari book preview of a few lines, below. (I have the book, this enough to give you an idea of his usage, introduced here.) See http://en.wikipedia.org/wiki/Relational_Model/Tasmania re kernel entities of RM/T.


Let me comment only few basic things from this web site. In the section "Summary of RM/T", the following basic concepts are defined:

(i) Definition
Surrogates. A surrogate is a unique value assigned to each entity.

my comment: The entity is the real world object and it is not possible to assign value to the real world object.

(ii) Definition
A nonentity is some thing that is not an entity…

my comment: Note that “entity” and “thing” are synonyms.

(iii) Definition
The RM/T addresses atomic semantics by…

my Comment: With the most carefully observing the RM / T, one could not find a single atom of the semantics, because in the RM/T, section 4, E. Codd wrote:
"Database users may cause the system to generate or delete a surrogate, but they have no control over its value, nor is its value ever displayed to them.”



>
>
>
> philip
>
>
>
> A. Primary Keys Are Nice but Not Essential
>
> ONE PRIMARY KEY PER ENTITY TYPE?
>
> I turn now to the second of the two issues mentioned in the introduction to this appendix: viz., that entities of a given type are supposed to be identified in exactly the same way everywhere in the database. What this means, loosely speaking, is that there’ll typically be:
>
> A single “anchor” relvar for the pertinent entity type, having some particular primary key, together with
>
> Zero or more subsidiary relvars giving further information about entities of that type, each having a foreign key that refers back to the primary key of that anchor relvar.
>
> (Does this state of affairs remind you of the RM/T discipline discussed in Chapter 15?)
>
>
>
>
>
> he

Vladimir Odrljin

com...@hotmail.com

unread,
Nov 5, 2012, 4:27:36 PM11/5/12
to
On Sunday, November 4, 2012 10:31:33 AM UTC-8, vldm10 wrote:
> Speaking in the style of maritime terms, it seems to me that the "anchor relation" as a kind of a release ink.

I cannot make sense of "as [has? is?] a release ink [link?]" but I am interested to know what you meant. ("Speaking in the style of maritime terms," an anchor has a chain that has links but "has/is a release link" still makes no sense.)

philip

Eric

unread,
Nov 5, 2012, 4:44:14 PM11/5/12
to
On 2012-11-04, vldm10 <vld...@yahoo.com> wrote:
...
>> Google gives a Safari book preview of a few lines, below. (I have the
>> book, this enough to give you an idea of his usage, introduced here.) See
>> http://en.wikipedia.org/wiki/Relational_Model/Tasmania re kernel entities
>> of RM/T.

> Let me comment only few basic things from this web site. In the section
> "Summary of RM/T", the following basic concepts are defined:

(re-ordered)

> (ii) Definition
> A nonentity is some thing that is not an entity?
>
> my comment: Note that ?entity? and ?thing? are synonyms.

Selective quoting is never a good way to try to make a point. The
sentence before says "An entity is some thing in the modelled universe",
i.e. a definition if "entity" in the current context. A thing that is
not in the modelled universe is not an entity so the two words are not
synonyms. Simple.

>
> (i) Definition
> Surrogates. A surrogate is a unique value assigned to each entity.
>
> my comment: The entity is the real world object and it is not possible
> to assign value to the real world object.

I can assign value to anything I choose. I can make my system assign *a*
value to its representation of an entity. There are two distinct
meanings of "value" there. Either way your comment is just wrong.

> (iii) Definition
> The RM/T addresses atomic semantics by?
>
> my Comment: With the most carefully observing the RM / T, one could not
> find a single atom of the semantics, because in the RM/T, section 4,
> E. Codd wrote: "Database users may cause the system to generate or delete
> a surrogate, but they have no control over its value, nor is its value
> ever displayed to them.?

There are various things one could say about Codd's statement you have
quoted, but what the blazes has it got to do with atomic semantics? I
suspect that you have totally misunderstood the meaning of "atomic
semantics" in the context of that web page.

Or, to put it another way, nothing you are saying makes any sense at
all.

paul c

unread,
Nov 5, 2012, 10:30:12 PM11/5/12
to
I'm more interested to hear why people think keys matter and how they
matter, not about anchors. Ordinary English is full of nautical idioms,
that's just the result of its rapid development during the host
country's 19th century adventures, durimg a time when I gather that the
other "Western" languages such as French didn't change much. (I don't
know if it's merely a coincidence that most of the people who strike me
as the deepest RT thinkers happen to be English.)

I wish people would ask about the importance of keys in the first place,
instead of this "anchor" variant. I wish they would keep aspects of
English out of it. Codd was surely a pragmatist who I think chose his
adjectives to help encourage his ideas among commercial users, not to
embellish any credentials of rigour (although I image he could have if
he had wanted to - nobody can ever know that now). Not to disparage
rigour in general, but where practical advantages are concerned, one
eventually has to make choices. Forums like this one have been obsessed
for years with topics that had to do with choices that are quite
subjective, in other words quite concerned with the ease of automating a
particular application.

For myself, the chief advantage of what Codd called a key is to enable
the identification of a proposition without going to the touble of
specifying all of the proposition. (There are many other ways to
express that sentence which I don't argue with.)

I wish this forum would get back to basic theoretical questions instead
of talking about language nuance ( I realize avoiding them is hard,
especially for people whose first language isn't English, nothing I can
do about that.)

vldm10

unread,
Nov 28, 2012, 3:54:33 AM11/28/12
to
I did not use “selective quoting”, I wrote about surrogate keys, entities and atomic semantics. These are the basic concepts of RM/T. In fact, I wrote about nonsense in the paper RM/T.

For example, the following sentence: “An entity is some thing in the modelled universe and is typically identified by a surrogate.” is not true, because Codd’s surrogate does not identify an entity. In many cases, the surrogate introduced by E. Codd, cannot identify anything. In fact, Codd’s surrogate is serious nonsense.
The paper RM/T has many other mistakes; some of these mistakes are related to fundamentals from the theory of database.

I am writing about this, because I have impression that there are people who try to “fix” RM/T paper, by using works which are done by others. Note that these are the most important things from the theory of database.

Vladimir Odrljin

Eric

unread,
Nov 29, 2012, 6:23:20 PM11/29/12
to
On 2012-11-28, vldm10 <vld...@yahoo.com> wrote:
> On Monday, November 5, 2012 11:10:05 PM UTC+1, Eric wrote:


>> On 2012-11-04, vldm10 <vld...@yahoo.com> wrote:
>> ...
>>>> Google gives a Safari book preview of a few lines, below. (I have the
>>>> book, this enough to give you an idea of his usage, introduced here.)
>>>> See http://en.wikipedia.org/wiki/Relational_Model/Tasmania re kernel
>>>> entities of RM/T.
>>
>>> Let me comment only few basic things from this web site. In the section
>>> "Summary of RM/T", the following basic concepts are defined:
>>
>> (re-ordered)
>>
>>> (ii) Definition
>>> A nonentity is some thing that is not an entity.
>>>
>>> my comment: Note that "entity" and "thing" are synonyms.
>>
>> Selective quoting is never a good way to try to make a point. The
>> sentence before says "An entity is some thing in the modelled universe",
>> i.e. a definition if "entity" in the current context. A thing that is
>> not in the modelled universe is not an entity so the two words are not
>> synonyms. Simple.
>>
>>>
>>> (i) Definition
>>> Surrogates. A surrogate is a unique value assigned to each entity.
>>>
>>> my comment: The entity is the real world object and it is not possible
>>> to assign value to the real world object.
>>
>> I can assign value to anything I choose. I can make my system assign *a*
>> value to its representation of an entity. There are two distinct
>> meanings of "value" there. Either way your comment is just wrong.
>>
>>> (iii) Definition
>>> The RM/T addresses atomic semantics by.
>>>
>>> my Comment: With the most carefully observing the RM / T, one could not
>>> find a single atom of the semantics, because in the RM/T, section 4,
>>> E. Codd wrote: "Database users may cause the system to generate or delete
>>> a surrogate, but they have no control over its value, nor is its value
>>> ever displayed to them."
>>
>> There are various things one could say about Codd's statement you have
>> quoted, but what the blazes has it got to do with atomic semantics? I
>> suspect that you have totally misunderstood the meaning of "atomic
>> semantics" in the context of that web page.
>>
>> Or, to put it another way, nothing you are saying makes any sense at
>> all.
>
> I did not use "selective quoting", I wrote about surrogate keys, entities
> and atomic semantics. These are the basic concepts of RM/T. In fact,
> I wrote about nonsense in the paper RM/T.

You quoted one sentence from the web site. You chose not to quote the
sentence before it. You selected what to quote so that your (otherwise
nonsensical) comment on it would make some sort of sense. _That_ is
selective quoting, one of the many forms of argument by deception.

> For example, the following sentence: "An entity is some thing in the
> modelled universe and is typically identified by a surrogate." is not
> true, because Codd's surrogate does not identify an entity. In many cases,
> the surrogate introduced by E. Codd, cannot identify anything. In fact,
> Codd's surrogate is serious nonsense.

You are now saying that something is wrong because it is wrong. The
concept of the surrogate is not nonsense, you simply do not understand
it!

> The paper RM/T has many other mistakes; some of these mistakes are
> related to fundamentals from the theory of database.

The paper may or may not contain mistakes. I don't think you have made a
proper identification of any, let alone explaining why they are
mistakes or how they might be fixed.

> I am writing about this, because I have impression that there are people
> who try to "fix" RM/T paper, by using works which are done by others.

And you object to this? Why?

vldm10

unread,
Dec 5, 2012, 6:26:04 AM12/5/12
to

> You are now saying that something is wrong because it is wrong. The
>
> concept of the surrogate is not nonsense, you simply do not understand
>
> it!


In this thread I showed why the surrogate key is a bad solution. See my posts from 18 July and 12 September in this thread.
There are many other cases, which show that the surrogate key is a bad solution. The cases that I have indicated are very serious.
As far as I know, this is the first time that someone has explained why the surrogate key is a bad solution. This explanation is supported with very important examples.

Obviously, RM / T can not support nulls. A person who enters the data has to know all the data of the corresponding entity. So, someone could set the following question: why RM / T uses binary relations, when data entry person has to know the entire entity.

Logical operations that Codd introduced for nulls do not work and do not help.

The surrogate keys cannot be fixed.

There are other serious problems in the RM / T that are not due to surrogate keys. What is much worse, it's that the RM / T has a serious misunderstanding on the theoretical level. There are also serous mistakes on the theoretical level. Let me show you just three of them.

(i) Kurt Gödel 1944:
“By the theory of simple types I mean the doctrine which says that the objects of thought (or, in another interpretation, the symbolic expressions) are divided into types, namely: individuals, properties of individuals, relations between individuals, properties of such relations, etc. (with a similar hierarchy for extensions), and that sentences of the form: " a has the property φ ", " b bears the relation R to c ", etc. are meaningless, if a, b, c, R, φ are not of types fitting together. Mixed types (such as classes containing individuals and classes as elements) and therefore also transfinite types (such as the class of all classes of finite types) are excluded…”

Note that Kurt Gödel uses the term "individuals" rather than “entities”. 35 years later, E. Codd extensively uses the term "entity type" in his work RM / T.
What is wrong here?
This “job” is wrong, because no one knows how Codd transmits data from one data model to another data model, using the entity type (from E/R data model to RM data model and vice verse).

Please note that I am completely solved this problem. See my paper “Database design and data model founded on concept and knowledge constructs” from 2008 at http://www.dbdesign11.com , section 1, 4.1, 4.2.7 and 6.4.
The mapping between data models for simple databases is defined by identifiers of entities (relationships). The mapping for complex databases is defined by identifiers of states of entities (relationships).

(ii) I distinguish real objects from abstract objects. I defined abstract objects and the identification of each of these objects. See (3.3.3), section 3.3, from the above mentioned paper “Database design and data model founded on concept and knowledge constructs”.
In my paper only the "m-attributes" are determined with our perceptual abilities. All other (more complex) objects are defined recursively, according to their complexity (see m-entities, m-relationships and m-states). The complex objects are determined by our mental activities.

We can notice that E. Codd not distinguish these objects. He did not even notice these objects.

On the other hand, we can notice that (3.3.3) is important because it defines the relationship between concepts and identification, that is, it determines the relationship between the relation of satisfy and the corresponding identification.

(iii) E. Codd did not prove the decomposition of a relation into binary relations, that is suggested in his paper RM / T.

Note that it is obvious that the binary relation must have one attribute and the simple key. Obliviously, E. Codd was aware that there was no point in having a binary relation that has a composite key and one attribute. So, he introduced the “surrogate key”.

Many people tried to solve the problem of "binary decomposition". This is perhaps the most important problem in the database theory.
--

In his book "Go Faster! - The TransRelational ™ Approach to DBMS Implementation" C. Date published a letter which was written by E. Codd. This letter E. Codd wrote to Steve Tarin. C. Date represents Steve Tarin as the author of The TransRelational™ Model.
(This book is free downloaded at bookboon.com). In this letter E. Codd expressed congratulations to S. Tarin for his “revolutionize” work and that this technology can be “extremely effective”. It seems to me that with this letter E. Codd admitted that did not resolve the problem of the mentioned "binary decomposition".

But in his last book "Database Design & Relational Theory – Normal Forms & All That Jazz” C.J. Date writes about “anchor relvars” and “(Does this state of affairs remind you of the RM/T discipline discussed in Chapter 15?) “
I have an impression that this claim about “RM / T discipline” stands in contradiction with the obvious E. Codd assertion that The TransRelational ™ Model superiorly solved the mentioned problems.


> > I am writing about this, because I have impression that there are people
>
> > who try to "fix" RM/T paper, by using works which are done by others.
>
>
>
> And you object to this? Why?

The following sentence: “An entity is some thing in the modelled universe and is typically identified by a surrogate.” is not true, because Codd’s surrogate does not identify an entity. (See http://en.wikipedia.org/wiki/Relational_Model/Tasmania )

In many cases, the surrogate introduced by E. Codd, cannot identify anything. In fact, Codd’s surrogate is serious nonsense. See examples in my posts from 18 July and 12 September in this thread.

Given that the mentioned sentence is posted on Wikipedia, and considering that Wikipedia has a global approach, and considering that a lot of people use Wikipedia as a reference material, I found that it is important to inform the public about these inaccuracies.
Of course, if you think that my examples are not correct, please post it. But please be specific. So, please specify a concrete my example which is not correct, and give an explanation. Given that your comments were not specific, I can not accept them seriously.


> Eric
>
> --
>
> ms fnd in a lbry

Vladimir Odrljin

Eric

unread,
Dec 5, 2012, 4:41:56 PM12/5/12
to
On 2012-12-05, vldm10 <vld...@yahoo.com> wrote:
> On 2012-11-29, Eric <er...@deptj.eu> wrote:
>
>> You are now saying that something is wrong because it is wrong. The
>> concept of the surrogate is not nonsense, you simply do not understand
>> it!
>
>
> In this thread I showed why the surrogate key is a bad solution. See
> my posts from 18 July

I see no point in responding to something I have already responded to!

> and 12 September

This is a couple of examples which are good indicators for the need for
a "temporal database", a subject about which there is now a substantial
literature which you do not seem to have referred to at any time.

> in this thread. There are many other cases, which show that the surrogate
> key is a bad solution. The cases that I have indicated are very serious.
> As far as I know, this is the first time that someone has explained why
> the surrogate key is a bad solution. This explanation is supported with
> very important examples.

I do not understand why you have made surrogate keys your target,
especially as I still think you do not understand them. Where you have
raised a real problem it is always about what operations should be allowed
in a database and what, if any, record of those operations is kept in the
database system. Surrogate keys are neither the problem nor the solution.

> Obviously, RM / T can not support nulls. A person who enters the data
> has to know all the data of the corresponding entity. So, someone could
> set the following question: why RM / T uses binary relations, when data
> entry person has to know the entire entity.

In a system which uses only binary relations there is no such thing as a
null, but there is still the possibility of missing/unknown data. The
data entry person does not have to know the entire entity.

<snip enormous amount of confused garbage>

>>> I am writing about this, because I have impression that there are people
>>> who try to "fix" RM/T paper, by using works which are done by others.
>>
>> And you object to this? Why?

<snip more stuff which does not answer my question at all!>

> Of course, if you think that my examples are not correct, please post
> it. But please be specific. So, please specify a concrete my example
> which is not correct, and give an explanation. Given that your comments
> were not specific, I can not accept them seriously.

It has taken me this long to make any sense out of your examples at all,
and anyway picking on them individually is pointless when you seem to
have so many more fundamental confusions. Also, as I said above, your
examples do not explain why it is surrogates specifically that are the
problem. One might as well claim that you cause car crashes by looking
out of a nearby window.

vldm10

unread,
Dec 9, 2012, 10:57:07 AM12/9/12
to
I'm sorry, but I will not read your posts any more. I have no time
for such phraseology, at such a level.
In fact, it seems to me, I need a parachute to get down to that
level.


> One might as well claim that you cause car crashes by looking
>
> out of a nearby window.


I'm not sure, that one day, you will not initiate discussions about
the construction of the space shuttle, on this user group.

Vladimir Odrljin

Eric

unread,
Dec 9, 2012, 12:13:51 PM12/9/12
to
On 2012-12-09, vldm10 <vld...@yahoo.com> wrote:
> Dana srijeda, 5. prosinca 2012. 22:41:56 UTC+1, korisnik Eric napisao je:
>> On 2012-12-05, vldm10 <vld...@yahoo.com> wrote:
>>> On 2012-11-29, Eric <er...@deptj.eu> wrote:
>>> Of course, if you think that my examples are not correct, please post
>>> it. But please be specific. So, please specify a concrete my example
>>> which is not correct, and give an explanation. Given that your comments
>>> were not specific, I can not accept them seriously.
>>
>> It has taken me this long to make any sense out of your examples at all,
>> and anyway picking on them individually is pointless when you seem to
>> have so many more fundamental confusions. Also, as I said above, your
>> examples do not explain why it is surrogates specifically that are the
>> problem.
>
>
> Given that your comments were not specific, I can not accept them
> seriously.

So he says "I won't argue with you except on my terms." The trouble is,
his terms are only part of the things that are wrong (choose your own
word there!)

> I'm sorry, but I will not read your posts any more. I have no time
> for such phraseology, at such a level.
> In fact, it seems to me, I need a parachute to get down to that
> level.

I think he needs a ladder rather than a parachute, to climb out of the
hole he has dug for himself.

Anyway I suspect he is only posting here so that his words survive
indefinitely in usenet archives.

<context snipped by Vladimir to mask what this bit was about>

>> One might as well claim that you cause car crashes by looking
>> out of a nearby window.

> I'm not sure, that one day, you will not initiate discussions about
> the construction of the space shuttle, on this user group.

Which doesn't make sense even if you know what the context was.

vldm10

unread,
Jan 2, 2013, 4:50:02 AM1/2/13
to

I have written several times on this group, that the work of E. Codd's is very important for the theory of databases. But we can not say that E. Codd did something if he did not do it. So I wonder why C. Date wrote the following:


“...
ONE PRIMARY KEY PER ENTITY TYPE?
I turn now to the second of the two issues mentioned in the introduction to this appendix: viz., that entities of a given type are supposed to be identified in exactly the same way everywhere in the database. What this means, loosely speaking, is that there’ll typically be:

A single “anchor” relvar for the pertinent entity type, having some particular primary key, together with

Zero or more subsidiary relvars giving further information about entities of that type, each having a foreign key that refers back to the primary key of that anchor relvar.

(Does this state of affairs remind you of the RM/T discipline discussed in Chapter 15?)
...”


First, RM / T is not discipline. There is no industry or theories, which are based on the RM / T.
Second, I showed (on examles) in this thread, why is the surrogate key bad solution.
Thirdly (the most important), RM / T has great theoretical drawbacks. For example, RM / T does not have anything related to the history of events, but without a history of events, the "binary decomposition" can not be done.
Also, it is not clear how one transfers structures from ER model into RM model, using RM / T.(and vice verse). Do a binary decomposition is valid in the ER, using "invisible" surrogate?
There are other theoretical problems in the RM / T, but it takes much more time and space.

"Anchor Modeling" has bridged some theoretical problems of the RM / T, using some of the results that I have published four years before them on this group. "Anchor Modeling" also uses the surrogate key and has some other incorrect theoretical results. Let me mention that it is allowed to delete some data in "Anchor Modeling." If someone has a delete operation and adding, then it implies that he has the update operation. And that means that there is no history.

Vladimir Odrljin

paul c

unread,
Jan 4, 2013, 1:29:50 PM1/4/13
to
On 02/01/2013 1:50 AM, vldm10 wrote:
> First, RM / T is not discipline. There is no industry or theories, which are based on the RM / T.

Ha! I'll bet many people would agree that RM/T is where Codd first tried
to justify nulls. Today practically the whole known English-speaking
world uses so-called industrial dbms'es that embrace the idea of
recording what we don't know. Maybe the rest of the world, too for all
I know.

vldm10

unread,
Jan 17, 2013, 1:43:11 AM1/17/13
to
In this thread I gave examples, which show that the surrogate key is a bad solution. These examples show that the missing data can not be solved by applying surrogates. One such example, I wrote in this thread on 18/07/2012:

Let R be Relvar, which has the surrogate key K and three properties
A,B,C,
Let us suppose that it is somehow possible to decompose the
above relvar into binary relvars using the “RM/T discipline”. Let the
following example be one possible situation:


K K A K B K C
--------------------------------
k1 k1 a1 k1 b1 k1 c3
k2
k3 k3 c3
k4 k4 c3
k5
k6


The above decomposition is very bad. For instance, there is the
question: how will a user find the real world entity that has the
attribute C=c3 and the surrogate key K=k3? Note that a surrogate key
is only in the database, it is not in the real world.
So, my point here is that the surrogate key makes this table so bad
that it becomes not an acceptable design.
--

I did not want to show all the bad consequences of the use of surrogate key in this example, because I think that this example is enough. But apparently it's a good idea to show some other disadvantages that result from the use of surrogate key.

1. For example, there is the question: A data entry person should enter the attribute b7, into the corresponding binary relation. This binary relation has a surrogate key whose value is k3. How will the data entry person find the corresponding binary relation? Obviously, with a surrogate key, there is no solution to this problem.

2. How will one make m-n relationships with entities in the above example? Obviously, a surrogate key is not solution to this problem.

3. How would one apply the update, and delete operation on the data from the mentioned binary relations?

Suppose someone wants to solve the above problems by applying Codd’s Three-Value logic. This approach is not correct because the above problems are not a matter of logic; they are a matter of database design.
For example, if in the above example, we apply VINs (Vehicle Identification Numbers) instead of surrogates, then we will not have the problems mentioned above.

Note that Lukasiewicz and Kleene use Three-Valued Logic before Codd. Codd uses nulls as a method of representing missing data in the relational model.

Vladimir Odrljin

Eric

unread,
Jan 17, 2013, 6:26:14 PM1/17/13
to
On 2013-01-17, vldm10 <vld...@yahoo.com> wrote:
> In this thread I gave examples, which show that the surrogate key is
> a bad solution. These examples show that the missing data can not be
> solved by applying surrogates. One such example, I wrote in this thread
> on 18/07/2012:

Saying it more than once doesn't make it any more true, but at least
this time it is a simple, coherent chunk. I use coherent as in "hanging
together", not as in "making sense".

> Let R be Relvar, which has the surrogate key K and three properties
> A,B,C,

OK

> Let us suppose that it is somehow possible to decompose the
> above relvar into binary relvars using the "RM/T discipline".

Last July you had quotes around 'somehow', and tried to imply that there
was no method available to do this. You gave no evidence for that, and
in fact you are wrong. It is possible as a purely logical operation on
the relvar, with no semantic information needed. If it is so difficult,
how did you manage to produce the result below?

> Let the following example be one possible situation:
>
> K K A K B K C
> --------------------------------
> k1 k1 a1 k1 b1 k1 c3
> k2
> k3 k3 c3
> k4 k4 c3
> k5
> k6

This is a perfectly reasonable example of the solution for a particular
case. We need to assume that A, B, and C are the only properties we
consider it sensible to record for a real-world object, and also that
the real-world objects are all members of some class, e.g. "road
vehicles", or "people".

If the values in (K) are really surrogate keys, the presence of k2, k5,
and k6 in (K) is incorrect because we have no information whatsoever
about the objects referred to. Also, only one of k3 or k4 should be
present in any relvar, because they are indistinguishable. So, in fact,
we have only k1 and (arbitrarily) k3.

Note: I am using "object" rather than "entity" because the latter
has a technical meaning that is too close to the subject under
discussion. The actual choice of word is not at issue except that
using "entity" bothers me.

> The above decomposition is very bad. For instance, there is the question:
> how will a user find the real world entity that has the attribute
> C = c3 and the surrogate key K = k3?

All the user can do is observe real-world objects in the appropriate
class, and look for one that has [C = c3] but neither [A = a1] nor [B =
b1]. If this turns out not to be a simple procedure (as seems likely),
then the only possible conclusion is that we should have recorded more
properties for our objects, i.e. that our design is flawed. However,
it is not flawed because we have used surrogate keys, but because we
have recorded insufficient information to identify real-world objects.

At this point you, Vladimir, may be thinking "but that's what I said".
However, you seem to have wrongly assumed that the surrogate is actually
A substitute for having a natural key usable for identification, then
attacked the concept because it doesn't do real-world identification.
Thereby you have missed the point. A surrogate key is an artificially
created alternate key because the available natural keys are too complex
for our database system to use as keys. They may be too complex because
the necessary property list for the natural key is too large, or has
alternative or optional parts, or for other reasons.

> Note that a surrogate key is only in the database, it is not in the
> real world.

Well, yes, that is rather the whole point of a surrogate key.

> So, my point here is that the surrogate key makes this table so bad that
> it becomes not an acceptable design.

An incorrect point. See above.

> I did not want to show all the bad consequences of the use of surrogate
> key in this example, because I think that this example is enough. But
> apparently it's a good idea to show some other disadvantages that result
> from the use of surrogate key.

OK, but...

> 1. For example, there is the question: A data entry person
> should enter the attribute b7, into the corresponding binary
> relation. This binary relation has a surrogate key whose value is
> k3. How will the data entry person find the corresponding binary
> relation? Obviously, with a surrogate key, there is no solution
> to this problem.

Finding the right binary relation is easy, it is (K,B). The user's
problem is to find the right value for K to use in the new row. This is
also easy, it is the intersection of all the K values found by querying
other binary relations for known properties of the real-world object.

> 2. How will one make m-n relationships with entities in the above
> example? Obviously, a surrogate key is not solution to this problem.

I can see no problem here, just use a new binary relation (K,K).

> 3. How would one apply the update, and delete operation on the data
> from the mentioned binary relations?

This is just number 1 again - find the proper K value and use it!

> Suppose someone wants to solve the above problems by applying Codd's
> Three-Value logic.

Why would they want to do that? You don't need three-valued logic to
construct the binary relations, and once you have them you don't need
it for data manipulation either.

> This approach is not correct because the above problems are not a matter
> of logic; they are a matter of database design. For example, if in the
> above example, we apply VINs (Vehicle Identification Numbers) instead
> of surrogates, then we will not have the problems mentioned above.

No form of logic is any help if your database design is inadequate. To
that extent I think we may even agree. You need to stop getting stuck on
surrogates because you have missed the point of what they are and what
they are for.

As for the VIN, I have tried to tell you before, it is just someone
else's surrogate key exported into the real world, and it is not even
usable in all possible applications about motor vehicles.

Perhaps I will find the time to write another post that is the true
story of a surrogate key (with nothing to do with motor vehicles).

> Note that Lukasiewicz and Kleene use Three-Valued Logic before Codd.

Was this the same three-valued logic, and even if it was, is this in
any way relevant (unless you merely wish to discredit Ted Codd)?

> Codd uses nulls as a method of representing missing data in the relational
> model.

Perhaps he did, but considering the subsequent literature on the subject,
is this even relevant any more? In any case I don't think it is relevant
to this discussion.

vldm10

unread,
Jan 21, 2013, 6:43:07 AM1/21/13
to


> Let R be Relvar, which has the surrogate key K and three properties
>
> A,B,C,
>
> Let us suppose that it is somehow possible to decompose the
>
> above relvar into binary relvars using the “RM/T discipline”. Let the
>
> following example be one possible situation:
>
>
>
>
>
> K K A K B K C
>
> --------------------------------
>
> k1 k1 a1 k1 b1 k1 c3
>
> k2
>
> k3 k3 c3
>
> k4 k4 c3
>
> k5
>
> k6



This table shows that the surrogate key is a very bad solution. In this table, I added the column K. Column K represents the so-called E-relation, which was introduced by Codd as a unary relation. Authors of Anchor Modeling also introduced this unary relation, they provide naval name "anchor"!?

The table shows the six surrogates keys, but none of them work. Note that the relation that has a surrogate key "k1" is also poorly designed. Malicious programmer can delete one of the attributes a1, b1, c1. In this case, the project leader does not know what to do, because there is no "history".

On the other hand, if someone use my solution in this table, then ALL of these six relations are useful. For instance identifier can be a VIN number.
For example, let k2 from this table, is a VIN. In this case, a police officer can check, using the radio, who were the owners of a car.

Now, we see that the “RM / T” needs to know all attributes of the entity, so that it is possible a binary decomposition of the entity. This is due to the use of surrogates. Note that this fact, in a sense, degrades the idea of binary (atomic) structures. Because of this, someone has to collect all the information about the entity to preserve and record this information somewhere, and then enter them into the binary structures (into a database).

The authors of "Anchor Modeling" claim that their model solves problems with nulls. They wrote the following:
“Absence of null values – There are no null values in an anchor database. This eliminates the need to interpret null values [27] as well as waste of storage space.” See
Anchor Modeling – Agile Information Modeling in Evolving Data Environments , Section 9.2 (The article published in Data & Knowledge Engineering in 2010)
Moreover, these authors claim the following: “Anchor Modeling is a technique that has been proven to work in practice for managing data warehouses.” See section 11 of this paper.

Now we have the following situation:
1. I have shown that surrogates can not manage "nulls."
2. Authors of anchor modeling claim that their model manages "nulls", even better than existing solutions.
Since Anchor modeling using surrogates, then it seems to me that the statements of these authors are not accurate.

My results related to the design of databases that manage the history of events, the first time I put on this user group in September 2005. The name of the thread was "Database Design, Keys and some other things." In this thread, Joe Celko posted a very good comment regarding surrogates:

David Cressey: 9/27/05
>> A VIN, a bank account number, and an SSN are all surrogate keys. <<

Celko:
No; read Codd's definiiton of a surrogate key. These are all
industry-standard, externally verifiable keys with known validation
rules. Honking BIG difference!! The big part of this is that they
are EXTERNAL to the database.
When you use (longitude, latitude) for a location, is it also a
surrogate? If so, wouldn't every key be a surrogate? I verify a
location with a GPS; I verify a VIN, a bank account number, and an SSN
by computers or phone calls. Just a different device.

Vladimir Odrljin

vldm10

unread,
Jan 28, 2013, 2:54:17 PM1/28/13
to
K K A K B K C
-------------------------------------
k1 k1 a1 k1 b1 k1 c3
k2
k3 k3 c3
k4 k4 c3
k5
k6

From this post can be seen that surrogates have problems at the level of data entry with nulls and that these problems can not be solved at all.

I showed in this example that the relation with the surrogate k3 can not do either one operation with the data in this relation because the user has no way to find this relation. Also it is shown that if the user sees (knows exactly for the relation with k3), then he can not identify the corresponding object from the real world.
Note that the exact same problems a user has, if the data from the relation with K = K3, he writes on a paper. For example if a user first tries to gradually gather all the data for the relation with K = K3 on a paper, then he would still have the same problems as the entries of the data in the DB. So the following question arises: how do the elementary thing; how to collect and where to keep the collected data (with nulls).

On the other hand, identifier, which is given in my solution, has many advantages compared to surrogates. However, notice that my identifier also has certain problems with nulls. If I have this key and nulls, then I can solve many of the mentioned problems. The key, which is given in my solution, could find a real object and vice verse. If I have nulls, then I can apply three-valued logic, or I can extract the tuples with nulls and implement some of programming languages, etc. It is not possible apply nulls, surrogates and the three-valued logic, all together.

Today more than 90% of the database has identifiers that are part of my solution; these are industry-standard identifiers. For entities with these keys does not make sense to introduce surrogates. So, for over 90% of today's databases, it is nonsense to apply surrogates. This number is an astonishing example of the amount of misunderstanding. I am referring to the wide usage of surrogates in scientific papers, which are related to OOA, RM / T and Anchor Modeling.

On the other hand, the identifiers that are given in my solution do not have to be like industry-standard identifiers. Every company can define its own system of identifiers and identification, which is based on my solution. This db design is a great advantage and a great independence for each company. In this database design, it is essential that these identifiers are placed on the real objects of our business, for example, these identifiers should be in the documentation, receipts, invoices, etc. In this way, the identification is completely under our control. Of course, there are many variations on this solution.

Imagine now a situation that everyone uses some of their surrogate keys. For example, that instead of the ISBN standard for books, every project leader uses his surrogates system. It is obvious that such a solution is impossible in real life. It is also obvious that if we use the ISBN identifier, then we do not need the surrogate key, at all.

In this thread I pointed to a large group of objects from the business applications that can not be resolved with surrogates. This is the example about an Honda dealer who sells Honda cars, which all have the same attributes. Here we can not use surrogates, because they would show the same entities in a database. Therefore, we must introduce the VIN. And, again, it is obvious that if we use the VIN identifier, then we do not need the surrogate key, at all.
---------------
Now after the above examples, we can set an important issue, it is the following question: is there a good theory of the surrogates. Note that such theory does not exist and this is the main problem with surrogates.

In my model, the identifier is intrinsic or extrinsic attribute of an object. It has been designed in accordance with the rules of identification. See my paper "Semantic databases and semantic machines" section 5.5 and 5.6 at http://www.dbdesign11.com
As regards identification, my model identifies the following: attributes entities, relationships and states. Attributes and entities are in the real world, but the relationships and states are slightly different structures.
Note that the interpretation and abstraction of these objects exist in our mind, I call these abstractions with the following names: m-attributes, m-entities, m-relationship, and m-states and defined them as abstract objects. I also introduced a definition of abstract objects. In this way I have tried to give a formalization of these objects.
Note that these abstract objects we store in a db, using a data model.
Identifiers of the attributes and entities can be found in the real world and in the database, while the identifiers of the relationships and states can be found only in the database. In my model, m-states and m-relationships are complex abstract objects that are constructed from less complex abstract objects. For example, m-relationship is constructed from the m-entity. Identifiers of
m-entities provide “m-relationship - the real world” link.
So, between abstract objects, I have introduced a hierarchy according to the complexity of abstract object. For the attributes we have innate abilities. We identified the entities by the mentioned identifiers; they are set in both real entity and the m-entity. We identify the relationships by the corresponding entities. The states of relationships are identified by corresponding relationships. See comment in Example 8 in my paper “Database design and data model founded on concept and knowledge constructs” at http://www.dbdesign11.com
In this paper, I introduced a relation marked with (3.3.3). This is the first time that the concept is defined in an accurate manner. Note that Russell's paradox shows that the definition of the concept, using the properties, leading to a paradox. Formula (3.3.3) allows definition of the concept based on properties. This formula provides an important link between the relation of satisfying and identification of members of the extensions.
In my opinion the formula (3.3.3) solves Russell's paradox. As I already wrote about this, Russell has made two mistakes:
1. Here is a semantic procedure, Russell used logic.
2. When we work with concepts, then we need to identify the objects for which we apply the concept.

In the design of the concept, I introduced the identifiers of the objects that satisfy the concept. Identification of abstract objects is also introduced. Knowledge is defined in a new way and it is incorporated into the construction of the concept.

My identifier is associated with columns of knowledge, while surrogates and anchors have strict unary structure.

I can change the identifier of the entity and maintain the history of these identifiers. See "Semantic databases and semantic machines" section 5.12 at http://www.dbdesign11.com

Vladimir Odrljin

derek.a...@gmail.com

unread,
Feb 4, 2013, 6:43:18 AM2/4/13
to
Vladimir

First, let me say that I like your work very much, although I do not agree with some aspects of it.

Second Let me say that there are a few here who are just here for the argument, with no genuine commitment to resolution. They are the ones who bellow I AM RIGHT and YOU ARE WRONG, with nothing to support their bellowing. I suggest you do not answer their posts. I do not even read them.

Third, there is only one RM, and Codd is the only author. There are many authors who write about the RM, who subtract from the RM, and add their own weird stuff to, then they package it all together and market it as the RM. That is fraud. Initially I accepted these authors had some point because, everyone in databases talked about their various theories. Over the years, no decades, and as a result of sometimes intense interaction with them (or non-response from them), I have formed the evidenced conclusion that they are at best, neurotic and obsessed with irrelevant details, and at worst they are subversives who seek to damage the RM and the use of it in the world of db implementation. I suggest that you do not quote from, or try to understand such writings.

There are three subjects that you have raised here that I would like to discuss with you. I don't know if I can cover all of it in a single post.

-------------
Surrogates
-------------

I agree with most of what you have written about it. They are definitely bad news.

> As far as I know, this is the first time that someone has explained why the surrogate key is a bad solution.

Definitely not. We knew about it from the early days of the Relational paradigm. I have some of Codd's papers with me at my current location, but not the RM/T paper, so I can't argue specifics about that, but I can argue specifics about the others.

1. Codd decried surrogates. He may well have talked about them in RM/T but I doubt that he supported them, given what he wrote in RM. Perhaps he was just discussing them.

2. I refuse to call them "surrogate keys". They are not Keys by any definition. The insane who twist Codd's definition of Keys in the RM, prove their insanity, they do not prove that they are "keys". They are attached to their databases that are full of Id[iot] surrogates, and anything that appears to attack that is scary, so they react defensively.

3. Most (all?) the authors other than Codd, do not understand Relational Keys; therefore they do not understand the VALUE of Relational Keys. Another reason why you will not get any sense from some posters here, you will get arguments about semantics and intrinsics and implications. Because they do not value RKs, they are open to surrogates, and they usually stamp a surrogate on every table. I value RKs, I am not open to surrogates willy nilly.

4. There are a few papers written by some neurotics, who have since become famous (to me and a few genuine Relational types, they are famous neurotics; to the rest of the world they are just famous). But note that the neurotics cite each others papers, and thus elevate each other, a form of mutual masturbation. These papers support surrogates. They jump through hoops to justify surrogates. They have "normal forms" of surrogates. Of course it is madness, and constantly needs to be maintained, so they now have "normal forms" of "normal forms" of surrogates. (The "normal forms" have serious problems, but let's not get distracted.) All this has the result that most db implementers these days have no knowledge of Relational Keys, or their value, and implement surrogates across the db. Of course, such dbs have no Relational power or speed at all ... but they do not know that, because they have read the famous books and they think they are implementing famous "relational" structures. Db implementers are robbed of the value of the RM, and they are stuck with some monstrosity that they believe is the RM. This is one reason I believe that these authors may be subversives.

5. A pure Relational database will have no surrogates. It will therefore supply OLTP and OLAP from the one single database (Codd did write an OLAP paper). I supply that as a matter of course, with no fanfare. A few others talk about doing the same, but I have not seen a database of theirs that actually does so.

6. There is one condition, and one condition only, that justifies a surrogate, and one cannot get around it. That has not been described by anyone in the entire thread above. But that does not subtract from the fact that whenever you use a surrogate, you break the relational capability at that location. Therefore I cannot state "never use surrogates". No, avoid surrogates as much as possible; use it only you have to; and when you do, choose the location carefully.

-------------
Plagiarism
-------------

Yes, I understand, from painful experience. So let me start out by saying I am generally on your side, I agree and empathise.

But I think you need to understand that although there are laws against it, etc, it is sadly very common in the west. Especially in the last ten years, where universities are no longer centres of learning; they are centres of programming humans to be herd animals, and to compete without resolution. I am not saying "deal with it", I am saying, protect yourself.

---------------------------------
Highly Normalised Tables
---------------------------------

Let me say that the "normal forms" are forms of insanity. Mathematicians who have no IT qualifications have come up with abstractions about concrete objects. And now, abstractions of abstractions of concrete objects. The most important issue is, thousands of people try to Normalise their data by using these "normal forms", and fail miserably. If you write to the cretins who wrote them and ask for a method, they tell you that there is no method; that their "normal forms" do not have a method of achieving Normalisation, it is a measurement after the fact. Of course, the seasoned practitioner knows that, but the masses don't. So the sad fact is, the name "normal forms" is a lie, they have nothing to do with Normalisation.

They have nothing to do with Relational Keys, either.

And a distinctly different point, they have nothing to do with Codd's Normalisation. Since it is in the RM, we can call it Relational Normalisation. So they have nothing to do with the RM or Relational Normalisation.

I have stopped using the term "normal forms", because I do not want to participate in their fraud.

These neurotics have a veritable orgy of defining "normal forms", citing each other, elevating otherwise hopeless papers. If you write to them about the RM or RKs, or Normalisation, they blink and say they know nothing about it, and request that you communicate only in mathematical definitions. Most implementers can deal with logic and IT definitions, but not mathematical definitions, so I write for that audience. This is a trick the mathematicians who have no IT qualifications use, to avoid robust discussion, to avoid exposure, to maintain the relevance of abstraction. When you realise that the objects they are "abstracting" are not abstract, the bubble is punctured, their value is lost, so they defend their abstraction to the death.

Re your issue with Anchor Modelling. I think the best way to explain what I have to say to you is to provide a little chronology.

1. I was a software engineer for one of the pre-relational db vendors. In those days, computers were expensive; IT people were properly educated; we had standard;s and we stuck to them. I was privileged to work with great customers such as 3M and Kodak, I was at the cutting edge of db technology (not the abstractions of it). When the RM came out, we all knew what it meant (different from what the masses understand it to be now, for the reasons above); we all worked towards it. As I embraced the RM and moved into working with it with my high-end customers, I was shielded from a lot of the nonsense that is marketed as the "RM". I went into consulting and still enjoyed my high-end customers who understood the RM, and I was delivering high performance RM only. It is only in the last, say six years, where I have started answering questions on fora that I realise the sorry state of the majority of databases, and the sadly misrepresented RM.

2. When 3NF was the highest NF, I was delivering 3NF by definition. When 5NF became the NF to be accepted as minimum in the financial markets, I was asked to go back and "upgrade" one of my previous 3NF databases to 5NF. After studying it, I simply wrote a declaration, at no charge, that the db was 5NF. How ? Because, before the neurotics wrote the definition, I was Normalising as a principle, producing dbs with zero update anomalies (which means zero duplication of any kind). MVDs were pedestrian to me, because I already had RKs.

If you are not neurotic, the FDs taken wholly and completely; the famous "every attribute must depend on the key, the whole key, and nothing but the key" taken to heart is MVDs. I do not need a neurotic definition to figure it out.

3. Many of us have fairly intense requirements in our databases. I had situations where I needed data that was stored in rows, to be displayed as columns, etc. Without duplication, of course. I did not have books on the subject 20 years ago, I just Normalised the hell out of it , and came up with a table structure, that served row or column requirements at the same speed. SQL did not provide the constraints I needed, so I wrote a little catalogue, or as some like to call it "meta-data". Over time, I perfected it, and used it in many situations. I did not give it any special name, except "Highly Normalised Table".

4. I do not suffer the "null problem", it is a total non-issue to me, and the great number of papers that have been written about it are, to me, the sad meanderings of neurotics, who get lost trying to find the toilet. I have never, ever, stored a null in a database. I won't get distracted with a discourse re The Null Problem here, but it does deserve one at some point.
- Missing info is a bad name, because, given that the entity is defined, you either enter the whole row or not at all.
- Optional column is a better name, because it identifies the issue being dealt with exactly.
- Optional columns simply need an Optional table. That is a natural result if one Normalises to the point where there are no unpopulated columns or unknowns or "unknowns" in the database.
- I also do not have a problem with the methods that Codd suggested (in the RM/T, I believe), which allow either a bit, or using a value that is out-of-range, to indicate that a column is not being filled. That was demanded in the old days, due to a SELECT being limited to 16 tables; that is no longer demanded, as the limit is now 50 tables.
(I am making this point because you seem to be saying that Codds' RM/T does not handle Nulls correctly: it does, if you get the arguments that he was having with the neurotics out of the way, and just use the ideas.) Sure, the extreme end of it, for guys like me, is that there is no "Null Problem", but for most people there is one, and 35 years after the issue was closed, they are still arguing about it.

5. Something like 6 or 7 years ago, it was brought to my attention that someone I had never heard of had written a paper; identifying 6NF; as the ultimate solution to the "null problem" (which I did not suffer, but they wanted me to look at the theoretical alternatives). After I got past the silliness, and got to the definition, lo and behold, it was none other than my and the Optional Table that I had been using for decades. Now it had an official name. Scores of my tables and my insistence on a catalogue was validated. I did not realise then that he was an abstract neurotic, I was told by many about his works, etc, so I treated him with respect, joined his website as requested; interacted; etc.

6. I specifically wrote to him about the VALUE of his 6NF; about the way I had used it for both Optional Columns (which he had identified) and Highly Normalised Tables (which he had not identified); about the catalogue; about pushing the SQL Committee to incorporate support for it, and thus eliminate the need for the catalogue. Nothing. Instead, more invitations to interact about his baby. One year later, I wrote a reminder. Nothing. Instead, more invitations. I formed the conclusion that he was a neurotic, an abstractionist, and he had no clue about the relevance of, or the application of, his mathematical definition.

Separately, after three years of interaction on his website, I formed the conclusion that his baby had no value at all, except to attack and demean the RM. There is no replacement for the RM. There is no replacement for SQL.

Why is this important ? because it provides further evidence that the neurotic abstractionists have no clue about what they are writing about, about what they become famous for. They do not have any genuine understanding of the RM; they are obsessed with something that is not the RM, and they find problems there. There is no problem with the actual RM. Vendors have completed any bits that could be considered incomplete, 20 years ago, and these poor people are still discussing its incompleteness.

The NFs are useless, they cannot be used even by the people who wrote them.

7. For people like me, who understand Normalisation as a principle, we just Normalise; we know the RM, and we apply Relational Normalisation. We can pretty much guarantee that whatever the neurotics define as a "normal form" anytime in the future, our databases of the past will qualify for it. As I did with 5NF, and again with 6NF.

8. I already comply with DKNF *as the goal that Codd defined* but could not articulate in those days. The definition of DKNF, came later and it is hilarious (do not use Wiki for anything serious). When I wrote to the author, innocently, I found out that it was just another abstract mathematical definition of the concrete world, and to my horror, that he knew absolutely nothing about the RM (but like the other neurotics, he insisted that he did). He could not even confirm if the DM I submitted was in DKNF by his definition. The definition has nothing to do with Codd' goal; it has everything to do with orgies and justifying surrogates.

Therefore:
Forget about the "normal forms", they are a disease that prevents you from achieving anything of value.

9. You and 6NF and Anchor Modelling. As per above, for years, I was able to say, there is only one other company that I know of that (by their technical literature) produces structures in the databases that support OLAP and effortless "pivoting" as I do (note I do not pivot, but that is what most people know it as).

Ok, now there are three.

And that is now called 6NF. Well it isn't. The author is clueless to the value. 6NF is a simple definition. The tables aren't a simple definition. They are the result of disciplined Normalisation, which the author has no knowledge about, and does not recognise when it is presented to him. He called his definition 6NF. So we can't, we have to use another name for the object that delivers features that he knows nothing about. If we call our tables 6NF, we elevate him and his definition, and subtract from our techniques, which came years before the definition. So I have reverted to calling those tables Optional Tables and Highly Normalised Tables, depending on their use, because the terms identify what they are, exactly, and what their purpose is.

10. And the last point is this. (I have not read your paper.) I have no problem that you wrote the papers first, and Anchor Modelling implemented a database and wrote thier docs five years later. I have no problem that they plagiarised your paper. But there is no way that you can assert that the design or 6NF (not the paper) is yours. I had early forms of it worked out 20 years ago, and final forms say 14 years ago, without naming it 6NF. I don't know how you arrived at it; I arrived at it because (a) I Normalise as a principle, not as a bunch of definitions from the insane, and (b) I was seeking speed for rows-as-columns requirements. Normalisation and performance go hand-in-hand; a progression of one progresses the other.

I am sure that I am not the only one who did that. So it is quite possible that Anchor Modelling came up with the designs, the implementation, etc, all on their own. Although they seem to have plagiarised your paper. Sybase have had a special db offering that provides "columnar access" at lightning speeds, for over ten years. I don't like Anne's response to you, they could have been more direct and given specifics and reasons.

Cheers
Derek

derek.a...@gmail.com

unread,
Feb 4, 2013, 7:18:11 AM2/4/13
to
On Monday, 4 February 2013 22:43:18 UTC+11, derek.a...@gmail.com wrote:
>
> (I have not read your paper.)

I have read this entire thread (awful lot of repetition!) and scanned the "clairified" pdf (read a few pages) on the linked website.

Cheers
Derek

vldm10

unread,
Feb 10, 2013, 10:02:53 AM2/10/13
to
Hi Derek,
Thanks for your support. Your text contains a number of important questions related to db theory. I'll comment the three subjects that you defined here as "Surrogates”, “Plagiarism” and “Highly Normalized Tables". Today I'll write just about Surrogates. In the next few days I will comment the rest.
First, I would like to clarify my opinion on Codd's contribution to the theory of database. I think that the RM is very important and it is well done. In my opinion, RM is a mathematical theory, and therefore, I think the theory of the database is a mathematical theory. I mean Codd’s greatest contribution is that he has devised the theory of database as a mathematical theory. So I agree with your opinion about importance of RM.
However, I think that there is a certain group of people that unrealistically exaggerate Codd’s contribution to the theory of databases. I've always argued that the theory that explains the story about "relation-predicate-proposition" is Gottlob Frege theory and this mathematical theory is fundamental.

I agree with you that there is a poor use of the so-called IDs. But in this thread, I am writing about the surrogates used by Codd in his paper RM / T. Anchor Modeling and Object Oriented Approach also use the same surrogates. I also think that the RM / T is a bad paper, with many mistakes.

In this thread, I presented five examples that show fundamental weaknesses of the surrogates. I also explained, that would be a chaos, try to merge some two business applications, if their (different) surrogates denote the same entities. I write about these examples, because the first time in one place, they are clearly presented. There was some doubt earlier, some mistakes were observed. But problems with surrogates were never clearly and fully understood. The best-known experts in OO db, for years have tried unsuccessfully to resolve their problems. Now, my solution makes it possible to completely solve OO db. My solution also solves important problems in RM. For example my solution completely solves what Codd unsuccessfully tried to solve with RM / T.

But more important than these specific examples is the theory behind all of this. One of the important theoretical questions is identification, surrogates are about identification. This is about the identification of abstract objects. Some of my abstract objects have more than one identifier. Identifiers of the complex abstract objects exist only in the databases, similar to surrogates, but unlike surrogates, my identifiers are doing well.
All objects that are stored in the human's memory or in the db's memory are abstract objects.
In my solution, objects and attributes have real identifiers, while relationships and states have only identifiers in the database. Identifiers that are only in the database are linked to the identifiers that are in the real world. Note that in my solution, the attributes are treated as identifiers.
You also can use my thread: Does the phrase "Russell's paradox" should be replaced with another phrase? Here I write about the identification of individuals and their relationship to the plurality.
See also my post in this thread since 28.January, 2013, this is also related to identification.
You can find my definition of abstract objects in my paper "Semantic Databases and Semantic Machines" section 1.1 at http://www.dbdesign11.com/

At the end, I want to say that my goal was to give a general procedure to allow proper db design that will allow the full solution to the problem of identification. Identification of objects in the real world and identification of the objects in a memory (remembrance) as well as vice versa.

Vladimir Odrljin

Derek Asirvadem

unread,
Feb 11, 2013, 2:41:14 AM2/11/13
to
On Monday, 11 February 2013 02:02:53 UTC+11, vldm10 wrote:

Vladimir

Thank you for your response.

> First, I would like to clarify my opinion on Codd's contribution to the theory of database. I think that the RM is very important and it is well done. In my opinion, RM is a mathematical theory, and therefore, I think the theory of the database is a mathematical theory. I mean Codd’s greatest contribution is that he has devised the theory of database as a mathematical theory. So I agree with your opinion about importance of RM.

That's not Codd's greatest contribution.

Agreed, he applied mathematical theory to databases (and did so completely, with full examples and discussion). Note that I state it in chronological order, personally, I would not say he "devised the theory of database as a mathematical theory", no, he applied pre-existing mathematical theory to database design.

I disagree, he did not suggest that he invented said theory (others have suggested that he did, which is incorrect).

> However, I think that there is a certain group of people that unrealistically exaggerate Codd’s contribution to the theory of databases. I've always argued that the theory that explains the story about "relation-predicate-proposition" is Gottlob Frege theory and this mathematical theory is fundamental.

Codd did not cite the source of the mathematical theories that he applied in the RM. I am not disputing your attribution of Frege (I have not researched his article on this subject), but what I do know, before I read your post, is that more than one great author before Codd had developed parts of it. Boole's Logical Calculus comes to mind. So at this stage I would not grant that it was one person, or that that one person was Frege (much as I respect him, for the articles that I have read).

I agree about the story and the fundament.

> I agree with you that there is a poor use of the so-called IDs. But in this thread, I am writing about the surrogates used by Codd in his paper RM / T. Anchor Modeling and Object Oriented Approach also use the same surrogates. I also think that the RM / T is a bad paper, with many mistakes.

I think it is a bad paper as well, which is why it is not in my briefcase when I travel. Perhaps explained by the notion that it is exploratory, whereas the RM is definitive.

Ok, AM uses surrogate as per the RM/Ts. The majority (90%) of the database implementers out there use surrogates in a more primitive manner, with no reference to or knowledge of RM/T. I do not see this as a remarkable different when discussing surrogates and their problems.

Which begs the question: what exactly are you defining as surrogates ?

[I do not use wiki because it is merely the popular or propagandised view of the uneducated masses, and it cannot be relied upon because it changes all the time. But since I do not have my copy of RM/T with me, if it is alright with you, I will use wiki in this instance. People who read this post on some date other than today should note that the horribly written wiki entry of toady will have changed several times, and thus may not reflect the interchange here.]

So let me take the definition of surrogate from todays wiki entry (gratefully "surrogate" and not "surrogate key"). And let me assume that your definition of surrogate is the same.

[While we are here, it should be noted that the RM/T gives a full and proper definition of what some imbecile has "defined" as "sixth normal form", written a "paper" about it, without attribution, and without knowledge of its purpose and use, and that his "colleagues" have cited and elevated in their orgies. One hundred percent plagiarism.]

Let me try to relate the issue in a chronological order or a set of progressions.

2. My Highly Normalised Tables are exactly RM/T, minus the surrogates. In IDEF1X, Entities and Non-entities are named Independent and Dependent, but that is a bit too general re the categories identified in RM/T (eg. I would say that Non-entities/Dependents are not allowed to be related (have relationships); only Entities/Independents can be related. It is more of a stricture to constrain the novice modeller from making silly mistakes.

2.1. In my "5NF" databases (no update anomalies; no duplication *of any kind*; all FDs and MVDs resolved [amongst themselves first, before application to the db!]; all present and future "normal forms" satisfied), given an entity has been Normalised correctly, all the attributes (P-Relations) that are mandatory are located in a single Independent table in which the Relational Key (I will not use the term E-Relation, in order to avoid confusion, and because you seem to have a problem with that) is the Primary Key and not a surrogate. All the optional attributes (P-Relations) are located in Dependent P-Relation tables, where the Primary key is the same as the Independent table to which it belongs. There are no Nulls in the db. There are no ambiguities in the db. No surrogates.

2.2. If any of those mandatory attributes were to be used in OLAP fashion, eg. "pivoting" or "columnar access", then I would remove it from the Independent table, and locate it in a separate Dependent P-Relation table, where the Primary key is the same as the Independent table to which it belongs. No Nulls. No surrogates.

2.3. We could do the same for all tables. Remove the consideration of mandatory/optional, and treat all attributes as P-Relations only. Then we have only Independent tables with RKs, no attributes, and Dependent tables with the RKs and one attribute each. In order to avoid royally confusing oneself, relationships are allowed only between Independent tables.

2.4 Generally, I provide a single View for each independent table and its entire cluster of Dependent tables. (Sometimes a series of Views, but that, and the predicates that drive it, are not relevant to this thread.)

3. Another way of stating that is:
• the Identifiers are all unary relations
• the Dependent tables have all been reduced to binary relations
• the Associative tables are all ternary relations.

> > Now, we see that the “RM / T” needs to know all attributes of the entity, so that it is possible a binary decomposition of the entity.

Yes.

But it is not the RM/T that needs to know all the attributes. The designer decides what attributes are mandatory/optional.

[ I think it would be silly to argue that the mandatory attributes are onerous. If an Employee must have a salary, then we should not be attempting to add an Employee for whom the salary is missing or "missing" or "unknown". That has nothing to do with database design or RM or RM/T, it has everything to do with what the business has decided is required for an Employee.]

> > This is due to the use of surrogates.

No.

I have the same requirements in my HNTs, which have no surrogates. That is due to whatever business rules are implemented, not the use of surrogates.

4. At this stage I still have my genuine Relational Keys; no Nulls; no surrogates.

> > The authors of "Anchor Modeling" claim that their model solves problems with nulls. They wrote the following:
“Absence of null values – There are no null values in an anchor database. This eliminates the need to interpret null values [27] as well as waste of storage space.” See
Anchor Modeling – Agile Information Modeling in Evolving Data Environments , Section 9.2 (The article published in Data & Knowledge Engineering in 2010)
Moreover, these authors claim the following: “Anchor Modeling is a technique that has been proven to work in practice for managing data warehouses.” See section 11 of this paper.

To be fair, I read their earlier paper; the one you quote is more about their temporal implementation. Nevertheless ...

Well, the claims are correct, although I would not state the resolution of the Null issue that way.

5. The big difference (the only one?) between my HNTs and AM is that they use surrogates where I use RKs. I think this is the issue you have with them as well. Please agree or disagree with my simple chronology up to this point, before we launch into the next part.

---------------
dbdesign10
---------------

5.1. I would like to be able to say, at this point that my [4] is the same as your "DbDesign 10 Knowledge Data Model", at least in the sense that [4] is an implementation of dbdesign10, and dbdesign10 is a generic or template definition (not an implementation). But I can't say that yet, because:
• the one big difference that stands out (in my reading thus far) is that I totally accept RKs, and RKs are compound keys, that AFAIC cannot be decomposed. Whereas, your "Keys" do not allow compound keys.
• on the face of it your "Keys" are surrogates, but since you decry surrogates, I am sure you are trying to convey something else, that I have not absorbed yet.
••• CarId is the Car Key. CarKey is not a Key, it is a surrogate, and the column is therefore incorrectly and named, and leads to confusion.
• (I think dbdesign10 needs to be elevated in terms of specific statements and clarity, because it takes undue effort to understand it, but let's not get into that here)

--------------------------------------
Relational Keys vs Surrogates
--------------------------------------

6. In general, before getting into the specifics of your claims, I agree that surrogates do not work. But I state that, and the reasons why, in quite a different way.

7. You seem to think that in RM/T, Codd defined and therefore prescribed surrogates. I totally disagree with that. If he did, he would be contradicting himself re what he defined in the RM. The essence of "relational" is relation-by-key, as opposed to the previous paradigm which was relation-by-record_number (or pointer). So my view of RM/T is that either for convenience, or to avoid dealing with the Relational Key requirement of Relational databases, which would complicate the new concepts presented in RM/T, Codd used surrogates. I am sure he is kicking himself in the shins, now that surrogates in the RM/T are used for vastly more than he intended. Note that I apply the RM/T in [4] and I have no surrogates; I do not accept that the absence of surrogates in my [4] as indicating that it does not comply with RM/T.

I am viewing RM/T in the context of RM. I do not view it as a stand-alone paper, or as a discipline. Direction and guidelines, yes, but it is not complete enough to be viewed as a discipline. I can erect a perfectly good Semantic Model using IDEF1X (plus a bit of textual documentation, which would be required for any Relational model), using primarily the RM as doctrine, discipline, mandate, and secondarily the RM/T as guidelines and direction. And do so without any angst or contradiction.

From my various reading, I notice that people either use RKs and refer to the RM, xor they use surrogates are refer to the RM/T. That latter is dishonest, as per the two paras above, they take the RM/T out of context.

I have no need to change the papers that were written, or to write books interpreting them, or misrepresenting them (the majority), so I hope I do not fall into your category of people who elevate the RM/T to something that it is not.

You seem to be taking RM/T as a stand-alone paper, a decree. Well, it isn't. But if you do take it like that, then yes, it would fail.

8. I think AM are foolish in giving up RKs and implementing surrogates. Not for the reasons you mention, but because the displacement of the RK-as-PK with a surrogate at each location in the db, eliminates the essential Relational property of the RM, deemes the database non-Relational, and is a complete loss of Relational power between the *parents* of the subject table (not the subject table itself), and its child tables of the subject table.

8.1. I disagree that AM *substitutes* or replaces the RK with a surrogate. Clearly, one of their attribute tables (P-Relation) contains the RK, the K-Relation or K-Role. So the surrogate is used in the normal manner, as a permanent Identifier, a substitute PK, that is an FK in all its child tables.

9. This is doubly foolish in the context of a data warehouse, where Dimensional access is implemented. AFAIC, If the db is Normalised to the degree required for DW use, ie. my HNTs, well then, the Dimensions are already there, as ordinary Entities (per RM/T) or Independents or E-Relations, with RKs, ready for use, in every table in which each Dimension has some content. There is no additional work to do, to produce a DW, or Dimensions. Note it is relying on RKs. So the displacement of RKs [8] where Dimensional access is required is doubly foolish: Relational capability is broken between every related pair of tables. That results in many more joins being demanded, than for a [4] database.

====================

10. Now let's take your specific points.

> > Now we have the following situation:
> > 1. I have shown that surrogates can not manage "nulls."

Reference please, I easily cannot find a passage that shows that explicitly.

On the assumption that I understand where you are heading with this one, to be clear, it is the decomposition to binary relations that eliminates Null. Whether the RK-as-PK is displaced with a surrogate-as-PK has nothing to do with it, for or against managing or removing Nulls.

> > 2. Authors of anchor modeling claim that their model manages "nulls", even better than existing solutions.
Since Anchor modeling using surrogates, then it seems to me that the statements of these authors are not accurate.

If by "manage" they mean eliminate, I agree, their design does.

If by "existing solutions", they mean generally "most implementations", I agree.

If I take "existing solutions" to be the body of technical information available to us, which includes the RM and the RM/T, then their claim is nonsense.

Surrogates do not imply elimination or removal of Nulls. The fact that they use surrogates is bad news, but it does not inhibit the removal of Nulls, or demand Nulls.

> > My results related to the design of databases that manage the history of events, the first time I put on this user group in September 2005. The name of the thread was "Database Design, Keys and some other things." In this thread, Joe Celko posted a very good comment regarding surrogates:

(Allow me to ignore the temporal issue, until it becomes something that I cannot ignore. I appreciate that dbdesign10 is primarily to support temporal requirements.)

> > David Cressey: 9/27/05
> > A VIN, a bank account number, and an SSN are all surrogate keys. <<

> > Celko:
> > No; read Codd's definiiton of a surrogate key. These are all
industry-standard, externally verifiable keys with known validation
rules. Honking BIG difference!! The big part of this is that they
are EXTERNAL to the database.

For the record, Celko is an idiot, and Cressey is an even bigger idiot, that even Celko can destroy.

I would not state it the way Celko does, he has it backwards. I would state:
• read Codd's definition of Relational Key, in the RM
• by that definition, a surrogate is not a Key
• a surrogate is entirely internal to the database, invisible to the user
• VIN, BankAccountNo, SocialSecurityNo are each data, clearly Keys (unique row identifier from data) completely visible to the user, and relied upon as Keys both inside the database and by the user.

(Since the purpose of any database is to record facts about the real world, well, all the data in the database is, er, um, gee whiz, "external".)

> In this thread, I presented five examples that show fundamental weaknesses of the surrogates.

1. Agreed

2. Agreed.

3. I do not like your proposition or the way you have presented it. I think I understand your intent, so let's continue, rather than be hindered by that.

> > Note that a surrogate key is only in the database, it is not in the real world.

Noted.

> > The above decomposition is very bad. For instance, there is the
question: how will a user find the real world entity that has the
attribute C=c3 and the surrogate key K=k4?

Date, Darwen and some others use (a) examples that are ridiculous, then they (b) propose some absurd nonsense, which can only be entertained if suspension of disbelief (hollywood style) has been achieved (hence the hollywood style presentation in the"scientific" papers and books), then they (c) propose the danger involved is huge, due to some difficulty that the "user" will have when inserting rows into the database, which an user never does (more suspension of disbelief required). All that, each point, is laughable, and dishonest, the whole proposition is laughable, but those who have been programmed for hollywood suck it up. Honest people present examples from the real world (no suspension of disbelief required) that apply to the proposition, thereby giving it a credible foundation, and do not suggest dangers that do not exist in the real world. At best, these disgusting papers are entertainment, but they are marketed as "science'.

I think you are honest, you have put (a) and (b) squarely. But you damage its credibility because your (c) is laughable. Users do not walk up to databases (let alone highly normalised ones, at the cutting edge, which are not common), and make changes to single RM/T rows. No. First, they are isolated from the low level implementation of the database (if anything, they will see Views [2.4] ); second, there will be various constraints in place that prevent incorrect updates; third, whatever they are attempting will be encapsulated in a transactions (that convert the logical business action into a series of single-row updates, all of which together, constitute an Atomic change to the db).

So, no, the user will not be doing any such thing, and as we agree (I think) the surrogate is not visible to them anyway, so they will not be looking for k4 that they cannot see. The user will execute the relevant transaction, that looks up C=c3, and finding that 3 identifiers exist, it will fail. Or else the user will look up c3 on a search window first, find that 3 logical 5NF rows exist for it; choose one via proper Keys (let's assume column A is the Key), then execute the relevant transaction using Key A{value}, which will succeed or fail. Since column A is not given in your example, there is not enough detail to suggest either success or failure.

If you clean up the example, and tighten up the failure, you can use it. But as it is, you damage your paper by using the dishonest method (c).

3.1. I do not accept that "[Codd] was unsuccessful at [decomposition of a relvar into binary relvars] and was not able to show how this is done. " I think it is clear in RM/T, and I do it all the time. There may be marginal cases where the technique does not apply or where further techniques are necessary in order to provide resolution, but that does not subtract from the technique given, and you are not one of those idiots who argue at the margins (straining at the gnat and swallowing the camel).

4. I cannot say one way or the other, since I do not have RM/T paper at hand. But I will say that I would not show a surrogate to the user, or give the user a heading for the surrogate column.

5.

I cannot find the fifth one.

So I agree with your proposition that surrogates are bad.

Note that this proposition appears to contradict your dbdesign10, which is based on surrogates (incorrectly named "Keys").

> I also explained, that would be a chaos, try to merge some two business applications, if their (different) surrogates denote the same entities. I write about these examples, because the first time in one place, they are clearly presented.

That is not correct. In the RM, Codd identified the exact issue of surrogates, and of "merging" data, although he used terminology from that age. The former is clearly defined. He stated the latter point thus:

"The simplicity of the array representation which becomes
feasible when all relations are cast in normal form is not
only an advantage for storage purposes but also for com-
munication of bulk data between systems which use widely
different representations of the data."

Of course the "normal form" referred to, is that which he provided in the RM, converting Hierarchical Normal Form into Relational Normal Form, and further, he provided details re the superiority of Relational Keys over surrogates.

And of course, most capable people, even if they had not read Codd, when they attempted to communicate bulk data between systems, without using Proper Keys, found that out. Which has been happening since the 1970s. It is hardly a first time or first time clearly presented.

Nevertheless, I agree with the main point, that surrogates cause chaos. Particularly because with the loss of Relational power [8], and the exponential number of constraints that are required in that absence, and the consequences therefrom. But it is possible for someone who is fixated to using surrogates, *and* who is diligent in their implementation of the morass of constraints, to substantially reduce the chaos. They still have the poor performance.

> > There was some doubt earlier, some mistakes were observed. But problems with surrogates were never clearly and fully understood.

I dare say, I understand it better than most, and certainly better than authors of books allegedly about the RM.

I dare say, that *if* Anchor Modelling databases do not suffer from the consequences of data and referential integrity loss, then they have a good handle on it as well.

> > The best-known experts in OO db, for years have tried unsuccessfully to resolve their problems. Now, my solution makes it possible to completely solve OO db. My solution also solves important problems in RM. For example my solution completely solves what Codd unsuccessfully tried to solve with RM / T.

I do not argue the statements above because I have not finished reading your paper; the one thing I disagree with is that there are "important problems in RM". Please identify specific, and start a new thread for that subject.

(OO experts have not produced anything that lasted, they come and go like bright flashes of algae in the ocean, so I suggest you do not worry about them.)

> But more important than these specific examples is the theory behind all of this. One of the important theoretical questions is identification, surrogates are about identification. This is about the identification of abstract objects. Some of my abstract objects have more than one identifier. Identifiers of the complex abstract objects exist only in the databases, similar to surrogates, but unlike surrogates, my identifiers are doing well.
>
> All objects that are stored in the human's memory or in the db's memory are abstract objects.

(Again, the question begs: what is the exact difference between your "identifiers" and surrogates?)

I don't agree. In order to be able to discuss anything reasonably, we need to be using the same terminology, and meaning the same things. The dishonest people who write nonsensical papers and cite each other in mutual masturbation do that (abusing terminology, creating new conflicted application of existing terminology, and thus destroying it), on purpose. That is clearly not what you are doing, but you are using established terms in a way that means something else, which I would ask you to correct: either use the established term strictly in the established sense, or create a new term.

There is nothing abstract stored in the database, about:
• a car
••• the attributes of a single car
• a person
••• the attributes of a single person
••• at any given time

AFAIC, all data in any database is non-abstract.

Likewise, I think you should tighten up your loose use of these terms:
• Key
• Identifier (in the RM & IDEF1X it explicitly means Key)
• Surrogate (use record or slot identifier, and do not use terms "row" or "key" in that definition)

Following that, statements in your para above are ambiguous and confusing. And I do not wish to dismiss it wholesale.

> In my solution, objects and attributes have real identifiers, while relationships and states have only identifiers in the database. Identifiers that are only in the database are linked to the identifiers that are in the real world. Note that in my solution, the attributes are treated as identifiers.

That is even more confusing than the previous para.

If the attributes are treated as Identifiers, then you solution *IS* the RM/T, unmodified! ... but you say it is not, which begs the question: what is the exact difference ?

> You also can use my thread: Does the phrase "Russell's paradox" should be replaced with another phrase? Here I write about the identification of individuals and their relationship to the plurality.

If you don't mind, I would like to stay with one set of issues until we close them, rather than starting on another set of issues.

> See also my post in this thread since 28.January, 2013, this is also related to identification.

> > Note that this proposition appears to contradict your dbdesign10, which is based on surrogates (incorrectly named "Keys").
> On the other hand, identifier, which is given in my solution, has many advantages compared to surrogates.

Ok, so surrogates and "identifiers" are clearly different from your perspective. But that difference has not been identified clearly, at least to me. You mix up the terms in your different papers. I need a clear statement that uses terms in their established sense only; provides new terms for that which is not established; and does not mix up the terms.

> However, notice that my identifier also has certain problems with nulls. If I have this key and nulls, then I can solve many of the mentioned problems. The key, which is given in my solution, could find a real object and vice verse. If I have nulls, then I can apply three-valued logic, or I can extract the tuples with nulls and implement some of programming languages, etc. It is not possible apply nulls, surrogates and the three-valued logic, all together.

That makes me baulk. Now you are confusing "identifier" and "key" within two paras. Second, you have real objects (not abstract ones as you suggest elsewhere is the only content of a database). Third, and this is a long point, a Key cannot be Null; so the rest does not apply; even if for some reason the user had some problem in finding the Keys for which the (earlier rendition) attribute C of value c3, we would be relying on Keys, not attributes, for identification of the qualifying rows. In this rendition of the same example, you are using k3, which is a different problem again, nonetheless a problem.

Last, that has the stinking thinking of those who *implement* the "Null problem" in their databases. If they did not blindly implement that, they would not have the "Null problem". I do not think you are blind.

> Today more than 90% of the database has identifiers that are part of my solution; these are industry-standard identifiers. For entities with these keys does not make sense to introduce surrogates.

Yes, if you mean Identifiers in the normal established sense, but since they have not read your paper, you cannot suggest that they are using your solution, or part of it. Further, I argue the 90%, certainly good databases have good identifiers, but it is 5 or 10%. 90% are chock-full of surrogates and the Identifiers are badly handled.

> So, for over 90% of today's databases, it is nonsense to apply surrogates. This number is an astonishing example of the amount of misunderstanding. I am referring to the wide usage of surrogates in scientific papers, which are related to OOA, RM / T and Anchor Modeling.

Yes, I decry the wide use of surrogates in "scientific" papers, especially if they purport to explore aspects of the RM (but not RM/T, of course). OO is a joke. AM: well they can, if they have worked out the constraints diligently.

The number for me is 5 to 10%, the surrogates-users are 90%; they are already badly damaged; used; and abused. Your point is valid.

> On the other hand, the identifiers that are given in my solution do not have to be like industry-standard identifiers. Every company can define its own system of identifiers and identification, which is based on my solution. This db design is a great advantage and a great independence for each company. In this database design, it is essential that these identifiers are placed on the real objects of our business, for example, these identifiers should be in the documentation, receipts, invoices, etc. In this way, the identification is completely under our control. Of course, there are many variations on this solution.

That remains to be seen. The requirement to make an *additional* Identifier (correct use of term here, in your para) visible and maintained, is an onerous one. I think (not sure yet) that your purpose is to support temporal requirements. If that is the case, their are other methods for that. Yours look like it (a) implements surrogates (b) elevates them to Identifiers (c) demands the business takes on the burden. Xor, yours *is* RM/T, with some minor refinements.

> Imagine now a situation that everyone uses some of their surrogate keys. For example, that instead of the ISBN standard for books, every project leader uses his surrogates system. It is obvious that such a solution is impossible in real life. It is also obvious that if we use the ISBN identifier, then we do not need the surrogate key, at all.

I agree with the first part.

I disagree that we do not need surrogates *at all*. As per my first post, there is one condition where we have to use them, and when we do, we have to do so properly; in that situation, it remains a surrogate, it is not elevated to a Key.

> In this thread I pointed to a large group of objects from the business applications that can not be resolved with surrogates. This is the example about an Honda dealer who sells Honda cars, which all have the same attributes. Here we can not use surrogates, because they would show the same entities in a database. Therefore, we must introduce the VIN. And, again, it is obvious that if we use the VIN identifier, then we do not need the surrogate key, at all.

No.
• the dealer is crazy if the system does not use VIN as an Identifier, agreed
• they may have to use a surrogate *as well*
••• a surrogate is always an *additional* key and index. It does not *replace* the Key or Identifier. It is not either-one-or-the-other. It displaces the position of the PK to an AK, and takes the position of the PK. That is why we call it a surrogate (it has the same meaning in English; the term would not be correct otherwise). The original PK (VIN) cannot be released. It is now an Alternate Key (unique, Identifier, visible). The surrogate (invisible) is now the PK, and the FK in child tables.
• so no, I disagree that the surrogate is not needed *at all*. Sometimes yes; sometimes, no; sometimes essential.

> Now after the above examples, we can set an important issue, it is the following question: is there a good theory of the surrogates. Note that such theory does not exist and this is the main problem with surrogates.

Let me assure that it exists, and I have used it since the 1980's. I do not have it written up as a theory; it is a practice or standard, explicitly identified in the documentation of every database that I wrote since then. I also write it up when I execute assignments, eg. Technical Audit of a database, when they use too many surrogates, or use *all* surrogates. I am quite sure that other capable Relational types have such a theory or practice, that I am not alone.

I am also quite sure that 90% of the database implementers out there have no clue about surrogates: they have neither theory nor practice/standards statement; they use surrogates on every table without appreciating the problems, and without adding the required protection.

> You can find my definition of abstract objects in my paper "Semantic Databases and Semantic Machines" section 1.1 at http://www.dbdesign11.com/

I am still labouring with dbdesign10. But I will get around to it.

> At the end, I want to say that my goal was to give a general procedure to allow proper db design that will allow the full solution to the problem of identification. Identification of objects in the real world and identification of the objects in a memory (remembrance) as well as vice versa.

That is great, and I would love to see a solution that does that, defined clearly. You do not mention here, but it appears to be a stated goal in dbdesign10, that your solution provides a temporal database.

But there seems to be a contradiction between that concept of "object" (para above) and the concept that "objects" are abstract (elsewhere).

> For example my solution completely solves what Codd unsuccessfully tried to solve with RM / T.

If that is the case, you should write that up clearly, as part of your paper (appendix ?): what RM/T provides; what RM/T does not provide or what it is missing or how it is incomplete: what your solution provides (that "completely solves" it).

To iterate that which I have already detailed, I do not view RM/T as either stand-alone or incomplete. I have been able to use RM/T in the context of RM, for the stated purpose and beyond, therefore I do not find it incomplete, or that it does not solve the problem identified in the synopsis. Since I use Relational Keys in 100% of my databasess, and I am aware of the problems with surrogates, I use them (a) only when I have to, and (b) implement the requirements for data & referential integrity that is lost by their use.

I would like to understand your paper and solution thoroughly, because I have a lot of experience implementing temporal requirements into genuine Relational databases, without any of the insanity proposed by Snotgrass, Wilderstein and Escher. But I have the biggest project of my life coming up, and it is going to be 100% temporal. I am currently polishing up my own temporal documentation and code, that is required (I have an extension to the SQL catalogue). I also have a completely different method that has a theoretical foundation, but I have no substantial experience (real implementations) with it, and it is surprisingly unheard of, not discussed. For this reason, I want to understand your solution and either accept it or reject it, sooner rather than later. Therefore I would ask you to maintain focus; to avoid repetition; and to just answer the points that are not closed (by number or by short quote).

I will erect a few models, which will hopefully be ready after your response to this.

Cheers
Derek

vldm10

unread,
Feb 13, 2013, 1:04:14 PM2/13/13
to
Hi Derek

I divide databases into two kinds: Simple databases and General databases. See my paper “Database design and data model founded on concept constructs and knowledge constructs” section 1 at http://www.dbdesign11.com/
My papers and solutions are about General databases. Existing database theory is mainly what I am calling simple databases. RM/T also belongs to simple databases.
What is the difference between General and Simple databases? I'll answer this question in the following way, I'll write what was solved in my papers, and has not been resolved in the Simple database. So this is the list:

1. I introduce effective solution which decompose concepts, relations, and files to the corresponding binary data structures. This decomposition is related to Simple databases.
See “Some ideas about a new Data Model” Section 4 from May15, 2006 at http://www.dbdesign10.com/
Note that the conditions for binary decomposition determine how to design db. In other words I do not need normal forms; I need appropriate conditions that will determine, from the beginning and immediately, a good db design. I wrote “Today’s database theory holds that an entity (that is, the corresponding relation) must be normalized. Contrastingly, in our data model the general case is that the entity has intrinsic properties. This means that properties of the entity take values freely. This entity corresponds to a relation which has mutually independent attributes. Once again, this is a general case in
database design. If the database designer needs to introduce constraints in his application, then he needs to define those limitations on the mentioned entity with intrinsic properties. For example, he can define functional dependencies, after he has defined the entity (in the general case). “See Semantic Databases and Semantic Machines” section 5.1(2) and 5.9. at http://www.dbdesign11.com/ Of course it will be the best to use constraints only on the level of binary (decomposed) structures.

These conditions show that 6nf can not solve problems. For example, if you have a relation with 5 mutually independent attributes then 6nf can not decompose this relation. However, in my model, mutually independent attributes are basic condition for the binary decomposition.

2. I introduce effective solution which decompose concepts, relations, and files to the corresponding binary data structures. This decomposition is related to General databases.
This decomposition immediately follows from my results published in 2005. For more details See my paper “Database design and data model founded on concept constructs and knowledge constructs” section 4.2.9 at http://www.dbdesign11.com/

3. My solution enables for any data, that we can determine who has constructed that data. These structures are designed so that no one can knock them down. So this db design enables the following: It is always possible to determine who had stored any data in the db.

4. In my solution I have introduced completely new techniques. These techniques allow databases to maintain the history of events. They also allow some of the other skills specific to general databases. Some of these techniques are as follows:
a) Immutable identifiers;
b) n-temporal model;
c) Fully supported the work with states of entities and relationships;
d) Working with multiple identifiers on a single object. These identifiers are related to each other;
e) Minimal set of events that describe the data in the database;

5. My solution is the first, which totally without exception allows a database that fully maintains the history of events.

6. My solution constructs databases, which are supported by the Internet. So on line work is immediately supported with these databases and all data from the database is immediately accessible. As for internet supported databases, the following two advantages of my models are important:
a) History of any data
b) The ability to accurately identify who is responsible for certain data, including incorrect information.

7. In my papers the concept is well defined. Also, all other things related to the concepts are defined. To understand the importance of introducing the definition of the concept in my papers note the following:
In a famous paper by P. Chen "The Entity-Relationship Model - Toward a Unified View of Data," there is nothing at all about the concepts. In ER model, there is no definition of the concept, although this model is called the conceptual model.
This is the most important part in my model. My model is primarily conceptual. I introduced the concepts of state; I also introduced identifiers, events and knowledge into the concepts. Russell's paradox is resolved. The binary (atomic) concepts have been introduced. I think this is the first real conceptual model. Note that this model was done in my paper from 2008.

8. My model does not require data warehouses. It is alone sufficient. The reason for this is that any data that has ever been entered into the database is stored permanently in the database.
Another reason is that my database is supported online. So I can combine my data with data from internet.

9. My model is not temporal db. My database is event-oriented, which is much more general, than temporal db.

10. My solution as mentioned above is based on concepts. I actually use a lot, G. Frege’s theory. For this theory, I have added a part, which I think was missed. This part is about identification. I expanded Leibniz's Law, see section 5.6, in my paper "Semantic databases and semantic machines".
In connection with Leibniz's Law, intrinsic object properties form an object as an independent entity in relation to other entities. Extrinsic object properties form an object as a dependent in relation to other entities.
Also, I did a lot of other things related to identification. Identification of individuals and their link to pluralities is resolved in my model. This enables databases that contain individual objects. Identification is defined as mind - world link. So the concepts and identification are two mind-world links.
The relation (3.3.3), see, section 3.3, from my paper “Database design and data model founded on concept and knowledge constructs” is important because it defines the relationship between concepts and identification, that is, it determines the relationship between the relation of satisfy and the corresponding identification. “

11. My model provides a simple and direct mapping between different data models.
In the case of Simple databases, identifiers of entities or relationships define the mapping between the corresponding binary structures.
In the case of General databases, identifiers of states of entities or relationships define the mapping between the corresponding binary structures.

12. Knowledge is introduced in a new way and has great application in my model. Knowledge in my model consists only of facts. The facts in my model are always atomic. The facts in my model are generated from binary structures, they are almost always determined by relation (3.3.3) mentioned in this post (see case 10. above).
Another important thing about knowledge is that knowledge is structured. One data can have considerable associated knowledge, for example:
Knowledge1 can be: who and when he said that this attribute has this value in the real world.
Knowledge2 can be: who and when received this information.(who from IT department)
Knowledge3 can be: who and when entered this data into the db
Knowledge 4 can be: who and when transfer this data into another db
So just for one fact, associated knowledge can be substantial. Of Course you can add more knowledge to this data if you need it.
(This example demonstrate knowledge, stored in db, just for one data

I distinguish fact and factual sentence. Also a subject is aware of facts. Knowledge is set of facts, so subject is aware of knowledge; I mean knowledge is not just some stored facts. See my paper “Semantic databases and semantic machines” section 1.

Structured knowledge you can see in my paper, "Database design and data model founded on concept and knowledge constructs" sections 3.6 - 3.9.
It is important to realize that knowledge is distributed in the structures, so you can add different structures of knowledge to one data.
It is also important to realize that knowledge, which is introduced in this way, is much more general than "that" what some have called "metadata." I personally think that db design that uses the idea of "metadata" is incorrect and very bad.



-----------------------------------------
As seen here Codd’s RM/T paper does not address these areas at all.

Those above 12 points is a rough division, there are other details. I completed very complex projects using the above elements. Real strength of this model can be seen by applying it for enormous and complex db application.

I do not know if you're familiar with the semantic web and the corresponding software, which is led by the W3C.

This is about the theory and software for data on the Web.

It seems to me that my solution is much better. My theory on the concepts, semantics, and possibility to maintain complete history of events, clearly show the benefits of my solution. My model allows you to work with individuals (objects) and with pluralities. It enables the organization of data using binary (atomic) structure. Apparently the atomic structures can be distributed to various web resources.

I will post one more message and then we can analyze an example. It will be good to understand example 8 from my paper "Database design and data model founded on concept and knowledge constructs”, section 6.5. If you understand that example, then you can create your own db application.

Vladimir Odrljin

Derek Asirvadem

unread,
Feb 13, 2013, 11:11:04 PM2/13/13
to
On Thursday, 14 February 2013 05:04:14 UTC+11, vldm10 wrote:
>
Vladimir

I was going to say "Thank you for your response", but I can't, because you have not responded. I do not understand why you are posting new information, when I have posted some views, and asked specific questions. I won't be going through this new information until I receive answers for the outstanding requests.

> section 6.5. If you understand that example, then you can create your own db application.

The issue is not about me (or anyone else) understanding your paper ... it is about you writing a paper that is understandable, by technical people. So far the paper is not clear, and I have to labour through it.

Until I understand your solution, I cannot confirm if any of the claims you make, are true or false, or analyse an example.

Please do not post an example or more information. Just answer the questions in my previous post. Otherwise we cannot progress.

Cheers
Derek

vldm10

unread,
Feb 19, 2013, 1:50:11 PM2/19/13
to
Hi Derek

> I disagree, he did not suggest that he invented said theory (others have suggested that he did, which is incorrect).

I do not think that E. Codd invented significant mathematical theories, but I think he's done some things that need to be classified as applied mathematics (relational algebra, functional dependencies). He, along with others began to establish this theory of databases as a mathematical theory. This paved the further development of the theory of databases as science.


> Which begs the question: what exactly are you defining as surrogates ?


1. Before I give you a definition of the surrogate key, I would like to say that surrogate key is not the main thing here. The main thing here is the decomposition into the binary (atomic) structures. This is crucial not only for database theory but also in some other areas.

So far in this thread, I gave examples showing that the surrogate key does not work. Some of these examples show that the surrogate key can not be used for over 90% of business applications. These examples were more of practical character, that is, they were not theoretical.

Now I would like to present some theoretical problems related to RM and RM/T which are very serious. I will try to present it as much as it is possible on the user group. First, E. Codd did this decomposition of relations into binary relations, without any prove. Let me now give you examples which are of the theoretical character and which show that RM cannot solve some serious problems. These problems form a large field in the theory of databases.
Example A:
State1. Person X from a company that supplies consumers with electricity reads a device for measuring the energy consumption for a person Y. Person X writes down on paper that, on May 1st, 2011, person Y fully spent 128 units of energy.
State2. Person X submits this list to the IT department on June 1st, 2011, a month later.
State3. The person from the IT department has put this list in a drawer and
> [I do not use wiki because it is merely the popular or propagandised view of the uneducated masses, and it cannot be relied upon because it changes all the time. But since I do not have my copy of RM/T with me, if it is alright with you, I will use wiki in this instance. People who read this post on some date other than today should note that the horribly written wiki entry of toady will have changed several times, and thus may not reflect the interchange here.]
forgotten about it. After two months, he was reminded of this list and promptly handed it over to the person doing data entry.
State4. Thus, the data was entered into the database on September 1st, 2011.
State5. Then the IT department filed a lawsuit against a person Y, because this person has not paid the electricity consumed.
State6. However, it turns out person Y passed away on May 15th, 2011.
State7. During the trial it is determined that the person who entered the data, made a mistake and entered 728 instead of 128 (as the amount of energy consumed).
The son of person Y is a good lawyer represents in court his late father.

This kind of problems cannot be solved using the RM / T. It cannot be solved using the RM (or others db models). Of course, for this kind of problems, the decomposition into binary relationships with surrogates, what is demonstrated by Codd, does not work at all.

This set of problems is solved for the first time in 2005, and this is my solution, which is presented in this user group in 2005.

If you want to see how to do the decomposition into binary structures for these cases, then please see my post from February 13, 2013, see second case.
In my post which is from February 13, 2012. I explained the problems that General databases can solve, while Simple databases cannot solve these problems. I use the term "General database theory" (General databases as shortcut) for the group of database’s fields that I roughly presented in my post from 13 February, 2013 in this thread.

Keep in mind that "General databases" does not have Delete and Update operations. So, there are no Delete anomalies or Update anomalies. The purpose of General database is to keep all the data that are entered into it. Therefore there are no Insert anomalies.

I would like to emphasize that the paper "RM/T" has so many practical and theoretical errors that it is scandalous that the paper was published.


> So let me take the definition of surrogate from todays wiki entry (gratefully "surrogate" and not "surrogate key"). And let me assume that your definition of surrogate is the same.

I take the definition of surrogate from RM/T. This is the only official definitions of Codd about the surrogate key.
By the way, today I visited the site which is related to the surrogate key on Wikipedia. It is really sad what some people write.


> ---------------
>
> dbdesign10
>
> ---------------
>
>
>
> 5.1. I would like to be able to say, at this point that my [4] is the same as your "DbDesign 10 Knowledge Data Model", at least in the sense that [4] is an implementation of dbdesign10, and dbdesign10 is a generic or template definition (not an implementation). But I can't say that yet, because:
>
> • the one big difference that stands out (in my reading thus far) is that I totally accept RKs, and RKs are compound keys, that AFAIC cannot be decomposed. Whereas, your "Keys" do not allow compound keys.
>
> • on the face of it your "Keys" are surrogates, but since you decry surrogates, I am sure you are trying to convey something else, that I have not absorbed yet.
>
> ••• CarId is the Car Key. CarKey is not a Key, it is a surrogate, and the column is therefore incorrectly and named, and leads to confusion.
>
> • (I think dbdesign10 needs to be elevated in terms of specific statements and clarity, because it takes undue effort to understand it, but let's not get into that here)



In this example, I have two identifiers, both belonging to one database structure. The first identifier identifies a real object car; while the second identifier identifies an abstract object that is a state of this car. Everything else in this relation represents the knowledge about one attribute. Once again, notice that the state of an object is an abstract object. For example, you can not touch the state, while the object car you can.
A state of an entity, I have defined as any knowledge about that entity, which has some subject. As I said, these two identifiers are linked and are located in one database structure, which I call the state of an entity. The identifier of the state identifies the state of the entity, but can not identify the entity. The identifier of the entity identifies the entity, but can not identify a state of the entity. However, the state structure connects these two identifiers. This is my solution, which connects the identification of an abstract object with the identification of the real object, and vice verse.


Obviously, my solution is not a surrogate.


> For the record, Celko is an idiot, and Cressey is an even bigger idiot, that even Celko can destroy.

Joe and David are IT professionals. I have also in the past in some way insulted these guys, but now I know, it was my mistake.

I've already explained to you, the industry-standard identifiers are the same type as my identifiers. More than 90% of all business applications use these identifiers.
You and I are discussing at the level of "db designer - database". Joe talks about “the industry-standard, externally verifiable keys with known validation rules”. It is not possible externally check the surrogates.
The question is how a police officer can check a license, using the "RM / T"? In order that an officer might check the passport of a traveler, how should be implemented "RM / T"? In order to use ATMs, how should be implemented "RM / T"?
There is another important question here. How the logical and semantic content is conveyed between a db and user. Using surrogates this is not possible. In RM/T this is possible only between db designer and a database.

Since Anchor Modeling and RM /T use surrogates, it is clear that the damage is global. I want to show you the extent of the damage, which occurs with implementation of the RM / T into industry and science.


> I think you are honest, you have put (a) and (b) squarely. But you damage its credibility because your (c) is laughable. Users do not walk up to databases (let alone highly normalised ones, at the cutting edge, which are not common), and make changes to single RM/T rows. No. First, they are isolated from the low level implementation of the database (if anything, they will see Views [2.4] ); second, there will be various constraints in place that prevent incorrect updates; third, whatever they are attempting will be encapsulated in a transactions (that convert the logical business action into a series of single-row updates, all of which together, constitute an Atomic change to the db).
>
> So, no, the user will not be doing any such thing, and as we agree (I think) the surrogate is not visible to them anyway, so they will not be looking for k4 that they cannot see. The user will execute the relevant transaction, that looks up C=c3, and finding that 3 identifiers exist, it will fail. Or else the user will look up c3 on a search window first, find that 3 logical 5NF rows exist for it; choose one via proper Keys (let's assume column A is the Key), then execute the relevant transaction using Key A{value}, which will succeed or fail. Since column A is not given in your example, there is not enough detail to suggest either success or failure.
>


Up in this thread, I explained that I took this example because it is similar to the example from the book of C. Date. I was surprised with the db design in Date's book. In the example I wanted to show the following:

1. RM / T can not work with Nulls at all.
2. If someone is working with Nulls then he should be aware that does not help Codd's recommended three-value logic at all.
3. The surrogate key should be never displayed.
4. It is shown that the binary relation with Nulls looks really bad and it would be best to forbid the work with Nulls, especially because the user cannot see the primary key (that is surrogate key) of the binary relation.
5. Where and how to keep the uncompleted data?
6. If one instead of surrogate key uses VIN number (that is industry-standard key), then he can do it fairly.

Vladimir Odrljin

vldm10

unread,
Feb 20, 2013, 10:53:56 AM2/20/13
to
In my post on February 19, 2013 I made a typo at ExampleA. This is a corrected version.


> I disagree, he did not suggest that he invented said theory (others have suggested that he did, which is incorrect).

I do not think that E. Codd invented significant mathematical theories, but I think he's done some things that need to be classified as applied mathematics (relational algebra, functional dependencies). He, along with others began to establish this theory of databases as a mathematical theory. This paved the further development of the theory of databases as science.


> Which begs the question: what exactly are you defining as surrogates ?


1. Before I give you a definition of the surrogate key, I would like to say that surrogate key is not the main thing here. The main thing here is the decomposition into the binary (atomic) structures. This is crucial not only for database theory but also in some other areas.

So far in this thread, I gave examples showing that the surrogate key does not work. Some of these examples show that the surrogate key can not be used for over 90% of business applications. These examples were more of practical character, that is, they were not theoretical.

Now I would like to present some theoretical problems related to RM and RM/T which are very serious. I will try to present it as much as it is possible on the user group. First, E. Codd did this decomposition of relations into binary relations, without any prove. Let me now give you examples which are of the theoretical character and which show that RM cannot solve some serious problems. These problems form a large field in the theory of databases.
Example A:
State1. Person X from a company that supplies consumers with electricity reads a device for measuring the energy consumption for a person Y. Person X writes down on paper that, on May 1st, 2011, person Y fully spent 128 units of energy.
State2. Person X submits this list to the IT department on June 1st, 2011, a month later.
State3. The person from the IT department has put this list in a drawer and forgotten about it. After two months, he was reminded of this list and promptly handed it over to the person doing data entry.
State4. Thus, the data was entered into the database on September 1st, 2011.
State5. Then the IT department filed a lawsuit against a person Y, because this person has not paid the electricity consumed.
State6. However, it turns out person Y passed away on May 15th, 2011.
State7. During the trial it is determined that the person who entered the data, made a mistake and entered 728 instead of 128 (as the amount of energy consumed).
The son of person Y is a good lawyer represents in court his late father.

This kind of problems cannot be solved using the RM / T. It cannot be solved using the RM (or others db models). Of course, for this kind of problems, the decomposition into binary relationships with surrogates, what is demonstrated by Codd, does not work at all.

This set of problems is solved for the first time in 2005, and this is my solution, which is presented in this user group in 2005.

If you want to see how to do the decomposition into binary structures for these cases, then please see my post from February 13, 2013, see second case.
In my post which is from February 13, 2012. I explained the problems that General databases can solve, while Simple databases cannot solve these problems. I use the term "General database theory" (General databases as shortcut) for the group of database’s fields that I roughly presented in my post from 13 February, 2013 in this thread.

Keep in mind that "General databases" does not have Delete and Update operations. So, there are no Delete anomalies or Update anomalies. The purpose of General database is to keep all the data that are entered into it. Therefore there are no Insert anomalies.

I would like to emphasize that the paper "RM/T" has so many practical and theoretical errors that it is scandalous that the paper was published.


> So let me take the definition of surrogate from todays wiki entry (gratefully "surrogate" and not "surrogate key"). And let me assume that your definition of surrogate is the same.


I take the definition of surrogate from RM/T. This is the only official definitions of Codd about the surrogate key.
By the way, today I visited the site which is related to the surrogate key on Wikipedia. It is really sad what some people write.


> ---------------
>
> dbdesign10
>
> ---------------
>
>
>
> 5.1. I would like to be able to say, at this point that my [4] is the same as your "DbDesign 10 Knowledge Data Model", at least in the sense that [4] is an implementation of dbdesign10, and dbdesign10 is a generic or template definition (not an implementation). But I can't say that yet, because:
>
> • the one big difference that stands out (in my reading thus far) is that I totally accept RKs, and RKs are compound keys, that AFAIC cannot be decomposed. Whereas, your "Keys" do not allow compound keys.
>
> • on the face of it your "Keys" are surrogates, but since you decry surrogates, I am sure you are trying to convey something else, that I have not absorbed yet.
>
> ••• CarId is the Car Key. CarKey is not a Key, it is a surrogate, and the column is therefore incorrectly and named, and leads to confusion.
>
> • (I think dbdesign10 needs to be elevated in terms of specific statements and clarity, because it takes undue effort to understand it, but let's not get into that here)


In this example, I have two identifiers, both belonging to one database structure. The first identifier identifies a real object car; while the second identifier identifies an abstract object that is a state of this car. Everything else in this relation represents the knowledge about one attribute. Once again, notice that the state of an object is an abstract object. For example, you can not touch the state, while the object car you can.
A state of an entity, I have defined as any knowledge about that entity, which has some subject. As I said, these two identifiers are linked and are located in one database structure, which I call the state of an entity. The identifier of the state identifies the state of the entity, but can not identify the entity. The identifier of the entity identifies the entity, but can not identify a state of the entity. However, the state structure connects these two identifiers. This is my solution, which connects the identification of an abstract object with the identification of the real object, and vice verse.


Obviously, my solution is not a surrogate.


> For the record, Celko is an idiot, and Cressey is an even bigger idiot, that even Celko can destroy.

Joe and David are IT professionals. I have also in the past in some way insulted these guys, but now I know, it was my mistake.

I've already explained to you, the industry-standard identifiers are the same type as my identifiers. More than 90% of all business applications use these identifiers.
You and I are discussing at the level of "db designer - database". Joe talks about “the industry-standard, externally verifiable keys with known validation rules”. It is not possible externally check the surrogates.
The question is how a police officer can check a license, using the "RM / T"? In order that an officer might check the passport of a traveler, how should be implemented "RM / T"? In order to use ATMs, how should be implemented "RM / T"?
There is another important question here. How the logical and semantic content is conveyed between a db and user. Using surrogates this is not possible. In RM/T this is possible only between db designer and a database.

Since Anchor Modeling and RM /T use surrogates, it is clear that the damage is global. I want to show you the extent of the damage, which occurs with implementation of the RM / T into industry and science.


> I think you are honest, you have put (a) and (b) squarely. But you damage its credibility because your (c) is laughable. Users do not walk up to databases (let alone highly normalised ones, at the cutting edge, which are not common), and make changes to single RM/T rows. No. First, they are isolated from the low level implementation of the database (if anything, they will see Views [2.4] ); second, there will be various constraints in place that prevent incorrect updates; third, whatever they are attempting will be encapsulated in a transactions (that convert the logical business action into a series of single-row updates, all of which together, constitute an Atomic change to the db).
>
> So, no, the user will not be doing any such thing, and as we agree (I think) the surrogate is not visible to them anyway, so they will not be looking for k4 that they cannot see. The user will execute the relevant transaction, that looks up C=c3, and finding that 3 identifiers exist, it will fail. Or else the user will look up c3 on a search window first, find that 3 logical 5NF rows exist for it; choose one via proper Keys (let's assume column A is the Key), then execute the relevant transaction using Key A{value}, which will succeed or fail. Since column A is not given in your example, there is not enough detail to suggest either success or failure.
>

vldm10

unread,
Feb 25, 2013, 7:16:26 AM2/25/13
to

> In this example, I have two identifiers, both belonging to one database structure. The first identifier identifies a real object car; while the second identifier identifies an abstract object that is a state of this car. Everything else in this relation represents the knowledge about one attribute. Once again, notice that the state of an object is an abstract object. For example, you can not touch the state, while the object car you can.
>
> A state of an entity, I have defined as any knowledge about that entity, which has some subject. As I said, these two identifiers are linked and are located in one database structure, which I call the state of an entity. The identifier of the state identifies the state of the entity, but can not identify the entity. The identifier of the entity identifies the entity, but can not identify a state of the entity. However, the state structure connects these two identifiers. This is my solution, which connects the identification of an abstract object with the identification of the real object, and vice verse.
>
>
>
>
>
> Obviously, my solution is not a surrogate.


A. Now I would like to separate this subject, and to give it the appropriate title. I think this will help a better understanding of this very complex matter.

How a database stores an object and how a database can remember of its objects?

So how the stuff work? Is there a general algorithm that manages the objects in the database? Such an algorithm should have the following steps:

First, I will define the basic concepts. In the database, each object was selected using a procedure which performs the identification of this object. Each object in the database has its identifier.
All objects stored in the databases are treated as abstract objects. In the general case, we have the following three kinds of abstract objects:

1. The abstract object that represents a real object. In this case, the real
object and the corresponding abstract object in the database have the
same identifier.
2. This kind of abstract object is the result of the human imagination. (for
example horse Pegasus, Mickey Mouse etc.). These abstract objects do
not represent the real objects.
I wrote about these objects; see my comments on November 28, 2007 on
this user group.
3. These are abstract objects, which do not represent real objects, but they
represent some specific abstraction of the real objects. For example, a
s state of an entity is one such abstract object.
This kind of abstract object is a special database structure. This
structure
contains the corresponding identifiers that determine the identification of
the abstract object. The structure has link between the corresponding
identifiers.
The main part of my model are the states: the state of an entity (object)
and the state of a relationship. In my previous post I explained, how to
realize the identification of the abstract object, that is, how to identify
the state of an entity.

Here it is shown that this is the identification of abstract objects in my db model. And in my model there are only the states. However, entities, relationships, attributes, and states are fairly general terms. It is very important that in my model, knowledge is associated with attributes, objects, relationship and condition.
Note that the abstract objects are well defined in my model. I wrote about the abstract object in my papers. For example see my paper “Semantic databases and semantic machines”, section 1 at http://www.dbdesign11.com
I introduce m-attributes, m-entities, m-relationships and m-states. These objects are interpretations and abstractions of their corresponding real world objects. I name them abstract objects. Note that the four mentioned “m” objects are objects of a most general character.

Properties of abstract objects are:
(i) Abstract objects are recorded, meaning that they are permanent.
(This implies that abstract objects need a language.)
(ii) Abstract objects hold meaning for corresponding subjects.
(iii) Abstract objects differ in the way they are constructed.
1. The simplest of them – m-attributes, are direct abstractions of the
real world through our perceptual abilities.
2. More complex abstract objects are constructed from simpler ones.
For instance, m-entities are constructed from m-attributes.

Therefore, hierarchy among abstract objects is determined by the
level of abstraction of their corresponding real world objects.

(iv) We identify abstract objects through their identifiers.


Here in this post I'm trying to write a little more about the identification. In connection with the identification I have done some other things besides this written in here. I'll mention the identification of attributes, entities, relationship and states. See my post “ Does the phrase " Russell's paradox " should be replaced with another phrase? ” .
Also see my paper “Semantic databases and semantic machines” section 5.6


B. Here is another example with regard to low theoretical level in RM/T. E.Codd often did the transition from ER model into RM model and vice versa. He works with the entities, associations and relations, but he did not show how it works these transitions between data models. He did not even notice that he has to explain and prove it.
If you want to see how to build a theory of the mapping between data models, then you can refer to the works of other authors. (for example, see Ron Fagin, Phil Bernstein, S. Alagić, S. Melnik, …)
You can see my solution for the mapping between data models, in my post of February 13, 2013, see 11th case.

C. The identifiers defined in my solution are the keys, because they belong to objects. They belong to the real objects or they belong to the abstract objects.
The surrogates don't belong to a real objects, so, as Derek noticed they are in contradiction with the definition of the relation keys.

Note that key defined in my solution is general solution. Surrogate key is just a special case. Key defined in my solution can do the following:
(i) My key can do what the surrogate key can.
(ii) My key can do what the surrogate key can not. (For example, my key is able to identify the states and can maintain a "history."
When we work with surrogates, it may happen that two surrogate keys have all the same attributes. My key can maintain this case.)

With this my message I wanted to show that surrogates are just "technical detail", but that there are other things which are more important.

Vladimir Odrljin

vldm10

unread,
Mar 14, 2013, 9:21:01 AM3/14/13
to
Dana ponedjeljak, 11. veljače 2013. 08:41:14 UTC+1, korisnik Derek Asirvadem napisao je:


> (Again, the question begs: what is the exact difference between your "identifiers" and surrogates?)

1. I would like to mention that this part which is about the surrogate key is not important, it's maybe 5% of the problem. Decomposition of a structure into binary structures is what is important here. Atomic structures imply many important consequences. Some of these important consequences are atomic semantics and atomic sentences.
E. Codd and Date & Darwen, is a group that has been extensively and unsuccessfully worked on the problem of decomposition of structures into binary structures.
The decomposition of structures in the binary structures, which is shown in the work of the RM / T is not correct because it is only valid for Simple databases. These are databases which maintain only current state. RM / T is not correct for databases that maintain history of events. RM / T also does not hold for General database theory. ( The basic ideas of General database theory, I've outlined in this thread in my post of 13 February 2013. )
In my papers it is shown how to construct the aforementioned decomposition for the following databases: Simple databases, General databases, databases that implement RM, databases that implement ER Model and databases that implement file systems. It is also shown how to do the mapping between these data models.

Codd is introduced surrogates only for one reason. By using the surrogates he tried to get binary (atomic) structures.

2. RM / T "solution" is not supported with theory. For example, in the RM / T paper, binary decomposition is not proven at the level of ER Model. It was not proven in RM/T paper that this decomposition into the binary structures is valid in the ER model. Codd introduced the E and P atomic structures, directly into the ERM without any evidence, because he needed the binary structures. I would say that he desperately needs the binary decomposition because this decomposition means "A and Z" of RM/T. Then Codd immediately proclaimed the binary structures of ERM as binary relations. Thus he introduced the E and P relations. By the way this is not science; even more, in my opinion, this is not a “discipline” as C. Date presents RM/T.
If one wants to transfer the binary structure from ERM in RM (and also do inverse mapping), then it has to do with a theory about the mapping between the two data models. As I already wrote about it, Codd did not notice it at all.

3. The RM / T, section 4, Codd wrote: "There are three Difficulties in employing user-controlled keys as permanent surrogates for entities."
My comment: The surrogates are not user-controlled by definition.

Also in RM / T, section 4, Codd wrote: "Introduction of the E-domain, E-attributes and surrogates does not make a user controlled keys obsolete. Users will often need entity identifiers (such as part serial numbers) that are totally under their control, although they are no longer compelled to invent a user-controlled key if they do not whish to. "
My comment: the surrogates are not user-controlled by definition.

4. It is not clear what the surrogate key is. Does the surrogate key is a part of a theory or is a technical solution? Does the surrogate key is a part of an entity? Codd also need to explain why leaves relational key.
It is not clear what Codd modeled by using a surrogate key? Note that RM/T is data model. What RM/T describes and explains? Does "RM / T” model real things?

States of the entities are modeled in my solution, for example. The entities and relationships are from the real world.
I also introduced abstract objects, which I defined precisely. In working with abstract objects, the main tool is the identification. How the memory of the database operates with abstract objects and how memory identifies abstract objects is an important part in my paper. So, in my theory about the identification, identifiers (labels, tokens) have important roles, especially in memory.

5. What is done by Codd in his paper "RM / T"? He was aware that the binary relation must have a simple key. If the key is composite then there is no point to create a binary relation, which has a key with many attributes. So, Codd introduced a simple key. He understood that surrogate keys can cause problems. Therefore, he has introduced some additional things - he proposed the surrogate key should be invisible to the user.
He was also aware that if someone solves this decomposition into binary structures, then these binary structures must have a simple key and one attribute. So any new solution must be plagiarism. Obviously this approach is very useful and cheap.

There is another approach to this problem. It is 6nf. However 6nf has one big minus. Authors of 6nf not give a procedure that places a relvar in 6nf. Note that the authors of 6nf here actually trying to solve the aforementioned binary (atomic) decomposition.

At this User Group, I gave an example of the relation whose attributes are mutually independent. This example shows that 6nf is absurd, because there are relations who are in 6nf, but these relations have keys that are composed of a large number of attributes. Theoretically the number of these attributes can be any finite number.

6. Construction of the surrogate key.

The RM / T relation has a primary key, which is usually a composite key. A primary key is in the database and has the corresponding attributes in the real world.

In addition to this primary key, Codd has added another primary key, which is only in the database, it is a surrogate key. In this way, there are two primary keys on the database level and the corresponding attributes are at the third level, that is, at the level of the corresponding real entity.
As I already wrote, Codd is introduced surrogates only for one reason. By using the surrogates he tried to get binary (atomic) structures.

Can you imagine how easily an experienced programmer can create chaos if he intentionally changing the value of a key in one of these aforementioned three levels?

To make matters better the surrogates are used only as part of binary structures.
So, to escape work with the two parallel relational databases, Codd uses only binary relations. But now new problems arise. These problems are caused by the surrogate key. I will mention the following two such problems:
(i) How to identify an entity in the real world using a surrogate key from a database? Note that the surrogate key does not exist in the real world. We note also that in RM / T, we work only with binary relations.
(ii) Two distinct entities can have the same state. This impies the following problem: How to find the two corresponding real world objects. Note that a key in the RM is on the level of relation and a key in RM/T is on the level of a database. The same problem exist in OOA. Note that the surrogate key is similar to OO identifier.

7. In my solution, the real entity has the (real) identifier. This identifier also exists in the corresponding database structure. So this identifier is not a surrogate key, even more it is not a key in my solution.
Note that my structure has the identifier of state as primary key. The identifier of state is different from identifier of an entity.

8. Note that Codd does not understand the difference between the surrogate key and externally verifiable key. See 3, above. There are two types of externally verifiable keys. The first is the global industry-standard externally verifiable keys. Another type is not as the global industry-standard; rather it is the local-standard externally verifiable keys.
The local standard is very important because it gives a great opportunity for good design.
One example is employee number at some company. A company can maintain a list of the employee numbers on the paper at Human Resources, or a company can maintain a documentation or a company can put it on the web with limited access, etc. Thus all departments can use these numbers. It is important that a user can find and verify the identity in the real world, it is not necessary that the employee numbers be part of the real employee. Off course a company can associate badges to employees, if they want it.
Another example about local standard is addresses. Different countries have different systems of addresses, but an address can externally be verified.

It is very important to understand that each company may have its own system of identification and technology, which can be public or private and it can be externally verifiable. The Local standard is an example of the good design. Of course, general industry-standard keys and local-standard keys are not surrogates. Note that there is a profound difference between the externally verifiable keys and surrogates keys. Surrogates are on the level of the link “db designer–database”, while externally verifiable keys are on the level of the link "users-database".
The surrogates are related to the memory manipulation, while externally verifiable keys are related to the transfer of the semantic and logical content. It seems to me that Codd did not understand the nature of this problem?

9. The following paper was selected as the best paper of ER’09 :
Anchor Modeling - An Agile Modeling Technique Using the Sixth Normal Form for Structurally and Temporally Evolving Data.
This paper, among others, has three important parts: the surrogate key from the RM / T, 6nf, and the main results of my paper published in the 2005th. In this post I showed that the RM / T is not correct, i.e. its binary decomposition is not correct. RM/T was not resolved problems in the domain of the general theory of databases. In contrast to RM/T paper, these problems are solved in my papers.
I have also shown that 6nf has no scientific value. Paper "Anchor Modeling" promotes the RM / T and 6nf, in conjunction with the most important results in database theory, although these works are incorrect and inaccurate concerning important parts of the general theory of databases. So among these three important parts of "Anchor Modeling" only accurate part is the one whose results I published four years before the paper "Anchor Modeling."
One of the main purposes of normalization in RM is to avoid redundancy. In contrast to RM my model keeps all redundancy. This shows that the difference between my model and the RM is complete. There are other differences between my model and RM, which indicates that these two models are the opposite. Therefore, the aforementioned "bridging" between RM and my model is not correct.

10. I wrote such a long post primarily to protect my work. But also this post was written, because people debating for years on this subject and in my opinion spend a great time on things that are not clearly presented.
Of course, everyone is entitled to their opinion.

Vladimir Odrljin

vldm10

unread,
Mar 25, 2013, 5:08:52 AM3/25/13
to
Dana ponedjeljak, 11. veljače 2013. 08:41:14 UTC+1, korisnik Derek Asirvadem napisao je:


Hi Derek,

> Which begs the question: what exactly are you defining as surrogates ?


(a) Codd defined surrogate key in his paper RM/T. Wikipedia also has the same definition of the surrogate key. The surrogate key is the primary key of the corresponding binary relations. The surrogate key exists only in the corresponding database (not in the real world). Codd wrote “now the surrogate that is the primary key and provides truly permanent identification of each entity.” Note that he wrote “provides identification of each entity”, Codd did not write “identifies the entity”. This means that he uses the corresponding primary key from the original relation (RM relation) to identify the real world entity (not the surrogate key, which is primary key of binary relations.) This implies that he must maintain the two primary keys. And this implies that he must join the attributes from the primary key and keep them unique. (I hope that I didn't make a mistake in this explanation.) Note that Codd did not maintain the history, he even did not know for “history of events”. So my explanation here is my guessing that “system-assigned surrogates” must work in this way. Note that Code didn't explain how these things work, he only said, a system does this behind the scene.
==========================

In my paper “Some ideas about a new Data Model”, From September 17, 2005 at http://www.dbdesign10.com/ , see section 1.1, I wrote:
“Besides Ack, every entity has an attribute which is the Identifier of the entity or can provide identification of the entity. This Identifier has one value for all the states of one entity or relationship.”
In this definition I wrote “…or can provide identification of the entity”. This “can provides” means that my solution covers the case of the surrogate key. So the surrogate key is just a special case of my solution. But here the identifier not necessary must be the surrogate. I didn't think specifically about the surrogate key here. I thought on each identifier who can “identify or provide identification”. For example my identifier of a state is 100 times more complex than the surrogate key.

My approach to the identifiers, surrogates and keys is different from Codd's, it is based on the abstract objects and memory manipulation with abstract objects. My approach solves more complex things than the surrogates. (See the algorithm in my post from February 25 in this thread. I named that algorithm as “How a database stores an object and how a database can remember of its objects?” )
===========================

The paper “Anchor Modeling An Agile Modeling Technique Using the Sixth Normal Form for Structurally and Temporally Evolving Data” has reference [19].

At [19] the authors of Anchor Modeling wrote that the anchor is the surrogate key. In their paper they also “proved” that all structures from Anchor modeling are in “6nf”. This implies that Anchor Modeling is based on the nothing because I showed in this thread that “6nf” is absurd. This paper was signed by all five authors. Why the authors of “Anchor Modeling” are put the "6nf" in the title? It is because they use these atomic structures in their paper without proof. They use "6nf” and RM/T as implicit proof for their decomposition. As I wrote "6nf" is nonsense.

There are also problems in the Anchor Modeling that are related to transitions from one data model to another data model. It seems to me that the authors of Anchor modeling walk with giant steps through these data models. Firstly they have entities, then they have a set consisting of different types of attributes.
But nowhere is proven the decomposition of this entity to these attributes. It was not proven that the reverse is true: it is not proven that this attributes forming this entity. What is surprising is that at the beginning of this paper the authors write that Anchor Modeling is the Relational Model: "An anchor model is a relational database schema" see section 1.

On August 6, 2010 (see my thread “The original version”) I wrote on this user group the following:
“There is one other thing here, which is more important, which is badly
done in the "Anchor Modeling. This is about how to do the transition
from E / R model in the relational model and vice versa. I think it is
necessary to define the mapping from E / R to RM, then the inverse
mapping for the given mapping and in the end it is necessary to define
the composition mapping. In my model I have at the outset, the binary
concepts. Each binary structure has its own unique identifier of the
state. Therefore, each tuple or a binary concept is uniquely defined.
In "Anchor Modeling" They start from the E / R and go in the RM, so do
6NF, and return to the E / R. But it was not discussed in the paper,
so it's not clear how to do it. We can note that mapping of schemas between
two db models can be complex, for examples it can include constrains.”

Basic terms are not correct or they do not belong to the theory of databases. For example section 2.1 starts with the following text: “An anchor represents a set of entities, such as a set of actors…”
The set of entities does not exist, because we do not put the physical entities into sets. For example, we can say that we have a set whose elements denote actors. Databases work mostly with names, not with physical objects.

Just after the above mentioned sentence, there are the following definitions:
Def1 Let ID be an infinite set of symbols, which are used as identities.

Def2 An anchor A(C) is a table with one column. The domain of C is ID . The primary key for A is C.

If somebody wants to check what an identity is, then he can visit web page: Stanford encyclopedia of philosophy, the articles from this web site are written by the prominent scientists. On this web site there is no a definition of Identity, but there are tens of pages about the identity. This is among the most important terms in philosophy. However databases do not work with philosophical terms. I just want to tell you that the most basic term in Anchor Modeling are defined inaccurately.

This paper has the following title: “Anchor Modeling An Agile Modeling Technique Using the Sixth Normal Form for Structurally and Temporally Evolving Data”

I was looking just for the part that is related to “Using the Sixth Normal Form for Structurally Evolving Data”. I mean this is really impressive title and notation. However, I could not find anything about "using sixth normal form for structurally evolving data".
If authors of this paper believed they can on "agile" way add attributes to the existing entities, then they have to realize that they need to swap the existing "identity" of the corresponding entity, i.e. they should change the "anchor".
I have many doubts related to "Agile evolving" especially because there is no explanation or example about it.

This paper was awarded the first prize at the Congress that bears the name "International conference on conceptual modeling," but nowhere in the paper, there is no definition of the concept. Note that the authors introduce unusual concepts, which are about how to keep the identity of the entity that is changing. Also, we are talking about atomic entities. Therefore definitions of the concepts are important. Note that P. Chen also did not give any definition of the concept in his work ERM. ERM is conceptual model.

On the web site of Anchor Modeling the authors write about “meta data”. They have discussion club there, they correct errors and announce the new improved versions. I want to say the following:
1. “meta data” is undefined concept.
2. the authors didn't write in their paper nothing about “meta data”. For example they didn't include “meta data” in schema and they didn’t define which “meta data” are included.
3. I am sure that Anchor Modeling can not support “meta data”, but one would be frivolous when criticizing something that is not defined. So nobody who is reasonable person can’t say anything about it.

My point here is that these authors don’t understand what “meta data” is and this imply that they don’t know what the history is. So their solution can not solve the history. In fact these authors think that history is a kind of a temporal database, what is not true at all. Obviously the editors of this paper do not understand the nature of the history.
With these few examples I wanted to draw attention to the low level of this work. It also raises the question of how do the editors of this paper did not notice the low level of the paper.
By the way, the reference [19] is disappeared; it is not at the given address.




> 5.1. I would like to be able to say, at this point that my [4] is the same as your "DbDesign 10 Knowledge Data Model", at least in the sense that [4] is an implementation of dbdesign10, and dbdesign10 is a generic or template definition (not an implementation). But I can't say that yet, because:
>
> • the one big difference that stands out (in my reading thus far) is that I totally accept RKs, and RKs are compound keys, that AFAIC cannot be decomposed. Whereas, your "Keys" do not allow compound keys.
>
> • on the face of it your "Keys" are surrogates, but since you decry surrogates, I am sure you are trying to convey something else, that I have not absorbed yet.
>
> ••• CarId is the Car Key. CarKey is not a Key, it is a surrogate, and the column is therefore incorrectly and named, and leads to confusion.




====================================
(b) CarKey is not a surrogate key by definition. Users can see the value of the surrogate key. If user wants, then he can delete the surrogate, which he saw. CarkKey is the identifier of the abstract object i.e. it is the key of the state of an entity. In contrast to Codd’s surrogate key which is related to a real entity, CarKey is related to the abstract object.
CarKey directly identifies the corresponding state, so it is a key, while surrogate keys can’t identify the entities, the surrogate key indirectly identifies the real world entity, and it uses additional database structures.
=====================================

But more important, CarKey is about how to store complex objects in a memory and about how to recall complex objects from a memory (Man can remember of ideas, emotions, music, thoughts, shapes and other very complex objects. Here I mean on a memory for databases. But this can be a clue for general theory about memories. So CarKey goes in that direction. See the algorithm in my post from February 25 in this thread, I named that algorithm as “How a database stores an object and how a database can remember of its objects?”).
The surrogate is about simple objects and the surrogate doesn't work correctly. The surrogate is a naïve technical solution. (the surrogate looks to me as a kind of an index)
Especially Codd’s surrogate doesn't work for General database theory. For example Codd didn't notice very important and huge field in database theory. It is the history. Now, it seems that authors of Anchor Modeling want to “include” Codd’s surrogate in the theory of the history of events although Codd didn't notice this very important field.
===================================

Note that each primary key is the identifier. So the corresponding identifier can be physically associated to the corresponding object from the real world. See my paper “Semantic databases and semantic machines”, section 5.1(i) at http://www.dbdesign11.com/ which is about primary keys.

Keep in mind that a lot of people do not understand what it is a surrogate key. For example, if we have an invoice and if the invoice has an identifier (invoice number), then this is not a surrogate key by definition, because the identifier is put on the object in the real world.
I also think that the authors of Anchor Modeling not fully understood the surrogate key. I mean this is obvious.


>
> • (I think dbdesign10 needs to be elevated in terms of specific statements and clarity, because it takes undue effort to understand it, but let's not get into that here)
>


(c) I posted my solution for the first time on September 23, 2005 on this user group.
Many of the members are understood my paper, and they immediately started the discussion.
It is OK that one can’t understand something; you (or anybody else) can post your questions to the group or can sent it to my email, I will respond.





> 8.1. I disagree that AM *substitutes* or replaces the RK with a surrogate. Clearly, one of their attribute tables (P-Relation) contains the RK, the K-Relation or K-Role. So the surrogate is used in the normal manner, as a permanent Identifier, a substitute PK, that is an FK in all its child tables.


(d) An entity is a fundamental term here. It is also fundamental semantic unit. The entities are the basic units for relationships.
The surrogate keys are the keys for entities. The anchor is the surrogate key; this is written in reference [19].
In my opinion the surrogate is wrong in Anchor Modeling, because users can see surrogates in Anchor Modeling. It is not a surrogate key by definition.
Therefore the consequence here is the following question: On what is based the decomposition of entities in Anchor Modeling? Is it based on "6nf" or E. Codd's "decomposition" or maybe on some combination of the mentioned approaches?

Vladimir Odrljin

vldm10

unread,
Apr 1, 2013, 1:55:09 PM4/1/13
to
Dana ponedjeljak, 11. veljače 2013. 08:41:14 UTC+1, korisnik Derek Asirvadem napisao je:





> I dare say, that *if* Anchor Modelling databases do not suffer from the consequences of data and referential integrity loss, then they have a good handle on it as well.



Hi Derek,

Surrogate key, that is, "Anchor Modeling" cannot even solve some major areas related to "history" of the database. In other words “Anchor Model” cannot solve many of the problems it intended to solve.

1.
On November 29, I posted in my thread “The original version” on this user group the following example:

The following example shows that this model cannot solve the most
important of the cases for which it is intended. This is an example of
wrongly-entered data (erroneous or wrong data). The question arises:
why hasn't this been fixed in the new version? Rather, it has been
explained unclearly. The answer is that AnchorModeling cannot solve
the most important part.
Let me explain this with the following example: Let “Historized
Attribute” be given (for entity Car).
Car ( id, color, start-date-for-color)
===============================
22 Blue 16 August 2002
22 Red 16 August 2002
In the first row the data entry person made a mistake and entered the
wrong color “blue”, then the manager fixed the mistake and entered
“red”, which is the actual color of the car. This cannot be done in AM
because we get a double key, key = (id, date). For this reason wrong
data in AM must be deleted. Of course, there is no history.
This implies many other things. For instance, AM can’t be used to
design online applications, as well as to solve other serious
problems.
This example clearly indicates that the authors of AM are incompetent
and fail to understand the nature of databases and the problems of
changes and history.
-------------------------------------
I submitted complaints to DKE and Springer. Among other examples, I sent this example as part of my complaint. At the beginning of this thread I posted the answers that I received from P. Chen (DKE journal) and the response from the Springer journal.

Now I am going to analyze in more detail this example. Suppose that we have done business application that allows teachers to enter grades for students interactively, using a database that is supported on-line, that is, which can be accessed via the Internet.
The company that sells software provides the following:
(a) It is impossible to do a crime using this database because our solution preserves the history of events;
(b) This is the first solution that provides 100% on-line operations. In the event that someone or some procedures or external user makes a mistake, this database will show who exactly made a mistake and show who needs to fix the error. The wrong data are not responsibilities of designers or theorists. The designers and theory should design db which shows who is responsible for the faulty data.
(c) Our solution solves a lot more, than it does the "Data Warehouse", because our database keeps everything what was been done in the database. In addition, this database has on-line support, so you can use the current data on the Internet or on a network. So we do not need “Data Warehouse” and we do not need to transfer data to the “Data Warehouse”.

Suppose that the schools purchased Anchor Modeling software because it maintains history. Suppose that the teacher Smith gave grade A to student John. By using a data entry screen, Smith entered the grade A in the corresponding database. He deliberately made a mistake when entering a grade A. John's realistic knowledge corresponds in fact to the grade D. After two months, John was admitted to a good college, thanks to grade A. Smith then declared that the grade A is a mistake and delete it, since the authors of Anchor Modeling allow the deletion of the wrong data. So everybody is OK. John didn't make a mistake; John’s college even does not know what is happened with his diploma. Smith made mistake and fixed it. The authors of Anchor Modeling are responsible in this case, because they claim that their solution maintain the history, although it is not correct. Even more they obviously do not understand what the history is. Obviously they do not understand the above-mentioned item (a), (b) and (c). Otherwise, if they understand the history, then they will not allow the delete of data.
We note that history works well with the insert operation. However, if one is allowed to insert and delete, then it means that the update operation is also allowed.
Now imagine that in the above example with schools, some manager is introduced the following rule: If the professor enters the wrong grade in the database, then such data can be repaired only by System Administrator, at the written request of the teacher. Now, we have a new problem, and this is who maintains written documentation. More importantly, now the benefits (a), (b), (c) do not hold in the database. Of course, it is possible organized crime with insider and outsider members.

It may happen that the system administrator makes a mistake while entering data. It is possible that two professors erroneously entered grades, deliberately for one student. It is clear that many bad combinations are now possible with this database design.

Thus, deleting erroneous data as do the authors of Anchor Modeling shows that they do not know exactly what it is history. I'm not sure that editors of the paper "Anchor Modeling", which was published in the Springer and DKE journals, that they have complete understanding of these issues.
In my opinion, this can be a bad experience for those who use AnchorModeling paper for their scientific work, a PhD, scientific presentations and the like. Note that Springer and DKE are very well known scientific journals. Of course Springer and DKE should be responsible.
The application of this software for "history" can have dangerous consequences for complex enterprise applications such as banks, airlines, military applications etc.
If you (or anyone else) think that I'm wrong here, then please explain it in this user group.

2.
Example 2 (December 22, 2010, tread “The original version”)

Anchor Modeling solves relationships using the following explanation:
“When a transaction has some property, like PEDAT-PerformanceDate in
Fig. 4, it should be modeled as an anchor. It can be modeled as a tie
only if the transaction has no properties.” (See AnchorModeling,
section 4.1 Modeling Core Entities and Transactions)
However no scientific explanation for the above statement is given. In
contrast to Anchor Modeling, many university lectures include
definitions of relationships by their attributes - often giving
examples for relationships with attributes. Wikipedia also defines
relationships through attributes.

In fact, Anchor Modeling cannot solve a relationship that has an
attribute.
Explanation: In this case the Historized Tie would have the following
form: Htie (C1, C2, ..., Cn, A, T) where A is an attribute. Here it is
not clear what T is - is T time that is related to a set of anchors (as it is in Anchor Modeling) or is T time related to A? Here arise problems with key.
Note that a relationship can hold at arbitrary time, independently of attributes.
--------------------------------------------
Let me do a small analysis of this example. In his paper
“The Entity – Relationship Model” P. Chen wrote “Note that relationships also have attributes.” (See section 2.2)
This statement by P. Chen is in contradiction with the following statement from the authors of "Anchor Modeling" : Relationships do not have attributes.
This theory about relationships that was written in the "Anchor Modeling" I also stated in my complaint addressed to P. Chen.
Databases with relationships are the most complex structures. I had the opportunity to work with very complex relationships problems. Therefore, I am very surprised with this “contradiction” approach to this problem.

3.
In historized Attributes Hatt(C,D,T), the attribute T is wrongly designed. T is the time when the value of a property is no longer valid and also the time that a new value of the property becomes valid. However, this is incorrect. For instance, let us consider the
property: the color of a car. A car can be sent to the mechanic to be fixed. The old color can be removed right away and the company can enter this into a database. After a serious of repair jobs, the new color can be painted on 10 days after the previous color was removed. This data may be entered into the database 12 days after the car has
been in the shop. Therefore, one of the design foundations in Anchor Modeling is flawed.
There are numerous examples of business applications where entities’ attributes don’t work in this way.
---------------------
Let me do a small analysis of this example.
As we sow Anchor modeling supports only start date in H(C, D, T). Is it possible in Anchor modeling to store end date? No it is not possible in Anchor Modeling, because it is very bad db design. If you try to store both date, start date and end date then you will have duplicate key. In fact, it is not possible to work with end date in “Anchor Modeling”, at all.

4.
The authors of Anchor modeling have not shown how they work with “metadata”. In the first version of Anchor Modeling (section 2) they mention that “Although important, metadata is not discussed further since its use does not differ from that of other modeling techniques.” This is a very important theoretical part. There is no example, definition or schema of “metadata”. There is absolutely nothing about “metadata” in the scientific paper which is about fundamental results. Of course this is not a scientific way.
As I already wrote the authors of the "anchor modeling" had an intense discussion about the "metadata" on their website and repeatedly announced fixes and improvements.
You can see my paper “Some ideas about a new data model”, example 2.5 at http://www.dbdesign10.com . Note that this example has more than one solution.

-------------
Surrogates
-------------

5.
Technical solution that is known as a "surrogate key" is bad for the database that maintains "history", because it is possible that two (or more) different surrogates identify one entity. This is very bad.

6.
Global industry-standard identifiers are a large group of the business applications that don’t need surrogate keys at all.

7.
There is a large group of objects from the business applications that can not be resolved by using the surrogates. This is the example about a Honda dealer who sells Honda cars, which all have the same attributes. This is interesting case because we can’t apply Leibniz Law. Here we can not use surrogates, because they would show the same entities in a database. Therefore, we must introduce the VIN. Again, it is obvious that if we use the VIN identifier, then we do not need the surrogate key, at all.

8.
The surrogate keys are not externally verifiable.

9.
Locally verifiable keys are applied in local environment, for example in a company, city, library, public transportation card, local shop card etc. They do not need the surrogates, at all. Instead of the surrogate key, an identifier is better solution.

10.
Identifiers that provide good db design. This example shows how identifiers can hold the entire database structure and provide good db design.
Take for example auto service which provides repairs of cars. We can organize completely our business, by applying the database. For example, the manager of the auto service takes necessary data from a customer and the data about the car. Then the worker checks the car, and then he enters the necessary information about the job that is needed on the car as well as price and time. Then manager prints a sheet that contains all the information and the corresponding identifier.
This identifier is very different from the surrogate key. This identifier belongs to the real entity; the identifier is placed on the print. This identifier holds the whole deal. The identifier also makes the auto service one complete little world. One copy goes to the customer; another copy goes to the worker etc. All the information about cars, customers, repairs and prices go into database.

This database is a fully functional, it allows that this technician can present in documented manner, what is needed to be done, how much it will cost and when it will be over. The database provides the customer with all the necessary information so that he can accept or reject the job.
Auto service has all the information necessary for the operations of the company, financially part, material and parts, accounting, information about the customer, car, worker etc. This identifier can be physically present on each paper that is related to the job. Obviously we do not need a surrogate key here. Note that the surrogate key has no functionality that has this identifier. In this kind of databases, there is no place for a surrogate key. Often people who work with databases forget that a man makes these little worlds.
I put this example with intention to show how bad the surrogate key for database design is. Another reason why I put this example is the importance of database design. This is the most fundamental step in working with databases. In this small example, we see, in this example, that the constructed identifier is better than a surrogate key.
Database design means the construction of a small world that is fully functional for what is intended. The database design is the most important step, so technical solution as it is the surrogate key, should be in compliance with the db design. It is the fundamental mistake if technical solution determines the database design. Note that the surrogate key is a kind of technical solution, it is similar to index and its work is unknown because it is part of the system.

11.
My approach to the memory and keys I explained in my post from February 25, 2013. I gave the following heading to this topic:

How a database stores an object and how a database can remember of its objects?

In my post from February 25, 2013, I showed you the main steps of the algorithm which is related to a memory and remembrance. You can see my solution with the identifier of the state, to see the complexity of the problem and of the solution. Certainly this is a serious issue and that is clear the surrogate key has no theoretical importance. As I said the surrogate key is a technical solution, it is a kind of index. If you look carefully 5,6,7,8,9,10 cases that I described in this post, it is clear that the surrogate can be applied in a small number of cases, say, less than 2%.

On the other hand, my solution is based on the states of the entities and relationships. I define the state of an entity as a total knowledge about the entity. I think my solution is theoretically and practically at a much higher level. When we talk about the theoretical level, I would like to note that the state of an object is one of the most complex topics in the history of science.
These questions are about the states of an object, the identity of the object that is changed, the semantics of the subject that generates knowledge about the entity that is changing and the construction of the corresponding concepts. Also it is in fact the generation of present, past and future.
Obviously, Codd did not even notice these things; his thinking is at the level of the indexes. The same level has the “Anchor Modeling”, but they use the theory about the history. I have developed and built the complete theory of the "history". The main results that are related to the "history" I have posted on this user group in 2005. So here in Paragraph 11, I want to say that my first objection is that the theoretical level of RM/T and Anchor Modeling in fact is of the low significance.
I want to clearly say that my second objection is the following: Instead of creating a theory, now it anticipates creation of theory which others might discover, and thus in this way in advance occupy the space in theory.
This index-duplicate-key is presented as something the most important and it is now tied for my theory that solves the "history". My theory about the “history” is the only thing here that is important.

My database, in addition to the maintenance history, is also building a small world in which "everyone knows all others." In this world there are two important events - when something is created and when something has ceased to exist. This world knows of existing entities, the former entities, and procedures that all this creates. In my opinion, this is what Codd is not understood, nor are the authors of Anchor Modeling understand it.

It is absurd that someone is using the surrogate key instead of large-scale systems such as barcode, driver license, passport identification, credit card, VIN, User ID in the network and tousends and tousends of complex systems, which in addition to the primary key, implement complex identification, verification, authentication, and more. So these identifiers can do all what do the surrogate key, with the addition of many more, what surrogates can not do.

As far as I know, Microsoft also uses the states of the entities in their EDM data model, but note that I defined the states in 2005.
As for the Microsoft EDM data model and my solution, see my post "The priority of the idea" that I posted on December 3, 20009, on this user group to the next page:

https://groups.google.com/forum/?fromgroups=#!topic/comp.databases.theory/rkhDdJ8XD_o


Vladimir Odrljin


vldm10

unread,
Apr 12, 2013, 6:50:51 AM4/12/13
to
Dana ponedjeljak, 11. veljače 2013. 08:41:14 UTC+1, korisnik Derek Asirvadem napisao je:

Hi Derek,

>
> 3.1. I do not accept that "[Codd] was unsuccessful at [decomposition of a relvar into binary relvars] and was not able to show how this is done. " I think it is clear in RM/T, and I do it all the time. There may be marginal cases where the technique does not apply or where further techniques are necessary in order to provide resolution, but that does not subtract from the technique given, and you are not one of those idiots who argue at the margins (straining at the gnat and swallowing the camel).
>

1.
In 2005, I solved the binary decomposition for the General databases. These are databases that can maintain "history". This decomposition is determined by the applying of the identifier of the state. The drawback of this decomposition is that it can be applied only in the RM, only for General databases.

2.
The binary decomposition of Simple databases, I published on this user group in May 2006th. Then I published "Simple Form". Simple form defines the following:
a) It determines to which relations it can be applied
b) It defines how the relations can be decomposed into binary relations.
Now things have slightly agreed. But the major issue still unresolved. We need the binary decomposition at the level of the conceptual model.

3.
Conceptual model is always the beginning of the construction of the database. But here, in the conceptual model nothing was done, even there was no definition of the concept.
Within the ERM, it was necessary to solve the decomposition of the Simple and the General databases. After that it was necessary to define a mapping from the ERM into RM.
All solutions in the ERM should be done with the tools that belong to the ERM. So we have to work with the concepts.

The "Binary decomposition" at the level of ERM I published in 2008. This paper I had to write in a limited number of pages, because Croatian journal in which I submitted the paper, gives a very small number of pages. That's why this paper has become more difficult to read than usual. I showed this decomposition in Example 6, Section 4.2.3, "Database design and data model founded on concept and knowledge constructs”. In this example, implicitly I used my definition of the concept and the extension of the concept. See also my paper "Semantic databases and semantic machines", section 5 at http://www.dbdesign11.com Here you can find more on this very complex matter.

In Example 5 is shown working with binary files. I showed only this example that relates to the m-n relationships in the file data model, because this is the hardest part.

4.
In my model, knowledge was associated to the concept. A new definition of knowledge has been given. Knowledge is defined using atomic facts. The difference between the fact and factual sentence is introduced. So a fact is always tied to a subject, that is, the subject knows that fact. This implies the necessity that the subject should be "aware" of the fact.
"Truth conditions" are defined, i.e. the relationship between meaning and truth is introduced.

5.
The mapping (also the corresponding inverse mapping) between ERM and RM is defined. Based on the above mentioned four items, it is obvious that these mappings are defined in the following way:
a) The identifier of the entity defines the mapping for "Simple database";
b) The identifier of the state defines the mapping for “General databases".

6.
First, I define the knowledge, and then I have defined entities that are based on the knowledge. In my model, the construction of the entity does not use undefined terms such as the "metadata".

7.
My model allows you to work on all aspects related to the history of events. This is a special large area and therefore I will not explain it here. History is one of the greatest contributions that my solution adopted. This drastically changes the theory of database.

8.
In the item 8 I will discuss the thing that has been in the forefront in this thread. It is the story about the surrogates. The story has boosted the importance of the surrogates to a size with which, the real surrogates have nothing to do.
I'll try here to show two things that are an important part of the "big picture" and that they represent some "general" conditions.

One thing is important here, it's my algorithm that is related to memory management. It is an algorithm that defines how to store an entity in the memory and how recall it from memory, and for which kinds of the objects it works. I wrote about this algorithm, in more details in my post from February 25, 2013, in this thread.
If we talk in generalized terms, it can be said that this algorithm is about procedures for memory and remembrance (recollect).

Another thing is about some “general” conditions:

(a)
The main part of solving “temporal”, “historical” and other complex databases consists of two sub-steps:
1. Constructing an identifier of an entity or relationship.
2. Connecting all changes of states of one entity (or relationship) to the identifier of this entity (or relationship).

I had published this idea on my website http://www.dbdesign10.com and
in this user group in 2005 (see section 1. and 2.)

(b)
“We determine the Conceptual Model so that every entity and every relationship has only one attribute, all of whose values are distinct. So this attribute doesn’t have two of the same values. We will call this attribute the Identifier of the state of an entity or relationship. We will denote this attribute by the symbol Ack. All other attributes can have values which are the same for some different members of an entity set or a relationship set. Besides Ack, every entity has an attribute which is the Identifier of the entity or can provide identification of the entity. This Identifier has one value for all the states of one entity or relationship.” (See section 1.1)
--
Here are given the general conditions for the construction of procedures that are necessary to maintenance the history of events and their application to the identifiers and keys. I would like to emphasize once more: The general conditions are the mentioned procedures, not an “immutable key”. I mean, AM was only took “immutable key” from my procedures. I wrote “the general conditions” because there are many specific conditions in the construction of General database theory. I would like to mention that the next part of the definition: "... or can provide identification of the entity" from case (b), does not apply so much to the surrogates, as they apply to other cases. These cases are more general than surrogates. As I wrote, the surrogate key is unimportant case.

I also want to show that Codd’s thinking about these issues was kind of narrowly oriented and technical solutions. I mean that he did not understand the nature of these matters. The authors of Anchor Modeling, were applied the surrogate key in the same manner of misunderstanding of the difference between technical solution and theoretical solution and the difference between a special case and the general case.

The authors of AM gave the name of "immutable" to certain keys. Recently, I quickly glanced what Microsoft is doing with their "EDM", and there I noticed that they used the term "immutable type."
In my model, I gave two options for the identifier of the entity, it may be immutable or it may be mutable, depending on db design. So immutability depends on db designer. For example, it is known that in the U.S. a man in particular situations can change his identity. In the real use of databases, there is the need when an identifier should be immutable only at one period, for example, the identifier must be changed each year, so the identifier is immutable in one year and after that, this identifier must be changed.
In my paper "Database design and data model founded on knowledge constructs", section 4.2.4, I wrote: "In the states of an entity, the identifier of the entity stays unchanged through all the states of the entity, because the states are from one entity (however if we want, then we can decide and determine which of the states belong to one entity). The identifier of the entity determines which states belong to the entity. If an entity only has one state, then the identifier of the entity is semantically equal to the identifier of the state.”
In my paper “Semantic databases and semantic machines”, section 5.12, I demonstrated that history of entities, which changes its identifiers, can be maintain by “seq” structure.
Note that unary relations from RM/T are simple special case of the structure “seq”. The Anchors also are very simple special case of the structure “seq”.
However RM/T and AM can not solve some important matters that the structure “seq” can solve.

--------------------------------------------
Conclusion:

None of the above mentioned eight items E. Codd did not do. These items are very important. Item 7, it is history, changed a lot in the db theory. I think Codd did not even notice these areas. The same can be said for AM. Note that I published solution for history, 4 years before it did Anchor Modelling.
Therefore we might set the following question: Does the decomposition presented in "RM / T" is correct? The same question can be asked for AM. Does this decomposition of the structures that are shown in the AM is correct?

I mean, there are no proofs for these decomposition. These are the most important things, and many scientists were tried to prove it. As you can see Codd and AM are the exceptions, they have been able to publish their solution without proof.

From the papers, it is obvious that these decomposition are done, without any proof. How is it possible that such well-known journals have so scandalous omission? AM uses Codd's surrogates. Note that the paper about "AM" has the following title: "An Agile Modeling Technique using the Sixth Normal Form for structurally and Temporally envolving Data", despite the fact that 6nf is unusable.

As I wrote in this thread the surrogate key is not the main thing here. The main thing here is the decomposition into the binary (atomic) structures. This is crucial not only for database theory but also in some other areas.

Vladimir Odrljin

vldm10

unread,
May 13, 2013, 7:58:56 AM5/13/13
to
Hi Derek,


> -------------
>
> Plagiarism
>
> -------------
>
>
>
> Yes, I understand, from painful experience. So let me start out by saying I am generally on your side, I agree and empathise.
>
>
>
> But I think you need to understand that although there are laws against it, etc, it is sadly very common in the west. Especially in the last ten years, where universities are no longer centres of learning; they are centres of programming humans to be herd animals, and to compete without resolution. I am not saying "deal with it", I am saying, protect yourself.
> -----------


I appreciate your comments. On this occasion I would like to quote the following: “During times of universal deceit, telling the truth becomes a revolutionary act.”
--George Orwell

I think that behind each act, there is concrete man, with his name. Therefore in this thread, I concentrate on facts and names.

1. My paper completely solves the area from database theory, which I call the "history of events". This area is actually a general approach to the theory and practice of database. This general approach to databases, I called general theory of databases or abbreviated "General databases". In my post on February 13, 2013, I wrote with which areas are dealing General databases.

2.
I had published my ideas on my website http://www.dbdesign10.com and in this user group in September 2005. Here is a link to the comp.databases.theory user group, where I presented my ideas in 2005:
http://groups.google.com/group/comp.databases.theory/browse_frm/thread/c79f846beb00cc56# (there are also many other links where the ideas were clearly presented)

Anchor Modeling was published in November 2009 and fixed version was published in October 2010. Everybody can see these facts on the internet. The internet is the global auditorium. I mean I have global witnesses.

3.
On 12 April, 2013, in this thread I wrote about some general conditions which must be satisfied. It is about identification, simple key and algorithm about storing and recalling different kinds of objects into a memory. So the following ProcedureA gives general conditions:
===============================
ProcedureA
(a)
The main part of solving “historical” and General databases consists of two sub-steps:
1. Constructing an identifier of an entity or relationship.
2. Connecting all changes of states of one entity (or relationship) to the identifier of this entity (or relationship).

I had published this idea on my website http://www.dbdesign10.com and in this user group in 2005 (see section 1. and 2.)

(b)
“We determine the Conceptual Model so that every entity and every relationship has only one attribute, all of whose values are distinct. So this attribute doesn’t have two of the same values. We will call this attribute the Identifier of the state of an entity or relationship. We will denote this attribute by the symbol Ack. All other attributes can have values which are the same for some different members of an entity set or a relationship set. Besides Ack, every entity has an attribute which is the Identifier of the entity or can provide identification of the entity. This Identifier has one value for all the states of one entity or relationship.” (See section 1.1)
===============================
Anchor modeling uses the schema which is given in part (a); it uses both of the following sub-steps:
1. Constructing an identifier of an entity.(This is about the construction of the simple key)
2. Connecting all changes of one entity to the identifier of this entity.

This is plagiarism.

The identifier is a more general solution than a surrogate key. The idea of the identifier allows presenting key as simple key instead of a complex key.

ProcedureA is important also for others fields, for example for philosophy, logic and semantics. People have always held that a name denotes a certain entity, although this entity has been changed many times. But the following problem has always existed: How an entity which has changed to another entity is, in fact, the same entity. This problem is solved in my paper. An anchor surrogate key and a surrogate key without ProcedureA does not help at all.
In my paper I gave the corresponding procedures, constructions and semantics for solving this problem. Note that I introduced concepts that are related to these fields and procedures.

ProcedureA was caused a lot of problems. On conceptual level the decomposition of the data structures has not been solved.
How to explain the binary concept, which consists of the identifier of the entity and one attribute? I have introduced the intrinsic, extrinsic and universal attribute. This has led to the expansion of Leibniz Law of identity. I am writing this, because I want to show a complexity of the problems that solves ProcedureA.

ProcedureA has a general character. It solves the following:

i) The surrogate key. (See sentence “ Besides Ack, every entity has an attribute which is the Identifier of the entity or can provide identification of the entity.”, in ProcedureA, case (b))
To be precise, this is not 100% the same as a surrogate key. The surrogate key is the technical solution, it is index, it is not a theory. The mentioned part of my definition of the key is more general then the surrogate key.
What's the difference between this part of my definition and the surrogate key?
My key is always "visible" and it always uses ProcedureA which determines the
general conditions.
Another important difference between my identifiers and surrogates is that my
identifier always belongs to an object. It belongs to a real entity or to an abstract object or it belongs to both the real and the abstract object.
(ii) ProcedureA enables work with abstract objects. My main structure is the state of the entity or relationship. The key of this structure is the identifier of the state of the entity or relationship. ProcedureA is the basic procedure for the construction of History.
As I have already said, the state of an entity was defined as the totally knowledge about the entity (or relationship). Total knowledge means that knowledge from more than one man, is also included in total knowledge. It possible that one man says that the color of the car is blue and another man claims that the color of the car is dark-green. In this case, my solution can maintain History.

The identifier of a state of an entity can identify the entity. A surrogate key can also identify an entity. However, the identifier of a state of an entity is not the key of the entity. An entity and the state of the entity are two very different things.
I wrote in more details about the algorithm that is related to memory management, in my post from February 25, 2013, at this thread. My state structure has both of identifiers, the identifier of the entity and the identifier of the state of the entity. The state is not fully abstract object, it is the state of the real entity. So the state is a combination of the real object and the abstract object. This situation is resolved by using "Combination" of appropriate identifiers. In fact this "combination" enables
storing and recalling of complex objects into / from memory.

So this is the rule; the combination of identifiers solves the "Store" and "Recall" of complex objects into / from memory.

I also wrote about relationship between concept and identification, see the semantic procedure (3.3.3) in my paper “Database design and data model founded on knowledge constructs".

This semantic procedure shows that so-called Russell's paradox is not correct (see my thread “ Does the phrase “Russell’s paradox” should be replaced with another phrase?" posted on this user group ). More important it shows that identification is another mind - real world link and that attributes are identifiers.
As I wrote above, the identifier in my model is an attribute. (it belongs to a real or abstract object). So the identifier from my ProcedureA is part of complex construction (procedure). It is not matter of system because it also has semantic nature (it is mind – real link world).
In contrast to my solution, a solution that uses the surrogate key has no semantic nature, even more, the surrogate key is invisible to users.
(iii) There are mathematical theories that are partially dealing with history. It is for example Modal Logic, which deals with "Possible World". There is a The situation theory, etc. However, these theories have very general nature and do use the undefined terms such as "World".
I have already mentioned that Microsoft in its EDM model introduces history.
Another theory called Abstract State Machines now also introduces History.
(See http://www.w3.org/TR/scxml/ look for History).
It is also clear that we now need adequate mathematical theory for "History of
events".

I write about this here, so that one can understand the scale and importance of this plagiarism.

Note that in working with the History, the surrogate key is not important. ProcedureA is essential.

4. Anchor Modeling uses other essential elements from my work. If someone link these elements together, then he can get a strong theory of history. These are the following elements:
(i) Decomposition into the atomic structure

The decomposition into atomic structures in Anchor Modeling was done at the
conceptual level without a proof. They just put some “atomic structures” into their paper.
Anchor Modeling does a transition of these “Atomic structures” from Conceptual model into RM without a proof or explanation. The same “technique” was applied in RM/T.

Then they proved that the corresponding relvars are in 6nf. The subtitle of this paper is as follow: “An Agile Modeling Technique using the Sixth Normal Form for Structurally and Temporally Evolving Data”. The sixth normal form is just a list of desires about the decomposition into atomic structures. It does not provide a solution, procedure or algorithm that allows the decomposition of a relvar into 6nf. I explained that 6nf does not work properly in the following example: if we have a relvar with mutually independent attributes then 6nf has no sense.

I want to say clearly that a lot of people spent their time trying to get the atomic structures. A lot of people also try to construct mapping between data models. In RM/T and AM this was done without proof and published as scientific paper.
One important step in the construction of atomic structures and mapping between
data models is History.

In the paper Anchor Modeling in reference [19], all five authors claim that they use the surrogate key. ("H (C, D, T) contains, as seen above an anchor surrogate key”, see page 2). Even more they use the surrogate key in SQL!

By definition the surrogate key is not visible and never displayed. Everybody can see the definition of the surrogate key in RM / T, or Wikipedia, etc. In my opinion this is very serious. The Springer and Data & Knowledge Engineering are well known referential journals used by many scientists. These journals must not allow the use of erroneous construction. Especially as E. Codd used surrogate for decomposition into binary structures, which is not proven in the RM / T. Now, we can set the question: On which the Anchor Modeling was founded? Obviously Anchor Modeling is based on nothing.

Note that the author of Anchor Modeling first used the term "surrogate key", and
then "identities" and finally "identifier", which are very different things. It is obvious that these fundamental concepts are not clear to the authors of Anchor Modeling and to editors of the paper.

However if you think that I am wrong here then let me know.

This paper, in which all five authors claimed that Anchor's values are the surogate keys, was disappeared from the list of references. I started to write about reference [19] (see my thread "The original version "). The authors of Authors Modeling, returned the paper to the list of references, but in their other paper.

Please note that the decomposition into the binary structures is proved in my papers.

(ii) In my model, I took that entities have different sets of data. These are: the knowledge about the entity, knowledge about the attributes and knowledge about the data. In my model, it is possible implement various additional knowledge.
Anchor Modeling also uses different sets of data, but fixed and limited set of data.
Note that my data model is more general than Anchor Modeling and that is not based on undefined terms, such as "metadata".
Note that Anchor Modeling did not present work with "metadata" at all. In fact
Anchor Modeling can resolve only a very simple "metadata". You can take my
example 2.5, from my website http://www.dbdesign10.com and you can notice that
there is no structure from Anchor Modeling which can solve History for this
example.

(iii) Date when some information is created and the date when this information ceased to exist.
Anchor Modeling uses also two dates. The authors of Anchor Modeling pay attention to “bitemporal” data. In fact they didn’t notice that my model enables n-temporal data.
For example for one attribute you can have four StratDate: StartDate1 when an
attribute get value in the real world; StartDate2 when the value for the attribute was entered in database; StartDate3 when the value of the attribute was entered in Warehouse; StartDate4 when the attribute was transferred to xml. Note that I do not use Data warehouse because I have only database, but Anchor
Modeling uses Data warehouse, they have the section which is devoted to Data
warehouse. I mean in theory it should be n-temporal data instead bitemporal.

Note that in Anchor Modeling the temporal data and “metadata” are outside the
scope of predicate logic, what is a kind of a disaster for RM. In contrast to Anchor Modeling my model enables the formalization of tensed propositions.

Vladimir Odrljin

vldm10

unread,
Jun 15, 2013, 3:00:29 PM6/15/13
to
Dana ponedjeljak, 13. svibnja 2013. 13:58:56 UTC+2, korisnik vldm10 napisao je:
Hi Derek,

> > Plagiarism
>
> >
>
> > -------------
>
> >
>
> >
>
> >
>
> > Yes, I understand, from painful experience. So let me start out by saying I am generally on your side, I agree and empathise.
>
> >
>
> >
>
> >
>
> > But I think you need to understand that although there are laws against it, etc, it is sadly very common in the west. Especially in the last ten years, where universities are no longer centres of learning; they are centres of programming humans to be herd animals, and to compete without resolution. I am not saying "deal with it", I am saying, protect yourself.
>
> > -----------

> ===============================
>
> ProcedureA
>
> (a)
>
> The main part of solving “historical” and General databases consists of two sub-steps:
>
> 1. Constructing an identifier of an entity or relationship.
>
> 2. Connecting all changes of states of one entity (or relationship) to the identifier of this entity (or relationship).
>
>
>
> I had published this idea on my website http://www.dbdesign10.com and in this user group in 2005 (see section 1. and 2.)
>
>
>
> (b)
>
> “We determine the Conceptual Model so that every entity and every relationship has only one attribute, all of whose values are distinct. So this attribute doesn’t have two of the same values. We will call this attribute the Identifier of the state of an entity or relationship. We will denote this attribute by the symbol Ack. All other attributes can have values which are the same for some different members of an entity set or a relationship set. Besides Ack, every entity has an attribute which is the Identifier of the entity or can provide identification of the entity. This Identifier has one value for all the states of one entity or relationship.” (See section 1.1)
>
> ===============================
>
> Anchor modeling uses the schema which is given in part (a); it uses both of the following sub-steps:
>
> 1. Constructing an identifier of an entity.(This is about the construction of the simple key)
>
> 2. Connecting all changes of one entity to the identifier of this entity.
>
>
>
> This is plagiarism.
>
>
>
> The identifier is a more general solution than a surrogate key. The idea of the identifier allows presenting key as simple key instead of a complex key.
-----------------------------



In the above written text, it was shown that ProcedureA is crucial for "History of Events". Proceduree also shown that Anchor Modeling use it for most important part of which I called “History”.
===========================================
Now in this post I will prove that the Identifier of an entity which I defined is much more general and better solution than the surrogate key used by the authors of Anchor Modeling.
===========================================

My main structure, roughly speaking, has the following schema:
IdentifierOfEntity, IdentifierOfState, Knowledge
“Knowledge” is total knowledge about an entity, which has one or more subjects.
The key is IdentifierOfState. The key is man’s abstraction; the key does not exist in the real entity and the key is one wholeness.
Note also that IdentifierOfEntity is not the key of my main structure.

I usually have three groups of knowledge: knowledge about an entity, knowledge about attributes and knowledge about data. However, in my papers I wrote that design of Knowledge depends on the real world situation. For one application I can have one kind of total knowledge and for another application I can have another total knowledge with maybe 5 groups of knowledge, etc. My solution enables variable Knowledge structures design.
Anchor modeling has fixed number: knots, anchors .... This db design is nonsense because business applications are very different and I am surprised that the the editors of Springer and the editors of Data & Knowledge Engineering – Journal can accept this limitation on the fundamentals of db design.

(1)
Note that the authors of Anchor Modeling define the anchor key as the surrogate key, see their paper “Analysis of normal form for anchor models”, which is reference to both of their papers; “Anchor Modeling” from 2009 and repair of their first paper, from 2010.
===========================================
By definition, the surrogate key is internally generated by system and never visible to the user or application. See Codd, RM/T paper.
===========================================
(2)
Let me cite the following part of the above mentioned ProcedureA:
“Besides Ack, every entity has an attribute which is the identifier of the entity or can provide identification of the entity.”
This sentence means that the following two cases are possible:
===========================================
CaseA: The identifier of an entity is a part of entity and it is also a part of m-entity. As I wrote, in RM a tuple is m-entity. So, the identifier of entity is embedded in both entity; in the real entity and in m-entity.

CaseB: The identifier of an entity belongs only to the corresponding m-entity. ===========================================
In CaseA, I solved problems which are related to "industry standard" keys, using the identifier of an entity. In this case Anchor Modeling is heavy nonsense. See my explanation in my post from April 1, 2013 in this thread, see cases 6,7,8,9,10. We already discussed in details about the industry-standard-keys.

CaseB is the case in which the authors of Anchor Modeling apply the anchor keys. In fact, this case is applicable to the small number of the database applications. Say 2% of all the databases.

In CaseB, the anchor key is partly plagiarism of the key from my IdentifierOfEntity.
(i) It is simple key.
(ii) It does not belong to real world entity.
(iii) It uses the mentioned two steps defined in ProcedureA, see case (a).
(iv) The anchor key is not the same as my key because it is not visible.

However the authors of AM didn't understand the following:
(i) My key belongs to an abstract object, that is, it belongs to the m-entity.
(ii) My key is the identifier of the abstract object. This means that the key “enables” the identification of real object bat that the key is part of the abstract object.
The user can always identify the real world entity using information from the corresponding abstract object, that is, from the m-entity. That is why in this case the ProcedureA was written "or can provide identification of the entity".
As I wrote in my ProcedureA, the identifier of an entity in this case "can" provide the identification of the entity. This means it does not identify the entity directly; rather the identifier can do it indirectly.
(iii) The authors of AM do not understand that there is no need for surrogate key here. I have no need to hide the key for the following reason:

a) The identifier of a state of an entity can “join” all binary relations that are related to a state of an entity. My solution is based on the states, so I know what it is modeled in my database. In contrast, it seems to me that the authors of "Anchor Modeling" trace the changes, what is a kind of a programming rather than a database.

b Codd has used a surrogate key, so that he can do some things "behind the scenes". In my opinion, these are the things, for which Codd was been aware, that are not entirely clear for him.

Codd introduced invisible key because he did not know to solve all the problems related to the decomposition of data structures in the corresponding binary structure.

As for my paper, I have solved the problems that are related to General Databases. Therefore I don't have needs to hide the keys. So my conclusion is that the authors of Anchor Modeling didn't understand that Codd defined the key invisible because he didn't know to solve the problems but databases which can solve “History”, don’t need a surrogate key.

That is my answer on your question: What is the difference between the surrogate key and identifier?
An identifier is well known (visible) part of (abstract) object. So, my design uses “visible” objects, in contrast to the surrogate key. For example, I can use identifiers in SQL, but nobody can do it by using a surrogate key.
More importantly, the identifier is not an "identity." The authors of Anchor Modeling define the anchor key as a symbol, which is used as identity!
Let me give you one naïve example. Let me give you one naïve example. Complete molecular biology (genetics) works by using receptors (these are small parts of molecules). The receptors are a kind of identifiers, and the receptors work by using identification. The receptors are stupid for philosophical concept, as it is the identity. A virus just need to catch a receptor. After that it can couse very bad things to a person

---------------
Conclusion
---------------
CaseA. The surrogate key is nonsense for this case.

CaseB. The surrogate key is nonsense for this case, because it is invisible. However, If the authors of the anchor modeling try to cheat, if they say that they do not use invisible surrogate key in their paper, then their key (for this CaseB) is brutal plagiarism of my identifier of an entity.

Please note that I defined concepts for identifiers. I wrote a whole chapter on the identification and much more which is related to identifiers.

Finally, once again, let me repeat, that I think Codd's contribution to the theory of databases, as one of the most important. However, his paper RM / T is not good.
About this paper I did not have intention to write. (I think that editors and journal where the RM / T was released, they are obliged to explain some things.)

I write about RM / T because they occurred some very bad frauds that are related to my work and RM / T.

In my last two posts from this thread I showed two things.
1. History in Anchor Modeling is plagiarism if my paper(see ProcedureA).
2. My Identifiers Of Entity is much more general and better solution than any surrogate
key.
-------------------
This my paper was presented in this user group four years before the publication of Anchor Modeling. My solution was not accepted at all of the people on the many years of discussions. I say this in order to understand how it was contrary to the then-image about databases. I'm also constantly had doubts that I made a mistake somewhere, despite the fact that I did a very complex database "history project". The project has worked and what bothered me was the fact that in the implementation phase, the project has not needs for programmers and maintenance at all. When one works with a database, my project took from him all the information, without his knowledge: his id, id of computer, time etc and all this for each data. Therefore, it was clear who or which procedure is responsible for each data. When it came to the Internet, I had some new problems. My solution is roughly the following: database distinguishes two types of online users. The first type, the registered user who can work with data and another type of visitors who only look at the data. Users from the another type, if I need them, then I take their data, for example if you want to bore them with advertisements. Data entry person can work at home, at office or in a cafe.

Johan Sjöström, MSc, MCAD, Stockolm (Sweden) posted the problem about history and databases on this user group on September 1, 2006. The name of the thread is “Order details table reference live data”. In this thread you can find one example with my ad hoc solution.

Vladimir Odrljin

vldm10

unread,
Jul 25, 2013, 4:15:12 PM7/25/13
to
Hi Derek,

Now I would like to summarize my posts to you regarding the following two points:

1.
In my post from April 1, 2013, I listed eleven major fields that "Anchor Modeling" can not solve at all. This shows that "Anchor Modeling," is not a solution at all.

2.
All the main ideas of the paper Anchor Modelling can be found in my paper.
My paper was published four years before Paper Anchor Modeling, presented and the mass of discussers in this user group over the years. Examples of these important ideas are listed in this thread and in the thread “The original version" in this user group.

In all these cases, which are very important for database theory, my solution is more general and gives an accurate solution, as opposed to "Anchor Modeling." The authors of the Anchor Modelling not understand these basic ideas that are used in their work. They did not understand the nature of these fundamental ideas and their essential properties.

The authors of Anchor Modeling have made some "cosmetic" changes, so these similarities with my solution are not obvious at first glance. What did they change?
First they introduced naval terms. They changed the two main things from my model. These two things are the identifier of an entity and the identifier of the state of the entity:

(i) Instead of the identifier of an entity, they have introduced "anchor key." In fact they have introduced "surrogate key". Here, right at the start, the authors of Anchor Modeling show their not understanding the theory of databases. In fact, nearly all databases use the keys that are defined by the International Organization for Standardization. These keys are not surrogates, they are originals, and they are externally verifiable. For example, you can verify the VIN by phone, but you can not verify a surrogate key. The industry-standard keys usually have procedures for verification, decoding etc. In contrast to the industry-standard keys, the surrogate keys do not have these procedures.

(ii) Instead of the identifiers of state the authors of Anchor Modeling introduces key K (C, T). Here C is "anchor key", that is a surrogate key and T is time. Obviously the authors of Anchor Modeling think that the "the identifier of the entity + time" can stay instead of my identifier of a state. Again this is serious misunderstanding of database theory and logic. In this thread I showed that this key is a bad solution.

Note that this T is not defined in the paper "Anchor Modeling." For example, it is not clear what is T. Is T a time when information about the new attribute value was received by the IT department? Is T a time when attribute got a new value in the real world? Is T a time when a new attribute value is entered into the database? Note that the key is the basic concept in the theory of db. For more details see Def 5 in the paper “Anchor Modeling”.

I will now briefly summarize all the main ideas of Anchor Modeling, which can also be found in my paper:

(a)
The idea of “history”.
This is very important idea in my paper. The idea of history is not understood in the paper "Anchor Modeling." For example, in Anchor Modeling is allowed to delete the data. If you allow deletion of data, then you have no history. Even worse, if you allow deletion of data, then you have no database that is on-line supported. In the Internet age this design is extremely bad. These are capital items in the design phase of the database. In addition to these authors of "Anchor Modeling", these things do not understand also editors of their work, it's obvious. However the authors of Anchor Modeling use all important constructs related to history from my paper.

(b)
The immutable key.
The immutable key is the identifier of the entity (real or abstract), that is, it belongs to the entity. The identifier is not "identity" of the entity that is changing, as it is defined in the paper Anchor Modelling (see "Def 1" and "Def 2" in the paper Anchor Modeling). The immutable key is a matter of identification, it is not a matter of an identity. The idea of immutable key is not related to surrogate key . In this thread I've explained that identity is not a defined the term.
In my model the immutable key is general solution in contrast to "Anchor Modeling." In my solution this key can be unlimited immutable, but it also can be immutable only in a limited period of time, depending on the real database application. This second case is important in practice. About this case, I was writing in more details in this thred.

(c)
Bitemporal Data in Anchor Modeling.
My data model has the general solution, my solution supports n-temporal data.

(d)
Simple key.
In all my published papers the key is simple. I introduced Simple Form in May 2006th. One of the main reasons for the introduction of the Simple Form was the construction of simple key. It was clear to me, that this was one of the most important steps in the construction of atomic structure and also for some other important things.
Viewed from the standpoint of theory, the simple key is incomparably more important than the surrogate key. Obviously, Codd did not realize this nor the authors of the anchor modeling. They thought that here comes a technical solution. They also did not realize all the other things that are associated with the construction of the simple key.
Of course if you have a simple key then you can easily import the visible surrogate key. However these authors use the simple key in the very restricted form of the surrogate key.

Note that my dbdesign always starts from entities and relationships. The construction of a key of a relationship is predetermined. Entities have only intrinsic attributes.
These predefined conditions significantly improve the structure of entities from the beginning of the db design. Note that only constraints can introduce the need for NFs.

In 2006 I presented “Simple Form”, see section 4 at http://www.dbdesign10.com :
Relation schema R (K, A1, A2,…, An) is in Simple Form if R satisfies:
R (K, A1, A2, …,An) = R1 (K, A1) join R2 (K, A2), join … join Rn (K, An)
if and only if
1. Key K is simple
2. A1, A2,…, An are mutually independent.

Note that the relvar which is in Simple Form is “all-key” relvar. So the simple key corresponds to “all-key” in Simple Form.
I use term relvar, because it is traditional term in RM. However I think that “relational schema” is more appropriate term. The term “relational schema” is in compliance with the Model Theory.

In fact, in the current theory of database design is not precisely defined what are the first steps. Simple Form defines a first step in the db design. Simple Form is introduced only for one reason; it determines and constructs the attributes and the simple key for entities. It separates construction of entity from construction of the corresponding constraints. Note that even an axiom is a constraint.
In contrast to so-called “6NF”, Simple Form completely determines natural (without constraints) construction of entities. However, the most important thing in Simple Form is the simple key; later I will elaborate this statement.

(e)
My model is more general than ERM. I am starting from states of entities (or relationships), which is essentially more general than entities and relationships. Historic attributes from AM are a special case of the entities.
My model defines the concept. This definition for the first time introduces the correct definition of the concept. In my paper "Russell's paradox" has been resolved and dismissed as erroneous idea. My solution uses Frege's results and adds what is lacking in Frege's theory. It is shown that identification is another mind - the real world link. It has been also shown that the identification and concept are related and integrated semantic structure.

(f)
Anchor Modeling builds its model on a finite number of some structures (knots, ties ...), and there is no evidence that every business application can be presented through these structures.

(g)
In my paper, I have demonstrated the decomposition of entities (or relationship) into the corresponding atomic structures. This proof has been proven only by using of tools from my general ERM.
The Anchor Modelling paper has no evidence that the entities and relationships can be decompose into the corresponding atomic structures. For more details see my post from 25 March 2013 in this thread.

(h)
Authors of Anchor Modelling transfer their ERM "atomic structure" in the RM. They do this without evidence. I also wrote about this in my post from March 25 2013. Note that many well known scientists deal with this difficult topic, known as schema mapping and data transfer.

(i)
Although Anchor Modeling is on conceptual level, the authors did not define the concept. In fact there is not one word that has something to do with concepts.

(j)
StartDate - EndDate. In Anchor Modeling they use only StartDate for History structures; EndDate is "implicitly" defined. In this thread I showed that this technique, which is based on "implicitly" defined EndDate, is wrong. Most importantly, this is not Temporal DB. General databases, which include databases that maintain history of events, are not temporal DB as authors of Anchor Modeling claim. Obviously they do not understand importance of this model. My data model is event oriented; it means that my data model is much more general then Temporal DB. For example, I use only two events to describe all operations to the data. I have shown that these two (existential) event define time. I introduced these two events in 2005. See also my paper from 2009, Database design and data model founded on concept constructs and knowledge constructs, see section 7.4

Note that the authors of Anchor Modeling using a combination StartDate, EndDate at the level of atomic structure. As I said atomic structure were introduced in Anchor Modeling without evidence.

(k)
In my model I use the idea of knowledge. My main structure, roughly speaking, has the following schema: IdentifierOfEntity, IdentifierOfState, Knowledge

“Knowledge” is total knowledge about an entity, therefore Anchor Modeling data structures knots; static attributes, etc are just part of my solution.
In my model a subject has this knowledge. It may be that more than one subject has his knowledge about an entity. I came here to the idea of using flexible structures. For example in my paper from 2005, example 5, I wrote, "Now we can assign a different number of columns of knowledge to each attribute. These columns can also be different, concerning what they represent. "
In contrast to my solution, Anchor Modeling has fixed and limited number of data structures.
In my model, one can add or delete arbitrary part of knowledge, depending on the real world application. In my paper I wrote: if there are no changes of certain entity, then in that case, the identifier of the entity is equal to the identifier of the state of the entity. Note also that my dbdesign enables the construction of databases which can maintain an individual entity which do not belong to an entity set.
Note also that “knot” structure can cause very bad consequences, for example, if someone enter wrong data (accidentally or intentionally)

Knowledge is precisely defined in my papers. Let me mention some main characteristics of Knowledge introduced in my model:
(i) Knowledge is based on atomic facts; I have the procedure which decomposes entities into atomic structures.
(ii) Knowledge is strictly related to a subject (or subjects). This implies that in my model it is possible that different subjects can have different knowledge about one attribute.
(iii) I have a factual sentence and the corresponding fact; these are two very different things in my approach to knowledge. The fact in my model is on the level of thoughts, it is related to meaning and awareness, the fact is subjective and it links the corresponding data in a memory to the subject. So facts and data are very different things.
The factual sentence is just a set of symbols.

(l)
Meta data.
Meta data is defined as "data about data" which is a kind of circular definition. On the other hand, "meta data" can have their own "meta data", so in this case we have "meta meta data," Which is really unusual dbdesign. In my paper I am using term knowledge which I explained in the above text. Note that it is possible the theoreticall case "meta ... meta data".
I wrote in this thread that authors of Anchor Modeling did defined meta data in their db schemas at all. That is because they did not solve this problem. In Anchor Modeling there is neither any theoretical text nor example about “meta data”. Note that “meta data” are among most important constructs in Anchor Modeling.

As I wrote above I use "knowledge" in my data model. Knowledge is based on atomic facts. Facts that are stored in memory become data and so data become permanent.
(For more details about facts, knowledge and permanent, see section 3 from my paper “Database design and data model founded on concepts and knowledge constructs”. About facts also see section 3.5 from January 2006 at my website http://www.dbdesign10.com. and section 2 from my paper “Semantic databases and semantic machines )

(m)
Surrogate key.
The name “surrogate key” is very useful; this name is under the influence of the authority of name Codd. E. Codd defines that the surrogate key is invisible. E. Codd is the biggest authority for RM. Therefore people use E. Codd's definition of the surrogate key. Codd's definition of surrogate key people use especially when the surrogate key is part of binary relations.
Today you can find Codd’s definition on Wikipedia, but it is false definition. On wikipedia someone wrote that by Codd’s definition the surrogate key is visible. This is not true. In my opinion, this fraud someone did intentionally. Obviously, this person is trying to fix this Codd's a huge mistake, not using scientific means. This person claims that Codd defines the surrogate key as visible in 1976. According to my knowledge, Codd has not published a single paper in 1976.You can see Codd’s definition about the invisible surrogate key in his paper RM/T.

The authors of Anchor modelling defined their anchor key as the surrogate key. You can find this in their paper Anchor Modeling, reference [19] “Analysis of normal forms for anchor models, http://www.anchormodeling.com/tiedostot/6nf.pdf”. This reference disappeared, so that you can not find it on this address. After my writings about this paper it appears in fixed version of Anchor modelling. I have original version of the mentioned article.

So applying surrogate key by Anchor Modeling is nonsense because it is invisible by definition.

If authors of Anchor Modeling apply some other definition, then, they should say which version they use because the anchor key is basic term in their paper. Especially, because there is another definition of the surrogate key, in which the surrogate key is invisible. See Wieringa and De Jonge (1991).

If they use surrogate key as a visible key, then it is plagiarism of the part of my definition about the identifier of an entity.

My definition about the identifier of an entity is divided into the two parts as I wrote about it in my post from June 15, 2013. I want to say that this definition is not simple, in fact the matter is very complex and must be examined by cases. This important fact, the authors of Anchor Modeling didn't notice, at all. The same case is with Codd and RM/T.

There is one bigger mistake about the surrogate key. The surrogate key is a kind of technical solution. However, a theoretical solution is much more important; it is more general and it fits in a theory. I was looking for something theoretical. On may 2006 I introduce Simple Form (see section (d) in this post, it is about Simple Form). Simple Form has the simple key and that is it, the simple key can solve everything. If you have the simple key, then instead of this simple key you can put some “industry-standard” key or a surrogate key, etc.

Note that Anchor Modeling assigns the surrogate key to an entity, unconditionally. They only had written that the surrogate key is “identity” of an entity. Note that AM is on the ERM level.
Now, there are many questions. Let me mention two of them:
1. If the entity has the corresponding relations with anomalies?
Note that in the paper Anchor Modeling, they don't have a procedure which enables mappings from ERM to RM
2. If they can somehow transfer their structures from AM (that is ERM) to RM and if they do some lossless decomposition, then no one knows what these identities (surrogates) represent. I mean which entity these surrogates represent after decompositions? (Note that the same question can be applied to RM/T)
This part should be explained by Springer Company.


Definition of The Identifier of the entity
===========================
Here is this definition from year 2005 that I posted on this thread on June 15, 2013:
“Besides Ack, every entity has an attribute which is the Identifier of the entity or can provide identification of the entity. This Identifier has one value for all the states of one entity or relationship.”

(See section 1.1 from my paper at my website http://www.dbdesign10.com )
Look for the tread: “Database design, Keys and some other things”, from
September 2005, on this user group and for the corresponding discussion.
===========================


As I have already said, the definition that defines the identifier of the entity is complex, there are more cases, the surrogate key is irrelevant case.
The first part of this definition, I called the CaseA in my post from June 15, 2013 and explained it in details.

Now I'm going to analyze the second part of this definition which has the following text: "or can provide identification of the entity." I called it CaseB. The "Anchor Modeling" is based on this CaseB. As I already wrote surrogates can support a small part of the real world applications. Because of this, I'm not addressed a lot of attention to the surrogate keys. This CaseB part is related to small amount of the real world business applications. This section can be divided into sub cases; Here, I chose only a portion of all the cases that are of type CaseB. These are the following cases:

CaseB1. This case is appropriate for my Simple Form. The construction of the simple key should be done by using the values from the attributes which form the primary key. The key can be constructed by using concatenation of the corresponding values. (See 5.1.(iii) in my paper "Semantic database and semantic machines"). Note that this simple key should be the immutable key in my General database model. Therefore someone can apply the immutable key, starting from an arbitrary state and change this simple key with the immutable key. The simple key becomes immutable key, and the states of the entity are determined by applying the corresponding identifier of the state.
Even more, it is possible to set “certain” constraints on the corresponding attributes. But this is separate field. So this case shows how one can start from Simple database and then how to switch to the General Database and vice verse.

========================
Note that this procedure is better and more general than the surrogate key from Anchor Modeling. The procedure can solve all the cases which can be solved by using the surrogate key.

As I have already said, the basic idea is the simple key. "Simple Key" is the general theoretical idea and solution. That is what the authors of Anchor Modeling did not understand.
========================

CaseB2. In this case the start is again at Simple Form, but now the values which will be assigned to the simple key, they come from an arbitrary domain.

CaseB3. In this case, we can apply the visible surrogate key.

CaseB4. In this case, we can apply indexes as the starting value for the immutable keys.
Note that in this case, we can use some other technique that is proven in the maintenance of keys.

CaseB5. In this case, we can apply my structure “sequence” as the starting point. This structure was defined in my paper “Semantic databases and semantic machines.” section 5.12, at http://www.dbdesign.com. Note that this structure can maintain entities which identifiers are immutable for a limited period of time. In fact, this is the idea of changing the identity of an entity. Note that the authors of Anchor Modeling determine the immutable key as “eternal”.
Sequence is a powerful structure and can support any kind of the identifier of an entity (industry-standard keys, as well as the visible surrogate keys which use different domains, etc).
-----------------------------------------
CaseC. This is the case, which works with complex problems. Note that my solution is based on states. Technically speaking History is solved by using two identifiers; the identifiers of states and the identifiers of entities. However, if you pay more attention to this case, then you can see that this is theoretical approach with some complex cases. The states are complex db structures. See my posts from February 25, 2013 and from May 13, 2013 in this thread.



==========================
These B-cases show that the visible surrogate key is just one of many technical solutions. Note that Anchor Modeling could not solve problems even with the visible surrogate keys. In this post I show that all important ideas from “Anchor Modeling” exist in my papers in more general form. My ideas and solutions were published in 2005, and Paper Anchor Modeling 2009. Corrected version of paper Anchor Modeling was published 2010th.
When I started working in this field, was not been crystallized, which are the main things in this area. Moreover at that time it was not known which the ideas of the game are. You saw how Codd was far from the solution for the decomposition in the atomic structures.
==========================



5. Identification
I devoted special attention to the process of the identification. I started with this field in 20007 see section 7 at http://www.dbdesign10.com I introduced new results related to identification in my paper from 2008 and 2012.

(i) CASE - SIMPLE DATABASE
In my opinion identification is fundamental for database theory. I will mention the aspect of identification that is linked to identifiers and concepts. I would also like to say that the surrogate key has a specific solution on the level of the corresponding concept. The identification process is in my work recursively defined. First, I defined the concept and identification for properties, then for entities, relationships, and finally for states. An entity is determined by the properties, relationship is determined by the entities and other relationships, and states are determined by the entities and relationships. I will explain the concept and identification only for properties. Note that in my papers a property is a concept, while an attribute is an instance of property. For example, color is the concept, while red is the attribute.
I will explain the concept and identification only for properties. It is determine with: (3.3.3) in section3.3, from my paper “Database design and data model founded on concept and knowledge constructs”, at http://www.dbdesign11.com :
==================
S (the m-attribute, the concept of the property) = T iff the m-attribute matches the entity’s attribute
==================
1. On the left side of this equivalence, we have the relation: "satisfies this concept".
2. On the right side of this equivalence, we have the identification of the attribute.

Note that in section 3.2.1(i) of the mentioned paper, it is determined the following: “The m-attribute is created by the match between an entity’s attribute and the corresponding attribute in our mind.”

Similar as for attributes, goes the identification of entities.
In my opinion, the above mentioned (3.3.3) “crashes” Russell's paradox. For more details about this see my thread: “Does the phrase “Russell’s paradox " should be replaced with another phrase?”
-------------------------------
(ii) CASE – GENERAL DATABASE. This includes databases that maintain history. Here I am using my main data structure that fully models states. This structure has the following scheme: ConceptStateName (P, E, A1… An, Kp1…Knr, Dp1,…,Dns)
where P is the concept of the identifier of a state of the entity (or relationship);
E is the concept of the identifier of the entity;
A1,…,An are concepts of the properties of an entity (or relationship);
Each property, including E and P, can have
different sets of knowledge K associated to
them and defined in 3.6 – 3.9. Thus:
P has knowledge Kp1, Kp2… Kpi;
E has knowledge Ke1, Ke2… Kej;
A1 has knowledge K11, K12…K1k;
….
An has knowledge Kn1, Kn2… Knr.
Knowledge Dp1,…,Dns is defined in 3.8.

For more details see section 4.2.5 and 4.2.6 from my paper paper “Database design and data model founded on concept and knowledge constructs”, at http://www.dbdesign11.com


====================
Conclusion:
When doing some database design, then I usually start from Simple Form and make schemas for the concept of the entities. So again, I emphasize that there are two kind of the identifies of entities:
The first group is where the identifier of the entity belongs to the entity and m-entity. This is CaseA that is discussed above.
The second group is where the identifier of the entity belongs only to m-entity. These are CaseBs.
The authors of Anchor Modeling did not realize that there are two groups of identifiers. E. Codd also didn’t understand this fact in RM/T.
On the level of concepts, the two groups have difference in the phase of identifying of attribute – this is the right side of the above (3.3.3)

In the above mentioned case CaseC I use mentioned Concept for the General database. Note that in all cases I have the Identifier of the Entity, which enables the simple key. The simple key implies the surrogate keys and much more.
=====================


Vladimir Odrljin

vldm10

unread,
Sep 10, 2013, 8:54:07 AM9/10/13
to
Hi Derek,

1.
In my last post (posted on Jul 25, 2013), I showed that my solution is a general solution, and that the surrogate key from Anchor Modeling is just a special case of my solution. In addition, the surrogate key is a technical solution; it is not on the theoretical solution.
It was also shown that the surrogate key is irrelevant, since it can be applied in a small number of business applications. Say 2% of the real world business applications.

2.
In my papers, the construction of the simple key is precisely determined on the theoretical level. There are the following two important cases:

(a) SIMPLE DATABASES – these are databases which maintain only the current state.
In 2006 I presented “Simple Form”, see section 4 at http://www.dbdesign10.com . The simple key is defined in the Simple Form. The simple key was realized as the identifier of an entity (or as the identifier of a relationship).
Introducing the Simple Form enables the following important constructions:
(i) It separates the construction of an entity, from the construction of the entity’s constraints. The current theory does not separate the construction of the entity’s constraints from the construction of the corresponding entity.
(ii) It enables construction of an entity by using the simple identifiers and intrinsic attributes.
(iii) It enables a construction of the simple key.
(iv) The Simple Form enables the construction of the atomic concepts, predicates and propositions.
(v) It enables the construction of constraints on the level of the atomic structures.
(vi) The simple key is also important in the process of the identification of a plurality and the individuals that belong to this plurality.

(b) GENERAL DATABASES - databases that maintain history and other important things.
Here I introduced identifiers. The identifier of a state is the main construct in my db design and theory. Here I will mention only two things that this identifier enables. The identifier of the states is powerful tool that determines the decomposition of the most complex data structures into the corresponding atomic structures. The identifier of states determines mapping between my model and ER model and RM and vice verse. The identifier of states determines mapping between ERM and RM and vice verse. The identifier of states in fact determines two kinds of the mappings, the schema mapping between data models and the mappings of the corresponding instances. This solution is of the general character.

In 2005 I presented on this user group this solution that enables decomposition of the general data structures into the corresponding atomic data structures. I call it General Form. In 2008, I added the concepts on my solution and because of it I named it “Schema of the Concept of a State”, see section 4.2.5 and 4.2.6 in my paper from 2008 at http://www.dbdesign11.com . The General Form, i.e. the schema of the concept of a state of an entity takes the following form:


(Left side)
ConceptStateName (P, E, A1… An, Kp1… Knr, Dp1,…,Dns)

(Right side)
where P is the concept of the identifier of a state of the entity (or relationship); E is the concept of the identifier of the entity; A1,…,An are concepts of the properties of an entity (or relationship); Each property, including E and P, can have different sets of knowledge K associated to them and defined in 3.6 – 3.9. Thus:
P has knowledge Kp1, Kp2… Kpi;
E has knowledge Ke1, Ke2… Kej;
A1 has knowledge K11, K12…K1k;
….
An has knowledge Kn1, Kn2… Knr.

Knowledge Dp1,…,Dns is defined in 3.8.

Here is (Left) = (Right). The symbol "=" denotes that schemas (Left) and (Right) determine and represent the same states. Note that the identifier of a state, that is P, uniquely determines each component of the ConceptStateName. I will point out again the formula (3.3.3) from my paper:

S (the m-attribute, the concept of the property) = T iff the m-attribute matches
the entity’s attribute. … (3.3.3)

Given that P, E, A1, A2... An are identifiers, and considering on (3.3.3), then we can construct the schemas of the following relationships: (P, E), (P, A1), ..., (P, An).

So (Left) = (Right) is the general solution. I named this solution General Form. I will mention few notes for this occasion:
(i) This General Form is based on concepts of states, extensions, structured knowledge and (3.3.3). The General Form is not a schema of an entity or relation. General Form is not about a predicate, although we can construct the predicate that corresponds to this schema. The General Form is the set of the semantic constructs and procedures.
We can not say that states and concepts are the real world entities. So my model is strong extension of ER model. The identifier of a state is an identifier of subject’s subjective knowledge about the state of an entity.
(ii) The above mentioned knowledge is divided on ‘primitive knowledge about an entity’, knowledge about attributes and knowledge about data. (See sections 3.6 – 3.9 in my paper from 2008)
Primitive knowledge about the entity is related to the construction of concepts, i.e. it is related to the left side of (3.3.3).
Knowledge about attributes is related to the right side of (3.3.3).

I presented on this user group the example which belongs to Hilary Putnam: “Lemon is a lemon, even when it is green”. French philosopher Maurice Merleau-Ponty has another interesting example: When we look at a cherry tree at night, we can conclude, cherries are black; however we know they are red. I solve these two examples by applying the right side of (3.3.3). So, in order to identify an entity’s attribute, we use our knowledge about the entity’s attribute.
Knowledge about data is knowledge about how data are stored.

(iii) I wrote in my paper that knowledge is not limited only to above mentioned knowledge. We can expand or restrict knowledge; depends on the real world application.
So my solution is not limited on non-proven and pre-limited set of ties, knots and anchors. My solution is flexible, i.e. my solution is of general character.

(iv) Note that I do not use the undefined term "metadata." I use the aforementioned knowledge, which is precisely defined.

In both cases (a) and (b), the construction of the simple key is defined. Note that both, the Simple Form and the General Form are determined on the level of concepts.
=======================
So my general solution has a precisely defined structure of the simple key. If we have a construction for the simple key on the theoretical level, then we can easily assign the surrogate key into the simple key (the same holds for all technical solutions which maintain keys on db level see also my definition of the identifier of an entity at http://www.dbdesign10.com section 1.1).
In my previous post I explained that my definition of the key covers technical cases, including the surrogate key.

In this section I explained what the difference between theoretical and technical solution is. I also explained what the difference between general and special solution is. These are the basic things. The surrogate key does not belong to the fundamentals of database theory.
Very roughly speaking, the technique of my model consists of the following two steps: I construct the entities using Simple Form and I also use the "General Form" for the construction of atomic structures to model "General Databases".

3.
In my last post the main attention is paid to the fundamental problem, which is the decomposition of a data structure into the atomic structures. Please note that these are fundamental problems that were not solved in the database theory. In accordance with this notice the following major errors in the paper Anchor Modeling:

(a) Decomposition of the data structures at the level of Anchor model was done without evidence.
(b) Decomposition of the data structures at the level of ERM, was done without evidence.
(c) Decomposition of the data structures at the level of RM, was done without evidence.
(d) Mapping data structures from the ERM in the RM, made without evidence.
(e) Mapping data structures from AM in the ERM and RM, made without evidence.
(f) The authors of Anchor Modeling, refer to 6NF. Moreover, they put 6NF in the title of their work. However, nowhere in this paper, there is not one word about 6NF. Note that 6NF has no any significance for the theory because it does not provide any procedure and nothing effectively. Name 6NF is an unofficial abbreviation for the following text "Relvar R is in sixth normal form, 6nf, if and only if it can't be nonloss decomposed at all."
The paper Anchor Modeling has many inaccuracies and even nonsense, I wrote about the many, and on some I did not write. The title of this article contains the amazing claims. The title: "Anchor Modeling An agile modeling technique using the sixth normal form for structurally and temporally evolving data" has the following part "technique for structurally evolving data". This part has misunderstanding of basic things, because "structurally evolving data" can not be solved in Anchor Modeling.

=====================================
Conclusion: Paper Anchor Modeling won the first prize at "Conceptual Modeling - ER 2009, 28th International Conference on Conceptual Modeling, Gramado, Brazil, November 9-12, 2009"
Scandalous proceedings in connection with this work, I described on this user group in my thread "The original version". Please note that this first work on Anchor Modeling besides the above-mentioned cases from (a) to (f) also has other glitches that I wrote in this thread and in my thread "The original version".
=====================================

4.
I have explained in details many of the errors from the Anchor Modeling, in my thread "The original version". In May 2010 I have started the thread “The original version” and the authors of "Anchor Modeling" have submitted their second paper in September. In the second paper, they corrected some major theoretical errors from their first paper, which I presented in detail in the thread "original version". For the errors in the first paper I thoroughly and publicly explained how it should be done, the authors of "Anchor Modeling" corrected it, in their new paper. Corrected version of Anchor Modeling was published in Data & Knowledge Engineering.

Here are some examples of these corrections done in the corrected version of Anchor Modeling, which are of fundamental character:
(i) Identifier
The authors of "Anchor Modeling" use term "key" and the term "identity." In my model, I have introduced a chapter on the identification and concept of identifiers which I explained in this thread. I have defined the identification, as another mind - the real world link. The RM theory exclusively uses the idea of "natural key" instead of identifiers. I also explained that the term "identity" in db theory is nonsense. This term is not defined even in philosophy. In this thread I wrote that attributes are identifiers - this is based on (3.3.3). I also wrote that identification of complex objects is based on the combination of the corresponding identifiers.
In the paper Anchor Modeling I run command "find" for the term "identifier". I got no response. Word identifier has not once mentioned in this paper.

However in the second corrected paper of "Anchor Modeling" term identifier is introduced. Note that the identifier is one of the very important things in my model, and I explained it in my threads on this group. Note that, I am using the concept of an identifier in the both Simple Form and General Form. Both forms are mentioned in this post as the general solution. Note that these authors used concept of the visible surrogate key in the original version of Anchor Modeling.

(ii) Events
In my critique of Anchor Modeling, I wrote that my model is not temporal database but is event-oriented. Moreover I think I've given a new precise definition of the time, using only two events (see my paper at http://www.dbdesign10.com , section 1). Authors of Anchor Modeling claim that their model is temporal db. In the paper Anchor Modeling, I run command "find" for the term "event". I got no response. The word event has not once mentioned in this paper, in the sense that their model is event-oriented.

However, in the second corrected paper, these authors of Anchor Modeling introduced events. In my opinion the events are defined very badly in Anchor Modeling.

(iii) Concepts
My model is based on concepts. In the paper Anchor Modeling, there is no definition of the concept at all. Keep in mind that their model is defined as a conceptual and presented at the congress, which was named "28th International Conference on Conceptual Modeling." However, the main things here are not consistent. For example, on page 2 say "A anchor model is a relational database schema." Immediately following this sentence also on page 2 say "An anchor is a set of entities." Further on page 3 says:
def 2. An Anchor A (C) is a table with one column. The domain of C is ID. The primary key is C.
This means that this is ER model, i.e. this is the conceptual model.

In the corrected work of Anchor Modeling, the authors introduce some new terms. These terms are very poorly done and obviously the authors do not understand the important things. First of all, here I think on semantics of a given data model.

My model has a clear semantics. Unlike EntityRelationship model, my model has concepts and these concepts have meaning in the real world. The meaning in my model is precisely defined. Note that although the ER model claims that have a stronger semantics than RM, it is not true. ER model has entities but has not the semantics. The RM has a primitive semantics, which is implemented through the corresponding generated spoken language.

(iv) The identifiers of states
In my critique of paper Anchor Modeling, I wrote that their technique is actually a log of changes; it is more like programming and less like a database. I mean, I have a model that models something. My solution models the states of entities and relationships.
In the new, corrected version of the Anchor Modeling, the authors introduce the identifier of a state. As I have written in this post, the identifier of a state contains the main ideas of my model. It is new and one of the main ideas of my work which contains a number of other important ideas. These authors have plagiarized most complex case for the identifier of states; it is the case for the states of relationships (ties).

On this user group I explained that without the states and without the identifiers of states, can not be done the most important things. For example, it can not be done decomposition of relationship on the atomic structures. The mapping between two data models also can not be done. Without the identifier of the state, many other things are not possible.

In the last five years, I wrote in detail on this user group about the identifier of a state. The identifier of a state I introduced in 2005, and the authors of Anchor Modeling were plagiarized this identifier in 2010th

(v) Decompositions and mappings
In this post I wrote that the authors of Anchor Modeling is widely used mappings between data models, and also they use the decomposition of the data structure, all this without of any proof. See section 3 of this post. Please note that many eminent scientists working on these issues. In the corrected version of Ancor Modeling, the authors are trying unsuccessfully to resolve this problem.
I have solved this problem completely. As far as I know, no one else has solved this problem. In connection with the mapping between the data models, I distinguish the following two mappings:
(i) Schema mapping
(ii) The mapping between instances of these schemas
Note that, both of these mappings are completely determined by identifiers of states.

There is one more important issue here, which is related to the meaning. The question is whether the sentence from one model means the same as the sentence from another (mapped) model. This problem is not even noticed by the Anchor Modeling.
I have solved this problem in the broader context, See my paper "Semantic databases and semantic machines" sections 1, 2, 3 at http://www.dbdesign11.com .

As I already wrote in this thread, I complained to Peter Chen, who is editor in chief of the journal Data & Knowledge Engineering. In this journal was published the corrected version of Anchor Modeling. He never replied to my complaint.

In addition to these aforementioned theoretical failures, "Anchor Model" can not solve many of the specific problems for what is intended to solve. See my post from April 1, 2013 in this thread. There I showed the real problems of the 11 areas that Anchor Modeling can not resolve.

Vladimir Odrljin

vldm10

unread,
Sep 16, 2013, 12:31:48 PM9/16/13
to
Dana utorak, 10. rujna 2013. 14:54:07 UTC+2, korisnik vldm10 napisao je:
> Hi Derek,


> (a) SIMPLE DATABASES – these are databases which maintain only the current state.
>
> In 2006 I presented “Simple Form”, see section 4 at http://www.dbdesign10.com . The simple key is defined in the Simple Form. The simple key was realized as the identifier of an entity (or as the identifier of a relationship).
>


This simple key is theoretical basis for the identifier of an entity. Note that this simple key is also theoretical bases for the surrogate key. I have explained that the identifier of entity is the basis for all kind of simple keys. The simple key that is on the database level, do not corresponds always to one attribute of the corresponding entity from the real world.
The design of a database depends on db designer. For example he can maintain history for just one entity – in this case he can use this simple key as the immutable identifier of the entity (from the some point in the time) and combine it with the identifier of a state in order to maintain a history, the general database approach, etc. Note that this is logical db design and therefore one can use only a technical solution that is in compliance with the logical database design.
Note also that a design with the surrogate key has its limitation. For some of them see my post from April 1, 2013, where I explained using examples from 11 areas that surrogate keys can not even give a solution.
Let me explain in more details the most important case. Within the last 20 years, there is a strong development of the use of keys that are the global industry standard. Today, almost there is no a serious business application without the use of these keys. As far as I know, this phenomenon is not sufficiently explained theoretically. In my paper from 2008, I introduced a clear division between the m-objects and real objects. M-objects are objects in the memory; they are abstract objects that can be recorded.
What the authors of Anchor Modeling and RM / T did not realize, is that the surrogate keys exist just as m-attributes, and the industry-standard keys exist as m-attributes and also as real attributes. Therefore industry-standard keys are not surrogates.
==========================
I have explained in this thread, how complex objects are stored in memory, and how they are called from memory using a combination of identifiers of other objects.
I also explained that the identifier of a complex object is built using a combination of identifiers of objects that make up the complex object. In my paper from 2008, I have defined identifiers recursively. First, I define attributes as identifiers by the formula (3.3.3) see my paper from 2008. The identifier of an m-entity is constructed using identifiers of the corresponding attributes. Then I apply formula (3.3.3) to the m-entity. Procedures for relations and states are similar. Note that the identifiers of complex objects are derived. However identifiers for attributes are not derived. They are given.

Note that industry-standard keys as well as the surrogate keys are the identifiers of the m-entities. In contrast to the industry-standard keys, note that the surrogate keys can not be the identifiers of the real entities.
==============================
There is one other thing about identifiers, and that is their use by people who have little knowledge about theory. Therefore, professionals have become irritated with the use of identifiers by the amateurs.
Note that industry-standard keys are much better than the surrogate keys. If there is an industry standard key in the data structure, then there is no reason to use a surrogate key. If someone uses a surrogate key in addition to the existing industry-standard key, then it would be the same as if someone had killed a steer to make a steak for lunch, which is about half a pound.
Note that in my db design, the identifier of an entity does not have to be constructed as a separated single column data structure. The identifier of an entity may have an associated knowledge. When we talk about the design of surrogate keys, then my solution is more general. For example in the case of the surrogate key I can join to it the date, when it was created, who (or which procedure) created it, etc.
If you must have an identifier of an entity as a column, then do so. But put into each design strictly structure with one column is a serious error, because it is at the level of db design. Note that Anchor Modeling and RM / T have strictly the surrogate key as one column structure.



> (b) GENERAL DATABASES - databases that maintain history and other important things.
>
> Here I introduced identifiers. The identifier of a state is the main construct in my db design and theory. Here I will mention only two things that this identifier enables. The identifier of the states is powerful tool that determines the decomposition of the most complex data structures into the corresponding atomic structures. The identifier of states determines mapping between my model and ER model and RM and vice verse. The identifier of states determines mapping between ERM and RM and vice verse. The identifier of states in fact determines two kinds of the mappings, the schema mapping between data models and the mappings of the corresponding instances. This solution is of the general character.
>

Note that the identifier of a state is not an “anchor”, it is not the identifier of an entity.


> (iv) The identifiers of states
>
> In my critique of paper Anchor Modeling, I wrote that their technique is actually a log of changes; it is more like programming and less like a database. I mean, I have a model that models something. My solution models the states of entities and relationships.
>
> In the new, corrected version of the Anchor Modeling, the authors introduce the identifier of a state. As I have written in this post, the identifier of a state contains the main ideas of my model. It is new and one of the main ideas of my work which contains a number of other important ideas. These authors have plagiarized most complex case for the identifier of states; it is the case for the states of relationships (ties).
>
> On this user group I explained that without the states and without the identifiers of states, can not be done the most important things. For example, it can not be done decomposition of relationship on the atomic structures. The mapping between two data models also can not be done. Without the identifier of the state, many other things are not possible.
>
> In the last five years, I wrote in detail on this user group about the identifier of a state. The identifier of a state I introduced in 2005, and the authors of Anchor Modeling were plagiarized this identifier in 2010th
>


The following Definition16 from the corrected version of Anchor Modeling is plagiarism:
--------------------------------------
Definition 16 (identifier). Let T be a (static, historized, knoted or knotted historized) tie. An identifier for T is a subset of T containing at least one anchor role. Furthermore, if T is historized or knotted historized tie, where t is the time type in T, every identifier for T must contain t.
---------------------------------------
The “ties” from Anchor modeling are a special case of structures from my papers. Note that the identifier defined in definition16 from the corrected paper of Anchor Modeling is the special case of my identifier of a state. Note that every time when something is changed in the data structure from the Definition16, then the identifier from the Definiti16 is also changed, therefore this is the identifier of states. Note that I introduced the identifier of a state in 2005th.
Note that without the identifier of state there is no proof for decomposition on atomic structures and there is no mapping between data models.
I mean, the first paper Anchor Modeling started from unproved structures and unproved mappings. They just put in the title of their first paper that they are using 6NF.

After my public explanations of these fundamental errors from the first paper of Anchor Modeling, which I presented to this user group, the Definition16 was published, in the corrected version of Anchor Modeling. As I previously wrote, I complained to the chief editor Peter Chen, but I never got an answer.

On September 25,2010, I received email from one of the authors, Lars Rönnbäck. Here is a part from the email:
"... Some of the points you address are clarified in our upcoming paper.
Unfortunately we have not read yours. Have you got a URL to where we
might find it? Some of your critique is also based on
misunderstandings. I am sorry if we were not precise enough in our
presentation to not leave room for interpretation."
I wrote about this email on the user group, and explained that I express my opinion about database theory only publicly on the corresponding user groups.
I wrote about this email because this thread seems maybe unrealistic. However this topic and matter is extremely important for the database theory.

================================================

The corrected version of Anchor Modeling has title: "Anchor Modeling - Agile information modeling in evolving data environment." In section 6, page 12 the authors have explained the idea of "evolving" and wrote: "This is solved by adding an attribute PR_DUR_Program_Duration to the PR_Program anchor ...". This means Anchor Modeling has the type of entities that have a different number of attributes, but the same concept. It also raises the question: how corresponding "relvar" for this concept looks like?

Vladimir Odrljin

vldm10

unread,
Oct 14, 2013, 6:04:24 PM10/14/13
to
Dana ponedjeljak, 11. veljače 2013. 08:41:14 UTC+1, korisnik Derek Asirvadem napisao je:

Hi Derek,

In this post I would like to comment on "Conceptualan Model" in the work of
Anchor Modeling.
This paper works with concepts and builds a database using the conceptual model,
but in this paper does not exist the definition of the concept. Even more in
this paper does not mention the concept.
RM and ERM have a contradiction, which I think is fundamental. Both models RM
and ERM uses implicitly Frege's definition of the concept.
They use the attributes that are actually properties. We know that the model
that uses the properties is opposed with Russell's paradox. Russell's paradox
states that the definition of the concept over the properties leads to
paradoxes.
So, the question of the definition of the concept is a fundamental question.

1.
But when it comes to concepts, then Anchor Modeling build some unusual
construction. For example, in section 2 Basic notations of Anchor Modeling, is
mentioned "set of actors." Of course, these sets do not exist. The elements of
set are not physical objects.
Next in this section provides a definition of identities. I already wrote that
this is one of the most complex concepts in the philosophy of which are
dedicated hundreds of pages. This is the definition:
Def 1 Let ID be an infinite set of symbols, which are used as identities.
Now, the main concept is defined.
Def 2. An anchor A(C) is a table with one column. The domain of C is ID. The
primary key for A is C.
On the same page is written: Attributes are used to present properties of
anchors. This is in contradiction with Def 2, which states that the Ancor has
one column.
There is also the following questions: how are constructed concepts of
attributes, which are represented as atomic structure. How did they construct
concept of time. Is it the time attribute?

2.
In the improved version of Anchor Modeling, a new definition of Anchor is
introduced:
Def 4 An anchor A is a string. An extension of an anchor is a subset Of I.
(Here, "I" means the same as ID from the aforementioned Def 1).
In this version of Anchor Modeling, again there is no one word about the
concepts. What's wrong here, is that the authors did not write, which theory
they use for this definition. It is not written which axioms they use. In my
opinion, this work can not be published because it is not known on what is a
major and fundamental concept of Anchor defined.
The Def 4 used alongside, the following terms: set and extension. Note that the
basic concepts of set theory (primitives) are the following two: set and
element. In Frege's theory, the primitives are concept (falling under) and
extension.
Another problem is the question: which data model they use. Do they use data
sets as a model? Or they use table as it is in their paper, Anchor Modeling.

3.
In my work, Database design and data model founded on concept and knowledge
constructs from 2008 (see at http://www.dbdesign11.com) I use Frege's definition
of concept + Frege's definition of extension. Concepts are defined by law V,
that is, by properties. Extensions are introduced in the following way: The
extensions of two concepts F and G are identical objects if and only if all and
only the objects that fall under F fall under G. (see section 2 and 4.2.1).

I also showed that Russell's Paradox does not make sense and is based on wrong
conceptions. I also added part of theory which Frege missed. The following cases
show why Russell's paradox does not make sense.

(a)
I introduced formula (3.3.3) in my paper from 2008. This formula for attributes
is as follows:

S (the m-attribute, the concept of the property) = T iff
the m-attribute matches the entity’s attribute. … (3.3.3)

This formula is written as the identity in the propositional logic. This
equivalence is true only if both sides are true, that is, when both semantic
procedures works. The corresponding m-attribute must satisfy the concept and
this m-attribute must be identified. All other cases in (3.3.3) do not make
sense.

Russell’s paradox, we can explain in the following way: We will call the set of
all sets that are not members of themselves “N”. The following two cases are
possible:
(i) If N is a member of itself, then by definition it must not be a member of
itself.
(ii) if N is not a member of itself, then by definition it must be a member of
itself.

So we can not construct set N, it means we can not identified this object.
Therefore, the object N does not satisfy (3.3.3).

Note also that (3.3.3) works only with abstract objects. These are m-attributes.
So, you can not construct “set of actors” as it is done in Anchor Modeling.
In my paper, I can apply (3.3.3) also, for m-entities, m-relationships and m-
states.

(b)
In my paper from 2008, I introduced the following procedure which defines the
purpose of concepts:
5.3 Definition of Concept
A concept is a construct which determines one or both of the following:

(i) A plurality of things in which all the things satisfy the concept;
(ii) A particular thing from the plurality determined by (i)■
In order to identify an entity we use the following procedures:
Procedure1: Identifying the plurality.
Procedure2: Identifying individuals.
Procedure2 is not effective without Procedure1.
--
For example, if one should to find certain entity, then he will first use the
concept which defines plurality with the corresponding properties. Then he will
look for this individual entity in the plurality. To determine a plurality we
need a concept. To determine an individual we use identification.
Russell Paradox in fact, asking only one individual, that is the set N, so we do
not need a concept, we need only identification of this individual.

(c)
Note that my formula (3.3.3) is not an axiom. It specifies two semantic
procedures that are interconnected.
My definition of concept is totally new. In addition to the concept, it involves
the identification, structured knowledge (ie knowledge about entity, data,
attributes, ...). My definition of the concepts is associated with history of
events.
The decomposition of concepts of entities (or relationship) into the
corresponding atomic structures was done.

3.
By accepting Frege's definition about the extension, we can write the following:

(1) Ǝx€xX
(2) €xX & €yY => (x = y  X ≡ Y)

Here €xX stands for “x is the extension of X”
X stands for the concept X (I use notation which is used in the works from J.
Burgess)

From the above it is clear that:
(3) Ǝ!x €xX

The corrected version of Anchor Modeling has title: "Anchor Modeling - Agile
information modeling in evolving data environment." In section 6, page 12 the
authors have explained the idea of "evolving" and wrote: "This is solved by
adding an attribute PR_DUR_Program_Duration to the PR_Program anchor ...".
In section 2 there are the following definitions:
Definition1 (Identities). Let I be an infinite set of symbols, which are used as
identities.
Definition4 (Anchor). An anchor A is a string. An extension of an anchor is a
subset of I.

Obviously, the above statements from Anchor Modeling are in a contradiction
with (3).

Vladimir Odrljin

vldm10

unread,
Oct 22, 2013, 6:11:55 PM10/22/13
to
Instead of “This formula is written as the identity in the propositional logic.” should be:
This formula is written as the equivalence from the propositional logic.
(iff stands for “ if and only if ”)
Here should be: €xX & €yY => (x = y < = > X ≡ Y)

vldm10

unread,
Nov 8, 2013, 12:02:53 PM11/8/13
to
Dana ponedjeljak, 11. veljače 2013. 08:41:14 UTC+1, korisnik Derek Asirvadem napisao je:

Hi Derek,

In the last few posts I wrote about the concepts and problems that were caused by Russell’s Paradox, related to Frege's definition of the concept. These problems affect the Set theory, which is the foundation of mathematics. Conceptual model is essential in my model, in which I use Frege's definition of the concept (see my paper from 2008 section 4.2.1. at http://www.dbdesign11.com ). But I have added a number of other things on this Frege's definition of the concept. I would like, in the next post point out the importance of these supplements on to conceptual modeling.
In this post I want to point out the low level at which there is a current conceptual modeling. Here, first of all I think on the low theoretical level.
Today, E / RM is taken as a paradigm for the conceptual modeling of databases. Under the leadership of people from E / RM, every year is organized the International Conferences on conceptual modeling, with honorary president P. Chen. At these conferences begins to form one large group. This group forms all those that have to do with the concepts and semantics - this primarily refers to E / RM, semantic db, semantic web, OO databases, and Ontologies.

1.
It seems that in 1976 Codd requested banning of publishing of Chen’s paper. At ER 2009 The International Conference on Conceptual Modeling, P. Chen had presented the paper, "Thirty Years of ER Conferences: Milestones, Achievements, and Future Directions." In the paper, according to Chen's words, "Codd wrote a long letter to the editor of ACM Transaction on Database Systems criticizing the author's paper ..."
In this paper, Chen writes that Codd had begun to accept E / RM in his paper RM / T. Chen wrote in this paper: "Furthermore, in the 90's, the Codd and Date Consulting Group invited author to serve as a keynote speaker (together with Codd) several times in their database Symposia in London." (Here, "author" denotes P. Chen)
Note also that on this ER 2009 conference, the paper "Anchor Modeling - An Agile Modeling Technique using the Sixth Normal Form for structurally and Temporally Evolving Data" is awarded as the best.

2.
It is true that P. Chen introduced data model that uses tables. However tables have two main disadvantages. They have no semantics, from them; you can not determine the meaning. Tables have weak mathematical tool. But the biggest weakness of the tables is that they did not agree with Frege's theory.
In my opinion Chen did not understand that tables do not form predicates and propositions. Codd was done this part correctly. Codd's model has relations, while Chen's model has tables.
In my opinion, both of them, Codd and Chen did not understand what connects concepts with predicates, and what separates them. Codd did not mention concepts in his RM.
It is usual to treat E / RM as the leading conceptual model, although the concept is not defined in the E / RM. It is true that with the introduction of entities and relationships, is slightly improved semantic approach to db theory. But it was done very naive. In Chen's paper and the model, the concept has not been defined at all. If Chen called his model, conceptual, then an elementary thing to do is the definition of the concept.
In contrast to Chen's work with the concepts, I have recently, on this user group presented a technique done by very good mathematicians. These techniques are related to the concepts and extensions, sets, membership relations, logical axioms etc. See my post that is related to Frege's Theorem.
These techniques are superior to Chen’s conceptual model, both theoretical as well as practical solutions. This approach to concepts determines the bases of mathematics, i.e. it determines Set Theory.

We can notice that the entities, relationships and attributes were introduced long before the E / RM. For example, Kurt Gödel wrote in 1944:
“By the theory of simple types I mean the doctrine which says that the objects of thought (or, in another interpretation, the symbolic expressions) are divided into types, namely: individuals, properties of individuals, relations between individuals, properties of such relations, etc. (with a similar hierarchy for extensions), and that sentences of the form: " a has the property φ ", " b bears the relation R to c ", etc. are meaningless, if a, b, c, R, φ are not of types fitting together. Mixed types (such as classes containing individuals and classes as elements) and therefore also transfinite types (such as the class of all classes of finite types) are excluded. That the theory of simple types suffices for avoiding also the epistemological paradoxes is shown by a closer analysis of these.”

Godel used term individuals instead of entities. Note that Godel used term "types" for entities, attributes and relationships. "Types" are just part of today's database theory. As you can see from this quotes, the main structures of the E / RM were completely determined yet 1944th Therefore we can not say that Chan introduced these constructs, rather we can say that Chen applied existing theory about attributes, entities and relationships to the theory of databases. Note that there are many philosophers, mathematicians and logicians who had been working on entities, attributes, and relationships, before than Chen is applied them to databases.

3.
P. Chen in his work on the E / RM, defines the entity as follows: An entity is a "thing" which can be distinctly identified. (See section 2.2 at Chen's paper).

Note that the entity and the "thing" are synonyms. Note that we can determine the difference between two entities, only by applying Leibniz's Law, not by using the mentioned Chen's "definition". Note that Chen determines entities by using intrinsic properties. On this user group, I presented example about 2000 Honda cars that have all its attributes the same. It means that (Chen's) intrinsic properties are not a solution. The surrogate key also can not help here; we must use the VIN number which is:

1. The real attribute of the entity Car;
2. The primary key (not a surrogate);
3. This is about db design.
In my paper “Semantic databases and semantic machines” I introduced “General Law” (See section 5.6). This law enables very different picture about entities from Chen’s picture about entities.

4.
In Abstract of Chen's paper is writen the following: "The entity-relationship model can be used as a basis for unification of different views of data: the network model, the relational model, and the entity set model." However, this is not true, because Chen did not defined mapping from E / RM into RM, OO model, etc. Mapping between the two data models is a very complex matter and it is not universally resolved. In addition to the mapping between the two models, there is also the concept of translation of one model in another. This implies that it is not enough just to do the mapping between the two models. It is also necessary that the appropriate things from the two models must have the same meaning. So the translation should preserve meaning.
Please note that this problem recently emerged as important because the translation has become important. For example, Google has translators for many spoken languages. In db theory we use formal languages. R. Montague, probably Tarski's the best student, is done a lot of things in the formalization of spoken languages. However, for translating, Frege's Principle of Compositionality is the most used. This crucial principle says that the meaning of a whole expression is composed of the meaning of its component parts. This matter is very complex and has many cases. For example, changing the order of words in a sentence changes its meaning.
So, as a conclusion it can be said that the mapping and translation from E / RM to another data model has not been done, at all.

5.
Chen's E / RM is not correct. For example, in E / RM attributes of entities are not defined. Anchor modeling use normal forms. If we want to do normalization, then it is necessary to do mapping and translation from E / RM in RM. In this text I have already explained that Chen did not specify the mapping and translation, at all. So, there is the following question: who guarantees that a beginner (who believes to the claims of P. Chen) does not create an entity whose attributes, in fact, belong to two different entities?

6.
As I already mentioned, both Chen and Codd did not understand the essence and role of concepts. They did not understand what it is that connects a concept and the corresponding predicate. They also did not understand what it is that separates the concept from predicate.
To explain this, we need a good understanding of Frege's work. I will simplify these things, so that it is understandable for those who did not familiar with Frege's work.
Frege here has a big picture about very important things. He divides this matter on two parts, thoughts and language. The concepts belong to thought level, while predicates belong to the language level. A concept and the corresponding predicate is one thing which has different constructs at different levels.
(i) The construction of Frege’s concepts can be seen in my post on 26 Septembar, 2013 about Frege’s Theorem at:
http://plato.stanford.edu/entries/frege-theorem/

(ii) The construction of Frege’s predicates you can see at my post on 4 September, 2013 at: https://groups.google.com/forum/?hl=en#!msg/comp.databases.theory/IfFnvnKoP4w/KkqT0DFeEzQJ
(In fact, we can say that predicates are grammatical constructs, they are about sentences and names)
In this manner I introduced “assignment” as a grammatical construct which bind names. For example we can bind the name of a variable to the name of a value. In languages ( programming and db languages) the assignment is only atomic command. In my db solution it is only possible to assign a new value or to “close” existing value (i.e. data).
(See my paper “Semantic databases and semantic machines”, section 7.3.

Here, the constructions of concepts and predicates are presented very shortly. To know complete Frege,s theory one must spent a lot of time.
Note that Frege introduced these constructs as reality. Some scientists call it the third realm, i.e. the realm of semantics. According to Frege these objects are a realm and they enable semantics to us. Before Frege, there were two realm accepted in science: the realm of the external world and the realm of the purely mental.

Codd and Chen did not even define the concept. Apparently they did not realize the important relationships in conceptual modeling.

Vladimir Odrljin

vldm10

unread,
Nov 27, 2013, 1:12:52 PM11/27/13
to
Hi Derek

With this post I would like to conclude this discussion about the conceptual
model. In fact, very little has been written about the conceptual modelling. In
my opinion, the E / RM is not conceptual model at all.
In this post I will show that E / RM is bad based on inaccurate or naive
statements. The two most important things in the E / RM were not done. It is not
done definition of the concept and Conceptual model is not built as a fully
(well established) theory
In my opinion Anchor Modeling is used to correct large gaps of several major db
theories.

In the subsections 1, 2, 3, 4, 5, I'm going to show some serious errors of E /
RM:

1.
In the paper, The Entity-Relationship Model - Toward a Unified View of Data,
Chen says, in section 2.2, at the conceptual level, the following: "There is a
predicate associated with each entity set to test whether an entity belongs to
it.

It appears that P. Chen is not clear enough about predicates and what is not a
predicate. Note that predicates do not belong to conceptual level. Note that at
the conceptual level, predicates do not test anything. At the conceptual level
there is one other testing. For example, there is testing whether a particular
attribute "falls under the corresponding concept." The predicates are on the
sentence level; more precisely they are a kind of grammatical constructs. The
concepts are on the level of thoughts.

2.
There are no "conceptual objects," as Chen it claims. In fact, there is a big
difference between concepts and objects (entities). Therefore, the following can
not be: "The entities, relationships, and values at level 1 are conceptual
objects in our mind," as Chen wrote in section 2.3.

It seems that Chen believes that the entity (object) is some kind of a concept?
Let me quote Frege: "A concept - as I understand the word - is predicative. On
the other hand, the name of object, a proper name, is quite incapable of being
used as grammatical predicate. "

3.
Chen did not write the definition of the concept, which is a fundamental thing
in the conceptual modeling.

Chen does not explain what it is that connects the predicate with the concept. He do not understand the nature of the wholeness that consists of the concept
and the predicate. So the fundamentals are not defined in the E / RM.
Note that here we are speaking about the most important things related to mental
activities. Therefore definition of the basic terms is important. This is about
concepts. Concepts are also important for set theory. This matter is also
related with the following crucial issue: How a man realizes the meaning of a
word or expression?

4.
I looked again at Chen's paper about the E /RM . Chen introduced entity /
relationship relations (see section 2.3.2). It should clearly say, that this was
already done in RM, by Codd . More specifically these entity / relationships
relations are a special case in RM. It is true that the entities and
relationships improve semantics. But, let me say it is about 10% of this job.
Note that Codd was the first who applied Frege's theory to database theory, and
thus first achieved semantics. He had developed completely RM. In my opinion,
the following conclusion is correct:

(i) Chen applied the existing results from theory to creation of E/RM. He
accepted Godel's definition that the world is discrete, that is, that the
world consists of individuals, relationships among individuals, and
attributes. (Note that some other scientists are coming up with the same
or similar conclusions).
(ii) Further Chen uses the RM , which Codd was already made.
(iii Chen did not do conceptual part. That part which is fundamental to
science. Chen did it at the level of a few naive and very vague and
confusing observations.
For example, in section 2.1 Chen wrote the following:
"(1) Information concerning entities and relationships which exists in
our minds."
Chen also claims the following: "The entities, relationships, and values
at level 1 are conceptual objects in our minds. "

In section 2.3 Chen wrote the following: “Basically, an entity key is a
group of attributes such that mapping from the entity set to the
corresponding group of value sets is one-to-one. If we cannot find such
one-to-one mapping on available data, or if simplicity in identifying
entities is desired, we may define an artificial attribute and a value set
so that mapping is possible.”
I must say that the entity with the artificial attribute does not exist in
the real world.
Therefore a concept for the real entity with the artificial attribute does
not exist.
=============================================

In the text that follows I will show that Anchor Modelling trying to fix the big
mistakes of other db theories, using my results.

5.
As I showed in my last two posts, conceptual modelling is not well defined in
many parts of this model.
Let me mention just some of them;

(i) There is no good definition of the concept.
(ii) There is no solution that will associate to the concept the appropriate
mathematical structures.
(iii) There is no theory that defines the mapping from the conceptual model to
other data models.
(iv) There is no solution for the decomposition into the atomic structures
(v) There is no general theory that connects all parts.

Anchor Modeling is published two papers. In their first paper (which won first
prize at the ER2009 - Conceptual Modeling Conference) their two main structures
are defined on the same data model that uses E / RM. The data model consists of
tables. For example, the main structures are defined as follows:
Def 2. An anchor A(C) is a table with one column. The domain of C is ID. The
primary key for A is C.
Def 5. A historized attribute Hatt(C, D, T) for an anchor A(C) is a table with
three columns. The domain of C is ID, of a non-null data data type, and of T a
non-null time type. Hatt.C is non-null foreign key with respect to A.C (Hatt.C,
Hatt.T) is a primary key for Satt

------------------------------------------------------------
In the improved version of Anchor Modeling, (which was published in DKE, which
chief editor is P. Chen) the new definition of Anchor is introduced.

Def 4. An anchor A is a string. An extension of an anchor is a subset Of I.
Def 7. A historized attribute BH is a string. A historized attribute BH has an
anchor A for domain, a data type D for range, and a time type T as time range.
An extension of a historized attribute BH is a relation over I x D x T.

What is wrong here, it is that this data model with "extensions" and "sets" is
not defined. It is not known whether the authors used an existing theory or is
it their invention. The authors did not provide any reference. There are no
definitions of basic terms. Note that this matter is about fundamentals. In my
opinion this is not science and this is not the way how scientific paper should
be presented.

Note that my model has extensions and concepts. Anchor model has two data
models??? First one uses tables, defined in the first paper, while the new model
uses "extensions" and "sets" defined in the second paper. Obviously, the latter
is published in order to fix the existing conceptual model.
Note also that this version of Anchor Modeling has improvements based on my
critique from my thread "The original version". I think that conceptual model
(which is on the level of thoughts) can not be built on the tables.

6.
In OO data model there is one big unsolved problem. OO uses surrogates; all
entities stored in OO model are on the db level. Note that in RM, entities are
at a relvar level. As entities can change their attributes during the time, it
is possible, for example, the following situation:
Five entities can get all the same values for all their attributes. For example,
entitet1 has all the values of all its attributes, the same as the values of the
corresponding attributes from entity2. The same holds for attributes from enity1
and entity3, entity1 and entiy4, entity1 and entity5.


Anchor Modeling use also surrogates, but the surrogates are not on db level,
because the anchors are on the entity level. This means that anchor modelling
has the same collapse but on the entity level.

In RM this case is forbidden. In RM it is not possible that two relations have
the same attribute’s values. So RM is limited and obviously RM can not support
this kind of problems (entities with same attributes).

The authors of Anchor Modeling have tried to fix all problems using my solution.
They tried to apply the history of events in solving problems. However they
didn’t understand many things in my solution.

Note that my solution enables that the OOM and RM applications have always
complete solutions. I want to say, that just using my solution, OOM and RM for
the first time can solve this case. As I already wrote, my data model belongs to
General database theory.

7.
I started to write about this plagiarism on May 26, 2010 in my thread "The
original version".


On 8 June, 2010, in my thread “The original version” I wrote on this user group
the following: “An identifier of a state of an entity allows decomposition of
the concept of the state into binary concepts. The same hold true for relations.
The identifier of the state of an entity provides straightforward mapping
between the binary schemas as well as inverse mapping. “

In short, I tried to explain that the transition from E / RM to RM or OOM must
be done using some mapping. I also wrote that you can not use tables for this
mapping. Then I explained that my "identifier of state" determines completely
"decomposition into atomic structures" and all these mappings.

On September 2010, the authors of Anchor Modeling submitted improved version of
Anchor Modeling. In this version the authors of Anchor Modeling are plagiarized
the most important part of my work and it is the identifier of a relationship.
Using my solution they have done "decomposition of the atomic structure" and the
mapping between two data models.
In addition, they solved problems for themselves which previously could not be
solved, the authors of AnchorModeling enables E / RM, for the first time, that
can do the mapping into RM, OOM, XML and other data models.

The technology that is applied to the publication of of the improved version of
Anchor Modeling is very interesting. This paper is published in Jurnal DKE whose
editor in chief is P. Chen. But the two probablly most important results in the
theory of databases are not published in this paper but the following was given
as a reference [29] and [30]. These references are in fact private website -
with the address: www.anchormodeling.com

Note that my identifiers of states and maintaining of states were published in
2005 on my website. All this was presented on this user group and thoroughly
discussed in 2005.

Five years before of publishing improved version of Anchor Modeling.

The following three things are key in the mapping between the two data models:
(i) Mapping between schemas
(ii) Mapping data between data models
(iii) Keeping of the meaning of the corresponding data from the corresponding
two data models
(i) and (ii) are completely determined by my “identifiers of states”.
Relationships between meaning, truth and facts are presented in my paper
“Semantic databases and semantic machines”, see section from 1.3 to 3.2.1

8.
In the Entity-Relationship Model – Toward a Unified View of Data, Chen wrote the
following: In a join meeting of the RDF and Schema Working Groups over one year
ago, they issued the Cambridge Communiqué that states: “… RDF can be viewed as a
member of the Entity-Relationship model family…”

The mentioned Communique can be seen at http://www.w3.org/TR/schema-arch
Note that the decomposition into the atomic structures, which was done in my
papers, enables that the atomic structures can be separately deployed on the
www. The atomic structures also enable that current data, what are globally
distributed and separated, can be easily and formally integrate into a global
db. Obviously atomic structures, “history of events” and General database theory
enable tremendous, possibilities on the www.

So my position is that databases are a key part of the solution to the global
communication with the information, www is just a technical resource.

It seems to me that w3c philosophy is wrong; they think that it is opposite,
that web resources are leading part of the solution.
Note also that temporal and spatial knowledge in my data model has superior
solution. My solution has complete history of events. These events can be
related to many objects and subjects.


Vladimir Odrljin

I intend to end this thread, soon. Maybe I will add two or three posts, but it
will be accelerated. So if you have some questions or comments if you notice
some my mistakes, then let me know. After that I will be very busy.


vldm10

unread,
Dec 13, 2013, 8:03:36 PM12/13/13
to
Hi Derek,

In this post I will briefly explain the main steps in my conceptual model.
As far as I know this is the first data model that was developed in full
compliance with Frege's theory. On the other hand, I think that this is the
first data model where the conceptual model is completely done. My data model
has the following elements of the conceptual model:

1.
My data model has a precise definition of the concept. I use Frege's definition
of the concept, which I improved so that Russell's Paradox is not valid in my
definition of the concept.

What is significant in this definition of the concept, it is the object which
Frege introduced and called the extension of the concept. Frege also defines
when two extensions are identical, which is fundamental. See my paper “Database
design and data model founded on concept and knowledge constructs”, sections 2
and 4.2.1. at http://www.dbdesign11.com
=======================
In my post from 23 October, 2013 in this thread I wrote: By accepting Frege's
definition about the extension, we can write the following:

(i) Ǝx€xX
(ii) €xX & €yY = > (x = y < = > X ≡ Y)

It is possible to derive comprehension from (i) and extensionality from (ii),
for sets. Note that Russell’s paradox doesn’t hold in set theory which I apply.
======================

Why am I writing about set theory? I write about it, because "E/ RM" and "RM",
use sets. These are for example "entity set" in E / RM and relation as a set of
n-tupples in RM. However, the E/ RM and RM do not say how they got sets at the
conceptual level.
Here is shown how from the concepts we are coming to the sets. In conclusion, we
can say that definition of concept is of fundamental character. So, my data
model is based on sets. More precisely, my data model consists of sets whose
elements represent states of entities or states of relationships.

Note that Frege's principle of subordination applies here:
First comes the following relations: "an element which falls under the concept"
and the element is in the corresponding extension.

After that comes the relation: "an element belongs to a set" and this element is
a derivative of the corresponding element from the first level.

2.
Now, let me try to explain “relationship” between concepts and predicates from
Frege’s theory. In my post from October 23, 2013, I schematically presented
Frege’s theory of predicates. The unsaturated expressions of the form S/NN…N
represent relations and entities from RM and E/RM.
===========================
In Frege's theory, the predicates are language (grammatical) constructs which
denote concepts. Besides the denotation, Frege also developed theory of meaning,
thoughts and statements that contain actual knowledge as important components in
this relationship between predicates and concepts.
===========================

This part of Frege's theory is very large and very significant. Therefore, this
matter can not be briefly outlined in the user group. Keep in mind, that in
Frege's work were started many other theories, such as the above-mentioned
unsaturated expression S / NN ... N, which are the beginning of the theory of
interpretation.

In his book, M. Dummett wrote: “ Frege would therefore have had within his grasp
the concepts necessary to frame the notation of the completeness of a
formalization of logic, as well as its soundness.”
With these few remarks about the importance of Frege's theory I want to
emphasize that the data model which is based on Frege's approach is good,
because it includes the foundations of mathematics.

3.
At this point, I'll write about the identification. Identification I have
introduced as semantic procedure. So in my model, there are two semantic
procedures. In addition to concepts, there is identification.

In my paper “Database design and data model founded on concept and knowledge
constructs” section 5, at http://www.dbdesign11.com I wrote: The process of
identifying goes from a subject to the real world and this implies that the
subject has some knowledge about the entity which it tries to identify.
If we connect this with my definition about "Limitation of Interpretation", see
section 3 and formula (3.3.3) and the definitions of particular and universal
attributes (see section 3.3), then we come to several new conclusions.
For example, we conclude that we can only work with the attributes that are
known to us, ie which we can identify, directly or gradually. For example, we
can work only with the concepts of colors that we can identify. Note that when
working with the entity's attribute then we identify the particular attributes.
(see definition of particular attributes, I wrote in more details about these
terms in improved of my paper from 2009)

From this part of my text follows the solution of Russell's Paradox about which
I wrote in the thread “Does the phrase “Russell’s paradox” should be replaced
with another phrase?”

In my data model identification is realized by using an identifier. First, the
identification of attributes is defined. The attributes are identifiers, these
attributes I named universal attributes. Note that my data model, ie sets,
working with abstract objects. These abstract objects I denote with prefix m.
For example m-attributes, m-entities etc.. It is clear that the identification
occurs between the universal attributes and m-attributes. The procedure of
identification the attributes is described by formula (3.3.3), from my paper.

The next level is entities. They are determined with the identifier of the
entity. The identifier of an entity determines all the entity's particular
attributes. Note also that the identifier of an entity enables the decomposition
of the entity into particular attributes, that is into the atomic structures.

The next level is the states of an entity. The states are determined with the
identifier of the corresponding state of the entity.

If knowledge about one state of an entity we named the particular knowledge,
then the corresponding identifier of a state enables the decomposition of the
particular knowledge into the atomic data structures.

4.
This section is about meaning.
I will concentrate here only on the two aspects of a meaning, for that I think
that my work has given some contribution:

(i) Identification.
Identification of certain entity helps us to have quickly access to the entity that is stored in memory. We can say it, in this way: Identifying helps us, to
quickly recall an entity that is stored in the memory.

However here, we do not have an entity; we have a state of an entity. A state of
an entity is very complex thing. As I wrote earlier in this thread, we store
complex objects in the memory, and we can get them from the memory, by combining
multiple identifiers. I have defined this rule as law of general character in
the work with the memory. In this way, I get the the meaning of complex
objects, such as the meaning of a state of an entity. In my data model, this is
achieved by applying the following two identifiers: the identifier of an entity
and the identifier of the state of the entity.

Similarly, the atomic structure (IdentifierOfEntity, Attribute1), is constructed
from two identifiers: IdentifierOfEntity and Attribute1 identifier. These two
identifiers provide the following meaning: the Attribute1 is a particular
attribute of the entity, ie this attribute belongs to this entity.

When I got the complete state, it means that I got complete actual knowledge
about the state of the entity in the real world. This state of the entity in the
real world is determined by the corresponding events. Of course, I can seek only
for the part of the knowledge about the state of an entity. In this case, I only
have the particular actual knowledge about the state of an entity.

=================================
Totally or any particular knowledge about the state of an entity has significant
influence on the meaning of the entity. Note that particular knowledge about an
entity is very similar to what Frege called "sense" or "the mode of
presentation."
=================================


(ii) Links between truth, meaning and facts
I wrote about links between truth, meaning and facts in my paper “Semantic
databases and semantic machines”, section 1, 2, 3 at http://www.dbdesign11.com
-------------------------------

So, roughly speaking the above-mentioned four sections in this post makes major
steps in my conceptual model. Section 1 provides a brief definition of the
concept, extensions and describes the transition from the concept to a set.
Section 2 briefly describes the relationship between predicates and concepts.
Section 3 describes the semantic identification procedure. Section 4 discusses
the construction of meaning for the entity.
The E/ RM model does not have any of these four sections. Therefore, in my
opinion E / RM is not the conceptual model. It has some intuitive elements of
semantics. In my opinion the best name for this model is the E / RM model,
without mixing with the conceptual modeling.
Please note that the conferences on conceptual modeling are run under the
leadership of men from E / RM with Honorary Chairman P. Chen. I think this
conference should have the name "entity relationships modeling".
Note also that E. Codd did not notice concepts, although the concepts are highly
associated with predicates.

In my opinion, conceptual modeling is important because it is the foundation for database theory.

Vladimir Odrljin

vldm10

unread,
Dec 16, 2013, 7:16:11 PM12/16/13
to
Here I made a mistake. Instead of “In my post from October 23, 2013”, It should be the following text:
In my post from September 24, 2013 (See the thread ”Sensible and NonsenSQL Aspects of the NoSQL Hoopla”)

vldm10

unread,
Dec 21, 2013, 10:03:28 AM12/21/13
to
Hi Derek,

In this post I will describe the main steps in the database design in my data
model, so that someone can get picture about main things in my solution.
My data model is completely generalization of E / R model, and it represents the
entities in a completely different way than it was done in the E / RM.

I accept Godel's philosophy that the world is discrete. Kurt Gödel has described
this philosophy in 1944, as follows: "By the theory of simple types I mean the
doctrine Which says that the object of thought (or, in another interpretation,
the symbolic expressions) are divided into types, namely: individuals,
properties of individuals, relations between individuals, properties of such
relations, etc."
(I am aware that many of the scientists worked on entities prior to Gödel.
Actually here I used the name Godel, as a symbol of authority.)

I will now give a definition of the entity. Note that P. Chen was defined the
entity as follows: "An entity is a "thing" which can be distinctly identified."
See section 2.2, in Chen's paper about the E / RM.
I use Frege's definition of the object, because the term "thing" is not defined
in Chen's definition. Note also that the relationship "can be distinctly
identified."

Frege defines the object as something that can be the referent for a
name. According to Frege, a proper name has "sense" and "referent". The referent
of a proper name is the object it denotes. The sense of a proper name is related
to the meaning. If you noticed, I wrote in my previous post about the sense and
the referent of the predicates.

Step1. The entities are constructed only of intrinsic properties.
Step2. Construction entity is determined by applying the Leibniz's Law. If it is
not possible to apply Leibniz's Law, then I apply the "General law that
determining the difference between entities." (see my paper Semantic databases
and semantic machines, section 5.6)
Step3. In this step, the "Simple Form" should be applied and the decomposition
of the entity into atomic structures should be done. I introduced Simple Form on
May 2006, at http://www.dbdesign10.com
----------------------
Speaking simplified, Simple Form can be rewritten as follows:R (K, A1, A2,…, An)
= R1 (K, A1) join R2 (K, A2) join…join Rn (K, An) iff

(i) Key K is simple
(ii) A1, A2,…,An are mutually independent attributes
----------------------
Obviously Simple Form can be presented in E/RM. To do this, we can use entities
instead of the relations and we can use relationships instead of the (equi) joins.

Regarding the key K of Simple Form, the following cases are possible:
(i) The key K can take unique values from the corresponding domains. In this
case we have the surrogate key.
(ii) The key K can take the unique values that are defined as international
Industry-standard or locally-defined standard. Such a key is usually
determined by the above-mentioned "General law that determining the
difference between entities." Note that this key is introduced as an
attribute of the corresponding entity.
(iii) The key K can take values that are obtained by concatenation of the values
of attributes. For more details about this case see my paper "Semantic
databases and Semantic machines ", section 5.1 (iii). Note that in this
case the key K is simple.
(iv) In my data model, the attributes are mutually independent. So the key K
can be "natural key" that is, K is an "all-attributes key". This case has
a bad characteristic. The key K in general can have, say ten attributes.
Now the "atomic structure" has key K (with ten attributes) and one
attribute (which is outside the key). Of course this kind of atomic
structure does not look very "atomic". Note that in this case, the key K
is not simple, and does not meet the conditions for Simple Form.
===========================================
Note that my Simple Form is general and theoretical result which determines
completely the surrogate key. So my paper gives general conditions related to
the surrogate key. Anchor Modeling surrogates, OO surrogates and Codd’s RM/T
surrogates are just special cases of Simple form.
============================================

The first three steps enable the construction of entity and construction of
Simple form. After these three steps, we can apply the other things that are
related to the entity. For example, we can analyze the constraints on the level
of atomic structure.

Domains should be constructed in compliance with section 5 in my paper Semantic
databases and semantic machines. Domains are associated with the process of
identification.
Step4. In this step the states of entities (relationships) should be introduced.
The concept of a state of an entity has the following schema:
ConceptStateName (P, E, A1… An, Kp1, … ,Knr, Dp1,…,Dns) where P is the concept
of the identifier of a state of the entity (or relationship);
E is the concept of the identifier of the entity;
A1,…,An are concepts of the properties of an entity (or relationship);
Each property, including E and P, can have different sets of knowledge K
associated to them and defined in 3.6 – 3.9. Thus:
P has knowledge Kp1, Kp2… Kpi;
E has knowledge Ke1, Ke2… Kej;
A1 has knowledge K11, K12…K1k;
….
An has knowledge Kn1, Kn2… Knr.
Knowledge Dp1,…,Dns is defined in 3.8.
--------------------------------------------------------
The schema of a concept of a state can be represented by schemas of the
following binary concepts: (P, E), (P, A1)… (P, An) to which we associate the
corresponding knowledge and get the following concepts:

(i) Schemas of K-concepts;
Ck1 (P,A1, K11,…,K1k,D1,…,D1m);

Ckn (P, An, Kn1,…,Knr, Dn1,…,Dnq);

(ii) Schema of the E-concept
Ce (P, E, Kp1,…,Kpi, Dp1,…,Dps);
=======================================
Here in Step4, many questions about concepts and extensions can be raised. I
will emphasize only one: my concepts are constructed from knowledge, not from
properties.
Knowledge in my data model is defined precisely and in a new way. Knowledge is
based on atomic facts and knowledge is structured. For example in the above
schema of the concept of a state of an entity, there is knowledge about the
entity, knowledge about attributes and knowledge about the data. If one wants to
add some additional structural knowledge (or subtract an existing) he can do it
on formal way. So, there are no undefined terms in my data model, as it is
"metadata".
The approach to concepts that uses knowledge, enables that many problems can be
solved. For example H. Putnam's claim that green lemon is a lemon, can be
explain by using knowledge.
These concepts are denoted by the corresponding predicates.

Anchor modeling is a kind of conceptual modeling. The paper Anchor Modeling ER
won the 2009 Best Paper Award at the 28th International Conference on Conceptual
Modeling.
In this paper, the concepts are not even mentioned. The main structure is
defined as table:
Def 2 (Anchor). An anchor A (C) is a table with one column ...
In this award-winning paper, nothing was written about the "metadata". In
section 2, the authors wrote: "Although important, the metadata is not discussed
further since its use does not differ from that of other modeling techniques."
At this point, Step4, I presented the relationship between the "metadata" and
concepts.

Step5. I represent my db structures by schemes. I introduced simple schema
language. The main reason why I use schemes, it is that in this case can be
applied Model Theory. In my opinion Model Theory is powerful mathematical tool.
The Model Theory for my data model is very good. For example, an interpretation
I1 satisfies those sentences that are related to the state of an entity. This
means that the interpretation I1 satisfies a sentence (say S1), which refers to
the state of the entity. If the state of the entity is changed, then the
interpretation I1 no longer satisfies sentences related to the new state.

Obviously this model much more accurately describes the sentences about the real
world, than models that do not maintain the history.

Step6. My data model requires only two, very important procedures. One procedure
creates a new data in the database, I call this procedure "Creator." The second
procedure is used when data from the database ceases to be valid, I call this
procedure "Closer." For all operations with data in my data model, only these
procedures are used.

---------------------
In Step3 and Step4 decomposition entities and the state of the entities in the
atomic data structures is done.

Vladimir Odrljin

vldm10

unread,
Dec 30, 2013, 1:03:59 PM12/30/13
to
Hi Derek,

In this post I will present the list of all my results that are published for
the first time in the scientific world. My results have always been published on
my website and in this user group. So my papers were presented globally and at
www.
The reason for this list is that it is possible to determine the priority of
ideas and results.

All important and accurate results from Anchor Modeling I published four or five
years before these results are published in Anchor Modeling. With this list of
my results everyone can check it out, and in the case that I'm wrong, and then
he can show that my assertions are nonsense. On the other hand, I stand behind
my statements with the facts.

1.
For the first time, the history of events for the entity / relationship model is
solved. Generally speaking, the General Databases are resolved. What are the
"General Databases" you can see in my post from February 13, 2013 in this
thread. This post is a supplement to my post from February 13 2013.

Why is it important to start from the entity and relationships? First, it is
important construct E / RM, as it was written by K. Godel in 1944th. Secondly,
this view of the world is precisely defined. It is not a philosophy, as a modal
logic, where basic concept is the "world". But “world” is not precisely defined.
Secondly, we are talking about real objects and their real relationships. In my
data model, there are appropriate techniques that allow also work with abstract
objects. So my data model works with the real world objects and with the
abstract objects.
Note that in contrast to E/RM, object-oriented approach is actually oriented to
"objects" that are stored in a memory of a computer.
Frege’s definition of an entity (object) is introduced. What is the most
important in this definition is that it is essentially a semantic definition.
This definition provides a connection between the man and the real world.
By applying the concepts we achieve that real objects can be represented as
sets, whose elements represents the real world entities. We can apply current
(axiomatic) set theory or we can apply a new set theory in which Russell’s
paradox is not acceptable, because it makes no sense when identification is
involved.

2.
In my work, for the first time is solved decomposition of an entity/relationship
on the atomic structures. (See my Simple Form).
Decomposition of the states of the entities / relationships on the atomic
structures is also solved. (See my General Form)

The decomposition of states of entities was done in 2005, while decompositions
for entities were presented on May 2006. My main goal was to get the atomic
structures and the simple keys. So the simple key is much more important and
more general thing than the surrogate key. Although the surrogate key can be
applied in Simple Form, it is not of a general character, because the surrogate
key can not be applied as an international standard.

Note that the decomposition on the Atomic structures was not proved in Anchor
Modeling!
The authors of Anchor Modeling introduced their atomic structure without any
evidence. Note that many scientists have devoted, virtually theirs entire work,
to this problem. E. Codd had tried to do it in the RM / T, unsuccessfully.
There is also the question: who guarantees that each business application can be
presented via Anchor Modeling's structures. (knots, historized attributes and
static attributes). Maybe there is another structure that is needed in the
Anchor Modeling for some applications?

3.
My model is based on the states of the entities or relationships. This is an
original approach that is very complex. It contains structured knowledge, atomic
data structures, procedures for maintaining the history of events, procedures
for identification, events, concepts, extensions, meanings and fact. Note that
the theories that are based on the surrogate key does not maintain the history
of events, so that they have only current state, that is, realistically
speaking, these approaches do not have the states. First of all, here I think on
OO approach and Codd's RM / T.

My model is completely new. It is somewhat in contrast to existing models. For
example, my model has no anomalies of deleting, updating and adding, at all.
Simply speaking, the main structure I named procedure (a), which is based on two
identifiers, the identifier of an entity and the identifier of a state. (See my
thread “The original version”)

However in my model, if we want then we can change the identifier of an entity.
So, for different groups of states of one entity, we can assign different
identifiers for one entity. See my paper “Database design and data model founded
on concept and knowledge construct” section 4.2.4 at http://www.dbdesign11.com

The states depend on the project and business requirements. So "knowledge-
columns" are variable, depending on many things. If one needs then he can add
some new structural knowledge. See section 6.5.

Only these last two remarks indicate to the advantage of my data model. Note
that the Anchor Modeling has a fixed structure for all the applications.

4.
Identification was introduced as the second semantic procedure, which is
connected with the concepts. Attributes are defined for the first time as
identifiers. Further recursively identifiers of entities (relationships) were
introduced. Identifiers of states of entities (relationships) are introduced.
(see (3.3.3) in my mentioned paper)
Identification of complex objects is realized by using a combination of
identifiers.
Generalized Leibniz's Law has been introduced for the purpose of identification.

5.
The first time the conceptual model was constructed with the use of Frege's
semantic theory. Russell's paradox is resolved. The design of concepts was done
by applying a structured knowledge.
In my conceptual model the following steps are required:
- Concepts;
- Procedures for identification;
- Extensions and switch to the appropriate sets;
- Construction of meaning – link between predicates and concepts, atomic
sentences. Links between truth, meaning and facts;

6.
For the first time, the data from the database are based on only two events.
These two events were used to introduce the definition of time.

7.
My data model uses some new mathematical ideas:
(a) The theory of sets, which is not founded on Russell's paradox.
(b) The model theory that takes into account the factors in the real world:
- States of entities and relationships;
- Atomic structures;
- Links between facts, truth and meaning as it is defined in my paper Semantic
databases and semantic machines, sections 1,2,3;
- Conceptual model that is determined with my data model.

So this kind of Model Theory, in fact "works" with the real world. It can realistically interpret sentence about the real world.

8.
Sequence as database's structures is introduced. This structure enables that the
identifiers of entities can be changed. It allows someone that can “learn about
an entity." This allows monitoring of changes in the number of entity's
attributes, during the development of the “knowledge” about the entity.
This structure is much more powerful than "anchors" and Codd's E-relation.

Vladimir Odrljin

vldm10

unread,
Dec 31, 2013, 4:15:11 PM12/31/13
to
Hi Derek,

With this post, I'd finished my writing in this thread. Given that these are the
most important topics for database theory, be free you or anyone else, to
present your opinions, criticism, and other things that is associated with this
thread.
Regarding the plagiarism of my work by the author of Anchor Modeling, that's
what I already wrote in this thread and in thread "The original version".

1.
As I understood, your main question is related to “surrogate key”. My answer now
is short because I already explain this controversy. My Simple Form gives
general conditions for decomposition of an entity into the atomic structures. An
atomic structure has the simple key and one attribute. Here, in the atomic
structure, you can apply a surrogate key instead of the simple key.

In my paper “Some ideas about a new data model” from September 17, 2005 at
http://www.dbdesign10.com section 1.1, I wrote “besides Ack, every entity has an
attribute which is the identifier of the entity or can provide identification of
the entity.” Here part “or can provide identification” is related to the
surrogate key.

I think in this thread has shown that the surrogate key can be applied only in a
small number of business applications.
In my post from April 1, 2013, in this thread, I showed that Anchor Modeling is
wrong in a number of cases.

-------------------------------------------
Related to surrogate key, here's another information. A few days ago I found a
book by Joe Celko in a bookstore in Croatia. The book is titled "Joe Celko's
Data, Measurements and Standards in SQL." In the book there is section about
Codd's surrogate key. Joe wrote the following:
==========================
“This means that a surrogate ought to act like index; created by the user,
managed by the system and never seen by a user. That means never used in
queries, DRI, or anything else that user does.”
==========================
About ten years ago I rudely attacked Joe's post about Codd's surrogate key, at
this user group. At that time, for me Codd was the greatest authority for
databases; so that my attack on Joe was provoked by suspicion that Joe slowly
expanding degradation of Codd's paper, without justifiable reasons. My poor
knowledge of English has also contributed that my post was harsh. However, Joe
was the first who realized the incorrect nature of RM / T.
All my results I got quite independently of the other. Luckily I worked
completely independently. If I were studying other people's work, it is certain
that I would not do something serious. I worked on specific projects with
specific requirements.
Later, when I published my ideas, a couple of times I tried to figure out
exactly how it actually works Codd's surrogate key. Eventually I came to similar
conclusions as it concluded Joe. I think that today, this controversy about
surrogates is fully solved. So, the surrogate key was solved and explained by
the members of this user group. It is not done by academia.

2.
In my opinion, the authors of Anchor Modeling are plagiarized major things from
my work. Some minor things are not plagiarized, but are changed and all these
changes are the cause of large errors in the Anchor Modeling.
I do not like this role, to publicly and persistently defend my work, and these
types are not really my style. It also is not pleasant to write such a long
thread in front of all those who know me personally. But this is unfortunately
the only way to defend my work.
If I did not write anything about this plagiarism, my work would be in vain. On
the other hand author of the Anchor Modeling would have an "open door" and
unlimited time to repair the errors of theirs paper, such as they largely done
in the corrected version of the Anchor Modeling.

Vladimir Odrljin

vldm10

unread,
Apr 21, 2014, 2:32:59 PM4/21/14
to
Hi Derek

Considering that you partially agree with me on the anchor modeling and that you
do not agree that the anchor modeling is plagiarism, then in this post, I will
give facts which confirm that the anchor modeling is plagiarism.
Here I will mention only my solutions that are relevant to database theory, and
which are used in the anchor modeling. These solutions are posted on my web site
and at this user group in 2005. The anchor modeling paper was published in 2009.
Fixed version of anchor modeling was published in 2010, after the presentation
of my critique of anchor modeling.
The anchor modeling paper has published the following things and these things
have been published previously in my papers:

1. Simple key that I named the identifier of an entity.
This key is plagiarized in anchor modeling and named with another name - anchor
key. Specially important to this key, is that it enables construction which
preserves and maintains the history of entities and relationships. This whole
idea was plagiarized in the anchor modeling. This technique has been extensively
discussed on this user group. Note that in my paper from May 2006 (see text from
September 2005, section 4 at http://www.dbdesign11.com ), in Simple Form, there
is the general form of Simple key. Simple Form provides complete theoretical
basis for surrogate keys, anchor keys and indexes. In this paper, the
decomposition of entities (relationships) was done.

2. My paper is the first data model that enables the modeling of history of
entities and relationships

3. My paper is the only one that has done the composition of entities and
relationships into atomic structures.
In 2005 I solved the problem of the decomposition into atomic structures for
General databases (General Form), and in 2006 I solved the problem of the
decomposition for databases that maintains only current state (Simple Form).
Note that anchor modeling uses the decomposition into atomic structures.

4. My model is only one that defined and introduced states of entities and
relationships.

5. My data model introduces the identifier of states of entities.
After my critique of serious theoretical mistakes in the anchor modeling, the
authors have published in 2010, fixed version of their paper. In the paper they
introduced the identifier of states of relationship. The states are the most
important part of my paper. The identifier of a state of a relationship enables
solutions for most complex databases.

6. My model defines states with Structural Knowledge.
The authors of the anchor modeling use part of this structural knowledge. This
part they call meta data.
=========================

In my threads "The original version" and "Some information about anchor
modeling", I have described in details these plagiarisms. Each of above examples
is important part of db theory, especially when they are all together in one
paper.

If you think that something in this post is not correct, please let me know. We
can analyze it.

Vladimir Odrljin

vldm10

unread,
May 25, 2014, 3:26:38 PM5/25/14
to
Dana ponedjeljak, 21. travnja 2014. 20:32:59 UTC+2, korisnik vldm10 napisao je:

Hi Derek,
Since you are not interested in the topic of plagiarism, then I´ll finish this
topic. Before that, I will mention a few things.
As I already wrote earlier, the authors of Anchor Modeling plagiarized all the
important parts of my work. In some unimportant details they have given their
solutions. I have demonstrated that all these their solutions are wrong.
In this post I will show only two examples of plagiarism. These two examples
show that these plagiarized things are of utmost importance for the theory and
practice.

1. For a long time I have thinking and working on the following problem: People
have always held that a name denotes a certain entity, although this entity has
been changed many times. How an entity which has changed to another entity is,
in fact, the same entity. How one can explain the relationship between Leibniz's
law and the possibility of changing entity. This question exists because
persisting entities can change their intrinsic properties.

This problem is known as Theseus´s paradox.

I solved this problem by using identifiers of entities. In my paper from 2005, I
gave the corresponding procedures, constructions and semantics for solving this
problem. The main part of solving "temporal", "historical" and other complex
databases consists of two sub steps:

ProcedureA
1. constructing an identifier of an entity or relationship.
2. connecting all changes of states of one entity (or relationship) to
the identifier of this entity (or relationship).

My solution was totally new. You can see the reaction of the users of this group
to the first presentation of my solution. (See my thread "Database design, Keys
and some other things", from 2005).

Note that ProcedureA contains other important solutions, such as Simple key,
Decomposition into atomic structures, and understanding the reality as changes
of entities and relationships.

I explained in this thread, that there are three important cases for
identifiers of entities:
a) identifiers that are defined as international standard;
b) locally defined identifiers;
c) surrogate keys.
These identifiers from case a) and case b) can identify the real world
entities, because they belong to the real entities.

These identifiers from c) are surrogate keys, they can not identify the real
world entities, but they "can provide identification of the entity". How this
works? First, we must determine the attributes that correspond to the surrogate
key. Then by using these attributes we can determine the real word entity.

Now we can notice that my definition is good. My definition works for all three
kinds of mentioned identifiers. That is the reason why I wrote the following: "
...Besides Ack, every entity has an attribute which is an identifier of the
entity or can provide identification of the entity..." See my paper "Some ideas
about a new data model", section 1.1, posted on September 2005 on this user
group at http://www.dbdesign10.com .

Note also that the locally determined identifier is very useful. This identifier
can be applied at each company which do not use international standards.
Companies that are good organized, can introduce their own, local standards. For
example, companies that works with invoices, bills, public utilities companies
etc, they can introduce their own technology with locally defined identifiers on
their invoices, papers etc.

The authors of Anchor Modeling use complete ProcedureA and call it Anchor
Modeling. The identifier from ProcedureA they named "anchor surrogate key", see
their paper "Analysis of normal form for anchor models", see page 2 (this is
reference [19] in their paper and all five authors sign the paper).

Now, I would like to mention some more serious matters that make this plagiarism
more series. In my opinion, my solution has changed important thinks in Logic.
Let me shortly present it. In my solution truth-value of the statement depends
on the following:

(A) On what time the statement relates.
(B) When the statement is uttered.
(C) What is available subject´s knowledge about the entity's state?

Obviously, my solution precisely determines truth value in the above situations.
Note that here I use term "statement". Some mathematicians use term
"proposition". (Note that this is very serious theme that needs much more space)

Also my solution affects the application of Model Theory to databases. For
example, several times I have presented problems when the IT department is suing
the person who has died a long time ago. I mean, the mentioned person does not
exist in the real world.

2. The second example of plagiarism refers to the identifier of a state.
In my model, the states are the main part. I presented the states in 2005. In my
thread "The original version" I presented many mistakes in Anchor Modeling.
After that the authors of Anchor Modeling have published a new paper in December
2010, in which they tried to correct their mistakes, using my results. They
introduced the identifier of states of relationships, which I introduced and
defined in 2005. The identifiers of states of relationships is the most complex
things related to states.

Note that I am only one who has defined states of entities and relationships.
For example, note that authors of Anchor Modeling plagiarized the identifier of
states, but they did not defined states. Of course this is serious nonsense.

I want to emphasize that states are very serious problem.
I solved states for entities and relationships. The entities and relationships
are the most general categories. I defined states as knowledge, and I defined
knowledge, atomic structures, and relationships between meaning and truth for
atomic structure. I also introduced the identifier of states and some other
things.
The states are done on the real conceptual model.

Vladimir Odrljin
0 new messages