Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

A Topological Relational Algebra in Lisp

270 views
Skip to first unread message

Norbert_Paul

unread,
Jan 11, 2015, 8:25:31 AM1/11/15
to
I would like to draw your attention to
http://pavel.gik.kit.edu/
which contains a result from a project of former times when I used to
work as a researcher.

I am mainly interested in comments from practicioners if they see any
potential practical application of my results. I am thinking of databases
for geo-information, CAD-data, or, in general, applications that must
handle topological data with arbitrary topologies.

A brief description:
During my work I have realized that there is an interesting link between
Relational Algebra and Topology (in the mathematical sense):
(1) Every finite topological space has a simple and efficient
relational representation.
(2) Every Relational Algebra operator has a corresponding topological
construction. This yields a relationally complete query language
for topological spaces.

So the idea was to extend the relational model in the following way:
(i) Given a database table $X$ add a relational representation of
a topology $T_X$ for $X$ such that the pair $(X,T_X)$ becomes
a topological space. The records in $X$ constitute the pointset
and a binary relation $R$ on $X$ represents the topology.
(ii) Provide a version for each database query operator that takes
/spaces/ as input and produces result spaces:
First act as usual on the pointsets.
Then each result set of such query has a uniquely determined
result topology, (see http://en.wikipedia.org/wiki/Final_topology
and http://en.wikipedia.org/wiki/Initial_topology) which yields
a query result space.
(iii) Do this in Lisp because Lisp is fun.
(iv) I wanted to use (and extend) CLSQL first but then decided to
change from SQL (after finding the word "madness" in the
grammar file of PostgreSQL) to Relational Algebra of which I
defined a Lispy syntax.

The web-page describes everything in more detail, contains the sources,
and has a running instance of the experimental database server. The
server is just meant as proof-of-concept and, of course, still far far
away from parcatical usefullness.

I am interested in a discussion here, because it seemed that reviewers
tended to frown on my (any my co-workers) work because, for example:
(1) The applied mathematics is too challenging for the average
audience at scientific conferences (that was an actual remark)
(2) Why should we need database systems particularly designed for
mathematicians?
(3) Arbitrary topological dimension leads to an "combinatorial
explosion of complexity" (another actual remark. Funny, because
obviously wrong)


Derek Asirvadem

unread,
Jan 12, 2015, 5:12:06 AM1/12/15
to
> On Monday, 12 January 2015 00:25:31 UTC+11, Norbert_Paul wrote:
>
> I would like to draw your attention to ...
>
> A
> I am mainly interested in comments from practicioners if they see any
> potential practical application of my results. I am thinking of databases
> for geo-information, CAD-data, or, in general, applications that must
> handle topological data with arbitrary topologies.
>
> B
> So the idea was to extend the relational model in the following way ...
>
> C
> I am interested in a discussion here, because it seemed that reviewers
> tended to frown on my (any my co-workers) work ...

It is not clear to me, precisely what you are looking for from us:

A. Confirm/deny that the research is valuable and that it has many applications.

B. [Your ongoing work] Confirm or discuss the method you are embarking on, to make the server and db more useful, more functional, to progress the research. Discuss any methods we may have used.

C. Read some of your papers, and provide more relevant reviews, than your past reviewers. Hopefully because we are not scared of the mathematics involved. Discuss the three concerns your past reviewers raised.

Cheers
Derek

com...@hotmail.com

unread,
Jan 16, 2015, 4:05:36 AM1/16/15
to
On Sunday, January 11, 2015 at 5:25:31 AM UTC-8, Norbert_Paul wrote:

> I am interested in a discussion here, because it seemed that reviewers
> tended to frown on my (any my co-workers) work because, for example:

What database-oriented field(s) do you consider this work to be contributing to? (Eg the reviewers' and audiences'.)

Just because you are using relational operators and persisting doesn't mean you are involving "relational databases". Moreover just because you are using a relation(al) algebra doesn't mean you are being "relational" in the database sense.

Besides that, you are mostly using relation operators among others to implement non-relation operators on non-relation structures that are merely represented by relations among other structures. Moreover some of your relations are specifically limited to binary relations and some are specifically representing matrices. So really at best you could say you have a partly relation(al) algebra implementation of a topological space algebra, some of the parts and operators of which are actually relation-based in the relation(al) algebra sense.

But almost none of this involves relations in the "relational database" sense. You're just using relations as an appropriate abstract and implementation sturcture; you're not using them "relationally". Because the significance of the "relational" in "relational database" is that there is a certain correspondence between relation(al) algebra operators and logic non-terminals, in such a way that there is a certain correspondence between equivalent relation expressions and predicate expressions, in such a way that not only can one query (in the sense of logical deduction, not the trivial sense of evaluating a relation expression) using either notation but the result can be automatically evaluated with certain complexity and optimization opportunities.

Here is part of what I wrote a while ago here when someone asked about a similar situation where a paper discussed analogues of relational database theory and probablitiy distributions, after I read its abstract. You can just read "topological space" for "probability distribution".

On Wednesday, April 11, 2012 at 3:02:41 PM UTC-7, com...@hotmail.com wrote:
>
> A (named attribute) relation can be seen as a set of or mapping on multi(-named-)dimensional points. Functional dependencies are properties of relation values and expressions.
>
> Database relational operators are designed so that there is a correspondence between relation expressions and predicates (and predicate expressions aka wffs). The value of a relation expression is the extension of a corresponding predicate (and wff) where the relation value's attributes are the predicate's parameters (and the wff's free variables). A relation expression has an associated predicate (and wff) built from it in a certain way according to its operators and its variables' given predicates (and wffs). The fundamental theorem of the relational model is that IF the body of each relation variable's value is the set of tuples that make a given predicate (or wff) true THEN the body of each relation expression's value is the set of tuples that make that expression's predicate (or wff) true. Eg if the predicate of relation R is R(X,Y) "person X loves person Y" and the predicate of relation variable S is S(Y,Z) "person Y loves food Z" then the expression for (R JOIN S) PROJECT_AWAY Z is EXISTS Z [S(X,Y) AND R(Y,Z)] "there exists a Z such that person X loves person Y and person Y loves food Z" ie "person X loves person Y who loves some food".
>
> Probability operators for treating relations as probability distributions will do different things (in general) than database operators. They will satisfy different theorems.
>
> A relation with a functional dependency can represent a function. Composition and images are relevant in databases when a relation expression corresponds to a function's (or wff term's) value. To the extent that distributions are used as (relational or functional) mappings such representation-independent mapping-oriented operators will appear in that system.
>
> So what we can expect is that what the two systems have in common is... they both somehow involve a relation as a set of or mapping on multi(-named-)dimensional points. Correspondences between operators other than ones that are oriented to mappings would be coincidental. I don't call that much of an analogy/parallel.


It turned out like I predicted. (Although the paper's relation-based representation of probability distributions was simpler than I expected.)

So from reading your page http://pavel.gik.kit.edu/doc/howto.html (particularly sections Topological Data Model and Example Session) my not very topologically informed impression is that your work uses relation operators but isn't "relational" in the database sense. (Even when you select from relations, it's by wffs and oblivious to the relation(al) algebra making that both possible and unnecessary.) Of course, I could be missing "relational" analogues. I hope you will try to justify any you have found or will find in light of this message.

Perhaps I can distill it down to your system's notion of "query" merely being an expression of mostly non-relation operators calculating mostly non-relation results that are also somewhat implemented by relation operators, rather than an enquiry, even most of the time they involve relations.

Norbert_Paul

unread,
Jan 17, 2015, 7:20:09 AM1/17/15
to
com...@hotmail.com wrote:
> On Sunday, January 11, 2015 at 5:25:31 AM UTC-8, Norbert_Paul wrote:
>
>> I am interested in a discussion here, because it seemed that reviewers
>> tended to frown on my (any my co-workers) work because, for example:
>
> What database-oriented field(s) do you consider this work to be contributing
> to? (Eg the reviewers' and audiences'.)
Thank you for your remarks.

My specific interest are CAD databases and, maybe, geo-information.
Hovever, I suppose, however, topological data has a wider field of application.

> Just because you are using relational operators and persisting doesn't mean
> you are involving "relational databases". Moreover just because you are using
> a relation(al) algebra doesn't mean you are being "relational" in the
> database sense.

Actually, your statement is a bit ambiguous. Doy you mean that my approach
is "non-relational" in some sense or do you say that its being "relational"
is still an open issue?

> Besides that, you are mostly using relation operators among others to
> implement non-relation operators on non-relation structures that are merely
> represented by relations among other structures.

How can structures that are represented by relations be "non-relational"?
Could you please identify the non-relation operators you are talking about?

> Moreover some of your
> relations are specifically limited to binary relations and some are
> specifically representing matrices.

The concept is the following:
Relational Model: Take relations (database tables) R1, R2, ... and operate
on them using a relational operator op to produce a
result relation
R = op(R1,R2,...).
My extension: Attach a topology T1, T2, ... to each relation to get
topological spaces (R1,T1), (R2,T2), ....
Take the same operator op to produce the same result
relation R = op(R1,R2,...).
Then there exists a unique result topology T, such that
(R,T) is the result /space/ of op.
The result topology can be computed by a relational query
on all relations involved. If there is no dimension upper
bound on the topolgoies the query languange /must/ support
recursive join on binary relations to produce the correct
result topology.
The topologies, can always be represented by a binary relation (They are
"Alexandroff topologies"). This is, where binary relations enter the scene, but
the original relational model is still present.
Note that no apprach to store a topology can have a better worst-case storage
space-complexity than O(n^2) when arbitrary topologies of dimension more than 0
are to be stored for a given point-set of sitze n.
So binary relations are asymptotically optimal with respect to storage space.
(This is even true for one-dimensional spaces only).

> So really at best you could say you have
> a partly relation(al) algebra implementation of a topological space algebra,
> some of the parts and operators of which are actually relation-based in the
> relation(al) algebra sense.

I have topologized /all/ relational operators that are necessary to get a
relationally complete query language. So I have a relationally complete query
language for topological spaces.

> But almost none of this involves relations in the "relational database"
> sense. You're just using relations as an appropriate abstract and
> implementation sturcture; you're not using them "relationally". Because the
> significance of the "relational" in "relational database" is that there is a
> certain correspondence between relation(al) algebra operators and logic
> non-terminals, in such a way that there is a certain correspondence between
> equivalent relation expressions and predicate expressions, in such a way that
> not only can one query (in the sense of logical deduction, not the trivial
> sense of evaluating a relation expression) using either notation but the
> result can be automatically evaluated with certain complexity and
> optimization opportunities.

I am aware of complexity issues and still want to focus on effectiveness
and not so much on efficiency.

Could you please provide the definition of "relational" your judgement is
based on? THis would be helpful in the discussion.
Actually the sitaution is different: I do not have an "analogy". I realized
that topological constructions (product space, subspace, glueing, wedge sum,
,...) constitute a relationally complete "query languge" for topological
spaces and that each database relation can be turned into such a topological
space by attaching a binary relation to it. So this is an extension.

> So from reading your page http://pavel.gik.kit.edu/doc/howto.html
> (particularly sections Topological Data Model and Example Session) my not
> very topologically informed impression is that your work uses relation
> operators but isn't "relational" in the database sense. (Even when you select
> from relations, it's by wffs and oblivious to the relation(al) algebra making
> that both possible and unnecessary.) Of course, I could be missing
> "relational" analogues. I hope you will try to justify any you have found or
> will find in light of this message.

I can only justify, if you give me an accepted definition of "relational".
This is, accoring to Codd RMV2:
(http://books.google.de/books/about/The_Relational_Model_for_Database_Manage.html?id=ZZckAQAAIAAJ)

1,2 The Relational Model
1.2.1 Relation is the only comound data type:

Every relation X of a relational database is a candidates as point set of a
topological space (X,T).
Every topology T is represented by a binary relation R on X.
So there only exist relations.

1.2.2 Inter-relating the Information Contained in Distinct Relations:
"All inter-relating is achieved by means of comparison of values" (Codd).

So everything of the above is satisfied, and I guess Codd's definition of
"relational" is authoritative.

You may want to compare what I did to Codd's Chapter
28.1 Requested Exensions
(3) computer-aided engineering design.

My spatial variants of queries are (formally) composed of relational
query operators.
Hence I followed Codd's general rule for making extensions:
"I strongly recommend that the problem be examined first at the level of
abstraction of the relational model. This means treating the relational
model, together with any necessary and relevant extensions, as a tool
for solving the problem."

> Perhaps I can distill it down to your system's notion of "query" merely being
> an expression of mostly non-relation operators calculating mostly
> non-relation results that are also somewhat implemented by relation
> operators, rather than an enquiry, even most of the time they involve
> relations.

I merely replaced each database query that takes a number of relations to
produce a result by a modified query that takes pairs of relations and produces
result pairs of relations. So i still don't get why this is "non-relational".

Thank you for your remarks

Norbert



Norbert_Paul

unread,
Jan 17, 2015, 7:38:53 AM1/17/15
to
So, I am interested in A,B, and, partially, C.
A: I personally see many practical applications because the approach
unifies any attempt to model topological data.
B: The discussion I experienced within the scientific community
(Conferences, Conference and Journal Reviewers) was extremely
frustrating. So I thought this group could be a better forum
for two reasons:
(1) every post is public. I know that publishing reviewer remarks
is considerd bad habit, even if they are abviously nonsense.
(2) The personal interests are different. I have had a paper rejected
for not being able to prove that the approach is novel, according to the
reviewer. Then, last year, the editor of that journal co-authored an
article with the concept of a "complex" that very much resembles
http://comjnl.oxfordjournals.org/content/early/2009/02/27/comjnl.bxn054.short
So it depends on who you are, iff your ideas are considered "novel" or
not.
C: You are, of course, welcome to read my papers:
http://scholar.google.de/citations?user=ZZMZCY4AAAAJ
(Note that another Norbert Paul is listed there, too.)
I am posting here as an experiment with a different discussion platform.

> Cheers
> Derek

Than you
Norbert

Tegiri Nenashi

unread,
Jan 17, 2015, 2:48:34 PM1/17/15
to
Can we lower discussion level a little? Instead of topologies lets focus on metric spaces. Now consider multivariate relation:

x y z
-----
1 1 1
2 1 1
3 2 1
3 2 2

What metric space do you expend this relation to?

Tegiri Nenashi

unread,
Jan 17, 2015, 2:50:56 PM1/17/15
to
"extend" (what a dinosaur forum where one can't edit typos in the post).

Norbert_Paul

unread,
Jan 17, 2015, 6:07:12 PM1/17/15
to
I will call that relation "X".

Then, of course, the metric space will be the discrete space (X.P(X)) where
P(X) is the power set of X. It will be modelled as
(X,\emptyset)
because the empty set creates the discrete topology.

Actually, when considering databases, I am not (yet) interested very much in
metric topologies for databases.

The reason is that a database relation is always finite (one might dispute that
but it is a assumption a make). My model so far only considers finite database
relations, and takes them as topological point sets. But a finite metric space
is always discrete (where every subset of the point set is open), and, hence,
quite uninteresting.

Whereas I consider the discrete topology a "default" topology for bare sets
(database relations with no explicitly defined topoology) in my concept,
it is a commonly known fact that finite metric topologies are always discrete.
Discrete topologies are an important extreme (like the universal relation in
relation theory) but not of much practical interest within my concept.

On the other hand, topological spaces that serve as a physical model of
topologically organized data (say, R^3 for spaes or R^4 for space-time
models) have their natural (metric) topology. So I do not neglect metric
topologies. IMHO they just do not have a relational representation for
interesting practical applications.

However, when spatial entities that live in some R^n (R^2 - the plane, R^3 -
the Euclidean space, R^4 - the space-time, ...) are modelled and
(according to my assumption) are partitioned into a a finite set of spatial
entities (say, faces, edges, volumes, hyper-volumes, ...) then the set of
entities immediately fits into my model (for being finite). Interestingly the
entities get their topology by the same operations that produce topological
result spaces from input spaces (so-called !quotient space"). Anyhow, the
resulting topology for these entities (like an edge being bounded by vertices)
usually does not constitute a metric sapce any more.

Maybe, however, your post might be the start of developing a concept to
introduce concepts for metric topologies. Note, however, that, in order to get
interesting topologies (not only the dioscrete one), the point set must be
infinite.

Cheers
Norbert

Norbert_Paul

unread,
Jan 17, 2015, 6:09:18 PM1/17/15
to
Yeah. An you can't reliably cancel your post because Google groups even
archives posts that have beeen cancelled by the original posters.



Tegiri Nenashi

unread,
Jan 17, 2015, 6:58:58 PM1/17/15
to
On Saturday, January 17, 2015 at 3:07:12 PM UTC-8, Norbert_Paul wrote:
> Tegiri Nenashi wrote:
> > Can we lower discussion level a little? Instead of topologies lets focus on metric spaces. Now consider multivariate relation:
> >
> > x y z
> > -----
> > 1 1 1
> > 2 1 1
> > 3 2 1
> > 3 2 2
> >
> > What metric space do you expend this relation to?
>
> I will call that relation "X".

X capital is fine (as long as reader is not confusing it with lower x which is the first attribute).

> Then, of course, the metric space will be the discrete space (X.P(X)) where
> P(X) is the power set of X. It will be modelled as
> (X,\emptyset)
> because the empty set creates the discrete topology.

What is the power set of a relation? Most of the time, when people refer to set structure of relation they mean set of tuples. Therefore, I interpret your answer as

{{},{(1,1,1)},{(1,1,1),(2,1,1)},...,{(1,1,1),(2,1,1),(3,2,1),(3,2,2)}}

Why this is an interesting object? It seems that further along you argue it isn't?

> Actually, when considering databases, I am not (yet) interested very much in
> metric topologies for databases.

Metric space is something that you can explain to undergraduate student from STEM field. I doubt every math major undergraduate can coherently explain what abstract topology is.

This is one of the reasons why databases enjoyed such a success. The basic object is a predicate -- a concept easily grasped by anybody taking introductory course in mathematical logic.

> The reason is that a database relation is always finite (one might dispute that
> but it is a assumption a make). My model so far only considers finite database
> relations, and takes them as topological point sets. But a finite metric space
> is always discrete (where every subset of the point set is open), and, hence,
> quite uninteresting.

Forget about metric spaces then. Can you elaborate a toy example with one or two relations and what topological objects they correspond to?

> Whereas I consider the discrete topology a "default" topology for bare sets
> (database relations with no explicitly defined topoology) in my concept,
> it is a commonly known fact that finite metric topologies are always discrete.
> Discrete topologies are an important extreme (like the universal relation in
> relation theory) but not of much practical interest within my concept.
>
> On the other hand, topological spaces that serve as a physical model of
> topologically organized data (say, R^3 for spaes or R^4 for space-time
> models) have their natural (metric) topology. So I do not neglect metric
> topologies. IMHO they just do not have a relational representation for
> interesting practical applications.
>
> However, when spatial entities that live in some R^n (R^2 - the plane, R^3 -
> the Euclidean space, R^4 - the space-time, ...) are modelled and
> (according to my assumption) are partitioned into a a finite set of spatial
> entities (say, faces, edges, volumes, hyper-volumes, ...) then the set of
> entities immediately fits into my model (for being finite). Interestingly the
> entities get their topology by the same operations that produce topological
> result spaces from input spaces (so-called !quotient space"). Anyhow, the
> resulting topology for these entities (like an edge being bounded by vertices)
> usually does not constitute a metric sapce any more.

Again, in order not to loose your reader (me), can you please exhibit an example? For example, take a concrete simplicial complex, are you saying you have innovative way to represent it "relationally"? (Please keep the example concrete).

> Maybe, however, your post might be the start of developing a concept to
> introduce concepts for metric topologies. Note, however, that, in order to get
> interesting topologies (not only the dioscrete one), the point set must be
> infinite.

No, if metric space correspondence is vacuous, there is no point elaborating this -- I understand that metric space view can be not interesting, while topology offering some nontrivial insight.

Norbert_Paul

unread,
Jan 18, 2015, 4:03:27 AM1/18/15
to
Tegiri Nenashi wrote:
> On Saturday, January 17, 2015 at 3:07:12 PM UTC-8, Norbert_Paul wrote:
>> Tegiri Nenashi wrote:
>>> Can we lower discussion level a little? Instead of topologies lets focus
>>> on metric spaces. Now consider multivariate relation:
>>>
>>> x y z
>>> -----
>>> 1 1 1
>>> 2 1 1
>>> 3 2 1
>>> 3 2 2
>>>
>>> What metric space do you expend this relation to?
>>
>> I will call that relation "X".
>
> X capital is fine (as long as reader is not confusing it with lower x which
> is the first attribute).

OK. It is commonplace among topologists to denote the point set by X and the
topology by T (or, Tau). I overlooked the possible confusion.
>
>> Then, of course, the metric space will be the discrete space (X.P(X))
>> where P(X) is the power set of X. It will be modelled as (X,\emptyset)
>> because the empty set creates the discrete topology.
>
> What is the power set of a relation? Most of the time, when people refer to
> set structure of relation they mean set of tuples. Therefore, I interpret
> your answer as
>
> {{},{(1,1,1)},{(1,1,1),(2,1,1)},...,{(1,1,1),(2,1,1),(3,2,1),(3,2,2)}}
>
> Why this is an interesting object? It seems that further along you argue it
> isn't?

Yes, the power set P(X) of a set is the set of all subsets of X.

It is interesting for formal reasons: All topologies for a set form al lattice
where the discrete topology is the maximal element and its counterpart, the
indiscrete topology {{}, X}, is the minimal element. It is also often used in
proofs but it has not many practical application. One might be its being a
suitable default topology (IMHO) if no other topology is specified.

>> Actually, when considering databases, I am not (yet) interested very much
>> in metric topologies for databases.
>
> Metric space is something that you can explain to undergraduate student from
> STEM field. I doubt every math major undergraduate can coherently explain
> what abstract topology is.
>
> This is one of the reasons why databases enjoyed such a success. The basic
> object is a predicate -- a concept easily grasped by anybody taking
> introductory course in mathematical logic.

Maybe future introductory courses in mathematics for STEM also cover some
topology. For example, Calculus uses mucht topology (using the metric topology
of the reals).

>> The reason is that a database relation is always finite (one might dispute
>> that but it is a assumption a make). My model so far only considers finite
>> database relations, and takes them as topological point sets. But a finite
>> metric space is always discrete (where every subset of the point set is
>> open), and, hence, quite uninteresting.
>
> Forget about metric spaces then. Can you elaborate a toy example with one or
> two relations and what topological objects they correspond to?

Take a set

X = {bathroom, bdoor, hallway}.

Each element can be considered a subset of the 3D real space R^3 (its
"geometry"). The set could be coded as a database table

roomid name floor department
--------------------------------------------
bathroom "Bathroom 007" 2nd 007
hallway "Hall 007" 2nd 007
bdoor "Bathroom door" 2nd 007.

A topology could be

T = { {}, {bathroom}, {hallway}, X}.

Note that it does not contain the set {bdoor}.
This topology tells us, that bdoor is close to the
bathroom and also close to hallway.
So it represents the fact, that bdoor connects
the bathroom to the hallway. (Here also the connection
to the metric topology of the space in R^3 where these
three entities live, could be discussed.)

The above topology can be coded as a binary relation on X:

R = bounded boundary
-----------------
bathroom bdoor
hallway bdoor

If you transpose the relation you get

RT = bounded boundary
-----------------
bdoor bathroom
bdoor hallway

This gives a topology

{ {}, {bdoor}, X}

the so-called "dual topology".
In the example case this topology is a graph where bdoor is an edge
connecting bathroom and hallway. These graphs are often called
"room adjacency graphs" by planners.
Example:
http://www.arch.virginia.edu/insightlab/student.php?postid=68

>> However, when spatial entities that live in some R^n (R^2 - the plane, R^3
>> - the Euclidean space, R^4 - the space-time, ...) are modelled and
>> (according to my assumption) are partitioned into a a finite set of
>> spatial entities (say, faces, edges, volumes, hyper-volumes, ...) then the
>> set of entities immediately fits into my model (for being finite).
>> Interestingly the entities get their topology by the same operations that
>> produce topological result spaces from input spaces (so-called !quotient
>> space"). Anyhow, the resulting topology for these entities (like an edge
>> being bounded by vertices) usually does not constitute a metric sapce any
>> more.
>
> Again, in order not to loose your reader (me), can you please exhibit an
> example? For example, take a concrete simplicial complex, are you saying you
> have innovative way to represent it "relationally"? (Please keep the example
> concrete).

See the above example.

The relational representaion of a topology is an old and well-established
fact in mathematics. The innovation is (IMHO) to combine it with the relational
model.

>> Maybe, however, your post might be the start of developing a concept to
>> introduce concepts for metric topologies. Note, however, that, in order to
>> get interesting topologies (not only the dioscrete one), the point set must
>> be infinite.
>
> No, if metric space correspondence is vacuous, there is no point elaborating
> this -- I understand that metric space view can be not interesting, while
> topology offering some nontrivial insight.

The bathroom door example topology is derived from the metric space
that door and rooms live in.

Norbert_Paul

unread,
Jan 18, 2015, 4:24:12 AM1/18/15
to
OOPS!!!

Norbert_Paul wrote:
> A topology could be
>
> T = { {}, {bathroom}, {hallway}, X}.
>
This is wrong. The topology looks like
T = { {}, {bathroom}, {hallway}, {bathroom, hallway}, X}.


> This gives a topology
>
> { {}, {bdoor}, X}
>
> the so-called "dual topology".
Also wrong. The dual topology looks like
{ {}, {bdoor}, {bdoor, bathroom}, {bdoor, hallway}, X }.

Seems, I am a bit out of practice.

Eric

unread,
Jan 18, 2015, 12:40:06 PM1/18/15
to
On 2015-01-17, Norbert_Paul <norbertpau...@yahoo.com> wrote:
> Tegiri Nenashi wrote:
>> On Saturday, January 17, 2015 at 11:48:34 AM UTC-8, Tegiri Nenashi wrote:

>> ... (what a dinosaur forum where one can't edit typos in the post).
> Yeah. An you can't reliably cancel your post because Google groups even
> archives posts that have beeen cancelled by the original posters.

Not a forum in the current "web forum" sense, though of course it is a
forum in the sense of "a place to discuss things". As soon as you click
the Post button the message is on its way to the world of NNTP servers,
controlled by a variety of people and organisations. There is no central
control, there is no master copy that you can edit.

If you post from Google Groups you can delete you own post, but your
chances of doing that before it is on its way round the world are very
low.

If you are not using Google Groups, in theory you can "cancel" or
"supersede" your post, but this usually does not work, partly because
message distribution is asynchronous and multi-path so the cancellation
etc may not reach some server that has the original message, but mostly
because many NNTP servers ignore at least some cancellations etc, and
the policies for which (if any) will be obeyed vary between servers. The
reason for this is that fake cancellations etc are extremely common and
ignoring them is seen as the lesser of two evils.

Apparently Google Groups ignores cancellations etc, it would be
pointless to be the odd one out.

also, there are Google Groups which are not Usenet Groups and do not
behave like this.

This is a Usenet Group, and the behaviour you dislike is, in fact, the
price of freedom.

Eric
--
ms fnd in a lbry

Tegiri Nenashi

unread,
Jan 18, 2015, 5:18:48 PM1/18/15
to
On Sunday, January 18, 2015 at 1:03:27 AM UTC-8, Norbert_Paul wrote:
> Take a set
>
> X = {bathroom, bdoor, hallway}.
>
> Each element can be considered a subset of the 3D real space R^3 (its
> "geometry"). The set could be coded as a database table
>
> roomid name floor department
> --------------------------------------------
> bathroom "Bathroom 007" 2nd 007
> hallway "Hall 007" 2nd 007
> bdoor "Bathroom door" 2nd 007.
>
> A topology could be
>
> T = { {}, {bathroom}, {hallway}, X}.

OK, this example illustrates what your topological objects are (modulo corrections in your followup post). The next question is how do you fit this object into database relation.
OK, this is intuitive, but I would like to mention that you are still quite far away from equipping [database] relations with topology. The technical difficulty here is that binary relations are different from multivariate relations studied in database theory. Specifically, the most important algebraic operation among binary relations is composition, while the most important operation among multivariate relations is natural join. The analog of composition in database world is natural join projected to the set of "distinct" attributes (formally symmetric difference). Composition is associative in the world of binary relations, but not associative in general in the database world.

Many people in database community disagree, insisting that binary relations are just special case of multivariate relations. This is one of the most glossed over topics in [database] relational theory. Here is the challenge. Take a relation R with three attributes x,y,z and define transitive closure. Even simpler problem: define composition of this relation with itself.

> The relational representaion of a topology is an old and well-established
> fact in mathematics. The innovation is (IMHO) to combine it with the relational
> model.

Again, you are referring to the theory of binary relations
http://en.wikipedia.org/wiki/Relation_algebra
which is different from relational model used in database theory.

> The bathroom door example topology is derived from the metric space
> that door and rooms live in.

Thank you for clarification. However, I still don't quite understand what mathematical object that you study. If this is a [database] relation amended with topology, can you please elaborate the details? In your example, do you leave the Rooms table alone, and introduce additional binary RoomNeighbourhood relation? Or you combine the two into some multivariate RoomsWithTopology relation, or even generalize database relation into entirely new object? Perhaps you can elaborate your example and work out the join of two topological relations?

Regarding Alexandroff topology, you have mentioned that it is a set together with binary relation. Is this relation special in any way; what of its properties can be expressed formally?
http://en.wikipedia.org/wiki/Relation_algebra#Expressing_properties_of_binary_relations_in_RA

Norbert_Paul

unread,
Jan 19, 2015, 2:58:42 PM1/19/15
to
Tegiri Nenashi wrote:
> On Sunday, January 18, 2015 at 1:03:27 AM UTC-8, Norbert_Paul wrote:
>> Take a set
>>
>> X = {bathroom, bdoor, hallway}.
>>
>> Each element can be considered a subset of the 3D real space R^3 (its
>> "geometry"). The set could be coded as a database table
>>
>> X = roomid name floor department
>> --------------------------------------------
>> bathroom "Bathroom 007" 2nd 007
>> hallway "Hall 007" 2nd 007
>> bdoor "Bathroom door" 2nd 007.
>>
>> A topology could be
>>
>> T = { {}, {bathroom}, {hallway}, X}.
>>
>> The above topology can be coded as a binary relation on X:
>>
>> R = bounded boundary
>> -----------------
>> bathroom bdoor
>> hallway bdoor
> OK, this is intuitive, but I would like to mention that you are still quite
> far away from equipping [database] relations with topology. The technical
> difficulty here is that binary relations are different from multivariate
> relations studied in database theory. Specifically, the most important
> algebraic operation among binary relations is composition, while the most
> important operation among multivariate relations is natural join. The analog
> of composition in database world is natural join projected to the set of
> "distinct" attributes (formally symmetric difference). Composition is
> associative in the world of binary relations, but not associative in general
> in the database world.

Just a the pair (X,T) is called "topological space", I call the pair (X,R)
a topological data type, because I consider the relation R a binary relation
R \subseteq X \times X
on X where, by abuse of notion, R will only contain primary key attributes from
X. Then R generates a topology T(R) for X this representing the tpological
spcae
(X,T(R)).
I am, hence, thinking of registering such pairs in the database catalog as
"spaces".

So I have both: a "multivariant" relation X accompanied by a binary relation
R on X. In the ER-model X is usually called an "entity type" and R a
"relationship type".

> Many people in database community disagree, insisting that binary relations
> are just special case of multivariate relations. This is one of the most
> glossed over topics in [database] relational theory. Here is the challenge.
> Take a relation R with three attributes x,y,z and define transitive closure.
> Even simpler problem: define composition of this relation with itself.

I am only interested in topology and, hence a binary relation is sufficient.
However I can imagine a binary relation with three attributes:

create table PointSet
planid integer not null,
objectid integer not null,
-- ... additional attributes ...
primary key (planid, objectid)
-- ... additional constraints ...
);

create table Topology(
planid integer not null,
bounded integer not null,
boundary integer not null,
primary key(planid, bounded, boundary),
foreign key (planid, bounded) references point_set,
foreign key (planid, boundary) references point_set);

Then the pair (PointSet, Topology) represents topological space that is a
so-called direct sum, a disjoint union of topological spaces each indexed by
planid: http://en.wikipedia.org/wiki/Disjoint_union_(topology)
The ineresting thing is, that the above construction would enforce the
"dircet sum" property as a consistency rule.

Also a ternary relation can be used to model a binary relation where each
association has an integer attribute denoting the orientation of the boundary.
But this is future work. I want also be able to model complexes.
http://en.wikipedia.org/wiki/Chain_complex

>> The relational representaion of a topology is an old and well-established
>> fact in mathematics. The innovation is (IMHO) to combine it with the
>> relational model.
>
> Again, you are referring to the theory of binary relations
> http://en.wikipedia.org/wiki/Relation_algebra which is different from
> relational model used in database theory.

Again, I am always talking of a /pair/ (X,R) of relations where X is arbitrary
and R a binary relation on X (thereby, of course, omitting all non-key
attribute values from X).

If you register such space in a hypothetical catalogue of spaces, say by
sort of

create space OuterSpace(
PointSet, -- the point set
Topology, -- the relation
(planid, bounded) -- the FK on the "bounded side".
);

I willl give an example query to illustrate the idea: When the pair
(PointSet, Topology) is registered as a space you could query OuterSpace
just like an ordinary relation:

select *
from OuterSpace
where planid = 9;

would give a pair (X9, R9) which would represent plan 9 from OuterSpace
as a tpopological space (X9, T9) and which then would consist of the two
relations

X9 = select * from PointSet where planid = 9;

R9 = select R.planid, R.bounded, R.boundary
from cltopology R, X9 bded, X9 bdry
where R.planid = bded.planid and R.bounded = bded.objectid
and R.pplanid = bdry.pplanid and boundary = bry.objectid --.

Here cltopology is the transitive closure on topology. R9 then generates the
"subspace topology" of X9 in OuterSpace:
http://en.wikipedia.org/wiki/Subspace_topology
In the above example, the transitive closure is not necessary but in general,
correct subspace topology computation requires transitive closure. Its
efficient use requires a good query optimizer.

The entire Relational Algebra can be "topologized" this way. Every operator
of Relational Algebra can be applied to such a space and will then return a
correct result space. Some examples:

group by: http://en.wikipedia.org/wiki/Quotient_space_(topology)
cross join: http://en.wikipedia.org/wiki/Product_topology
projection: http://en.wikipedia.org/wiki/Final_topology
equijoin: http://en.wikipedia.org/wiki/Pullback_(category_theory)

>> The bathroom door example topology is derived from the metric space that
>> door and rooms live in.
>
> Thank you for clarification. However, I still don't quite understand what
> mathematical object that you study. If this is a [database] relation amended
> with topology, can you please elaborate the details? In your example, do you
> leave the Rooms table alone, and introduce additional binary
> RoomNeighbourhood relation? Or you combine the two into some multivariate
> RoomsWithTopology relation, or even generalize database relation into
> entirely new object? Perhaps you can elaborate your example and work out the
> join of two topological relations?

It is a RoomsWithTopology object that is represented by a pair of relations
just as described above. I am thinking of expanding the relational catalog to
register topological spaces. Then a query on spaces always uses the topological
counterparts of each relational operator involved. A mix of spatial and
non-spatial operators combining "ordinary" database relations with "spaces" is
also possible.

> Regarding Alexandroff topology, you have mentioned that it is a set together
> with binary relation. Is this relation special in any way; what of its
> properties can be expressed formally?
> http://en.wikipedia.org/wiki/Relation_algebra#Expressing_properties_of_binary_relations_in_RA

Usually, the relations are considered preorders (called "specialization
preorder"). However this restriction can be dropped (See property (1) below).
Besides that, there are no particular restrictions on the binary relation. Some
properties of R, however, yield particular topological properties of the
topological space (X,T(R)), where T(R) is the topoogiy on X represented by the
relation R:

(1) Two relations R and S produce the same topology, iff their transitive and
reflexive closures R^* and S^* are equal, hence, R^* = S^* <=> T(R) = T(S)
always holds for arbitrary relations R,S \subseteq X.
Reflexive closure is always relative to X:
R^* = R^+ \cup I(X)
Where R^+ is the transitive closure of R and I(X) the identity on X
(modelled according to the the schema of R, of course).

(2) Iff the relation has no cycles, the topological space it generates is
called T_0 separable: http://en.wikipedia.org/wiki/Kolmogorov_space

(3) The pair (X,R) can be considered a directed graph with vertex set X
and edge set R. All "connectedness" notions of (X,R) matsch the
corresponding topologigal notions of (X,T(R)). (X,R) is a connected graph
iff (X,T(R)) is a connected space and the connected components of
(X,R) (as graphs) represent the connected components of thr topological
space (X,T(R)).

(4) The space is discrete iff R is coreflexive.

(5) Reflexivity and transitivity have no influence on the topology but may
affect the costs of query processing.

(6) Iff R^* is an equivalence then the topological space is a disjoint sum of
trivial spaces: http://en.wikipedia.org/wiki/Trivial_topology
In particular, iff R^* is the universal relation on X, then
(X,T(R)) is a trivial space.

(7) The height of R (the length n of the maximal acyclic chain
x0 R x1 R ... R xn) gives the topological Krull-Dimension
(http://de.wikipedia.org/wiki/Krulldimension - German link, because the
English article only contains the algebraic definition) of the space.

... these are the ones I already know. The list you gave me is handy. I could
check for each property mentioned there if and how it affects the topology.

Instead of restricting R, one could provide consistency rules for spaces to
enforce topological properties of modelled spaces.

Hope this clarifies my idea.

Norbert

Tegiri Nenashi

unread,
Jan 19, 2015, 5:53:05 PM1/19/15
to
On Monday, January 19, 2015 at 11:58:42 AM UTC-8, Norbert_Paul wrote:
> Tegiri Nenashi wrote:
> > On Sunday, January 18, 2015 at 1:03:27 AM UTC-8, Norbert_Paul wrote:
> >> Take a set
> >>
> >> X = {bathroom, bdoor, hallway}.
> >>
> >> Each element can be considered a subset of the 3D real space R^3 (its
> >> "geometry"). The set could be coded as a database table
> >>
> >> X = roomid name floor department
> >> --------------------------------------------
> >> bathroom "Bathroom 007" 2nd 007
> >> hallway "Hall 007" 2nd 007
> >> bdoor "Bathroom door" 2nd 007.
> >>
> >> A topology could be
> >>
> >> T = { {}, {bathroom}, {hallway}, X}.
> >>
> >> The above topology can be coded as a binary relation on X:
> >>
> >> R = bounded boundary
> >> -----------------
> >> bathroom bdoor
> >> hallway bdoor
> >
> > ... The technical difficulty here is that binary relations
> > are different from multivariate
> > relations studied in database theory...
>
> Just a the pair (X,T) is called "topological space", I call the pair (X,R)
> a topological data type, because I consider the relation R a binary relation
> R \subseteq X \times X
> on X where, by abuse of notion, R will only contain primary key attributes from
> X...

OK, now it is clear: the objects you are focusing on are pairs of multivariate
relation X in the first component, and binary relation R in the second.

Now you can define operations in Norbert algebra "component-wise". For example you can define "multiplication" as natural join in the first component and composition in the second. This would be not entirely without merit, because multiplication would be commutative and associative operation. However, natural join in [multivariate] relational algebra is idempotent, while composition in algebra of binary relations is not. Therefore, I'm not certain how well defined your algebra is. Can you please describe each and every of the following operations: natural join (as it might be something other than my guess), projection, and union?

> So I have both: a "multivariant" relation X accompanied by a binary relation
> R on X...

Multivariate, not "multivariant". Algebraic geometry studies systems of multivariate polynomials. If you think about zeroes of multivariate polynomials, they are exactly the tuples from database relations. I simply prefer the shorter term "multivariate relation" compared to more verbose "relation with named attributes". Again, it is important to clarify what mathematical structure are we talking about: Pierce-Tarski algebra of binary relations, or Codd's algebra of relations with named attributes (AKA "multivariate" relations).

> > Many people in database community disagree, insisting that binary relations
> > are just special case of multivariate relations. This is one of the most
> > glossed over topics in [database] relational theory. Here is the challenge.
> > Take a relation R with three attributes x,y,z and define transitive closure.
> > Even simpler problem: define composition of this relation with itself.
>
> I am only interested in topology and, hence a binary relation is sufficient.
> However I can imagine a binary relation with three attributes:

You lost me: binary relation with three attributes?

>
> create table PointSet
> planid integer not null,
> objectid integer not null,
> -- ... additional attributes ...
> primary key (planid, objectid)
> -- ... additional constraints ...
> );
>
> create table Topology(
> planid integer not null,
> bounded integer not null,
> boundary integer not null,
> primary key(planid, bounded, boundary),
> foreign key (planid, bounded) references point_set,
> foreign key (planid, boundary) references point_set);

You defined topology as binary relation. Here you exhibited ternary relation. Granted you may recover topology as a binary relation from it, that is not the same as "binary relation with three attributes". Perhaps, if we dwell little more on this idea I would be able to express my discomfort with more precision.
This gives me a flavor how you define selection. However, selection can be
considered as a special case of natural join. Here I see that my earlier guess about natural join defined as composition in the second component is simply wrong.

> Here cltopology is the transitive closure on topology. R9 then generates the
> "subspace topology" of X9 in OuterSpace:
> http://en.wikipedia.org/wiki/Subspace_topology
> In the above example, the transitive closure is not necessary but in general,
> correct subspace topology computation requires transitive closure. Its
> efficient use requires a good query optimizer.
>
> The entire Relational Algebra can be "topologized" this way. Every operator
> of Relational Algebra can be applied to such a space and will then return a
> correct result space. Some examples:
>
> group by: http://en.wikipedia.org/wiki/Quotient_space_(topology)
> cross join: http://en.wikipedia.org/wiki/Product_topology
> projection: http://en.wikipedia.org/wiki/Final_topology
> equijoin: http://en.wikipedia.org/wiki/Pullback_(category_theory)

"group by" generalizes projection. Does Quotient Space generalize Final Topology?

Since you offered definition of topological relation as ordered pair of
multivarite relation together with binary relation, you can specify operations in your algebra without referring to topology. Again, I'm interested to see formal definitions of natural join, projection, and union.

> It is a RoomsWithTopology object that is represented by a pair of relations
> just as described above. I am thinking of expanding the relational catalog to
> register topological spaces. Then a query on spaces always uses the topological
> counterparts of each relational operator involved. A mix of spatial and
> non-spatial operators combining "ordinary" database relations with "spaces" is
> also possible.

Again, can you please take the (Rooms,RoomNeighbourhood) pair from your example, then introduce one more pair and, next, work out the natural join of these two pairs (is it something meaningful)?

Derek Asirvadem

unread,
Jan 20, 2015, 4:36:48 AM1/20/15
to
Norbert

Please forgive the length; my comments are terse (feel free to ask questions about anything that is not completely clear). Hopefully you will find them to be a value. Take your time responding.

> On Saturday, January 17, 2015 at 11:38:53 PM UTC+11, Norbert_Paul wrote:

-----------
-- A & B --
-----------

Relevance & Value
-----------------

It is absolutely valuable. Of course, for GIS and CAD, but nowadays, for much more.

1. Years ago I wrote a app+Rdb for the Dept of Mineral & Energy (mining). They had a huge GEAC GIS system, proprietary with an SQL database. My system was quite separate, for Registration & Correspondence, it kept all the legal docs and correspondence together, on a registration basis (read lease-holder), whereas the GIS system kept every claim; every mining shaft; every hydrant; every nook and cranny, together. Of course the GIS system was full 3D, and printed something like the architectural plan for a house. On the registry side, some of the legal docs, when they are printed, need a proper diagram: not a full 3D, but a simple 2D-looking-like-3D, or 3D simplified. The same as your "flattening" function.

So I wrote a simple transport to query the GSI system and obtain the GIS data for a claim or whatever, and to construct a simple diagram. Only reqd for some printouts. I used awk for the ETL and computation, and gnuplot for the erection of the graphics. I didn't keep any GIS data on my side, in order to avoid duplication. But there was a rather large study of doing so, before I convinced them that they shouldn't.

Point being, if I had your tables and queries, I would have used them.

2. There are thousands of GIS & CAD type systems out there. They need enhancement, extension, additional capabilities that have been contemplated but not implemented, that needs to be written.

3. More than the GIS & CAD systems, nowadays we need to erect and display 3D objects, from virtually any system. Right now, I am working on a model (I can't discuss its specifics due to confidentiality) that is:
- to get an idea of the type of drawing, the structure of the objects: imagine eg. a typical standard-complaint IDEF1X model of a Rdb, with all the objects hanging in 3D space, and each object is full 3D (shape; volume; colour; shadow; etc)
- that has many layers, which have to be drawn and manipulated separately (separate parties, separate departments)
- all dragged together, into one layered corporate-level diagram, so that everyone can henceforth use one diagram, and conflicts can be eliminated
- all the issues of different people drawing objects differently have to be resolved
- I just demand vector graphics, with the dimensions stated, and some rules observed (relate to zero axis, etc)
Now I am doing this in a very good drawing tool, but not a CAD tool (on purpose). If I had something like your tables & queries, it would save me a lot of time.

There are millions of needs like this.

4. When I was in Germany last, I visited the factory of a company that builds 100% wood houses, using 50 or 60 different types of wood. You design the house on a computer; they make up all the components in a couple of weeks; then they go to the site with a few trucks and equipment and build the entire house in one week. They had it the other way around: an Rdb, which contained all the drawing elements (not the drawings) and inventory (components of a house). They then exported that data to a CAD system, where the drawing itself took place. The CAD system was not very good at topology. Ie. they had respect for the Rdb, and wouldn't dream of placing their company data on something that was not Relational. If they had your tables & queries, they could do it all from one system.

Topology in Relational Form
---------------------------

I find it amazing that mathematical types have a well-established topology set ... that is not Relational. What on earth is the point of defining anything, in terms that are not Relational. I mean, it is 2015, and Codd wrote his paper in 1970. Congratulations on developing a fully Relational set of topological relations.

Next Steps
----------

I would advise you to find a suitable company that has a need, such as the examples above, and team up with them. They will most probably fund your research. But then you would have to use industrial-strength tools, platforms, languages. You would have to consider:
- their data (inventory of objects) already exists in a Relational DB, usually SQL
- for a first cut, translate that data into your topology, then query (erect diagram)
- for the long term, change their database such that it stores your topology in relations that can be used to query directly

So the real goal is, to produce a paper that defines your topology in terms of relations, with a fully detailed data model, such that one can understand it, and implement one's objects in a relational topology, such that one can use your queries to erect the diagrams.

Method
------

So the result would be a set of routines in some callable language, delivered as library, plus a set of adequate documentation. Then anyone can link it in and use it, in any app+Rdb (3rd party product or in-house).

Therefore, LISP is fine for now ("fun" is a very good reason at this stage), but you must keep your eye on the long-term goal of a packaged library, which you work up to, over time.

Most important, stay away from the notion of doing everything in one monolithic program. As a stand-alone program, the capability is irrelevant. This is a common problem amongst academics.
- SQL is the data sub-language for accessing an RDb, which is deployed in an RDBMSS
- it is not a language (do not expect the capabilities of a language)
- trying to do everything in SQL, or everything in (C, Java, a "d", etc) is stupid. It is 2015, we use various components and tie them together to construct a solution to a problem, not one monolithic program that "does everything", which is severely limited to "everything" that it can do.
- eg. the notion of a "d" might be sound, for drawing in the sand, whilst contemplating some mathematical definition or other. The sound-ness ends there. Eg. only a tiny fraction of the RDBMS users can use it, because only that same tiny fraction care about writing database access (definiton as well as manipulation0 in terms of mathematical definitions of "relations". Eg. note the gross architectural limits of the monolith.

Right now, any of my product deliveries consist of:
- Sybase: Client, plus Server; plus Unix utilities
- a complete, stand-alone RDb, which can be added to without limitation [*1]
- a full set of Transactions (the ACID-compliant, high concurrency variety, not the imbecilic self-locking MVCC variety)
- a full set of reports in whatever s/w the customer has, for report generation (because every report is a single SELECT command, because the Rdb is truly Relational). On top of which, they are free to construct any further reports. No massive report tool such as BusinessObjects is required (such are only required in order to provide reporting [Relational] access to Record Filing Systems)
- all maintenance & housekeeping tasks, as shell scripts [that call awk,] that call SQL, callable from a GUI menu. This includes everything from nightly jobs that maintain the Rdb, as well as an array of back-end monitoring.
- all ETL, both import and export, in awk, which first loads a set of awk-arrays with data from the Rdb, and then does its work (perl is also quite valid here)
- the client-side program(s) can be built in any language: Java; C or any derivative.
- which may call gnuplot to erect graphs, etc.
- it is strange that many people use .NET, etc as a *development* language. I and many of my colleagues use it strictly as a delivery platform, the idea of coding in it is insane, much like coding in a 2GL, Assembler, etc. No. We code in 4GLs, such as PowerBuilder (there are many similar), and *deploy* on .NET or Java or Web, etc, by simply pressing a button.
I am trying to give you an idea of the componentry, which is normal, demanded, since 1990, for a single product delivery. Note that each component can be expanded and extended, and it can be deployed according to the specific requirements of the site (security model; actual s/w package used; etc).

*1 if you had your topology delivered as a set of tables plus a library, they would fit in here. That would allow me to erect topological diagrams directly from the client-dside program. Sure, it would have its own set of topological queries, which eventually have to call SQL. In the same way that GUI client-side programs now call a library to erect 2D graphs within the GUI (a new window is just fine).

Practical Application
---------------------

Personally, I can't see the point in doing *ANY* research that doesn't have a practical application. If it doesn't have a practical *use*, well, it is *useless*. Once you cross the line into the insanity, of researching useless things, there is no end to it. Sure, these days the academic world is chock-ful of research that has no purpose, it is not like the old days. Which is what, in large part, leads to the stupidity of reviewers and their criteria: they prioritise category items on a basis that is, well, insane, non-logical. "Novelty is more relevant than relevance."

Note that I did not say that all research should be commercially relevant, which is a different thing. Let's not get distracted.

Quality

One thing I must say is that your papers are much better than most papers in these subject areas, both the Relational and the BIM/CIM. They have far more definition and far more detail, they are well laid-out, and they are more understandable. The concise number of citations is great (lousy papers have hundreds of citations, 99% of which are irrelevant; they have the structure but lack substance).

Possible Improvement
--------------------

As for improvements (to the already great papers), personally, I would:
- increase the number of diagrams
- include standard-compliant data models. Better if you had one single data model as an appendix., and refer to it or parts of it.
- use terms that are accepted in the industry
- use a better drawing tool, so that the diagrams are cleaner, more precise. I think you know that the arrow-lines and the fill-pattern is too faint to show up in a PDF. All that has to be fixed

Which leads to ...
- use the same symbols in the diagram as it the text (you have done that, I am asking for more symbols in the diagram, and a more drafting-style diagram)
- eg. fig 1.2 in /Topological Database for Architectural Spaces/ is too crowded: it has to standardised and expanded
- eg. fig 1.5: draw the dots and arrowheads much smaller; assuming the perspective is from above, use three separate shades of grey and one of white.

I offer this (page 1) as a suggestion (the legend is obviously wrong, but it is not clear which line or plane in the source diagram is being referred to; if the greater-than sign is a pointer or if it is part of the function):
http://www.softwaregems.com.au/Documents/Student%20Resolutions/Norbet%20Paul/TDAS%20Note.pdf

All that is minor, except for the data model.

That is the all I can think of.

Documentation
-------------

The documentation required for a package is of course, different to the papers. The papers remain as the foundation, not to be set aside. It must be aimed at the technical user, who does not understand mathematical definitions, such that they can understand the relations; the requirements for storage,; etc; and use the queries.

Who would you like to be using this:
- only people who (a) understand mathematical definitions, *and* (b) topology, expressed in that form. In that case, the doco as is, is fine, noting the improvements. The target market is tiny.
- (c) technical people who do not have (a), and understand (b) but not in terms of mathematical definitions. Probably quite good at their CIM or BIM, but weak on the maths. In that case another layer of documentation is needed, so that they can use your tables and queries directly.

Relational or Not
-----------------

1. I didn't find that your relations *when described as relations* (eg. TDAS p95 top) were non-relational. AFAIC, it is not possible to determine relational-ness from the terse mathematical definitions of relations, and without all other relations being given.

(By virtue of the evidence (what they actually produce), there are many posters here who have mathematical definitions for Record Filing Systems only, in text format only, they do not recognise Relational when it is fed to them through a feeding tube, in the form of standard-compliant models, and they can't Normalise their "relations". But they anoint themselves with the pig fat, that allows only themselves to determine what is "relational", and nitpick about anything that they themselves did not produce. Their idea of "theory" is limited to mathematical definitions. The "not invented here" syndrome is alive and well in the so-called academic world that pertains to "relational".)

2. On the other hand, your relations *as depicted in SQL or UML* (eg. TDAS p95 bottom, p180) are definitely not Relational. And they have to be.

a. If all (or the majority of) your tables have a surrogate (ID column; Identity; autoincrement), by definition, it is not a Relational database. It implements the Access Path Dependence prohibition in the RM. And it results in many more tables involved in joins, and many more indices, than a Relational Database.

b. Relational database Integrity is very important, but it is not restricted to, it does not consist of, merely Referential Integrity. The RM defines Relational Keys, and you get much, much, more Integrity from than, over and above DRI. That is also, precisely where the (i) speed and (ii) power of Relational databases lie. Note that (ii) is where the ease (or not) of coding queries lies, both planned and unplanned.

c. Tables such as Personen (as implied in TDAS p180 Addressen, but not given, or as given in TDAS p 198) are non-relation for a different reason. In addition to [a], they allow duplicate rows, which are prohibited in the RM.

d. My understanding of the proposition might not be correct, but on the face of it, we do Mitglied and Mitglied[Anything] in a much better, much more simple way, as per Codd's RM.

e. If you think that "SQL is broken", as propagandised by the self-described "hexperts in the Relational Model", let me assure you that that is false. If you think that "you can't implement a Relational Database in SQL", and thus you have not bothered to, let me assure you that I have close to one hundred, and they are acknowledged as such by technical auditors.

All of this is easier to explain with a diagram, so (page 2):
http://www.softwaregems.com.au/Documents/Student%20Resolutions/Norbet%20Paul/TDAS%20Note.pdf

Please forgive my translation to German.

f. Please feel free to give me your SQL tables (which are currently non-relational and have *only a fraction* of Integrity), and I will give you a Relational Database in SQL, with *full Relational* Integrity. And Speed. Probably overnight.

A full data model is required, hopefully in a Standard notation. I might need ch 4 from TDAS translated, but I might not, I am willing to try it without the translation.

Re TDAS 4.5.1, I am not sure my translation of it is correct, therefore I will not comment on specifics, but let me assure you that any Rdb that I provide has:
- "fifth Normal form" which really means no duplication whatsoever; Codd's famous 3NF applied across all tables; and no Update Anomalies.
- "sixth normal form" for the elements that require it (not per the definition, because the author has no clue about what its two quite different purposes are, despite me providing a fully documented data model and code). Just the Columnar Normalisation, to remove optional columns to a separate table.
- "domain key normal form" as Codd intended it (not the hysterical collection of fragmented mathematical definitions, which the author himself says is impossible to achieve: he is fixated in non-relational Record Filing Systems). Ie. all data and all Integrity is controlled within the database, declaratively, by constraints of various types. Far, far beyond the fragmented definitions.

Relational (IDEF1X) vs OO (UML)
-------------------------------

A lot can be said here, but I will keep it short. Note each point, although terse, is very important.
- UML is marvellous, absolutely smashing, the bees knees. For modelling program components. For a OO mindset.
- UML cannot be reasonably used for modelling data. Its simply doesn't have the mindset, and therefore doesn't have the symbols.
- Data vis-a-vis Program, are very different, and the Analysis; Design; and Modelling methods, are quite different. Similar to the difference between the wiring for Control vs the tubes for transporting material, in a factory. Or a body. They are never mixed up. The rules for usage are different.
- OO types tend to treat data with disdain, as the servant or 'populater' of an Object, and view the database as a set of simple tables, which are in fact Files (note the Access Path Dependence of each), that merely store (make persistent) their Objects.
- They are quite happy to "re-factor" their Record Filing system every time they change their Objects. They have no clue about Data Integrity or Relational Integrity; about Relational Keys and their speed and power.
- All the boxes are the same (square corners), the lines are the same (solid), the labels on the lines are meaningless wrt Data (may be meaningful wrt Objects, Actions). It is not a database by any means, and it is not Relational by miles.

- Data Modelling means modelling the data, as data, and nothing but the data, so help me Codd. No reference the the usage or the app.
- The result is permanent tables. Extensible, easily, but permanent. Data does not change, apps that use the data change every few years, and OO apps, due to their "re-factoring", change with every version.
- Therefore Data Modelling uses quite different rules, and a different Methodology. And requires a notation that has the symbols, the richness, that is required. Eg. one diagram, instead of a diagram plus a CREATE TABLE statement.

- We have had IDEF1X as the Standard, a Methodology, for modelling Relational databases since 1984, and available as a tool ERwin since 1987.
- Unfortunately the dear people who write books allegedly about the RM and teach courses allegedly about the subject
__a. have not heard of it [until I introduced them to it on TTM a few years ago],
__b. do not use it
__c. do not understand pictures (right brain, 96% brain power), they prefer text (left brain, 4% brain power), and especially like isolated fragments of text
__d. the very pictures that the practitioners have been using for thirty years, to model and implement Relational databases
__e. but they remain convinced, that they "know" the subject
__f. The 'not-invented-here' syndrome is badge of honour.

- Relational Modelling, and IDEF1X as its method, elevate and implement all the requirements of the Relational Model. That means (to name a few):
__a. Relational Keys, or Identifiers. Used correctly, across the db
__b. The square vs round corners are relevant: Independent vs Dependent tables
__c. The solid vs dashed lines are relevant: Identifying vs Non-identifying Relations
__d. Of course we differentiate base Relations vs Derived Relations, therefore much of the activity and argument that the academic types engage in, for four decades, is irrelevant to us.
__e. Thus the Relations (as opposed to tables, which are of course base "relations", which we call the TABLES) are meaningful, not mere lines that "connect" or "link" Files. The tables are the SUBJECTS; the relation lines are the ACTIONS that take place between the subjects, expressed as a Verb Phrase. (By "Projekt [is a] client [of] Person", you may be using this concept, but from here, it looks like you mean it in the OO sense alone, meaning dependency, Client object.)
__f. And the database, all the Business Rules, are implemented declaratively, in constraints of various types. As much as possible for most qualified practitioners, 100% for me.

You won't get any of that in a Record Filing System, or from a person who uses UML for Data Modelling.

Therefore:
- To the extent that you provide data models (as opposed to relation definitions in text strings), it demonstrates the extent of your Relational mindset
- To the extent that you provide data models in UML, it demonstrates the extent of your OO mindset, your non-relational mindset, a superficial approach to Data Modelling
- To the extent that you provide data models in UML, the models are absent a number of Relational requirements, concepts and constructs, therefore it is impossible that it is a Relational Database
- To the extent that the tables have a RecordId as the PK, it is guaranteed that the thing is a Record Filing System, deployed in a database container, for convenience (use of SQL, backups, etc).

The fact that the data models provided are examples, some of which are not directly related to the present work, is irrelevant, you are demonstrating.

Therefore, I would strongly recommend that:
a. remove all the examples (pure examples, not related to the architectural spaces) from the body, place them in a single Data Model, as an appendix, referred to from all parts of the body
b. remove all the definitions of "relations" related to architectural spaces from the body, place them in a single Data Model, as a second appendix, referred to from all parts of the body
c. Ensure that [a] and [b] are fully Relational. This assures the reader that the constructs; the tables; the queries, enjoy all the integrity, power and speed of the RM
d. which means, you can't use UML, you need to use IDEF1X
e. Your tables and columns MUST be correctly and meaningfully named. Single character names, and acronyms are not allowed. If your RDb is to be Open Architecture, you have make it so.
____SELECT FROM t WHERE x = y AND EXISTS (...) ...
is fine for the paper. But totally unacceptable for any person who will use the RDb (ie. your productised package as detailed in the Next Steps section). Not acceptable for your Lisp project, you have to get a start there.

Note that the sequence [c], [d], is relevant, it isn't the other way around.

Note that I didn't mention Normalisation in the above section. Of course, the data has to be Normalised, and Relationally Normalised, if you want people to see that you are following the RM. Codd defined two specific Normal Forms in the RM. The creatures who write books allegedly about the RM, and the pseudo-mathematicians who write maffemafical diffinishuns of abnormal "normal forms", have failed, miserably, in forty four years, to define those two Normal Forms. So the community is left with the pre-relational, Record Filing Systems that the authors know, the abnormal methods of "normalising", and the fragmented "normal forms" (that suffice for RFS).

What I am saying is, it is not your fault, that your models and tables are not Relational, you sit in the 95% of the RDb community, who are victims of such cursed authors. And at the high end of that 95%, because your work is very good.

Versioning
----------

In TDAS 4.4.8, you state:
> Since the spatial planning also works with versions of rooms,
it makes sense to extend this concept of version control on topological spaces.

Doesn't make sense to me.

In the RM, Codd gives a method for versioning, which is admittedly very terse. It works beautifully. (I use it for "temporal" databases as well, but let's not get side-tracked.) The assumption is that all the data elements for your architectural spaces are deployed in SQL tables (separate to the OO Objects). Therefore, the versioning concerned is about the actual architectural spaces changing over time. Eg. I would like to erect a diagram of the current kitchen door plus doorframe, alongside the 2007 version of that door, or determine the differences (ala diff), at the data content level, between them.

I think you appreciate the value of solving a problem via a Relational Database, first and foremost. You quoted Codd to that effect in one of your papers, I forget which one.

If you implement versioning in the SQL tables, you can perform all related functions in SQL queries, rather than
- export;
- RCS/CVS/diff/patch;
- then import.

(You still need to save all your code, SQL and other, in RCS/CVS/Subversion. Just as everyone one has to. But that is purely for backups, restores, etc. Not for versioning of architectural objects.)

-------
-- C --
-------

> C: You are, of course, welcome to read my papers:
> http://scholar.google.de/citations?user=ZZMZCY4AAAAJ

Yes, I know, I have already read some of them, at academia.edu.

Note, I did not read TDAS completely, which requires a full translation. I skimmed it, and my German is horrible; picked up what I thought was relevant; translated that; and commented.

But [what I think are] the two main papers
- Basic Topological Notions and their Relation to BIM
- Topological Databases for Architectural Spaces
are not downloadable or cost money or unavailable in English.

Could you please email anything that you are happy to provide, that is in English, to me.

Feel free to recommend specific papers that are limited to the scope of you original enquiry. There are many and it is sometimes hard to figure out which ones are relevant to the OP. You certainly have been productive, and the produce is high quality.

More on [C], later.

Cheers
Derek

Message has been deleted

Derek Asirvadem

unread,
Jan 20, 2015, 8:06:46 AM1/20/15
to
Norbert

> On Tuesday, 20 January 2015 20:36:48 UTC+11, Derek Asirvadem wrote:

A couple of clarifications.

> Relational or Not
> -----------------

> Relational (IDEF1X) vs OO (UML)
> -------------------------------

> Therefore, I would strongly recommend that:

> c. Ensure that [a] and [b] are fully Relational. This assures the reader that the constructs; the tables; the queries, enjoy all the integrity, power and speed of the RM

I forgot to mention, it goes without saying, the purpose of my detailed comments under this heading, if you wish to make the claim that the Topology you are publishing is Relational, it had (a) better be Relational, as (b) demonstrated in every instance that you expose it or parts of it.

Claim
-------

i. I might repeat, it is not possible, not feasible, to determine the Relational-ness (qualification against the Relational Model), for your "relations" (terse text strings), unless they are provided in toto and with complete definitions (eg. Foreign Keys; constraints). And especially not if the names of the "relations" and attributes are single character or terse. If you do not understand this, you will get into silly arguments like "how can a relation be non-relational".

Claims re whether such "relations" are Relational cannot be confirmed, and they are unsafe to make. Of course, if the claim is made *only* to other mathematicians, and *not* to implementers, since they are under the same limits, they can argue equally well, that the claim is true, or that it is false. Without resolution one way or the other.

"Relations" are an abstraction. I cannot evaluate the thought that you are thinking, and determine if it is a sin or not.

Therefore, the place where the 'rubber hits the road', where the "relations" become defined in real terms, is the place where one can determine Relational-ness. Pieces of paper, with ink marks on them, that define *all* the "relations" in set, and all the relevant constraints. Now argument is avoided or at least minimised, and resolution is possible.

ii. Until 1985, the marks defining the "relations" were ASCII characters, and the statements were SQL DDL. That engages 4% of ones brain. Determination of whether such "relations" are Relational or not, and indeed whether the database as a whole is Relational or not, can now be made. Engaging the same 4% of the brain. Time-consuming and prone to error.

The claim can now be confirmed, with some difficulty.

The abstract "relations" are no longer abstract, they are real. I can see that you are looking at my daughter with a naughty intention.

iii. Since 1985, we use Data Modelling diagrams, rendered in the IDEF1X Standard. That engages 96% plus the 4% of ones brain. No DDL is required (if implementation was planned, sure, that is the next step). Determination of whether such "relations" are Relational or not, and indeed whether the database as a whole is Relational or not, can now be made. Engaging the whole brain. Far less time-consuming and far less prone to error.

The claim can now be confirmed, with ease.

I can now see all your mannerisms, your glances, when you are around my daughter, and that they are different when she is not there.

Frivolous Claim
--------------------

For your understanding. Note that it is by the exact same protocol, that:
a. the pseudo-scientists, the boffins who write books allegedly about the RM, are able to make claims that some "relation" (either theirs or some other's) is Relational or not
b. that the RM itself is "relationally incomplete"
c. and such claims are idiotic, frivolous, because they can be neither confirmed nor denied (as detailed above)
d. therefore scientists do not accept such claims
e. only frauds and thieves make unfounded and unprove-able claims
f. they do so petulantly, and continuously (scientists know that a truth needs to be proved, and proved just once)
g. why they steer clear of Data Models and diagrams
h. why they avoid SQL DDL (the denigration of SQL being a separate matter, but nonetheless servicing this need as well: "can't define the 'welation' in SQL because it is bwoken")
i. and why they restrict themselves to describing (not defining) their "relations" in ASCII characters, text strings, sans DDL, sans DM, at the pre-1985 level of capability
j. why that is their permanent cave.
k. why scientists who can erect DMs call [i], their "art", Degenerate Art

Their language, their mantra, is (i) mathematical definition, (ii) only mathematical definitions, and (iii) nothing but mathematical definitions. Now, I have every respect for (i). But given the range of developments in the Relational arena since 1985, given the English language for technical use, given the ease of DDL and Data Models, (ii) and (iii) are for sub-humans, backward cavemen. Unfortunately they write books ! Even worse, some of those books are used as textbooks !!!

Which is why, in order to "hear" the music, to "see" the kaleidoscope, to sing and dance at the private orgy, first you must speak the language, and second, you need the Kool-Aid.

Back to you. The same protocol as I have given you, applies to them. The difference is, you are honest, and willing to engage in a discussion with a view to resolution. They desperately need their continued petulance, the non-resolution.

(I might have to publish my Relational Scorecard. Which thus far, has been for customers only. Just to kill the argument that I expect will develop from the sewer rats.)

Relational Algebra
------------------------

I think you can see from my previous post, that in the context of whole scheme, it is probably not as relevant as you thought at first. It remains absolutely relevant, in that it forms the foundation, an essential part (but not the whole) of the theory that the package (either standalone monolith or components plus doco) is built upon.

Depending on your target market, with consideration to [1] to [111] above, if you limit yourself to mathematicians who understand Topology, then you need to:
a. document all the "relations" together, along with all their constraints, and
b. avoid the claim. Perhaps declare that you created them with the RM in mind.

But if you want to reach everyone who can use your packaged tables plus SQL queries plus <language> queries (or at least a large portion of that market), *or* if you wish to make the Relational claim, then in addition to [a], you need to:
c. produce a Data Model, in IDEF1X.
d. make the Relational claim. And [c] will be used to prove the claim, or to deny it.

> Versioning
> ----------

> In TDAS 4.4.8, you state:
> > Since the spatial planning also works with versions of rooms,
> > it makes sense to extend this concept of version control on topological spaces.
>
> Doesn't make sense to me.

Mistake, incomplete quotation. Sorry, that does make sense.

> In TDAS 4.4.8, you state:
> > In UNIX, there are the commands diff and patch to compare and update text files ...
> > RCS, CVS, Subversion ...
> > Since the spatial planning also works with versions of rooms,
> > it makes sense to extend this concept of version control on topological spaces ...
> > ... [and to use diff, patch, RCS, CVS, Subversion]

That doesn't make sense to me.

(And then, my comments follow, and make sense.)

Cheers
Derek

Norbert_Paul

unread,
Jan 22, 2015, 3:20:12 PM1/22/15
to
Tegiri Nenashi wrote:
> OK, now it is clear: the objects you are focusing on are pairs of
> multivariate relation X in the first component, and binary relation R in the
> second.
>
> Now you can define operations in Norbert algebra "component-wise". For
> example you can define "multiplication" as natural join in the first
> component and composition in the second. This would be not entirely without
> merit, because multiplication would be commutative and associative operation.
> However, natural join in [multivariate] relational algebra is idempotent,
> while composition in algebra of binary relations is not. Therefore, I'm not
> certain how well defined your algebra is. Can you please describe each and
> every of the following operations: natural join (as it might be something
> other than my guess), projection, and union?
OK, but first I will go into some mathematical principles:

Given two topological spaces (X,Tx) and (Y,Ty).
A function f:X->Y from a set X to a set Y is called /continuous/,
from (X,Tx) to (Y,Ty) if and only if ("iff") the original image

f^(-1)(B) = { a in X | f(a) in B }

of every element B in Ty is an element in Tx. Note that the bigger
("finer") Tx (or the smaller ("coarser") Ty) the more likely is it
that f is continuous.

Given two relational representations (X,R) and (Y,S) of topologiacal
spaces (X,T(R)) and (Y,T(S)) then a function

f : X -> Y

is continuous iff for every pair (a,b) in R the image pair
(f(a),f(b)) is an element of S^*, the transitive and reflexive closure of S,
hence, iff

(a,b) in R => (f(a),f(b)) in S^*

holds.
For example, f may be a foreign key such that each tuple tx in X references a
tuple f(tx) in Y.
Note that here the smaller R ("finer") or the bigger S ("coarser")
the more likely is it that f is continuous.

Now when you have a query operation op(Y) on a database table Y may get
a result table

X = op(Y) with a function f : X -> Y

that maps each the query result tuple tx in X to its input tuple f(tx) in Y
or you get a result table

Z = op(Y) with a function g : Y -> Z

that maps each input tuple to a result tuple.
In the first case I say, the query operator is of /initial/ type, because the
result is at the starting side of the function arrow. The second case is, hence,
of /final/ type.

When you have queries that involve more than one tables, say Y1, Y2, Y3,
you may get

X = op(Y1,Y2,Y3) with functions f1:X -> Y1, f2:X -> Y2, f3:X->Y3

or you get

Z = op(Y1,Y2,Y3) with functions g1:Y1 -> Z, g2:Y2 -> Z, g3:Y3 -> Z

Now if each such query input table has an attached binary relation
R1, R2, R3, the relation of the topology for the query result table should
contain as much topological information from the input as possible but still all
functions, f1,f2,f3 or g1,g2,g3 -- depending if your query operator is
initial or final (or both) -- should stay continuous.

Trivially, the f1,f2,f3 are always continuous with the empty relation {} subset
X x X and the g1,g2,g3 are always continuous with the universal relation X x X.
So there always exists a relation that makes all functions involved continuous.

So we are either looking for a maximal relation Rx for X that makes all fs
continuous or we are looking for a minimal relation Rz that makes all gs
continuous.

This can be done component-wise, for each function:
Let R be the relation associated with Y. Hence op queries the
spaces (Y1,R1), (Y2,R2), (Y3,R3) to get (X,Rx) or else (X,Rz).

The final case gives:
R1 = { (f1(a), f1(b)) | a R b }
R2 = { (f2(a), f2(b)) | a R b }
R3 = { (f3(a), f3(b)) | a R b }
Then a result relation is Rz = R1 u R u R3 (where "u" is set union).
So this case is easy and can be efficiently computed.

The initial case gives:
R1 = { (a,b) in X x X | g1(a) R+ g1(b)) }
R2 = { (a,b) in X x X | g2(a) R+ g2(b)) }
R3 = { (a,b) in X x X | g3(a) R+ g3(b)) }
Then a result relation is Rz = R1 /\ R /\ R3 (where /\ is set intersection).
Here R+ is the transitive closure on R. So this case is more expensive and
one should take more care in efficiency. Whereas transitive closure always
gives a correct relation one may reduce the result relation to get
Rx' = OPEN(Rz) by removing tupels thet do not affect the topology.
For example
Ri = { (a,b) in X x X | a != b, gi(a) R+ gi(b)) }
is smaller but topologically equivalent to
Ri = { (a,b) in X x X | gi(a) R+ gi(b)) }.

Every standard query operator from Relational Algebra falls into one of these
two categories. The renaming query which just renames attributes and leaves the
data unaffected even falls in both categories.

natural join:

assume two relations R(a,b,c) and S(c,d,e).
I will not specify the key to simplify the presentation and, hence, take
(a,b,c) and (c,d,e) , which are superkeys, as keys.

Then RS(a,b,c,d,e) is the schema of the natural join
Each tuple (a,b,c,d,e) in RS has a projection

pR(a,b,c,d,e) := (a,b,c) in R

and

pS(a,b,c,d,e) := (c,d,e) in S

So we have two functions

pR : RS -> R and pS : RS -> S

that map each result tuple back to its "origin" within the input relations.
As the result RS is at the domain side of each of these functions we have an
/initial/ query. So when the relations R ans S have binary relations
T(ia,ib,ic, da,db,dc) and Q(ic,id,ie, dc,dd,de) associated such that (R,T) and
(S.Q) represent topological spaces, then
a straight-forward result relation TQ for the topology is

TQ(ia,ib,ic,id,ie,da,db,dc,dd,de)
= { (ia,ib,ic,id,ie,da,db,dc,dd,de) |
(ia,ib,ic) T^+ (da,db,dc) andalso (ic,id,ie) Q^+ (dc,dd,de) }.

Then (RS,TQ) represents the fiber product space of the topological spaces
represented by (R,T) and (S,Q). Of course the relatton TYQ may be "trimmed"
by projecting only onto the key attributes of TQ and by removing "spurious"
tuples (ix,dz) in TQ if the tuples (ix,y) and (y,dz) exist in TQ.
(Codd called this operator "OPEN").

projection:

Projection is simple, as it is a final operator.

Take a relation X(a,b,c) and project on a,c, say p[a,c](x,y,z) = (x,z).
Again, I take all attributes as (super)-keys.
Then Y = p[a,c](X) is the query result table.
So p[x,y] is the function that takes a query input tuple (x,y,z) to the
query result (x,z).
When R represents a topolgoy for X, it defines which pairs
(t1,t2) in X x X are "topologically" close.
The query result query merely the image of R under p[a,c]:

If R(ia,ib,ic,da,db,dc) encodes the binary relation (ia,ib,uc) R (da,db,dc)
then the prokction pi[ia,ic,da,dc](R) gives the query result topolgoy
for the space (X,R).

union:

union is also a final operator.

Let (X,R) and (Y,S) be representations of topological spaces and
XuY be the union of X and Y. Then we have the two inclusion functions

ix : X -> XuY , iy : Y -> XuY ,

which are simply defined as ix(s) = s abd iy(t) = t.
The result topology is represented by RuS. Hence
(X,R) u (Y,S) = (XuR, RuS) .

If R and S are not union compatible they can easily made so.

Note that intersection is a special case of the natural join:

(X,R) /\ (Y,S) = (X/\R, R^+ /\ S^+) .

Here, in general, transitive closure is ncessary.

I hope this gives an idea if the operations and their topological

ackground.


> This gives me a flavor how you define selection. However, selection can be
> considered as a special case of natural join. Here I see that my earlier
> guess about natural join defined as composition in the second component is
> simply wrong.

Selection is an initial operator:

Les s[p] be the selection operator with predicate p.

s[p](Y) on a relation Y gives a subset X = s[p](Y)

If R is a relation for Y and (Y,R) represents a topological space, then
the result topology T(R)|_{s[p](Y) (subspace topology) is represented by

sR = { (a,b) in R+ /\ X^2 } .

This is the restriction of the transitive closure on R to the selected tuples.
The selection operator s[p] (with predicate p) applied to a space gives

s[p](Y,R) = (s[p](Y), sR) .

Here again, in general, transitive closure must be possibe to always give
corerct query results. Otherwise the spaces involved must have a constant
upper dimension limit (say, 42, which means that queries on 43-dimensional
spaces may give wrong results).

> "group by" generalizes projection. Does Quotient Space generalize Final
> Topology?

I'd say it the other way: Final Topology generalizes all final operators:
projection, quotient, union.

> Since you offered definition of topological relation as ordered pair of
> multivarite relation together with binary relation, you can specify
> operations in your algebra without referring to topology. Again, I'm
> interested to see formal definitions of natural join, projection, and union.

I hope you have seen them above ;).

> Again, can you please take the (Rooms,RoomNeighbourhood) pair from your
> example, then introduce one more pair and, next, work out the natural join of
> these two pairs (is it something meaningful)?

The natural join of two topological spaces to me is extremely interseting.

Take the example space S (the "sketch"):

----------|--------- < this is an "ascii-art" sketch
room1 door outside

As tables
S = id
----
room1
door
outside

RS = iid bdid
--------------
room1 door
outside door

such that (S,RS) represents a topological space. Attributes prefixed by "bd"
denote elements that are in the topological closure of those having the
attribute prefixed with "i". Hence each pair (iid, bdid) in RS specifies that
bdid in cl({iid})
holds.

You may specify yet another space D, I call "details", such as in ascii-art:

interior--> |1 <--innerDoorMaterial--> 2| <--interior

interior--> |x <--entranceDoorMaterial--> || <--exterior

where the vertical bar denotes a database object that represents the
material that may be used as surface towards an interior room.
The innerDoorMaterial is bounded by two different surfaces which are also in
the boundary of interior.
The arrows --> denote the relation
a-->b :<=> b in cl({a}) .
Two vertical bars || denote a surface that must be used on teh exterior side
of entrance doors which is exposed to weathering.

These are the tables:

D = id comment
----------------------------------------------------
interior an interior room
exterior the exterior
surface1| Surface towards interior
surface2| Opposite surface towards interior
surface|| Surface towards exterior
surfacex| Exterior door's surface towards interior
idoormat material of interior doors
xdoormat material for entry doors

RD = ii bdid
------------------
interior surface1|
idoormat surface2|
interior surfacex|
exterior surface||
xdoormat surface||

Now a third table may define aggregations I (the "index") among details D:

I = id comment
--------------------------
int an interior space
idoor an interior door
xdoor an entry door
ext the exterior

with relations

RI = iid bdid
------------
int idoor
int xdoor
ext xdoor

Now the example space S may reference these detail aggregates
(via a foreign key from S to I):

object detail
-----------------
room1 -> int
door -> xdoor
outside -> ext .

That mapping is continuous.
The following mapping however

object detail
-----------------
room1 -> int
door -> idoor
outside -> ext

is not continuous, because the relation RD that represents the topology
of D contains the pair (outside, door). The foreign key then maps this pair
to (ext, idoor) which is not in the transitive and reflexive closure
of RA. Hence "continuity of foreign keys" could be a helpful instrument for
CAD systems. Id does not allow to put an interor at the common boundary
between interior and exterior.

Now when the details as a space (D,RD) map to their aggregates the following
way (again, say, via a foreign key to I):

id detail
---------------------
interior int
exterior ext
surface| idoor
surface|| xdoor
idoormat xdoor
xdoormat xdoor

then the natural join (S,RS) >< (D,RD) = (SD, RSD) gives:

------------->|<====================>||<-------------------- (ascii-"art")
(room1) (door) (door) (door) (outside)
(int) (xdoor) (xdoor) (xdoor) (ext)
(interior) (xdsurf|) (xdoormat) (surface||) (exterior)

Whis is denoted with vertically stacked tuples. In tables this looks loke:

SD S.id I.id D.id
-----------------------------
room1 int interior
door xdoor xdsurf|
door xdoor xdoormat
door xdoor surface||
outside ext exterior

The topology is then generated by

RSD S.iid I.iid D.iid S.bdid I.bdid D.bdid
-----------------------------------------------------
room1 int interior door xdoor xdsurf|
door xdoor xdoormat door xdoor xdsurf|
door xdoor xdoormat door xdoor surface||
outside ext exterior door xdoor surface||

For example the tuple (door, xdoor, xdsurf|) is in the topological closure
of the set {(room1, int, interior}).

The applicaton of the details (D,RD) to the sketch (S,RS) gives a detailled
version of (S,RD). So natural join could be an interesting operator to organize
levels of detail (LoD) in CAD system where the same detail is repeatedly used.
This may be extended to a relational generalization of CAD-macro (or BLOCK)
objects.

For the topologistst: the above natural join (SD,RSD) = (S,RS)><(D,RD) its the
following pullback (S,RS) x_I (D,RD)

(SD,RSD) --------p_D-------> (D, RD)
| |
| |
| |
p_S FK_D
| |
| |
V V
(S, RS) -------FK_S-------> (I, RI)

with projections p_D and p_S and continuous foreign keys FK_D and FK_S

This "details library" concept still has some open questions because in some
situations details can get lumped together too much. This problem has yet to be
solved, but I consider it only a not too difficult database design problem for
an extremely interesting practical application.

Hope that clarifies.

Norbert


Tegiri Nenashi

unread,
Jan 22, 2015, 5:38:04 PM1/22/15
to
On Thursday, January 22, 2015 at 12:20:12 PM UTC-8, Norbert_Paul wrote:
> Given two topological spaces (X,Tx) and (Y,Ty).
> A function f:X->Y from a set X to a set Y is called /continuous/,
> from (X,Tx) to (Y,Ty) if and only if ("iff") the original image
>
> f^(-1)(B) = { a in X | f(a) in B }
>
> of every element B in Ty is an element in Tx. Note that the bigger
> ("finer") Tx (or the smaller ("coarser") Ty) the more likely is it
> that f is continuous.

Which brings up
http://en.wikipedia.org/wiki/Comparison_of_topologies#Lattice_of_topologies

Similar in spirit construction -- Lattice of Partitions -- is widely used in Database Dependency theory
http://liris.cnrs.fr/Documents/Liris-6427.pdf (fig.2)
Partitions are reflexive, symmetric, transitive binary relations.

> Now when you have a query operation op(Y) on a database table Y may get
> a result table
>
> X = op(Y) with a function f : X -> Y
>
> that maps each the query result tuple tx in X to its input tuple f(tx) in Y
> or you get a result table
>
> Z = op(Y) with a function g : Y -> Z
>
> that maps each input tuple to a result tuple.
> In the first case I say, the query operator is of /initial/ type, because the
> result is at the starting side of the function arrow. The second case is, hence,
> of /final/ type.
>
> When you have queries that involve more than one tables, say Y1, Y2, Y3,
> you may get
>
> X = op(Y1,Y2,Y3) with functions f1:X -> Y1, f2:X -> Y2, f3:X->Y3
>
> or you get
>
> Z = op(Y1,Y2,Y3) with functions g1:Y1 -> Z, g2:Y2 -> Z, g3:Y3 -> Z

I'm not sure why you are focusing here on operations with three operands. Relational algebra operators are dyadic, and operations with higher arity are composed from basic operations (natural join, projection, union, etc).

Now, consider

op(Y1,Y2,Y3) = (Y1 join Y2) union Y3

Is the first operations (join) allowed to be of initial type, and the second one
(union) be of final type?

> Note that intersection is a special case of the natural join:
>
> (X,R) /\ (Y,S) = (X/\R, R^+ /\ S^+) .
>

Yes, natural join is ubiquitous because there are 3 special cases:
1. Intersection is a join when both operands have the same attributes.
2. Selection is a join where the first operand set of attributes is a subset of the second operand attributes. (One have to allow relations/predicate of infinite content in order to express something like "select * from employees where salary > 1000" via natural join)
3. Cartesian product is a join for a relations with disjoint set of attributes.

In other words, if you convince the reader that topological extension of natural join operation is flawless, then most likely the whole algebra is well defined.
If I join S with D, the result is not particularly illuminating, partially because their sets of ids are disjoint. Which prompts

> Now a third table may define aggregations I (the "index") among details D:
>
> I = id comment
> --------------------------
> int an interior space
> idoor an interior door
> xdoor an entry door
> ext the exterior
>
> with relations
>
> RI = iid bdid
> ------------
> int idoor
> int xdoor
> ext xdoor
>
> Now the example space S may reference these detail aggregates
> (via a foreign key from S to I):
>
> object detail
> -----------------
> room1 -> int
> door -> xdoor
> outside -> ext .

OK, now you are introducing something new -- aggregation. Are you implying that topological extension of a join is not a dyadic operation anymore, and requires auxiliary pair (I,RI)?

I'm uncomfortable at this stage because, one can explain classic natural join between Employees and Departments in a single paragraph and suggest that perhaps a more succinct example is wanted. For example, can you amend either (S,RS) or (D,RD) with (I,RI) and construct a pair of topologically enhanced relations which natural join can be explained without the help of auxiliary pair (I,RI)?


Norbert_Paul

unread,
Jan 23, 2015, 3:32:12 PM1/23/15
to
Tegiri Nenashi wrote:
> On Thursday, January 22, 2015 at 12:20:12 PM UTC-8, Norbert_Paul wrote:
>> .... Note that the bigger
>> ("finer") Tx (or the smaller ("coarser") Ty) the more likely is it that f
>> is continuous.
>
> Which brings up
> http://en.wikipedia.org/wiki/Comparison_of_topologies#Lattice_of_topologies
>
> Similar in spirit construction -- Lattice of Partitions -- is widely used in
> Database Dependency theory http://liris.cnrs.fr/Documents/Liris-6427.pdf
> (fig.2) Partitions are reflexive, symmetric, transitive binary relations.

Interesting, but I didn't want to populate my post too much with text-book
material on topology.

>> When you have queries that involve more than one tables, say Y1, Y2, Y3,
>> you may get
>>
>> X = op(Y1,Y2,Y3) with functions f1:X -> Y1, f2:X -> Y2, f3:X->Y3
>>
>> or you get
>>
>> Z = op(Y1,Y2,Y3) with functions g1:Y1 -> Z, g2:Y2 -> Z, g3:Y3 -> Z
>
> I'm not sure why you are focusing here on operations with three operands.
> Relational algebra operators are dyadic, and operations with higher arity are
> composed from basic operations (natural join, projection, union, etc).

I wanted to express by "say" that three is an arbitrary number.
(I don't know if three is always arbitrary but here ist is.)
Note that topologists even have operations that operate on infinitely
many spaces. For example the Cartesian product space may be applied to
infinitely many topologies.

> Now, consider
>
> op(Y1,Y2,Y3) = (Y1 join Y2) union Y3
>
> Is the first operations (join) allowed to be of initial type, and the second
> one (union) be of final type?

Of course the above expression gives a meaningful topology. I admit that my
wording was not correct. It is the /basic/ operations of Relational Algebra
(except outer join) which are always either initial or final. Compound
expressions, however, may have neither property. Maybe it is possible to
immediately compute the topology for expressions like

(Y1 join Y2) union Y3

by looking at the three relations between the the query result amd the input
tables. But these relations then aren't functions any more. So you cannot use
the standard machinery for topological constructions and have to invent
something else. This could be a "further research" question.

>> Note that intersection is a special case of the natural join:
>>
>> (X,R) /\ (Y,S) = (X/\R, R^+ /\ S^+) .
>>
>
> Yes, natural join is ubiquitous because there are 3 special cases: 1.
> Intersection is a join when both operands have the same attributes. 2.
> Selection is a join where the first operand set of attributes is a subset of
> the second operand attributes. (One have to allow relations/predicate of
> infinite content in order to express something like "select * from employees
> where salary> 1000" via natural join) 3. Cartesian product is a join for a
> relations with disjoint set of attributes.

I consider your "selection" argument overly formal. You must then use the
trivial topology for your infinite infinite relation to get selection work this
way. Usualy I consider the discrete topology the default for "bare" tables
without an explicitely given topology. But I found authors who consider the
trivial topology a better default. But don't ask me who it was. That was more
than ten years ago when I stumbled over that paper.

> In other words, if you convince the reader that topological extension of
> natural join operation is flawless, then most likely the whole algebra is
> well defined.

I guess it is. I am convinced it is correct, and it has been checked by by two
mathematicians.

>> Take the example space S (the "sketch"):
...
>> Now the example space S may reference these detail aggregates (via a
>> foreign key from S to I):

Note "foreign key". I meant to extend both relations S and D by an additional
attribute "detail". I know, I should have been clearer here. Mea culpa.
The relational schemata of S an D are hence

S(id, detail)
D(id, detail)

OOPS! I just realize that I have to rename the id-attributtes to get the
natural join in the way I had it in mind:

S(sid, detail)
D(did, detail)

There is no need for a foreign key into another ("auxiliary") table, but I
copied that setup from topological pullback. If this table is omitted, you
get a natural join of relational representation of topological spaces. The
"auxiliary" table already goes beyon join and was intended to illustrate
an application I have in mind.

>> object detail
>> -----------------
>> room1 -> int
>> door -> xdoor
>> outside -> ext .
>
> OK, now you are introducing something new -- aggregation. Are you implying
> that topological extension of a join is not a dyadic operation anymore, and
> requires auxiliary pair (I,RI)?

No, it is a simple natural join on S and D (with the schemata corrected ;) ).
The point of table I is to demonstrate yet another feature: "/continuous/
foreign keys" as a topological consistency rule for spatial data. The
"aggregation" is an application I have in mind. Natural join also works without
all that extra decoration.

> I'm uncomfortable at this stage because, one can explain classic natural join
> between Employees and Departments in a single paragraph and suggest that
> perhaps a more succinct example is wanted. For example, can you amend either
> (S,RS) or (D,RD) with (I,RI) and construct a pair of topologically enhanced
> relations which natural join can be explained without the help of auxiliary
> pair (I,RI)?

Natural join on topological data represented as (S,RS) and (D,RD) works just
that simple way on S and D to get, say, SD. But then in order to get a
meaningful topology for SD you can compute a binary relation RSD from RS, RD,
and SD that represents the initial topology of the two projection mappings

pi_S : SD -> S and pi_D : SD -> D .

Everything else was decoration to illustrate an application of
topological natural join, which I consider interesting. In particular, the
additional relation I (or, the space (I, RI)) is no prerequisite. Sorry for the
errors in defining the schemta, which give intersection, and sorry for all
other sloppiness in my post.

Norbert

Tegiri Nenashi

unread,
Jan 23, 2015, 5:38:03 PM1/23/15
to
On Friday, January 23, 2015 at 12:32:12 PM UTC-8, Norbert_Paul wrote:

> >> Take the example space S (the "sketch"):
> ...
> >> Now the example space S may reference these detail aggregates (via a
> >> foreign key from S to I):
>
> Note "foreign key". I meant to extend both relations S and D by an additional
> attribute "detail". I know, I should have been clearer here. Mea culpa.
> The relational schemata of S an D are hence
>
> S(id, detail)
> D(id, detail)

You are joining S ("Sketches") with D ("Details"). The fact that both S and D have an attribute called "detail" is confusing.

Better names for attribute values would also help. For example, innerDoorMaterial could be glass, entranceDoorMaterial could be veneer.

Please note how intuitive for a lay person is a typical database query:
"Find the names of all females who eat at least one pizza served by Straw Hat"
(https://class.stanford.edu/c4x/Engineering/CS145/asset/opt-rel-algebra.html)

> OOPS! I just realize that I have to rename the id-attributtes to get the
> natural join in the way I had it in mind:
>
> S(sid, detail)
> D(did, detail)

A well known approach here is to represent detail attribute as complex datatype (AKA spacial datatype). The challenge there is defining operators; most important the equality. In that respect your solution seems novel.

> There is no need for a foreign key into another ("auxiliary") table, but I
> copied that setup from topological pullback. If this table is omitted, you
> get a natural join of relational representation of topological spaces. The
> "auxiliary" table already goes beyon join and was intended to illustrate
> an application I have in mind.

May I ask you to revisit the example?

> >> object detail
> >> -----------------
> >> room1 -> int
> >> door -> xdoor
> >> outside -> ext .
> >
> > OK, now you are introducing something new -- aggregation. Are you implying
> > that topological extension of a join is not a dyadic operation anymore, and
> > requires auxiliary pair (I,RI)?
>
> No, it is a simple natural join on S and D (with the schemata corrected ;) ).

Perhaps with better naming and correct schema it would be simple. It is much harder to understand what is joining Sketches with Details.

> The point of table I is to demonstrate yet another feature: "/continuous/
> foreign keys" as a topological consistency rule for spatial data. The
> "aggregation" is an application I have in mind. Natural join also works without
> all that extra decoration.
>
> > I'm uncomfortable at this stage because, one can explain classic natural join
> > between Employees and Departments in a single paragraph and suggest that
> > perhaps a more succinct example is wanted. For example, can you amend either
> > (S,RS) or (D,RD) with (I,RI) and construct a pair of topologically enhanced
> > relations which natural join can be explained without the help of auxiliary
> > pair (I,RI)?
>
> Natural join on topological data represented as (S,RS) and (D,RD) works just
> that simple way on S and D to get, say, SD. But then in order to get a
> meaningful topology for SD you can compute a binary relation RSD from RS, RD,
> and SD that represents the initial topology of the two projection mappings
>
> pi_S : SD -> S and pi_D : SD -> D .
>
> Everything else was decoration to illustrate an application of
> topological natural join, which I consider interesting. In particular, the
> additional relation I (or, the space (I, RI)) is no prerequisite. Sorry for the
> errors in defining the schemta, which give intersection, and sorry for all
> other sloppiness in my post.

Please note that at this moment I'm confused on the meaning of both

S join D

together with

RS join RD

This is because you have introduced common attribute "detail" into both S and D while omitting the example. I can make guesses interpreting the id as identification of parent object in the aggregation, and detail being child. Then, join S with D is finding objects with common set of children. Again, a fixed example would be nice.

> > Yes, natural join is ubiquitous because there are 3 special cases:
> > 1. ...
> > 2.
> > Selection is a join where the first operand set of attributes is a subset of
> > the second operand attributes. (One have to allow relations/predicate of
> > infinite content in order to express something like "select * from employees
> > where salary> 1000" via natural join)
> > 3. ...
>
> I consider your "selection" argument overly formal. You must then use the
> trivial topology for your infinite infinite relation to get selection work this
> way. Usualy I consider the discrete topology the default for "bare" tables
> without an explicitely given topology. But I found authors who consider the
> trivial topology a better default. But don't ask me who it was. That was more
> than ten years ago when I stumbled over that paper.

I moved this part towards the end of the discussion: assuming you helped to resolve my confusion over natural join, this is the second issue to clarify. Why do you endow the real line interval (1000 < x < infinity) with trivial topology? Isn't it standard metric space?
http://en.wikipedia.org/wiki/Real_line#As_a_metric_space




Norbert_Paul

unread,
Jan 24, 2015, 12:11:36 PM1/24/15
to
Norbert_Paul wrote:
> The initial case gives:
> R1 = { (a,b) in X x X | g1(a) R+ g1(b)) }
> R2 = { (a,b) in X x X | g2(a) R+ g2(b)) }
> R3 = { (a,b) in X x X | g3(a) R+ g3(b)) }
> Then a result relation is Rz = R1 /\ R /\ R3 (where /\ is set
> intersection).
YA error: This must be
R1 = { (a,b) in X x X | g1(a) R^* g1(b)) }
R2 = { (a,b) in X x X | g2(a) R^* g2(b)) }
R3 = { (a,b) in X x X | g3(a) R^* g3(b)) }
Where R^* is the transitive and reflexive closure of R.

Norbert_Paul

unread,
Jan 25, 2015, 5:30:29 AM1/25/15
to
Tegiri Nenashi wrote:
> On Friday, January 23, 2015 at 12:32:12 PM UTC-8, Norbert_Paul wrote:
> Please note that at this moment I'm confused on the meaning of both
>
> S join D
>
> together with
>
> RS join RD
>
> This is because you have introduced common attribute "detail" into both S and
> D while omitting the example. I can make guesses interpreting the id as
> identification of parent object in the aggregation, and detail being child.
> Then, join S with D is finding objects with common set of children. Again, a
> fixed example would be nice.

OK, I will start over again and only do the natjoin part without any decoration:

Assume that there are two topological spaces (S,T(RS)) and (D,T(RD)) represented
by the four relations

S = sid detail
---------------
int1 interior
door xdoor
garden exterior

<---interior--->|<----boudary--->
RS = isid idetail - bdsid bddetail
----------------------------------
int1 interior - door xdoor
garden exterior - door xdoor


for (S,T(RS)) and


D = did detail
----------------------------------------------------
interior interior
exterior exterior
surface1| idoor
surface2| idoor
surface|| xdoor
surfacex| xdoor
idoormat idoor
xdoormat xdoor

<-----interior---->|<-----boudary------>
RD = idid idetail bddid bddetail
---------------------------------------
interior interior surface1| idoor
idoormat idoor surface2| idoor
interior interior surfacex| xdoor
exterior exterior surface|| xdoor
xdoormat xdoor surface|| xdoor


for (D,T(RD)). The labels "interior" and "boundary" denote, how a tuple in D
references two tuples in D. The topology-defining binary relations RS on S and
RD on D are defined by the rule:

For each pair of tuples d1, d2 in D holds:
exists r in RD s.t.
d1.did=r.idid and d1.detail=r.idetail
and
d2.did=r.bddid and d2.detail=r.bddetail
<=>
d2 in cl({d1}) (1)

with respect to the topology T(RD). T(RD) itself is the finest (maximal)
topology that satisfies the above rule.
An attribute prefix "i" is mnemonic for "interior" and "bd" for "boundary".
The topology T(RS) is defined in a similar manner.

Now we carry out the natural join on S and D:

SD = S natjoin D
= sid detail did
--------------------------
int1 interior interior
door xdoor surface||
door xdoor surfacex|
door xdoor xdoormat
garden exterior exterior .

Then we have the two projections ps and pd:

ps : SD -> S,
ps(int1, interior, interior) = (int1, interior)
ps(door, xdoor, surface||) = (door, xdoor)
ps(door, xdoor, surfacex|) = (door, xdoor)
ps(door, xdoor, xdoormat) = (door, xdoor)
ps(garden, exterior, exterior) = (garden, exterior)

pd : SD -> D,
pd(int1, interior, interior) = (interior, interior)
pd(door, xdoor, surface||) = (surface||, xdoor)
pd(door, xdoor, surfacex|) = (surfacex|, xdoor)
pd(door, xdoor, xdoormat) = (xdoormat, xdoor)
pd(garden, exterior, exterior) = (exterior, exterior)

Now we need the coarsest topology that makey both functions continuous.
This is generated by the maximal binary relation on SD, that makes ps and pd
continuous: So I will check each pair (sd1, sd2) of tuples from SD x SD, if
it satisfies the two donditions

(ps(sd1),ps(sd2)) in RS^*
(pd(sd1),pd(sd2)) in RD^*

(here a mande an error in previous posts (and publications), where I used the
transitive closures RS^+ and RD^+ instead of the transitive and reflexive
closures RS^* and RD^*).
These pairs as a binary relation on SD generate the topology for the natural
join of the spaces. I will not check diagonal entries (sd1,sd1) because it is
not necessary. Note that RS and RD are transitive, hence in this example
(a,b) in Rx^* <=> a=b or (a,b) in Rx holds for Rx = RS and Rx = RD.

sd1 as first tuple in SD:
sd1=(int1, interior, interior) sd2=(door, xdoor, surface||)
ps: (int1,interior, door,xdoor) is in RS
pd: (interior,interior, surface||,xdoor) is not in RD
So this tuple is not in the result

sd1=(int1, interior, interior) sd2=(door, xdoor, surfacex|)
ps: (int1,interior, door,xdoor) is in RS
pd: (interior,interior surfacex|,xdoor) is in RD
So this tuple will be taken into the result relation.


sd1=(int1, interior, interior) sd2=(door, xdoor, xdoormat)
ps: (int1,interior, door,xdoor) is in RS
pd: (interior,interior xdoormat,xdoor) is not in RD
So this tuple is not in the result


sd1=(int1, interior, interior) sd2=(garden, exterior, exterior)
ps: (int1,interior, garden,exterior) is not in RS
pd: (interior,interior exterior,exterior) is not in RD
So this tuple is not in the result

sd1 as second tuple in SD:
sd1=(door, xdoor, surface||) sd2=(int1, interior, interior)
ps: (door,xdoor, int1,interior) is not in RS
pd : need not be tested
So this tuple is not in the result

sd1=(door, xdoor, surface||) sd2=(door, xdoor, surfacex|)
ps: (door,xdoor, door,xdoor) is a diagonal element, hence it is in RS^*
pd: (surface||,xdoor, surfacex|,xdoor) is not in RD
So this tuple is not in the result

sd1=(door, xdoor, surface||) sd2=(door, xdoor, xdoormat)
ps: (door,xdoor, door,xdoor) is in RS^*
pd: (surface||,xdoor, xdoormat,xdoor) is not in RD^*
So this tuple is not in the result

sd1=(door, xdoor, surface||) sd2=(garden, exterior, exterior)
ps: (door,xdoor, exterior,garden) is not in RS^*
pd: need not be tested
So this tuple is not in the result

sd1 as third tuple in SD:
sd1=(door, xdoor, surfacex|) sd2 = (int1, interior, interior)
ps: (door,xdoor, int1,interior) not in RS^*
So this tuple is not in the result

sd1=(door, xdoor, surfacex|) sd2 = (door, xdoor, surface||)
ps: (door,xdoor, door,xdoor) yet another diagonal element in RS^*
pd: (surfacex|,xdoor, surface||,xdoor) not in RD^*
So this tuple is not in the result

sd1=(door, xdoor, surfacex|) sd2 = (door, xdoor, xdoormat)
ps: diagonal
pd: (surfacex|,xdoor, xdoormat,xdoor) not in RD^*
So this tuple is not in the result

sd1=(door, xdoor, surfacex|) sd2 = (garden, exterior, exterior)
ps: (door,xdoor, garden,exterior) not in RS^*
pd: need not be tested.
So this tuple is not in the result

sd1 as fourth in SD:
sd1=(door, xdoor, xdoormat) sd2=(int1, interior, interior)
ps: (door,xdoor, int1,interior) not in RS^*
So this tuple is not in the result

sd1=(door, xdoor, xdoormat) sd2=(door, xdoor, surface||)
ps: (door,xdoor, door,xdoor) is in RS^*
pd: (xdoormat,xdoor, surface||,xdoor) is in RD
So this tuple will be taken into the result relation.

sd1=(door, xdoor, xdoormat) sd2=(door, xdoor, surfacex|)
ps: (door,xdoor, door,xdoor) is in RS^*
pd: (xdoormat,xdoor, surfacex|,xdoor) is in RD
So this tuple will be taken into the result relation.

sd1=(door, xdoor, xdoormat) sd2=(garden, exterior, exterior)
ps: (door,xdoor, garden,exterior) is not in RS^*
So this tuple is not in the result

fifth row:

sd1=(garden, exterior, exterior) sd2=(int1, interior, interior)
ps: (garden,exterior, int1,interior) not in RS^*
So this tuple is not in the result

sd1=(garden, exterior, exterior) sd2=(door, xdoor, surface||)
ps: (garden, exterior, door, xdoor) is in RS
pd: (exterior, exterior) sd2=(surface||, xdoor) is in RD
So this tuple will be taken into the result relation.

sd1=(garden, exterior, exterior) sd2=(door, xdoor, surfacex|)
ps: (garden,exterior, door,xdoor) is in RS
pd: (exterior,exterior, surfacex|,xdoor) not in RD^*
So this tuple is not in the result

sd1=(garden, exterior, exterior) sd2=(door, xdoor, xdoormat)
ps: (garden,exterior, door,xdoor) is in RS
pd: (exterior,exterior, xdoormat,xdoor) is not in RD^*
So this tuple is not in the result


So the topology of the natural join SD is generated by the binary relation

RSD = isid idetail idid bdsid bddetail bddid
---------------------------------------------------------
int1, interior, interior, door, xdoor, surfacex|
door, xdoor, xdoormat, door, xdoor, surface||
door, xdoor, xdoormat, door, xdoor, surfacex|
garden, exterior, exterior, door, xdoor, surface||

on SD.

I hope this exhaustive exapmple helps to understand my approach.
(It made me spot an error in a previous post, too :) )

> I moved this part towards the end of the discussion: assuming you helped to
> resolve my confusion over natural join, this is the second issue to clarify.
> Why do you endow the real line interval (1000< x< infinity) with trivial
> topology? Isn't it standard metric space?
> http://en.wikipedia.org/wiki/Real_line#As_a_metric_space

The real line, when considered a topological space, is usually considered
metric, not trivial. So the topology is implicitly given by convention.
One could search for similar conventions for standard topologies for relational
database table. A metric topology would be OK for me, because then we would
happen to meet in the special case of finite tables.

Norbert
Message has been deleted

Tegiri Nenashi

unread,
Jan 25, 2015, 3:56:27 PM1/25/15
to
On Sunday, January 25, 2015 at 2:30:29 AM UTC-8, Norbert_Paul wrote:
> Assume that there are two topological spaces (S,T(RS)) and (D,T(RD)) represented
> by the four relations
>
> S = sid detail
> ---------------
> int1 interior
> door xdoor
> garden exterior
>
> <---interior--->|<----boudary--->
> RS = isid idetail - bdsid bddetail
> ----------------------------------
> int1 interior - door xdoor
> garden exterior - door xdoor

RS is not transitive. (Later, you mention that it is.)

> for (S,T(RS)) and
>
>
> D = did detail
> ----------------------------------------------------
> interior interior
> exterior exterior
> surface1| idoor
> surface2| idoor
> surface|| xdoor
> surfacex| xdoor
> idoormat idoor
> xdoormat xdoor
>
> <-----interior---->|<-----boudary------>
> RD = idid idetail bddid bddetail
> ---------------------------------------
> interior interior surface1| idoor
> idoormat idoor surface2| idoor
> interior interior surfacex| xdoor
> exterior exterior surface|| xdoor
> xdoormat xdoor surface|| xdoor
>
> for (D,T(RD)). The labels "interior" and "boundary" denote, how a tuple in D
> references two tuples in D.

OK: having composite attributes (such as idid+idetail) allows you to avoid explicit references to tuples with introduction of auxiliary concepts such as FK, which you did in earlier posts.

> The topology-defining binary relations RS on S and
> RD on D are defined by the rule:
>
> For each pair of tuples d1, d2 in D holds:
> exists r in RD s.t.
> d1.did=r.idid and d1.detail=r.idetail
> and
> d2.did=r.bddid and d2.detail=r.bddetail
> <=>
> d2 in cl({d1}) (1)
>
> with respect to the topology T(RD). T(RD) itself is the finest (maximal)
> topology that satisfies the above rule.

I seem to grasp the high level idea: given the two topologies T(RS) and T(RD), their "natural join" is a lattice meet T(RS)^T(RD):
http://en.wikipedia.org/wiki/Comparison_of_topologies#Lattice_of_topologies

[Relational algebra] natural join is lattice operation as well:
http://www.dcs.bbk.ac.uk/~szabolcs/ramics-final.pdf
This makes it trivial to establish that topology plays well along with Database Relational Algebra: the objects of Norbert Paul algebra are pairs (DatabaseRelation, Topology) with each component defining their own lattice meet and join. The Norbert Paul lattice is just a lattice product.

I might have jumped too far, because we didn't discuss union and projection yet. I simply point out that while natural join covers 3 cases (selection, intersection, and Cartesian product), its dual operation covers 2 cases -- union and projection -- at once.

> An attribute prefix "i" is mnemonic for "interior" and "bd" for "boundary".
> The topology T(RS) is defined in a similar manner.
>
> Now we carry out the natural join on S and D:
>
> SD = S natjoin D
> = sid detail did
> --------------------------
> int1 interior interior
> door xdoor surface||
> door xdoor surfacex|
> door xdoor xdoormat
> garden exterior exterior .

Crystal clear now, thank you for clarifying it.
Typo? Did you mean RS^* and RD^* are transitive?

> (a,b) in Rx^* <=> a=b or (a,b) in Rx holds for Rx = RS and Rx = RD.
>
> sd1 as first tuple in SD:
> sd1=(int1, interior, interior) sd2=(door, xdoor, surface||)
> ps: (int1,interior, door,xdoor) is in RS
> pd: (interior,interior, surface||,xdoor) is not in RD
> So this tuple is not in the result

Same question here: RS^* and RD^*? I guess this can be easily recovered because "... is in RS" entails "... is in RS^*".

ps: (int1,interior, door,xdoor) is in RS (and, therefore, in RS^*)
OK, this convinces me that the natural join, and the entire algebra is consistent. It would also be helpful to demonstrate that it is meaningful. I have already complained about relation and attribute names. The Sketch can be architectural schema, but what is Detail? Is it mapping of the archeological site as described in the reference in one of your earlier posts?

There are some values in your example, such as door, which intend to appeal to layman intuition. Still, I can imagine database application where home improvement chain store has an inventory of all doors:

Doors(sku,vendor)
DoorsDimensions(doorType,x,y,z)


I can't figure out realistic schema where "door" is a value of some attribute.

Norbert_Paul

unread,
Jan 27, 2015, 1:48:33 PM1/27/15
to
Tegiri Nenashi wrote:
> On Sunday, January 25, 2015 at 2:30:29 AM UTC-8, Norbert_Paul wrote:
>> Assume that there are two topological spaces (S,T(RS)) and (D,T(RD))
>> represented by the four relations
>>
>> S = sid detail
>> ---------------
>> int1 interior
>> door xdoor
>> garden exterior
>>
>> <---interior--->|<----boudary--->
>> RS = isid idetail - bdsid bddetail
>> ----------------------------------
>> int1 interior - door xdoor
>> garden exterior - door xdoor
>
> RS is not transitive. (Later, you mention that it is.)

http://en.wikipedia.org/wiki/Transitive_relation :

Forall a,b,c in X : (a R b and b R c) => a R c. (1)

For each tuple t in RS the projection (t.isid, t.idetail) is the
"left-hand-side" and = (t,bdsid, t.bddetail) is the "right-hand side"
of the binary relation RS on S, hence

(t.isid, t.idetail) RS (t.bdsid, t.bddetail)

is equivalent to t in RS.

Let a,b,c in S be arbitrary chosen, such that a RS b and b RS c hold.
Then there must be two tuples (a,b) = t1, (b,c) = t2 in RS such that
b = (t1.bsid, t1.bddetail) = (t2,isid, t2,idetail)
holds.

All right-hand sides (t1.bsid, t1.bddetail) of the tupels in RS
look like (door,xdooor).
Hence t2 must have a left-hand-side

b = (t2.isid, t2.idetail) = (door, xdoor).

No such tuple exists in RS, hence the premise (a RS b and b RS c)
ist wrong for all a,b,c in S.
Hence the implication
(a R b and b R c) => a R c.
is true for all a,b,c in RS.
Hence RS satisfies property (1).
Hence RS is transitive.
\qed

> OK: having composite attributes (such as idid+idetail) allows you to avoid
> explicit references to tuples with introduction of auxiliary concepts such as
> FK, which you did in earlier posts.

The "detail" attribute is still a "foreign key" into
select detail from S
uinion
select detail from D;


>
>> The topology-defining binary relations RS on S and RD on D are defined by
>> the rule:
>>
>> For each pair of tuples d1, d2 in D holds: exists r in RD s.t.
>> d1.did=r.idid and d1.detail=r.idetail and d2.did=r.bddid and
>> d2.detail=r.bddetail <=> d2 in cl({d1})
>> (1)
>>
>> with respect to the topology T(RD). T(RD) itself is the finest (maximal)
>> topology that satisfies the above rule.
>
> I seem to grasp the high level idea: given the two topologies T(RS) and
> T(RD), their "natural join" is a lattice meet T(RS)^T(RD):
> http://en.wikipedia.org/wiki/Comparison_of_topologies#Lattice_of_topologies

No, it is the lattice "meet" of the two topologies induced by the projections
from the join to the input tables. In general a result relation may also be the
lattice "join", depending on the query operator.

> [Relational algebra] natural join is lattice operation as well:
> http://www.dcs.bbk.ac.uk/~szabolcs/ramics-final.pdf This makes it trivial to
> establish that topology plays well along with Database Relational Algebra:
> the objects of Norbert Paul algebra are pairs (DatabaseRelation, Topology)
> with each component defining their own lattice meet and join. The Norbert
> Paul lattice is just a lattice product.

No. it is not that simple:
First you have to find induced topologies (or co-induced topologies depending on
the type operator). The result topology is either the infimum or the
supremum topology. The infimum topology is simply achieved by the union of
the relations. The supremum topology, however, requires transitive closure
computation in general. There are exceptions though. For example the topolgoy
for the Cartesian Product "x" can be computed without transitive closure:

(A,R) x (B,S) = (A x B, Ia x S + R x Ib)

where Ia is the identity relation (a,a) on A and Ib the identity (b,b) on B.
"+" is set union. It is, for example, an initial topology which can be created
by a relational query expression without transitive closure.
The tuples in Ia x S look like
((a,b1), (a,b2)) where a in A and b1 S b2 holds.
The tuples in R x Ib look like
((a1,b), (a2,b)) where a1 R a2 and b in B holds.

> Crystal clear now, thank you for clarifying it.

You are welcome.

>> Now we need the coarsest topology that makey both functions continuous.
>> This is generated by the maximal binary relation on SD, that makes ps and
>> pd continuous: So I will check each pair (sd1, sd2) of tuples from SD x SD,
>> if it satisfies the two donditions
>>
>> (ps(sd1),ps(sd2)) in RS^* (pd(sd1),pd(sd2)) in RD^*
>>
>> (here a mande an error in previous posts (and publications), where I used
>> the transitive closures RS^+ and RD^+ instead of the transitive and
>> reflexive closures RS^* and RD^*). These pairs as a binary relation on SD
>> generate the topology for the natural join of the spaces. I will not check
>> diagonal entries (sd1,sd1) because it is not necessary. Note that RS and RD
>> are transitive, hence in this example
>
> Typo? Did you mean RS^* and RD^* are transitive?

No. See proof above: RS and RD /are/ transitive.

> Same question here: RS^* and RD^*? I guess this can be easily recovered
> because "... is in RS" entails "... is in RS^*".
>
> ps: (int1,interior, door,xdoor) is in RS (and, therefore, in RS^*)

The rule to get the topology induced by a function f is, in general

(f(x1),f(x2)) in R^* .

You need noct check (f(x),f(x)) in R^*, because this trivially holds.
When you have already verified

(f(x1),f(x2)) in R, then (f(x1),f(x2)) in R^* holds.

>> ....
>> on SD.
>
> OK, this convinces me that the natural join, and the entire algebra is
> consistent. It would also be helpful to demonstrate that it is meaningful. I
> have already complained about relation and attribute names. The Sketch can be
> architectural schema, but what is Detail? Is it mapping of the archeological
> site as described in the reference in one of your earlier posts?

This concept may not be suitable for archeological sites. I think of a
topological generalizaiton of kind of a "details library" which can be
referenced from a coarser "sketch" to get a "detailled" plan iff details
(say, windows, doors) are repeatedly used.

So imagine you work on a plan where you have an element 0815 of type "door".
Then refence that 0815-door object to a family of finer objects aggregated to a
description of that door. The details library may then also describe
topologically, how a door fits into a wall. The detailled plan (the natjoin)
then topologically connects the, say, 0816-wall to the 0815-door, iff this
connection is as well present in the sketch as also in the details library.

However, this is still only an idea. There is much work to be done to get this
running.

> There are some values in your example, such as door, which intend to appeal
> to layman intuition. Still, I can imagine database application where home
> improvement chain store has an inventory of all doors:
>
> Doors(sku,vendor) DoorsDimensions(doorType,x,y,z)
>
> I can't figure out realistic schema where "door" is a value of some
> attribute.

Actually, this "door" value is only intendet for the small example.
If there are many doors, of course, each will get its identifier.

Norbert

Tegiri Nenashi

unread,
Jan 27, 2015, 5:00:51 PM1/27/15
to
On Tuesday, January 27, 2015 at 10:48:33 AM UTC-8, Norbert_Paul wrote:
> Tegiri Nenashi wrote:
> > On Sunday, January 25, 2015 at 2:30:29 AM UTC-8, Norbert_Paul wrote:
> >> <---interior--->|<----boudary--->
> >> RS = isid idetail - bdsid bddetail
> >> ----------------------------------
> >> int1 interior - door xdoor
> >> garden exterior - door xdoor
> >
> > RS is not transitive. (Later, you mention that it is.)
> ...
> Hence RS is transitive.

Oops, what a trivial mistake I made: RS^2 is empty, therefore RS^2+RS^3+... is also empty, thus RS = RS+RS^2+RS^3+... .

> > I seem to grasp the high level idea: given the two topologies T(RS) and
> > T(RD), their "natural join" is a lattice meet T(RS)^T(RD):
> > http://en.wikipedia.org/wiki/Comparison_of_topologies#Lattice_of_topologies
>
> No, it is the lattice "meet" of the two topologies induced by the projections
> from the join to the input tables. In general a result relation may also be the
> lattice "join", depending on the query operator.

Proposition. Natural join is of /final/ type; Inner union is of /initial/ type.

Corollary. Intersection, Cartesian product, and Selection is of /final/ type. Projection and Union is of /initial/ type.

I still stand by my assertion that behind your construction is cross product of two lattices. More specifically, you extend natural join of two database relations with lattice meet operation of two corresponding topologies. Likewise, you extend inner union-like operation (projection&union) with join of two lattice topologies.


> > [Relational algebra] natural join is lattice operation as well:
> > http://www.dcs.bbk.ac.uk/~szabolcs/ramics-final.pdf This makes it trivial to
> > establish that topology plays well along with Database Relational Algebra:
> > the objects of Norbert Paul algebra are pairs (DatabaseRelation, Topology)
> > with each component defining their own lattice meet and join. The Norbert
> > Paul lattice is just a lattice product.
>
> No. it is not that simple:
> First you have to find induced topologies (or co-induced topologies depending on
> the type operator). The result topology is either the infimum or the
> supremum topology. The infimum topology is simply achieved by the union of
> the relations. The supremum topology, however, requires transitive closure
> computation in general.

Isn't transitive closure is just an implementation detail? If I wanted to work with partitions instead of topologies, the situation would be very similar.
Lattice meet operation is partition intersection. In example of partitions of the set {1,2,3}:

1 | 2 3
^
1 2 | 3
=
1 | 2 | 3

Lattice join definition is a little more involved, so that partitions are the equivalence classes of element reachable via transitive closure. In example of partitions of the set {1,2,3,4,5,6}:

1 2 | 3 4 | 5 | 6
v
1 | 2 3 | 4 5 | 6
=
1 2 3 4 5 | 6

Here 5 is reachable from 1 via the chain 1->2->3->4->5: both 1 and 2 are in the same equivalence class of the first partition set, then, 2 and 3 are in the same equivalence class of the second partition set, and so on.

In other words [partition] lattice meet is just set intersection, while lattice join requires transitive closure. The fact that transitive closure was used can be ignored, and further development can be hinged on join and meet satisfying lattice algebraic laws.

> There are exceptions though. For example the topolgoy
> for the Cartesian Product "x" can be computed without transitive closure:
>
> (A,R) x (B,S) = (A x B, Ia x S + R x Ib)
>
> where Ia is the identity relation (a,a) on A and Ib the identity (b,b) on B.
> "+" is set union. It is, for example, an initial topology which can be created
> by a relational query expression without transitive closure.
> The tuples in Ia x S look like
> ((a,b1), (a,b2)) where a in A and b1 S b2 holds.
> The tuples in R x Ib look like
> ((a1,b), (a2,b)) where a1 R a2 and b in B holds.

This just reinforces my impression that transitive closure is not fundamental to the issue how you match relational algebra operations against topological lattice join and meet.

> > OK, this convinces me that the natural join, and the entire algebra is
> > consistent. It would also be helpful to demonstrate that it is meaningful. I
> > have already complained about relation and attribute names. The Sketch can be
> > architectural schema, but what is Detail? Is it mapping of the archeological
> > site as described in the reference in one of your earlier posts?
>
> This concept may not be suitable for archeological sites. I think of a
> topological generalizaiton of kind of a "details library" which can be
> referenced from a coarser "sketch" to get a "detailled" plan iff details
> (say, windows, doors) are repeatedly used.
>
> So imagine you work on a plan where you have an element 0815 of type "door".
> Then refence that 0815-door object to a family of finer objects aggregated to a
> description of that door. The details library may then also describe
> topologically, how a door fits into a wall. The detailled plan (the natjoin)
> then topologically connects the, say, 0816-wall to the 0815-door, iff this
> connection is as well present in the sketch as also in the details library.

More explanation (detail:-) is wanted. Say, I'm working on home improvement project changing the entrance door (any Ask This Old House fans here?). I could have a spec what features are wanted: the door thermal/sound insulation properties, is it a glass door (genus-1 surface), what door knob should it have, etc. I might also be concerned about the door fitting the door entrance, so this has to be also a part of my query against improvement home store catalog. What is entirely escaping my intuition is that the door topological neighbors (rooms) are also relevant in such an application.

Tegiri Nenashi

unread,
Jan 27, 2015, 5:32:52 PM1/27/15
to
On Tuesday, January 27, 2015 at 2:00:51 PM UTC-8, Tegiri Nenashi wrote:
> Corollary. Intersection, Cartesian product, and Selection is of /final/ type. Projection and Union is of /initial/ type.
>
> I still stand by my assertion that behind your construction is cross product of two lattices. More specifically, you extend natural join of two database relations with lattice meet operation of two corresponding topologies. Likewise, you extend inner union-like operation (projection&union) with join of two lattice topologies.

[Lattice] "join" is quite sloppy term here. Either synonym "least upper bound" or "supremum" (which you used in followup snippet) is better. Ditto for meet. Less confusion-prone assertion:

The elements of topological database are pairs (database relation, topology). The operations are defined component-wise so that they respect both lattice structures: relational lattice in the first component, and topological lattice in the second.

Derek Asirvadem

unread,
Jan 27, 2015, 8:48:34 PM1/27/15
to
Norbert

Following Tegiri's hint, I went back and re-read your previous posts. I have deleted my previous post, this is the revised version.

> On Saturday, 24 January 2015 09:38:03 UTC+11, Tegiri Nenashi wrote:

(Tegiri: I am getting value from your posts, so please do not take my comments as subtracting from that.)

> You are joining S ("Sketches") with D ("Details"). The fact that both S and D have an attribute called "detail" is confusing.
>
> Better names for attribute values would also help. For example, innerDoorMaterial could be glass, entranceDoorMaterial could be veneer.

Definitely.

> Please note how intuitive for a lay person

Lay person ??? Is mathematics a religious affair ? Maybe that is why they think they are different from the rest of us ! Maybe that is why there is a chasm between mathematicians and the real world !!

Well, God already has me, and I am Commanded against idolatry, I am happy to be a "lay person" in the context of the god of mathematics. You guys can spend your time inventing mathematical poofs to lay at his feet or hooves or whatever !

> is a typical database query:
> "Find the names of all females who eat at least one pizza served by Straw Hat"
> (https://class.stanford.edu/c4x/Engineering/CS145/asset/opt-rel-algebra.html)

Agreed. That was the point of Codd's work, to make the database accessible to users.

Considering:
> > On Wednesday, 21 January 2015 00:06:46 UTC+11, Derek Asirvadem wrote:
> > Claim
> > ii. Until 1985, the marks defining the "relations" were ASCII characters, and the statements were SQL DDL. That engages 4% of ones brain. Determination of whether such "relations" are Relational or not, and indeed whether the database as a whole is Relational or not, can now be made. Engaging the same 4% of the brain. Time-consuming and prone to error.
>
> The claim can now (at [ii] ) be confirmed, with some difficulty.

Stanford's claim that the relations in the example are Relational, can be determined, and confirmed, easily:
a. They use names, not letters, for their relations and attributes
b. They give the keys
c. The keys are not IDs
(which means record numbers, not rows, and the RM demands that "keys are made up from the data", which means IDs are prohibited. Further, the main different between Relational data and Record filing systems, is that they are related by Key vs related by record ID.)
d. One parent relation is missing, but it is easily identified:
____Pizzeria(pizzeria) (pizzeria) is a key
While that is fine for a classroom exercise, outside a class, it is not.
e. Not only are the given relations Relational, they are all base relations. Therefore the class can progress without hindrance. (In the great majority of papers written by mathematicians, they confuse base relations and derived relations. Sometimes without realising it, and other times, in order to erect a Straw Man argument.)

> I hope this exhaustive exapmple helps to understand my approach.

No, it did not.

Many mathematicians make this mistake.

At this moment, I have two papers on my desk that are awaiting approval/rejection from a customer, written by well-established mathematicians (one who is a correspondent here), and they are both raised by an OO development team leader, to support his notion that the data & referential integrity problems that plague the customer's app plus record filing system can be corrected by further definitions, typing, classifiers, etc. Obviously that is false, and it is simply a continuation of the same promises that have been made for three years, none of which have been delivered. Except that this time, he has a few academic papers to support his promises. His credibility is long gone, and the app+RFS is going to be replaced by an RDB + app, which is why I was hired.

The papers are not science, therefore pointing out the errors and dismissing them is not difficult, but it is time-consuming, since I have to address the construction of the argument before stitching up in the proposal.

Back to you. The common mistake in such papers, and in your long post, is this. You focus on the data content, one the different "relations". Instead of on the relation definitions.

I am not sure, but I have a good idea that mathematicians, especially the ones who provide partial UML diagrams, think of the relations through the lens of the objects (classes, classifiers, inheritance) that they have coded or planned. And that causes your perception of the data, as data, minus the objects, to be severely limited. The perception is data-the-way-it-looks (note the content of your post), as opposed to the definition of relations; what relations each relation is related to; dependent on; etc.

So please, do not type so much examples, it does not increase the understanding of the data in people who do not know the data. Give us definitions. Standard-compliant definitions (iii) are preferred, but simple text defns (ii) will be quite acceptable.

--

Stanford definitions. I realise that at this stage, all your relations are in fact records, not relations (they fail on (a) and (b)). Which has to be corrected in your next rendition of the paper, before you can make the claim.

But we can set that aside, on the basis that you have backed off from the claim, and try to understand the relations-as-intended, so that we can understand your papers, your topological spaces and the topological queries that you have constructed. Only if you give us relation definitions.

Your descriptions fail massively on (e). But that is the gap that has to be worked out before you can realise any implementation. What you and Tegiri have stated is "a lot of work to be done" is a small effort for me to do. Once I understand you.

<Retracted>

If I have a somewhat correct understanding of [1] and [2] ...
> for (D,T(RD)). The labels "interior" and "boundary" denote, how a tuple in D
> references two tuples in D.

No.

a. The labels are RoleNames, as defined in the RM.

b. The RoleNames give meaning to the left and right side of the two FK references in the record, and differentiate each from the other.

c. Since each RD record is, well, one RD record, they do not denote how one tuple references two tuples.
- In each record, there are two references.
- Each reference is one fact, one discrete relationship, to one parent record
- A relationship definition is a two-way street (two derived facts) but a single definition (if Fred is Sally's father, then I know that Sally is Fred's daughter, I don't need a second relationship to tell me that the first relationship backwards is a fact)
- If you are talking about the record in Detail, then there are two separate relationships, *each* to *way more* than one record in R(Detail). ("Two" is incorrect)
--- If you mean that for any given Detail, one particular relationship (Interior or Boundary) there are at most 2 records in R(Detail), then you must state that, as a constraint. If it is not stated, then the cardinality is 1:0-n.
- If you are talking about the record in R(Detail), then one record in R(Detail) (not Detail) relates to one record in Detail (via Interior) and relates to one record in Detail (via Boundary), which may or may not be the same Detail-record.

In OO/UML terms, sure, Detail (or Sketch) is an aggregator. So what. In your intended-relational terms, Detail is the parent relation, R(Detail) is the child relation. In a Relational Database, aggregation can occur at *any* level in the data hierarchy, for *any* descendent, it is pedestrian. The great-grand-father can aggregate the children; the grand-children; the great-grand-children. Effortlessly.

6.
I don't need the data content for natural joins or projections (but don't allow me stop you). I am pretty sure I know what a projection is. I need the projection and its purpose, what it means (hopefully in terms of spaces; boundaries; interiors; exteriors), to be explained, in one or two English sentences.

For continuations and topology generation, I think the data content is relevant, but only if the above is resolved and understood first.

Cheers
Derek

Derek Asirvadem

unread,
Jan 27, 2015, 8:50:48 PM1/27/15
to

Norbert, Tegiri

Thanks, Tegiri, for the hint.

Ok, I went back and re-read the earlier posts.

Ok, so S is Sketch and D is Detail. Those "relations", that are /so/ different to you, (which caused my misunderstanding) are one and the same to me. Perhaps my mind is Normalised. Let me try and put it in your terms: there is a single relation variable over /all/ the "relations" you describe in your examples. When Normalising, we care only about base relations, not derived relations, in order to create, to make-concrete, the variable.

Since I realise that mathematicians cannot make the distinction between base relations and derived relations, I am not going to ask you to do that. Please, just realise, that that is an impediment in the progress of your project, from The Abstract to The Real. And to communications.

> On Sunday, 18 January 2015 10:07:12 UTC+11, Norbert_Paul wrote:
>
> ...
>
> However, when spatial entities that live in some R^n (R^2 - the plane, R^3 -
> the Euclidean space, R^4 - the space-time, ...) are modelled and
> (according to my assumption) are partitioned into a a finite set of spatial
> entities (say, faces, edges, volumes, hyper-volumes, ...) then the set of
> entities immediately fits into my model (for being finite). Interestingly the
> entities get their topology by the same operations that produce topological
> result spaces from input spaces (so-called !quotient space").

In case it is not clear, given your original request, for practitioners to respond, a relational database implementation, etc, the above paragraph is the place where we do "meet", where we start from.

It is disappointing that since databases are finite, they are uninteresting to you. Because that is meeting point is for me. I am happy to deal with your request, not being a mathematician, being a practitioner, and so on. Likewise, in view of your original request, you might have to entertain a non-mathematician.

I am not sure how you are going to implement "interesting" database schema, Relational ones, if you will not first implement "uninteresting" database schema, and ensure they are Relational. There is already a gap between what is in your mind, in abstract terms, and what is required to implement that ("there is a lot of work to be done"). And the exercise of closing that gap is proceeding at a snails pace. Because:
a. you have difficulty communicating with real-world people, in concrete terms
b. and conversely, you do not understand what they communicate to you
c. you will not accept a practitioner's knowledge of what is important, to an implementation, even though you are aware that you are not an implementer.

====

I understood the uncorrected versions of this, I think 95%, and accepted it (ie. it did not jar me with some intuitive error). But the corrected versions ...

> On Sunday, 18 January 2015 20:24:12 UTC+11, Norbert_Paul wrote:

> > A topology could be
> >
> > T = { {}, {bathroom}, {hallway}, X}.
> >
> This is wrong. The topology looks like
> T = { {}, {bathroom}, {hallway}, {bathroom, hallway}, X}.

Is the fourth element supposed to identify that there is a passage between {bathroom} and {hallway}, while the previous enumeration did not have that passage. And we know (from another row) that that passage is occupied by {bdoor} ?

In that case, I do not accept that {room_a, room_b} is a valid IDENTIFIER for passages. I suggest {room,passage}. Otherwise you encounter problems with the implementation of the key which is two elements from the previous level of the hierarchy {Room}, where all other keys in that table are (thus far) one element from the previous level of the hierarchy plus a designator, eg. {bdoor}.

> > This gives a topology
> >
> > { {}, {bdoor}, X}
> >
> > the so-called "dual topology".
> Also wrong. The dual topology looks like
> { {}, {bdoor}, {bdoor, bathroom}, {bdoor, hallway}, X }.

Ok, as long as, as a consequence, there are more rows in your previous enumeration R = bounded boundary.

Dual topology. If you mean the transposition, that is trivial. RT is a derived relation of R, not a base relation.

I have produced a second sketch of the data model in the distance.
http://www.softwaregems.com.au/Documents/Student%20Resolutions/Norbert%20Paul/TDAS%20Note%20p4.pdf
If you would be so kind as to ignore it again, and confirm the elements thus far, I can produce the next sketch, in my own plodding way. You need an Hierarchy for Articles (Architectural Spaces) in a physical location (Room).

In case you are unfamiliar with diagrams or standard-compliant data models, I have included a link to IDEF1X Notation.

If you have difficulty with diagrams, and you would prefer text, let me know, I will translate.

====

> On Monday, 19 January 2015 09:18:48 UTC+11, Tegiri Nenashi wrote:
>
> OK, this example illustrates what your topological objects are (modulo corrections in your followup post). The next question is how do you fit this object into database relation.

Please, let's not "fit it in", let's (a) understand it, (b) Normalise it, and then (c) implement it with some intelligence.

> OK, this is intuitive, but I would like to mention that you are still quite far away from equipping [database] relations with topology.

Agreed. That is precisely the gap that I was trying to close, coming from a completely different location to that of yours, Tegiri. Imagine that Norbert is lost in the Grand Canyon, and he started from a town on the West side. Tegiri is from a city in the West. I am starting from a town on the East side. We are both trying to find Norbert and lead him out.

So far, it appears I am a voice crying in the wilderness, and you (Norbert) have no idea that the Messias (ie. your stated goal, not me) is close at hand. That is alright, St John lost his head to a slut who couldn't bear to hear The Truth. I am in very good company.

I appreciate that you know your subject well, and you think your relations might be Relational, etc. Please note, you are quite loose with your identifiers. That hinders your project, when communicating with others, whose assistance you have requested.

> > The bathroom door example topology is derived from the metric space
> > that door and rooms live in.

Clarified yes, but the confusion about identifiers remains.

Instead of identify a passage as a connection between two rooms, ie. two keys, compounded, why don't you identify it as an item in metric space, ie. x, y, x co-ords from a nominal zero co-ord point, such as the north east corner of the floor.

Or identify the passage as belonging to (being part of) one room only.

> On Tuesday, 20 January 2015 06:58:42 UTC+11, Norbert_Paul wrote:
> > Tegiri Nenashi wrote:

> Just a the pair (X,T) is called "topological space", I call the pair (X,R)
> a topological data type, because I consider the relation R a binary relation
> R \subseteq X \times X
> on X where, by abuse of notion,

That will get you in the end. Here it interferes with the identification.

> R will only contain primary key attributes from
> X. Then R generates a topology T(R) for X this representing the tpological
> spcae
> (X,T(R)).
> I am, hence, thinking of registering such pairs in the database catalog as
> "spaces".

Yeah, sure, at the high level, with no rubber on the ground.

The issue is:
a. Those Spaces are quite varied (room, door, passage, veneer)
b. The way you have been identifying them is very loose (coords; names; composite names; etc)
c. and you want to relate these Spaces to other Spaces of different types
d. therefore you had better have an universal and meaningful identifier for "spaces"

> So I have both: a "multivariant" relation X accompanied by a binary relation
> R on X.

Tegiri
Can you please provide a one line definition of a multivariate relation. Google finds nothing in this space, only such that is related to multiple regression..

> In the ER-model X is usually called an "entity type" and R a
> "relationship type".
>
> I am only interested in topology and, hence a binary relation is sufficient.
> However I can imagine a binary relation with three attributes:
>
> create table PointSet
> planid integer not null,
> objectid integer not null,
> -- ... additional attributes ...
> primary key (planid, objectid)
> -- ... additional constraints ...

x ...
y ...
z ...

> );

What article is PointSet a point set of, ie. what are the various things that objectid will contain ? (I think you are going to answer, 50 or 60 "relations".)

> create table Topology(
> planid integer not null,
> bounded integer not null,
> boundary integer not null,
> primary key(planid, bounded, boundary),
> foreign key (planid, bounded) references point_set,
> foreign key (planid, boundary) references point_set);

I don't think it is a big deal at this point, but in my second sketch, thus far, my name for your PointSet is Article, and Topology is Space. I think Topology is a result set or derived relation, as such, there is no table for it (and if there were, it would be a massive duplication of data, which I will not participate in helping you to create).

I think the essence is, a bounded article is defined in terms of its boundaries.

If so, then:
a. how are unbounded articles defined ? Just the point set ?
b. And are they included in some Topology ?

> Then the pair (PointSet, Topology) represents topological space

See what I mean by the naming. X and Y cannot represent Y. That is how they produced Mad Cow Disease.

> that is a
> so-called direct sum, a disjoint union of topological spaces each indexed by
> planid: http://en.wikipedia.org/wiki/Disjoint_union_(topology)
> The ineresting thing is, that the above construction would enforce the
> "dircet sum" property as a consistency rule.
>
> Also a ternary relation can be used to model a binary relation where each
> association has an integer attribute denoting the orientation of the boundary.
> But this is future work. I want also be able to model complexes.
> http://en.wikipedia.org/wiki/Chain_complex

Sure, and I am happy to support that, with these notes in mind
a. ternary relations (ie. for the identifier, as opposed to a third or fourth FK for some attribute) are a nightmare (that might be a restatement of Tegiri's comment, if by "multi-variate", he means n-ary in the relational database world)
b. If you accept Codd, then n-ary relations can be reduced to binary relations. So the solution is given but you must implement it
c. the difficulty is, again, your concept of how Spaces and PointSets are Identified.

Which brings up the next (after cleaning up the identifiers) important issue, that mathematicians (not you) are scared to death of. You need
____An Hierarchy____
with full navigation capability.

- Room actually consists of door, window, wall
- window consists of sill, frame, inner glaze, space, outer glaze
- frame consists of ... etc
- and each of those items is an Architectural Space. Which relates to other Spaces (bounded, boundary), etc.
- and Topology is some combination of those Spaces, expressed as a /function/ of those Spaces.
- in an hierarchy, obtaining the "direct sum" is a simple matter
- your "consistency rule" is not an enforcement, it is a direct result of a few smaller consistency rules that are implemented in the structure. Ie. you are trying to enforce a rule on the result relation (instead of enforcing more appropriate rules on the base relations, which is the appropriate place), such that the derived relations (*all* of them) are consistent.
- that is a classic error that mathematicians make

Which is why I am trying to get you to be clear about the relationships, so that the hierarchical ones (normal) are differentiated; what you require for boundary and bounded.



> >> The relational representaion of a topology is an old and well-established
> >> fact in mathematics. The innovation is (IMHO) to combine it with the
> >> relational model.
> >
> > Again, you are referring to the theory of binary relations
> > http://en.wikipedia.org/wiki/Relation_algebra which is different from
> > relational model used in database theory.
>
> Again, I am always talking of a /pair/ (X,R) of relations where X is arbitrary
> and R a binary relation on X (thereby, of course, omitting all non-key
> attribute values from X).
>
> If you register such space in a hypothetical catalogue of spaces, say by
> sort of
>
> create space OuterSpace(
> PointSet, -- the point set
> Topology, -- the relation
> (planid, bounded) -- the FK on the "bounded side".
> );

Ok, we are getting there.
- What is the key that you use to identify the space when you register it ?
- Thus far, there is no PK (planid, bounded) anywhere for that space to be an FK reference /to/. Do you mean (planid, objectid) for a bounded row ?

See what I mean, the confusion of base relations vs derived relations causes manifold problems. That have to be resolved when an implementation is contemplated. Better to be clear about the distinction right from day one.

> I willl give an example query to illustrate the idea: When the pair
> (PointSet, Topology) is registered as a space you could query OuterSpace
> just like an ordinary relation:
>
> select *
> from OuterSpace
> where planid = 9;
>
> would give a pair (X9, R9) which would represent plan 9 from OuterSpace
> as a tpopological space (X9, T9) and which then would consist of the two
> relations
>
> X9 = select * from PointSet where planid = 9;
>
> R9 = select R.planid, R.bounded, R.boundary
> from cltopology R, X9 bded, X9 bdry
> where R.planid = bded.planid and R.bounded = bded.objectid
> and R.pplanid = bdry.pplanid and boundary = bry.objectid --.
>
> Here cltopology is the transitive closure on topology. R9 then generates the
> "subspace topology" of X9 in OuterSpace:
> http://en.wikipedia.org/wiki/Subspace_topology
> In the above example, the transitive closure is not necessary but in general,
> correct subspace topology computation requires transitive closure. Its
> efficient use requires a good query optimizer.

All commercial SQL platforms will do that quite easily.

We do not necessarily need the exact structure that you appear to be fixed upon. Any Normalised Relational structure, as long as it provides the hierarchy correctly, will do. You can whack a DISTINCT in the query, but it is better to have a constraint that prevents self-references.

> The entire Relational Algebra can be "topologized" this way. Every operator
> of Relational Algebra can be applied to such a space and will then return a
> correct result space. Some examples:
>
> group by: http://en.wikipedia.org/wiki/Quotient_space_(topology)
> cross join: http://en.wikipedia.org/wiki/Product_topology
> projection: http://en.wikipedia.org/wiki/Final_topology
> equijoin: http://en.wikipedia.org/wiki/Pullback_(category_theory)

Oh dear.

If that is all there is to it, then it is no big deal at all (I concede that it may be novel, and it may be a big deal, in the mathematicians circles.) We have been doing that in Relational Databases, for over thirty years that I am aware of. We have "geographised" spaces (think google maps plus customised objects); "geneologises" spaces (mum; dad; the kids; the grannies, for people; horses; dogs); manufacturing Bill of Materials spaces (assembly-components, construction; building materials). All with:
- full computation
- transitive closure
- instantaneous traversal
- unlimited levels
- etc, etc.
The Unix inode file system is a straight hierarchy, with a true Relational implementation (one does not need an RDBMS to implement data, Relationally, on any platform). There are millions of Hierarchies. Unknown to mathematicians, of course.

I wrote my first hierarchical system in 1979, on a Network DBMS, and then years later, as a consultant for a vendor, advised the same team on how to convert it to Relational. I have written at least 40 hierarchical components in various databases over the years, and given full solution for others to implement in at least 50 more. I wrote one for Multiple Regression. Hierarchies are common in nature, and therefore almost every genuine Relational database will have some hierarchical capability implemented, if not the full hog, transitive closure, computation, etc.

There are Three Obstacles and One Consideration.

The first is, mathematicians are scared of The Hierarchy. And they are in a state of denial re the evidenced fact, that the Relational Model is a progression of, not a substitute of, the Hierarchical Model. Note the comments of J Hidders and J K Lowden in recent threads. Of the two Normal Forms defined in the RM (which btw remain undefined by mathematicians after 44 years), the first is, if we were to name it without psychological impediments, the
____ Hierarchical Normal Form____
- It destroys many of the problems that mathematicians, even today, are grappling with.
- It deems many of the proposals of Date, Fagin, and Darwen *non-relational*, which is why they suppress the Hierarchical issue, in order that people won't identify their proposals as massive breaches of the RM.

The second obstacle is, due to the fear of the HM, and the suppression of Hierarchies, the recognition of an Hierarchy in the data (which is an ordinary, natural occurrence) is made difficult or impossible. As a result the method for both implementing an Hierarchy in a Relational Database (ie. in the /Data Space/, and for navigating it (the code, in the /Process Space/), are unknown.

This state of affairs results in many funny situations, all of them massively inefficient, with many more indices and rows and limitations (compared to the simple, direct implementation). They can't call it "hierarchy" because they deny it and fear it, so they have funny names for it: nested sets; adjacency lists. And they can only implement a fraction of it. We just call them Hierarchies. MS even has a HierarchyID datatype, which is the worst, because it fixes the hierarchy in concrete (any change to the hierarchy requires a large part of it to be re-written). In an unhindered implementation, one updates only the rows that one needs.

The need is so common, and the method is so pedestrian, that I no longer charge for the advice (concerning an implementation of an Hierarchy in a Relational Database), I give it away free of charge, and I have the tutorials with a full set of examples and code available online.

Third, the hierarchy must be implemented in the Data Space, not the Program Space. The traversal (navigation of levels in the hierarchy) is of course, implemented in the Program Space, not the Data Space. If you do not understand this /architectural principle/ (the same one that I have stated that the OO/UML types and mathematicians commonly break, when they try to control the data from the program space, and build monoliths to offer to the god of mathematics), _you will build a very poor system_, that needs a massive amount of maintenance.

Fourth, a consideration, not an obstacle. Traversal requires recursion. Which means, if you don't have a commercial SQL platform, that provides recursion /in the Program Space/, you are stuffed. Check Joe Celko's method of traversing an hierarchy, only a fraction of which is implemented, as an "adjacency list", if you want a really good laugh.

> It is a RoomsWithTopology object that is represented by a pair of relations
> just as described above. I am thinking of expanding the relational catalog to
> register topological spaces. Then a query on spaces always uses the topological
> counterparts of each relational operator involved. A mix of spatial and
> non-spatial operators combining "ordinary" database relations with "spaces" is
> also possible.

I do not see how the "space" relations are different from "ordinary" relations. They are all just relations (keeping in mind that we implement base relations as tables). If the relations contain "space" data, sure, they are space relations. If they contain "mummy and daddy" data, sure, they are geneological relations.

> ... these are the ones I already know. The list you gave me is handy. I could
> check for each property mentioned there if and how it affects the topology.
>
> Instead of restricting R, one could provide consistency rules for spaces to
> enforce topological properties of modelled spaces.

Exactly right. When you get close to realising your tables, there will be quite literally 40 or 50 constraints that you can implement on the database ... as long as you have:
- correctly Normalised it first,
- and any hierarchies that do exist in the data, have been implemented directly (without the impediments detailed above).
That is the mass of Integrity (all declarative) that a Relational Database is capable of, that a OO/UML program is not capable of. ANd that will "ensure" that any and all result sets (your "relations" are consistent; have transitive closure; allow full computation (including "direct sum"); etc. Attempts to enforce constraints on the result set is of course a gross error.

My last point regards relevance. Separate to the novelty of "implementing toplogical space <or whatever> relations, Relationally". Separate to overcoming the impediments and implementing an hierarchy as a Relational Hierarchy. At the outset, from your text descriptions, I had the idea that this was a substantial project; that there was a huge gap between the 40 or 50 mathematical relations, and a physical implementation. It is turning out, that it is:
- Four Relational tables.
---- Pedestrian for a practitioner (as long as Normalisation is understood).
- That are implemented as a Relational Hierarchy.
---- Simple for me, difficult-to-impossible for those who are victims of the suppression of Hierarchies, as a result of Date's, Fagin's, Darwen's, Abiteboul-Hull-Vianu's "work", and as proliferated by virtually every mathematician, eg. as evidenced by J Hidders and J K Lowden.
- There are already hundreds of thousands of similar implementations in the real world. Refer my first post in this thread. Millions, if you count licences.

Therefore, it is a tiny project, if implemented by an experienced practitioner who is not crippled by the impediments.
---- And it is difficult, for anyone else, with the extent of difficulty directly related to the extent that they are affected by the impediments.
---- This is a case in point, of how mathematicians cripple mathematicians, all by themselves.

Therefore, what remains is, whether the relations are Relational.
- that is trivial to me, I can ensure that they are
---- Given the evidenced level of knowledge of the Relational Model amongst mathematicians, sure, that might (a) be a big deal, and (b) never complete (due to not knowing what Relational means in the physical universe)
- that might well be a big deal in the world of mathematical papers (refer my initial comments that,, for the life of me, I cannot understand why people even consider non-relational relations), and therefore, go ahead and ensure that your relations are Relational (considering my initial comments that it is not possible to determine that, until you have a model that is close to implement-able)

Therefore, I fail to see the relevance of the tiny project.
---- I concede, it may be relevant, novel, first time relations are Relational, "a lot of work to be done", etc ... for mathematicians.

Nevertheless, I am willing to execute the thankless task of assisting you in your progress to your stated goal of implementation. Ie. translate your mathematicians descriptions into implement-able form; Normalised to <all>NFs (ie. all current NFs and any NF that will be "defined" in the future); with an Hierarchy implemented directly and squarely (full "direct sum" and transitive closure); without redundancy; and produce a complete (ala complete documentation) data model for a Relational Database.

Cheers
Derek

Tegiri Nenashi

unread,
Jan 27, 2015, 11:41:01 PM1/27/15
to
On Tuesday, January 27, 2015 at 5:50:48 PM UTC-8, Derek Asirvadem wrote:
> Tegiri
> Can you please provide a one line definition of a multivariate relation. Google finds nothing in this space, only such that is related to multiple regression..

Section 3.2 from the Alice book
http://wiki.epfl.ch/provenance2011/documents/foundations+of+databases-abiteboul-1995.pdf
reads:

"Under the named perspective, these attributes are viewed as an explicit part of a database schema and may be used (e.g., by query languages and dependencies). Under the unnamed perspective, the specific attributes in the sort of a relation name are ignored, and only the
arity of a relation schema is available (e.g., to query languages). "

Unnamed perspective is something that has been used in mathematics for a long long time. Please note that Tarski's algebra of binary relations also uses unnamed perspective. It seems that only after Codd's invention of relational model the named perspective reigned supreme, so that database practitioners might be even not aware of relations with unnamed/positional attributes. In those rare cases where SQL uses positional notation, it is more of misnomer. Compare

select name from employees order by 1

with

select name from employees order by 1.000001

Apparently, a person who originally designed SQL "order by" clause was not familiar with continuous functions.

Getting back to your question, the term "multivariate relation" is synonym for "relation with named attributes", which emphasizes named perspective.

Derek Asirvadem

unread,
Jan 28, 2015, 1:01:12 AM1/28/15
to

Tegiri

Thank you very much for answering my question, both concisely, and (separately) completely. Reminds me of someone.

> On Wednesday, 28 January 2015 15:41:01 UTC+11, Tegiri Nenashi wrote:

> Getting back to your question, the term "multivariate relation" is synonym for "relation with named attributes", which emphasizes named perspective.

Perfect.

(So it is not what I suspected, it is not n-ary relations.)

> Section 3.2 from the Alice book

Oh God ! A well-known abortion. An intended textbook, available free. Unfortunately cited in many papers. Which damages the credibility of, and declares the watery foundation of, said papers. I take it, it is beloved of mathematicians.

> Unnamed perspective is something that has been used in mathematics for a long long time. Please note that Tarski's algebra of binary relations also uses unnamed perspective. It seems that only after Codd's invention of relational model the named perspective reigned supreme, so that database practitioners might not be even aware of relations with unnamed/positional attributes.

Confirmed. I do know about it, but I readily agree that most practitioners do not.

> In those rare cases where SQL uses positional notation, it is more of misnomer.

Meaning, in SQL, not mathematics, it is a misnomer.

> Getting back to your question, the term "multivariate relation" is synonym for "relation with named attributes", which emphasizes named perspective.

By "emphasises", do you mean "implements". The unnamed perspective is not implemented.

> Apparently, a person who originally designed SQL "order by" clause was not familiar with continuous functions.

I think not. He understood completely, that in the practical world, the physical universe, the unnamed perspective is irrelevant. Or, the unnamed/posititional perspective would be a nightmare for the developer, and for an SQL implementation, ridiculous in a practical universe.

And then it follows, that the positional in SQL is the position number of the [named] attributes (in the SELECT list).

> Compare
>
> select name from employees order by 1
>
> with
>
> select name from employees order by 1.000001

Well, only the commercial SQLs (not "rare") handle things like that, and handle them correctly. The second example has a value distinction that is meaningless, irrelevant, to the operation. This might give a better understanding:

SELECT /column_name_list/
____FROM { /table/ | /view/ | /derived_table/ }
____ORDER BY { /column_name/ | /column_no } [, { /column_name/ | /column_no } ] ...

Where /view/ and /derived_table/ are merely SELECT statements.

SQL is a practical-only data sublanguage.

It is not reasonable to expect theoretical closure from it. We do have full transitive closure (if one implemented that in the tables, not by magic); join transitive closure (by magic, fully automatic); etc, that is relevant in the physical universe.

If the TTM groupies collected half a brain amongst the lot of them, they would simply write an interpreter for relational algebra, and stick it on top of SQL. I said so six years ago; someone else said the same thing three years ago. Deaf ears. But for the mathematical papers (un-scientific) that support, and re-inforce, the monolith, the Tower of Babel, the everything-in-one-language. In these days of components and libraries.

To anyone who says that SQL is "not relational" or that something in a Relational Database can't be done in SQL, I have always said: give me the problem, and I will give you the fully Relational solution, free of charge. While practitioners and developers take up my offer frequently, and leave satisfied, mathematicians never take it up. They prefer to hang on to the notion, un-evidenced, that "SQL is broken, it is not relational".

Have you seen the O.2.SQL abortion ? The creatures the OO boys come up with these days, sheeesh. I do all that in plain SQL, on a pure RDb, with standards and simple rules for implementation. No fuss, no mess, no fanfare. That one is not free.

Last question. Why then, call the unnamed perspective "multi-variate relations" ?

Cheers
Derek

Tegiri Nenashi

unread,
Jan 28, 2015, 12:51:56 PM1/28/15
to
On Tuesday, January 27, 2015 at 10:01:12 PM UTC-8, Derek Asirvadem wrote:
>
> SELECT /column_name_list/
> ____FROM { /table/ | /view/ | /derived_table/ }
> ____ORDER BY { /column_name/ | /column_no } [, { /column_name/ | /column_no } ] ...

This syntax is wrong, because "order by" clause allows arbitrary expressions, e.g.

select * from employees order by salary+1

The "1" there, is it a column number? What column does it refer to? I understand that it is possible that the original SQL standard might not have expressions. Still, in normal programming language as soon as one feature is introduced, other is deprecated. What are those column numbers are for, to save a user few keystrokes? In modern world with auto completion?

> Last question. Why then, call the unnamed perspective "multi-variate relations" ?

Consider multivariate polynomial system

$latex x^2-3 x+2 = 0 $
$latex y-1 = 0 $
$latex z-1 = 0 $

it has 2 zeroes:

x=1, y=1, z=1
x=2, y=1, z=1

In other words, we have relation

x y z
- - -
Message has been deleted

Tegiri Nenashi

unread,
Jan 30, 2015, 12:29:17 PM1/30/15
to
On Thursday, January 29, 2015 at 8:13:16 PM UTC-8, Derek Asirvadem wrote:
> > because "order by" clause allows arbitrary expressions,
>
> No it doesn't. As per the syntax above, it allows either a /column_name/ or a /column_no/ (where 0 < /column_no/ ≤ /no_of_columns/

"order by emp_no+1" works in Oracle and MySql, the two the most ubiquitous databases on the planet. I honestly don't understand why are you reluctant to allow users to sort their result by salary+bonus. Finally,
SQL is the verbose champion of programming languages, what difference couple of saved keystrokes at the end of the query does it make?

Derek Asirvadem

unread,
Jan 31, 2015, 1:54:54 AM1/31/15
to
> On Saturday, 31 January 2015 04:29:17 UTC+11, Tegiri Nenashi wrote:
> On Thursday, January 29, 2015 at 8:13:16 PM UTC-8, Derek Asirvadem wrote:
> > > because "order by" clause allows arbitrary expressions,
> >
> > No it doesn't. As per the syntax above, it allows either a /column_name/ or a /column_no/ (where 0 < /column_no/ ≤ /no_of_columns/
>
> "order by emp_no+1" works in Oracle and MySql ... I honestly don't understand why are you reluctant to allow users to sort their result by salary+bonus.

> Ouch. You are right. I have made a mistake, sorry. I should have examined it instead of glancing at the manuals. Thank you for staying with it until it was resolved. So let me back up to the point where we were before I took the wrong fork in the road.

> This syntax is wrong, because "order by" clause allows arbitrary expressions, e.g.

Yes.

Corrected:

SELECT /column_name_list/
____FROM { /table/ | /view/ | /derived_table/ }
____ORDER BY {
________/column_name/ |
________/expression/ |
________/column_no/
________}
________[, { /column_name/ | /expression/ | /column_no } ] ...

Where /expression/ may or may not reference a column; subquery; etc.
Where /column_no/ may be an expression, that resolves to an integer (a real is truncated) 0 < cn ≤ /no_of_cols/.

If you have a problem with that, note that the DB2 and Sybase Parsers are far more intelligent (Sybase implemented it fourth major Parser version in 2007; the first was 1984) than Oracle and MyNonSql, which I would say are hardly parsers. Ie. it handles context well.

> "order by emp_no+1" works in Oracle and MySql

And now that I have had my coffee, it works in Sybase.

> the two the most ubiquitous databases on the planet.

[Distraction] Sure, but they are not SQL compliant. Their use of "SQL" is fraudulent.

> select * from employees order by salary+1
>
> The "1" there, is it a column number? What column does it refer to?

No, that is an /expression/, which happens to be a /column_name/ and some arithmetic. Any SQL arithmetic will be fine. Any whitespace is fine. It resolves to the value of salary for the result set row plus the value 1.

If the /expression/ does not reference a column_name, then it is a /column_no/.

> Still, in normal programming language as soon as one feature is introduced, other is deprecated. What are those column numbers are for, to save a user few keystrokes? In modern world with auto completion?

My comments on those issues stand.

Cheers
Derek
0 new messages