teaching relational basics to people, questions

Sampo Syreeni

unread,

Nov 16, 2009, 2:42:21 PM11/16/09

to

Right now it is to be expected that I will be spreading the good
relational word among my peers, in the near future. That is an
opportunity one doesn't want to fuck up; many enough have gone down
that road already. So I've been going over, and over, and over the
basics. Don't want them to be able to catch me off guard with the
minutiae, after all...

So now I bump into my first real surprise, and the chills immediately
go down my spine. That's Date et al.'s answer regarding the
implications between 6NF and DK/NF, at http://www.dbdebunk.com/page/page/621935.htm
. In there they flat out state that DK/NF doesn't imply 6NF.

So, my first question is, can this really be true? I mean, this seems
highly suspect to me: since 6NF is a normal form like any other and is
as such defined by the constraints it upholds by design, and on the
other hand DK/NF is by definition a normal form where any constraint
whatsoever follows from the domain and key ones, shouldn't it be self-
evident that DK/NF logically implies 6NF, and in fact any other form?
No matter the fact that there might well be databases which could be
put into 6NF which cannot attain DK/NF? I think at the very least said
implication should follow at the price of making it a vacuous truth
(i.e. all (non-trivial?) 6NF databases could be such that they cannot
be put into DK/NF)?

In particular I suspect that the seeming lack of implication follows
from not treating the time dimension(s) on an equal footing with the
rest of the attributes in a relation. That, then, would at least to me
seem like a rather grave violation of the information principle.

The second point ain't as much a rebuke as a retort: I wonder whether
Date and Darwen chose their model of time -- which 6NF is defined on
top of -- based on convenience and familiarity, instead of some deeper
theoretical reasoning. To me the idea that time in a relational
database should be treated as a discrete, countably infinite set of
disjoint moments at a preset temporal granularity seems just
unnatural, and unnecessarily limiting.

To me it would seem much more natural to model time as a full
continuum of precise moments in time, and to constrain such real life
models using a finite (but otherwise unlimited cardinality) set of
FOPL constraints, relying on the full linear order on top of the
reals, on top. I.e. to model time using CW-complexes over the real
line (i.e. finite unions of open, closed and semi-closed intervals of
reals), in a fully discrete but also fully variable precision
approximation.

Model-wise, 6NF as D&D define it immediately generalizes to this --
all that needs to be changed is to quantify every defining formula
over the corresponding nondenumerable set -- yet the possibility of
rigorously modelling the interaction between open and closed intervals
as well is a considerable plus when dealing with general intersection
queries. I also consider the the fact that imputing any kind of chosen-
ahead granularity parameter into the basic model suddenly becomes
unnecessary a huge plus. So, do you think this sort of approach is
sound?

Finally, I of course have the firm intention of covering the
essentials, including the basics of dependency and normalization
theory at least upto 6NF and DK/NF. If my audience proves to be game,
I'd also like to mention in passing some of the lesser known, more
esoteric, and less fully researched topics in dependency theory like
(E)(B)MVD's (cf. e.g. http://www2.cs.uregina.ca/~butz/publications/ipmu00.pdf
), just to make sure people don't accidentally think they've mastered
the subject after what is a mere, hurried, introduction. I'd hope to
pique some genuine interest in the relational way of thought, among
people who perhaps haven't been exposed to the mindset, eventhough
otherwise more than capable in modelling data. If you could suggest
other ways to accomplish the feat, I would greatly appreciate a hint.
--
Sampo

com...@hotmail.com

unread,

Nov 17, 2009, 6:08:30 PM11/17/09

to

On Nov 16, 11:42 am, Sampo Syreeni <de...@iki.fi> wrote:
> other hand DK/NF is by definition a normal form where any constraint
> whatsoever follows from the domain and key ones

Go read a definition carefully. Eg wiki dknf.
It's in DK/NF iff all constraints follow from the domain and key ones.
So it's not in DK/NF iff some constraint doesn't follow from the
domain and key ones,
In other words, if there other constraints is just isn't *in* DK/NF.
(It's not a very useful concept.)

philip

Casey Hawthorne

unread,

Nov 18, 2009, 2:37:25 AM11/18/09

to

Somebody once called me pretentious.

I replied, "Moi?"

On Mon, 16 Nov 2009 11:42:21 -0800 (PST), Sampo Syreeni <de...@iki.fi>
wrote:

Regards,
Casey

Sampo Syreeni

unread,

Nov 18, 2009, 11:53:00 AM11/18/09

to

On Nov 18, 1:08 am, com...@hotmail.com wrote:

> Go read a definition carefully. Eg wiki dknf.
> It's in DK/NF iff all constraints follow from the domain and key ones.
> So it's not in DK/NF iff some constraint doesn't follow from the
> domain and key ones,
> In other words, if there other constraints is just isn't *in* DK/NF.

But that's just my point: we *assumed* DK/NF, so there cannot be any
other constraints. 6NF should logically be subsumed, if not else then
vacuously.

> (It's not a very useful concept.)

Personally I find it very useful, at least as a mental aid, because it
forces one to concentrate on the possibility of constraining your
schema just a little bit further. No matter the means.
--
Sampo

com...@hotmail.com

unread,

Nov 18, 2009, 10:56:48 PM11/18/09

to

So what sense do make then of Date's example?

"EMP {EMP#,DEPT#,SALARY} KEY {EMP#}
(with the obvious semantics)"

You are confused. This is why I said to read some definitions.

If some relations and constraints meet certain criteria,
we say they are in some certain normal form.
The example has one relation and no (non-key-or-type) constraints.
Since there are no non-key-or-type constraints, that collection
of relations and constraints is in DK/NF.
But the relation in the example has more than one non-key attribute.
So that collection of relations and constraints is not in 6NF.

> we *assumed* DK/NF, so there cannot be any
> other constraints. 6NF should logically be subsumed, if not else then
> vacuously.

> Personally I find it very useful, at least as a mental aid, because it

> forces one to concentrate on the possibility of constraining your
> schema just a little bit further.

The sense of "constraint" in the first quote is the normal dbms one.
But you seem to think 6NF is some kind of "constraint".
We could say that in some sense
"a given set of relations and constraints rearranged
into some normal form is more constrained than it was".
But this is using the word "constraint" with a different meaning.
This seems to be where you are confused.

It would probably help if you didn't use vague and/or suggestive
(thus, meaningless) phrases like "logically subsumed".
You do this a lot.
Force yourself to identify and refer to the things, notions and
terms you find in definitions
(like, a collection of some relations and some constraints).

philip

Sampo Syreeni

unread,

Nov 19, 2009, 4:32:41 AM11/19/09

to

On Nov 19, 5:56 am, com...@hotmail.com wrote:

> But the relation in the example has more than one non-key attribute.
> So that collection of relations and constraints is not in 6NF.

The definition of 6NF is that the schema must not satisfy any non-
trivial join dependencies at all. This implies at most one non-prime
attribute per relation. As I see it, the condition that there be at
most one non-prime attribute is a constraint on the relation in the
usual sense. It does not follow from key and domain constraints,
however. Hence, the relation may not be in 6NF, but I don't think it
is in DK/NF either. If so, it isn't a counter-example to DK/NF
implying 6NF like Date seems to think.

> It would probably help if you didn't use vague and/or suggestive
> (thus, meaningless) phrases like "logically subsumed".

At least that one has a perfectly standard meaning in mathematics.
It's strictly the same as "is implied by".
--
Sampo

com...@hotmail.com

unread,

Nov 19, 2009, 9:56:25 PM11/19/09

to

On Nov 19, 1:32 am, Sampo Syreeni <de...@iki.fi> wrote:

> As I see it, the condition that there be at
> most one non-prime attribute is a constraint on the relation in the
> usual sense.

> > (thus, meaningless) phrases like "logically subsumed".

> At least that one has a perfectly standard meaning in mathematics.
> It's strictly the same as "is implied by".

If you look into it you might find differently.

Regardless (or perhaps, with regard to those "other ones"),
it has been my experience that phrases like "As I see it"
are rhetorical wafflings. As if, when later shown wrong, you can say,
ok, that might not be the way it is, but that's the way I saw it,
and all I claimed was that I saw it that way, and I did, so I wasn't
wrong.
I am not trying to catch you at anything.
I'm trying to help (about normal forms and your writing, both).
(And because of the exercise in improving my reasoning,
understanding and writing.)
As I said, it helps to force oneself to be direct and precise,
because it (painfully) makes one think and write more clearly.

Regarding constraints & you vs Date we remain in disagreement.
Hope someone else can help.

philip

rpost

unread,

Nov 20, 2009, 3:18:56 PM11/20/09

to

Sampo Syreeni wrote:

>To me it would seem much more natural to model time as a full
>continuum of precise moments in time, and to constrain such real life
>models using a finite (but otherwise unlimited cardinality) set of
>FOPL constraints, relying on the full linear order on top of the
>reals, on top. I.e. to model time using CW-complexes over the real
>line (i.e. finite unions of open, closed and semi-closed intervals of
>reals), in a fully discrete but also fully variable precision
>approximation.

Doesn't this make it a bit hard to predict how long queries
on time-valued domains may take? The amount of time required for
manipulating a value will greatly vary. My guess is that why
Date and Darwen rule it out for this reason. Then again, Date
defines the use of essentially arbitrary domains elsewhere.

--
Reinier

Sampo Syreeni

unread,

Nov 21, 2009, 7:30:25 AM11/21/09

to

On Nov 20, 10:18 pm, rp...@pcwin518.campus.tue.nl (rpost) wrote:

> Doesn't this make it a bit hard to predict how long queries
> on time-valued domains may take? The amount of time required for
> manipulating a value will greatly vary. My guess is that why
> Date and Darwen rule it out for this reason. Then again, Date
> defines the use of essentially arbitrary domains elsewhere.

I don't think so, because all of the logic would still revolve around
similar comparison operations, on an identical number of endpoints.
The only difference would be that in some cases there would be a less-
than-or-equal-to where a less-than was before, the numerical values
used to store time ranges would have their lowest order bit used to
denote the distinction between an open and a closed end point, and
variable precision would be available at least at the conceptual
level. At the physical one, it could always be implemented so that the
complexity if bounded by the less precise of the compared values, or
in fact limited to some fixed value at physical design time. And since
one would have to have a bona fide range datatype, building in
handling for infinite ranges would also be easy; that'd get rid of one
of the most persistent reasons why people incorporate nulls into
designs.
--
Sampo

Roy Hann

unread,

Nov 21, 2009, 11:24:20 AM11/21/09

to

Sampo Syreeni wrote:

> [snip] And since

> one would have to have a bona fide range datatype, building in
> handling for infinite ranges would also be easy; that'd get rid of one
> of the most persistent reasons why people incorporate nulls into
> designs.

I think you are being excessively optimistic. The most persistent (and
most common) reason people incorporate nullable columns into designs is
because they have a misplaced desire to minimize the number of tables in
the design, and think that conflating multiple fact types in one table
is clever, efficient, and harmless.

--
Roy

Bob Badour

unread,

Nov 21, 2009, 11:56:40 AM11/21/09

to

Roy Hann wrote:

Very well put!

Reinier Post

unread,

Nov 21, 2009, 2:12:28 PM11/21/09

to

Sampo Syreeni wrote:

[...]

>variable precision would be available at least at the conceptual
>level. At the physical one, it could always be implemented so that the
>complexity if bounded by the less precise of the compared values, or
>in fact limited to some fixed value at physical design time.

The most common types of database values can be represented
in constant space and compared in constant time. For other
types such as stringsor blobs of arbitrary size at least comparison
is linear in the size of the values. In your proposal, the values
are *arbitrary* reals. For all we know, they may be represented
by algorithms to compute them - in which case comparing them may be
pretty expensive - or even worse, they may be uncomputable.
You'll need to constrain them at least to the extent that comparing them
becomes a decidable problem.

>And since
>one would have to have a bona fide range datatype, building in
>handling for infinite ranges would also be easy; that'd get rid of one
>of the most persistent reasons why people incorporate nulls into
>designs.

Yes, I agree with your basic idea, but I really think you want to
replace 'arbitrary real number' with 'arbitrary precision integer'
or 'arbitrary precision integer quotient' or something similar.
They are similar to the strings and blobs we already have.

>--
>Sampo

--
Reinier

Mr. Scott

unread,

Nov 22, 2009, 3:26:23 AM11/22/09

to

"Roy Hann" <spec...@processed.almost.meat> wrote in message
news:3eudnYx_I7CpiZXW...@pipex.net...

I think you are oversimplifying. The most persistent (and most common)
reason people incorporate nullable columns into designs is the not so
misplaced desire to provide for information that is relevant but not
required, and that doesn't necessarily involve conflating multiple fact
types in one table. Each table design represents a family of interdependent
predicates. For example, the the simple table design,

{EMPLOYEE, DEPENDENTS} KEY(EMPLOYEE)

could represent the interdependent predicates,

'there is an employee <EMPLOYEE>' iff
'employee <EMPLOYEE> claims <DEPENDENTS> dependents.'

The predicates are interdependent. Given a row (EMP1, 2), the assertion
that 'there is an employee EMP1' implies that 'employee EMP1 claims 2
dependents' due to the functional dependency EMPLOYEE -> DEPENDENTS, while
at the same time the assertion that 'employee EMP1 claims 2 dependents'
implies that 'there is an employee EMP1.'

Permitting nulls in the DEPENDENTS column doesn't change the fact that each
employee claims some number of dependents, so long as the predicates remain
interdependent. What it does do is to permit recording the assertion that
there is an employee even when the precise number of dependents to be
claimed is as yet unknown. I should point out that splitting up the table
in order to eliminate the nulls doesn't fix anything. For example,
splitting the table into

T1 {EMPLOYEE} and T2 {EMPLOYEE, DEPENDENTS}

where T1 represents the predicate,

'there is an employee <EMPLOYEE>,'

and T2 represents the predicate,

'employee <EMPLOYEE> claims <DEPENDENT> dependents,'

requires the introduction of referential constraints from T2 to T1 and also
from T1 to T2 since the original predicates are interdependent. Dropping
the referential constraint from T1 to T2 is a problem because the fact that
each employee claims some number of dependents is no longer explicitly
specified nor is it a logical consequence of the constraints that remain.

The problem you're referring to, the one which involves conflating multiple
fact types, occurs only when the relationship between the predicates is
disjunctive. For example,

'there is an employee <EMPLOYEE>' or
('there is an employee <EMPLOYEE>' iff
'employee <EMPLOYEE> claims <DEPENDENT> dependents.')

It should be easy to see that this mirrors the split tables T1 and T2
without the referential constraint from T1 to T2. While one can't determine
whether null indicates that an employee doesn't claim dependents or that the
number of dependents claimed is not yet known, one also can't determine
whether the absence of a row in T2 indicates that an employee doesn't claim
dependents or just that the number of dependents claimed is not yet known.

What troubles me most about eliminating nulls just for the sake of
eliminating nulls is that there isn't always a precise mapping between a
database scheme that permits nulls and one that doesn't. Information is
lost, such as the fact that each employee claims some number of dependents
in the example above.

> --
> Roy
>
>

vldm10

unread,

Nov 22, 2009, 6:23:41 AM11/22/09

to

On Nov 16, 8:42 pm, Sampo Syreeni <de...@iki.fi> wrote:
> Right now it is to be expected that I will be spreading the good
> relational word among my peers, in the near future. That is an
> opportunity one doesn't want to fuck up; many enough have gone down
> that road already. So I've been going over, and over, and over the
> basics. Don't want them to be able to catch me off guard with the
> minutiae, after all...
>
> So now I bump into my first real surprise, and the chills immediately
> go down my spine. That's Date et al.'s answer regarding the

> implications between 6NF and DK/NF, athttp://www.dbdebunk.com/page/page/621935.htm

> (E)(B)MVD's (cf. e.g.http://www2.cs.uregina.ca/~butz/publications/ipmu00.pdf

> ), just to make sure people don't accidentally think they've mastered
> the subject after what is a mere, hurried, introduction. I'd hope to
> pique some genuine interest in the relational way of thought, among
> people who perhaps haven't been exposed to the mindset, eventhough
> otherwise more than capable in modelling data. If you could suggest
> other ways to accomplish the feat, I would greatly appreciate a hint.
> --
> Sampo

Regarding DK/NF and 6NF you can see my solution at www.dbdesign11.com.
There I introduce “Simple Form” an effective solution which decomposes
any relation to Binary Relations. My solution distinguishes two kinds
of DBs. DBs that don’t maintain changes (Naive DBs) and DBs which
maintain changes (General DBs). The Simple Form for Naive DBs is given
in 6.5 while Simple Form for General DBs is given in 4.2.9.

DK/NF and 6NF don’t distinguish Naive DBs from General DBs.
R. Fagin in his paper wrote: “The very important practical question of
exactly when DK/NF can be obtained (and how to obtain it) is open.”
What is a purpose of 6NF if key is compound? I mean very compound.

The Simple Form don’t need any other NF. The binary schema can be
immediately constructed. Obviously Binary Relations are superior to
NFs.

The idea of Binary Relations can be applied to Binary Files (see
example 5). However the most general case is Binary Concept. Here I
introduce Identifier of Entity which is very important for
construction of entity’s abstraction. Binary Concepts are related to
facts see (3.4 – 3.9). I introduce facts as primitives i.e. a fact is
the smallest complete unit by means of which a semantic act.

Regarding time my solution is event oriented. There are only two kinds
of events related to information (see 3.1). So - no events imply no
time. These events are in the real world. DB’s events are also in the
real world. So my DB design is very much related to the real world and
always it enables determination of the corresponding real world
events.
Hope this can help to determine answers on your questions.
Vladimir Odrljin

toby

unread,

Nov 22, 2009, 4:34:30 PM11/22/09

to

On Nov 18, 2:37 am, Casey Hawthorne <caseyhHAMMER_T...@istar.ca>
wrote:

> Somebody once called me pretentious.
>
> I replied, "Moi?"

A joke used in Fawlty Towers :)

vldm10

unread,

Nov 23, 2009, 5:23:07 AM11/23/09

to

On Nov 22, 12:23 pm, vldm10 <vld...@yahoo.com> wrote:
> On Nov 16, 8:42 pm, Sampo Syreeni <de...@iki.fi> wrote:

> What is a purpose of 6NF if key is compound? I mean very compound.

> Vladimir Odrljin- Hide quoted text -

To illustrate above question I will give you the following example:
1.
R1{A,B,C,D} and KEY{A,B,C,D}.
My question is: What is an equivalent set of 6NF relvars for R1?

2.
Now we want to maintain changes of attributes. Then R1 become:
R1{A, StartDateA, EndDateA,…,EndDateD}
My question is: What is an equivalent set of 6NF relvars for R1?

3.Now I can add for example one more relvar:
R2{A1, StartDateA1, EndDateA1,…, EndDateA10}.

4.
Now I have m-n relationship between R1 and R2. The relationship also
changes its attributes.
What is the key for this relationship
What is an equivalent set of 6NF relvars for this m-n relationship?

This is really simple example. There are dynamic db applications with
hundreds of relvars.
I can have an additional columns per one attribute. For examle in my
solution (see www.dbdesign11.com example 9) I have 6 additional
columns per one attribute.
For example I can insist that all date be in form YY, MM, DD, HH, MIN,
SEC – but I must take date as timestamp (technical reasons - in the
theory)
What will be 6NF equivalent for last m-n relationship?

Vladimir Odrljin

Roy Hann

unread,

Nov 23, 2009, 6:05:21 AM11/23/09

to

Mr. Scott wrote:

>
> "Roy Hann" <spec...@processed.almost.meat> wrote in message
> news:3eudnYx_I7CpiZXW...@pipex.net...
>> Sampo Syreeni wrote:
>>
>>> [snip] And since
>>> one would have to have a bona fide range datatype, building in
>>> handling for infinite ranges would also be easy; that'd get rid of one
>>> of the most persistent reasons why people incorporate nulls into
>>> designs.
>>
>> I think you are being excessively optimistic. The most persistent (and
>> most common) reason people incorporate nullable columns into designs is
>> because they have a misplaced desire to minimize the number of tables in
>> the design, and think that conflating multiple fact types in one table
>> is clever, efficient, and harmless.
>
> I think you are oversimplifying. The most persistent (and most common)
> reason people incorporate nullable columns into designs is the not so
> misplaced desire to provide for information that is relevant but not
> required, and that doesn't necessarily involve conflating multiple fact
> types in one table. Each table design represents a family of interdependent
> predicates.

That last sentence is the crux of your objection to my comment. I guess
since you and I are implicitly talking about SQL and since SQL makes few
if any claims to fidelity with the relational model, you are entitled
to think whatever you like about tables that represent "a family of
interdependent predicates". Personally I have no idea what those are
supposed to behave like, but I am pretty sure that in practice they
must suck. Certainly every example I've ever had to deal with did. At
this point I invoke the Principle of Incoherence and drop out of the
discussion.

I will however allow that you may be right that yours is an even more
common reason people introduce nullable columns. It would be worth
investigating which really is most common, so that appropriate remedial
training can be devised.

[much erudite-looking stuff snipped]

--
Roy

Mr. Scott

unread,

Nov 26, 2009, 5:19:14 AM11/26/09

to

"Roy Hann" <spec...@processed.almost.meat> wrote in message

news:N8Sdnf8ODpfs8ZfW...@pipex.net...

That sentence applies not only to SQL tables but also to Codd's time-varying
relations and Date and Darwen's relvars. Date defines a relvar's predicate
as the conjunction of all of the constraints that apply to it, but I think
he's wrong. The logical connective should be IFF rather than AND. The
difference is subtle, and may at first glance appear problematic since IFF
is true whenever none of its operands are true, but under the closed world
assumption, the only atomic formulas that are ever represented in the
database are those that are supposed to be true, so there is no harm in
choosing IFF over AND. The main reason I think the connective should be IFF
rather than AND involves deletes. While inserting a row effectively asserts
that all of the atomic formulas represented by the row are true, regardless
of the whether the logical connective is IFF or AND, the same can't be said
for deletes. When the logical connective is IFF, deleting a row effectively
denies that any (not all) of the atomic formulas represented by the row are
true, but when the logical connective is AND, deleting a row effectively
denies that all (not any) of the atomic formulas represented by the row are
true, which can be even if only one isn't. If some but not all of the
atomic formulas represented by a row were true, but a row can't be in the
table unless they're all true, then where is that positive information to be
stored? Without anywhere to store it, the database is inconsistent.

Clifford Heath

unread,

Nov 26, 2009, 11:13:19 PM11/26/09

to

vldm10 wrote:
> Regarding DK/NF and 6NF you can see my solution at www.dbdesign11.com.

> There I introduce ï¿½Simple Formï¿½ an effective solution which decomposes

> any relation to Binary Relations.

Your approach seems very similar in general to NIAM
and other fact-oriented modeling approaches like ORM2.
They are built on "Elementary Form", which in ORM2 can
include fact types that are ternary and higher (though
these are always trivial to binarise, the higher-order
fact types are useful in modeling because they reflect
natural verbalisations).

The main difference is that in fact orientation, the
model is expected always to be *constructed* in elementary
form. For efficiency of storage and access, uniqueness
constraints allow automatic and invisible aggregation
into non-elementary structures. This means you can model
in elementary form, and get a correct and efficient schema
which holds the same facts to use with your SQL DBMS. This
is the implementation principle behind my "Constellation
Query Language" as well.

You use the term "attribute" a lot. In fact orientation,
there are no attributes, they are just fact types that
encompass a functional dependency between objects. To drop
the notion of "attribute" is a major win for a number of
reasons, including that it isn't a clearly-defined concept
in natural usage; vis the conflict between relational and
object-oriented aggregation; and also the fact that
"attribute migration" is a major driver of schema evolution,
hence project scope inflation and failure.

In any case, before you go claiming your work as "novel",
it'd be good if you did some reading on fact orientation
and see whether others haven't been there before you, like,
for example, more than twenty years ago ;-). This isn't a
criticism actually; I haven't looked in sufficient depth
at your work to decide whether you've done something new.
But a lot of it does seem very familiar...

> Regarding time my solution is event oriented.

I like this view - it accords with my thoughts on the
quantization of time where the general principle that
"time is just G*d's way of keeping everything from happening
at once" ;-).

Terry Halpin (creator of ORM2) has a recent article series
on temporal modeling published at brcommunity.com, if
you're interested.

Clifford Heath, Data Constellation, http://dataconstellation.com
Agile Information Management and Design.

Message has been deleted

com...@hotmail.com

unread,

Nov 27, 2009, 10:32:35 PM11/27/09

to

On Nov 26, 2:19 am, "Mr. Scott" <do_not_re...@noone.com> wrote:

You really don't understand the relational model.

> That sentence applies not only to SQL tables but also to Codd's time-varying
> relations and Date and Darwen's relvars. Date defines a relvar's predicate
> as the conjunction of all of the constraints that apply to it, but I think
> he's wrong.

The "relvar predicate" that is the conjunction of all its constraints
is the characteristic predicate of the set of tuples that can ever be
in the relation. Date now calls this the "total relvar constraint" but
it used to be called the "internal predicate" (which term is now
deprecated by Date). But there is a different "relvar predicate",
which the designer specifies to tell the user what the relvar is
saying about the world. Date now calls this the "relvar predicate"; it
used to be called the "external predicate" (which term is now
deprecated by Date).

The relationship between the two "predicates" is as follows. The
designer specifies the relvar predicate so that a user updating the
database can observe the world and figure out whether a given tuple
makes it true and so that a user looking at the database can find out
what is true of the world. Next the designer figures out all the
possible tuples that could ever turn up in a relation because of the
way the world can be and then writes the total relvar constraint so
the dbms can tell the user they made an error if they ever try to put
some other tuple into the relation.

> The logical connective should be IFF rather than AND. The
> difference is subtle, and may at first glance appear problematic since IFF
> is true whenever none of its operands are true, but under the closed world
> assumption, the only atomic formulas that are ever represented in the
> database are those that are supposed to be true, so there is no harm in
> choosing IFF over AND. The main reason I think the connective should be IFF
> rather than AND involves deletes. While inserting a row effectively asserts
> that all of the atomic formulas represented by the row are true

A row in a relation asserts one proposition for each syntactically
valid row. If a such a row is in the relation then the proposition is
its relvar predicate with free variables replaced by attribute values.
If such a row is not in the relation then the proposition is its
relvar predicate with free variables replaced by attribute values,
negated. Since these are all asserted, that's the same as asserting
their conjunction.

If one were to use the "open world assumption" then the latter
propositions are not asserted. But then a dbms cannot calculate the
answer for a query that can only be written using a "NOT" (or in
relational algebra, MINUS, assuming the other primitives are JOIN,
UNION, PROJECT, EQUALS-RESTRICT and RENAME).

, regardless
> of the whether the logical connective is IFF or AND,

I don't know what "atomic formulas" you think are being ANDed/IFFed.
But if the database were asserting the IFF of some propositions then
it would be saying they are either all true or all false. So a user
would know that either they are all true or all false, but not which.
(That's a rather moot hypothetical since it *isn't* what a database
asserts.) Whereas in the relational model the user knows they are all
true. (That is, all the propositions above, some of which are ground
terms and some of which are negated ground terms, are all true.)

> the same can't be said
> for deletes. When the logical connective is IFF, deleting a row effectively
> denies that any (not all) of the atomic formulas represented by the row are
> true, but when the logical connective is AND, deleting a row effectively
> denies that all (not any) of the atomic formulas represented by the row are
> true, which can be even if only one isn't.

I don't know what you mean by "the atomic formulas represented by a
row". If one thinks of a relation's predicate as being an arbitrary
wff (it's usually thought of as being in natural language, or as being
wffs with natural language for ground terms) then the truth value of a
particular constituent atomic formula when the overall predicate is
true depends (like usual in predicate logic) on the connectives/
quantifiers. A row does not have a bunch of things ANDed together; the
propositions above are ANDed together (or more simply, just asserted).

After a delete a row it is no longer in the relation. So the database
is asserting the negation of the proposition that you get by
substituting its attribute values for attribute names in the
proposition.

> If some but not all of the
> atomic formulas represented by a row were true, but a row can't be in the
> table unless they're all true, then where is that positive information to be
> stored? Without anywhere to store it, the database is inconsistent.

Once again, a row does not in general assert the conjunction of wffs.

If you think otherwise please explain very clearly, hopefully with a
fully worked out simple example, because you won't be able to assume I
know what you are talking about, because what you have written is not
how the relational model works. From this message you should also be
able to work that example out "correctly".

philip

paul c

unread,

Nov 28, 2009, 12:06:40 AM11/28/09

to

com...@hotmail.com wrote:
> On Nov 26, 2:19 am, "Mr. Scott" <do_not_re...@noone.com> wrote:

...
...

>> The logical connective should be IFF rather than AND. The
>> difference is subtle, and may at first glance appear problematic since IFF
>> is true whenever none of its operands are true, but under the closed world
>> assumption, the only atomic formulas that are ever represented in the
>> database are those that are supposed to be true, so there is no harm in
>> choosing IFF over AND. The main reason I think the connective should be IFF
>> rather than AND involves deletes. While inserting a row effectively asserts
>> that all of the atomic formulas represented by the row are true

> ...

>
> , regardless
>> of the whether the logical connective is IFF or AND,

> ...

I believe IFF as well as AND can be expressed with NAND. So, what is the
argument?

The only thing I wonder about is why constraints must be truth-valued.
I'd rather they were allowed to have relation values other than dee and
dum. Seems the only way to allow defaults/mandatory tuples without
introducing some other concept.

Mr. Scott

unread,

Nov 28, 2009, 5:12:59 AM11/28/09

to

"paul c" <toledob...@oohay.ac> wrote in message
news:Ad2Qm.55460$PH1.49121@edtnps82...

> com...@hotmail.com wrote:
>> On Nov 26, 2:19 am, "Mr. Scott" <do_not_re...@noone.com> wrote:
> ...
> ...
>>> The logical connective should be IFF rather than AND. The
>>> difference is subtle, and may at first glance appear problematic since
>>> IFF
>>> is true whenever none of its operands are true, but under the closed
>>> world
>>> assumption, the only atomic formulas that are ever represented in the
>>> database are those that are supposed to be true, so there is no harm in
>>> choosing IFF over AND. The main reason I think the connective should be
>>> IFF
>>> rather than AND involves deletes. While inserting a row effectively
>>> asserts
>>> that all of the atomic formulas represented by the row are true
>> ...
>>
>> , regardless
>>> of the whether the logical connective is IFF or AND,
>> ...
>
> I believe IFF as well as AND can be expressed with NAND. So, what is the
> argument?

The issue is that denying a conjunction merely asserts that at least one of
the conjuncts is false, but not which. When the logical connective is IFF,
denial of any of the operands denies all of the operands, otherwise the
result would be inconsistent.

An example might help. In the typical table,

CTRS {COURSE, TEACHER, ROOM, STUDENT},

each row states that a particular COURSE is taught by a particular TEACHER
in a particular ROOM to a particular STUDENT.

Now, while it can be argued that there can't be a course without a teacher,
or that there can't be a course without a student, or that there can't be a
student without a teacher, the room exists independent of whether there is a
course, or a teacher or a student. It therefore follows that locating the
fact that 'there is a room <ROOM>' only in table CTRS is a problem because
then there could only be a room whenever there is at least one course and at
least one teacher and at least one student. When one inserts a row into an
empty CTRS, one effectively asserts

that 'there is a course <COURSE>,'
that 'there is a teacher <TEACHER>,'
that 'course <COURSE> is taught by teacher <TEACHER>,'
that 'there is a room <ROOM>,'
that 'course <COURSE> is held in room <ROOM>,'
that 'teacher <TEACHER> teaches in room <ROOM>,'
that 'course <COURSE> is taught by teacher <TEACHER> in room <ROOM>,'
that 'there is a student <STUDENT>,'
that 'course <COURSE> is taken by student <STUDENT>,'
that 'teacher <TEACHER> teaches student <STUDENT>,'
that 'course <COURSE> is taught by teacher <TEACHER> to student
<STUDENT>,'
that 'student <STUDENT> studies in room <ROOM>,'
that 'teacher <TEACHER> teaches student <STUDENT> in room <ROOM>,'
that 'student <STUDENT> takes course <COURSE> in room <ROOM>,' and
that 'course <COURSE> is taught by teacher <TEACHER> in room <ROOM> to
student <STUDENT>.'

When one deletes that only row, one effectively denies them, every one.
It's an all or nothing proposition. Either all of the atomic formulas
represented by the row are true, or none of them are. That is consistent
with the logical connective between those formulas being IFF rather than
AND.

But what if there is more than one row? The information content of a table
is the logical sum (disjunction) of the information represented by each row.
Isn't it true that

(P IFF Q) OR (P IFF R) = P IFF (Q OR R)?

As a result, denying (P IFF R) in this context doesn't necessarily deny P.

>
> The only thing I wonder about is why constraints must be truth-valued. I'd
> rather they were allowed to have relation values other than dee and dum.

But those can easily be transformed into into truth-valued constraints,
can't they?

paul c

unread,

Nov 28, 2009, 10:47:32 AM11/28/09

to

Mr. Scott wrote:
...

> An example might help. In the typical table,
>
> CTRS {COURSE, TEACHER, ROOM, STUDENT},
>
> each row states that a particular COURSE is taught by a particular TEACHER
> in a particular ROOM to a particular STUDENT.
>
> Now, while it can be argued that there can't be a course without a teacher,
> or that there can't be a course without a student, or that there can't be a
> student without a teacher, the room exists independent of whether there is a

> course, or a teacher or a student. ...

That's a good example of mysticism. Unless an application requirement
is given that rooms are independent in this way, one might just as
easily conclude that CTRS is the only base relation in the db. In that
case, the set of rooms must be a projection of CTRS. Without further
information, I'd have no choice but to conclude that second choice. In
my experience, the implementation of unstated requirements has been a
huge unnecessary cost in many db's.

paul c

unread,

Nov 28, 2009, 3:33:36 PM11/28/09

to

Mr. Scott wrote:
...

> But what if there is more than one row? The information content of a table
> is the logical sum (disjunction) of the information represented by each row.

...

The conventional view is that is that the information in a table is the
logical conjunction of the information represented by the rows in the
table. Just because the table is formed by a summing operation doesn't
change that.

...

> But those can easily be transformed into into truth-valued constraints,
> can't they?

> ...

Not easy if users aren't required to know default values. I don't see
why they should. Contrary to Dr. Strangelove, the whole point of
defaults is to avoid users having to know them!

Bob Badour

unread,

Nov 28, 2009, 4:56:36 PM11/28/09

to

paul c wrote:

> Mr. Scott wrote:
>
>> But what if there is more than one row? The information content of a
>> table is the logical sum (disjunction) of the information represented
>> by each row.
>

> The conventional view is that is that the information in a table is the
> logical conjunction of the information represented by the rows in the
> table. Just because the table is formed by a summing operation doesn't
> change that.

Ahem. Conjunction of the domains. A relation is the extension of a
predicate. That predicate can be represented as the disjunction of the
tuples.

>> But those can easily be transformed into into truth-valued
>> constraints, can't they?
>

> Not easy if users aren't required to know default values. I don't see
> why they should. Contrary to Dr. Strangelove, the whole point of
> defaults is to avoid users having to know them!

Defaults are not constraints.

--
is there something in it for them, like maybe bailouts, if they can
panic us into doing something politically to cover them?

November 19, 2007 - John S Bolton

http://tinyurl.com/y9e4vxh

Mr. Scott

unread,

Nov 28, 2009, 5:43:07 PM11/28/09

to

"paul c" <toledob...@oohay.ac> wrote in message

news:oCbQm.55478$PH1.3123@edtnps82...

> Mr. Scott wrote:
> ...
>> An example might help. In the typical table,
>>
>> CTRS {COURSE, TEACHER, ROOM, STUDENT},
>>
>> each row states that a particular COURSE is taught by a particular
>> TEACHER in a particular ROOM to a particular STUDENT.
>>
>> Now, while it can be argued that there can't be a course without a
>> teacher, or that there can't be a course without a student, or that there
>> can't be a student without a teacher, the room exists independent of
>> whether there is a course, or a teacher or a student. ...
>
> That's a good example of mysticism. Unless an application requirement is
> given that rooms are independent in this way,

What do you mean by mysticism? A room is a room even if it isn't being used
to hold classes. It's existence is therefore independent of the existence
of any courses, teachers, or students. That's just plain ordinary common
sense--the opposite of mysticism.

> one might just as easily conclude that CTRS is the only base relation in
> the db.

That is my point exactly: if CTRS is the only table in the db, then there
are delete anomalies due to the existence of a room being dependent upon the
existence of at least one each of courses, teachers and students, and
further that those delete anomalies are a direct consequence of the logical
connective that interconnects the atomic formulas that are represented by
each row. The point of my example was to emphasize that whenever a row
represents a non-atomic proposition, the logical connective between the
atomic formulas that the proposition is composed of is not AND but IFF.
Since each row exemplifies the table's predicate, that predicate must also
be composed of a collection of interconnected atomic formulas, and the
logical connective must be IFF rather than AND. It therefore doesn't make
sense to define a relvar's predicate as the logical /conjunction/ of all of
the constraints that mention it.

com...@hotmail.com

unread,

Nov 28, 2009, 6:45:59 PM11/28/09

to

On Nov 27, 7:32 pm, com...@hotmail.com wrote:
> The relationship between the two "predicates" is as follows. The
> designer specifies the relvar predicate so that a user updating the
> database can observe the world and figure out whether a given tuple
> makes it true and so that a user looking at the database can find out
> what is true of the world. Next the designer figures out all the
> possible tuples that could ever turn up in a relation because of the
> way the world can be and then writes the total relvar constraint so
> the dbms can tell the user they made an error if they ever try to put
> some other tuple into the relation.

The line

> possible tuples that could ever turn up in a relation because of the

is wrong; I should have written:
"possible values of the relation that could ever turn up because of
the"
That is, the designer communicates the set of. (They actually
communicate all the possible database states that could ever arise,
via the "total database constraint". But above I was just addressing
the term "total relvar constraint".)

philip

Mr. Scott

unread,

Nov 29, 2009, 3:31:25 AM11/29/09

to

<com...@hotmail.com> wrote in message
news:9a0f4cc3-ef8c-47b5...@13g2000prl.googlegroups.com...

<snip>

> I don't know what you mean by "the atomic formulas represented by a
> row". If one thinks of a relation's predicate as being an arbitrary
> wff (it's usually thought of as being in natural language, or as being
> wffs with natural language for ground terms) then the truth value of a
> particular constituent atomic formula when the overall predicate is
> true depends (like usual in predicate logic) on the connectives/
> quantifiers. A row does not have a bunch of things ANDed together; the
> propositions above are ANDed together (or more simply, just asserted).
>
> After a delete a row it is no longer in the relation. So the database
> is asserting the negation of the proposition that you get by
> substituting its attribute values for attribute names in the
> proposition.
>
>> If some but not all of the
>> atomic formulas represented by a row were true, but a row can't be in the
>> table unless they're all true, then where is that positive information to
>> be
>> stored? Without anywhere to store it, the database is inconsistent.
>
> Once again, a row does not in general assert the conjunction of wffs.
>
> If you think otherwise please explain very clearly, hopefully with a
> fully worked out simple example, because you won't be able to assume I
> know what you are talking about, because what you have written is not
> how the relational model works. From this message you should also be
> able to work that example out "correctly".
>
> philip

I am going to revisit the example I posted for paul.

In a typical table,

CTRS {COURSE, TEACHER, ROOM, STUDENT},

each row states that a particular COURSE is taught by a particular TEACHER
in a particular ROOM to a particular STUDENT.

Now, while it can be argued that there can't be a course without a teacher,
or that there can't be a course without a student, or that there can't be a
student without a teacher, the room exists independent of whether there is a

course, or a teacher or a student. It therefore follows that locating the
fact that 'there is a room <ROOM>' only in table CTRS is a problem because
then there could only be a room whenever there is at least one course and at
least one teacher and at least one student. When one inserts a row into an
empty CTRS, one effectively asserts

that 'there is a course <COURSE>,'
that 'there is a teacher <TEACHER>,'
that 'course <COURSE> is taught by teacher <TEACHER>,'
that 'there is a room <ROOM>,'
that 'course <COURSE> is held in room <ROOM>,'
that 'teacher <TEACHER> teaches in room <ROOM>,'
that 'course <COURSE> is taught by teacher <TEACHER> in room <ROOM>,'
that 'there is a student <STUDENT>,'
that 'course <COURSE> is taken by student <STUDENT>,'
that 'teacher <TEACHER> teaches student <STUDENT>,'
that 'course <COURSE> is taught by teacher <TEACHER> to student
<STUDENT>,'
that 'student <STUDENT> studies in room <ROOM>,'
that 'teacher <TEACHER> teaches student <STUDENT> in room <ROOM>,'
that 'student <STUDENT> takes course <COURSE> in room <ROOM>,' and
that 'course <COURSE> is taught by teacher <TEACHER> in room <ROOM> to
student <STUDENT>.'

If you disagree that all of these assertions are a consequence of inserting
a single row, then how is it that there can be an answer to the query, "is
course <COURSE> held in room <ROOM>?" Unless there is the possibility that
'course <COURSE> is held in room <ROOM>,' there can be no answer (at least
not a yes or a no) to the query.

When one deletes that only row, one effectively denies all of those
assertions, every one.
It's an all or nothing proposition. Either all of the atomic formulas

represented by the row are true, or none of them are. That is consistent
with the logical connective between those formulas being IFF rather than

AND. If the logical connective were AND, then denying just some of the
above assertions should be allowable, but since it is the last row being
deleted, there would no longer be anywhere for the positive assertions that
remain, leaving the database inconsistent.

That should explain how things work when the first row is being inserted or
when the last row is being deleted, but what if there is more than one row?

The information content of a table
is the logical sum (disjunction) of the information represented by each row.

Isn't it true that

(P IFF Q) OR (P IFF R) = P IFF (Q OR R)?

As a result, denying (P IFF R) in this context doesn't deny P. It follows
that if there were more than one course taught in room <ROOM>, deleting only
one row wouldn't deny that 'there is a room <ROOM>.'

vldm10

unread,

Nov 29, 2009, 12:03:34 PM11/29/09

to

On Nov 27, 5:13 am, Clifford Heath <no.s...@please.net> wrote:
> vldm10 wrote:
> > Regarding DK/NF and 6NF you can see my solution atwww.dbdesign11.com.

> > There I introduce “Simple Form” an effective solution which decomposes

> > any relation to Binary Relations.
>
> Your approach seems very similar in general to NIAM
> and other fact-oriented modeling approaches like ORM2.
> They are built on "Elementary Form", which in ORM2 can
> include fact types that are ternary and higher (though
> these are always trivial to binarise, the higher-order
> fact types are useful in modeling because they reflect
> natural verbalisations).

> The main difference is that in fact orientation, the
> model is expected always to be *constructed* in elementary
> form. For efficiency of storage and access, uniqueness
> constraints allow automatic and invisible aggregation
> into non-elementary structures. This means you can model
> in elementary form, and get a correct and efficient schema
> which holds the same facts to use with your SQL DBMS. This
> is the implementation principle behind my "Constellation
> Query Language" as well.

No, my approach is very different from NIAM/ORM. My approach is a kind
of ER, while NIAM/ORM isn’t. I have entities and relationships and a
lot of semantic modeling.
I don’t have any of basic things from NIAM/ORM, such as object, roles
and fact types.
ORM uses concepts and I also use concepts. In ORM the concept is not
defined although it is basic thing. I defined the term concept with
general definition and I also defined the following particular
concepts: properties, entities, relationships and states. When we
speak about basic facts, I must say that I haven’t noticed a
definition of facts in ORM. So, for example how do you know that facts
are types if you haven’t defined facts?
As you wrote, I use attributes a lot and this is also a big difference
between my approach and ORM. I believe that more than 90% of the data
in my DBs is related to attributes.
All of the above mentioned are fundamental differences between my
approach and the one in NIAM/ORM.

> You use the term "attribute" a lot. In fact orientation,
> there are no attributes, they are just fact types that
> encompass a functional dependency between objects. To drop
> the notion of "attribute" is a major win for a number of
> reasons, including that it isn't a clearly-defined concept
> in natural usage; vis the conflict between relational and
> object-oriented aggregation; and also the fact that
> "attribute migration" is a major driver of schema evolution,
> hence project scope inflation and failure.

However, there is one thing I believe you haven’t noticed. It is about
a sentence and its corresponding thought. No doubt a sentence denotes
its truth-value. But sentence is not a possessor of truth-value. So a
sentence is not a fact, rather a fact is related to the its
corresponding thought.
This is the reason why I introduced two very different things: a fact
about an entity and the corresponding factual sentence.
I also determined the construction of a fact about an entity. It
involves the construction of Binary Concept of an entity, the
construction of attribute’s abstraction (see 3.3.3 in my paper),
etc.
And in my opinion this is very different from ORM and from any other
DB models and designs.

> In any case, before you go claiming your work as "novel",
> it'd be good if you did some reading on fact orientation
> and see whether others haven't been there before you, like,
> for example, more than twenty years ago ;-). This isn't a
> criticism actually; I haven't looked in sufficient depth
> at your work to decide whether you've done something new.
> But a lot of it does seem very familiar...
>
> > Regarding time my solution is event oriented.
>
> I like this view - it accords with my thoughts on the
> quantization of time where the general principle that
> "time is just G*d's way of keeping everything from happening
> at once" ;-).
>
> Terry Halpin (creator of ORM2) has a recent article series
> on temporal modeling published at brcommunity.com, if
> you're interested.

Regarding “temporal” DBs, I gave a solution. You can see it on my web
site. Note that the solution is more general than the field of
“temporal” DBs. I put the main ideas on my web page in 2005 and have
completely integrated them in this paper. I am fairly certain that my
solution for the first time solves “temporal” and historical DBs and
that is one of the things which is novel.

>
> Clifford Heath, Data Constellation,http://dataconstellation.com

> Agile Information Management and Design.

Vladimir Odrljin

Gene Wirchenko

unread,

Nov 30, 2009, 9:23:40 PM11/30/09

to

On Sat, 28 Nov 2009 05:06:40 GMT, paul c <toledob...@oohay.ac>
wrote:

[snip]

>The only thing I wonder about is why constraints must be truth-valued.
>I'd rather they were allowed to have relation values other than dee and
>dum. Seems the only way to allow defaults/mandatory tuples without
>introducing some other concept.

Is the given constraint true of the data? Either it is, or it is
not. That is why constraints are truth-valued.

What other values could you use? What would having a general
relation value as a result tell you about the constraint being
followed or not?

Sincerely,

Gene Wirchenko

paul c

unread,

Dec 1, 2009, 7:12:38 PM12/1/09

to

Bob Badour wrote:
...

>> The conventional view is that is that the information in a table is
>> the logical conjunction of the information represented by the rows in
>> the table. Just because the table is formed by a summing operation
>> doesn't change that.
>
> Ahem. Conjunction of the domains. A relation is the extension of a
> predicate. That predicate can be represented as the disjunction of the
> tuples.

> ...

Yes, yes and yes, thanks. Maybe I was a bit clumsy. Still, once people
start talking about propositions, I have to note that any two tuples in
a relation are conventionally treated as if the ones they represent are
both true.

Sampo Syreeni

unread,

Dec 4, 2009, 2:35:27 PM12/4/09

to

On Nov 21, 6:24 pm, Roy Hann <specia...@processed.almost.meat> wrote:

> The most persistent (and most common) reason people incorporate nullable
> columns into designs is because they have a misplaced desire to minimize
> the number of tables in the design, and think that conflating multiple fact
> types in one table is clever, efficient, and harmless.

(Apologies for the delay.) That is absolutely correct. That is the
general mistake, and my particular example (validity time spans) was
just one, if recurring, instance of the more general class. Here, too,
we refuse to consider relations with and without an end date two
separate types of objects, and then conflate them into one relation
with nulls allowed in the end date.

I think there are two aspects to this problem. The first and obvious
one is the psychological: people do have immense problems getting a
hold on a data model with lots of separate entities. E.g. think of SAP
or the foundation schemata Teradata peddles, with their thousands of
relation types. Trying to sort through something like that, e.g. to
update it, is a real pain and you can never be quite certain you've
actually taken everything necessary into consideration.

But then there's the second, more subtle aspect: it is quite possible
that the problem is in sparse metadata and/or in the substandard tools/
interfaces used to access it, so that it could in fact be solved with
the right set of tools and methodology/process. I mean, envision the
metadata heaven where we have all of the formal semantics of our data
model neatly encoded within the database catalog, and further, we have
tools which automatically, graphically, and in a procedure oriented
way guide our way through all of the possible dependencies,
remembering for us which ones we've already covered. In that case
there is no possibility that you could have forgotten something once
you're done. The hassle in learning the necessary semantic
information, and *only* the necessary information, is minimal, because
all of the formal aspects are recorded within the database and your
tools show it to you on the fly, plus any linkages to the outside,
semantic world are also documented in a human readable and easily
accessible form.

In this ideal world, when you're about to handle one of those thousand
relations, you just click a column in the database client and out pops
a representation of the full generalization hierarchy of this
particular type. Want a semantic description of your entity? One click
more. Want the constraints? Another click, and voilà, it gives you
every single one of them in a nice chart alongside the top and bottom
parts of the type lattice, so that with one click more, you can get
the human readable version which maps the constraint to the external
model/business process equivalent that was used to derive the formal
constraint in the first place. Want to see where the particular domain
you're handling is used? One click. Want to see the domain definition
itself? Click.

Want to find a suitable, existing domain for your new relation,
beginning with a blank slate? Your database/client of course easily
handles a mixture of both formal and free form queries, at the level
of specificity that rivals Google, so that when you put in something
like "person company (cardinality>10) (last access within two years)
(list all relations with conflicting keys)", in a couple of seconds
you can find every single existing relation in the system which might
have something to do with your problem domain, in a well-ordered query
result form. Plus of course you should be able to dig in from the
perspective of a humanly composed ontology, or three, of everything
that is in the system, ODP style.

That sort of thing is of course a pipe dream at the moment, but then
quite a lot could already be done to organize database schemata using
existing technology. I mean, even a freely manipulable selection list
of "my relations" which retains a history of every relation you've
touched, with metadata on what you have done to them, would be a real
asset. Or when we talk about this typical case of relations related by
contraction of nulls, a UI side tool which lets us work with views of
otherwise identical relations, some of which contain "null, value not
applicable" values while updating multiple underlying tables/views,
would take 95% out of the steam to not work with more than one
relation at the same time.

So I think this thing is mostly about UI design and formal conceptual
organization of the data model, and not about any intrinsic cognitive
limitation (which do exist as well: I for one don't think there is a
single person who could reliably spot an irreducible, ten component
decomposition into PJ/NF).
--
Sampo

com...@hotmail.com

unread,

Dec 4, 2009, 7:26:42 PM12/4/09

to

On Nov 29, 12:31 am, "Mr. Scott" <do_not_re...@noone.com> wrote:
> I am going to revisit the example I posted for paul.

Thank you for giving an example. Especially when my stated intent is
to show you misunderstand something.

Your explanation is still unclear.

The user looks at the world and the relation variable's predicate(s)
and combines them to determine what rows to put in and what rows to
leave out. Similarly, they look at the rows present and absent in a
relation variable and combine them with the predicate(s) to produce
statements about the world. Please tell me what the designer gives
with a relation variable (a predicate, or set of predicates?) and the
correspondence between an overall proposition about the world and the
rows in and the rows not in that variable (now and/or previously).

> CTRS {COURSE, TEACHER, ROOM, STUDENT},

> each row states that a particular COURSE is taught by a particular TEACHER
> in a particular ROOM to a particular STUDENT.

> one effectively asserts

> that 'there is a course <COURSE>,'
> that 'there is a teacher <TEACHER>,'

> [...]

The designer gives a relation variable. It's not clear what else you
claim they give. A single predicate per variable? A set of predicates
per variable? Do you mean that (some function of) this list is a
clarification of the preceding (parameterized) statement? If so,
exactly what predicate does the user form from the list that is
supposed to be equivalent to the statement? Or do you mean that the
designer specifies the statement and you think the list is derivable
from the statement by the system?

I can't tell when you write "when one inserts a row" or "inserting a
single row", whether you just mean mean "when a given row is in a
relation". Or whether by "inserting a single row" or "deletes that
only row" you specifically mean the old value of a relation variable
is also involved in mapping to instantiated predicates.

> Either all of the atomic formulas
> represented by the row are true, or none of them are. That is consistent
> with the logical connective between those formulas being IFF rather than
> AND.

First you write "one effectively asserts" the things in the list
(whatever "effectively" means); asserting a number of things is the
same as asserting their conjunction. But then you write "the logical
connective between those formulas being IFF", so I guess you mean the
IFF of the things in the list is asserted. So you're not being clear.

And no, it's not consistent with the logical connective between those
formulas being IFF. For example, although
a1 IFF a2 IFF a3
is true when all ai are true and it is false when all ai are false, it
is also true when a1 and a2 are false and a3 is true. So the above
expression does not assert that they're all true or all false. For
that you would want something like
(a1 AND a2 AND a3) OR ((NOT A1) AND (NOT a2) AND (NOT a3))
ie (NOT (a1 OR a2 OR a3)) OR (a1 AND a2 AND a3)
ie (a1 OR a2 OR a3) IMPLIES (a1 AND a2 AND a3)
ie the whole thing is true when all are false or all are true but it's
false when there's a mix. (Not that I think you ever want the ai all
individually false.)

On the other hand I can imagine that you are thinking that for each
syntactically valid row it is in the relation iff all the
corresponding instantiated assertions in the list are true. But that's
not what you have said. (In fact you've said "all of the atomic
formulas represented by the row are true, or none of them are".) And
anyway that's not a relation's statement about the world; it is a
statement about how the value of a relation variable maps to its
statement about the world.

> If

> The information content of a table
> is the logical sum (disjunction) of the information represented by each row.

This is clearer, but as I said I can't figure out what you think is
asserted by a row in the table (or by a row not in 'the table) to be
ORed together. (Note that it makes sense to OR together equalities
like COURSE='Math' as the characteristic predicate of a set, but to
AND together things like "there is a course 'Math'" as a statement or
assertion about the world.) But I also don't know whether you mean
this applies to all relation variable values, or only the values of
ones that that are empty, or just became empty or nonempty, or didn't,
or what.

So please start out with what the designer gives, then tell me clearly
how to form the proposition is that is "the information represented by
each row" that is syntactically valid, then what the proposition is
that is "the information content of a table". And whether it depends
on the old state as well as the new. And tell me whether I am using
that predicate as a statement about the world or as the characteristic
predicate of some set. In other words, be clear. Then I can discuss
your approach (ie show it doesn't do what you think it does).

It's also pointless for me to address your justifications for
individual steps if I don't understand the overall process.

philip

Mr. Scott

unread,

Dec 10, 2009, 9:12:26 PM12/10/09

to

<com...@hotmail.com> wrote in message
news:35bb0b6c-af79-4c41...@y32g2000prd.googlegroups.com...

Thank you for pointing that out. I really wasn't very clear. By
interdependent I mean that the connective between any two atomic formulas
(atoms) is a biconditional, not that the only connectives are
biconditionals, so for three atoms, it's not

a1 iff a2 iff a3,

but rather,

(a1 iff a2) and (a1 iff a3) and (a2 iff a3)

which is true whenever all ai are true or all ai are false and false
otherwise.

For four atoms, it's not

a1 iff a2 iff a3 iff a4,

but rather,

(a1 iff a2) and (a1 iff a3) and (a1 iff a4) and
(a2 iff a3) and (a2 iff a4) and (a3 iff a4)

which is true whenever all ai are true or all ai are false and false
otherwise.

> For
> that you would want something like
> (a1 AND a2 AND a3) OR ((NOT A1) AND (NOT a2) AND (NOT a3))
> ie (NOT (a1 OR a2 OR a3)) OR (a1 AND a2 AND a3)
> ie (a1 OR a2 OR a3) IMPLIES (a1 AND a2 AND a3)
> ie the whole thing is true when all are false or all are true but it's
> false when there's a mix. (Not that I think you ever want the ai all
> individually false.)
>
> On the other hand I can imagine that you are thinking that for each
> syntactically valid row it is in the relation iff all the
> corresponding instantiated assertions in the list are true. But that's
> not what you have said. (In fact you've said "all of the atomic
> formulas represented by the row are true, or none of them are".) And
> anyway that's not a relation's statement about the world; it is a
> statement about how the value of a relation variable maps to its
> statement about the world.

Constraints specify what can be represented in a table. They don't say
anything about what actually is the case other than that it can be the case.
What actually is the case is a matter of interpretation.

>
>> If
>> The information content of a table
>> is the logical sum (disjunction) of the information represented by each
>> row.
>
> This is clearer, but as I said I can't figure out what you think is
> asserted by a row in the table (or by a row not in 'the table) to be
> ORed together. (Note that it makes sense to OR together equalities
> like COURSE='Math' as the characteristic predicate of a set, but to
> AND together things like "there is a course 'Math'" as a statement or
> assertion about the world.) But I also don't know whether you mean
> this applies to all relation variable values, or only the values of
> ones that that are empty, or just became empty or nonempty, or didn't,
> or what.
>

Consider a typical row for CTRS,

{COURSE:Physics, TEACHER:Brown, ROOM:145, STUDENT:Jones}

This row doesn't "state" anything. That it is in the table is evidence that
someone with authority asserted as fact that Brown teaches Jones Physics in
room 145, and that it is still in the table is evidence that noone with
authority has since denied it. It is reasonable, therefore, to treat as
fact that Brown teaches Jones Physics in room 145. From this fact a number
(14 to be exact) of other facts can be inferred. That's 15 distinct facts
in total represented by a single row. Each row of CTRS isn't just a single
fact, but a family of interdependent facts. Brown can't teach Jones Physics
in room 145 if he doesn't teach Physics or if there is no room 145, and
under the assumptions that CTRS is the only table in the database and that
there is only a single row that mentions Physics, Brown, 145 or Jones, there
can't be a student called Jones unless he is being taught Physics by Brown
in room 145. The addition of a second row, for example,

{COURSE:Calculus, TEACHER:Green, ROOM:154, STUDENT:Jones},

weakens that condition so that there can't be a student called Jones unless
he is being taught Physics by Brown in room 145 or he is being taught
Calculus by Green in room 154. The addition of a separate table, S
{STUDENT}, for example, also weakens the condition because then there could
be a student called Jones even if he isn't taking any courses from any
teachers in any rooms. It doesn't change the form of the assertion
represented by each row, though, where the logical connective between any
two atoms is IFF, yielding a non-atomic formula, but the logical connective
between any two of those formulas is AND.

The logical connective that separates the facts represented by one row from
the facts represented by another is OR (unless there is a self-referencing
inclusion dependency), as is the logical connective that separates the facts
represented in one table from the facts represented in another (unless there
is an inclusion dependency).

com...@hotmail.com

unread,

Dec 11, 2009, 3:07:30 AM12/11/09

to

On Dec 10, 6:12 pm, "Mr. Scott" <do_not_re...@noone.com> wrote:

Thanks, but this is no clearer.
philip

Kevin Kirkpatrick

unread,

Dec 11, 2009, 4:29:26 PM12/11/09

to

> > philip- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -

Hello Mr. Scott,

If I'm understanding your explanation properly, in a database with
only table:

BIG_US_CITIES {CITY_NAME, STATE_CODE}

(that is, lacking table STATES {STATE_CODE}) one must logically infer,
from the CWA, that states with no big cities, e.g. 'RI', do not exist?

Gene Wirchenko

unread,

Dec 11, 2009, 4:36:58 PM12/11/09

to

On Fri, 11 Dec 2009 13:29:26 -0800 (PST), Kevin Kirkpatrick
<kvnkr...@gmail.com> wrote:

[snip]

>If I'm understanding your explanation properly, in a database with
>only table:
>
>BIG_US_CITIES {CITY_NAME, STATE_CODE}
>
>(that is, lacking table STATES {STATE_CODE}) one must logically infer,
>from the CWA, that states with no big cities, e.g. 'RI', do not exist?

Effectively yes. You can only go by what is in the database. If
that is not sufficient for your purposes, then the database should be
redesigned. If it is sufficient, well and good. The idea of a
database is not to model all that exists but only that which you are
interested in.

Sincerely,

Gene Wirchenko

Mr. Scott

unread,

Dec 11, 2009, 11:15:17 PM12/11/09

to

"Kevin Kirkpatrick" <kvnkr...@gmail.com> wrote in message
news:4be6ab47-e654-4925...@g12g2000yqa.googlegroups.com...

> On Dec 10, 8:12 pm, "Mr. Scott" <do_not_re...@noone.com> wrote:

<snip>
> Hello Mr. Scott,

> If I'm understanding your explanation properly, in a database with
> only table:

> BIG_US_CITIES {CITY_NAME, STATE_CODE}

> (that is, lacking table STATES {STATE_CODE}) one must logically infer,
> from the CWA, that states with no big cities, e.g. 'RI', do not exist?

The inference is valid, but it is not due to the CWA. If the only place in
the database for assertions like 'there is a state called <STATE_CODE>,' is
in BIG_US_CITIES, which houses facts like 'There is a big city named
<CITY_NAME> in a state called <STATE_CODE>,' then one can logically conclude
that there aren't any states with no big cities. Even under the OWA, there
can't be any states with no big cities. Under the CWA, all and only states
with big cities are represented in the database, and under the OWA, only but
not necessarily all states with big cities are represented in the database.
In both cases, only states with big cities can be represented in the
database. There isn't anywhere to record the assertion that there is a
state with no big cities, so there can't be any. The table STATES
{STATE_CODE} provides a place to record assertions like 'there is a state
called <STATE_CODE>' independent of whether there is also a big city.

paul c

unread,

Dec 12, 2009, 2:57:23 PM12/12/09

to

Mr. Scott wrote:
...

>
> In a typical table,
>
> CTRS {COURSE, TEACHER, ROOM, STUDENT},
>
> each row states that a particular COURSE is taught by a particular TEACHER
> in a particular ROOM to a particular STUDENT.
>
> Now, while it can be argued that there can't be a course without a teacher,
> or that there can't be a course without a student, or that there can't be a
> student without a teacher, the room exists independent of whether there is a
> course, or a teacher or a student. It therefore follows that locating the
> fact that 'there is a room <ROOM>' only in table CTRS is a problem because
> then there could only be a room whenever there is at least one course and at
> least one teacher and at least one student. When one inserts a row into an
> empty CTRS, one effectively asserts
>

> ...
> that 'there is a room <ROOM>,' ...

This presumes that the predicate for a table R { ROOM } is the same as
the predicate of CTRS { ROOM } but from the dbms perspective it's patent
they can't be, since R and CTRS don't even have the same heading. If
CTRS is not base, maybe you can say such but if it is base, one can only
assert that there exists some teacher, student and course such that room
<ROOM> is involved.

>
> If you disagree that all of these assertions are a consequence of inserting
> a single row, then how is it that there can be an answer to the query, "is
> course <COURSE> held in room <ROOM>?" Unless there is the possibility that
> 'course <COURSE> is held in room <ROOM>,' there can be no answer (at least
> not a yes or a no) to the query.

> ...

If CTRS is base, that question is strictly not answerable. I suspect
that the man in the street might think such a query is possible (not to
mention many db designers) but to be precise I think it's important to
distinguish what is from what could be, otherwise we lapse into
mysticism. A query that is answerable from CTRS is "are there some
teachers and students such that course <COURSE> and room <ROOM> are
involved?".

Just because SQL and the like might not be expressive enough to prevent
the query doesn't mean the most basic condition, ie., the definition of
CTRS, can be ignored. A language that might reflect this might have two
sets of operands for projection instead of one.

(I also realize that the usual explanations of projection operators
don't make this clear. Not that table/relvar names carry any
significance other than as a device to segregate rows/tuples according
to predicate but it might be more suggestive of the actual situation if
the name of CTRS were changed to 'EXPELLED'.)

Mr. Scott

unread,

Dec 12, 2009, 5:37:49 PM12/12/09

to

"paul c" <toledob...@oohay.ac> wrote in message

news:DASUm.55772$Db2.7870@edtnps83...

There is no table R. The only place in the database to record assertions
that there is a particular room is in CTRS, but one cannot assert that there
is a particular room without also asserting that there is a particular
teacher, that there is a particular student, and that said teacher is
teaching a particular course to said student in that room.

>>
>> If you disagree that all of these assertions are a consequence of
>> inserting a single row, then how is it that there can be an answer to the
>> query, "is course <COURSE> held in room <ROOM>?" Unless there is the
>> possibility that 'course <COURSE> is held in room <ROOM>,' there can be
>> no answer (at least not a yes or a no) to the query.
>> ...
>
> If CTRS is base, that question is strictly not answerable.

Yes, it is answerable. The fact that a particular teacher is teaching a
particular student a particular course in a particular room implies the fact
that that particular course is held in that particular room.

> I suspect that the man in the street might think such a query is possible
> (not to mention many db designers) but to be precise I think it's
> important to distinguish what is from what could be, otherwise we lapse
> into mysticism. A query that is answerable from CTRS is "are there some
> teachers and students such that course <COURSE> and room <ROOM> are
> involved?".

I think you're assuming that it is possible for there to be courses in rooms
even if there aren't any students or teachers, but that hasn't been
established, and in fact since there is no place in the database for such
assertions, there cannot be courses in rooms without students or teachers.
The queries that are answerable are therefore not limited to just those that
involve all four attributes.

If there were a place in the database for such assertions, such as a table

CR {COURSE,ROOM}

Then what is in the database would consist of the logical sum of all courses
in rooms without students or teachers and all courses in rooms with students
and teachers. The query "Is course <COURSE> held in room <ROOM>?" would
involve both CR and CTRS, unless there is an inclusion dependency from
CTRS[COURSE,ROOM] to CR[COURSE,ROOM], in which case the query could be
answered without involving CTRS.

>
> Just because SQL and the like might not be expressive enough to prevent
> the query doesn't mean the most basic condition, ie., the definition of
> CTRS, can be ignored. A language that might reflect this might have two
> sets of operands for projection instead of one.
>
> (I also realize that the usual explanations of projection operators don't
> make this clear. Not that table/relvar names carry any significance other
> than as a device to segregate rows/tuples according to predicate but it
> might be more suggestive of the actual situation if the name of CTRS were
> changed to 'EXPELLED'.)
>

Table names are significant. There can be more than one table with the same
set of columns. The closest logical analog to a table name is a predicate
symbol. Predicate symbols are significant.

P(X) may be true for a given X even if Q(X) is false for the same X.

paul c

unread,

Dec 13, 2009, 2:10:38 PM12/13/09

to

Perhaps the most remarkably bald of these assertions is that a
'particular ROOM' has 'independent' 'existence' when a 'particular'
STUDENT or TEACHER or COURSE doesn't. I've never even liked the word
'exists'... 'forall' is okay but 'is present' would help keep one's eye
on the ball. To avoid the chronic mystical persuasion, basically all we
really need to be concerned with is the presence of symbols.

com...@hotmail.com

unread,

Dec 13, 2009, 2:16:03 PM12/13/09

to

On Dec 11, 1:29 pm, Kevin Kirkpatrick <kvnkrkpt...@gmail.com> wrote:
> Hello Mr. Scott,
> If I'm understanding your explanation properly, in a database with
> only table:
> BIG_US_CITIES {CITY_NAME, STATE_CODE}
> (that is, lacking table STATES {STATE_CODE}) one must logically infer,
> from the CWA, that states with no big cities, e.g. 'RI', do not exist?

On Dec 11, 1:36 pm, Gene Wirchenko <ge...@ocis.net> wrote:
> Effectively yes.

On Dec 11, 8:15 pm, "Mr. Scott" <do_not_re...@noone.com> wrote:
> There isn't anywhere to record the assertion that there is a
> state with no big cities, so there can't be any.

It is more reasonable to say that the database is silent about the
existence of states and cities. But even if it did, the inference that
some state doesn't exist is wrong.

If the predicate is "the city named city_name in state state_code is
big", then what you know is that for every tuple <c, s> in
big_us_cities "city named c in state s is big", and that (by the cwa)
for every tuple <c, s> typed by but not in in big_us_cities "it is not
the case that: the city named c in state s is big". (Maybe there is
no such named city anywhere; or maybe there's one in one or more in
other states; and/or maybe they're just not big; or there's no such
state; or that city in s is small; etc.) (If it were the case, the
tuple would have been in the relation. The use of this interpretation
of relations, which is done in a relational dbms and the relational
algebra needs in order to answer queries, is "assuming a closed
world".) This particular database (ie relation variable plus
predicate) has nothing to say about the existence of states or cities.
There is no way to write a relation expression whose predicate
involves existence starting from only the given predicate.

Now, if the user is allowed to assume (which they would using the
everyday meaning of the predicate above) that "exists c: the city
named c in state s is big" implies "there is a state named s" then
they could reason from <c,s> being in big_us_cities that "there is a
state named s". But from <c,s> not in big_us_cities they still
couldn't infer that "s is not a state". (Maybe there's just no such
city; or it's small in state s; etc.)

Note that the database has no idea what the predicate is, let alone
what "big" means or what a "state" is or the consequences (like
"existence") of there being or not being a big city named c in state
s. With this relation and predicate you can write a query that tells
you whether (ie its predicate is) for some city c "there exists an s
such that the city named c in state s is big". But there is no query
that answers "is there a state s". (But I told you how to derive it).

This should not be surprising since if we change the predicate for
big_us_cities to "the number of characters in string city_name is not
the age of any dog whose tag says state_id" we wouldn't expect any
query to give a different answer for any given value of big_us_cities.
Regarding silence on existence, we also wouldn't expect any such
change if we do add explicit information about existence.

Also regarding non-existence: Suppose existence is made explicit by
the predicate being something like "state state_code exists and city
city_name exists in that state and that city is big". Then when <c, s>
is in big_us_cities you know "state s exists" because it follows from
"state s exists and ...". But if <c,s> is not in big_us_cities you
still don't know which (one or more) of the conjuncts is false to make
the whole thing false. You only know the whole predicate is fakse
using <c,s>. So you don't know that "state s doesn't exist".

philip

com...@hotmail.com

unread,

Dec 14, 2009, 2:33:33 PM12/14/09

to

On Dec 4, 4:26 pm, com...@hotmail.com wrote:
> So please start out with what the designer gives, then tell me clearly
> how to form the proposition is that is "the information represented by
> each row" that is syntactically valid, then what the proposition is
> that is "the information content of a table".

On Dec 11, 12:07 am, com...@hotmail.com wrote:
> Thanks, but this is no clearer.

Mr. Scott,

I received an early xmas gift from my database designer: a relational
dbms. Its database has relation variables vr{A,B,C,D} and vs{A,X} with
respective predicates pr(A,B,C,D) and ps(A,X). Observing the world I
can determine for a particular tuple whether a given predicate is true
with corresponding attributes/parameters substituted. All attributes
are of the same type.

At some point when the database reflects the world:

If pr(a,b,c,d) is true, what does this imply about the tuples in vr?
If pr(a,b,c,d) is false, what does this imply about the tuples in vr?
If a tuple <a,b,c,d> is in vr, what does this imply about the world,
in terms of pr?
If a tuple <a,b,c,d> is not in vr, what does this imply about the
world, in terms of pr?
What does vr imply about the world?
What does the database imply about the world?

What query (relation expression) returns:
The set of tuples <a,b,c,d> for which pr(a,b,c,d)?
The set of tuples <b,c,d> for which there exists an A such that pr
(A,b,c,d)?
The set of tuples <a,b,c,d,x> for which pr(a,b,c,d) and ps(a,x)?
The set of tuples <a,b,c,d> for which pr(a,b,c,d) and not ps
(a',b',c',d')?

Each of the following queries returns a relation. For each query, what
does the returned value imply about the world?
vr
vr project all but {A}
vr join vs
vr minus relation{A a',B b',C c',D d'}

Maybe you don't consider these to be appropriate questions. If so
please give precise rules mapping between the world and the database
state and between queries and predicates.

philip

Mr. Scott

unread,

Dec 14, 2009, 7:39:15 PM12/14/09

to

<com...@hotmail.com> wrote in message
news:f8fb0e4b-100a-4466...@2g2000prl.googlegroups.com...

> On Dec 11, 1:29 pm, Kevin Kirkpatrick <kvnkrkpt...@gmail.com> wrote:
>> Hello Mr. Scott,
>> If I'm understanding your explanation properly, in a database with
>> only table:
>> BIG_US_CITIES {CITY_NAME, STATE_CODE}
>> (that is, lacking table STATES {STATE_CODE}) one must logically infer,
>> from the CWA, that states with no big cities, e.g. 'RI', do not exist?
>
> On Dec 11, 1:36 pm, Gene Wirchenko <ge...@ocis.net> wrote:
>> Effectively yes.
>
> On Dec 11, 8:15 pm, "Mr. Scott" <do_not_re...@noone.com> wrote:
>> There isn't anywhere to record the assertion that there is a
>> state with no big cities, so there can't be any.
>
> It is more reasonable to say that the database is silent about the
> existence of states and cities. But even if it did, the inference that
> some state doesn't exist is wrong.

No, it isn't wrong. An assertion that there is a state with no big cities
is inconsistent with the definition of the database: it cannot be true.

<snip>

Mr. Scott

unread,

Dec 15, 2009, 5:13:00 AM12/15/09

to

> <com...@hotmail.com> wrote in message
> news:d84d3f25-b398-4c35...@o9g2000prg.googlegroups.com...

> On Dec 4, 4:26 pm, com...@hotmail.com wrote:
> > So please start out with what the designer gives, then tell me clearly
> > how to form the proposition is that is "the information represented by
> > each row" that is syntactically valid, then what the proposition is
> > that is "the information content of a table".
>
> On Dec 11, 12:07 am, com...@hotmail.com wrote:
> > Thanks, but this is no clearer.
>
> Mr. Scott,
>
> I received an early xmas gift from my database designer: a relational
> dbms. Its database has relation variables vr{A,B,C,D} and vs{A,X} with
> respective predicates pr(A,B,C,D) and ps(A,X). Observing the world I
> can determine for a particular tuple whether a given predicate is true
> with corresponding attributes/parameters substituted. All attributes
> are of the same type.
>
> At some point when the database reflects the world:
>
> If pr(a,b,c,d) is true, what does this imply about the tuples in vr?

the fact that there is an (a,b,c,d) such that pr (a,b,c,d) is represented
in the database as the tuple {A:a,B:b,C:c,D:d} in vr.

> If pr(a,b,c,d) is false, what does this imply about the tuples in vr?

the fact that there is not an (a,b,c,d) such that pr (a,b,c,d) is
represented
in the database as the absence of the tuple {A:a,B:b,C:c,D:d} in vr.

> If a tuple <a,b,c,d> is in vr, what does this imply about the world,
> in terms of pr?

that there is an (a,b,c,d) such that pr (a,b,c,d).

> If a tuple <a,b,c,d> is not in vr, what does this imply about the
> world, in terms of pr?

that there is not an (a,b,c,d) such that pr (a,b,c,d).

> What does vr imply about the world?

that there can be an (A,B,C,D) such that pr (A,B,C,D)

> What does the database imply about the world?

that there can be an (A,B,C,D) such that pr (A,B,C,D) OR there can be an
(A,X) such that ps (A,X)

(you didn't specify whether there is an inclusion dependency from vr[A] to
vs[A], so the logical connective is OR).

>
> What query (relation expression) returns:
> The set of tuples <a,b,c,d> for which pr(a,b,c,d)?
> The set of tuples <b,c,d> for which there exists an A such that pr
> (A,b,c,d)?
> The set of tuples <a,b,c,d,x> for which pr(a,b,c,d) and ps(a,x)?
> The set of tuples <a,b,c,d> for which pr(a,b,c,d) and not ps
> (a',b',c',d')?

You are being inconsistent. Is <a,b,c,d> a tuple or a set of tuples?

>
> Each of the following queries returns a relation. For each query, what
> does the returned value imply about the world?
> vr

forall A forall B forall C forall D,
which (A,B,C,D) are such that pr (A,B,C,D)?

> vr project all but {A}

This is really beyond the scope of a single newsgroup post, but,

assuming that A,B,C,D are free in pr,

pr (A,B,C,D) IFF (((pr A) B) C) D IFF (pr A) (B,C,D)

so vr project all but {A} is

forall A forall B forall C forall D,
which (B,C,D) are such that (pr A) (B,C,D)?

> vr join vs

forall A forall B forall C forall D forall X,
which (A,B,C,D,Z) are such that pr(A,B,C,D) and ps(A,X)?

> vr minus relation{A a',B b',C c',D d'}

forall A forall B forall C forall D,
which (A,B,C,D) are such that pr (A,B,C,D)
and not (A = a' and B = b' and C = c' and D = d')?

Roy Hann

unread,

Dec 15, 2009, 5:34:07 AM12/15/09

to

Mr. Scott wrote:

I really do hesitate to comment on this ever more imbecilic thread, but
this is a kind of discussion that always gets my goat.

Relational databases (and being generous, I will include SQL databases)
use logic and set theory to manage human testimony about the world. We
are not required to deny our own certain knowledge to satisfy the
machinations of the DBMS. It is the DBMS that is constrained to use the
closed world hypothesis, not us. But knowing that it is constrained in
that way, we are obliged to design databases to accomodate what we
know is true, and to never tell us anything we know is untrue.

But we have to accept that a DBMS can't tell us all that we know is
true. We human users are never obliged to use a closed world when we
interpret the result of a query.

--
Roy

Mr. Scott

unread,

Dec 15, 2009, 7:37:31 AM12/15/09

to

"paul c" <toledob...@oohay.ac> wrote in message

news:O_aVm.55874$Db2.49482@edtnps83...

The idea that symbols can be 'present' is very strange. For each table in a
database there is a collection of assertions that can be represented by a
row in the table. Some of those assertions may be true, and some may be
false, but the symbols from which those assertions are composed are always
'present' regardless of whether the assertions are true or not. They are
part of the language. The non-logical symbols of a typical first-order
language consists of a set of predicate symbols of various arities, a set of
function symbols of various arities, and a set of variable names. Arbitrary
propositions are 0-ary predicate symbols. Constants are 0-ary function
symbols. The idea that those sets of symbols can somehow change undermines
the basis of the logic. Moreover, the unique name assumption demands that
everything that has a name has one and only one name for all time, but if
the pool of names can change, then there's no longer any guarantee that the
name of something won't be different at some future time.

I think you're continual reference to mysticism is misplaced. Ignorance may
be bliss, but it has no place in database theory. It is a mistake to
deliberately ignore the fact that since databases can change, the
propositions represented by the rows that can be in the database must
concern concrete rather than abstract objects, since abstract objects are
independent of time. Propositions that concern abstract objects are either
necessarily true or necessarily false, so they have no place in a
time-varying relation. Think:

P -> (~#~P /\ ~#P): P implies possibly P but not necessarily P.

Every proposition that can be represented by a row in the database has to
have a similar form. If the contents of a table can change, then the
propositions that can be represented by a row in the table must be possible
but not necessary, and the only way they can be possible but not necessary
is if they concern concrete rather than abstract objects.

Probably the most important property of concrete objects is that they have
lifetimes. They can come into existence and they can cease to exist. This
is not mysticism: it is a logical consequence of the fact that the contents
of a table can change. But I understand your dislike of 'exists' as a
quantifier. I prefer to use 'there is' instead of 'there exists,' not
because there is something mystical about actual existence, but because a
fixed domain is easier to work with, especially for temporal data. A fixed
domain includes everything that ever was, everything that actually is, and
everything that can ever be. When the domain is fixed, the collection of
assertions that can be represented in the database is also fixed, so the
determination of which are represented at a given time is as simple as
assigning a truth value to each. There is no need to extend every time
there is an update. For example, suppose that someone tries to insert a row
for part '123455,' but there actually is no part '123455.' The row is
inconsistent with the database because there is no part '123455' in the
domain of parts, so there wouldn't be any propositions in the extension that
concern part '123455,' but what if someone at some later time defines a part
'123455?' At that time the row would be consistent because the part
'123455' would be in the domain of parts. How would the information that
'123455' is in the domain of parts be maintained so that consistency can be
ensured? Domains are not tables. One can't insert values into a domain
without altering the database scheme. However, with a fixed domain, it is
always the case that '123455' can be a part, and the fact that '123455'
actually is a part would have to be recorded in a table somewhere in the
database. It is also a simple matter with a fixed domain to represent
something in the database that no longer actually exists or something that
may occur in the future.

Kevin Kirkpatrick

unread,

Dec 15, 2009, 12:02:09 PM12/15/09

to

> Roy- Hide quoted text -

>
> - Show quoted text -

In the real world, I qualify the results of all queries as, "According
to the database [inference from the query]". I'll recast my point in
this light. My question on this thread was: given a database with
only one relvar BIG_US_CITIES {CITY, STATE} and associated predicate
"<CITY> is a big city in the state of <STATE>", if I run a query with
selection "STATE='Rhode Island'" and get no rows back, should I
interpret this result as:

A: "According to the database, Rhode Island is not a state" or
B: "According to the database, there are no large cities in Rhode
Island"

Mr. Scott is [I believe] claiming that both of these are valid
interpretations of the query result. I'd argue that only the latter
interpretation valid, and feel the example itself shows mine to be the
more sensible approach.

Mr. Scott

unread,

Dec 15, 2009, 12:40:29 PM12/15/09

to

"Roy Hann" <spec...@processed.almost.meat> wrote in message
news:NpSdnXvYDN2C-7rW...@pipex.net...

You're missing the point. The constraints that define the database clearly
and precisely specify what can and cannot be true in the context of the
database. That context is the micro-world that the database is supposed to
model, which may or may not be a subset of the real world. It might even be
a completely fictional world, but within that context, only assertions that
are consistent with the definition of the database can be true. The above
mentioned database is concerned only with states with big cities because
there is only a place for assertions about states with big cities, and in
that context, there can be no states with no big cities--even though we know
that in reality there are indeed states with no big cities.

> --
> Roy
>
>

Jan Hidders

unread,

Dec 18, 2009, 3:23:54 AM12/18/09

to

On 16 nov, 20:42, Sampo Syreeni <de...@iki.fi> wrote:
> Right now it is to be expected that I will be spreading the good
> relational word among my peers, in the near future. That is an
> opportunity one doesn't want to fuck up; many enough have gone down
> that road already. So I've been going over, and over, and over the
> basics. Don't want them to be able to catch me off guard with the
> minutiae, after all...
>
> So now I bump into my first real surprise, and the chills immediately
> go down my spine. That's Date et al.'s answer regarding the
> implications between 6NF and DK/NF, athttp://www.dbdebunk.com/page/page/621935.htm
> . In there they flat out state that DK/NF doesn't imply 6NF.
>
> So, my first question is, can this really be true? I mean, this seems
> highly suspect to me: since 6NF is a normal form like any other and is
> as such defined by the constraints it upholds by design, and on the
> other hand DK/NF is by definition a normal form where any constraint
> whatsoever follows from the domain and key ones, shouldn't it be self-
> evident that DK/NF logically implies 6NF, and in fact any other form?

Can I ask whether your question has been answered yet? I'm a bit
puzzled by it because this is basically a mathematical theorem and as
such the answer to it lies in understanding the math. But in your
additional text you seem to question more the philosophical /
intuitive foundations of these notions. Is that correct? And what is
exactly your problem with that? FWIW I'd certainly agree that the
justification of 6NF is suspect and weak, maybe even based on
misunderstanding the relational model, and it is certainly not widely
accepted.

Please be brief. ;-)

-- Jan Hidders

vldm10

unread,

Dec 21, 2009, 3:24:45 PM12/21/09

to

On Dec 15, 1:37 pm, "Mr. Scott" <do_not_re...@noone.com> wrote:
> "paul c" <toledobythe...@oohay.ac> wrote in message

>
> news:O_aVm.55874$Db2.49482@edtnps83...
>
>
>
>
>
> > Mr. Scott wrote:

> >> "paul c" <toledobythe...@oohay.ac> wrote in message

> may occur in the future.- Hide quoted text -

>
> - Show quoted text -

I would appreciate it if you could give a definition of abstract and
concrete objects. Any definition, or rough description of abstract
objects can clarify these important things in your message which was
interesting.
As far as I know the following questions don’t have good answers:
What abstract objects are; how we know that they exist?
What is the distinction between concrete and abstract objects – what
is the criterion of distinction?

The notation of objects is also questionable. As far as I know this
notation was first introduced by G. Frege.

If you use the terms abstract and concrete objects in the context of a
certain theory it might be interesting to know which framework you
used. I get the impression that these themes are becoming very
important in modern mathematics and some other sciences.

Vladimir Odrljin

Reinier Post

unread,

Dec 21, 2009, 5:07:31 PM12/21/09

to

Jan Hidders wrote:

>On 16 nov, 20:42, Sampo Syreeni <de...@iki.fi> wrote:
>>
>> So now I bump into my first real surprise, and the chills immediately
>> go down my spine. That's Date et al.'s answer regarding the
>> implications between 6NF and DK/NF,
>athttp://www.dbdebunk.com/page/page/621935.htm
>> . In there they flat out state that DK/NF doesn't imply 6NF.
>>
>> So, my first question is, can this really be true? I mean, this seems
>> highly suspect to me: since 6NF is a normal form like any other

I don't think so. It's hard fo me to tell, because I just did the
required math and it turns out don't have $60 to spend on the book
which contains the definition of 6NF required for this discussion,
but if I can get by the Google Books preview, it appears to involve some
degree of interpretation of domain values (as being totally ordered).
Normal normal forms don't do this.

>> and is
>> as such defined by the constraints it upholds by design, and on the
>> other hand DK/NF is by definition a normal form where any constraint
>> whatsoever follows from the domain and key ones, shouldn't it be self-
>> evident that DK/NF logically implies 6NF, and in fact any other form?

Any constraint of the kind used in normal forms, perhaps.
Constraints that involve interpretation of domain values
aren't among those.

>Can I ask whether your question has been answered yet? I'm a bit
>puzzled by it because this is basically a mathematical theorem and as
>such the answer to it lies in understanding the math.

It's probably a mathematical theorem, once you've bought it.
I believe Sampo's objection is a purely mathematical one.

--
Reinier

Jan Hidders

unread,

Dec 21, 2009, 6:13:40 PM12/21/09

to

On 21 dec, 23:07, r...@raampje.lan (Reinier Post) wrote:
> Jan Hidders wrote:
> >On 16 nov, 20:42, Sampo Syreeni <de...@iki.fi> wrote:
>
> >> So now I bump into my first real surprise, and the chills immediately
> >> go down my spine. That's Date et al.'s answer regarding the
> >> implications between 6NF and DK/NF,
> >athttp://www.dbdebunk.com/page/page/621935.htm
> >> . In there they flat out state that DK/NF doesn't imply 6NF.
>
> >> So, my first question is, can this really be true? I mean, this seems
> >> highly suspect to me: since 6NF is a normal form like any other
>
> I don't think so. It's hard fo me to tell, because I just did the
> required math and it turns out don't have $60 to spend on the book
> which contains the definition of 6NF required for this discussion,
> but if I can get by the Google Books preview, it appears to involve some
> degree of interpretation of domain values (as being totally ordered).
> Normal normal forms don't do this.

Wikipedia is your friend:

<http://en.wikipedia.org/wiki/Sixth_normal_form>

Whether you can call it a normal form or not is of course largely a
matter of definition, but it clearly is different from the other ones.
It does for example not remove redundancy or update anomalies, which
was pretty much the whole point of the other normal forms.

-- Jan Hidders

Bob Badour

unread,

Dec 21, 2009, 11:01:24 PM12/21/09

to

Jan Hidders wrote:

1st normal form through 5th normal form involve lossless decomposition
by project. Fagin proved 5th normal form is necessary and sufficient to
avoid the update anomalies avoidable by lossless decomposition by project.

It has long been noted that one could create other normal forms using
different operations. e.g. lossless decomposition by restrict. However,
to my knowledge, nobody has done so or found a reason to.

If I am not mistaken, 6th normal form does involve lossless
decomposition by project, which puts it into the same class of normal
forms as 1st normal form through 5th normal form; even if the reasons
for choosing 6th normal form do not involve avoiding update anomalies.

Of course, everything is a matter of definition.

--
is there something in it for them, like maybe bailouts, if they can
panic us into doing something politically to cover them?

November 19, 2007 - John S Bolton

http://tinyurl.com/y9e4vxh

Mr. Scott

unread,

Dec 22, 2009, 8:59:19 AM12/22/09

to

"vldm10" <vld...@yahoo.com> wrote in message
news:2568ebc5-3d4f-4f71...@m16g2000yqc.googlegroups.com...

<snip>

>
>I would appreciate it if you could give a definition of abstract and
>concrete objects. Any definition, or rough description of abstract
>objects can clarify these important things in your message which was
>interesting.
>
>As far as I know the following questions don�t have good answers:
>What abstract objects are; how we know that they exist?
>
>What is the distinction between concrete and abstract objects � what
>is the criterion of distinction?

Time is the criterion. Abstract objects are independent of time. Concrete
objects aren't. Concrete objects can come into existence or can cease to
exist. Abstract objects just are. The integer three just is.

paul c

unread,

Dec 22, 2009, 8:59:15 PM12/22/09

to

Both of you succinct as ever.

vldm10

unread,

Dec 25, 2009, 9:09:46 PM12/25/09

to

On Dec 22, 12:13 am, Jan Hidders <hidd...@gmail.com> wrote:

> Whether you can call it a normal form or not is of course largely a
> matter of definition, but it clearly is different from the other ones.
> It does for example not remove redundancy or update anomalies, which
> was pretty much the whole point of the other normal forms.
>

> -- Jan Hidders- Hide quoted text -
>

Hello Jan. As I told you a few years ago, my “simple form” is the way
in which db design should be done. Now you can see it as part of a
“bigger picture” at
http:// www.dbdesign11.com

Vladimir Odrljin

vldm10

unread,

Dec 27, 2009, 5:41:33 AM12/27/09

to

Sorry, the address is
http://www.dbdesign11.com

vldm10

unread,

Dec 27, 2009, 9:36:19 AM12/27/09

to

On Dec 22, 2:59 pm, "Mr. Scott" <do_not_re...@noone.com> wrote:

> Time is the criterion. Abstract objects are independent of time.

This statement may imply strong consequences, but it seems to be
correct.

> Concrete objects aren't. Concrete objects can come into existence or can cease to
> exist. Abstract objects just are. The integer three just is.

In fact, here we have the following three abstract objects:
1. integer;
2. the particular number 3;
3. a triad, for example three cars.

For G. Frege “3” in “3 + 5 = 8” is a name. Or “three” in “ There are
three cars on the street” is a name.
Frege’s notation of an object is characterized as the kind of thing
which can be the referent of a name. So a number is an object.
Also there are the different kinds of abstract objects regarding
universal and existential quantification (for example numbers and
relations).
As far as I know Frege’s model is the only general model or framework
for abstract objects.

There are some other views. “Some say that what is true or false is
not the sentence, but the meaning or thought expressed by the
sentence.”

There is group of mathematicians and philosophers named “Platonists”
who “believes that there are abstract objects not perceptible to our
senses that exist independently of us. Such objects can be perceived
by us only through our intellect.”
I write about these things because your post in fact is about the
fundamentals.

I use abstract objects in my DB design. I treat abstractions of
properties, attributes, entities, etc. as abstract objects. I pay
attention to identification of objects as crucial
question related to objects. So I have identification of pluralities,
entities, attributes, states etc.

Vladimir Odrljin

Reinier Post

unread,

Dec 27, 2009, 6:37:48 PM12/27/09

to

Jan Hidders wrote:

>On 21 dec, 23:07, r...@raampje.lan (Reinier Post) wrote:

[...]

>> [...] It's hard fo me to tell, because I just did the

>> required math and it turns out don't have $60 to spend on the book
>> which contains the definition of 6NF required for this discussion,
>> but if I can get by the Google Books preview, it appears to involve some
>> degree of interpretation of domain values (as being totally ordered).
>> Normal normal forms don't do this.
>
>Wikipedia is your friend:

I know ...

><http://en.wikipedia.org/wiki/Sixth_normal_form>

... that's why I'm so forgiving when it disappoints me, I guess.
I even try to remedy its shortcomings every now and then, which
is why you see half a definition on that page right now - I couldn't
find the definition of "U-projection" to complete it, and apparently,
nobody else can, which makes me wonder whether the subject should
be on Wikipedia in the first place. So please help out if you can.

--
Reinier

floronDBA

unread,

Jan 4, 2010, 9:09:06 AM1/4/10

to

Google "U projection" relational.

msh.revues.org/docannexe9233.html
PROJECTIVE OPERATIONS ON RELATIONAL CONSTRAINTS Luigi BURIGANA1

"2. EXISTENTIAL AND UNIVERSAL PROJECTIONS
Let A O[W] be any constraint and U W any set of variables. The e-
projection
(existential projection) of A relative to U, denoted by A " U, is the
set containing
each assignment in context (W,O) such that there is a way of modifying
its subassignment
on U so that the resulting assignment is in A. The u-projection
(universal
projection) of A relative to U, denoted by A # U, is the set
containing each assignment
in context (W,O) such that for every modification of its sub-
assignment on U,
the resulting assignment is in A. The two concepts are formally
rendered by these
equations:

A " U ={p + q : p0 + q 2 A for some p0 2 O[U]} (2)
A # U ={p + q : p0 + q 2 A for all p0 2 O[U]} (3)

in which + is catenation between local assignments and q stands for a
generic member
of O[W \ U]."

Sorry, most symbols come out wrong if I copy and paste.

Mr. Scott

unread,

Jan 5, 2010, 4:39:03 PM1/5/10

to

"Reinier Post" <r...@raampje.lan> wrote in message
news:4b37efcc$0$28718$703f...@news.kpn.nl...

6NF is intended for data that involves intervals. For example, a typical
row that records the reduced price of a part for a sale,

{A0131245, $4.95, [09/01/09,09/03/09]}

States that part A0131245 has price $4.95 from the start of 09/01/09 through
the end of 09/03/09.

This row is equivalent to the following three rows

{A0131245, $4.95, [09/01/09,09/01/09]}
{A0131245, $4.95, [09/02/09,09/02/09]}
{A0131245, $4.95, [09/03/09,09/03/09]}

Date, Darwen and Lorentzos' system is discrete rather than continuous, so an
interval that is not atomic, like [09/01/09,09/03/09], can be subdivided
into a series of consecutive intervals that are. The authors call the
operation of splitting a row with a non-atomic interval into a collection of
consecutive rows with atomic intervals "UNPACKing." The inverse, "PACKing"
combines consecutive rows, adjusting only the interval. They then
generalized the ordinary relational operators to accomodate the packing and
unpacking operators. A U-projection involves unpacking the operand, taking
a projection over the result of the unpack, and then packing the result of
taking the projection.

Hope this helps.