Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Normalization by Composing, not just Decomposing

14 views
Skip to first unread message

Dawn M. Wolthuis

unread,
Apr 8, 2004, 2:51:04 PM4/8/04
to
Sorry I have so many questions, but I do appreciate the help I have received
from this list. I just read, or rather, skimmed the document Jan pointed me
to related to XML and normal forms. There were other more accessible papers
there that I skimmed too.

If I am understanding correctly, the process of normalization for any set of
data attributes is a process of decomposing from one large set to several
smaller ones. That makes sense when starting from scratch.

But tests for determining whether data is normalized also seem to focus on
whether it has been fragmented sufficiently and do not take into account
whether the data has been TOO fragmented.

For example, if we have attributes: ID, First Name, Last Name, Nick Name
where the ID is a primary key (or candidate key if you prefer) and for each
ID there is precisely one list of Nick Names and the Nick Name list
(relation, if you prefer) is determined by the ID, the whole ID, and nothing
but the ID, then in the relational model, most folks would still split out
Nick Names into a separate relation simply because it is, itself, a
relation.

More progressive relational modelers might decide it is OK to model the
relation-valued attribute of Nick Names within the first relation. But
either option would then be acceptable and considered normalized (using
newer definitions of 1NF).

But there seem to be no "rules" or even guidelines that are provided to
COMPOSE or keep together the Nick Names with the ID. Such rules would be
the ones I would add to what I have seen related to XML modeling and are
used, without being explicitly stated, by PICK developers. The imprecise
description of this rule is:

If it is dependent on the key, the whole key, and nothing but the key, then
don't split it out!

More precision, but not absolute precision, would give us something like:
Let A be the set of all Attributes and FD be the set of all functional
dependencies among the attributes. If a is an element of A and is a key and
mv is another element (named to give a hint that it might be multivalued,
aka relation-valued) and a-->mv is in FD (but no subcomponent of a implies
mv), then

mv should be an attribute in a relation where a is a key and for all
attributes b with this same relationship to a, mv should be in the relation
with b

In other words, there ought to be some "rules" that govern when we ought not
split out data attributes, in general, as well as when we should decompose
them.

Or am I missing something? Perhaps what I skimmed includes this, but I just
didn't pick it up. I know I haven't read everything out there -- are there
other places where normalization or rules related to data modeling are not
focussed exclusively on when to split attributes out, but also include
bringing them together when they have already been unnecessarily decomposed?

Thanks. --dawn


Laconic2

unread,
Apr 8, 2004, 3:37:04 PM4/8/04
to
Just because the remedy for a given lack of normalization is generally
decomposition doesn't necessarily mean that decomposition is what normal
forms are ABOUT.

For each normal form, it's worth asking, what problems come about if one
does not adhere to this normal form? And in fact the best treatments of
normal forms do precisely that.

For 1NF the problem is an access problem: you have to look in more than one
column to find the answer to a single question.

For all the other normal forms, the problem is an update problem. You have
to change more than one row to change a single fact, and, if you don't, you
could end up with a self contradictory database.

Your previous answers suggest you already know this, so I'm wondering where
the rest of your comment is leading.

BTW, the fact that 1NF solves a different problem than all the others could
be related to your claim that PICK data is normalized with regard to the 2NF
and 3NF rules, and maybe more., but not 1NF.

The fact that normal forms avoid (rather than solve) certain problems also
gives a handle on WHEN to normalize: if it ain't broke, don't fix it.

Alan

unread,
Apr 8, 2004, 3:43:19 PM4/8/04
to
You are assuming that (good) normalization is a science. It is not. It is
part science and part art- that's where experience (as well as ESP to read
the user's minds and clairvoiance to predict future needs) comes in to play.
Oh, it is also part voodoo. Sometimes waving a dead chicken in a paper bag
over your head produces the results you need. By the way, the process of
putting it back together is called denormalization, not composing, and is
not uncommon, but as you noted, there are no rules. That's why experienced
data modelers get paid more than newbies.


"Dawn M. Wolthuis" <dw...@tincat-group.com> wrote in message
news:c546v4$a76$1...@news.netins.net...

Dawn M. Wolthuis

unread,
Apr 8, 2004, 4:20:22 PM4/8/04
to
"Alan" <al...@erols.com> wrote in message
news:c54a0e$2ohurg$1...@ID-114862.news.uni-berlin.de...

> You are assuming that (good) normalization is a science. It is not. It is
> part science and part art- that's where experience (as well as ESP to read
> the user's minds and clairvoiance to predict future needs) comes in to
play.
> Oh, it is also part voodoo. Sometimes waving a dead chicken in a paper bag
> over your head produces the results you need.

You are preachin' to the choir-ish -- that's the type of thing I would say
if I were not trying, oh so hard, to learn what makes relational theorists
tick and trying, oh so hard, to use the same way of thinking so that I can
really learn what it is about relational theory that is keeping it the king
of the hill. There are very formalized statements, using very mathematical
terminology and all, that show the process of normalization to be
np-complete or whatever else makes some folks feel all warm and fuzzy (not
the mathematical use of the term "fuzzy"). When I ask what it is about
relational theory that makes it king, I hear that it is because it is based
on mathematics, including predicate logic.

> By the way, the process of
> putting it back together is called denormalization, not composing, and is
> not uncommon, but as you noted, there are no rules. That's why experienced
> data modelers get paid more than newbies.

Yes, you are right and I'm quite familiar with denormalization used with
OLAP, which is why I avoided that term. From what I have seen, folks talk
about denormalization when going away from transactional data processing and
I didn't want to inadvertantly take the thread in that direction. .

So, with your statements about formalization of the rules for good data
modeling/design/implementation, are you in "the relational camp" or among
the less orthodox (of us)? Thanks. --dawn

Jan Hidders

unread,
Apr 8, 2004, 4:26:50 PM4/8/04
to
Dawn M. Wolthuis wrote:
>
> If I am understanding correctly, the process of normalization for any set of
> data attributes is a process of decomposing from one large set to several
> smaller ones. That makes sense when starting from scratch.
>
> But tests for determining whether data is normalized also seem to focus on
> whether it has been fragmented sufficiently and do not take into account
> whether the data has been TOO fragmented.

Of course they take that into account! Except for the obvious
requirement to be information preserving, you might have a requirement
to be dependency preserving. Even beyond that the theory doesn't tell
you when you should and should not split. It only tells you in what
stage you are, what the potential problems in that stage are and how you
might get rid of them. Whether you want to accept these potential
problems is up to you as a database designer. But without a thorough
understanding of normalization theory you cannot make a well-founded
decision about this.

By the way, if you think normalization for the 1NF relational model is
tricky, then I seriously doubt you will master the art for the NFNF
relational model. Never mind a semistructured datamodel like XML or the
Pick data model where things get much much more complicated.

> In other words, there ought to be some "rules" that govern when we ought not
> split out data attributes, in general, as well as when we should decompose
> them.

That's what the rules already "do" now. If you want to be in 3NF / BCNF
/ PJNF / DKNF or whatever, and you can join two tables without breaking
the NF requirements, then there is nothing in normalization theory that
tells you to keep them separate. In that respect they are much like
physical laws: they don't tell you what to do, just what the
consequences of your choices will be.

-- Jan Hidders

Dawn M. Wolthuis

unread,
Apr 8, 2004, 4:46:58 PM4/8/04
to
"Jan Hidders" <jan.h...@REMOVETHIS.pandora.be> wrote in message
news:eAidc.64823$6j6.4...@phobos.telenet-ops.be...

> Dawn M. Wolthuis wrote:
> >
> > If I am understanding correctly, the process of normalization for any
set of
> > data attributes is a process of decomposing from one large set to
several
> > smaller ones. That makes sense when starting from scratch.
> >
> > But tests for determining whether data is normalized also seem to focus
on
> > whether it has been fragmented sufficiently and do not take into account
> > whether the data has been TOO fragmented.
>
> Of course they take that into account! Except for the obvious
> requirement to be information preserving, you might have a requirement
> to be dependency preserving. Even beyond that the theory doesn't tell
> you when you should and should not split. It only tells you in what
> stage you are, what the potential problems in that stage are and how you
> might get rid of them. Whether you want to accept these potential
> problems is up to you as a database designer. But without a thorough
> understanding of normalization theory you cannot make a well-founded
> decision about this.
>
> By the way, if you think normalization for the 1NF relational model is
> tricky,

not sure I said that

> then I seriously doubt you will master the art for the NFNF
> relational model. Never mind a semistructured datamodel like XML or the
> Pick data model where things get much much more complicated.

Do you HAVE a description of any type of normalization or other data
modeling theories related to PICK? I sortof think I'm at least in the 3.5
(tennis 5 point scale) camp on PICK data modeling & design and am working to
take what I KNOW somewhere in the depths of my brain and formalize it so
that it doesn't appear to relational theorists as beeing so loosely-goosey.

If someone has already done that -- PLEASE POINT ME TO IT! I know Dick Pick
& Don Nelson did not do that, nor did some of the heavy-weights like Henry
Eggers (who sortof gave me the blessing to go for it last year while sitting
in a little cafe by the Pacific ocean with Jonathan Sisk, a fine PICK author
and historian).

> > In other words, there ought to be some "rules" that govern when we ought
not
> > split out data attributes, in general, as well as when we should
decompose
> > them.
>
> That's what the rules already "do" now. If you want to be in 3NF / BCNF
> / PJNF / DKNF or whatever, and you can join two tables without breaking
> the NF requirements, then there is nothing in normalization theory that
> tells you to keep them separate.

But there is nothing that tells you to put them back together, right? You
can obey all the rules of normalization and have a set of N resulting
relations when someone else also has a normalized set of the same attributes
but with N-500 relations, right? Again, I might very well be wrong on this
point -- I want to know if I am. Thanks. --dawn

mAsterdam

unread,
Apr 8, 2004, 5:47:53 PM4/8/04
to
Dawn M. Wolthuis wrote:

> If I am understanding correctly, the process of normalization for any set of
> data attributes is a process of decomposing from one large set to several
> smaller ones. That makes sense when starting from scratch.
>
> But tests for determining whether data is normalized also seem to focus on
> whether it has been fragmented sufficiently and do not take into account
> whether the data has been TOO fragmented.

Lossless decomposition is the magic word. Slicing the data from
information needs *without* losing information.
Having said that, I think your thread on Order & meaning
at least intuitively shows that when moving information from utterances
in natural language into datastructures *some* of it's meaning is lost
anyway.
Furthermore, a heavy burden is placed on
1. stating the inforation needs - it should be done perfectly,
and
2. keeping everything connected, to indeed have a decomposition
that is lossless.

Anyway, the irreducible normal form (one non-key attribute only)
was recently mentioned again by Chris Date as the 6th normal form.
I can't find my books on it, so I must do this from memory, please
forgive the inacuracies:
AFAIK it (irreducible normal form - by another name) first popped up in
NIAM (when it was called Nijssen's Information Analysis Method, late
1970's, after one of its originators, prof Nijssen. Now the N stand for
Natural) - I don't remember what is was called in that context -
something like 'elementary sentence'. NIAM is mentioned at the ORM site
Jan Hidders kindly referred us to, http://www.orm.net, so maybe I can
find the origins there.

> More progressive relational modelers might decide it is OK to model the
> relation-valued attribute of Nick Names within the first relation. But
> either option would then be acceptable and considered normalized (using
> newer definitions of 1NF).

From here on I did not get your points, sorry. I don't know Pick (only
from your example), and though I like XML, winning some over HTML as a
markup language, maturity for modelling data along those lines is beyond
the horizon, IMHO.

> If it is dependent on the key, the whole key, and nothing but the key, then
> don't split it out!

There are some anomalies which sometimes make it necessary to split
beyond BCNF (i.e. Boyce-Codd normal form, easily to recognize with the
phraze: all non-key attributes are dependent on the key, the whole key,
and nothing but the key.

> ... In other words, there ought to be some "rules" that govern when we ought not


> split out data attributes, in general, as well as when we should decompose
> them.
>
> Or am I missing something?

Maybe the magic *lossless* I just mentioned?

Jan Hidders

unread,
Apr 8, 2004, 8:02:20 PM4/8/04
to
Dawn M. Wolthuis wrote:
>
> Do you HAVE a description of any type of normalization or other data
> modeling theories related to PICK?

That's what the Arenas & Libkin paper was. I know the math is tough but
that's to a large extent because normalization in such data models is
inherently more complex then in the relational model. Explaining it in
detail would take more time than I have. If you want things to be
simpeler then I suggest you stick to the NFNF relational model. See, for
example:

http://citeseer.ist.psu.edu/ozsoyoglu87new.html

although that is not a very good paper, but it contains more references
if you want them. If you can find them look at the Fischer and Van Gucht
paper, and the Roth, Korth and Silberschatz paper.

And if that is too dificult for you, just stick to the usual
dependencies (functional, multi-valued, join), which would basically
mean you are just dealing with the good old flat relational model. Or,
for starters, perhaps even just the functional dependencies.

I had the impression from you attempted formalization of your
denormalization rule, that this is what you were doing anyway. I just
hope you realize that this is just a very tiny tip of a really huge
iceberg. .. I'm sure you can come up with a nice pun on the word "pick"
here somewhere .. ;-)

> But there is nothing that tells you to put them back together, right? You
> can obey all the rules of normalization and have a set of N resulting
> relations when someone else also has a normalized set of the same attributes
> but with N-500 relations, right?

Only if you normalize in a very naive way, i.e., splitting off offending
FDs one by one. The usual algorithm that gets you to 3NF in one step
(the one using the minimal cover) splits as little as possible. See for
example sheet 46 on:

http://cs.ulb.ac.be/cours/info364/relnormnotes.pdf

-- Jan Hidders

Laconic2

unread,
Apr 8, 2004, 10:13:52 PM4/8/04
to
The trouble with avoiding normalization is that it tells you what not to
do, but it doesn't tell you what to do.

It's sort of like asking a carpenter how to build a dome shaped residence,
and having him say, "don't use right angles."

Jan Hidders

unread,
Apr 9, 2004, 1:36:38 PM4/9/04
to
mAsterdam wrote:
>
> Anyway, the irreducible normal form (one non-key attribute only)
> was recently mentioned again by Chris Date as the 6th normal form.

Are you sure? Date's 6NF is a special normal form for when you have
temporal data. It's not uncontroversial, by the way.

> AFAIK it (irreducible normal form - by another name) first popped up in
> NIAM (when it was called Nijssen's Information Analysis Method, late
> 1970's, after one of its originators, prof Nijssen. Now the N stand for
> Natural) - I don't remember what is was called in that context -
> something like 'elementary sentence'.

I believe the term to be "elementary facts". Actually finding out what
the elementary facts are is essentially the same as normalizing to 5NF.

NIAM is mentioned at the ORM site
> Jan Hidders kindly referred us to, http://www.orm.net, so maybe I can
> find the origins there.

To a large extent ORM *is* NIAM.

-- Jan Hidders

Jan Hidders

unread,
Apr 9, 2004, 3:04:34 PM4/9/04
to
Jan Hidders wrote:
>
> [...] The usual algorithm that gets you to 3NF in one step
> (the one using the minimal cover) splits as little as possible. See for
> example sheet 46 on:
>
> http://cs.ulb.ac.be/cours/info364/relnormnotes.pdf

Did anyone notice that this algorithm is actually not correct? Take the
following example of a relation R(A,B,C,D,E) with the set of FDs:

{ AB->C, AB->D, BC->D }

It is clear that the relation ABCD is not in 3NF. Since the set of FDs
it is already a minimal cover the resulting decomposition is:

{ ABCD, BCD }

But that gives us our old relation back (plus a projection) so this is
definitely not in 3NF.

The strange thing is that this algorithm appears as such in the Elmasri
and Navathe and also in Date (but not Ullman). Surely these two major
textbooks would not get the most fundamental algorithm in normalization
theory wrong? Or would they? Reminds me a little of the
misrepresentation of 5NF in many textbooks.

-- Jan Hidders

Dawn M. Wolthuis

unread,
Apr 9, 2004, 5:03:28 PM4/9/04
to
"Jan Hidders" <jan.h...@REMOVETHIS.pandora.be> wrote in message
news:gKldc.65103$2I5.4...@phobos.telenet-ops.be...

> Dawn M. Wolthuis wrote:
> >
> > Do you HAVE a description of any type of normalization or other data
> > modeling theories related to PICK?
>
> That's what the Arenas & Libkin paper was. I know the math is tough but
> that's to a large extent because normalization in such data models is
> inherently more complex then in the relational model.

OK, I read it rather than skimming it and I have a few parts where I'm not
tracking with the notation, but I mostly followed the mathematics this time,
and this does take care of one aspect of what I care about -- thanks!

> Explaining it in
> detail would take more time than I have. If you want things to be
> simpeler then I suggest you stick to the NFNF relational model. See, for
> example:
>
> http://citeseer.ist.psu.edu/ozsoyoglu87new.html
> although that is not a very good paper, but it contains more references
> if you want them. If you can find them look at the Fischer and Van Gucht
> paper, and the Roth, Korth and Silberschatz paper.

Yes, I have read at least one such.

> And if that is too dificult for you,

Don't mis-diagnose laziness ;-) .

> just stick to the usual
> dependencies (functional, multi-valued, join), which would basically
> mean you are just dealing with the good old flat relational model. Or,
> for starters, perhaps even just the functional dependencies.
> I had the impression from you attempted formalization of your
> denormalization rule, that this is what you were doing anyway. I just
> hope you realize that this is just a very tiny tip of a really huge
> iceberg. .. I'm sure you can come up with a nice pun on the word "pick"
> here somewhere .. ;-)

There have been plenty of those over the history of PICK (including one of
my favorite names for a book -- The PICK Pocket Guide)

Yes, agreed on the iceberg. PICK is not a "database management system" per
se, doesn't have strong typing nor declarative constraints and just plain
doesn't follow many of what are assumed to be best practices for a database
system..

> > But there is nothing that tells you to put them back together, right?
You
> > can obey all the rules of normalization and have a set of N resulting
> > relations when someone else also has a normalized set of the same
attributes
> > but with N-500 relations, right?
>
> Only if you normalize in a very naive way, i.e., splitting off offending
> FDs one by one. The usual algorithm that gets you to 3NF in one step
> (the one using the minimal cover) splits as little as possible. See for
> example sheet 46 on:
>
> http://cs.ulb.ac.be/cours/info364/relnormnotes.pdf

Another good link -- thanks! --dawn


Jonathan Leffler

unread,
Apr 10, 2004, 12:31:06 AM4/10/04
to
Jan Hidders wrote:
> Jan Hidders wrote:
>> [...] The usual algorithm that gets you to 3NF in one step (the one
>> using the minimal cover) splits as little as possible. See for example
>> sheet 46 on:
>>
>> http://cs.ulb.ac.be/cours/info364/relnormnotes.pdf
>
> Did anyone notice that this algorithm is actually not correct? Take the
> following example of a relation R(A,B,C,D,E) with the set of FDs:
>
> { AB->C, AB->D, BC->D }

You've lost E - was that a mistake in the FD's or in the example relation?

> It is clear that the relation ABCD is not in 3NF. Since the set of FDs
> it is already a minimal cover the resulting decomposition is:
>
> { ABCD, BCD }
>
> But that gives us our old relation back (plus a projection) so this is
> definitely not in 3NF.
>
> The strange thing is that this algorithm appears as such in the Elmasri
> and Navathe and also in Date (but not Ullman). Surely these two major
> textbooks would not get the most fundamental algorithm in normalization
> theory wrong? Or would they? Reminds me a little of the
> misrepresentation of 5NF in many textbooks.
>
> -- Jan Hidders


--
Jonathan Leffler #include <disclaimer.h>
Email: jlef...@earthlink.net, jlef...@us.ibm.com
Guardian of DBD::Informix v2003.04 -- http://dbi.perl.org/

Bert Blink

unread,
Apr 10, 2004, 1:10:12 AM4/10/04
to
On Sat, 10 Apr 2004 04:31:06 GMT, Jonathan Leffler
<jlef...@earthlink.net> wrote:

>Jan Hidders wrote:
>> Jan Hidders wrote:
>>> [...] The usual algorithm that gets you to 3NF in one step (the one
>>> using the minimal cover) splits as little as possible. See for example
>>> sheet 46 on:
>>>
>>> http://cs.ulb.ac.be/cours/info364/relnormnotes.pdf
>>
>> Did anyone notice that this algorithm is actually not correct? Take the
>> following example of a relation R(A,B,C,D,E) with the set of FDs:
>>
>> { AB->C, AB->D, BC->D }
>
>You've lost E - was that a mistake in the FD's or in the example relation?
>
>> It is clear that the relation ABCD is not in 3NF. Since the set of FDs
>> it is already a minimal cover the resulting decomposition is:
>>
>> { ABCD, BCD }
>>
>> But that gives us our old relation back (plus a projection) so this is
>> definitely not in 3NF.
>>
>> The strange thing is that this algorithm appears as such in the Elmasri
>> and Navathe and also in Date (but not Ullman). Surely these two major
>> textbooks would not get the most fundamental algorithm in normalization
>> theory wrong? Or would they? Reminds me a little of the
>> misrepresentation of 5NF in many textbooks.
>>
>> -- Jan Hidders

See p321Table 10.1 in E&N 4th Edition & elswhere in the text.

It specifically mentions the need to preserve the Candidate Key (CK)
as a separate relation in particular when the CK is not on the LHS of
any FD in the minimal cover.

So you need an extra relation r3(A, B, E).

Jan Hidders

unread,
Apr 10, 2004, 4:30:29 AM4/10/04
to
Jonathan Leffler wrote:
> Jan Hidders wrote:
>
>> Jan Hidders wrote:
>>
>>> [...] The usual algorithm that gets you to 3NF in one step (the one
>>> using the minimal cover) splits as little as possible. See for
>>> example sheet 46 on:
>>>
>>> http://cs.ulb.ac.be/cours/info364/relnormnotes.pdf
>>
>>
>> Did anyone notice that this algorithm is actually not correct? Take
>> the following example of a relation R(A,B,C,D,E) with the set of FDs:
>>
>> { AB->C, AB->D, BC->D }
>
>
> You've lost E - was that a mistake in the FD's or in the example relation?

Oops. That's a mistake in the example relation. Sorry about that.

-- Jan Hidders

Jan Hidders

unread,
Apr 10, 2004, 4:41:45 AM4/10/04
to
Bert Blink wrote:
> On Sat, 10 Apr 2004 04:31:06 GMT, Jonathan Leffler
> <jlef...@earthlink.net> wrote:
>>Jan Hidders wrote:
>>
>>>Jan Hidders wrote:
>>>
>>>>[...] The usual algorithm that gets you to 3NF in one step (the one
>>>>using the minimal cover) splits as little as possible. See for example
>>>>sheet 46 on:
>>>>
>>>> http://cs.ulb.ac.be/cours/info364/relnormnotes.pdf
>>>
>>>Did anyone notice that this algorithm is actually not correct? Take the
>>>following example of a relation R(A,B,C,D,E) with the set of FDs:
>>>
>>> { AB->C, AB->D, BC->D }
>>
>>You've lost E - was that a mistake in the FD's or in the example relation?
>>
>>
>>>It is clear that the relation ABCD is not in 3NF. Since the set of FDs
>>>it is already a minimal cover the resulting decomposition is:
>>>
>>> { ABCD, BCD }
>>>
>>>But that gives us our old relation back (plus a projection) so this is
>>>definitely not in 3NF.
>>>
>>>The strange thing is that this algorithm appears as such in the Elmasri
>>>and Navathe and also in Date (but not Ullman). Surely these two major
>>>textbooks would not get the most fundamental algorithm in normalization
>>>theory wrong? Or would they? Reminds me a little of the
>>>misrepresentation of 5NF in many textbooks.
>
> See p321Table 10.1 in E&N 4th Edition & elswhere in the text.
>
> It specifically mentions the need to preserve the Candidate Key (CK)
> as a separate relation in particular when the CK is not on the LHS of
> any FD in the minimal cover.
>
> So you need an extra relation r3(A, B, E).

The E shouldn't have been there. But even if it is, that doesn't solve
the problem. The decomposition { ABCD, BCD, ABE } is also not in 3NF.

-- Jan Hidders

mAsterdam

unread,
Apr 10, 2004, 6:13:55 AM4/10/04
to
Jan Hidders wrote:

> mAsterdam wrote:
>> Anyway, the irreducible normal form (one non-key attribute only)
>> was recently mentioned again by Chris Date as the 6th normal form.
>
> Are you sure?

Your question got me to think about this again, and the only honest
answer I can come up with is: no, I am not 100% sure. Either I read
somewhere or at sometime I concluded that - for all practical purposes -
6NF is INF. Sorry for presenting this as a fact.

I could back out by rephrasing it so:

> Anyway, the irreducible normal form (one non-key attribute

> only) was recently mentioned again by Chris Date in the
> prelude to the 6th normal form.

... which would be correct, but would miss anything about
6NF and INF being the same or not. I think they are.
Now I'm in an uneasy spot. I find that cannot reproduce how I got to
that they are - provide a quote or proof.

I can just give you some of my thoughts about it.
First: some nuance.

> Date's 6NF is a special normal form for when you have
> temporal data.

1. Which is allways, *if* you look at it this way:
"The database is not the database - the log is the database, and the
database is just an optimized access path to the most recent version of
the log." - B.-M Shueler, prominently quoted by Date,
Darwen and Lorenzos in their recent book
"Temporal Data and the relational model".

2. Not just when you have temporal data.
All data involving interval attributes in the key benefits
from decomposing 5NF into 6NF. Temporal data, handled as proposed by
Date, Darwen and Lorenzos is a special case of interval attributes in
the key. (Temporal Data and the relational model, read e.g. pages
[172:177]). So yes, you need 6NF when dealing with temporal data, but
that is not it's only purpose.

[Date's 6NF]


> It's not uncontroversial, by the way.

Could you share some of the controversy?

> ... Actually finding out what

> the elementary facts are is essentially the same as normalizing to 5NF.

That is, only when you exclude intervals as key-attributes.
When you allow intervals as key-attributes (and... why not?)
it maps to 6NF.

My take is that Date, Darwen and Lorenzos formulated 6NF the way they
did to make it fairly obvious that 6NF is more strict than PJNF (5NF)
(i.o.w. that every set of relations (relational variables) in 6NF is by
definition also in 5NF so 6NF is another step on the lossless
decomposition ladder). However, until I see an counterexample -
preferably pizza orders related - I'll look at 6NF as an alternative
predicate for the INF, the irreducable normal form (loose definition:
just one non-key attribute) (BTW great
acronym, don't you think? :-).


Alan

unread,
Apr 12, 2004, 9:48:04 AM4/12/04
to
Realtional theory as a data theory is analagous to democracy as a form of
government- it may not be perfect, but so far, there's nothing better in
most cases.

Denormalization in itself has nothing directly to do with OLAP, except that
one may denormalize more for an OLAP application than an OLTP application.
However, in OLAP, you are not necessarily denormalizing so much as
"re-normalizing", in that you are really developing a diiferent distribution
among entities for the same data, such as in a star schema. It's not
normalized, but it's not denormalized either. It's just different. I suppose
an argument could be made that (in the case of a star schema), you start
with a normalized schema, and then apply transformation rules (no, don't ask
me what they are- there are books on the topic) to transform it into a star
schema. Think about it- a basic star schema is essentially a giant
many-to-many linking table (the fact table) with a bunch of descriptive data
tables (dimensions).


"Dawn M. Wolthuis" <dw...@tincat-group.com> wrote in message

news:c54c6j$cbq$1...@news.netins.net...

Dawn M. Wolthuis

unread,
Apr 12, 2004, 11:24:09 AM4/12/04
to
"Alan" <al...@erols.com> wrote in message
news:c5e6m0$lo87$1...@ID-114862.news.uni-berlin.de...

> Realtional theory as a data theory is analagous to democracy as a form of
> government- it may not be perfect, but so far, there's nothing better in
> most cases.

Have you concluded this by reviewing some emperical data that has been
collected or because you adhere to some philosophy or what? Could I state
something contradictory with as much logical backing?

> Denormalization in itself has nothing directly to do with OLAP, except
that
> one may denormalize more for an OLAP application than an OLTP application.
> However, in OLAP, you are not necessarily denormalizing so much as
> "re-normalizing", in that you are really developing a diiferent
distribution
> among entities for the same data, such as in a star schema. It's not
> normalized, but it's not denormalized either. It's just different. I
suppose
> an argument could be made that (in the case of a star schema), you start
> with a normalized schema, and then apply transformation rules (no, don't
ask
> me what they are- there are books on the topic) to transform it into a
star
> schema. Think about it- a basic star schema is essentially a giant
> many-to-many linking table (the fact table) with a bunch of descriptive
data
> tables (dimensions).

Are you adhering to relational theory when deploying an OLAP database where
the data is in fact & dimension tables? I have not seen a star schema for
the purposes of OLAP without some rules broken such as duplication of data.
But I'm fine with it either way -- just curious whether most relational
theorists would view data modeled as OLAP cubes as following "the
ules". --dawn
<snip>


Eric Kaun

unread,
Apr 12, 2004, 12:11:37 PM4/12/04
to
"Alan" <al...@erols.com> wrote in message
news:c54a0e$2ohurg$1...@ID-114862.news.uni-berlin.de...

> You are assuming that (good) normalization is a science. It is not. It is
> part science and part art- that's where experience (as well as ESP to read
> the user's minds and clairvoiance to predict future needs) comes in to
play.
> Oh, it is also part voodoo. Sometimes waving a dead chicken in a paper bag
> over your head produces the results you need.

It might not be science, but it's at least a discipline based on logic
(specifically functional dependencies). It's always going to require
interpretation with respect to the domain being modeled, because we're
trying to model part of reality, which is messy, in such a way that we (and
computers) can extract meaningful data, which requires clarity.

That's all a far cry from voodoo, unless you're defining voodoo as
everything which is not science. And you might be surprised what real
science is like...

> By the way, the process of
> putting it back together is called denormalization,

Putting it back together implies that information was lost during
normalization, which isn't the case - in fact, the normalized schema doesn't
risk data loss (e.g. inconsistency) the way a denormalized schema does.

- erk


Eric Kaun

unread,
Apr 12, 2004, 12:16:02 PM4/12/04
to
"Dawn M. Wolthuis" <dw...@tincat-group.com> wrote in message
news:c54c6j$cbq$1...@news.netins.net...

> "Alan" <al...@erols.com> wrote in message
> news:c54a0e$2ohurg$1...@ID-114862.news.uni-berlin.de...
> > You are assuming that (good) normalization is a science. It is not. It
is
> > part science and part art- that's where experience (as well as ESP to
read
> > the user's minds and clairvoiance to predict future needs) comes in to
> play.
> > Oh, it is also part voodoo. Sometimes waving a dead chicken in a paper
bag
> > over your head produces the results you need.
>
> You are preachin' to the choir-ish -- that's the type of thing I would say
> if I were not trying, oh so hard, to learn what makes relational theorists
> tick and trying, oh so hard, to use the same way of thinking so that I can
> really learn what it is about relational theory that is keeping it the
king
> of the hill. There are very formalized statements,

I think it's formal, but would expect that any software developer with a bit
of patience could understand the math. Date writes in a very clear style,
and avoids complex explanations and excessive notations. But sometimes a
formula really is worth a thousand pictures (that's from Dijkstra).

> using very mathematical
> terminology and all, that show the process of normalization to be
> np-complete or whatever else makes some folks feel all warm and fuzzy (not
> the mathematical use of the term "fuzzy").

Yes, a close correlation with logic does make me (a programmer) feel warm
and fuzzy. I would have thought that if anything were a solid basis for
computing, it would be logic. I could, of course, be wrong.

> When I ask what it is about
> relational theory that makes it king, I hear that it is because it is
based
> on mathematics, including predicate logic.

Specifically a closer correlation with it than other models, which add
additional complexity with no additional computational or expressive power.
You can still apply logic (different ones) to those data structures, but
it's much harder. Witness the mathematical complexities of generalized graph
theory, for example. Relational is about avoiding complexity, not adding it.
It's not about bondage to mathematics for its own sake.

- Eric


Eric Kaun

unread,
Apr 12, 2004, 12:18:32 PM4/12/04
to
"Alan" <al...@erols.com> wrote in message
news:c5e6m0$lo87$1...@ID-114862.news.uni-berlin.de...

> Realtional theory as a data theory is analagous to democracy as a form of
> government- it may not be perfect, but so far, there's nothing better in
> most cases.
>
> Denormalization in itself has nothing directly to do with OLAP, except
that
> one may denormalize more for an OLAP application than an OLTP application.
> However, in OLAP, you are not necessarily denormalizing so much as
> "re-normalizing", in that you are really developing a diiferent
distribution
> among entities for the same data, such as in a star schema. It's not
> normalized, but it's not denormalized either. It's just different.

That's just nonsense. At least have the courtesy to define new terms -
"normalize" was created and defined fairly precisely by the relational camp.
Star schemas duplicate information in the alleged name of performance, and
as such are less normalized, not just "different."

> I suppose
> an argument could be made that (in the case of a star schema), you start
> with a normalized schema, and then apply transformation rules (no, don't
ask
> me what they are- there are books on the topic) to transform it into a
star
> schema. Think about it- a basic star schema is essentially a giant
> many-to-many linking table (the fact table) with a bunch of descriptive
data
> tables (dimensions).

It's far more than that. The fact table typically combines many different
predicates, again in the name of performance (to avoid joins). So of course
it's not normalized. It's normalized if you squint really hard and bang your
head against the table a few times, then look at it.

- Eric


Eric Kaun

unread,
Apr 12, 2004, 12:19:59 PM4/12/04
to
"Laconic2" <laco...@comcast.net> wrote in message
news:rY-dndm14aE...@comcast.com...

Normalization rules are tests, not a process, so you're right... but there
are procedures you can derive from the rules.


Laconic2

unread,
Apr 12, 2004, 12:52:16 PM4/12/04
to
Your original question started me thinking in a new direction. While others
and I would like to keep the word "normalization" reserved for a relatively
specific concept, there's no reason why we can't start with individual data
values or variables, and compose our way up to schemas.

But first, I want to suggest that you can analyze a body of data values
(which I'll just call "items"), back to entities without regard to
relations as such.

The following are assumptions of mine, although some of them can be derived
from some minimal set of assumptions. You may agree or disagree.

Every (instance of a) data value specifies an attribute.

Every attribute has a domain, which is the set of values that can specify
the attribute.

Every attribute describes either a relationship or an entity.

Every relationship associates two or more entities.

The entities are discovered from the underlying ontology of some subject
matter.

Notice that, in the above, I've said nothing about columns, rows, tables,
schemas, tuples or relations. Or for that matter about databases, files,
lists, or the like. It's just a body of data consiting of values.

Now we can proceed to the question of composition. We can compose lists,
fields, records, files, columns, rows, tables. Take your pick, or go
consult the oracle at Delphi. The question is, when does it make sense to
compose data into a structure, and when does the composition do more harm
than good?

Laconic2

unread,
Apr 12, 2004, 1:43:22 PM4/12/04
to

"Eric Kaun" <ek...@yahoo.com> wrote in message
news:6hzec.53531$YU6....@newssvr16.news.prodigy.com...

> Yes, a close correlation with logic does make me (a programmer) feel warm
> and fuzzy. I would have thought that if anything were a solid basis for
> computing, it would be logic. I could, of course, be wrong.

Le coeur a ses raisons que la raison ne connait point.

Pardon my French (especially the spelling!)


Laconic2

unread,
Apr 12, 2004, 2:26:07 PM4/12/04
to
I'm still struggling with something that both you and Eric seem to have
accepted, namely that normalization is prescriptive rather than
descriptive.

To me, the normalization "rules" are just tests to determine what normal
form describes the data.

It isn't necessarily a set of rules for "good doobies" to follow.


Alan

unread,
Apr 12, 2004, 3:43:58 PM4/12/04
to
Normalization rules are Codd's rules, not God's rules. They are a _guide_ to
distributing data among entities, not a dogmatic recipe. You seem to want to
project a certain amount of dogmatism on everything, as if life were black
and white. It isn't, it's an infinite number of shades of gray (well 16,384
at least).

Did you hear about the programmer they found dead in the shower? He was
stiff, grasping a bottle of shampoo, his eyes apparently fixed on the
instructions, "Lather, rinse repeat."

Here's the general rule of thumb: Normalize to 3NF, and then see if that
works for you in your situation. If it doesn't, then denormalize or
normalize further. Iterate.

You ask, "Are you adhering to relational theory when deploying an OLAP
database where
the data is in fact & dimension tables?" Did you read my message? What did I
write about normalization and the star schema?

Data is not _modeled_ as "OLAP cubes". Cubes are an implementation,
modelling is analysis and maybe design.


"Dawn M. Wolthuis" <dw...@tincat-group.com> wrote in message

news:c5ecb7$e2f$1...@news.netins.net...

Eric Kaun

unread,
Apr 12, 2004, 3:56:05 PM4/12/04
to
"Laconic2" <laco...@comcast.net> wrote in message
news:j4ydnTr4kfS...@comcast.com...

> Your original question started me thinking in a new direction. While
others
> and I would like to keep the word "normalization" reserved for a
relatively
> specific concept, there's no reason why we can't start with individual
data
> values or variables, and compose our way up to schemas.
>
> But first, I want to suggest that you can analyze a body of data values
> (which I'll just call "items"), back to entities without regard to
> relations as such.
>
> The following are assumptions of mine, although some of them can be
derived
> from some minimal set of assumptions. You may agree or disagree.
>
> Every (instance of a) data value specifies an attribute.

I honestly don't have any idea what this means. When you say "instance of a
data value", do you mean a specific physical representation? An appearance
or specific encoding? 7 is a value - how does that figure into the above?

I really don't know what "specifies an attribute" means. An attribute
typically "has" a name and a type.

> Every attribute has a domain, which is the set of values that can specify
> the attribute.

"Specify" is again confusing. An attribute is of a specific type (domain),
and therefore something (we'll remain deliberately vague here) "with" that
attribute will "have" a value chosen from the set designated by the type
(domain), subject also to additional constraints.

> Every attribute describes either a relationship or an entity.

1. What's the difference between a relationship and an entity?
2. Can an attribute "describe" more than one? For example, an attribute
called Name of type CHARACTER (unlimited length) - can that attribute be
used by multiple "things"?

> Every relationship associates two or more entities.

Association meaning any relationship at all? Since a relationship can have
attributes, and entities can have attributes, what's the difference?

> The entities are discovered from the underlying ontology of some subject
> matter.

Oh how I loathe the word "ontology" - much like "methdology", it's been
absconded with and sorely molested.

> Notice that, in the above, I've said nothing about columns, rows, tables,
> schemas, tuples or relations. Or for that matter about databases, files,
> lists, or the like. It's just a body of data consiting of values.
>
> Now we can proceed to the question of composition.

But you already have - for example, you've described the composition of
relationships, and something about their attributes. What separates that
level of composition from the one you're about to describe below?

> We can compose lists,
> fields, records, files, columns, rows, tables. Take your pick, or go
> consult the oracle at Delphi. The question is, when does it make sense
to
> compose data into a structure, and when does the composition do more harm
> than good?

Your basic definitions are of course the building blocks of other things,
but the above needs some solidification. I would say it makes sense to
compose data into a structure when that structure it what you want - in
order to specify what you want, you need some sort of query or specification
language, along with rules for how scalars are referenced by propositions
(or something like them), and how those propositions and scalars are
composed with others to build something else (e.g. the structure you're
looking for, or something along the path to it).

- Eric


Dawn M. Wolthuis

unread,
Apr 12, 2004, 3:59:40 PM4/12/04
to
"Alan" <al...@erols.com> wrote in message
news:c5erh9$qt9b$1...@ID-114862.news.uni-berlin.de...

> Normalization rules are Codd's rules, not God's rules. They are a _guide_
to
> distributing data among entities, not a dogmatic recipe. You seem to want
to
> project a certain amount of dogmatism on everything, as if life were black
> and white. It isn't, it's an infinite number of shades of gray (well
16,384
> at least).

Who, me? Perhaps it comes across that way bz I am new to doing any sort of
significant study of database theory and compared to everything I have
worked on before, relational theory IS very tightly pre/described with a
mathematical model. The model is tight but that doesn't mean it is equally
tightly implemented. I'll agree completely with the infinite number of
shades of gray -- I'm guessing even uncountably infinite.

> Did you hear about the programmer they found dead in the shower? He was
> stiff, grasping a bottle of shampoo, his eyes apparently fixed on the
> instructions, "Lather, rinse repeat."
>
> Here's the general rule of thumb: Normalize to 3NF, and then see if that
> works for you in your situation. If it doesn't, then denormalize or
> normalize further. Iterate.

But I don't want to put the data in 1NF -- there is no reason to do so from
my perspective. Since all other normal forms require the data to be in 1NF
first, that pretty much kills the process as it is written. However, since
I can put data into (2NF - 1NF) and (3NF - 1NF) that is what I do and then
proceed as you describe to refactor the model until it fits.

> You ask, "Are you adhering to relational theory when deploying an OLAP
> database where
> the data is in fact & dimension tables?" Did you read my message? What did
I
> write about normalization and the star schema?

I'll re-read and see if it clearer, but it seems to me that you were making
a pitch for OLAP data being "relational" in some way. Sorry if I
misunderstood.

> Data is not _modeled_ as "OLAP cubes". Cubes are an implementation,
> modelling is analysis and maybe design.

Cubes could be an implementation, but they could also be used to model the
data. There is not always a need to take the data, put it into a relational
data model and reform it for OLAP -- one could go from requirements to OLAP
data model, right? --dawn

Alan

unread,
Apr 12, 2004, 4:03:24 PM4/12/04
to
Haven't you ever heard of a joke (voodoo)?

"Putting it back together" was not my concept, it was Dawn's, and I was
explaining what the correct term is for what she was describing. I made no
mention of lossy joins, etc.

The final product (a physical implementation) is not just based on the
domain being modeled, though, as you point out, it is based on a certain
reality that must be considered. The implementation of a
normalized/denormalized schema also considers physical performance factors.

BTW, the best scientists do more than just follow rules. They show a great
deal of creativity, intuition, and insight. A bit like voodoo. _You_ might
be surprised...


"Eric Kaun" <ek...@yahoo.com> wrote in message

news:Zczec.53528$aU6....@newssvr16.news.prodigy.com...

Alan

unread,
Apr 12, 2004, 4:34:03 PM4/12/04
to
See in line... --->


"Dawn M. Wolthuis" <dw...@tincat-group.com> wrote in message

news:c5esfv$hi$1...@news.netins.net...


> "Alan" <al...@erols.com> wrote in message
> news:c5erh9$qt9b$1...@ID-114862.news.uni-berlin.de...
> > Normalization rules are Codd's rules, not God's rules. They are a
_guide_
> to
> > distributing data among entities, not a dogmatic recipe. You seem to
want
> to
> > project a certain amount of dogmatism on everything, as if life were
black
> > and white. It isn't, it's an infinite number of shades of gray (well
> 16,384
> > at least).
>
> Who, me? Perhaps it comes across that way bz I am new to doing any sort
of
> significant study of database theory and compared to everything I have
> worked on before, relational theory IS very tightly pre/described with a
> mathematical model. The model is tight but that doesn't mean it is
equally

---> In the strictest sense, yes, as a theory, but in practice, after the
rules are followed, you go back and bend them quite often. I think of it as
a starting point, not an ending point- perhaps that's the explanation you're
looking for.

> tightly implemented. I'll agree completely with the infinite number of
> shades of gray -- I'm guessing even uncountably infinite.
>
> > Did you hear about the programmer they found dead in the shower? He was
> > stiff, grasping a bottle of shampoo, his eyes apparently fixed on the
> > instructions, "Lather, rinse repeat."
> >
> > Here's the general rule of thumb: Normalize to 3NF, and then see if that
> > works for you in your situation. If it doesn't, then denormalize or
> > normalize further. Iterate.
>
> But I don't want to put the data in 1NF -- there is no reason to do so
from
> my perspective. Since all other normal forms require the data to be in
1NF
> first, that pretty much kills the process as it is written. However,
since
> I can put data into (2NF - 1NF) and (3NF - 1NF) that is what I do and then
> proceed as you describe to refactor the model until it fits.
>

---> Ah, the rules DEscribe the data as being in 1NF first, they do not
PREscribe the data to be in 1NF first. You find the data however it is, and
then move it around as necessary. The data may happen to be in 1NF when you
start, but maybe it isn't. Often, especially after some experience, you
start out by placing the data you are familar with in 3NF, and then see
where the rest goes. For examlpe. An HR database- well you know you have EMP
and DEPT, and you have a pretty good idea what's going to go in them. No
sense starting in 1NF when you already know where you're going. OTOH, if you
are woking in a domain about which you know very little, you have no choice
but to start by asking a lot of questions and placing the data in 1NF, and
take it from there. Note that this is coming from an implementers POV, not a
theorist. The theorists are probably ready to explode by now.


> > You ask, "Are you adhering to relational theory when deploying an OLAP
> > database where
> > the data is in fact & dimension tables?" Did you read my message? What
did
> I
> > write about normalization and the star schema?
>
> I'll re-read and see if it clearer, but it seems to me that you were
making
> a pitch for OLAP data being "relational" in some way. Sorry if I
> misunderstood.

---> No, I was making a pitch that it is not necessary, and can even be
confusing, to view OLAP data as being relational in some way. Read Ralph
Kimball (the "father" of data warehouses). He refers to what we are talking
about as "dimensional modeling", and as something that is _applied to_
realtional databases. Note that he is _not_ talking about realtional
modeling.


>
> > Data is not _modeled_ as "OLAP cubes". Cubes are an implementation,
> > modelling is analysis and maybe design.
>
> Cubes could be an implementation, but they could also be used to model the
> data. There is not always a need to take the data, put it into a
relational
> data model and reform it for OLAP -- one could go from requirements to
OLAP
> data model, right? --dawn


---> Yes and no. How would you model an eight (or more) dimension cube? I
don't think it is practical, as models are used for analysis and design. You
do build a model of the cube (using various cube-creation software) before
you actually create the cube, but you really can't easily model from scratch
(perform analysis) with it. You can readily model a star schema, however,
and you then build the cube model from a star schema. While you can take
"loose" data and turn it into a star schema, the practical reality is that
you are almost always starting out with an existing relational database, so
it is wise to discuss it from that point of view.

Eric Kaun

unread,
Apr 12, 2004, 4:55:14 PM4/12/04
to
"Alan" <al...@erols.com> wrote in message
news:c5esln$suso$1...@ID-114862.news.uni-berlin.de...

> Haven't you ever heard of a joke (voodoo)?

Sorry, I did understand that part was a joke, but had my sense of humor
turned off... I was just implying that it was "closer" to science than to
art, due to the principles and definitions that comprise normalization.

> "Putting it back together" was not my concept, it was Dawn's, and I was
> explaining what the correct term is for what she was describing. I made no
> mention of lossy joins, etc.
>
> The final product (a physical implementation) is not just based on the
> domain being modeled, though, as you point out, it is based on a certain
> reality that must be considered. The implementation of a
> normalized/denormalized schema also considers physical performance
factors.
>
> BTW, the best scientists do more than just follow rules. They show a great
> deal of creativity, intuition, and insight. A bit like voodoo. _You_ might
> be surprised...

Hmmm. That's actually the point I was trying to make - that the process of
science differs from its outcome, and that creativity and intuition play a
role in launching the processes, etc. etc.

I did a very poor job expressing myself below...

- erk

Eric Kaun

unread,
Apr 12, 2004, 4:58:52 PM4/12/04
to
"Alan" <al...@erols.com> wrote in message
news:c5erh9$qt9b$1...@ID-114862.news.uni-berlin.de...

> Normalization rules are Codd's rules, not God's rules. They are a _guide_
to
> distributing data among entities, not a dogmatic recipe.

They're more than a guide, though certainly not divine law. They're based on
some fairly black-and-white rules, once you can decide on what propositions
you're modeling, and give far more solid advice than other modeling
approaches (entity-relationship, anything XML-ish, etc.), which are far
fuzzier.

> You seem to want to
> project a certain amount of dogmatism on everything, as if life were black
> and white.

Dawn? I never got that impression from her - more the opposite (she's far
more situationally-minded than I am, for example).

> It isn't, it's an infinite number of shades of gray (well 16,384 at
least).

Yes, there is ambiguity in life, and there are choices in data and systems
design. That doesn't imply all of them are of equal merit.

> Here's the general rule of thumb: Normalize to 3NF, and then see if that
> works for you in your situation.

So what's the situation? The app you're working on? The needs of the
business? How long a timeframe?

> If it doesn't, then denormalize or normalize further. Iterate.

Uh... ok. Sound advice (cough).

- erk


Dawn M. Wolthuis

unread,
Apr 12, 2004, 5:23:21 PM4/12/04
to
"Alan" <al...@erols.com> wrote in message
news:c5euf6$v5g3$1...@ID-114862.news.uni-berlin.de...

I don't think that is the general opinion of relationa theoriests, but even
given your approach, I'm contesting the "starting point". I don't mind
having some logically-based rules from which to deviate, but I'd rather
start with some closer to what I want to end up with and 1NF is my biggest
issue of the normalization "rules" from relational theory.

Data is rarely in 1NF when you start looking at a problem domain. And I
have "no choice" but to put the data in 1NF under certain circumstances? I
beg to differ -- I can name that tune with 2 & 3NF sans 1NF, but I'm not
saddled with an RDBMS as the target environment. And, by the way, the
theorists on the list have been, for the most part, generous in handling the
questions of this "implementor" with relatively minor explosions ;-)

> > > You ask, "Are you adhering to relational theory when deploying an OLAP
> > > database where
> > > the data is in fact & dimension tables?" Did you read my message? What
> did
> > I
> > > write about normalization and the star schema?
> >
> > I'll re-read and see if it clearer, but it seems to me that you were
> making
> > a pitch for OLAP data being "relational" in some way. Sorry if I
> > misunderstood.
>
> ---> No, I was making a pitch that it is not necessary, and can even be
> confusing, to view OLAP data as being relational in some way. Read Ralph
> Kimball (the "father" of data warehouses). He refers to what we are
talking
> about as "dimensional modeling", and as something that is _applied to_
> realtional databases. Note that he is _not_ talking about realtional
> modeling.

Yes, and, in fact, Kimball's "dimensional modeling" is what I was referring
to when talking about modeling cubes. Modeling stars or snowflakes for fact
& dimension tables IS cube-modeling.

>
> >
> > > Data is not _modeled_ as "OLAP cubes". Cubes are an implementation,
> > > modelling is analysis and maybe design.
> >
> > Cubes could be an implementation, but they could also be used to model
the
> > data. There is not always a need to take the data, put it into a
> relational
> > data model and reform it for OLAP -- one could go from requirements to
> OLAP
> > data model, right? --dawn
>
>
> ---> Yes and no. How would you model an eight (or more) dimension cube?

With a fact table with an 8-part key pointing to eight dimension tables (or
something like that).

>I
> don't think it is practical, as models are used for analysis and design.
You
> do build a model of the cube (using various cube-creation software) before
> you actually create the cube, but you really can't easily model from
scratch
> (perform analysis) with it. You can readily model a star schema, however,
> and you then build the cube model from a star schema. While you can take
> "loose" data and turn it into a star schema, the practical reality is that
> you are almost always starting out with an existing relational database,
so
> it is wise to discuss it from that point of view.
>

Seems like a terminology issue, so I'll use yours -- in that case I was
referring to dimensional modeling of data, which is not the same as
relational modeling and is also, typically, not used for the same purposes.
Oh, wouldn't it be oh, so, grand if we could model the data and use that
same model for both OLTP & OLAP? Although the modeling for PICK is neither
relational modeling (bz not in 1NF) nor dimensional modeling, PICK software
developers seem to get away without doing data marts or warehouses and using
the operational data, including stored historical data, for most of their
reporting needs. This is, in part, due to the fact that many PICK shops are
in the small-to-midsize business category rather than in large businesses.
But there might be something more to it that there isn't as much need to
take the data and reformat it for reporting/analysis purposes. Hmmmm. I'll
have to think more about that. --dawn

<snip>


Laconic2

unread,
Apr 13, 2004, 7:57:04 AM4/13/04
to
Eric,

I am sure that what I wrote could be worded much better, but I'm going to
plead not guilty to word molestation. Next thing, you'll want a law
requiring pedants to register with the local police.

I hesitate to try to fix up the wording, becasue I don't know if the missed
communication between you and me is solely a matter of word choice.

Let me try just the beginning,

A body of data is made up of data items. Each data item expresses a value.
Each data item also specifies the state of an instance of an attribute.

It's unfortunate that the word "attribute" is used in both relational
modeling and ER modeling. I think both uses of the word are fair, but they
sometimes confuse the issue.


Laconic2

unread,
Apr 13, 2004, 8:09:58 AM4/13/04
to
It's easier to form desired sets if you start with 1NF. That's why Codd
described 1NF as "normal form".
By "easier" I don't mean fewer machine cycles. I mean conceptually easier.

Thinking in terms of sets is basic to relational transforms of data.

It's easy to turn a body of data in 1NF into a hierarachical report. Even
the Datatrieve report writer would do that, way back before there were any
relational DBMS products on the VAX!

Laconic2

unread,
Apr 13, 2004, 8:23:35 AM4/13/04
to
"The key, the whole key, and nothing but the key, so help me Codd."

Now there's an old chestnut!

Laconic2

unread,
Apr 13, 2004, 8:48:22 AM4/13/04
to
Dawn,

You can go the other way. Let's just agree not to call it normalization.

Let's say your start from a body of data items. I don't care whether these
items are pulled off a report, or a screen, or out of data files or
whatever. Let's say you've decided to group the items into columns, and
the columns into tables, and the tables into a schema.

Let me skip over the question of whether to put two data items in the same
column or not. It's an important question, and in the real world, it
should be addressed before the question of whether two columns belong in the
same table.

But let's say we have composed the data items into useful columns.

Now, when we ask, "shall we join these columns together", we can really
look at it three ways:

The two columns are unrelated. Putting the two columns in the same table
will put two items next to each other that have no relationship to each
other. Leave them apart.

Go ahead and put them together in the base tables. In other words,
"materialize the join".

The third choice is, use the foreign key mechanism to mark the relationship,
but leave the columns in separate tables.
Anyone who knows the foreign keys, and what they represent, and knows how
to do an equijoin can paste them together whenever necessary.

Which of these three choices is better?

Well, when you decide on composition, you can use normal forms and
knowledge of the FD's to determine what normal form the materialized join
will be in, and you can use that knowledge, along with other knowledge, to
decide whether the composition buys you more than it costs you.

There's more than one way to skin a cat, but some ways are better than
others.


Alan

unread,
Apr 13, 2004, 9:49:55 AM4/13/04
to

In line, again --->>>

"Dawn M. Wolthuis" <dw...@tincat-group.com> wrote in message

news:c5f1cm$qo5$1...@news.netins.net...


--->>> I wrote that when you are unfamiliar with the problem domain, then
you will need to put the data in 1NF first. You simply cannot jump right to
3NF. If you are familiar with the problem domain, then, yes, it is possible
to go right to 3NF to the degree based on your experience. The more familiar
you are, the more correct your 3NF will be.


>
> > > > You ask, "Are you adhering to relational theory when deploying an
OLAP
> > > > database where
> > > > the data is in fact & dimension tables?" Did you read my message?
What
> > did
> > > I
> > > > write about normalization and the star schema?
> > >
> > > I'll re-read and see if it clearer, but it seems to me that you were
> > making
> > > a pitch for OLAP data being "relational" in some way. Sorry if I
> > > misunderstood.
> >
> > ---> No, I was making a pitch that it is not necessary, and can even be
> > confusing, to view OLAP data as being relational in some way. Read Ralph
> > Kimball (the "father" of data warehouses). He refers to what we are
> talking
> > about as "dimensional modeling", and as something that is _applied to_
> > realtional databases. Note that he is _not_ talking about realtional
> > modeling.
>
> Yes, and, in fact, Kimball's "dimensional modeling" is what I was
referring
> to when talking about modeling cubes. Modeling stars or snowflakes for
fact
> & dimension tables IS cube-modeling.


----->>> If you have ever created a cube, you would know that creating a
star schema is not modeling a cube. A star schema is a logical
representation of the arrangement of the data in a certain way in entities.
A cube is but one (albeit most? common) possible physical implementation of
the star schema. We may actualy agree in general- we could be getting hung
up on sematics.

>
> >
> > >
> > > > Data is not _modeled_ as "OLAP cubes". Cubes are an implementation,
> > > > modelling is analysis and maybe design.
> > >
> > > Cubes could be an implementation, but they could also be used to model
> the
> > > data. There is not always a need to take the data, put it into a
> > relational
> > > data model and reform it for OLAP -- one could go from requirements to
> > OLAP
> > > data model, right? --dawn
> >
> >
> > ---> Yes and no. How would you model an eight (or more) dimension cube?
>
> With a fact table with an 8-part key pointing to eight dimension tables
(or
> something like that).

----->>> No, that is a star schema, not a cube. If you've ever created a
pivot table (minus the data to make it a model only) in Excel, you have an
idea of what a cube model begins to look like. Cognos Powerplay Transformer
is a cube modeler and constructor. Google it- I haven't but there is
probably a white paper with examples.

Laconic2

unread,
Apr 13, 2004, 10:39:05 AM4/13/04
to
Alan,

I once built a star schema to validate a powerplay cube

(by providing an alternate route for the data from the OLTP system into the
cube.)

Here's what I didn't expect: it took less time top copy the data from the
OLTP system to the star schema, and then from the star schema into the cube
than it took to load the cube directly from the OLTP data. I don't know
whether that was just a fluke or was to be expected.


Interesting discussion you are having concerning dimensional modeling,
relational modeling and star schemas.

I think of a star schema as a projection of a dimensional model into the
world of tables, columns etc. I don't think of a star schema as having been
particularly derived from a relational model.


Alan

unread,
Apr 13, 2004, 12:32:56 PM4/13/04
to
Powerplay is optimized to build a cube from a flat file (which would have
been extracted from a star schema). I view the start schema as a variation
of the relational model. The constructs and constraints are the same, their
just assembled differently. It's a relational model that follows rules other
than normalization rules, IMO...


"Laconic2" <laco...@comcast.net> wrote in message

news:1-ednaJKnbb...@comcast.com...

Dawn M. Wolthuis

unread,
Apr 13, 2004, 1:23:18 PM4/13/04
to
"Laconic2" <laco...@comcast.net> wrote in message
news:7uadnZH0GZb...@comcast.com...

> It's easier to form desired sets if you start with 1NF. That's why Codd
> described 1NF as "normal form".
> By "easier" I don't mean fewer machine cycles. I mean conceptually
easier.

Would it be conceptually easier to talk that way too?
I would like a pizza with pepperoni on it
I would like a pizza with mozarella cheese on it
I would like a pizza with feta cheese on it
I would like a pizza with a tomato-based sauce on it
I would like a pizza with a Chicago-style crust

There, I've got my talkin' normalized! We should teach children to talk
this way since it is conceptually somuch easier, eh? smiles. --dawn

Dawn M. Wolthuis

unread,
Apr 13, 2004, 1:37:17 PM4/13/04
to
"Alan" <al...@erols.com> wrote in message
news:c5gr5c$1dbtj$1...@ID-114862.news.uni-berlin.de...

>
> In line, again --->>>
>
> "Dawn M. Wolthuis" <dw...@tincat-group.com> wrote in message
> news:c5f1cm$qo5$1...@news.netins.net...
> > "Alan" <al...@erols.com> wrote in message
> > news:c5euf6$v5g3$1...@ID-114862.news.uni-berlin.de...
> > > See in line... --->
> > >
> > >
> > > "Dawn M. Wolthuis" <dw...@tincat-group.com> wrote in message
> > > news:c5esfv$hi$1...@news.netins.net...
> > > > "Alan" <al...@erols.com> wrote in message
> > > > news:c5erh9$qt9b$1...@ID-114862.news.uni-berlin.de...
<snip>

> ----->>> If you have ever created a cube, you would know that creating a
> star schema is not modeling a cube. A star schema is a logical
> representation of the arrangement of the data in a certain way in
entities.
> A cube is but one (albeit most? common) possible physical implementation
of
> the star schema. We may actualy agree in general- we could be getting hung
> up on sematics.

I've created many cubes before, with varying products for implementation but
I'm far from an expert in this field and likey using incorrect terminology.
I consider cubes an implementation of a star schema, for example, to the
extent that I don't have a distrinction between start schema modeling and
cube modeling except for user-defined fields (whether stored or virtual)
being added in for the cube (measures that are summed, averaged, etc).

But please continue to correct me until I have the terminology correct. I
think of doing the data modeling as dimensional modeling or star schema
modeling but "designing" (rather than modeling) for the implementation of a
cube. So I just used the term "cube modeling" for what gets done with data
modeling prior to cube design. Otherwise what would be the distinction
between dimension modeling and cube modeling -- couldn't those terms be used
interchangably?

>
> >
> > >
> > > >
> > > > > Data is not _modeled_ as "OLAP cubes". Cubes are an
implementation,
> > > > > modelling is analysis and maybe design.
> > > >
> > > > Cubes could be an implementation, but they could also be used to
model
> > the
> > > > data. There is not always a need to take the data, put it into a
> > > relational
> > > > data model and reform it for OLAP -- one could go from requirements
to
> > > OLAP
> > > > data model, right? --dawn
> > >
> > >
> > > ---> Yes and no. How would you model an eight (or more) dimension
cube?
> >
> > With a fact table with an 8-part key pointing to eight dimension tables
> (or
> > something like that).
>
> ----->>> No, that is a star schema, not a cube. If you've ever created a
> pivot table (minus the data to make it a model only) in Excel, you have an
> idea of what a cube model begins to look like. Cognos Powerplay
Transformer
> is a cube modeler and constructor. Google it- I haven't but there is
> probably a white paper with examples.

I haven't seen Cognos Powerplay Transformer, but I'm guessing it is more for
cube design, specification and implementation than for the data modeling
that might preceed those steps.

Isn't a star schema a data model for cube design and then cube
implementation? I have created a pivot table, so I have an idea what cube
DESIGN is. I've also created cubes for Data Beacon and other OLAP tools.
The "cube modeling" (even if technically called dimension modeling) is by
way of star schema, right? The "cube design" extends that model to identify
additional attributes based on measures and such.

I'm probably just muddying the waters, but I purposely called it "cube
modeling" since it was modeling for the future implementation in a cube,
where relational model is for future implementation in a relational
structure, for example. I'm open for correction in the concepts as well as
the terminology. Thanks. --dawn


Laconic2

unread,
Apr 13, 2004, 3:09:45 PM4/13/04
to
Dawn,

Don't be silly. I would no more normalize speech than teach people how to
count in hexadecimal or binary.

But just because decimal is more "inuitive" for people doesn't mean I
should build the numerical architecture of the computer around the base 10.

Laconic2

unread,
Apr 13, 2004, 3:23:17 PM4/13/04
to
The terminology, by Kimball, refers to dimensional modeling. I would put
dimensional modeling on a par with ER modeling. It's abstract enough so
that it's not particularly tuned to any implementation.

A star schema is a dimensional model mapped into tables. This makes it easy
to imlpement in something like Oracle or DB2.

A cube is a dimensional model mapped into some kind of multidimensional
database system. Cognos powerplay is one example.

You can realize a dimensional model by building a cube, just as you can
realize an ER model by building a database.

However, it may be better to do the modeling first, then the building,
rather than the other way around, especially for large projects. But, at
first, we tend to do things backward. We learn how to program before we
learn how to design programs, and we learn ho to design programs before we
learn how to analyze requirements. Funny the way we think, huh?


Alan

unread,
Apr 13, 2004, 3:28:24 PM4/13/04
to
Dawn,

Laconic2's explanation is essentially the same as mine. There is no further
clarification.


"Laconic2" <laco...@comcast.net> wrote in message

news:_LednVPTGP1...@comcast.com...

Jan Hidders

unread,
Apr 13, 2004, 3:54:04 PM4/13/04
to
mAsterdam wrote:
> Jan Hidders wrote:
>> mAsterdam wrote:
>
> [Date's 6NF]
>
>> It's not uncontroversial, by the way.
>
> Could you share some of the controversy?

Well, to start with, most experts on temporal databases that I know
don't agree with Date's PACK/UNPACK approach and consider these algebra
operators fundamentally flawed. This is because they don't correspond to
easily implementable operators like the classical algebra operations. As
it happens I don't agree completely with that criticism, but that is
another matter.

Keep in mind that just because Date writes something in his
"Introduction ..." or in of his other books, that does not mean that it
is widely accepted by the research community and all the other experts.
Usually it is the non-experts that are impressed the most by Chris
Date's writings. You may also notice that Date doesn't get the
definition of 5NF right in his "Introduction ...". Not a good start for
somebody who wants to tell the world how 6NF should be defined.... But I
digress.

Another criticism is that 6NF is too strict. If there is no intervals
anywhere it still says you have to normallize to INF, (all JDs are
trivial) and not just to PJNF. Why would you do that? Sure, you could
argue that we always want access to historical information, so we always
have an interval, but why not be modest here and let the data modeler
decide.

>> ... Actually finding out what the elementary facts are is essentially
>> the same as normalizing to 5NF.
>
> That is, only when you exclude intervals as key-attributes.
> When you allow intervals as key-attributes (and... why not?)
> it maps to 6NF.

Oops, I'm afraid I have to backpedal a little here, so it is my turn to
apologize. Splitting a relation into elementary facts is of course
equivalent to normalizing to INF which is evidently not the same as
normalizing to PJNF.

> My take is that Date, Darwen and Lorenzos formulated 6NF the way they
> did to make it fairly obvious that 6NF is more strict than PJNF (5NF)
> (i.o.w. that every set of relations (relational variables) in 6NF is by
> definition also in 5NF so 6NF is another step on the lossless
> decomposition ladder). However, until I see an counterexample -
> preferably pizza orders related - I'll look at 6NF as an alternative
> predicate for the INF, the irreducable normal form (loose definition:
> just one non-key attribute) (BTW great
> acronym, don't you think? :-).

Hmmm, actually, you might be on to something there. My first thought was
that *clearly* Date's 6NF is strictly stronger than INF, but then I
realized the following. If there is a generalized JD (in Date's terms)
then that means there is a classical JD on the unpacked relation. But
from that it follows that the same JD also holds on the packed relation!
So for every generalized JD on the (packed) relation there also holds
the equivalent (w/o the ACL) classical JD on the (packed) relation. As a
consequence Date's 6NF and INF are always equivalent.

I wonder if Date et al. have realized that.

-- Jan Hidders

Dawn M. Wolthuis

unread,
Apr 13, 2004, 4:41:17 PM4/13/04
to
OK, I'll call your analogy and raise the fact that I would not then have my
software developers code directly in either of these -- binary or decimal.
Is it the computer that needs to have the data normalized or the people who
require that? My contention is that the people don't need it -- it does
nothing for THEM. Now, I'm NOT talking about 2nd and 3rd normal forms (or
even 5th for that matter) -- it is at least 1NF that is not needed by people
(and what follows from that is removing 4NF too, I think). --dawn

"Laconic2" <laco...@comcast.net> wrote in message

news:xYidnT8G3Z9...@comcast.com...

Dawn M. Wolthuis

unread,
Apr 13, 2004, 4:45:05 PM4/13/04
to
This terminology is just peachy -- I'll try to stick to it better.
Dimensional modeling --> Star Schema design (whether or not it is for a
relational database) --> Cube specification --> Cube construction.

It seems to me that the dimensional modeling is more like relational
modeling than ER modeling (which is more conceptual) but that is not a major
point and if I'm wrong on that small point, it pains me naught.
Thanks. --dawn

"Laconic2" <laco...@comcast.net> wrote in message

news:_LednVPTGP1...@comcast.com...

Jan Hidders

unread,
Apr 13, 2004, 4:56:39 PM4/13/04
to
Dawn M. Wolthuis wrote:
>
> Would it be conceptually easier to talk that way too?
> I would like a pizza with pepperoni on it
> I would like a pizza with mozarella cheese on it
> I would like a pizza with feta cheese on it
> I would like a pizza with a tomato-based sauce on it
> I would like a pizza with a Chicago-style crust
>
> There, I've got my talkin' normalized! We should teach children to talk
> this way since it is conceptually somuch easier, eh?

That's not what the relational model claims. The idea is that if you
have multiple application that look at the data in different ways, then
it could be that some application would like to see the data nested like
this:

{ (ordered-pizza, { (ingredients) }) }

and another one might want

{ (ingredient, { (ordered-pizza) } }

In that case it is easier and more practical to formulate the central
conceptual data model in a flat way, so that it does not favour any of
the applications inadvertently, and also does not favour a specific
physical implementation inadvertently. Data-independence and all that.

In theory, anyway. ;-)

-- Jan Hidders

mAsterdam

unread,
Apr 13, 2004, 5:10:56 PM4/13/04
to
Dawn M. Wolthuis wrote:

> I would like a pizza with pepperoni on it
> I would like a pizza with mozarella cheese on it
> I would like a pizza with feta cheese on it
> I would like a pizza with a tomato-based sauce on it
> I would like a pizza with a Chicago-style crust

Four pizza's comin' up! ;-)

Dawn M. Wolthuis

unread,
Apr 13, 2004, 5:40:17 PM4/13/04
to
I counted five, but otherwise we are on the same wave length.
smiles. --dawn

"mAsterdam" <mAst...@vrijdag.org> wrote in message
news:407c5762$0$570$e4fe...@news.xs4all.nl...

Eric Kaun

unread,
Apr 13, 2004, 6:18:28 PM4/13/04
to
"Laconic2" <laco...@comcast.net> wrote in message
news:PJmdnXgEZ9L...@comcast.com...

> Eric,
>
> I am sure that what I wrote could be worded much better, but I'm going to
> plead not guilty to word molestation. Next thing, you'll want a law
> requiring pedants to register with the local police.
>
> I hesitate to try to fix up the wording, becasue I don't know if the
missed
> communication between you and me is solely a matter of word choice.

I'm not sure whether it's word choice or the difficulties in choosing basic
definitions and attendant axioms. Am I the only one confused by the
definitions?

> Let me try just the beginning,
>
> A body of data is made up of data items.

So data is the plural of "data item"? Does the body itself have additional
attributes, or is it just a collection of some sort?

> Each data item expresses a value.

What does it mean to "express a value" - does that mean a data item has a
value, or is a value, or points to a value?

> Each data item also specifies the state of an instance of an attribute.

So a data item has 2 parts: a value that it expresses, plus the state of an
instance of an attribute... I'm lost already. I get a hint of a whiff of
what you're suggesting, but that's all. I'm not being deliberately overly
pedantic here either.

Typically an attribute has a type and a name. Several attributes can have
the same type, and an attribute can be referenced in several "places" (e.g.
data items - maybe?). Attributes can be attached to something (and here's
where it gets tricky), and each occurrence of such a something (which of
necessity also has a type) has a value for each of its attributes. Each such
value must be a member of the set defined by its attribute's type.

Given all that, I'm not sure how it matches up with your use of "state" and
"instance of an attribute." I could guess, but...

> It's unfortunate that the word "attribute" is used in both relational
> modeling and ER modeling. I think both uses of the word are fair, but
they
> sometimes confuse the issue.

Agreed. Guess there are just too few (English) words to go around.

- erk


mAsterdam

unread,
Apr 13, 2004, 7:59:06 PM4/13/04
to
Jan Hidders wrote:

[Date's 6NF]


> mAsterdam wrote:
>> Jan Hidders wrote:
>>> It's not uncontroversial, by the way.
>> Could you share some of the controversy?
>
> Well, to start with, most experts on temporal databases that I know
> don't agree with Date's PACK/UNPACK approach and consider these algebra
> operators fundamentally flawed. This is because they don't correspond to
> easily implementable operators like the classical algebra operations. As
> it happens I don't agree completely with that criticism, but that is
> another matter.

Maybe we should just write them - or is the criticism about
something else?

> Keep in mind that just because Date writes something in his
> "Introduction ..." or in of his other books, that does not mean that it
> is widely accepted by the research community and all the other experts.
> Usually it is the non-experts that are impressed the most by Chris
> Date's writings. You may also notice that Date doesn't get the
> definition of 5NF right in his "Introduction ...". Not a good start for
> somebody who wants to tell the world how 6NF should be defined.... But I
> digress.

Heh - when I read that chapter long time ago I constructed
an example myself (about tracking manufactured parts placed by
subcontractors at projects for garantee litigations), and
I could not map it to Date's explanation. I never got around
to discuss it with another reader :-)

> Another criticism is that 6NF is too strict. If there is no intervals
> anywhere it still says you have to normallize to INF, (all JDs are
> trivial) and not just to PJNF. Why would you do that? Sure, you could
> argue that we always want access to historical information, so we always
> have an interval, but why not be modest here and let the data modeler
> decide.

See below.

I think they did (just a hunch).
But it may depend on *not* letting the modeller decide.
My bet is: there are more books in the making :-)

Jan Hidders

unread,
Apr 14, 2004, 3:42:08 PM4/14/04
to
Jan Hidders wrote:
> Jan Hidders wrote:
>> [...] The usual algorithm that gets you to 3NF in one step (the one
>> using the minimal cover) splits as little as possible. See for example
>> sheet 46 on:
>>
>> http://cs.ulb.ac.be/cours/info364/relnormnotes.pdf
>
>
> Did anyone notice that this algorithm is actually not correct? Take the
> following example of a relation R(A,B,C,D,E) with the set of FDs:
>
> { AB->C, AB->D, BC->D }
>
> It is clear that the relation ABCD is not in 3NF. Since the set of FDs
> it is already a minimal cover the resulting decomposition is:
>
> { ABCD, BCD }
>
> But that gives us our old relation back (plus a projection) so this is
> definitely not in 3NF.

As was pointed out to me by Ramez Elmasri, the counterexample is not
correct since the set of FDs is not a minimal cover. The reason for this
is that AB->D can be derived from AB->C and BC->D. So a proper minimal
cover would be

{ AB->C, BC->D }

and that leads to the decomposition

{ ABC, BCD }

which is indeed in 3NF.

I now officially declare this thread closed an will stop replying to
myself. :-)

-- Jan Hidders

Dan

unread,
Apr 14, 2004, 11:34:41 PM4/14/04
to
Jan Hidders <jan.h...@REMOVETHIS.pandora.be> wrote in message news:<kugfc.71448$OG1.4...@phobos.telenet-ops.be>...
I know this thread is officially closed, though I'm disapointed
because I like the discourses you hold with yourself.

Date in his *Intro* textbook also demonstrates finding minimal cover
using Armstrong's axioms (versus the algorithm), correct?

Thanks,

Dan
> -- Jan Hidders

Jan Hidders

unread,
Apr 15, 2004, 3:43:05 PM4/15/04
to
Dan wrote:
>
> Date in his *Intro* textbook also demonstrates finding minimal cover
> using Armstrong's axioms (versus the algorithm), correct?

I'm not sure what you mean by "versus the algorithm", or do you mean
"for the algorithm"? He does indeed use the the concept of minimal cover
and calls it an irreducible cover, and uses it to present the algorithm
for getting to 3NF in one step.

-- Jan Hidders

Dan

unread,
Apr 15, 2004, 6:12:09 PM4/15/04
to

"Jan Hidders" <jan.h...@REMOVETHIS.pandora.be> wrote in message
news:dBBfc.72339$xu1.4...@phobos.telenet-ops.be...

Ahh, well it's been many years now since my most thorough read of the 7th
edition. I did not recall seeing this algorithm in chapter 10 on functional
dependencies, but if I would have done exercise 11.3 and then checked the
answer, I would have found the algorithm predominantly displayed. :-)

Thanks. I learn something new every time.

Dan


0 new messages