Who first (publicly) asserted 3NF is "good enough"?

Roy Hann

unread,

Sep 18, 2006, 5:18:01 AM9/18/06

to

It has always seemed to me that nothing looks more obviously and intuitively
wrong than a table that is in third normal form (3NF) but not also 4NF.
Furthermore I rarely see violations of 4NF "in the wild" in databases
intended to support core business transaction processing (as opposed to say
DBs for decision support or data warehousing). This makes me think even
ill-informed database designers can immediately spot the problem and avoid
it even if they don't know what it's called.

Assuming I'm right about the above it is baffling that one so frequently
sees books and articles in which it is asserted that 3NF is "good enough".
I assume the authors don't actually know what 4NF (and 5NF) is and they are
quoting and re-quoting some original source that gave them permission to
ignore 4NF etc.

It's probably far too late to get a confident answer now, but does anyone
know if there was a single authoritative writer (long ago) who expressed
this foolish idea?

Roy

David Cressey

unread,

Sep 18, 2006, 7:20:58 AM9/18/06

to

"Roy Hann" <spec...@processed.almost.meat> wrote in message
news:iqKdneSFqv3...@pipex.net...

I heard it as early as 1984. I can't tell you who I heard it from, but it
was someone who knew a lot more about it than I did at the time.

Bob Badour

unread,

Sep 18, 2006, 9:15:22 AM9/18/06

to

Roy Hann wrote:

It's not so much that 3nf is good enough as it is that almost all 3nf
designs are also in 5nf. Violating the higher normal forms requires
compound keys and complex dependencies, and designers tend to avoid both.

Having a thorough understanding of 3nf may be good enough as long as one
also knows to open a reference text at the first hint of a complex
dependency or a compound key.

Roy Hann

unread,

Sep 18, 2006, 11:38:20 AM9/18/06

to

"Bob Badour" <bba...@pei.sympatico.ca> wrote in message
news:KZwPg.22209$9u.2...@ursa-nb00s0.nbnet.nb.ca...

>> It's probably far too late to get a confident answer now, but does anyone
>> know if there was a single authoritative writer (long ago) who expressed
>> this foolish idea?
>
> It's not so much that 3nf is good enough as it is that almost all 3nf
> designs are also in 5nf.

Indeed. That is almost verbatim what I tell people too.

I wanted to know if there was some specific culprit I could name and shame.
I am happy enough to laboriously work through the whys and wherefores if
challenged, but in in the interests of brevity in a presentation I'm doing
on Wednesday I want to just slander someone and move on.

Roy

Alfredo Novoa

unread,

Sep 18, 2006, 1:00:50 PM9/18/06

to

Bob Badour wrote:

> Having a thorough understanding of 3nf may be good enough as long as one
> also knows to open a reference text at the first hint of a complex
> dependency or a compound key.

I think that it is a lot better to use BCNF.

BCNF is "better" than 3NF and even easier to check IMO.

Regards
Alfredo

David Cressey

unread,

Sep 18, 2006, 4:16:53 PM9/18/06

to

"Roy Hann" <spec...@processed.almost.meat> wrote in message

news:fpednXTGmYH...@pipex.net...

I think I know the name of the person that first gave me the idea that 3NF
is "good enough". It's Jeff Tash.

Back in 1983-1984 when I was first learning relational database, I was an
employee at Digital, and so was Jeff.
Jeff was teaching an internal 3 day seminar on Rdb and DSRI at that time,
and I'm pretty sure it was him.

So, if you need someone to blame, he's my nominee.

But what Bob said is true. 3NF is 5NF except for complex cases.

-CELKO-

unread,

Sep 18, 2006, 5:55:36 PM9/18/06

to

>> .. one so frequently sees books and articles in which it is asserted that 3NF is "good enough" .. does anyone know if there was a single authoritative writer (long ago) who expressed this foolish idea? <<

I am not sure the source of the myth, I have a feeling it goes back to
the early forms of ER diagrams which did not have much in the way of
showing compound keys or "tricky" relationships. A lot of the early ER
tools could only go to 3NF, so maybe that was where it started.

And, as Bob pointed out, for a lot of commercial apps, there is an
industry standard or internally defined key that you have to use (UPC,
EAN, VIN, ISBN, etc.), so most of the tables have a simple key. Also,
if a table is all key, then it is in 5NF.

When I have to teach a class, I pull out a (buyer, seller, lender)
table that demonstrates JPNF problems and a (classroom,course, teacher,
period) table where any three columns are a key for overlapping UNIQUE
constraints.

Johnny

unread,

Sep 18, 2006, 11:00:04 PM9/18/06

to

A good question! Another question as a followup. Have you
actually seen a 3NF model in production? Or have you only
seen it as the logical represenation of a physical database?
I'm still waiting to find a database in 3NF. Apparently
3NF isn't good enough.

Bob Badour

unread,

Sep 19, 2006, 12:46:48 AM9/19/06

to

Alfredo Novoa wrote:

Would the real 3nf please stand up? BCNF is what 3nf was intended to be
except somebody goofed. If asked to describe 3nf, most folks would
describe BCNF instead.

Jan Hidders

unread,

Sep 19, 2006, 4:51:38 AM9/19/06

to

Moving to BCNF you may no longer be "dependency preserving", so whether
that is "better" or not may depend.

-- Jan Hidders

Sampo Syreeni

unread,

Sep 19, 2006, 4:53:51 AM9/19/06

to

On 2006-09-18, Roy Hann wrote:

> It has always seemed to me that nothing looks more obviously and
> intuitively wrong than a table that is in third normal form (3NF) but

> not also 4NF. [...] It's probably far too late to get a confident

> answer now, but does anyone know if there was a single authoritative
> writer (long ago) who expressed this foolish idea?

I believe the practical sufficiency of 3NF became an appropriate topic
of academic discussion for two reasons. First, [BB79] shows that the
database design problem for 3NF is solvable in at most quadratic time,
while in general it's NP-hard already for BCNF. This limits our ability
to show that complex schemas in fact *are* in 4NF, and around '79 they
seem to have worried a whole lot about that sort of thing. Second, while
lossless join, dependency preserving decomposition into 3NF is always
possible, the same does not hold for BCNF. This is even worse because
loss of dependency preservation implies a run time penalty for
constraint checking.

I wasn't able to come up with the original reference for the second
half, but the issue is spelled out at least in [Ma83] (at 6.7.1), and is
illustrated by example at least as early as [Ri77] (at 3, second
example).

As people have already said, the practical impact of such results is
limited [DF92]. But that doesn't mean it's nil, especially after you
take into consideration the limitations of existing constraint checking
engines. My favorite example in this vein is the so called scientific
dataset, like a pile of images or higher rank tensors, where the key of
the aggregate/image/function multidetermines the set of valid indices
for each of its dimensions {A->>B, A->>C} and they in turn determine the
value of the function at the indexed point {ABC->D}. How would you
represent that if you want to maintain it under completeness semantics
[GM86], and you're given, say, Oracle?

[BB79] Beeri, Catriel and Bernstein, Philip: Computational Problems
Related to the Design of Normal Form Relational Schemas, ACM
Transactions on Database Systems, Vol. 4, No. 1, March 1979
[DF92] Date, Chris and Fagin, Ronald: Simple Conditions for
Guaranteeing Higher Normal Forms in Relational Databases, ACM
Transactions on Database Systems, Vol. 17, No. 3, September 1992
[GM86] Graham, Marc, Mendelzon, Alberto and Vardi, Moshe: Notions of
Dependency Satisfaction, Journal of the ACM, Vol. 33, No. 1,
January 1986
[Ma83] Maier, David: The Theory of Relational Databases, Computer
Science Press, 1983
[Ri77] Rissanen, Jorma: Independent Components of Relations, ACM
Transactions on Database Systems, Vol. 2, No. 4, December 1977
--
Sampo Syreeni, aka decoy - mailto:de...@iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2

Jan Hidders

unread,

Sep 19, 2006, 4:56:03 AM9/19/06

to

-CELKO- wrote:
>
> And, as Bob pointed out, for a lot of commercial apps, there is an
> industry standard or internally defined key that you have to use (UPC,
> EAN, VIN, ISBN, etc.), so most of the tables have a simple key. Also,
> if a table is all key, then it is in 5NF.

What exactly do you mean by "a table is all key"? Is it "every column
is by itself a candidate key" or "all columns together are a candidate
key"? In the first case you are right, but in the second case your
statement would not be correct.

-- Jan Hidders

David Cressey

unread,

Sep 19, 2006, 8:18:23 AM9/19/06

to

"Jan Hidders" <hid...@gmail.com> wrote in message
news:1158656163....@e3g2000cwe.googlegroups.com...

I take "all key" to mean that there is no candidate key other the entire
relation.

-CELKO-

unread,

Sep 19, 2006, 10:31:17 AM9/19/06

to

I was thinking of Ron Fagin's proof that
1) Table is in 3NF
2) every key is one column

The phrase "all keys" was what Chris Date used when he wrote a short
piece about this in DATABASE PROGRAMMING & DESIGN back in 1992

The same article also had Zaniolo's definition of 3NF:

T is a table, X is any set of columns of T and C is any single column
of T. T is in 3NF iff
for every functional dependency X -> C at least one of the following is
true:

1) X contains C -- a trivial dependency like C -> C or ABC -> C
2) X contains a key of T
3) C is contained in a key of T

The nice part is that if you drop #3, you get a definition of BCNF, so
you can come up with a good teaching example for 3NF versus BCNF.

Jan Hidders

unread,

Sep 19, 2006, 11:39:57 AM9/19/06

to

-CELKO- wrote:
> I was thinking of Ron Fagin's proof that
> 1) Table is in 3NF
> 2) every key is one column

For those that want to know more about that:

http://www.almaden.ibm.com/cs/people/fagin/tods92.pdf#search=%22simple%205nf%20fagin%20date%22

> The phrase "all keys" was what Chris Date used when he wrote a short
> piece about this in DATABASE PROGRAMMING & DESIGN back in 1992

They used it in the way you just did? In the TODS article they also
used it, but in the meaning of "the set of all attributes is the only
candidate key". Can I be so bold as to suggest that you perhaps
misremembered? :-)

-- Jan Hidders