On Apr 17, 1:12 am,
com...@hotmail.com wrote:
I would seriously consider looking at the cloud of the articles
revolving around this one:
http://dl.acm.org/citation.cfm?id=588011.588050
. I mean, those "weak multivalued dependencies" and their ilk -- in
two precise articles which I cannot find, don't remember the author
of, and which I don't have access to anymore thanks to being laid off
-- basically prove that fourth normal form in database normalization
is equivalent to the natural factorization within bayesian
probabilistic networks.
The only difference is that you put in an extra attribute in your
database which denotes the basic marginal probability of your tuple
being true, and then follow the natural rules of probability calculus
when taking joins or cartesian products: the rest of the columns
become attributes/selection conditions upon the state of a closed
world, any union of such states becomes a sum on the probability
column, any intersection becomes a minus, and any join becomes a
multiplication+sum.
The first paper in the series shows that normalizing something to 4NF,
with the probability column present, is consistent with the resulting
database representing a base form Bayesian network. The second one
responds to criticism where it was claimed that the RM-BN-analogy
leads to a contradiction. The way it shows that is essentially the
same which was in its time used to show that not all universal
relations have a decomposition as a natural join of their projections
into their constituent, smaller relations. Only in this case the
argument was played in something of a reverse: since Bayesian networks
can never lead to the kinds of problems described, by their structure,
wrt the relational model, then we can derive an easy contradiction
from the supposed claim of inequivalence.
Seriously, look it up, even if I can't find a proper reference to the
first article. It has something to do with 4NF, Bayesianism, weak
multivalued normal forms, it's only about two little-known articles
with a single author in both, it's probably something that was
published by the ACM, and so on. But I at least thought it was a
beautiful result, if superficially trivial, when I saw it. At the
deeper level it tells us something about why that "independent
component" and "normalization" business happened in the first place,
in the history of the development of the relational model.