Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

multilevel modelling and constraints

1 view
Skip to first unread message

Sampo Syreeni

unread,
Oct 23, 2009, 11:40:22 PM10/23/09
to
I just realized that, for a number of years already, I've continuously
been bumping into something that could perhaps be generalized into a
bona fide data modelling (problem?) pattern. Bear with me...

As an example, think about a generalization hierarchy where we have a)
a bunch of organizations at a higher level, and b) also the
disaggregated mother companies and the physical business locations
which the parent company controls, the two of which naturally can
differ rather wildly in the detailed data we have on them, plus c) all
of the extra metadata that can mesh between companies and business
locations, while staying fully clear of the aggregated level data.

That's all the data we have so far, it's been neatly normalized and
all of the necessary integrity constraints are in place. Us DBAs are
living the good life, pretty much having already completed the job
when we put up all of them neat productivity enhancing constraints.
Now we're basically dreaming about our happy place, with complimentary
bunnies and stuff.

Except that then comes down the latest business requirement: we
suddenly know for a fact that Albania(tm) does not allow companies to
own other companies. Ever. So, since we're conscientious DBAs, we
immediately add an assertion which effectively says: "relationships
from companies in Albania(tm) involved in a relationship with the type
of "owned-by", to another company which also lives in Albania(tm), ist
verboten". Everything is fine from there on, and the DBA can once
again go to sleep.

But do notice what happened. At the higher level of companies, there
would have been a numbre of integrity constraints. They couldn't have
catched the case of our very special Albania(tm), though. Instead we
had to build a second line of constraints, essentially a special case,
at the lower level of abstraction, to deal with that.

If we had flattened the generalization hierarchy by using trickery
like nulls and the like, and dealt with all of the special cases by
hand, nothing like that would have been necessary. But then, we
couldn't have managed the complexity. I mean, even now I'm dealing
with an instance of this precise logic which would take manual, expert
assessment of the validity of some 20K+ assertions to make it work at
the lowest level. Using general rules and exceptions, I think the
problem can be reduced to far under 500 human decisions.

Thus I think constraints should often be set at multiple levels of
generalization, simultaneously. After proper normalization this
implies that under an RDBMS, such constraints/assertions might spread
wide and far, and be a bit more complicated than most are used to.
Still, this is an *extremely* common pattern in everyday database
work, so I think shying away from the implications woudl be a bit
foolish.

0 new messages