Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Hierarchy in a Database

18 views
Skip to first unread message

Christophe Courtois

unread,
Sep 13, 1999, 3:00:00 AM9/13/99
to
I need to design a hierarchy (country, site , zone, building, level,
area, equipment, component... for example) in a database (Access or
Oracle). As the number of levels is not defined, we'll put everything in
a single table.

Does someone know of an URL or some advice on how to deal with this, as
it seems that SQL queries will be a headache ?
(Examples : 'sum of the values of all equipments in this building' ;
'list of everuthing in a area'...)

Thanks in advance for any idea.

--
Christophe Courtois - ccou...@cybercable.tm.fr
Strasbourg, France
http://www.multimania.com/courtois

Wolfgang Keller

unread,
Sep 13, 1999, 3:00:00 AM9/13/99
to
Christophe,


> I need to design a hierarchy (country, site , zone, building, level,
> area, equipment, component... for example) in a database (Access or
> Oracle). As the number of levels is not defined, we'll put everything in
> a single table.
>

my best advice in this case is ... use an object database. They are better
suited
for arbitrary tree structures ... from the project I'm actually working for,
we
derived an Antipattern named "storing trees in a relational database" - know

what I mean ;-).

If you need a few reasons, have a look at ..
Pitfall: Rules in a Relational Database
in the STJA paper at ...
http://www.objectarchitects.de/ObjectArchitects/papers/WhitePapers/

Cheers

Wolfgang

=========================================================
EMail: w...@objectarchitects.de
Web: http://www.objectarchitects.de/ObjectArchitects/

David Cressey

unread,
Sep 13, 1999, 3:00:00 AM9/13/99
to
Use two tables: one for the hierarchy itself, and another for the
occupants
of the hierarchy. The occupant table can be pretty fluid. The only column
I'll mention is OCCUPANT_ID. This is the key.

the hierarchy table has four columns:

SEQ_NO, END_SEQ_NO, LEVEL_NO, and OCCUPANT_ID.

SEQ_NO is the number you get if you assign numbers sequentially, traversing
the tree
in logical order. It's the key to the hierarchy, and has the added feature
that order
in SEQ no is meangingful. Note that all children of a node has a higher
SEQ_NO than the node.

END_SEQ_NO is the highest SEQ_NO among this node and all its children. For
a leaf,
SEQ_NO and END_SEQ_NO are the same.

LEVEL_NO says how far below the root the node is. The root itself is level
1, its children
all all level two.

OCCUPANT_ID is a foreign key. It's a reference, from within a slot of the
hierarchy, to the
current occupant of that slot.

now, if w look up say, node 101, and its end seq_no is 149, we can get
the sum of the
value of all the occupants in the subtree by just:

select sum (OCCUPANT.VALUE)
from OCCUPANT, HIERARCHY
where OCCUPANT.OCCUPANT_ID = HIERARCHY.OCCUPANT_ID
and HIERARCHY.SEQ_NO between 101 and 149

Other standard queries are similarly easy. It isn't hard to
define views of the aggregate values of subtrees, or of the
path from the root to any given node, etc.

Regards,
David Cressey mailto:da...@dcressey.com


Christophe Courtois wrote in message <37DD2646...@cybercable.tm.fr>...


>I need to design a hierarchy (country, site , zone, building, level,
>area, equipment, component... for example) in a database (Access or
>Oracle). As the number of levels is not defined, we'll put everything in
>a single table.
>

Christophe Courtois

unread,
Sep 14, 1999, 3:00:00 AM9/14/99
to David Cressey
First, thank you very much for the answer.

> the hierarchy table has four columns:
> SEQ_NO, END_SEQ_NO, LEVEL_NO, and OCCUPANT_ID.

> select sum (OCCUPANT.VALUE)


> from OCCUPANT, HIERARCHY
> where OCCUPANT.OCCUPANT_ID = HIERARCHY.OCCUPANT_ID
> and HIERARCHY.SEQ_NO between 101 and 149

Yes, it is easy... when the numbers are already there !

With this system, every change to the top of the tree could involve to
re-calculate all the SEQ_NO. That's the same for moving a single object.
We could deal with more than 10,000 objects...

joe_...@my-deja.com

unread,
Sep 18, 1999, 3:00:00 AM9/18/99
to

>> ... a hierarchy ... the number of levels is not defined <<

Another way of representing trees is to show them as nested sets.
Since SQL is a set oriented language, this is a better model than the
usual adjacency list approach you see in most text books. Let us
define a simple Personnel table like this, ignoring the left (lft) and
right (rgt) columns for now. This problem is always given with a
column for the employee and one for his boss in the textbooks:

CREATE TABLE Personnel
(emp CHAR(10) PRIMARY KEY,
boss CHAR(10), -- this column is unneeded & denormalizes the
table
salary DECIMAL(6,2) NOT NULL,
lft INTEGER NOT NULL,
rgt INTEGER NOT NULL);

Personnel
emp boss salary lft rgt
===================================
Albert NULL 1000.00 1 12
Bert Albert 900.00 2 3
Chuck Albert 900.00 4 11
Donna Chuck 800.00 5 6
Eddie Chuck 700.00 7 8
Fred Chuck 600.00 9 10

which would look like this as a directed graph:

Albert (1,12)
/ \
/ \
Bert (2,3) Chuck (4,11)
/ | \
/ | \
/ | \
/ | \
Donna (5,6) Eddie (7,8) Fred (9,10)

This (without the lft and rgt columns) is called the adjacency list
model, after the graph theory technique of the same name; the pairs of
nodes are adjacent to each other. The problem with the adjacency list
model is that the boss and employee columns are the same kind of thing
(i.e. names of personnel), and therefore should be shown in only one
column in a normalized table. To prove that this is not normalized,
assume that "Chuck" changes his name to "Charles"; you have to change
his name in both columns and several places. The defining
characteristic of a normalized table is that you have one fact, one
place, one time.

To show a tree as nested sets, replace the nodes with ovals, then nest
subordinate ovals inside each other. The root will be the largest oval
and will contain every other node. The leaf nodes will be the
innermost ovals with nothing else inside them and the nesting will show
the hierarchical relationship. The rgt and lft columns (I cannot use
the reserved words LEFT and RIGHT in SQL) are what shows the nesting.

If that mental model does not work, then imagine a little worm crawling
anti-clockwise along the tree. Every time he gets to the left or right
side of a node, he numbers it. The worm stops when he gets all the way
around the tree and back to the top.

This is a natural way to model a parts explosion, since a final
assembly is made of physically nested assemblies that final break down
into separate parts.

At this point, the boss column is both redundant and denormalized, so
it can be dropped. Also, note that the tree structure can be kept in
one table and all the information about a node can be put in a second
table and they can be joined on employee number for queries.

To convert the graph into a nested sets model think of a little worm
crawling along the tree. The worm starts at the top, the root, makes a
complete trip around the tree. When he comes to a node, he puts a
number in the cell on the side that he is visiting and increments his
counter. Each node will get two numbers, one of the right side and one
for the left. Computer Science majors will recognize this as a
modified preorder tree traversal algorithm.

Finally, drop the unneeded Personnel.boss column which used to
represent the edges of a graph.

This has some predictable results that we can use for building
queries. The root is always (left = 1, right = 2 * (SELECT COUNT(*)
FROM TreeTable)); leaf nodes always have (left + 1 = right); subtrees
are defined by the BETWEEN predicate; etc. Here are two common queries
which can be used to build others:

1. An employee and all their Supervisors, no matter how deep the tree.

SELECT P2.*
FROM Personnel AS P1, Personnel AS P2
WHERE P1.lft BETWEEN P2.lft AND P2.rgt
AND P1.emp = :myemployee;

2. The employee and all subordinates. There is a nice symmetry here.

SELECT P2.*
FROM Personnel AS P1, Personnel AS P2
WHERE P1.lft BETWEEN P2.lft AND P2.rgt
AND P2.emp = :myemployee;

3. Add a GROUP BY and aggregate functions to these basic queries and
you havehierarchical reports. For example, the total salaries which
each employee controls:

SELECT P2.emp, SUM(P1.salary)
FROM Personnel AS P1, Personnel AS P2
WHERE P1.lft BETWEEN P2.lft AND P2.rgt
GROUP BY P2.emp;

This will be two to three orders of magnitude faster than the adjacency
list model.

For details, see the chapter in my book JOE CELKO'S SQL FOR SMARTIES by
Joe Celko, Morgan-Kaufmann, 1995, ISBN 1-55860-323-9 or my columns in
DBMS magazine in 1996 March, April, May and June issues. I went into
details as to how to manipulate this model.

--CELKO--


Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.

Philip Lijnzaad

unread,
Sep 20, 1999, 3:00:00 AM9/20/99
to

>>> ... a hierarchy ... the number of levels is not defined <<

> Another way of representing trees is to show them as nested sets.

( haven't I seen this post before :-)

> 1. An employee and all their Supervisors, no matter how deep the tree.

> SELECT P2.*
> FROM Personnel AS P1, Personnel AS P2
> WHERE P1.lft BETWEEN P2.lft AND P2.rgt
> AND P1.emp = :myemployee;

> 2. The employee and all subordinates. There is a nice symmetry here.

> SELECT P2.*
> FROM Personnel AS P1, Personnel AS P2
> WHERE P1.lft BETWEEN P2.lft AND P2.rgt
> AND P2.emp = :myemployee;

This should be:

SELECT P1.*


FROM Personnel AS P1, Personnel AS P2
WHERE P1.lft BETWEEN P2.lft AND P2.rgt
AND P2.emp = :myemployee;

Philip
--
Enough is more than a lot.
-----------------------------------------------------------------------------
Philip Lijnzaad, lijn...@ebi.ac.uk | European Bioinformatics Institute,rm A2-24
+44 (0)1223 49 4639 | Wellcome Trust Genome Campus, Hinxton
+44 (0)1223 49 4468 (fax) | Cambridgeshire CB10 1SD, GREAT BRITAIN
PGP fingerprint: E1 03 BF 80 94 61 B6 FC 50 3D 1F 64 40 75 FB 53

0 new messages