Nested set model with large gaps and spreads in the numbering

--CELKO--

unread,

Sep 9, 2002, 11:14:15 PM9/9/02

to

There is a version of the nested set model in which we use a larger
spread than just one between the numbers. The idea is that while you
lose the algebriac property that (rgt -lft+1)/2, you can insert new
nodes without renumbers the nodes.

Well, up to a point. Eventually, the insertions and deletions will
leave serious gaps. I need a good compact procedure that will shift
the children under a parent node to the left, leaving the largest gap
possi.ble on the right

Here is a first attempt:

CREATE PROCEDURE ShiftLeft ()
AS BEGIN
DECLARE squeeze IN INTEGER;
SET squeeze
= (SELECT CASE WHEN MIN(O2.lft - O1.rgt) - 1 > 1
THEN MIN(O2.lft - O1.rgt) - 1
ELSE 1 END
FROM OrgChart AS O1, OrgChart AS O2
WHERE O1.rgt < O2.lft
AND O1.emp <> O2.emp);
UPDATE OrgChart
SET lft = (lft - squeeze),
rgt = (rgt - squeeze)
WHERE (lft - squeeze) > 0
AND NOT EXISTS
(SELECT *
FROM OrgChart AS O1
WHERE O1.emp <> OrgChart.emp
AND (O1.lft
BETWEEN (OrgChart.lft - squeeze)
AND (OrgChart.rgt - squeeze)
OR O1.rgt
BETWEEN (OrgChart.lft - squeeze)
AND (OrgChart.rgt - squeeze)));
END;

This routine can be executed over and over, producing some shifting.
The trouble is that it does not work as desired.

Each node should be packed to the left and the largest possible gap
built on the right. Another problem is that it "slows down" rather
quickly and depends on the value of the squeeze parameter.

first call:
emp lft rgt
======================
Albert 100 1200
Bert 101 201
Chuck 400 1100
Donna 401 501
Eddie 601 701
Fred 801 901

second call:
emp lft rgt
======================
Albert 100 1200
Bert 101 201
Chuck 400 1100
Donna 401 501
Eddie 502 602
Fred 702 802

Third call:
emp lft rgt
======================
Albert 100 1200
Bert 101 201
Chuck 400 1100
Donna 401 501
Eddie 502 602
Fred 701 801

Notice how 'Fred' does not move much after the second call to
ShiftLeft().

How can we improve this procedure?

Heinz Huber

unread,

Sep 10, 2002, 3:05:26 AM9/10/02

to

71062...@compuserve.com (--CELKO--) wrote in
news:c0d87ec0.02090...@posting.google.com:

If I understand you correctly, the wanted result is:
emp lft rgt
======================
Albert 1 1101
Bert 2 102
Chuck 103 803
Donna 104 204
Eddie 205 305
Fred 306 406

The structure and the differences between lft and rgt have remained the
same, but the gaps between the employees have been closed.

I think the problem is that there is no "universal" shift factor. Instead
the shift is different for each employee. Therefore some employees have a
shift factor of 1 while others have still a rather high shift factor.

If you don't want to implement a cursor approach, I can only imagine adding
a column in which you store the shift factor:
UPDATE OrgChart
SET shift = NULL;

-- Calculate shift factor within one parent.
-- First siblings are corrected later.
UPDATE OrgChart
SET shift = lft - 1 -
(SELECT MAX(sibling.rgt)
FROM OrgChart AS sibling
WHERE sibling.rgt < OrgChart.lft)
WHERE shift IS NULL AND EXISTS
(SELECT *
FROM OrgChart AS sibling
WHERE sibling.rgt < OrgChart.lft);

emp lft rgt shift
==============================
Albert 100 1200 NULL
Bert 101 201 NULL
Chuck 400 1100 198
Donna 401 501 199
Eddie 601 701 99
Fred 801 901 99

UPDATE OrgChart
SET shift = lft - 1 -
(SELECT MAX(parent.lft)
FROM OrgChart parent
WHERE parent.lft < OrgChart.lft AND parent.rgt > OrgChart.rgt)
WHERE shift IS NULL OR (lft - shift) <
(SELECT MAX(parent.lft)
FROM OrgChart parent
WHERE parent.lft < OrgChart.lft AND parent.rgt > OrgChart.rgt);

emp lft rgt shift
==============================
Albert 100 1200 NULL
Bert 101 201 0
Chuck 400 1100 198
Donna 401 501 0
Eddie 601 701 99
Fred 801 901 99

UPDATE OrgChart
SET shift = lft - 1
WHERE shift IS NULL;

emp lft rgt shift
==============================
Albert 100 1200 99
Bert 101 201 0
Chuck 400 1100 198
Donna 401 501 0
Eddie 601 701 99
Fred 801 901 99

For the shift itself, you need another additional column which contains the
original order.

UPDATE OrgChart
SET nr =
(SELECT COUNT(*)
FROM OrgChart before
WHERE before.lft <= OrgChart.lft);

emp lft rgt shift nr
====================================
Albert 100 1200 99 1
Bert 101 201 0 2
Chuck 400 1100 198 3
Donna 401 501 0 4
Eddie 601 701 99 5
Fred 801 901 99 6

UPDATE OrgChart
SET lft = lft -
(SELECT SUM(shift)
FROM OrgChart before
WHERE before.nr <= OrgChart.nr),
rgt = rgt -
(SELECT SUM(shift)
FROM OrgChart before
WHERE before.nr <= OrgChart.nr);

emp lft rgt shift nr
====================================
Albert 1 1101 99 1
Bert 2 102 0 2
Chuck 103 803 198 3
Donna 104 204 0 4
Eddie 205 305 99 5
Fred 306 406 99 6

Of course you can also have those two extra columns in an extra table
together with the key so that they don't interfere with the normal
operations on the OrgChart table.

The problem with a cursor approach is that you need a stack for rgt column
of all parents.

Regards,
Heinz

Mikito Harakiri

unread,

Sep 10, 2002, 9:41:44 PM9/10/02

to

71062...@compuserve.com (--CELKO--) wrote in message news:<c0d87ec0.02090...@posting.google.com>...

> There is a version of the nested set model in which we use a larger
> spread than just one between the numbers. The idea is that while you
> lose the algebriac property that (rgt -lft+1)/2, you can insert new
> nodes without renumbers the nodes.

Are you willing to switch from integers to real or rational numbers?
Both reals and rationals are dense, so that you can insert new nodes
without being forced to shift other values. For example, you root node
is [0,1) and then you insert 2 children. That's easy, the new
intervals are [1/5,2/5) and [3/5,4/5). Now, suppose we want to insert
a new child of [0,1) between the [1/5,2/5) and [3/5,4/5). Also easy:
[1/5+1*(3/5-2/5)/3, 1/5+2*(3/5-2/5)/3).

--CELKO--

unread,

Sep 11, 2002, 8:48:45 PM9/11/02

to

>> Are you willing to switch from integers to real or rational
numbers? <<

Yes, but only in a perfect world. The REAL and FLOAT datatypes start
to get floating point rounding errors and will lose the "BETWEEN-ness"
property we need. This approach starts to fail when the tree gets
very deep or very large. And this is the worst time for such a
failure.

On the other hand, a DECIMAL(s,p) datatype can often be over 30 digits
-- Oracle is a max of 38 if I remember correctly. If I give each
level a spread that is a power of ten, the math is pretty simple.

>> Now, suppose we want to insert a new child of [0,1) between the
[1/5,2/5) and [3/5,4/5). <<

I work with the rule that the newest sibling goes to the rightmost
side, not between existing siblings. This is why I want to be able to
tell the existing children to move left and make room for their new
brother on the right side of their parent's (lft, rgt) spread.

Heinz Huber

unread,

Sep 12, 2002, 1:56:40 AM9/12/02

to

Heinz Huber <hhu...@racon-linz.at> wrote in
news:Xns92855C7979C8A...@195.3.96.116:

[snipped problem and solution]

On thinking further, you should be able to pack the whole thing in a stored
procedure that works with a temp table:

The procedure gets the table name and the names of the lft and rgt columns.
Get the keys of the table from the system tables.
Create a temporary table containing the key columns of the table to pack
and the additional columns shift and nr.
Then apply the updates in the last post (modified to use an additional
table).
Drop the temporary table.

Regards,
Heinz

Pablo Sanchez

unread,

Sep 12, 2002, 10:30:50 AM9/12/02

to

Heinz Huber <hhu...@racon-linz.at> wrote in

news:Xns928750D09C151...@195.3.96.116:

The problem with creating and destroying temporary tables within a
stored proc is during system stress, this wreaks havoc on the
system's catalogs. In other words, it doesn't scale well.

Thx!
--
Pablo Sanchez, High-Performance Database Engineering
http://www.hpdbe.com

--CELKO--

unread,

Sep 12, 2002, 11:06:20 AM9/12/02

to

>> On thinking further, you should be able to pack the whole thing in
a stored procedure that works with a temp table: <<

Richard Romley came up with the same idea and some code in SQL Server.
His temp table had columns for the original (lft,rgt) pairs and
columns for the new (lft,rgt) pairs. The condition that had to be
preserved is that ((old_rgt - old_lft) = (new_rgt - new-lft)); that
is, the spread stays the same, while the gaps between the siblings
gets smaller. The process was iterative in his code. I am thinking
that this might be one of those problems where an iterative solution
is best.

Mikito Harakiri

unread,

Sep 12, 2002, 1:00:57 PM9/12/02

to

71062...@compuserve.com (--CELKO--) wrote in message news:<c0d87ec0.0209...@posting.google.com>...

> >> Are you willing to switch from integers to real or rational
> numbers? <<
>
> Yes, but only in a perfect world. The REAL and FLOAT datatypes start
> to get floating point rounding errors and will lose the "BETWEEN-ness"
> property we need. This approach starts to fail when the tree gets
> very deep or very large. And this is the worst time for such a
> failure.

First, it depends if we want to admit a closed-form representation
(like sqrt(2), for example), or just naive approximations (like 1.4).
Although, nearly every programming framework -- including databases --
implements the latter, there are some -- notably, computer algebra
systems -- that have the former.

However, it's really much more simple than "closed-form REALs":
RATIONAL numbers are much simpler solution.

> On the other hand, a DECIMAL(s,p) datatype can often be over 30 digits
> -- Oracle is a max of 38 if I remember correctly. If I give each
> level a spread that is a power of ten, the math is pretty simple.

Similarly, one can overflow INTEGERs in your nesed set model. Again,
the only nearly perfect number representation exists in computer
algebra systems like Mapple and Mathematica.

> >> Now, suppose we want to insert a new child of [0,1) between the
> [1/5,2/5) and [3/5,4/5). <<
>
> I work with the rule that the newest sibling goes to the rightmost
> side, not between existing siblings. This is why I want to be able to
> tell the existing children to move left and make room for their new
> brother on the right side of their parent's (lft, rgt) spread.

Well, I insert new intervals with the gaps on each side. New members
can be fit at any position there.

Essentially, the heart of your Nested Sets model is a set of intervals
with containment relation. (You have some additional properties like
(left-right+1)=level*2, but I'm not sure of their significance). When
we invert the order at, say, the right node, then we essentially have
points on a 2 dimensional cartesian plain (X,Y) with a partial order
defined by:

(X1,Y1)<=(X2,Y2) <==> X1<=X2 & Y1<=Y2

Now we can easily see why nested sets can't model arbitrary DAGs: 2
dimensions is just not enough for representing any partial order.

Heinz Huber

unread,

Sep 13, 2002, 5:27:50 AM9/13/02

to

71062...@compuserve.com (--CELKO--) wrote in
news:c0d87ec0.02091...@posting.google.com:

I thought about an iterative approach, too. But as soon as tried a short
pencil test, I realized that you need a stack of parents to do that.

The problem is (on stepping from one row to the next) how to find out
whether the new row is a sibling or to which parent it belongs. It works
only if you have a stack of parents and peek at the top or if you fire
another query for every row.

On the other hand, the approach I have given works set oriented with two
extra columns (whereever one wants to define those).

Regards,
Heinz

Kendall

unread,

Sep 13, 2002, 1:59:01 PM9/13/02

to

In article <c0d87ec0.02090...@posting.google.com>, "--CELKO--"
<71062...@compuserve.com> wrote:

>
> How can we improve this procedure?

I'm a bit confused about what this is trying to achieve, but it looks
like a lot of work, so why not try this:

update Orgchart set
lft =
(select count(*) from Orgchart o1 where o1.lft <= Orgchart.lft),
rgt =
(select count(*) from Orgchart o2 where o2.rgt <= Orgchart.rgt)

The numbers can be multiplied by some constant factor if gaps are desired.

A good factor might be
2^floor( 31 - log(select count(*) from Orgchart)/log(2))

--CELKO--

unread,

Sep 13, 2002, 7:24:37 PM9/13/02

to

>> I'm a bit confused about what this is trying to achieve, but it

looks like a lot of work, so why not try this: .. <<

That is the standard way to close up all the gaps in a Nested Sets
model. But what I want to do is keep the spread the same and close
only the gaps. A spread is defined as (rgt -lft) -- the size of a
given node and how much roo it has for children. A gap is the
different between the (rgt) value of one and the (lft) value of its
sibling on the immeidate left or right side.

I want to retain the spread, but close the gaps by moving the nodes to
the left. This will leave me with a single large gap on the right of
each parent.

In thought about doing this way, then moving all of the (rgt) values
further to the right. But frankly, I have been working on my new book
about "Trees in SQL" for several days straight and my brain is getting
very tired.

>> A good factor might be
2^floor( 31 - log(select count(*) from Orgchart)/log(2)) <<

I like that factor because it looks scientific <g>, but I was thinking
that I might wan tot use the level of the node inthe tree to determine
how far to shift its (rgt) value ...

David Cressey

unread,

Sep 16, 2002, 11:10:33 AM9/16/02

to

Joe,

For some reason, what you are trying to do reminds me of the process of
defragmenting a disk.

In the defragmenting case, the goal is almost the opposite: to coalesce all
the "gaps" into one giant block of free space.

Still, if the analogy is worthwhile, maybe there are some algorithms in the
world of defragging that can be adapted to the task at hand.

--
Regards,
David Cressey

John Gilson

unread,

Sep 16, 2002, 4:47:48 PM9/16/02

to

The issue is to move the gaps between siblings in a tree to the right
of the rightmost sibling while also maintaining the spread between
each node's left and right values. Additionally, it would be useful
to be able to set the root node's left value to 1 before left shifting
so that the tree is put into "normal form". This would be useful if
we wanted to present a subtree normalized. Furthermore, we might want
to begin the left shifting at any node in the tree and not just the
tree's root node. When working with a very large tree it could be
much more efficient to left shift only certain subtrees, indicated by
giving the subtree's root node.

Before we can left shift a tree we need to identify the gaps to
close. There are two types of gaps:
1. Between the left value of the oldest, i.e., leftmost, child N1L and
the left value of its parent N2L. We look for instances where
N1L - N2L > 1.
2. Between the left value of a node N1L and the right value of its
sibling N2R to the immediate left. We look for instances where
N1L - N2R > 1.

For all gaps on level N to be closed, all gaps in the previous N-1
levels must have been closed. This would lead to a top-down
breadth-first iterative algorithm. However, note that when left
shifting a level N the same amount of shifting must be simultaneously
applied to all descendants of this level so that the tree's
ancestor-descendant relationship, represented by nested sets, is
always maintained.

Below is some T-SQL code, tested on SQL Server, that implements this
followed by a few examples. The code might appear a bit complex on a
first pass but the algorithm is straightforward. Note that while
iteration is used, no temp tables or extra columns are required.

First, let's show the DDL and sample data.

CREATE TABLE Tree
(
value VARCHAR(10) NOT NULL PRIMARY KEY,
lft INT NOT NULL CHECK (lft > 0),
rgt INT NOT NULL CHECK (rgt > 0),
CHECK (rgt > lft)
)

INSERT INTO TREE
SELECT 'Albert', 100, 1200
UNION ALL
SELECT 'Bert', 101, 201
UNION ALL
SELECT 'Ernie', 250, 300
UNION ALL
SELECT 'Chuck', 400, 1100
UNION ALL
SELECT 'Donna', 501, 601
UNION ALL
SELECT 'Eddie', 701, 801
UNION ALL
SELECT 'Fred', 901, 1001

Next, let's look at some views that provide structural information
about the tree.

-- Define tree levels with the root node being at level 1,
-- not the standard level 0
CREATE VIEW NodeLevels
AS
SELECT T2.value AS value,
COUNT(T1.value) AS level
FROM Tree AS T1
INNER JOIN
Tree AS T2
ON T2.lft BETWEEN T1.lft AND T1.rgt
GROUP BY T2.value

-- Define tree parent-child relationship
CREATE VIEW Children
AS
SELECT T1.value AS parent,
T2.value AS child,
T1.lft AS lftParent,
T1.rgt AS rgtParent,
T2.lft AS lftChild,
T2.rgt AS rgtChild
FROM Tree AS T1
INNER JOIN
Tree AS T2
ON T2.lft > T1.lft AND
T2.rgt < T1.rgt AND
NOT EXISTS (SELECT *
FROM Tree AS T3
WHERE T3.lft > T1.lft AND
T3.lft < T2.lft AND
T3.rgt < T1.rgt AND
T3.rgt > T2.rgt)

-- Define tree ancestor-descendant relationship
CREATE VIEW Descendants
AS
SELECT T1.value AS ancestor,
T2.value AS descendant
FROM Tree AS T1
INNER JOIN
Tree AS T2
ON T2.lft > T1.lft AND
T2.rgt < T1.rgt

-- Consecutive siblings with a gap > 0
CREATE VIEW ConsecutiveSiblingGaps
AS
SELECT C1.child AS lftChild,
C2.child AS rgtChild,
L.level AS level,
C2.lftChild - C1.rgtChild - 1 AS gap
FROM Children AS C1
INNER JOIN
Children AS C2
ON C1.parent = C2.parent AND
C1.rgtChild < C2.lftChild - 1 AND
NOT EXISTS (SELECT *
FROM Children AS C3
WHERE C3.parent = C1.parent AND
C3.lftChild > C1.rgtChild AND
C3.rgtChild < C2.lftChild)
INNER JOIN
NodeLevels AS L
ON L.value = C1.child

-- Gap between parent and oldest child, i.e., leftmost child, > 0
CREATE VIEW ParentOldestChildGaps
AS
SELECT C1.parent AS parent,
C1.child AS child,
L.level AS childLevel,
C1.lftChild - C1.lftParent - 1 AS gap
FROM Children AS C1
INNER JOIN
NodeLevels AS L
ON NOT EXISTS (SELECT *
FROM Children AS C2
WHERE C2.parent = C1.parent AND
C2.lftChild < C1.lftChild) AND
C1.lftParent < C1.lftChild - 1 AND
L.value = C1.child

Finally, the stored procedure that does the left shifting.

-- Procedures to left-shift tree
-- All gaps are moved rightward
-- All spreads remain the same

-- Only left-shift subtree rooted at the given node
CREATE PROCEDURE ShiftLeftSubtree
@node VARCHAR(10)
AS
DECLARE @level INT
SELECT @level = level
FROM NodeLevels
WHERE value = @node
IF @level IS NOT NULL
EXEC ShiftLeftHelper @node, @level

-- Left-shift complete tree where the root can optionally be shifted so
-- that its left value is 1
CREATE PROCEDURE ShiftLeft
@rootLeftAt1 BIT = 0
AS
DECLARE @root VARCHAR(10)
SELECT @root = value
FROM NodeLevels
WHERE level = 1
IF @root IS NOT NULL
EXEC ShiftLeftHelper @root, 1, @rootLeftAt1

-- This is the main procedure that does all left shifting
CREATE PROCEDURE ShiftLeftHelper
@node VARCHAR(10), -- node to start shifting from
@level INT, -- node's level (root is level 1)
@rootLeftAt1 BIT = 0 -- set root's left value to 1?
AS
BEGIN

-- Current tree level to visit
DECLARE @currentLevel INT
SELECT @currentLevel = @level

-- Tree height
DECLARE @height INT
SELECT @height = MAX(level)
FROM NodeLevels

-- Handle left shifting of root node when it's left value is to be 1
IF @currentLevel = 1
BEGIN
IF @rootLeftAt1 = 1
BEGIN
-- Left shift tree based on gap provided by root node
DECLARE @rootGap INT
SELECT @rootGap = T1.lft - 1
FROM Tree AS T1
WHERE NOT EXISTS (SELECT *
FROM Tree AS T2
WHERE T2.lft < T1.lft AND
T2.rgt > T1.rgt)
UPDATE Tree
SET lft = CASE WHEN Tree.value = @node
THEN 1
ELSE Tree.lft - @rootGap
END,
rgt = Tree.rgt - @rootGap
END
END
ELSE
BEGIN
-- Left shift starting from a node at a level > 1
-- This is the first pass at left shifting a subtree based only on
-- the gap before a subtree's root. All subsequent left shifting of
-- this subtree provided by levels greater than the subtree's root
-- level will take place in the loop following this block.

DECLARE @consecutiveSiblingGap INT
SELECT @consecutiveSiblingGap = G.gap
FROM ConsecutiveSiblingGaps AS G
WHERE G.level = @currentLevel AND
G.rgtChild = @node

DECLARE @parentOldestChildGap INT
SELECT @parentOldestChildGap = G.gap
FROM ParentOldestChildGaps AS G
WHERE G.childLevel = @currentLevel AND
G.child = @node

UPDATE Tree
SET lft = COALESCE(Tree.lft - @consecutiveSiblingGap,
Tree.lft - @parentOldestChildGap,
Tree.lft),
rgt = COALESCE(Tree.rgt - @consecutiveSiblingGap,
Tree.rgt - @parentOldestChildGap,
Tree.rgt)
WHERE Tree.value = @node OR
EXISTS (SELECT *
FROM Descendants
WHERE ancestor = @node AND
descendant = Tree.value)
END

-- Proceed to the level just after the level of the tree's (or
-- subtree's) root node.
SELECT @currentLevel = @currentLevel + 1

-- Loop through each level up to the height of the tree
WHILE @currentlevel <= @height
BEGIN
-- Each iteration of this loop will shift a level's nodes further to
-- the left until there are no more gaps on this level.
-- Concurrently, all shifts applied to the nodes on this level are
-- also applied to all of those nodes' descendants so as to maintain
-- the tree's parent-child relationship.
WHILE EXISTS (SELECT *
FROM ConsecutiveSiblingGaps AS G
INNER JOIN
Descendants AS D
ON G.level = @currentLevel AND
D.descendant = G.rgtChild AND
D.ancestor = @node) OR
EXISTS (SELECT *
FROM ParentOldestChildGaps AS G
INNER JOIN
Descendants AS D
ON G.childLevel = @currentLevel AND
D.descendant = G.child AND
D.ancestor = @node)
BEGIN
UPDATE Tree
SET lft = COALESCE((SELECT Tree.lft - G.gap
FROM ConsecutiveSiblingGaps AS G
WHERE G.level = @currentLevel AND
(G.rgtChild = Tree.value OR
EXISTS (SELECT *
FROM Descendants AS D
WHERE D.descendant = Tree.value AND
D.ancestor = G.rgtChild))),
(SELECT Tree.lft - G.gap
FROM ParentOldestChildGaps AS G
WHERE G.childLevel = @currentLevel AND
(G.child = Tree.value OR
EXISTS (SELECT *
FROM Descendants AS D
WHERE D.descendant = Tree.value AND
D.ancestor = G.child))),
Tree.lft),
rgt = COALESCE((SELECT Tree.rgt - G.gap
FROM ConsecutiveSiblingGaps AS G
WHERE G.level = @currentLevel AND
(G.rgtChild = Tree.value OR
EXISTS (SELECT *
FROM Descendants AS D
WHERE D.descendant = Tree.value AND
D.ancestor = G.rgtChild))),
(SELECT Tree.rgt - G.gap
FROM ParentOldestChildGaps AS G
WHERE G.childLevel = @currentLevel AND
(G.child = Tree.value OR
EXISTS (SELECT *
FROM Descendants AS D
WHERE D.descendant = Tree.value AND
D.ancestor = G.child))),
Tree.rgt)
-- Only interested in shifting nodes >= the current level that are
-- also descendants of the supplied root node
WHERE EXISTS (SELECT *
FROM Descendants AS D
INNER JOIN
NodeLevels AS L
ON D.descendant = Tree.value AND
D.ancestor = @node AND
L.value = Tree.value AND
L.level >= @currentLevel)
END
-- Prepare to visit next level of nodes
SELECT @currentLevel = @currentLevel + 1
END

END

Some examples might be illustrative. The tree we start with is

value lft rgt

Albert 100 1200
Bert 101 201

Ernie 250 300
Chuck 400 1100
Donna 501 601
Eddie 701 801
Fred 901 1001

If we want to left shift the entire tree we execute

EXEC ShiftLeft

so the resulting tree is

value lft rgt

Albert 100 1200
Bert 101 201

Ernie 202 252
Chuck 253 953
Donna 254 354
Eddie 355 455
Fred 456 556

If we want to shift the entire tree but start with a left value of 1
for the root node we execute

EXEC ShiftLeft 1

which results in the following tree

value lft rgt

Albert 1 1101
Bert 2 102

Ernie 103 153
Chuck 154 854
Donna 155 255
Eddie 256 356
Fred 357 457

Finally, we can just shift a subtree by giving that subtree's root
node by executing

EXEC ShiftLeftSubtree 'Chuck'

which modifies the original table to

value lft rgt

Albert 100 1200
Bert 101 201

Ernie 250 300
Chuck 301 1001
Donna 302 402
Eddie 403 503
Fred 504 604

or

EXEC ShiftLeftSubtree 'Eddie'

value lft rgt

Albert 100 1200
Bert 101 201

Ernie 250 300
Chuck 400 1100
Donna 501 601
Eddie 602 702
Fred 901 1001

Regards,
jag

Kendall

unread,

Sep 17, 2002, 12:28:00 AM9/17/02

to

In article <c0d87ec0.02091...@posting.google.com>, "--CELKO--"
<71062...@compuserve.com> wrote:

>>> I'm a bit confused about what this is trying to achieve, but it
> looks like a lot of work, so why not try this: .. <<
>

(OK, got it - just needed a reality check.)

> That is the standard way to close up all the gaps in a Nested Sets
> model. But what I want to do is keep the spread the same and close only
> the gaps. A spread is defined as (rgt -lft) -- the size of a given node
> and how much roo it has for children. A gap is the different between
> the (rgt) value of one and the (lft) value of its sibling on the
> immeidate left or right side.
>
> I want to retain the spread, but close the gaps by moving the nodes to
> the left. This will leave me with a single large gap on the right of
> each parent.
>

I guess I have misgivings about where the free space is needed - between
the siblings would be one place, but you're squishing them together.

If you want to slide all the siblings on all the levels together, you're
going to have to accumulate the offset starting from root. There's a
proc at http://willets.org/superdfs.sql that does the same thing with no
offset - it just calculates the preorder numbering by adding up the
parent's lft and descendant counts across each level.

If you replace descendant counts with spreads you should be able to do
the same thing. It's iterative, but I don't see an easy way out
otherwise.

> In thought about doing this way, then moving all of the (rgt) values
> further to the right. But frankly, I have been working on my new book
> about "Trees in SQL" for several days straight and my brain is getting
> very tired.
>

This problem is akin to building balanced trees from random data -
there's no way that's going to work every time.

--CELKO--

unread,

Sep 17, 2002, 9:39:05 AM9/17/02

to

>> For some reason, what you are trying to do reminds me of the
process of
defragmenting a disk. <<

Good thought!

Richard Romley sent me a recursive UDF that computes the new left
value for a node. It can be put into one statement then used in
UPDATE Tree
SET lft = ShiftLeft(left),
rgt = ShiftLeft(left) + (rgt-lft);

Very neat. I have to go out of town, but I will post it when I get
back.

--CELKO--

unread,

Sep 17, 2002, 9:47:15 AM9/17/02

to

>> Below is some T-SQL code, tested on SQL Server, that implements
this followed by a few examples. The code might appear a bit complex
on a first pass but the algorithm is straightforward. Note that while
iteration is used, no temp tables or extra columns are required. <<

I am out of town today, but I will play with this when I get back.
Looks like a lot of the same stuff that Richard Romley got, but he
used recursion in a UDF instead of iteration to get the results.

John Gilson

unread,

Sep 18, 2002, 3:32:07 AM9/18/02

to

The algorithm I posted previously to solve this problem, while
correct, was unnecessarily constrained. I had applied shifting level
by level, i.e., before level N is shifted all levels before N are
fully shifted. This constraint has been removed leading to a cleaner
algorithm and code. Even though this solution is essentially the same
as the previous one, for completeness and ease of understanding I'll
make this posting self-contained.

The issue is to move the gaps between siblings in a tree to the right
of the rightmost sibling while also maintaining the spread between
each node's left and right values. Additionally, it would be useful
to be able to set the root node's left value to 1 before left shifting
so that the tree is put into "normal form". This would be useful if
we wanted to present a subtree normalized. Furthermore, we might want
to begin the left shifting at any node in the tree and not just the
tree's root node. When working with a very large tree it could be
much more efficient to left shift only certain subtrees, indicated by
giving the subtree's root node.

Before we can left shift a tree we need to identify the gaps to
close. There are two types of gaps:

1. Between the left value of the oldest, i.e., leftmost, child CL and
the left value of its parent PL. We look for instances where
CL - PL > 1. Let's call this kind of gap a ParentOldestChildGap.
2. Between the left value of a node S2L and the right value of the
sibling S1R to its immediate left. We look for instances where
S2L - S1R > 1. Let's call this kind of gap a
ConsecutiveSiblingGap.

Using this notion of gaps, the algorithm to left shift a tree or
subtree rooted at node N is as follows:

declare N TreeNode -- root node
while(exists TreeNode N1 where N1 = N or N1 is a descendant of N and
N1 has a ConsecutiveSiblingGap or a ParentOldestChildGap)
begin
forall(TreeNode N1 where N1 = N or N1 is a descendant of N and
(N1 has a ConsecutiveSiblingGap SG; SG is 0 if none) or
(N1 has a ParentOldestChildGap PCG; PCG is 0 if none) or
forall(TreeNode N2 where N2 = N or N2 is a descendant of N
and N2 is an ancestor of N1 and N2 has an ancestor
ConsecutiveSiblingGap ASG or an ancestor
ParentOldestChildGap APCG; ASG and APCG are 0 if none))
begin
update N1.lft = N1.lft - SG - PCG - SUM(ASG) - SUM(APCG),
N1.rgt = N1.rgt - SG - PCG - SUM(ASG) - SUM(APCG)
from N1
end
end

Note that when left shifting a node with a gap, the same amount of

shifting must be simultaneously applied to all descendants of this

node so that the tree's ancestor-descendant relationship, represented

by nested sets, is always maintained.

Below is T-SQL code, tested on SQL Server, that implements this
algorithm followed by a few examples. Note that while iteration is
used, no temp tables or extra columns are required and no local
variables are needed in the loop.

First, let's show the DDL code and sample data.

CREATE TABLE Tree
(
value VARCHAR(10) NOT NULL PRIMARY KEY,
lft INT NOT NULL CHECK (lft > 0),
rgt INT NOT NULL CHECK (rgt > 0),
CHECK (rgt > lft)
)

INSERT INTO TREE
SELECT 'Albert', 100, 1200
UNION ALL

SELECT 'Bert', 111, 201

UNION ALL
SELECT 'Ernie', 250, 300
UNION ALL
SELECT 'Chuck', 400, 1100
UNION ALL
SELECT 'Donna', 501, 601
UNION ALL
SELECT 'Eddie', 701, 801
UNION ALL
SELECT 'Fred', 901, 1001

Next, let's look at some views that provide structural information
about the tree.

-- Tree's root node
CREATE VIEW RootNode
AS
SELECT T1.value, T1.lft, T1.rgt

FROM Tree AS T1
WHERE NOT EXISTS (SELECT *
FROM Tree AS T2
WHERE T2.lft < T1.lft AND
T2.rgt > T1.rgt)

-- Define tree parent-child relationship

CREATE VIEW Children
AS
SELECT T1.value AS parent,
T2.value AS child,
T1.lft AS lftParent,
T1.rgt AS rgtParent,
T2.lft AS lftChild,
T2.rgt AS rgtChild
FROM Tree AS T1
INNER JOIN
Tree AS T2
ON T2.lft > T1.lft AND

T2.rgt < T1.rgt AND
NOT EXISTS (SELECT *

FROM Tree AS T3
WHERE T3.lft > T1.lft AND
T3.lft < T2.lft AND
T3.rgt < T1.rgt AND
T3.rgt > T2.rgt)

-- Define tree ancestor-descendant relationship
CREATE VIEW Descendants
AS
SELECT T1.value AS ancestor,
T2.value AS descendant
FROM Tree AS T1
INNER JOIN
Tree AS T2
ON T2.lft > T1.lft AND
T2.rgt < T1.rgt

-- Consecutive siblings with a gap > 0
CREATE VIEW ConsecutiveSiblingGaps
AS
SELECT C1.child AS lftChild,
C2.child AS rgtChild,

C2.lftChild - C1.rgtChild - 1 AS gap
FROM Children AS C1
INNER JOIN
Children AS C2
ON C1.parent = C2.parent AND

C1.rgtChild < C2.lftChild - 1 AND
NOT EXISTS (SELECT *

FROM Children AS C3
WHERE C3.parent = C1.parent AND
C3.lftChild > C1.rgtChild AND
C3.rgtChild < C2.lftChild)

-- Gap between parent and oldest child, i.e., leftmost child, > 0

CREATE VIEW ParentOldestChildGaps
AS
SELECT C1.parent AS parent,
C1.child AS child,

C1.lftChild - C1.lftParent - 1 AS gap
FROM Children AS C1

WHERE NOT EXISTS (SELECT *

FROM Children AS C2
WHERE C2.parent = C1.parent AND
C2.lftChild < C1.lftChild) AND
C1.lftParent < C1.lftChild - 1

Finally, the stored procedure that does the left shifting.

-- Procedures to left-shift tree
-- All gaps are moved rightward
-- All spreads remain the same

CREATE PROCEDURE ShiftLeftHelper
@node VARCHAR(10), -- root node of tree or subtree to begin shifting
@normalizeTree BIT = 0 -- normalize tree?
AS
BEGIN

IF EXISTS (SELECT *
FROM RootNode
WHERE value = @node AND
lft <> 1) AND
@normalizeTree = 1
BEGIN
DECLARE @rootGap INT
SELECT @rootGap = lft - 1
FROM RootNode
UPDATE Tree
SET lft = Tree.lft - @rootGap,

rgt = Tree.rgt - @rootGap

END -- IF

-- While any gaps exist at or below provided root node, left shift nodes
-- with gaps and all of their descendants by the necessary amounts

WHILE EXISTS (SELECT *
FROM ConsecutiveSiblingGaps AS G

WHERE G.rgtChild = @node OR
EXISTS (SELECT *
FROM Descendants AS D
WHERE D.ancestor = @node AND
D.descendant = G.rgtChild)) OR

EXISTS (SELECT *
FROM ParentOldestChildGaps AS G

WHERE G.child = @node OR
EXISTS (SELECT *
FROM Descendants AS D
WHERE D.ancestor = @node AND
D.descendant = G.child))
BEGIN
UPDATE Tree
SET lft = Tree.lft -
COALESCE((SELECT gap
FROM ConsecutiveSiblingGaps
WHERE rgtChild = Tree.value),
(SELECT gap
FROM ParentOldestChildGaps
WHERE child = Tree.value),
0) -
COALESCE((SELECT SUM(G.gap)

FROM ConsecutiveSiblingGaps AS G
INNER JOIN

Descendants AS D1
ON D1.ancestor = G.rgtChild AND
D1.descendant = Tree.value AND
(G.rgtChild = @node OR
EXISTS
(SELECT *
FROM Descendants AS D2
WHERE D2.ancestor = @node AND
D2.descendant = G.rgtChild))), 0) -
COALESCE((SELECT SUM(G.gap)

FROM ParentOldestChildGaps AS G
INNER JOIN

Descendants AS D1
ON D1.ancestor = G.child AND
D1.descendant = Tree.value AND
(G.child = @node OR
EXISTS
(SELECT *
FROM Descendants AS D2
WHERE D2.ancestor = @node AND
D2.descendant = G.child))), 0),
rgt = Tree.rgt -
COALESCE((SELECT gap
FROM ConsecutiveSiblingGaps
WHERE rgtChild = Tree.value),
(SELECT gap
FROM ParentOldestChildGaps
WHERE child = Tree.value),
0) -
COALESCE((SELECT SUM(G.gap)

FROM ConsecutiveSiblingGaps AS G
INNER JOIN

Descendants AS D1
ON D1.ancestor = G.rgtChild AND
D1.descendant = Tree.value AND
(G.rgtChild = @node OR
EXISTS
(SELECT *
FROM Descendants AS D2
WHERE D2.ancestor = @node AND
D2.descendant = G.rgtChild))), 0) -
COALESCE((SELECT SUM(G.gap)

FROM ParentOldestChildGaps AS G
INNER JOIN

Descendants AS D1
ON D1.ancestor = G.child AND
D1.descendant = Tree.value AND
(G.child = @node OR
EXISTS
(SELECT *
FROM Descendants AS D2
WHERE D2.ancestor = @node AND
D2.descendant = G.child))), 0)
WHERE (Tree.value = @node OR

EXISTS (SELECT *
FROM Descendants

WHERE descendant = Tree.value AND
ancestor = @node)) AND
(EXISTS (SELECT *
FROM ConsecutiveSiblingGaps AS G
WHERE G.rgtChild = Tree.value) OR
EXISTS (SELECT *

FROM ConsecutiveSiblingGaps AS G
INNER JOIN

Descendants AS D1
ON D1.ancestor = G.rgtChild AND
D1.descendant = Tree.value AND
(G.rgtChild = @node OR
EXISTS (SELECT *
FROM Descendants AS D2
WHERE D2.ancestor = @node AND
D2.descendant = G.rgtChild))) OR

EXISTS (SELECT *
FROM ParentOldestChildGaps AS G

WHERE G.child = Tree.value) OR

EXISTS (SELECT *
FROM ParentOldestChildGaps AS G
INNER JOIN

Descendants AS D1
ON D1.ancestor = G.child AND
D1.descendant = Tree.value AND
(G.child = @node OR
EXISTS (SELECT *
FROM Descendants AS D2
WHERE D2.ancestor = @node AND
D2.descendant = G.child))))
END -- WHILE
END -- PROCEDURE

-- Left-shift complete tree
CREATE PROCEDURE ShiftLeft

AS
DECLARE @root VARCHAR(10)
SELECT @root = value

FROM RootNode

IF @root IS NOT NULL
EXEC ShiftLeftHelper @root

-- Left-shift complete tree with the root's left value set to 1, i.e.,
-- placed in normal form
CREATE PROCEDURE ShiftLeftNormalized

AS
DECLARE @root VARCHAR(10)
SELECT @root = value

FROM RootNode

IF @root IS NOT NULL
EXEC ShiftLeftHelper @root, 1

-- Only left-shift subtree rooted at the given node

CREATE PROCEDURE ShiftLeftSubtree
@node VARCHAR(10)
AS

IF EXISTS (SELECT * FROM Tree WHERE value = @node)
EXEC ShiftLeftHelper @node

Some examples might be illustrative. The tree we start with is

value lft rgt
Albert 100 1200

Bert 111 201

Ernie 250 300
Chuck 400 1100
Donna 501 601
Eddie 701 801
Fred 901 1001

If we want to left shift the entire tree we execute

EXEC ShiftLeft

so the resulting tree is

value lft rgt
Albert 100 1200
Bert 101 191
Ernie 192 242
Chuck 243 943
Donna 244 344
Eddie 345 445
Fred 446 546

If we want to shift the entire tree but start with a left value of 1
for the root node we execute

EXEC ShiftLeftNormalized

which results in the following tree

value lft rgt
Albert 1 1101

Bert 2 92
Ernie 93 143
Chuck 144 844
Donna 145 245
Eddie 246 346
Fred 347 447

Finally, we can just shift a subtree by giving that subtree's root
node by executing

EXEC ShiftLeftSubtree 'Chuck'

which modifies the original table to

value lft rgt
Albert 100 1200

Bert 111 201

Ernie 250 300
Chuck 301 1001
Donna 302 402
Eddie 403 503
Fred 504 604

or

EXEC ShiftLeftSubtree 'Ernie'

value lft rgt
Albert 100 1200

Bert 111 201
Ernie 202 252

Chuck 400 1100
Donna 501 601
Eddie 701 801
Fred 901 1001

Regards,
jag

John Gilson

unread,

Sep 18, 2002, 3:45:15 AM9/18/02

to

"--CELKO--" <71062...@compuserve.com> wrote in message
news:c0d87ec0.0209...@posting.google.com...

I posted a slightly simplified version of my previous algorithm. Both
should give you the same results but the latter is cleaner and, hopefully,
clearer.

Regards,
jag

John Gilson

unread,

Sep 18, 2002, 12:33:06 PM9/18/02

to

"John Gilson" <j...@acm.org> wrote in message news:XZVh9.2110$GJ3.4...@twister.nyc.rr.com...

[...]

Here's the exact same algorithm as described above but with a further simplified
and more efficient implementation. The code below should supersede the
previous version posted.

The change is in dropping the ancestor-descendant relationship (implemented
previously by the Descendants view) in favor of a path relationship (implemented
by the Paths view below). The paths relationship is conceptually a relation of triples
(head of path, tail of path, node in path) for a tree. For example,

SELECT head, tail, node
FROM Paths
WHERE head = 'Albert' AND tail = 'Donna'
ORDER BY lftNode

returns

head tail node
Albert Donna Albert
Albert Donna Chuck
Albert Donna Donna

Here's the new version of the code. The only differences from the previous
implementation are adding the Paths view, dropping the Descendants view,
and significantly simplifying the implementation of the ShiftLeftHelper procedure
by using the Paths view.

-- Enumerate all nodes in all paths in tree from a head node to a
-- tail node, both inclusive.
CREATE VIEW Paths
AS
SELECT T1.value AS head,
T2.value AS tail,
T3.value AS node,
T3.lft AS lftNode,
T3.rgt AS rgtNode

FROM Tree AS T1
INNER JOIN
Tree AS T2

ON T1.lft <= T2.lft AND
T1.rgt >= T2.rgt
INNER JOIN
Tree AS T3
ON T3.lft BETWEEN T1.lft AND T2.lft AND
T3.rgt BETWEEN T2.rgt AND T1.rgt

-- Procedures to left-shift tree

INNER JOIN
Paths AS P
ON P.head = @node AND
P.node = G.rgtChild) OR

EXISTS (SELECT *
FROM ParentOldestChildGaps AS G
INNER JOIN

Paths AS P
ON P.head = @node AND
P.node = G.child)

BEGIN
UPDATE Tree
SET lft = Tree.lft -

COALESCE((SELECT SUM(G.gap)
FROM ConsecutiveSiblingGaps AS G
INNER JOIN

Paths AS P
ON P.head = @node AND
P.tail = Tree.value AND
P.node = G.rgtChild), 0) -

COALESCE((SELECT SUM(G.gap)
FROM ParentOldestChildGaps AS G
INNER JOIN

Paths AS P
ON P.head = @node AND
P.tail = Tree.value AND
P.node = G.child), 0),
rgt = Tree.rgt -

COALESCE((SELECT SUM(G.gap)
FROM ConsecutiveSiblingGaps AS G
INNER JOIN

Paths AS P
ON P.head = @node AND
P.tail = Tree.value AND
P.node = G.rgtChild), 0) -

COALESCE((SELECT SUM(G.gap)
FROM ParentOldestChildGaps AS G
INNER JOIN

Paths AS P
ON P.head = @node AND
P.tail = Tree.value AND
P.node = G.child), 0)
WHERE EXISTS (SELECT *

FROM ConsecutiveSiblingGaps AS G
INNER JOIN

Paths AS P
ON P.head = @node AND
P.tail = Tree.value AND
P.node = G.rgtChild) OR

EXISTS (SELECT *
FROM ParentOldestChildGaps AS G
INNER JOIN

Paths AS P
ON P.head = @node AND
P.tail = Tree.value AND
P.node = G.child)

John Gilson

unread,

Sep 18, 2002, 5:19:14 PM9/18/02

to

"--CELKO--" <71062...@compuserve.com> wrote in message

news:c0d87ec0.02091...@posting.google.com...

Actually, it's possible to write a single view, with a few supporting
views for clarity, that can calculate the total amount every node in
the tree needs to be shifted, relative to some reference node, to move
all gaps rightward. The algorithm is as follows:

forall(Paths P with head node H and tail node T where H = T or T is a
descendant of H, have T be the node whose shift amount will be
calculated with respect to the reference node H)
begin
headNodeLevel := Level(H)
tailNodeLevel := Level(T)
forall(Nodes N where Level(N) >= headNodeLevel and
Level(N) <= tailNodeLevel and
PN is the node at Level(N) within Path P and
N's lft value is <= PN's lft value)
begin
The total shift amount of T with respect to H is
sum(ConsecutiveSiblingGap of N) + sum(ParentOldestChildGap of N)
end
end

The translation of this to SQL is

-- For every pair of nodes (N1, N2) where N2 = N1 or N2 is a descendant
-- of N1, calculate the amount of left shifting that needs to be applied
-- to N2 when the tree rooted at N1 is completely left shifted
CREATE VIEW ShiftedLeftValues
AS
SELECT P.head AS rootNode,
P.tail AS shiftNode,
SUM(COALESCE(G1.gap, G2.gap, 0)) AS gap
FROM Paths AS P
INNER JOIN
NodeLevels AS L1
ON L1.value = P.tail
INNER JOIN
NodeLevels AS L2
ON L2.value = P.node

INNER JOIN
Descendants AS D

ON D.ancestor = P.head
INNER JOIN
NodeLevels AS L3
ON L3.value = D.descendant AND
L3.level = L2.level AND
D.lft <= P.lftNode
LEFT OUTER JOIN
ConsecutiveSiblingGaps AS G1
ON G1.rgtChild = D.descendant
LEFT OUTER JOIN
ParentOldestChildGaps AS G2
ON G2.child = D.descendant
GROUP BY P.head, P.tail

With the helping views defined as

-- Define tree levels with the root node being at level 1,
-- not the standard level 0
CREATE VIEW NodeLevels
AS
SELECT T2.value AS value,
COUNT(T1.value) AS level

FROM Tree AS T1
INNER JOIN
Tree AS T2

ON T2.lft BETWEEN T1.lft AND T1.rgt
GROUP BY T2.value

-- Define tree ancestor-descendant relationship where a node
-- is also considered its own descendant

CREATE VIEW Descendants
AS
SELECT T1.value AS ancestor,

T2.value AS descendant,
T2.lft AS lft,
T2.rgt AS rgt

FROM Tree AS T1
INNER JOIN
Tree AS T2

ON T2.lft >= T1.lft AND
T2.rgt <= T1.rgt

-- Define tree parent-child relationship
CREATE VIEW Children
AS
SELECT T1.value AS parent,
T2.value AS child,
T1.lft AS lftParent,
T1.rgt AS rgtParent,
T2.lft AS lftChild,
T2.rgt AS rgtChild
FROM Tree AS T1
INNER JOIN
Tree AS T2
ON T2.lft > T1.lft AND

T2.rgt < T1.rgt AND
NOT EXISTS (SELECT *

FROM Tree AS T3
WHERE T3.lft > T1.lft AND
T3.lft < T2.lft AND
T3.rgt < T1.rgt AND
T3.rgt > T2.rgt)

-- Consecutive siblings with a gap > 0

CREATE VIEW ConsecutiveSiblingGaps
AS
SELECT C1.child AS lftChild,
C2.child AS rgtChild,
C2.lftChild - C1.rgtChild - 1 AS gap
FROM Children AS C1
INNER JOIN
Children AS C2
ON C1.parent = C2.parent AND

C1.rgtChild < C2.lftChild - 1 AND
NOT EXISTS (SELECT *

FROM Children AS C3
WHERE C3.parent = C1.parent AND
C3.lftChild > C1.rgtChild AND
C3.rgtChild < C2.lftChild)

-- Gap between parent and oldest child, i.e., leftmost child, > 0
CREATE VIEW ParentOldestChildGaps
AS
SELECT C1.parent AS parent,
C1.child AS child,
C1.lftChild - C1.lftParent - 1 AS gap
FROM Children AS C1
WHERE NOT EXISTS (SELECT *
FROM Children AS C2
WHERE C2.parent = C1.parent AND
C2.lftChild < C1.lftChild) AND
C1.lftParent < C1.lftChild - 1

-- Enumerate all nodes in all paths in tree from a head node to a

-- tail node, both inclusive.
CREATE VIEW Paths
AS
SELECT T1.value AS head,
T2.value AS tail,
T3.value AS node,
T3.lft AS lftNode,
T3.rgt AS rgtNode
FROM Tree AS T1
INNER JOIN
Tree AS T2
ON T1.lft <= T2.lft AND
T1.rgt >= T2.rgt
INNER JOIN
Tree AS T3
ON T3.lft BETWEEN T1.lft AND T2.lft AND
T3.rgt BETWEEN T2.rgt AND T1.rgt

The procedure to actually shift the tree is now very straightforward
and succinct

CREATE PROCEDURE ShiftLeftHelper
@node VARCHAR(10) -- reference node of tree or subtree to begin shifting
AS
BEGIN

UPDATE Tree
SET lft = Tree.lft - (SELECT gap
FROM ShiftedLeftValues
WHERE rootNode = @node AND
shiftNode = Tree.value),
rgt = Tree.rgt - (SELECT gap
FROM ShiftedLeftValues
WHERE rootNode = @node AND
shiftNode = Tree.value)
WHERE (SELECT gap
FROM ShiftedLeftValues
WHERE rootNode = @node AND
shiftNode = Tree.value) > 0

END -- PROCEDURE

-- Tree's root node
CREATE VIEW RootNode
AS
SELECT T1.value, T1.lft, T1.rgt
FROM Tree AS T1
WHERE NOT EXISTS (SELECT *
FROM Tree AS T2
WHERE T2.lft < T1.lft AND
T2.rgt > T1.rgt)

-- Left-shift complete tree

CREATE PROCEDURE ShiftLeft
AS
DECLARE @root VARCHAR(10)
SELECT @root = value
FROM RootNode
IF @root IS NOT NULL
EXEC ShiftLeftHelper @root

-- Only left-shift subtree rooted at the given node

CREATE PROCEDURE ShiftLeftSubtree
@node VARCHAR(10)
AS
IF EXISTS (SELECT * FROM Tree WHERE value = @node)
EXEC ShiftLeftHelper @node

This will produce the same results as my other algorithm. It would be
interesting to evaluate the performance of these two approaches,
especially on large trees.

Let's look at a couple of examples.

CREATE TABLE Tree
(
value VARCHAR(10) NOT NULL PRIMARY KEY,
lft INT NOT NULL CHECK (lft > 0),
rgt INT NOT NULL CHECK (rgt > 0),
CHECK (rgt > lft)
)

INSERT INTO TREE
SELECT 'Albert', 100, 1200
UNION ALL
SELECT 'Bert', 111, 201
UNION ALL
SELECT 'Ernie', 250, 300
UNION ALL
SELECT 'Chuck', 400, 1100
UNION ALL
SELECT 'Donna', 501, 601
UNION ALL
SELECT 'Eddie', 701, 801
UNION ALL
SELECT 'Fred', 901, 1001

To shift the entire tree

EXEC ShiftLeft

Tree becomes

value lft rgt
Albert 100 1200

Bert 101 191
Ernie 192 242
Chuck 243 943
Donna 244 344
Eddie 345 445
Fred 446 546

and to shift a subtree, for example,

EXEC ShiftLeftSubtree 'Chuck'

Tree becomes

Albert 100 1200
Bert 111 201
Ernie 250 300
Chuck 301 1001
Donna 302 402
Eddie 403 503
Fred 504 604

Regards,
jag

Message has been deleted

--CELKO--

unread,

Sep 20, 2002, 11:09:22 AM9/20/02

to

>> It could be possible to do this with one sql statement, but it's
difficult. <<

Yes, very difficult <g>. So far, the shortest answer involves a user
defined recursive function that builds the total offset when it is
called.

>> So, the offset for a given node n from its parent's lft is
1+ sum( 1+ leftsibling.rgt - leftsibling.lft )
where leftsibling.parent_id = n.parent_id and leftsibling.rgt <
n.lft.

A view for the same:

create view siblingoffset( node_id int, lft int, rgt int,
lft_offset, rgt_offset )
as
select
n.node_id, n.lft, n.rgt,
1 + sum( 1+ leftsibling.rgt - leftsibling.lft ),
1 + sum( 1+ leftsibling.rgt - leftsibling.lft ) + (n.rgt -
n.lft)
from
orgchart n, orgchart leftsibling
where n.parent_id = leftsibling.parent_id and leftsibling.rgt <
n.lft; <<

Actually, you only need the lft_offset value, since the rgt_offset can
be calculated by using the (lft, rgt) spread within the node.

>> Given the sibling offsets, it's a matter of summing them for all
nodes x
in the path from root, ie the usual nested sets predicate, where x.lft
< n.lft and x.rgt > n.rgt. The sum is the offset from root's lft
value (1
or 0).

update orgchart
set lft =
(select sum(x.lft_offset)
from siblingoffset AS x
where orgchart.lft >= x.lft
and orgchart.rgt <= x.rgt),
rgt =
(select sum(x.rgt_offset) from siblingoffset x
....same....)

The update could be more elegant, but this form is fairly clear. <<

I think that this looks pretty clear, too. I think we could change
the VIEW to show the total of the offsets rather than doing the
summation inside the scalar subqueries in the UPDATE, but I will have
to play around a bit. I am assuming the VIEW would be materialized
once.

--CELKO--

unread,

Sep 20, 2002, 11:20:00 AM9/20/02

to

You put in a lot of work on this one! I also like the idea of giving
the gaps names. One trick that might help is that the immediate
subordinates (the children) can be found with this query:

SELECT B.emp AS boss, P.emp
FROM OrgChart AS P
LEFT OUTER JOIN
OrgChart AS B
ON B.lft
= (SELECT MAX(lft)
FROM OrgChart AS S
WHERE P.lft > S.lft
AND P.lft < S.rgt);

Kendall

unread,

Sep 20, 2002, 3:00:03 PM9/20/02

to

In article <c0d87ec0.02092...@posting.google.com>, "--CELKO--"
<71062...@compuserve.com> wrote:

> Actually, you only need the lft_offset value, since the rgt_offset can
> be calculated by using the (lft, rgt) spread within the node.
>
>

Yes, you're right, and I think I screwed up the rgt summation as well. No
need to add an extra term at each level, as I did below :-).

> rgt =
> (select sum(x.rgt_offset) from siblingoffset x
> ....same....)
>
>

The correct form would be sum(x.lft_offset) + (n.rgt - n.lft).

> The update could be more elegant, but this form is fairly clear. <<
>
> I think that this looks pretty clear, too. I think we could change the
> VIEW to show the total of the offsets rather than doing the summation
> inside the scalar subqueries in the UPDATE, but I will have to play
> around a bit. I am assuming the VIEW would be materialized once.

Yes, that makes some sense. The combined predicate for the summation
terms would be something like "for node n, every node xx to the left of a
node a on the path from root to n, inclusive". Summing the xx's spreads
would yield the total offset from root.

a=root
/ | \
xx xx a
/ | \
xx xx a
/ \
xx a=n

All the xx's are summed to get the total offset down to n.

select n.node_id, sum(1 + xx.rgt - xx.lft) + count(distinct a.node_id) totaloffset
from
orgchart n,
orgchart a, /* ancestor from root to n inclusive */
orgchart xx /* left sibling of an a */
where n.lft >= a.lft and n.lft < a.rgt
and xx.parent_id = a.parent_id
and xx.rgt < a.lft
group by n.node_id

That count(distinct a) term is there to add 1 for each "a" node as we go
down.

Mike

unread,

Sep 23, 2002, 7:00:24 AM9/23/02

to

This is very interesting. I'm left wondering one important thing
though, how can you choose when you should perform the compacting? I'm
thinking of a threaded newsgroup application, one that's very, very
busy (lots of posts). Eventually you need to compact, but once you
compact, you end up going back to a cascading update when new posts
arrive, so is there some happy middle ground? Or a rough hueristic to
avoid bouncing between compacting & cascading updates?

--CELKO--

unread,

Sep 23, 2002, 8:18:32 PM9/23/02

to

>> ... how can you choose when you should perform the compacting? I'm

thinking of a threaded newsgroup application, one that's very, very
busy (lots of posts).<<

Well, I start with a DECIMAL (38,0) -- the Oracle max -- for the
(lft,rgt) numbering and steps of 100 per level -- and prayer. Very
few threads go pass 500 messages. Most of the additions are made on
the end of the last post. So a reorg woudl be a once every few months
task.

I wish I had a magic answer of eternal expansion within the bounds of
finite storage.

Mike

unread,

Sep 24, 2002, 5:31:54 AM9/24/02

to

Whens your next book due out?

--CELKO--

unread,

Sep 25, 2002, 1:44:55 PM9/25/02

to

>> Whens your next book due out? <<

I am trying to move to Salt Lake City in October and my wife is not
her to help. My plan is to have it to Morgan-Kaufmann before the end
of the year for publication in early 2003 (?)

milkchaser

unread,

Oct 1, 2002, 1:12:17 PM10/1/02

to

Originally posted by --Celko--

Writing a book AND moving! You're a glutton for punishment. :-)

Looking over the many posts in this forum concerning the nested set
model, I can't help but wonder if one could define the sets equally well
by replacing the "left" and "right" columns with "node_id" and
"num_descendants" columns. Using your standard org chart table, that
would yield:

emp . node_id . num_descendants
=========.===========.===================
'Albert' . 1 . 5
'Bert' . 2 . 0
'Chuck' . 3 . 3
'Donna' . 4 . 0
'Eddie' . 5 . 0
'Fred' . 6 . 0

I don't know as this is an advantage or disadvantage, but it uses fewer
node ids. Using the left/right numbering process, each node enumerates
a non-existent node (ie. left=2,right=3 enumerates the node itself,
namely 2, and an empty neighbor, namely 3).

--
Posted via http://dbforums.com

--CELKO--

unread,

Oct 4, 2002, 3:51:04 PM10/4/02

to

>> Writing a book AND moving! You're a glutton for punishment. :-) <<

This was not quite my plan, but life is what happens while you are
working on your dreams.

>> Looking over the many posts in this forum concerning the nested set
model, I can't help but wonder if one could define the sets equally
well by replacing the "left" and "right" columns with "node_id" and
"num_descendants" columns. <<

Actually I have a suggestion somewhat like that. use the pre-order
traversal number and the rightmost son.

OrgChart
emp trv_nbr rgt_most
==========================
'Albert' 1 6
'Bert' 2 2
'Chuck' 3 6
'Donna' 4 4
'Eddie' 5 5
'Fred' 6 6

The organizational chart would look like this as a directed graph:

Albert (1,6)
/ \
/ \
Bert (2,2) Chuck (3,6)
/ | \
/ | \
/ | \
/ | \
Donna (4,4) Eddie (5,5) Fred (6,6)

Remember a shorthand in PL/I where you coudl replace a string of END
keywords with a single "END <block name>" construct? Same idea.

--CELKO--

unread,

Oct 4, 2002, 3:56:57 PM10/4/02

to

>> how can you choose when you should perform the compacting? I'm
thinking of a threaded newsgroup application, one that's very, very
busy (lots of posts). Eventually you need to compact, but once you
compact, you end up going back to a cascading update when new posts
arrive, <<

No, think about it. I am gathering all the gaps and merging them into
a huge gap for new nodes on the right side of every sibling group.
What would mess me up is a tree that is a right-side only binary tree
-- one looooong list. In that situation I might have to clean up
things every 20-30 postings.

But I am willing to bet against the worst case -- which is what we do
when we use QuickSort and other algorithms.

Jan.Hidders

unread,

Oct 7, 2002, 11:37:50 AM10/7/02

to

In article <c0d87ec0.02100...@posting.google.com>,

--CELKO-- <71062...@compuserve.com> wrote:
>
>Actually I have a suggestion somewhat like that. use the pre-order
>traversal number and the rightmost son.

You meant rightmost descendant, right? :-)

-- Jan Hidders

Heinz Huber

unread,

Oct 8, 2002, 1:48:20 AM10/8/02

to

71062...@compuserve.com (--CELKO--) wrote in
news:c0d87ec0.02100...@posting.google.com:

I've finally found a "simple" sql that calculates all the shifts in this
model or also in the original nested set model. The only condition is that
you can calculate the nr of "slots" that a row really occupies (I call it
the gap).

In your case above that would be:
emp trv_nbr rgt_most gap
================================
'Albert' 1 6 6
'Bert' 2 2 1
'Chuck' 3 6 4
'Donna' 4 4 1
'Eddie' 5 5 1
'Fred' 6 6 1

This is incidently milkchaser's num_descendants + 1. But only because you
don't have any empty "slots".

To calculate the shift of every row, you take all the gaps of its siblings,
the siblings of its parent, the siblings of its grandparent, ... which come
before the row. That is the gap of all rows that come before the row and
don't have an ancestor that is not an ancestor of this row.
Then you add the number of ancestors, since each row needs 1 slot. Here
only ancestors are relevant, since other rows have been dealt with by the
gaps of their respective ancestor.

SELECT trv_nbr, rght_most,
(SELECT SUM(rght_most - trv_nbr + 1)
FROM Employees before
WHERE before.rght_most < emp.trv_nbr AND NOT EXISTS
(SELECT trv_nbr
FROM Employees parent
WHERE parent.trv_nbr < before.trv_nbr AND
before.trv_nbr <= par.rght_most AND
parent.rght_most < emp.trv_nbr)) AS gaps,
(SELECT COUNT(*)
FROM Employees parent
WHERE parent.trv_nbr < emp.trv_nbr AND
emp.trv_nbr <= parent.rght_most) AS parents,
COALESCE(gaps, 0) + parents + 1 AS shift
FROM Employees emp
ORDER BY trv_nbr;

I'm not sure whether you could directly plug this into an update according
to the SQL standard, since I don't know whether all the values of shift
have to calculated before the update is made.

But you can surely write a stored procedure that opens a cursor (as a
snapshot) on the select above and adjust trv_nbr and rght_most for every
row.

Regards,
Heinz

--CELKO--

unread,

Oct 9, 2002, 8:17:58 PM10/9/02

to

> > I've finally found a "simple" sql that calculates all the shifts in this model or also in the original nested set model. The only condition is that
you can calculate the nr of "slots" that a row really occupies (I call
it
the gap). <<

This looks good, but I will not have time to play with it because I
need to get ready to move before the end of the month. And things are
going wrong -- of course.