Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Adding history/versioning to a Nested Set model (is it possible?)

103 views
Skip to first unread message

Ed Lucas

unread,
Oct 28, 2009, 1:55:35 PM10/28/09
to
Hi All,

I'm trying to implement a tree structure, hence I'd like the Nested
Set model, but I need to be able to record the history of the tree
structure *somehow* so that I can access previous versions.

I've searched up and down online but all I can find are transactions
+rollbacks.

As far as I can see...

The simplest option would be to add a version number to the model, but
then every member of the set would have to be updated each time a
change is made - not feasible because of how fast that would make the
table would grow, and how slow that would make it to update?

I imagine I could log each action and then replay all the log actions
in a temporary table when I needed them, but that would get pretty
slow after a few updates.

The previous Nested Set could be serialised to an blob and stored.
That's currently looking like the best option, but only by process of
elimination...

I've probably missed something obvious that's staring me in the face
here! - any suggestions?

Thanks for taking the time to read this far, best regards,
Ed

Vladimir Odrljin

unread,
Oct 30, 2009, 4:39:09 AM10/30/09
to


You can find the solution at www.dbdesign11.com – see states of
relationships.
Vladimir Odrljin

Tegiri Nenashi

unread,
Oct 30, 2009, 12:04:34 PM10/30/09
to

When you make even small change to a tree (e.g. insert a node), you
have to recalculate half of the node encodings on average. This
feature alone invalidates nested sets approach for temporal
hierarchies. Google nested intervals.

Ed Lucas

unread,
Oct 30, 2009, 12:37:39 PM10/30/09
to


Aha! Thank you very much to you and Vladimir!

Now I just need to sit in a darkened room with a lot of coffee while I
get my head round it all...

Vladimir Odrljin

unread,
Oct 31, 2009, 2:03:47 PM10/31/09
to
> get my head round it all...- Hide quoted text -
>
> - Show quoted text -

vldm10

unread,
Oct 31, 2009, 2:17:11 PM10/31/09
to
On Oct 30, 5:37 pm, Ed Lucas <edward.lu...@gmail.com> wrote:
> get my head round it all...- Hide quoted text -
>
> - Show quoted text -
Because there are many kinds of hierarchies, I can suggest the most
common example of this kind of data – an organizational chart. Let me
give an example which maybe can give a better idea about my solution.

Mary Smith is a boss of 100 employees, and one employee is Jane
Jones. Try to solve the following using my solution:
1. Mary Smith changed her last name (she got divorced)
Jane Jones changed her last names (she got married)
Generally, every attribute of an entity can be changed at any
time.
2. One week Jane Jones works the morning shift. From 8-12 she works
with
customers and her boss is Mary Smith. From 12- 4pm Jane does data
entry
and her boss is Jim.
Every second week she works the afternoon shift and also has two
different bosses.
(Note that Jane can have in different periods some new bosses)
3. Jane can enter false data, to get a better salary. Or generally,
solve the problem of any
kind of crime, false data or tries to broke the DB solution.

Now try to apply my solution so that it keeps all information from1.,
2. and 3.
Note that Mary Smith --- Jane Jones is just one line in this hierarchy
and this example is a simple hierarchy, but this is a kind of the real
world example. Hope this example will help.

Vladimir Odrljin

-CELKO-

unread,
Oct 31, 2009, 11:32:11 PM10/31/09
to
First, get a copy of my TREES & HIERARCHIES IN SQL so you will have
some code to work with.

In this model, the structure is kept in a table like this: Tree (lft,
rgt, node) where the node is a reference to the Nodes table and (lft,
rgt) is an integer pair. It is very small, and if you leave an
appropriate fill factor on the data pages, updates are very fast.
Think about how much fits on a 5K data page, with three 4-byte
integers. .

You could do snapshots of the whole tree with a timestamp. Or add a
timestamp column and keep it all in one table. This will let you do a
COMPLETE re-structuring every time -- probably not likely in the real
world.

If the table is growing, then we can put wide gaps in the (lft, rgt)
and keep adding nodes to the tree as long as we can preserve the
"between-ness" property of new nodes. This means you have to give up
the algebraic property (i.e sub-tree size rooted at this node = (rgt -
lft +1) /2 for all nodes); but do you need it?

My experience with applications that use Nested Set model, you handle
~1,000 nodes per thread in a newsgroup forum where trees grow in a
predictable fashion (i.e. add leaf nodes to an existing structure).

Talk to me off-line.

0 new messages