Implement Closure Table for Hierarchical data

1,154 views
Skip to first unread message

Hugo

unread,
Aug 21, 2011, 7:37:00 AM8/21/11
to Redis DB
Hi all,

I'm trying to see how to implement a closure table (http://
karwin.blogspot.com/2010/03/rendering-trees-with-closure-tables.html
and http://www.slideshare.net/billkarwin/models-for-hierarchical-data
- slide 64) in Redis.

In SQL this is done by creating the following table:

| ancestor | descendant | length
| 1 | 2 | 1
| 1 | 4 | 1
| 1 | 3 | 3
| 2 | 3 | 1

for this tree:
1
|
------------
| |
2 4
|
3

My first attempt is to create two sets for each node:

ancestors:<node_id>
descendants:<node_id>

So I can get all the ancestors or descendants for a specific node in
one query.

Besides not being sure if the is the right way to go, I don't know how
to map the length column( so I can query for instance the immediate
descendants or ancestors for a specific node).

For my application will have at most four levels in the tree and the
most important is to have fast reads for:
- Getting all node immediate ancestors or descendants (level = 1).
- Getting all node ancestors or descendants.
- Show all the network tree.

Can anyone help me?
Thanks.
Best regards,
Hugo

Didier Spezia

unread,
Aug 21, 2011, 11:03:44 AM8/21/11
to redi...@googlegroups.com

Here is a solution based on sorted sets:

for node n, you store:
- a zset whose key is desc:n, with score is the relative path length,
  and value is 'node:ascendant of this node'
- a zset whose key is asc:n, with score is the relative path length,
  and values is just 'node' 

Your example tree is represented by:
zadd desc:1 1 2:1
zadd desc:1 1 4:1
zadd desc:1 2 3:2
zadd desc:2 1 3:2
zadd asc:2 1 1
zadd asc:3 2 1
zadd asc:3 1 2
zadd asc:4 1 1

- Getting all node 3 immediate ancestors:
zrangebyscore asc:3 1 1

- Getting all node 1 immediate descendants:
zrangebyscore desc:1 1 1

- Getting all node 1 descendants (i.e. getting all the tree):
zrangebyscore desc:1 1 +inf WITHSCORES

- Getting all node 3 ancestors:
zrangebyscore asc:3 1 +inf WITHSCORES

Because the values stored in the desc zsets contain both the node and
its immediate ancestor, fetching values from a desc zset is enough
for a client to quickly build a memory representation of the subtree.

Regards,
Didier.

Hugo

unread,
Aug 23, 2011, 6:45:57 AM8/23/11
to Redis DB
Hi Didier,

Thanks for your help!

I've started to implement the solution that you suggested but I'm
having some problems with it.
For instance when adding a new node (4) bellow 3, I need to use MULTI
to:
- add the immediate ancestor (3) to asc:4.
- add the created node (4) to desc:3.
- get all upper ancestors and recursive add them to key 4:asc and 4
to the ancestor (ancestor_id:desc).

Because I have the a query in the middle of the multi (zrangebyscore)
this does not seem to work. Can you help me?

Code:
MULTI
// add ancestor.
// score = proximity, value = ancestor.
zadd asc:4 1 3
// add descendant to immediate ancestor.
// score = proximity, value = end : origin.
zadd desc:3 1 "4:3"

// find all 3 ancestors:
zrangebyscore asc:3 1 +inf WITHSCORES
// recursive add upper ancestor to 4.
multi.zadd asc:3 (score+1) upper_ancestor
// recursive add 4 to upper ancestor.
multi.zadd desc:upper_ancestor 1 "4:3"

Note: I'm using node.js to do this.

Thanks again.
Best regards,
Hugo

Hugo

unread,
Aug 23, 2011, 7:42:11 AM8/23/11
to Redis DB
Hello again Didier,

Sorry for the previous message. The answer is very simple: I just need
to fetch all the ancestors before creating the multi.

Thanks again for your help.

Best regards,
Hugo

Didier Spezia

unread,
Aug 24, 2011, 6:24:07 AM8/24/11
to redi...@googlegroups.com
Hi,

maintaining the closure table (creation, deletion, node move between branches, etc ...)
is clearly the most complex part. If the tree can be updated concurrently, I'm not sure
the implementation is possible without relying on Lua scripting ...

Another downside of this approach is the memory consumption. zsets are huge in
memory. You mentioned your trees have 4 levels max. If they are small,
you should consider using Redis 2.4 with its specific optimizations for
small zsets.

Regards,
Didier.

Hugo

unread,
Aug 24, 2011, 6:56:18 AM8/24/11
to Redis DB
Hi Didier,

The tree will have 4 levels max and the 5000 nodes max.
It might happen that two users are changing two different nodes at
this same time, so yes it it can be updated concurrently. WATCH does
not help in this case?

Also, unfortunately due to a change in the requirements, a specific
node can have one or more parents, so the closure table might not be
the best for my use case.

For instance:

1
|
-----------
| |
| 4
2 |
| 5
| |
-----------
|
3


In the example above I will have two asc for node 3 that link to 1
with different scores:

zadd asc:3 1 2
zadd asc:3 2 1 <----
zadd asc:3 1 5
zadd asc:3 2 4
zadd asc:3 3 1 <----

How can I save this structure in redis?

I've downloaded redis 2.4 Release Candidate 6 and will check it.

Thanks again.

Best regards,
Hugo

Didier Spezia

unread,
Aug 24, 2011, 9:12:59 AM8/24/11
to redi...@googlegroups.com

Yes, you can use WATCH to implement an optimistic locking mechanism.
It adds some more complexity in the code though.

IMO, the granularity of the lock should be the tree itself (coarse locking),
because it looks hard to determine the optimal subset of keys to lock
depending on the operation.

Regarding the new requirements, you want to move from a tree to
a directed acyclic graph. 

It is possible to store a graph with a similar closure table.
You could keep the desc keys unchanged, and for the asc keys
store the ascendant nodes with their immediate descendant.
So your example would be:

zadd asc:3 1 2:3 
zadd asc:3 2 1:2
zadd asc:3 1 5:3 
zadd asc:3 2 4:5 
zadd asc:3 3 1:4


Please note node 1 is stored twice because it has 2 descendants.
It is up to the client to merge the two items in a single node when
the ascendants are retrieved.

Actually the definition of the desc and asc content is now completely
symmetrical, and the graph can be reversed by exchanging the content
of the two keys for all the nodes ...

Maintenance of this structure is even more complex (and probably
twice as expensive) than for the tree.

You should perhaps consider another storage engine, more adapted
to this kind of data. For instance, OrientDB and Neo4j are both excellent
open-source graph oriented databases for which it should be possible
to find Node.js drivers.

Best regards,
Didier.

Hugo

unread,
Aug 24, 2011, 11:10:55 AM8/24/11
to Redis DB
Hi again Didier,

Thanks for your help.
In the project I'm developing I'm using mongodb and redis and didn't
want to add another database, so I was trying to see if I could do
this in Redis.

I've considered the databases you suggested but:
Neo4J: The price tag is high for my project.
OrientDB: The driver that I found (https://github.com/maggiolo00/node-
orientdb) does not have any documentation. I'm going to ask for help
in the mailing list.

Thanks again.
Best regards,
Hugo

Reply all
Reply to author
Forward
0 new messages