Details on the Grouping algorithm.

2 views
Skip to first unread message

Dhruv Matani

unread,
Jun 25, 2006, 3:11:47 AM6/25/06
to The Distibuted DataBase
Hello,
Described below is the details for the grouping algorithm, which is
to be used instead of the Transitive-closure algorithm. Since we don't
need an all-to-all pairs closure, but instead require only a
reachability matrix, it is possible to use this algorithm instead.


[1] Assign each term of the expression of the form LHS=RHS an ID. So,
in this case, LHS will get an ID 0, and RHS gets ID 1.

[2] Perform step[1], assigning a unique ID to each unique LHS & RHS
term(s) that appear in the WHERE clause. Example:
x=y AND x=z AND y=a AND a=b
x ==> 0
y ==> 1
z ==> 2
a ==> 3
b ==> 4

[3] Create a reachability matrix, and set all those points where there
is a join.
So, for the above example, the following pairs will have a set value:
{0,1} {1,0} {0,2} {2,0} {1,3} {3,1} {3,4} {4,3}

[4] Start from any one node, and start adding it to group 0. So, if we
start from node 0, then 1 gets added, then all those nodes which are
reachanble from node-1. They are 0(which is already there), and 3.
Then all nodes reachable from 3(1 & 4), all nodes reachable from (1 &
4), and so on.... we don't expand into a node if it has already been
visited. So, in the above example, we won't actually be visiting node
1 again.

[5] Check if any node is still unvisited. If so, repeat from step[4],
and add those nodes to a group numbered 1 more than the previous
group.

[6] Finally, we will get one or more sets of graphs which are fully connected.

The algorithm can be implemented as a DFS search on the nodes. Overall
complexity is O(n^2) in worst case(when there is only 1 graph
generated).


--
-Dhruv Matani.
http://www.geocities.com/dhruvbird/

"The biggest room is the room for improvement."
-- Navjot Singh Siddhu.

Reply all
Reply to author
Forward
0 new messages