Transitive closure for query optimization.

8 views
Skip to first unread message

Dhruv Matani

unread,
Jun 25, 2006, 12:26:33 AM6/25/06
to The Distibuted DataBase
Hello,
The next stage in query optimization would be to add transitive
closures for simple queries involving joins.

[1] SELECT * FROM t1, t2, t3, t4 WHERE
t1.x=23 AND
t1.x=t2.x AND
t2.x = t3.x AND
t4.x = 11;

Should get simplified, so that the table reader has the following information:
t1.x=23
t2.x=23
t3.x=23

This operation can be trivially done by computing the Transitive
closure of all fields involved in the where clause when they are
connected using the '=' operator. This can be done using the
Floyd-Warshall algorithm. However, if we observe closely, then we need
not compute the full transitive closure at all. This will save us from
using the O(n^3) floy-warshall algorithm.

Instead, we use a different "grouping algorithm", which accomplishes
the same goal, but in O(n^2) time.

If we notice closely, then we observe that we need NOT compute the
Transitive Closure at all for such queries. Instead, a simpler
grouping mechanism, wherein all reachable fields are grouped together
would suffice. Consider for example:
t1.x=t2.x AND t2.x=t3.x AND t1.x=23;

A Transitive closure operation would give:
t1.x=t2.x
t2.x=t3.x
t3.x=t1.x

However, it is sufficient to know that [t1.x, t2.x, t3.x] all have the
same value, and given that any one of them is compared with a
constant(t1.x=23 in this case), it is sufficient to say that the rest
should also equal the same constant value(23 in this case).

This can be done in O(n^2) time by the algorithm which will be
proposed & discussed later.


--
-Dhruv Matani.
http://www.geocities.com/dhruvbird/

"The biggest room is the room for improvement."
-- Navjot Singh Siddhu.

Reply all
Reply to author
Forward
0 new messages