Suppose table tab1 constains 5000 rows.
SELECT * FROM tab1 as t1, tab2 as t2 WHERE t1.x=23 AND t2.x=t1.x;
Will in the current scenario result in 2,50,00,000 tuples being
examined by the Join algorithm. This takes approximately 34sec. When
compiled with -O1 optimization level. Mysql too takes about 32sec. to
run Nested Loop joins on similar queries if forced to do so using an
OR condition, or if we leave out the constant term(23).
With the new scenario, with the WHERE clause with constants being
pushed down, it will result in just 1*5000 rows being selected for
further processing(assuming that there is just 1 row which satisfies
that constraint in the target table).
This is a _big_ win in terms of performance. Further, we can reduce
this even more in case of multi-table JOINs by using Hash-Join for the
above case. Thus, full processing will have to be done for just 1*1
row(s) in the above case.
The new function added is is_simple() which returns TRUE if every term
in the WHERE clause is connected by AND only. Next step would be to
add accessors to check if exactly 1 side of a binary_term is a
constant. If so, we check which table it belongs to, and group all
such terms having 1 constant(eg: x=23, y<23, etc....) and for a
separate sql_condition object for each table used. This sql_condition
object will be passed to the table reader which will select only those
rows which match the constraints.
--
-Dhruv Matani.
http://www.geocities.com/dhruvbird/
"The biggest room is the room for improvement."
-- Navjot Singh Siddhu.