How to dramatically improve search performance (in some cases)

6 views
Skip to first unread message

Bluebutton

unread,
Jul 27, 2008, 6:14:41 AM7/27/08
to ActiveScaffold : Ruby on Rails plugin
I have a large database with a column of serial numbers. I noticed
that although I had
indexed that column, performance was not very good when searching on
it.

It turns out that ActiveScaffold::Finder does case-insensitive
searches by using LOWER
in the SQL it generates to case-lower the column data. But this is
really bad:

explain select * from units where (LOWER(serial_number) like
'001099003436%' );

+----+-------------+-------+------+---------------+------+---------
+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len |
ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------
+------+--------+-------------+
| 1 | SIMPLE | units | ALL | NULL | NULL | NULL |
NULL | 481025 | Using where |
+----+-------------+-------+------+---------------+------+---------
+------+--------+-------------+

This shows that it's not using the index at all, and is searching
every row..

If I remove LOWER, however:

mysql> explain select * from units where (serial_number like
'001099003436%' );
+----+-------------+-------+-------+------------------------------
+------------------------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys |
key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+------------------------------
+------------------------------+---------+------+------+-------------+
| 1 | SIMPLE | units | range | index_units_on_serial_number |
index_units_on_serial_number | 258 | NULL | 1 | Using where |
+----+-------------+-------+-------+------------------------------
+------------------------------+---------+------+------+-------------+


This is MUCH better! The number of rows searched drops from 481025 to
1!

So it seems that generating SQL with LOWER should be made optional
(perhaps with a flag like 'case_insensitive'. Given the performance
consequences of LOWER, if one really wants to do
a case-insensitive search, while preserving the original case of the
data, it would pay in some cases
to store the data twice: once as originally entered, and once as a
case lowered string to use as a
search column (after also case-lowering the key). In my case, I don't
need to preserve the original
case; I can store in the database already case-lowered and do a fast
case-sensitive indexed
search using a case-lowered key.

Tim Uckun

unread,
Jul 28, 2008, 8:10:43 AM7/28/08
to actives...@googlegroups.com
>
> I have a large database with a column of serial numbers. I noticed
> that although I had
> indexed that column, performance was not very good when searching on
> it.
>
> It turns out that ActiveScaffold::Finder does case-insensitive
> searches by using LOWER
> in the SQL it generates to case-lower the column data. But this is
> really bad:
>
> explain select * from units where (LOWER(serial_number) like
> '001099003436%' );


Postgres doesn't have case insensitive collations so that's probably
why it's there. In postgres you can build an index on a function can
you do that with mysql? if so that might fix your problem.

Reply all
Reply to author
Forward
0 new messages